Archive Page
Gastrointestinal Archive











SLIDES & TRANSCRIPTS
Tuesday, February 15, 2000

What Model in Clinical Design Should We Use to Examine New Cytostatic Compounds?
Edward Korn, PhD

Slide 1:

DR. MEROPOL : I would like to introduce Ed Korn who will discuss some of the statistical issues that we might face in the development of new agents with cytostatic potential.

DR. KORN: Thanks, Neal.

I would like to thank Dr. Meropol for inviting me to add some statistical considerations to what has been previously discussed.


TOP

Slide 2:

After talking about Phase I dose-finding trials, I am going to talk about preliminary efficacy trials and comparative efficacy trials, and I use these words rather than Phase II and Phase III trials because those words have started to get confusing.

TOP

Slide 3:

As has been mentioned in the Phase I trial, rather than using toxicity you could consider using an alternative biologic end point, and we have heard many discussed today already. However, if you do that the end point should be able to measure reproducibly, and it also should be sufficient to determine a dose where efficacy testing is appropriate, and that really has kind of two parts to it. One is that if the biologic end point was actually achieved before you started seeing toxicity, you would call it quits and go ahead with that dose of the agent for further testing.

You wouldn't extend the dose higher, and the other part of that is that if you actually got to an MTD for toxicity and you didn't see the biologic response you were looking for, the amount of it, you would say, okay, I don't want to proceed with this agent anymore, and if you don't really satisfy those two conditions then you probably are going to end up doing a Phase I trial to toxicity, and that would be the appropriate thing to do.

TOP

Slide 4:

If you decide to use a biologic end point, you have your sample size issues, and what sample you use usually depends on exactly what the question you are asking is, and there are, it seems like just three questions you might ask. You could ask is there a dose response? You could ask if there a Aminimum effective dose@? You could ask is there an Aoptimum biologic dose@? I put those last two terms in quotes because different people define them in different ways.

No matter which of these three questions you ask though you are going to be into a larger sample size than in the traditional Phase I trial. I will give you some examples of that now.

TOP

Slide 5:

Let us follow the kind of general scheme that Dr. Eisenhauer mentioned, which is let us see you do a somewhat quick escalation to MTD, one to three patients at each dose level, and now you want to ask the question is there a dose response? At first I want to consider a binary yes/no end point, and assume that at the MTD you actually have 90 percent of the patients who have that biologic response.

Then you pick another dose quite a bit smaller than the first one, and you treat a bunch of patients there, too, and you want to see if they have a lower, what proportion of patients there have the responses lower.

Now depending upon how different you think those proportions are determines your sample size. So if you are in a case where you didn't actually think there was that big a difference, 90 percent of the patients at the MTD and 70 percent at this lower dose would have this biologic response, then you would need 76 patients per arm, 152 patients in total to have any sort of reasonable power to detect that dose response.

Of course, if you hypothesize a much larger difference in the biologic response, then you could get by with a much smaller sample size.

Sometimes it is said or implied that, well, why don't we test at a bunch of doses, and then we will need less patients. That is not really quite right. Suppose you tested it at three doses instead of two doses. It is true that you would need less patients at each dose, but in fact, the total number of patients you would need for detecting the same difference would actually be larger. So instead of 152, you would need 216 or, in this larger difference case, instead of 34 you would need 42.

Now, of course, it is interesting to test at more than two doses because you get a feel for what the dose-response curve looks like, but it doesn't save you in the total number of patients you have to accrue.

TOP

Slide 6:

Instead of using the binary response, you might use a continuous response which would normally be a good idea because continuous measurements have more information than binary yes/no measurements. The same idea, you might escalate to MTD and then back off and do a randomized study at the MTD and a lower dose, and if you thought that the mean response was 100 at the MTD and 80 at the lower dose and standard deviation was about 30 across patients at each dose, then you would need 84 patients to detect that dose response.

This sample size is very sensitive to the standard deviation here. So, in fact, if you had a standard deviation of 20 instead of 30 you would get by with a much smaller sample size or if you thought that the difference in mean response was quite a bit bigger then you could get by with a smaller sample size.

It is important when you go ahead and try to design these studies, or better yet have your statistician try to design these studies, that you have some idea what the standard deviation is, and a good place to get that information is when you are doing that initial escalation up to the MTD, even though you are using toxicity to drive that escalation, go ahead and record your biologic variables so you can get some background data on how variable they are.

You would be surprised at how many studies we see that people want to do this kind of analysis and have no idea what the background variability of the data is because they have yet to actually record it on patients.

TOP

Slide 7:

So in summary of the Phase I issues, the alternative endpoint should be able to be measured reproducibly. It should be sufficient to determine where efficacy testing would be appropriate. Trials with alternative end points are going to require larger sample sizes, and if the alternative end point is continuous rather than binary, then background variability data will be necessary to give a sample size calculation.

TOP

Slide 8:

I would like to switch over now to preliminary efficacy trials and discuss seven different options. The first one, if the agent has some cytotoxic activity then you could use a standard Phase II two-stage design and get by with a relatively small number of patients and move on. So before you throw away this option and I think as Elizabeth stressed also give it serious thought. Maybe you have some responses there, and you can go ahead and use the standard design.

TOP

Slide 9:

If not, and you can identify meaningful survival, progression-free survival targets, one could still use a standard Phase II design. So you can target 30 percent versus 10 percent progression-free survival at one year provided that you know and have data that would suggest strongly that the progression-free survival rate without the agent would be 10 percent at one year. So you are trying to improve it to 30 percent or you could target 12 versus 6 months median survival, and again, if you have background historical data that 6 months is about what you would expect without the agent, and the sample size here would be relatively small. It is a usual Phase II kind of sample size, 30 to 50 patients.

You can still use two-stage designs here, but you have to be a little careful because you probably don't want to wait around after you have accrued the first stage to see what their survival experience is before you go on to the second stage, and there are some statistical designs that allow for over accrual at the first stage. So you don't have to shut things down, and another option to consider is if you actually saw any objective tumor responses in the first stage go ahead to the second stage no matter what, again, so you don't shut things down while you are waiting for that survival experience.

TOP

Slide 10:

So in order to use this kind of standard design, targeting survival or progression-free survival difference, you need some historical data of survival, progression-free survival on a group of patients, and here comes my dream, like Dan Sargent's this morning, with similar stage of disease, similar amount of prior treatment, similar organ function performance status, for whom the same procedures were used for monitoring disease progression, if you are using progression-free survival and treated at the same institutions with the same referral patterns and a recent error. So this is the ideal. If you had data like that I think people would feel comfortable in going ahead and using a standard Phase II type design.

TOP

Slide 11:

Some other options, if you can identify a meaningful clinical benefit target, say, improvement in quality of life or reduction in pain, you could also use a standard Phase II design, the idea being that you would unlikely see this clinical benefit without some active agent. So any benefit you see is due to the agent.

However, if you do that, you have to assume that the agent in question is going to affect the clinical end point in question. You are going to have to restrict eligibility to patients for whom the clinical parameters can be measured. So if you look at pain reduction, obviously they have to have pain, and there could be placebo effects here that might give you a false positive trial, but that would not be the end of the world here because you are just using this as a screening trial. If you make it through this hump, it is going to go to a randomized trial with a control arm.

TOP

Slide 12:

The fourth option is to identify a meaningful biologic end point, say, some sort of immune response where here again you could get by with a small trial. However, as Dr. Eisenhauer mentioned, there are a bunch of problems, dificulties with this. It may be difficult to accrue patients for whom the biologic end point can be measured. If the agent acts by a different mechanism than hypothesized the trial may lead to the wrong conclusions. Biologic studies can be resource intensive, and even if an agent demonstrates biologic activity, some may not consider that sufficient evidence to move on to a randomized trial in the absence of any evidence of clinical benefit.

TOP

Slide 13:

The fifth option, you can compare the tumor growth before and after the beginning of treatment using each patient as his own control. However, tumor growth may slow down as tumor gets larger and even in the absence of an effective agent giving you a misleading result.

You need to have a measure of tumor growth before the agent is started which may lead to logistic difficulties if the patients are being referred to another institution for the new agent. You have to have recorded their tumor growth before they are actually registered on the study.

Explicit rules need to be developed for definition of successful outcome, here especially in the presence of new mets, either before or after the agent, and how do you factor that into your tumor growth measurements?

Care must be taken about measurement error in regression to the mean, and in the simplest case you can think of, suppose the tumors are growing at a constant rate even after you start the cytostatic agent? By chance with any sort of measurement error it is going to look like half of the tumors have slowed down and half have gotten larger. So you have to be a little bit careful about that.

TOP

Slide 14:

So far all the options have involved one-arm trials. You could also do a multi-arm selection design with a relatively small number of patients per arm. The idea with selection designs is to pick the best arm, the arm with the best clinical outcome, and when you are done with this kind of trial, you are not sure that you have really gotten the best of the agents, but you can be sure that you haven't gotten a really bad agent. So sometimes this is a useful design when you have a lot of agents available and you want to move forward with one of them.

However, drug companies and investigators may not want to stop development of an agent just because it is not a great deal better than a competing agent. So if there are agents from different drug companies this is probably not going to work because they are not going to stop their development.

In Langdon's case, since his company is starting to own all the other companies, this probably wouldn't be an issue, but I guess if you wanted a test, this would work if you were trying to test a bunch of different schedules with the same agent because then it is one investigator and one company involved.

The design should probably include rules for not going forward with any of the arms or any of the agents if you see insufficient activity, but now you are back into the same problems you had with the one-arm trial -- how do you define insufficient activity?

TOP

Slide 15:

The last option for preliminary efficacy trials is not to do them and just go straight to the comparative randomized trial. As I will discuss in a moment, that kind of trial doesn't have to be super large.

TOP

Slide 16:

So just a review of the Phase II options, if you ask me how I would rate these things, if you think you can get objective response rates I would go for one, induce a standard design. If you don't, and you have any sort of reasonable historical data, then I think I would go for two and use a standard design again but now based on historical survival, progression-free survival rate. If neither of those two options, I think I would probably go directly to a small randomized trial as I will now describe.

TOP

Slide 17:

So in comparative efficacy trials you actually compare the efficacy of the two arms now, and there are different ways you could set these up. You could have a cytostatic agent versus placebo. You could have a cytotoxic agent plus or minus a cytostatic agent, and I will end with just a brief mention of factorial designs.

TOP

Slide 18:

If you are doing a cytostatic agent versus placebo, there are really two modes you can do this in. You can do it in a screening mode or definitive mode, and the real difference is what alpha level or how much evidence you require to say that the agent is a successful improvement over placebo.

With screening mode you might use an alpha of .20 and in definitive mode alpha .05. This would mean in a screening mode if at the end of the trial you saw a P value of .15, before saying, too bad, it didn't work, you would say, no, this looks promising, let=s go on with it.

The price you pay for that is that you have to go on to another trial in the screening mode. The benefit you get from the screening mode is you can get by with a smaller sample size. For instance, if you were targeting improvement in median survival or progression-free survival from 6 to 9 months, instead of needing 260 patients, you could get by with 135 patients.

TOP

Slide 19:

n trying to choose which mode to use, if there has already been a preliminary efficacy trial done obviously you don't need to do it again, and you might go to the definitive trial.

If the preclinical evidence is really strong, then you might go to a definitive trial. If you have many agents available right now for testing, you might want to go into a screening mode since you don't have enough patients to do definitive trials in all of them. Again, if there are a few patients available at a limited number of institutions you are probably going to be in the screening mode just because you cannot mount a large trial there.

TOP

Slide 20:

There is the issue of what the appropriate end point is, whether you choose survival or progression-free survival. Survival is usually our gold standard here, but in this trial of cytostatic agent versus placebo there may be a reluctance of patients to participate in a trial that doesn't have active therapy in one of the arms, and if you choose progression-free survival as the end point, you can cross the patients over to the cytostatic agent at progression. So I think that makes it more acceptable.

If you use progression-free survival you have to be careful about bias. So you might want to run that trial double blind if you can or include other mechanisms to avoid bias there.


TOP

Slide 21:

The trial of a cytotoxic agent plus or minus the cytostatic agent B- you are going to have concerns perhaps here about interactions between the agents, especially negative interaction which would give you a negative result of the trial. You can try cycling the agents or giving the cytotoxic first. That may not be the optimal way to keep patients with the cytostatic. Ideally if you are going to do this, the efficacy interactions would be studying preclinical models to give you a kind of a handle whether you are going to have this interaction or not, and if you are worried about PK interactions they could be studied in a small number of patients.

TOP

Slide 22:

Let me just end with a slide mentioning factorial designs because I don't think we see enough of them. If you have a trial that is testing two different cytotoxics, to end on a factorial question of a cytostatic doesn't really increase the sample size by very much. You get the answer to two questions for the price of one almost. It also has the advantage that all of the patients are getting active therapy on all the arms. So you also don't have that problem of just the cytostatic versus placebo.

Let me stop here and turn it back over to Neal.

(Applause.)

TOP