|











|
|
|
SLIDES
& TRANSCRIPTS
Tuesday, February 15,
2000
What
Model in Clinical Design Should We Use to Examine New Cytostatic
Compounds?
Edward Korn, PhD
|
| Slide
1: |
|
DR.
MEROPOL : I would like to introduce Ed Korn who will discuss some
of the statistical issues that we might face in the development
of new agents with cytostatic potential.
DR. KORN: Thanks,
Neal.
I would like
to thank Dr. Meropol for inviting me to add some statistical considerations
to what has been previously discussed.
TOP
|
| Slide
2: |
|
After
talking about Phase I dose-finding trials, I am going to talk about
preliminary efficacy trials and comparative efficacy trials, and
I use these words rather than Phase II and Phase III trials because
those words have started to get confusing.
TOP
|
| Slide
3: |
|
As
has been mentioned in the Phase I trial, rather than using toxicity
you could consider using an alternative biologic end point, and
we have heard many discussed today already. However, if you do that
the end point should be able to measure reproducibly, and it also
should be sufficient to determine a dose where efficacy testing
is appropriate, and that really has kind of two parts to it. One
is that if the biologic end point was actually achieved before you
started seeing toxicity, you would call it quits and go ahead with
that dose of the agent for further testing.
You wouldn't
extend the dose higher, and the other part of that is that if you
actually got to an MTD for toxicity and you didn't see the biologic
response you were looking for, the amount of it, you would say,
okay, I don't want to proceed with this agent anymore, and if you
don't really satisfy those two conditions then you probably are
going to end up doing a Phase I trial to toxicity, and that would
be the appropriate thing to do.
TOP
|
| Slide
4: |
|
If
you decide to use a biologic end point, you have your sample size
issues, and what sample you use usually depends on exactly what
the question you are asking is, and there are, it seems like just
three questions you might ask. You could ask is there a dose response?
You could ask if there a Aminimum effective dose@? You could ask
is there an Aoptimum biologic dose@? I put those last two terms
in quotes because different people define them in different ways.
No matter which
of these three questions you ask though you are going to be into
a larger sample size than in the traditional Phase I trial. I will
give you some examples of that now.
TOP
|
| Slide
5: |
|
Let
us follow the kind of general scheme that Dr. Eisenhauer mentioned,
which is let us see you do a somewhat quick escalation to MTD, one
to three patients at each dose level, and now you want to ask the
question is there a dose response? At first I want to consider a
binary yes/no end point, and assume that at the MTD you actually
have 90 percent of the patients who have that biologic response.
Then you pick
another dose quite a bit smaller than the first one, and you treat
a bunch of patients there, too, and you want to see if they have
a lower, what proportion of patients there have the responses lower.
Now depending
upon how different you think those proportions are determines your
sample size. So if you are in a case where you didn't actually think
there was that big a difference, 90 percent of the patients at the
MTD and 70 percent at this lower dose would have this biologic response,
then you would need 76 patients per arm, 152 patients in total to
have any sort of reasonable power to detect that dose response.
Of course, if
you hypothesize a much larger difference in the biologic response,
then you could get by with a much smaller sample size.
Sometimes it
is said or implied that, well, why don't we test at a bunch of doses,
and then we will need less patients. That is not really quite right.
Suppose you tested it at three doses instead of two doses. It is
true that you would need less patients at each dose, but in fact,
the total number of patients you would need for detecting the same
difference would actually be larger. So instead of 152, you would
need 216 or, in this larger difference case, instead of 34 you would
need 42.
Now, of course,
it is interesting to test at more than two doses because you get
a feel for what the dose-response curve looks like, but it doesn't
save you in the total number of patients you have to accrue.
TOP
|
| Slide
6: |
|
Instead
of using the binary response, you might use a continuous response
which would normally be a good idea because continuous measurements
have more information than binary yes/no measurements. The same
idea, you might escalate to MTD and then back off and do a randomized
study at the MTD and a lower dose, and if you thought that the mean
response was 100 at the MTD and 80 at the lower dose and standard
deviation was about 30 across patients at each dose, then you would
need 84 patients to detect that dose response.
This sample
size is very sensitive to the standard deviation here. So, in fact,
if you had a standard deviation of 20 instead of 30 you would get
by with a much smaller sample size or if you thought that the difference
in mean response was quite a bit bigger then you could get by with
a smaller sample size.
It is important
when you go ahead and try to design these studies, or better yet
have your statistician try to design these studies, that you have
some idea what the standard deviation is, and a good place to get
that information is when you are doing that initial escalation up
to the MTD, even though you are using toxicity to drive that escalation,
go ahead and record your biologic variables so you can get some
background data on how variable they are.
You would be
surprised at how many studies we see that people want to do this
kind of analysis and have no idea what the background variability
of the data is because they have yet to actually record it on patients.
TOP
|
| Slide
7: |
|
So
in summary of the Phase I issues, the alternative endpoint should
be able to be measured reproducibly. It should be sufficient to
determine where efficacy testing would be appropriate. Trials with
alternative end points are going to require larger sample sizes,
and if the alternative end point is continuous rather than binary,
then background variability data will be necessary to give a sample
size calculation.
TOP
|
| Slide
8: |
|
I
would like to switch over now to preliminary efficacy trials and
discuss seven different options. The first one, if the agent has
some cytotoxic activity then you could use a standard Phase II two-stage
design and get by with a relatively small number of patients and
move on. So before you throw away this option and I think as Elizabeth
stressed also give it serious thought. Maybe you have some responses
there, and you can go ahead and use the standard design.
TOP
|
| Slide
9: |
|
If
not, and you can identify meaningful survival, progression-free
survival targets, one could still use a standard Phase II design.
So you can target 30 percent versus 10 percent progression-free
survival at one year provided that you know and have data that would
suggest strongly that the progression-free survival rate without
the agent would be 10 percent at one year. So you are trying to
improve it to 30 percent or you could target 12 versus 6 months
median survival, and again, if you have background historical data
that 6 months is about what you would expect without the agent,
and the sample size here would be relatively small. It is a usual
Phase II kind of sample size, 30 to 50 patients.
You can still
use two-stage designs here, but you have to be a little careful
because you probably don't want to wait around after you have accrued
the first stage to see what their survival experience is before
you go on to the second stage, and there are some statistical designs
that allow for over accrual at the first stage. So you don't have
to shut things down, and another option to consider is if you actually
saw any objective tumor responses in the first stage go ahead to
the second stage no matter what, again, so you don't shut things
down while you are waiting for that survival experience.
TOP
|
| Slide
10:
|
|
So
in order to use this kind of standard design, targeting survival
or progression-free survival difference, you need some historical
data of survival, progression-free survival on a group of patients,
and here comes my dream, like Dan Sargent's this morning, with similar
stage of disease, similar amount of prior treatment, similar organ
function performance status, for whom the same procedures were used
for monitoring disease progression, if you are using progression-free
survival and treated at the same institutions with the same referral
patterns and a recent error. So this is the ideal. If you had data
like that I think people would feel comfortable in going ahead and
using a standard Phase II type design.
TOP
|
| Slide
11: |
|
Some
other options, if you can identify a meaningful clinical benefit
target, say, improvement in quality of life or reduction in pain,
you could also use a standard Phase II design, the idea being that
you would unlikely see this clinical benefit without some active
agent. So any benefit you see is due to the agent.
However, if
you do that, you have to assume that the agent in question is going
to affect the clinical end point in question. You are going to have
to restrict eligibility to patients for whom the clinical parameters
can be measured. So if you look at pain reduction, obviously they
have to have pain, and there could be placebo effects here that
might give you a false positive trial, but that would not be the
end of the world here because you are just using this as a screening
trial. If you make it through this hump, it is going to go to a
randomized trial with a control arm.
TOP
|
| Slide
12: |
|
The
fourth option is to identify a meaningful biologic end point, say,
some sort of immune response where here again you could get by with
a small trial. However, as Dr. Eisenhauer mentioned, there are a
bunch of problems, dificulties with this. It may be difficult to
accrue patients for whom the biologic end point can be measured.
If the agent acts by a different mechanism than hypothesized the
trial may lead to the wrong conclusions. Biologic studies can be
resource intensive, and even if an agent demonstrates biologic activity,
some may not consider that sufficient evidence to move on to a randomized
trial in the absence of any evidence of clinical benefit.
TOP
|
| Slide
13: |
|
The
fifth option, you can compare the tumor growth before and after
the beginning of treatment using each patient as his own control.
However, tumor growth may slow down as tumor gets larger and even
in the absence of an effective agent giving you a misleading result.
You need to
have a measure of tumor growth before the agent is started which
may lead to logistic difficulties if the patients are being referred
to another institution for the new agent. You have to have recorded
their tumor growth before they are actually registered on the study.
Explicit rules
need to be developed for definition of successful outcome, here
especially in the presence of new mets, either before or after the
agent, and how do you factor that into your tumor growth measurements?
Care must be
taken about measurement error in regression to the mean, and in
the simplest case you can think of, suppose the tumors are growing
at a constant rate even after you start the cytostatic agent? By
chance with any sort of measurement error it is going to look like
half of the tumors have slowed down and half have gotten larger.
So you have to be a little bit careful about that.
TOP
|
| Slide
14: |
|
So far all the options have involved one-arm trials. You could also
do a multi-arm selection design with a relatively small number of
patients per arm. The idea with selection designs is to pick the
best arm, the arm with the best clinical outcome, and when you are
done with this kind of trial, you are not sure that you have really
gotten the best of the agents, but you can be sure that you haven't
gotten a really bad agent. So sometimes this is a useful design
when you have a lot of agents available and you want to move forward
with one of them.
However, drug
companies and investigators may not want to stop development of
an agent just because it is not a great deal better than a competing
agent. So if there are agents from different drug companies this
is probably not going to work because they are not going to stop
their development.
In Langdon's
case, since his company is starting to own all the other companies,
this probably wouldn't be an issue, but I guess if you wanted a
test, this would work if you were trying to test a bunch of different
schedules with the same agent because then it is one investigator
and one company involved.
The design should
probably include rules for not going forward with any of the arms
or any of the agents if you see insufficient activity, but now you
are back into the same problems you had with the one-arm trial --
how do you define insufficient activity?
TOP
|
| Slide
15: |
|
The
last option for preliminary efficacy trials is not to do them and
just go straight to the comparative randomized trial. As I will
discuss in a moment, that kind of trial doesn't have to be super
large.
TOP
|
| Slide
16: |
|
So just a review of the Phase II options, if you ask me how I would
rate these things, if you think you can get objective response rates
I would go for one, induce a standard design. If you don't, and
you have any sort of reasonable historical data, then I think I
would go for two and use a standard design again but now based on
historical survival, progression-free survival rate. If neither
of those two options, I think I would probably go directly to a
small randomized trial as I will now describe.
TOP
|
|
Slide 17: |
|
So
in comparative efficacy trials you actually compare the efficacy
of the two arms now, and there are different ways you could set
these up. You could have a cytostatic agent versus placebo. You
could have a cytotoxic agent plus or minus a cytostatic agent, and
I will end with just a brief mention of factorial designs.
TOP
|
| Slide
18: |
|
If
you are doing a cytostatic agent versus placebo, there are really
two modes you can do this in. You can do it in a screening mode
or definitive mode, and the real difference is what alpha level
or how much evidence you require to say that the agent is a successful
improvement over placebo.
With screening
mode you might use an alpha of .20 and in definitive mode alpha
.05. This would mean in a screening mode if at the end of the trial
you saw a P value of .15, before saying, too bad, it didn't work,
you would say, no, this looks promising, let=s go on with it.
The price you
pay for that is that you have to go on to another trial in the screening
mode. The benefit you get from the screening mode is you can get
by with a smaller sample size. For instance, if you were targeting
improvement in median survival or progression-free survival from
6 to 9 months, instead of needing 260 patients, you could get by
with 135 patients.
TOP
|
| Slide
19: |
|
n
trying to choose which mode to use, if there has already been a
preliminary efficacy trial done obviously you don't need to do it
again, and you might go to the definitive trial.
If the preclinical
evidence is really strong, then you might go to a definitive trial.
If you have many agents available right now for testing, you might
want to go into a screening mode since you don't have enough patients
to do definitive trials in all of them. Again, if there are a few
patients available at a limited number of institutions you are probably
going to be in the screening mode just because you cannot mount
a large trial there.
TOP
|
| Slide
20: |
There is the issue of what the appropriate end point is, whether
you choose survival or progression-free survival. Survival is
usually our gold standard here, but in this trial of cytostatic
agent versus placebo there may be a reluctance of patients to
participate in a trial that doesn't have active therapy in one
of the arms, and if you choose progression-free survival as the
end point, you can cross the patients over to the cytostatic agent
at progression. So I think that makes it more acceptable.
If you use
progression-free survival you have to be careful about bias. So
you might want to run that trial double blind if you can or include
other mechanisms to avoid bias there.
TOP
|
| Slide
21: |
|
The
trial of a cytotoxic agent plus or minus the cytostatic agent B-
you are going to have concerns perhaps here about interactions between
the agents, especially negative interaction which would give you
a negative result of the trial. You can try cycling the agents or
giving the cytotoxic first. That may not be the optimal way to keep
patients with the cytostatic. Ideally if you are going to do this,
the efficacy interactions would be studying preclinical models to
give you a kind of a handle whether you are going to have this interaction
or not, and if you are worried about PK interactions they could
be studied in a small number of patients.
TOP
|
| Slide
22: |
|
Let
me just end with a slide mentioning factorial designs because I
don't think we see enough of them. If you have a trial that is testing
two different cytotoxics, to end on a factorial question of a cytostatic
doesn't really increase the sample size by very much. You get the
answer to two questions for the price of one almost. It also has
the advantage that all of the patients are getting active therapy
on all the arms. So you also don't have that problem of just the
cytostatic versus placebo.
Let me stop
here and turn it back over to Neal.
(Applause.)
TOP
|
|