No Evidence for Taxane/Platinum Pharmacogenetic Markers: Just Lack of Power?

  1. Werner Vach
  1. Department of Statistics, University of Southern Denmark, Odense, Denmark
  1. Troels K. Bergmann
  1. Clinical Pharmacology, Institute of Public Health, University of Southern Denmark, Odense, Denmark
  1. Kim Brøsen
  1. Clinical Pharmacology, Institute of Public Health, University of Southern Denmark, Odense, Denmark

To the Editor:

Recently, Marsh et al1 validated 27 selected polymorphisms in 16 key genes using data from a clinical trial with about 900 patients. For none of the polymorphisms were they able to show a significant association to toxicity or outcome. Hence, they conclude that there are no clear candidates for taxane/platinum pharmacogenetic markers.

At first sight, a clinical trial with about 900 patients seems to be appropriate for a convincing validation study. However, a closer look reveals that the authors have used a validation strategy, which is so conservative, that the study has actually poor power to detect genes with clinically relevant effects on toxicity. Therefore, the main conclusion of Marsh et al1 cannot be justified.

There are several aspects in the validation strategy chosen by Marsh et al1 that contribute to a rather poor power, despite starting with about 900 patients:

  1. The authors analyze the two treatment arms separately, such that they actually start with about 450 patients.

  2. These 450 patients are divided into a development set with about 300 patients and a validation set with about 150 patients.

  3. The five binary outcomes considered for toxicity all have a rather low prevalence according to Table 1 in Marsh et al,1 ranging from 12.4% to 32.4%.

  4. The authors correct for multiple testing by applying a technique controlling the false discovery rate. As shown in Table 3 in Marsh et al,1 it is obviously implied that P values above .01 are not regarded as significant. So the authors actually require P values less than .01 to pass the first validation based on the development set.

  5. The authors apply a χ2 test comparing the groups defined by the three genotypes. Based on 300 patients in the development set, such a test has not an impressive power. For example, if there are 150 patients homozygous in the wild type, 100 heterozygous patients, and 50 patients homozygous in the variant, and the true probability of toxicity increases from 14% to 31.5%, there is only a power of 47% to obtain a P value less than .01.

  6. The authors do not state explicitly the criteria to be fulfilled in the validation set. However, in the Results section, they argue with lack of significance; they require a P value less than 5% so implicitly. However, with only 150 patients available, there is only a moderate power to achieve this. For the example above, the power is only 41%.

  7. As the authors require that a gene passes both the analysis based on the development set and the analysis based on the validation set, the power of the whole study is poor, as we have to multiply the power of the analyses in the two single sets. In our example, we obtain only a power of 19%.

Table 1.

Simulated Power of the Strategy of Marsh et al1

We simulated the power of the strategy of Marsh et al1 for each single gene and for the five different toxicity outcomes assuming an increase of the true rate of toxicity by 50%, comparing heterozygous patient with patients homozygous in the wild type, and a further 50% increase when looking on patients homozygous in the variant. We used the allele frequencies shown in Table 21 and the overall prevalences shown in Table 1 of Marsh et al.1 The results are shown in Table 1.

As for most genes, the distribution of the genotypes is highly skewed, and the power is often smaller than in the example considered above. Even if we look on the most frequent toxicity with a prevalence above 30%, we never reach an overall power above 48% and, on average, only reach an overall power of 26%. For the toxicities with prevalences around 20%, we never reach an overall power above 20% and average only an overall power of only 10%. For those toxicities with the lowest prevalence, we have a power less than 5% for most genes. Hence, it becomes a pure chance result to detect an association.

As the authors have used a validation strategy with actually limited power to detect genes with a clinically relevant effect on toxicity, it is incorrect to conclude from the lack of significance that the genes considered have no effect on toxicity.

To come to a final evaluation of the evidence for a relation between the polymorphisms and toxicity or outcome after platinum plus taxane chemotherapy, we highly recommend to reanalyze the data by a strategy that has sufficient power to show clinically relevant effects. The splitting into a development and validation set is useful to avoid overoptimistic results. However, if we have insufficient power in the development set to establish clinically relevant associations, data splitting should be avoided. Hence, we feel that it is most appropriate to use the complete data in each treatment arm to analyze the association for each polymorphism. If one follows additionally the advice of Sasieni2 to regard the genotypes as an ordinal scale and to use a logistic regression to increase the power, we could observe in our simulations a sufficient power at least for the most frequent toxicities. However, at least for the two toxicities with lowest prevalence there is still need for alternative methods (eg, using ordinal outcome variables).

If one succeeds in applying a convincing statistical method to the data of this study, we agree with Maitland et al3 that the study has indeed the potential to become “a model for how collection of whole blood samples for DNA in a phase III clinical trial enables high quality pharmacogenetic research.”

AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST

The author(s) indicated no potential conflicts of interest.

REFERENCES

Related Article

| Table of Contents
  • Advertisement
  • Advertisement
  • Advertisement