Am Fam Physician. 2013;87(12):872-873
Author disclosure: No relevant financial affiliations.
In AFP Journal Club, three presenters review an interesting journal article in a conversational manner. These articles involve “hot topics” that affect family physicians or “bust” commonly held medical myths. The presenters give their opinions about the clinical value of the individual study discussed. The opinions reflect the views of the presenters, not those of AFP or the AAFP.
Büller HR, Prins MH, Lensin AW, et al.; EINSTEIN–PE Investigators. Oral rivaroxaban for the treatment of symptomatic pulmonary embolism. N Engl J Med. 2012;366(14):1287–1297.
What does this article say?
Mark: This is a drug company–supported noninferiority trial including 4,832 patients with a history of acute symptomatic pulmonary embolism (PE), with or without concurrent deep venous thrombosis (DVT). The study compared the new factor Xa inhibitor rivaroxaban (Xarelto) with the traditional therapy of enoxaparin (Lovenox) followed by warfarin (Coumadin). Patients were randomized (open-label) to rivaroxaban (15 mg twice daily for three weeks, followed by 20 mg once daily) or to enoxaparin followed by dose-adjusted warfarin. Treatment duration of three, six, or 12 months was determined by the treating physician at the time of randomization. The main outcome was symptomatic, recurrent PE or DVT. This required that the patient recognize symptoms on an at-home checklist, although occasional office visits were required. Because the initial determination of who had PE or DVT occurred before submitting the patient's information for blind adjudication, patient and researcher perceptions could have been affected, leading to bias in referral for adjudication (see the study's supplemental protocol at http://www.nejm.org/doi/full/10.1056/NEJMoa1113572).
The safety outcome was major or clinically relevant bleeding. In theory, the authors used intention-to-treat analysis. However, patients who didn't complete the study were analyzed using only data collected up to when contact was lost. We don't know how many of them died, were admitted to another hospital, etc.
I want to avoid numbers at this point, because they are not relevant to our discussion yet, but the study authors concluded that “rivaroxaban was noninferior to standard therapy (noninferiority margin, 2.0; P = .003) for the primary efficacy outcome.” The hazard ratio (HR) for the safety outcome was the same in both groups (equal safety), although the researchers spin it to say rivaroxaban has a “potentially improved benefit-risk profile.” Note that the P value of .003 is for noninferiority, not equivalence or superiority as in a usual study.
Should we believe this study?
Mark: This is a complicated answer, but we will try to make it simple. A standard head-to-head trial attempts to show that a new drug is better or worse than a standard drug or placebo. In a noninferiority trial, the researchers choose what is considered an acceptable clinical difference (noninferiority margin). For example, consider the statement “rivaroxaban was noninferior to standard therapy (noninferiority margin, 2.0; P = .003).” What does “the margin” mean? In this study, the authors decided that if the HR for rivaroxaban was less than 2 when compared with warfarin (the margin being 2), they would call rivaroxaban noninferior. This means that patients in the rivaroxaban group could have had twice the number of PEs and rivaroxaban still would have been called noninferior by these authors. In this type of trial, it is possible to make any drug look noninferior just by setting the margin high enough. In other words, the margin allows the researcher to choose how badly (not how well) a drug can perform and still be called noninferior.
Bob: As you can see, the researcher has a great amount of control over whether a drug will be called noninferior. Because of potential researcher bias in defining “clinically significantly inferior,” and thus the noninferiority margin, a useful starting point is often the lower limit of efficacy of the standard drug as determined in the literature. When this standard is used to determine the margin, and the study drug meets the margin, it does not mean the drugs have equal efficacy, only that the new drug meets the minimum expected benefit from the traditional drug.
Jill: Most studies test the null hypothesis that two treatments are the same. Rejecting the null hypothesis means that one drug is superior to the other. In noninferiority trials, the null hypothesis is that the study drug is inferior to the control drug, and the researchers are trying to prove that the drugs are the same within their set margin. Rejecting the null hypothesis means that the study drug is noninferior to the control drug within this margin. Again, this does not prove equivalence.
Bob: In this study, there were more thromboembolic events in the rivaroxaban group than in the warfarin group (50 vs. 44). There were also more deaths with rivaroxaban than with warfarin (58 vs. 50). Neither of these differences reached statistical significance, although they may have if the study had been larger.
Jill: As the results turned out, the HR of rivaroxaban relative to warfarin for PE or DVT was 1.12 (95% confidence interval, 0.75 to 1.68). The HR for rivaroxaban overlapped with that of warfarin. This means that rivaroxaban could be a bit better than warfarin (HR = lower limit of 0.75) or a bit worse than warfarin (HR = upper limit of 1.68). So, these two drugs should be similar in clinical practice, although safety is an issue.
Mark: There is no proof of an improved benefit-risk profile in this study as claimed. There was less major bleeding (P = .003), but no difference in the combined end point of major bleeding and clinically significant nonmajor bleeding. What we care about is the sum total of major and clinically significant minor bleeding, and there was no difference in this measure. In fact, there was no difference in prolonged hospitalizations, need to discontinue the drug, any serious events, and any emergent event. Finally, there were 58 vs. 50 deaths favoring warfarin (nonsignificant difference).
What should the family physician do?
Bob: Warfarin still has the best data supporting its use for preventing PE. Its effects are also reversible, whereas the effects of rivaroxaban are not. This study does not change our practice. When applied to a larger population (effectiveness vs. efficacy), the differences in mortality and PE rates, favoring warfarin, may become significant.
Mark: Be skeptical when you see a noninferiority study. First, there are many potential biases beyond what we have discussed. Second, remember that noninferior is not the same as equivalent or superior. In this study, the drugs seemed similar (HR of rivaroxaban compared with warfarin of 0.75 to 1.68). However, allowing the researcher to decide what margin is clinically significant is problematic, especially when the margin is larger than what most clinicians would accept. Table 1 includes resources for more information about noninferiority trials.
|European Medicines Agency. Guideline on the choice of non-inferiority margin. http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2009/09/WC500003636.pdf. Accessed March 20, 2013.|
|Piaggio G, Elbourne DR, Pocock SJ, Evans SJ, Altman DG; CONSORT Group. Reporting of noninferiority and equivalence randomized trials: an extension of the CONSORT 2010 statement. JAMA. 2012;308(24):2594–2604.|
|Snapinn SM. Noninferiority trials. Curr Control Trials Cardiovasc Med. 2000;1(1):19–21.|
In a population with previous thromboembolic disease, 2.1% of patients taking rivaroxaban had a second thromboembolic event, compared with 1.8% of those taking warfarin (an absolute difference favoring warfarin of 0.3%). There were also fewer deaths in the warfarin group. However, the differences between the drugs were not statistically significant. We worry about the increased death risk with rivaroxaban when applied to a larger population.
Rivaroxaban is not reversible, and thus warfarin may be a better choice for thromboembolic disease.
Differences in adverse events are unclear, although they were equivalent in this study. However, deaths were not included in the adverse events analysis.
Showing that one drug is noninferior to another does not mean that these drugs are equivalent.
The noninferiority margin allows researchers to choose their own benchmark for what is considered a clinically significant difference between two drugs. This can lead to a drug being called noninferior when others not associated with the study would call it inferior.
Open-label studies can lead to bias, especially when the researcher is adjudicating an outcome, such as recurrent PE or DVT, as in this study.