brand logo

This is a corrected version of the AFP Journal Club that appeared in print.

Am Fam Physician. 2011;84(2):177-179

Related letter: Clarification Regarding Epidemiologic Concepts

Author disclosure: No relevant financial affiliations to disclose.

Purpose
In AFP Journal Club, three presenters review an interesting journal article in a conversational manner. These articles involve “hot topics” that affect family physicians or “bust” commonly held medical myths. The presenters give their opinions about the clinical value of the individual study discussed. The opinions reflect the views of the presenters, not those of AFP or the AAFP.
Article
Manzano S, Bailey B, Girodias JB, Galetto-Lacour A, Cousineau J, Delvin E. Impact of procalcitonin on the management of children aged 1 to 36 months presenting with fever without source: a randomized controlled trial. Am J Emerg Med. 2010;28(6):647–653.

Procalcitonin is a precursor to calcitonin that is elevated only in the presence of a bacterial infection. Measurement of procalcitonin is being pushed as a way to determine the presence or absence of bacterial infection and, therefore, as a way to reduce antibiotic use. But, how well does it work in the patients family physicians see? Why are the results of this study not as convincingly positive as the results of other studies?

What does this article say?

Mark: This was a study of 457 consecutive patients one to 36 months of age in the emergency department. Patients had a rectal temperature of greater than 100.4˚F (38˚C); had no source of infection found on history and physical examination; and had urinalysis, complete blood count, and blood and urine cultures performed. All patients had procalcitonin measurements, and based on randomization, some of the results were available to the attending physician and some were not. Physicians could order other tests (e.g., chest radiography, spinal tap), admit the patient, or treat the patient with antibiotics at their discretion. Overall, 17 parents refused to allow their child to participate, and 28 children in each group could not have blood drawn or had a blood sample that was lost on the way to the laboratory.

The primary end point was the difference in antibiotic prescribing between the group that had procalcitonin levels available to the attending physician and the group that did not, excluding those patients with diagnosed serious bacterial infection or neutropenia following workup in the emergency department. From the perspective of our discussion, a more important outcome was the ability of procalcitonin measurement to predict serious bacterial illness.

Serious bacterial illness or neutropenia was diagnosed in 72 of 384 patients (19 percent). There were 158 children without serious bacterial illness or neutropenia in the group that had procalcitonin levels available to the physician and 154 in the group that did not. Of these “healthy” patients, 9 percent of patients in the group with available procalcitonin levels received antibiotics versus 10 percent in the group without available levels (no significant difference). However, had the physicians provided treatment based on the procalcitonin level (greater than 0.5 ng per mL), it would have increased unnecessary antibiotic use by 24 percent (95% confidence interval, 15 to 33).

An important secondary outcome is that a procalcitonin level greater than 0.5 ng per mL was only 77 percent sensitive and 64 percent specific for serious bacterial illness based on other test results (e.g., culture, chest radiography).

Should we believe this study?

Mark: Yes. Having the procalcitonin measurement available did not change antibiotic prescribing in this group of physicians. If the procalcitonin measurement had guided prescribing, it would have resulted in more antibiotic use rather than less.

Yet, procalcitonin measurement has been shown to do a pretty good job of differentiating between bacterial and viral disease in patients admitted to the hospital with pneumonia or meningitis.1,2 Why, then, wasn't it useful in this group of outpatients? There are three answers. First, the characteristics of a test depend on the prevalence of disease in the population. For example, if you do HIV screening in nuns, it is likely you will get more false-positive results than true-positive results. If you do HIV screening in intravenous drug users in sub-Saharan Africa, you will get more true-positive results than false-positive results. [ corrected] To state this succinctly, the positive predictive value of a test changes depending on the prevalence of disease in the population in which it is used.

Andrea: Second, there is spectrum bias in many studies of procalcitonin measurement. This study looked at patients in an emergency department. Most of the patients weren't very sick and didn't have a serious bacterial illness. So, the sensitivity of the procalcitonin measurement was lower (the patients are on the less severe spectrum of serious bacterial illness and thus can be expected to have lower procalcitonin levels).

Now, if we did a study of procalcitonin measurement in children with bacterial illness in an intensive care unit, it would likely be closer to 100 percent sensitive for serious bacterial illness—the patients would be sicker and the procalcitonin level would be higher. This is why procalcitonin measurement is useful for differentiating bacterial from viral illness in patients with meningitis, for example.

Bob: You can't blindly test patients and expect the tests to perform as they do in the literature. For example, if you obtain abdominal computed tomography in 100 patients in whom you are not sure what is going on (“fishing”), you will get a false-positive result for appendicitis, whereas if you obtain computed tomography in 100 patients with fever, elevated white blood cell count, and three days of tenderness at McBurney point, you will get a true-positive result. As Mark pointed out, the characteristics of a test depend on the prevalence and severity of disease in the group to whom it is applied. If you perform tests in a “no-risk” patient, you will most likely get false-positive results. This is what has occurred with the d-dimer test and pulmonary embolism, leading to more computed tomography scans with the attendant risk.

Mark: The third reason that procalcitonin measurement has performed well in some studies is because the study authors figure out an optimal “positive” cutoff retrospectively. They look at their data and decide what to use as positive and negative cutoffs for a lab test. The test will never perform this well in another population because the authors have optimized the cutoff for their study. So, anytime you see a cutoff that was calculated for a particular study, you can know it doesn't apply generally. A tip-off is that there is usually a receiver operating characteristic curve that helps to find the optimal sensitivity and specificity.

Legend has it that receiver operating characteristic curves were first devised in Britain during World War II. Some radar operators never missed a German air attack, but also called attacks when there was a flock of birds (very sensitive, not specific), and some radar operators missed German air attacks, but were almost always correct when they did call them (very specific, not sensitive).

Bob: In this study, they used a standard definition of a positive procalcitonin measurement (greater than 0.5 ng per mL). This is another reason procalcitonin measurement didn't perform well. They could have made the procalcitonin measurement more sensitive by, for example, reducing the positive result to 0.4 ng per mL. However, this also would have led to more false-positive results (lower specificity).

What should the family physician do?

Andrea: When you look at a study, make sure that the population is the same as your patient population. Characteristics of a test, such as sensitivity and specificity, will differ if the population is different than your own.

If you see a cutoff calculated for a study, realize that the test will never work that well again. The second group you test it on will always be different than the original group in some characteristic.

Bob: Don't use procalcitonin measurement for febrile infants in your office. Although it works well for very sick patients (e.g., those with meningitis or hospitalized with pneumonia), it doesn't work so well in typical infants who have a fever without a source. In fact, other studies have found similar or lower sensitivities in outpatient populations.35

Mark: To be fair, not all outpatient studies are negative. But, many do a post hoc analysis. For example, one study of procalcitonin measurement in febrile infants (which claims to be positive) found a sensitivity of 95 percent, but a specificity of only 25 percent.6 However, the study authors used a cutoff of 0.2 ng per mL determined post hoc to make the results look good. This translates into a 75 percent false-positive rate, which is hardly what you want when you are trying to avoid antibiotics.

Main Points
  • Procalcitonin measurement is not good at differentiating bacterial from viral illness in outpatient febrile infants. It is in sicker inpatients, but these patients will likely be treated with antibiotics anyway, based on their degree of illness, while awaiting more definitive tests, such as culture. Procalcitonin measurement is neither 100 percent sensitive nor 100 percent specific, even in inpatients. The results of procalcitonin measurement should be used as only one part of your decision-making process.

  • Our take on the use of procalcitonin measurement is that it is not yet ready for prime time. Others may disagree.

EBM Points
  • The prevalence of a disease in a population changes the characteristics of a test (e.g., sensitivity, specificity).

  • Spectrum bias occurs when the group (in whom you are doing a test, for example) is either sicker or not as sick as the patients you see in your office. You cannot apply a test standardized in an inpatient population to your outpatient population (or vice versa) and expect it to have the same sensitivity and specificity.

  • In studies of tests (e.g., procalcitonin measurement), post hoc cutoff values are often selected to maximize the sensitivity and specificity of a test. The test will not perform as well in another group of patients. Receiver operating characteristic curves are used to figure out the optimal sensitivity vs. specificity.

  • Only positive studies get press. You likely have heard (or will hear) how great procalcitonin measurement is, but you have been told only part of the story.

Continue Reading


More in AFP

More in Pubmed

Copyright © 2011 by the American Academy of Family Physicians.

This content is owned by the AAFP. A person viewing it online may make one printout of the material and may use that printout only for his or her personal, non-commercial reference. This material may not otherwise be downloaded, copied, printed, stored, transmitted or reproduced in any medium, whether now known or later invented, except as authorized in writing by the AAFP.  See permissions for copyright questions and/or permission requests.