Efficient Identification of Adults with Depression and Dementia

JANE M. THIBAULT; ROBERT WILLIAM PRASAAD STEINER

JANE M. THIBAULT, M.S.S.W., PH.D., AND ROBERT WILLIAM PRASAAD STEINER, M.D., PH.D.

Am Fam Physician. 2004;70(6):1101-1110

Family physicians must decide how to screen for depression or dementia and which patients to screen. Mental health questionnaires can be helpful. In practice-based screening, questionnaires are administered to all patients, regardless of risk status. In case-finding screening, questionnaires are administered only when depression or dementia is suspected. The 2002 U.S. Preventive Services Task Force report recommends screening adults for depression to improve detection and patient outcomes but does not suggest the use of any particular screening instrument. Serial or sequential testing with the Patient Health Questionnaire-2 and the Patient Health Questionnaire-9 is a good strategy for detecting major depressive episodes in primary care settings. The Patient Health Questionnaire-2 consists of two questions that assess the presence of anhedonia and dysphoria. If a patient answers “yes” to either question, the more specific Patient Health Questionnaire-9 is administered to assess the severity of depressive symptoms and to ascertain the presence of major depressive episode. The Patient Health Questionnaire-9 also can be used to monitor symptom severity and treatment response. The 2003 U.S. Preventive Services Task Force report does not recommend for or against routine screening for dementia in older adults. However, the report does assert that cognitive function should be assessed when impairment is suspected. The Folstein Mini-Mental State Examination and the Functional Activities Questionnaire are suggested tools. The Clock Drawing Test also has been shown to be useful in primary care settings.

Family physicians need efficient methods for identifying patients with depression or dementia. Mental health questionnaires can improve the accuracy of diagnosis. In practice-based screening, questionnaires are administered to all patients, regardless of risk status. However, this approach is associated with high false-positive screening results (i.e., depression or dementia may be identified mistakenly in patients who do not have the condition in question). In case finding, questionnaires are administered to selected patients when the physician suspects that a disorder is present. With this approach, the probability of disease in the group being tested is higher; therefore, it is more likely that a patient with a positive (or abnormal) test has the suspected disorder. Once depression or dementia is identified, questionnaires can be used to evaluate the effect of therapy or the natural history of the disorder, and can provide useful prognostic information.

Key clinical recommendations	Label	References
Screening for depression is recommended as long as there are adequate systems in place for treatment and follow-up.	B	^7,12
Serial or sequential screening for depression with the PHQ-2 followed by the PHQ-9 has been shown to be effective in identifying patients with depression in a primary care setting.	B	²⁵
There is insufficient evidence regarding routine screening for dementia to make a firm recommendation.	C	³¹

Assessment of Depressive Symptoms and Depression in Adult Patients

Despite the high prevalence of depression,^1–3 family physicians may fail to recognize 30 to 50 percent of patients with major depressive episodes.^4–6 The 2002 U.S. Preventive Services Task Force (USPSTF) report⁷ recommends screening adults for depression to improve detection and patient outcomes, provided that effective systems are in place to ensure accurate diagnosis, effective treatment, and appropriate follow-up. However, the report offers few suggestions for selecting screening instruments.

Standardized screening questionnaires for depressive symptoms and major depressive episodes have been reviewed extensively.^8–11 Some of the older questionnaires are too cumbersome, time-consuming, or inaccurate for routine use in clinical settings (Table 1).^10,12–22

Instrument	Date of introduction	Number of items	Time frame of questions	Score range	Usual cutoff point*	Literacy level†	Administration time (minutes)
After publication of DSM-III-R (1987)
PHQ-2^13,14	2003	2	Past two weeks	0 to 6	3	Average	< 1
PHQ-9¹⁵	1999	9	Past two weeks	0 to 27	10	Average	< 3
Medical outcomes study–Depression test¹⁶	1995	8	Past week	0 to 1	0.06	Average	< 2
Before publication of DSM-III-R
Beck Depression inventory^17,18	1961	21	today	0 to 63	10 (mild) 20 (moderate) 30 (severe)	Easy	2 to 5
Center for epidemiologic study–Depression scale¹⁹	1977	20	Past week	0 to 60	16	Easy	2 to 5
General Health Questionnaire^20,21	1972	28	Past few weeks	0 to 28	4	Easy	5 to 10
Zung self-assessment Depression scale²²	1983	20	recently	25 to 100	50 (mild) 60 (moderate) 70 (severe)	Easy	2 to 5

SCREENING AND CASE FINDING

The rating scales developed before the 1987 publication of the Diagnostic and Statistical Manual of Mental Disorders, 3d ed. rev. (DSM-III-R) contain items that are not as highly correlated with current diagnostic standards as the items in newer questionnaires. Based on summary data from a meta-analysis¹⁰ of instruments for depression screening and the assumption of a 15 percent probability of major depressive episodes, only about 35 percent of patients identified as depressed by the older screening questionnaires actually have major depressive episode.

In short, the older questionnaires perform poorly as screening tools. However, these questionnaires servereasonably well as case-finding instruments when the probability of depression is higher. For example, in a group of patients with suspected major depressive episode, where the pretest probability for depression is 50 percent, the positive predictive value for identifying major depressive episode is 67 percent with the Beck Depression Inventory, 72 percent with the General Health Questionnaire, 75 percent with the Center for Epidemiologic Study–Depression scale, and 75 percent with the Zung Self-Assessment Depression scale.¹⁰

The items in the newly revised Patient Health Questionnaire-9 (PHQ-9) were designed to correspond with the criteria for major depressive episode given in the Diagnostic and Statistical Manual of Mental Disorders, 4th ed. (DSM-IV).²³ Consequently, the PHQ-9 has excellent sensitivity (88 percent) and specificity (88 percent).¹⁵ However, with practice-based screening, the probability of detecting major depressive episode is only 56 percent when the probability of depression is assumed to be 15 percent (Table 2).^13–16 Thus, even new and well-designed questionnaires for detecting depressive symptoms have limitations in practice-based screening. The newer questionnaires perform better than the older ones in case finding, with positive predictive values of 88 percent for the PHQ-9 and 79 percent for the Medical Outcomes Study–Depression instrument in testing scenarios similar to those described above.

			Probability of depression
	Test characteristics		Screening test result†		Case-finding test result‡
Instrument	Sensitivity (%)	Specificity (%)	Positive (%)	Negative (%)	Positive (%)	Negative (%)
PHQ-2 with yes/no scoring§¹³	96	57	28	1	69	7
PHQ-2 with point scoring\|\|¹⁴	83	92	65	3	91	16
PHQ-9¹⁵	88	88	56	2	88	12
Medical Outcomes Study–Depression test¹⁶	75	80	40	5	79	24

SERIAL OR SEQUENTIAL SCREENING

Use of two questionnaires serially or sequentially is more efficient and provides more accurate results than use of a single questionnaire.^24,25 The first screening instrument should be short and easy to score. It also must have high sensitivity to assure that most patients with probable major depressive episode are included for a second stage confirmatory test. The second screening instrument can be somewhat longer, because it is administered only to patients with a positive result on the first test. The second-stage questionnaire must be more specific to minimize false-positive results, increase positive predictive value, and improve overall accuracy of the screening process.

Serial testing with the Patient Health Questionnaire-2 (PHQ-2) and the PHQ-9 is perhaps the best validated two-stage strategy to detect major depressive episode in primary care settings. The brief PHQ-2 is given to all adult patients for initial screening. The confirmatory PHQ-9 questionnaire is administered only to patients with a positive stage-one screen.

PHQ-2 for Initial Screening

The PHQ-2 contains two simple screening questions, adapted from the original Primary Care Evaluation of Medical Disorders instrument,²⁶ to assess the presence of anhedonia and dysphoria (diagnostic criteria for major depressive episode²³).

One version of the PHQ-2 calls for simple “yes” or “no” responses, with a “yes” response to either question constituting a positive screen.¹³ The questions are as follows: Over the past month, have you often had little interest or pleasure in doing things? (Yes/No) Over the past month, have you often been bothered by feeling down, depressed, or hopeless? (Yes/No).¹³ The simplicity of this version in clinical interviews is appealing. The questionnaire has a sensitivity of 96 percent, but a specificity of only 57 percent (Table 2).^13–16 This questionnaire also yields a high number of patients for stage-two screening (Table 3).

Instruments used in two-stage screening	Patients screened	All positive stage-one tests	All true positive tests	All true-negative tests	All false-positive tests	All false-negative tests	Overall accuracy (%)†
PHQ-2 with yes/no scoring followed by PHQ-9	1,000	510	127	806	44	23	93.3
PHQ-2 with point scoring followed by PHQ-9	1,000	193	110	842	8	40	95.1
PHQ-2 with point scoring followed by Medical Outcomes Study–Depression test	1,000	193	93	836	14	57	93.0

Another version of the PHQ-2 questionnaire, which uses different time frames, responses, and scoring, has greater accuracy (Figure 1).¹⁴ A score of three points or more on this version of the PHQ-2 has a sensitivity of 83 percent and a specificity of 92 percent for major depressive episode¹⁴ (Table 2).^13–16

PHQ-9 for Stage-Two Confirmation of Diagnosis

The PHQ-9 is an excellent questionnaire for confirming the diagnosis of major depressive episode.¹⁵ A score of 10 points or higher indicates the presence of major depressive episode.¹⁵ Updated versions of the PHQ-9^26,27 are available on the Internet^28,29; one version is shown in Figure 2.²⁸

Two-stage screening with the point-scored PHQ-2 as the initial screening instrument and the PHQ-9 for confirmation of the diagnosis of major depressive episode yields accurate overall results (95.1 percent) and an acceptable number of patients for stage-two screening (only 19.3 percent of screened patients require further testing; Table 3).

Family physicians who use the results of the PHQ-9 as supportive evidence for the presence of major depressive episode still should confirm the diagnosis by ruling out physical causes of depression, differentiating anxiety and physical symptoms that may mimic depression, and eliciting any history of manic episodes or bereavement that could confound the diagnosis.

Stage-two screening with instruments that have high specificity, including the Medical Outcomes Study–Depression questionnaire and others, also produces good results³⁰ (Table 3). Other conditions that mimic depression still need to be ruled out.

MONITORING

The PHQ-9 can be used to monitor the severity of depressive symptoms and assess response to treatment. PHQ-9 scores of 15 points or higher reliably indicate moderate to severe impairment from depression.

Clinical Interview

Family physicians who do not want to use formal questionnaires can ask patients about specific DSM-IV criteria for major depressive episode (Table 4).¹¹ This structured interview can be used to monitor response to treatment and to assess patients for recurrence or relapse of major depressive episode.

The rightsholder did not grant rights to reproduce this item in electronic media. For the missing item, see the original print version of this publication.

Assessment of Cognition in Older Adults

Age is the most significant risk factor for dementia, a syndrome characterized by a decline in memory and in at least one of the following areas: language, visuospatial skills, and executivefunctioning. From 3 to 11 percent of persons older than 65 years and 25 to 47 percent of persons older than 85 years have a dementing disorder. Alzheimer’s disease and cerebrovascular ischemia are common causes of dementing disorders.³¹

The 2003 USPSTF report³¹ concluded that evidence is insufficient to support a recommendation for or against routine screening for dementia in older adults. The report qualifies this conclusion by noting that physicians “should assess cognitive function whenever cognitive impairment or deterioration is suspected, based on direct observation, patient report, or concerns raised by family members, friends, or caretakers.”³¹ Thus, a case-finding approach is more appropriate than a population-based approach in screening for dementia.

A positive response to a question about symptoms (e.g., “Do you have any memory or thinking problems that are interfering with your daily life?”) suggests the need for formal mental status testing. Although a serial approach to screening may be effective, it has not undergone formal evaluation.

Not all patients who experience memory loss have dementia. For example, delirium, medication use, and psychiatric illnesses such as amnestic disorders may be associated with cognitive impairment. These possible causes of memory loss need to be ruled out before the diagnosis of dementia is made. Note that dismissive statements, such as “That’s normal for your age,” are inappropriate, even if they are intended to reassure the patient.

MINI-MENTAL STATE EXAMINATION

Because many short cognitive tests are available, choosing the most appropriate test can be difficult. The Mini-Mental State Examination (MMSE) is the best studied instrument. The USPSTF report³¹ notes that the sensitivity of the MMSE for dementia ranges from 71 to 92 percent, and the specificity ranges from 56 to 96 percent.^32,33 Therefore, the predictive value of a positive test may range from 15 to 72 percent,³⁴ depending on the population to which the MMSE is applied and the cutoff score that is used to define an abnormal test. The accuracy of the MMSE and other screening tests for dementia is summarized in Table 5.^32–40

Instrument	Sensitivity (%)	Specificity (%)	Positive predictive value (%)	Negative predictive value (%)
Mini-Mental State Examination	71 to 92	56 to 96	15 to 72	95 to 99
Functional Activities Questionnaire	90	90	50	99
Blessed Information Memory Concentration	90	65 to 90	22 to 50	98 to 99
Blessed Orientation Memory Concentration	69	90	43	96
Short Test of Mental Status	81	90	47	98

Initially published in 1975, the MMSE is a 30-item screening instrument that also can be used to monitor the effectiveness of treatment.^41,42 The developers of this instrument recently provided definitive instructions for its administration and included new references to population norms that guide interpretation of scores.⁴³ When properly administered, the MMSE is a valid and reliable test for identifying cognitive impairments in high-risk patients. The MMSE takes five to 10 minutes to administer and score.

OTHER INSTRUMENTS

The Clock Drawing Test is a brief, easily understood psychometric instrument that sometimes is combined with a test of the ability to make correct monetary change.³¹ Recent evidence suggests that the Clock Drawing Test may have value in the assessment of visuospatial and executive functions (areas that the MMSE does not test well).^44,45 In the Clock Drawing Test, the patient is given a blank sheet of paper and told to draw the face of a clock with the numbers in their correct positions. The patient then is instructed to draw in clock hands to show a time of 11:10. There are variations in scoring the test. The simplest method includes four equally weighted components: one point each for drawing a closed circle, including all 12 numbers, correctly placing the numbers, and placing clock hands at the designated time. The test takes three to four minutes to administer.

In the Time and Change Test,^32,46 the patient first is shown a clock face set at 11:10 and asked to tell the time. Response time is measured with a stopwatch. The patient is allowed two tries for a correct response within a one-minute period. For the change-making task, three quarters, seven dimes, and seven nickels are placed in front of the patient, who then is asked to provide one dollar in change. The patient is allowed two tries within a two-minute period. Reducing the time limit to 12 seconds makes the test highly sensitive but less specific.⁴⁶ Incorrect responses on either or both tasks are scored as a positive result, indicating dementia. A correct response on both tasks is scored as a negative result.

Difficulty performing routine activities of daily living suggests cognitive decline. The Katz Index of Activities of Daily Living Scale and the Instrumental Activities of Daily Living Scale are observer-dependent descriptive tools that have been used for many years.⁴⁷ These scales provide a framework for assessing level of performance and rate of cognitive decline.⁴⁷ The USPSTF report³¹ suggests use of the Functional Activities Questionnaire (Figure 3).⁴⁸ The questionnaire is reported to be 90 percent sensitive and specific for the identification of dementia.^31,32 The primary disadvantage of all functional assessments is that they depend on caregiver observation and report, and not all patients have caregivers.

More intensive neuropsychologic testing is indicated when a patient suffers from sensory losses, test scores are normal but function is abnormal, and impairment is present in one area of cognition.

Regier DA, Myers JK, Kramer M, Robins LN, Blazer DG, Hough RL, et al. The NIMH Epidemiologic Catchment Area program. Historical context, major objectives, and study population characteristics. Arch Gen Psychiatry. 1984;41:934-41.

Kessler RC, McGonagle KA, Zhao S, Nelson CB, Hughes M, Eshleman S, et al. Lifetime and 12-month prevalence of DSM-III-R psychiatric disorders in the United States. Results from the National Comorbidity Study. Arch Gen Psychiatry. 1994;51:8-19.

Weissman MM, Leaf PJ, Tischler GL, Blazer DG, Karno M, Bruce ML, et al. Affective disorders in five United States communities. Psychol Med. 1988;18:141-53.

Simon GE, VonKorff M. Recognition, management, and outcomes of depression in primary care. Arch Fam Med. 1995;4:99-105.

Gerber PD, Barrett J, Barrett J, Manheimer E, Whiting R, Smith R. Recognition of depression by internists in primary care: a comparison of internist and “gold standard” psychiatric assessments. J Gen Intern Med. 1989;4:7-13.

Susman JL, Crabtree BF, Essink G. Depression in rural family practice. Easy to recognize, difficult to diagnose. Arch Fam Med. 1995;4:427-431.

Pignone MP, Gaynes BN, Rushton JL, Burchell CM, Orleans CT, Mulrow CD, et al. Screening for depression in adults: a summary of the evidence for the U.S. Preventive Services Task Force. Ann Intern Med. 2002;136:765-76.

Williams JW, Mulrow CD, Kroenke K, Dhanda R, Badgett RG, Omori D, et al. Case-finding for depression in primary care: a randomized trial. Am J Med. 1999;106:36-43.

Klinkman MS, Coyne JC, Gallo S, Schwenk TL. Can case-finding instruments be used to improve physician detection of depression in primary care?. Arch Fam Med. 1997;6:567-73.

Mulrow CD, Williams JW, Gerety MB, Ramirez G, Montiel OM, Kerber C. Case-finding instruments for depression in primary care settings [published correction appears in Ann Intern Med 1995;123:966]. Ann Intern Med. 1995;122:913-21.

Williams JW, Noel PH, Cordes JA, Ramirez G, Pignone M. Is this patient clinically depressed?. JAMA. 2002;287:1160-70.

Pignone M, Gaynes BN, Rushton JL, Mulrow CD, Orleans CT, Mills C, et al. Screening for depression systematic evidence review. Rockville, Md.: Agency for Healthcare Research and Quality; 2002. AHRQ Systematic Evidence Review, no. 6. Accessed online June 10, 2004, at: http://www.ahrq.gov/clinic/serfiles.htm#download.

Whooley MA, Avins AL, Miranda J, Browner WS. Case-finding instruments for depression. Two questions are as good as many. J Gen Intern Med. 1997;12:439-45.

Kroenke K, Spitzer RL, Williams JB. The Patient Health Questionnaire-2: validity of a two-item depression screener. Med Care. 2003;41:1284-92.

Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16:606-13.

Berwick DM, Murphy JM, Goldman PA, Ware JE, Barsky AJ, Weinstein MC. Performance of a five-item mental health screening test. Med Care. 1991;29:169-76.

Beck AT, Ward CH, Mendelson M, Mock J, Erbaugh J. An inventory for measuring depression. Arch Gen Psychiatry. 1961;4:561-7.

Beck AT, Steer RA, Garbin MG. Psychometric properties of the Beck Depression Inventory: twenty-five years of evaluation. Clin Psychol Rev. 1988;8:77-100.

Radloff LS. The CES-D scale: a self-report depression scale for research in the general population. Appl Psychol Meas. 1977;1:385-401.

Goldberg D. Manual of the general health questionnaire. Windsor, England: NFER, 1978.

Goldberg DP, Blackwell B. Psychiatric illness in general practice. A detailed study using a new method of case identification. Br Med J. 1970;1:439-43.

Zung WW, King RE. Identification and treatment of masked depression in a general medical practice. J Clin Psychiatry. 1983;44:365-8.

American Psychiatric Association. Diagnostic and statistical manual of mental disorders. 4th ed. Washington, D.C.: American Psychiatric Association, 1994:327.

Nease DE, Klinkman MS, Volk RJ. Improved detection of depression in primary care through severity evaluation. J Fam Pract. 2002;51:1065-70.

Nease DE, Maloin JM. Depression screening: a practical strategy. J Fam Pract. 2003;52:118-24.

Spitzer RL, Williams JB, Kroenke K, Linzer M, deGruy FV, Hahn SR, et al. Utility of a new procedure for diagnosing mental disorders in primary care. The PRIME-MD 1000 study. JAMA. 1994;272:1749-56.

Spitzer RL, Kroenke K, Williams JB. Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study. Primary Care Evaluation of Mental Disorders. Patient Health Questionnaire. JAMA. 1999;282:1737-44.

Dobscha SK, Gerrity MS, Ward MF. Effectiveness of an intervention to improve primary care provider recognition of depression. Eff Clin Pract 2001;4:163–71. Accessed online June 15, 2004, at: http://www.acponline.org/journals/ecp/julaug01/dobscha_apdxtb.gif.

Clinical Regional Advisory Network. Patient health questionnaire. Accessed online June 15, 2004, at: http://www.cran3.org/downloads/PatientHealthQuestionnaire.pdf.

Burnam MA, Wells KB, Leake B, Landsverk J. Development of a brief screening instrument for detecting depressive disorders. Med Care. 1988;26:775-89.

U.S. Preventive Services Task Force. Screening for dementia: recommendations and rationale. June 2003. Accessed online June 15, 2004, at: http://www.ahcpr.gov/clinic/3rduspstf/dementia/dementrr.htm.

Early identification of Alzheimer’s disease and related dementias. Agency for Health Care Policy and Research. Clin Pract Guide Quick Ref Guide Clin. 1996(19):1-28.

Heun R, Papassotiropoulos A, Jennssen F. The validity of psychometric instruments for detection of dementia in the elderly general population. Int J Geriatr Psychiatry. 1998;13:368-80.

Boustani M, Peterson B, Hanson L, Harris R, Lohr KN. Screening for dementia in primary care: a summary of the evidence for the U.S. Preventive Services Task Force. Ann Intern Med. 2003;138:927-37.

Wilder D, Cross P, Chen J, Gurland B, et al. Operating characteristics of brief screens for dementia in a multicultural population. Am J Geriatr Psychiatry. 1995;3:96-107.

Jitapunkul S, Lailert C, Worakul P, Srikiatkhachorn A, et al. Chula mental test: a screening test for elderly people in less developed countries. Int J Geriatr Psychiatry. 1996;11:715-20.

McDowell I, Kristjansson B, Hill GB, Hebert R. Community screening for dementia: the Mini Mental State Exam (MMSE) and Modified Mini-Mental State Exam (3MS) compared. J Clin Epidemiol. 1997;50:377-83.

Lindeboom J, Launer LJ, Schmand BA, Hooyer C, Jonker C. Effects of adjustment on the case-finding potential of cognitive tests. J Clin Epidemiol. 1996;49:691-5.

Law S, Wolfson C. Validation of a French version of an informant-based questionnaire as a screening test for Alzheimer’s disease. Br J Psychiatry. 1995;167:541-4.

Braekhus A, Laake K, Engedal K. A low ‘normal’ score on the Mini-Mental State Examination predicts development of dementia after three years. J Am Geriatr Soc. 1995;43:656-61.

Patterson CJ, Gass DA. Screening for cognitive impairment and dementia in the elderly. Can J Neurol Sci. 2001;28(suppl 1):S42-51.

Ercoli LM, Siddarth P, Dunkin JJ, Bramen J, Small GW. MMSE items predict cognitive decline in persons with genetic risk for Alzheimer’s disease. J Geriatr Psychiatry Neurol. 2003;16:67-73.

Folstein MF, Folstein SE, McHugh PR, Fanjiang G. Mini-Mental State Examination: MMSE user’s guide. Odessa, Fla.:. Psychology Assessment Resources. 2000:2-12.

Juby A, Tench S, Baker V. The value of clock drawing in identifying executive cognitive dysfunction in people with a normal Mini-Mental State Examination score. CMAJ. 2002;167:859-64.

Schramm U, Berger G, Muller R, Kratzsch T, Peters J, Frolich L. Psychometric properties of Clock Drawing Test and MMSE or Short Performance Test (SKT) in dementia screening in a memory clinic population. Int J Geriatr Psychiatry. 2002;17:254-60.

Froehlich TE, Robison JT, Inouye SK. Screening for dementia in the outpatient setting: the time and change test. J Am Geriatr Soc. 1998;46:1506-11.

Gallo JJ, Fulmer T, Paveza GJ, Reichel W. Functional assessment. In: Gallo JJ, Fulmer T, Paveza GJ, Reichel W, eds. Handbook of geriatric assessment. 3d ed. Gaithersburg, Md.: Aspen, 2000:101–57.

Pfeffer RI, Kurosaki TT, Harrah CH, Chance JM, Filos S. Measurement of functional activities in older adults in the community. J Gerontol. 1982;37:323-9.