brand logo

Am Fam Physician. 2004;70(6):1101-1110

Family physicians must decide how to screen for depression or dementia and which patients to screen. Mental health questionnaires can be helpful. In practice-based screening, questionnaires are administered to all patients, regardless of risk status. In case-finding screening, questionnaires are administered only when depression or dementia is suspected. The 2002 U.S. Preventive Services Task Force report recommends screening adults for depression to improve detection and patient outcomes but does not suggest the use of any particular screening instrument. Serial or sequential testing with the Patient Health Questionnaire-2 and the Patient Health Questionnaire-9 is a good strategy for detecting major depressive episodes in primary care settings. The Patient Health Questionnaire-2 consists of two questions that assess the presence of anhedonia and dysphoria. If a patient answers “yes” to either question, the more specific Patient Health Questionnaire-9 is administered to assess the severity of depressive symptoms and to ascertain the presence of major depressive episode. The Patient Health Questionnaire-9 also can be used to monitor symptom severity and treatment response. The 2003 U.S. Preventive Services Task Force report does not recommend for or against routine screening for dementia in older adults. However, the report does assert that cognitive function should be assessed when impairment is suspected. The Folstein Mini-Mental State Examination and the Functional Activities Questionnaire are suggested tools. The Clock Drawing Test also has been shown to be useful in primary care settings.

Family physicians need efficient methods for identifying patients with depression or dementia. Mental health questionnaires can improve the accuracy of diagnosis. In practice-based screening, questionnaires are administered to all patients, regardless of risk status. However, this approach is associated with high false-positive screening results (i.e., depression or dementia may be identified mistakenly in patients who do not have the condition in question). In case finding, questionnaires are administered to selected patients when the physician suspects that a disorder is present. With this approach, the probability of disease in the group being tested is higher; therefore, it is more likely that a patient with a positive (or abnormal) test has the suspected disorder. Once depression or dementia is identified, questionnaires can be used to evaluate the effect of therapy or the natural history of the disorder, and can provide useful prognostic information.

Key clinical recommendationsLabelReferences
Screening for depression is recommended as long as there are adequate systems in place for treatment and follow-up.B7,12
Serial or sequential screening for depression with the PHQ-2 followed by the PHQ-9 has been shown to be effective in identifying patients with depression in a primary care setting.B25
There is insufficient evidence regarding routine screening for dementia to make a firm recommendation.C31

Assessment of Depressive Symptoms and Depression in Adult Patients

Despite the high prevalence of depression,13 family physicians may fail to recognize 30 to 50 percent of patients with major depressive episodes.46 The 2002 U.S. Preventive Services Task Force (USPSTF) report7 recommends screening adults for depression to improve detection and patient outcomes, provided that effective systems are in place to ensure accurate diagnosis, effective treatment, and appropriate follow-up. However, the report offers few suggestions for selecting screening instruments.

Standardized screening questionnaires for depressive symptoms and major depressive episodes have been reviewed extensively.811 Some of the older questionnaires are too cumbersome, time-consuming, or inaccurate for routine use in clinical settings (Table 1).10,1222

InstrumentDate of introductionNumber of itemsTime frame of questionsScore rangeUsual cutoff point*Literacy levelAdministration time (minutes)
After publication of DSM-III-R (1987)
PHQ-213,14 20032Past two weeks0 to 63Average< 1
PHQ-915 19999Past two weeks0 to 2710Average< 3
Medical outcomes study–Depression test16 19958Past week0 to 10.06Average< 2
Before publication of DSM-III-R
Beck Depression inventory17,18 196121today0 to 6310 (mild)
20 (moderate)
30 (severe)
Easy2 to 5
Center for epidemiologic study–Depression scale19 197720Past week0 to 6016Easy2 to 5
General Health Questionnaire20,21 197228Past few weeks0 to 284Easy5 to 10
Zung self-assessment Depression scale22 198320recently25 to 10050 (mild)
60 (moderate)
70 (severe)
Easy2 to 5


The rating scales developed before the 1987 publication of the Diagnostic and Statistical Manual of Mental Disorders, 3d ed. rev. (DSM-III-R) contain items that are not as highly correlated with current diagnostic standards as the items in newer questionnaires. Based on summary data from a meta-analysis10 of instruments for depression screening and the assumption of a 15 percent probability of major depressive episodes, only about 35 percent of patients identified as depressed by the older screening questionnaires actually have major depressive episode.

In short, the older questionnaires perform poorly as screening tools. However, these questionnaires servereasonably well as case-finding instruments when the probability of depression is higher. For example, in a group of patients with suspected major depressive episode, where the pretest probability for depression is 50 percent, the positive predictive value for identifying major depressive episode is 67 percent with the Beck Depression Inventory, 72 percent with the General Health Questionnaire, 75 percent with the Center for Epidemiologic Study–Depression scale, and 75 percent with the Zung Self-Assessment Depression scale.10

The items in the newly revised Patient Health Questionnaire-9 (PHQ-9) were designed to correspond with the criteria for major depressive episode given in the Diagnostic and Statistical Manual of Mental Disorders, 4th ed. (DSM-IV).23 Consequently, the PHQ-9 has excellent sensitivity (88 percent) and specificity (88 percent).15 However, with practice-based screening, the probability of detecting major depressive episode is only 56 percent when the probability of depression is assumed to be 15 percent (Table 2).1316 Thus, even new and well-designed questionnaires for detecting depressive symptoms have limitations in practice-based screening. The newer questionnaires perform better than the older ones in case finding, with positive predictive values of 88 percent for the PHQ-9 and 79 percent for the Medical Outcomes Study–Depression instrument in testing scenarios similar to those described above.

Probability of depression
Test characteristicsScreening test result†Case-finding test result‡
InstrumentSensitivity (%)Specificity (%)Positive (%)Negative (%)Positive (%)Negative (%)
PHQ-2 with yes/no scoring§13 9657281697
PHQ-2 with point scoring||14 83926539116
PHQ-915 88885628812
Medical Outcomes Study–Depression test16 75804057924


Use of two questionnaires serially or sequentially is more efficient and provides more accurate results than use of a single questionnaire.24,25 The first screening instrument should be short and easy to score. It also must have high sensitivity to assure that most patients with probable major depressive episode are included for a second stage confirmatory test. The second screening instrument can be somewhat longer, because it is administered only to patients with a positive result on the first test. The second-stage questionnaire must be more specific to minimize false-positive results, increase positive predictive value, and improve overall accuracy of the screening process.

Serial testing with the Patient Health Questionnaire-2 (PHQ-2) and the PHQ-9 is perhaps the best validated two-stage strategy to detect major depressive episode in primary care settings. The brief PHQ-2 is given to all adult patients for initial screening. The confirmatory PHQ-9 questionnaire is administered only to patients with a positive stage-one screen.

PHQ-2 for Initial Screening

The PHQ-2 contains two simple screening questions, adapted from the original Primary Care Evaluation of Medical Disorders instrument,26 to assess the presence of anhedonia and dysphoria (diagnostic criteria for major depressive episode23).

One version of the PHQ-2 calls for simple “yes” or “no” responses, with a “yes” response to either question constituting a positive screen.13 The questions are as follows: Over the past month, have you often had little interest or pleasure in doing things? (Yes/No) Over the past month, have you often been bothered by feeling down, depressed, or hopeless? (Yes/No).13 The simplicity of this version in clinical interviews is appealing. The questionnaire has a sensitivity of 96 percent, but a specificity of only 57 percent (Table 2).1316 This questionnaire also yields a high number of patients for stage-two screening (Table 3).

Instruments used in two-stage screeningPatients screenedAll positive stage-one testsAll true positive testsAll true-negative testsAll false-positive testsAll false-negative testsOverall accuracy (%)†
PHQ-2 with yes/no scoring followed by PHQ-91,000510127806442393.3
PHQ-2 with point scoring followed by PHQ-91,00019311084284095.1
PHQ-2 with point scoring followed by Medical Outcomes Study–Depression test1,00019393836145793.0

Another version of the PHQ-2 questionnaire, which uses different time frames, responses, and scoring, has greater accuracy (Figure 1).14 A score of three points or more on this version of the PHQ-2 has a sensitivity of 83 percent and a specificity of 92 percent for major depressive episode14 (Table 2).1316

PHQ-9 for Stage-Two Confirmation of Diagnosis

The PHQ-9 is an excellent questionnaire for confirming the diagnosis of major depressive episode.15 A score of 10 points or higher indicates the presence of major depressive episode.15 Updated versions of the PHQ-926,27 are available on the Internet28,29; one version is shown in Figure 2.28

Two-stage screening with the point-scored PHQ-2 as the initial screening instrument and the PHQ-9 for confirmation of the diagnosis of major depressive episode yields accurate overall results (95.1 percent) and an acceptable number of patients for stage-two screening (only 19.3 percent of screened patients require further testing; Table 3).

Family physicians who use the results of the PHQ-9 as supportive evidence for the presence of major depressive episode still should confirm the diagnosis by ruling out physical causes of depression, differentiating anxiety and physical symptoms that may mimic depression, and eliciting any history of manic episodes or bereavement that could confound the diagnosis.

Stage-two screening with instruments that have high specificity, including the Medical Outcomes Study–Depression questionnaire and others, also produces good results30 (Table 3). Other conditions that mimic depression still need to be ruled out.


The PHQ-9 can be used to monitor the severity of depressive symptoms and assess response to treatment. PHQ-9 scores of 15 points or higher reliably indicate moderate to severe impairment from depression.

Clinical Interview

Family physicians who do not want to use formal questionnaires can ask patients about specific DSM-IV criteria for major depressive episode (Table 4).11 This structured interview can be used to monitor response to treatment and to assess patients for recurrence or relapse of major depressive episode.

The rightsholder did not grant rights to reproduce this item in electronic media. For the missing item, see the original print version of this publication.

Assessment of Cognition in Older Adults

Age is the most significant risk factor for dementia, a syndrome characterized by a decline in memory and in at least one of the following areas: language, visuospatial skills, and executivefunctioning. From 3 to 11 percent of persons older than 65 years and 25 to 47 percent of persons older than 85 years have a dementing disorder. Alzheimer’s disease and cerebrovascular ischemia are common causes of dementing disorders.31

The 2003 USPSTF report31 concluded that evidence is insufficient to support a recommendation for or against routine screening for dementia in older adults. The report qualifies this conclusion by noting that physicians “should assess cognitive function whenever cognitive impairment or deterioration is suspected, based on direct observation, patient report, or concerns raised by family members, friends, or caretakers.”31 Thus, a case-finding approach is more appropriate than a population-based approach in screening for dementia.

A positive response to a question about symptoms (e.g., “Do you have any memory or thinking problems that are interfering with your daily life?”) suggests the need for formal mental status testing. Although a serial approach to screening may be effective, it has not undergone formal evaluation.

Not all patients who experience memory loss have dementia. For example, delirium, medication use, and psychiatric illnesses such as amnestic disorders may be associated with cognitive impairment. These possible causes of memory loss need to be ruled out before the diagnosis of dementia is made. Note that dismissive statements, such as “That’s normal for your age,” are inappropriate, even if they are intended to reassure the patient.


Because many short cognitive tests are available, choosing the most appropriate test can be difficult. The Mini-Mental State Examination (MMSE) is the best studied instrument. The USPSTF report31 notes that the sensitivity of the MMSE for dementia ranges from 71 to 92 percent, and the specificity ranges from 56 to 96 percent.32,33 Therefore, the predictive value of a positive test may range from 15 to 72 percent,34 depending on the population to which the MMSE is applied and the cutoff score that is used to define an abnormal test. The accuracy of the MMSE and other screening tests for dementia is summarized in Table 5.3240

InstrumentSensitivity (%)Specificity (%)Positive predictive value (%)Negative predictive value (%)
Mini-Mental State Examination71 to 9256 to 9615 to 7295 to 99
Functional Activities Questionnaire90905099
Blessed Information Memory Concentration9065 to 9022 to 5098 to 99
Blessed Orientation Memory Concentration69904396
Short Test of Mental Status81904798

Initially published in 1975, the MMSE is a 30-item screening instrument that also can be used to monitor the effectiveness of treatment.41,42 The developers of this instrument recently provided definitive instructions for its administration and included new references to population norms that guide interpretation of scores.43 When properly administered, the MMSE is a valid and reliable test for identifying cognitive impairments in high-risk patients. The MMSE takes five to 10 minutes to administer and score.


The Clock Drawing Test is a brief, easily understood psychometric instrument that sometimes is combined with a test of the ability to make correct monetary change.31 Recent evidence suggests that the Clock Drawing Test may have value in the assessment of visuospatial and executive functions (areas that the MMSE does not test well).44,45 In the Clock Drawing Test, the patient is given a blank sheet of paper and told to draw the face of a clock with the numbers in their correct positions. The patient then is instructed to draw in clock hands to show a time of 11:10. There are variations in scoring the test. The simplest method includes four equally weighted components: one point each for drawing a closed circle, including all 12 numbers, correctly placing the numbers, and placing clock hands at the designated time. The test takes three to four minutes to administer.

In the Time and Change Test,32,46 the patient first is shown a clock face set at 11:10 and asked to tell the time. Response time is measured with a stopwatch. The patient is allowed two tries for a correct response within a one-minute period. For the change-making task, three quarters, seven dimes, and seven nickels are placed in front of the patient, who then is asked to provide one dollar in change. The patient is allowed two tries within a two-minute period. Reducing the time limit to 12 seconds makes the test highly sensitive but less specific.46 Incorrect responses on either or both tasks are scored as a positive result, indicating dementia. A correct response on both tasks is scored as a negative result.

Difficulty performing routine activities of daily living suggests cognitive decline. The Katz Index of Activities of Daily Living Scale and the Instrumental Activities of Daily Living Scale are observer-dependent descriptive tools that have been used for many years.47 These scales provide a framework for assessing level of performance and rate of cognitive decline.47 The USPSTF report31 suggests use of the Functional Activities Questionnaire (Figure 3).48 The questionnaire is reported to be 90 percent sensitive and specific for the identification of dementia.31,32 The primary disadvantage of all functional assessments is that they depend on caregiver observation and report, and not all patients have caregivers.

More intensive neuropsychologic testing is indicated when a patient suffers from sensory losses, test scores are normal but function is abnormal, and impairment is present in one area of cognition.

Continue Reading

More in AFP

More in PubMed

Copyright © 2004 by the American Academy of Family Physicians.

This content is owned by the AAFP. A person viewing it online may make one printout of the material and may use that printout only for his or her personal, non-commercial reference. This material may not otherwise be downloaded, copied, printed, stored, transmitted or reproduced in any medium, whether now known or later invented, except as authorized in writing by the AAFP.  See permissions for copyright questions and/or permission requests.