Items are listed in order of how they appear in the NASAL-7 smell test with their respective substitutions. The NASAL-3 test only uses items 1, 2, and 7. If the item or its substitution were both unavailable, an unused substitution was taken as a replacement at the end of the test.
Horizontal lines indicate the median values; boxes, the range between the first and third quartiles of the data set; whiskers, minimum and maximum values; dots, outliers.
eTable. Performance of 45 Items on Household Survey
eFigure 1. Clusters of 174 Smell Descriptors
eFigure 2. 45-Item Household Survey Experience
eFigure 3. Boxplots of the CGI-S Response Categories with Total NASAL-7 and NASAL-3 Scores
Customize your JAMA Network experience by selecting one or more topics from the list below.
Gupta S, Kallogjeri D, Farrell NF, et al. Development and Validation of a Novel At-home Smell Assessment. JAMA Otolaryngol Head Neck Surg. 2022;148(3):252–258. doi:10.1001/jamaoto.2021.3994
Can an olfactory dysfunction be assessed via commonly available household items?
In this diagnostic study of 115 adults with self-reported olfactory dysfunction, 2 shorter, patient-reported assessments using common household items were easy to use and had high accuracy when detecting olfactory dysfunction.
These findings suggest that these novel assessments can be used by people seeking to test their sense of smell and by health care professionals to detect olfactory dysfunction.
Current tools for diagnosis of olfactory dysfunction (OD) are costly, time-consuming, and often require clinician administration.
To develop and validate a simple screening assessment for OD using common household items.
Design, Setting, and Participants
This fully virtual diagnostic study included adults with self-reported OD from any cause throughout the US. Data were collected from December 2020 to April 2021 and analyzed from May 2021 to July 2021.
Main Outcomes and Measures
Participants with self-reported olfactory dysfunction took a survey assessing smell perception of 45 household items and completed the Clinical Global Impression–Severity (CGI-S) smell questionnaire, the University of Pennsylvania Smell Identification Test (UPSIT), and the 36-item Short Form Survey (SF-36). Psychometric and clinimetric analyses were used to consolidate 45 household items into 2 short Novel Anosmia Screening at Leisure (NASAL) assessments, NASAL-7 (range, 0-14; lower score indicating greater anosmia) and NASAL-3 (range, 0-6; lower score indicating greater anosmia).
A total of 115 participants were included in the study, with a median (range) age of 42 (19-70) years, 92 (80%) women, and 97 (84%) White individuals. There was a moderate correlation between the UPSIT and NASAL-7 scores and NASAL-3 scores (NASAL-7: ρ = 0.484; NASAL-3: ρ = 0.404). Both NASAL-7 and NASAL-3 had moderate accuracy in identifying participants with anosmia as defined by UPSIT (NASAL-7 area under the receiver operating curve [AUC], 0.706; 95% CI, 0.551-0.862; NASAL-3 AUC, 0.658; 95% CI, 0.503-0.814). Scoring 7 or less on the NASAL-7 had 70% (95% CI, 48%-86%) sensitivity and 53% (95% CI, 43%-63%) specificity in discriminating participants with anosmia from participants without. Scoring 2 or less on the NASAL-3 had 57% (95% CI, 36%-76%) sensitivity and 78% (95% CI, 69%-85%) specificity in discriminating participants with anosmia from participants without. There was moderate agreement between UPSIT-defined OD categories and those defined by NASAL-7 (weighted κ = 0.496; 95% CI, 0.343-0.649) and those defined by NASAL-3 (weighted κ = 0.365; 95% CI, 0.187-0.543). The agreement with self-reported severity of olfactory dysfunction as measured by CGI-S and the NASAL-7 and NASAL-3 was moderate, with a weighted κ of 0.590 (95% CI, 0.474-0.707) for the NASAL-7 and 0.597 (95% CI, 0.481-0.712) for the NASAL-3.
Conclusion and Relevance
The findings of this diagnostic study suggest that NASAL-7 and NASAL-3, inexpensive and brief patient-reported assessments, can be used to identify individuals with OD. As the burden of COVID-19–associated OD increases, these assessments may prove beneficial as screening and diagnostic tools. Future work will explore whether the NASAL assessments are sensitive to change and how much of a change is clinically important.
The ongoing COVID-19 pandemic reaffirms the need to diagnose postviral olfactory dysfunction (OD) effectively, as OD is one of the most common presenting symptoms of SARS-CoV-2 infection. Recent observational studies have found that OD may be the symptom most predictive of a positive COVID-19 test.1-4 As data on COVID-19–related OD emerges, as many as 10% of adults with COVID-19–associated acute olfactory loss may develop permanent OD, thus affecting millions of people worldwide. Many patients also experience dysosmia, such as phantosmias and parosmias, a few months following their initial infection with COVID-19.5,6
OD plays an important role in both quality of life and as a marker of other diseases. In adults who experienced symptoms like shortness of breath, fever, cough, and rhinorrhea as a part of their COVID-19 infection, only loss of smell and/or taste was associated with increased depression and anxiety.7 OD has also been identified as a preclinical marker for Alzheimer disease and Parkinson disease.8,9 Screening for OD among older adults may provide an opportunity for early intervention and prevent further deterioration even before the presence of mild cognitive impairment.10,11 There is also evidence to suggest that OD can be a predictor of 5-year mortality among older adults, for whom healthier habits can be reinforced.12,13 Assessing smell function also helps determine the usefulness of possible treatments for OD, such as budesonide irrigations or smell training.14-17 Therefore, it is imperative to create robust diagnostic tools to determine the presence and extent of an individual’s OD.
The current criterion standard for the detection of OD is the University of Pennsylvania Smell Identification Test (UPSIT).18-20 The UPSIT is a 40-question, forced-choice odor identification test in which microencapsulated odorants on a strip are released by scratching. While validated for OD identification, the UPSIT is time consuming, requires a clinician to interpret the results, and costs approximately $27 per test. The Sniffin’ Sticks,21 a more elaborate smell test, detects odor threshold, discrimination, and identification. However, Sniffin’ Sticks require a trained clinical person to administer. In light of the difficulty in measuring smell function with existing tools, the need for another option became evident.
A cost-effective, self-administered, at-home diagnostic smell test can be used to identify individuals with OD. Our aim was to develop and validate such an assessment, the Novel Anosmia Screening at Leisure (NASAL), which uses commonly found household items.
This prospective study was conducted online. The protocol was approved by the Washington University institutional review board. Participants provided electronic consent and all responses to surveys online via REDCap version 9.0. Participants were eligible if they were aged 18 to 70 years, lived in the United States, and reported an impaired ability to smell. Participants were excluded if they were unable to provide electronic consent, were pregnant, or had their first symptom of COVID-19 or other viral infection in the last 4 weeks. This last exclusion was because of possible fluctuations in smell for the duration of the study. Participants were recruited via social media groups for anosmia and national research participant databases. This study followed the Standards for Reporting of Diagnostic Accuracy (STARD) reporting guideline.
The household items used for the smell test were identified based on the odorant clustering system developed by Castro and colleagues via nonnegative matrix factorization.22-24 The established 174 smell descriptors were sorted into 8 distinct categories, ranging from pungent to fragrant (eFigure 1 in the Supplement). Forty-six broad or inappropriate items for a smell test, such as “dry” or “sewer,” were removed. Items thought to stimulate trigeminal response were removed. The remaining 128 items were presented to research students, faculty, staff, and patients at Washington University in an anonymous survey to determine the availability of each item in one’s house. Based on responses from 55 participants, 45 items that were available in at least 50% of households and/or represented odorant clusters important for a smell test were identified.
The 45-item household survey was developed with specific instructions on how to smell each representative item. The response categories were “Smells like normal,” “Smells less strong than normal,” “Smells different than normal,” “Cannot smell,” and “Do not have access to this item.” Items were organized by location in a household, such as kitchen or bathroom (eFigure 2 in the Supplement).
Participants completed the 45-item household survey, the UPSIT (2020 revision), Clinical Global Impression–Severity (CGI-S) scale for perceived smell dysfunction,25,26 and the 36-item Short Form Health Survey (SF-36).27 Participants received the UPSIT packages in the mail. They were instructed to expect an email providing them with a link to the REDCap survey for recording their responses to sniffing each of the 40 UPSIT items. The scoring of the results was performed in our laboratory based on patient’s recorded responses to each item. The CGI-S poses the question, “Overall, what is your current ability to smell?” with 6 response options (absent, poor, fair, good, very good, or excellent), adapted from a validated patient-reported outcome measure commonly used in psychometric studies. The SF-36 is a self-reported measure of general health status in 8 different domains.
Standard descriptive statistics were used to describe the distribution of demographic variables, clinical characteristics, and olfactory test results of the study population. Items in the 45-item household survey were ranked based on availability. Sensitivity, specificity, and Youden J statistic were used to assess each item’s performance for assessing presence of OD as defined from UPSIT and CGI-S. Item reduction to develop the 7-item and 3-item NASAL tests was performed with special focus on content validity and face validity. For content validity, the clinical relevance of each item was assessed with attention to item availability and to limit redundancy within odor clusters. Face validity of items useful in a smell test was established with 2 otolaryngologists (N.F.F. and J.F.P.) treating anosmia at Washington University. Internal consistency was evaluated using Cronbach α. NASAL-7 and NASAL-3 underwent further psychometric and clinimetric analyses to establish convergent validity and criterion validity with UPSIT and CGI-S as well as discriminant validity with the SF-36. Spearman correlation coefficient was used to evaluate the correlation of scores on the NASAL-7 and NASAL-3 with UPSIT total score. Receiver operating characteristic curve (ROC) analysis was used to determine diagnostic accuracy of each of the NASAL instruments in discriminating subjects with anosmia as defined by a UPSIT score between 6 and 18. Sensitivity and specificity were calculated to evaluate the performance of an optimal cutoff point for NASAL-7 and NASAL-3 performance in identification of patients with anosmia or OD of at least moderate degree, as defined from UPSIT. Spearman correlation coefficient was used to establish discriminant validity by exploring the correlation between SF-36 domain scores for the 2 NASAL scores.
An OD severity classification system was established using box and whisker plots displaying the distribution of total scores on the NASAL-7 and NASAL-3 through CGI-S categories. The distribution of UPSIT scores through each of the categories derived from the NASAL-7 and NASAL-3 was compared using Kruskal-Wallis t test. The system’s performance was evaluated by determining agreement with classification of participants though CGI-S and UPSIT categories using weighted κ statistic with quadratic weights. Only participants who had completed all surveys were included in the analysis. Statistical analyses were performed using SPSS statistical software version 27 (IBM Corp). All statistical tests were 2-tailed and evaluated at α = .05.
A total of 495 participants were screened, and 261 met the eligibility criteria. From the 165 participants enrolled from November 2020 to March 2021, 115 completed the study (Figure 1). Patient demographic characteristics (median [range] age, 42 [19-70] years; 92 [80%] women, 97 [84%] White individuals) are described in Table 1. Dysosmia was common among this sample, in which 67 participants (58%) experienced parosmia or phantosmia or both. Dysgeusia was reported in 59 participants (51%). Most participants (70 [61%]) had OD for less than 1 year. The most common cause given for OD was COVID-19 (53 participants [46%]). Eight participants (7%) reported currently smoking.
Based on the UPSIT scores, 38 participants (33%) participants had normosmia, 56 (48%) had some degree of hyposmia (28 mild, 13 moderate, and 15 severe), and 21 (18%) had anosmia (Table 2). Based on self-reported severity of smell dysfunction using CGI-S, the most frequent response was poor (53 participants [46%]), followed by fair (32 [28%]) (Table 2). The SF-36 scores were lowest for the energy/fatigue domain, highest for the role limitations due to physical health domain, and representative of published national normative scores.28
From the 45 household items (eTable in the Supplement), soap was the most available item, found in the homes of 114 of 115 participants (99%). Overall, 21 items with 87% availability or less were excluded. A list of 14 items was established via item reduction based on content validity. Each item had a Youden J of 0.44 or greater. Seven items representing different clusters were prioritized to create NASAL-7. The other 7 were used as substitute items if the original item was not available to increase the feasibility of the NASAL-7 without diminishing face validity and discriminative power.
Each of the items in the NASAL assessment was scored as 0 for cannot smell, 1 for smells less strong/different than normal, and 2 for smells normal. The NASAL-7 contains 7 items (Figure 2), with a total score ranging from 0-14. Scores were obtained for all but 1 participant. Cronbach α for the NASAL-7 was 0.893. The NASAL-3 was created by selecting 3 of the 7 items on the NASAL-7 (Figure 2) while trying to preserve face and content validity. The NASAL-3 was scored the same way and had a total score ranging from 0 to 6. Cronbach α was 0.809. The NASAL-7 took approximately 3 to 10 minutes to complete, whereas NASAL-3 took 1 to 5 minutes.
There was a moderate correlation between the overall UPSIT and the scores on NASAL-7 (ρ = 0.484; 95% CI, 0.325-0.617). The accuracy of NASAL-7 total score to identify participants with anosmia as defined by UPSIT was moderate (area under the ROC [AUC], 0.706, 95% CI, 0.551-0.862). When patients were dichotomized as having anosmia based on a NASAL-7 score of 7 or less and as not having anosmia with scores greater then 7, the NASAL-7 had 70% (95% CI, 48%-86%) sensitivity and 53% (95% CI, 43%-63%) specificity in discriminating those with anosmia from the rest of the participants. The accuracy of NASAL-7 improved when discriminating participants with at least moderate hyposmia, as defined by UPSIT (AUC, 0.814; 95% CI, 0.727-0.900). A cutoff value of 7 was optimal for classification of participants, and when individuals were categorized as having a NASAL-7 score of 7 or less vs greater than 7, the test had 79% (95% CI, 66%-89%) sensitivity and 70% (95% CI, 58%-80%) specificity in discriminating participants with at least moderate hyposmia from the rest of the participants. The NASAL-7 had good accuracy in discriminating participants identified from UPSIT as having anosmia (20 participants) from those with normosmia (38 participants), with an AUC of 0.77 (95% CI, 0.62-0.92), a sensitivity of 70% (95% CI, 48%-86%), and a specificity of 76% (95% CI, 61%-88%). Among the 38 participants defined as having normosmia, the NASAL-7 identified 29 participants (76%) as having normosmia, 8 with severe hyposmia, and 1 with anosmia. Among the 21 participants defined as anosmia, 17 participants (75%) were defined as having anosmia by the NASAL-7.
There was a moderate correlation between the overall UPSIT and NASAL-3 scores (ρ = 0.404; 95% CI, 0.233-0.550). The accuracy of NASAL-3 to identify participants with anosmia defined by UPSIT was moderate (AUC, 0.658; 95% CI, 0.503-0.814). When participants were dichotomized as having anosmia based on a NASAL-3 score of 2 or less, the test had 57% (95% CI, 36%-76%) sensitivity and 78% (95% CI, 69%-85%) specificity. The NASAL-3 had good accuracy in discriminating participants with at least moderate hyposmia as defined by UPSIT (AUC, 0.775; 95% CI, 0.683-0.867) and had 59% (95% CI, 45%-72%) sensitivity and 94% (95% CI, 86%-98%) specificity.
The NASAL-3 had good accuracy in discriminating participants identified from UPSIT as having anosmia (21 participants) from those with normosmia (38 participants). The AUC was 0.73 (95% CI, 0.57-0.88), and it had 57% (95% CI, 36%-76%) sensitivity and 95% (95% CI, 84%-99%) specificity. Among 38 participants defined as having normosmia based on the UPSIT test, NASAL-3 identified 1 participant with anosmia and 21 participants with hyposmia. Among the 21 participants defined as having anosmia based on UPSIT, only 4 participants (25%) were defined as having normosmia by the NASAL instruments.
Distribution of total scores from the NASAL-7 and NASAL-3 assessments within CGI-S categories were used to determine categories of OD severity (eFigure 3 in the Supplement). For the NASAL-7 (eFigure 3A in the Supplement), the following 4 categories were determined: anosmia (score, 0-4), severe dysfunction (score, 5-7), mild dysfunction (score, 8-10), and normosmia (score, 11-14). For the NASAL-3 (eFigure 3A in the Supplement), the 3 categories were determined: anosmia (score, 0-1), hyposmia or dysosmia (score, 2-4), and normosmia (score, 5-6).
Next, UPSIT score distribution was examined within the 4 categories of NASAL-7 (Figure 3A) and the 3 categories of NASAL-3 (Figure 3B). Kruskal-Wallis test showed that there was a significant difference (H3 = 32.303; P < .001) in UPSIT scores between the 4 categories of NASAL-7 (Figure 3A) and between the 3 categories of NASAL-3 (H2 = 21.479; P < .001) (Figure 3B).
Agreement between the 4 OD categories defined by the NASAL-7 with UPSIT categories (anosmia, severe/moderate hyposmia, mild hyposmia, and normosmia) was moderate (weighted κ = 0.496; 95% CI, 0.343-0.649). Agreement between the 3 OD categories defined by the NASAL-3 with corresponding UPSIT categories (anosmia, severe/moderate/mild hyposmia, and normosmia) was poor to moderate (weighted κ = 0.365; 95% CI, 0.187-0.543).
The agreement between the classification of participants based on NASAL-7 and NASAL-3 assessments with self-reported severity of OD as measured by CGI-S was moderate, with a weighted κ of 0.590 (95% CI, 0.474-0.707) for the NASAL-7 and 0.597 (95% CI, 0.481-0.712) for the NASAL-3. There was a weak correlation between each of the 8 domains of SF-36 and NASAL-7 or NASAL-3 with the strongest correlations observed between the emotion domain and NASAL-7 (ρ = 0.16) and NASAL-3 (ρ = 0.15).
In this diagnostic study, a new brief, self-administered smell assessment was developed using commonly available household items. Reduction of a 45-item household survey via psychometric and clinimetric analyses resulted in 2 valid assessments: the NASAL-7 and NASAL-3.
NASAL-7 and the NASAL-3 have face and content validity and are internally consistent. The assessments use items with diverse smells, have broad availability in US homes, and minimize trigeminal stimulation from smells that could be painful to some patients. Both assessments were significantly correlated with performance on the CGI-S and the UPSIT. ROC analysis suggests that both NASAL assessments have moderate accuracy in detecting anosmia.
Participants in this study all self-reported some level of subjective smell dysfunction; however, 33% scored in the normosmic range on the UPSIT. The observed lack of agreement between the CGI-S and the UPSIT is not surprising, given that subjective self-reported measures do not correlate well with objective measures in multiple clinical conditions. Both the CGI-S and UPSIT were used to internally validate NASAL-7 and NASAL-3 because no current standard for detecting OD is optimal.
Importantly, the NASAL assessments address the limitations posed by the current assessments of OD. The NASAL assessments were developed to be inexpensive and brief, but they are also more granular than the CGI-S, can test an individual’s ability to smell items at normal odor concentrations, and allow for participants to report when items smell less strong (a sign of hyposmia or desensitization) or different (a sign of dysosmia). These options serve as a preliminary measure of concerns with odor identification and threshold along with discrimination. The ease of administration is another benefit. Both the NASAL-7 and the NASAL-3 can be performed and scored at home by the patient without need for a clinician. Included items are easily available and universal, making the NASAL assessments particularly inexpensive.
The NASAL-7 and the NASAL-3 each contribute unique information for people who wish to test their sense of smell. In comparison to NASAL-7, the NASAL-3 includes 4 fewer items, requires less time to complete, and had slightly greater agreement with CGI-S. In comparison with NASAL-3, the NASAL-7 correlated more strongly with both CGI-S and UPSIT and was slightly more accurate in detecting anosmia and normosmia. NASAL-7 was validated to detect nuanced levels of disease, particularly discriminating between severe and mild dysfunction. People can take the test at home and share their results in clinic or smell the items in the office while waiting for an appointment. The NASAL assessments can be used by health care workers to detect OD and may be able to assess efficacy of different treatments for the rapidly growing population with COVID-19–associated OD. In future research, external validation will be needed in other OD populations, and sensitivity to change, test-retest reliability, and determination of minimally clinically important difference values will need to be defined.
This study has limitations. As a result of the ongoing COVID-19 pandemic, this study was performed virtually, without participants visiting the Clinical Outcomes Research Office at Washington University School of Medicine in St Louis. The assessments were completed by individuals at home without supervision, the same environment in which the NASAL assessments will be used. We enrolled participants from around the US for all causes of OD, thus increasing generalizability. As with any survey, this study may be limited by respondent bias. We were not able to validate the NASAL instrument in disorders associated with aging, such as Parkinson and Alzheimer, because of the upper age limit of the study and the online-focused recruitment method. Other etiologies of OD, such as chronic rhinosinusitis and nasal polyps, could not be sufficiently evaluated by a health care professional. The interpretation of NASAL-7 and NASAL-3 performance was limited by the lack of a clear criterion standard for OD. In the future, comparison of the performance of the NASAL assessments to in-person clinical tests, such as the Sniffin’ Sticks, would be valuable.
To address the limitations posed by current diagnostic tests of olfactory function, the NASAL assessments in this diagnostic study were developed and validated as simple and inexpensive patient-reported tools to identify people with OD. The validated NASAL-7 and NASAL-3 assessments do not require special materials or clinician assessment and can be used widely by individuals who may be at risk of postviral OD or OD from other pathologies. Health care workers and researchers may also wish to incorporate the NASAL assessment in routine clinical practice. Future research is required to define sensitivity to change and define the minimal difference that is clinically important.
Accepted for Publication: November 15, 2021.
Published Online: January 13, 2022. doi:10.1001/jamaoto.2021.3994
Corresponding Author: Jay F. Piccirillo, MD, Clinical Outcomes Research Office, Department of Otolaryngology–Head & Neck Surgery, Washington University School of Medicine in St. Louis, 660 South Euclid Avenue, Campus Box 8115, St Louis, MO 63110 (firstname.lastname@example.org).
Author Contributions: Ms Gupta and Dr Kallogjeri had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Gupta, Kallogjeri, Smith, Khan, Piccirillo.
Acquisition, analysis, or interpretation of data: Gupta, Kallogjeri, Farrell, Lee, Piccirillo.
Drafting of the manuscript: Gupta.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Gupta, Kallogjeri, Smith, Khan.
Obtained funding: Piccirillo.
Administrative, technical, or material support: Gupta, Lee, Piccirillo.
Supervision: Gupta, Farrell, Lee, Piccirillo.
Conflict of Interest Disclosures: Dr Kallogjeri reported owning stock in Potentia Metrics and receiving personal fees for serving as a statistics editor for JAMA Otolaryngology–Head & Neck Surgery outside the submitted work. Dr Piccirillo reported receiving royalty payments for Sino-Nasal Outcome Test, licensed by Washington University. Dr Piccirillo also received fees for serving as the Editor of JAMA Otolaryngology–Head & Neck Surgery. No other disclosures were reported.
Funding/Support: Research reported in this publication was supported by grant T32DC000022 from the Development of Clinician/Researchers in Academic ENT of the National Institute of Deafness and Other Communication Disorders (Ms Gupta, Mr Smith, and Dr Lee) and grant TL1TR002344 from the National Center For Advancing Translational Sciences of the National Institutes of Health (Mr Khan). Recruitment for this study was supported by ResearchMatch and the Recruitment Enhancement Core (Volunteers for Health). ResearchMatch is a national health volunteer registry that was created by several academic institutions and supported by the National Institutes of Health as part of the Clinical Translational Science Award program. ResearchMatch has a large population of volunteers who have consented to be contacted by researchers about health studies for which they may be eligible. The Recruitment Enhancement Core in the Regulatory Support Center of the Institute of Clinical and Translational Sciences, Washington University School of Medicine, is supported by grant UL1TR002345 from the National Center For Advancing Translational Sciences of the National Institutes of Health. Recruitment was also supported by the group AbScent, led by Chrissi Kelly.
Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Disclaimer: Drs Kallogjeri and Piccirillo are Statistics Editor and Editor, respectively, of JAMA Otolaryngology–Head & Neck Surgery, but they were not involved in any of the decisions regarding review of the manuscript or its acceptance. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Additional Contributions: Thanks to Amber Perrin, BS, and Zach Jaeger, BS, of the Washington University School of Medicine for their support throughout this project. Ms Perrin received additional compensation.