Douglas RS, Tsirbas A, Gordon M, Lee D, Khadavi N, Garneau HC, Goldberg RA, Cahill K, Dolman PJ, Elner V, Feldon S, Lucarelli M, Uddin J, Kazim M, Smith TJ, Khanna D, . Development of Criteria for Evaluating Clinical Response in Thyroid Eye Disease Using a Modified Delphi Technique. Arch Ophthalmol. 2009;127(9):1155-1160. doi:10.1001/archophthalmol.2009.232
Copyright 2009 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2009
To identify components of a provisional clinical response index for thyroid eye disease using a modified Delphi technique.
The International Thyroid Eye Disease Society conducted a structured, 3-round Delphi exercise establishing consensus for a core set of measures for clinical trials in thyroid eye disease. The steering committee discussed the results in a face-to-face meeting (nominal group technique) and evaluated each criterion with respect to its feasibility, reliability, redundancy, and validity. Redundant measures were consolidated or excluded.
Criteria were parsed into 11 domains for the Delphi surveys. Eighty-four respondents participated in the Delphi 1 survey, providing 220 unique items. Ninety-two members (100% of the respondents from Delphi 1 plus 8 new participants) responded in Delphi 2 and rated the same 220 items. Sixty-four members (76% of participants) rated 153 criteria in Delphi 3 (67 criteria were excluded because of redundancy). Criteria with a mean greater than 6 (1 = least appropriate to 9 = most appropriate) were further evaluated by the nominal group technique and provisional core measures were chosen.
Using a Delphi exercise, we developed provisional core measures for assessing disease activity and severity in clinical trials of therapies for thyroid eye disease. These measures will be iteratively refined for use in multicenter clinical trials.
Graves disease is a multisystem autoimmune disorder targeting the thyroid, orbit, and skin. The orbital disease runs a biphasic course. The initial active phase can be dominated by inflammation of orbital soft tissues and expansion of fat accompanied or followed by progressive fibrosis and its attendant abnormalities of eye motility. Thyroid eye disease (TED), also referred to as thyroid-associated ophthalmopathy and Graves orbitopathy, encompasses the orbital and periorbital manifestations of Graves disease and causes substantial morbidity and reduced quality of life.1- 4
The initial manifestations of TED can precede, coincide with, or follow the onset of thyroid dysfunction. Moreover, the severity and duration of TED are unpredictable. The characteristic clinical course in which the disease often waxes and wanes challenges attempts to develop adequate outcome measures required for randomized therapeutic trials.5 Typically, the active inflammatory disease transitions to a chronic stable phase within 1 to 5 years of onset.6,7 Treatment options for active TED have been limited by the absence of an evidence-based clinical response index (CRI).
Several immunomodulatory agents have been introduced over the past decade that target key putative components of autoimmune inflammation.8,9 These have found increasing utility in the therapy of diseases associated with tissue remodeling that at least superficially resemble TED. They include antibody-based agents that either abrogate cytokine function or deplete specific subsets of lymphocytes.10- 15 A CRI-TED would aid in defining clinical disease states, including remission and progression. Similar hurdles were overcome in allied diseases, including rheumatoid arthritis and scleroderma, following development of a CRI using the Delphi and nominal group methods.16- 18 These tools have allowed precise definition of disease activity, severity, remission, and response to treatment. By iteratively evaluating the feasibility, validity, and sensitivity of each criterion composing the CRI, progressively more sensitive evaluation and discrimination of therapeutic effects have become possible. Well-validated, widely accepted combined-response indexes should prove more sensitive than individual measures. This flexibility should facilitate drug development and improve assessment of therapeutic agents. Scleroderma and TED present with similarly complex and heterogeneous clinical manifestations deriving from the inflammatory and fibrotic nature they share. While still in a developmental stage, a scleroderma-specific CRI has already culminated in therapeutic trials.19
Development of a CRI-TED reflecting practice patterns would complement and extend important earlier work, such as that conducted by the European Group on Graves' Orbitopathy.20- 24 In addition, several disease activity score systems have been developed over the years, including NOSPECS, CAS, and VISA.25- 28 Many individual parameters included in these scales have been validated and they can now be included as essential elements in a CRI-TED. In particular, the CAS system has been validated in patients receiving anti-inflammatory therapy. However, “activity scores” are confusing and a single version may be difficult to implement. In general, they are limited in scope, are not widely used, and have not benefited from iterative refinement. They have failed to predict therapeutic efficacy. Regardless of the relative merits of each system, without universal implementation, no system for patient evaluation can be optimally developed and refined. Our ultimate goal was to overcome these limitations by developing a CRI-TED that reflects with great fidelity the qualitative and quantitative features of TED pathophysiology. Since the assessment parameters destined to be included were identified during a process where participating clinicians were solicited, broad acceptance became more likely. Ideally, CRIs achieve the necessary sensitivity for grading disease without becoming cumbersome. Refinement and validation of the CRI-TED resulted from an iterative and data-driven process where parameters that were redundant or offered no incremental value were removed. As additional therapies targeting inflammation and fibrosis become available, an adequately validated and inclusive CRI-TED will become indispensable in their evaluation. Herein, we report the first steps in developing a CRI-TED using a modified Delphi technique. This technique allowed proposal and assessment of criteria relevant in TED. It included subjective features (symptoms), objective measures (divided anatomically into skin, conjunctiva, cornea, and orbit), functional measures (vision and motility), imaging measures, quality of life, global health, biomarkers, previous grading scales, genetics, and other items not further specified.
A provisional set of criteria was developed using both Delphi and nominal group techniques. The Delphi technique systematically solicits and collates judgment on a particular topic through an algorithm consisting of sequential questionnaires temporally interspersed so as to allow summarization of feedback and modification based on written responses obtained from individuals rather than assembled groups of participants. This technique facilitates consensus building among experts on topics where exact definitions may be unavailable.29,30 The essential elements include (1) anonymous response from participating individuals, (2) interaction between participants following each round of input from questionnaire responses and controlled feedback to participants, and (3) statistical group response. In contrast, the nominal group technique is a structured face-to-face meeting that is led by a facilitator. After discussion, a voting process determines the fate of the proposed items by grading their value. Our application of these techniques to the development of the CRI-TED follows success in instrument building for other autoimmune diseases, including adult and juvenile rheumatoid arthritis, gout, and scleroderma.29- 35
The International Thyroid Eye Disease Society (ITEDS) steering committee consists of ophthalmologists with specialized expertise in TED. Prior to the Delphi exercise, this committee proposed “domains” relevant in TED that included subjective features (symptoms), objective measures (divided anatomically into skin, conjunctiva, cornea, and orbit), functional measures (vision and motility), imaging measures, quality of life, global health, biomarkers, previous grading scales, and genetics and others (items not further specified).
Endocrinologists and ophthalmologists were invited to participate in the development of a CRI using e-mail questionnaires addressed to members of the Endocrine Society, American Society of Ophthalmic Plastic and Reconstructive Surgery (ASOPRS), European Society of Ophthalmic Plastic and Reconstructive Surgery, Orbital Society, North American Neuro-Ophthalmology Society, and European Group on Graves' Orbitopathy as obtained from published member lists. Respondents were invited to participate as a member of ITEDS, which is an independent, international, nonprofit organization dedicated to advancing clinical trial methods and developing treatments for TED (www.iteds.net). The organization has no financial conflicts of interest or affiliation with other organizations. However, unrestricted funding was provided by the ASOPRS Foundation and American Neuro-Ophthalmology Society.
The Delphi exercise was initiated via e-mail in December 2006 by asking a cohort of clinicians who focus on TED to identify parameters that they believed could be used in development of a CRI-TED application to a 1-year multicenter clinical trial. This was followed up with 2 reminder e-mails, each at a frequency of 3 weeks.
The 84 respondents to the first request were subsequently sent a second questionnaire (Delphi 2) and asked to rate the importance of each item for a “CRI-TED in a hypothetical 1-year, prospective, longitudinal clinical trial.” Eight additional experts chose to participate at this stage, for a total of 92 investigators in Delphi 2. Each respondent rated the criteria on a scale of 1 (extremely inappropriate for a combined measure) to 9 (extremely appropriate for a combined measure). Each investigator was reminded twice by e-mail to complete the questionnaire. Descriptive statistics were calculated and a report containing the final questionnaire was sent to participants (Delphi 3). This report provided feedback to the respondents, reminding them of their previous ratings in Delphi 2 for each criterion compared with a group mean (standard deviation). The questionnaire requested that each participant again rate the criteria after they considered the mean group response. Each investigator was again reminded twice by e-mail to complete the questionnaire.
Statistical analyses were conducted by calculating mode, mean, median, standard deviation, and 25th/75th percentile response values. The median of the sample constitutes the most “typical” value in the range of data, where no more than 50% of the data fall above or below. Following the RAND/University of California, Los Angeles, appropriateness method, median scores falling in the 1 to 3 range during Delphi 3 were excluded from further consideration. Those in the 4 to 6 range were considered potentially appropriate, and scores between 7 and 9 were included in the CRI-TED. Exceptions included median scores lower than 5 but with multimodal distribution, where a small group of respondents felt the criterion was of high importance. Additional determinations were undertaken by members of the steering committee, who attempted to err on the side of overinclusion. This practice pertained to criteria with low median, multimodal values and those that were viewed as surprising or inconsistent with the committee's collective expectations.
The ITEDS steering committee members met in April 2008 to consider the feasibility, reliability, redundancy, validity, and sensitivity to change of each proposed parameter for use in a 1-year clinical trial. The committee focused on responses from Delphi 3 (153 items) with a median score of 6 or greater. After discussion, the committee voted on each. Consensus among 80% of those attending was required for acceptance.
Eighty-four respondents participated in Delphi 1 and collectively provided 220 unique parameters for 11 domains (available on request). Ninety-two participants (100% response rate) responded in Delphi 2 (including 8 additional clinicians), rating the 220 parameters on a scale from 1 (least appropriate) to 9 (most appropriate). The criteria were reviewed by the steering committee, which eliminated 67 redundant items. One hundred fifty-three criteria grouped in the 11 domains were selected for presentation in Delphi 3. Sixty-four investigators (76%) rated the 153 parameters included in Delphi 3. Criteria with a mean greater than 6 (136 items) were further evaluated by the steering committee for use in a 1-year multicenter trial (Delphi 3 parameters available on request).
The committee evaluated each proposed parameter. The final provisional core measures are shown in the Table. Those with high content validity, as provided in the Delphi exercise (ie, mean ≥6), but minimal evidence of validity in the literature were included in research criteria. Additional criteria, including a visual analog scale assessing disease severity and interval change from patients' and physicians' perspectives, were also included. These were considered important, both by the committee and the respondents of Delphi 3, although currently they are not data driven with respect to reliability and validity.
The ability to evaluate therapeutic intervention in several autoimmune diseases has been enhanced by the development of CRIs using the Delphi technique. Thus, we have begun to develop a CRI-TED using this same strategy. For the first time, to our knowledge, the combined clinical judgment of a relatively large cohort of clinicians with expertise with TED was incorporated. This we believe has resulted in a set of core parameters that will allow meaningful assessment of therapeutic efficacy in TED. We have solicited and analyzed the judgment of interested clinicians. However, our results should be considered preliminary. They must now be validated in a 1-year observational trial to assess their feasibility, reliability, and validity. Unlike previous scales of TED activity, the CRI-TED will undergo iterative refinement. This instrument should allow meaningful assessment of potential therapies by facilitating the standardization, conduct, reporting, and interpretation of clinical trials.
Many criteria are considered “standard of care,” including examination for inflammation, visual acuity, and thyrotropin levels. Others, such as the visual analog scale, have been used to assess disease activity and have been found feasible, reliable, valid, and responsive to change in recent multicenter clinical trials. Domains and parameters that lacked feasibility or could not be easily measured in a multicenter study were avoided. For instance, ultrasonographic assessment of extraocular muscles was avoided because wide interreader variability makes its standardization difficult.
Our primary goal was to develop an instrument that can be used in prospective, longitudinal, observational clinical trials. Based on the natural history of TED, a 1-year trial seems appropriate. At its conclusion, further assessment of the CRI-TED will be undertaken to assure that it reflects disease progression. A panel of experts should define domains demonstrating 30% or more improvement or worsening of TED. This process will involve a paper-based case study.
Our study has both strengths and limitations. First, we successfully conducted a Delphi exercise using the broad knowledge base of endocrinologists, ophthalmologists, and orbital surgeons. Each of the 3 exercises generated high response rates. The principal limitation was an initial reliance on processes associated with consensus building. However, iterative refinement of the CRI-TED parameters will be data driven. Acknowledging this limitation, the current study, for the first time to our knowledge, obtains combined clinical input from investigators on several continents. Development of core parameters provides the first step toward a robust and easily administered CRI-TED. The resulting instrument should greatly enhance our ability to conduct meaningful therapeutic trials.
Correspondence: Raymond S. Douglas, MD, PhD, Department of Ophthalmology, Jules Stein Eye Institute, Los Angeles, CA 90095 (firstname.lastname@example.org) or Dinesh Khanna, MD, MS, Department of Medicine, David Geffen School of Medicine, Los Angeles, CA 90095 (email@example.com).
Submitted for Publication: December 23, 2008; final revision received February 4, 2009; accepted February 10, 2009.
Financial Disclosure: None reported.
Funding/Support: This study was supported in part by the American Society of Ophthalmic Plastic and Reconstructive Surgery; North American Neuro-Ophthalmology Society; National Institutes of Health grants EY008976, EY011708, DK063121, EY016339, and RR00425; an unrestricted grant from Research to Prevent Blindness; a Research to Prevent Blindness Career Development Award; and the Bell Charitable Foundation.
Members Who Completed Questionnaires
Malena Amato, Kaiser Permanente Medical Group; James Antoszyk, Charlotte Eye, Ear, Nose, and Throat Associates; Saj Ataullah, Anne Cook, Manchester Royal Eye Hospital; Rebecca Bahn, Mayo Clinic; Giuseppe Barbesino, Massachusetts General Hospital; Evan Black, Wayne State University School of Medecine; Bert Bowden, Eye for God; Wade Brock, Arkansas Oculoplastic Surgery; Kenneth Cahill, Ohio State University; Carolyn Cates, West Suffolk Hospital; Steven Chen, Glendale, Arizona; David Cheung, Codere Francois McGill University; Marc Criden, University of Texas; Philip Custer, Washington University in Saint Louis; Roger Dailey, John Ng, Oregon Health and Science University; Jane Dale, The Dudley Group of Hospitals; Monte Del Monte, Victor Elner, Bartley Frueh, Alon Kahana, Christine Nelson, University of Michigan; Jean-Louis DeSousa, Adam Gajdatsy, Lions Eye Institute; Peter Dolman, University of British Columbia; Guido Dorner, Medizinische Universität Wien; Raymond Douglas, Andrew Gianoukakis, Robert Goldberg, Lynn Gordon, Cathy Hwang, Howard Krauss, Ronald Mancini, Mehryar Taban, Angelo Tsirbas, Jules Stein Eye Institute, University of California, Los Angeles; Jean Paul Dray, Tel Aviv Medical Center; Vikram Durairaj, University of Colorado, Denver, School of Medicine; Jeff Edelstein, Arizona Orthopedic Surgical Hospital; Robert Fante, Michael Hawes, University of Colorado Health Sciences Center; Aaron Fay, Harvard University; Ken Feldman, Southern California Permanente Medical Group and University of California, Irvine; Steven Feldon, University of Rochester; James Fleming, The Hamilton Eye Institute; Tamara Fountain, Ophthalmology Partners Ltd; Suzanne Freitag, Boston University Eye Association; Tim Fulcher, Mater Misericordiae Hospital; Gregg Gayre, Kaiser Permanente, San Rafael, California; Scott Goldstein, Wills Eye Hospital Oculoplastic Service; Russell Gonnering, Medical College of Wisconsin; C. Miguel Gonzales, IMO Barcelona; Mark Gordon, Loyola Marymount University; Andrew Harrison, University of Minnesota; Laszlo Hegedüs, Odense University Hospital; Peter Hildebrand, University of Oklahoma Health Sciences Center; David Holck, Wilford Hall Medical Center; Don Hollsten, University of Texas Health Sciences Center; David Hughes; David Jordan, University of Ottawa Eye Institute; Randy Kardon, Jeffrey Nerad, University of Iowa; Safak Sisli Karslioglu, Etfal Hospital; James Katowitz, Children's Hospital of Philadelphia; Michael Kazim, Hindola Konrad, Columbia University; Yoon-Duck Kim, Sungkyunkwan University; Stephen Klapper, Klapper Eyelid and Facial Plastic Surgery; John Koh, Mindlin Koh Center, Ophthalmic Surgery; Kenneth Krantz, Santa Ana, California; Vladimir Kratky, Queens University; Carol Lane, Cardiff and Vale NHS Trust; Simeon Lauer, New York Eye and Ear Infirmary; John Linberg, Health Sciences Center of West Virginia University; Richard Lisman, New York University; Gary Lissner, Northwestern's Feinberg School for Medicine; Mark Lucarelli, University of Wisconsin–Madison; David Lyon, Eye Foundation of Kansas City; Hunter Maclean, Portsmouth Hospitals; Polyzois Makras, Leiden University Medical; Ruth Manners, Southampton University Hospitals; Geva Mannor, Scripps Clinic Rancho Bernardo; John McCann, Center for Facial Appearances; Polly McKinstry, Orange County, California; Michael Migliori, Warren Alpert Medical School at Brown University; Ilse Mombaerts, University Hospitals Leuven; Maarten Mourits, Academic Medical Center of the University of Amsterdam; James Oestreicher, University of Toronto; Jay Older, Older and Slonim Eye Lid Institute; James Orcutt, University of Washington; Naser Owji, Shiraz University of Medical Sciences; Ben Parkin, The Royal Bournemouth and Christchurch Hospitals; Ron Pelton, Memorial Hospital System; Marleen Pigeaud, Vanderbilt University Medical Center; Allen Putterman, University of Illinois College of Medicine; Raymond Radford, Lancashire Teaching Hospitals; Richard Redmond, Scarborough and North East Yorkshire Healthcare; Geoff Rose, Moorfields Eye Hospital; Stuart Seiff, University of California, San Francisco; Marc Shields, University of Virginia Health OPH; Rona Silkiss, California Pacific Medical Center; Terry Smith, University of California, Los Angeles; Stephen Soll, Camden Ophthalmology; Marius Stan, Mayo Clinic; Tim Sullivan, Queensland Hospital; Vladimir Thaller, Royal Eye Infirmary; Jimmy Uddin, Moorfields Eye Hospital; Nicholas Volpe, Scheie Eye Institute, University of Pennsylvania; Ted Wojno, Emory University; Kyung In Woo, Samsung Medical Center; Patrick R. Yeatts, Wake Forest University School of Medicine; Michael Yen, Baylor College of Medicine
Steering Committee Members
Raymond S. Douglas, MD, PhD, associate professor of Ophthalmology, Division of Orbital and Ophthalmic Plastic Surgery, Jules Stein Eye Institute/University of California, Los Angeles, director, Orbital and Ophthalmic Plastic and Reconstructive Surgery, Greater Los Angeles Veterans Administration Hospital; Michael Kazim, MD, clinical professor of Ophthalmology and Surgery, Columbia University; Kenneth Cahill, MD, clinical professor of Ophthalmology, codirector of the Oculoplastic Surgery Service, Ohio State University; Mark Lucarelli, MD, University of Wisconsin–Madison; Steve Feldon, MD, University of Rochester School of Medicine and Dentistry, University of Rochester Eye Institute; Victor Elner, MD, PhD, University of Michigan Health Systems; Peter J. Dolman, MD, FRCSC, clinical professor, University of British Columbia; Jimmy Uddin, MA, FRCOphth, consultant ophthalmic surgeon, Moorfields Eye Hospital.