Customize your JAMA Network experience by selecting one or more topics from the list below.
Nordyke RA, Reppun TS, Madanay LD, Woods JC, Goldstein AP, Miyamoto LA. Alternative Sequences of Thyrotropin and Free Thyroxine Assays for Routine Thyroid Function Testing: Quality and Cost. Arch Intern Med. 1998;158(3):266–272. doi:10.1001/archinte.158.3.266
Current guidelines and practices for thyroid function testing are strongly affected by the usually higher patient billing charges and Medicare reimbursement for thyrotropin (TSH) vs free thyroxine (FT4) tests, despite their comparable direct costs.
Due to recently reduced laboratory costs, to reexamine the effectiveness and cost of alternative test sequences.
Alternative test sequences involve using the TSH test first, followed, if the TSH test result is abnormal, by the FT4 test; the FT4 test first, followed by the TSH test; and doing both tests together. We applied these strategies to consecutive patients referred for any thyroid function test to a health maintenance organization, a multispecialty fee-for-service group, a military hospital, and a commercial laboratory. Effectiveness was determined from a literature review. The cost was determined from direct costs and the distribution of diagnostic categories.
The TSH and FT4 tests have similar sensitivities for detecting clinical hyperthyroidism and hypothyroidism. The TSH test detects subclinical function, and it monitors thyroxine treatment better; the FT4 test detects central hypothyroidism, and it monitors rapidly changing function better. Direct costs for both were equal, but charges for the TSH test were higher. The average direct cost per patient, starting with the FT4 test, was $4.61; starting with the TSH test, $5.90; and starting with both tests together, $6.50. Medicare reimbursements correlated poorly with costs.
Starting with the TSH test and reflexing to the FT4 test provides a better first-line all-purpose sequence than the reverse. In managed care settings, the slightly higher direct cost of this approach is offset by greater clinical effectiveness. In fee-for-service settings, cost differences can be nearly eliminated by equalizing TSH and FT4 charges to reflect current direct-cost realities. Obtaining both tests together overcomes the disadvantages of each at a slightly higher direct cost.
PROFESSIONAL societies have issued different guidelines for thyroid function testing,1-6 and complex variations are advised depending on the clinical settings and the characteristics of individual patients.2-4,7,8 Much of the controversy involves the trade-off between quality and cost,3,8-11 with cost information derived from older technology. Recent laboratory improvements have reduced and equalized the direct costs of these tests, suggesting the need to reconsider their use in routine clinical practice.
Our search was for an optimal testing strategy applicable to nearly all patients in both managed care and fee-for-service settings. Determining effectiveness from reviewing the literature and obtaining current direct costs from 4 laboratories that provide testing for 85% of Hawaii's population, we compared alternative strategies: starting with the thyrotropin (TSH) test, followed, if the TSH level was abnormal, by the free thyroxine (FT4) immunoassay; starting with the FT4 test, followed, if that result was abnormal, by the TSH test; and doing both tests together.
Four laboratories cooperated in this study: a multispecialty fee-for-service group practice (Straub Clinic and Hospital, Inc [Straub]), a health maintenance organization (Kaiser-Permanente Medical Care Program [Kaiser]), a military hospital (Tripler Army Medical Center [Tripler]), and a large private laboratory (Diagnostic Laboratory Services, Inc [DLS])—all of them located in Honolulu, Hawaii. Each served both outpatient and hospitalized patients for separate physician clientele. Patients included those being screened for thyroid disease and those suspected of or having thyroid disease. These conditions were not separated out for the purposes of this study. Each laboratory performed TSH and FT4 tests on 500 consecutive patients presenting with any laboratory thyroid test request ordered by the patient's physician within the same 3-month period in 1993. The population studied, therefore, represents all those for whom thyroid tests were ordered.
Kaiser used the semiautomated Baxter (now Dade International, Deerfield, Ill) Stratus II fluorometric assay for all tests.12 Tripler and DLS used the Boehringer Mannheim Corporation (Indianapolis, Ind) ES 300 enzyme-linked immunosorbent assay for all tests.13 Straub used the Becton Dickinson Immunodiagnostics (now ICN Pharmaceuticals, Diagnostic Division, Orangeburg, NY) Simultrac solid-phase system for simultaneous TSH and FT4 tests,14,15 Nuclear Medical Laboratories–Organon Teknika Corporation (Durham, NC) radioimmunoassay kits for total thyroxine level and triiodothyronine uptake, and Diagnostic Products Corporation (Los Angeles, Calif) radioimmunoassay kits for total triiodothyronine level. None of the laboratories measured FT4 levels by equilibrium dialysis. Although the immunoassay methods may not be as accurate as the equilibrium dialysis method, they are nevertheless the tests routinely used by all laboratories in the study, which includes 85% of tests done in Hawaii.
For each laboratory, the assays performed well within the manufacturers' precision specifications. Each laboratory participated successfully in the College of American Pathologists' Ligand Assay Series proficiency testing program.
Direct costs of reagents and personnel per reportable test result were calculated for each of the 4 laboratories. Reagent costs were based on the manufacturer's list price uncorrected for volume discounts.
Normal ranges are listed in Table 1 for TSH and FT4 levels, as reported by each laboratory and recommended by the reagent manufacturers. These vary, as would be expected, with different assays and patient populations.
Using an average normal range from Table 1 for TSH (0.4-4.8 µIU/L) and FT4 levels (10.6-25.2 pmol/L), we categorized the test results of 500 consecutive patients at each laboratory by test combinations into normal, hyperthyroid, hypothyroid, subclinical hyperthyroid, subclinical hypothyroid, and "other"(Table 2). Table 3 displays the average distribution of results for all 2000 patients and identifies the contents of the "other" category.
Direct costs (reagents and personnel) per reportable test for TSH and FT4 at Tripler, Straub, and DLS were similar (Table 4), ie, between $3.06 and $3.61. Costs at Kaiser were higher ($6.81), consistent with higher batch frequency (3/d) and smaller batch sizes. The average costs of the TSH and FT4 tests were identical ($4.15). The average cost of the FT4 test was lower than that of the FT4 index ($4.15 vs $6.92) because the FT4 index required both the total thyroxine level and the triiodothyronine uptake. Reagent costs per test in the United States by companies that provided more than 5% of all reagents averaged $1.10 for the FT4 (range, $0.35-$2.29) and $1.92 for the TSH (range, $0.45-$2.79).16 These national reagent costs are less than ours, but are not directly comparable because we counted each reportable result as an individual test, and IMS America, Ltd (a health care and marketing research company in Plymouth Meeting, Pa), counted each step to achieve a reportable result as a test, whether standard, duplicate, repeat, or quality control.
Applying this protocol to all 2000 patients resulted in an average of 88.8% of patients being classified as euthyroid; 3.2%, hyperthyroid; 3.2%, hypothyroid; and 4.8%, other (Table 3). The subclinical categories (11.1%) were subsumed within the normal group. Direct laboratory costs include FT4 tests on all patients and TSH assays on an additional 11.2%, or a total of $4.61 per patient (Table 5).
Applying this protocol to all patients resulted in an average of 77.4% being classified as euthyroid; 3.2%, hyperthyroid; 3.2%, hypothyroid; 2.7%, subclinically hyperthyroid; 8.4%, subclinically hypothyroid; and 5.1%, other (Table 3). Direct laboratory costs include TSH tests on all patients and FT4 assays on an additional 22.6%, a total of $5.09 per patient, or $0.48 more than starting with the FT4 test (Table 5).
Applying this protocol to all patients resulted in an average of 73.6% being classified as euthyroid; 3.2%, hyperthyroid; 3.2%, hypothyroid; 2.7%, subclinically hyperthyroid; 8.4%, subclinically hypothyroid; and 8.9%, other (Table 3). The direct cost by simple addition ($4.15 + $4.15) is $8.30. Obtaining the 2 tests together, however, has a lower direct cost—$6.18 by the Becton Dickinson Simultrac method at Straub (not including a small additional cost for radioactive waste disposal), $6.12 at DLS, $7.22 at Tripler, and $13.62 at Kaiser. When the Kaiser costs are excluded—as it had special problems with multiple daily runs—the average direct cost of doing both tests concurrently at Straub, Tripler, and DLS was $6.50 (Table 5).
Test ordering by the independent physician clientele of DLS (Table 6) involved potentially duplicative tests. Combinations of the FT4 test and the FT4 index were ordered together on 9.9% of patient visits in December 1993 and 8.4% in December 1995. Ordering patterns for December 1993, 1994, and 1995 indicate a shift to more effective combinations, with a substantial increase in orders for the TSH test and decreases for the thyroxine assay and the FT4 index. Improvement was accomplished by actively educating physicians, adjusting patient billing charges for the TSH test equal to those for the FT4 test, starting routinely with the TSH test and then doing the FT4 test for the largest private hospital in the study that was the major client of DLS, restricting FT4 tests and the FT4 index in the same specimen, and providing pathologist consultation if the TSH and FT4 test were not the primary tests ordered.
From the viewpoint of individual physicians and patients, equalizing the charges nearly eliminates the influence of cost as a basis of choice between tests. From the socioeconomic perspective,17 starting with the TSH test may still cost more because the detection of subclinical function categories by the TSH test leads to more secondary testing.
Substantial discrepancies were found between Medicare reimbursements and direct costs of laboratory tests (Table 5). Reimbursements for individual tests in December 1993 ranged from $9.95 for the total thyroxine test to $24.12 for the TSH test (Table 6), a difference of 142%, whereas the direct costs ranged from $3.52 to $4.15 (Table 4), a difference of 18%. Reimbursement for the TSH test was $24.12 and for the FT4 test, $13.06 (Table 6), a difference of 85%, whereas there was no difference in direct costs. Medicare reimbursed laboratories for each test when charges for the FT4 and the FT4 index were submitted together.
Clinicians order thyroid function tests to detect functional abnormalities in patients with possible, suspected, or known thyroid disease. The ideal test or combination of tests is sensitive and specific for hyperthyroidism and hypothyroidism, independent of thyroid-binding protein changes, not falsely altered by thyroxine treatment, and relatively inexpensive. The tests most often used are the FT4 index, FT4, and TSH. Arguments are presented for applying dual testing1,18 or for sequencing them by starting with 1 test and following it, if the results are abnormal, with a second test.19-28 Guidelines often suggest variations, depending on the reasons for testing, such as screening, case finding, or monitoring in the contexts of stable or rapidly changing function. It is generally conceded that the TSH assay is the best test overall, but much of the controversy involves the trade-off between quality and cost3,8-11 because charges for the TSH test are routinely higher than for the FT4.
Whether used as the primary or secondary test, either the FT4 test or the FT4 index is often needed to confirm diagnoses, classify subclinical functional states, monitor thyroxine treatment with the objective of suppressing TSH, observe the course of rapidly changing thyroid function, and detect and observe instances of pituitary or hypothalamic dysfunction. To choose between them, we considered diagnostic accuracy and cost. The sensitivities of the FT4 assay and the FT4 index for detecting clinical hyperthyroidism and hypothyroidism are similar.18 In our study, the FT4 index had a higher direct cost per reportable result ($6.92 vs $4.15). In the United States in 1993, the FT4 index had a higher reagent cost per test ($2.05 vs $1.10) and was used 6.1 times more often.16 If the FT4 test had been substituted for the FT4 index that year, annual direct cost savings for reagents alone would have been about $22 million. For these reasons, the FT4 test became our choice to complement the TSH test, as it has been in many other studies.1,2,18,21-23,25,26
Several studies of cost-effectiveness have concluded that the thyroxine-based tests are preferable to the TSH test as the first test in a sequence.3,4,7,10,19 It is argued that subclinical hyperthyroidism and hypothyroidism, detected only by the TSH test, are not harmful and that tests that do not detect them are sufficient to identify patients who will benefit from treatment.29,30 Triiodothyronine toxicosis will be missed, but many disorders are identifiable clinically. The rare but important cases of central hypothyroidism are detectable by the FT4 test but not by the TSH test.31 The sometimes confusing miscellaneous categories that occur when starting with the TSH test32 are reduced. The diagnostic accuracy of the FT4 assay is better than the TSH when the thyroid function status is rapidly changing, especially after the recent treatment of hyperthyroidism.33,34 The strongest argument for choosing the FT4 as the routine first test is its lower cost because charges for the FT4 are consistently lower than for the TSH.
The introduction of second-generation ("sensitive") TSH tests brought a reconsideration of testing strategies.25,27,28,35 They accurately identify both hyperthyroidism and hypothyroidism, are not falsely affected by protein-binding variations or thyroxine treatment, and are important, if not critical, for monitoring replacement and suppressive therapy with thyroxine.36,37 Excluding special situations as noted earlier, a TSH test result within normal limits best defines euthyroidism.6,18,35
The TSH is the only test capable of detecting subclinical hypothyroidism and hyperthyroidism. If these conditions are important and common, there is further reason to start with this test. Data are accumulating that sustained subclinical hypothyroidism results in subtle signs and symptoms,38-40 cardiac abnormalities,38,40 and disturbed lipid metabolism,41-43 all relieved by treatment. Together with positive results on antithyroid antibody tests, it predicts the delayed onset of overt hypothyroidism.44,45 Treatment may22 or may not46 increase the health-related quality of life. Sustained subclinical hyperthyroidism (determined by undetectable second-generation TSH test results and a normal FT4 level) has been reported to result in correctable clinical signs and symptoms,47 abnormalities of the heart,47-49 and an increased risk of osteoporosis.50-54 Although the consequences of these subclinical states are under investigation, it seems reasonable to avoid these conditions whenever possible. Truly subclinical hyperthyroidism can be most accurately determined by third- and fourth-generation TSH tests with a normal FT4 level by equilibrium dialysis. These more accurate tests are not widely available, however, and are likely to be more costly.
The prevalence of subclinical categories in our study of 2000 consecutive patients who had any thyroid test ordered averaged 11.1% of all patients. Others have obtained similar or higher results, depending on the practice settings.
The 5.1% "other" category we found (Table 3) may be confusing. Most of these had a low but detectable TSH level (between 0.1 and 0.5 µIU/L). Many such results occur in patients being treated with thyroxine; in these, the TSH information alone is usually sufficient. In patients who have no clinical indications of thyroid disease, low but detectable levels rarely indicate or predict the presence of hyperthyroidism.55,56 Without clinical evidence of thyroid dysfunction, these groups usually need no further testing.
The argument most effectively raised against routinely starting with the TSH test is that current charges for it are higher than for the FT4.7,11,22 Using a variety of methods in our 4 laboratories, we found only minor direct cost differences, and the national survey16 confirms the small differences in reagent costs.
A single first test with high sensitivity and specificity is appropriate for most purposes. Although our choice is the TSH test, obtaining TSH and FT4 test results together circumvents the limitations of either test used separately. The arguments for combining the tests are strongest for patients at risk for thyroid dysfunction before treatment is started; for those with changing thyroid function, as frequently occurs in the months following the treatment of hyperthyroidism; for those with nonthyroidal illnesses (euthyroid sick syndrome); for those taking drugs known to affect the test results; and for those with possible or known central hypothyroidism.31 Both test results are available immediately without further decisions by the physician or laboratory. They provide reassuring confirmation of euthyroidism and often new information in patients more likely to have thyroid dysfunction.1,18,32 In our study, the direct cost of this approach was $6.50 per patient compared with $5.09 for starting with the TSH test and $4.61 starting with the FT4 assay. We have not analyzed it, but the direct cost may not be substantially greater than a strategy of waiting for results from the first test, producing a subset of abnormal samples, running the second test separately, and collating reports.
New cost information suggests a reconsideration of routine thyroid function testing strategies. Because direct costs between widely used sensitive TSH and FT4 tests do not differ substantially, recommendations for sequencing these tests can be based more clearly on professional judgment. This applies both to managed care settings and to fee-for-service settings willing to equalize charges of the TSH and FT4 tests to reflect equal costs.
For routine testing, we currently favor a protocol that starts with the TSH test, followed by an FT4 assay if the TSH test result is abnormal. This strategy is applicable to nearly all routine screening, diagnostic testing, and treatment monitoring. Major cautions are in patients with rapidly changing thyroid function and those suspected of having or known to have central hypothyroidism. It optimizes diagnostic accuracy, reduces laboratory work, and simplifies ordering by clinicians. The small additional cost of starting with the TSH test compared with starting with the FT4 assay is outweighed by the added value of detecting the subclinical function states and by more effective thyroxine therapy monitoring. Obtaining these tests together, especially in patients at higher risk for thyroid dysfunction, increases efficiency and diagnostic accuracy at a slightly greater cost, but individual requirements should be accommodated.
Accepted for publication June 5, 1997.
This work was supported in part by grants from the Agency for Health Care Policy and Research through the Hawaii MEDTEP Research Center, Pacific Health Research Institute, and the Straub Foundation. The Diagnostics Division of Miles Laboratories provided laboratory materials needed for this study at Tripler Army Medical Center. These funding sources had no role in the collection of data, analysis or interpretation, or right of approval or disapproval of the submitted paper.
The conclusions and opinions expressed herein are those of the authors and do not necessarily reflect the position or policy of the US government, the Department of Defense, the Department of the Army, or the US Army Medical Command.
David B. Johnson, PhD, Hawaii MEDTEP Research Center, Pacific Health Research Institute, provided biostatistical support, and Casimir Kulikowski, PhD, Department of Computer Science, Rutgers University, New Brunswick, NJ, reviewed and commented on the manuscript.
Reprints: Lynn D. Madanay, MD, 846 S Hotel St, Suite 303, Honolulu, HI 96813.