Box plot comparing log B-type natriuretic peptide (ln[BNP]) levels in patients with (+S3) and without (−S3) a third heart sound by observer. The bottom line indicates the 25th percentile; the middle line, the median; and the top line, the 75th percentile. Asterisks represent outliers; error bars, ±1.5 times the interquartile range.
Sensitivity (A) and specificity (B) of the third heart sound for detecting abnormal measurements of cardiac function. BNP indicates B-type natriuretic peptide; LVEDP, left ventricular end-diastolic pressure; and LVEF, left ventricular ejection fraction.
Marcus G, Vessey J, Jordan MV, Huddleston M, McKeown B, Gerber IL, Foster E, Chatterjee K, McCulloch CE, Michaels AD. Relationship Between Accurate Auscultation of a Clinically Useful Third Heart Sound and Level of Experience. Arch Intern Med. 2006;166(6):617–622. doi:10.1001/archinte.166.6.617
Poor performance by physicians-in-training and interobserver variability between physicians have diminished clinicians' confidence in the value of the third heart sound (S3).
To determine whether auscultation of a clinically useful S3 improves with advancing levels of experience, we performed a prospective, blinded, observational study of 100 patients undergoing left-sided heart catheterization. Patients underwent blinded auscultation by 4 physicians (each from 1 of 4 different levels of experience), phonocardiography, measurement of blood B-type natriuretic peptide levels, echocardiography for measurement of left ventricular ejection fraction, and cardiac catheterization for measurement of left ventricular end-diastolic pressure.
Whereas residents' and interns' auscultatory findings demonstrated no significant agreement with phonocardiographic findings, an S3 auscultated by cardiology fellows (κ = 0.37; P<.001) and cardiology attendings (κ = 0.29; P = .003) agreed with phonocardiographic findings. Although the sensitivities of the S3 were low (13%-52%) for identifying patients with abnormal measures of left ventricular function, the specificities were high (85%-95%), with the best test characteristics exhibited by phonocardiography and more experienced physicians. The S3 detected by attendings and fellows was superior in distinguishing an elevated B-type natriuretic peptide level, a depressed left ventricular ejection fraction, or an elevated left ventricular end-diastolic pressure (P = .002-.02 for attendings and .02-.03 for fellows) compared with residents (P = .02-.47) or interns (P = .09-.64).
The S3 auscultated by more experienced physicians demonstrated fair agreement with phonocardiographic findings. Although correlations were superior for phonocardiography, the associations between the S3 and abnormal markers of left ventricular function improved with each level of auscultator experience.
The third heart sound (S3) is a soft, low-frequency vibration that can be auscultated in early diastole. Its origins have been investigated for more than a century.1 Extensive studies in animal models and human studies involving older patients have demonstrated that the S3 reflects elevated ventricular diastolic pressure,2,3 rapid early diastolic filling,4 increased ventricular stiffness,5 volume overload, and an abrupt deceleration of early diastolic filling.6 In addition, the presence of an S3 seems to be clinically meaningful; it portends increased risk in patients undergoing noncardiac surgery,7 and it is associated with adverse cardiovascular outcomes in patients with heart failure8 and acute myocardial infarction.9,10
Identification of this valuable physical finding requires relatively little time, and it is accessible to any physician with a stethoscope. However, studies have demonstrated very poor interobserver agreement among physicians11 and have suggested that trainees may not receive adequate instruction in auscultation.12 There is concern that the teaching and art of physical diagnosis may deteriorate as physicians increasingly rely on more sophisticated technology.13,14 To our knowledge, no previous study has examined agreement in auscultation of the S3 across groups of individuals representing different training levels or used both an objective correlate of the sound itself (eg, phonocardiography) and objective diagnostic testing of cardiac function.
With the hypothesis that auscultation of the S3 improves with level of experience, we sought to compare auscultative abilities of physicians of multiple training levels with the presence of an S3 detected by computerized heart sound analysis. To assess the clinical relevance of this physical examination finding, we also compared the auscultative and phonocardiographic findings with objective independent measures of left ventricular function.
Adults scheduled to undergo nonemergency left-sided heart catheterization for a clinical indication at the University of California, San Francisco, were eligible for enrollment. Exclusion criteria included younger than 18 years, hypotension (systolic blood pressure <90 mm Hg), vasopressor or inotropic pharmacotherapy, cardiac rhythm other than a sinus or paced atrial rhythm, severe mitral regurgitation or stenosis, constrictive pericarditis, a serum creatinine level of 4.0 mg/dL or greater (≥354 μmol/L), severe pulmonary hypertension (mean pulmonary artery pressure >50 mm Hg), and mechanical ventilation.
Before angiography, measurement of left ventricular diastolic pressure was recorded. Blood samples were obtained from the arterial sheath for measurement of the B-type natriuretic peptide (BNP) level using a membrane immunofluorescence assay (Biosite Inc, San Diego, Calif). A BNP level greater than 100 pg/mL was prospectively specified as abnormal.15 Within 2 hours immediately before or after cardiac catheterization, auscultation, computerized heart sound analysis, and a transthoracic echocardiogram were performed. The auscultators were blinded to any other clinical findings at the time of auscultation.
From a review of the clinical record, the patients' primary cardiac diagnosis and significant comorbidities were recorded, including coronary artery disease (defined as ≥1 coronary artery with ≥70% diameter stenosis), systemic hypertension, cardiomyopathy (left ventricular ejection fraction [LVEF] ≤40%), moderate-to-severe aortic stenosis, moderate-to-severe mitral regurgitation, chronic renal insufficiency (creatinine clearance ≤30 mL/min [≤0.50 mL/s]), hypertrophic obstructive cardiomyopathy, and chronic obstructive pulmonary disease. The protocol was approved by the University of California, San Francisco, Committee on Human Research, and all the patients gave written informed consent before enrollment.
Patients underwent recording of left ventricular end-diastolic pressure (LVEDP) using a 6-French pigtail catheter and a properly zeroed fluid-filled pressure transducer. Pressure was recorded using a 50–mm Hg scale at 50–mm/s paper speed. One of 2 blinded physicians (G.M. and J.V.) measured the post–A wave pressure. A minimum of 5 consecutive cardiac cycles were used to determine the mean LVEDP. An LVEDP greater than 15 mm Hg was prospectively specified as abnormal.16
Transthoracic echocardiographic data were obtained using an experienced echocardiographer (Acuson Sequoia; Siemens, Malvern, Pa, or SONOS 5500; Philips Medical Systems, Andover, Mass). Echocardiographic contrast (Optison; Amersham, Little Chalfont, England) (0.3-0.5 mL injected into a peripheral vein) was administered when required to improve endocardial border detection. Echocardiographic data were stored on magneto-optical disks and were analyzed off-line by a single experienced reader (B.M.) blinded to any clinical or study data. End-diastolic and end-systolic volumes were calculated using the biplane method of discs,17 and they were then indexed to body surface area. These volumes were used to calculate LVEF. An LVEF less than 50% was prospectively specified as abnormal.18
Each study participant was auscultated for a left ventricular S3 by 1 physician from each of 4 groups: board-certified cardiology attendings (n = 26), cardiology fellows (n = 18), internal medicine residents (n = 54), and internal medicine interns (n = 48). Auscultation was performed in a quiet room at and around the apex in the supine and left lateral decubitus positions. The auscultators were not permitted to elicit any patient history or to perform other components of the physical examination. An intermittent S3 was considered positive. All the auscultators were blinded to the patients' conditions and to all study test results.
A 3-minute audioelectric phonocardiographic tracing (Audicor; Inovise Medical Inc, Portland, Ore) was obtained. Audioelectrocardiographic leads were attached to the V3 and V4 position connected to an electrocardiography machine (Marquette MAC 5000; General Electric Healthcare Technologies, Waukesha, Wis). The audioelectrocardiographic data were stored electronically on a compact disc. A 10-second segment free of artifact was selected off-line by a blinded electrical and biomedical engineer for a computer-generated report determining the presence of an S3.
Data are given as mean ± SD for continuous variables. Comparisons between groups were assessed using Mann-Whitney and Fisher exact tests, where appropriate. κ Statistics were calculated for the degree of agreement among phonocardiography and the different auscultation groups. Generally, κ>0.7 identifies good agreement, κ = 0 is equal to guessing, and κ<0 is worse than guessing.19 Clustering of ratings by individual physicians was checked using a χ2 test in a random-effects logistic regression of physicians' rating on the phonocardiography rating. Two-tailed P<.05 was considered statistically significant. Statistical computations were performed using a software program (Stata version 9.1; StataCorp, College Station, Tex).
One hundred patients were enrolled. The mean age was 62 ± 14 years (range, 24-91 years), and 65 were men. Twenty-nine of these patients had treated diabetes mellitus, 81 had systemic hypertension, 36 had a clinical diagnosis of heart failure, 68 had coronary artery disease, and 17 were hospitalized for an acute coronary syndrome. Seven patients had moderate-to-severe aortic stenosis, and 3 had severe hypertrophic obstructive cardiomyopathy. The mean creatinine level was 1.52 ± 1.39 mg/dL (134 ± 123 μmol/L) (>2.0 mg/dL [>177 μmol/L] in 12 patients).
One hundred patients had invasive measurement of central hemodynamic variables and LVEDP. The mean heart rate was 69 ± 12 bpm. The mean central aortic pressure was 131 ± 25 mm Hg systolic and 66 ± 12 mm Hg diastolic. The mean LVEDP was 15.1 ± 7.7 mm Hg (range, 1-31 mm Hg). Forty-six patients had an abnormal LVEDP (>15 mm Hg). Eighty-eight patients underwent adequate assessment of LVEF. The mean LVEF was 57% ± 19% (range, 7%-85%). Twenty-six patients had an abnormal LVEF (<50%). Ninety-eight patients underwent BNP measurement. The mean BNP level was 487 ± 914 pg/mL (range, 5-4490 pg/mL). Fifty-eight patients had an abnormal BNP level (>100 pg/mL).
Ninety-eight patients were auscultated by 1 of 26 board-certified cardiologists. Ninety-nine patients were auscultated by 1 of 18 cardiology fellows. Ninety-four patients were auscultated by 1 of 54 internal medicine residents. Eighty-six patients were auscultated by 1 of 48 internal medicine interns. The maximum number of patients auscultated by a single physician from each group was 19 for attendings and fellows and 6 for residents and interns. The percentage of study patients in whom an S3 was auscultated is given by level of experience in Table 1. Ninety patients had adequate phonocardiographic data: 2 were excluded because of atrial pacing and 8 owing to poor sound quality tracings. Phonocardiography detected an S3 in 21 (23%) of the 90 patients (Table 1).
The fellows' and attendings' auscultation of an S3 had fair agreement with phonocardiographic findings (κ = 0.37; P < .001 and κ = 0.29; P = .003, respectively). Using phonocardiographic S3 as the reference standard, fellows tended to have fewer false-positive S3s (30%) compared with attendings (55%) but more false-negative S3s (61% for fellows vs 48% for attendings). The agreement was higher with residents than with interns, but there was no significant agreement between either interns or residents with phonocardiographic findings (κ = 0.04; P = .36 and κ = 0.13; P = .11, respectively). None of the 8 tests for clustering of ratings by individual physician (S3 for each category of physician) was significant (P>.15 by χ2 test for all).
Patients with an S3 detected by phonocardiography had significantly higher BNP, lower LVEF, and higher LVEDP measurements than those without an S3 detected by phonocardiography (Figure 1 and Table 2). With decreasing levels of experience, each of the clinical auscultator groups performed progressively less well compared with phonocardiography for separating patients based on these functional and hemodynamic measurements (Table 2). Phonocardiography and attendings separated patients based on all 3 measures of ventricular function. Fellow detection of an S3 separated patients based on LVEF and LVEDP but not BNP level. Resident auscultation of an S3 separated patients based only on LVEDP and not BNP level or LVEF. Intern assessment of an S3 performed most poorly, with no significant separation based on any of these objective measures of cardiac function.
Using prospectively defined cutoff points for abnormality in each hemodynamic measurement, the presence of an S3 as detected by phonocardiography had sensitivities of 33% to 52% and specificities of 88% to 92% (Figure 2). Compared with attendings, an S3 detected by fellow auscultation yielded consistently lower sensitivities (33%-50% for attendings compared with 13%-30% for fellows) but higher specificities (85%-89% for attendings compared with 93%-95% for fellows). Auscultation of an S3 by interns and residents demonstrated low sensitivities (30%-38%). The interns' and residents' S3 had somewhat high specificities (79%-88%) that generally remained lower than those obtained by the groups with more advanced levels of training. Phonocardiography, attendings, and fellows detected an S3 more frequently in those with elevated LVEDP and reduced LVEF compared with residents and interns (Table 3).
Although the clinical utility and prognostic power of the S3 have been demonstrated,7,8,20 previous literature focusing on the actual auscultation of the sound has not been promising. Physicians-in-training were shown to have inadequate cardiac auscultatory proficiency,12 and interobserver agreement among small numbers of fully trained physicians has been poor.11 Lok et al21 examined the accuracy and interobserver agreement of 2 cardiologists, 1 general internist, 3 senior residents, and 2 junior residents to auscultate an S3 in 40 patients. They found “slight agreement” among auscultators (κ = 0.18). There was no improvement in accuracy (comparing auscultative findings with those from phonocardiography) or interobserver agreement with more advanced levels of training.
Unlike previous studies, we examined the ability of groups of physicians to represent each level of training and included a larger number of patients. A fair degree of agreement between an S3 detected by phonocardiography and one detected by cardiology fellows (κ = 0.37) or cardiology attendings (κ = 0.29) was observed. There was poor agreement between phonocardiography and medicine residents (κ = 0.13) and medicine interns (κ = 0.04). In contrast, we demonstrated that the agreement with phonocardiography improved with a greater level of experience. We recently showed that the phonocardiographic S3 has fair sensitivities but high specificities for evidence of left ventricular dysfunction.22
Although several animal and human studies have elucidated the origins of the S3,2- 6 few have compared the sound with objective markers of hemodynamic aberration routinely used in clinical practice. In 1968, Shah et al2 demonstrated that the mean left atrial pressure was elevated in 10 patients with an S3 detected by phonocardiography; however, correlation with physician auscultation was not performed.
B-type natriuretic peptide is a neurohormone secreted from the myocardium in response to myocyte stretch. With higher BNP levels indicating volume overload, this serologic marker has been shown to be useful in the clinical diagnosis of heart failure, and it is becoming a commonly used tool in determining the cause of dyspnea.15,23 It was previously demonstrated that an S3 auscultated by a senior cardiologist is highly specific for an elevated level of BNP.24
Comparing findings from each group of auscultators and phonocardiography with BNP levels, LVEF, and LVEDP, we found a consistent association between the presence of an S3 and greater hemodynamic abnormality. The presence of an S3 correlated with markers of poor left ventricular function, volume overload, or both. Again, performance improved with more advanced levels of training. Whereas the mean levels of each of the 3 objective markers of cardiac function (BNP, LVEF, and LVEDP) were significantly more abnormal in patients with an S3 detected by phonocardiography and the attendings, 2 markers (LVEF and LVEDP) were significantly more abnormal in those with an S3 detected by fellows, 1 marker (LVEDP) was significantly more abnormal in those with an S3 detected by residents, and none were significantly more abnormal in those with an S3 detected by interns.
No formal assessment of the educational curriculum for physical diagnostic skills for each level of training was performed. The clear improvement in auscultatory accuracy by the fellows compared with the residents and interns may be due in part to the emphasis on the cardiac physical examination and regular bedside teaching by senior cardiologists provided to the cardiology fellows at the University of California, San Francisco. It is also possible that individuals with greater interest in or skill at clinical auscultation may pursue cardiology specialty training.
Finally, the test characteristics of the S3 demonstrated clinical utility and reflected improvement with higher levels of experience. In general, the sensitivity of the S3 to detect an abnormal hemodynamic marker was poor, regardless of the level of the auscultator. Therefore, the absence of an S3 is likely not helpful to exclude abnormal ventricular hemodynamics or function. However, the specificity of the S3 for an abnormal hemodynamic marker was consistently high: the specificities of an S3 detected by interns for an abnormal hemodynamic marker in the abnormal range were 79% to 84%, and the specificities of an S3 detected by residents were 82% to 88%. The presence of an S3 detected by fellows and attendings had specificities of 85% to 95% for detecting a marker of cardiac function in the abnormal range. Because auscultation of an S3 requires that the heart sound have sufficient amplitude, frequency, and coupling to the chest wall, this physical finding is an insensitive, but highly specific, sign of ventricular dysfunction.25 Therefore, auscultation of an S3 by individuals with advanced training can be useful to rule in an abnormally elevated BNP level, a depressed LVEF, or an elevated LVEDP.
Although correlation with phonocardiographic findings improved with each level of experience, the κ statistics for fellows (κ = 0.37) and attendings (κ = 0.29) were only fair. Rather than reflecting only fair auscultation of the S3 by these physicians, this level of agreement must be interpreted with the understanding that the phonocardiographic S3 also had modest sensitivity for detecting patients with abnormal left ventricular function.22 Using the abnormal levels of the objective markers of ventricular function (BNP, LVEF, and LVEDP) as reference standards, the sensitivity of the phonocardiographic S3 was low (33%-52%), demonstrating that a false-negative phonocardiographic S3 might commonly occur. The specificity of the phonocardiographic S3 was high. Using the same markers as reference standards, the test specificities of the attendings' and fellows' auscultated S3 (85%-95%) were similar to those of the phonocardiographic S3 (88%-92%). The attendings' and fellows' S3 matched the phonocardiographic S3 in successfully distinguishing patients with abnormal levels of each of the 3 markers (Table 2). The increasing correlation with the phonocardiogram with each level of experience is encouraging; however, the lack of a robust κ value should not dissuade anyone from perceiving the physician-auscultated S3 as a clinically useful finding.
Insofar as these findings demonstrate the capacity for physicians to auscultate a clinically important S3, we believe they can be generalized to the practicing physician and physician-in-training. The full realization of that capacity requires both continuing interest on the part of the learner and mentorship and teaching by those with expertise.
Because we did not follow the auscultative abilities of any given individual across time and with further training, we cannot absolutely conclude that more training is sufficient to improve physical examination skills. However, lack of evidence for interdependence among an individual physician's ratings reduced the likelihood that the superior performance of the cardiology fellows and attendings was due to a selection of particular physicians with an interest in or affinity for the cardiovascular examination (rather than a general improvement with more training). Although auscultators were blinded to patient histories and test results, they had the opportunity to “eyeball” the patient during auscultation. The appearance of the patient and visible physical examination findings may have biased the auscultator; however, in true clinical practice, the examining physician will always be guided by the general appearance and condition of the patient. Finally, although patients with pulmonary hypertension were excluded, a right ventricular S3 might have been detected.
In conclusion, compared with an objective measure of the sound itself (phonocardiography), auscultation of the S3 improves with more advanced levels of training. Compared with clinically useful objective markers of left ventricular function (BNP, LVEF, and LVEDP), the specificity of the S3 is high, with the best performance demonstrated by physicians representing more advanced levels of training. Phonocardiography performed better than any of the auscultator groups in correlating with measures of ventricular function.
Correspondence: Andrew D. Michaels, MD, Division of Cardiology, University of California, San Francisco, Medical Center, 505 Parnassus Ave, Box 0124, San Francisco, CA 94143-0124 (firstname.lastname@example.org).
Accepted for Publication: November 11, 2005.
Financial Disclosure: None.
Funding/Support: This study was supported by an unrestricted educational grant from Inovise Medical Inc (Dr Michaels) and internal research funds from the Division of Cardiology, University of California, San Francisco.
Acknowledgment: We thank the patients who participated in the study; the staff of the Cardiac Catheterization Laboratory, University of California, San Francisco, for their technical assistance; Patti Arand, PhD, Nancy Forman, RN, BSN, and Robert Warner, MD, of Inovise Medical Inc for training and technical assistance; and Inovise Medical Inc for free use of the phonocardiographic equipment and interpretation of the tracings.