Facial landmarks and distances. Na indicates nasal; F, frontal; Io, infraorbital; A, alar; M, mouth; Ls, superior lip; and Li, inferior lip. The landmarks F, Io, A, and M are bilateral, whereas landmarks Na, Ls, and Li are midline, bringing the total number to 11. Distance measurements are straight lines between 2 points; eg, FNa is the distance between F and Na (dotted line).
Facial area measures. The eye area indicates the F-Na-Io triangle (ΣEYE); lateral area, the A-M-Io triangle (ΣLATERAL); paranasal area, the A-Na-Io triangle (ΣPARANASAL); upper lip area, the A-Ls-M triangle (ΣUPPERLIP); and mouth area, the Ls-Li-M triangle (ΣMOUTH). Landmarks used to measure the areas are described in the legend to Figure 1.
Dulguerov P, Wang D, Perneger TV, Marchal F, Lehmann W. VideomimicographyThe Standards of Normal Revised. Arch Otolaryngol Head Neck Surg. 2003;129(9):960-965. doi:10.1001/archotol.129.9.960
Copyright 2003 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2003
Studies aiming to objectively evaluate facial movements have focused on the technique of measurement, whereas the most pertinent measurements of basic facial movements have not been well characterized.
To determine the best normal measures of 5 basic facial movements in healthy patients.
In 5 healthy subjects, 11 facial landmarks were placed on the face, and 5 movements (forehead lift, eye closure, nose wrinkling, lip puckering, and smiling) with maximal contraction force were requested. Each subject repeated each movement 3 times, and the entire session was repeated on 4 different days. No specific immobilization of the head was performed. The session was filmed with a digital camera, and the frames with maximal movement were selected. Measurements were performed with Osiris public domain image analysis software. For each measure, the change from rest was computed. Intersubject and intrasubject variability were determined by a multivariate analysis of variance.
In all movements, surface changes (mean ± SD) were higher than distance changes. For forehead lifting and eye closure, the best measure was the eye surface changes of 13% ± 5% and −32% ± 9%, respectively. For nasal wrinkling, lip puckering, and smiling, the best measures were the paranasal area (change, –28% ± 9%), upper lip area (change, –23% ± 8%), and mouth area (change, 63% ± 21%), respectively. Most distance changes were below 10%. Same-day repeatability variation was below 15%, and day-to-day repeatability variation was below 7%. In healthy subjects, more than 80% of the total variation was accounted for by the intersubject variability.
Videomimicography is a simple and objective linear measurement system based on facial surface changes. The measures exhibit good reliability.
EVALUATION OF the motor facial nerve function requires that movements of the facial musculature be elicited, either by external electrical stimulation or by voluntary contraction, which is usually solicited by a verbal command.1 Electrical stimulation tests have definitive shortcomings when used in incomplete facial nerve paralysis, mainly because they lack the necessary dynamic range for quantifying the residual facial motor function. Although never clearly spelled out, the 2 stimulation methods result in tests addressing different extremities of the facial neuromuscular dysfunction scale. The electrical stimulation tests are used for patients with little residual (0%) facial nerve function, and voluntary evoked movements for patients with good residual (100%) facial function.1
In tests of facial neuromuscular function evoked by voluntary contraction, the evaluation of facial movements can be classified as subjective and objective. Subjective evaluation methods correspond to the various facial nerve grading systems, of which the most widely used is the House-Brackmann system.2 Objective facial nerve evaluation methods use some kind of measurement technique in the hope of reducing errors and avoiding observer biases inherent to the different grading systems. Although these methods are still experimental, they could be subdivided, according to the technique of measurement used, in the following 3 main groups: linear measurement, image subtraction, and miscellaneous techniques.1 Linear measurement techniques use facial landmarks and movement-associated changes in distance between these landmarks to derive an index of the facial function.3- 6 Image subtraction techniques use digitized images and rely on a computer to perform a subtraction between frames obtained at rest and after facial movement; the number of pixels with a different "color" in a given facial area are summed and used to calculate facial movements.7- 9 Miscellaneous techniques include surface electromyography,10 microscaling,11 and moiré topography.12
Despite the multiplicity of reports, mostly focused on the technique of measurement, fundamental questions remain unanswered. The facial movements used should be clearly defined and standardized. In addition, the exact pertinent measures for typical facial movements remain to be specified. The basic questions are (1) what should be measured and (2) how should these measurements be performed? These questions should be first addressed in healthy subjects and the defined measures evaluated in patients with facial paralysis.
We herein report a new, objective facial nerve evaluation method, called videomimicography (VMG), based on a systematic evaluation of these questions.1 We evaluated distances adapted from previously published linear measurements and facial surfaces that are the basic measurements of image subtraction techniques. In this study, our goal was to establish the best normal measures of 5 standard facial movements in healthy subjects.
Five adults (2 men and 3 women) with an average age of 41.2 years and no history of facial paralysis were used as our healthy control group.
During VMG, the subject sits comfortably on a custom chair. The chair has a headrest, providing head support and minimizing head movements in the anteroposterior and lateral directions. In addition, a 10-cm scale is positioned just superior to the forehead by means of an adjustable frame and used for the calibration of all measurements.
Landmarks are placed on the face with a blue eyeliner pen. The 5 facial movements chosen are routinely used in clinical facial nerve testing and include forehead lifting, eye closure, nose wrinkling, lip puckering, and smiling. Before recording, the requested movements were explained to the subject, and a few trials were performed. For each movement, the subject is actively stimulated by verbal commands to produce the maximal possible movement and to keep this position for a few seconds. A digital video camera was used (AG-EZ1; Panasonic, Osaka, Japan) to record the VMG session on a videocassette (DV10000; Panasonic). The total procedure took about 30 seconds for 1 repetition of all 5 facial movements in all cases.
Before the video recording, 11 facial landmarks were placed (Figure 1). These landmarks were similar, but not identical, to the ones proposed by Burres,3 including the following:
Nasal (Na) was placed in the midline, on the nasal bone, slightly below the nasion. This point was found by Frey et al4 to correspond to an immobile point during facial movements and was, therefore, used as a reference point for the measurement of other facial landmarks.
Frontal (F) was placed about 2 cm above the eyebrow, above the pupil.
Infraorbital (Io) was placed at the level of the skin overlying the orbital rim on a vertical line passing from the pupil.
Alar (A) was placed a few millimeters lateral to the inferior edge of the nasal alae.
Mouth (M) was placed a few millimeters lateral to the corner of the mouth.
Superior lip (Ls) was placed in the midline, in the deepest point of the philtrum. It is usually a few millimeters above the Cupid bow.
Inferior lip (Li) was placed in the midline, in the deepest point of the chin. It is usually about 10 mm from the inferior vermilion border, at the midline.
Landmarks F, Io, A, and M are bilateral, whereas Na, Ls, and Li are at the midline. The landmarks were placed according to these descriptions, although somewhat arbitrarily, without exact measurements of their exact location.
For each movement, 10 distances and 5 areas were evaluated on both halves of the faces. Distance measurements correspond to the length of the straight line between 2 points, eg, the distance FNa corresponds to the length of a straight line drawn between points F and Na (Figure 1). The distances studied were ALs (from A to Ls), AM (from A to M), FIo (from F to Io), FNa (from F to Na), IoA (from Io to A), IoM (from Io to M), MLi (from M to Li), MLs (from M to Ls), NaA (from Na to A), and NaIo (from Na to Io). Area measurements correspond to the surface of a triangle between 3 points, eg, the eye area is the surface of the F-Na-Io triangle (Figure 2). The 5 areas evaluated included the eye (ΣEYE), lateral (ΣLATERAL), paranasal (ΣPARANASAL), upper lip (ΣUPPERLIP), and mouth (ΣMOUTH) areas. The percentage of change (ΔX) of each of the above 15 variables between rest and maximal movements was computed using the following formula:
ΔX = ([XMOVEMENT − XREST]/XREST) × 100,
where X represents a given measure.
When a given measure decreased during the requested movement, ΔX was negative, whereas an increase led to a positive ΔX. For each facial movement, the best measure was defined as the measure exhibiting the largest change relative to the rest value (high ΔX), whereas the standard variation of the measure was small.
The entire video was replayed on a digital videocassette recorder. For each subject, 3 rest frames and 3 frames for each facial movement were selected and then fed to a personal computer under software control (DV Studio; Panasonic). Graphic measurements of the distances and areas were performed with Osiris public domain image analysis software. We used a custom modification that allowed us to send the coordinates of the marked point to a spreadsheet file by clicking on them. This modification also performs the special calculation in terms of distances and areas and their calibration in centimeters. The modified version of the Osiris software can be requested from the authors or the Division of Medical Information of the Geneva University Hospital, Geneva, Switzerland (available at: http://www.expasy.ch/www/UIN/html1/projects/osiris/osiris.html).
The measures were averaged across subjects, including repetitions during the same day and on different days, and across sides (left and right). The coefficient of variation was computed using the standard formula SD/mean. We assessed the reliability of VMG by evaluating same-day and day-to-day variability. The recordings for each subject were repeated on 4 different days (days 0, 1, 7, and 8), with 3 repetitions each day. Side-to-side, day-to-day, retest variability (intrasubject variability), and intersubject variability were assessed by analysis of variance (ANOVA). We used SPSS for Windows software, version 9.0 (SPSS Inc, Chicago, Ill), for statistical tests. Unless otherwise indicated, data are expressed as mean ± SD.
Beside nose wrinkling, which was sometimes difficult to achieve, subjects had no trouble producing the requested facial movements. For every facial movement studied, the measures exhibiting highest changes relative to rest (ΔX) were always measures involving surfaces rather than distances (Table 1). For eye closure and forehead lifting, the measure with the largest percentage of change relative to the rest value was ΣEYE, with −31.86% ± 8.54% and 12.73% ± 4.83%, respectively. For nose wrinkling, the best measure was ΣPARANASAL, with an average change of −28.08% ± 9.49%. For lip puckering, the best measure was ΣUPPERLIP, with an average change of −22.89% ± 8.29%. For smiling, the best measure was ΣMOUTH, with an average change of 63.48% ± 21.27%. For all of these 5 best measures, the coefficient of variation was smaller than 0.4.
Results of ANOVA showed that, for most of the measures, the majority of the total variation resulted from intersubject variability, whereas side-to-side, day-to-day, and retest (intrasubject) variability were negligible (P>.05). For ΣEYE in eye closure and forehead lifting, ΣPARANASAL in nose wrinkling, ΣUPPERLIP in lip puckering, and ΣMOUTH in smiling, the ANOVA showed that intersubject variation was responsible for 91%, 73%, 97%, 82%, and 90%, respectively, of the total variability. Depending on the best measure studied (Table 2), the percentage of total variability due to side variation ranged from 0.2% to 2%; due to same-day retest variation, 1% to 13%; and due to different-days retest variation, 1% to 12%. Only the intersubject variability reached statistical significance (P≤.001).
For intrasubject variability, most high values were found in forehead lifting, with intermediate values in lip puckering and low values in eye closure, nose wrinkling, and smiling.
To develop any test, the first prerequisite is to decide exactly what the test should measure. For objective facial nerve evaluation methods, this means specification of what facial movements should be studied, how the movements should be performed, and what should be measured for each movement. Only then can issues about the way to perform these measures be raised.
Any published facial evaluation method, subjective or objective, has assumed that the production of facial movements is a reliable representation of facial neuromuscular function. We also assumed that a variation of the standard facial movements used in clinical evaluation should be tested. Although the movements we choose cover most of the facial mimetic musculature, from the forehead to the chin, the pertinence of these movements should be addressed by a correlation with some form of facial disability evaluation, of which disability questionnaires13- 17 are an example. Such a study has yet to be performed.
Once the facial movements to be assessed are determined, it remains to specify how these movements should be performed. Few objective facial nerve evaluation methods have specified maximal contractions for each facial movement.5,6 Experimental arguments favoring the use of maximal contraction can be derived from the studies by Burres,3 ie, the measures assessed for soft eye closure had a higher variability than those for tight eye closure. Probably because it seems the only simple way to standardize the production of facial movements, maximal contraction should be recommended. During VMG, we requested the maximal possible effort and provided active verbal stimulation for the subjects.
In previous reports, few studies have compared different measures to assess the best measure for each movement. In his pioneering study, Burres3 recommended the following distances for different facial movements: forehead movements (FIo), eye closure (FIo and NaIo), nose wrinkle (AM and NaA), kissing (M to lateral canthus), and smiling (M to midmouth). Frey et al4 used a sophisticated setup with 4 different cameras and somewhat different movements and landmarks and reached similar conclusions.
Previous linear measurement techniques have all used facial distances as measures, whereas image subtraction methods,7,9 although based on a different technology, could be regarded as measuring areas. We studied distance and area measures at the same time. Ten distances and 5 areas were analyzed in each frame (18 frames per subject). The best measure for each facial movement studied was determined as follows: for eye closure and forehead lifting, ΣEYE; for nose wrinkling, ΣPARANASAL; for lip puckering, ΣUPPERLIP; and for smiling, ΣMOUTH. These measures exhibited the largest changes relative to rest and the smallest coefficient of variation. Area measures were found to better estimate the 5 routine facial movements than distance measures in healthy subjects.
In all 5 movements, the measures with the highest changes were located close to the moving facial area, whereas those located far from the moving facial area had low changes or no real change (Table 1).
In order of importance, an ideal technique should first not impede facial movements; therefore, the face should not be touched during the movements or for the measurements. Second, it should be reproducible for a given individual, both in normal and pathologic cases. Third, it should provide synchronous data from the left and right sides of the face, for comparison. Fourth, it should not require the observer to make the measurements, avoiding manipulation errors and bias. Fifth, it should be rapid, simple, and low cost. Sixth, it should be well tolerated by patients. Seventh, it should provide absolute values, not just percentages. Eighth, it should be stored in some form for later comparison, evaluation by other examiners, or further studies. Finally, it should not require markings on the face.1
Although image subtraction methods have the inherent advantage of not requiring contact with the patient's face, for linear measurements most authors have so far used a manual technique,3,4,18,19 which requires touching the patient's face. Several authors have used still photographs,6 taped images,5 or a complicated digital setup,4 but a simple digital technique has yet to be successfully applied.
Videomimicography is a measurement system in which the patient's face is not touched. A digital video image is obtained, and frames or movies can be directly visualized or fed into a computer for analysis.
Few previous studies have assessed reproducibility. Wood et al11 used a microscaling technique to examine 11 healthy subjects performing 2 facial movements (brow lift and smiling). The average test-retest variability was 4% and 5%, respectively; day-to-day variablility, 5% and 6%, respectively; side-to-side variability, 6% and 14%, respectively; and intersubject variability, 25% and 23%, respectively. Neely et al20 used a general linear model to examine the variability in their image subtraction method. The model predicted 82% to 95% of the observed variability, and 70% to 85% of the variability was due to intersubject differences, whereas intrasubject variability was less than 2%. That study used only 1 repetition the same day and did not explore day-to-day repeatability.
Our ANOVA is a general linear model in which we examined the contribution to the total variability of same-day test-retest (3 repetitions), day-to-day retest (4 days), side-to-side (2 hemifaces), and intersubject (5 healthy subjects) variability. Our findings confirm the results of Wood et al11 and Neely et al20 that intrasubject variability is low compared with intersubject differences. In general, intersubject differences are responsible for 80% to 95% of the total variation. Therefore, it could be concluded that intrasubject variability of VMG is very low.
In VMG and other objective methods based on video recordings of facial movements, both sides of the face could undergo evaluation simultaneously. Techniques based on direct measures on the patient's face usually use asynchronous facial movements for their measurements.
Several techniques proposed involve complicated manipulations by the examinee, and their objectivity is questionable.6,11,12 All techniques involving direct measurements on the face3,18,19 carry inherent observer bias. In other described linear measurement techniques, it is simply unclear how the measurements are performed5 and sometimes what is measured.21 Image subtraction methods are, by definition, digital; however, the facial areas to be analyzed are rarely clearly defined,9,20,22 and the threshold for "color change" has remained arbitrary.7,9,20,22 Finally, some studies,12,23 including ours, although giving the impression of being completely automatic, require at least some observer input.
Even in its present state, VMG requires minimal observer intervention. A manual pointing with the mouse on the computer screen is necessary to obtain the coordinates of the different landmarks. With the custom-modified Osiris software, the coordinates and all required calculation are sent to a spreadsheet or other mathematical software. We are in the process of implementing a completely digital system in which the entire VMG session is fed directly into a computer. The main technical difficulty is that the software must track the landmarks across frames.
In general, the more sophisticated and objective a technique, the more complicated it is in terms of time and cost of the involved equipment. Simple systems involving direct linear measurements on the face3,4,18,19 require little equipment, but have numerous shortcomings, as previously discussed.1 Other linear measurement methods tend to involve numerous time-consuming observer manipulations5,6 or extremely expensive equipment.4 Image subtraction methods7,9 could be criticized because they require special lighting equipment and ambient luminosity control, fixed subject-camera distance, long duration of the procedure (10 minutes), almost absolute head immobilization, and rather expensive computer setup.
For VMG, videotape frames have been individually fed into the computer directly from a digital video recorder. With the certain fall in prices of digital video recorders and the advent of IEEE 1394 (Fire wire) I/O ports as a computer standard, the system remains relatively simple. The system requires a chair with a headrest, an eyeliner pen and remover, a digital camera, and a computer. The analysis software can be freely downloaded.
Patient acceptance is a major drawback for image subtraction methods, in which head immobilization in a special head holder for 10 minutes9 is required. In VMG, the recording session takes about 30 seconds and has been well tolerated by patients who often have spontaneously requested an evaluation to "assess their improvement."
Image subtraction methods have been expressed in numbers of pixels, and absolute values have not been available. Simple systems involving direct linear measurements on the face3,4,18,19 can obviously directly obtain real distance, although the published results are often in percentages. In VMG, absolute and relative measures are available. However, and contrary to most previous reports, to diminish errors derived from the placement of landmarks and the size of faces, relative changes of the measures were derived by comparing movement frames and rest frames. This manipulation is also analogous of the measures of image subtraction methods.
Storage of data for later comparison and evaluations is obvious and, with present video facilities, could only be neglected for simplicity reasons.
Ideally, the patient face should not be touched at all. Nevertheless, except image subtraction techniques, all methods have used some kind of facial landmarks. Eyeliner landmarks are easy to remove with makeup removal solutions and have not been a problem.
A test that satisfies all these requirements might be useful in discriminating patients with different degrees of facial paralysis. Preliminary data from our institution in 29 patients with facial nerve paralysis of various etiologies and different House-Brackmann grades (9 patients, grade II; 8, grade III; 5, grade IV; 2, grade V; and 5, grade VI) showed that the areas measurements are almost linearly related with the degree of facial paralysis (Pearson coefficients, 0.6-0.8; P<.001). A global index of facial paralysis could be derived with a correlation of 0.94.24
Videomimicography is a new objective, quantitative, reproducible, and relatively simple method for evaluating facial nerve function. Area measures are better than distance measures in evaluating facial movements. In our study, the best measure for eye closure and forehead lifting was ΣEYE; for nose wrinkling, ΣPARANASAL; for lip puckering, ΣUPPERLIP; and for smiling, ΣMOUTH. Most of the variability in healthy subjects was due to intersubject variability, whereas the retest variability was low.
Corresponding author: Pavel Dulguerov, MD, Department of Otolaryngology–Head and Neck Surgery, Geneva University Hospital, 24 rue Micheli-du-Crest, 1211 Geneva 14, Switzerland (e-mail: firstname.lastname@example.org).
Submitted for publication June 10, 2002; final revision received September 26, 2002; accepted February 5, 2003.
This study was presented at the First International Congress on Salivary Gland Diseases; January 28, 2002; Geneva, Switzerland.
We thank Ms M. Logean from the Division of Medical Information of the Geneva University Hospital for customizing the Osiris software.