Images of the MacBeth ColorChecker Chart (GretagMacBeth, Regensdorf, Switzerland), before (A) and after (B) calibration.
Comparison of some colors between the 2 most divergent uncalibrated images of the MacBeth ColorChecker Chart (GretagMacBeth, Regensdorf, Switzerland), before (A) and after (B) calibration.
Distribution of the color differences for robustness, reproducibility, and without calibration.
Distribution of the color differences between a color measurement and its “real” spectrophotometric value, with and without calibration.
Comparison of some images, without (A) and with (B) calibration. Note the important differences in exposure between the uncalibrated and calibrated images.
Comparison between 2 skin images taken a few seconds after each other, without (A) and with (B) calibration.
Vander Haeghen Y, Naeyaert JM. Consistent Cutaneous Imaging With Commercial Digital Cameras. Arch Dermatol. 2006;142(1):42-46. doi:10.1001/archderm.142.1.42
Copyright 2006 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2006
To demonstrate how to improve the reproducibility and accuracy of digital images of the skin taken with commercially available digital cameras by transforming them to a standard color space, sRGB.
Our computer algorithm transforms digital images to the standard sRGB color space. It is based on a card with a number of color squares with known colorimetric properties that is included in the image, thereby removing any ambiguity about the color information in the image. Reproducibility and accuracy of the method were assessed by comparing images of color squares with known colorimetric properties taken with different digital cameras at different exposures and zoom settings.
Although calibrated images exhibit markedly improved precision and accuracy compared with noncalibrated images, all variability of the imaging process cannot be eliminated.
With a little care and effort, a calibrated color chart, and computer software, it is possible to greatly improve the quality of clinical imaging in dermatology and possibly other fields of medicine.
In dermatology, digital imaging, with its ease of use and directly visible results, has steadily taken over traditional photography. An important problem with traditional photography is the difficulty of obtaining reproducible color content because of differences in film, lighting, exposure, and development. All of this means that even qualitative comparison of 2 photographs (eg, to assess the evolution of a skin lesion over time during treatment) is tricky at best and excludes any color-based quantitative measurements and comparisons.
Color is not a physical phenomenon like light; rather, it is the interpretation by the human visual system of light entering the eye. The term color thus always assumes a human observer. Because of the way the human eye is built, almost all colors can be reconstructed by a suitable combination of 3 base colors: red, green, and blue (RGB). This phenomenon is readily exploited in a computer monitor or a television, which uses these as its base colors; however, no 2 display devices are equal. Consequently, a color defined by certain amounts of RGB on one device may look completely different on another device. Basically, each device has its own device-dependent RGB color space. Unfortunately, the same holds true for digital imaging devices, so it is no wonder then that no 2 images of the same subject look alike. Thus, eliminating film and development has not improved reproducibility. Despite the lack of reproducibility, articles1- 4 report measurements from digital images of the skin, sometimes with some kind of calibration but mostly without it. Our initial calibrated computer-controlled imaging system5 for pigmented lesions had a limited field of view. Clearly, a generally applicable, simple calibration method to obtain reproducible and accurate imaging of larger areas using commercially available digital cameras would be a benefit.
In 1931, the Commission Internationale de l’Eclairage (CIE) developed a number of well-defined, standardized color spaces that are directly related to human vision (device-independent, or colorimetric, color spaces). Of those, we will retain only CIE L*a*b* because this color space includes a color difference metric called dE*ab, which is proportional to the difference between 2 colors as seen by a human observer. This allows quantification of color differences and errors.
However, the device-independent nature of CIE color spaces also means that they bear no relation to most display or imaging devices and thus cannot be used directly in such devices. For that purpose, it is much better to use the standardized sRGB color space, which is a kind of compromise between device-dependent and device-independent color spaces. Indeed, on one hand, the sRGB color space has a known relationship with the CIE colorimetric color spaces and on the other hand can be displayed directly and realistically on a modern computer monitor (look for the “sRGB” or “6500K” setting on a monitor). The sRGB color space is also used in many printers and is the standard color space of the Web.6- 10
The calibration of acquired images involves transforming the pixel values coming from the camera, which are defined in an unknown input RGB color space, to pixel values defined in the standard sRGB color space, thereby removing most of the variability introduced by changes in lighting, exposure, and white balance. This is achieved by including a small card (the MacBeth ColorChecker Chart [MBCCC]; GretagMacBeth, Regensdorf, Switzerland), which contains a number of color squares with known colorimetric properties in the image close to the region of interest. By requiring that the colors in those squares are transformed correctly to their known sRGB values, a suitable mathematical transformation can be determined that is valid for all the pixels in the image.
Before the actual calibration, an image containing the region of interest and the MBCCC is acquired. Although calibration will be able to correct some imaging defects, such as underexposure or overexposure and improper color balance, it is clear that illumination must be homogeneous over the field of view.
After acquisition, the calibration procedure involves loading the images into the calibration application and clicking the center of the first and last of the calibration squares. Then, the software detects and measures the average color of the squares, computes the RGB to sRGB transformation, and applies it to the whole image. This free software is available at http://uzdermis.ugent.be/yvdh (the home Web page of Y.V.H.) under the software menu.
There are 2 important parameters that can be used to describe the performance of the calibration procedure: precision and accuracy. Precision, or reproducibility, is a measure of how close consecutive measurements of the same subjects are to each other. Accuracy refers to how measurements made with the device or procedure under investigation relate to measurements of the same subjects with a standard measurement device—in the case of color, a spectrophotometer or a chromameter.
To determine this precision and accuracy, images of the MBCCC were taken with 2 different digital cameras (Olympus CL2500 [Olympus Corp, Tokyo, Japan] and Canon 10D [Canon Inc, Tokyo, Japan]) at 2 zoom settings and using either the internal low-quality flash or high-quality studio flashes (8 images in total). For precision, simple pair-wise comparison of the measurements for each color square resulted in a set of dE*ab color differences. The average and 99th percentiles of these color differences are a good estimate for the average and largest color differences that can be expected between 2 measurements of the same subject.
To assess the accuracy of this procedure, we compared each color square measurement with the spectrophotometric measurement of that same square. The average and 99th percentiles of the resulting color differences are used as an estimate for the average and largest color differences that can be expected between 1 measurement and its spectrometric value. This is actually a measure of the accuracy and reproducibility together, but it is more relevant to the user than the accuracy alone.
The calibration process requires some user interaction, which may introduce variation in results. Consequently, the robustness of the calibration process was determined by comparing the results for 8 calibration runs of the same image of the 24 MBCCC color squares in a similar fashion as for the precision. A number of in vivo images, before and after calibration, are also presented.
At first, the calibrated images of the MBCCC look quite alike (Figure 1). A closer look, using dermatologically relevant colors for the 2 most divergent MBCCC colors, reveals clear differences (Figure 2). Measurements of the calibration procedure repeated on the same MBCCC image reveal an average color difference of 0.4 dE*ab, with the 99th percentile at 1.2 dE*ab, which means the procedure is quite robust with regard to the user interaction needed to locate the chart in the image. The reproducibility of the whole calibrated imaging process (ie, including acquisition) is lower, with an average color difference of 3.7 dE*ab, with the 99th percentile at 10.4 dE*ab. Although still a sizable error, this is quite an improvement compared with the reproducibility without calibration: on average 8.9 dE*ab, with the 99th percentile at 26.6 dE*ab (Figure 3).
Color differences between measurements and the real spectrophotometric values were on average 11.2 dE*ab, with the 99th percentile at 28.2 dE*ab, for the uncalibrated images; and on average 5.1 dE*ab, with the 99th percentile at 15.4 dE*ab, for the calibrated images (Figure 4). Again, calibration gives an important boost to the accuracy.
Although spectrophotometric measurements were not made on in vivo skin, we present some examples (Figure 5) and compare potential measurements with and without calibration (Figure 6). When comparing the average color inside the 2 yellow rectangles in the left image of Figure 6 with the color in the corresponding rectangles in the right image of Figure 6 taken a few seconds later, measurements of 16.1 and 15.2 dE*ab are obtained for the uncalibrated images in the top row and 1.5 and 2.9 dE*ab for the corresponding calibrated images in the bottom row. This is the type of measurement that would be performed when doing a follow-up of a lesion in which the color difference with the surrounding skin is less relevant than the color difference with the initial lesion color (eg, ulcers). Clearly, in our case, these values should be as close to zero as possible because the subject has not changed at all between the images.
Comparison of the average color between the 2 rectangles of the same image gives color differences 12.1 and 12.0 dE*ab for the uncalibrated images in the top row and 9.5 and 9.1 dE*ab for the calibrated images in the bottom row. This type of measurement would be performed when monitoring lesions in which the color difference with the surrounding normal skin is of interest (eg, vitiligo, psoriasis, or tattoo removal). In our case, the values should be as close to each other as possible because the color difference between the lesion and healthy skin of the subject has not changed between the images.
It is clear from the results for the MBCCC that even after calibration, there is still quite some variability left in the images. Basically, if one wants to compare absolute color measurements of images, the results need to be larger than 10 dE*ab to be significant. Note that this does not hold for the comparison of different regions within 1 image, in which much smaller color differences will be statistically significant because we do not have to deal with variations between images. This is also true when comparing these color differences over several images.
For single absolute color measurements, the effect of calibration is not as spectacular, and the deviations that are still present between the image and spectrophotometric measurements may be too important, even if these have been reduced by a factor of 2 to 3.
Ideally, when one tries to compare the color of the same areas of skin from a patient in 2 images taken only seconds apart, this result should, of course, be 0 dE*ab. However, it is difficult to remeasure exactly the same area on different images, and it is clear that the results from the calibrated images are much closer to this theoretical result than the results from the uncalibrated images.
Measurements of the color difference between areas in the same image and comparison of these color differences over several images, will probably be used much more often. These types of measurements are more reliable because variations in the color content of the individual images are partly nullified by this in-image comparison. This is clearly visible in our results, which show in both the uncalibrated and calibrated images a similar evolution of the color differences between the lesion and healthy skin: from 12.1 to 12.0 dE*ab for the uncalibrated images, and from 9.5 to 9.1 dE*ab for the calibrated images. Both show an almost constant color difference, as is expected for images in which the subject has not changed. Note that here the images that were compared have been taken with the same digital camera and the uncalibrated images are not very divergent from each other; this will not always be the case.
In conclusion, the procedure presented in this article allows clinical images taken with commercially available digital cameras to be calibrated, provided that illumination in the field of view is uniform. This calibration vastly increases reproducibility and colorimetric accuracy of acquired images and allows better qualitative and real quantitative comparison between them.11,12 The most reliable measurements are those in which color differences between areas inside the same calibrated image are compared over several images of the same subject (eg, taken at different times). The free software can be downloaded from our Web site, but the calibrated MBCCC color squares need to be acquired separately and replaced about every 2 years.
Correspondence: Yves Vander Haeghen, PhD, Department of Dermatology, University Hospital, De Pintelaan 185, 9000 Ghent, Belgium (Yves.VanderHaeghen@UGent.be).
Financial Disclosure: None.
Accepted for Publication: August 24, 2005.
Author Contributions:Study concept and design: Vander Haeghen. Acquisition of data: Vander Haeghen. Analysis and interpretation of data: Vander Haeghen. Drafting of the manuscript: Vander Haeghen. Critical revision of the manuscript for important intellectual content: Naeyaert. Statistical analysis: Vander Haeghen. Obtained funding: Naeyaert. Administrative, technical, and material support: Naeyaert. Study supervision: Vander Haeghen.
Funding/Support: This research was partially funded by Pierre Fabre Dermo-cosmetique, Boulogne, France.