Background
In 2000, 41 million people searched for medical information online. The quality of that information is unregulated, variable, and not well studied.
Objective
To quantify and compare the numbers and types of Internet sites matched for 10 diverse skin conditions through different search systems.
Design
Search strategies were performed at 6-month intervals via Netscape Navigator, using 3 search engines and 1 directory. Ten skin conditions were chosen to represent common (psoriasis and eczema), cosmetic (wrinkles and cellulite), difficult-to-treat (alopecia, mastocytosis, granuloma annulare, and xanthoma), and uncommon (dermatitis herpetiformis and epidermolysis bullosa) problems. Search strings were designed to generate lists of Web sites that provide educational or product-related information. Results were compared.
Setting
The Saint Louis University information technology server, July 9, 1999, December 16, 1999, and February 5, 2000.
Main Outcome Measures
Comparisons of the total number, top 10 ranking, and type (educational vs product-related) of sites that matched through different search systems at 6-month intervals.
Results
The total number of matched sites for different skin conditions varied up to 100-fold. This number increased by 30% to 316% between July and December 1999. The largest number of Web sites related to wrinkles, followed by Web sites related to common conditions. Product-related sites outnumbered educational sites, especially for common and cosmetic conditions. Although there were differences in the total number of sites found through different search engines, the ratios of product-related to educational sites were similar. Different search engines yielded different top 10 match lists for the same condition. The top 10 lists included higher proportions of educational sites than the total match lists for all conditions except cellulite. Within the top 10 lists, the rank order of well respected sites varied by search engine used and changed over time.
Conclusions
Patients are increasingly accessing the growing body of data available through the Internet. Most Web sites contain information related to products. Until standards are enacted to govern the distribution of online medical information, consumers are at risk for obtaining misinformation and buying ineffective products. To better guide patients, physicians must become familiar with this ever-changing information.
IN RECENT years, the Internet has become a useful tool for gathering data. More people are turning to the Internet to find information about their illnesses and treatment options. In 1999, 25 million people searched for health and medical information online. This was expected to grow to 33 million in 2000,1 but was reported at a staggering 41 million.2 The number of Internet sites containing medical information has correspondingly skyrocketed. In 1997, more than 10 000 health-related English-language sites existed on the Web,3 while in 1998, more than 25 000 Web sites were health-related.4 In 2000, 4.9 million people shopped for health products online.2 Consumers must be cautious about receiving medical products and advice over the Internet, because many sites are market-driven and not regulated for scientific accuracy. Although some standards have been developed in recent years, including those of the American Medical Association, Chicago, Ill, and the code of e-Health ethics,5 consumers will be at risk for obtaining misinformation and ineffective products until a "reliability seal" is used. The aims of this study were to quantify and to compare the numbers and types of Internet sites matched for 10 diverse skin conditions through different search systems.
We used a Netscape Navigator (version 4.0; Mountain View, Calif) browser, selected 3 search engines, and performed similar searches July 9, 1999, and December 16, 1999. Excite (http://www.excite.com) was used as the primary search engine. AltaVista (http://www.altavista.com) and Northern Light (http://www.northernlight.com) were used for comparison. These search engines are characterized by the size of their library (index), mode of matching, and system of ranking matched sites. Excite is one of the most popular search services on the Web. It browses a medium-sized index and is used by America Online's NetFind (http://www.aol.com) and by Netscape as their default search engines. AltaVista has one of the largest indexes of any search engine.6 Its advanced search syntax is powerful and flexible and allows for detailed queries, making it popular among experienced Web searchers. Excite and AltaVista were established in 1995, and Northern Light opened to general use in August 1997. Northern Light is well organized and powerful, with a large index.7 We also used Yahoo (http://www.yahoo.com) as an example of a search directory. A directory's index is much smaller than that of a search engine, composed of a group of personally submitted Web sites.
Using these 3 major search engines and 1 directory, we compared the results of the following Internet searches. The names of 10 skin conditions were used as the main search strings: psoriasis, eczema, wrinkles, cellulite, jalopecia, mastocytosis, granuloma annulare, xanthoma, dermatitis herpetiformis, and epidermolysis bullosa. The use of specific syntax in an advanced search may generate more sites than a simple search. For example, in AltaVista's simple search, a disease name alone may find different results from those of a disease name surrounded by quotation marks and preceded by a plus sign, which forces an exact match to be present in the results. However, in our search, a change of syntax did not yield different results, ie, the same number of sites was generated by using dermatitis herpetiformis or +"dermatitis herpetiformis." For the searches conducted through Excite or Northern Light, we used the following as the search string to find the sites that were related to products: the name of the disease AND (amazing OR breakthrough OR sale OR products). For the educational sites, we used the following as the search string: the name of the disease AND education. For the sites devoted strictly to education, we used the following as the search string: the name of the disease AND education NOT (amazing OR breakthrough OR sale OR products OR clients) (Table 1). AltaVista's search syntax necessitated a slightly different but equivalent string through its advanced search option.8 In addition, for a subset of 6 conditions, we searched the top 10 matched Web sites and classified them into either an educational or a product-related category. For 3 major conditions, we specifically looked for well respected sites, such as the National Alopecia Areata Foundation (http://www.alopeciaareata.com), among their top 10 lists. This subset of searches was conducted July 9, 1999, and February 5, 2000.
The significant increase in web sites
The total numbers of matched sites for different skin conditions found through Excite in July and December 1999 are shown in Figure 1A. Among the diseases that we investigated, wrinkles was associated with the largest number of sites: 13 545 in July and 19 159 in December (a 41% increase during this period). Xanthoma was associated with the fewest number of sites: 103 in July and 274 in December (a 166% increase). The other diseases in the study were ranked according to their number of sites as follows: psoriasis, 7071 in July and 12 668 in December (a 79% increase); eczema, 6004 in July and 8272 in December (a 38% increase); cellulite, 3081 in July and 4004 in December (a 30% increase); alopecia, 2316 in July and 3039 in December (a 31% increase); epidermolysis bullosa, 427 in July and 1776 in December (a 316% increase); dermatitis herpetiformis, 375 in July and 909 in December (a 142% increase); mastocytosis, 185 in July and 664 in December (a 259% increase); and granuloma annulare, 135 in July and 297 in December (a 120% increase). The total numbers of matched sites were different for searches through AltaVista and Northern Light (see the third subsection in this section and Figure 2A), but the ratios of different conditions were comparable (data not shown).
The product-related sites vs the educational sites
The total number of matched sites varied up to 100-fold for different skin conditions, and increased for all conditions during the second half of 1999. These diseases with fewer sites initially had larger percentage increases over time.
The percentages of the product-related sites were higher than those of the educational or strictly educational sites for most conditions that we investigated (Figure 1B). For example, 66% of psoriasis-related sites were devoted to products, and only 7% were designated solely for educational purposes. The difference between the number of product-related sites and educational sites was dramatic for the common conditions (psoriasis and eczema) and cosmetic concerns (wrinkles and cellulite). The difference was less significant for epidermolysis bullosa, granuloma annulare, and xanthoma. An exception was mastocytosis, for which there were slightly more educational sites than product-related sites.
The percentage of sites that included products increased with the total number of sites (Figure 1). For example, psoriasis, with the second highest total number of sites (n = 7071), had 4700 sites (66%) containing information about products. Conversely, xanthoma, with the fewest number of sites (n = 103), had only 10 sites (10%) related to products. The percentages of educational sites (which may involve some products) and strictly educational sites were similar among different skin conditions regardless of their total number of sites (Figure 1B). For example, there were 7071 sites for psoriasis, and only 7% were devoted purely to educational purposes. Similarly, there were 103 sites for xanthoma, and only 4% were related solely to education.
The difference between search engines
There is a difference in the total number of sites found through different search engines.
Excite yielded the fewest number of sites and Northern Light yielded the most sites for all conditions except cellulite (Figure 2A). In general, fewer than one third of the number of sites were found through Excite compared with Northern Light. The Yahoo search directory yielded a small fraction of Web sites found by the search engines. For example, in July 1999, Yahoo matched 57 sites for wrinkles, compared with 13 545 matches found through Excite. Specialized search strings attempted on Yahoo were of low and often no yield.
Although the total numbers of sites found by different search engines were different, the percentages of the product-related sites and strictly educational sites found by these engines were similar. For example, in July 1999, Northern Light matched 26 790 sites for eczema, AltaVista matched 24 060 sites, and Excite matched only 6004 sites. However, the percentages of product-related sites found by these engines were comparable: 58% for Excite, 55% for AltaVista, and 50% for Northern Light (Figure 2B). Similarly, the percentages for eczema educational sites in Excite, AltaVista, and Northern Light were 7%, 9%, and 7%, respectively (Figure 2C).
Internet users usually focus on the top 10 matches. Therefore, we investigated the nature of the top 10 sites for a subset of 6 skin conditions and found little overlap among the match lists generated by the 3 search engines. The largest overlap occurred for psoriasis and dermatitis herpetiformis, for which 3 of Excite's top 10 list were also found by Northern Light and AltaVista. There was no overlap among the 3 top 10 sites for alopecia and xanthoma.
Compared with the total search results, a higher percentage of the top 10 sites was educational (Figure 3). For example, 50% to 60% of the top 10 psoriasis sites were educational, compared with fewer than 10% among the total matched psoriasis sites. The exceptions were the top 10 lists for cellulite, which included 8 to 10 product-related sites.
The percentages of product-related and educational sites within the top 10 lists depended more on the nature of the skin condition than the search engine. For common diseases, such as psoriasis and eczema, at least half of the top 10 sites generated by the 3 search engines introduced products (Figure 3). For cellulite, a cosmetic concern, all but 2 sites among the 3 top 10 lists introduced products. In contrast, for the uncommon and difficult-to-treat diseases, dermatitis herpetiformis and xanthoma, all but 2 sites among the 6 top 10 lists offered educational information. We also surveyed the top 10 lists for some well respected sites (Table 2). The rank order of these sites varied by search engine and changed over time. For example, the National Alopecia Areata Foundation was not among the top 10 lists generated by any search engine in July 1999, but in February 2000, it was the number 1 match on Excite and AltaVista and was number 2 on Northern Light.
Patients seeking medical information are increasingly using the Internet. In 1997, 22 million users searched health topics online. In 1999, the number climbed to 25 million and was expected to reach 33 million in 2000,9 but estimates exceeded these expectations by 24%.4 Worldwide, there are now more than 100 000 biomedical Internet sites.10 Medical products and medical care are available online.9 The Internet has provided consumers with benefits such as quick and easy access, but the line between objective information and promotional content is often blurred. The risks are misinformation and money wasted. Although many attempts have been made to rate Internet medical information sites, these Web-based instruments have not been validated11 and add little to the morass of online information. In this study, we surveyed Internet information for 10 skin conditions covered on English-language Web sites. We were particularly interested in the potential to access different information through different search systems and to compare the number of sites devoted to sales and products with those related to education.
Our data show that the number and the nature of matched sites are first determined by the prevalence of the condition. Common diseases are the subjects of many more Web sites than unusual conditions. In addition, for common diseases and cosmetic concerns, Internet information is biased toward products. These findings are not surprising, because the number of consumers seeking information is proportional to the prevalence of the condition. Common diseases are obvious targets for marketing. Uncommon conditions had the fewest number of Web sites, but had a disproportionately large increase in the number of new Web sites during the second half of 1999. Epidermolysis bullosa and mastocytosis were the subjects of the greatest increase. This may reflect an increased prevalence of Web site testimonials of personal trauma experienced by individuals with these disorders.
The unique designs of search systems affect search results. Search systems are a Web browser's primary guides. As of February 1999, there were 800 million estimated Web pages stored on the World Wide Web. At most, only 16% of Web sites are reached by the most comprehensive search systems.6,12 There are 2 types of search systems: directories and search engines. Yahoo, a directory, searches a personally screened index composed of submitted Web site summaries and looks for matches only in those summaries. Search engines, such as Excite, AltaVista, and Northern Light, are characterized by their larger, unique computer-generated indexes. A search engine's index is screened to match and rank a subject by the unique search engine software. Excite's index library contains 1.5 million fully indexed Web pages, while AltaVista and Northern Light have indexes of more than 16 million.13 The results of our study were consistent with the general experience of search engines' yielding more extensive match lists than directories. In addition, different search engines yielded varying numbers of potentially informative sites. Excite, with its small index, located only a fraction of the sites found by the other 2 search engines.
Despite the different total numbers of matched search sites, the proportions of sites devoted to specific skin conditions, as well as those related to products and education, were similar among the different search systems. All the search engines used in our study yielded a larger number of product sites than educational sites. This result suggests that a browser's choice of search engine is unlikely to affect exposure to sales gimmicks.
Although the relative yields of educational and product-related sites were similar for different search engines, they generated different top 10 lists. Each search engine follows a unique set of rules to rank its matches. For all search engines, the location and frequency of the keyword are the most important factors in ranking Web sites. Keywords in the title or near the beginning of the text carry more weight than those appearing later in the Web page. Web sites with frequently appearing keywords are usually considered more relevant. In addition to these rules, each search engine has its own well guarded scheme in ranking sites. This variation yielded different top 10 lists that matched through different search engines. Compared with the total match lists, the top 10 lists included higher proportions of sites devoted to education vs those promoting products, except for cellulite. However, 50% to 100% of the top 10 lists for common and cosmetic conditions were product-related sites. This finding suggests that consumers should be especially cautious when seeking information about common and cosmetic skin conditions.
This study was designed to distinguish between educational and product-related Internet sites, but many Web pages are mixed. Medscape (http://www.medscape.com) and WebMD (http://www.webmd.com), for example, have features of educational value but are inherently commercial, as they are supported by industry. To more specifically distinguish between sites aimed at patient support rather than marketing, we examined the top 10 lists generated by different search engines for psoriasis, eczema, and alopecia areata, to locate the home pages for well established and well respected patient support organizations. The chances of finding those sites among the top 10 were better in February 2000 than in July 1999, although in one third of the cases, the well respected sites were not number 1. Because many patients misinterpret a search engine's ranking as an expression of quality,14 they can be misled by choosing the number 1 Web site. For example, in February 2000, patients who used Excite to search eczema would consider the number 2 hit, Eczema Mailing List
(http://website.lineone.net/~eczema), more reliable than the number 5 hit, National Eczema Association for Science and Education (http://www.eczema-assn.org/index.html), while in fact, the opposite is true. Therefore, in Web browsing, the rank of a site is not necessarily related to its quality.
Health information online is dramatically increasing. During the second half of 1999, the number of Web sites we found for 10 skin diseases increased by 30% to 316%. As patients increasingly use the Internet to learn about their medical conditions and treatment options, physicians can help guide them by becoming and staying familiar with this ever-changing information. The Hardin Meta Directory of Internet services (http://www.lib.uiowa.edu/hardin) provides user-friendly lists of links on medical topics. Medical societies or patient advocacy groups assist dermatologists and their patients in gathering reliable information from other Web sites, such as those listed in Table 3. Informative Web sites sponsored by the federal government include healthfinder (http://www.healthfinder.gov), the Food and Drug Administration (http://www.fda.gov), and PubMed, the world's largest medical database (http://www.ncbi.nlm.nih.gov/PubMed). Additional informative sites that could be valuable to the reader in evaluating the reliability of a World Wide Web site include the Committee for the Scientific Investigation of Claims of the Paranormal (http://www.csicop.org), The Scientific Review of Alternative Medicine (http://www.hcrc.org/sram), Quackwatch (http://www.quackwatch.com), and Healthcare Reality Check on-line (http://www.hcrc.org). Finally, on the site of the Clinical Digital Libraries Project Web Quality Bibliography (http://www.bama.ua.edu/ ˜smaccall/qualitybib.html), there is an extensive list of articles discussing the quality of medical information.
Accepted for publication June 18, 2001.
Corresponding author and reprints: Elaine Siegfried, MD, Central Dermatology, 1034 S Brendwood Blvd, Suite 600, St Louis, MO 63117 (e-mail: siegfried@centralderm.com).
2.Gorman
C Doctors without borders.
Time. March2001;
(special issue)
36
Google Scholar 3.Not Available, Mayo Health O@sis for free medical info.
Link-up. 1997;141- 8
Google Scholar 4.Izenberg
NLieberman
D The Web, communication trends, and children's health: the Web and the practice of pediatricians.
Clin Pediatr (Phila). 1998;37
(pt 2)
215- 221
Google ScholarCrossref 9.Stolberg
S From M.D. to I. P. O.: chasing virtual fortunes.
New York Times. July4 1999;sec 43
Google Scholar 10.Eysenbach
GSa
ERDiepgen
TL Shopping the Internet today and tomorrow: towards the millennium of cybermedicine.
BMJ. 1999;3191294
Google ScholarCrossref 11.Jadad
ARGagliardi
A Rating health information on the Internet: navigating to knowledge or to Babel?
JAMA. 1998;279611- 614
Google ScholarCrossref 14.Eysenbach
GDiepgen
TL Patients looking for information on the Internet and seeking teleadvice: motivation, expectations, and misconceptions as expressed in e-mails sent to physicians.
Arch Dermatol. 1999;135151- 156
Google ScholarCrossref