Introduction 1 2 3 4 5 Many people use search engines to find the information they use to help make personal health decisions. Search engines and the Internet have vastly improved access to health information for many consumers. However, search processes and results vary considerably among search engines, and are not transparent to consumers. The criteria used to identify and rank health-related Web sites vary among search engines, and often is not apparent to consumers. Search results may be affected by the structure of content on health Web sites, consumer search terminology, and the use of paid placements by the search engine. In short, research on health searches suggests that the process by which consumers locate health information on the Internet, and the evaluations they make regarding which Web sites to review are important variables in the quality of information they ultimately view and use. Improved understanding of factors influencing online searches will facilitate technical and educational approaches for maximizing quality and benefit of health searches. Methods In 2003, URAC and Consumer WebWatch (CWW), a project of Consumers Union, carried out a project funded by the Robert Wood Johnson Foundation to examine factors influencing the results of online health searches and to develop an agenda for future research and development that would improve the results of health searches. We reviewed published literature and industry reports, and convened two stakeholder groups consisting of consumers, quality experts, search engine experts, researchers, health-care providers, informatics specialists and others. Literature Review Method Our literature review was not exhaustive: its purpose was to provide a baseline understanding of consumer, Web site, quality measurement, and search engine factors that influence the results of searches for health information. We conducted a search of key terms in the Cumulative Index of Nursing and Allied Health Literature (CINAHL), Medline, PubMed, Expanded Academic ASAP, Lexis-Nexis, Proquest, Ingenta, and related databases in health care, information science, and computer science. The initial searches took place in early 2003, but citations were added as they were identified. Where initial searches revealed poor topic coverage, associated reference lists, books and other media that were considered to inform the topic were included. The following search terms were included: Web-based, Web site, information quality, Web search, consumer health, eHealth, health information, search engine, information retrieval, information seeking. We also examined bibliographies of articles retrieved by electronic searches and solicited recommendations from members of the project advisory committee. We discontinued searching in specific topic areas when project staff believed we had adequately described current understanding of key issue areas. Methods for Convening Stakeholder Meetings An open announcement about the project and recommendations from industry leaders helped identify interested stakeholders, and participants were selected by URAC and CWW with guidance from a project advisory committee. Not everyone invited was available to attend. We attempted to achieve a balance of different stakeholders at each meeting. Meetings were held in California and Washington, DC to facilitate participation. The purpose of each stakeholder meeting was to review existing knowledge about results of consumer searches for health information, and to develop recommendations for additional research, technical improvements, and educational approaches needed to improve the results of online consumer searches for health information. Participants reviewed the summary recommendations presented in this article after the meeting and had the opportunity to comment, but were not asked to vote on or endorse the recommendations. 2 Results Results of Literature Review How Consumers Use the Internet to Locate Health Information 6 7 Consumer Search Strategies 8 9 10 Comprehension, Literacy, and Access Issues 11 4 12 Physician Responses to Internet Information 13 14 Consumer Evaluation of Web Site Credibility 9 15 How Web Sites Influence Availability of Quality Health Information Techniques for Conveying Information about Web Site Content The structure of a Web site influences how information can be retrieved from the site by a search engine, as well as the usability of the site for consumers. Coding and structure of Web sites can facilitate retrieval by search engines or can pose a barrier to information retrieval. Coded information on a Web site is processed through the search engine algorithm, and determines whether and how the site is ranked in search returns. The same tags and codes that can be used to highlight information on a legitimate Web site may also be used by "spoofers" who try to lure traffic onto the site. 16 Quality Indicators for Web Site Content 16 17 18 19 These findings suggest that additional research is needed to identify indicators of content quality, and to correlate consumer preferences to quality indicators. Sites that include content correlated with popularity may best meet the public's desire for health information. Current search algorithms may not be in agreement with quality clinical indicators and performance measures currently used throughout the health-care industry. Codes of Conduct A wide range of tools has been developed to assist site developers to produce good quality sites and for consumers to assess the quality of sites. Adherence to accepted codes could theoretically be used as a factor in searches. Ratings instruments include codes of conduct, quality labels, user guides, filters, and third party certification. The value of these tools is unclear: studies have demonstrated that consumers do not routinely seek out information on certifications or adherence to voluntary codes. However, it is assumed by many that such codes benefit consumers indirectly by influencing Web site behaviors and practices. For example, most standards require sites to implement privacy protections and disclosure of site information as consumer protections. No research has been done on the effect of compliance to a code of conduct on Web sites. 20 21 22 23 Web Site Rating Instruments 24 25 26 27 28 29 Discussion Search Engines and Mediators of Health Information Electronic and Human Mediation 30 How Search Engines Work Search engines and Web directories play a central role in facilitating access to health information. Web directories are organized Web site listings put together by human reviewers. Search engine listings are put together by automated systems and lack a navigable structure. Directories usually concentrate on indexing Web sites, while search engines typically index individual Web pages. Consumer searches for keywords will result in a valid match only if the keyword appears in the Web site's description. Hybrid models of search engines and directories are common. Search Engine Indexing and Retrieval Methods 31 Current metrics for evaluating search engines include initial page retrieval capacity and the ability to revisit Web sites to update information. Currency of information, as demonstrated by elimination of non-working links to Web sites is also a performance metric. These criteria are features of business performance, not necessarily the content relevance or quality of the sites returned by a search. Content and format of Web sites determine how they are indexed by search engines. Some search engines use keyword location, frequency, phrasing, and density as indexing and ranking factors. Type and number of links associated with a Web site are common indexing factors. Web sites also use tags to identify certain types of information. Search engine databases include only Web sites that have been registered with or indexed by the search engine-hence the importance of Web site developers making their sites accessible to automated agents, or becoming known to directory developers. Ranking and Ratings Ranking of sites in the final display of search results is of great importance to Web sites, users, and search engines. Ranking effectively drives the likelihood of particular sites being recognized and visited because, as noted, consumers rarely look at more than three pages of results. A poorly designed or executed search may produce an unwieldy list of Web site results that is difficult to navigate. Alternatively, searches that are too narrowly drawn may omit important sites. 32 33 Mediated Searches Mediated searches may be as simple as having a librarian assist with a search, or they may be based on much more complex algorithms. Participants in the URAC/CWW stakeholder group noted that medical and general librarians play an important role in helping large segments of the population retrieve online information and learn effective search strategies. More complex mediated search strategies employ both human mediation and electronic queries to interface with users and focus a search. 34 35 36 Stakeholder Discussion of Literature Review Research Needs to Address Consumer Evaluation of Web Quality There is great variation in how consumers seek information via the Internet, and in how successful they are in searching for health information. Since there is significant consumer-level variation in how consumers search for health information, search algorithms that support variation and still return expected results will meet consumer needs most effectively. Additional research is needed on information needs of different consumer segments and how to effectively educate differing consumer segments to improve the results of their health searches. Research is needed on how to efficiently validate the quality of Web sites and communicate this information to consumers. Research Needs for Web Site Quality Indicators 29 As noted, gateways filter information to increase its relevance to consumers and provide expert assessment regarding validity of sources is available. It may also be useful to develop more sophisticated search models for providing useful and relevant information to consumers via customization approaches. Such approaches could potentially be embedded in search algorithms. In addition, more research is needed on the impact of Internet-based health information on outcomes. The benefits and risks of health information, both from a health outcome and a system outcome (quality, cost), are poorly understood and should be examined further. Research Needs for Search Factors Influencing Search Results Search engines are increasingly important as a tool for locating and organizing information from the vast Internet resource. The volume of information on the Web is so significant that consumers may need different types of mediators, such as search engines or librarians, to help manage the volume of information. Human assistance is also helpful to counteract electronic spoofing and to help consumers overcome limitations in their search strategies. To effectively improve health searches, more information is needed about search algorithms and how quality factors are identified in the algorithms. Search engines are also developing technology to search for synonyms, which may enhance health searches conducted by laypersons. It may also be helpful for search engines to develop methods to distinguish health related searches from other types of searches, rather than using a simple word match. Search technology to intuit consumer needs more effectively and learn from repeated searches could help search engines steer consumers to quality results. New technologies may ultimately be more effective than electronic filtering, requiring consumers to apply filters, or modifying their search strategies. With technology advances, search engines may be able to identify quality proxies that could improve page rankings of high-quality Web sites. Search engines could, for example, give higher ranking to "official sites" for diseases. They could also piggyback onto credibility assessments provided by groups such as healthfinder.gov, or give higher ranking to sites listed in directories from trusted independent sources. Ultimately, adoption of technological solutions depends on the ability of researchers to understand the relationship between electronic proxies for quality and actual quality of content. Discussion of Stakeholder Recommendations for Next Steps Textbox 1 Recommendations of the Group Leadership for Health Search Improvement Organizations concerned about the quality and accessibility of health information online should continue to collaborate to promote "health search literacy." Collaborators should convene a leadership summit on health search literacy to discuss feasibility and implementation of many of the recommendations herein. Collaborating organizations should work with funding organizations to develop a comprehensive long-term research agenda to improve health searches and increase access to quality health information; develop enhanced research methodologies to evaluate the quality, impact, and effectiveness of online health information. Consumer-directed Tools Create tools to support consumer health-information needs, including preset, prescreened health bookmarks and more guidance on how to reach health gateways and portals containing trusted health content. Develop and circulate a public domain brochure on health search strategies that could be branded and distributed by physicians, employers, health plans, and others to educate consumers. Develop public domain interactive, validated search strategy content pages that could be branded and used by health Web sites. Research Needs Identify the search needs and capabilities of diverse populations of searchers, including culturally diverse users and searchers with health needs of differing intensity and severity. Develop more understanding about how consumers interpret online health information, assess its credibility, and make health-related decisions. Research the relationship between consumer search strategies and consumer expectations for results to determine effective approaches for conveying information on the Internet. Research factors affecting physician assessments of Web-based information and how quality content affects physician recommendations to patients about online health-information resources. Assess the relationship between expert accreditation, quality seals, ratings and content quality, as well as the impact of such endorsements on both consumer behavior and Web site behavior. Research the correlation between Web site traffic volume and consumer satisfaction, particularly for health Web sites where there is variation in dimensions of quality such as accuracy, comprehensiveness, ease of navigation, and reading level. Evaluate content quality of Web sites in different domains, (eg, .gov, .edu, .com, and .org) to identify similarities and differences related to quality within and across categories of Internet domain names Evaluate the impact of Internet-based health information on health outcomes: utilization, behavior change, knowledge, burden of illness and disease, or other measures. Research the relative effect of each component of a search algorithm (word frequency and placement, links, etc) for finding health information. Validate elements of some search algorithms, such as link frequency, as indicators of value/quality. 5 Education Agenda Develop models for offering health search education at teachable moments and in diverse consumer settings. Promote dissemination of existing educational tools and resources to assist consumers in evaluating health information on the Web more effectively. Develop user-appropriate tools and approaches to assist Internet users with special needs. High priority user groups may include disability, low literacy, and non-English speaking groups. Urge provider organizations to educate provider members on the value of offering Internet information and interactive learning recommendations as part of the therapeutic intervention. Educate health Web site developers on how to make information easy to find and how to meet the content-level of their intended users. Urge education organizations, in collaboration with health organizations, to develop a school-based or publicly available health search curriculum. Technology Improvement Agenda Continue to develop interactive features on search engines and sites to customize and personalize health searches. Develop more functionality for search engines to enhance selected health queries by offering additional relevant information. Develop technological markers or indicators that could be uniformly applied by Web site developers to indicate accuracy and comprehensiveness of health Web sites. Develop codes to indicate when information on a Web site supercedes previous information. Develop collaborations between health quality and search engines experts to develop codes for validated quality proxies. Develop search technology similar to that used in the commercial sector to direct consumers to related, relevant information based on both searching and viewing behaviors. Enhance personalized searches by building search engine capability to "learn" from repeated searches and user behavior. Expanding the Market for Quality Develop a health equivalent of "BizRate" or "eBay" surveys that can be used by consumers to evaluate Web sites after viewing. Existing models for such a survey could be adapted and disseminated. Sponsor a competition for individuals or organizations to design a search algorithm that returns the most credible health results as evaluated by experts. Design a separate contest for the most effective business plan to make the business case for building quality factors into health searches. Conclusion The Internet has opened a vast library of information to consumers of health information and made that information more accessible than ever before. This represents a significant step forward for consumers. However, the volume of information and the variable quality of information has created new interpretive challenges. Now, one great challenge is helping consumers find the information they want that is also accurate, reliable, and presented in an accessible format. Searches for health information rely on a complex interplay of search algorithms, Web site content and coding, and consumer behaviors. The recommendations presented here address each of those factors with ideas for further research as well as more immediate recommendations for action. This agenda is a start at maximizing the potential of the Internet to deliver high-quality health information for diverse users.