Introduction 1 2 3 4 5 6 7 7 10 8 11 7 8 9 12 13 14 14 16 1 2 4 17 18 15 4 19 14 20 Methods Sample Twelve students from 1 middle school (N= 4) and 2 high schools (N = 4 and N= 4) in southeast Michigan were recruited for this study. Staff at each school were asked to select 4 students who were (a) comfortable using computers, (b) comfortable searching for information on the Internet, and (c) strong students who could afford to miss one class period. Students received a University of Michigan T-shirt, valued at roughly $8, in return for their participation. The parent or guardian of every student signed an informed consent document that described the purpose and procedure of the study. Students also signed separate assent forms with similar information. The University of Michigan Behavioral Science Institutional Review Board approved this study and the consent and assent documents. Data Collection 21 All observations of adolescents were conducted during January 2002. Each school provided a room in which to conduct the observations. Students were brought to the observation room one at a time. Two researchers were present at every observation. For each student, one of the researchers first reviewed the assent form to introduce the project and obtain the student's permission to participate. The students were then asked 14 questions about demographics (age, race/ethnicity, and gender) and their prior computer use (eg, how often they use computers or the Internet, what health topics they have searched, which search engines they used, and whether they have a computer and access to the Internet at home). 22 Table 1 Topics for the health-related questions were chosen based upon responses to a survey of adolescents conducted by the Kaiser Family Foundation (written communication, 2001 Dec; Generation RX.com Survey printouts; V. Rideout, Henry J. Kaiser Foundation, Menlo Park, CA). Certain topics including homosexuality, teen pregnancy, and abortion were purposefully avoided so as not to expose participants to overly-controversial information. Data Analysis After all the observations were completed, 3 researchers including a physician, health educator, and human-computer interface specialist met as a group to review the real-time coding results and to clarify or augment the coding scheme before the definitive final coding of the tracking-software records. The final coding scheme was designed to record data on the person searching, the question being asked, the time it took to find an answer, the search strategy utilized (eg, utilize search engine or directly type in URL); search strings used; number of search engine results pages reviewed; number of pages viewed within a particular site; and the use of menus, advertisements, and directories. One of the 3 coders was assigned as a primary reviewer for each of the observation sessions. The assigned primary reviewer was responsible for a detailed coding of the observation session and any coding problems were resolved in a second group discussion. correct incorrect complete incomplete useful not useful. correct 13 Results 23 15 Overall Search Strategy 24 health Table 2 25 26 27 Search Engine Tactics Table 3 A total of 132 search phrases were entered into the various search engines. Only 104 of those search phrases were unique. The most-frequent 2 phrases used were "diabetes" and "Paxil," each of which had 5 occurrences. There was an average of 3.6 words typed in per search phrase and 80% of the time there were 4 or fewer words per search phrase. Of the 132 search phrases, 30 contained at least 1 word that was misspelled (eg, "tatoo," "Alchoholics," or "smokeing"), despite the fact that students could read the correctly-spelled word on the index card containing the question. Some search engines (eg, Google) offer a feature that recommends an alternate search string with the correct spelling of a word. For example, if a student typed "alchoholics anonymous," the first page of results began with, "Do you mean 'alcoholics anonymous?'" Students were offered a new search string with correct spelling on 15 separate occasions, but only noticed and used it 6 times. The remainder of the times they used the results that were offered for the incorrect spelling. Of the 7 students who were offered corrected spelling suggestions, only 2 ever used them. Table 4 Successful Searching Characteristics Of the 68 questions that students attempted to answer, 7 searches were abandoned after the student gave up or, in 2 cases, when the class period ended. Of the remaining 61 searches, 47 were successful in finding a complete, correct, and useful answer to the health question and the remaining 14 were unsuccessful. Six of the unsuccessful answers were completely incorrect and not useful, 4 were useful but only partially correct, and 4 were fully correct but not useful. Several factors contributed to the success of finding a correct, complete, and useful answer. One important factor was the individual who was performing the search. Although every student answered at least 1 question correctly there was wide variation in the number of correct answers. Two students successfully answered 6 out of 6 questions, 3 students successfully answered 5 questions, 4 students successfully answered 4 questions, and the remaining 3 students only successfully answered 1 or 2 questions. While our sample of students was too small to draw conclusions from, no distinct patterns were observed that would indicate that race, gender, Internet experience, or health searching experience were significant determinants of success. However, the older adolescents (16-17 year olds) were successful 87% of the time (26 of 30) as compared to 68% (21 of 31) for the younger adolescents. Table 5 Certain search actions led to sites that contained the answer more often than others. Overall, students found answers on 22% of the sites they accessed (47 of 215). They accessed sites in 5 ways. Although not often taken, the action with the highest probability of success (47%; 7 of 15) was following a link from 1 non-search-engine site (eg, www.aa-intergroup.org) to another site (eg, www.alcoholics-anonymous.org). In most of these cases, the student accessed the first site directly from a search engine. Clicking on search engine results led to a site where students found an answer 21% of the time (35 of 166). Success rates were similar for following a recommended link from a list or menu provided by the search engine (18%; 4 of 22). Directly typing in a URL, bypassing search engines entirely, was successful only 9% of the time (1 of 11). A sponsored link from a search engine was followed only once, and the student found an incorrect answer on that site. Another contributing factor related to success was misspelling of search terms. Of the 14 completed but unsuccessful searches, 29% (4 searches) had at least 1 misspelling compared to only 15% (7 searches) of the 47 successful searches. Perhaps even more telling, both successful and unsuccessful searches with misspellings took students 1.5 minutes longer on average than searches without misspellings. Observations confirmed that some students were unable to find an answer until they discovered and corrected their misspelling, resulting in higher quality and more-relevant results. Other search characteristics did not have statistically significant impacts on whether searches were successful, although this may have been due to small sample sizes. For example, the search engines were not significantly different in their percentages of successful searches. Similarly, the average number of words per search string was not significantly related to search success rate. (Data not shown.) Qualitative Analysis Certain common behaviors of the adolescent searchers were observed which were not apparent from the quantitative analysis. First, the students were very comfortable and confident while searching online for health information. Most students knew where they wanted to start the search and navigated using quick mouse clicks and shortcut keys. However, this characteristic was likely over-represented in our population due to their strong academic performance and Internet proficiency. Second, several searchers did not take much time in formulating a search strategy or (when applicable) choosing search terms. Instead, these searchers seemed to type in the first search string that came to mind. If the results were not what were anticipated, another search string was typed in, sometimes without even clicking on any results from the first search string. The overall approach was a trial-and-error method with frequent backtracking. The most-common problem with search strings was that they were not specific enough. For example, 2 different students typed in the search string "hiv" when looking for a place that administers free and confidential HIV tests. 7 10 Fourth, students mentioned that they purposefully avoided sponsored links and advertisements, despite the fact that many of the search engines present these results first. The qualitative data confirmed this practice, as only 1 sponsored link was ever selected. Finally, little to no attention was paid to the source of the answer. In the vast majority of cases, once an answer was located, it was simply assumed to be correct. Discussion 15 1 2 15 15 Simulation of Searches 13 The results of this study suggest that such simulations can focus on the use of search engines, but that very-broad search terms and, especially for adolescents, common spelling errors should be considered. Ads and other nonresult links can be ignored. Since more than 80% of the links that were followed appeared in the top 10 results, and more than 95% were among the top 40, a search simulation need not consider result links beyond these. Providers of Internet Health Content 28 29 Another area that Internet content providers should focus on is within-site navigation. Because students tend to skip around from place to place within a page and read little in sequence, it is important that sites with a significant adolescent audience are well organized, concise, and understandable. Long paragraphs, too many links, and difficult vocabulary all decrease the likelihood of adolescents finding health information they are seeking, even if it is contained within a site. Internet content producers should attempt to understand the needs of the site visitors and build hierarchal structures that reflect those needs. For example, if one of the primary needs of individuals visiting the Alcoholics Anonymous site is to find a local meeting, the first page of the site should include an obvious link (eg, "Find an AA Meeting Near You") that leads to another page that returns the nearest meetings after entering in a zip code or city name. While ease of within-site navigation is important for all visitors to health information sites, some information providers may want to develop sites targeted specifically to adolescents. While they might like the targeted information once they found it, we observed that adolescents tend to rely on general-purpose search engines. Thus, developing special youth-targeted versions of information sites may be of somewhat limited utility, unless also accompanied by advertising or education campaigns that make adolescents more likely to find such sites. Rather than changing Web sites or their presentation in search engines, it may also be useful to undertake education campaigns to improve the search strategies and tactics that adolescents use when seeking health information. It may be helpful to guide them towards youth-oriented directories or search engines, rather than general-purpose search engines. For example, both Yahoo! and Google offer directories with subcategories of sites designed for teens that cover various health topics. This approach may be facilitated by including links to such resources on the Web browser's starting page in schools and libraries. Alternatively, adolescents might be taught techniques for formulating and refining search terms at general-purpose search engines, adding or dropping more-specific words based on the kinds of results returned. They might also be taught to notice potential search term misspellings based on surprising search results. Finally, adolescents might also be taught techniques for systematically exploring within a Web site to find the kind of information they are looking for. Limitations and Future Research There are several important limitations to the interpretation of these results. First, this was not a representative or random sample of adolescents. It was a small convenience sample with a selection bias toward adolescents with strong Internet searching skills. While the results cannot be generalized to all adolescents and do not capture the full range of adolescent searching experience, we can assume that the average adolescent would have had even more trouble than our study participants in finding health information on the Internet. Second, the health-related search questions were deliberately constructed to avoid controversial topics such as safe sex, abortion, and homosexuality. Given that adolescents are often faced with health problems related to sexuality, their actual search behavior and success at finding health information related to sexuality may not be reflected in our results. Another concern is that participants may have changed their search behavior because of the presence of observers and because they were aware that their search behaviors were being recorded. For example, students who had trouble finding an answer may have persisted in their search longer than they would have in a nonresearch setting. Alternatively, because students knew they had several search questions to answer during a single class period, they may not have been as persistent as they might have been with a more personally-relevant question and less-restricted search time. Thus, the data here reflect a rough estimate of persistence for an adolescent looking for health-related information. Also, searching was conducted individually, while in practice many searches both at home and at school are conducted with friends, teachers, or family close by. While it is difficult to know how this would affect searching behavior without future research, it is possible that students would act differently (eg, receive help with spelling). Finally, while components of our classification scheme for successful versus unsuccessful searching have been previously validated, the overall scheme was modified to more accurately code the search results as correct, complete, and useful. A more-systematic validation of coding schemes for health information search results is an important area for future research. More research is needed to validate the results presented in this article, as well as determine if results vary for different populations (eg, age, race, and experience with health searching) and different health questions (eg, finding a practitioner versus finding the answer to a question). Additionally, instead of focusing on how adolescents currently search for health information, future studies may also want to explore interventions aimed at improving their searches. For example, should health portal sites designed for adolescents or online directories be used? Or would the current practice of using common search engines, but with adolescents learning improved search tactics be more effective? Also, which search strategies lead to sites that are the most likely to be accurate and influence adolescents to change their behavior? Conclusions This study provides a useful snapshot of current adolescent searching patterns. The results have implications for constructing realistic simulations of search behavior, and for both information providers and educators. Analyzing search behavior through actual observation should be a cornerstone in any effort to improve adolescents' access to health information.