Introduction Haemophilus influenzae H. influenzae Black et al. 1992 H. influenzae Zeckel et al. 1992 H. influenzae Bolduc et al. 2000 Shurin et al. 1980 Bolduc et al. 2000 Gonzales et al. 1987 Munson and Grass 1988 Chong et al. 1995 Bolduc et al. 2000 Bolduc et al. 2000 Yang et al. 2003 Fitzpatrick and McInerney 2005 Fitzpatrick and McInerney 2005 Knowledge of the position of positively selected codons in a protein would constitute an excellent starting point for immunization studies and epitope mapping, not only because of their biological function, but also because the number of variable sites of proteins that are candidates for inclusion in a vaccine can be reduced. In the present study, we identified codons that evolved more rapidly through nonsynonymous than through synonymous substitutions in a sample of 36 OMP-P1 sequences. We compared the location of these codons with the location of peptides that were used in epitope mapping and with B- and T-cell OMP-P1-specific antigens to examine the congruence among these techniques and to identify regions that could be important for vaccine design. Finally, we localized stretches with positively selected codons in secondary structures and three-dimensional (3D) models of OMP-P1. Our computational approach led to the identification of several novel domains with positive selected codons within the OMP-P1 protein that may be attractive targets in future vaccine design. Materials and Methods Evolutionary Analysis of the Selection Pressure on OMP-P1 H. influenzae Thompson et al. 1997 Rozas et al. 2003 Swofford 2003 Posada and Crandall 1998 Yang 1997 Goldman and Yang (1994) Derrick et al. 1999 Jiggins et al. 2002 Fitzpatrick and McInerney 2005 Yang 1997 Yang et al. 2003 Yang and Swanson 2002 Yang et al. 2000 2 2 Yang et al. 2005 Yang et al. 2005 Yang et al. 2005 Suzuki and Gojobori 1999 Kosakovsky Pond and Frost 2005 Massingham and Goldman (2005) Goldman and Yang (1994) Massingham and Goldman 2005 Secondary Structure of OMP-P1 Krogh et al. 2001 Cuff et al. 1998 Cheng et al. 2005 Cuff and Barton 1999 Tertiary Structure of OMP-P1 H. influenzae Combet et al. 2002 Peitsch 1996 Schwede et al. 2003 Schwede et al. 2003 Liu et al. 2003 Results Alignment of OMP-P1 Sequences Bolduc et al. 2000 Identification of Positively Selected Codons Yang 1997 1 1 1 1 1 1 1 Fig. 1. Maximum likelihood tree based on the TVM model of nucleotide substitution with a proportion of invariant sites and rate heterogeneity (see text). The error bar indicates the number of substitutions per nucleotide. Asterisks indicate branches with 100% bootstrap support. Table 1. Heamophilus influenza Likelihood Tree length Kappa κ Parameters Codon Amino acid Probability positive selection (BEB) dN/dS codon ± SE M1 −5260.86 1.91 1.90 0 1 0 1 — — M2 −5169.38 2.17 2.13 0 1 2 0 1 2 93 A 1.00 6.523 ± −0.833 94 S 1.00 6.523 ± 0.833 96 K 0.57 3.979 ± 2.632 97 I 1.00 6.523 ± 0.833 99 R 1.00 6.523 ± 0.834 105 Q 1.00 6.523 ± 0.833 198 A 1.00 6.522 ± 0.835 222 K 1.00 6.523 ± 0.834 225 T 0.79 5.293 ± 2.349 284 K 1.00 6.519 ± 0.844 329 H 1.00 6.493 ± 0.919 M7 −5270.13 2.07 1.91 B(p = 2.14, q = 0.04) — — M8A −5260.96 1.92 1.91 0 1 2 M8 −5177.30 2.17 2.12 0 1 2 93 A 1.00 5.445 ± 0.585 94 S 1.00 5.445 ± 0.585 95 V 0.84 4.661 ± 1.726 96 K 0.95 5.214 ± 1.099 97 I 1.00 5.445 ± 0.585 99 R 0.74 4.191 ±2.015 100 N 1.00 5.445 ± 0.585 105 Q 1.00 5.445 ± 0.585 151 I 0.90 4.929 ± 1.480 198 A 1.00 5.445 ± 0.585 222 K 1.00 5.445 ± 0.585 225 T 0.96 5.238 ±1.098 284 K 1.00 5.445 ± 0.586 329 H 1.00 5.442 ± 0.596 371 Y 0.76 4.279 ± 2.030 Yang (1997) i i Yang et al. 2005 2 P P P P 2 2 P 2 1 Fig. 2. H. influenzae Topographic Localization of the Amino Acids Encoded by Putatively Positively Selected Codons and Their Relation to Immunological Data of OMP-P1 3 Proulx et al. 1991 Chong et al. 1995 Panezutti et al. 1993 Proulx et al. 1992 Panezutti et al. 1993 Chong et al. 1995 H. influenzae Proulx et al. 1992 Chong et al. 1995 Chong et al. 1995 Proulx et al. 1992 H. influenzae Munson and Grass 1988 Chong et al. 1995 Fig. 3. Haemophilus influenzae H. influenzae Proulx et al. 1992 Chong et al. 1995 2 Chong et al. (1995) Munson et al. 1992 Krogh et al. 2001 Cuff et al. 1998 3 Chong et al. 1995 Localization of the Amino Acids Encoded by Putative Positively Selected Codons as Assessed by 3D Modeling H. influenzae FadL E. coli Berg et al. 2004 FadL FadL TodX Pseudomonas putida TbuX Ralstonia pickettii 4 http://www.expasy.org/swissmod/SWISS-MODEL.html −43 5 FadL 4A B FadL 5 4 FadL 4A 5 Fig. 4. H. influenzae A FadL E. coli B C Fig. 5. H. influenzae FadL E. coli FadL 4 FadL 4C Discussion Positive Selection on OMP-P1 Smith et al. 1995 Derrick et al. 1999 Anisimova et al. 2003 Anisimova and Yang 2004 Anisimova et al. 2003 Anisimova et al. 2003 1 2 Use and Value of 3D Reconstruction of OMP-P1 FadL FadL 5 OmpF PhoE E. coli Neisseria Derrick et al. 1999 Smith et al. 1995 Derrick et al. 1999 Are Positively Selected Sites Useful for Vaccine Formulations? Proulx et al. 1992 Panezutti et al. 1993 Chong et al. 1995 Chong et al. 1995 H. influenzae H. influenzae Bolduc et al. (2000) H. influenzae Bolduc et al. 2000 H. influenzae H. influenzae H. influenzae Bolduc et al. (2000) H. influenzae Bolduc et al. (2000) H. influenzae H. influenzae Meats et al. 2003 H. influenzae Bolduc et al. (2000) Meats et al. (2003) H. influenzae Easton et al. 2005 Jonge et al. 2004 H. influenzae Riedmann et al. 2003 Fisseha et al. 2005 Impact of Computational Analyses on Vaccine Development Urwin et al. 2002 Yang et al. Yang 2003 Fitzpatrick and McInerney 2005 H. influenzae Electronic Supplementary Material Supplementary material