Introduction 2002 Escherichia coli E. coli 1976 1986 1987 1988 1991 2002 E. coli E. coli E. coli Escherichia coli E. coli 1975 1991 2000 2000 2006 fim pap sfa foc 2006 2006a E. coli 1996 2003 2003 2005 2005 2003 kps iutA fyuA malX 2003 E. coli E. coli 2006 E. coli E. coli E. coli Materials and methods Bacterial strain Escherichia coli 1975 E. coli E. coli E. coli 8 2006 2007 RNA isolation and microarray hybridisation ® E. coli Data analysis 2004 2004 http://www.bioconductor.org http://www.r-project.org E. coli E. coli E. coli E. coli Microarray data accession number http://www.ebi.ac.uk/arrayexpress Results E. coli E. coli E. coli E. coli E. coli 1 asnT c2557–c2563 1 c1968–c1971 ydfI d ydfJ rspAB Fig. 1 outer blue circle red iroN fepA fecI iucBC fhuA exbD b3337 b1995 marA sodA ahpC b1452 c1220 c4210 lysA rrsG rrsH yrbL sitABCD fec E. coli 2006b 1 c0300 aspV c3686–3690 pheV yrbH 2007 chu ycdO ycdB 2006 2 in vivo Fig. 2 diagonal boxes dark blue colour boxes blue colour Closeness to CFT073 Given the different growth conditions analysed, it is not unrealistic to assume that most genes present in strain 83972 would be expressed, to some extent, under at least one of these seven different conditions/environments, i.e. growth in liquid and on solid media; during exponential phase, in biofilm and during colony-forming conditions; in different growth media (human urine and minimal lab medium); as well as in vivo in three different individuals. E. coli 1 3 E. coli E. coli E. coli 3 fec fecABCDEIR 2007 Fig. 3 Venn diagrams E. coli E. coli E. coli Shigella 2008 E. coli E. coli E. coli E. coli E. coli 2008 E. coli 3 E. coli E. coli 2008 E. coli E. coli 2007 flgABCDEFGHIJKL flhABE fliACDEFGHIJKLNOPQRSTZ motAB csgABCEFG wcaABCDEFGHI wza cheBRWYZ tap E. coli hyaBCDEF hycACD tauABCD E. coli b1500–1505 ydeQRST fimEAIC 2006 UPEC-associated genes present in strain 83972 2006 2006 E. coli 2006 4 E. coli Shigella pheV CFT073 hlyCABD iutA iucABCD c3655 sat papIBAHCDJKEFG pap 2006 pap papHCDJKF papIAEG Fig. 4 E. coli Shigella outer blue circle inner circles Shigella red fim c5391–5400 pap fim sfa foc pap fim fimEAIC 4 sfa foc sfaC 2007 yehABCD 1 Table 1 Analysis of fimbriae-encoding genes in strain 83972 Description c Genes No of genes No (%) of absent genes Absent Putative chaperone-usher fimbrial operon c0166–0172 yadN-ecpD-htrE-yadMLKC 7 3 (43) ecpD yadMK a c1237–1245 sfaCB-focAICDFGH 9 1 (11) sfaC F9 c1931–1936 c1936-34-ydeSRQ 6 6 (100) All Putative chaperone-usher fimbrial operon c2635–2638 yehABCD 4 4 (100) All Putative chaperone-usher fimbrial operon c2878–2884 yfcOPQRSUV 7 5 (71) yfcQRSUV a c3583–3593 papIBAHCDJKEFG 11 4 (36) papIAEG b Putative chaperone-usher fimbrial operon c3791–3794 ygiLGH-c3794 4 1 (25) ygiL Auf fimbriae c4207–4214 aufABCDEFG 8 7 (88) aufBCDEFG a c5179–5189 papIBAHCDJKEFG papAD 1 (50) papA_2 Type 1 fimbriae c5391–5399 fimBEAICDFGH 9 4 (44) fimEAIC a pap papA papD b pap papIAEG pap Presence of other pathogenicity islands in 83972 2 pheU CFT073 pap E. coli Shigella Table 2 Analysis of presence of pathogenicity islands in strain 83972 Island name Common name c a Absent (%) b PAI-CFT073-aspV PAI III CFT073 c0253–c0368 96 40 (42) cdiA (c0345), picU (c0350) PAI-CFT073-serX c1165–c1293 92 33 (36) mchBCDEF (c1227, c1229–1232), sfa/foc (c1237–c1247), iroNEDCB (c1250–c1254), ag43 (c1273) PAI-CFT073-icdA c1518–c1601 42 5 (12) sitDCBA (c1597–1600) PAI-CFT073-asnT HPI CFT073 c2418–c2437 19 3 (16) fyuA (1246) PAI-CFT073-metV c3385–c3410 26 17 (65) hcp (c3391), clpB (c3392) PAI-CFT073-pheV PAI I CFT073 c3556–c3698 119 51 (43) hlyA (c3570), pap (c3582–c3593), iha (c3610), sat (c3619), iutA , iucDCBA (c3623–3628), ag43 (c3655), kpsTM (c3697–c3698) PAI-CFT073-pheU PAI II CFT073 c5143–c5216 46 43 (93) pap2 (c5179–c5189) a E. coli b Yersinia pestis E. coli 2002 2007 2008 1 2006 pks 2006 2006 4 1 pks E. coli Presence of positively selected UPEC genes E. coli 2006 agaI yjiL recC yegO amiA cutE fepE ompC ompF yfaL entD entF yojI 2006 E. coli 1997 E. coli 2003 2003 3 E. coli . 2003 E. coli 3 E. coli Table 3 Distribution of absent genes in functional categories Functional category Absent Total Z No. % P Amino acid transport and metabolism 98 33.1 296 0.636 Carbohydrate transport and metabolism 126 42.3 298 0.025 Cell cycle control, cell division and chromosome partitioning 5 16.1 31 0.000 Cell motility 77 84.6 91 0.000 Cell wall/membrane/envelope biogenesis 81 40.5 200 0.087 Coenzyme transport and metabolism 25 23.1 108 0.001 Defense mechanisms 17 48.6 35 0.000 Energy production and conversion 88 36.7 240 0.563 Function unknown 61 24.7 247 0.003 General function prediction only 83 30.9 269 0.255 Inorganic ion transport and metabolism 55 34.4 160 0.921 Intracellular trafficking, secretion and vesicular transport 8 22.2 36 0.000 Lipid transport and metabolism 21 29.6 71 0.129 Nucleotide transport and metabolism 19 24.4 78 0.002 Posttranslational modification, protein turnover, chaperones 24 20.2 119 0.000 Replication, recombination and repair 70 42.4 165 0.023 Secondary metabolites biosynthesis, transport and catabolism 22 40.7 54 0.075 Signal transduction mechanisms 39 33.6 116 0.748 Transcription 78 33.2 235 0.654 Translation, ribosomal structure and biogenesis 21 13.5 156 0.000 Not in COGs 577 54.2 1065 0.000 1,595 39.2 4,070 CFT073 genes absent in strain 83972 4 ireA tsx tsx 2006 2007 Table 4 Characteristics of ABU isolate 83972 compared with UPEC isolates CFT073, UTI89 and 536 a CFT073 UTI89 536 83972 b Serotype O6 O18 O6 c Capsule K2 K1 K15 c Chu + + + + U, BF, Pat Ent + + + + U, BF, Pat Fep + + + + Pl, U, BF, Pat Feo + + + + BF, Pat Fhu + + + + Pl, U, BF, Pat Iro + + + + U, Pat Iuc + − − + Pl, U, BF, Pat IutA + − − + U, BF, Pat Sit + + + + U, BF, Pat FyuA + + + + U, BF, Pat Iha + − − + U, Pat IreA + − − − Pks island + + + + U, Pat RfaH + + + + BF d + + + + Pat Pap + + + − Fim + + + − Foc/sfa + + + − Vat + + + + BF, Pat Sat + − − + U Tsx + + + − Biofilm formation 1.0 1.3 14.4 a b U BF Pl Pat c ppdD hofBC b0106–0108 CFT073 genes present in strain 83972 but not found in other UPEC strains 4 c1194–c1204 serX c1522–c1528 icdA c3394–c3396 metV c3681–c3682 pheV c5372–c5382 c3394–c3396 c5372–c5382 E. coli Shigella 4 Discussion E. coli 2004 E. coli 2006 2006b 2007 pks cdiA mchBCDEF flu hcp rfaH sat picU vat pap fim foc sfa clpB ireA tsx Thus from the analyses performed here we can make predictions about several gene categories such as potential virulence genes, fitness genes and “household-class” genes. It is also noteworthy that the information reported herein complements a potential genome sequence of strain 83972. Whole genome sequencing can identify the presence of genes but is unable to reveal if they are transcribed. Genes can be silenced not only due to lesions in the actual gene and its promoter but also due to mutations of genes encoding regulatory factors. The methodology employed in the present work reveals the active genome of strain 83972. ABU strain 83972 is closely related to fully virulent uropathogenic strains. All evidences suggest that the strain is a deconstructed pathogen. This study dispels the commonly held idea that ABU strains are commensals that have picked up niche-adaptation genes by horizontal gene transfer. Rather, strain 83972 was originally a true pathogenic strain that has lost whole or part of operons that contribute to virulence.