A majority check details of the strains has been characterised by one or more methods including MLST, MLEE, 16S rRNA sequencing, biotyping, and capsular type. Data on the association of strains with different diseases, dates and geographical sites of isolation were also available for many strains. 46 H. influenzae strains were selected for study that
represented the diversity within a tree created from the concatenated sequence data from the entire MLST database ( http://haemophilus.mlst.net). A further 15 strains were selected based on existing MLEE and biotype data. Finally, clinical, geographical and temporal data were used to identify some further strains that were included, based on criteria other than MLST or MLEE, as well as a Selleck FHPI number of strains from closely related species and sub-species of H. influenzae including H. haemolyticus, Haemophilus parahaemolyticus, Haemophilus parainfluenzae, Haemophilus paraphrophilus, H. influenzae biotype IV strains, and putative ‘hybrid’ H. influenzae-H. parainfluenzae strains (Table 1). The latter ‘hybrid’ strains are H. influenzae isolates that do not contain a fucK MLST allele,
Mocetinostat a characteristic of H. parainfluenzae, and therefore their classification is uncertain (personal communication Abdel Elamin, University of Oxford). Most of the serotype b strains were recovered from patients with invasive disease but a number were associated with non-symptomatic carriage. Bacterial isolates were cultured from frozen on solid brain heart infusion (BHI) medium supplemented with 10% Levinthals reagent and 1% agar, and incubated at 37°C. For DNA preparation, bacteria
were cultured on BHI liquid supplemented with haemin (10 μg/ml) and NAD (2 μg/ml). Genome sequencing, assembly, and comparison of genome sequence data Strains were grown on BHI broth and chromosomal DNA was isolated from bacteria using Qiagen columns as described by the supplier. The genomic DNA from 96 strains was sequenced using multiplex (12 separately indexed DNAs per lane) Illumina sequencing as described previously [21]. The sequencing was conducted utilising 7 lanes (84 DNAs) on one flow cell and one lane (12 DNAs) on a second flow cell. The 55 bp reads from each of the 96 strains were separated using Farnesyltransferase the index tags, and then assembled using the Velvet assembly programme [14]. Genome sequences for eleven strains were rejected due to poor assembly; the result of insufficient coverage or large numbers of small contigs (lower part of Table 1). For 85 Haemophilus strains, genome sequences of between 1.27 Mbp to 1.91 Mbp in length were assembled by Velvet (Table 1). The sequence reads were mapped to a reference using MAQ [15] and default parameters, these were then tested to identify the depth of reads covering the lower %G+C regions of DNA, as an indication of when coverage was insufficient for assembly.