Identifying lineage effects when controlling for population structure improves power in bacterial association studies

Earle, S. G., Wu, C.-H., Charlesworth, J., Stoesser, N., Gordon, N. C., Walker, T. M., Spencer, C. C. A., Iqbal, Z., Clifton, D.A., Hopkins, K. L., Woodford, N., Smith, E. G., Ismail, N., Llewelyn, M. J., Peto, T. E., Crook, D. W., McVean, G., Walker, A. S. and D. J. Wilson (2016)
Nature Microbiology 16041 (pdf preprint supplement)

Bacteria pose unique challenges for genome-wide association studies (GWAS) because of strong structuring into distinct strains and substantial linkage disequilibrium across the genome. While methods developed for human studies can correct for strain structure, this risks considerable loss-of-power because genetic differences between strains often contribute substantial phenotypic variability. Here we propose a new method that captures lineage-level associations even when locus-specific associations cannot be fine-mapped. We demonstrate its ability to detect genes and genetic variants underlying resistance to 17 antimicrobials in 3144 isolates from four taxonomically diverse clonal and recombining bacteria: Mycobacterium tuberculosis, Staphylococcus aureus, Escherichia coli and Klebsiella pneumoniae. Strong selection, recombination and penetrance confer high power to recover known antimicrobial resistance mechanisms, and reveal a candidate association between the outer membrane porin nmpC and cefazolin resistance in E. coli. Hence our method pinpoints locus-specific effects where possible, and boosts power by detecting lineage-level differences when fine-mapping is intractable.