DeepAMR for predicting co-occurrent resistance of Mycobacterium tuberculosis

Yang, Y., Walker, T. M., Walker, A. S., Wilson, D. J., Peto, T. E. A., Crook, D. W., Shamout, F., Zhu, T., Clifton, D. A. and CRyPTIC Consortium (2019)
Bioinformatics 35: 3240-3249 (pdf)

Motivation: Resistance co-occurrence within first-line anti-tuberculosis (TB) drugs is a common phenomenon. Existing methods based on genetic data analysis of Mycobacterium tuberculosis (MTB) have been able to predict resistance of MTB to individual drugs, but have not considered the resistance co-occurrence and cannot capture the latent structure of poly-resistant TB.

Methods: We used a large cohort of TB patients from 16 countries across six continents where whole-genome sequences for each isolate and associated phenotype to anti-TB drugs were obtained using drug susceptibility testing recommended by the World Health Organization. We then proposed an end-to-end multi-task model with stacked denoising auto-encoder (DeepAMR) to learn low-dimensional latent structure and perform multi-label classification.

Results: The results showed that the DeepAMR outperformed the examined multi-label learning and conventional single-label learning models with mean sensitivity of 96%, 87%, 73%, and 96% for predicting resistance to rifampicin, ethambutol, pyrazinamide, and multi-drug resistant TB (MDR-TB), respectively, and performed better than single-label learning models with mean sensitivity of 95.6% for isoniazid. In comparison to a method that predicts resistance of an isolate when a single previously identified mutation is present, the DeepAMR improved mean sensitivity for classifying resistance to four first-line drugs and MDR-TB by up to 14%, except for ethambutol. The latent structure obtained by DeepAMR in t-distributed stochastic neighbor embedding space illustrated that there were several distinctive subtypes of cross-resistance cases in the latent space.

Availability: The details of source code are provided at˜davidc/code.php