Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network

Zhengqiao Zhao; Stephen Woloszynek; Felix Agbavor; Joshua Chang Mell; Bahrad A Sokhansanj; Gail L Rosen

doi:10.1371/journal.pcbi.1009345

Back

Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network

Journal article

Open access

Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network

Zhengqiao Zhao, Stephen Woloszynek, Felix Agbavor, Joshua Chang Mell, Bahrad A Sokhansanj and Gail L Rosen

PLoS computational biology, v 17(9), pp e1009345-e1009345

Sep 2021

DOI: https://doi.org/10.1371/journal.pcbi.1009345

PMID: 34550967

Files and links (1)

url

https://doi.org/10.1371/journal.pcbi.1009345View

Published, Version of Record (VoR)CC BY V4.0, Open

Abstract

Algorithms

Computational Biology

Databases, Genetic

Deep Learning

Gastrointestinal Microbiome - genetics

Host Microbial Interactions - genetics

Humans

Inflammatory Bowel Diseases - microbiology

Microbiota - genetics

Natural Language Processing

Neural Networks, Computer

Phenotype

Prevotella - classification

Prevotella - genetics

Prevotella - isolation & purification

Proof of Concept Study

RNA, Ribosomal, 16S - classification

RNA, Ribosomal, 16S - genetics

Recurrent neural networks with memory and attention mechanisms are widely used in natural language processing because they can capture short and long term sequential information for diverse tasks. We propose an integrated deep learning model for microbial DNA sequence data, which exploits convolutional neural networks, recurrent neural networks, and attention mechanisms to predict taxonomic classifications and sample-associated attributes, such as the relationship between the microbiome and host phenotype, on the read/sequence level. In this paper, we develop this novel deep learning approach and evaluate its application to amplicon sequences. We apply our approach to short DNA reads and full sequences of 16S ribosomal RNA (rRNA) marker genes, which identify the heterogeneity of a microbial community sample. We demonstrate that our implementation of a novel attention-based deep network architecture, Read2Pheno, achieves read-level phenotypic prediction. Training Read2Pheno models will encode sequences (reads) into dense, meaningful representations: learned embedded vectors output from the intermediate layer of the network model, which can provide biological insight when visualized. The attention layer of Read2Pheno models can also automatically identify nucleotide regions in reads/sequences which are particularly informative for classification. As such, this novel approach can avoid pre/post-processing and manual interpretation required with conventional approaches to microbiome sequence classification. We further show, as proof-of-concept, that aggregating read-level information can robustly predict microbial community properties, host phenotype, and taxonomic classification, with performance at least comparable to conventional approaches. An implementation of the attention-based deep learning network is available at https://github.com/EESI/sequence_attention (a python package) and https://github.com/EESI/seq2att (a command line tool).

Metrics

6 Record Views

17 citations in Scopus

Details

Title: Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network
Creators: Zhengqiao Zhao - Drexel University
Stephen Woloszynek - Beth Israel Deaconess Medical Center
Felix Agbavor - Drexel University
Joshua Chang Mell - Drexel University
Bahrad A Sokhansanj - Drexel University
Gail L Rosen - Drexel University
Publication Details: PLoS computational biology, v 17(9), pp e1009345-e1009345
Publisher: Public LIbrary of Science (PLOS)
Resource Type: Journal article
Language: English
Academic Unit: Microbiology and Immunology; Electrical and Computer Engineering; School of Biomedical Engineering, Science, and Health Systems
Scopus ID: 2-s2.0-85116030586
Other Identifier: 991019173668804721

Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media