Conference proceeding
Semi-Supervised and Incremental Sequence Analysis for Taxonomic Classification
2023 IEEE Symposium Series on Computational Intelligence (SSCI), pp 1132-1138
05 Dec 2023
Abstract
Metagenomic analysis is vital in determining what organisms are present in a microbial sample and why they are present. In this study, we explore the utility of MMseqs2, a bioinformatics pipeline, for taxonomic classification in metagenomics, focusing on 16S rRNA gene sequences. We evaluate the algorithm's performance in full dataset as well as batch-by-batch incremental processing, and more importantly, we add the capability of semi-supervised classification to this otherwise clustering only algorithm. Incremental updating is important because it allows seamless integration and processing of new data, whereas semi-supervised classification allows taxonomic identification of previously unknown organisms. We also evaluate the different clustering modes offered by MMseqs2, and compare MMseqs2 to our previously developed semi-supervised incremental algorithm SSI-VSEARCH. We show that MMseqs2's built-in clusterupdate function works well, and our semi-supervised classification capability adds new functionality to this bioinformatics processing pipeline.
Metrics
22 Record Views
Details
- Title
- Semi-Supervised and Incremental Sequence Analysis for Taxonomic Classification
- Creators
- Adriana Fasino - Rowan UniversityEmrecan Ozdogan - Rowan UniversityBahrad A. Sokhansanj - Drexel UniversityGail Rosen - Drexel UniversityRobi Polikar - Rowan University
- Publication Details
- 2023 IEEE Symposium Series on Computational Intelligence (SSCI), pp 1132-1138
- Publisher
- IEEE
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Electrical and Computer Engineering
- Scopus ID
- 2-s2.0-85182947610
- Other Identifier
- 991021818477904721