Logo image
Benchmarking of gene prediction programs for metagenomic data
Journal article

Benchmarking of gene prediction programs for metagenomic data

Non Yok and Gail Rosen
Conference proceedings (IEEE Engineering in Medicine and Biology Society. Conf.), v 2010, pp 6190-6193
2010
PMID: 21097156

Abstract

Databases, Genetic Benchmarking ROC Curve Molecular Sequence Annotation - methods Algorithms Metagenomics - methods
This manuscript presents the most rigorous benchmarking of gene annotation algorithms for metagenomic datasets to date. We compare three different programs: GeneMark, MetaGeneAnnotator (MGA) and Orphelia. The comparisons are based on their performances over simulated fragments from one hundred species of diverse lineages. We defined four different types of fragments; two types come from the inter- and intra-coding regions and the other types are from the gene edges. Hoff et al. used only 12 species in their comparison; therefore, their sample is too small to represent an environmental sample. Also, no predecessors has separately examined fragments that contain gene edges as opposed to intra-coding regions. General observations in our results are that performances of all these programs improve as we increase the length of the fragment. On the other hand, intra-coding fragments of our data show low annotation error in all of the programs if compared to the gene edge fragments. Overall, we found an upper-bound performance by combining all the methods.

Metrics

10 Record Views
6 citations in Scopus

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#3 Good Health and Well-Being

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Web of Science research areas
Engineering, Biomedical
Engineering, Electrical & Electronic
Logo image