Predicting gene function using few positive examples and unlabeled ones

Yiming Chen; Zhoujun Li; Xiaofeng Wang; Jiali Feng; Xiaohua Hu

doi:10.1186/1471-2164-11-S2-S11

Back

Predicting gene function using few positive examples and unlabeled ones

Journal article

Open access

Peer reviewed

Predicting gene function using few positive examples and unlabeled ones

Yiming Chen, Zhoujun Li, Xiaofeng Wang, Jiali Feng and Xiaohua Hu

BMC genomics, v 11 Suppl 2(2), pp S11-S11

02 Nov 2010

DOI: https://doi.org/10.1186/1471-2164-11-S2-S11

PMID: 21047378

Featured in Collection : UN Sustainable Development Goals @ Drexel

Files and links (1)

url

https://doi.org/10.1186/1471-2164-11-S2-S11View

Published, Version of Record (VoR) Open

Abstract

Computational Biology - methods

Algorithms

Saccharomyces cerevisiae - genetics

Artificial Intelligence

Databases, Genetic

Gene Expression Profiling

Genomics - methods

Protein Interaction Mapping

A large amount of functional genomic data have provided enough knowledge in predicting gene function computationally, which uses known functional annotations and relationship between unknown genes and known ones to map unknown genes to GO functional terms. The prediction procedure is usually formulated as binary classification problem. Training binary classifier needs both positive examples and negative ones that have almost the same size. However, from various annotation database, we can only obtain few positive genes annotation for most of functional terms, that is, there are only few positive examples for training classifier, which makes predicting directly gene function infeasible. We propose a novel approach SPE_RNE to train classifier for each functional term. Firstly, positive examples set is enlarged by creating synthetic positive examples. Secondly, representative negative examples are selected by training SVM (support vector machine) iteratively to move classification hyperplane to a appropriate place. Lastly, an optimal SVM classifier are trained by using grid search technique. On combined kernel of Yeast protein sequence, microarray expression, protein-protein interaction and GO functional annotation data, we compare SPE_RNE with other three typical methods in three classical performance measures recall R, precise P and their combination F: twoclass considers all unlabeled genes as negative examples, twoclassbal selects randomly same number negative examples from unlabeled gene, PSoL selects a negative examples set that are far from positive examples and far from each other. In test data and unknown genes data, we compute average and variant of measure F. The experiments show that our approach has better generalized performance and practical prediction capacity. In addition, our method can also be used for other organisms such as human.

Metrics

16 Record Views

5 citations in Web of Science

7 citations in Scopus

Details

Title: Predicting gene function using few positive examples and unlabeled ones
Creators: Yiming Chen - Computer School of National University of Defense Technology,Changsha,Hunan, China. nudtchenym@gmail.com
Zhoujun Li
Xiaofeng Wang
Jiali Feng
Xiaohua Hu
Publication Details: BMC genomics, v 11 Suppl 2(2), pp S11-S11
Publisher: Springer BMC; England
Resource Type: Journal article
Language: English
Academic Unit: Information Science
Web of Science ID: WOS:000289200100011
Scopus ID: 2-s2.0-78149317292
Other Identifier: 991014877895104721

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types: Domestic collaboration; International collaboration
Web of Science research areas: Biotechnology & Applied Microbiology; Genetics & Heredity

Predicting gene function using few positive examples and unlabeled ones

Files and links (1)

Abstract

Metrics

Details

UN Sustainable Development Goals (SDGs)

InCites Highlights

Drexel University Social media