Logo image
Biclustering of microarray data with MOSPO based on crowding distance
Journal article   Open access   Peer reviewed

Biclustering of microarray data with MOSPO based on crowding distance

Junwan Liu, Zhoujun Li, Xiaohua Hu and Yiming Chen
BMC bioinformatics, v 10(S4), pp S9-S9
29 Apr 2009
PMID: 19426457
url
https://doi.org/10.1186/1471-2105-10-s4-s9View
Published, Version of Record (VoR)CC BY V4.0 Open
url
https://doi.org/10.1186/1471-2105-10-S4-S9View
Published, Version of Record (VoR) Open

Abstract

Biochemical Research Methods Biochemistry & Molecular Biology Biotechnology & Applied Microbiology Life Sciences & Biomedicine Mathematical & Computational Biology Science & Technology
Background: High-throughput microarray technologies have generated and accumulated massive amounts of gene expression datasets that contain expression levels of thousands of genes under hundreds of different experimental conditions. The microarray datasets are usually presented in 2D matrices, where rows represent genes and columns represent experimental conditions. The analysis of such datasets can discover local structures composed by sets of genes that show coherent expression patterns under subsets of experimental conditions. It leads to the development of sophisticated algorithms capable of extracting novel and useful knowledge from a biomedical point of view. In the medical domain, these patterns are useful for understanding various diseases, and aid in more accurate diagnosis, prognosis, treatment planning, as well as drug discovery. Results: In this work we present the CMOPSOB (Crowding distance based Multi-objective Particle Swarm Optimization Biclustering), a novel clustering approach for microarray datasets to cluster genes and conditions highly related in sub-portions of the microarray data. The objective of biclustering is to find sub-matrices, i.e. maximal subgroups of genes and subgroups of conditions where the genes exhibit highly correlated activities over a subset of conditions. Since these objectives are mutually conflicting, they become suitable candidates for multi-objective modelling. Our approach CMOPSOB is based on a heuristic search technique, multi-objective particle swarm optimization, which simulates the movements of a flock of birds which aim to find food. In the meantime, the nearest neighbour search strategies based on crowding distance and -dominance can rapidly converge to the Pareto front and guarantee diversity of solutions. We compare the potential of this methodology with other biclustering algorithms by analyzing two common and public datasets of gene expression profiles. In all cases our method can find localized structures related to sets of genes that show consistent expression patterns across subsets of experimental conditions. The mined patterns present a significant biological relevance in terms of related biological processes, components and molecular functions in a species-independent manner. Conclusion: The proposed CMOPSOB algorithm is successfully applied to biclustering of microarray dataset. It achieves a good diversity in the obtained Pareto front, and rapid convergence. Therefore, it is a useful tool to analyze large microarray datasets.

Metrics

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#3 Good Health and Well-Being

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
International collaboration
Web of Science research areas
Biochemical Research Methods
Biotechnology & Applied Microbiology
Mathematical & Computational Biology
Logo image