Logo image
Content-based search of gene expression databases using binary fingerprints of differential expression profiles
Journal article   Peer reviewed

Content-based search of gene expression databases using binary fingerprints of differential expression profiles

Francis Bell and Ahmet Sacan
Network modeling and analysis in health informatics and bioinformatics (Wien), v 4(1)
01 Dec 2015

Abstract

Life Sciences & Biomedicine Mathematical & Computational Biology Science & Technology
Availability and rapid growth of microarray databases have made an integrated analysis of these databases computationally challenging. We present a novel approach to content-based searching in microarray databases, using binary vector representations, that is inspired from the Chemoinformatics field. A benchmark compendium of microarray datasets is established for evaluation of content-based searching. Differential expression profiles from microarray experiments are represented either as floating point vectors or as concise binary vectors. The benchmark compendium is searched using several distance measures for determining similarity. We demonstrate that the use of binary vector representations achieves accuracies equivalent to or better than the use of floating point measures, while at the same time significantly reducing the time required to search a microarray database, owing to the fast bitwise operations and the reduction in memory requirements. Experiments on a large database of binary vector representations demonstrate that a modified Tanimoto distance measure is best suited for content-based search of differential microarray profiles.

Metrics

10 Record Views

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#3 Good Health and Well-Being

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Web of Science research areas
Mathematical & Computational Biology
Logo image