Identifying Cycling Genes by Combining Sequence Homology and Expression Data
journal contributionposted on 01.01.1989 by Yong Lu, Roni Rosenfeld, Ziv Bar-Joseph
Any type of content formally published in an academic journal, usually following a peer-review process.
Motivation: The expression of genes during the cell division process has now been studied in many different species. An important goal of these studies is to identify the set of cycling genes. To date, this was done independently for each of the species studied. Due to noise and other data analysis problems, accurately deriving a set of cycling genes from expression data is a hard problem. This is especially true for some of the multicellular organisms, including humans. Results: Here we present the first algorithm that combines microarray expression data from multiple species for identifying cycling genes. Our algorithm represents genes from multiple species as nodes in a graph. Edges between genes represent sequence similarity. Starting with the measured expression values for each species we use Belief Propagation to determine a posterior score for genes. This posterior isusedtodetermineanewset ofcyclinggenes foreachspecies. We applied our algorithm to improve the identification of the set of cell cycle genes in budding yeast and humans. As we show, by incorporating sequence similarity information we were able to obtain a more accurate set of genes compared to methods that rely on expression data alone. Our method was especially successful for the human dataset indicating that it can use a high quality dataset from one species to overcome noise problems in another. Availability: C implementation is available from the supporting