Two plus two does not equal three: statistical tests for multiple genome comparison.
Gene clusters that span three or more chromosomal regions are of increasing importance, yet statistical tests to validate such clusters are in their infancy. Current approaches either conduct several pairwise comparisons or consider only the number of genes that occur in all of the regions. In this paper, we provide statistical tests for clusters spanning exactly three regions based on genome models of typical comparative genomics problems, including analysis of conserved linkage within multiple species and identification of large-scale duplications. Our tests are the first to combine evidence from genes shared among all three regions and genes shared between pairs of regions. We show that our tests of clusters spanning three regions are more sensitive than existing approaches, and can thus be used to identify more diverged homologous regions.