TY - JOUR
T1 - Accurate identification of orthologous segments among multiple genomes
AU - Hachiya, Tsuyoshi
AU - Osana, Yasunori
AU - Popendorf, Kris
AU - Sakakibara, Yasubumi
N1 - Funding Information:
Funding: Ministry of Education, Culture, Sports, Science and Technology of Japan Grant-in-Aid for Scientific Research on Priority Area ‘Comparative Genomics’ (No. 17018029).
PY - 2009
Y1 - 2009
N2 - Motivation: The accurate detection of orthologous segments (also referred to as syntenic segments) plays a key role in comparative genomics, as it is useful for inferring genome rearrangement scenarios and computing whole-genome alignments. Although a number of algorithms for detecting orthologous segments have been proposed, none of them contain a framework for optimizing their parameter values. Methods: In the present study, we propose an algorithm, named OSfinder (Orthologous Segment finder), which uses a novel scoring scheme based on stochastic models. OSfinder takes as input the positions of short homologous regions (also referred to as anchors) and explicitly discriminates orthologous anchors from non-orthologous anchors by using Markov chain models which represent respective geometric distributions of lengths of orthologous and non-orthologous anchors. Such stochastic modeling makes it possible to optimize parameter values by maximizing the likelihood of the input dataset, and to automate the setting of the optimal parameter values. Results: We validated the accuracies of orthology-mapping algorithms on the basis of their consistency with the orthology annotation of genes. Our evaluation tests using mammalian and bacterial genomes demonstrated that OSfinder shows higher accuracy than previous algorithms.
AB - Motivation: The accurate detection of orthologous segments (also referred to as syntenic segments) plays a key role in comparative genomics, as it is useful for inferring genome rearrangement scenarios and computing whole-genome alignments. Although a number of algorithms for detecting orthologous segments have been proposed, none of them contain a framework for optimizing their parameter values. Methods: In the present study, we propose an algorithm, named OSfinder (Orthologous Segment finder), which uses a novel scoring scheme based on stochastic models. OSfinder takes as input the positions of short homologous regions (also referred to as anchors) and explicitly discriminates orthologous anchors from non-orthologous anchors by using Markov chain models which represent respective geometric distributions of lengths of orthologous and non-orthologous anchors. Such stochastic modeling makes it possible to optimize parameter values by maximizing the likelihood of the input dataset, and to automate the setting of the optimal parameter values. Results: We validated the accuracies of orthology-mapping algorithms on the basis of their consistency with the orthology annotation of genes. Our evaluation tests using mammalian and bacterial genomes demonstrated that OSfinder shows higher accuracy than previous algorithms.
UR - http://www.scopus.com/inward/record.url?scp=63549088678&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=63549088678&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btp070
DO - 10.1093/bioinformatics/btp070
M3 - Article
C2 - 19188192
AN - SCOPUS:63549088678
SN - 1367-4803
VL - 25
SP - 853
EP - 860
JO - Bioinformatics
JF - Bioinformatics
IS - 7
ER -