TY - JOUR
T1 - Genome-wide searching with base-pairing kernel functions for noncoding RNAs
T2 - Computational and expression analysis of snoRNA families in Caenorhabditis elegans
AU - Morita, Kensuke
AU - Saito, Yutaka
AU - Sato, Kengo
AU - Oka, Kotaro
AU - Hotta, Kohji
AU - Sakakibara, Yasubumi
N1 - Funding Information:
New Energy and Industrial Technology Development Organization (NEDO) of Japan (Functional RNA Project); Ministry of Education, Culture, Sports, Science and Technology of Japan (Grant-in-Aid for Scientific Research on Priority Area ‘‘Comparative Genomics’’ No. 17018029) Funding for open access charge: Ministry of Education, Culture, Sports, Science and Technology of Japan (Grant-in-Aid for Scientific Research on Priority Area ‘‘Comparative Genomics’’ No. 17018029).
PY - 2009
Y1 - 2009
N2 - Despite the accumulating research on noncoding RNAs (ncRNAs), it is likely that we are seeing only the tip of the iceberg regarding our understanding of the functions and the regulatory roles served by ncRNAs in cellular metabolism, pathogenesis and host-pathogen interactions. Therefore, more powerful computational and experimental tools for analyzing ncRNAs need to be developed. To this end, we propose novel kernel functions, called base-pairing profile local alignment (BPLA) kernels, for analyzing functional ncRNA sequences using support vector machines (SVMs). We extend the local alignment kernels for amino acid sequences in order to handle RNA sequences by using STRAL's; scoring function, which takes into account sequence similarities as well as upstream and downstream base-pairing probabilities, thus enabling us to model secondary structures of RNA sequences. As a test of the performance of BPLA kernels, we applied our kernels to the problem of discriminating members of an RNA family from nonmembers using SVMs. The results indicated that the discrimination ability of our kernels is stronger than that of other existing methods. Furthermore, we demonstrated the applicability of our kernels to the problem of genome-wide search of snoRNA families in the Caenorhabditis elegans genome, and confirmed that the expression is valid in 14 out of 48 of our predicted candidates by using qRT-PCR. Finally, highly expressed six candidates were identified as the original target regions by DNA sequencing.
AB - Despite the accumulating research on noncoding RNAs (ncRNAs), it is likely that we are seeing only the tip of the iceberg regarding our understanding of the functions and the regulatory roles served by ncRNAs in cellular metabolism, pathogenesis and host-pathogen interactions. Therefore, more powerful computational and experimental tools for analyzing ncRNAs need to be developed. To this end, we propose novel kernel functions, called base-pairing profile local alignment (BPLA) kernels, for analyzing functional ncRNA sequences using support vector machines (SVMs). We extend the local alignment kernels for amino acid sequences in order to handle RNA sequences by using STRAL's; scoring function, which takes into account sequence similarities as well as upstream and downstream base-pairing probabilities, thus enabling us to model secondary structures of RNA sequences. As a test of the performance of BPLA kernels, we applied our kernels to the problem of discriminating members of an RNA family from nonmembers using SVMs. The results indicated that the discrimination ability of our kernels is stronger than that of other existing methods. Furthermore, we demonstrated the applicability of our kernels to the problem of genome-wide search of snoRNA families in the Caenorhabditis elegans genome, and confirmed that the expression is valid in 14 out of 48 of our predicted candidates by using qRT-PCR. Finally, highly expressed six candidates were identified as the original target regions by DNA sequencing.
UR - http://www.scopus.com/inward/record.url?scp=63349094333&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=63349094333&partnerID=8YFLogxK
U2 - 10.1093/nar/gkn1054
DO - 10.1093/nar/gkn1054
M3 - Article
C2 - 19129214
AN - SCOPUS:63349094333
SN - 0305-1048
VL - 37
SP - 999
EP - 1009
JO - Nucleic acids research
JF - Nucleic acids research
IS - 3
ER -