TY - JOUR
T1 - A max-margin training of RNA secondary structure prediction integrated with the thermodynamic model
AU - Akiyama, Manato
AU - Sato, Kengo
AU - Sakakibara, Yasubumi
N1 - Funding Information:
This work was supported in part by a Grant-in-Aid for Scientific Research (C) (KAKENHI) (No. 16K00404) from the Japan Society for the Promotion of Science (JSPS) to K.S. This work was also supported in part by a MEXT-supported Program for the Strategic Research Foundation at Private Universities. The supercomputer system was provided by the National Institute of Genetics (NIG), Research Organization of Information and Systems (ROIS).
Funding Information:
This work was supported in part by a Grant-in-Aid for Scienti¯c Research (C) (KAKENHI) (No. 16K00404) from the Japan Society for the Promotion of Science (JSPS) to K.S. This work was also supported in part by a MEXT-supported Program for the Strategic Research Foundation at Private Universities. The supercomputer system was provided by the National Institute of Genetics (NIG), Research Organization of Information and Systems (ROIS).
Publisher Copyright:
© 2018 World Scientific Publishing Europe Ltd.
PY - 2018/12/1
Y1 - 2018/12/1
N2 - A popular approach for predicting RNA secondary structure is the thermodynamic nearest-neighbor model that finds a thermodynamically most stable secondary structure with minimum free energy (MFE). For further improvement, an alternative approach that is based on machine learning techniques has been developed. The machine learning-based approach can employ a fine-grained model that includes much richer feature representations with the ability to fit the training data. Although a machine learning-based fine-grained model achieved extremely high performance in prediction accuracy, a possibility of the risk of overfitting for such a model has been reported. In this paper, we propose a novel algorithm for RNA secondary structure prediction that integrates the thermodynamic approach and the machine learning-based weighted approach. Our fine-grained model combines the experimentally determined thermodynamic parameters with a large number of scoring parameters for detailed contexts of features that are trained by the structured support vector machine (SSVM) with the ℓ1 regularization to avoid overfitting. Our benchmark shows that our algorithm achieves the best prediction accuracy compared with existing methods, and heavy overfitting cannot be observed. The implementation of our algorithm is available at https://github.com/keio-bioinformatics/mxfold.
AB - A popular approach for predicting RNA secondary structure is the thermodynamic nearest-neighbor model that finds a thermodynamically most stable secondary structure with minimum free energy (MFE). For further improvement, an alternative approach that is based on machine learning techniques has been developed. The machine learning-based approach can employ a fine-grained model that includes much richer feature representations with the ability to fit the training data. Although a machine learning-based fine-grained model achieved extremely high performance in prediction accuracy, a possibility of the risk of overfitting for such a model has been reported. In this paper, we propose a novel algorithm for RNA secondary structure prediction that integrates the thermodynamic approach and the machine learning-based weighted approach. Our fine-grained model combines the experimentally determined thermodynamic parameters with a large number of scoring parameters for detailed contexts of features that are trained by the structured support vector machine (SSVM) with the ℓ1 regularization to avoid overfitting. Our benchmark shows that our algorithm achieves the best prediction accuracy compared with existing methods, and heavy overfitting cannot be observed. The implementation of our algorithm is available at https://github.com/keio-bioinformatics/mxfold.
KW - RNA secondary structure prediction
KW - structured support vector machine
KW - thermodynamic model
UR - http://www.scopus.com/inward/record.url?scp=85059771961&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85059771961&partnerID=8YFLogxK
U2 - 10.1142/S0219720018400255
DO - 10.1142/S0219720018400255
M3 - Article
C2 - 30616476
AN - SCOPUS:85059771961
SN - 0219-7200
VL - 16
JO - Journal of Bioinformatics and Computational Biology
JF - Journal of Bioinformatics and Computational Biology
IS - 6
M1 - 18400255
ER -