TY - JOUR
T1 - Splicing QTL analysis focusing on coding sequences reveals mechanisms for disease susceptibility loci
AU - Yamaguchi, Kensuke
AU - Ishigaki, Kazuyoshi
AU - Suzuki, Akari
AU - Tsuchida, Yumi
AU - Tsuchiya, Haruka
AU - Sumitomo, Shuji
AU - Nagafuchi, Yasuo
AU - Miya, Fuyuki
AU - Tsunoda, Tatsuhiko
AU - Shoda, Hirofumi
AU - Fujio, Keishi
AU - Yamamoto, Kazuhiko
AU - Kochi, Yuta
N1 - Publisher Copyright:
© 2022, The Author(s).
PY - 2022/12
Y1 - 2022/12
N2 - Splicing quantitative trait loci (sQTLs) are one of the major causal mechanisms in genome-wide association study (GWAS) loci, but their role in disease pathogenesis is poorly understood. One reason is the complexity of alternative splicing events producing many unknown isoforms. Here, we propose two approaches, namely integration and selection, for this complexity by focusing on protein-structure of isoforms. First, we integrate isoforms with the same coding sequence (CDS) and identify 369-601 integrated-isoform ratio QTLs (i2-rQTLs), which altered protein-structure, in six immune subsets. Second, we select CDS incomplete isoforms annotated in GENCODE and identify 175-337 isoform-ratio QTL (i-rQTL). By comprehensive long-read capture RNA-sequencing among these incomplete isoforms, we reveal 29 full-length isoforms with unannotated CDSs associated with GWAS traits. Furthermore, we show that disease-causal sQTL genes can be identified by evaluating their trans-eQTL effects. Our approaches highlight the understudied role of protein-altering sQTLs and are broadly applicable to other tissues and diseases.
AB - Splicing quantitative trait loci (sQTLs) are one of the major causal mechanisms in genome-wide association study (GWAS) loci, but their role in disease pathogenesis is poorly understood. One reason is the complexity of alternative splicing events producing many unknown isoforms. Here, we propose two approaches, namely integration and selection, for this complexity by focusing on protein-structure of isoforms. First, we integrate isoforms with the same coding sequence (CDS) and identify 369-601 integrated-isoform ratio QTLs (i2-rQTLs), which altered protein-structure, in six immune subsets. Second, we select CDS incomplete isoforms annotated in GENCODE and identify 175-337 isoform-ratio QTL (i-rQTL). By comprehensive long-read capture RNA-sequencing among these incomplete isoforms, we reveal 29 full-length isoforms with unannotated CDSs associated with GWAS traits. Furthermore, we show that disease-causal sQTL genes can be identified by evaluating their trans-eQTL effects. Our approaches highlight the understudied role of protein-altering sQTLs and are broadly applicable to other tissues and diseases.
UR - http://www.scopus.com/inward/record.url?scp=85136487866&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85136487866&partnerID=8YFLogxK
U2 - 10.1038/s41467-022-32358-1
DO - 10.1038/s41467-022-32358-1
M3 - Article
C2 - 36002455
AN - SCOPUS:85136487866
SN - 2041-1723
VL - 13
JO - Nature communications
JF - Nature communications
IS - 1
M1 - 4659
ER -