TY - GEN
T1 - Performing external join operator on PostgreSQL with data transfer approach
AU - Takizawa, Ryota
AU - Kawashima, Hideyuki
AU - Mitsuhashi, Ryuya
AU - Tatebe, Osamu
N1 - Publisher Copyright:
© 2018 ACM.
PY - 2018/1/28
Y1 - 2018/1/28
N2 - With the development of sensing devices, the size of data managed by human being has been rapidly increasing. To manage such huge data, relational database management system (RDBMS) plays a key role. RDBMS models the real world data as n-ary relational tables. Join operator is one of the most important relational operators, and its acceleration has been studied widely and deeply. How can an RDBMS provide such an efficient join operator? The performance improvement of join operator has been deeply studied for a decade, and many techniques are proposed already. The problem that we face is how to actually use such excellent techniques in real RDBMSs. We propose to implement an efficient join technique by the data transfer approach. The approach makes a hook point inside an RDBMS internal, and pulls data streams from the operator pipeline in the RDBMS, and applies our original join operator to the data, and finally returns the result to the operator pipeline in the RDBMS. The result of the experiment showed that our proposed method achieved 1.42x speedup compared with PostgreSQL. Our code is available on GitHub.
AB - With the development of sensing devices, the size of data managed by human being has been rapidly increasing. To manage such huge data, relational database management system (RDBMS) plays a key role. RDBMS models the real world data as n-ary relational tables. Join operator is one of the most important relational operators, and its acceleration has been studied widely and deeply. How can an RDBMS provide such an efficient join operator? The performance improvement of join operator has been deeply studied for a decade, and many techniques are proposed already. The problem that we face is how to actually use such excellent techniques in real RDBMSs. We propose to implement an efficient join technique by the data transfer approach. The approach makes a hook point inside an RDBMS internal, and pulls data streams from the operator pipeline in the RDBMS, and applies our original join operator to the data, and finally returns the result to the operator pipeline in the RDBMS. The result of the experiment showed that our proposed method achieved 1.42x speedup compared with PostgreSQL. Our code is available on GitHub.
KW - Parallel Hash Join
KW - PostgreSQL
KW - Relational database
UR - http://www.scopus.com/inward/record.url?scp=85044384770&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85044384770&partnerID=8YFLogxK
U2 - 10.1145/3149457.3149480
DO - 10.1145/3149457.3149480
M3 - Conference contribution
AN - SCOPUS:85044384770
T3 - ACM International Conference Proceeding Series
SP - 271
EP - 277
BT - Proceedings of International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2018
PB - Association for Computing Machinery
T2 - 2018 International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2018
Y2 - 28 January 2018 through 31 January 2018
ER -