Software Defect Prediction based on JavaBERT and CNN-BiLSTM

Kun Cheng, Shingo Takada

Research output: Contribution to journalConference articlepeer-review


Software defects can lead to severe issues in software systems, such as software errors, security vulnerabilities, and decreased software performance. Early prediction of software defects can prevent these problems, reduce development costs, and enhance system reliability. However, existing methods often focus on manually crafted code features and overlook the rich semantic and contextual information in program code. In this paper, we propose a novel approach that integrates JavaBERT-based embeddings with a CNN-BiLSTM model for software defect prediction. Our model considers code context and captures code patterns and dependencies throughout the code, thereby improving prediction performance. We incorporate Optuna to find optimal hyperparameters. We conducted experiments on the PROMISE dataset, which demonstrated that our approach outperforms baseline models, particularly in leveraging code semantics to enhance defect prediction performance.

Original languageEnglish
Pages (from-to)51-59
Number of pages9
JournalCEUR Workshop Proceedings
Publication statusPublished - 2023
EventJoint of the 5th International Workshop on Experience with SQuaRE Series and its Future Direction and the 11th International Workshop on Quantitative Approaches to Software Quality, IWESQ-QuASoQ 2023 - Hybrid, Seoul, Korea, Republic of
Duration: 2023 Dec 4 → …


  • BiLSTM
  • CNN
  • JavaBERT
  • Optuna
  • Software defect prediction

ASJC Scopus subject areas

  • General Computer Science


Dive into the research topics of 'Software Defect Prediction based on JavaBERT and CNN-BiLSTM'. Together they form a unique fingerprint.

Cite this