Evaluating Large Language Models with NeuBAROCO: Syllogistic Reasoning Ability and Human-like Biases

Risako Ando, Takanobu Morishita, Hirohiko Abe, Koji Mineshima, Mitsuhiro Okada

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper investigates whether current large language models exhibit biases in logical reasoning, similar to humans. Specifically, we focus on syllogistic reasoning, a well-studied form of inference in the cognitive science of human deduction. To facilitate our analysis, we introduce a dataset called NeuBAROCO, originally designed for psychological experiments that assess human logical abilities in syllogistic reasoning. The dataset consists of syllogistic inferences in both English and Japanese. We examine three types of biases observed in human syllogistic reasoning: belief biases, conversion errors, and atmosphere effects. Our findings demonstrate that current large language models struggle more with problems involving these three types of biases.

Original languageEnglish
Title of host publicationIWCS2023 - Proceedings of the 4th Natural Logic Meets Machine Learning Workshop, NALOMA 2023
EditorsStergios Chatzikyriakidis, Valeria de Paiva
PublisherAssociation for Computational Linguistics (ACL)
Pages1-11
Number of pages11
ISBN (Electronic)9781959429951
Publication statusPublished - 2023
Event4th Natural Logic Meets Machine Learning Workshop, NALOMA 2023, held at the 15th International Conference on Computational Semantics, IWCS 2023 - Nancy, France
Duration: 2023 Jun 23 → …

Publication series

NameIWCS2023 - Proceedings of the 4th Natural Logic Meets Machine Learning Workshop, NALOMA 2023

Conference

Conference4th Natural Logic Meets Machine Learning Workshop, NALOMA 2023, held at the 15th International Conference on Computational Semantics, IWCS 2023
Country/TerritoryFrance
CityNancy
Period23/6/23 → …

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Human-Computer Interaction
  • Software

Fingerprint

Dive into the research topics of 'Evaluating Large Language Models with NeuBAROCO: Syllogistic Reasoning Ability and Human-like Biases'. Together they form a unique fingerprint.

Cite this