SCITAB – A Benchmark for Scientific Table-Based Fact-Checking

Xinyuan Lu, Liangming Pan, Qian Liu, Preslav Nakov, Min-Yen Kan

6 Dec, 2023

Example from SCITAB showing a claim and its reasoning process.

SCITAB was developed to fill the gap in scientific fact-checking benchmarks by incorporating claims derived from actual scientific papers rather than crowd-sourced ones. These claims require multi-step reasoning to verify against evidence in tables. The dataset was created using a human–model collaboration approach, where real-world claims were filtered, counterclaims were generated using InstructGPT, and all claims were manually verified by expert annotators.

The dataset has been extensively evaluated with state-of-the-art models, including TAPAS, TAPEX, Vicuna, and GPT-4, revealing that existing table-based models struggle with SCITAB. Only GPT-4 shows notable improvements, highlighting the dataset’s challenge and potential for advancing research in table-based reasoning.

SCITAB was presented at EMNLP 2023, and the dataset is publicly available for further research. Researchers interested in compositional reasoning, table fact-checking, and model robustness are encouraged to explore SCITAB and contribute to its future developments.

SCITAB – A Benchmark for Scientific Table-Based Fact-Checking

Xinyuan Lu

ISEP Doctoral Alumnus (Jan ‘20). Thesis: Fact-Checking Complex Claims with Large Language Models.

Liangming Pan

Doctoral Alumnus (Apr ‘22). Thesis: Towards Generating Deep Questions from Text.

Min-Yen Kan

Associate Professor