SCITAB – A Benchmark for Scientific Table-Based Fact-Checking

Example from SCITAB showing a claim and its reasoning process.

SCITAB was developed to fill the gap in scientific fact-checking benchmarks by incorporating claims derived from actual scientific papers rather than crowd-sourced ones. These claims require multi-step reasoning to verify against evidence in tables. The dataset was created using a human–model collaboration approach, where real-world claims were filtered, counterclaims were generated using InstructGPT, and all claims were manually verified by expert annotators.

The dataset has been extensively evaluated with state-of-the-art models, including TAPAS, TAPEX, Vicuna, and GPT-4, revealing that existing table-based models struggle with SCITAB. Only GPT-4 shows notable improvements, highlighting the dataset’s challenge and potential for advancing research in table-based reasoning.

SCITAB was presented at EMNLP 2023, and the dataset is publicly available for further research. Researchers interested in compositional reasoning, table fact-checking, and model robustness are encouraged to explore SCITAB and contribute to its future developments.

Xinyuan Lu
Xinyuan Lu
ISEP Doctoral Student (Jan ‘20)

PhD Candidate January 2020 Intake

Liangming Pan
Doctoral Alumnus (Apr ‘22). Thesis: Towards Generating Deep Questions from Text.

Doctoral Alumnus (Apr ‘22).

Min-Yen Kan
Min-Yen Kan
Associate Professor

WING lead; interests include Digital Libraries, Information Retrieval and Natural Language Processing.