Investigating Zero- and Few-shot Generalization in Fact Verification

Fact verification benchmark overview.

Abstract

We explore zero- and few-shot generalization for fact verification (FV), which aims to generalize the FV model trained on well-resourced domains, such as Wikipedia, to low-resourced domains that lack human annotations. To this end, we first construct a benchmark dataset collection which contains 11 FV datasets representing 6 domains. We conduct an empirical analysis of generalization across these FV datasets, finding that current models generalize poorly. Our analysis reveals that several factors affect generalization, including dataset size, length of evidence, and the type of claims. Finally, we show that two directions of work improve generalization: incorporating domain knowledge via pretraining on specialized domains, and automatically generating training data via claim generation.

Publication
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Liangming Pan
Doctoral Alumnus (Apr ‘22). Thesis: Towards Generating Deep Questions from Text.

Doctoral Alumnus (Apr ‘22).

Yunxiang Zhang
Remote Undergraduate Intern (Aug ‘21). Project Topic: Robustness in NLP

Remote Undergraduate Intern

Min-Yen Kan
Min-Yen Kan
Associate Professor

WING lead; interests include Digital Libraries, Information Retrieval and Natural Language Processing.