Discourse and Language Models

 

Abstract

Recent advancements demonstrate the proficiency of (large) language models in sentence or paragraph-level tasks, yet it remains unclear whether these models can discern latent structures in a way humans find meaningful. This project concentrates on the comprehension of discourse properties by language models. We present three research questions: (1) Can we construct behavioral tests to assess models’ fidelity in understanding discourse? (2) Can we make discourse analysis verifiable through the implementation of exact memory? (3) Can we utilize distant supervision to enhance discourse modeling, surpassing traditional language models’ training objectives like next-token prediction?

Project Members

Publications

We have an unpublished manuscript. Let us know if you want to read it!

 

Picture is generated by Midjourney