Large Language Models

Reasoning and Planning for Large Language Models

ICLR 2025 Workshop on Reasoning and Planning for Large Language Models

28 Apr, 2025 Singapore Expo

Jiaying Wu, Min-Yen Kan

Reasoning and Planning for Large Language Models

Beyond Memorization: The Challenge of Random Memory Access in Language Models

we investigate whether a generative LM (e.g., GPT- 2) is able to access its memory sequentially or randomly. Through carefully-designed synthetic tasks, we reveal that LMs manage to sequentially access their memory while encountering challenges in randomly accessing memorized content.

Tongyao Zhu, Qian Liu, Liang Pang, Zhengbao Jiang, Min-Yen Kan, Min Lin

Beyond Memorization: The Challenge of Random Memory Access in Language Models

Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models’ Understanding of Discourse Relations

We propose Discursive Socratic Questioning (DiSQ), a new evaluation measure for discourse semantics. Inspired by the Socratic method, DiSQ involves asking models about key event relations, testing their robustness to counterfactuals, and ensuring consistency with equivalent questions. Experiments show that GPT-4 achieves only 41% of the DiSQ scores. We recommend using context and discourse connectives as essential linguistic features to enhance discourse comprehension.

Yisong Miao, Hongfu Liu, Wenqiang Lei, Nancy F. Chen, Min-Yen Kan

Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models’ Understanding of Discourse Relations