Michalis Vlachos Presents at the CSSH Seminar: Searching Digital Libraries with Language Models


Date
28 Jan, 2026
Event
CSSH Seminar
Location
Room AS7-0102, The Shaw Foundation Building
5 Arts Link, Level 3, Singapore 117570, Singapore, 117570

Our group is delighted to have Michalis Vlachos visiting our group as a Sabbatical Visiting Professor. On January 28th, 2026, Michalis presented his work on “Searching Digital Libraries with Language Models” at a seminar hosted by the Centre for Computational Social Science and Humanities (CSSH) at NUS.

Abstract:

Digital libraries now hold tens of millions of pages, yet most are still accessed through keyword based search over potentially noisy OCR text. As collections expand, traditional search interfaces struggle to support meaningful discovery, context, and evidence-based answers.

This talk explores how modern language models can enhance the entire digital library pipeline. We examine how LLMs can refine OCR output, clean historical text, and enable natural-language search through retrieval augmented generation. Using real-world archival data, we show improvements in character and word error rates, as well as downstream gains in retrieval quality, evaluated through both standard metrics and LLM based judging for faithfulness, correctness, and relevancy.

We also look beyond ranking results, discussing evidence driven answers and immersive, augmented-reality interfaces that open new ways to explore large historical collections. We conclude by reflecting on how these advances can improve transparency, reduce misinformation, and reshape the future of search in digital libraries.

Below is a gallery of the seminar, photo credit to Min and Yisong!

Michalis Vlachos at CSSH Seminar
Michalis Vlachos presenting at CSSH Seminar
Michalis Vlachos presenting at CSSH Seminar
Michalis Vlachos presenting at CSSH Seminar
Michalis Vlachos
Michalis Vlachos
Sabbatical Visiting Professor

Professor of Information Systems at the University of Lausanne, leading the “Big Data and Machine Learning” lab.