[ChimeText] 4 May (tomorrow!) Chia Tee Kiah / Lattice-Based Statistical Spoken Document Retrieval
Min-Yen Kan
knmnyn at gmail.com
Sun May 3 14:02:34 SGT 2009
Dear all:
A gentle reminder of Tee Kiah's Ph.D. defense tomorrow. Please come to
show your support!
--Min
SPEAKER: Mr Chia Tee Kiah
TITLE: Lattice-Based Statistical Spoken Document Retrieval
VENUE: MR6 (AS6 05-10)
TIME: 2-3:30 pm
DATE: 4 May 2009
ABSTRACT:
Recent research efforts on spoken document retrieval (SDR) have tried
to overcome the low quality of 1-best automatic speech recognition
transcripts -- especially for conversational speech -- by using
statistics derived from speech lattices containing multiple
transcription hypotheses as output by a speech recognizer. However,
these efforts have invariably used the classical vector space
retrieval model.
In this thesis, I present a lattice-based SDR method based on a
statistical approach to information retrieval. I formulate a way to
estimate statistical models for documents from expected word counts
derived from lattices; query-document relevance is computed as a log
probability under such models. Experiments show that my method
outperforms statistical retrieval using 1-best transcripts, a recent
lattice-based vector space method, and BM25 using lattice statistics.
I also extend my proposed SDR method to the task of query-by-example
SDR -- retrieving documents from a speech corpus, where the queries
are themselves full-fledged spoken documents (query exemplars).
BIODATA:
Chia Tee Kiah is a PhD candidate (2003-present) in School of
Computing, National University of Singapore. He is supervised by A/P
Ng Hwee Tou from School of Computing and Dr. Li Haizhou from Institute
for Infocomm Research. He has received his B.Comp (Hons) from the
National University of Singapore, Singapore in 2003. He is currently
working on methods for improving spoken document retrieval.
Upcoming Talks:
4 May - Chia Tee Kiah / Lattice-Based Statistical Spoken Document
Retrieval
5 May - Gina-Anne Levow / Context and Learning in Multilingual Tone
and Pitch Accent Recognition
30 Jun - Wang Kai and Ye Shiren / "A Syntactic Tree Matching Approach
to Finding Similar Questions in Community-based QA Services" and
"Summarizing Definition from Wikipedia"
More information about the ChimeText
mailing list