[ChimeText] 30 Nov ([15:00 - 16:30] @ Video Conference Room (COM1 02-13)) Kernel Engineering on Parse Trees
Min-Yen Kan
knmnyn at gmail.com
Tue Nov 29 03:01:16 SGT 2011
A reminder about tomorrow's seminar. See you there! -M
SPEAKER: Jun Sun
TITLE: Kernel Engineering on Parse Trees
VENUE: Video Conference Room (COM1 02-13)
DATE AND TIME: 30 November 2011 (Wednesday), [15:00 - 16:30]
CHAIRED BY: Prof Chew Lim Tan, Dept of Computer Science
Dr Min Zhang, I2R
ABSTRACT: Recently, Natural Language Processing (NLP) has been greatly benefiting from the progress of machine learning methods in large data driven applications. Some NLP tasks require complex data representation to deeply analyze the syntactic and semantic features. In many cases the input data is represented as sequences, trees and even graphs. Traditional feature based methods transform these structured input data into vectorial representation by sophisticated feature engineering, which is argued infeasible to fully explore the structure features. Alternatively, kernel methods can explore a very high dimensional feature space for these complex input structures without explicitly representing the input data as a feature vector. In terms of tree structures, tree kernels can explore the subtree features in the parse trees, without explicitly enumerating each type of subtree.
However, previous tree kernels explore the structure features with respect to the single subtree representation. The structure of the large single subtree may be sparse in the data set, which prevents large structures from being effectively utilized. Sometimes, only certain parts of a large subtree are beneficial instead of the entire subtree. In this case, using the entire structure may introduce noisy information. To address the above deficiency, this dissertation systematically investigates the phrase parse tree and attempts to design more sophisticated kernels to deeply explore the structure features embedded in the phrase parse trees other than the single subtree representation.
Specifically, this dissertation proposes tree sequence based kernels which adopt the subtree sequence structure as the basic feature type to explore the structure features in phrase parse trees. A variety of kernels are built up based on the subtree sequence structure. The advantages of the subtree sequence structures are demonstrated on various NLP applications. By means of the tree (sequence) kernels over multiple parse trees, a kernel based alignment model is proposed for the task of bilingual subtree alignment, with which the translation performance can be effectively improved. On a more general perspective, this dissertation systematically explores the disconnected structure features in parse trees by means of kernels. On this point, this dissertation may provide novel views of structure features for NLP applications.
BIODATA: Mr Jun Sun is a Ph.D. student in School of Computing (SoC), National University of Singapore (NUS), under supervision of Prof. Chew Lim Tan and Dr. Min Zhang since 2007. He received his B.Sc. degree in Computer Science in Harbin Institute of Technology in 2006. His research interests include Machine Translation, Statistical Parsing and Statistical Machine Learning.
Upcoming seminars:
30 Nov Kernel Engineering on Parse Trees by Jun Sun
More information about the ChimeText
mailing list