Chinese Informal Word Normalization: an Experimental Study
Research Area: Natural Language Processing Year: 2013
Type of Publication: In Proceedings Keywords: Informal Word Translation
  • Aobo Wang
  • Min-Yen Kan
  • Daniel Andrade
  • Takashi Onishi
  • Kai Ishikawa
We study the linguistic phenomenon of informal words in the domain of Chi- nese microtext and present a novel method for normalizing Chinese informal words to their formal equivalents. We formal- ize the task as a classification problem and propose rule-based and statistical fea- tures to model three plausible channels that explain the connection between for- mal and informal pairs. Our two-stage selection-classification model is evaluated on a crowdsourced corpus and achieves a normalization precision of 89.5% across the different channels, significantly im- proving the state-of-the-art.
Digital version