A PDTB-Styled End-to-End Discourse Parser

This is the homepage of an end-to-end discourse parser to parse free texts in the Penn Discourse Treebank (PDTB) style in a fully data-driven approach. The parser consists of multiple components joined in a sequential pipeline architecture, which includes a connective classifier, argument labeler, explicit classifier, non-explicit classifier, and attribution span labeler. Our trained parser first identifies all discourse relations, locates and labels their arguments, and then classifies the sense of the relation between each pair of arguments. For the identified relations, the parser also determines the attribution spans, if any, associated with them.

License

This software is licensed under the GNU Public License (GPL).

Download

The Java reimplementation of the parser can be downloaded from here. (Developed and maintained by Ilija)

The old source code in Ruby can be downloaded from here.

Web-based Demonstration

Key in a PTB file id (e.g., 0296, 2309):      

Or, enter your text here (blank line to separate paragraphs):



Publications

Ziheng Lin, Hwee Tou Ng and Min-Yen Kan (2014). A PDTB-Styled End-to-End Discourse Parser. Natural Language Engineering, 20, pp 151-184. Cambridge University Press. (pdf)

Ziheng Lin, Chang Liu, Hwee Tou Ng and Min-Yen Kan (2012). Combining Coherence Models and Machine Translation Evaluation Metrics for Summarization Evaluation. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012), Jeju, Korea, July. (pdf)

Ziheng Lin, Hwee Tou Ng and Min-Yen Kan (2011). Automatically Evaluating Text Coherence Using Discourse Relations. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT 2011), Portland, Oregon, USA, June. (pdf)

Ziheng Lin, Hwee Tou Ng and Min-Yen Kan (2010). A PDTB-Styled End-to-End Discourse Parser. Technical Report TRB8/10, School of Computing, National University of Singapore, August. (pdf)

Ziheng Lin, Min-Yen Kan and Hwee Tou Ng (2009). Recognizing Implicit Discourse Relations in the Penn Discourse Treebank. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP 2009), Singapore, August. (pdf)

Project Members

Prof. Min-Yen Kan
Prof. Hwee Tou Ng
Ziheng Lin


© 2010-2017 Ziheng Lin