JavaRAP

Introduction

JavaRAP is an implementation of the classic Resolution of Anaphora Procedure (RAP) given by Lappin and Leass (1994) . It resolves third person pronouns, lexical anaphors, and identifies pleonastic pronouns. The original purpose of the implementation is to provide anaphora resolution result to our TREC 2003 Q&A system. Since RAP is so widely known, we shortly came to the idea of making this implementation freely available to the research community, in the hope that it could
We name the implementation as JavaRAP because it is developed in Java. Such a decision in programming language makes it more easily portable over hardware platforms, however less straightforwardly when the operating system changes.

Features

News

Feb 12, 2006:
JavaRAP now works on Windows® after the way it calls the Charniak parser is changed (previously depending on pipe);
The MANIFEST file is corrected and JavaRAP can be started by an easier "java -jar AnaphoraResolution.jar";
Source code released.
July 25, 2005:
A bug in SentenceSplitter fixed.
Source of SentenceSplitter released.
May 25, 2005:
Thinking about releasing the source code under GPL.  An alpha version is ready for testing now. Please drop me an email if you want to check it out. Thanks ...
Start FAQ.
Sep 27,2004 (Chinese Mid Autumn Festival):
Package name changed from edu.nus.comp.NLP.tool.anaphoraresolution to edu.nus.comp.nlp.tool.anaphoraresolution.

Demo

Try it out online!

Download

JavaRAP_1.11(classes) (code)
JavaRAP_1.1(classes) (code)
JavaRAP_1.02(classes only)
JavaRAP_1.01(classes only)
Sentence Splitter

Installation

  1. Decompress the tar file. 
  2. Make sure you have the following files and directories under "JavaRAP_x.y":
  3. Change the path to "JavaRAP_x.y", where "x.y" are version numbers, and you are ready to go!

Usage

*Each line in a resultFile contains a single record, which shows one anaphor and its antecedent, in the form:
   
    (sentenceIndex,tokenOffset) antecedent <-- (sentenceIndex,tokenOffset) anaphor,


where sentenceIndex is the index of the sentence in the complete article, starting from 0; and
tokenOffset is the index of the first token (puncuation or word) of the phrase in the sentence, starting from 0 as well.

FAQ & Special Issues

References

Long Qiu, Min-Yen Kan and Tat-Seng Chua. (2004).  A Public Reference Implementation of the RAP Anaphora Resolution Algorithm.  In proceedings of  the Fourth International Conference on Language Resources and Evaluation (LREC 2004). Vol. I, pp. 291-294.

Links

The poster we presented at LREC 2004, Lisbon, Portugal.
Check out our Natural Language Processing / Information Retrieval research framework webpage to find other NLP resources we have.

Contact

QIUL at MYSOC dot NET