Menu Content/Inhalt
Home arrow Evaluation
Evaluation Print E-mail

Evaluation is a key bottleneck in much NLP and IR research. To decide whether a summary is a good summary or a search result is relevant, there is often little substitute for having human participants examine the results of a computer algorithm. Our group practices joint evaluation exercises, in which members of the group are voluntarily asked to help each other out in becoming subjects for evaluation.

Evaluation tools

Computer Centre has acquired the Eazy-Survey software package for those of you who need to do a survey as part of your research. See the following URL for more information: http://www.nus.edu.sg/comcen/apps/ezsurvey.htm.

Aside from that, if you're willing to learn perl you can use some of Min's code to customize a survey and link it to a simple Berkeley DB database to store and retrieve your results.

WING-eval group

Visitors, if you have time, feel free to join our evaluation pool. Periodically we will send out notices to the WING-eval mailing list announcing new evaluations that require human subjects. Subjects are paid between 8 to 15 dollars an hour for their participation in evaluations. See below for details about current and open evaluations.

All current evaluations / experiments

N.B. if the project below is older than three months and a project duration is not specified, you may want to check with the maintainer to see whether the evaluation is still active.

TitleDurationMaintainerNotes
Semantic Relation: Similarity & Significance* 14-18 Apr 2006
Long
This annotation task collects data indicating whether two semantic relations are similar when they are paired by the system, and whether they are ignorable wrt paraphrase judgment when unpaired. 
JavaScript Functionality Annotation* 6-14 Jan 2006 Wei Please help to annotate several JavaScript's functionality (i.e. what does a fragment of JavaScript do)
Query-Feature Association Survey.doc* Dec 2005 - Jan 2006 (10 entries max) Zhao Jin Help to answer a survey about what kind of detectors will be helpful for finding the desired video clips.
Javascript usefulness* February 2005 Wei We are investigating what kind of functionalities achieved by Javascript is more helpful to the end-users and which are not.
PARCELS evaluation* January - February 2005 Aik Miang Two evaluations need your assistance: Evaluation of Similarity Labels in news Web pages, and News Annotation.
Support Verb Categorization* September 2004 - February 2005 Yee Fan Help categorize verb/object pairs as support verb or not. A tutorial explaining what a support verb is is also included.
Web page search result summaries* August 2004 Thiam Chye Write a sample summary of multiple web pages for analysis. Email This e-mail address is being protected from spam bots, you need JavaScript enabled to view it if you are willing to participate. He has to set up a bunch of web pages for each evaluator.
SMS corpus collection* July 2004 - March 2005 Mingfeng Enter in your SMS messages so that we can learn how to speed up SMS input using optimized keypads and word completion.
Known Item Queries* July 2004 Min Help evaluate whether a library query is a known item query and whether returned items match the known query.
Citation Annotation* September 2004 Yong Kiat Help annotate the internal structure of citations in a web interface.
PARCELS web page annotation* July 2004 Chee How, Sandra Help us annotate web page blocks so that we can learn how to automatically label parts of a web page.
Lead sentence relevance judgment* December 2003 Long Help us annotate whether the first couple sentences of a summary constitute a good summary of newswire articles.
Extended Definition Sentence Judgment* August - November 2003 Hang Help us identify whether a sentence could be used as part of a definition of a search target.
Website Metadata Relevancy* August - September 2003 Alex, Edwin Help us fill out a survey to identify how important some sources of metadata are to website relevancy.
* - denotes that the evaluation is currently inactive or closed.
Last Updated ( Friday, 29 December 2006 )