Web Services PDF Print E-mail
Wednesday, 14 July 2010 12:51

Our group makes web services for natural language processing, information retrieval and digital libraries available to the public.  This makes it easy to do one-off processing of such work.  Note that we run these tasks for the public only when bandwidth allows, so don't depend on our services for day-to-day use.

 

All of our web services are publicly funneled through our web services broker that prioritizes and queues external requests on a single port.  You can download our Web Services Description Language (WSDL) file that programmatically describes all of the services that our group exposes to the public.  See below to learn more about each service.

 

Check whether our web service server is up and running.

 

Note that references to URI or path, URI refers to a web-accessible address, and path refers to the local path of a file on our WING servers.  Remote execution by the public only works for URIs.  For exact code to run the below system, see the web page for the service.  This page serves as a convenience page for reference only.

  • ParsCit - An Open Source CRF Reference String Parser (and section labeler)
    • extract_citations(uri_or_path): Extracting and parsing citations from a text file representing a scholarly article
    • extract_header(uri_or_path): Extracting and parsing the meta (first page) from a text file representing a scholarly article
    • extract_meta(uri_or_path): Extracting and parsing all metadata (header, citations) from a text file representing a scholarly article
  • Robust Argumentative Zoning (RAZ) - Argumentative zoning a la Teufel (1999) of a scholarly article text
    • arg_zone_postagged_file(uri_or_path):Argumentative Zones a Part-Of-Speech Tagged file
  • Extract from PDF
    • extract_text(uri_or_path): Extracts plain text from a PDF file
    • extract_html(uri_or_path): Extracts the text from a PDF file into HTML format
  • Maximum Entropy Part of Speech Tagging: Assigns part of speech tags to plain text file or sentence.  Daisy-chain the two latter calls to have an entire file POS tagged.
    • tag_delimited_sentence(sentence): Part of Speech tags a single line of text (sentence) provided as input
    • tag_delimited_file(uri_or_path): POS tags a local file or URI provided as argument
    • delimit_file(uri_or_path): Delimits a local file or URI provided as argument
  • WordSim: give a score for the semantic similarity of two words
    • wordsim_ldk(w1,w2): Calculate the similarity of two words using Dekang Lin's contextual thesaurus
    • wordsim_ldk_file(uri_or_path): Calculate the similarity of pairs of words from a file using Dekang Lin's contextual thesaurus
    • wordsim_wn(w1,w2): Calculate the similarity of two words using the WordNet Similarity package
    • wordsim_wn_file(uri_or_path):  Calculate the similarity of pairs of words from a file using the WordNet Similarity package
  • MeURLin - A URL based web page classifier
    • classify_websites(uri_or_path, lexicon): Classifies URLs of websites to a category hierarchy
    • segment_urls(uri_or_path,lexicon): Segments URL components into word chunks
  • Chinese Word Segmentation - CWS from Hwee Tou Ng's NLP group
    • cws_gb(uri_or_path):Word segments an input Chinese text file in GuoBiao encoding
    • cws_utf8(uri_or_path): Word segments an input Chinese text file in UTF-8 encoding
  • Scholarly Paper Keyphrase Extractor: our WINGNUS keyphrase extraction system from SemEval5
    • extract_kp(uri_or_path): Lists/extracts keyphrases from a plain text scholarly document, provided as input
  • Low Wee Heng's Word Sense Disambiguator
    • lwheng_wsd(uri_or_path): Assigns WordNet sense tag inventory to words in input
  • Tamisa Huangwongsri's Chinese Word Segmentor
    • tamisa_zh_segment(uri_or_path): Segments Chinese text in UTF8
  • JavaRAP: Resolution of Pronominal Anaphora in Java
    • resolve_anaphora_javarap(uri_or_path): Resolves pronominal anaphora using the Java RAP program
  • DefMiner: Supervised automatic definition extraction system.

WING folks: the WSDL file is the authoritative source for which web services are currently available.  Check our Google Docs file for updated information.

 

 

Last Updated on Friday, 28 November 2014 15:36