MeURLin: URL-based classification of web pages

Meurlin, the web page classification wizard

[ Back to the WING home page ]
[ Back to WING web services ]

This is the home page of the MeURLin project, which performs web page classification based only on its URL. MeURLin relies on mnemonic URLs to achieve its classification accuracy. It segments a URL into chunks and expands these segments and uses this information as evidence for classification.

You can classify URLs against three different classification schemes (send me email if you're interested in another scheme): the comprehensive Open Directory Project, four WebKB project classes, and the Bank Search classes. The latter two are specialized classification schemes and should be used only by those familiar with the projects.


Web Service

You can access MeURLin as a web service now. Requests coming from external sources are recorded for future training purposes. Note, if the load from external sources are too high, your web service request may not be run.

Web form-based demonstration

Make sure that you prefix your URLs with the "http://" scheme. N.B.: this demo is an older, online version that predates recent work in 2004-2005. The current version has not been put online yet. Please see our publications for more up-to-date information. Also, note that if the system load from external sources are too high, your demo request may not be run.

Internal key (if applicable):

Enter a list of URLs to classify:

Or enter a filename (comprising of one URL per line) to upload and classify:

Splitting model:
   Hierarchical IC

Classification scheme to use:
   Three-level ODP
   WebKB
   Bank Search Dataset

Publications

Refereed:

Others:

Group Members


Min-Yen Kan <kanmy@comp.nus.edu.sg>
Created on: Wed May 5 16:07:15 2004 | Version: 1.0 | Last modified: Sun Jul 19 14:39:26 2009