Purpose of our Research

Short Message Service (SMS) messages are now a ubiquitous form of communication, connecting friends, families and colleagues globally. However, there is precious little corpora with which researchers can study to understand this phenomenon that is in the public domain.

We are group of researchers from the School of Computing of the National University of Singapore. We are working to expand an existing dataset of SMS messages, to study its language, network and other characteristics. An important part of our work is to compile the messages into a publicly available corpus, as an object for scholars' comparative study.


Our Live Corpus Project

We resurrected our earlier project from 2004 for SMS collection in October 2010, reviving it as a live corpus project. The goal is to continually enlarge the corpus, using a array of collection methodologies, leveraging current technology trends.

As of May 2012 , we have collected 41537 English SMS messages (exclusive the old corpus) and 29556 Chinese SMS messages . Statistics will be updated each week, after a sanity check of new SMSes collected during the last period. Detailed, individual monthly statistics are also accessible by clicking on Browse Corpus->Collection Period on the left sidebar.

We have integrated our previous SMS corpus as part of our collection efforts. It contains 10,117 English SMS messages. For more infomation from the 2004 corpus, please refer to its homepage.

We have just finished a working paper of the corpus on Dec 2011. Read it.


Call for Contributions

We have built this website to show the gradual achievement of this project and help (potential) SMS contributers have a better understanding of our work. More importantly, we hope that you (as a visitor to this site), will know and support our work and help publicize it to others, to attract more donations and make the corpus a primary source for SMS and informal text research. With more contributers and more messages archived, the corpus will grow in depth and utility to scholars everywhere.

If you are willing to donate SMS messages, please go to our contribution page.

If you are interested in our project, or you are willing to help contribute SMS messages extending beyond the scope of our collection methodologies, please contact us.

We sincerely appreciate your suggestions and contributions.