[ChimeText] Yahoo! visit to NUS -- addendum
Min-Yen Kan
knmnyn at gmail.com
Tue Jul 15 22:33:42 SGT 2008
Hi all:
The talks by the Yahoo! folks has been finalized. Please reserve your
time on next Friday to keep up to date with the great IR research
coming out of Yahoo!
Please publicize to all folks who you think might be interested. We'd
like as many folks to come out for this as possible!
Cheers, Min
Speakers: Yahoo! Research Lab Staff
Venue: SR3 (COM1 02-12)
Date: 25 Jul (Friday)
Time: 9:30am-11:30am (questions till 12:00nn)
Talk Overviews (times are approximate):
9:30-10:00 - Evgeniy Gabrilovich / Overview of Computational Advertising
10:00-10:30 - Rosie Jones / Geography in Web Search
10:30-11:00 - Donald Metzler / Predicting when (not) to Advertise
11:00-11:30 - Vanessa Murdock / Diversifying Image Search with User
Generated Content
1) Evgeniy Gabrilovich
Title: Overview of Computational Advertising
Abstract: Web advertising is the primary driving force behind many Web
activities, including Internet search as well as publishing of online
content by third-party providers. A new discipline - Computational
Advertising - has recently emerged, which studies the process of
advertising on the Internet from a variety of angles. A successful
advertising campaign should be relevant to the immediate user's
information need as well as more generally to user's background, be
economically worthwhile to the advertiser and the intermediaries (e.g.,
the search engine), as well as not detrimental to user experience. At
first approximation, the process of obtaining relevant ads can be
reduced to conventional information retrieval, where one constructs a
query that describes the user's context, and then executes this query
against a large inverted index of ads. We show how to augment the
standard IR approach using query expansion and text classification
techniques. We demonstrate how to employ a relevance feedback assumption
and use Web search results retrieved by the query. We will also survey
the numerous challenges and open research problems posed by
computational advertising, such as text summarization, natural language
generation, named entity extraction, handling geographic names, and
others.
Bio: Evgeniy Gabrilovich is a Senior Research Scientist and Manager of the
NLP & IR Group at Yahoo! Research. His research interests include
information retrieval, machine learning, and computational linguistics.
Recently, he co-organized a workshop on the synergy between Wikipedia
and research in AI at AAAI 2008, as well as co-presented a tutorial on
computation advertising at ACL 2008 and EC 2008. He served on the
program committees of ACL-08:HLT, AAAI 2008, WWW 2008, CIKM 2008, JCDL
2008, AAAI 2007, EMNLP-CoNLL 2007, and COLING-ACL 2006. Evgeniy earned
his MSc ad PhD degrees in Computer Science from the Technion - Israel
Institute of Technology. In his Ph.D. thesis, Evgeniy developed a
methodology for using large scale repositories of world knowledge (e.g.,
all the knowledge available in Wikipedia) in order to enhance text
representation beyond the bag of words. URL:
http://research.yahoo.com/Evgeniy_Gabrilovich
2) Rosie Jones
Title: Geography in Web Search
Abstract: Web search results are typically based on the user's search query,
without taking other contextual information into account. However, we
can see from user search behavior that for some search topics the user
may prefer results which are geographically close to home. We will show
topics which have a geographical dependence, as well as others which
appear to be geographically independent. Based on these findings, we
propose a more flexible approach to web search, which in which we prefer
a ranking with results close to the user location when this will best
satisfy the user's information need.
Bio: Rosie Jones is a Senior Research Scientist at Yahoo!. Her research
interests include web search, geographic information retrieval and
natural language processing. She received her PhD from the School of
Computer Science at Carnegie Mellon University. In 2005 she co-organized
the SIGIR workshop on lexical cohesion and information retrieval, and in
2003 she co-organized the ICML workshop on The Continuum from Labeled to
Unlabeled Data in Machine Learning and Data Mining. She served as a
Senior PC member for SIGIR in 2007 and 2008. URL:
http://research.yahoo.com/Rosie_Jones
3) Donald Metzler
Title: Predicting when (not) to Advertise
Abstract: In this talk we discuss the problem of whether or not to show online
advertisements. We propose two methods for addressing this problem, a
simple thresholding approach and a machine learning approach, which
collectively analyzes the set of candidate ads augmented with external
knowledge. Our experimental evaluation, based on over 28,000 editorial
judgments, shows that we are able to predict, with high accuracy, when
to show ads for both content match and sponsored search advertising
tasks.
Bio: Donald Metzler is a Research Scientist at Yahoo! Research in Santa
Clara, CA. He obtained his Ph.D. degree in Computer Science from the
University of Massachusetts Amherst in 2007. His research interests
include information retrieval, machine learning, and their intersection.
He is the co-author of Search Engines: Information Retrieval in
Practice, which will be published in the early part of 2009. URL:
http://research.yahoo.com/Don_Metzler
4) Vanessa Murdock
Title: Diversifying Image Search with User Generated Content
Abstract: Large-scale image retrieval on the Web relies on the availability of
short snippets of text associated with the image. This user-generated
content is a primary source of information about the content and context
of an image. While traditional information retrieval models focus on
finding the most relevant document without consideration for diversity,
image search requires results that are both diverse and relevant. This
is problematic for images because they are represented very sparsely by
text, and as with all user-generated content the text for a given image
can be extremely noisy.
The contribution of this paper is twofold. We show that it is possible
to minimize the trade-off between precision and diversity, relevance
models offer a unified framework to afford the greatest diversity
without harming precision. Furthermore we show that estimating the
query model from the distribution of tags favors the dominant sense of a
query. Relevance models operating only on tags offers the highest level
of diversity with no significant decrease in precision.
Bio: Vanessa Murdock currently holds a Post Doc position at Yahoo! Research
Barcelona. Her current work focuses on retrieval of short texts, such as
for advertisements, and user-generated content for images and video. She
completed her PhD in 2006 at the University of Massachusetts, working
with W. Bruce Croft. Her thesis, focusing on sentence retrieval for
applications such as Question Answering, novelty detection, and
information provenance, was recently published as a book "Exploring
Sentence Retrieval. URL: http://research.yahoo.com/Vanessa_Murdock
Upcoming Talks:
16 Jul: Xiong Deyi (I2R / Linguistically Annotated BTG for Statistical
Machine Translation)
17 Jul: Douglas Oard (University of Maryland / Fourth-Generation
Content Analysis: Supporting social science research using
computational linguistics)
18 Jul: (related seminars) 3 seminars on 1) Real-Time Document Image
Retrieval with LLAH 2) Large-Scale and Real-Time Specific Object 3)
Pattern recognition with supplementary information
25 Jul: Yahoo! Research Labs talks:
4 talks on 1) Evgeniy Gabrilovich / Overview of Computational
Advertising 2) Rosie Jones / Geography in Web Search 3) Donald Metzler
/ Predicting when (not) to Advertise 4) Vanessa Murdock / Diversifying
Image Search with User Generated
More information about the ChimeText
mailing list