VELDA: relating an image tweet's text and images

Abstract

Image tweets are becoming a prevalent form of social media, but little is known about their content - textual and visual - and the relationship between the two mediums. Our analysis of image tweets shows that while visual elements certainly play a large role in image-text relationships, other factors such as emotional elements, also factor into the relationship. We develop Visual-Emotional LDA (VELDA), a novel topic model to capture the image-text correlation from multiple perspectives (namely, visual and emotional).Experiments on real-world image tweets in both English and Chinese and other user generated content, show that VELDA significantly outperforms existing methods on cross-modality image retrieval. Even in other domains where emotion does not factor in image choice directly, our VELDA model demonstrates good generalization ability, achieving higher fidelity modeling of such multimedia documents.

Publication
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence
Min-Yen Kan
Min-Yen Kan
Associate Professor

WING lead; interests include Digital Libraries, Information Retrieval and Natural Language Processing.

Dongyuan Lu
Postdoctoral Alumnus

WING alumni; former postdoc