Task-oriented Dialogue Augmentation

Abstract

Low resource training is a phenomenon that tries to solve the lack of big annotated corpora that most SOTA models require for training. The data scarcity problem becomes even more obvious within the task-oriented dialogue field, as every new task (e.g.flight booking) introduces new slot values (e.g.flight-day, flight-destination, etc.). Thus the ability to be able to transform the learned experience from one domain to another is crucial. Two main approaches addressing this problem are designing models that utilize low-scale data, and data augmentation methods that enlarge the low-scale data to train standard data-hungry models. This project focuses on the latter approach.

Past attempts for augmenting task-oriented dialogues have mostly focused on standard NLP techniques which apply transformations limited to word or sentence level changes. These did not use the characteristic features of the conversational modality of data which opens doors for transformations in the dialogue level. In this line of work, we explore conversational augmentation through dialogue-level transformations.

Members