Abstractive summarization is an ideal form of summarization
since it can synthesize information from
multiple documents to create concise informative
summaries. In this work, we aim at developing
an abstractive summarizer. First, our proposed approach
identifies the most important document in
the multi-document set. The sentences in the most
important document are aligned to sentences in
other documents to generate clusters of similar sentences.
Second, we generate K-shortest paths from
the sentences in each cluster using a word-graph
structure. Finally, we select sentences from the set
of shortest paths generated from all the clusters employing
a novel integer linear programming (ILP)
model with the objective of maximizing information
content and readability of the final summary.
Our ILP model represents the shortest paths as binary
variables and considers the length of the path,
information score and linguistic quality score in
the objective function. Experimental results on the
DUC 2004 and 2005 multi-document summarization
datasets show that our proposed approach outperforms
all the baselines and state-of-the-art extractive
summarizers as measured by the ROUGE scores. Our method also
outperforms a recent abstractive summarization technique.
In manual evaluation, our approach also achieves promising results
on informativeness and readability.