Extractive Multi-Document Text Summarization Using Multi-Objective Evolutionary Algorithm Based Model

Abstract

Automatic document summarization technology is evolving and may offer a solution to the problem of information overload. Multi-document summarization is an optimization problem demanding optimizing more than one objective function concurrently. The proposed work considers a balance of two significant objectives: content coverage and diversity while generating a summary from a collection of text documents. Despite the large efforts introduced from several researchers for designing and evaluating performance of many text summarization techniques, their formulations lack the introduction of any model that can give an explicit representation of – coverage and diversity – the two contradictory semantics of any summary. The design of generic text summarization model based on sentence extraction is modeled as an optimization problem redirected into more semantic measure reflecting individually both content coverage and content diversity as an explicit individual optimization models. The proposed two models are then coupled and defined as a multi-objective optimization (MOO) problem. Up to the best of our knowledge, this is the first attempt to address text summarization problem as a MOO model. Moreover, heuristic perturbation and heuristic local repair operators are proposed and injected into the adopted evolutionary algorithm to harness its strength. Assessment of the proposed model is performed using document sets supplied by Document Understanding Conference 2002 (DUC 2002) and a comparison is made with other state-of-the-art methods using Recall-Oriented Understudy for Gisting Evaluation (ROUGE) toolkit. Results obtained support strong proof for the effectiveness of the proposed model based on MOO over other state-of-the-art models.