TOPSIS with Multiple Linear Regression for Multi-Document Text Summarization


The huge amount of information in the internet makes rapid need of text summarization. Text summarization is the process of selecting important sentences from documents with keeping the main idea of the original documents. This paper proposes a method depends on Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS). The first step in our model is based on extracting seven features for each sentence in the documents set. Multiple Linear Regression (MLR) is then used to assign a weight for the selected features. Then TOPSIS method applied to rank the sentences. The sentences with high scores will be selected to be included in the generated summary. The proposed model is evaluated using dataset supplied by the Text Analysis Conference (TAC-2011) for English documents. The performance of the proposed model is evaluated using Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metric. The obtained results support the effectiveness of the proposed model.