Multi-Document Text Summarization Based on Multiple Linear Regression

Abstract

Due to the huge amount of information on the internet makes text summarization growth rapidly. Text summarization is the process of selecting important sentences from documents with keeping the main idea of the original documents. Features considered the basis of text summarization. In this paper a method for assigning a weight to selected features was developed which is depend on building a mathematical model using Multiple Linear Regression which estimate the weights between dependent and independent variables. The proposed model is evaluated using dataset supplied by the Text Analysis Conference (TAC-2011) for English documents. The results were measured by using Recall-Oriented Understudy for Gisting Evaluation(ROUGE. The obtained results support the effectiveness of the proposed model