Finding the Relevance Degree between an English Text and its Title

Abstract

Keywords are useful tools as they give the shorter summary of thedocument. Keywords are useful for a variety of purposes includingsummarizing, indexing, labeling, categorization, clustering, and searching, andin this paper we will use keywords in order to find the relevance degreebetween an English text and its title.The proposed system solves this problem through simple statistic (Termfrequency) and linguistic approaches by extracting the keywords of the titleand keywords of the text (with their frequency that appear in the text) andfinding the average of title's keywords frequency across the text that representthe relevance degree that required, with depending on a lexicon of a particularfield(in this work we choose computer science field). This lexicon isrepresented using two different B+ trees one for non-keywords and the otherfor candidate keywords, these keywords was stored in a manner that preventredundancy of these terms or even sub-terms to provide efficient memoryusage and to minimize the search time.The proposed system was implemented using Visual Prolog 5.1 and aftertesting, it proved to be valuable for finding the degree of relevance between atext and its title (from point of view of accuracy and search time).