research centers

Search results: Found 1

Listing 1 - 1 of 1
Sort by

Effective Web Page Crawler

Authors: Isra’a Tahseen Ali --- Hilal Hadi Saleh
Journal: Engineering and Technology Journal مجلة الهندسة والتكنولوجيا ISSN: 16816900 24120758 Year: 2011 Volume: 29 Issue: 3 Pages: 513-530
Publisher: University of Technology الجامعة التكنولوجية


The World Wide Web (WWW) has grown from a few thousand pages in1993 to more than eight billion pages at present. Due to this explosion in size,web search engines are becoming increasingly important as the primary meansof locating relevant information.This research aims to build a crawler that crawls the most important webpages, a crawling system has been built which consists of three maintechniques. The first is Best-First Technique which is used to select the mostimportant page. The second is Distributed Crawling Technique which based onUbiCrawler. It is used to distribute the URLs of the selected web pages toseveral machines. And the third is Duplicated Pages Detecting Technique byusing a proposed document fingerprint algorithm.

Listing 1 - 1 of 1
Sort by
Narrow your search

Resource type

article (1)


English (1)

From To Submit

2011 (1)