Conference proceeding
Evaluating topic-driven web crawlers
Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, pp.241-249
SIGIR '01
09/01/2001
DOI: 10.1145/383952.383995
Abstract
Due to limited bandwidth, storage, and computational resources, and to the dynamic nature of the Web, search engines cannot index every Web page, and even the covered portion of the Web cannot be monitored continuously for changes. Therefore it is essential to develop effective crawling strategies to prioritize the pages to be indexed. The issue is even more important for topic-specific search engines, where crawlers must make additional decisions based on the relevance of visited pages. However, it is difficult to evaluate alternative crawling strategies because relevant sets are unknown and the search space is changing. We propose three different methods to evaluate crawling strategies. We apply the proposed metrics to compare three topic-driven crawling algorithms based on similarity ranking, link analysis, and adaptive agents.
Details
- Title: Subtitle
- Evaluating topic-driven web crawlers
- Creators
- Filippo MenczerGautam PantPadmini SrinivasanMiguel Ruiz
- Resource Type
- Conference proceeding
- Publication Details
- Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, pp.241-249
- Series
- SIGIR '01
- DOI
- 10.1145/383952.383995
- Publisher
- ACM
- Language
- English
- Date published
- 09/01/2001
- Academic Unit
- Nursing; Computer Science; Business Analytics
- Record Identifier
- 9984003017802771
Metrics
16 Record Views