首页>
外国专利>
UNSUPERVISED, AUTOMATED WEB HOST DYNAMICITY DETECTION, DEADLINK DETECTION AND PREREQUISITE PAGE DISCOVERY FOR SEARCH INDEXED WEB PAGES
UNSUPERVISED, AUTOMATED WEB HOST DYNAMICITY DETECTION, DEADLINK DETECTION AND PREREQUISITE PAGE DISCOVERY FOR SEARCH INDEXED WEB PAGES
展开▼
机译:无需监控的自动Web主机动态检测,死链接检测和前提条件,即可搜索独立的Web页面
展开▼
页面导航
摘要
著录项
相似文献
摘要
Automated crawling of page links associated with a site domain that was previously crawled involves computing the dynamically of a site based on totals of continuous dead links, live links and/or prerequisite pages encountered while crawling page links corresponding to the site. The degree to which links are crawled is optimized based on the dynamically of the site. Some pages require that another particular page (i.e., a prerequisite page) is retrieved from the host prior to retrieving a given page, e.g., so that the prerequisite page can set a cookie. Prerequisite pages are determined based on stored information about pages that were retrieved, during a previous crawl, prior to retrieving a page. Prerequisite pages are identified to a search system so that when a user clicks on the URL for the page, the request is redirected to the prerequisite page to set the cookie appropriately.
展开▼