首页>
外国专利>
Building of a web corpus with the help of a reference web crawl
Building of a web corpus with the help of a reference web crawl
展开▼
机译:借助参考Web爬网构建Web语料库
展开▼
页面导航
摘要
著录项
相似文献
摘要
Computer-implemented method for building a web corpus (WCD) comprising the steps of: sending by a web crawler (WC) a query to a reference web crawl agent (RWCA), this query containing a least one identifier of a resource, receiving by the web crawler (WC) a response from the reference web crawl agent (RWCA); if this response does not contain the resource identified by the identifier, downloading by the web crawler (WC) the resource from the website (WS) corresponding to the identifier and adding the resource to the web corpus (WCD; and if this response contains the resource identified by the identifier, adding the resource to the web corpus (WCD).
展开▼