Web Spam Filtering in Internet Archives

机译：Internet档案中的Web垃圾邮件过滤

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

While Web spam is targeted for the high commercial value of top-ranked search-engine results, Web archives observe quality deterioration and resource waste as a side effect. So far Web spam filtering technologies are rarely used by Web archivists but planned in the future as indicated in a survey with responses from more than 20 institutions worldwide. These archives typically operate on a modest level of budget that prohibits the operation of standalone Web spam filtering but collaborative efforts could lead to a high quality solution for them.In this paper we illustrate spam filtering needs, opportunities and blockers for Internet archives via analyzing several crawl snapshots and the difficulty of migrating filter models across different crawls via the example of the 13 . uk snapshots performed by UbiCrawler that include WEBSPAM-UK2006 and WEBSPAM-UK2007.

机译：虽然网络垃圾邮件的目的是为了获得排名靠前的搜索引擎结果的高商业价值，但网络档案馆却注意到质量下降和资源浪费是其副作用。到目前为止，Web垃圾邮件过滤器技术很少被Web档案管理员使用，但是如一项调查所表明的那样，在未来计划中，来自全球20多个机构的回应。这些归档文件通常以适度的预算水平运作，该预算禁止运行独立的Web垃圾邮件过滤功能，但是协作可以为他们提供高质量的解决方案。在本文中，我们通过分析几个爬网快照以及通过13个示例在不同爬网之间迁移过滤器模型的难度来说明Internet存档的垃圾邮件过滤需求，机会和阻止者。 UbiCrawler执行的英国快照，包括WEBSPAM-UK2006和WEBSPAM-UK2007。

著录项

来源
《5th international workshop on adversarial information retrieval on the web 2009》|2009年|P.17-20|共4页
会议地点
作者
Miklos Erdelyi; Andras A. Benczu; Julien Masanes; David Siklosi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机网络;
关键词
web spam; document classification; time series analysis;

机译：网络垃圾邮件;文件分类;时间序列分析;

相似文献

外文文献
中文文献
专利

1. Web Archives for the Analog Archivist: Using Webpages Archived by the Internet Archive to Improve Processing and Description [J] . Gelfand Aleksandr Journal of Western Archives . 2018,第1期

机译：模拟档案管理员的Web归档文件：使用Internet归档文件归档的网页来改进处理和描述
2. 'Bayes's theorem became essential for internet spam filters, translation software and much more' [J] . New scientist . 2011,第2819期

机译：“贝叶斯定理对于互联网垃圾邮件过滤器，翻译软件及其他功能至关重要”
3. Spam Filtering for Optimization in Internet Promotions using Bayesian Analysis [J] . Ion SMEUREANU, Madalina ZURINI Journal of Applied Quantitative Methods . 2010,第2期

机译：使用贝叶斯分析优化Internet促销中的垃圾邮件过滤
4. Web spam filtering in internet archives [C] . Miklos Erdelyi, Andras A. Benczur, Julien Masanes, 5th international workshop on adversarial information retrieval on the web 2009 . 2009

机译：Internet档案中的Web垃圾邮件过滤
5. The Internet filtering dilemma: A qualitative analysis of the beliefs, themes, and patterns associated with Internet filtering in Kansas K--12 schools. [D] . Brown, Ken. 2004

机译：互联网过滤难题：对堪萨斯州K--12学校与互联网过滤相关的信念，主题和模式的定性分析。
6. Cable modem access to picture archiving and communication system images using a web browser over the internet [O] . William F. Bennett, Dimitri G. Spigos, Kuldeep V. Vaswani, 2000

机译：电缆调制解调器通过Internet使用Web浏览器访问图片存档和通信系统图像
7. Web Spam Filtering in Internet Archives∗ [O] . Miklós Erdélyic 2014

机译：Internet档案中的Web垃圾邮件过滤*

Web Spam Filtering in Internet Archives

摘要

著录项

相似文献

相关主题

期刊订阅