An automatic approach to extracting review link from Chinese news pages

机译：从中国新闻网页提取审查链接的自动方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Review links are widely used in some special kinds of web pages, especially news pages. They are very useful pieces of information in many applications, such as hot topic discovery and public opinion monitoring. Unfortunately, extracting review links manually from news pages is time-consuming and error-prone. Though lots of works on web data extraction have been developed, we argue that this is still not a trivial problem due to the diversity on both DOM tree structure and visual presentation. In this paper, a novel approach is proposed for automatically extracting the review links from web pages. This approach consists of two steps: first segment each news page into a set of blocks, and then identify the block(s) that contain the review link using a machine learning technique. Experimental results over a large number of Chinese news pages indicate that this approach is highly accurate.

机译：审查链接广泛用于某些特殊类型的网页，尤其是新闻页面。它们在许多应用中是非常有用的信息，例如热门主题发现和公众意见监控。不幸的是，从新闻页面手动提取审查链接是耗时和错误的。虽然已经开发了许多关于Web数据提取的作品，但我们认为这仍然不是一个由于DOM树结构和视觉演示的多样性而难度的问题。在本文中，提出了一种自动提取来自网页的审查链接的新方法。此方法由两个步骤组成：首先将每个新闻页面分段为一组块，然后识别使用计算机学习技术包含审阅链接的块。在大量中国新闻网页上的实验结果表明这种方法是高度准确的。

著录项

来源
《IEEE Joint International Information Technology and Artificial Intelligence Conference》|2011年||共5页
会议地点
作者
Liu Wei;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
Machine learning; Review link; Visual feature; Web data extraction;

机译：机器学习;审查链接;可视功能;Web数据提取;

相似文献

外文文献
中文文献
专利

1. A rule-based approach for automatically extracting data from systematic reviews and their updates to model the risk of conclusion change [J] . Bashir Rabia, Dunn Adam G., Surian Didi Research Synthesis Methods . 2021,第2期

机译：基于规则的方法，用于自动从系统审核和更新中提取数据以建模结论变更风险
2. Automatic Expansion of Abbreviations in Chinese News Text: A Hybrid Approach [J] . Guohong Fu, Kang-Kwong Luke, Jonathan J. Webster International Journal of Computer Processing of Oriental Languages . 2007,第2a3期

机译：中文新闻文字缩写的自动扩展：一种混合方法
3. Automatically Embedding Newsworthy Links to Articles: From Implementation to Evaluation [J] . loannis Arapakis, Mounia Lalmas, Hakan Ceylan, Journal of the American Society for Information Science and Technology . 2014,第1期

机译：自动将具有新闻价值的链接嵌入文章：从实施到评估
4. An automatic approach to extracting review link from Chinese news pages [C] . Liu Wei 2011 6th IEEE Joint International Information Technology and Artificial Intelligence Conference . 2011

机译：一种自动从中文新闻页面提取评论链接的方法
5. T2LD - An automatic framework for extracting, interpreting and representing tables as Linked Data. [D] . Mulwad, Varish. 2010

机译：T2LD-用于提取，解释和表示表为链接数据的自动框架。
6. Linking attentional processes and conceptual problem solving: visual cues facilitate the automaticity of extracting relevant information from diagrams [O] . Amy Rouinfar, Elise Agra, Adam M. Larson, 2014

机译：将注意过程与概念性问题解决联系起来：视觉提示有助于自动从图表中提取相关信息
7. A novel approach to automatically extracting basic units from Chinese sign language [O] . Gaolin Fang, Xiujuan Gao, Wen Gao, 2015

机译：一种从中文手语中自动提取基本单位的新方法

An automatic approach to extracting review link from Chinese news pages

摘要

著录项

相似文献

相关主题

期刊订阅