Mining key information of web pages: A method and its application

Chao Wang; Jie Lu; Guangquan Zhang

首页> 外文期刊>Expert systems with applications >Mining key information of web pages: A method and its application

【24h】

Mining key information of web pages: A method and its application

机译：网页关键信息挖掘：一种方法及其应用

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Web content mining aims to discover useful information and generate desired knowledge from a large amount of web pages. Key information, such as distinctive menu items, navigation indicators, which is embedded in web pages, can help classify the main contents of web pages and reflect certain taxonomy knowledge. Therefore, mining key information is significant in helping acquire domain knowledge and build catalogue classifiers. Current web content mining methods cannot mine such key information effectively. "Noise information" (such as advertisements) is a problem for the performance of web mining tasks. This paper proposes a method to extract key information out of web pages which contain noisy information. The method contains two steps: to extract a list of candidate key information, and then apply entropy measure to filter noisy information and discover key information. Experiment results show that this method is effective in discovering key information. With the discovered key information that reflects taxonomy knowledge, an application is developed to help ontology generation.

机译：Web内容挖掘旨在发现有用的信息并从大量的网页中生成所需的知识。嵌入在网页中的关键信息（例如独特的菜单项，导航指示器）可以帮助对网页的主要内容进行分类并反映某些分类法知识。因此，挖掘关键信息对于帮助获取领域知识和建立目录分类器具有重要意义。当前的Web内容挖掘方法无法有效地挖掘此类关键信息。 “噪声信息”（例如广告）是执行Web挖掘任务的问题。本文提出了一种从包含嘈杂信息的网页中提取关键信息的方法。该方法包括两个步骤：提取候选关键信息列表，然后应用熵测度过滤噪声信息并发现关键信息。实验结果表明，该方法可以有效地发现关键信息。利用发现的反映分类学知识的关键信息，开发了一个应用程序来帮助生成本体。

著录项

来源
《Expert systems with applications》 |2007年第2期|p.425-433|共9页
作者
Chao Wang; Jie Lu; Guangquan Zhang;
展开▼
作者单位

Faculty of Information Technology, University of Technology, Sydney (UTS), P.O. Box 123, Broadway, NSW 2007, Australia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词
web content mining; web page; key information; entropy; taxonomy; ontology generation;

机译：网络内容挖掘;网页;关键信息;熵;分类学;本体生成;

相似文献

外文文献
中文文献
专利

1. Mining the World Wide Web - Methods, Applications, and Perspectives [J] . Andreas Hotho, Gerd Stumme Kunstliche Intelligenz . 2007,第3期

机译：挖掘万维网-方法，应用程序和观点
2. Designing web-based data mining applications to analyze the association rules tracer study at university using a FOLD-growth method [J] . Herman Yuliansyah, Lisna Zahrotun International Journal of Advanced Computer Research . 2016,第27期

机译：设计基于Web的数据挖掘应用程序，以使用FOLD增长方法分析大学中的关联规则跟踪器研究
3. Verification of the fulfilment of the purposes of Basel II, Pillar 3 through application of the web log mining methods [J] . Munk M., Pilková A., Drlík Martin, Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis . 2012,第2期

机译：通过应用Web日志挖掘方法来验证实现《巴塞尔协议II》，支柱3的目的
4. WEB USAGE MINING-A Study of Web data pattern detecting methodologies and its applications in Data Mining [C] . Rajesh K Shukla, Prachi Sharma, Noopur Samaiya, International Conference on Data, Engineering and Applications . 2020

机译：Web使用挖掘-Web数据模式检测方法及其在数据挖掘中的应用研究
5. Improving Web retrieval by mining the HTML tags for keywords and exploring the hyperlink structures of Web pages. [D] . Quevedo-Torrero, Jesus Ubaldo. 2004

机译：通过挖掘HTML标记的关键字并探索网页的超链接结构来改善Web检索。
6. Demonstrating the financial impact of mining injuries with the Safety Pays in Mining web application [O] . J.R. Heberger -1

机译：通过采矿安全付费 Web应用程序展示采矿伤害的财务影响
7. Webometrics benefitting from web mining? An investigation of methods and applications of two research fields [O] . Gunnarsson Lorentzen, David 2014

机译：Webometrics是否受益于Web挖掘？两个研究领域的方法和应用研究

Mining key information of web pages: A method and its application

摘要

著录项

相似文献

相关主题

期刊订阅