Libwebex: A Non-discriminating Web Content Extraction Library

机译：Libwebex：一个无差别的Web内容提取库

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Libwebex is designed to be a powerful and flexible tool for the purpose of web page content extraction in the domain of Data Integration. Utilizing the semi-structured nature of web pages, this library provides users with a skeletal view of the web page content. The data elements that need to be extracted are arranged into a hierarchical tree of related topics and content. In this representation, users are able to build custom applications on top of this framework using the object-oriented design concept, the Visitor design pattern. All a programmer needs to provide is the customized behavior to be executed at each node in the tree. The library provides multiple basic building blocks for additional applications, such as archiving, searching, and printing mechanisms.

机译：Libwebex被设计为功能强大且灵活的工具，目的是在数据集成领域提取网页内容。利用网页的半结构化性质，该库为用户提供了网页内容的骨架视图。需要提取的数据元素被安排在相关主题和内容的层次树中。在这种表示形式中，用户可以使用面向对象的设计概念（访客设计模式）在此框架之上构建自定义应用程序。程序员需要提供的只是在树中每个节点上执行的自定义行为。该库为其他应用程序提供了多个基本构建块，例如归档，搜索和打印机制。

著录项

来源
《International conference on information knowledge engineering》|2016年|23-26|共4页
会议地点 Las Vegas NV(US)
作者
Michael Gruesen; En Cheng;
展开▼
作者单位

Department of Computer Science College of Arts and Sciences University of Akron Akron OH 44325-4003;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Data Integration; Web Content Extraction; Web Page Interpreter;

机译：数据整合； Web内容提取；网页翻译;

相似文献

外文文献
中文文献
专利

1. Business Library Web Sites Revisited: An Updated Review of the Organization and Content of Academic Business Library Web Sites [J] . CHARLES LYONS, HAL KIRKWOOD Journal of business & finance librarianship . 2009,第4期

机译：再次访问商业图书馆网站：对学术商业图书馆网站的组织和内容的更新评论
2. Relation Extraction from Web Contents with Linguistic and Web Features（言語分析およびWeb上の情報を用いたコンテンツからの関係の抽出） [J] . 顔玉蘭人工知能学会志 . 2011,第1期

机译：使用语言和Web功能从Web内容中提取关系（使用Web上的信息进行语言分析和从内容中提取关系）
3. Use of Web Content Management Systems (WCMSs) in Library: A Study of University Libraries in Bangladesh [J] . Umme Habiba, Akhtar Rozifa Research Journal of Library Sciences . 2015,第6期

机译：图书馆中Web内容管理系统（WCMS）的使用：孟加拉国大学图书馆的研究
4. Libwebex: A Non-discriminating Web Content Extraction Library [C] . Michael Gruesen, En Cheng International conference on information knowledge engineering . 2016

机译：libwebex：非识别Web内容提取库
5. Content and quality of school library Web sites in Missouri. [D] . Krause, Kelly R. 2006

机译：密苏里州学校图书馆网站的内容和质量。
6. Case study: the Health SmartLibrary experiences in web personalization and customization at the Galter Health Sciences Library Northwestern University [O] . James Shedlock, Michelle Frisque, Steve Hunt, 2010

机译：案例研究：西北大学Galter健康科学图书馆在Web个性化和自定义方面的Health SmartLibrary体验
7. Improving Webpage Content Extraction by extending a novel single page extraction approach: A case study with Thai websites [O] . Thanadechteemapat W., Fung C.C. 2012

机译：通过扩展新颖的单页提取方法来改善网页内容提取：以泰国网站为例

Libwebex: A Non-discriminating Web Content Extraction Library

摘要

著录项

相似文献

相关主题

期刊订阅