Visual-based web page analysis.

机译：基于视觉的网页分析。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This research investigates efforts to identify different content areas appearing on a webpage by comparing the visual features and the relative characteristics of each content area, called visual block in this study. The process is to use the Image Segmentation technique to extract and parse a webpage's visual features, as well as analyze it to identify the functionality of each content area based on its layout and position.;To accomplish this, this study reviews several techniques that have been used in related fields and discusses the strengths and the weaknesses of these techniques. The main weakness for the past techniques is they rely heavily on HTML; in other words, they are language-dependent. This paper proposes a visual-based technique that focuses on using visual features rather than HTML; hence it is more language-independent. To determine the functionality of each visual block, the technique uses an algorithm to parse webpages into a tree structure and apply a rule of how humans determine the relationship between two objects on a 2D monitor.;The goal of this research is to design an automated visual-based algorithm to exam each visual block showing on the webpage and apply human cognitive processes to decide the role of each block. For example, one might wish to identify the main content, the sub content, the navigation menu, and the advertisement.;Chapter 1 describes the motivation, the issue, and possible solution to the problem. Chapter 2 reviews several different technologies that can be used to solve the problem and elucidates possible future research. Chapter 3 focuses on explaining how to prepare the test environment and techniques that have been used. Chapter 4 describes the result, what was accomplished, what was missing, and necessary further research. Chapter 5 concludes with the possibilities of this research and how future research might help accomplish the final goal of this research.

机译：这项研究调查了通过比较每个内容区域的视觉特征和相对特征（在本研究中称为视觉块）来识别网页上出现的不同内容区域的努力。该过程是使用图像分割技术提取和解析网页的视觉特征，并对其进行分析以根据其内容和布局确定每个内容区域的功能。为此，本研究回顾了几种具有以下特点的技术：已在相关领域中使用，并讨论了这些技术的优点和缺点。过去技术的主要缺点是它们严重依赖HTML。换句话说，它们取决于语言。本文提出了一种基于视觉的技术，重点是使用视觉功能而不是HTML。因此，它与语言无关。为了确定每个可视块的功能，该技术使用一种算法将网页解析为树形结构，并应用人类如何确定2D监视器上两个对象之间关系的规则。本研究的目的是设计一种自动化的基于视觉的算法，用于检查网页上显示的每个视觉块，并应用人类认知过程来确定每个块的作用。例如，可能希望识别主要内容，子内容，导航菜单和广告。第1章介绍了动机，问题和可能的解决方案。第2章回顾了可用于解决问题的几种不同技术，并阐明了未来可能的研究。第3章重点介绍如何准备已使用的测试环境和技术。第4章介绍了结果，完成的内容，缺少的内容以及必要的进一步研究。第五章总结了这项研究的可能性以及未来的研究如何帮助完成这项研究的最终目标。

著录项

作者
Lee, Kuang-Yao.;
展开▼
作者单位

San Diego State University.;

展开▼
授予单位 San Diego State University.;
学科 Computer Science.
学位 M.S.
年度 2014
页码 46 p.
总页数 46
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Localization of function-specific segments of the primary motor pathway in children with Sturge-Weber syndrome: A multimodal imaging analysis. [J] . Jeong-Won Jeong, Harry T Chugani, Csaba Juhász Journal of magnetic resonance imaging: JMRI . 2013,第5期

机译：Sturge-Weber综合征患儿主要运动通路功能特定部分的定位：多模式成像分析。
2. Localization of function-specific segments of the primary motor pathway in children with Sturge-Weber syndrome: A multimodal imaging analysis. [J] . Jeong-Won Jeong, Harry T Chugani, Csaba Juhász Journal of magnetic resonance imaging: JMRI . 2013,第5期

机译：具有鲟鱼综合征的儿童初级电机途径的功能特定段的定位：多峰成像分析。
3. SWeBLAST: A Sliding Window Web-based BLAST tool for recombinant analysis. [J] . Fourment M, Gibbs AJ, Gibbs MJ Journal of Virological Methods . 2008,第1a2期

机译：SWeBLAST：一种基于滑动窗口Web的BLAST工具，用于重组分析。
4. Scope of Visual-Based Similarity Approach Using Convolutional Neural Network on Phishing Website Detection [C] . J. Rajaram, M. Dhasaratham International Conference on Information Systems Design and Intelligence Applications . 2021

机译：基于视觉的相似性方法使用卷积神经网络在网络钓鱼网站检测中的范围
5. The effects of reduced stream flow caused by natural and anthropogenic disturbances on headwater stream food webs: Evidence from stable isotope analysis. [D] . Robinson, Laura. 2006

机译：自然和人为干扰引起的水流量减少对上游水源食物网的影响：稳定同位素分析的证据。
6. GenomicScape: An Easy-to-Use Web Tool for Gene Expression Data Analysis. Application to Investigate the Molecular Events in the Differentiation of B Cells into Plasma Cells [O] . Alboukadel Kassambara, Thierry Rème, Michel Jourdan, 2015

机译：GenomicScape：用于基因表达数据分析的易于使用的Web工具。在研究B细胞分化为浆细胞的分子事件中的应用
7. PanWeb: A web interface for pan-genomic analysis. [O] . Yan Pantoja, Kenny Pinheiro, Allan Veras, 2017

机译：panWeb：用于泛基因组分析的Web界面。
8. MalWebID-Autodetection and Identification of Malicious Web Hosts Through Live Traffic Analysis. [R] . T. Nichols 2013

机译：malWebID-通过实时流量分析自动检测和识别恶意Web主机。

Visual-based web page analysis.

摘要

著录项

相似文献

相关主题

期刊订阅