Focused crawler for the acquisition of health articles

机译：专注于获取健康用品的履带

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The health intervention by using technology can be the alternative to the doctor, especially for common health problem. To support the technology, we need health knowledge base as the foundation. The artificial intelligence and hardware development nowadays support this requirement. The big picture of our research is building the application that can utilize the health knowledge base to provide health intervention. As the first step, we collect the articles related to health. To realize it, we build the focused crawler that implements multithreaded programming, Larger-Sites-First algorithm and also Naïve Bayes classifier. We find that the articles acquisition is going to saturate along with the increment of threads. Furthermore, the implementation of Larger-Sites-First algorithm do increase the number of crawled articles, but it is not significant. In addition, Naïve Bayes recognizes ≥ 90 percent articles in perfect condition for both health and non-health category. However, the performance goes down when recognizing the non-health articles which contain health keywords.

机译：使用技术进行健康干预可以替代医生，特别是对于常见的健康问题。为了支持该技术，我们需要健康知识库为基础。当今的人工智能和硬件开发支持这一要求。我们研究的全局是构建可以利用健康知识库提供健康干预的应用程序。第一步，我们收集与健康相关的文章。为了实现这一目标，我们构建了专注的爬虫，该爬虫实现了多线程编程，Large-Sites-First算法以及朴素贝叶斯分类器。我们发现文章获取将随着线程的增加而饱和。此外，Large-Sites-First算法的实现确实增加了已爬网文章的数量，但这并不重要。此外，朴素贝叶斯（NaïveBayes）识别出≥90％处于健康和非健康类别的最佳状态的文章。但是，当识别包含健康关键字的非健康文章时，性能会下降。

著录项

来源
《International Conference on Data and Software Engineering》|2017年|1-6|共6页
会议地点
作者
Amalia Amalia; Dani Gunawan; Atras Najwan; Fathia Meirina;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Crawlers; Training; Dictionaries;

机译：爬行者;培训;词典;

相似文献

外文文献
中文文献
专利

1. PDD Crawler : A Focused Web Crawler Using Link and Content Analysis for Relevence Prediction [J] . Prashant Dahiwale, M M Raghuwanshi, Latesh Malik Computer Science & Information Technology . 2014,第11期

机译：PDD爬网程序：使用链接和内容分析进行相关性预测的集中式Web爬网程序
2. Original Article A qualitative study of classical Chinese medicine in community health focusing on self-care: practitioner and staff perspectives [J] . Alaia Harvie, Amie Steel, Jon Wardle Integrative Medicine Research . 2020,第1期

机译：原始文章对社区健康古典中医重点的定性研究，专注于自我保健：从业者和员工的观点
3. Risk Factors of Cardiovascular Disease and Their Related Socio-Economical, Environmental and Health Behavioral Factors: Focused on Low-Middle Income Countries- A Narrative Review Article [J] . Sun Li-yuan, Lee Eun-whan, Zahra Aqeela, Iranian journal of public health. . 2015,第4期

机译：心血管疾病的危险因素及其相关的社会经济，环境和健康行为因素：集中于中低收入国家-叙述性评论文章
4. Focused crawler for the acquisition of health articles [C] . Amalia Amalia, Dani Gunawan, Atras Najwan, International Conference on Data and Software Engineering . 2016

机译：重点履带用于收购健康文章
5. Regional Cooperation Addressing Marine Pollution from Land-based Activities: An Interpretation of Article 207 of the Law of the Sea Convention Focusing on Monitoring, Assessment, and Surveillance of the Pollution [D] . Popattanachai, Naporn. 2018

机译：区域合作涉及陆地活动的海洋污染：对海洋大会法律第207条的解释，重点是监测，评估和监督污染
6. Original article: Health hackathons: theatre or substance? A survey assessment of outcomes from healthcare-focused hackathons in three countries [O] . Kristian R Olson, Madeline Walsh, Priya Garg, -1

机译：原始文章：健康黑客马拉松：戏剧还是物质？对三个国家/地区针对医疗保健的黑客马拉松的结果进行的调查评估
7. Focused Crawler Research for Business Intelligence Acquisition [O] . Peng Xin, Qin Qiuli 2014

机译：针对商业智能收购的集中爬虫研究

Focused crawler for the acquisition of health articles

摘要

著录项

相似文献

相关主题

期刊订阅