Application of Data Mining Technology on Surveillance Report Data of HIV/AIDS High-Risk Group in Urumqi from 2009 to 2015

Tang Dandan; Zhang Man; Xu Jiabo; Zhang Xueliang; Yang Fang; Li Huling; Feng Li; Wang Kai; Zheng Yujian

首页> 外文期刊>Complexity >Application of Data Mining Technology on Surveillance Report Data of HIV/AIDS High-Risk Group in Urumqi from 2009 to 2015

【24h】

Application of Data Mining Technology on Surveillance Report Data of HIV/AIDS High-Risk Group in Urumqi from 2009 to 2015

机译：数据挖掘技术在乌鲁木齐市2009年至2015年HIV / AIDS高危人群监测报告数据中的应用

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Objective. Urumqi is one of the key areas of HIV/AIDS infection in Xinjiang and in China. The AIDS epidemic is spreading from high-risk groups to the general population, and the situation is still very serious. The goal of this study was to use four data mining algorithms to establish the identification model of HIV infection and compare their predictive performance. Method. The data from the sentinel monitoring data of the three groups of high-risk groups (injecting drug users (IDU), men who have sex with men (MSM), and female sex workers (FSW)) in Urumqi from 2009 to 2015 included demographic characteristics, sex behavior, and serological detection results. Then we used age, marital status, education level, and other variables as input variables and whether to infect HIV as output variables to establish four prediction models for the three datasets. We also used confusion matrix, accuracy, sensitivity, specificity, precision, recall, and the area under the receiver operating characteristic (ROC) curve (AUC) to evaluate classification performance and analyzed the importance of predictive variables. Results. The final experimental results show that random forests algorithm obtains the best results, the diagnostic accuracy for random forests on MSM dataset is 94.4821%, 97.5136% on FSW dataset, and 94.6375% on IDU dataset. The k-nearest neighbors algorithm came out second, with 91.5258% diagnostic accuracy on MSM dataset, 96.3083% diagnostic accuracy on FSW dataset, and 90.8287% diagnostic accuracy on IDU dataset, followed by support vector machine (94.0182%, 98.0369%, and 91.3571%). The decision tree algorithm was the poorest among the four algorithms, with 79.1761% diagnostic accuracy on MSM dataset, 87.0283% diagnostic accuracy on FSW dataset, and 74.3879% accuracy on IDU. Conclusions. Data mining technology, as a new method of assisting disease screening and diagnosis, can help medical personnel to screen and diagnose AIDS rapidly from a large number of information.

机译：目的。乌鲁木齐是新疆和中国艾滋病毒/艾滋病感染的重点地区之一。艾滋病的流行正在从高危人群向普通人群蔓延，情况仍然很严重。这项研究的目的是使用四种数据挖掘算法来建立HIV感染的识别模型并比较其预测性能。方法。 2009年至2015年乌鲁木齐的三组高危人群（注射吸毒者（IDU），男男性接触者（MSM）和女性性工作者（FSW））的前哨监测数据包括人口统计数据特征，性行为和血清学检测结果。然后，我们使用年龄，婚姻状况，受教育程度和其他变量作为输入变量，并以是否感染艾滋病毒作为输出变量来为这三个数据集建立四个预测模型。我们还使用混淆矩阵，准确性，敏感性，特异性，精度，召回率以及接收器工作特征（ROC）曲线（AUC）下的面积来评估分类性能并分析了预测变量的重要性。结果。最终的实验结果表明，随机森林算法获得了最好的结果，MSM数据集对随机森林的诊断准确性为94.4821％，FSW数据集为97.5136％，IDU数据集为94.6375％。 k最近邻算法排在第二位，MSM数据集的诊断准确度为91.5258％，FSW数据集的诊断准确度为96.3083％，IDU数据集的诊断准确度为90.8287％，其次是支持向量机（94.0182％，98.0369％和91.3571 ％）。决策树算法是这四种算法中最差的，在MSM数据集上的诊断准确性为79.1761％，在FSW数据集上的诊断准确性为87.0283％，在IDU上的准确性为74.3879％。结论。数据挖掘技术作为一种辅助疾病筛查和诊断的新方法，可以帮助医务人员从大量信息中快速筛查和诊断艾滋病。

著录项

来源
《Complexity》 |2018年第1期|共页
作者
Tang Dandan; Zhang Man; Xu Jiabo; Zhang Xueliang; Yang Fang; Li Huling; Feng Li; Wang Kai; Zheng Yujian;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类大系统理论;
关键词

相似文献

外文文献
中文文献
专利

1. Application of Data Mining Technology on Surveillance Report Data of HIV/AIDS High-Risk Group in Urumqi from 2009 to 2015 [J] . Dandan Tang, Man Zhang, Jiabo Xu, Complexity . 2018,第Pta17期

机译：数据挖掘技术在2009年至2015年从乌鲁木齐艾乌鲁木齐艾滋病毒/艾滋病高风险群体监督报告数据的应用
2. Data Mining in HIV-AIDS Surveillance System Application to Portuguese Data [J] . Oliveira Alexandra, Faria Brigida Monica, Rita Gaio A., Journal of medical systems . 2017,第4期

机译：艾滋病毒艾滋病监测系统应用于葡萄牙数据的数据挖掘
3. Seismic vulnerability assessment at urban scale using data mining and GIScience technology: application to Urumqi (China) [J] . Yaohui Liu, Zhiqiang Li, Benyong Wei, Geomatics,Natural Hazards & Risk . 2019,第1期

机译：使用数据挖掘和GISCIENCE技术在城市规模上进行地震脆弱性评估：在乌鲁木齐（中国）
4. Modelling Reporting Delays in a Multilevel Structured Surveillance System - Application to Portuguese HIV-AIDS Data [C] . Alexandra Oliveira, Humberta Amorim, Rita Gaio, World Conference on Information Systems and Technologies . 2019

机译：在多级结构监测系统中建模报告延迟 - 葡萄牙艾滋病毒艾滋病数据的应用
5. Scaling the Technology Opportunity Analysis text data mining methodology: Data extraction, cleaning, online analytical processing analysis, and reporting of large multi-source datasets. [D] . George, Richard Peyton. 2006

机译：扩展技术机会分析文本数据挖掘方法：数据提取，清理，在线分析处理分析以及大型多源数据集的报告。
6. Application of Data Mining Algorithms for Dementia in People with HIV/AIDS [O] . Luana Ibiapina Cordeiro Calíope Pinheiro, Maria Lúcia Duarte Pereira, Marcial Porto Fernandez, 2021

机译：数据挖掘算法在艾滋病毒/艾滋病人中痴呆的应用
7. Application of Data Mining Technology on Surveillance Report Data of HIV/AIDS High-Risk Group in Urumqi from 2009 to 2015 [O] . Dandan Tang, Man Zhang, Jiabo Xu, 2018

机译：数据挖掘技术在2009年至2015年从乌鲁木齐艾乌鲁木齐艾滋病毒/艾滋病高风险群体监督报告数据的应用
8. HIV/AIDS Data through December 2008: Provided for the Ryan White HIV/AIDS Treatment Modernization Act of 2009, for Fiscal Year 2010. HIV Surveillance Report, Supplemental Report, Volume 17, Number 1 [R] . 2008

机译：截至2008年12月的艾滋病毒/艾滋病数据：为2010财年提供2009年瑞安艾滋病毒/艾滋病治疗现代化法案。艾滋病毒监测报告，补充报告，第17卷，第1期

Application of Data Mining Technology on Surveillance Report Data of HIV/AIDS High-Risk Group in Urumqi from 2009 to 2015

摘要

著录项

相似文献

相关主题

期刊订阅