首页> 外文学位 >Topic models and dynamic prediction models and their applications in document retrieval and healthcare.

【24h】

Topic models and dynamic prediction models and their applications in document retrieval and healthcare.

机译：主题模型和动态预测模型及其在文档检索和医疗保健中的应用。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Statistical Topic Models has been widely studied in Text Mining as an effective approach to extract latent topics from unstructured text documents. We present a robust and computationally efficient Hierarchical Bayesian model for effective topic correlation modeling Generalized Dirichlet distribution (GD). GD-LDA is effective to avoid over-fitting as the number of topics is increased. We provide results using Empirical Likelihood (EL) in 4 public datasets. We show the application of topic models in two different domains: 1)Information Retrieval, and 2)Dynamic Prediction Models applied in health care.;In Information Retrieval, we propose to leverage statistical topic modeling techniques in relevance feedback to incorporate a better estimate of context by including corpus level information about the document. We show results using the OHSUMED dataset for three different variants and obtain higher performance, up to 12.5% in Mean Average Precision (MAP).;Patients often search for information on the web about treatments and diseases after they are discharged from the hospital. However, searching for medical information on the web poses challenges due to related terms and synonyms for the same disease and treatment.;We present a method to retrieve healthcare related documents using the patient discharge document. We show that the proposed framework outperformed the winner of the retrieval CLEF eHealth 2013 Challenge by 68% in the MAP measure, and by 13% in NDCG. We present a method to estimate dynamically the probability of mortality inside the Intensive Care Unit (ICU) by combining heterogeneous data. We propose a method based on Generalized Linear Dynamic Models that models the probability of mortality as a latent state that evolves over time. This framework allows us to combine different types of features (lab results, vital signs readings, doctor and nurse notes, etc.) into a single state. We update this state each time new patient data is observed. We test our proposed approach using 15,000 Electronic Medical Records (EMRs) obtained from the MIMIC II public data set.;We expand this dynamic mortality estimation model in two forms. We estimate the probability that a patient is readmitted after he is discharged from the ICU and transferred to a lower level care unit. We also present a method to predict the failure of physiological subsystems from patients admitted to the ICU using heterogeneous data dynamically. We model the probability of failure in each subsystem as a latent state. Then, we estimate the probability of patient mortality as a combination of the estimated failure propensity for all subsystems. We propose a method of imputing missing values using the non-ignorable nature of the patient data. Experimental results show that our method outperforms other approaches in the literature in terms of AUC, sensitivity, and specificity. In addition, we show that the combination of different features (numerical and text) increases the prediction performance of the proposed approach.

机译：统计主题模型已经在文本挖掘中得到了广泛的研究，是一种从非结构化文本文档中提取潜在主题的有效方法。我们为有效的主题相关性建模广义Dirichlet分布（GD）提供了鲁棒且计算效率高的贝叶斯模型。随着主题数量的增加，GD-LDA可以有效避免过度拟合。我们在4个公共数据集中使用经验似然（EL）提供结果。我们展示了主题模型在两个不同领域中的应用：1）信息检索，以及2）在医疗保健中应用的动态预测模型。;在信息检索中，我们建议在相关性反馈中利用统计主题建模技术，以更好地评估通过包含有关文档的语料库级别信息来实现上下文。我们使用OHSUMED数据集显示了三个不同变体的结果，并获得了更高的性能，平均平均精度（MAP）高达12.5％。；患者出院后，经常在网上搜索有关治疗和疾病的信息。然而，由于相同疾病和治疗的相关术语和同义词，在网络上搜索医疗信息提出了挑战。我们提出了一种使用出院文件检索医疗保健相关文件的方法。我们显示，提出的框架在MAP措施方面胜过了CLEF eHealth 2013挑战赛的获胜者，在MAP措施上胜过了13％，在NDCG方面胜过了13％。我们提出了一种方法，通过结合异构数据来动态估计重症监护病房（ICU）内的死亡率。我们提出了一种基于广义线性动力学模型的方法，该方法将死亡率的概率建模为随时间演变的潜在状态。该框架使我们可以将不同类型的功能（实验室检查结果，生命体征读数，医生和护士记录等）组合到一个状态中。每当观察到新的患者数据时，我们都会更新此状态。我们使用从MIMIC II公共数据集中获得的15,000个电子病历（EMR）测试了我们提出的方法。我们将动态死亡率估算模型扩展为两种形式。我们估计患者从ICU出院并转移到较低级别的护理部门后重新入院的可能性。我们还提出了一种使用异构数据动态预测从ICU入院的患者体内生理子系统失败的方法。我们将每个子系统的故障概率建模为潜在状态。然后，我们将所有子系统的估计故障倾向作为估计的患者死亡率概率。我们提出了一种使用患者数据的不可忽略性来估算缺失值的方法。实验结果表明，在AUC，灵敏度和特异性方面，我们的方法优于文献中的其他方法。另外，我们表明，不同特征（数字和文本）的组合提高了所提出方法的预测性能。

著录项

作者
Caballero Barajas, Karla L.;
展开▼
作者单位

University of California, Santa Cruz.;

展开▼
授予单位 University of California, Santa Cruz.;
学科 Information technology.;Statistics.;Information science.
学位 Ph.D.
年度 2015
页码 175 p.
总页数 175
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Modeling query-document dependencies with topic language models for information retrieval [J] . Wu Meng-Sung Information Sciences: An International Journal . 2015,第Null期

机译：使用主题语言模型为查询文档依赖性建模以进行信息检索
2. Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora [J] . Ivan Vulić, Wim De Smet, Marie-Francine Moens Information Retrieval . 2013,第3期

机译：基于潜在主题模型的跨语言信息检索模型，该主题模型经过与文档对齐的可比语料库训练
3. Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora [J] . Ivan Vulic, Wim De Smet, Marie-Francine Moens Information retrieval . 2013,第3期

机译：基于潜在主题模型的跨语言信息检索模型，该主题模型经过与文档对齐的可比语料库训练
4. An Approach for Retrieval of Text Documents by Hybridizing Structural Topic Modeling and Pointwise Mutual Information [C] . K. Vishal, Gerard Deepak, A. Santhanavijayan International Conference on Electrical and Electronic Engineering . 2021

机译：通过杂交结构主题建模和叉互信息检索文本文档的方法
5. Language models and automatic topic categorization for information retrieval in handwritten documents [D] . Farooq, Faisal 2008

机译：用于手写文档中信息检索的语言模型和自动主题分类
6. Incorporating Statistical Topic Models in the Retrieval of Healthcare Documents [O] . Karla Caballero, Ram Akella 2015

机译：在医疗文档检索中纳入统计主题模型
7. Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora [O] . Vulic Ivan, De Smet Wim, Moens Marie-Francine 2013

机译：基于潜在主题模型的跨语言信息检索模型，该主题模型经过与文档对齐的可比语料库训练

Topic models and dynamic prediction models and their applications in document retrieval and healthcare.

摘要

著录项

相似文献

相关主题

期刊订阅