首页> 外文会议>IEEE International Conference on Data Mining Workshops >Imputation of Missing Values in Time Series with Lagged Correlations
【24h】

Imputation of Missing Values in Time Series with Lagged Correlations

机译:具有滞后相关的时间序列中的缺失值的插补

获取原文

摘要

Missing values are a common problem in real world data and are particularly prevalent in biomedical time series, where a patient's medical record may be split across multiple institutions or a device may briefly fail. These data are not missing completely at random, so ignoring the missing values can lead to bias and error during data mining. However, current methods for imputing missing values have yet to account for the fact that variables are correlated and that those relationships exist across time. To address this, we propose an imputation method (FLk-NN) that incorporates time lagged correlations both within and across variables by combining two imputation methods, based on an extension to k-NN and the Fourier transform. This enables imputation of missing values even when all data at a time point is missing and when there are different types of missingness both within and across variables. In comparison to other approaches on two biological datasets (simulated glucose in Type 1 diabetes and multi-modality neurological ICU monitoring) the proposed method has the highest imputation accuracy. This was true for up to half the data being missing and when consecutive missing values are a significant fraction of the overall time series length.
机译:缺失值是现实世界数据中的常见问题,并且在生物医学时间序列中尤其普遍,在这种情况下,患者的病历可能会分散在多个机构中,或者设备可能会短暂失效。这些数据不会完全随机丢失,因此忽略丢失的值可能会导致数据挖掘期间出现偏差和错误。但是,当前的估算缺失值的方法尚未考虑到以下事实:变量相互关联,并且这些关系跨时间存在。为了解决这个问题,我们提出了一种插补方法(FLk-NN),它基于对k-NN的扩展和傅立叶变换,通过组合两种插补方法,在变量内部和变量之间合并了时间滞后的相关性。即使某个时间点的所有数据都丢失,并且变量内部和变量之间都存在不同类型的缺失,这也可以估算缺失值。与在两个生物学数据集上的其他方法(1型糖尿病中的模拟葡萄糖和多模态神经病学ICU监测)相比,该方法具有最高的估算准确性。对于多达一半的数据丢失以及连续的丢失值占整个时间序列长度的很大一部分来说,这是正确的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号