首页> 美国卫生研究院文献>other >Online updating method with new variables for big datastreams

【2h】

Online updating method with new variables for big datastreams

机译：大数据新变量在线更新方法流

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

For big data arriving in streams, online updating is an important statistical method that breaks the storage barrier and the computational barrier under certain circumstances. In the regression context, online updating algorithms assume that the set of predictor variables does not change, and consequently cannot incorporate new variables that may become available midway through the data stream. A naive approach would be to discard all previous information and start updating with new variables from scratch. We propose a method that utilizes the information from earlier data in the online updating algorithm with bias corrections to improve efficiency. The method is developed for linear models first, and then extended to estimating equations for generalized linear models. Closed-form expressions for the efficiency gain over the naive approach are derived in a particular linear model setting. We compare the performance of our proposed bias-correcting approach and the naive approach in simulation studies with data generated from a normal linear model and a logistic regression model. The method is applied to a study on airline delay, where reasons for delays were only available more recently, starting in 2003.

机译：对于流中到达的大数据，在线更新是一种重要的统计方法，可以在某些情况下打破存储壁垒和计算壁垒。在回归上下文中，在线更新算法假定预测变量集未更改，因此无法合并可能在数据流中途变得可用的新变量。天真的方法是丢弃所有先前的信息，并从头开始使用新变量进行更新。我们提出了一种方法，该方法利用在线更新算法中来自早期数据的信息进行偏差校正，以提高效率。首先针对线性模型开发该方法，然后将其扩展到估计广义线性模型的方程。在特定的线性模型设置中，可以得出针对幼稚方法的效率提高的闭式表达式。我们将模拟研究中提出的偏差校正方法和幼稚方法的性能与从正常线性模型和逻辑回归模型生成的数据进行比较。该方法用于航空公司延误的研究，该延误的原因仅在2003年才开始提供。

著录项

期刊名称 other
作者
Chun Wang; Ming-Hui Chen; Jing Wu; Jun Yan; Yuping Zhang; Elizabeth Schifano;
展开▼
作者单位

展开▼
年(卷),期 -1(46),1
年度 -1
页码 123–146
总页数 37
原文格式 PDF
正文语种
中图分类
关键词
Added variable estimating equation data compression regression;

机译：增加了变量;估计方程式;数据压缩;回归;

相似文献

外文文献
中文文献
专利

1. Coal Mine Safety Evaluation Method Based on Incomplete Labeled DataStream Classification [J] . Sun Gang, Zhou Huaping, Sun Kelei The Open Cybernetics & Systemics Journal . 2017,第1期

机译：基于不完整标记数据流分类的煤矿安全评价方法
2. Comparison of online model updating methods in pseudo-dynamic hybrid simulations of TADAS frames [J] . Mohagheghian Kamal, Mohammadi Reza Karami Bulletin of earthquake engineering . 2017,第10期

机译：塔达斯框架伪动态混合模拟在线模型更新方法的比较
3. A robust approach for online feedback path modeling in single-channel narrow-band active noise control systems using two distinct variable step size methods [J] . Haseeb Abdul, Tufail Muhammad, Ahmed Shakeel, Applied Acoustics . 2018,第APRa期

机译：使用两个不同的可变步长方法在单通道窄带有源噪声控制系统中进行在线反馈路径建模的可靠方法
4. STARLORD: Sliding Window Temporal Accumulate-Retract Learning for Online Reasoning on Datastreams [C] . Cristian Axenie, Radu Tudoran, Stefano Bortoli, IEEE International Conference on Machine Learning and Applications . 2018

机译：STARLORD：用于数据流在线推理的滑动窗口时间累积-收缩学习
5. SEQUENTIAL QUADRATIC PROGRAMMING METHODS BASED ON APPROXIMATING A PROJECTED HESSIAN MATRIX (UPDATING METHOD, QUASI-NEWTON, NONLINEAR CONSTRAINTS). [D] . GURWITZ, CHAYA BLEICH. 1986

机译：基于逼近投影Hessian矩阵的顺序二次编程方法（更新方法，拟牛顿法，非线性约束）。
6. Examining variable selection methods for the predictive performance of regression models and the proportion of selected variables and selected random variables [O] . Hiromasa Kaneko 2021

机译：检查回归模型的预测性能的变量选择方法以及所选变量的比例和选择的随机变量
7. Online Tensor Methods for Learning Latent Variable Models [O] . Huang, Furong, Niranjan, U. N., Hakeem, Mohammad Umar, 2015

机译：学习潜变量模型的在线张量方法

Online updating method with new variables for big datastreams

摘要

著录项

相似文献

相关主题

期刊订阅