首页> 外文学位 >Provenance-aware framework for lightweight capture and high quality data regeneration.

【24h】

Provenance-aware framework for lightweight capture and high quality data regeneration.

机译：起源感知框架，用于轻量级捕获和高质量数据再生。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The provenance, or derivation history, of a dataset is additional data that is necessary for making determinations of data quality, and for repeatability of the science behind the results, leading to data regeneration. Existing models and techniques for provenance capture identify and record provenance only during the execution of an experiment. These mechanisms are coarse grained, frequently unable to capture the exact mapping between the inputs and outputs of an experiment and capture only the processes and the inputs to the processes that generate the data. But the underlying programs change and the applications that create an experiment are frequently regenerated with different configurations or parameters. Additionally, provenance capture from different programming environments and execution platforms have so far required human intervention, manually annotating applications that can result in incorrect or missing information. Currently, no generic model or framework exists to automatically identify and capture provenance, addressing these issues that are required for regenerating the data from a wide range of applications running on different environments.;This dissertation proposes a unified model of provenance capture for high quality data regeneration and builds a framework for automatically capturing provenance with extremely low overhead. The generic framework captures provenance for both the application and the data generated by the application. A combination of compilers, a runtime system and a rule-engine is used for collecting provenance at different levels of application deployment and execution. This dissertation highlights the generality of this framework and studies the viability of capturing provenance through static (compile-time) and dynamic (runtime) components. We evaluate our methodology for different types of applications and benchmarks and show that capturing static and dynamic provenance contributes to high quality data analysis and regeneration. Finally, we study and recommend provenance capture approaches taking into consideration application and platform limitations.

机译：数据集的出处或派生历史记录是确定数据质量以及结果背后的科学可重复性所必需的附加数据，从而导致数据再生。现有的种源捕获模型和技术仅在执行实验时才识别和记录种源。这些机制是粗粒度的，经常无法捕获实验的输入和输出之间的精确映射，而仅捕获过程和生成数据的过程的输入。但是底层程序发生了变化，并且创建实验的应用程序经常使用不同的配置或参数来重新生成。此外，迄今为止，从不同的编程环境和执行平台捕获物源需要人工干预，手动注释可能会导致错误或丢失信息的应用程序。当前，尚不存在通用的模型或框架来自动识别和捕获来源，从而解决了从运行在不同环境中的各种应用程序中重新生成数据所需要的这些问题。重新生成并建立一个框架，以极低的开销自动捕获源。通用框架可捕获应用程序以及应用程序生成的数据的来源。编译器，运行时系统和规则引擎的组合用于收集应用程序部署和执行的不同级别的出处。本文着重介绍了该框架的一般性，并研究了通过静态（编译时）和动态（运行时）组件捕获源的可行性。我们针对不同类型的应用程序和基准评估了我们的方法，并表明捕获静态和动态来源有助于高质量的数据分析和再生。最后，我们考虑到应用程序和平台的局限性，研究并推荐出处捕获方法。

著录项

作者
Ghoshal, Devarshi.;
展开▼
作者单位

Indiana University.;

展开▼
授予单位 Indiana University.;
学科 Computer Science.
学位 Ph.D.
年度 2014
页码 242 p.
总页数 242
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Capturing Enterprise Data Integration Challenges Using a Semiotic Data Quality Framework [J] . John Krogstie Wirtschaftsinformatik . 2015,第1期

机译：使用符号数据质量框架捕获企业数据集成挑战
2. Capturing Enterprise Data Integration Challenges Using a Semiotic Data Quality Framework [J] . Prof. John Krogstie Business & Information Systems Engineering . 2015,第1期

机译：使用符号数据质量框架捕获企业数据集成挑战
3. Evaluation of a mandatory quality assurance data capture in anesthesia: a secure electronic system to capture quality assurance information linked to an automated anesthesia record. [J] . Peterfreund RA, Driscoll WD, Walsh JL, Anesthesia and Analgesia: Journal of the International Anesthesia Research Society . 2011,第5期

机译：评估麻醉中强制性质量保证数据的评估：一种安全的电子系统，用于捕获与自动麻醉记录相关的质量保证信息。
4. Towards a provenance-aware spatial-temporal architectural framework for massive data integration and analysis [C] . Ivens Portugal, Paulo Alencar, Donald Cowan IEEE International Congress on Big Data . 2016

机译：建立面向出处的时空架构框架以进行大规模数据集成和分析
5. Improving quality and quantity of data captured via wearable physiological sensors: A step towards precision medicine initiative [D] . Rahman, Md. Mahbubur 2016

机译：通过可穿戴生理传感器提高数据采集的质量和数量：迈向精准医学的一步
6. A validation of clinical data captured from a novel Cancer Care Quality Program directly integrated with administrative claims data [O] . David M Kern, John J Barron, Bingcao Wu, 2017

机译：从新的癌症护理质量计划中直接与行政理赔数据集成的临床数据的验证
7. A DATA CAPTURE FRAMEWORK FOR IMPROVING THE QUALITY OF SUBSURFACE UTILITY INFORMATION [O] . R. van Son, S. W. Jaw, A. Wieser 2019

机译：用于提高地下实用信息的质量的数据捕获框架

Provenance-aware framework for lightweight capture and high quality data regeneration.

摘要

著录项

相似文献

相关主题

期刊订阅