首页> 外文学位 >Effective entity resolution methodology for improving data quality and reliability of service-oriented applications.
【24h】

Effective entity resolution methodology for improving data quality and reliability of service-oriented applications.

机译:有效的实体解析方法,可提高面向服务的应用程序的数据质量和可靠性。

获取原文
获取原文并翻译 | 示例

摘要

This dissertation proposes new paradigms for improving the testing, reliability of service-oriented applications as well as the quality of data. Since it is difficult to track information flowing through the multiple tiers of an application, testing service-oriented systems can be very challenging. We present a methodology for testing service-oriented applications that takes into account all the components, including services, external services, and data components. The results of our experiments demonstrate that this approach greatly improves the effectiveness of testing service-oriented applications.;To examine the effects of invalid data on the reliability of services and service-oriented applications, first we developed an approach to quantify the quality of database relations and compute the quality of data components based on the type of interactions they have with software components. Then, we developed a methodology that incorporates data quality into reliability modeling and can therefore better account for the failures caused by invalid data. Our empirical results show that our model provides more accurate estimations of reliability of service-oriented applications than traditional approaches by detecting 16% more faults.;Recognizing the importance of data quality, we developed an Entity Resolution (ER) algorithm that not only provides a blocking scheme, but also a 2-stage comparison selection process. It efficiently removes oversized blocks and identifies comparisons that are most likely to contain duplicates. Our empirical results demonstrate the usefulness of our algorithm with respect to both the efficiency and effectiveness. For the former, our algorithm reduces the number of comparisons that need to be resolved. For the latter, it increases the number of detected duplicates.
机译:本文提出了改进测试,面向服务的应用程序的可靠性以及数据质量的新范例。由于很难跟踪流经应用程序多层的信息,因此测试面向服务的系统可能会非常具有挑战性。我们提出了一种测试面向服务的应用程序的方法,该方法考虑了所有组件,包括服务,外部服务和数据组件。我们的实验结果表明,该方法大大提高了测试面向服务的应用程序的有效性。为了检查无效数据对服务和面向服务的应用程序的可靠性的影响,首先我们开发了一种量化数据库质量的方法根据数据组件与软件组件之间的交互类型来建立关系并计算数据组件的质量。然后,我们开发了一种将数据质量纳入可靠性建模的方法,从而可以更好地解决无效数据导致的故障。我们的经验结果表明,与传统方法相比,该模型通过检测更多的16%故障来提供面向服务的应用程序可靠性的更准确的估计。;认识到数据质量的重要性,我们开发了一种实体分辨率(ER)算法,该算法不仅提供了阻止方案,但也是一个两阶段的比较选择过程。它可以有效地删除超大块,并标识最有可能包含重复项的比较。我们的经验结果证明了我们的算法在效率和有效性方面的有用性。对于前者,我们的算法减少了需要解决的比较次数。对于后者,它增加了检测到的重复项的数量。

著录项

  • 作者

    Musial, Ewa.;

  • 作者单位

    State University of New York at Albany.;

  • 授予单位 State University of New York at Albany.;
  • 学科 Computer Science.;Engineering Computer.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 139 p.
  • 总页数 139
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号