首页> 外文学位 >Pattern-Based Data Mining on Diverse Multimedia and Time Series Data.
【24h】

Pattern-Based Data Mining on Diverse Multimedia and Time Series Data.

机译:基于多媒体和时间序列数据的基于模式的数据挖掘。

获取原文
获取原文并翻译 | 示例

摘要

The ubiquity of patterns in data mining and knowledge discovery data sets is a binding characteristic across a diverse, and possibly otherwise unrelated, range of images, audio, video, and time series data. Despite the intra and inter distinctions of the data sets, there is usually the notion of a pattern within each data. These patterns may manifest as macro and micro textures in images, n-grams in text, motifs in time series, etc. Though despite this recurring trait, scientific studies on these data sets are an expansive history of varied methods, with new algorithms continuously presenting novel techniques and/or specialized parameters to adjust to their particular data. Because of the growing algorithmic complexities, efforts with new data then require an in depth review of its voluminous research background in order to optimize the selection of algorithm's sub-functions, feature spaces, and parameters.;Rather than providing data-dependent approaches which exist to cater to the variances in the data, this work leverages on the existence of patterns in many, if not all, data sets by using the data's pattern as its atomic form of representation. By forming our algorithms to operate on the data's patterns, all that is necessary for the application of these pattern-based methods on new, unseen data is an understanding the data's patterns; a term well understood by human intuition and abundantly expressed in the literature in many fields.;We first present a framework which provides an extremely accurate, fast, and parameter-less methods for measuring textural pattern similarity in images. We then demonstrate that this pattern-based method continues to be high performing across a large variety of image data sets from very diverse fields. We then show that it performs equally well in the realm of audio similarity. To further demonstrate the reach of pattern-based approaches, we present a novel method for the discovery of motif rules in time series; a pattern discovery problem where previous research efforts have been shown to deliver meaningless results. We then demonstrate optimizations for time series similarity search, a core subroutine to time series rule discovery and many other time series data mining algorithms.
机译:数据挖掘和知识发现数据集中的模式无处不在是跨越图像,音频,视频和时间序列数据的各种(可能不相关)范围的绑定特征。尽管数据集在内部和内部存在区别,但通常在每个数据中都有模式的概念。这些模式可能表现为图像中的宏观和微观纹理,文本中的n-gram,时间序列中的图案等。尽管具有这种重复性,但对这些数据集的科学研究却是各种方法的广阔历史,新算法不断出现新技术和/或专用参数以适应其特定数据。由于算法的复杂性不断提高,因此需要对新数据进行深入研究,以深入研究其大量研究背景,以优化算法子功能,特征空间和参数的选择;而不是提供现有的依赖数据的方法为了适应数据中的差异,这项工作利用数据的模式作为其原子表示形式,利用了许多(如果不是全部)数据集中模式的存在。通过形成我们的算法以对数据的模式进行操作,将这些基于模式的方法应用于新的,看不见的数据所需要的全部工作就是理解数据的模式。这是人类直觉很好理解的术语,并且在许多领域的文献中得到了充分表达。;我们首先提出一个框架,该框架提供了一种非常准确,快速且无参数的方法来测量图像中纹理图案的相似性。然后,我们证明了这种基于模式的方法在来自非常不同领域的各种图像数据集上仍然具有较高的性能。然后,我们证明它在音频相似性领域中表现同样出色。为了进一步说明基于模式的方法的范围,我们提出了一种在时间序列中发现主题规则的新颖方法。模式发现问题,以前的研究工作已显示出毫无意义的结果。然后,我们演示时间序列相似性搜索的优化,时间序列规则发现的核心子例程以及许多其他时间序列数据挖掘算法。

著录项

  • 作者

    Campana, Bilson Jake.;

  • 作者单位

    University of California, Riverside.;

  • 授予单位 University of California, Riverside.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 106 p.
  • 总页数 106
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号