...
首页> 外文期刊>Physica, D. Nonlinear phenomena >Data compression and learning in time sequences analysis
【24h】

Data compression and learning in time sequences analysis

机译:时序分析中的数据压缩和学习

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Motivated by the problem of the definition of a distance between two sequences of characters, we investigate the so-called learning process of a typical sequential data compression schemes. We focus on the problem of how a compression algorithm optimizes its features at the interface between two different sequences A and B while zipping the sequence A + B obtained by simply appending B after A. We show the existence of a scaling function (the "learning function") which rules the way in which the compression algorithm learns a sequence B after having compressed a sequence A. In particular it turns out that there exists a cross-over length for the sequence B, which depends on the relative entropy between A and B, below which the compression algorithm does not learn the sequence B (measuring in this way the cross-entropy between A and B) and above which it starts learning B, i.e. optimizing the compression using the specific features of B. We check the scaling on three main classes of systems: Bernoulli schemes, Markovian sequences and the symbolic dynamic generated by a nontrivial chaotic system (the Lozi map). As a last application of the method we present the results of a recognition experiment, namely recognize which dynamical systems produced a given time sequence. We finally point out the potentiality of these results for segmentation purposes, i.e. the identification of homogeneous sub-sequences in heterogeneous sequences (with applications in various fields from genetic to time-series analysis). (C) 2003 Elsevier Science B.V. All rights reserved. [References: 48]
机译:由于定义了两个字符序列之间的距离的问题,我们研究了典型的顺序数据压缩方案的所谓学习过程。我们关注的问题是压缩算法如何在两个不同序列A和B之间的接口处优化其特征,同时压缩通过简单地在A之后附加B获得的序列A +B。我们展示了缩放函数的存在(“学习函数”),该规则决定了压缩算法在压缩了序列A之后学习序列B的方式。特别是,事实证明,序列B存在一个穿越长度,该长度取决于A与之间的相对熵B,压缩算法将不学习序列B(以这种方式测量A和B之间的交叉熵),在B之上,压缩算法将开始学习B,即使用B的特定功能优化压缩。我们检查缩放在三个主要类别的系统上:伯努利方案,马尔可夫序列和非平凡混沌系统生成的符号动力学(Lozi映射)。作为该方法的最后一个应用,我们介绍了识别实验的结果,即识别哪个动态系统产生了给定的时间序列。我们最后指出了这些结果用于分割目的的潜力,即在异质序列中鉴定同质子序列(在从遗传到时间序列分析的各个领域都有应用)。 (C)2003 Elsevier Science B.V.保留所有权利。 [参考:48]

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号