Data compression and learning in time sequences analysis

Puglisi A.; Benedetto D.; Caglioti E.; Loreto V.; Vulpiani A.

首页> 外文期刊>Physica, D. Nonlinear phenomena >Data compression and learning in time sequences analysis

【24h】

Data compression and learning in time sequences analysis

机译：时序分析中的数据压缩和学习

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Motivated by the problem of the definition of a distance between two sequences of characters, we investigate the so-called learning process of a typical sequential data compression schemes. We focus on the problem of how a compression algorithm optimizes its features at the interface between two different sequences A and B while zipping the sequence A + B obtained by simply appending B after A. We show the existence of a scaling function (the "learning function") which rules the way in which the compression algorithm learns a sequence B after having compressed a sequence A. In particular it turns out that there exists a cross-over length for the sequence B, which depends on the relative entropy between A and B, below which the compression algorithm does not learn the sequence B (measuring in this way the cross-entropy between A and B) and above which it starts learning B, i.e. optimizing the compression using the specific features of B. We check the scaling on three main classes of systems: Bernoulli schemes, Markovian sequences and the symbolic dynamic generated by a nontrivial chaotic system (the Lozi map). As a last application of the method we present the results of a recognition experiment, namely recognize which dynamical systems produced a given time sequence. We finally point out the potentiality of these results for segmentation purposes, i.e. the identification of homogeneous sub-sequences in heterogeneous sequences (with applications in various fields from genetic to time-series analysis). (C) 2003 Elsevier Science B.V. All rights reserved. [References: 48]

机译：由于定义了两个字符序列之间的距离的问题，我们研究了典型的顺序数据压缩方案的所谓学习过程。我们关注的问题是压缩算法如何在两个不同序列A和B之间的接口处优化其特征，同时压缩通过简单地在A之后附加B获得的序列A +B。我们展示了缩放函数的存在（“学习函数”），该规则决定了压缩算法在压缩了序列A之后学习序列B的方式。特别是，事实证明，序列B存在一个穿越长度，该长度取决于A与之间的相对熵B，压缩算法将不学习序列B（以这种方式测量A和B之间的交叉熵），在B之上，压缩算法将开始学习B，即使用B的特定功能优化压缩。我们检查缩放在三个主要类别的系统上：伯努利方案，马尔可夫序列和非平凡混沌系统生成的符号动力学（Lozi映射）。作为该方法的最后一个应用，我们介绍了识别实验的结果，即识别哪个动态系统产生了给定的时间序列。我们最后指出了这些结果用于分割目的的潜力，即在异质序列中鉴定同质子序列（在从遗传到时间序列分析的各个领域都有应用）。（C）2003 Elsevier Science B.V.保留所有权利。 [参考：48]

著录项

来源
《Physica, D. Nonlinear phenomena》 |2003年第2期|共16页
作者
Puglisi A.; Benedetto D.; Caglioti E.; Loreto V.; Vulpiani A.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类物理学;
关键词
Time sequence analysis; Compression algorithm; Characterization of complexity; Lempel-ziv algorithm; Information; Complexity; Redundancy;

机译：时间序列分析压缩算法复杂度表征Lempel-ziv算法信息复杂度冗余;

相似文献

外文文献
中文文献
专利

1. Data compression and learning in time sequences analysis [J] . Puglisi A., Benedetto D., Caglioti E., Physica, D. Nonlinear phenomena . 2003,第1a2期

机译：时序分析中的数据压缩和学习
2. Near-Real time analysis of seismic data of active volcanoes: Software implementations of time sequence data analysis [J] . Vila J., Ortiz R., Tárraga M., Natural Hazards and Earth System Sciences Discussions . 2008,第4期

机译：活性火山地震数据的近实时分析：时间序列数据分析的软件实现
3. CLASSMODE - A NEW DATA FORMAT FOR REAL-TIME MULTIPARAMETER DATA ANALYSIS AND DATA COMPRESSION [J] . Keij JF., Jonker RR., Smith CR., Cytometry: The Journal of the Society for Analytical Cytology . 1995,第1期

机译：CLASSMODE-用于实时多参数数据分析和数据压缩的新数据格式
4. MODULAR DATA COMPRESSION TO OPTIMALLY LOCATE REGULAR SEGMENTS IN SEQUENCES. APPLICATION TO DNA SEQUENCE ANALYSIS [C] . Olivier Delgrange, Eric Rivals Symposium on Information Theory in the Benelux; 20050519-20; Brussels(BE) . 2005

机译：模块化数据压缩，以按顺序最佳定位常规段。在DNA序列分析中的应用
5. Machine Learning Approaches to Biological Sequence and Phenotype Data Analysis. [D] . Min, Renqiang. 2010

机译：机器学习方法的生物序列和表型数据分析。
6. To what degree does the missing-data technique influence the estimated growth in learning strategies over time? A tutorial example of sensitivity analysis for longitudinal data [O] . Liesje Coertjens, Vincent Donche, Sven De Maeyer, 2011

机译：缺失数据技术在多大程度上会影响学习策略在一段时间内的估计增长？纵向数据敏感性分析的教程示例
7. Data compression and learning in time sequences analysis [O] . A. Puglisi, Dario Benedetto, Emanuele Caglioti, 2003

机译：时序分析中的数据压缩和学习
8. Analytical Model for Developing Objective Measures of Air Crew Proficiency with Multivariate Time Sequenced Data. Volume I. Analysis and Results [R] . Connelly, E. M., Johnson, P., Shipley, B. D. 1981

机译：用多元时间序列数据开发机组人员熟练度目标测度的分析模型。第一卷分析和结果

Data compression and learning in time sequences analysis

摘要

著录项

相似文献

相关主题

期刊订阅