Engineering a Lightweight Suffix Array Construction Algorithm

机译：设计轻量级后缀数组构造算法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the problem of computing the suffix array of a text T[l,n]. This problem consists in sorting the suffixes of T in lexicographic order. The suffix array (or PAT array) is a simple, easy to code, and elegant data structure used for several fundamental string matching problems involving both linguistic texts and biological data. Recently, the interest in this data structure has been revitalized by its use as a building block for three novel applications: (1) the Burrows-Wheeler compression algorithm, which is a provably and practically effective compression tool; (2) the construction of succinct and compressed indexes; the latter can store both the input text and its full-text index using roughly the same space used by traditional compressors for the text alone; and (3) algorithms for clustering and ranking the answers to user queries in web-search engines. In all these applications the construction of the suffix array is the computational bottleneck both in time and space. This motivated our interest in designing yet another suffix array construction algorithm which is fast and "lightweight" in the sense that it uses small space.

机译：我们考虑计算文本T [l，n]的后缀数组的问题。这个问题在于按字典顺序对T的后缀进行排序。后缀数组（或PAT数组）是一种简单，易于编码且优雅的数据结构，用于一些涉及语言文本和生物学数据的基本字符串匹配问题。最近，通过将其用作三个新颖应用程序的构建基块，使人们对该数据结构的兴趣得到了重新激发：（1）Burrows-Wheeler压缩算法，它是一种可证明且实用的压缩工具；（2）简洁和压缩索引的构造；后者可以使用传统压缩器仅用于文本的大致相同空间来存储输入文本及其全文索引；（3）用于在网络搜索引擎中对用户查询的答案进行聚类和排名的算法。在所有这些应用程序中，后缀数组的构造都是时间和空间上的计算瓶颈。这激发了我们对设计另一种后缀数组构造算法的兴趣，该算法在占用空间较小的意义上又快速又“轻巧”。

著录项

来源
《10th Annual European Symposium on Algorithms - ESA 2002, Sep 17-21, 2002, Rome, Italy》|2002年|p.698-710|共13页
会议地点 Rome(IT);Rome(IT);Rome(IT)
作者
Giovanni Manzini; Paolo Ferragina;
展开▼
作者单位

Dipartimento di Informatica, Universita del Piemonte Orientale 1-15100 Alessandria, Italy;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Engineering a Lightweight Suffix Array Construction Algorithm [J] . Giovanni Manzini, Paolo Ferragina Algorithmica . 2004,第1期

机译：设计轻量级后缀数组构造算法
2. A bioinformatician’s guide to the forefront of suffix array construction algorithms [J] . Anish Man Singh Shrestha, Martin C. Frith, Paul Horton Briefings in bioinformatics . 2014,第2期

机译：生物信息学家指南，介绍后缀数组构建算法的最前沿
3. Two Efficient Algorithms for Linear Time Suffix Array Construction [J] . Nong Ge, Zhang Sen, Chan Wai Hong Computers, IEEE Transactions on . 2011,第10期

机译：线性时间后缀数组构造的两种有效算法
4. Engineering a Lightweight Suffix Array Construction Algorithm [C] . Giovanni Manzini, Paolo Ferragina Annual European symposium on algorithms . 2002

机译：工程轻量级后缀阵列施工算法
5. Parallel external memory suffix array construction. [D] . Walia, Nancy. 2009

机译：并行外部存储器后缀数组构造。
6. A bioinformatician’s guide to the forefront of suffix array construction algorithms [O] . Anish Man Singh Shrestha, *, Martin C. Frith, -1

机译：生物信息学家指南介绍后缀数组构建算法的最前沿
7. Engineering a lightweight suffix array construction algorithm [O] . G. MANZINI, FERRAGINA P 2004

机译：设计轻量级后缀数组构造算法

Engineering a Lightweight Suffix Array Construction Algorithm

摘要

著录项

相似文献

相关主题

期刊订阅