Optimal Selection of Different Order N-Grams

机译：不同阶数N克的最佳选择

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We subjectively introduce different assumptions about the mutual information among words for separate usage of different N-grams regardless of the fact that some assumptions cannot be true. An approach is proposed to evaluate different order N-grams based on the concept of confidence of assumption or assumption probability. The assumption probabilities are estimated on the basis of discriminative estimation criterion. To reduce the redundant information among different order N-grams, we bring up with a method to select M-gram elements from different order N-grams with reference to the assumption probabilities. We conduct experiments on the platform of conversion from Chinese pinyin to Chinese character. At first, we emplov Newton Gradient method to estimate the assumption probabilities and then lest the optimally selected language model. The experimental results show that the memory capacity of language model can be remarkably lowered with little loss of accuracy.

机译：我们主观地引入了关于单词之间相互信息的不同假设，以用于不同N-gram的单独使用，而不管某些假设不能成立的事实。提出了一种基于假设或假设概率的置信度概念评估不同阶N-gram的方法。假设概率是根据判别估计标准进行估计的。为了减少不同阶N-gram之间的冗余信息，我们提出了一种基于假设概率从不同阶N-gram中选择M-gram元素的方法。我们在从汉语拼音转换为汉字的平台上进行实验。首先，我们采用牛顿梯度方法来估计假设概率，然后再选择最佳选择的语言模型。实验结果表明，语言模型的存储容量可以显着降低，而准确性损失很小。

著录项

来源
《International Conference on Artificial Intelligence IC-AI'2001 Vol.2, Jun 25-28, 2001, Las Vegas, Nevada, USA》|2001年|p.593-597|共5页
会议地点 Las Vegas NV(US);Las Vegas NV(US)
作者
Gongjun Li; Na Dong;
展开▼
作者单位

Matsushita Research Development (China) Co. Ltd. No.4 Huixin East Street, Chaoyang District Beijing, 100029 P. R. China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Efficient n-gram construction for text categorization using feature selection techniques [J] . Garcia Maximiliano, Maldonado Sebastian, Vairetti Carla Intelligent data analysis . 2021,第3期

机译：使用特征选择技术的文本分类高效的n-gram结构
2. N-Gram Based Approach for Text Authorship Classification: Metric Selection [J] . Elena Mikhailova, Polina Diurdeva, Dmitry Shalymov International journal of embedded and real-time communication systems . 2017,第2期

机译：基于N-Gram的文本作者身份分类方法：度量选择
3. Clustering botnet communication traffic based on n-gram feature selection [J] . Wei Lu, Goaletsa Rammidi, Ali A. Ghorbani Computer Communications . 2011,第3期

机译：基于n-gram特征选择对僵尸网络通信流量进行聚类
4. Optimal Selection of Different Order N-Grams [C] . Gongjun Li, Na Dong International conference on artificial intelligence . 2001

机译：不同订单的最佳选择n-grams
5. Optimal population value selection: A population-based selection strategy for genomic selection. [D] . Goiffon, Matthew D. 2016

机译：最优种群值选择：基于种群的基因组选择策略。
6. Improving Response in Genomic Selection with a Population-Based Selection Strategy: Optimal Population Value Selection [O] . Matthew Goiffon, Aaron Kusmec, Lizhi Wang, 2017

机译：基于种群的选择策略改善基因组选择的响应：最佳种群值选择
7. Attacking decipherment problems optimally with low-order n-gram models [O] . Sujith Ravi, Kevin Knight 2008

机译：用低阶n元语法模型最佳地解决解密问题
8. Discriminative N-Gram Selection for Dialect Recognition. [R] . Richardson, F. S., Campbell, W. M., Torres-Carrasquillo, P. A. 2016

机译：方言识别的判别式N-Gram选择。

Optimal Selection of Different Order N-Grams

摘要

著录项

相似文献

相关主题

期刊订阅