首页> 外文期刊>International Journal of Innovative Computing and Applications >An empirical study of statistical language models: n-gram language models vs. neural network language models
【24h】

An empirical study of statistical language models: n-gram language models vs. neural network language models

机译:统计语言模型的实证研究:n-gram语言模型与神经网络语言模型

获取原文
获取原文并翻译 | 示例
           

摘要

Statistical language models are an important module in many areas of successful applications such as speech recognition and machine translation. And n-gram models are basically the state-of-the-art. However, due to sparsity of data, the modelled language cannot be completely represented in the n-gram language model. In fact, if new words appear in the recognition or translation steps, we need to provide a smoothing method to distribute the model probabilities over the unknown values. Recently, neural networks were used to model language based on the idea of projecting words onto a continuous space and performing the probability estimation in this space. In this experimental work, we compare the behaviour of the most popular smoothing methods with statistical n-gram language models and neural network language models in different situations and with different parameters. The language models are trained on two corpora of French and English texts. Good empirical results are obtained by the recurrent neural network language models.
机译:统计语言模型是许多成功应用程序的重要模块,例如语音识别和机器翻译。而n-gram模型基本上是最先进的。但是,由于数据的稀疏性,所建模语言不能完全在n克语言模型中表示。实际上,如果新单词出现在识别或翻译步骤中,我们需要提供平滑方法来分配未知值的模型概率。最近,神经网络用于基于将单词突出到连续空间的想法和执行该空间中的概率估计来模拟语言。在该实验工作中,我们将最流行的平滑方法的行为与不同情况和不同参数的统计n-gram语言模型和神经网络语言模型进行比较。语言模型在法语和英语文本的两个语言中受过培训。经常性神经网络语言模型获得了良好的经验结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号