首页> 外文会议>International Conference on Electronics, Computer and Computation >Kazakh Language Open Vocabulary Language Model with Deep Neural Networks
【24h】

Kazakh Language Open Vocabulary Language Model with Deep Neural Networks

机译:哈萨克语语言与深神经网络的开放词汇语言模型

获取原文

摘要

Natural Language models are a crucial tool in computational linguistics. They are specially difficult to build in agglutinative languages, which require attention since the words are formed by attaching sequences of different morphemes, where each morpheme can change the meaning of the word. For the mentioned type of language fixed and limited vocabulary itself can pose restrictions. The character-based solution may help to overcome the problem. However, it triggers the disambiguation of a word according to the context. The present work aims to build a character-based language model for the Kazakh Language, with the use of Deep Neural Networks, namely a Long Short-Term Memory model. The Language Model in the present research is generative and aims to produce all possible correct words within the context given. A word can be treated as a morpheme generated by characters where any possible word type could be generated. In order to understand the language model correctly, it is necessary to use data which was initially written in Kazakh and not translated from other sources. Therefore, the model will be trained using books written in Kazakh.
机译:自然语言模型是计算语言学的重要工具。它们特别难以凝集语言,这需要注意,因为通过附着不同的语素序列来形成单词,其中每个语素可以改变单词的含义。对于所提到的语言类型,固定和有限的词汇本身可以造成限制。基于角色的解决方案可能有助于克服这个问题。但是,它根据上下文触发了单词的歧义。目前的工作旨在为哈萨克语建立一个基于角色的语言模型,利用深度神经网络,即长期内记忆模型。本研究中的语言模型是生成的,旨在在给出的上下文中产生所有可能的正确单词。可以将单词视为由可以生成任何可能的单词类型的字符生成的语言。为了正确理解语言模型,有必要使用最初用哈萨克语编写的数据,而不是从其他来源翻译。因此,该模型将使用在哈萨克书写的书籍进行培训。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号