首页> 外文会议>International Conference on Electronics, Computer and Computation >Kazakh Language Open Vocabulary Language Model with Deep Neural Networks
【24h】

Kazakh Language Open Vocabulary Language Model with Deep Neural Networks

机译:具有深度神经网络的哈萨克语开放词汇语言模型

获取原文

摘要

Natural Language models are a crucial tool in computational linguistics. They are specially difficult to build in agglutinative languages, which require attention since the words are formed by attaching sequences of different morphemes, where each morpheme can change the meaning of the word. For the mentioned type of language fixed and limited vocabulary itself can pose restrictions. The character-based solution may help to overcome the problem. However, it triggers the disambiguation of a word according to the context. The present work aims to build a character-based language model for the Kazakh Language, with the use of Deep Neural Networks, namely a Long Short-Term Memory model. The Language Model in the present research is generative and aims to produce all possible correct words within the context given. A word can be treated as a morpheme generated by characters where any possible word type could be generated. In order to understand the language model correctly, it is necessary to use data which was initially written in Kazakh and not translated from other sources. Therefore, the model will be trained using books written in Kazakh.
机译:自然语言模型是计算语言学中的关键工具。它们特别难以用凝集语言来构建,这需要引起注意,因为单词是通过附加不同语素的序列形成的,每个语素可以改变单词的含义。对于上述类型的语言,固定和有限的词汇表本身可能会造成限制。基于字符的解决方案可能有助于解决该问题。但是,它会根据上下文触发单词的歧义消除。本工作旨在通过使用深度神经网络(即长期短期记忆模型)为哈萨克语语言建立基于字符的语言模型。本研究中的语言模型具有生成性,旨在在给定的上下文中产生所有可能的正确单词。单词可被视为由字符生成的语素,其中可以生成任何可能的单词类型。为了正确理解语言模型,有必要使用最初用哈萨克语编写且未从其他来源翻译而来的数据。因此,将使用哈萨克语编写的书籍对模型进行训练。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号