首页> 外文会议>International conference on asian language processing >Automatic acquisition of morphological resources for Melanau language
【24h】

Automatic acquisition of morphological resources for Melanau language

机译:自动获取Melanau语言的形态资源

获取原文

摘要

Computational morphological resources are the crucial component needed in providing morphological information to create morphological analyser. To acquire the morphological resources in a manual way, two main components are required. The components, which are preprocessing and morphology induction, have led to two issues: i) time consuming and ii) ambiguity in managing the resources from under-resourced languages perspective. We proposed an automatic acquisition of morphological resources tool, which is an extension from the manual way, to overcome the mentioned issues. In this work, three main modules in the proposed automatic tool are: i) tokenization - to tokenise a raw text and generate a wordlist, ii) conversion - to convert a softcopy of morphological resources into required formats and iii) integration of segmentation tools - to integrate two established segmentation tools, namely, Linguistica and Morfessor, in obtaining morphological information from the generated wordlist. Two testing methods have been conducted are component and integration testing. Result shows the proposed tool has been devised and the effectiveness has been demonstrated which allows the linguist to obtain their wordlist and segmented data easily. We believe the proposed tool will assist other researchers to construct computational morphological resources in automated way for under-resourced languages.
机译:计算形态资源是提供形态信息以创建形态分析器所需的关键组件。要以手动方式获取形态资源,需要两个主要组件。预处理和形态学归纳这些组件导致了两个问题:i)耗时,ii)从资源不足的语言角度管理资源时含糊不清。我们提出了一种自动获取形态资源的工具,它是对手动方法的扩展,以克服上述问题。在这项工作中,建议的自动工具中的三个主要模块是:i)标记化-标记原始文本并生成单词表,ii)转换-将形态资源的软拷贝转换为所需格式,以及iii)集成分段工具-为了从生成的单词表中获取形态信息,将两个已建立的分割工具,即Linguistica和Morfessor集成在一起。已经进行了两种测试方法,即组件测试和集成测试。结果表明,该工具已被设计出来,并且证明了其有效性,该语言工具使语言学家可以轻松地获取他们的单词表和分段数据。我们认为该工具将帮助其他研究人员以自动方式为资源不足的语言构建计算形态资源。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号