首页> 外文会议>International conference on computational linguistics >Morphological Analyzer for Affix Stacking Languages: A Case Study of Marathi
【24h】

Morphological Analyzer for Affix Stacking Languages: A Case Study of Marathi

机译:词缀堆叠语言的形态分析器:以马拉地语为例

获取原文

摘要

In this paper we describe and evaluate a Finite State Machine (FSM) based Morphological Analyzer (MA) for Marathi, a highly inflectional language with agglutinative suffixes. Marathi belongs to the Indo-European family and is considerably influenced by Dravidian languages. Adroit handling of participial constructions and other derived forms (Krudantas and Taddhitas) in addition to inflected forms is crucial to NLP and MT of Marathi. We first describe Marathi morphological phenomena, detailing the complexities of inflectional and derivational morphology, and then go into the construction and working of the MA. The MA produces the root word and the features. A thorough evaluation against gold standard data establishes the efficacy of this MA. To the best of our knowledge, this work is the first of its kind on a systematic and exhaustive study of the Morphotactics of a suffix-stacking language, leading to high quality morph analyzer. The system forms part of a Marathi-Hindi transfer based machine translation system. The methodology delineated in the paper can be replicated for other languages showing similar suffix stacking behaviour as Marathi.
机译:在本文中,我们描述和评估了基于有限状态机(FSM)的Marathi形态分析器(MA),Marathi是一种具有拐点性后缀的高度变形语言。马拉地语(Marathi)属于印欧语系,深受德拉威语言的影响。除了变型形式以外,对参与结构和其他派生形式(Krudantas和Taddhitas)的熟练处理对于马拉地语的NLP和MT至关重要。我们首先描述马拉地语的形态学现象,详细描述了拐折和派生形态的复杂性,然后进入了MA的构建和工作。 MA产生根词和特征。对金标准数据的全面评估确定了该MA的有效性。据我们所知,这项工作是对系统和详尽研究后缀堆叠语言的词法学的首创,从而形成了高质量的词素分析仪。该系统构成了基于Marathi-Hindi转移的机器翻译系统的一部分。本文中描述的方法可以用于其他语言,这些语言显示出与Marathi相似的后缀堆叠行为。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号