首页> 外文会议>Southern African Universities Power Engineering Conference >Towards an unsupervised morphological segmenter for isiXhosa
【24h】

Towards an unsupervised morphological segmenter for isiXhosa

机译:朝着伊西夏的无人育形态分段

获取原文

摘要

In this paper, branching entropy techniques and isiXhosa language heuristics are adapted to develop unsupervised morphological segmenters for isiXhosa. An overview of isiXhosa segmentation issues is given, followed by a discussion on previous work in automated segmentation, and segmentation of isiXhosa in particular. Two unsupervised isiXhosa segmenters are presented and compared to a random minimum baseline and Morfessor-Baseline, a standard in unsupervised word segmentation. Morfessor-Baseline outperforms both isiXhosa segmenters at 79.10% boundary identification accuracy. The IsiXhosa Branching Entropy Segmenter (XBES) performance varies depending on the segmentation mode used, with a maximum of 73.39%. The IsiXhosa Heuristic Maximum Likelihood Segmenter (XHMLS) achieves 72.42%. The study suggests that unsupervised isiXhosa morphological segmentation is feasible with better optimization of the current attempts.
机译:在本文中,分支熵技术和isixhosa语言启发式适于为isixhosa制定无监督的形态分段员。给出了Isixhosa分割问题的概述,然后讨论了以前的自动分割工作,特别是Isixhosa的分割。呈现了两个无人监督的伊西多莎段,并与随机最小基线和Morfessor-Baseline进行了比较,这是一个无监督的词分割的标准。 Morfessor-Baseline占据了79.10%的边界识别准确性的伊斯山群分段器。 ISIXHOSA分支熵分段器(XBES)性能根据使用的分割模式而变化,最大为73.39%。 Isixhosa启发式最大可能性分段器(XHMLS)达到72.42%。该研究表明,无监督的Isixhosa形态分割是可行的,随着目前的尝试更好地优化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号