首页> 外文会议>International conference on asian language processing >The analysis on mistaken segmentation of Tibetan words based on statistical method
【24h】

The analysis on mistaken segmentation of Tibetan words based on statistical method

机译:基于统计方法的藏文单词错误分割分析

获取原文

摘要

In this paper, by using the Tibetan word segmentation system, IEA-TWordSeg, the authors attempt segmentation of the total 1271 sentences in the closed set and 1000 sentences in an open set. The accuracy of testing is 99.54% and 92.41% respectively. The authors describe the wrong segmentation types as well as the causes of the mistakes, and demonstrate the proportion of different types of segmentation errors. The purpose of the article is to provide clues for those who intend to improve the accuracy of Tibetan word segmentation system.
机译:本文使用藏语分词系统IEA-TWordSeg,尝试对封闭集中的1271个句子和开放集中的1000个句子进行切分。测试的准确性分别为99.54%和92.41%。作者描述了错误的细分类型以及错误的原因,并说明了不同类型的细分错误所占的比例。本文的目的是为那些打算提高藏文分词系统准确性的人提供线索。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号