首页> 中文期刊> 《哈尔滨工业大学学报:英文版》 >Applying rough sets in word segmentation disambiguation based on maximum entropy model

Applying rough sets in word segmentation disambiguation based on maximum entropy model

         

摘要

To solve the complicated feature extraction and long distance dependency problem in Word Segmentation Disambiguation (WSD), this paper proposes to apply rough sets in WSD based on the Maximum Entropy model. Firstly, rough set theory is applied to extract the complicated features and long distance features, even from noise or inconsistent corpus. Secondly, these features are added into the Maximum Entropy model, and consequently, the feature weights can be assigned according to the performance of the whole disambiguation model. Finally, the semantic lexicon is adopted to build class-based rough set features to overcome data sparseness. The experiment indicated that our method performed better than previous models, which got top rank in WSD in 863 Evaluation in 2003. This system ranked first and second respectively in MSR and PKU open test in the Second International Chinese Word Segmentation Bakeoff held in 2005.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号