A New Word Clustering Algorithm Based on Word Similarity

YUAN Lichi

首页> 中文期刊> 《电子学报（英文版）》 >A New Word Clustering Algorithm Based on Word Similarity

A New Word Clustering Algorithm Based on Word Similarity

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相关主题

摘要

Category-based statistic language model is an important method to solve the problem of sparse data in statistical language models. But there are two bottlenecks about this model: 1) The problem of word clustering, it is hard to find a suitable clustering method that has good performance and has not large amount of computation; 2) Class-based method always loses some prediction ability to adapt the text of different domain. In order to solve above problems, a novel definition of word similarity by utilizing mutual information was presented. Based on word similarity, the definition of word set similarity was given and a bottom-up hierarchical clustering algorithm was proposed. Experimental results show that the word clustering algorithm based on word similarity is better than conventional greedy clustering method in speed and performance, the perplexity is reduced from 283 to 207.8.

著录项

来源
《电子学报（英文版）》 |2017年第6期|1221-1226|共6页
作者
YUAN Lichi;
展开▼
作者单位

School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China;

展开▼
原文格式 PDF
正文语种 eng
中图分类
关键词

客服邮箱：kefu@zhangqiaokeyan.com

客服微信
服务号

A New Word Clustering Algorithm Based on Word Similarity

摘要

著录项

相关主题

期刊订阅