企业知识库数据量以爆炸性的速度增长,其中大部分信息是非结构化的文本数据,系统往往不能快速准确地满足企业用户的查询请求.为解决这一问题,提出了一种基于TFIDF算法同义替换和相邻合并的文本挖掘技术.这种技术可以降低服务器压力,使服务人员可以更快更准确的从知识库中寻找出相关信息.最后以实例验证了本算法的有效性.%Enterprise knowledge repository increases with the explosive growth rate, most of which is unstructured text data.The application often can not meet the user's query requests quickly and accurately. To solve this problem, a novel text mining technique based on TFIDF with synonymous substitutions and adjacent integrationis proposed, which can reduce the server stress and make it more efficient to extract expected information. Finally, examples demonstrate the effectiveness of the algorithm.
展开▼