首页> 外文会议>IEEE International Conference on Bioinformatics Biomedicine >Discriminative Application of String Similarity Methods to Chemical and Non-chemical Names for Biomedical Abbreviation Clustering
【24h】

Discriminative Application of String Similarity Methods to Chemical and Non-chemical Names for Biomedical Abbreviation Clustering

机译:字符串相似性方法对生物医学缩写聚类化学和非化学名称的判别应用

获取原文

摘要

Term clustering by measuring the string similarities between terms is known to be an effective method to improve the quality of texts and dictionaries. However, based on our observations, chemical names are difficult to cluster using string similarity measures such as the edit distance. To demonstrate this difficulty clearly, we compared the string similarities determined using the edit distance, the Monge-Elkan score, Soft TFIDF, and the big ram Dice coefficient for chemical names with those for other terms. The experimental results show that the discriminative application of string similarity methods to chemical and non-chemical names may be a simple but effective way to improve the performance of term clustering.
机译:通过测量术语之间的字符串相似性的术语聚类是一种有效的方法,可以提高文本和词典的质量。但是,根据我们的观察,使用诸如编辑距离的字符串相似度量难以群集化学名称。为了清楚地证明这种困难,我们将使用编辑距离,Monge-Elkan得分,软TFIDF和大RAM骰子系数进行比较了与其他术语的化学名称确定的字符串相似度。实验结果表明,串相似性方法对化学和非化学名称的判别应用可能是提高术语聚类性能的简单但有效的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号