首页>
外国专利>
Method and apparatus for measuring the degree of polysemy in polysemous words
Method and apparatus for measuring the degree of polysemy in polysemous words
展开▼
机译:用于测量多义词中多义程度的方法和设备
展开▼
页面导航
摘要
著录项
相似文献
摘要
A system and apparatus are disclosed for identifying polysemous terms and for measuring their degree of polysemy. A polysemy index provides a quantitative measure of how polysemous a word is. A list of words can be ranked by their polysemy indices, with the most polysemous words appearing at the top of the list. A polysemy evaluation process collects a set of terms near a target term. Inter-term distances of the set of terms occurring near the target term are computed and the multi-dimensional distance space is reduced to two dimensions. The two dimensional representation is converted into radial coordinates. Isotonic/antitonic regression techniques are used to compute the degree to which the distribution deviates from unimodality. The amount of deviation is the polysemy index. A corpus can be preprocessed using the polysemy indices to identify words having clearly separated senses, allowing an information retrieval system to return a separate list of documents for each sense of a word. Self-organizing sense disambiguation techniques can use the polysemy indixces to select canonical contexts for the various senses identified for a given word. Contexts are selected containing terms falling in radial bins near each peak. Such contexts can then be used for subsequent training of a classifier.
展开▼