One of the main interests in the analysis of large document collections is to discover domains of discourse that are still actively developing, growing in interest and relevance, at a given point in time, and to distinguish them from those topics that are in stagnation or decline. The present paper describes a terminologically inspired approach to this kind of task. The inputs to the method are a corpus spanning several decades of research in computational linguistics and a set of single-word terms that frequently occur in that corpus. The diachronic development of these terms is modelled by means of term life cycle information, namely the parameters relative frequency and productivity. In a second step, k-means clustering is used to identify groups of terms with similar development patterns. The paper describes a mathematical approach to modelling term productivity and discusses what kind of information can be obtained from this measure. The results of the clustering experiment are promising and well motivate future research.
展开▼
机译:转换术语± Sup> [n i Sub>] f(+/-) min sup>的条件最小化结构的逻辑动态过程的方法Sub> AND ± Sup> [m i Sub>] f(+/-) min Sub>在功能添加结构中± Sup> f < Sub> 1 Sub>(Σ RU Sub>) min Sub>,不带纹波f 1 Sub>(± Sup>←←)和循环ΔtΣ Sub>→5∙f(&)-和5个条件逻辑函数f(&)-,并通过三元数系统的算术公理同时转换术语参数的过程f RU Sub>(+ 1,0,-1)及其实现其的功能结构(俄罗斯逻辑版本)