【24h】

LaNCoA: A Python toolkit for Language Networks Construction and Analysis

机译:LaNCoA:用于语言网络构建和分析的Python工具包

获取原文

摘要

In this paper we describe LaNCoA, Language Networks Construction and Analysis toolkit implemented in Python. The toolkit provides various procedures for network construction from the text: on the word-level (co-occurrence networks, syntactic networks, shuffled networks), and on the subword-level (syllable networks, grapheme networks). Furthermore, we implement functions for the language networks analysis on the global and local level. The toolkit is organized in several modules that enable various aspects of language analysis: analysis of global network measures for different co-occurrence window, comparison of networks based on original and shuffled texts, comparison of networks constructed on different language levels, etc. Text manipulation methods, like corpora cleaning, lemmatization and stopwords removal, are also implemented. For the basic network representation we use available NetworkX functions and methods. However, language network analysis is specific and it requires implementation of additional functions and methods. That was the main motivation for this research.
机译:在本文中,我们描述了用Python实现的LaNCoA,语言网络构建和分析工具包。该工具包提供了从文本中构建网络的各种过程:在单词级别(共现网络,句法网络,混洗网络)和在子单词级别(音节网络,字素网络)。此外,我们在全球和本地层面上实现了用于语言网络分析的功能。该工具包由几个模块组成,这些模块可以实现语言分析的各个方面:分析针对不同共现窗口的全局网络度量,基于原始文本和混排文本的网络比较,基于不同语言级别构建的网络比较等。文本处理还实现了诸如语料库清理,词形化和停用词删除之类的方法。对于基本的网络表示,我们使用可用的NetworkX功能和方法。但是,语言网络分析是特定的,它需要实现其他功能和方法。那是这项研究的主要动机。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号