首页> 外文期刊>Procedia - Social and Behavioral Sciences >Analysis of Stylometric Variables in Long and Short Texts
【24h】

Analysis of Stylometric Variables in Long and Short Texts

机译:长文本和短文本中的风格变量分析

获取原文
           

摘要

This paper presents some experiments in the task of authorship attribution. We achieve this task by a stylometric analysis of some stylistic markers tested in two Spanish corpora. The first corpus is composed of long texts written by professional authors, while the second corpus is formed by short texts written by students. In both corpora, different text genres are included. Thus, the objective of this study is to analyze several stylometric variables to test its capacity as markers for authorship attribution when the corpora vary in size and text genre. We represent the texts as high dimensional vectors and we visualize the similarities between them using multidimensional scaling. We conclude that the length of texts is a factor that affects the discriminatory capacity of the stylometric variables. We also found that there are certain variables that are better than others to identify specific authors and specific text genres.
机译:本文介绍了作者归因任务的一些实验。我们通过对在两个西班牙语料库中测试过的一些样式标记进行样式分析来完成此任务。第一个语料库由专业作者撰写的长篇文章组成,而第二个语料库由学生撰写的短篇文章组成。在两个语料库中,都包含不同的文本类型。因此,本研究的目的是分析一些文体变量,以测试其在语料库的大小和文本类型变化时作为作者归属标记的能力。我们将文本表示为高维向量,并使用多维缩放来可视化它们之间的相似性。我们得出的结论是,文本的长度是影响样式变量辨别能力的一个因素。我们还发现,在识别特定作者和特定文本类型方面,某些变量比其他变量更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号