【24h】

Author Identification on Noise Arabic Documents

机译:作者识别噪声阿拉伯文文件

获取原文

摘要

In the present research work, we deal with the problem of authorship attribution of Ancient Arabic Philosophers. For that purpose, we conducted several authorship attribution experiments applied to different noise Arabic text. A special dataset, called “A4P” (Authorship Attribution for Ancient Arabic Philosophers), has been constructed by extracting texts from the books of 5 Ancient Arabic Philosophers, where the genre and the topic are similar. In our approach two types of features were employed; character N-grams and words and several classifiers are used, namely: Support Vector Machines, Multi Layer Perceptron, Linear Regression, Stamatatos distance and Manhattan distance. The obtained results show that the failure limit and classification performances depend on the used features, the classification technique and the level of noise. In the overall the performances of the proposed techniques are quite interesting by showing the effect of noise on authorship attribution.
机译:在目前的研究工作中,我们处理古代阿拉伯哲学家的作者归因问题。为此目的,我们进行了若干作者归属实验,适用于不同的噪声阿拉伯文文本。一位名为“A4P”的特殊数据集(古代阿拉伯语哲学家的作者归属),通过从古代阿拉伯语哲学家书中提取文本来构建,其中类型和主题是相似的。在我们的方法中,采用了两种类型的功能;使用字符n-grams和单词和几个分类器,即:支持向量机,多层Perceptron,线性回归,雄山距离和曼哈顿距离。所获得的结果表明,故障限制和分类性能取决于使用的特征,分类技术和噪声水平。在整体上,所提出的技术的性能是非常有趣的,通过显示对作者归因的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号