首页> 外文会议>IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining >Text mining for plagiarism detection: Multivariate pattern detection for recognition of text similarities
【24h】

Text mining for plagiarism detection: Multivariate pattern detection for recognition of text similarities

机译:抄袭检测的​​文本挖掘:多变量模式检测识别文本相似度

获取原文

摘要

The problem of plagiarism the recent years has been intensified by the availability of information in digital form and the accessibility of the electronic libraries through the Internet. As a result, plagiarism detection has been transformed into a big data analytics problem since the number of digital sources is extravagant and a new document needs to be compared with millions of other existing documents. In this paper, a text mining methodology is proposed that can detect all common patterns between a document and the documents in a reference database. The technique is based on a pattern detection algorithm and the corresponding data structure that enables the algorithm to detect all common patterns. The methodology has been applied in a well-defined dataset providing very promising results identifying difficult cases of plagiarism such as technical disguise.
机译:近年来剽窃问题近年来,通过互联网提供了数字形式的信息和电子图书馆的可访问性的信息。因此,抄袭检测已转变为大数据分析问题,因为数字来源的数量是奢侈,并且需要与数百万其他现有文件进行比较。在本文中,提出了一种文本挖掘方法,其可以在参考数据库中检测文档和文档之间的所有常见模式。该技术基于模式检测算法和相应的数据结构,其使算法能够检测所有常见模式。该方法已经应用于明确定义的数据集,提供了非常有前途的结果,识别诸如技术伪装等诸如技术伪装的困难案例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号