首页> 外文期刊>Information Forensics and Security, IEEE Transactions on >Authorship Attribution for Social Media Forensics
【24h】

Authorship Attribution for Social Media Forensics

机译:社交媒体取证的作者身份归属

获取原文
获取原文并翻译 | 示例
           

摘要

The veil of anonymity provided by smartphones with pre-paid SIM cards, public Wi-Fi hotspots, and distributed networks like Tor has drastically complicated the task of identifying users of social media during forensic investigations. In some cases, the text of a single posted message will be the only clue to an author's identity. How can we accurately predict who that author might be when the message may never exceed 140 characters on a service like Twitter? For the past 50 years, linguists, computer scientists, and scholars of the humanities have been jointly developing automated methods to identify authors based on the style of their writing. All authors possess peculiarities of habit that influence the form and content of their written works. These characteristics can often be quantified and measured using machine learning algorithms. In this paper, we provide a comprehensive review of the methods of authorship attribution that can be applied to the problem of social media forensics. Furthermore, we examine emerging supervised learning-based methods that are effective for small sample sizes, and provide step-by-step explanations for several scalable approaches as instructional case studies for newcomers to the field. We argue that there is a significant need in forensics for new authorship attribution algorithms that can exploit context, can process multi-modal data, and are tolerant to incomplete knowledge of the space of all possible authors at training time.
机译:智能手机带有预付费SIM卡,公共Wi-Fi热点以及Tor之类的分布式网络提供的匿名面纱使法医调查过程中识别社交媒体用户的任务大大复杂化。在某些情况下,单个已发布消息的文本将是作者身份的唯一线索。当Twitter之类的服务上的消息不能超过140个字符时,我们如何准确地预测该作者是谁?在过去的50年中,语言学家,计算机科学家和人文学者一直在共同开发自动方法,以根据其写作风格来识别作者。所有作者都具有习惯特征,这些习惯习惯会影响其书面作品的形式和内容。这些特征通常可以使用机器学习算法进行量化和测量。在本文中,我们对可用于社交媒体取证问题的作者身份归属方法进行了全面的回顾。此外,我们研究了有效的基于监督学习的新兴方法,这些方法对于小样本量有效,并提供了几种可扩展方法的分步说明,作为该领域新手的指导案例研究。我们认为,在司法鉴定方面,迫切需要能够利用上下文,可以处理多模式数据并且能够在培训时容忍对所有可能的作者的不完全了解的新作者身份归属算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号