首页> 美国政府科技报告 >Blog Fingerprinting: Identifying Anonymous Posts Written by an Author of Interest Using Word and Character Frequency Analysis
【24h】

Blog Fingerprinting: Identifying Anonymous Posts Written by an Author of Interest Using Word and Character Frequency Analysis

机译:博客指纹识别:使用单词和字符频率分析识别感兴趣的作者撰写的匿名帖子

获取原文

摘要

Internet blogs are an easily accessible means of global communications. Monitoring blogs for criminal and terrorist activity is a serious challenge, due to blogs' anonymous nature and the sheer volume of data. The intelligence community is often faced with more information than it can process. The need exists to develop methods for processing the massive amounts of data this media presents, without a significant increase in manpower. An automated tool capable of identifying posts written by an individual, given a sample of his writing, would allow law enforcement and intelligence agencies to gather evidence that would otherwise be overlooked due to manpower and time constraints. This research focuses on identifying blog posts written by a particular author, when we do not have a model of every potential author. Previous research either builds a distinct model for every possible author, or limits itself to large documents. Neither approach is appropriate for processing blog posts. Blog posts tend to be short documents, and building a distinct model of each author is unreasonable if you are looking for one author among millions. We address this problem by combining sample posts by other authors to create a model of an 'average author.'.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号