首页> 外文会议>Conference on empirical methods in natural language processing >Geocoding Without Geotags: A Text-based Approach for reddit
【24h】

Geocoding Without Geotags: A Text-based Approach for reddit

机译:没有地理代ag的地理编码:reddit的基于文本的方法

获取原文

摘要

In this paper, we introduce the first geolocation inference approach for reddit, a social media platform where user pseudonymity has thus far made supervised demographic inference difficult to implement and validate. In particular, we design a text-based heuristic schema to generate ground truth location labels for reddit users in the absence of explicitly geotagged data. After evaluating the accuracy of our labeling procedure, we train and test several geolocation inference models across our reddit data set and three benchmark Twitter geolocation data sets. Ultimately, we show that geolocation models trained and applied on the same domain substantially outperform models attempting to transfer training data across domains, even more so on reddit where platform-specific interest-group metadata can be used to improve inferences.
机译:在本文中,我们介绍了Reddit的第一个地理位置推理方法,这是一个社交媒体平台,其中用户假奏的较远的监督人口摄入难以实现和验证。特别是,我们设计了一种基于文本的启发式模式,在没有明确的地理标记数据的情况下为Reddit用户生成地面真理位置标签。在评估标签程序的准确性之后,我们培训并在我们的Reddit数据集和三个基准Twitter地理位置数据集中测试多个地理位置推理模型。最终,我们表明,在同一领域培训和应用的地理定位模型基本上优于尝试在域中传输训练数据的模型,甚至更为reddit,在那里可以使用平台特定的兴趣组元数据来改善推断。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号