首页> 外文会议>Workshop on Natural Language Processing and Computational Social Sciences >Is Wikipedia succeeding in reducing gender bias? Assessing changes in gender bias in Wikipedia using word embeddings
【24h】

Is Wikipedia succeeding in reducing gender bias? Assessing changes in gender bias in Wikipedia using word embeddings

机译:维基百科是否成功地减少了性别偏见?使用Word Embeddings评估Wikipedia中性别偏差的变化

获取原文

摘要

Large text corpora used for creating word em-beddings (vectors which represent word meanings) often contain stereotypical gender biases. As a result, such unwanted biases will typically also be present in word embeddings derived from such corpora and downstream applications in the field of natural language processing (NLP). To minimize the effect of gender bias in these settings, more insight is needed when it comes to where and how biases manifest themselves in the text corpora employed. This paper contributes by showing how gender bias in word embeddings from Wikipedia has developed over time. Quantifying the gender bias over time shows that art related words have become more female biased. Family and science words have stereotypical biases towards respectively female and male words. These biases seem to have decreased since 2006, but these changes are not more extreme than those seen in random sets of words. Career related words are more strongly associated with male than with female, this difference has only become smaller in recently written articles. These developments provide additional understanding of what can be done to make Wikipedia more gender neutral and how important time of writing can be when considering biases in word embeddings trained from Wikipedia or from other text corpora.
机译:用于创建文字EM-BEDDINGS(代表词含义)的单词的大型文本语料库通常包含陈规定型的性别偏见。结果,这种不需要的偏差通常也存在于从自然语言处理领域(NLP)中的这种语料库和下游应用程序导出的嵌入词中。为了最大限度地减少这些环境中的性别偏差的影响,涉及到哪里以及偏见在雇用的文本语料库中表现出来的地点以及如何在地点和偏见方面需要更多的洞察力。本文通过展示Wikipedia的Word Embeddings中的性别偏差如何随着时间的推移而产生的贡献。随着时间的推移量化性别偏见表明,艺术相关词组变得更加偏见。家庭和科学词语分别对女性和男性词语具有陈规定型偏见。自2006年以来,这些偏差似乎已经减少,但这些变化并不比在随机的单词组中看到的变化。职业相关词汇与男性更强烈地关联,而不是女性,这种差异在最近书面的文章中只变得越来越小。这些事态发展提供了对使维基百科更为性别中立的额外的额外理解以及在从维基百科或其他文本语料库中考虑嵌入的单词嵌入中的偏见时,写作的重要时间是多么重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号