【24h】

On Measuring Gender Bias in Translation of Gender-neutral Pronouns

机译:论中性代词翻译中的性别偏见测度

获取原文

摘要

Ethics regarding social bias has recently thrown striking issues in natural language processing. Especially for gender-related topics, the need for a system that reduces the model bias has grown in areas such as image captioning, content recommendation, and automated employment. However, detection and evaluation of gender bias in the machine translation systems are not yet thoroughly investigated, for the task being cross-lingual and challenging to define. In this paper, we propose a scheme for making up a test set that evaluates the gender bias in a machine translation system, with Korean, a language with gender-neutral pronouns. Three word/phrase sets are primarily constructed, each incorporating positiveegative expressions or occupations; all the terms are gender-independent or at least not biased to one side severely. Then, additional sentence lists are constructed concerning formality of the pronouns and politeness of the sentences. With the generated sentence set of size 4,236 in total, we evaluate gender bias in conventional machine translation systems utilizing the proposed measure, which is termed here as translation gender bias index (TGBI). The corpus and the code for evaluation is available on-line1.
机译:关于社会偏见的伦理学最近在自然语言处理中引发了很多问题。特别是对于与性别相关的主题,在诸如图像字幕,内容推荐和自动雇用等领域,对减少模型偏差的系统的需求日益增长。但是,机器翻译系统中性别偏见的检测和评估尚未进行彻底研究,因为该任务是跨语言的且难以定义。在本文中,我们提出了一种构成测试集的方案,该测试集使用韩语(一种具有性别中立代词的语言)来评估机器翻译系统中的性别偏见。主要构建了三个词/短语集,每个词集包含正/负表达或职业。所有术语都与性别无关,或者至少没有严重偏向一侧。然后,构建有关代词形式和句子礼貌的附加句子列表。借助总共生成的大小为4,236的句子集,我们利用提出的度量来评估常规机器翻译系统中的性别偏见,在本文中将其称为翻译性别偏见指数(TGBI)。语料库和评估代码可在线获得1。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号