首页> 外文会议>Conference on Empirical Methods in Natural Language Processing >The Thieves on Sesame Street are Polyglots -Extracting Multilingual Models from Monolingual APIs
【24h】

The Thieves on Sesame Street are Polyglots -Extracting Multilingual Models from Monolingual APIs

机译:Sesame Street的盗贼是多胶 - 从单声道API中提取多语言模型

获取原文

摘要

Pre-training in natural language processing makes it easier for an adversary with only query access to a victim model to reconstruct a local copy of the victim by training with gibberish input data paired with the victim's labels for that data. We discover that this extraction process extends to local copies initialized from a pre-trained, multilingual model while the victim remains monolingual. The extracted model learns the task from the monolingual victim, but it generalizes far better than the victim to several other languages. This is done without ever showing the multilingual, extracted model a well-formed input in any of the languages for the target task. We also demonstrate that a few real examples can greatly improve performance, and we analyze how these results shed light on how such extraction methods succeed.
机译:在自然语言处理中进行预训练使其更容易具有对受害者模型的查询访问,以通过与受害者的标签与该数据配对的Gibberish输入数据进行培训来重建受害者的本地副本。我们发现该提取过程延伸到从预先训练的多语言模型初始化的本地副本,而受害者仍然是单一的。提取的模型从单声道受害者中了解任务,但它概括了比其他几种语言更好。这是在没有显示多语言的情况下完成的,提取模型在目标任务的任何语言中形成良好的输入。我们还证明了一些实际示例可以大大提高性能,我们分析这些结果如何阐明这种提取方法如何成功。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号