The Thieves on Sesame Street are Polyglots -Extracting Multilingual Models from Monolingual APIs

机译：Sesame Street的盗贼是多胶 - 从单声道API中提取多语言模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Pre-training in natural language processing makes it easier for an adversary with only query access to a victim model to reconstruct a local copy of the victim by training with gibberish input data paired with the victim's labels for that data. We discover that this extraction process extends to local copies initialized from a pre-trained, multilingual model while the victim remains monolingual. The extracted model learns the task from the monolingual victim, but it generalizes far better than the victim to several other languages. This is done without ever showing the multilingual, extracted model a well-formed input in any of the languages for the target task. We also demonstrate that a few real examples can greatly improve performance, and we analyze how these results shed light on how such extraction methods succeed.

机译：在自然语言处理中进行预训练使其更容易具有对受害者模型的查询访问，以通过与受害者的标签与该数据配对的Gibberish输入数据进行培训来重建受害者的本地副本。我们发现该提取过程延伸到从预先训练的多语言模型初始化的本地副本，而受害者仍然是单一的。提取的模型从单声道受害者中了解任务，但它概括了比其他几种语言更好。这是在没有显示多语言的情况下完成的，提取模型在目标任务的任何语言中形成良好的输入。我们还证明了一些实际示例可以大大提高性能，我们分析这些结果如何阐明这种提取方法如何成功。

著录项

来源
《Conference on Empirical Methods in Natural Language Processing》|2020年|6203-6207|共5页
会议地点
作者
Nitish Shirish Keskar; Bryan McCann; Caiming Xiong; Richard Socher;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Global multilingualism, local bilingualism, official monolingualism: the linguistic landscape of Montreal's St. Catherine Street [J] . Leimgruber Jakob R. E. International Journal of Bilingual Education and Bilingualism . 2020,第5a6期

机译：全球多种语言，局部双语，官方单梅林主义：蒙特利尔圣凯瑟琳街的语言景观
2. A Multilingual to Polyglot Speech Synthesizer for Indian Languages Using a Voice-Converted Polyglot Speech Corpus [J] . Vijayalakshmi P., Ramani B., Jeeva M. P. Actlin, Circuits, systems, and signal processing . 2018,第5期

机译：使用语音转换的多语种语音语料库的印度语多语言到多语种语音合成器
3. Comparative Study of Monolingual and Multilingual Search Models for Use with Asian Languages [J] . JACQUES SAVOY ACM transactions on Asian language information processing . 2005,第2期

机译：亚洲语言使用的单语言和多语言搜索模型的比较研究
4. Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots [C] . Samson Tan, Shafiq Joty Workshop on Computational Approaches to Linguistic Code-Switching . 2021

机译：芝麻街的代码混合：对抗性多胶的曙光
5. Polyglot: A Massive Multilingual Natural Language Processing Pipeline [D] . Al-Rfou, Rami 2015

机译：多语种：大规模的多语言自然语言处理管道
6. Being monolingual bilingual or multilingual: pros and cons in patients with dementia [O] . Farooq Khan 2011

机译：单语双语或多语：痴呆患者的利与弊
7. What’s so special about BERT’s layers? A closer look at the NLP pipeline in monolingual and multilingual models [O] . Wietse de Vries, Andreas van Cranenburgh, Malvina Nissim 2020

机译：Bert的层是什么特别的？仔细观察单声道和多语言模型中的NLP管道

The Thieves on Sesame Street are Polyglots -Extracting Multilingual Models from Monolingual APIs

摘要

著录项

相似文献

相关主题

期刊订阅