首页> 外文会议>Workshop on Knowledge Extraction and Integration for Deep Learning Architectures >Transformer visualization via dictionary learning: contextualized embedding as a linear superposition of transformer factors
【24h】

Transformer visualization via dictionary learning: contextualized embedding as a linear superposition of transformer factors

机译:通过字典学习实现变压器可视化:作为变压器因素线性叠加的上下文嵌入

获取原文

摘要

Transformer networks have revolutionized NLP representation learning since they were introduced. Though a great effort has been made to explain the representation in transformers, it is widely recognized that our understanding is not sufficient. One important reason is that there lack enough visualization tools for detailed analysis. In this paper, we propose to use dictionary learning to open up these 'black boxes' as linear superpositions of transformer factors. Through visualization, we demonstrate the hierarchical semantic structures captured by the transformer factors, e.g.. word-level polysemy disambiguation, sentence-level pattern formation, and long-range dependency. While some of these patterns confirm the conventional prior linguistic knowledge, the rest are relatively unexpected, which may provide new insights. We hope this visualization tool can bring further knowledge and a better understanding of how transformer networks work.
机译:自引入Transformer networks以来,它已经彻底改变了NLP表示学习。虽然已经做出了很大努力来解释变形金刚中的代表性,但人们普遍认为我们的理解是不够的。一个重要原因是缺乏足够的可视化工具进行详细分析。在本文中,我们建议使用字典学习来打开这些“黑匣子”,作为变压器因素的线性叠加。通过可视化,我们展示了变换因子所捕获的层次语义结构,例如。。词汇层面的多义消歧、句子层面的模式形成和长期依赖。虽然其中一些模式证实了传统的先验语言知识,但其他模式则相对出乎意料,这可能会提供新的见解。我们希望这个可视化工具能带来更多的知识,更好地理解变压器网络是如何工作的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号