首页> 外文会议>IEEE International Conference on Software Analysis, Evolution and Reengineering >MulCode: A Multi-task Learning Approach for Source Code Understanding
【24h】

MulCode: A Multi-task Learning Approach for Source Code Understanding

机译:mulcode:源代码理解的多任务学习方法

获取原文

摘要

Recent years have witnessed the significant rise of Deep Learning (DL) techniques applied to source code. Researchers exploit DL for a multitude of tasks and achieve impressive results. However, most tasks are explored separately, resulting in a lack of generalization of the solutions. In this work, we propose MulCode, a multi-task learning approach for source code understanding that learns unified representation space for tasks, with the pre-trained BERT model for the token sequence and the Tree-LSTM model for abstract syntax trees. Furthermore, we integrate two source code views into a hybrid representation via the attention mechanism and set learnable uncertainty parameters to adjust the tasks’ relationship.We train and evaluate MulCode in three downstream tasks: comment classification, author attribution, and duplicate function detection. In all tasks, MulCode outperforms the state-of-the-art techniques. Moreover, experiments on three unseen tasks demonstrate the generalization ability of MulCode compared with state-of-the-art embedding methods.
机译:近年来已经见证了应用于源代码的深度学习的显着升高(DL)技术。研究人员利用DL的众多任务,实现了令人印象深刻的结果。但是,大多数任务是单独探索的,导致解决方案的概念缺乏。在这项工作中,我们提出了Mulcode,一种用于源代码的多任务学习方法,用于了解任务的统一表示空间,具有令牌序列的预先训练的BERT模型和抽象语法树的树-LSTM模型。此外,我们通过注意机制将两个源代码视图集成到混合表示中,并设置了学习的不确定性参数来调整任务的关系。我们在三个下游任务中培训和评估Mulcode:注释分类,作者归因和重复功能检测。在所有任务中,Mulcode优于最先进的技术。此外,三个看不见任务的实验证明了与最先进的嵌入方法相比莫尔基铁的泛化能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号