首页> 外文会议>IEEE/ACM International Conference on Mining Software Repositories >Import2vec: Learning Embeddings for Software Libraries
【24h】

Import2vec: Learning Embeddings for Software Libraries

机译:Import2VEC:用于软件库的学习嵌入式

获取原文

摘要

We consider the problem of developing suitable learning representations (embeddings) for library packages that capture semantic similarity among libraries. Such representations are known to improve the performance of downstream learning tasks (e.g. classification) or applications such as contextual search and analogical reasoning. We apply word embedding techniques from natural language processing (NLP) to train embeddings for library packages ("library vectors"). Library vectors represent libraries by similar context of use as determined by import statements present in source code. Experimental results obtained from training such embeddings on three large open source software corpora reveals that library vectors capture semantically meaningful relationships among software libraries, such as the relationship between frameworks and their plug-ins and libraries commonly used together within ecosystems such as big data infrastructure projects (in Java), front-end and back-end web development frameworks (in JavaScript) and data science toolkits (in Python).
机译:我们考虑开发合适的学习表示(嵌入)的库包,用于捕获图书馆之间的语义相似性的库包。已知这些代表可以改善下游学习任务(例如分类)或诸如上下文搜索和类比推理的应用的性能。我们将嵌入技术从自然语言处理(NLP)应用于培训库包的嵌入式(“库向量”)。库向量代表库通过使用源代码中存在的导入语句确定的类似上下文来表示库。从训练中获得的实验结果在三个大型开源软件上进行了培训,揭示了图书馆向量捕获了软件库之间的语义有意义的关系,例如框架和他们的插件和库之间的关系,通常在大数据基础架构项目等生态系统中一起使用(在Java中),前端和后端Web开发框架(JavaScript)和数据科学工具包(在Python中)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号