首页> 外文会议>IEEE International Conference on Semantic Computing >Automatic Title Generation for Text with Pre-trained Transformer Language Model
【24h】

Automatic Title Generation for Text with Pre-trained Transformer Language Model

机译:具有预先培训的变压器语言模型的文本的自动标题生成

获取原文

摘要

In this paper, we propose a novel approach to Automatic Title Generation for a given text using a pre-trained Transformer Language Model GPT-2. The model proposes an unique approach of generating a pool of candidate titles and selecting an appropriate title among them which is then refined or de-noised to get the final title. The approach consists of a pipeline of three modules namely Generation, Selection and Refinement followed by a Scoring function. The Generation and Refinement modules are based on GPT-2, while the Selection module has a heuristic based approach. The model is able to generate accurate titles in spite of having a smaller corpus of relevant training data due to the fact that the natural language generation capabilities come from the pre-training while the model has to primarily learn task and corpus specific nuances. Additionally, Selection and Refinement modules ensure that the titles are representative of the given text and are semantically and syntactically accurate. We train our model for research paper abstracts from arXiv and evaluate it on three different test sets. Our pipeline shows promising results when evaluated on ROUGE and BLEU metrics against the test sets. In addition, we also perform human evaluation for validating the results generated by our proposed approach. Note: The title of this paper was generated automatically by our proposed algorithm from the abstract
机译:在本文中,我们提出了一种新的方法来自动标题生成使用预训练变压器语言模型GPT-2给定文本。该模型提出了产生候选标题的游泳池,并选择其中一个适当的标题,然后将其细化或去噪的拿到最后的冠军的唯一途径。该方法由三个模块组成的流水线即产生,选择和细化,然后将评分功能。的生成和细化模块基于GPT-2,而选择模块具有启发式为基础的方法。该模型是能够产生尽管有了相关培训较少的数据语料库的准确头衔由于该自然语言生成能力,来自于前培训,而模型必须主要学习任务和胼具体细微差别。此外,选择和细化模块确保标题是代表给定文本的,并在语义和语法准确。我们培训我们的模型从研究的arXiv论文摘要,并评估它在三个不同的试验台。我们的管道显示了对测试集胭脂BLEU指标进行评估时,有希望的结果。此外,我们还可以检验我们提出的方法所产生的结果进行人工评估。注:由我们的算法从抽象自动生成该文件的标题

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号