首页> 外文期刊>Cybersecurity >An end-to-end text spotter with text relation networks
【24h】

An end-to-end text spotter with text relation networks

机译:具有文本关系网络的端到端文本特点

获取原文
           

摘要

Reading text in images automatically has become an attractive research topic in computer vision. Specifically, end-to-end spotting of scene text has attracted significant research attention, and relatively ideal accuracy has been achieved on several datasets. However, most of the existing works overlooked the semantic connection between the scene text instances, and had limitations in situations such as occlusion, blurring, and unseen characters, which result in some semantic information lost in the text regions. The relevance between texts generally lies in the scene images. From the perspective of cognitive psychology, humans often combine the nearby easy-to-recognize texts to infer the unidentifiable text. In this paper, we propose a novel graph-based method for intermediate semantic features enhancement, called Text Relation Networks. Specifically, we model the co-occurrence relationship of scene texts as a graph. The nodes in the graph represent the text instances in a scene image, and the corresponding semantic features are defined as representations of the nodes. The relative positions between text instances are measured as the weights of edges in the established graph. Then, a convolution operation is performed on the graph to aggregate semantic information and enhance the intermediate features corresponding to text instances. We evaluate the proposed method through comprehensive experiments on several mainstream benchmarks, and get highly competitive results. For example, on the SCUT-CTW1500, our method surpasses the previous top works by 2.1% on the word spotting task.
机译:在图像中读取图像中的文本已成为计算机视觉中有吸引力的研究主题。具体而言,场景文本的端到端发现引起了显着的研究注意力,并且在多个数据集中实现了相对理想的准确性。但是,大多数现有的作品都忽略了场景文本实例之间的语义连接,并且在遮挡,模糊和看不见的字符之类的情况下具有局限性,这导致文本区域中丢失的一些语义信息。文本之间的相关性通常位于场景图像中。从认知心理学的角度来看,人类经常将附近的易于识别文本结合起来推断出不明的文本。在本文中,我们提出了一种基于图形的中间语义特征的基于图的方法,称为文本关系网络。具体而言,我们将场景文本的共同发生关系模拟为图形。图中的节点表示场景图像中的文本实例,并且相应的语义特征被定义为节点的表示。文本实例之间的相对位置被测量为已建立的图中的边缘的权重。然后,在图表上执行卷积操作以聚合语义信息并增强与文本实例相对应的中间特征。我们通过若干主流基准的全面实验评估所提出的方法,并获得竞争激烈的结果。例如,在SCUT-CTW1500上,我们的方法将超过之前的顶部作品超越2.1%的单词发现任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号