An end-to-end text spotter with text relation networks

Jiang Jianguo; Wei Baole; Yu Min; Li Gang; Li Boquan; Liu Chao; Li Min; Huang Weiqing

首页> 外文期刊>Cybersecurity >An end-to-end text spotter with text relation networks

【24h】

An end-to-end text spotter with text relation networks

机译：具有文本关系网络的端到端文本特点

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reading text in images automatically has become an attractive research topic in computer vision. Specifically, end-to-end spotting of scene text has attracted significant research attention, and relatively ideal accuracy has been achieved on several datasets. However, most of the existing works overlooked the semantic connection between the scene text instances, and had limitations in situations such as occlusion, blurring, and unseen characters, which result in some semantic information lost in the text regions. The relevance between texts generally lies in the scene images. From the perspective of cognitive psychology, humans often combine the nearby easy-to-recognize texts to infer the unidentifiable text. In this paper, we propose a novel graph-based method for intermediate semantic features enhancement, called Text Relation Networks. Specifically, we model the co-occurrence relationship of scene texts as a graph. The nodes in the graph represent the text instances in a scene image, and the corresponding semantic features are defined as representations of the nodes. The relative positions between text instances are measured as the weights of edges in the established graph. Then, a convolution operation is performed on the graph to aggregate semantic information and enhance the intermediate features corresponding to text instances. We evaluate the proposed method through comprehensive experiments on several mainstream benchmarks, and get highly competitive results. For example, on the SCUT-CTW1500, our method surpasses the previous top works by 2.1% on the word spotting task.

机译：在图像中读取图像中的文本已成为计算机视觉中有吸引力的研究主题。具体而言，场景文本的端到端发现引起了显着的研究注意力，并且在多个数据集中实现了相对理想的准确性。但是，大多数现有的作品都忽略了场景文本实例之间的语义连接，并且在遮挡，模糊和看不见的字符之类的情况下具有局限性，这导致文本区域中丢失的一些语义信息。文本之间的相关性通常位于场景图像中。从认知心理学的角度来看，人类经常将附近的易于识别文本结合起来推断出不明的文本。在本文中，我们提出了一种基于图形的中间语义特征的基于图的方法，称为文本关系网络。具体而言，我们将场景文本的共同发生关系模拟为图形。图中的节点表示场景图像中的文本实例，并且相应的语义特征被定义为节点的表示。文本实例之间的相对位置被测量为已建立的图中的边缘的权重。然后，在图表上执行卷积操作以聚合语义信息并增强与文本实例相对应的中间特征。我们通过若干主流基准的全面实验评估所提出的方法，并获得竞争激烈的结果。例如，在SCUT-CTW1500上，我们的方法将超过之前的顶部作品超越2.1％的单词发现任务。

著录项

来源
《Cybersecurity》 |2021年第a期|共13页
作者
Jiang Jianguo; Wei Baole; Yu Min; Li Gang; Li Boquan; Liu Chao; Li Min; Huang Weiqing;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Scene text spottingGraph convolutional networkVisual reasoning;

机译：场景文本SpottingGraph卷积的网络维度推理;

相似文献

外文文献
中文文献
专利

1. An end-to-end text spotter with text relation networks [J] . Jianguo Jiang, Baole Wei, Min Yu, 网络空间安全科学与技术（英文版） . 2021,第002期
2. FREE: A Fast and Robust End-to-End Video Text Spotter [J] . Zhanzhan Cheng, Jing Lu, Baorui Zou, IEEE Transactions on Image Processing . 2021,第1期

机译：免费：快速且坚固的端到端视频文本观察员
3. Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks [J] . Junho Jo, Hyung Il Koo, Jae Woong Soh, Multimedia Tools and Applications . 2020,第43a44期

机译：通过卷积神经网络的端到端学习手写文本分割
4. Mask Text Spotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes [C] . Pengyuan Lyu, Minghui Liao, Cong Yao, European conference on computer vision . 2018

机译：遮罩文字点样器：端到端可训练的神经网络，用于点样具有任意形状的文本
5. Discovering latent topical phrases in document collections and networks with text components: Leveraging text mining and information network analysis for human oriented applications. [D] . Danilevsky, Marina Grigoryevna. 2014

机译：在文档集合和带有文本组件的网络中发现潜在的主题短语：利用面向人类的应用程序的文本挖掘和信息网络分析。
6. Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images [O] . Asghar Ali Chandio, Md. Asikuzzaman, Mark Pickering, 2020

机译：草书文本：用于自然场景图像中端到端乌尔都语文本识别的综合数据集
7. FREE: A Fast and Robust End-to-End Video Text Spotter [O] . Zhanzhan Cheng, Jing Lu, Baorui Zou, 2021

机译：免费：快速且坚固的端到端视频文本观察员

An end-to-end text spotter with text relation networks

摘要

著录项

相似文献

相关主题

期刊订阅