...
首页> 外文期刊>Multimedia Tools and Applications >Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks
【24h】

Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks

机译:通过卷积神经网络的端到端学习手写文本分割

获取原文
获取原文并翻译 | 示例
           

摘要

We present a method that separates handwritten and machine-printed components that are mixed and overlapped in documents. Many conventional methods addressed this problem by extracting connected components (CCs) and classifying the extracted CCs into two classes. They were based on the assumption that two types of components are not overlapping each other, while we are focusing on more challenging and realistic cases where the components are often overlapping each other. For this, we propose a new method that performs pixel-level classification with a convolutional neural network. Unlike conventional neural network methods, our method works in an end-to-end manner and does not require any preprocessing steps (e.g., foreground extraction, handcrafted feature extraction, and so on). For the training of our network, we develop a cross-entropy based loss function to alleviate the class imbalance problem. Regarding the training dataset, although there are some datasets of mixed printed characters and handwritten scripts, most of them do not have overlapping cases and do not provide pixel-level annotations. Hence, we also propose a data synthesis method that generates realistic pixel-level training samples having many overlappings of printed and handwritten components. Experimental results on synthetic and real images have shown the effectiveness of the proposed method. Although the proposed network has been trained only with synthetic images, it also improves the OCR rate of real documents. Specifically, the OCR rate for machine-printed texts is increased from 0.8087 to 0.9442 by removing the overlapped handwritten scribbles by our method.
机译:我们提出了一种将手写和机器印刷组件分开混合并重叠在文档中的方法。许多传统方法通过提取连接的组件(CCS)并将提取的CC分类为两个类来解决此问题。它们是基于假设两种类型的组件彼此不重叠,而我们专注于更具挑战性和现实的情况,其中组件通常彼此重叠。为此,我们提出了一种用卷积神经网络执行像素级分类的新方法。与传统的神经网络方法不同,我们的方法以端到端的方式工作,不需要任何预处理步骤(例如,前景提取,手工制作功能提取等)。对于我们的网络培训,我们开发了基于跨熵的损失功能,以缓解类别不平衡问题。关于训练数据集,虽然有一些混合打印字符和手写脚本的数据集,但其中大多数都没有重叠的情况,并且不提供像素级注释。因此,我们还提出了一种数据综合方法,其生成具有许多印刷和手写组件的重叠的现实像素级训练样本。合成和真实图像的实验结果表明了该方法的有效性。虽然所提出的网络仅接受了合成图像的培训,但它也提高了真实文件的OCR率。具体而言,通过我们的方法删除重叠的手写综篇,机器印刷文本的OCR率从0.8087增加到0.9442。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号