【24h】

Segmentation for MRC compression

机译:用于MRC压缩的分段

获取原文
获取原文并翻译 | 示例

摘要

Mixed Raster Content (MRC) is a standard for efficient document compression which can dramatically improve the compression/quality tradeoff as compared to traditional lossy image compression algorithms. The key to MRC's performance is the separation of the document into foreground and background layers, represented as a binary mask. Typically, the foreground layer contains text colors, the background layer contains images and graphics, and the binary mask layer represents fine detail of text fonts.The resulting quality and compression ratio of a MRC document encoder is highly dependent on the segmentation algorithm used to compute the binary mask. In this paper, we propose a novel segmentation method based on the MRC standards (ITU-T T.44). The algorithm consists of two components: Cost Optimized Segmentation (COS) and Connected Component Classification (CCC). The COS algorithm is a blockwise segmentation algorithm formulated in a global cost optimization framework, while CCC is based on feature vector classification of connected components. In the experimental results, we show that the new algorithm achieves the same accuracy of text detection but with lower false detection of non-text features, as compared to state-of-the-art commercial MRC products. This results in high quality MRC encoded documents with fewer non-text artifacts, and lower bit rate.
机译:混合栅格内容(MRC)是有效文档压缩的标准,与传统的有损图像压缩算法相比,它可以显着改善压缩/质量折衷。 MRC性能的关键是将文档分为前景层和背景层,以二进制掩码表示。通常,前景层包含文本颜色,背景层包含图像和图形,二进制蒙版层表示文本字体的精细细节。MRC文档编码器的最终质量和压缩率高度依赖于用于计算的分割算法二进制掩码。在本文中,我们提出了一种基于MRC标准(ITU-T T.44)的新颖分割方法。该算法由两个部分组成:成本优化分段(COS)和关联组件分类(CCC)。 COS算法是在全局成本优化框架中制定的逐块分段算法,而CCC基于连接组件的特征向量分类。在实验结果中,我们表明,与最新的商用MRC产品相比,新算法可实现相同的文本检测精度,但对非文本特征的错误检测率较低。这导致具有较少非文本伪像和较低比特率的高质量MRC编码文档。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号