首页> 外文会议>Conference on Digital Publishing >A hybrid intelligence approach to artifact recognition in digital publishing
【24h】

A hybrid intelligence approach to artifact recognition in digital publishing

机译:一种数字出版中的文物识别的混合智能方法

获取原文

摘要

The system presented integrates rule-based and case-based reasoning for artifact recognition in Digital Publishing. In Variable Data Printing (VDP) human proofing could result prohibitive since a job could contain millions of different instances that may contain two types of artifacts: 1) evident defects, like a text overflow or overlapping 2) style-dependent artifacts, subtle defects that show as inconsistencies with regard to the original job design. We designed a Knowledge-Based Artifact Recognition tool for document segmentation, layout understanding, artifact detection, and document design quality assessment. Document evaluation is constrained by reference to one instance of the VDP job proofed by a human expert against the remaining instances. Fundamental rules of document design are used in the rule-based component for document segmentation and layout understanding. Ambiguities in the design principles not covered by the rule-based system are analyzed by case-based reasoning, using the Nearest Neighbor Algorithm, where features from previous jobs are used to detect artifacts and inconsistencies within the document layout. We used a subset of XSL-FO and assembled a set of 44 document samples. The system detected all the job layout changes, while obtaining an overall average accuracy of 84.56%, with the highest accuracy of 92.82%, for overlapping and the lowest, 66.7%, for the lack-of-white-space.
机译:该系统介绍了数字发布中的伪像识别的基于规则和基于案例的推理。在可变数据打印(VDP)中,人类校对可能会导致禁止,因为作业可以包含数百万不同的实例,这些情况可能包含两种类型的伪像:1)明显的缺陷,如文本溢出或重叠2)样式依赖的工件,微妙的缺陷在原始工作设计方面表现为不一致。我们设计了一种基于知识的工件识别工具,用于文档分段,布局理解,工件检测和文档设计质量评估。文档评估受到人类专家对剩余实例的一个实例的一个例子约束。文档设计的基本规则用于基于规则的组件进行文档分段和布局理解。通过基于规则的系统未涵盖的设计原则中的歧义,使用最近的邻居算法进行分析,其中来自以前作业的功能用于检测文档布局内的伪像和不一致的功能。我们使用了XSL-FO的子集并组装了一组44个文档样本。该系统检测到所有工作布局的变化,同时获得84.56%的总体平均精度,最高精度为92.82%,用于重叠和最低,66.7%,用于缺乏空白空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号