首页> 外文期刊>International journal of computational intelligence research >Rule Based Chunk Extraction from PDF Documents Using Regular Expressions and Natural Language Processing
【24h】

Rule Based Chunk Extraction from PDF Documents Using Regular Expressions and Natural Language Processing

机译:使用正则表达式和自然语言处理从PDF文档的规则的块提取

获取原文
获取原文并翻译 | 示例
           

摘要

The Natural Language Processing (NLP) is a stimulating and vital field of Artificial Intelligence (AI).The NLP can be used to find out the required intelligence through the system under consideration,so that system behaves as per convenience and efficiency expected by the user.The proposed system demonstrates application of NLP and by using Regular Expressions to categorize and classify sentences in Word/PDF (Portable Document Format) documents according to rules provided by user.Thousands of similar kind of PDF documents can be easily processed by reading them page wise,the proposed system produces results according to the user defined rules those are applicable to all input PDF documents.Single rule is written by considering one input PDF document and apply the same to all other input PDF documents of the proposed system to create individual data chunks out of all documents and display them on User Interface in table format.
机译:自然语言处理(NLP)是人工智能(AI)的刺激和重要领域。NLP可用于通过所考虑的系统找出所需的智能,因此系统按照用户预期的便利性和效率而行为。 。建议的系统演示了NLP的应用,并通过使用user.UST.Thousands的句号来对Word / PDF(可移植文档格式)文档中的句子进行分类和分类句子,可以通过读取它们来轻松处理PDF文档的类似类型的PDF文档 Wise,所提出的系统根据用户定义的规则生成结果,这些结果适用于所有输入的PDF文档。通过考虑一个输入PDF文档来编写一个规则,并将其应用于所提出的系统的所有其他输入PDF文档以创建单个数据。 在所有文档中的块中的块并在表格格式的用户界面上显示它们。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号