首页> 外文学位 >Leveraging text content for management of construction project documents.
【24h】

Leveraging text content for management of construction project documents.

机译:利用文本内容来管理建设项目文档。

获取原文
获取原文并翻译 | 示例

摘要

The construction industry is a knowledge intensive industry. Thousands of documents are generated by construction projects. Documents, as information carriers, must be managed effectively to ensure successful project management. The fact that a single project can produce thousands of documents and that a lot of the documents are generated in a textual/unstructured format greatly complicates the task of information management. Conventionally, project documents are organized based on classifying documents according to fixed/predefined classes and document metadata, e.g. according to document type, originator, project attribute, specification division, date, etc. While such classification method is easy to implement, it is only advantageous for document search and retrieval if the document seeker has prior knowledge of the content of the document corpus. In many cases and for various project management activities this is not the case, resulting in frustration of the search task with delayed or incomplete search results.;An alternative framework for organizing project documents based on document content is proposed. The framework takes into account important characteristics of construction project documents and leverages such characteristics to facilitate document search and retrieval. The premise for the framework is the fact that documents are not produced haphazardly, but are generated as a result of certain events or circumstances occurring in the project. As such documents can be linked to each other on the semantic level; a point that is overlooked by document management systems which generally manage documents in vacuo by disregarding or failing to utilize such semantic connections between the documents. Organizing project documents based on the semantic relations that exist between them (revealed from the document content and not just the document attributes) facilitates information retrieval and retains the knowledge of the actual project participants, thereby supporting knowledge reuse.;Another aspect of the thesis investigates the use of document content analysis to enable automated document management. If textual similarities between documents correlate with what human users recognize through their semantic abilities, then content analysis of documents can be used to automatically organize documents according to the proposed framework. Text classifiers based on machine learning techniques were evaluated to determine their performance in identifying which group of semantically-similar documents a test document belongs. Also, an unsupervised learning method was adapted and evaluated for the task of clustering documents based on textual similarity into sets of documents that are semantically related. The purpose of such evaluations is to equip electronic document management systems with content analysis capabilities that facilitate document search and retrieval.
机译:建筑业是知识密集型产业。建设项目生成了成千上万的文档。作为信息载体的文档必须得到有效管理,以确保成功进行项目管理。一个项目可以生成成千上万的文档,并且许多文档以文本/非结构化格式生成,这一事实使信息管理的任务变得非常复杂。常规地,基于根据固定/预定义的类别和文档元数据(例如,文档类别)对文档进行分类来组织项目文档。虽然这种分类方法易于实现,但是如果文档搜索者具有文档语料库内容的先验知识,则仅对文档搜索和检索有利。在许多情况下以及对于各种项目管理活动而言,情况并非如此,导致搜索任务受挫或出现搜索结果延迟或不完整的情况。;提出了一种基于文档内容来组织项目文档的替代框架。该框架考虑了建设项目文件的重要特征,并利用这些特征来促进文件的搜索和检索。该框架的前提是文档不是偶然生成的,而是由于项目中某些事件或情况而生成的。这样的文档可以在语义级别上相互链接;文档管理系统忽略了这一点,该文档管理系统通常通过不理会或无法利用文档之间的这种语义联系来真空管理文档。根据项目文档之间存在的语义关系来组织项目文档(从文档内容中显示出来,而不仅仅是文档属性)有助于信息检索并保留实际项目参与者的知识,从而支持知识重用。使用文档内容分析来实现自动文档管理。如果文档之间的文本相似性与人类用户通过其语义能力识别的内容相关联,则可以根据建议的框架使用文档的内容分析来自动组织文档。对基于机器学习技术的文本分类器进行了评估,以确定它们在识别测试文档所属的语义相似文档组中的性能。此外,针对文本相似性将文档聚类为语义相关文档集的任务,采用了无监督学习方法并对其进行了评估。这种评估的目的是为电子文档管理系统配备内容分析功能,以促进文档搜索和检索。

著录项

  • 作者

    Alqady, Mohammed.;

  • 作者单位

    Purdue University.;

  • 授予单位 Purdue University.;
  • 学科 Information Technology.;Engineering Civil.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 220 p.
  • 总页数 220
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号