首页> 外文会议>Advances in Information Retrieval >Integrating Structure and Meaning: A New Method for Encoding Structure for Text Classification
【24h】

Integrating Structure and Meaning: A New Method for Encoding Structure for Text Classification

机译:整合结构与意义:文本分类编码结构的一种新方法

获取原文
获取原文并翻译 | 示例

摘要

Current representation schemes for automatic text classification treat documents as syntactically unstructured collections of words or 'concepts'. Past attempts to encode syntactic structure have treated part-of-speech information as another word-like feature, but have been shown to be less effective than non-structural approaches. We propose a new representation scheme using Holographic Reduced Representations (HRRs) as a technique to encode both semantic and syntactic structure. This method improves on previous attempts in the literature by encoding the structure across all features of the document vector while preserving text semantics. Our method does not increase the dimensionality of the document vectors, allowing for efficient computation and storage. We present classification results of our HRR text representations versus Bag-of-Concepts representations and show that our method of including structure improves text classification results.
机译:当前用于自动文本分类的表示方案将文档视为语法上非结构化的单词或“概念”集合。过去对语法结构进行编码的尝试已将词性信息视为另一种类似单词的特征,但已证明其效率低于非结构性方法。我们提出了一种新的表示方案,该方案使用全息缩减表示(HRR)作为一种对语义和句法结构进行编码的技术。该方法通过在文档向量的所有特征上对结构进行编码,同时保留文本语义,从而改进了文献中先前的尝试。我们的方法不会增加文档向量的维数,从而可以进行有效的计算和存储。我们介绍了HRR文本表示形式与概念包表示形式的分类结果,并表明我们的包含结构方法改善了文本分类结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号