首页> 美国政府科技报告 >Investigation into Text Classification With Kernel Based Schemes
【24h】

Investigation into Text Classification With Kernel Based Schemes

机译:基于核方案的文本分类研究

获取原文

摘要

The development of the Internet has resulted in a rapid explosion of information available on the Web. In addition, the speed and anonymity of internet media 'publishing' make this medium ideal for rapid dissemination of various contents. As a result, there is a strong need for automated text analysis and mining tools, which can identify the main topics of texts, chat room discussions, Web postings, etc. This thesis investigates whether the nonlinear kernel-based feature vector selection approach may be beneficial for categorizing unstructured text documents. Results using a nonlinear kernel- based classification are compared to results obtained using the Latent Semantic Analysis (LSA) Approach commonly used in text categorization applications. The nonlinear kernel-based scheme considered in this work applies the feature vector selection (FVS) approach followed by the Linear Discriminant Analysis (LDA) scheme. Titles, along with abstracts from IEEE journal articles published between 1990 and 1999 with specific key terms, were used to construct the data set for classification. Overall, taking into account both classification performance and timing issues, results showed the FVS-LDA with a polynomial kernel of degree 1, and an added constant of 1, to be the best classifier for the database considered.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号