Effect of small sample size on text categorization with support vector machines

机译：小样本大小对带载体机器文本分类的影响

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Datasets that answer difficult clinical questions are expensive in part due to the need for medical expertise and patient informed consent. We investigate the effect of small sample size on the performance of a text categorization algorithm. We show how to determine whether the dataset is large enough to train support vector machines. Since it is not possible to cover all aspects of sample size calculation in one manuscript, we focus on how certain types of data relate to certain properties of support vector machines. We show that normal vectors of decision hyperplanes can be used for assessing reliability and internal cross-validation can be used for assessing stability of small sample data.

机译：由于需要医疗专业知识和患者知情同意，应对困难临床问题的数据集是昂贵的。我们调查小样本大小对文本分类算法性能的影响。我们展示了如何确定数据集是否足够大以培训支持向量机。由于不可能在一个稿件中涵盖样本量计算的所有方面，因此我们专注于某些类型的数据如何与支持向量机的某些属性有关。我们表明，决策超平面的正常矢量可用于评估可靠性，内部交叉验证可用于评估小样本数据的稳定性。

著录项

来源
《Workshop on biomedical natural language processing》|2012年||共9页
会议地点
作者
Pawel Matykiewicz; John Pestian;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? [J] . Edda Leopold, Jorg Kindermann Machine Learning . 2002,第1a2a3期

机译：支持向量机的文本分类。如何在输入空间中表示文本？
2. Afaan Oromo News Text Categorization using Decision Tree Classifier and Support Vector Machine: A Machine Learning Approach [J] . Kamal Mohammed Jimalo, Ramesh Babu P, Yaregal Assabie International Journal of Computer Trends and Technology . 2017,第1期

机译：使用决策树分类器和支持向量机的Afaan Oromo新闻文本分类：一种机器学习方法
3. Early detection of gradual concept drifts by text categorization and Support Vector Machine techniques: The TRIO algorithm [J] . M. Marseguerra Reliability Engineering & System Safety . 2014,第sepa期

机译：通过文本分类和支持向量机技术对渐变概念漂移进行早期检测：TRIO算法
4. Effect of small sample size on text categorization with support vector machines [C] . Pawel Matykiewicz, John Pestian 11th Workshop on biomedical natural language processing 2012 . 2012

机译：支持向量机的小样本量对文本分类的影响
5. A new feature selection method based on support vector machines for text categorization. [D] . Xu, Yaquan. 2006

机译：一种基于支持向量机的文本分类新特征选择方法。
6. Use of a support vector machine for categorizing free-text notes: assessment of accuracy across two institutions [O] . Adam Wright, Allison B McCoy, Stanislav Henkin, 2013

机译：使用支持向量机对自由文本注释进行分类：评估两个机构的准确性
7. Text categorization with support vector machines. How to represent texts in input space? [O] . Leopold, E., Kindermann, J. 2002

机译：使用支持向量机进行文本分类。如何在输入空间中表示文本？

Effect of small sample size on text categorization with support vector machines

摘要

著录项

相似文献

相关主题

期刊订阅