A Weighted κ-Nearest Neighborhood for BaseNP Detection under Covariate Shift

机译：在协变量移位下的基础检测的加权κ最近的邻居

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In common machine learning methods, there is a basic assumption that training data and test data are sampled from the same distribution. However, this assumption is commonly violated in practical fields. The situation where the training and test data are generated from different distributions is so-called covariate shift. In natural language processing, it is highly possible to occur covariate shift due to the size of sample space. Natural language data have theoretically infinite size, which causes that the distribution of training data can not reflect that of entire data. In this paper, we try to verify that the performance of methods on natural language processing can be improved by reducing error from covariate shift. For this purpose, we propose the importance weighted k-NN for base noun detection. In the proposed method, the weights are set as a difference between the training and test distribution. Theoretically, the performance under covariate shift can be improved using importance weight method. In the experiment, the proposed method shows better performance than normal k-NN.

机译：在公共机器学习方法中，存在培训数据和测试数据的基本假设是从相同的分布采样的。但是，这种假设通常在实际领域违反。从不同的分布生成培训和测试数据的情况是所谓的协变速。在自然语言处理中，由于样本空间的大小，因此非常可能发生协变量。自然语言数据具有理论上无限的大小，这导致培训数据的分布无法反映整个数据的分布。在本文中，我们尝试通过减少协变速转移来验证可以提高对自然语言处理的方法的性能。为此目的，我们提出了基础名词检测的重要性加权K-Nn。在所提出的方法中，重物被设置为训练和测试分布之间的差异。从理论上讲，使用重要性重量方法可以改善协变量移位下的性能。在实验中，所提出的方法显示出比正常的K-NN更好的性能。

著录项

来源
《International Conference on Advanced Language Processing and Web Information Technology》|2008年||共6页
会议地点
作者
Jeong-Woo Son; Seong-Bae Park; Young-Jin Han; Se-Young Park;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP312-53;
关键词

相似文献

外文文献
中文文献
专利

1. Weighted natural neighborhood graph: an adaptive structure for clustering and outlier detection with no neighborhood parameter [J] . Zhu Qingsheng, Feng Ji, Huang Jinlong Cluster computing . 2016,第3期

机译：加权自然邻域图：无邻域参数的自适应聚类和离群值检测结构
2. EWMA model based shift-detection methods for detecting covariate shifts in non-stationary environments [J] . Raza Haider, Prasad Girijesh, Li Yuhua Pattern Recognition: The Journal of the Pattern Recognition Society . 2015,第3期

机译：基于EWMA模型的换档检测方法，用于检测非静止环境中的协变量
3. Importance-weighted least-squares probabilistic classifier for covariate shift adaptation with application to human activity recognition [J] . Hirotaka Hachiya, Masashi Sugiyama, Naonori Ueda Neurocomputing . 2012,第期

机译：重要性加权最小二乘概率分类器，用于协变量转变适应及其在人类活动识别中的应用
4. A Weighted κ-Nearest Neighborhood for BaseNP Detection under Covariate Shift [C] . Jeong-Woo Son, Seong-Bae Park, Young-Jin Han, International Conference on Advanced Language Processing and Web Information Technology . 2008

机译：在协变量移位下的基础检测的加权κ最近的邻居
5. Hybrid Fuzzy Weighted K-Nearest Neighbor to Predict Hospital Readmission for Diabetic Patients [D] . Bahanshal, Soha A. 2021

机译：Hybrid模糊加权K最近邻居预测糖尿病患者的医院入院
6. A Local Weighted Nearest Neighbor Algorithm and a Weighted and Constrained Least-Squared Method for Mixed Odor Analysis by Electronic Nose Systems [O] . Kea-Tiong Tang, Yi-Shan Lin, Jyuo-Min Shyu 2010

机译：电子鼻系统混合气味分析的局部加权最近邻算法和加权约束最小二乘方法
7. Hyperspectral Image Target Detection via Weighted Joint K-Nearest Neighbor and Multitask Learning Sparse Representation [O] . Xianfeng Ou, Yiming Zhang, Hanpu Wang, 2020

机译：高光谱图像目标检测通过加权关节K到最近的邻居和多址学习稀疏表示

A Weighted κ-Nearest Neighborhood for BaseNP Detection under Covariate Shift

摘要

著录项

相似文献

相关主题

期刊订阅