首页> 外国专利> DETERMINING CONFIDENT DATA SAMPLES FOR MACHINE LEARNING MODELS ON UNSEEN DATA

DETERMINING CONFIDENT DATA SAMPLES FOR MACHINE LEARNING MODELS ON UNSEEN DATA

机译:确定未知数据上机器学习模型的机密数据样本

摘要

Techniques are provided for determining confident data samples for machine learning (ML) models on unseen data. In one embodiment, a method is provided that comprises extracting, by a system comprising a processor, a feature vector for a data sample based on projection of the data sample onto a standard feature space. The method further comprises processing, by the system, the feature vector using an outlier detection model to determine whether the data sample is within a scope of a training dataset used to train a machine learning model, wherein the outlier detection model was trained using features extracted from the training dataset based on projection of data samples included in the training dataset onto the standard feature space.
机译:提供了用于为看不见的数据确定用于机器学习(ML)模型的可信数据样本的技术。在一个实施例中,提供了一种方法,该方法包括由包括处理器的系统基于数据样本到标准特征空间上的投影来提取数据样本的特征向量。该方法还包括由系统使用离群值检测模型来处理特征向量,以确定数据样本是否在用于训练机器学习模型的训练数据集的范围内,其中,使用提取的特征来训练离群值检测模型。基于训练数据集中包含的数据样本到标准特征空间上的投影,从训练数据集中提取数据。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号