首页> 外国专利> DETERMINING CONFIDENT DATA SAMPLES FOR MACHINE LEARNING MODELS ON UNSEEN DATA

DETERMINING CONFIDENT DATA SAMPLES FOR MACHINE LEARNING MODELS ON UNSEEN DATA

机译：确定未知数据上机器学习模型的机密数据样本

页面导航

摘要
著录项
相似文献

摘要

Techniques are provided for determining confident data samples for machine learning (ML) models on unseen data. In one embodiment, a method is provided that comprises extracting, by a system comprising a processor, a feature vector for a data sample based on projection of the data sample onto a standard feature space. The method further comprises processing, by the system, the feature vector using an outlier detection model to determine whether the data sample is within a scope of a training dataset used to train a machine learning model, wherein the outlier detection model was trained using features extracted from the training dataset based on projection of data samples included in the training dataset onto the standard feature space.

机译：提供了用于为看不见的数据确定用于机器学习（ML）模型的可信数据样本的技术。在一个实施例中，提供了一种方法，该方法包括由包括处理器的系统基于数据样本到标准特征空间上的投影来提取数据样本的特征向量。该方法还包括由系统使用离群值检测模型来处理特征向量，以确定数据样本是否在用于训练机器学习模型的训练数据集的范围内，其中，使用提取的特征来训练离群值检测模型。基于训练数据集中包含的数据样本到标准特征空间上的投影，从训练数据集中提取数据。

著录项

公开/公告号US2020349434A1

专利类型
公开/公告日2020-11-05

原文格式PDF
申请/专利权人 GE PRECISION HEALTHCARE LLC;
展开▼

申请/专利号US202016934650
发明设计人 MIN ZHANG;GOPAL B. AVINASH;ZILI MA;KEVIN H. LEUNG;WEN JIN;
展开▼

申请日2020-07-21
分类号G06N3/08;G06F16/28;G06N3/04;
国家 US
入库时间 2022-08-21 11:21:23

相似文献

专利
外文文献
中文文献