...
首页> 外文期刊>Fortschritt-Berichte VDI >Data Preprocessing and Outlier Detection in Multivariate Data
【24h】

Data Preprocessing and Outlier Detection in Multivariate Data

机译:多元数据中的数据预处理和异常值检测

获取原文
获取原文并翻译 | 示例
           

摘要

The n-dimensional input structure in an artificial neural network for the classification of objects consists of different signal sources, such as signals from temperature sensors, position encoders, microphones, cameras, spectrometers or data from other measurement instruments. The raw data, delivered from these measurement data sources do not have the same structure; rather, they have a different number of dimensions and they can be presented in a parameterized form as linked values, as data packets or as discrete values at the input. The data distribution of these values will be analyzed in the data pre-preparation procedures; the statistical values were identified, non-applicable values will be eliminated and the characteristic values will be saved in a model for the training data set (TDS) for the underlying object. In order to match these values to the input structure of the artificial network, different methods were used. Sensor signals were quantized, filtered and standardized. Signal trends presented in a parameterized manner can be transformed via wavelet transform into spectral components, filtered and standardized. To create the input vectors, single measuring values, packets of spectral measuring data and other data sources of information can be combined in order to get reliable characteristic properties for the trained object. To obtain a pre-defined exactness for the mapping to a low dimensional space, the training vectors should have a limited maximum distance to their nearest neighbors (NN). This and a continuous manifold without gaps will be assured by the data pre-preparation procedure so that the data can be processed as an entirety in the following work steps.
机译:人工神经网络中用于对象分类的n维输入结构由不同的信号源组成,例如来自温度传感器,位置编码器,麦克风,照相机,光谱仪的信号或来自其他测量仪器的数据。从这些测量数据源传递的原始数据的结构不同。而是,它们具有不同数量的维,并且可以以参数化形式将其表示为链接值,数据包或输入处的离散值。这些值的数据分布将在数据准备过程中进行分析;确定了统计值,将消除不适用的值,并将特征值保存在基础对象训练数据集(TDS)的模型中。为了使这些值与人工网络的输入结构匹配,使用了不同的方法。传感器信号被量化,滤波和标准化。可以通过小波变换将以参数化方式呈现的信号趋势转换为频谱分量,进行滤波和标准化。为了创建输入向量,可以将单个测量值,光谱测量数据包和其他信息数据源组合在一起,以获得训练对象的可靠特性。为了获得映射到低维空间的预先定义的准确性,训练向量到其最近邻(NN)的最大距离应有限。数据预准备程序将确保这一点和连续的流水线无间隙,以便可以在以下工作步骤中整体处理数据。

著录项

  • 来源
    《Fortschritt-Berichte VDI》 |2017年第857期|105-140|共36页
  • 作者

    Gerhard Sartorius;

  • 作者单位

    Faculty of Mathematics and Computer Science FernUniversitat in Hagen, Germany;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号