【24h】

Fixed neighborhood sphere and pattern selection in SVDD

机译:固定邻域球体和SVDD的模式选择

获取原文

摘要

For the problem of a large dataset, we need to select a subset to represent the original dataset. Many scholars do pattern selection from the problem of the kNN (k-nearest neighbors). The distribution of a pattern's neighbors is usually uneven. In this paper, we define a fixed neighborhood sphere. When the pattern locates near the boundary of the data distribution, there will be fewer neighbors in the neighborhood sphere and when the pattern locates within the data distribution, there will be more neighbors in the neighborhood sphere. According to gather the statistic of the neighbors in a fixed neighborhood sphere, we can find those patterns locating near the boundary of the data distribution. In SVDD (Support Vector Data Description), those patterns are locating near the boundary of the data distribution have more information. They are those patterns which would be support vectors. We can use FNSPS (fixed neighborhood sphere pattern selection) algorithm to select those patterns, which locate near the boundary of the data distribution. The experimental results show that the performance of the SVDD will not go bad. The time complexity of the naive identifying the neighbors in the fixed neighborhood sphere is O(n2). And the time complexity of the SVDD is O(n3). If we set a lower threshold, the FNSPS algorithm can also be used to remove the noise in the targets.
机译:对于大型数据集的问题,我们需要选择一个代表原始数据集的子集。许多学者从knn(k最近邻居)的问题中进行模式选择。模式邻居的分布通常不均匀。在本文中,我们定义了固定的邻域球体。当模式定位在数据分布的边界附近时,邻域球体中的邻居较少,并且当模式在数据分布内找到时,邻域球体中将存在更多邻居。根据收集固定邻域球体中邻居的统计数据,我们可以在数据分布边界附近找到这些模式。在SVDD(支持向量数据描述)中,这些模式在数据分布的边界附近定位有更多信息。它们是那些将支持向量的模式。我们可以使用FNSPS(固定邻域球体图案选择)算法来选择这些模式,该模式定位在数据分布的边界附近。实验结果表明,SVDD的性能不会变坏。 Naive识别固定邻域球体中的邻居的时间复杂度是O(n2)。并且SVDD的时间复杂性是O(n3)。如果我们设置了较低的阈值,则FNSPS算法也可用于去除目标中的噪声。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号