...
首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning
【24h】

Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning

机译:将超平面查询散列到近点及其在大规模主动学习中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

We consider the problem of retrieving the database points nearest to a given hyperplane query without exhaustively scanning the entire database. For this problem, we propose two hashing-based solutions. Our first approach maps the data to 2-bit binary keys that are locality sensitive for the angle between the hyperplane normal and a database point. Our second approach embeds the data into a vector space where the euclidean norm reflects the desired distance between the original points and hyperplane query. Both use hashing to retrieve near points in sublinear time. Our first method's preprocessing stage is more efficient, while the second has stronger accuracy guarantees. We apply both to pool-based active learning: Taking the current hyperplane classifier as a query, our algorithm identifies those points (approximately) satisfying the well-known minimal distance-to-hyperplane selection criterion. We empirically demonstrate our methods' tradeoffs and show that they make it practical to perform active selection with millions of unlabeled points.
机译:我们考虑了在不穷举扫描整个数据库的情况下检索最接近给定超平面查询的数据库点的问题。针对此问题,我们提出了两种基于哈希的解决方案。我们的第一种方法将数据映射到对超平面法线和数据库点之间的角度局部敏感的2位二进制密钥。我们的第二种方法将数据嵌入到向量空间中,在该向量空间中,欧几里得范数反映了原始点和超平面查询之间的期望距离。两者都使用散列来检索亚线性时间中的近点。我们第一种方法的预处理阶段效率更高,而第二种方法的准确性保证更高。我们将两者都应用于基于池的主动学习:以当前的超平面分类器作为查询,我们的算法识别(近似)满足众所周知的最小距离超平面选择标准的那些点。我们凭经验证明了我们方法的权衡,并表明它们使对数百万个未标记点进行主动选择变得可行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号