首页> 外文会议>IEEE International Conference on Teaching, Assessment, and Learning for Engineering >Nearest Centroid: A Bridge between Statistics and Machine Learning
【24h】

Nearest Centroid: A Bridge between Statistics and Machine Learning

机译:最近的质心:统计和机器学习之间的桥梁

获取原文

摘要

In order to guide our students of machine learning in their statistical thinking, we need conceptually simple and mathematically defensible algorithms. In this paper, we present the Nearest Centroid algorithm (NC) algorithm as a pedagogical tool, combining the key concepts behind two foundational algorithms: K-Means clustering and K Nearest Neighbors (kNN). In NC, we use the centroid (as defined in the K-Means algorithm) of the observations belonging to each class in our training data set and its distance from a new observation (similar to k-NN) for class prediction. Using this obvious extension, we will illustrate how the concepts of probability and statistics are applied in machine learning algorithms. Furthermore, we will describe how the practical aspects of validation and performance measurements are carried out. The algorithm and the work presented here can be easily converted to labs and reading assignments to cement the students’ understanding of applied statistics and its connection to machine learning algorithms, as described toward the end of this paper.
机译:为了引导我们的学生在统计思考中的机器学习,我们需要概念简单,数学上可防止的算法。在本文中,我们介绍了最近的质心算法(NC)算法作为教学工具,将关键概念组合在两个基础算法后面的关键概念:K-means集群和K最近邻居(KNN)。在NC中,我们使用属于我们的训练数据集中的每个类的观测结果(如K-Means算法中所定义),以及从新观察(类似于K-NN)进行类预测的距离。使用这种明显的扩展,我们将说明如何在机器学习算法中应用概率和统计的概念。此外,我们将描述如何执行验证和性能测量的实际方面。算法和此处所呈现的工作可以很容易地转换为实验室和读取分配,以使学生对应用统计数据的理解及其与机器学习算法的连接,如本文的末尾所述。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号