...
首页> 外文期刊>Statistical modeling: applications in contemporary issues >Exploring and modelling team performances of the Kaggle European Soccer database
【24h】

Exploring and modelling team performances of the Kaggle European Soccer database

机译:探索和造型演播欧洲足球数据库的团队表演

获取原文
获取原文并翻译 | 示例
           

摘要

This study explores a big and open database of soccer leagues in 10 European countries. Data related to players, teams and matches covering seven seasons (from 2009/2010 to 2015/2016) were retrieved from Kaggle, an online platform in which big data are available for predictive modelling and analytics competition among data scientists. Based on both preliminary data analysis, experts’ evaluation and players’ position on the football pitch, role-based indicators of teams’ performance have been built and used to estimate the win probability of the home team with the binomial logistic regression (BLR) model that has been extended including the ELO rating predictor and two random effects due to the hierarchical structure of the dataset.The predictive power of the BLR model and its extensions has been compared with the one of other statistical modelling approaches (Random Forest, Neural Network, k-NN, Na?ve Bayes). Results showed that role-based indicators substantially improved the performance of all the models used in both this work and in previous works available on Kaggle. The base BLR model increased prediction accuracy by 10 percentage points, and showed the importance of defence performances, especially in the last seasons. Inclusion of both ELO rating predictor and the random effects did not substantially improve prediction, as the simpler BLR model performed equally good. With respect to the other models, only Na?ve Bayes showed more balanced results in predicting both win and no-win of the home team.
机译:本研究探讨了10个欧洲国家的足球联赛大型和开放数据库。覆盖七季(从2009/2010到2015/2016)的与玩家,团队和比赛有关的数据被检索到卡格,这是一个在线平台,其中大数据可用于数据科学家之间的预测建模和分析竞争。基于初步数据分析,专家评价和参与者对足球场的职位,基于角色的表现指标已经建立,并用于估计与二项式物流回归(BLR)模型的家庭团队的胜利概率已经扩展了,包括ELO评级预测器和由于数据集的分层结构而导致的两个随机效果。与其他统计建模方法(随机森林,神经网络,K-NN,Na'Ve Bayes)进行了比较了BLR模型的预测力及其扩展。结果表明,基于角色的指标显着提高了这项工作中使用的所有模型的性能以及在滑动时可用的产品。基础BLR模型将预测精度提高了10个百分点,并显示了防御性能的重要性,特别是在最后一季。作为ELO评级预测器和随机效果的夹杂物并未显着提高预测,因为简单的BLR模型同样良好。关于其他模型,只有NA?Ve贝父展示了更加平衡的结果,以预测家庭团队的胜利和无胜利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号