Robust Neural Automated Essay Scoring Using Item Response Theory

机译：基于项目反应理论的鲁棒神经自动作文评分

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automated essay scoring (AES) is the task of automatically assigning scores to essays as an alternative to human grading. Conventional AES methods typically rely on manually tuned features, which are laborious to effectively develop. To obviate the need for feature engineering, many deep neural network (DNN)-based AES models have been proposed and have achieved state-of-the-art accuracy. DNN-AES models require training on a large dataset of graded essays. However, assigned grades in such datasets are known to be strongly biased due to effects of rater bias when grading is conducted by assigning a few raters in a rater set to each essay. Performance of DNN models rapidly drops when such biased data are used for model training. In the fields of educational and psychological measurement, item response theory (IRT) models that can estimate essay scores while considering effects of rater characteristics have recently been proposed. This study therefore proposes a new DNN-AES framework that integrates IRT models to deal with rater bias within training data. To our knowledge, this is a first attempt at addressing rating bias effects in training data, which is a crucial but overlooked problem.

机译：自动论文评分（AES）是自动为论文分配分数的任务，可以代替人类评分。常规AES方法通常依赖于手动调整的功能，要有效开发这些功能很费力。为了消除对特征工程的需求，已提出了许多基于深度神经网络（DNN）的AES模型，并获得了最先进的精度。 DNN-AES模型需要在大量分级论文集上进行训练。但是，在通过给每篇文章中的评分者集中分配几个评分者进行评分时，由于评分者偏见的影响，在此类数据集中分配的评分众所周知是很偏重的。当将此类偏差数据用于模型训练时，DNN模型的性能会迅速下降。在教育和心理测量领域，最近已经提出了可以在评估评分者特征影响的同时估计论文分数的项目反应理论（IRT）模型。因此，本研究提出了一个新的DNN-AES框架，该框架集成了IRT模型以处理训练数据中的评估者偏差。就我们所知，这是解决训练数据中的等级偏差效应的首次尝试，这是一个至关重要但被忽视的问题。

著录项

来源
《International Conference on Artificial Intelligence in Education》|2020年|549-561|共13页
会议地点
作者
Masaki Uto; Masashi Okano;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Deep neural networks; Item response theory; Automated essay scoring; Rater bias;

机译：深度神经网络;项目响应理论;自动论文评分;评分偏见;

相似文献

外文文献
中文文献
专利

1. On the Connections Between Item Response Theory and Classical Test Theory: A Note on True Score Evaluation for Polytomous Items via Item Response Modeling [J] . Raykov Tenko, Dimitrov Dimiter M., Marcoulides George A., Educational and Psychological Measurement . 2019,第6期

机译：关于物品响应理论与经典测试理论的联系：物品响应建模的多组合物品真实分数评估的说明
2. Automated students arabic essay scoring using trained neural network by e-jaya optimization to support personalized system of instruction [J] . Marwa M. Gaheen, Rania M. ElEraky, Ahmed A. Ewees Education and information technologies . 2021,第1期

机译：自动化学生阿拉伯文论文评分使用训练有素的神经网络通过E-Jaya优化来支持个性化指导系统
3. SEDNN: Shared and enhanced deep neural network model for cross-prompt automated essay scoring [J] . Li Xia, Chen Minping, Nie Jian-Yun Knowledge-Based Systems . 2020,第Deca27期

机译：SEDNN：用于交叉自动化论文评分的共享和增强的深度神经网络模型
4. Integration of Automated Essay Scoring Models Using Item Response Theory [C] . Itsuki Aomi, Emiko Tsutsumi, Masaki Uto, International Conference on Artificial Intelligence in Education . 2021

机译：用物品响应理论集成自动论文评分模型
5. A comparison of classical test theory and item response theory methods for equating number-right scored to formula scored assessments. [D] . Wolkowitz, Amanda A. 2008

机译：比较经典测试理论和项目响应理论方法，将数字权利得分等同于公式得分评估。
6. On the Connections Between Item Response Theory and Classical TestTheory: A Note on True Score Evaluation for Polytomous Items via Item ResponseModeling [O] . Tenko Raykov, Dimiter M. Dimitrov, George A. Marcoulides, 2019

机译：关于项目响应理论与古典测试的联系理论：通过项目响应的多种物品真实分数评估的说明造型
7. Automated Topical Component Extraction Using Neural Network Attention Scores from Source-based Essay Scoring [O] . Haoran Zhang, Diane Litman 2020

机译：使用基于源的文章评分的神经网络关注分数自动化局部组件提取

Robust Neural Automated Essay Scoring Using Item Response Theory

摘要

著录项

相似文献

相关主题

期刊订阅