首页> 外文学位 >Assessing fit of item response models for performance assessments using Bayesian analysis.

【24h】

Assessing fit of item response models for performance assessments using Bayesian analysis.

机译：使用贝叶斯分析评估用于绩效评估的项目响应模型的适合性。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Assessing IRT model-fit and comparing different IRT models from a Bayesian perspective is gaining attention. This research evaluated the performance of Bayesian model-fit and model-comparison techniques in assessing the fit of unidimensional Graded Response (GR) models and comparing different GR models for performance assessment applications.;The study explored the general performance of the PPMC method and a variety of discrepancy measures (test-level, item-level, and pair-wise measures) in evaluating different aspects of fit for unidimensional GR models. Previous findings that the PPMC method is conservative were confirmed. In addition, PPMC was found to have adequate power in detecting different aspects of misfit when using appropriate discrepancy measures. Pair-wise measures were found more powerful in detecting violations of unidimensionality and local independence assumptions than test-level and item-level measures. Yen's Q3 measure appeared to perform best. In addition, the power of PPMC increased as the degree of multidimensionality or local dependence among item responses increased. Two classical item-fit statistics were found effective for detecting the item misfit due to discrepancies from GR model boundary curves.;The study also compared the relative effectiveness of three Bayesian model-comparison indices (DIC, CPO, and PPMC) for model selection. The results showed that these indices appeared to perform equally well in selecting a preferred model for an overall test. However, the advantage of PPMC applications is that they can be used to compare the relative fit of different models, but also evaluate the absolute fit of each individual model. In contrast, the DIC and CPO indices only compare the relative fit of different models.;This study further applied the Bayesian model-fit and model-comparison methods to three real datasets from the QCAI performance assessment. The results indicated that these datasets were essentially unidimensional and exhibited local independence among items. A 2P GR model provided better fit than a 1P GR model, and a two-dimensional model was also not preferred. These findings were consistent with previous studies, although Stone's fit statistics in the PPMC context identified less misfitting items compared to previous studies. Limitations and future research for Bayesian applications to IRT are discussed.

机译：评估IRT模型拟合并从贝叶斯角度比较不同的IRT模型正受到关注。这项研究评估了贝叶斯模型拟合和模型比较技术在评估一维梯度响应（GR）模型的拟合度并比较不同GR模型用于性能评估应用中的性能。评估一维GR模型拟合的不同方面时，各种差异度量（测试级别，项目级别和成对度量）。以前的发现证实了PPMC方法是保守的。另外，当使用适当的差异度量时，PPMC被发现具有足够的能力来检测失配的不同方面。研究发现，成对度量比检测级别和项目级别的度量在检测违反一维性和局部独立性假设方面更有效。日元的第三季度指标似乎表现最好。另外，PPMC的功能随着项目响应之间的多维程度或局部依赖性程度的增加而增加。发现两个经典的项目拟合统计量可以有效地检测由于GR模型边界曲线的差异而导致的项目失配。该研究还比较了三个贝叶斯模型比较指标（DIC，CPO和PPMC）的相对有效性。结果表明，这些指数在选择整体测试的首选模型时表现良好。但是，PPMC应用程序的优点是它们可用于比较不同模型的相对拟合，但也可以评估每个单独模型的绝对拟合。相比之下，DIC和CPO指数仅比较不同模型的相对拟合。本研究进一步将贝叶斯模型拟合和模型比较方法应用于QCAI绩效评估中的三个真实数据集。结果表明，这些数据集基本上是一维的，并且在项目之间表现出局部独立性。 2P GR模型比1P GR模型具有更好的拟合度，并且二维模型也不是首选。这些结果与以前的研究一致，尽管与以前的研究相比，Stone在PPMC中的拟合度统计确定了较少的不匹配项。讨论了贝叶斯应用于IRT的局限性和未来研究。

著录项

作者
Zhu, Xiaowen.;
展开▼
作者单位

University of Pittsburgh.;

展开▼
授予单位 University of Pittsburgh.;
学科 Quantitative psychology.;Educational tests measurements.;Educational psychology.
学位 Ph.D.
年度 2009
页码 304 p.
总页数 304
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Assessing Item Fit for Unidimensional Item Response Theory Models Using Residuals from Estimated Item Response Functions [J] . Haberman S.J., Sinharay S., Chon K.H. Psychometrika . 2013,第3期

机译：使用估计的项目响应函数中的残差评估一维项目响应理论模型的项目拟合度
2. Assessing Item Fit for Unidimensional Item Response Theory Models Using Residuals from Estimated Item Response Functions [J] . Shelby J. Haberman, Sandip Sinharay, Kyong Hee Chon Psychometrika . 2013,第3期

机译：使用估计的项目响应函数中的残差评估一维项目响应理论模型的项目拟合度
3. Assessment of Item Response Model-Data Fit Via Bayesian Limited Information Model Comparison Posterior Predictive Checks [J] . Multivariate behavioral research . 2020,第1期

机译：项目响应模型的评估 - 数据适合通过贝叶斯有限的信息模型比较后预测检查
4. Performance variations of the Bayesian model of peer-assessment implemented in OpenAnswer - Response to modifications of the number of peers assessed and of the quality of the class [C] . Maria De Marsico, Luca Moschella, Andrea Sterbini, International Conference on Information Technology Based Higher Education and Training . 2017

机译：OpenAnswer实施的同伴评估贝叶斯模型的性能变化 - 对评估和课堂质量评估的对等人数的回应
5. A new goodness of fit statistic for assessing model-data fit in item response theory applications. [D] . Guo, Fanmin. 1997

机译：在项目响应理论应用中，用于评估模型数据拟合的拟合统计量具有新的优势。
6. Assessing Item-Level Fit for Higher Order Item Response Theory Models [O] . Xue Zhang, Chun Wang, Jian Tao 2018

机译：评估高阶项目响应理论模型的项目级别拟合
7. Assessing Fit of Item Response Models for Performance Assessments using Bayesian Analysis [O] . Zhu Xiaowen 2009

机译：使用贝叶斯分析评估用于绩效评估的项目响应模型的拟合度

Assessing fit of item response models for performance assessments using Bayesian analysis.

摘要

著录项

相似文献

相关主题

期刊订阅