首页> 外文期刊>Electronics and communications in Japan >A Method of Readability Assessment for Web Documents Using Text Features and HTML Structures
【24h】

A Method of Readability Assessment for Web Documents Using Text Features and HTML Structures

机译:一种使用文本特征和HTML结构的Web文档可读性评估方法

获取原文
获取原文并翻译 | 示例
           

摘要

This paper describes a method of readability assessment for Web documents. Readability is the ease in which text can be read and understood. We hypothesize that the readability is determined by whether a reader can easily grasp text structures. The impression and complexity of text are significant factors. We extract features of impression and complexity from plain text and additional data, such as HTML tags. In order to compare the effect of extracting features, we assess readability rank by machine learning. We conduct fivefold cross validation for each domain and calculate the root mean squared error between the actual rank and the estimated rank. Cross validation experiments confirm that the performance of our method is high, showing the effectiveness of extracting features about the impression and complexity for readability assessment.
机译:本文介绍了一种Web文档的可读性评估方法。可读性是易于阅读和理解的文本。我们假设可读性取决于读者是否可以轻松掌握文本结构。文本的印象和复杂性是重要因素。我们从纯文本和其他数据(例如HTML标签)中提取印象和复杂性的功能。为了比较提取特征的效果,我们通过机器学习评估可读性等级。我们对每个域进行五重交叉验证,并计算实际等级与估算等级之间的均方根误差。交叉验证实验证实我们的方法性能很高,显示出提取印象和复杂性特征以进行可读性评估的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号