首页> 外文学位 >Statistical Methods for Dating Collections of Historical Documents.
【24h】

Statistical Methods for Dating Collections of Historical Documents.

机译:约会历史文献收藏的统计方法。

获取原文
获取原文并翻译 | 示例

摘要

The problem in this thesis was originally motivated by problems presented with documents of Early England Data Set (DEEDS). The central problem with these medieval documents is the lack of methods to assign accurate dates to those documents which bear no date.;With the problems of the DEEDS documents in mind, we present two methods to impute missing features of texts. In the first method, we suggest a new class of metrics for measuring distances between texts. We then show how to combine the distances between the texts using statistical smoothing. This method can be adapted to settings where the features of the texts are ordered or unordered categoricals (as in the case of, for example, authorship assignment problems).;In the second method, we estimate the probability of occurrences of words in texts using nonparametric regression techniques of local polynomial fitting with kernel weight to generalized linear models. We combine the estimated probability of occurrences of words of a text to estimate the probability of occurrence of a text as a function of its feature – the feature in this case being the date in which the text is written. The application and results of our methods to the DEEDS documents are presented.
机译:本文中的问题最初是由早期英格兰数据集(DEEDS)的文档提出的问题引起的。这些中世纪文档的中心问题是缺乏为那些没有日期的文档分配准确日期的方法。考虑到DEEDS文档的问题,我们提出了两种方法来估算文本的缺失特征。在第一种方法中,我们建议使用一类新的度量标准来测量文本之间的距离。然后,我们展示如何使用统计平滑来组合文本之间的距离。该方法可以适合于文本特征是有序或无序分类的设置(例如,在作者身份分配问题的情况下);在第二种方法中,我们使用以下方法估计文本中单词出现的可能性核多项式拟合的局部多项式的非参数回归技术与广义线性模型。我们将文本中单词出现的估计概率结合起来,以根据文本的特征来估计文本的出现概率–在这种情况下,该特征是文本被写入的日期。介绍了我们的方法在DEEDS文档中的应用和结果。

著录项

  • 作者

    Tilahun, Gelila.;

  • 作者单位

    University of Toronto (Canada).;

  • 授予单位 University of Toronto (Canada).;
  • 学科 Statistics.;Artificial Intelligence.;History Medieval.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 179 p.
  • 总页数 179
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号