首页> 外文学位 >Comparing subsets from digital spatial archives: Point set similarity.
【24h】

Comparing subsets from digital spatial archives: Point set similarity.

机译:比较数字空间档案中的子集:点集相似度。

获取原文
获取原文并翻译 | 示例

摘要

This research focuses on the key question of measuring the spatial similarity between a small subset and much larger superset. The new contribution of this work is a formal approach to measuring dataset similarity through a combination of the component similarities of the spatial qualities--density, dispersion and pattern--and the relative importance of spatial objects in the datasets.; Data consumers are faced with the difficult task of fulfilling their data requirements from the growing number of digital spatial data collections available online. New databases system architectures, such as digital libraries and data warehouses, are being introduced to manage these large collections. However, there are cognitive and technological limits which make very large datasets difficult to use. Sampling is one traditional approach to simplifying datasets, the assumption being that the sample (or subset) is representative of the entire dataset. The spatial data consumer may be interested in a specific set of spatial qualities in the dataset and traditional random samples do not preserve these qualities in very small subsets.; This work defines a formal approach to measuring similarity between very large spatial datasets and their much smaller subsets. The model defines methods for building similarity measures over nominal, ordinal, interval, and ratio measurement scales. Similarity measurements from different scales are combined through a set of measures that express similarity as a distance from zero (equal) to one (completely different). Values generated from this similarity measure are sorted into a similarity index.; The spatial measures were tested against a database of places and nineteen synthetically generated subsets. Examination of the metadata generated for each subset indicated that by combining multiple measures of a single spatial quality it is possible to isolate and identify the method that was used to generate a dataset.; The model was examined with regard to its application to digital spatial libraries and data warehouses. Since digital libraries are open domains the type of measure results that were usable for similarity assessment were restricted to interval and ratio measurement scales. Data warehouses have greater potential for domain closure and can use all scales of measurement in similarity assessment.
机译:这项研究的重点是测量小子集和更大的超集之间的空间相似性的关键问题。这项工作的新贡献是通过将空间质量的组成相似性(密度,分散度和模式)与空间对象在数据集中的相对重要性结合起来的一种正式方法来测量数据集的相似性。数据消费者面临着越来越多的在线数字空间数据收集所要满足的数据需求的艰巨任务。正在引入新的数据库系统架构,例如数字图书馆和数据仓库,以管理这些大型馆藏。但是,由于存在认知和技术限制,因此很难使用非常大的数据集。采样是简化数据集的一种传统方法,假设样本(或子集)代表整个数据集。空间数据使用者可能会对数据集中的一组特定的空间质量感兴趣,而传统的随机样本不会在很小的子集中保留这些质量。这项工作定义了一种正式的方法来测量非常大的空间数据集及其较小的子集之间的相似性。该模型定义了用于在标称,有序,区间和比率度量标度上构建相似性度量的方法。来自不同尺度的相似性度量通过一组度量来组合,这些度量将相似性表示为从零(相等)到一个(完全不同)的距离。从这种相似性度量产生的值被分类到相似性索引中。针对地点和19个综合生成的子集的数据库对空间量度进行了测试。对每个子集生成的元数据的检查表明,通过组合单个空间质量的多个度量,可以隔离和识别用于生成数据集的方法。对模型在数字空间图书馆和数据仓库中的应用进行了检查。由于数字图书馆是开放域,因此可用于相似性评估的度量结果类型仅限于区间和比例度量范围。数据仓库具有更大的域关闭潜力,并且可以在相似性评估中使用所有规模的度量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号