【24h】

Expanding the role of synthetic data at the U.S. Census Bureau

机译:在美国人口普查局扩大综合数据的作用

获取原文
获取原文并翻译 | 示例
           

摘要

National Statistical offices (NSOs) create official statistics from data collected from survey respondents, government administrative records and other sources. The raw source data is usually considered to be confidential. In the case of the U.S. Census Bureau, confidentiality of survey and administrative records microdata is mandated by statute, and this mandate to protect confidentiality is often at odds with the needs of users to extract as much information from the data as possible. Traditional disclosure protection techniques result in official data products that do not fully utilize the information content of the underlying microdata. Typically, these products take the form of simple aggregate tabulations. In a few cases anonymized public-use micro samples are made available, but these face a growing risk of re-identification by the increasing amounts of information about individuals and firms available in the public domain. One approach for overcoming these risks is to release products based on synthetic data where values are simulated from statistical models designed to mimic the (joint) distributions of the underlying microdata. We discuss recent Census Bureau work to develop and deploy such products. We discuss the benefits and challenges involved with extending the scope of synthetic data products in official statistics.
机译:国家统计局(NSO)根据从调查受访者,政府行政记录和其他来源收集的数据创建官方统计数据。通常认为原始数据是机密的。就美国人口普查局而言,调查和行政记录微数据的机密性是由法规规定的,而这种保护机密性的规定通常与用户从数据中提取尽可能多的信息的需求相抵触。传统的公开保护技术导致官方数据产品无法充分利用基础微数据的信息内容。通常,这些产品采用简单的汇总列表的形式。在少数情况下,可以使用匿名的公共微型样本,但是由于公共领域中可获得的有关个人和公司的信息量越来越大,这些微型样本面临着重新识别的风险。克服这些风险的一种方法是根据合成数据发布产品,在这些数据中,模拟统计模型中的值是用来模拟基础微数据的(联合)分布的。我们将讨论人口普查局最近开发和部署此类产品的工作。我们讨论在官方统计中扩展综合数据产品范围所涉及的收益和挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号