首页> 外文会议>UNESCO chair in data privacy international conference on privacy in statistical databases >Synthetic Data via Quantile Regression for Heavy-Tailed and Heteroskedastic Data
【24h】

Synthetic Data via Quantile Regression for Heavy-Tailed and Heteroskedastic Data

机译:通过分位数回归的重尾和异方差数据合成数据

获取原文

摘要

Privacy protection of confidential data is a fundamental problem faced by many government organizations and research centers. It is further complicated when data have complex structures or variables with highly skewed distributions. The statistical community addresses general privacy concerns by introducing different techniques that aim to decrease disclosure risk in released data while retaining their statistical properties. However, methods for complex data structures have received insufficient attention. We propose producing synthetic data via quantile regression to address privacy protection of heavy-tailed and heteroskedastic data. We address some shortcomings of the previously proposed use of quantile regression as a synthesis method and extend the work into cases where data have heavy tails or heteroskedastic errors. Using a simulation study and two applications, we show that there are settings where quantile regression performs as well as or better than other commonly used synthesis methods on the basis of maintaining good data utility while simultaneously decreasing disclosure risk.
机译:机密数据的隐私保护是许多政府组织和研究中心面临的基本问题。当数据具有复杂的结构或具有高度偏斜分布的变量时,情况将更加复杂。统计界通过引入旨在降低已发布数据中的披露风险并同时保留其统计属性的不同技术来解决一般隐私问题。但是,用于复杂数据结构的方法尚未引起足够的重视。我们建议通过分位数回归来生成合成数据,以解决重尾和异方差数据的隐私保护问题。我们解决了先前提出的使用分位数回归作为综合方法的一些缺点,并将工作扩展到数据有大量尾巴或异方差的情况。通过模拟研究和两个应用程序,我们表明在保持良好数据实用性同时降低披露风险的基础上,存在分位数回归表现优于或优于其他常用合成方法的设置。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号