Synthetic Data via Quantile Regression for Heavy-Tailed and Heteroskedastic Data

机译：通过分位数回归的重尾和异方差数据合成数据

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Privacy protection of confidential data is a fundamental problem faced by many government organizations and research centers. It is further complicated when data have complex structures or variables with highly skewed distributions. The statistical community addresses general privacy concerns by introducing different techniques that aim to decrease disclosure risk in released data while retaining their statistical properties. However, methods for complex data structures have received insufficient attention. We propose producing synthetic data via quantile regression to address privacy protection of heavy-tailed and heteroskedastic data. We address some shortcomings of the previously proposed use of quantile regression as a synthesis method and extend the work into cases where data have heavy tails or heteroskedastic errors. Using a simulation study and two applications, we show that there are settings where quantile regression performs as well as or better than other commonly used synthesis methods on the basis of maintaining good data utility while simultaneously decreasing disclosure risk.

机译：机密数据的隐私保护是许多政府组织和研究中心面临的基本问题。当数据具有复杂的结构或具有高度偏斜分布的变量时，情况将更加复杂。统计界通过引入旨在降低已发布数据中的披露风险并同时保留其统计属性的不同技术来解决一般隐私问题。但是，用于复杂数据结构的方法尚未引起足够的重视。我们建议通过分位数回归来生成合成数据，以解决重尾和异方差数据的隐私保护问题。我们解决了先前提出的使用分位数回归作为综合方法的一些缺点，并将工作扩展到数据有大量尾巴或异方差的情况。通过模拟研究和两个应用程序，我们表明在保持良好数据实用性同时降低披露风险的基础上，存在分位数回归表现优于或优于其他常用合成方法的设置。

著录项

来源
《UNESCO chair in data privacy international conference on privacy in statistical databases》|2018年|92-108|共17页
会议地点
作者
Michelle Pistner; Aleksandra Slavkovic; Lars Vilhuber;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Denoising Heavy-Tailed Data Using a Novel Robust Non-Parametric Method Based on Quantile Regression [J] . Gorji Ferdos, Aminghafari Mina Fluctuation and Noise Letters: FNL: An Interdisciplinary Scientific Journal on Random Processes in Physical, Biological and Technological Systems . 2017,第3期

机译：使用基于量子回归的新型鲁棒非参数方法去噪
2. Kernel-based testing with skewed and heavy-tailed data: Evidence from a nonparametric test for heteroskedasticity [J] . Henderson Daniel J., Sheehan Alice Economics letters . 2018,第nova期

机译：基于内核的偏斜和重尾数据测试：来自异方差性的非参数测试的证据
3. On the quantile regression when the regressor is functional: Spatial data case [Sur la régression quantile pour variable explicative fonctionnelle: Cas des données spatiales] [J] . Dabo-Niang S., Kaid Z., Laksaci A. Comptes rendus. Mathematique . 2011,第23a24期

机译：关于回归函数起作用时的分位数回归：空间数据案例[关于功能解释变量的分位数回归：空间数据案例]
4. Synthetic Data via Quantile Regression for Heavy-Tailed and Heteroskedastic Data [C] . Michelle Pistner, Aleksandra Slavkovic, Lars Vilhuber International Conference on Privacy in Statistical Databases . 2018

机译：通过定量回归的综合性数据，用于重型和异源性数据
5. Semiparametric Quantile Regression and Applications to Healthcare Data Analysis [D] . Maidman, Adam. 2018

机译：半参数分位数回归及其在医疗数据分析中的应用
6. Heterogeneity in the association between youth unemployment and mental health later in life: a quantile regression analysis of longitudinal data from English schoolchildren [O] . Liam Wright, Jenny Head, Stephen Jivraj 2021

机译：生命后期青年失业与心理健康关系之间的异质性：英语学童纵向数据的分量回归分析
7. Synthetic data methods using quantile regression and hot deck with rank swapping [O] . Larsen Michael, Huckett Jennifer 2009

机译：使用分位数回归和带等级交换的热甲板的综合数据方法

Synthetic Data via Quantile Regression for Heavy-Tailed and Heteroskedastic Data

摘要

著录项

相似文献

相关主题

期刊订阅