首页> 外文会议>International Conference on Software Process Improvement >Measuring data privacy preserving and machine learning
【24h】

Measuring data privacy preserving and machine learning

机译:测量数据隐私保留和机器学习

获取原文

摘要

The increasing publication of large amounts of data, theoretically anonymous, can lead to a number of attacks on the privacy of people. The publication of sensitive data without exposing the data owners is generally not part of the software developers concerns. The regulations for the data privacy-preserving create an appropriate scenario to focus on privacy from the perspective of the use or data exploration that takes place in an organization. The increasing number of sanctions for privacy violations motivates the systematic comparison of three known machine learning algorithms in order to measure the usefulness of the data privacy preserving. The scope of the evaluation is extended by comparing them with a known privacy preservation metric. Different parameter scenarios and privacy levels are used. The use of publicly available implementations, the presentation of the methodology, explanation of the experiments and the analysis allow providing a framework of work on the problem of the preservation of privacy. Problems are shown in the measurement of the usefulness of the data and its relationship with the privacy preserving. The findings motivate the need to create optimized metrics on the privacy preferences of the owners of the data since the risks of predicting sensitive attributes by means of machine learning techniques are not usually eliminated. In addition, it is shown that there may be a hundred percent, but it cannot be measured. As well as ensuring adequate performance of machine learning models that are of interest to the organization that data publisher.
机译:越来越多的数据出版大量数据,理论上匿名,可能导致一些对人的隐私的攻击。敏感数据的出版物而不暴露数据所有者通常不是软件开发人员关注的一部分。数据隐私保留的法规创建了一个适当的方案,从组织中发生的使用或数据探索的角度来看,专注于隐私。越来越多的隐私违规的制裁促使三种已知的机器学习算法的系统比较,以便测量数据隐私保留的有用性。通过将其与已知隐私保存度量进行比较来扩展评估的范围。使用不同的参数场景和隐私级别。使用公开的实现,方法的呈现,实验的解释和分析允许提供关于保护隐私问题的工作框架。在测量数据的测量和隐私保留的关系中显示出问题。该研究结果激励了在数据的隐私首选项上创建优化的指标,因为通常不会被淘汰通过机器学习技术预测敏感属性的风险。另外,表明可能存在百分之百,但不能测量。除了确保对数据发布者的组织感兴趣的机器学习模型的充分性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号