首页> 外文会议>IEEE International Conference on Grey Systems and Intelligent Services >Grey maximum distance to average vector based on quasi identifier attribute
【24h】

Grey maximum distance to average vector based on quasi identifier attribute

机译:基于准标识符属性的平均向量的灰色最大距离

获取原文

摘要

We are in big data era, and have surrounded by many kinds of data, the interactions between data become more and more frequently, and this also brought about personal privacy issues. In reality, attackers can infer the user identity of the sensitive information by aggregating data from other sources. The leakage of privacy will bring inconvenience to the personal life, and even lead to loss of property and personal safety. Therefore, how to ensure that private information in the process of interacting and sharing can be protected effectively becomes a hot research issue. k-anonymity is an effective method of privacy preserving, k-anonymous model can effectively avoid the personal identity from being directly identified and thus makes it difficult to determine the owner of sensitive information. There is no constraint on the sensitivity distribution of equivalence classes in k-anonymity, which makes the algorithm to be attacked by homogeneous attacks and background knowledge attacks, and leads to some sensitive property values leaked. MDAV (Maximum Distance to Average Vector) is one of the algorithms for k-anonymous models. MDAV uses Euclidean distance to measure the homogeneity between different records, which treats all attributes equally and covers the different importance of each attribute. However the importance of each attribute is different in actual situation and it needs to be treated differently. The importance of attribute in MDAV would affect the risk of privacy disclosure, the existing literatures mainly focus on: (I) subjective measurement of attribute importance, (II) Euclidean distance for the homogeneity measure, and it treats each attribute equally. Subjective approach could not be realized easily, and the importance of each attribute in the actual situation is often different, which will have an impact on the effect of MDAV. In this paper, from the perspective of qusi identifier attribute, grey relation analysis is introduced into improve the measure method, a novel GMDAV (Grey Maximum Distance to Average Vector) is proposed for k-anonymous. For the approach distance between tuples, considering the importance of the quasi identity attribute and the similarity of the important attributes, a comprehensive measure method with the weighted Euclidean distance based on grey relation analysis is proposed to determine the importance of attributes. As for the information loss evaluation of GMDAV, it needs to be evaluated according to the importance of attribute. MDAV often uses IL evaluation model to treat the loss of all attributes equally, but it can not test the validity of GMDAV. Based on IL evaluation model, considering the importance of the attributes, an attribute information loss based on grey weight model (AIL) is put forward. Finally, The experiments were conducted by using Tarragona, Census and EIA three sets of classical datasets, and AIL and DLD (Distance Linked Disclosure) have been adopted for algorithm evaluation. In three datasets, for AIL evaluation, information losses of GMDAV are all better than MDAV, with the increase of data amount in the datasets (Tarragona
机译:我们在大数据时代,并已包围多种数据,数据之间的相互作用越来越频繁,这也带来了个人隐私问题。实际上,攻击者可以通过从其他来源聚合数据来推断敏感信息的用户身份。隐私的泄漏将为个人生活带来不便,甚至导致财产损失和个人安全。因此,如何确保可以有效保护互动和共享过程中的私人信息成为一个热门研究问题。 K-匿名是一种有效的隐私保留方法,K-Anonymous模型可以有效地避免直接识别的个人身份,从而使得难以确定敏感信息的所有者。在k-匿名中的等价类别的敏感性分布没有约束,这使得算法通过均匀攻击和背景知识攻击攻击,并导致一些泄露的敏感属性值。 MDAV(到平均向量的最大距离)是K-Anonymous模型的算法之一。 MDAV使用欧几里德距离来测量不同记录之间的同质性,这同样处理所有属性并涵盖每个属性的不同重要性。然而,每个属性的重要性在实际情况下都有不同,需要以不同的方式对待。 MDAV中属性的重要性会影响隐私披露的风险,现有文献主要关注:(i)属性重要性的主观测量,(ii)均匀性测量的欧几里德距离,并且其同样地处理每个属性。无法容易地实现主观方法,并且每个属性在实际情况中的重要性通常是不同的,这将对MDAV的影响产生影响。本文从QUSI标识符属性的角度来看,引入了灰关系分析,提高了测量方法,提出了一种新的GMDAV(灰色最大距离到平均向量),用于K-Anonymous。对于元组之间的接近距离,考虑到准身份属性的重要性和重要属性的相似性,提出了一种基于灰色关系分析的加权欧几里德距离的综合测量方法,以确定属性的重要性。至于GMDAV的信息损失评估,需要根据属性的重要性进行评估。 MDAV通常使用IL评估模型同样地对待所有属性的丢失,但它无法测试GMDAV的有效性。基于IL评估模型,考虑到属性的重要性,提出了基于灰度模型(AIL)的属性信息丢失。最后,通过使用塔拉贡纳,人口普查和EIA进行三组经典数据集进行实验,并且已经采用了AIL和DLD(距离链接公开)进行算法评估。在三个数据集中,对于AIL评估,GMDAV的信息损失均优于MDAV,随着数据集中的数据量增加(Tarragona

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号