首页> 中文期刊> 《计算机学报》 >一种满足差分隐私的轨迹数据发布方法

一种满足差分隐私的轨迹数据发布方法

         

摘要

Trajectory data have rich spatiotemporal information,which may cause users' personal privacy leakage.In order to avoid this,privacy-preserving techniques are required before the publication of trajectory data.Most of the existing trajectory privacy-preserving techniques are based on the k-anonymity model,which has two main drawbacks:one is low privacy guarantee;the other is heavily dependent on the background knowledge that the attackers know.In recent years,researchers proposed differential privacy,which is known as the strongest privacy-preserving model,and it is quite good at publication of statistical information.While,publication of statistical information may also cause leakage of users' location privacy.At first,we propose two attack models in publication of count values of location samples,called sparse location attack and maximum moving speed attack.Then we propose two trajectory data publication methods under differential privacy:in free space,we propose a differentially private trajectory data publication method based on noisy quad-tree.The privacy budget is divided according to the level of the quad-tree;Laplace noise is added into each level's count value of moving object.In road network space,we propose a differentially private trajectory data publication method base on noisy R-tree.While R-tree is used to index road segment,privacy budget is divided and Laplace noise is added to the count values of each road segment or the minimum bounding rectangles of road segment.Both method spublish noisy count values of location data of each timestamp,which form a sequence of count values.Differential privacy algorithms have higher privacy guarantee than traditional k-anonymity methods.Actually,published trajectory data is a location count value sequence of multiple consequent timestamps.Since the main idea of differential privacy is to add Laplace noise into original data,and the added Laplace noise are independent at each timestamp,it may cause data inconsistency,which may reduce the utility of the published data.The data inconsistency happens when publication of location data of two or more consequent timestamps.We propose a heuristic algorithm to solve this problem.This algorithm is an extreme value problem under certain constraints.The constraints we conclude are the maximum and minimum numbers of each area or road segment,according to the moving speed,area size or the length of a road segment.The extreme value function is the L2 distance between the noisy count value sequence and the consistency count value sequence,we may find a consistent count value sequence which is most close to the noisy data.At last,we conduct a set of experiments on two data sets,one is generated location data under Gaussian distribution,and the other is generated data under Oldenburg city road network constraints.We measure data utility and runtime of each algorithm before and after consistency procedure,the results show that the consistency algorithm may improve the data utility about 200% in average,which shows the effectiveness of the consistency algorithm.We also test runtime of each algorithm,with the incensement of the data set size,run time increases linearly,which means the algorithms are easy to extend.%移动对象的轨迹数据包含丰富的时空信息,发布前需进行隐私保护处理以防止个人隐私信息的泄露.目前已有的隐私保护算法多以k-匿名模型为基础,这类方法提供的隐私保护度不够,且隐私保护度强弱与背景知识高度相关.近年来出现的差分隐私技术是一种与背景知识无关的强隐私保护模型,针对发布数据进行统计查询的误差率可控.然而,针对统计信息的查询仍可能造成移动对象隐私的泄露,针对此问题,该文首先提出了两种攻击模型:稀疏位置攻击和最大运行速度攻击.然后,提出两种满足差分隐私的轨迹数据发布方法:在自由空间中,采用基于噪音四分树的轨迹数据发布方法,分别发布每个时刻的噪音数据,按噪音四分树的层次分割隐私预算,对每个区域中的移动对象计数值添加噪音;路网空间中采用基于噪音R-树的轨迹数据发布方法,用R-树索引路网中的路段,按层次分割隐私预算,对路段中的移动对象计数值添加噪音.在空间范围计数查询上,上述两种方法比k-匿名模型的隐私保护度更高.差分隐私的基础是在原始数据中添加噪音,添加的独立噪音可能导致数据不一致问题.该文提出了一种基于移动对象最大运行速度的一致性处理算法.最后,该文在模拟数据集上对数据可用性和算法运行时间进行了实验,实验结果表明该文提出的算法具有良好的性能.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号