您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2015, Vol. 45 ›› Issue (1): 24-29.doi: 10.6040/j.issn.1672-3961.2.2014.120

• 机器学习与数据挖掘 • 上一篇    下一篇

基于kd-树的差分隐私二维空间数据划分发布方法

黄泗勇, 陈婷婷, 卢清, 吴英杰, 叶少珍   

  1. 福州大学数学与计算机科学学院, 福建 福州 350108
  • 收稿日期:2014-05-23 修回日期:2014-11-20 发布日期:2014-05-23
  • 通讯作者: 吴英杰(1979-),男,福建泉州人,副教授,博士,主要研究方向为数据挖掘与数据安全隐私保护等.E-mail:yjwu@fzu.edu.cn E-mail:yjwu@fzu.edu.cn
  • 作者简介:黄泗勇(1989-),男,福建漳州人,硕士研究生,主要研究方向为数据安全隐私保护等.E-mail:siyong.huang@gmail.com
  • 基金资助:
    国家自然科学基金资助项目(61300026);福州大学科技发展基金资助项目(2012-XQ-27)

Differentially privacy two-dimensional dataset partitioning publication algorithm based on kd-tree

HUANG Siyong, CHEN Tingting, LU Qing, WU Yingjie, YE Shaozhen   

  1. College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350108, China
  • Received:2014-05-23 Revised:2014-11-20 Published:2014-05-23

摘要: 为解决现有基于网格结构的差分隐私二维空间数据划分发布方法可能引起局部划分过细导致查询精度低的问题,提出了基于kd-树的差分隐私二维空间数据划分发布方法—kd-PPDP算法(differentially privacy partitioning publication algorithm based on kd-tree)。算法采用了kd-树算法思想,通过启发式地识别网格化后数据分布情况并合并相邻近似网格单元来防止局部划分过细问题,从而减少所添加的噪声,提高查询精度。通过实验对比分析了kd-PPDP算法与现有基于网格结构的划分发布方法的查询误差以及时间效率,结果表明了该算法的有效性和可行性。

关键词: kd-树, 差分隐私, 隐私保护, 数据划分发布, 二维空间数据

Abstract: The existing differentially privacy two-dimensional dataset partitioning publication approaches might result in over-partitioning of certain local regions thus lower the query accuracy. To solve this problem, a new differentially privacy partitioning publication algorithm based on kd-tree for two-dimensional dataset (kd-PPDP) was proposed.To reduce the noise generated by the Laplace mechanism and improve the accuracy of query, the new approach which was inspired by the core thought of kd-tree took the gridding data distribution into account and merged adjacent grid units with similar information heuristically to prevent local over-partitioning. The new approach was compared with the existing differentially privacy partitioning publication approachs based on grid structure in terms of query error and time complexity. Experimental results showed that the approach was feasible and effective.

Key words: two-dimensional dataset, privacy preserving, data partitioning publication, differential privacy, kd-tree

中图分类号: 

  • TP311.13
[1] 周水庚,李丰,陶宇飞,等.面向数据库应用的隐私保护研究综述[J]. 计算机学报, 2009, 32(5):847-858. ZHOU Shuigeng, LI Feng, TAO Yufei, et al. Privacy preservation in database applications:A survey[J]. Chinese Journal of Computers, 2009, 32(5):847-858.
[2] FUNG BCM, WANG Ke, CHEN Rui, et al. Privacy-preserving data publishing:A survey on recent developments[J]. ACM Computing Surveys, 2010, 42(4):1-53.
[3] DWORK C. Differential privacy[C]//Proc of the 33rd Int Colloquium on Automata, Languages and Programming. Berlin:Springer, 2006:1-12.
[4] DWORK C, MCSHERRY F, NISSIM F K, et al. Calibrating noise to sensitivity in private data analysis[C]//Proceedings of the 3rd Theory of Cryptography Conference (TCC). New York, USA:Springer Berlin Heidelberg, 2006:363-385.
[5] SWEENEY L. k-anonymity:A model for protecting privacy[J]. International Journal on Uncertainty, Fuzziness and Knowledge Based Systems, 2002, 10(5):557-570.
[6] MACHANAVAJJHALA A, GEHRKE J, KIFER D, et al. l-diversity:Privacy beyond k-anonymity[C]//Proceedings of IEEE ICDE'06. Piscataway, NJ:IEEE, 2006:24-35.
[7] LI Ninghui, LI Tiancheng. t-closeness:Privacy beyond k-anonymity and l-diversity[C]//Proceedings of IEEE ICDE'07. Piscataway, NJ:IEEE, 2007:106-115.
[8] CHEN R, MOHAMMED N, FUNG B C M, et al. Publishing set-valued data via differential privacy[J]. Proceedings of the VLDB Endowment, 2011, 4(11):1087-1098.
[9] CHEN R, ACS G, CASTELLUCCIA C. Differentially private sequential data publication via variable-length n-grams[C]//Proceedings of the 2012 ACM conference on Computer and communications security.Raleigh, NC, USA:ACM, 2012:638-649.
[10] CHEN R, FUNG B, DESAI B C, et al. Differentially private transit data publication:A case study on the montreal transportation system[C]//Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. Beijing, China:ACM, 2012:213-221.
[11] CORMODE G, PROCOPIUC C, SRIVASTAVA D, et al. Differentially private spatial decompositions[C]//Proceedings of IEEE 28th International Conference on Data Engineering (ICDE), Washington D C. USA:IEEE, 2012:20-31.
[12] QARDAJI W, YANG W, LI N. Differentially private grids for geospatial data[C]//Proceedings of IEEE 29th International Conference on Data Engineering (ICDE). Brisbane, Australia:IEEE, 2013:757-768.
[13] HAY M, RASTOGI V, MIKLAU G, et al. Boosting the accuracy of differentially private histograms through consistency[J]. Proceedings of the VLDB Endowment, 2010, 3(1-2):1021-1032.
[14] McSherry F D. Privacy integrated queries:an extensible platform for privacy-preserving data analysis[C]//Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data. Providence, Rhode Island, USA:ACM, 2009:19-30.
[1] 孙岚,罗钊,吴英杰,王一蕾. 面向路网限制的位置隐私保护算法[J]. 山东大学学报(工学版), 2012, 42(5): 96-101.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!