基于迁移学习的敏感数据隐私保护方法

英文篇名：Sensitive Data Privacy Protection Method Based on Transfer Learning
作者：付玉香 ; 秦永彬 ; 申国伟
英文作者：Fu Yuxiang;Qin Yongbin;Shen Guowei;College of Computer Science and Technology,Guizhou University;Guizhou Provincial Key Laboratory of Public Big Data,Guizhou University;
关键词：差分隐私 ; 迁移学习 ; 模型攻击 ; 敏感数据 ; 隐私保护
英文关键词：differential privacy;;transfer learning;;model attack;;sensitive data;;privacy protection
中文刊名：SJCJ
英文刊名：Journal of Data Acquisition and Processing
机构：贵州大学计算机科学与技术学院;贵州大学贵州省公共大数据重点实验室;
出版日期：2019-05-15
出版单位：数据采集与处理
年：2019
期：v.34;No.155
基金：国家自然科学基金重大研究计划(91746116)资助项目;; 贵州省重大应用基础研究(黔科合JZ字[2014]2001)资助项目;; 贵州省科技重大专项计划(黔科合重大专项字[2017]3002)资助项目
语种：中文;
页：SJCJ201903006
页数：10
CN：03
ISSN：32-1367/TN
分类号：54-63

摘要

机器学习涉及一些隐含的敏感数据,当受到模型查询或模型检验等模型攻击时,可能会泄露用户隐私信息。针对上述问题,本文提出一种敏感数据隐私保护"师徒"模型PATE?T,为机器学习模型的训练数据提供强健的隐私保证。该方法以"黑盒"方式组合了由不相交敏感数据集训练得到的多个"师父"模型,这些模型直接依赖于敏感训练数据。"徒弟"由"师父"集合迁移学习得到,不能直接访问"师父"或基础参数,"徒弟"所在数据域与敏感训练数据域不同但相关。在差分隐私方面,攻击者可以查询"徒弟",也可以检查其内部工作,但无法获取训练数据的隐私信息。实验表明,在数据集MNIST和SVHN上,本文提出的隐私保护模型达到了隐私/实用准确性的权衡,性能优越。
Machine learning involves some implicit sensitive data that may reveal user's privacy information when attacked by model attacks such as model queries or model tests. In view of the above problems,this paper proposes a sensitivity data privacy protection Mentoring model PATE ?T,which provides a strong privacy guarantee for the training data for machine learning. The method combines multiple Master models trained by disjoint sensitive data sets in a black box manner,relying directly on sensitive training data.Disciple is transfer learning by Master's collection and cannot directly access Master or basic parameters.Disciple's data field is different but related to the sensitive training data field. In terms of differential privacy,an attacker can query the Disciple and check its internal work,but it cannot obtain the private information of the training data. Experiments show that the privacy protection model proposed in this paper has reached the balance of privacy/practical accuracy on the MNIST data set and SVHN data set,and the results are superior.

引文

[1] Shokri R, Stronati M, Song C, et al. Membership inference attacks against machine learning models[C]//2017 IEEE Symposium on Security and Privacy(SP).[S.l.]:IEEE, 2017:160?176.
    [2] Fredrikson M, Somesh J, Thomas R. Model inversion attacks that exploit confidence information and basic countermeasures.Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security.[S.l.]:ACM, 2015:89?101.
    [3] Zhang C, Bengio S, Hardt M, et al. Deep learning requires rethinking generalization[C]//International Conference on Learning Representation(ICLP). Toulon, France:IEEE, 2017:262?277.
    [4] Abadi M, Chu A, Goodfellow I, et al. Deep learning with differential privacy[C]//Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security.[S.l.]:ACM, 2016:308?318.
    [5] Haeusser P, Frerix T, Mordvintsev A, et al. Associative domain adaptation[C]//International Conference on Computer Vision(ICCV). Venice, Italy:IEEE, 2017:2765?2773.
    [6] LeCun Y, Bottou L, Bengio Y, et al. Gradient?based learning applied to document recognition[J]. Proceedings of the IEEE,1998, 86(11):2278?2324.
    [7] Netzer Y, Wang T, Coates A, et al. Reading digits in natural images with unsupervised feature learning[C]//NIPS Workshop on Deep Learning and Unsupervised Feature Learning. Whistler, B C, Canada:NIPS,2011:5.
    [8] Sweeney L.K?anonymity:A model for protecting privacy[J].International Journal of Uncertainty, Fuzziness and Knowledge?Based Systems, 2002, 10(05):557?570.
    [9] Dwork C, Roth A. The algorithmic foundations of differential privacy[J]. Foundations and Trends in Theoretical Computer Science, 2014, 9(3/4):211?407.
    [10] Dwork C, McSherry F, Nissim K, et al. Calibrating noise to sensitivity in private data analysis[C]//Theory of Cryptography Conference. Berlin, Heidelberg:Springer, 2006:265?284.
    [11] Shokri R, Shmatikov V. Privacy?preserving deep learning[C]//Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security.[S.l.]:ACM, 2015:1310?1321.
    [12] Pathak M, Rane S, Raj B. Multiparty differential privacy via aggregation of locally trained classifiers[C]//Advances in Neural Information Processing Systems. Vancouver, British Columbia, Canada:NIPS,2010:1876?1884.
    [13] Hamm J, Cao Y, Belkin M. Learning privately from multiparty data[C]//International Conference on Machine Learning. New York, USA:[s.n.],2016:555?563.
    [14] Jagannathan G, Monteleoni C, Pillaipakkamnatt K. A semi?supervised learning approach to differential privacy[C]//IEEE 13th International Conference on Data Mining Workshops(ICDMW).[S.l.]:IEEE, 2013:841?848.
    [15] Papernot N, Abadi M, Erlingsson U, et al. Semi?supervised knowledge transfer for deep learning from private training data[C]//International Conference on Learning Representations. San Juan, Puerto Rico:[s.n.], 2016:202?218.
    [16] Dietterich T G. Ensemble methods in machine learning[C]//International Workshop on Multiple Classifier Systems. Berlin,Heidelberg:Springer, 2000:1?15.
    [17] Goodfellow I, Pouget?Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in Neural Information Processing Systems. Montreal, Quebec, Canada:NIPS,2014:2672?2680.
    [18] Chapelle O, Scholkopf B, Zien A. Semi?supervised learning[J]. IEEE Transactions on Neural Networks, 2009, 20(3):542?542.
    [19] Pan S J, Tsang I W, Kwok J T, et al. Domain adaptation via transfer component analysis[J]. IEEE Transactions on Neural Networks, 2011, 22(2):199?210.
    [20] Gretton A, Borgwardt K M, Rasch M J, et al. A kernel two?sample test[J]. Journal of Machine Learning Research, 2012, 13:723?773.
    [21] Long M, Wang J, Ding G, et al. Transfer feature learning with joint distribution adaptation[C]//Computer Vision(ICCV),2013 IEEE International Conference on.[S.l.]:IEEE, 2013:2200?2207.
    [22] Long M, Cao Y, Wang J, et al. Learning transferable features with deep adaptation networks[C]//International Conference on Machine Learning(ICML). Lille, France:[s.n.],2015:97?105.
    [23] Long M, Zhu H, Wang J, et al. Deep transfer learning with joint adaptation networks[C]//International Conference on Machine Learning. Sydney, NSW, Australia:[s.n.],2017:2208?2217.
    [24] Ganin Y, Ustinova E, Ajakan H, et al. Domain?adversarial training of neural networks[J]. The Journal of Machine Learning Research, 2016, 17(1):2096?2030.
    [25] Tan B, Zhang Y, Pan S J, et al. Distant domain transfer learning[C]//Thirty?First AAAI Conference on Artificial Intelligence.San Francisco, California, USA:AAAI, 2017:2604?2610.
    [26] Zhu J Y, Park T, Isola P, et al. Unpaired image?to?image translation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision. Venice. Italy:IEEE, 2017:2223?2232.
    [27] Ben?David S, Blitzer J, Crammer K, et al. A theory of learning from different domains[J]. Machine Learning, 2010, 79(1/2):151?175.
    [28] Haeusser P, Mordvintsev A, Cremers D. Learning by association?a versatile semi?supervised training method for neural networks[C]//Proc IEEE Conf on Computer Vision and Pattern Recognition(CVPR). Honolulu, HI, USA:IEEE, 2017:89?98.
    [29] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems. Lake Tahoe, Nevada, United States:NIPS,2012:1097?1105.
    [30] Ganin Y, Ustinova E, Ajakan H, et al. Domain?adversarial training of neural networks[J]. The Journal of Machine Learning Research, 2016, 17(1):2096?2030.
    [31] Arbelaez P, Maire M, Fowlkes C, et al. Contour detection and hierarchical image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(5):898?916.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700