加密云数据下基于Simhash的模糊排序搜索方案
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Ranked Fuzzy Keyword Search Based on Simhash over Encrypted Cloud Data
  • 作者:杨旸 ; 杨书略 ; 柯闽
  • 英文作者:YANG Yang;YANG Shu-Lue;KE Min;College of Mathematics and Computer Science,Fuzhou University;Key Lab of Information Security of Network System in Fujian Province,Fuzhou University;College of Physics and Information Engineering,Fuzhou University;
  • 关键词:云计算 ; 加密云数据 ; 隐私保护 ; 可搜索加密 ; 模糊排序搜索 ; Simhash
  • 英文关键词:cloud computing;;encrypted cloud data;;privacy-preserving;;searchable encryption;;ranked fuzzy keyword search;;Simhash
  • 中文刊名:JSJX
  • 英文刊名:Chinese Journal of Computers
  • 机构:福州大学数学与计算机科学学院;网络系统信息安全福建省高校重点实验室;福州大学物理与信息工程学院;
  • 出版日期:2016-09-19 17:39
  • 出版单位:计算机学报
  • 年:2017
  • 期:v.40;No.410
  • 基金:国家自然科学基金(61402112,61472307,61472309,61303198);; 福建省教育厅科技项目(JA12028);; 福建省重大科技项目(2015H6013);; 福州大学科技发展基金项目(2012-XY-17)资助~~
  • 语种:中文;
  • 页:JSJX201702010
  • 页数:14
  • CN:02
  • ISSN:11-1826/TP
  • 分类号:161-174
摘要
为了保护数据隐私,数据拥有者会将敏感数据的密文外包到云服务器,这使得传统明文搜索技术难以使用.因此可搜索加密技术被用于对密文数据进行搜索,实现高效的数据利用.然而目前在加密云数据中,关键词模糊搜索方案主要是通过构造关键词模糊集合来实现,其需要大量的计算和存储开销.本文提出的搜索方案,无需构造关键词模糊集合,而是基于Simhash的降维思想,将文档关键词做n-gram处理并得到Simhash指纹来实现模糊搜索.该文结合汉明距离和关键词相关度分数,设计了双因子排序算法对查询结果进行排序.使用树索引结构和新型遍历方法进一步提高了搜索效率.通过新型遍历方法,即使树的节点值与期望值不相等,也能够对树进行遍历.理论分析和实验结果表明:该方案实现了加密云数据下的关键词模糊搜索,同时极大地节约了时间和空间成本.
        With the development of cloud computing,data owners are motivated to outsource their data and the corresponding complex management tasks to the public cloud for convenience and economic savings.In order to protect data privacy,data owners prefer to outsource their sensitive data in an encrypted form to the cloud,which makes the traditional search techniques useless.Searchable encryption is a technique to search on encrypted data without decryption to realize efficient data utilization.There have been some studies on secure searching over encrypted cloud data,which pay attention to both privacy and practicability of data.However,most of them are based on accurate keyword matching.The fuzzy keyword search problem remains unsolved.Up to date,the existing construction of fuzzy keyword search schemes has to build fuzzy keyword set.It will lead to tremendous computation and storage overheads.In this paper,we propose a new scheme without constructing fuzzy keyword set.Based on the idea of dimension reduction of Simhash,each keyword is transformed to a Simhash fingerprint by n-gram method to achieve fuzzy matching.Combining the hamming distance and keyword relevance score,we design a double factor ranking algorithm to sort the results accurately.In addition,tree structure and a novel traversal method are utilized to further improve the efficiency of our proposed scheme.Thetree can be traversed even if the value of the tree node is not equal to the expected value by the proposed traversal method.Theoretical analysis and experimental results show that the scheme realizes the ranked fuzzy keyword search over encrypted cloud data.Meanwhile,the computation and storage overheads are greatly reduced.
引文
[1]Mell P,Grance T.The NIST definition of cloud computing.Communications of the ACM,2010,53(6):50
    [2]Song D X,Wagner D,Perrig A.Practical techniques for searches on encrypted data//Proceedings of the IEEE Symposium on Security and Privacy.Oakland,USA,2000:44-55
    [3]Goh E J.Secure indexes.Cryptology ePrint Archive,2003:216-235
    [4]Chang Y C,Mitzenmacher M.Privacy preserving keyword searches on remote encrypted data//Proceedings of the Applied Cryptography and Network Security.New York,USA,2005:442-455
    [5]Curtmola R,Garay J,Kamara S,et al.Searchable symmetric encryption:Improved definitions and efficient constructions.Journal of Computer Security,2011,19(5):895-934
    [6]Wang C,Cao N,Ren K,et al.Enabling secure and efficient ranked keyword search over outsourced cloud data.IEEETransactions on Parallel and Distributed Systems,2012,23(8):1467-1479
    [7]Wang C,Cao N,Li J,et al.Secure ranked keyword search over encrypted cloud data//Proceedings of the IEEE International Conference on Distributed Computing Systems(ICDCS).Genova,Italy,2010:253-262
    [8]Cao N,Wang C,Li M,et al.Privacy-preserving multikeyword ranked search over encrypted cloud data.IEEETransactions on Parallel and Distributed Systems,2014,25(1):829-837
    [9]Boneh D,Crescenzo G D,et al.Public key encryption with keyword search//Proceedings of the Eurocrypt.Interlaken,Switzerland,2004:506-522
    [10]Li J,Wang Q,Wang C,et al.Fuzzy keyword search over encrypted data in cloud computing//Proceedings of the IEEEINFOCOM.San Diego,USA,2010:1-5
    [11]Li J,Wang Q,Wang C,et al.Enabling efficient fuzzy keyword search over encrypted data in cloud computing.Cryptology ePrint Archive,2009:593
    [12]Liu C,Zhu L,Li L,et al.Fuzzy keyword search on encrypted cloud storage data with small index//Proceedings of the 2011IEEE International Conference on Cloud Computing and Intelligence Systems(CCIS).Beijing,China,2011:269-273
    [13]Wang C,Ren K,Yu S,et al.Achieving usable and privacyassured similarity search over outsourced cloud data//Proceedings of the IEEE INFOCOM.Orlando,USA,2012:451-459
    [14]Wang J,Ma H,Tang Q,et al.Efficient verifiable fuzzy keyword search over encrypted data in cloud computing.Computer Science&Information Systems,2013,10(2):667-684
    [15]Jin L I,Chen X.Efficient multi-user keyword search over encrypted data in cloud computing.Computing&Informatics,2013,32(4):723-738
    [16]Yu C M,Chen C Y,Chao H C.Privacy-preserving multikeyword similarity search over outsourced cloud data.IEEESystems Journal,2015,(99):1-10
    [17]Charikar M S.Similarity estimation techniques from rounding algorithms//Proceedings of the 34th Annual ACM Symposium on Theory of Computing.New York,USA,2002:380-388
    [18]Manku G S,Jain A,Das Sarma A.Detecting near-duplicates for web crawling//Proceedings of the 16th International Conference on World Wide Web.New York,USA,2007:141-150
    [19]Buyrukbilen S,Bakiras S.Secure similar document detection with Simhash//Proceedings of the 10th Very Large Data Bases Workshop.Trento,Italy,2014:61-75
    [20]Fu Z,Shu J,Wang J,et al.Privacy-preserving smart similarity search based on Simhash over encrypted data in cloud computing.Journal of Internet Technology,2015,16(3):458
    [21]Ristad E S,Yianilos P N.Learning string-edit distance.IEEETransactions on Pattern Analysis and Machine Intelligence,1998,20(5):522-532
    [22]Manning C D,Raghavan P,Schütze H.Introduction to Information Retrieval.Cambridge:Cambridge University Press,2008
    [23]Agrawal R,Kiernan J,Srikant R,et al.Order-preserving encryption for numeric data//Proceedings of the 2004 ACMSIGMOD International Conference on Management of Data.New York,USA,2004:563-574
    [24]Liu D,Wang S.Nonlinear order preserving index for encrypted database query in service cloud environments.Concurrency and Computation:Practice and Experience,2013,25(13):1967-1984
    [25]Bellare M,Boldyreva A,O’Neill A.Deterministic and efficiently searchable encryption//Proceedings of the 27th Annual International Cryptology Conference on Advances in Cryptology.Santa Barbara,USA,2007:535-552

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700