用户名: 密码: 验证码:
面向多属性的不等值连接操作算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Non-Equi Join Operation Algorithm for Multi-attribute
  • 作者:孟庆强 ; 何浩奇 ; 毕倪飞 ; 赵斌 ; 吉根林
  • 英文作者:MENG Qingqiang;HE Haoqi;BI Nifei;ZHAO Bin;JI Genlin;NARI Group Corporation(State Grid Electric Power Research Institute);School of Computer Science and Technology,Nanjing Normal University;
  • 关键词:不等值查询 ; 不等值连接 ; 最小候选集 ; 缓存敏感算法 ; 查询处理
  • 英文关键词:non-equi query;;non-equi join;;minimal candidate set;;cache-sensitive algorithm;;query processing
  • 中文刊名:JSJC
  • 英文刊名:Computer Engineering
  • 机构:南瑞集团有限公司(国网电力科学研究院有限公司);南京师范大学计算机科学与技术学院;
  • 出版日期:2018-06-22 10:52
  • 出版单位:计算机工程
  • 年:2019
  • 期:v.45;No.501
  • 基金:国家自然科学基金(41471371)
  • 语种:中文;
  • 页:JSJC201906010
  • 页数:7
  • CN:06
  • ISSN:31-1289/TP
  • 分类号:66-72
摘要
为降低多属性不等值连接操作的计算代价,提出一种基于属性优选的不等值连接操作算法MIEJoin。按照连接属性对元组进行排序,计算各连接属性的候选集大小,在最小候选集中根据连接谓词进行筛选得到最终的结果集。在此基础上,为提升系统的缓存命中率,提出一种缓存敏感的多属性不等值连接算法CMIEJoin。基于MIEJoin算法建立元组的排列顺序数组,在内存中邻近存储连续访问的数据,以降低缓存的缺失次数并提升算法的运行效率。在TPC-H数据集上的实验结果表明,与BIEJoin算法和NLJoin算法相比,CMIEJoin算法具有较高的运行效率。
        To reduce the computational cost of multi-attribute non-equi join operation,an non-equi join operation algorithm MIEJoin based on attribute optimization is proposed.The tuples are sorted by join attribute,the candidate set size of each join attribute is calculated,and the final result set is filtered according to the join predicate in the minimal candidate set.On this basis,a cache-sensitive multi-attribute non-equi join algorithm CMIEJoin is proposed to improve the hit rate of the system cache.Based on the MIEJoin algorithm,the array of sorted tuples is established,and the continuousty accessed data is stored in memory in proximity to reduce the number of cache missings and improve the efficiency of the algorithm.Experimental results on the TPC-H dataset show that CMIEJoin algorithm has higher efficiency than BIEJoin algorithm and NLJoin algorithm.
引文
[1] 张磊,方祝和,周敏奇,等.面向内存计算的连接算法[J].华东师范大学学报(自然科学版),2014(5):180-191.
    [2] KIM C,KALDEWEY T,LEE V W,et al.Sort vs.Hash revisited:fast join implementation on modern multi-core CPUs[J].Proceedings of the VLDB Endowment,2009,2(2):1378-1389.
    [3] ALBUTIU M-C,KEMPER A,NEUMANN T.Massively parallel sort-merge joins in main memory multi-core database systems[J].Proceedings of the VLDB Endowment,2012,5(10):1064-1075.
    [4] BLANAS S,LI Yinan,PATEL J M.Design and evaluation of main memory Hash join algorithms for multi-core CPUs[C]//Proceedings of ACM SIGMOD International Conference on Management of Data.New York,USA:ACM Press,2011:37-48.
    [5] JHA S,HE Bingsheng,LU Mian,et al.Improving main memory hash joins on Intel Xeon Phi processors:an experimental approach[J].Proceedings of the VLDB Endowment,2015,8(6):642-653.
    [6] KALDEWEY T,LOHMAN G M,MüLLER R,et al.GPU join processing revisited[C]//Proceedings of the 8th International Workshop on Data Management.New York,USA:ACM Press,2012:55-62.
    [7] HE Jiong,ZHANG Shuhao,HE Bingsheng.In-cache query co-processing on coupled CPU-GPU architectures[J].Proceedings of the VLDB Endowment,2014,7(4):329-340.
    [8] 冒佳明,王鹏飞,赵然.MapReduce架构下Reduce任务的调度优化[J].无线互联科技,2018,15(22):5-6.
    [9] 宋杰,李甜甜,朱志良,等.MapReduce连接查询的I/O代价研究[J].软件学报,2015,26(6):1438-1456.
    [10] 卞昊穹,陈跃国,杜小勇,等.Spark上的等值连接优化[J].华东师范大学学报(自然科学版),2014(5):263-270.
    [11] SOLOVIEV V.A truncating Hash algorithm for processing band-join queries[C]//Proceedings of the 9th International Conference on Data Engineering.Washington D.C.,USA:IEEE Press,1993:419-427.
    [12] ENDERLE J,HAMPEL M,SEIDL T.Joining interval data in relational databases[C]//Proceedings of ACM SIGMOD International Conference on Management of Data.New York,USA:ACM Press,2004:683-694.
    [13] KHAYYAT Z,LUCIA W,SINGH M,et al.Lightning fast and space efficient inequality joins[J].Proceedings of the VLDB Endowment,2015,8(13):2074-2085.
    [14] STOCKINGER K.Bitmap indices for speeding up high-dimensional data analysis[C]//Proceedings of the 13th International Conference on Database and Expert Systems Applications.Berlin,Germany:Springer,2002:881-889.
    [15] KHAYYAT Z,LUCIA W,SINGH M,et al.Fast and scalable inequality joins[J].The VLDB Journal,2017,26(1):125-150.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700