CinHBa: A Secondary Index with Hotscore Caching Policy on Key-Value Data Store
详细信息    查看全文
  • 作者:Wei Ge (22) (24)
    Yihua Huang (22)
    Di Zhao (22)
    Shengmei Luo (23)
    Chunfeng Yuan (22)
    Wenhui Zhou (22)
    Yun Tang (22)
    Juan Zhou (22)
  • 关键词:HBase ; secondary index ; memory cache ; caching policy ; key ; value store
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2014
  • 出版时间:2014
  • 年:2014
  • 卷:8933
  • 期:1
  • 页码:602-615
  • 全文大小:364 KB
  • 参考文献:1. DBMS2: DataBase Management System Services, http://www.dbms2.com/2009/05/11/facebook-hadoop-and-hive
    2. Huawei Hindex, https://github.com/Huawei-Hadoop/hindex
    3. Corbato, F.: A Paging Experiment with the Multics System. MIT Project MAC Report MAC-M-384 (1968)
    4. Ungureanu, C., Debnath, B., Rago, S., Aranya, A.: TBF: A memory-efficient replacement policy for flash-based caches. In: 29th IEEE International Conference onData Engineering Brisbane (ICDE), pp. 1117鈥?128. IEEE Press, Brisbane (2013)
    5. A Comparison of Approaches to Large-Scale Data Analysis: MapReduce vs. DBMS Benchmarks, http://database.cs.brown.edu/projects/mapreduce-vs-dbms
    6. Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A Comparison of Approaches to Large-scale Data Analysis. In: 35th International Conference on Management of Data, New York, pp. 165鈥?78 (2009)
    7. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking Cloud Serving Systems with YCSB. In: 1st ACM Symposium on Cloud Computing, Santa Clara, CA, pp. 143鈥?54 (2010)
    8. Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D.J., Rasin, A., Silberschatz, A.: HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. In: 35th International Conference on Very Large Data Bases, Lyon, pp. 922鈥?33 (2009)
    9. Dittrich, J., Quian-Ruiz, J., Jindal, A., Kargin, Y., Setty, V., Schad, J.: Hadoop++: Making a Yellow Elephant Run Like a Cheetah (WithoutIt Even Noticing). In: 36th International Conference on Very Large Data Bases, Singapore, pp. 518鈥?29 (2010)
    10. Dittrich, J., Quian-Ruiz, J., Richter, S., Schuh, S., Jindal, A., Schad, J.: Only Aggressive Elephants are Fast Elephants. In: 38th International Conference on Very Large Data Bases, Istanbul, pp. 1591鈥?602 (2012)
    11. Sfakianakis, G., Patlakas, I., Ntarmos, N., Triantafillou, P.: Interval Indexing and Querying on Key-value Cloud Stores. In: 29th IEEE International Conference on Data Engineering (ICDE), pp. 805鈥?16. IEEE Press, Brisbane (2013)
    12. Bentley, J.L.: Solutions to Klee鈥檚 Rectangle Problem, Technical Report, Carnegie-Mellon University, Pittsburgh (1977)
    13. Dean, J., Ghemawat, S.: MapReduce: a Flexible Data Processing Tool. Communications of the ACM聽53(1), 72鈥?7 (2010) CrossRef
    14. Levandoski, J.J., Larson, P., Stoica, R.: Identifying Hot and Cold Data in Main-Memory Databases. In: 29th IEEE International Conference on Data Engineering (ICDE), pp. 26鈥?7. IEEE Press, Brisbane (2013)
  • 作者单位:Wei Ge (22) (24)
    Yihua Huang (22)
    Di Zhao (22)
    Shengmei Luo (23)
    Chunfeng Yuan (22)
    Wenhui Zhou (22)
    Yun Tang (22)
    Juan Zhou (22)

    22. State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210046, China
    24. Guangxi Normal University, Guilin, 541000, China
    23. ZTE Corporation, Nanjing, 210012, China
  • ISSN:1611-3349
文摘
We are now entering the era of big data. HBase comes out to organize data as key-value pairs and support fast queries on rowkeys, but queries on non-rowkey column are a blind spot of HBase. It is the main topic of this paper to provide high-performance query capability on non-rowkey column. An effective secondary index model is proposed, and the prototype system CinHBa is implemented. Furthermore, a novel caching policy, Hotscore Algorithm, is introduced in CinHBa to cache hottest index data into memory to improve query performance. Experiment evaluation shows that query response time of CinHBa is far less than native HBase without secondary index on 10M records. Besides that, CinHBa has good data scalability.
NGLC 2004-2010.National Geological Library of China All Rights Reserved.
Add:29 Xueyuan Rd,Haidian District,Beijing,PRC. Mail Add: 8324 mailbox 100083
For exchange or info please contact us via email.