Improved Address-Calculation Coding of Integer Arrays
详细信息    查看全文
  • 作者:Amr Elmasry (12)
    Jyrki Katajainen (1)
    Jukka Teuhola (3)
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2012
  • 出版时间:2012
  • 年:2012
  • 卷:7608
  • 期:1
  • 页码:205-216
  • 全文大小:218.3 KB
  • 参考文献:1. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. SIGMOD Rec. 22(2), 207–216 (1993)
    2. Brisaboa, N.R., Ladra, S., Navarro, G.: Directly Addressable Variable-Length Codes. In: Karlgren, J., Tarhio, J., Hyyr枚, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 122–130. Springer, Heidelberg (2009)
    3. Brodnik, A.: Computation of the least significant set bit. In: Proc. Electrotechnical and Comput. Sci. Conf., vol. B, pp. 7–10 (1993)
    4. Culpepper, J.S., Moffat, A.: Compact Set Representation for Information Retrieval. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 137–148. Springer, Heidelberg (2007)
    5. Delpratt, O., Rahman, N., Raman, R.: Compressed Prefix Sums. In: van Leeuwen, J., Italiano, G.F., van der Hoek, W., Meinel, C., Sack, H., Pl谩šil, F. (eds.) SOFSEM 2007. LNCS, vol. 4362, pp. 235–247. Springer, Heidelberg (2007)
    6. Fenwick, P.M.: A new data structure for cumulative frequency tables. Software Pract. Exper. 24(3), 327–336 (1994)
    7. Ferragina, P., Venturini, R.: A simple storage scheme for strings achieving entropy bounds. Theoret. Comput. Sci. 372(1), 115–121 (2007)
    8. Gonz谩lez, R., Navarro, G.: Statistical Encoding of Succinct Data Structures. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 294–305. Springer, Heidelberg (2006)
    9. Gupta, A., Hon, W.K., Shah, R., Vitter, J.S.: Compressed data structures: Dictionaries and data-aware measures. Theoret. Comput. Sci. 387(3), 313–331 (2007)
    10. Hagerup, T.: Sorting and Searching on the Word RAM. In: Meinel, C., Morvan, M. (eds.) STACS 1998. LNCS, vol. 1373, pp. 366–398. Springer, Heidelberg (1998)
    11. Jansson, J., Sadakane, K., Sung, W.K.: CRAM: Compressed random access memory. E-print arXiv:1011.1708v2, arXiv.org, Ithaca (2012)
    12. Katajainen, J., Rao, S.S.: A compact data structure for representing a dynamic multiset. Inform. Process. Lett. 110(23), 1061–1066 (2010)
    13. Moffat, A.: Compressing integer sequences and sets. In: Encyclopedia of Algorithms, pp. 178–182. Springer Science+Business Media, LLC, New York (2008)
    14. Moffat, A., Stuiver, L.: Binary interpolative coding for effective index compression. Inform. Retrieval 3(1), 25–47 (2000)
    15. Moffat, A., Zobel, J.: Self-indexing inverted files for fast text retrieval. ACM Trans. Inform. Syst. 14(4), 349–379 (1996)
    16. Raman, R., Raman, V., Rao, S.S.: Succinct Dynamic Data Structures. In: Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 2001. LNCS, vol. 2125, pp. 426–437. Springer, Heidelberg (2001)
    17. Raman, R., Raman, V., Satti, S.R.: Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Trans. Algorithms 3(4), 43:1–43:25 (2007)
    18. Sadakane, K., Grossi, R.: Squeezing succinct data structures into entropy bounds. In: Proc. 17th Annual ACM-SIAM Symp. on Discrete Algorithms, pp. 1230–1239. ACM/SIAM, New York/Philadelphia (2006)
    19. Teuhola, J.: Interpolative coding of integer sequences supporting log-time random access. Inform. Process. Manag. 47(5), 742–761 (2011)
    20. Transier, F., Sanders, P.: Engineering basic algorithms of an in-memory text search engine. ACM Trans. Inform. Syst. 29(1), 2:1–2:37 (2010)
  • 作者单位:1. Department of Computer Science, University of Copenhagen, Denmark2. Computer and Systems Engineering Department, Alexandria University, Egypt3. Department of Information Technology, University of Turku, Finland
  • ISSN:1611-3349
文摘
In this paper we deal with compressed integer arrays that are equipped with fast random access. Our treatment improves over an earlier approach that used address-calculation coding to locate the elements and supported access and search operations in O(lg(n+s))O(\lg (n+s)) time for a sequence of n non-negative integers summing up to s. The idea is to complement the address-calculation method with index structures that considerably decrease access times and also enable updates. For all our structures the memory usage is n lg(1 + s/n) + O(n)n \lg(1 + s/n) + O(n) bits. First a read-only version is introduced that supports rank-based accesses to elements and retrievals of prefix sums in O(lglg(n+s)O(\lg \lg (n+s)) time, as well as prefix-sum searches in O(lgn+ lglgs)O(\lg n+ \lg \lg s) time, using the word RAM as the model of computation. The second version of the data structure supports accesses in O(lglgU)O(\lg\lg U) time and changes of element values in O(lg2 U)O(\lg^2 U) time, where U is the universe size. Both versions performed quite well in practical experiments. A third extension to dynamic arrays is also described, supporting accesses and prefix-sum searches in O(lgn + lglgU)O(\lg n + \lg\lg U) time, and insertions and deletions in O(lg2 U)O(\lg^2 U) time.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700