Multi-core DSP-based Vector Set Bits Counters/Comparators
详细信息    查看全文
  • 作者:Valery Sklyarov ; Iouliia Skliarova
  • 关键词:Hamming weight/population/vector set bits counter ; Hamming weight comparator ; Field ; programmable gate array ; Digital signal processing slice ; Hardware accelerator ; On ; chip architecture
  • 刊名:The Journal of VLSI Signal Processing
  • 出版年:2015
  • 出版时间:September 2015
  • 年:2015
  • 卷:80
  • 期:3
  • 页码:309-322
  • 全文大小:3,041 KB
  • 参考文献:1.Knuth, D.E. (2011). The Art of Computer Programming, vol. 3: Sorting and Searching. Addison-Wesley.
    2.Parhami, B. (2009). Efficient hamming weight comparators for binary vectors based on accumulative and up/down parallel counters. IEEE Transactions on Circuits and Systems II: Express Briefs, 56(2), 167鈥?71.View Article
    3.Chen, K. (1989). Bit-serial realizations of a class of nonlinear filters based on positive boolean functions. IEEE Transactions on Circuits and Systems, 36(6), 785鈥?94.View Article
    4.Wendt, P. D., Coyle, E. J., & Gallagher, N. C. (1986). Stack filters. IEEE Transactions on Acoustics, Speech, and Signal Processing, 34(4), 898鈥?08.View Article
    5.Storace, M., & Poggi, T. (2011). Digital architectures realizing piecewise-linear multivariate functions: two FPGA implementations. Int. Journal of Circuit Theory and Applications, 39(1), 1鈥?5.View Article MATH
    6.Asada, K., Kumatsu, S., & Ikeda, M. (1999). Associative memory with minimum Hamming distance detector and its application to bus data encoding. In Proc. IEEE Asia-Pacific Application-Specific Integrated Circuits Conf. Korea, 16鈥?8.
    7.Barral, C., Coron, J. S., & Naccache, D. (2004). Externalized fingerprint matching. In Proc. Int. Conf. on Biometric Authentication. Hong Kong, 309鈥?15.
    8.Zakrevskij, A., Pottosin, Y., & Cheremisiniva, L. (2008). Combinatorial Algorithms of Discrete Mathematics. TUT Press.
    9.Skliarova, I., & Ferrari, A. B. (2004). A Software/reconfigurable hardware SAT solver. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 12(4), 408鈥?19.View Article
    10.Pedroni, V. (2004). Compact Hamming-comparator-based rank order filter for digital VLSI and FPGA implementations. In Proc. IEEE International Symp. on Circuits and Systems, vol. 2. Canada, 585鈥?88.
    11.Hakmem (1972). Artificial Intelligence Memo, 239. Massachusetts Institute of Technology.
    12.Zhang, X., Qin, J., Wang, W., Sun, Y., & Lu, J. (2013). Hmsearch: an efficient hamming distance query processing algorithm (In Proc. 25th Int). USA: Conf. on Scientific and Statistical Database Management. Maryland.View Article
    13.El-Qawasmeh, E. (2003). Beating the popcount. Int. Journal of Information Technology, 9(1), 1鈥?8.
    14.Sklyarov, V., & Skliarova, I. (2013). Digital hamming weight and distance analyzers for binary vectors and matrices. Int. Journal of Innovative Computing, Information and Control, 9(12), 4825鈥?849.
    15.Sklyarov, V., & Skliarova, I. (2013). Design and implementation of counting networks. Computing. doi:10.鈥?007/鈥媠00607-013-0360-y .MATH
    16.Intel Corp. (2007). Intel庐 SSE4 Programming Reference. http://鈥媓ome.鈥媢stc.鈥媏du.鈥媍n/鈥媬shengjie/鈥婻EFERENCE/鈥媠se4_鈥媔nstruction_鈥媠et.鈥媝df . Accessed 8 May 2014.
    17.ARM Ltd. (2013). NEON鈩?Version: 1.0 Programmer鈥檚 Guide. http://鈥媔nfocenter.鈥媋rm.鈥媍om/鈥媓elp/鈥媔ndex.鈥媕sp?鈥媡opic=鈥?鈥媍om.鈥媋rm.鈥媎oc.鈥媎en0018a/鈥媔ndex.鈥媓tml . Accessed 8 May 2014.
    18.Dalke Scientific Software, LLC (2011). Faster population counts, http://鈥媎alkescientific.鈥媍om/鈥媤ritings/鈥媎iary/鈥媋rchive/鈥?011/鈥?1/鈥?2/鈥媐aster_鈥媝opcount_鈥媢pdate.鈥媓tml . Accessed 8 May 2014.
    19.Manku, G.S., Jain, A., & Sarma, A.D. (2007). Detecting near-duplicates for web crawling. In Proc. 16th Int. World Wide Web Conf. Banff, Canada, 141鈥?50.
    20.Nasr, R., Vernica, R., Li, C., & Baldi, P. (2012). Speeding up chemical searches using the inverted index: the convergence of chemoinformatics and text search methods. Journal of Chemical Information and Modeling, 52(4), 891鈥?00.View Article
    21.Sklyarov, V., & Skliarova, I. (2013). Fast regular circuits for network-based parallel data processing. Advances in Electrical and Computer Engineering, 13(4), 47鈥?0.View Article
    22.Sklyarov, V., Skliarova, I., Mihhailov, D., & Sudnitson, A. (2011). Implementation in FPGA of Address-based Data Sorting. In Proc. 21st Int. Conf. on Field-Programmable Logic and Applications. Crete, Greece, 405鈥?10.
    23.Xilinx Inc. (2013). 7 Series DSP48E1 Slice User Guide. http://鈥媤ww.鈥媥ilinx.鈥媍om/鈥媠upport/鈥媎ocumentation/鈥媢ser_鈥媑uides/鈥媢g479_鈥?Series_鈥婦SP48E1.鈥媝df . Accessed 8 May 2014.
    24.Sklyarov, V., & Skliarova, I. (2013). Parallel Processing in FPGA-based Digital Circuits and Systems. TUT Press.
    25.Piestrak, S. J. (2007). Efficient hamming weight comparators of binary vectors. Electronic Letters, 43(11), 611鈥?12.View Article
    26.Pedroni, V. A. (2003). Compact fixed-threshold and two-vector hamming comparators. Electronic Letters, 39(24), 1705鈥?706.View Article
    27.Mueller, R., Teubner, J., & Alonso, G. (2012). Sorting networks on FPGAs. The Int. Journal on Very Large Data Bases, 21(1), 1鈥?3.View Article
    28.Milenkovic, O., & Kashyap, N. (2005). On the design of codes for DNA computing (pp. 100鈥?19). Norway: In Proc. Int. Conf. on Coding and Cryptography. Bergen.
    29.Digilent Inc. (2013). Nexys4鈩?FPGA board reference manual. http://鈥媤ww.鈥媎igilentinc.鈥媍om/鈥婦ata/鈥婸roducts/鈥婲EXYS4/鈥婲exys4_鈥婻M_鈥媀B1_鈥婩inal_鈥?.鈥媝df . Accessed 8 May 2014.
    30.Sklyarov, V., Skliarova, I., Barkalov, A., & Titarenko, L. (2014). Synthesis and Optimization of FPGA-based Systems, Springer.
    31.Avnet Inc. (2014). ZedBoard (Zynq鈩?Evaluation and Development) Hardware User鈥檚 Guide. http://鈥媤ww.鈥媧edboard.鈥媜rg/鈥媠ites/鈥媎efault/鈥媐iles/鈥媎ocumentations/鈥媄edBoard_鈥婬W_鈥婾G_鈥媣2_鈥?.鈥媝df . Accessed 8 May 2014.
    32.Digilent, Inc. (2014). ZyBo Reference Manual. http://鈥媎igilentinc.鈥媍om/鈥婦ata/鈥婸roducts/鈥媄YBO/鈥媄YBO_鈥婻M_鈥婤_鈥媀6.鈥媝df . Accessed 8 May 2014.
    33.Digilent, Inc. (2011). PmodKYPD鈩?Reference Manual. http://鈥媎igilentinc.鈥媍om/鈥婸roducts/鈥婦etail.鈥媍fm?鈥婲avPath鈥?鈥嬧€?,401,940&鈥婸rod =鈥?PMODKYPD . Accessed 8 May 2014.
    34.Sadri, M., Weis, C., When, N., & Benini, L. (2013). Energy and Performance Exploration of Accelerator Coherency Port Using Xilinx ZYNQ. In Proc. 10th FPGAWorld Conference, Copenhagen/Stockholm.
    35.Skliarova, I., & Sklyarov, V. (2006). Design methods for FPGA-based implementation of combinatorial search algorithms (pp. 359鈥?68). Indonesia: In. Proc. Int. Workshop on SoC and MCSoC Design. Yogyakarta.
    36.Sklyarov, V., Skliarova, I., Silva, J., Rjabov, A., Sudnitson, A., & Cardoso, C. (2014). Hardware/Software Co-design for Programmable Systems-on-Chip. TUT Press.
    37.Anderson, S. E. (2007). Counting bits set, in parallel. http://鈥媑raphics.鈥媠tanford.鈥媏du/鈥媬seander/鈥媌ithacks.鈥媓tml#CountBitsSetPara鈥媗lel . Accessed 8 May 2014.
    38.Xilinx, Inc. (2014). Zynq-7000 All Programmable SoC Technical Reference Manual. http://鈥媤ww.鈥媥ilinx.鈥媍om/鈥媠upport/鈥媎ocumentation/鈥媢ser_鈥媑uides/鈥媢g585-Zynq-7000-TRM.鈥媝df . Accessed 8 May 2014.
  • 作者单位:Valery Sklyarov (1)
    Iouliia Skliarova (1)

    1. Department of Electronics, Telecommunications and Informatics, IEETA, University of Aveiro, 3810-193, Aveiro, Portugal
  • 刊物类别:Engineering
  • 刊物主题:Electrical Engineering
    Circuits and Systems
    Computer Imaging, Vision, Pattern Recognition and Graphics
    Computer Systems Organization and Communication Networks
    Signal,Image and Speech Processing
    Mathematics of Computing
  • 出版者:Springer New York
  • ISSN:1939-8115
文摘
The paper shows that fast counting non-zero components (Hamming weights) and comparing the results (Hamming distances) in large sets of data items is important for numerous practical applications and this problem has been broadly investigated by software and hardware designers. It is frequently referenced as population or vector set bits count (or simply popcount). This paper is dedicated to multi-core FPGA-based accelerators that compute Hamming weights/distances and compare the results with fixed thresholds and variable bounds. It is shown that widely available in contemporary FPGAs digital signal processing slices may be used efficiently and they provide the fastest and the less resource consuming solutions. A thorough analysis and comparison with the best known alternatives both in hardware and in software is presented and supported by numerous experiments in the recent Nexys-4, ZedBoard and ZyBo prototyping systems. Complete hardware description language (VHDL) specifications for core components are given ready to be synthesized, implemented, tested and evaluated. Experiments with the proposed designs clearly demonstrate significant speed-up comparing to known hardware/software alternatives.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700