摘要
Vector quantization(VQ) is a very effective way to save bandwidth and storage for speech coding and image coding. Traditional vector quantization methods can be divided into mainly seven types, tree-structured VQ,direct sum VQ, Cartesian product VQ, lattice VQ, classified VQ, feedback VQ, and fuzzy VQ, according to their codebook generation procedures. Over the past decade, quantization-based approximate nearest neighbor(ANN)search has been developing very fast and many methods have emerged for searching images with binary codes in the memory for large-scale datasets. Their most impressive characteristics are the use of multiple codebooks. This leads to the appearance of two kinds of codebook: the linear combination codebook and the joint codebook. This may be a trend for the future. However, these methods are just finding a balance among speed, accuracy, and memory consumption for ANN search, and sometimes one of these three suffers. So, finding a vector quantization method that can strike a balance between speed and accuracy and consume moderately sized memory, is still a problem requiring study.
Vector quantization(VQ) is a very effective way to save bandwidth and storage for speech coding and image coding. Traditional vector quantization methods can be divided into mainly seven types, tree-structured VQ,direct sum VQ, Cartesian product VQ, lattice VQ, classified VQ, feedback VQ, and fuzzy VQ, according to their codebook generation procedures. Over the past decade, quantization-based approximate nearest neighbor(ANN)search has been developing very fast and many methods have emerged for searching images with binary codes in the memory for large-scale datasets. Their most impressive characteristics are the use of multiple codebooks. This leads to the appearance of two kinds of codebook: the linear combination codebook and the joint codebook. This may be a trend for the future. However, these methods are just finding a balance among speed, accuracy, and memory consumption for ANN search, and sometimes one of these three suffers. So, finding a vector quantization method that can strike a balance between speed and accuracy and consume moderately sized memory, is still a problem requiring study.
引文
Ahalt SC,Krishnamurthy AK,Chen P,et al.,1990.Competitive learning algorithms for vector quantization.Neur Netw,3(3):277-290.https://doi.org/10.1016/0893-6080(90)90071-R
Ai LF,Yu JQ,He YF,et al.,2013.High-dimensional indexing technologies for large-scale content-based image retrieval:a review.J Zhejiang Univ-Sci C(Comput&Electron),14(7):505-520.https://doi.org/10.1631/jzus.CIDE1304
Ai LF,Yu JQ,Guan T,et al.,2014.Efficient approximate nearest neighbor search by optimized residual vector quantization.12thInt Workshop on Content-Based Multimedia Indexing,p.1-4.https://doi.org/10.1109/CBMI.2014.6849842
Babenko A,Lempitsky V,2012.The inverted multi-index.IEEE Conf on Computer Vision and Pattern Recognition,p.3069-3076.https://doi.org/10.1109/CVPR.2012.6248038
Babenko A,Lempitsky V,2014.Additive quantization for extreme vector compression.IEEE Conf on Computer Vision and Pattern Recognition,p.931-938.https://doi.org/10.1109/CVPR.2014.124
Babenko A,Lempitsky V,2015.Tree quantization for largescale similarity search and classification.IEEE Conf on Computer Vision and Pattern Recognition,p.4240-4248.https://doi.org/10.1109/CVPR.2015.7299052
Barnes CF,Rizvi S,Nasrabadi NM,1996.Advances in residual vector quantization:a review.IEEE Trans Image Process,5(2):226-262.https://doi.org/10.1109/83.480761
Beyer K,Goldstein J,Ramakrishnan R,et al.,1999.When is“nearest neighbor”meaningful?In:Beeri C,Buneman P(Eds.),Database Theory-ICDT’99.Springer Berlin Heidelberg,p.217-235.https://doi.org/10.1007/3-540-49257-7_15
Bezdek JC,Ehrlich R,Full W,1984.FCM:the fuzzy c-means clustering algorithm.Comput Geosci,10(2-3):191-203.https://doi.org/10.1016/0098-3004(84)90020-7
Bezdek JC,Tsao ECK,Pal NR,1992.Fuzzy kohonen clustering networks.IEEE Int Conf on Fuzzy Systems,p.1035-1043.https://doi.org/10.1109/FUZZY.1992.258797
Brandt J,2010.Transform coding for fast approximate nearest neighbor search in high dimensions.IEEE Conf on Computer Vision and Pattern Recognition,p.1815-1822.https://doi.org/10.1109/CVPR.2010.5539852
Buzo A,Gray A,Gray R,et al.,1980.Speech coding based upon vector quantization.IEEE Trans Acoust Speech Signal Process,28(5):562-574.https://doi.org/10.1109/TASSP.1980.1163445
Chan WY,Gersho A,1990.Enhanced multistage vector quantization with constrained storage.24thAsilomar Conf on Signals,Systems,and Computers,p.659-663.https://doi.org/10.1109/ACSSC.1990.523420
Chan WY,Gupta S,Gersho A,1992.Enhanced multistage vector quantization by joint codebook design.IEEETrans Commun,40(11):1693-1697.https://doi.org/10.1109/26.179932
Chen HH,Ding JJ,Sheu HT,2014.Image retrieval based on quadtree classified vector quantization.Multim Tools Appl,72(2):1961-1984.https://doi.org/10.1007/s11042-013-1492-y
Chuang JC,Hu YC,Lo CC,et al.,2013.Improved meanremoved vector quantization scheme for grayscale image coding.Int J Signal Process Image Process Patt Recogn,6(5):315-332.https://doi.org/10.14257/ijsip.2013.6.5.28
Convay JH,Sloane NJA,1982.Fast quantizing and decoding algorithms for lattice quantizers and codes.IEEE Trans Inform Theory,28(2):227-232.https://doi.org/10.1109/TIT.1982.1056484
Dasgupta S,Freund Y,2008.Random projection trees and low-dimensional manifolds.40thAnnual ACM Symp on Theory of Computing,p.537-546.https://doi.org/10.1145/1374376.1374452
Dasgupta S,Freund Y,2009.Random projection trees for vector quantization.IEEE Trans Inform Theory,55(7):3229-3242.https://doi.org/10.1109/TIT.2009.2021326
Eriksson T,Agrell E,1996.Lattice-Based Quantization:Part II.Technical Report No.18,Chalmers University of Technology,G?teborg,Sweden.
Fischer T,1986.A pyramid vector quantizer.IEEE Trans Inform Theory,32(4):568-583.https://doi.org/10.1109/TIT.1986.1057198
Foster J,Gray R,Dunham M,1985.Finite-state vector quantization for waveform coding.IEEE Trans Inform Theory,31(3):348-359.https://doi.org/10.1109/TIT.1985.1057035
Freund Y,Dasgupta S,Kabra M,et al.,2007.Learning the structure of manifolds using random projections.20th Int Conf on Neural Information Processing Systems,p.473-480.
Ge TZ,He KM,Ke QF,et al.,2013.Optimized product quantization for approximate nearest neighbor search.IEEE Conf on Computer Vision and Pattern Recognition,p.2946-2953.https://doi.org/10.1109/CVPR.2013.379
Ge TZ,He KM,Ke QF,et al.,2014.Optimized product quantization.IEEE Trans Patt Anal Mach Intell,36(4):744-755.https://doi.org/10.1109/TPAMI.2013.240
Gersho A,1979.Asymptotically optimal block quantization.IEEE Trans Inform Theory,25(4):373-380.https://doi.org/10.1109/TIT.1979.1056067
Gersho A,1982.On the structure of vector quantizers.IEEETrans Inform Theory,28(2):157-166.https://doi.org/10.1109/TIT.1982.1056457
Gersho A,Gray R,1991.Vector Quantization and Signal Compression.Springer,Berlin,Germany.
Gong YC,Lazebnik S,2011.Iterative quantization:a procrustean approach to learning binary codes.IEEE Conf on Computer Vision and Pattern Recognition,p.817-824.https://doi.org/10.1109/CVPR.2011.5995432
Gray R,1984.Vector quantization.IEEE ASSP Mag,1(2):4-29.https://doi.org/10.1109/MASSP.1984.1162229
Gray R,Neuhoff DL,1998.Quantization.IEEE Trans Inform Theory,44(6):2325-2383.https://doi.org/10.1109/18.720541
Guo D,Li CQ,Wu L,2016.Parametric and nonparametric residual vector quantization optimizations for ANNsearch.Neurocomputing,217:92-102.https://doi.org/10.1016/j.neucom.2016.04.061
Hang HM,Woods JW,1985.Predictive vector quantization of images.IEEE Trans Commun,33(11):1208-1219.https://doi.org/10.1109/TCOM.1985.1096238
Heo JP,Lin Z,Yoon SE,2014.Distance encoded product quantization.IEEE Conf on Computer Vision and Pattern Recognition,p.2139-2146.https://doi.org/10.1109/CVPR.2014.274
Indyk P,Motwani R,1998.Approximate nearest neighbors:towards removing the curse of dimensionality.30th Annual ACM Symp on Theory of Computing,p.604-613.https://doi.org/10.1145/276698.276876
Jégou H,Douze M,Schmid C,2008.Hamming embedding and weak geometric consistency for large-scale image search.10thEuropean Conf on Computer Vision,p.304-317.https://doi.org/10.1007/978-3-540-88682-2_24
Jégou H,Douze M,Schmid C,2010.Product quantization for nearest neighbor search.IEEE Trans Patt Anal Mach Intell,33(1):117-128.https://doi.org/10.1109/TPAMI.2010.57
Jégou H,Perronnin F,Douze M,et al.,2012.Aggregating local image descriptors into compact codes.IEEE Trans Patt Anal Mach Intell,34(9):1704-1716.https://doi.org/10.1109/TPAMI.2011.235
Juang BH,Gray A,1982.Multiple stage vector quantization for speech coding.IEEE Int Conf on Acoustics,Speech,and Signal Processing,p.597-600.https://doi.org/10.1109/ICASSP.1982.1171604
Kalantidi Y,Avrithis Y,2014.Locally optimized product quantization for approximate nearest neighbor search.IEEE Conf on Computer Vision and Pattern Recognition,p.2329-2336.https://doi.org/10.1109/CVPR.2014.298
Karayiannis NB,Pai PI,1995.Fuzzy vector quantization algorithms and their application in image compression.IEEE Trans Image Process,4(9):1193-1201.https://doi.org/10.1109/83.413164
Kieffer J,1982.Stochastic stability for feedback quantization schemes.IEEE Trans Inform Theory,28(2):248-254.https://doi.org/10.1109/TIT.1982.1056487
Kim T,1988.New finite state vector quantizers for images.Int Conf on Acoustics,Speech,and Signal Processing,p.1180-1183.https://doi.org/10.1109/ICASSP.1988.196809
Kohonen T,Huang T,Schroeder M,1984.Self-Organization and Associative Memory.Springer-Verlag,Berlin,p.3406-3409.https://doi.org/10.1007/978-3-662-00784-6
Liu SC,Shao JR,Lu HT,2017.Generalized residual vector quantization and aggregating tree for large-scale search.IEEE Trans Multim,19(8):1785-1797.https://doi.org/10.1109/TMM.2017.2692181
Lloyd S,1982.Least squares quantization in PCM.IEEETrans Inform Theory,28(2):129-137.https://doi.org/10.1109/TIT.1982.1056489
Lowe DG,1999.Object recognition from local scale-invariant feature.7thIEEE Int Conf on Computer Vision,p.1150-1157.https://doi.org/10.1109/ICCV.1999.790410
Lowe DG,2004.Distinctive image features from scaleinvariant keypoints.Int J Comput Vis,60(2):91-110.https://doi.org/10.1023/b:visi.0000029664.99615.94
Martinez J,Clement J,Hoos HH,et al.,2016.Revisiting additive quantization.14thEuropean Conf on Computer Vision,p.137-153.https://doi.org/10.1007/978-3-319-46475-6_9
Muja M,Lowe DG,2009.Fast approximate nearest neighbors with automatic algorithm configuration.4thInt Conf on Computer Vision Theory and Applications,p.331-340.https://doi.org/10.5220/0001787803310340
Nister D,Stewenius H,2006.Scalable recognition with a vocabulary tree.IEEE Conf on Computer Vision and Pattern Recognition,p.2161-2168.https://doi.org/10.1109/CVPR.2006.264
Norouzi M,Fleet DJ,2013.Cartesian k-means.IEEE Conf on Computer Vision and Pattern Recognition,p.3017-3024.https://doi.org/10.1109/CVPR.2013.388
Oliva A,Torralba A,2001.Modeling the shape of the scene:a holistic representation of the spatial envelope.Int JComput Vis,42(3):145-175.https://doi.org/10.1023/A:1011139631724
Ozan EC,Kiranyaz S,Gabbouj M,2016a.K-subspaces quantization for approximate nearest neighbor search.IEEE Trans Knowl Data Eng,28(7):1722-1733.https://doi.org/10.1109/TKDE.2016.2535287
Ozan EC,Kiranyaz S,Gabbouj M,2016b.Competitive quantization for approximate nearest neighbor search.IEEE Trans Knowl Data Eng,28(11):2884-2894.https://doi.org/10.1109/TKDE.2016.2597834
Park M,Gunda K,Gupta H,et al.,2014.Optimized transform coding for approximate KNN search.British Machine Vision Conf,p.1-12.https://doi.org/10.5244/c.28.34
Patchoo W,Fischer TR,Maddex C,2013.L1-norm-based coding for lattice vector quantization.Asia-Pacific Signal and Information Association Annual Summit and Conf,p.1-4.https://doi.org/10.1109/APSIPA.2013.6694164
PaulevéL,Jégou H,Amsaleg L,2010.Locality sensitive hashing:a comparison of hash function types and querying mechanisms.Patt Recogn Lett,31(11):1348-1358.https://doi.org/10.1016/j.patrec.2010.04.004
Perronnin F,Dance C,2007.Fisher kernels on visual vocabularies for image categorization.IEEE Conf on Computer Vision and Pattern Recognition,p.1-8.https://doi.org/10.1109/CVPR.2007.383266
Ramamurthi B,Gersho A,1986.Classified vector quantization of images.IEEE Trans Commun,34(11):1105-1115.https://doi.org/10.1109/TCOM.1986.1096468
Sabin M,Gray R,1982.Product code vector quantizers for speech waveform coding.Global Telecommunications Conf,p.1087-1091.https://doi.org/10.1109/CVPR.2007.383266
Sabin M,Gray R,1984.Product code vector quantizers for waveform and voice coding.IEEE Trans Acoust Speech Signal Process,32(3):474-488.https://doi.org/10.1109/TASSP.1984.1164346
Sch?nemann P,1966.A generalized solution of the orthogonal Procrustes problem.Psychometrika,31(1):1-10.https://doi.org/10.1007/bf02289451
Shannon CE,1948.A mathematical theory of communication.Bell Syst Tech J,27(3):379-423.https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Shannon CE,1959.Coding theorems for a discrete source with a fidelity criterion.IRE National Convention Record,Part 4,p.93-126.https://doi.org/10.1109/9780470544242.ch21
Silpa-Anan C,Hartley R,2008.Optimized KD-trees for fast image descriptor matching.IEEE Conf on Computer Vision and Pattern Recognition,p.1-8.https://doi.org/10.1109/CVPR.2008.4587638
Sivic J,Zisserman A,2003.Video Google:a text retrieval approach to object matching in videos.9thIEEE Int Conf on Computer Vision,p.1470-1477.https://doi.org/10.1109/ICCV.2003.1238663
Tsekouras GE,Tsolakis DM,2013.Fuzzy clustering-based vector quantization for image compression.In:Chatterjee A,Siarry P(Eds.),Computational Intelligence in Image Processing.Springer Berlin Heidelberg,p.93-105.https://doi.org/10.1007/978-3-642-30621-1_5
Tsekouras GE,Antonios M,Anagnostopoulos C,et al.,2008.Improved batch fuzzy learning vector quantization for image compression.Inform Sci,178(20):3895-3907.https://doi.org/10.1016/j.ins.2008.05.017
Tuytelaars T,Schmid C,2007.Vector quantizing feature space with a regular lattice.IEEE 11thInt Conf on Computer Vision,p.1-8.https://doi.org/10.1109/ICCV.2007.4408924
Wang JF,Wang JD,Song JK,et al.,2014.Optimized Cartesian k-means.IEEE Trans Knowl Data Eng,27(1):180-192.https://doi.org/10.1109/TKDE.2014.2324592
Wei BC,Guan T,Yu JQ,2014.Projected residual vector quantization for ANN search.IEEE Multim,21(3):41-51.https://doi.org/10.1109/MMUL.2013.65
Weiss Y,Torralba A,Fergus R,2008.Spectral hashing.21st Int Conf on Neural Information Processing Systems,p.1753-1760.
Xia Y,He KM,Wen F,et al.,2013.Joint inverted indexing.IEEE Int Conf on Computer Vision,p.3416-3423.https://doi.org/10.1109/ICCV.2013.424
Yuan JB,Liu XW,2015a.Product tree quantization for approximate nearest neighbor search.IEEE Int Conf on Image Processing,p.2035-2039.https://doi.org/10.1109/ICIP.2015.7351158
Yuan JB,Liu XW,2015b.Transformed residual quantization for approximate nearest neighbor search.ar Xiv:1512.06925
Zhang L,Zhang M,Pan Q,et al.,2014.An effective image coding method using lattice vector quantization in wavelet domain.Int J Signal Process Image Process Patt Recogn,7(2):305-316.https://doi.org/10.14257/IJSIP.2014.7.2.28
Zhang T,Du C,Wang JD,2014.Composite quantization for approximate nearest neighbor search.31stInt Conf on Machine Learning,p.838-846.
Zhang T,Qi GJ,Tang JH,et al.,2015.Sparse composite quantization.IEEE Conf on Computer Vision and Pattern Recognition,p.4548-4556.https://doi.org/10.1109/CVPR.2015.7299085