异构多核网络安全处理器硬件优化技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
高线速和数据安全保障的网络处理器在网络未来发展中具有越来越重要的作用。本文以网络处理器项目为基础,针对网络安全处理器的一些硬件(算术运算单元、安全加密模块等)展开研究,论文工作包括:
     1.实现XDNP异构多核网络安全处理器,其内的各个IP核均自主开发。该网络安全处理器包括1个XD-MP Core,6个包处理引擎PE,1个安全协处理器单元,以及存储控制(SRAM和SDRAM)单元和网络数据交换总线单元。将片上总线分成两类,分别为控制平面总线和数据平面总线。提出一种分离式并行交换结构的片上总线,将数据平面总线分离为命令总线和数据总线的多核共享形式使得经过命令总线仲裁后的总线请求在得到数据响应时可以通过不同的数据总线并发的进行响应,大大提高了片上总线的传输速率。
     2.从逻辑关系上优化了快速加法器中的组进位产生信号和组进位传播信号,并采用差分串联电压开关传输门(DCVSPG)逻辑电路实现,再用这两个信号产生组进位信号,该方法解决了传统静态曼彻斯特链进位旁路电路的逻辑冲突问题,又避免了动态曼彻斯特链进位旁路电路在预充阶段的延迟和功耗开销,比之标准CMOS门电路有更快的速度和更低的功耗。讨论了DCVSPG逻辑中形成树状结构的NMOS晶体管的沟道宽度对整个电路性能的影响,同时建立了一个估算DCVSPG逻辑电路延迟的简单模型。并利用该模型对DCVSPG逻辑产生组进位产生信号和组进位传播信号的电路进行优化,使其能达到指定的延迟指标。采用DCVSPG逻辑电路实现了一种32位加法器,其性能比之由标准单元设计得到的同类结构加法器有很大的提升。
     3.RSA密码系统性能受到长整数模乘幂运算速度的制约。为了提高模乘幂运算器的速度,采用两级进位保留加法器(CSA)结构改进了蒙哥马利模乘算法。通过插入寄存器缩短了电路的关键路径,保证了CSA操作数的同时性,显著提升了模乘运算器速度。另外,通过调整从左到右的二进制模幂运算的模乘运算次序,避免了大部分模乘运算结束后的长整数加法,大大节省了时间。将采用本方法实现的1024位RSA模幂运算器比近年最具代表性的从左到右二进制模幂运算器的有较大性能提升。
     4. ECC密码系统在加解密的计算中需要使用到大量的模乘运算和模平方运算。在素域上不但优化了传统的蒙哥马利模乘运算,提出了2位超前蒙哥马利模乘运算器,还利用平方运算所固有的特性对部分积进行重构,使部分积的数量减少一半,并以此为基础提出折半模平算法,设计专门针对模平方运算的电路,使模平方运算的时间仅为模乘运算的一半。在二进制域上,提出字串行模乘算法,将被乘数左移两个字,使得硬件实现能采用流水线技术,同时简化了关键变量的计算方法,缩短了电路计算的延时,提高了性能,能快速有效的计算出两个操作数的模乘结果。提出双字串行模平方算法,利用二进制域平方运算所固有的特性直接得到平方运算结果,再利用蒙特马利方法对结果约减,可每次处理两个字长,其计算时间也为模乘运算的一半。以这些算术运算电路为基础实现了ECC双域密码协处理器,其具有较高的性能。
     5.提出可切换式TAM结构,某些IP核通过切换电路挂接在多组TAM上,可以使用多组TAM来完成对一个IP核的测试,减少空闲时间,缩短测试用时。按特定的排序规则,采用0-1规划先给每个IP核分配一组TAM,再采用一种启发性搜索算法,挑选合适的IP核使用多组TAM测试。对ITC2002基准电路的实验结果表明,该方法的测试用时较小,优于其它一些测试调度方法。
High-speed security network processor plays a more and more important role innetwork development. Supported by the project of network processor, this dissertationfocuces on the research of hardware, including ALU, security cryptography circuit andso on, in security network processor and presents five main contributions as follows:
     1. XDNP heterogeneous multi-core security network processor was implemented,among which all IP cores were independently developed. XDNP consists of oneXD-MP Core, six Packet Engines (PEs), one security cryptoprocessor, one SRAMcontroller, one SDRAM controller and Media and Switch Fabric Interface(MSF).There are two kinds of buses on chip. One is control plane bus; the other is dataplane bus. A new chip bus architecture based on split transaction was proposed. Byadopting the new architecture, data plane bus is divided into two parts. One part is acommand bus shared by all cores, the other part are several data buses correspondingto each core. This architecture allows different data bus having data transferred at thesame time, which brings with high throughput and low bus latency.
     2. The logic expressions were optimized for block generate and block propagatesignals in fast adder which can be implemented using differential cascode voltageswitch with pass-gate (DCVSPG) logic. This method solves the problem of logicconflict in static Manchester carry bypass circuit, and eliminates the cost of delay andpower in charge stage of dynamic Manchester carry bypass circuit. It has a higherspeed and lower power than CMOS stander cell carry generate circuit. The problemthat the size of every NMOS transistor in DCVSPG logic would affect the performanceof the circuit is discussed. Then a simple delay model of DCVSPG logic was built toevaluate the delay of the circuit. The delay model of DCVSPG logic can be used tooptimize the size of NMOS transistors in adder circuit implemented by DCVSPG logic.A32-bit adder was implemented by DCVSPG logic, of which performance is higherthan that of adder implemented by CMOS stander cell.
     3. Modular multiplication and exponentiation severely restrict the RSAperformance. The thesis presents a modified Montgomery modular multiplicationalgorithm based on the two-level carry-save addition (CSA) tree. By inserting registers,the algorithm shortens the critical path and guarantees operands to arrive at the CSAinput ports simultaneously, which significantly improves the speed of modularmultiplication. Modular-multiplication sequence was adjusted in modular exponentiation, which avoids most format conversion and reduces the conversion time.The proposed modular exponentiation circuit has a higher performance improvementcompared with most representative design.
     4. Elliptic Cure Cryptography contains a large number of modular multiplicationand squaring operations over prime and binary finite fields. For prime finite field, thetraditional Montgomery algorithm was modified and2-bit prefix Montgomery modularmultiplication circuit was designed. Then partial-products were reconstructed based oninherent characteristic of square arithmetic, which reduces the number ofpartial-products by half. Half Number Partial-Products modular squaring algorithmwas proposed and modular squaring circuit was designed based on the new algorithm.Modular squaring operation time is only half of modular multiplication. For binaryfinite field, word-serial modular multiplication algorithm was proposed. Multiplier canbe implemented by pipeline technique due to multiplicand shifted left by two words inthe new algorithm. Also, the new algorithm simplifies the calculation of some keyvariables, thus the circuit path delay of multiplier is reduced. The result of modularmultiplication can be fast calculated by word-serial modular multiplier.Two-words-serial modular squaring algorithm was proposed. The algorithm adoptsMontgomery method to do modular reduction on the squaring result which is directlyobtained according to the characteristic of binary finite fields square arithmetic. Thealgorithm can handle two words at one time, thus the calculation time of modularsquaring is half of that of modular multiplication. Then this thesis presents a highlyefficient ECC dual-field processor consisted of these finite field arithmetic units.
     5. The switchable TAM architecture was presented that some IP cores attached tomultiple TAMs by switching circuit. So these IP cores can be tested by several TAMs,which will reduce idle time and test time effectively. By0-1programming, which wasrestricted in some given conditions, each IP core was allocated to a TAM, and thenheuristic search arithmetic was used to pick out some appropriate IP cores to be testedby multiple TAMs. Experimental results on ITC2002benchmark circuits show that ourapproach is better than some other approaches.
引文
[1.1]石晶林.网络处理器产生、发展趋势及设计要求.中科院计算技术研究所信息网络研究室.
    [1.2] Intel Inc. Intel IXP1200Network Processor Family Hardware Reference Manual2001.[Online]:www.intel.com.
    [1.3] Intel Inc. Intel IXP2800Network Processor Family Hardware Reference Manual2004.[Online]:www.intel.com.
    [1.4] Ezchip Inc. Network Processor Designs for Next-Generation NetworkingEquipment.[Online]:www.ezchip.com.
    [1.5] Ezchip Inc. NP-220Gigabit Network Processor with Integrated TrafficManagement.[Online]:www.ezchip.com.
    [1.6] Netronome Inc. NFP-3200Network Flow Processor.[Online]:www.netronome.com.
    [1.7] Netronome Inc. Understanding Network IO Virtualization(IOV).[Online]:www.netronome.com.
    [1.8] Cisco Inc. The Cisco QuantumFlow Processor: Cisco’s Next GenerationNetwork Processor.[Online]: www.cisco.com.
    [1.9]陈怀临.思科QuantumFlow处理器及其战略研究.[Online]: www.tektalk.cn.
    [1.10] Xelerated Inc. HX Family of Network Processors Product Brief.[Online]:www.xelerated.com.
    [1.11]我国自主开发的多核网络安全处理器填补国内该领域空白.[Online]:new.xinhuanet.com.2009.
    [1.12]“清华天行”网络处理器鉴定.[Online]:www.sist.tsinghua.edu.cn.2006.
    [1.13]王然,冷述伟.网络处理器的Intel IXA架构.
    [1.14]林国庆.网络信息安全体系中关键技术的研究.西安电子科技大学博士学位论文.2009.
    [1.15]叶娜.融合密码技术与隐藏技术的信息安全研究.同济大学博士学位论文.2007.
    [1.16]邓林.网络信息安全体系中关键技术的研究.合肥工业大学博士学位论文.2008.
    [1.17] C.E.Shannon. Communication theory of secrecy systems. Bell System TechnicalJournal,1949,28(4):656-715.
    [1.18] W.Diffie and M.Hellman. New directions in cryptography. IEEE Transactions onInformation Theory,1976,IT-22(6):644-654.
    [1.19] H.E.Link and W.D.Neumann. Clarifying obfuscation: improving the security ofwhite-box DES. ITCC2005International Conference on:InformationTechnology Coding and Computing.2005:679-684.
    [1.20] J.Yang,N.Li and J.Ding. A Design and Implementation of High-Speed3DESAlgorithm System. Future Information Technology and ManagementEngineering, FITME’09. Second International Conference on.2009:175-178.
    [1.21] A.Hodjat and I.Verbauwhede. Area-throughput trade-offs for fully pipelined30to70Gbits/s AES processors. IEEE Trans. on Computers.2006,55(6):366-372.
    [1.22] R.Zimmermann,A.Curiger and H.Bonnenberg. A177Mb/s VLSI Implementationof the International Data Encryption Algorithm. IEEE Journal of Solid-StateCircuits.1994,29(3):303-307.
    [1.23] R.Rivest, A.Shamir and L.Adlenan. A method for obtaining digital signatures andpublic key cryptosystems. Communication of ACM,1978,21(2):120-126.
    [1.24] E.Sacas,A.F.Tence and C.K.Koc. A Scalable and Unified Multiplier Architecturefor Finite Fields GF(p) and GF(2m). Cryptographic Hardware and EmbeddedSystems.2000:277-292.
    [1.25]陈华锋.椭圆曲线密码算法及芯片实现方法研究.浙江大学博士学位论文.2008.
    [1.26]韩煜.嵌入式系统安全的密码算法及实现技术研究.华中科技大学博士学位论文.2008.
    [1.27] K.Jarvinen and J.Skytta. Onparallelization of High-Speed Processors for EllipticCurve Cryptography. IEEE Trans. on VLSI,2008,16(9):1162-1175.
    [1.28] P.L.Montgomery. Modular multiplication without trial division. Mathematics ofComputation,1985,44(170):519-521.
    [1.29] W.N.Chelton and M.Benaissa. Fast Elliptic Curve Cryptography on FPGA. IEEETrans. on VLSI,2008,16(2):198-205.
    [1.30] J.Y.Lai and C.T.Huang. A Highly Efficient Cipher Processor for Dual-FieldElliptic Curve. IEEE Trans. on circuits and systems II,2009,56(5):394-398.
    [1.31] G.Chen, G.Bai and H.Chen. A High-Performance Elliptic Curve CryptographicProcessor for General Curves Over GF(p) Based on a Systolic Arithmetic Unit.IEEE Trans. on circuits and systems II,2007,54(5):412-416.
    [1.32] A.H.Mariano and L.A.Monico. CMOS Full-Adders for Energy-EfficientArithmetic Application. IEEE Trans. on VLSI,2011,19(4):718-721.
    [1.33] S.Goel,A.Kumar and M.A.Bayoumi. Design of Robust Energy-Efficient FullAdders for Deep-Submicrometer Design Using Hybrid-CMOS Logic Style..IEEE Trans. on VLSI,2006,14(12):1309-1321.
    [1.34] S.K.Mathew,M.A.Anders and B. Bloechel. A4-GHz300-mW64-bit integerexecution ALU with Dual Supply Voltages in90-nm CMOS. IEEE Journal ofSolid-State Circuits,2005,40(1):44-51.
    [1.35] Y.Shimazaki,R.Zlatanovice and B.Nikolic. A shared-well dual-supply-voltage64-bit ALU. IEEE Journal of Solid-State Circuits,2004,39(3):494-500.
    [1.36] Y.Choi and E.E.Swartzlander. Speculative Carry Generation With Prefix adder.IEEE Trans. on VLSI,2008,16(3):321-326.
    [1.37] P.M.Kogge and H.S.Stone. A parallel Algorithm for the Efficient Solution of aGeneral Class of Recurence Equations. IEEE Trans. onComputers,1973,22(8):786-793.
    [1.38] B.R.Zeydel,D.Baran and V.G.Oklobdzija. Energy-Efficient DesignMethodologies: High-Performance VLSI Adders. IEEE Journal of Solid-StateCircuits,2010,45(6):1220-1233.
    [1.39] R.Zlatanovici,S.Kao and B.Nikolic. Energy_Delay Optimization of64-BitCarry-Lookahead Adders With a240ps90nm CMOS Design Example. IEEEJournal of Solid-State Circuits,2009,44(2):569-583.
    [1.40]方建平. SoC低成本测试技术与实现方法研究.西安电子科技大学博士学位论文.2006.
    [1.41]谢元斌. MMC Host控制器可测性设计.西安电子科技大学硕士学位论文.2007.
    [1.42] S.Anuja and C.Krishnendu. Optimization of Dual-Speed TAM Architectures forEfficient Modular Testing of SOC. IEEE Trans on Computers,2006,56(1):120-133.
    [1.43] P.Varma and S.Bhatia. Structured test re-use methodology for core-based systemchips. IEEE Int Test ConfTC,1998:294-302.
    [1.44] J.Aerts and E.J.Marinissen. Scan Chain Design for Test Time Reduction inCore-Base ICs. IEEE Int Test ConfTC,1998:448-457.
    [1.45] P.Harrod. Testing Re-Usable IP:A Case Study. IEEE Int Test ConfTC,1999:493-498.
    [1.46] E.J.Marinissen. Structured and Scalable Mechanism for Test Access toEmbedded Reusable Cores. IEEE Int Test ConfTC,1998:284-293.
    [1.47] K.Chakrabarty. Design of System-on-a-Chip Test Access Architecture underPlace-and-Route and Power Constraints. Proc. IEEE/ACM Design AutomationConf.,2000:432-437.
    [1.48] K.Chakrabarty. Test Scheduling for Core-Based Systems Using Mixed-IntegerLinear Programming. IEEE Trans. Computer-Aided Design of IntegratedCircuits and Systems,2000,19(10):1163-1174.
    [1.49] S.Chattopadhyay and K.S.Reddy. Genetic Algorithm Based Test Scheduling andTest Access Mechanism Design for System-on-Chips. Proc. Int. Conf. VLSIDesign,2003:341-346.
    [1.50] Y.Hu,Y.H.Han and H.W.Li. Pair Balance-Based Test Scheduling for SOCs.Proc.of the13thAsian Test Symposium,2004:236-241.
    [2.1]石晶林,程胜,孙江明.网络处理器原理、设计与应用.北京:清华大学出版社,2003.
    [2.2]王庆成.面向IP包处理的硬件多线程处理器研究与设计.西安电子科技大学硕士学位论文.2010.
    [2.3]刘培彦.基于分离传输的网络处理器片上总线设计与实现.西安电子科技大学硕士学位论文.2011.
    [2.4]安爱女.多核共享的高效存储控制模块研究与设计.西安电子科技大学硕士学位论文.2011.
    [2.5]何科.网络处理器高性能数据交换接口设计研究.西安电子科技大学硕士学位论文.2011.
    [2.6]朱灵芳.网络处理器中多核共享SDRAM控制器的研究与设计.西安电子科技大学硕士学位论文.2011.
    [2.7]赵佳良.异构多核网络处理器中高性能共享存储器系统关键技术研究.西安电子科技大学硕士学位论文.2011.
    [2.8]陈敬洋.基于多核包处理器的高速数据交换总线设计研究.西安电子科技大学硕士学位论文.2011.
    [2.9] P.J. Ma,P.Y. Liu and K.Li. A Parallel Low Latency Bus on Chip for PacketProcessing MPSoC. International Conference on solid-state and IntergratedCircuit Technology,2010:545-548.
    [2.10] T.Ungerer,B.Robic and J.Silc. Multithreaded Processors. The Computer Journal,2002.45(3):320-348.
    [2.11] J.Laudon and L.Spracklen. The Coming Wave of Multithreaded ChipMultiprocessors. International Journal of Parallel Programming,2007.35(3):299-330.
    [2.12] David Money Harris,Saryh L.Harris著,数字设计和计算机体系结构(英文版),2008.
    [2.13]孙华锦,高德远,张盛兵. Round robin调度算法在FPGA中的实现.电子与信息学报,2003,25(8):1143-1147.
    [3.1] S.Naffziger, B.Stackhouse and T.Grutkowski. The implementation of a2-coremultithreaded itanium family processor. IEEE J. Solid-StateCircuites,2006,41(7):197-209.
    [3.2] M.Lehman and N.Burla. Skip Techniques for High-Speed Carry Propagation inBinary Arithmetic Unit. IRE Trans. on Electronic Computers.1962,EC-10(12):691-698.
    [3.3] O.Bedrij. Carry Select Adder. IRE Trans. on Electronic Computers.1962,EC-11:340-346.
    [3.4] O.MacSorley. High Speed Arithmetic in Binary Computers. IREProceedings,1961,49:67-91.
    [3.5] R.E.Ladner and H.T.Kung. A regular layout for parallel adders. IEEE Trans. onComputers.1980,27(4):831-838.
    [3.6]谢元斌,潘伟涛,郝跃.一种新的加法器进位旁路电路.电路与系统学报.2011,16(2):62-65.
    [3.7] H.Ling. High-speed binary adder. IBM J. R&D.1981,25:156-166.
    [3.8] F.S. Lai and W. Hwang. Design and implementation of differential cascodevoltage switch with pass-gate (DCVSPG) logic for high-performance digitalsystems. IEEE Jouranl of Solid-State Circuits.1997,32(4):563-573.
    [3.9] T.Sato, M.Sakate and H.Okada. An8.5-ns112-b Transmission Gate Adder witha Conflict-Free Bypass Circuit. IEEE Jouranl of Solid-State Circuits,1992,27(4):657-659.
    [3.10] P.K.Chan and M.D.F.Schlag. Analysis and design of CMOS Manchester adderswith variable carry-skip. IEEE Trans on Computer,1999,39(8):983-992.
    [3.11] H.Eriksson, P.Lursson-Edefors and A.Alvandpour. A2.8ns30mW/MHzarea-efficient32-b Manchester carry-bypass adder. Proc.ISCAS.2001,4:84-87.
    [3.12] S.Perri, P.Corsonello and F.Pezzimenti. Fast and energy-efficient Manchestercarry-bypass adders. IEE Proc.-Circuits Devices Syst.2004,51(6):497-502.
    [3.13] M.Alioto and G.Palumbo. A Simple Strategy for Optimized Design of One-LevelCarry-Skip Adders. IEEE Trans. Circuits and Systems.2003,50(1):141-148.
    [3.14] K.Chirca,M.Schulte and J.Glossner. A Static Low-Power, High-Performance32-bit Carry Skip Adder. Proceedings of the EUROMICRO Systems on DigitalSystem Design.2004:615-619.
    [3.15] S. Perri, P. Corsonello, and G. Cocorullo. A Highspeed energy-efficient64-bitreconfigurable binary adder. IEEE Trans. on VLSI.2003,11(5):939-943.
    [3.16] J.M.Rabaey, A.Chandrakasan and B.Nikolic. Digital integrated circuits: a designperspective Second Edition.2002.
    [3.17] Y.B.Xie,W.T. Pan and P.J. Ma. The Size Optimize of DCVSPG Logic. ASICON2009-Proceedings20098th IEEE International Conference onASIC.2009:1051-1054.
    [3.18] D.A.Hodges. Analysis and Design of Digital Integrated Circuits In DeepSubmicron Technology Third Edition.2002.
    [3.19]游肖君.高性能ALU优化设计研究.西安电子科技大学硕士学位论文.2009.
    [3.20] B.Chatterjee and M.Sachdev. Design of a1.7-GHz low-powerdelay-fault-testable32-b ALU in180-nm CMOS technology. IEEE Trans. onVLSI.2005,13(11):1296-1304.
    [4.1] D.Hankerson, A.Menezes and S.Vanstone. Guide to Elliptic Curve Cryptography.Springer.2004.
    [4.2] Standard Specifications for Public-Key Cryptography. IEEE1363,2000.
    [4.3] J.Lopez and R. Dahab. Improved Algorithms for Elliptic Curve Arithmetic inGF(2m). Proc. Sel. Areas Cryptography:5thAnnu.1998:201-212.
    [4.4] C.C.Yang,T.S.Chang and C.W.Jen. A new RSA cryptosystem hardware designbased on Montgomery’s algorithm. IEEE Trans. on Circuits Syst.II:Analog Digit.Signal Process.1998,45(7):908-913.
    [4.5] K.Manochehri and S.Pourmozafari. Fast Motgomery modular multiplication bypipelined CSA architecture. Proc. IEEE Int. Conf.Microelectron.2004:144-147.
    [4.6] C.McIvor,M.McLoone and J.V.McCanny. Modified Montgomery modularmultiplication and RSA exponentiation techniques. IEEProc.Comput.Digit.Techniques.2004,151(6):402-408.
    [4.7] A.Cilardo, A.Mazzeo and L.Romano. Carry-save Montgomery modularexponentiation on reconfigurable hardware. Proc.Des.,Autom. TestEur.Conf.Exhibition.2004,3:206-211.
    [4.8] T.W.Kwon, C.S.You and W.S.Hen. Two implementation methods of a1024-bitRSA cryptoprocessor based on modified Montgomery algorithm. Proc.IEEEInt.Symp.Circuits Syst.2001,4:650-653.
    [4.9]刘强,佟冬,程旭。一款RSA模乘幂运算器的设计与实现.电子学报,2005,33(5):923-927。
    [4.10] M.D.Shieh,J.H.Chen and H.H.Wu and. A New Modular ExponentiationArchitecture for Efficient Design of RSA Cryptosystem. IEEE Transactions onVLSI.2008,16(9):1151-1161.
    [4.11]谢元斌,史江一,郝跃.一种长整数模乘幂的改进算法与实现.西安电子科技大学学报.2011,38(2):129-134.
    [5.1]李树国,周润德,冯建华. RSA密码协处理器的实现.电子学报.2001,29(11):1441-1444.
    [5.2]王旭,董威,戎蒙恬.基于改进Montgomery模乘算法的RSA加密处理器的实现.上海交通大学学报.2004,38(2):240-247.
    [5.3]吴敏,曾晓洋,韩军.基于CRT的低成本RSA芯片设计.计算机研究与发展.2006,43(4):639-645.
    [5.4]范益波,曾晓洋,于宇.高速可配置RSA密码协处理器的VLSI设计.计算机研究与发展.2006,43(6):1076-1082.
    [5.5] Z.B.Hu,M.A.S.Rabah and V.P.Shirochin. An Efficient Architecture of1024-bitsCryptoprocessor for TSA Cryptosystem Based on Modified Montgomery’sAlgorithm. IEEE Int. Workshop on Intelligent Data Acquisition and AdvancedComputing Systems: Technology and Applications.2007:643-646.
    [5.6] A.Omondi and B.Premkumar. Residue Number Systems: Theory andImplementation. First edition. London: Imperial College Press,2007:29.
    [5.7] M.D.Shieh,J.H.Chen and H.H.Wu and. A New Modular ExponentiationArchitecture for Efficient Design of RSA Cryptosystem. IEEE Transactions onVLSI.2008,16(9):1151-1161.
    [5.8] McIvor C., McLoone M., and McCanny J. V., Modified Montgomery modularmultiplication and RSA exponentiation techniques[J]. IEEProceedings-Computers and Digital Techniques.2004,151(6):402-408.
    [5.9] S.Moon,J.Park and Y.Lee. Fast VLSI Arithmetic Algorithms for High-SecurityElliptic Curve Crytographic Applications. IEEE Tran. on Consumer Electronics.2001,47(3):700-708.
    [5.10] R.C.C.Cheung,N.J.Telle and W.Luk. Customizable Elliptic Curve Cryptosystems.IEEE Trans. on VLSI.2005,13(9):1048-1059.
    [5.11] M.Benaissa and W.M.Lim. Design of Flexible GF(2m) Elliptic CurveCryptography Processors. IEEE Trans. on VLSI.2006,14(6):659-662.
    [5.12] C.J.McIvor, M.McLoone and J.V.McCanny. Hardware Elliptic CurveCryptographic Processor Over GF(p). IEEE Trans. on Circuits and systems,I.2006,53(9):1946-1957.
    [5.13] G.Chen,G.Q.Bai and G.Y.Chen. A High-Performance Elliptic CurveCryptographic Processor for General Curves Over GF(p) Based on a SystolicArithmetic Unit. IEEE Trans. on Circuits and systems,II.2007,54(5):412-416.
    [5.14] P.Longa and A.Miri. Fast and Flexible Elliptic Curve Point Arithmetic overPrime Fields. IEEE Trans. on Computers.2008,57(3):289-302.
    [5.15] D.Karakoyunlu,F.K.Gurkaynak and B.Sunar. Efficient and side-channel-awareimplementations of elliptic curve cryptosystems over prime fields.IETInformation Security.2010,4(1):30-43.
    [5.16] J.Y.Lai and C.T.Huang. A Highly Efficient Cipher Processor for Dual-FieldElliptic Curve Cryptography. IEEE Trans. on Circuits and Systems II.2009,56(5):394-398.
    [5.17] J.Y.Lai and C.T.Huang. Elixir:High-Throughput Cost-Effective Dual-FieldProcessors and the Design Framework for Elliptic Curve Cryptography. IEEETrans. on VLSI.2008,16(11):1567-1579.
    [5.18] A.Satoh and K.Takano. A Scalable Dual-Field Elliptic Curve CryptographicProcessor. IEEE Trans. on Computers.2003,52(4):449-460.
    [5.19] K.Sakiyama,E.D.Mulder and B.Preneel. A Parallel Processing HardwareArchitecture for Elliptic Curve Cryptosystems. Proc. IEEEICASSP.2006,3:904-907.
    [5.20] Y.B.Xie, P.J. Ma and J.Y. Shi. High-speed and Flexible Elliptic CurveCryptographic Processor for General Prime Fields. International Conference onsolid-state and Intergrated Circuit Technology,2010:503-505.
    [5.21] John D.Carpinelli著,李仁发,彭蔓蔓译,计算机系统组成与体系结构,人民邮电出版社,2003:154-189.
    [5.22] C.Xavier, S.S.Iyengar(著),张云泉,陈英(译).并行算法导论.机械工业出版社.2004.
    [5.23] David Money Harris,Saryh L.Harris著,数字设计和计算机体系结构(英文版),2008:363-461.
    [5.24] K.Ananyi, H.Alrimeih and Daler Rakhmatov. Flexible Hardware Processor forElliptic Curve Cryptography Over NIST Prime Fields. IEEE Trans. on VLSI.2009,17(8):1099-1112.
    [5.25] K.Sakiyama, L.Batina and B.Preneel. Multi-core curve-based cryptoprocessorwith reconfiguarable modular arithmetic logic units over GF(2n). IEEE Trans. onComputers.2007,56(9):1269-1282.
    [6.1]雷绍充,绍志标,梁峰. VLSI测试方法学和可测性设计.电子工业出版社.2005.
    [6.2] L.K.Gong and J.F.LU. Verification-Purpose Operating System forMicroprocessor System-Level Functions. IEEE Design&Test of Computers.2010,(1):76-84.
    [6.3] W.W.Chen,J.Y.Zhang and J.Li. Study on A Mixed Verification Strategy forIP-Based SoC Design.2005Conference on High Density Microsystem Designand Packaging and Component Failure Analysis.2005.
    [6.4] K.Hylla,J.H.Oetjens and W.Nebel. Using SystemC for An ExtededMATLAB/Simulink verification flow. Forum On Specification and DesignLanguages. Proceedings-2008Forum on Specification, Verification and DesignLanguages.2008:221-226.
    [6.5] B.Noia and K.Chakrabarty. Test-wrapper Optimization for Embedded Cores inThrough-silicon Via-based Three-dimensional System on Chips. IETComput.Digit.Tech.2011,5(3):186-197.
    [6.6] E.Larsson. Architecture for Integrated Test Data Compression and Abort-on-FailTesting in a Multi-Site Environment. IET Comput.Digit.Tech.2008,2(4):275-284.
    [6.7] L.T.Wang,X.Q.Wen and S.L.Wu. VirtualScan: Test Compression TechnologyUsing Combinational Logic and One-Pass ATPG.. IEEE Design&Test ofComputers.2008,(2):122-130.
    [6.8]方建平,郝跃.一种有效的片上系统测试数据压缩算法.西安电子科技大学学报,2006,33(4):1-4.
    [6.9] Q.Xu. DFT Infrastructure for Broadside Two-Pattern Test of Core-Based SOCs.IEEE Trans on Computers,2006,55(4):470-485.
    [6.10] C.James, N.Mehrdad. FITS: An Integrated ILP-Based Test SchedulingEnvironment. IEEE Trans on Computers,2005,54(12),1598-1613.
    [6.11]谢元斌,高海霞,潘伟涛.可切换式TAM结构的快速SoC测试方法.西安电子科技大学学报.2009,36(1):38-42.
    [6.12] V.Iyengar, K.Chakrabarty, and E.J.Marinissen. Test Wrapper and Test Accessmechanism Co-Optimization for System-on-Chip. IEEE Int Test ConfTC.Baltimore: IEEE,2001:1023-1032.
    [6.13] Y.Huang, W.T.Chen and C.C.Tsai. Resource allocation and test scheduling forconcurrent test of core-based SOC design. Proc Asian Test Symp(ATS). Kyoto:IEEE Computer Socity Test Technology Technical Council,2001:265-270.
    [6.14] S.Koranne. A novel reconfigurable wrapper for testing embedded core-based andits associated scheduling. J.Electron.Testing:Theory Appl.(JETTA),2002,18:415-434.
    [6.15] S.Koranne, V.Iyengar. On the use of k-tuples for SoC test schedule representation.IEEE Int Test ConfTC. Baltimore: IEEE,2002:519-528.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700