基于流水线架构8051微控制器内核的实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
在嵌入式系统低端的微控制器领域,从8位微控制器诞生至今的近30年里,8051系列一直扮演着一个独特的角色。由于MCS-51提供的最佳兼容性,使MCS-51在被改造后,还能以不变的指令系统、基本单元的兼容性保持着8051内核的生命延续,并在未来片上系统(SOC)发展中,担任8位微控制器内核的重任[1]。
     本课题的研究就是在此产业背景下,为满足当前工程应用需要,通过对内核的重新定义,改造和设计,设法提升8051内核的指令执行效率。相关资料表明,在过去十年间改造8051内核指令执行效率的办法大致有以下两种:
     1)维持最初内核结构定义,将机器周期从12个缩短到6个或4个。但这种改造方式指令执行效率提升有限;
     2)通过改变编译器,将指令代码设计为RISC。指令执行周期完全是一个周期。但这样实现方式,在实际应用受到一定的限制。除此以外设计复杂度相对较高,成本也比较高。
     而本课题侧重点在提升8051指令执行效率上采取了新的途径,采用不同于以上的第三种方式,即设法在保持CISC结构及指令系统不变的情况下,对指令运行实行流水作业,多指令按照节拍并行执行,以提高在程序执行时的指令执行效率。在本文阐述了针对8051内核多级流水结构的实现而提出的五点新设计构想和实现过程:
     1)所有系统设计基于系统时钟:在本课题所设计的内核系统中,已经没有传统8051机器周期的概念。这也就是说,当内核具体运行时,外部的振荡器时钟就是内部的系统时钟。这样设计的用意在于对整体功耗的考虑。而可以如此设计的依据是由于内核和今后可能所设计的存储器之间有良好的存取时序匹配。
     2)当前指令的译码决定流水的分级:内核指令具体实现什么样的操作其实就是由操作码实现的。而所谓的译码在流水的分级前,最主要的就是看“操作码”译码的结果如何。而“操作数”的译码结果直接与数据相关问题的合理处理直接有关。
     3)指令并行处理,存在当前指令和预取指令:本课题所设计的流水内核的指令并行处理,最主要的就是增加了预取指令操作。这也就是本课题流水的核心设计之一。这个过程完全通过系统控制状态机来实现。而遇到外部的中断时,会做必要的保护处理。在这里借鉴一些32位处理器成熟的流水控制并行设计方法。
     4)流水控制,多级化控制调度:这也就是本课题流水的核心设计之二。针对内核设计,特别是基于流水架构的内核设计,在本课题中解决的另外一个核心问题就是多指令并行流水调度。特别是在8051指令编译系统不改变的前提下,所设计的内核兼容其二进制机器码。
     在具体实现的过程中,流水控制采用多级化控制管理调度。这取决对指令周期的管理和调度;对指令目标存储器(累加器,寄存器和存储器)的调度和管理;对指令功能的管理和调度(位操作,字节操作还是跳转操作)。
     5)冲突问题(Hazard)的处理:遇到关于数据冲突,结构冲突的解决,统一安排采用增加一个系统节拍的问题加以解决。这种设计思路在平衡和系统开销,设计复杂度的前提下加以实现。在这里借鉴一些32位处理器成熟的冲突处理设计方法。考虑到设计复杂度和实际应用比32位内核设计简化。
     实验测试结果表明,与传统12个周期的8051内核相比,在相同时钟频率下单周期指令运行效率为原来的12倍;整个指令集平均运行效率为原来8051的9.5倍,使8051兼容机系列进入了8位高速单片机行列。同时能有效的解决流水系统结构带来的数据冲突和控制冲突问题。
     本课题研究过程中的主要内容涉及:对传统8051体系结构的掌握,针对实际工程需要新架构模型的提出。合理应用“流水”技术设计,基于8051系列微控制器运用Verilog硬件描述语言加以实现,并最终在“ModelSim”平台仿真和工程样片上得以结果验证。
It has been 30 years since the 8-bit microcontroller appeared in the low-end microcontroller domain of embedded systems. The 8051 series have been playing a unique role. The MCS-51 provides the best compatibility and the basic unit in 8051 to maintain compatibility, which contrain the core life extension and the future system-on-chip (SOC) development of the 8-bit microcontroller core missions.
     To enhance the speed, in the past decade everal ways have been used to upgrade the performance of 8051 core which include the change of traditional bus speed. Below are two types:
     1)Reducing the machine cycles from 12 down to four or six, this way upgrades the core’s efficiency finitely.
     2)Altering the compiler, and adopting RISC instruction set. The cycles per instruction are one. But this way will make the application have some certain restrictions, and design complexity and the cost are very high too. To reaching the aim of enhancing the efficiency of the implementation of the 8051, this thesis adopted a new approach, which keeps the structure and CISC instruction set unchanged and use instruction pipeline way, in order to enhance the execution efficiency. This paper describes the structure of the multi-stage pipeline about 8051 core, which brings forword the below five new design concepts and its realization process:
     1) All system designs based on the system clock: The design of the core system has no traditional 8051 machines cycle concept. That is to say, external oscillator clock is the internal system clock when specific core is operating. The intent of this design is to pull down overall power consumption. Such a design could make the kernel and memory access matched well in the time series.
     2) The current instruction decoding decides pipeline’s gradation: Operator code decides the kernal’s operation. For the decoding stage, before its gradation, the most important thing is to see operation code’s decoded result.
     3) Parallel processing, it means the current instruction and the Prefetching instruction exist at the same time: The issue of pipeline kernel is to design by the directive parallel processing; the most important work is adding the instruction prefetch function. This is the key issue of the design about the pipeline. The process control system entirely uses through state machine to achieve. When an external interrupt occurred, it will do the necessary protection processing.
     4) Flow controlling, multi-level control scheduling: This is the second ey issue of pipeline architecture. Against core designs, particularly on the core architecture design flow in this issue resolved in the other core issue is a more direct parallel pipeline scheduling. Especially in the 8051 instruction compiler system is not changed under the premise of how to design the core of its binary compatible machine code. This is the kernel, and some different points of the design.
     The design of the topics we used is the multi-stage pipeline architecture classification. The concrete realization of the process is one type flow control of a multi-level control and management scheduling. This depends on the following levels:
     Right instruction cycle management and scheduling; the directive targets memory (accumulator, register and memory), scheduling and management; the directive function of management and scheduling (of operation, Operation Jump byte or operation).
     5) Hazard (Hazard) design soloving: the issue of process, if core excution conflict on the data hazard or control hazard, the design in this paper introduces a way by addressed system timing. These design ideas are in balance between system overhead and design complexity achieved under the premise.
     The research aims are the 8051 kernel optimization from traditional structure, in particular using the "pipeline" technology. In this model, the abolition of the machine cycle concept, in order to run the clock cycle units. On average, each can meet a completely single cycle instructions, thereby greatly increases the speed of operation instructions. In the same clock cycle under single command operating, the speed of the new design is 12 times of original design; the entire instruction set’s average speed is 9.5 times the original 8051. These performances enable 8051 microcontrolers to access the eight high-speed microcontroller ranks.
     The study of the main elements: based on the traditional master architecture 8051, in light of actual projects, the paper proposed new structure model. Particularly reasonable application of the "wasted" technical design based on the 8051 microcontroller, using verilog hardware description language, was to be realized, in the end, using "ModelSim" simulation tools and engineering, some samples have been proven.
引文
[1] 何立民. 从Cygnal C8051F看8位单片机发展之路. 北京, 北京航空航天大学, 2005 年.
    [2] ECN Senior TechnicaI Editor Jon Titu. 不断发展的 8051 单片机. 上海, 嵌入式系统, 电子产品世界, 2005 年, 第 3 期, 下半月.
    [3] 叶云燕. 降低功耗提升用电效率的高性能 8051 微控制器. 上海, 电子技术文摘, 2004 年, 第 11 月.
    [4] Keith Coffey. 8 位微控制器在 SOC 中的应用. 北京, 电子产品世界, 2006 年, 第 6 月.
    [5] 多 CPU 系 统 级 芯 片 设 计 的 CPU 内 核 选 择 . 电 子 工 程 专 辑 网 站 , http://www.eetchina.com/ART_8800300510_617681,617682.HTM.fd88308e, 2003 年.
    [6] 朱子玉, 李亚民. CPU 芯片逻辑设计技术. 北京, 清华大学出版社, 2004 年, 第 50~59 页.
    [7] R&D 公司. RD8051 产品数据书, 技术研发中心, 2002 年.
    [8] 黄敏敏, 林媛, 徐中佑. 一种采用 3 级指令流水线的 51 内核设计. 现代电子技术, 2005 年, 第 20 期, 第 211 页.
    [9] 美信(MAXIM)公司. 高速 8051 微控制器. 引领成长与创新之路, 应用笔记2035, 2003 年.
    [10] TMS320C55x 的指令流水线及其效率的提高. 北京, 电子通讯论文专辑, 2006 年, 第 23~53 页.
    [11] Verilog HDL 基 础 知 识 . 嵌 入 式 开 发 技 术 中 心 网 站 , http://techcenter.dicder.com/2006/0107/content_99_2.html, 2002 年.
    [12] MODELSIM 应用说明. 上海, 上海邮电出版社, 2003 年, 第 236~276 页.
    [13] William Stallings, 张昆藏译. 计算机组织与体系结构.性能设计第六版. 北京, 清华大学出版社, 2004, 第 345~348 页.
    [14] 傅麒麟, 徐勇. 现代计算机体系结构教程, 北京, 北京希望电子出版社, 2004 年, 第 137~176 页.
    [15] 石教英. 高级流水线与指令级并行. 北京, 电子工业出版社, 2003 年, 第 128页.
    [16] 王金明编. Verilog HDL 程序设计教程. 北京, 人民邮电出版社, 2002 年.
    [17] JohnPaulShen. 现代处理器设计超标量处理器基础. 北京, 电子工业出版社, 2004 年, 第 798-811 页.
    [18] Wyouken. 8051 Core 应 用 开 发 . 内 核 芯 片 开 发 网 站 ,http://www.dkdiy.com/article.php?id=17, 2005 年.
    [19] OpenCore 公司. Instruction Set of IT8051. OpenCore 公司研发中心, 2001 年.
    [20] 李亚民. 计算机组成与系统结构. 北京, 清华大学出版社, 2000 年, 第211~279 页.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700