top of page
  • stimescarmores

Modern Processor Design Fundamentals of Superscalar Processors: From Scalar to Dynamic Pipelines



of a design while a microprocessor or ISP is the implementation of a design. Aswith all forms of engineering design, microprocessor design is inherently a creativeprocess that involves subtle tradeoffs and requires good intuition and cleverinsights. This book focuses on contemporary superscalar microprocessor design at themicroarchitecture level. It presents existing and proposed microarchitecture techniques in a systematic way and imparts foundational principles and insights, withthe hope of training new microarchitects who can contribute to the effective designof future-generation microprocessors.




modern processor design fundamentals of superscalar processors pdf




developed and deployed in the past three decades are presented in a comprehensiveway. This book attempts to codify a large body of knowledge into a systematicframework. Concepts and techniques that may appear quite complex and difficultto decipher are distilled into a format that is intuitive and insightful. A number ofinnovative techniques recently proposed by researchers are also highlighted. Wehope this book will play a role in producing a new generation of microprocessor designers who will help write the history for the fourth decade of microprocessors.


1 Instruction Set Processor DesignThe focus of this book is on designing instruction set processors. Critical to aninstruction set processor is the instruction set architecture, which specifies thefunctionality that must be implemented by the instruction set processor. The ISAplays several crucial roles in instruction set processor design.


between input and output variables. The implementation is typically an optimizedtwo-level AND-OR design or a multilevel network of logic gates. The optimization attempts to reduce the number of logic gates and the number of levels of logicused in the design. For sequential circuit design, the specification is in the form of state machine descriptions that include the specification of the state variables aswell as the output and next state functions. Optimization objectives include thereduction of the number of states and the complexity of the associated combinational logic circuits. Logic minimization and state minimization software tools areessential. Logic and state machine simulation tools are used to assist the analysis task. These tools can verify the logic correctness of a design and determine the critical delay path and hence the maximum clocking rate of the state machine. The design process for a microprocessor is more complex and less straightfor ward. The specification of a microprocessor design is the instruction set architec ture, which specifies a set of instructions that the microprocessor must be able to execute. The implementation is the actual hardware design described using a hardware description language (HDL). The primitives of an HDL can range from logic gates and flip-flops, to more complex modules, such as decoders and multiplexers,to entire functional modules, such as adders and multipliers. A design is described as a schematic, or interconnected organization, of these primitives. The process of designing a modern high-end microprocessor typically involvestwo major steps: microarchitecture design and logic design. Microarchitecturedesign involves developing and defining the key techniques for achieving the tar geted performance. Usually a performance model is used as an analysis tool to assess the effectiveness of these techniques. The performance model accuratelymodels the behavior of the machine at the clock cycle granularity and is able toquantify the number of machine cycles required to execute a benchmark program.The end result of microarchitecture design is a high-level description of the organization of the microprocessor. This description typically uses a register transferlanguage (RTL) to specify all the major modules in the machine organization andthe interactions between these modules. During the logic design step, the RTL description is successively refined by the incorporation of implementation detailsto eventually yield the HDL description of the actual hardware design. Both theRTL and the HDL descriptions can potentially use the same description language.For example, Verilog is one such language. The primary focus of this book is onmicroarchitecture design.


multiple implementations that provide different levels of cost and performance canbe simultaneously developed. A program only needs to be developed once for thatISA, and then it can run on all these implementations. Such program portabilitysignificantly reduces the cost of software development and increases the longevityof software. Unfortunately this same benefit also makes migration to a new ISAvery difficult. Successful ISAs, or more specifically ISAs with a large softwareinstalled base, tend to stay around for quite a while. Two examples are the IBM360/370 and the Intel IA32. Besides serving as a reference targeted by software developers or compilers,ISA serves as the specification for processor designers. Microprocessor designstarts with the ISA and produces a microarchitecture that meets this specification.Every new microarchitecture must be validated against the ISA to ensure that it performs the functional requirements specified by the ISA. This is extremely important to ensure that existing software can run correctly on the new microarchitecture. Since the advent of computers, a wide variety of ISAs have been developedand used. They differ in how operations and operands are specified. Typically anISA defines a set of instructions called assembly instructions. Each instructionspecifies an operation and one or more operands. Each ISA uniquely defines anassembly language. An assembly language program constitutes a sequence ofassembly instructions. ISAs have been differentiated according to the number ofoperands that can be explicitly specified in each instruction, for example twoaddress or three-address architectures. Some early ISAs use an accumulator as animplicit operand. In an accumulator-based architecture, the accumulator is used asan implicit source operand and the destination. Other early ISAs assume that operands are stored in a stack [last in, first out (LIFO)] structure and operations areperformed on the top one or two entries of the stack. Most modern ISAs assumethat operands are stored in a multientry register file, and that all arithmetic andlogical operations are performed on operands stored in the registers. Specialinstructions, such as load and store instructions, are devised to move operandsbetween the register file and the main memory. Some traditional ISAs allow operands to come directly from both the register file and the main memory.ISAs tend to evolve very slowly due to the inertia against recompiling or redeveloping software. Typically a twofold performance increase is needed beforesoftware developers will be willing to pay the overhead to recompile their existingapplications. While new extensions to an existing ISA can occur from time to timeto accommodate new emerging applications, the introduction of a brand new ISAis a tall order. The development of effective compilers and operating systems for anew ISA can take on the order of 10+ years. The longer an ISA has been in existence and the larger the installed base of software based on that ISA, the more difficult it is to replace that ISA. One possible exception might be in certain specialapplication domains where a specialized new ISA might be able to provide significant performance boost, such as on the order of 10-fold.Unlike the glacial creep of ISA innovations, significantly new microarchitecturescan be and have been developed every 3 to 5 years. During the 1980s, there werewidespread interests in ISA design and passionate debates about what constituted the


1.2 Dynamic-Static InterfaceSo far we have discussed two critical roles played by the ISA. First, it provides acontract between the software and the hardware, which facilitates the independentdevelopment of programs and machines. Second, an ISA serves as the specification for microprocessor design. All implementations must meet the requirementsand support the functionality specified in the ISA. In addition to these two criticalroles, each ISA has a third role. Inherent in the definition of every ISA is an associated definition of an interface that separates what is done statically at compile timeversus what is done dynamically at run time. This interface has been called thedynamic-static interface (DSI) by Yale Patt and is illustrated in Figure 1 [MelvinandPatt, 1987].The DSI is a direct consequence of having the ISA serve as a contract betweenthe software and the hardware. Traditionally, all the tasks and optimizations done inthe static domain at compile time involve the software and the compiler, and areconsidered above the DSI. Conversely, all the tasks and optimizations done in thedynamic domain at run time involve the hardware and are considered below theDSI. All the architecture features are specified in the ISA and are therefore exposedto the software above the DSI in the static domain. On the other hand, all the implementation features of the microarchitecture are below the DSI and operate in thedynamic domain at run time; usually these are completely hidden from the softwareand the compiler in the static domain. As stated earlier, software development cantake place above the DSI independent of the development of the microarchitecturefeatures below the DSI.A key issue in the design of an ISA is the placement of the DSI. In betweenthe application program written in a high-level language at the top and the actualhardware of the machine at the bottom, there can be different levels of abstractionswhere the DSI can potentially be placed. The placement of the DSI is correlated


features may become ineffective or even undesirable. If some of these older features were promoted to the ISA level, then they become part of the ISA and therewill exist installed software base or legacy code containing these features. Since allfuture implementations must support the entire ISA to ensure the portability of allexisting code, the unfortunate consequence is that all future implementations mustcontinue to support those ISA features that had been promoted earlier, even if theyare now ineffective and even undesirable. Such mistakes have been made with realISAs. The lesson learned from these mistakes is that a strict separation of architecture and microarchitecture must be maintained in a disciplined fashion. Ideally, thearchitecture or ISA should only contain features necessary to express the functionality or the semantics of the software algorithm, whereas all the features that areemployed to facilitate better program performance should be relegated to the implementation or the microarchitecture domain. The focus of this book is not on ISA design but on microarchitecture techniques, with almost exclusive emphasis on performance. ISA features can influence the design effort and the design complexity needed to achieve high levels ofperformance. However, our view is that in contemporary high-end microprocessordesign, it is the microarchitecture, and not the ISA, that is the dominant determinant of microprocessor performance. Hence, the focus of this book is on microarchitecture techniques for achieving high performance. There are other importantdesign objectives, such as power, cost, and reliability. However, historically performance has received the most attention, and there is a large body of knowledgeon techniques for enhancing performance. It is this body of knowledge that thisbook is attempting to codify. 2ff7e9595c


1 view0 comments

Recent Posts

See All
bottom of page