PROCESSORS P-6 FAMILY PROCESSORS & THEIR MAIN FEATURES MAIN FEATURES OF P-II, P-III .
12/11/2008 Er. Nikhil Marriwala 2
ROAD MAP
• 6th GENERATION PROCESSORS
• P-6 MAIN FEATURES • PROCESSORS IN P-6 CATEGORY •P-II, P-III MAIN FEATURES •THREE ENGINES COMMUNICATING USING INSTRUCTION POOL
12/11/2008 Er. Nikhil Marriwala 3
6th Gen. Processor: P6 P6 Processor Variations: Pentium Pro Original P6 processor. L2 cache: 256kB, 512kB or 1MB (full-core speed) Pentium II P6 with L2-cache: 512kB (half-core speed) Pentium II Xeon P6 with L2-cache: 512kB/1MB/2MB (full-core speed) Celeron P6 without L2 cache Celeron-A P6 with L2-cache: 128kB on-die (full-core speed) Pentium III P6 with SSE (MMX2), L2-cache: 256kB on-die (half-core speed) Pentium III Xeon P6 with SSE (MMX2), L2-cache: 512kB/1MB/2MB on-die (full-core speed)
12/11/2008 Er. Nikhil Marriwala 4
P6: Main New Features... Dual Independent Bus (DIB): Two Separate Data Buses: One system bus (motherboard…)
One cache bus
Also called: “back-side bus” to the processor Advantage: Speed of the cache is scalable to the processor. In contrast: P5… Cache-speed = motherboard-speed. (that’s why there is no P5 processor > 266MHz) P6 processors: >1GHz
12/11/2008 Er. Nikhil Marriwala 5
P6: Main New Features... Dynamic Execution Multiple Branch Prediction Predict the flow of the program through several branches. Goal: Keep the instruction pipelines full. Dataflow analysis Detect opportunities for out-of-order instruction execution. Goal: Optimize the use of the multiple superscalar execution units. Speculative execution Execution of instructions in advance of the actual program counter. Execute all available instructions in the instruction pool. Store the results in temporary registers. Retirement unit searches the instruction pool for completed instructions that are no longer data dependent on other instructions. If those instructions are found: Results are committed in the order they were issued. Instruction are then retired from the pool.
12/11/2008 Er. Nikhil Marriwala 6
P6: Main New Features... Three-way superscalar: P6 has at least six separate instruction units… up to 3 instructions in one cycle Other new features: A few new instructions Enhanced multi-processor support Only recent Windows Versions (NT/2000/XP) do take full advantage of the P6’s capabilities Use optimising compilers to make code as “predictable” as possible
P6: Pentium II Processor core speeds: 233-450MHz Bus speeds: 66-100MHz 7.5M transistors. 0.25/0.35 m BiCMOS. MMX Power dissipation up to >40W! Heatsinks and fans required! A-bus: 36b Addressable: 64GB L2 cache Half core-speed. Supports up to 512MB
12/11/2008 Er. Nikhil Marriwala 10
P6: Celeron Cheaper packaging (SEP) No fancy plastic cartridge Specifically designed for lower-cost PCs L2 cache support up to 4GB of RAM MMX L1 cache: 2* 16kB Integral thermal diode for temperature monitoring 0.25/0.18 m technology
12/11/2008 Er. Nikhil Marriwala 11
P6: Pentium III Introduced: 02/1999 28M transistors 0.18 m coppermine technology Interconnect: Copper rather than Aluminum/Tungsten to reduce signal diffusion… Major improvements: SSE (Streaming SIMD Extensions) Integrated on-die L2 cache Available up to 1GHz
12/11/2008 Er. Nikhil Marriwala 12
7th Gen. Processor: Pentium 4 Introduced: 11/2000. Also called “Net Burst” Main technical details: Core speed range 1.3GHz..>3GHz…? 42M transistors. 0.18 m and 90nm… System (front-side) bus: up to 800MHz ALU runs at twice the processor core frequency Hyper-pipelined 20-stage technology Very deep out-of-order instruction execution 20kB L1 cache. 256kB full-speed L2-cache. 8-way set associative. L2 supports up to 4GB RAM and ECC. SSE2 – 144 new SSE2 instructions Socket 432. Up to 64W of power dissipation.
P6 Family of processors: Pentium Pro, Pentium II & Pentium III
Major features:
• Dynamic program execution
• Three 12-stage pipelines • On-chip FPU • Separate L1 code and data caches with write-back strategy • Out-of-order completion of instructions • Data forwarding • Dynamic Branch Prediction 12/11/2008 Er. Nikhil Marriwala 15 Major Features ( contd. ) Speculative Program Execution Multiprocessing with up to four processors – with no additional logic . Address bus widened to 36 bits – physical address space of 64Gbytes – enabled by new bit in CR4 New functions CMOVcc for conditional MOVs • Integrated L2-cache, within the same package, connected over a dedicated bus running at the full (or half ) CPU clock
12/11/2008 Er. Nikhil Marriwala 16
Major Features ( contd. ) Pentium II & Pentium III: New MMX instructions for enhanced floating point operation when doing graphics • Pentium III: SIMD extensions for enhanced high speed graphics • Dramatic increases in clock frequencies supported A block diagram of the Pentium Pro is given ahead. 12/11/2008 Er. Nikhil Marriwala 17 Major Features ( contd. ) Three-way superscalar, pipelined micro-architecture. Decoupled, multi-stage super pipeline, Pentium II has twelve stages (with a pipe stage time 33 percent less than the Pentium processor) ==> a higher clock rate on any given manufacturing process. ==> less work per pipe stage for more stages. The Pentium Pro, Pentium II and III processors use basically the same “dynamic execution” (i.e. out-of-order superscalar) microarchitecture principles.
12/11/2008 Er. Nikhil Marriwala 18
Major Features ( contd. ) A wide instruction window using an “instruction pool”. “Execute” phase is replaced by decoupled “issue”, “execute”, and “retire” phases. instruction execution is started in any order but always be retired in the original program order. Processors in the P6 family may be thought of as three independent engines coupled with an instruction pool. 12/11/2008 Er. Nikhil Marriwala 19 P6 family of Processors The P6 family of processors use a dynamic execution micro-architecture. This three-way superscalar, pipelined micro-architecture features a decoupled, multi-stage super- pipeline, which trades less work per pipe stage for more stages. A P6 family processor, for example, has twelve stages with a pipe stage time 33 percent less than the Pentium processor, which helps achieve a higher clock rate on any given manufacturing process. The approach used in the P6 family micro-architecture removes the constraint of linear instruction sequencing between the traditional “fetch” and “execute” phases, and opens up a wide instruction window using an instruction pool.
12/11/2008 Er. Nikhil Marriwala 20
P6 family of Processors This approach allows the “execute” phase of the processor to have much more visibility into the program instruction stream so that better scheduling may take place. It requires the instruction “fetch/decode” phase of the processor to be much more efficient in terms of predicting program flow. Optimized scheduling requires the fundamental “execute” phase to be replaced by decoupled “dispatch/execute” and “retire” phases. This allows instructions to be started in any order but always be completed in the original program order. Processors in the P6 family may be thought of as three independent engines coupled with an instruction pool as shown in Figure .
12/11/2008 Er. Nikhil Marriwala 21
Three Engines Communicating Using an Instruction Pool
12/11/2008 Er. Nikhil Marriwala 22
CONCLUSION
Today's objectives have been achieved
which were , LEARNING ABOUT 6th GENERATION PROCESSORS P-6 FAMILY PROCESSORS & THEIR MAIN FEATURES MAIN FEATURES OF P-II, P-III .
12/11/2008 Er. Nikhil Marriwala 23
Queries Any questions related to the topic can be mailed to me at nikhilmarriwala@hotmail.com