Documente Academic
Documente Profesional
Documente Cultură
To visualize how this works, consider a hospital surgical unit that consists of areas for admittance,
surgery, and recovery. Patients can move in only one direction, from admittance to recovery, and it
takes the same amount of time to go through each of the areas. Assume the admitting area can
handle three patients at a time and there are three surgical teams, each of which can work on a
single patient. Also assume the recovery area has an indeterminate number of beds, but can
accommodate only one person per bed. When the unit is working correctly, the admitting area
processes three patients at a time, sends one to each of the teams, and immediately processes
another three patients. Even though the surgical teams can handle only one patient at a time,
because there are three of them, they will have passed their charges on by the time the new ones
arrive. The paths the three patients take are analogous to instructions flowing through three
pipelines in a CPU clock cycle. The admitting area is like a fetching mechanism, the surgery teams
are like execution units, and the recovery room is like the registers or cache to which the units
write their results.
To illustrate the kind of problems that can occur in superscalar architectures, consider what would
happen if the staff of the admitting area in the example were not very competent. For example, if
they passed a patient in need of a kidney transplant to a surgical team before the donor kidney
was available, the team wouldn't be able to go to work. Suddenly, there would be a bottleneck at
the admitting area because only two surgical teams would be available for new patients. Another
bottleneck could occur if a surgical team tried to assign a patient to an already occupied bed in the
recovery area. Again, a bottleneck would appear because the team would not be available until
the bed was emptied and the team could move the current patient into it. Stalls like this happen in
processors when an execution unit tries to perform a task that is dependent on the results of as
yet uncalculated instructions. This is why it is important that CPUs carefully manage the order in
which they process instructions.
Simple superscalar pipeline. By fetching and dispatching two instructions at a time, a maximum of two instructions
per cycle can be completed. (IF = Instruction Fetch, ID = Instruction Decode, EX = Execute, MEM = Memory
access, WB = Register write back, i = Instruction number, t = Clock cycle [i.e., time])
superscalar
Superscalar describes a microprocessor design that makes it possible for more than
one instruction at a time to be executed during a single clock cycle . In a superscalar
design, the processor or the instruction compiler is able to determine whether an
instruction can be carried out independently of other sequential instructions, or whether
it has a dependency on another instruction and must be executed in sequence with it.
The processor then uses multiple execution units to simultaneously carry out two or
more independent instructions at a time. Superscalar design is sometimes called
"second generation RISC ."