Sunteți pe pagina 1din 25

Chapter 1:

1) Define the term computer architecture and computer organization.

Computer architecture – is the attributes of a system that visible to a programmer, those


attributes have direct impact on the logical execution of a program. Example: instruction
set, the number of bit use to represent the various data types, I/O mechanism and
technique for addressing memory.

Computer organization – is the operation unit and their interconnection that realize the
architecture specification. Example: hardware detail transparent to the programmer, such
as control signal, the interface between the computer and the peripheral, and the memory
technology used.

For example, an architecture issue is that whether a system will have the multiply
instruction. It organization issue is whether the instruction will be implement using the
multiply unit or by the mechanism that make repeated used of the add unit of the system.

2) Defines the term computer structure and computer function.

Computer structure – is the way that all the components are interrelated.

Computer function – is the operation of each individual component as part of the


structure.

Basic function that for a computer can perform is

-data processing

-data storage

-data movement

-control

3) List and briefly describe the structural components of a computer.

Center Processing Unit (CPU) – is used to control the operation of a computer and
performed its data processing functions, also referred as a processor.

Main Memory – stores data.

I/O – move the data between the computer and its external environment.

System Interconnection – is the mechanism that provides the communication among the
CPU, main memory, and the I/O devices. A system bus is an common example of system
interconnection.
4) List and briefly describe the structural components of a processor.

Control Unit – control the operation of the CPU.

Arithmetic Logic Unit (ALU) – Perform the data processing function of a computer.

Register – Provides storage internal to a CPU.

CPU Interconnection – is the mechanism that provides the communication between the
control unit, arithmetic logic unit (ALU) and the register.

Chapter 2:

1) Briefly describe the key concepts of Von Neumann architecture.

Von Neumann architecture is a design modal that using the stored-program concept to
design the computer that the computer should have the processor separate from their
memory, this allowed the processor have the free time to perform other action while
waiting for memory to return the value. So, the Von Neumann architecture computer
using a central processing unit (CPU) and a single separate storage structure (memory) to
hold both the instruction and data. The computer could get its instruction by reading them
from memory and the program can set or altered by setting the portion of the value of a
portion of memory.

2) Discuss the processing techniques used to maximize the high raw frequency of modern
processors.

Branch prediction – the processor will look ahead in the instruction code that fetch from
memory and predict which branch or group of instruction will be processed next. The
processor predict not just the next branch but multiple branch ahead, thus the branch
prediction will increase the amount of work available for the processor to execute.

Data flow analysis – the processor will analysis which instruction are dependent on each
other, either depend on the result or the data, this will optimized the schedule of
instruction. To prevent unnecessary delay, the instructions are scheduled to be executed
when ready, independent of the original program order.

Speculative execution – using the branch prediction and data flow analysis, some
processor speculatively execution instruction ahead of their actual appearance in the
program execution, holding the result in a temporary locations. This enables the
processor keep its execution engines as busy as possible by executing instructions that are
likely to be needed.
3) Discuss the solutions that can be taken to balance the performance of logic circuits
(processors) and memory circuits (memory devices).

-Increase the number of bits that are retrieved at one time by making DRAMs “wider”
rather that “deeper” and by using wide bus data paths.
-Change the DRAM interface to make it more efficient by including a cache on the
DRAM chip.
-Reduce the frequency of memory access by using the complex and efficient cache
structure between the processor and memory.
-Increase the interconnect bandwidth between the processor and memory by using
higher-speed buses and by using the hierarchy of buses to buffer and structure data flow.

4) Discuss the issues involved in designing microprocessors with high operating


frequencies.

 Increase the hardware speed of the processor – this increase is fundamentally


due to shrinking the size of the logic gates on the processor chips, so that more
gates can be packed together more tightly to reduce the propagation time and
increase the clock rate mean that individual operations are executed more rapidly,
this will enabling the speed up of the processor.

 Increase the size and speed of the cache that are interposed between the processor
and the main memory. If the cache is a portion of the processor chip, the cache
access time will also drop significantly and this will increase the processor
performance.

 Make change to the processor organization and architecture that increase the
effective speed of instruction execution. Typically this involves using parallelism
in one form or another.

5) Define the terms clock cycle, instruction cycle, cycle per instruction (CPI), millions of
instructions for second (MIPS) and millions of floating-point operation per second
(MFLOPS).

 Clock cycle – A single digital electronics pulse of 1–0 transmission.


 Instruction cycle – Time required to fetch and execute an instruction.
 CPI – Number of cycle required for an instruction to be executed.
 MIPS – Measurement of the instruction execution rate.
 MFLOPS- Measurement of the floating point instruction execution rate.
6) Discuss how the performance of three modern processors (i.e. with cache memory,
pipelining, branch prediction, etc.) can be estimated.

7) What is benchmarking? Discuss averaging methods that can be used to estimate


performance.

A benchmarking is the act of running a same set of program or operation on the different
machine to compare the execution times for checking the performance.

The averaging method is to obtain a reliable comparison of the performance of various


computers by running a number of different benchmark programs on each machine and
then average the result. The example of averaging method is arithmetic mean and
harmonic mean. The averaging method is concerned on the execution time of a system
and not the execution rate of the system. Arithmetic mean does not clearly related to
execution time because the result is proportional to the sum of the inverses of the
execution times and not inversely proportional to the sum of the executions time but the
harmonic mean results is inverse of the average execution time.

8) Describe Amdahl's Law by giving some examples.

Amdahl’s Law is use to find the maximum expected improvement to an overall system
when only a part of the system is improve.

For example, a task makes extensive use of the floating point operation with 40% of the
time is consumed by floating point operation, with the new hardware design the floating
point module is speedup by a factor of K, then the overall speed up is

Speedup= , f = time consumed in the execution of the operation = 0.4.

=
Thus, independent of the K, the maximum speed up is 1.67.

Chapter 3:

1) Describe the terms hardwired program and software program.

Hardwired program – is a process to connecting the various basic logic components in


the desired configuration as a form of programming. This system will accept the data
provide to produce the results, but to get another type of operation result, its need to
rewiring the hardware. The speed to get the result is faster.
Software program – is a process that using the general-purpose arithmetic and logic
function unit that can use to perform various functions on data depending on the control
signal applied to the hardware. Instead of rewiring the hardware for each new program,
the programmer only need to supply a new set of control signal.
2) Describe a basic execution flow of an instruction.

The basic execution flows of the instructions can be divided into 2 steps, that is
instruction fetch and instruction execute. At the beginning of each instruction, the
processor will fetches an instructions from the memory, and a register call program
counter (PC) is used to address the next instruction to be fetch, each time a instruction is
fetch the program counter will automatic increment by 1. The instruction fetching form
the memory will load into a register call the instruction register (IR) that use to decode
the instruction to specify the action that the processors required to be execute.

3) List all commonly known system registers and briefly describe their purposes.

Program counter – Hold the address of the next instruction to be fetch.


Instruction register – decode the instruction for the processor to know the operation
required to perform.
Memory Address Register (MAR) – is to specify the address in the memory for the
next read and write operation.
Memory Buffer Register (MBR) – contain the data to be written into memory or receive
the data read from memory.
I/O Address Register (I/OAR) – is used to specify a particular I/O devices.
I/O Buffer Register (I/OBR) – is used to exchange the data between the I/O module and
the CPU.

4) Describe how an interrupt affects the execution flow of program instructions.

Interrupt provide the primarily as a way to improved the processing efficiency. For
example, most of the external devices are much slower than the processor. If processor is
transferring the data to the printer, the processor need to pause and remain idle until the
printer can catches up, at this time the original program of the processor is stop until the
printer can receive all the data. The duration of waiting is very wasteful use of the
processor. By using interrupt, while the printer is received the data for printing, the
processor can continue the current program while waiting the printer to receive the data,
when the printer is ready to receive more data from the processor, the I/O module for the
external device will send an interrupt request to the processor and the processor will
suspending operation of the current program, branching off to a program to serve that
particular I/O device and resuming the original execution after the device is serve.

5) Discuss the elements involved in bus design.

Bus Type – divided into 2 types, dedicated and multiplexed. For multiplexed bus type,
the data and the address will share using the same bus to perform the operation by using
some controller that can separate the control the flow of content for the bus either is data
or addresses, this method will save the cost because less bus is use but the circuit will
become more complex. For the dedicated bus type, all the data and address is using
separate bus so the circuit become bigger and the cost also become expensive, but the
dedicated bus type can perform the operation fast because the bus is not share by other.

Method Of Arbitration – divided into 2 types, centralized and distributed. For centralized
that is using a single hardware device call bus controller or arbiter to control the module
for using the bus. For the distributed method, each module contains the access control
logic and all modules act together to share the same bus. Both method is purpose to
designate which module is act as master and which module is act as slave, with the
master will initiate the data transfer(read or write) to the slave.

Timing – divided into 2 types, synchronous and asynchronous. For synchronous timing,
the occurrence of events is determined by a clock and all the events start at the beginning
of the clock. For asynchronous timing, the occurrence of one event is depends on the
occurrence of a previous events.

Bus width – The wider the data bus, the faster the performance because more bits can be
transfer at one time. The wider the address bus, the bigger the capacity means the bigger
range of the location can be referenced.

Data Transfer Type – all type of bus also supports the read and writes operation. There is
also some combinational operation that support by some busses, that is read-modify-write
and the read-after-write process, the whole operation is typically indivisible to prevent
any other module to share the same memory resources.

6) Describe the difference between a memory module and an IO module.

7) Briefly explain how the PCI can be configured as a 64-bit bus.

8) Give you own opinion on why PCI maintained 32 bit bus as a default configuration.

9) Consider a generic bus controller that supports 4 masters and 32 clients. Discuss the
mechanism that you will use to handle the arbitration for multiple bus request from 3 bus
masters.

10) Describe the RS-232 and USB2.0 interfaces. Specify the relative implementation
cost, the supported data transfers, the maximum cable length, and the interface pins.

Chapter 4

1) List out the key characteristics of computer memory systems and give examples of
each characteristic.

Location – internal memory (processor, main memory, cache, register) or external


memory (optical disks, magnetic disks, tapes).
Capacity – is the sizes of the memory that express in term or word or bytes. Common
word lengths are 8, 16 and 32 bits.

Unit of transfer – is equal to the number of electrical lines into and out of the memory
module. Normally equal to the word length but is often larger such as 64,128 or 256 bits,
also refer as a block of data.

Method of accessing – likes sequential access (example: tapes), direct access (example:
disk unit), random access (example: cache or main memory) and associative (example:
cache memories).

Performance – likes access time, memory cycle times and transfer rate. For the access
time, the time for non-random access is the time takes to position the read and write
mechanism at the desired location. For memory cycle time, is the access time of the
random access memory plus the time required before the second access can started.
The transfer rate is the rate at which a data can be transfer into or out from the memory,
the transfer rate for random access memory is equal to 1/(cycle time).

Physical type – likes semiconductor, magnetic, optical and magneto-optical.

Physical characteristic – likes volatile (semiconductor memories) or non-volatile


(magnetic-surface memories) and erasable or non-erasable.

2) Discuss the access methods used for memory devices and relate them to the physical
design of the devices that is using the method.

3) List and briefly describe the performance parameters of memory devices.

There are 3 performance parameter is using, access time, memory cycle time and the
transfer rate.

Access time – for the random access memory, it is the time taken to perform the read or
write operation, that is the time from the instant that an address is presented to the
memory to the instant that data have stored or make available for use. For non random
access, access time is the time taken to position the read – write mechanism at the desired
location.

Memory cycle time – this concept is primarily applied to the random access memory and
consists of the time plus any additional time required before a second access and started.
This additional time may be required for transient to die out on the signal lines or
regenerate the data if they are destructively. Note that the memory cycle time is
concerned with the system bus not the processor.
Transfer rate – this is the rate at which data can be transferred into or out of a memory
unit. For random access memory, it is equal to 1/ (cycle time). For non-random access
memory the following relationship is hold
= + where = average time to read or write N bits.
= average access time
n = number of bits
R= Transfer rate, in bit per second (bps)

4) Describe the memory hierarchy in terms of access time, capacity, cost per bit.

From the memory hierarchy, at the top level of memory hierarchy is the inboard memory
(register, cache and main memory), secondly is the outboard storage (magnetic disk, CD-
ROM, CD-RW, DVD-RW and DVD-RAM) and the lastly is the off-line storage
(magnetic tape). As going down the hierarchy, the access time for the memory increases,
the cost per bit is decreases, the capacity increases and the frequency of the processor to
access the memory decreases.

5) Describe the usage of direct mapping method in a cache implementation. Highlight the
advantages and the disadvantages of this method.

Direct mapping is the simplest technique for maps each block of the main memory into
only one possible cache line. The main memory address can be view as 3 fields (tag, line
and word). Tag is used to keep track the main memory block in the cache and the line is
used to identify the block in cache and the word is use to identify the bytes inside the
block of the main memory. The usage of this direct mapping method in cache will be
increase the performance of the processor because this will reduces the frequency to
access the main memory to get the data, all the data can getting from the cache. The
advantage of direct mapping is that it is easy to implement and the disadvantage of the
direct mapping is that if a program need to reference the words repeatedly from two
different block that map into the same line, then the blocks will be continually swapped
in the cache, and the hit ration will be low, so the performance is not optimal compared to
the other techniques.

6) Describe the usage of associative mapping method in a cache implementation.


Highlight the advantages and the disadvantages of this method.

In an associative mapping cache, all block of the main memory can be mapped into any
slot of the cache. The mapping from main memory block to cache is performed by
partitioning an address into field for the tag and the word. The tag is an identifier use to
keep-track the block of the main memory in the cache and the word is use to identifier the
byte in the block. When a reference to the main memory is made, the cache will intercept
the reference and searches the cache tag memory to see if the request block is in the
cache. All the tag is search in parallel using an associative memory, if the tag in the cache
tag memory is match the tag field of the memory reference, the word is taken from the
position in the slot specified by the word field, if the reference word not in the cache,
then the block contain the word will be brought into the cache and the reference word
will taken from the cache. The advantage is the flexibility of the cache that a block can be
replace by the new block when read into the cache. The disadvantage is that complex
circuitry required to examine the tags of all cache in parallel.
7) Describe how set associative mapping merges the two methods mentioned above.

8) Discuss all possible replacement algorithms that can be used in cache


implementations. Give an opinion on which algorithm being the best.

Least recently use (LRU) – replace that block in the set that has been in the cache longest
with no reference to it.
First-in-first-out (FIFO) – replace that block in the set that has been in the cache longest.
Least frequently use (LFU) – replace that block in the set that has experienced the fewest
reference.

Chapter 5:
1) Describe the properties of a semiconductor memory cell.

The semiconductor memory cell can be use to represent the binary number of 1 and 0.
The memory cell content can be altered by written into the memory cell to set it and the
content also can be read to sense the state. The cell normally contains 3 functional
terminals capable to carrying an electrical signal. The selects signal that use to select the
cell, the control terminal that specify the operation either is read or write. For writing, the
other terminal will provides an electrical signal to set the state of the cell to 1 or 0 and for
read operation the terminal is use as the output of the cell state.

2) Compare the features of SRAM and DRAM. Discuss their usage in a computer system.

DRAM – is an analog device that use to store data, it store the data as a charge at the
capacitor. DRAM required a periodic charge refreshing to maintain data storage because
the capacitors have a natural tendency to discharge.
SRAM – is a digital devices that use to store data, the data is stored by using the
traditional flip-flop logic gate, the SRAM will hold the data as long as the power is
supply to it.
Both the DRAM and SRAM are volatile, that is the power is required to preserve the bit
values. DRAM is simpler, smaller and less expensive than SRAM but it required the
refresh circuitry. For larger memories, the fixed cost of the refresh circuitry is more than
compensated for by the smaller variable cost of DRAM cells. Thus DRAM is tend to be
favored for large memory requirement. SRAM is generally faster than the DRAM so it is
use for cache memory (both on and off chip) and DRAM is used for main memory.
Chapter 6:

1) Describe RAID technology. List out all levels and highlight how RAID 0 is different
from others.

RAID technology is a technology that provides increased storage reliability through


redundancy, combining multiple relatively low-cost, less-reliable disk drives components
into a logical unit where all drives in the array are interdependent.
There are total 7 level of RAID technology, RAID 0, RAID 1, RAID 2, RAID 3, RAID 4,
RAID 5 and RAID 6. The different of RAID 0 form other RAID is that it does not using
the redundancy to improve the performance.

2) Describe how data can be organized on a circular physical storage device such as the
magnetic disc and the optical disc.

For optical disc, the data are organized as a sequence of blocks. The block consists of the
following fields.
-Sync is use to identified the beginning of a block. It consist a bytes of all 0s, 10 bytes of
all 1s and a byte of all 0s.
-Header – contain the block address and the mode byte. Mode 0 mean the blank data,
mode 1 mean use of an error-correcting code and 2048 bytes of data, mode 2 mean 2336
bytes of data without using the error-correction code.
-Data – is the user data.
-Auxiliary – additional user data in mode 2. In mode 1, this is an error-correcting code.

Chapter 7:

1) Discuss why computer peripherals are not connected directly to the system bus, thus
requiring separate I/O interface.

-There are a lot of peripheral devices with various method of operation. It would be
impractical to incorporate the necessary logic within the processor to control a range of
devices.
-The data transfer rate of the peripheral is slower than that of the memory or the
processor. It is impractical to using the high-speed system bus to communicate directly
with the peripheral.
-There would have some other peripheral with the higher transfer rate than the memory or
processor. Again, the mismatch would lead to inefficiencies if not managed properly.
-peripheral often use different data format and word length that the computer to which
they are attached.

2) Describe the main I/O techniques that can be used to implement I/O interfaces.
Programmed I/O – when the processor is execute a program and the program involve the
I/O operation, the processor will direct control the I/O operation by sending a I/O
command to the I/O module and the processor must be wait until the I/O operation is
complete. This process will waste a lot of time if the processor is faster than the I/O
module because the I/O module will not alert the processor when the operation is
complete, it is the responsibility of the processor to check the status of I/O module until it
find the operation is complete.

Interrupt-driven I/O – when the processor issue a command to I/O module, it can
continue to execute other instruction and the processor will be interrupted by the I/O
module when the I/O device is ready for use. This technique is more efficient than the
programmed I/O because eliminates the unwanted waiting time of the I/O by the
processor. However still consumed a lot of processor time because every process must
pass through the processor.

Direct Memory Access (DMA) – transfer the data between the I/O devices and the
memory without passing through the processor and the processor only involve in the start
and the end of the I/O data transfer. At the beginning of a process if required the data
transfer between the I/O devices, the processor will issue a command to the DMA by
providing the request (read or write), address and the data size to the DMA and the DMA
will instead the processor to do the data transfer between the I/O and the memory and
when the data transfer is finish, the DMA will send an interrupt signal to the processor
and is ready for other I/O transfer.

Chapter 9:

1) Discuss all the possible 8-bit representations of a signed integer. Is it possible to have a
signed integer representation that include infinity?

The possible range for the 8-bit representation of a signed integer is in the range - -1
to – 1, where n is the number of bit for a signed integer, the range is -127 to 127.
Because the first bit or the MSB is represent the sign of the number, so the range of the
integer is determined by the remaining 7 bit of the signed integer.
To represent a signed integer as an infinity, it depend on the range of the infinity and the
number of bits that can use to represent a signed number, but normally it is not possible
to represent a signed number as infinity number.

2) Describe two logic circuits that can be used to implement integer multiplication. Show
them in block diagrams.

Figure 9.8 (a) pages 336:


Unsigned binary multiplication logic circuit is use to perform the multiplication operation
between 2 unsigned binary number. For a multiplication of n bit multiplicand with n bit
multiplier will produce a 2n bit result number. The component required in the logic
circuit of unsigned binary multiplication is 3 register with n bit long, a 1 bit carry register,
one n – bit adder and a shift and add control logic. For the 3 register, 1 register is used to
store the multiplicand value (register M), another one is use to stored the multiplier value
(register Q) and the last register is used together with the multiplier register to store the
multiplication result of the two unsigned integer. The register A and the C initially are set
to 0. At initial of the multiplication operation, the control logic will read the least
significant bit of the multiplier one at a time. If is1, the multiplicand is added to the
register A and the result is stored in the register A with the C used to store the overflow
bit. Then all the bit of the C, register A and register Q is shift to the right one bit, so that
C bit is go into bit, is goes into bit and the is lost. If is 0, then no
addition is perform, just the shift operation is perform. The process repeated for each bit
of the original multiplier.

Booth’s Algorithms method (multiplication of 2 two’s complement number).

-It can used to perform the multiplication of any 2 integer (positive and negative integer).
-multiplicand and multiplier are placed in the M and Q register.
-a 1 bit register is placed logically to the right of the LSB bit of the Q register ( ) and
designate .
-another 1 register A will used together with register Q to stored the result of the
multiplication of the 2 two complement number.
-A and are initially to 0 and the multiplication operation is started.
-control logic will check the bit value for and ,
 If the 2 is same (1-1 or 0-0) then all bit of A, Q and register is shift to the
right 1 bit.
 If the 2 is 1-0, then A will subtract with M and the result is stored into A, all the
bit of A, Q and register is shift to the right 1 bit.
 If the 2 is 0-1, then A will added with the M and the result is stored into A, all the
bit of A, Q and register is shift to the right 1 bit.
-For the shift right operation, the MSB of A ( ) not only shifted to and also
remains in . There is no need the overflow bit for addition operation and the borrow
bit for subtraction operation.

3) What is Booth algorithm? Describe its distinct feature in digital arithmetic.

Booth algorithm is a multiplication algorithm that multiplies two signed binary number in
two’s complement notation. It performs fewer additions and subtractions than a
straightforward algorithm. It can use to multiple 2 positive number, 2 negative number or
either 1 side is negative. The result is in two complement form either is positive or
negative depend on the sign of the multiplicand and the multiplier.

For a straightforward algorithms multiplication method,


M x (00011110) = M x ( + )
= M x (16+8+4+2)
= M x 30
The number of such operation can be reduces to two by looking at the operation below by
using the booth algorithms method,
M x (00011110) = M x ( )
= M x (32 – 2)
= M x 30
This reduced can be done only when a block of 1s is surrounding by 0s, then the
following formula can be apply, + +......+ = - . This formula also
can use at a single 1 treated as a block. For example 0010, the = =2, = =2, and
= = =4. Then 0010 = - =4 – 2 = 2 which is equal to =2.

Booth algorithm using this method to perform the multiplication operation, subtraction is
perform when the first 1 of the block is encountered (1-0) and addition is perform when
the end of the block is encountered (0-1). For example,

M x (01111010) = M x ( + + + + )
=Mx( - + - )
Bit 7 and bit 6 is (0-1) mean addition (+ ), bit 3 and bit 2 (1-0) mean subtraction (- ),
bit 2 and bit 1 (0-1) mean addition (+ ) and the bit 1 and bit 0 (1-0) mean subtraction (
).

4) Describe the IEEE standards for floating-point representation.

There are two IEEE standards for floating points, that is single precision (32-bits) and
double precision (64-bits).
-For the 32bit floating number, the most significant bit (MSB) is the sign bit. The 8 bit
after the MSB is the bit used to represent the biased exponent, and the remaining 23 bit is
used to represent the significant of the floating point number.
-For the 64 bit floating number, the most significant bit (MSB) is the sign bit. The 11 bit
after the MSB is the bit used to represent the biased exponent, and the remaining 52 bit is
used to represent the significant of the floating point number.
-The sign bit for both is the same, 0s represent positive and 1s represent negative.
-For the exponent of a floating point number, it exponent is biased exponent, mean that
the biased value for the exponent can use to represent a positive and negative value. The
biased numbers that add to the original exponent to become a biased exponent can be
determined by the formula -1, where n is the original bit number of the exponent. If
n=8 then the biased value is 127 and the range for the biased number is from 0 to 255,
therefore the range for the exponent can show is from 0 – 127 to 255 – 127, that is from
-127 to 128.
-For a normalized floating point number, there is an addition 1 bit that will not show in
the floating point and we must consider it when the operation is performs. That bit is 1s
in front of the radix point, because for the normalized value the first bit before the radix
point is always 1 and for this reason that bit is no need to include in the floating point
format.
-For a de-normalized number , the bit before the radix point is not 1 show we need to
shift the radix point to the right until the first 1 is find and the number of bit shifted will
be balanced by added that number into the exponent of the floating point number.
-not all the bit patterns in the IEEE format are interpreted in the usual way, there have
some bit value use to represent special values. For example
 If the biased exponent and the significant are all 0s, then it is a value represent 0
and the sign bit will use to represent the sign of the value 0.
 If the biased exponent is all 1s and the significant is all 0s then the value is an
infinity value and the sign but will used to represent the sign of the infinity value.
 If the biased exponent is all 1s and the significant is not 0s, weather the sign bit is
1 or 0, the number showed is not a number (NaN).
 If the biased exponent is 0s and the significant is not 0s, then the number is a de-
normalized number and it sign is depend on the sign bit.

5) Discuss the difference between a fixed-point representation and a floating-point


representation of real numbers.

For example, by using a 32 bit binary number that use to represent a fixed-point number
and floating point number, the total different number that can be represent by these 2
representation is the same, that is different number but the range of the value their
showing is different.

-For a fixed-point representation likes two complement value, the range is from to
-1.
-For a floating point representation, the range of the value can be represent depend on the
sizes of the exponent, for a 8 bit biased exponent, the range can be show is as follow,
 Negative number between – (2 )x and -
 Positive number between and (2 )x
The range can be change become larger by modify the number of bit that use to represent
the exponent, but the precision of the number is reduces because as the exponent bit
increases the significant bit in the floating representation is decreases.

Chapter 10:

1) Describe a machine instruction. Specify the elements of a machine instruction.

Machine instruction is a set of information that required by the processor to execute the
operation. The machine instruction is different for each different operation.
Element of a machine instruction:
-Operation code: specify the operation to be performed (example: ADD, MOVE). The
operation is specified by the binary code, known as the operation code or opcode.
- Source operand reference: The operation may involve one or more sources operand, the
operand is the inputs for the operation.
-Result operand reference: The operation may produce a result.
-Next instruction reference: This tells the processor where to fetch the next instruction
after the execution of this instruction is complete.

2) List and briefly describe the basic types of instructions.

-Data processing: arithmetic and logic instruction.


-Data storage: movement of data into or out of register and memory locations.
-Data movement: I/O instruction.
-Control: test and branch instructions.
Arithmetic instructions provide computational capabilities for processing numeric data.
Logic instructions operate on the bit of a word rather than on a number, thus it provides
capabilities for processing any type of data user wish to employ. These operations are
performed primarily on data in the processor register. Therefore, there must be memory
instruction for moving the data between memory and the processor register. I/O
instructions are needed to transfer program and data into memory and the results of
computations back out to the user. Test instructions are used to test the value of a data
word or the status of a computation. Branch instruction are then used to branch to a
different set of instructions depending on the decision make.

3) List and briefly explain the main issues of instruction set design.

-Operation repertoire: how many and which operation to provide, and how complex the
operation should be.
-Data types: the various types of data upon which operation are performed.
-Instruction format: instruction length (in bit), number of address, sizes of various field
and so on.
- Register: number of processor register that can be referenced by instructions and their
use.
-Addressing: The mode or modes by which the address of an operand is specify.

4) List and briefly describe the categories of instruction operand.

-Address: a form of data, some calculation must be performed on the operand reference
in an instruction to determine the main or virtual memory.
-Number: is a numeric data that used by the computer to perform the operation, for
example: binary integer or binary fixed point, binary floating point and decimal. The
number that stored in a computer is limited so programmer is faced with understanding
the consequences of rounding, overflow and underflow.
-Character: ACSII key is one type of the data use to represent the character, each
character represented by a unique 7-bit pattern.
-Logical data: is bit-oriented, easy to manipulate the bit of data in this form.

5) Describe the difference between logical shift operation and arithmetic shift operation.

Logical shift operation is the operation that the bit for a word is shifted left or right. On
one end the bit shifted out is lost, on the other end, a 0s is shifted in.
Arithmetic shift operation treat the data as signed integer and does not shift the sign bit.
On the right arithmetic shift, the sign is replicated into the bit position to its right. On the
left arithmetic shift, a logical left shift is performed on all bits but the sign bit, which is
retained.

6) List and briefly describe the types of operations that a processor has to handle.

There are a number of different opcode varies widely from machine to machine, however
the same general types of operation are found on each machine and list as below:

-Data transfer operation


 MOVE – transfer a word or a block from sources to destination
 STORE – transfer a word from processor to memory
 LOAD – transfer a word from memory to processor
-Arithmetic operation
 ADD – compute sum of two operands
 SUBTRACT – compute the different between two operands
 MULTIPLY – compute product of two operands
 DIVIDE – compute quotient of two operands
-Logical operation
 AND – perform logical AND
 OR – perform logical OR
 TEST – test special condition: set flags based on outcome
 SHIFT – left or right shift operand, introducing constants at end.
-Conversion operation
 Translate – translate values in a section of memory based on a table of
correspondences
 Convert – convert the content of a word from one form to another (example:
packed decimal to binary)
-I/O operation
 Input (read) – transfer the data from specified I/O port or device to destination
(main memory or processor register)
 Output (write) – transfer from a specific sources to I/O port or device
 Start I/O – transfer the instruction to I/O processor to initiate the I/O operation.
-Transfer of control operation
 Jump (branch) – unconditional transfer; load PC with specified address
 Skip – increment the PC to skip the next instruction
 Halt – stop the program execution

7) Describe stack operations and discuss the use of stacks in a microprocessor.

Stack is a location in the computer that used to reserve the data or storing the data
temporary. Stack operation will follow the last-in-first-out (LIFO) method, it mean that
only one item can be accessed at one time and that item always is at the top of the stack.
-The stack is use by the computer to stored some data temporary,
 For example when the processor execute the push instruction, an item will be
stored to the top of the stack and when a pop instruction is execute, the item on
the top of the stack is remove. The operation of a computer will depend on the
flow of the program, and the flow of the program will always not follow the
sequences, it maybe sometimes jump to other location to execute and return to the
location of the jump to execute the remaining instruction. For this reason, when
the flow of the program is jump to another memory location to execute, the
address that need to return is required to stored, so it is stored in the stack
temporary while execution of branch instruction.
-The stack can use to perform some operation (unary and binary) that involves zeros
addressing.
 Unary operation is involves the use of the element of the top of stack as one
operand, for example the NOT operation, the processor will pop that operand out
and perform the NOT operation on it and the data will be push back into the stack.
 Binary operation is involves the use of the top 2 stack item as operands, for
example the add instruction, the processor will pop the top 2 stack item out and
perform the addition operation and the result is push back into the stack.

8) Describe byte ordering concepts of little-endian and big-endian. Discuss the difference
of the two concepts, and its effect in program storage.

The little-endian and big-endian method is used when a number that bigger than one byte
(multiples byte) is needs to store in the memory location, this two method stored the
multiple bytes in different way. Let take an example to store a value 123456 in hex
decimal form into the memory location start at 100.

 For little-endian concepts, the least significant bytes are stored in the lowest
numerical bytes address. That is the 56h stored in the address 100, 34h stored in
the address 101 and the most significant bytes is stored in the address 102.
 For big-endian concepts, the most significant bytes are stored in the lowest
numerical bytes address. That is the 12h stored in the address 100, 34h stored in
the address 101 and the least significant bytes is stored in the address 102.

-For little-endian method, it is easy to check the number either is odd number or even
number because the first bytes read is always the least significant bytes. Because this
method put the number backwards, so it allow us to extend the sizes of the number to the
limits of the memory without actually changing it values, for example, “21 43”, “21 43
00” and “ 21 43 00 00” are all the same number because the little-endian will read the
word with least significant bytes first.

-For big-endian method, it is easy to check the sign of the number either is positive or
negative because the first value it read is the most significant bytes. The big-endian
method will stored the number as the same way as our human think about the data, so this
make the low-level debugging easier.
9) Give an opinion on implementing the above concepts for bit ordering.

Chapter 11:

1) Describe addressing techniques used in instruction sets.

Immediate addressing – where the operand value is present in the instruction.


Direct addressing – the address field for the instruction contain the main memory
address of the operand.
Indirect Addressing – content of the memory is the pointer to the address of the
operand. Register addressing – the address fields of the instruction contain the address
of the register that holding the operand.
Register indirect addressing – content of the register as a pointer to the address of the
operand.
Displacement addressing – the combination of direct and register indirect addressing
method.
Stack addressing – a reserved block location that function as last-in-first-out.

2) Discuss how those addressing techniques are related to the size of an instruction.

3) Discuss the difference between fixed-length and variable-length instructions.

For a fixed-length instruction there is clearly a trade-off between the numbers of opcodes
and the power of the addressing capability. More opcodes mean more bits in the opcodes
field and this will reduce the number of bits available for the addressing.

For variable-length instruction, it easy to provide a large repertoire of opcodes with


different opcode lengths, addressing can be more flexible, with various combinations of
register and memory reference plus addressing modes.

Chapter 12:

1) Describe the difference between user-visible registers and control/status registers.

-User-visible register is one set of register that can be reference by the user or machine to
minimize the main memory reference and optimized the use of register.
-Control/status register is one set of register that is not visible to user and used by the
control unit to control the operation of the processor.

2) Describe an instruction cycle that involves interrupt.


The stage for an instruction cycle that involves interrupt is fetch stage, execute stage,
interrupts stage and indirect stage. The execution of an instruction may involve one or
more operands in memory, each of which required a memory accesses. Further if indirect
addressing is used, then additional memory accesses are required so we can think that the
indirect addresses as one more instruction state. The main line activity is between the
fetch stage and the execute stage. After an instruction is fetch, it is examined to determine
if any indirect addressing involved. If so, the required operand is fetch by using indirect
addressing. Following execution, an interrupt may be processed before the next
instruction fetch.

3) Explain how pipelining can improve overall instruction execution time.

By using the pipelining strategy, the instruction can be execute in parallel instead of
executes in sequences. For an instruction execution, it can normally divide into two
stages, fetch instruction and execute instruction. There are times during execution of an
instruction when the main memory is not being accessed. This time could be used to fetch
the next instruction in parallel with the execution of the current one. This will save the
time waiting time for the next instruction fetch. This two stages pipelining method also
not maximum improved the performance of the processor, because the time using for
fetch the next instruction is less than the time use to execute the next instruction. There
still got the time the processor to wait the instruction execution while the parallel process
of fetching the next instruction is finish, so to maximum the performance of the
processor, the pipelining must be decompose into more stage and the various stage will
be of more nearly equal duration. For example, by using a six-stage pipeline the
execution time for 9 instructions can be reduces from 54 time unit to 14 time unit, which
save the 40 time unit.

4) Describe all types of pipeline hazards.

A pipeline hazards occur when the pipeline, or some portion of the pipeline, must stall
because condition do not permit continued execution. There are 3 types of hazards:
resource, data, and control.

Resource hazards – occurs when 2 instructions that are already in the pipeline need the
same resources. At this time the instruction need to be executed in series rather than
parallel for a portion of the pipeline. It sometimes also refers as structural hazard. For
example, a main memory has a single port and that all the instruction fetches, data reads
and writes must be performed at one time. So if an instruction required to fetch the
operand from the main memory rather than in the register, then it should be fetch the
operand from the main memory and next instruction in the pipeline if required the use
from memory, it should include a idle stage wait for the before instruction to take the
operand first and then continue it stage as normal. Solution for resources hazards is to
increases the available resources, such as having multiple ports into the main memory.
Data hazards – A data hazards occur when there is a conflict in the access of an operand
location. For example, we can state the hazards in this form, two instructions in a
program are to be executes in sequences and both access a particular memory or register
operand. If the two instructions are executed in strict sequence, no problem to occur but if
the two instructions are execute in a pipeline, then it is a possible for the operand value to
be updated in such a way as to produces a different result than would occur with strict
sequential execution. In other word the program will produces an incorrect result because
of the using of pipeline.

There are three types of data hazards:


Read after write (RAW), or true dependency – an instruction modifies a register or
memory location and a succeeding instruction read the data in the memory or register
location. A hazard occurs if the read take place before the write operation is complete.
Write after read (WAR), or anti-dependency – An instruction read a register or a memory
location and a succeeding instruction writes to the location. A hazard occurs if the writes
operation take place before the read operation takes place.
Write after write (WAW), or output dependency – Two instructions both write to the same
location. A hazard occurs if the writes operations take place in the reverse order of the
intended sequences.
-Control hazards – is also known branch hazard, occur when the pipeline makes the
wrong decision on a branch prediction and therefore brings instruction into the pipeline
that must subsequently be discarded.

5) Describe branch prediction. Illustrate the flow of branch prediction using taken/not
taken switch.

Branch prediction is the techniques can be used to predict whether a branch will be taken.
Branch prediction using taken /not taken switch is dynamic branch prediction, because it
depend on the execution history. For example, one or more bit can be associated with
each conditional branch instruction that reflect the recent history of the instruction, these
bit can be referred to as a taken/not taken switch that directs the processor to make a
particular decision next time the instruction is encountered.

Figure 12.18 page 475.

From the branch prediction flowchart, as long as each succeeding conditional branch
instruction that is encountered is taken, the decision process predicts that the next branch
will be taken. If a single-prediction is wrong, the algorithms continue to predict that the
next branch is taken. Only if two successive branches are not taken does the algorithm
shift to the right-hand side of the flowchart. Subsequently, the algorithm will predict that
branches are not taken until two branches in a row are taken. Thus, the algorithm requires
two consecutive wrong predictions to change the prediction decision.

Chapter 13:

1) Define CISC architecture and explain the advantages of CISC design.


The complex instruction set computing (CISC) is a computer where single instructions
can execute several low-level operations (such as load from memory, an arithmetic
operation, and a memory store) and/or are capable of multi-step operation or addressing
modes with single instructions. The CISC designs using a larger number of instruction
and the instruction are almost more complex, due to this reasons, the compiler of the
system is more simplified. Another reason is that the CISC will yield a smaller and faster
program, mean that improve the performance of the system. The smaller program will
improve the performance because fewer instruction bytes to be fetched, and in a paging
environment, smaller program occupy fewer pages, reducing the page fault.
2) Define RISC architecture and describe the main characteristics of RISC processors.

The Reduced instruction set computing (RISC) is a CPU design strategy based on the
simplified instructions to provide higher performance if the simplicity enables much
faster execution of each instruction.
The main characteristic of RISC processor is as below:
One machine instruction per machine cycle – A machine cycle is defined to be the
times to fetch two operands form the register, perform ALU operation, and stored the
result in a register.
Register to register – all the operation should be perform between the register and only
the instruction with operation LOAD and STORE accessing memory. This design
features simplified the instruction set and therefore the control unit. With the
optimization of register use, so that frequently accessed operands remain in high-speed-
storage.
Simple addressing modes – all RISC instruction use simple register addressing. Several
additional modes, such as displacement and pc-relative, may be included. Other, more
complex modes can be synthesized in software from the simple ones. Again, this design
features simplify the instruction set and the control unit.
Simple instruction format – there are only a few format are used and the instruction
length is fixed. Field location, especially the opcode are fixed and the fixed fields make
the opcode decoding and register operand accessing can occur simultaneously. The
instruction fetching also can optimized because word-length unit are fetched.

Chapter 14:

1) Describe a superscalar processor, focusing on the essential characteristics.

A superscalar processor is one processor in which multiple independent instruction


pipelines are used. Each pipeline consists of multiple stages, so that each pipeline can
handle multiple instructions at a time. The essential of the superscalar approach is the
ability to execute instruction independently and concurrently in different pipelines. The
concept can be further exploited by allowing instruction to be executed in an order
different from the program order.
2) Describe all types of limitations that reduce the effectiveness of instruction-level
parallelism.

True data dependency – also call read after write dependency (RAW).
For example by look at the 2 instruction below:

I1) ADD A, B
I2) MOVE C, A

The second instruction can be fetched and decoded but cannot be execute until the first
instruction is executes. The reason is that the second instruction required the data
produced by the first instruction.

Procedural dependency – the instruction following a branch (taken or not taken) haves a
procedural dependency on the branch and cannot be executed until the branch is
executed.

Resources conflict – A resources conflict is the competition of two or more instruction


for the same resources at the same time. Examples of resources include memories,
caches, buses, register-file port, and functional unit (example: ALU adder). It can be
overcome by increases the resources.

Output dependency – also call write after write (WAW) dependency.


For example by look at the 3 instruction below:

I1) ADD R3, R5


I2) ADD R4, R3
I3) ADD R3, R6

I1 and I2 is the true data dependency because I2 need the result from the I1. But for I1
and I3, there is no data dependency, if I3 execute to completion prior to I1, then the
wrong value of the R3 will be fetch for execution of the next instruction after I3.
Consequently, I3 must complete after the I1 produce the correct output value. This is the
output dependency between the I1 and I3. A wrong result maybe will be produce if the
instruction execution is in reverse order. To ensure this, issuing the third instruction must
be stalled if its result might later be overwritten by an older instruction that takes longer
to complete.

Anti-dependency – also call read after write dependency (RAW).


For example by look at the 3 instruction below:

I1) ADD R3, R5


I2) ADD R4, R3
I3) ADD R3, R6
The I3 cannot complete execute and fetched it operand before I2 begin execute, this is
because the I3 will update the register R3, which is a sources operand for I2. The anti-
dependency is reverse of the true data dependency, instead of the first instruction
producing a value that the second instructions uses, the second instruction will destroy a
value that the first instruction uses.

3) Describe the difference between superscalar and superpipelining.

A superscalar machine can issue several instructions per cycle depend on the degree n of
the superscalar machine. If a superscalar machine with degree 2 mean that it can executes
two instructions per cycle while the superpipelined machine can only be executed 1
instruction on 1 cycle, but the cycle time is depend on the degree m of the superpipelined
machine, for a superpipelined machine with degree 2 mean that it execute an instruction
with the cycle time ½ the cycle time of the base machine.

4) Describe in-order issue with in-order completion policy.

The simplest instruction issue policy to issue instructions in the exact order that would be
achieved by sequential execution (in-order issue) and write results in that same order (in-
order completion). To guarantee in-order completion, when there is a conflict for a
functional unit or when a functional unit required more than 1 cycle to generate the result,
the issuing of the instructions temporarily stalls.

5) Describe in-order issue with out-of-order completion policy.

In-order issue with out-of-order completion policy mean that the execution stage of the
instruction need to follow the sequence of the original program but the write stage or the
completion of an instruction can be different from the original program. Out-of-order
completion is used in scalar RISC processor to improve the performance of instruction
that required multiple cycles. With out-of-order completion, any number of instructions
may be in the execution stage at any one time, up to the maximum degree of machine
parallelism across all the function units. The out-of-order completion required more
complex instruction issue logic than the in-order completion. In addition it is more
difficult to deal with the instruction interrupt and exceptions.

6) Describe out-of-order issue with out-of-order completion policy.

With the in-order issue, the processor will only decode instruction up to the point of
dependency or conflict, no additional instruction are decode until the conflict is resolved.
As a result the processor cannot look ahead of the point of conflict to subsequent
instruction that may be independent of those already in the pipeline and that maybe
usefully introduced into the pipeline. To allow out-of-order issue, it is necessary to
decouple the decoded and executed stage of the pipeline. This one can be done with a
buffer referred to as an instruction window. With this organization, after the processor
has finished decoding an instruction, it is placed in the instruction window. As long as the
buffer is not full the processor can continue to fetched and decode new instructions, when
a functional unit become becomes available in the execution stage, an instruction may be
issue to the execution stage. Any instruction can be issue and depend on two conditions,
first if the particular functional unit that is available and second that is no conflict or
dependency block the instruction. This result of this organization is that the processor has
a look ahead capability, allowing it to identify independent instructions that can brought
into the execution stage.

Chapter 15:

1) Describe the term micro-operations.

Execution of a program consists of the sequential execution of instructions. Each


instruction is execute during an instruction cycle make up of shorter sub-cycle (example:
fetch, indirect, execute, interrupts). The execution of each sub-cycle involves one or more
shorter operation, that is, micro-operations.

2) List the micro-operations that need to be done on a generic processor to perform an


addition between an internal register and an external memory.

The micro-operation that needs to be done to perform an addition between an internal


register and an external memory is as follow:

: MAR ← (IR (address))


: MBR ← Memory
: Y ← (MBR)
: Z ← (AC) + (Y)
: AC ← (Z)

Where Y and Z are new register that need for the proper operation of the ALU. AC is the
accumulator that is always holding one operand for the operation of the ALU and the
other operand is temporary holding by the register Y and the register Z use to holding the
temporary result that produce by the ALU. MBR is the memory buffer register that used
to hold the data to be write to memory or to holding the data read form memory. MAR is
the memory address register, it holding the address for the read and write operation. All
of the components are connect to the same internal bus and the control unit will assert the
control signal to make the micro-operation of an instruction to perform.

3) Discuss the difference between a hardwired control unit and a micro-programmed


control unit.
Hardwired control unit is a control unit that making from the basic logic element. The
design must include logic for sequencing through the micro-operation, for executing
micro-operation, for interpreting opcodes, and for making decision based on the ALU
flags.

Micro-programmed control unit is a control units that generate the control signal
depend on a set of program stored in the control memory. Because that set of program is
describe the behavior of the control unit.

The different between the hardwired control unit and the micro-programmed control
unit:
-A micro-programmed control unit is easy to implement than a hardwired control unit.
-A micro-programmed control unit is easy to modify than the hardwired control unit, for
example an instruction is add to the processor, a simple change in the program for the
micro-programmed control unit can settle the problem while the hardwired control unit is
difficult to change it design.
-A micro-programmed control unit is slower than the hardwired control unit of
comparable technology. Despite this, micro-programmed is the dominant technique for
implementing control unit in pure CISC architecture, due to its ease implementation but
for the RISC processor with the simpler instruction format, then the hardwired control
unit is using.

S-ar putea să vă placă și