COA

Vidyalankar
S.E. Sem. III [CMPN]

Computer Organization & Architecture
Prelim Question Paper Solutions
1. (a) Von Neumann Architecture

Programming computers with huge numbers of switches and cables was slow, tedious and
inflexible. Von Neumann came to realize that the program could be represented in digital form in
the computer’s memory, along with the data. Von Neumann also saw that the clumsy serial
decimal arithmetic used by the ENIAC, with each digit represented by 10 vacuum tubes (1 on and
9 off) could be replaced by using parallel binary arithmetic.
The basic design, which he first described, is now known as a Von Neumann machine. It was
used in the EDSAC, the first stored program computer and is still the basis for nearly all digital
computers.
A sketch of the architecture is given in figure.
Memory
Arithmetic
Input
Control logic unit
unit
Output
Accumulator
Fig.: The original von Neumann machine
The von Neumann machine had five basic parts : the memory, the arithmetic logic unit, the
control unit, and the input and output equipment. The memory consisted of 4096 words, a word
holding 40 bits, each a 0 or a 1. Each word held either two 20-bit instructions or a 40-bit signed
integer. The instructions had 8 bits devoted to telling the instruction type and 12 bits for
specifying one of the 4096 memory words.
Inside the arithmetic logic unit was a special internal 40-bit register called the accumulator. A
typical instruction added a word of memory to the accumulator or stored the contents of the
accumulator in memory. The machine did not have floating-point arithmetic because von
Neumann felt that any competent mathematician ought to be able to keep track of the decimal
point (actually the binary point).
1. (b) IEEE 754 format :

The standards for representing floating point numbers in 32-bits and 64-bits have been developed
by the Institute of Electrical and Electronic Engineering (IEEE) referred as IEEE standards.
IEEE standard format is as shown below :
32 bit
31 30 23 22 0
S Ei M
8 bit signed 23 bit mantissa

Sign of
exponent in fractions
numbers :
0 signifies + excess + 27
1 signifies − representation
Value represented = ± 1M × 2E−127
Fig. 1 : Single Precision
(2) Vidyalankar : S.E. – COA
64 bit
63 62 52 51 0
i
S E M
15 bit 52 bit mantissa

Sign
excess-1023 fractions
exponent
Value represented = ± 1M × 2E−1023
Fig. 2 : Double Precision
The 32−bit standard representation shown in Fig. 1 is called a single − precision representation
because it occupies a single 32−bit word.
The 32−bits are divided into three fields as shown below :
• (field 1) sign ← 1− bit
• (field 2) Exponent ← 8 − bit
• (field 3) Mantissa ← 23 − bits
The sign of the number is given in the first bit, followed by a representation for the exponent (to
the base 2) of the scale factor. Instead of the signed exponent E, the value actually stored in the
exponent field is F′ = E + bias. In the 32−bit floating point system (single precision) base is 127.
This representation of exponent is also called as excess − 127 format. The end values of E′,
namely 0 and 255 are used to indicate the floating point values of exact zero and infinity
respectively in single precision. Thus, name of E′ for normal values in single precision is
a < E′ < 255. This means that for 32−bit representation the actual exponent E is in the range
−126 ≤ F ≤ 127.
The 64−bit standard representation shown in Fig. 2 is called a double − precision representation
because it occupies two 32−bit words. The 64−bits are divided into three fields as shown below :
• (field 1) Sign ← 1 − bit
• (field 2) Exponent ← 11 − bits
• (field 3) Mantissa ← 52 − bits
In the double precision format value actually stored in the exponent field is given as,
E′′ = E + 1023
Hence, bias value is 1023 and hence, it is also called as excess − 1023 format. The end values of
E′ viz. 0 and 2047, are used to indicate the floating point exact values of exact zero and infinity,
respectively. Thus, the range of E′ for normal values in double precision is 0 < E′ < 2047. This
means that for 64−bit representation the actual exponent E is in one range − 1022 ≤ E ≤ 1023.
(i) 15.4 :
Binary = (1111 . 0110011)2
Normalized the number = 1.1110110011 × 23
for single precisions
S = 0, E = 3 and M = 1110110011
E′ = E + 127
= 3 + 127
= 130
= (10000010)2
Number in 32−bit format is given as,
0 10000010 1110110111 …….0
S E M
Prelim Question Paper Solutions (3)
Double precision
S = 0, E = 3 and M = 1110110011
E′ = E + 1023
= 3 + 1023
= 1026
= (10000000010)2
In double precision format
0 000000010 1110110011 … 0
(ii) (B F C 00000)H S E M
Given number
0 1011 1111 1100 0000 …. 0
Exponent = 191
E′ = E + 127
= 191 − 127
= 64
1. (c) Active
Control APP A, B
DP Signals
p q r s MUX1
s
A&B
RF
A Write A
B Read A
Read B
A B
u v MUX
s 2 Select u−w
w x y z Select v−x
A B
F1 F2
A+B Add
Overflow
Fig.: Processor configured to implement add operation A = A + B

(i) Add A, B:
Figure shows the control signals that implements an addition instruction ADD A, B
i.e. A = A + B.
Assume that this operation can be executed in a single close cycle. The input variables A and
B are obtained from registers of the same name in register file (RF) and result is stored back
in to register A.
Finally the necessary logical connections for the data to flow through DP must be established
by applying approximate control signals to the multiplexers.
Thus, CU (Control Unit) must activate the following types of control signals during the clock
cycle in which the ADD A, B instruction is executed.
Function Select : ADD

Storage Control : Read A, Read B, Write A
Data routing : Select p−t, Select u−w, Select v−x
There is usually some feedback of control information from DP to CU to indicate exceptional
conditions encountered during instruction execution. The functional unit F1, perform the
addition and sends an overflow signal to CU whenever the sum A+B exceeds the normal
word size. (From figure)
(ii) SUB A, B
The CU must activate the following three types of control signals during the clock cycle in
which the SUB A, B instruction is executed.
Function select: SUB
Storage control: Read A, Read B, Write A
Data Routing: Select p−t, select u−w, select v−x.
Micro− −operation using stack pointer
The push operation is implemented with the following sequence of micro operation.
SP ← SP + 1 Increment stack pointer
N[SP] ← DR write item on top of the stack
If (SP = 0) then
(FULL ← 1) check if stack is full
EMPTY ← D mask the stack not empty.
1. (d) (1) The read access time of the system considering only memory read cycle
= access time cache + access time for main memory
= 0.9 × 100 + 0.1 × 1000 = 190 ns.
(2) The write access time of the system considering only memory write cycle.
= access time for main memory = 1000 ns
The average access time considering only memory read cycle is 190 ns.
The average access time for both read and write request is
= 0.75 (read access) + 0.25 (write access)
= 0.75 (190) + 0.25 (1000) = 392.5 ns.
2. (a) Addressing Modes

The processor can access operands in various ways. The addressing mode refers to the effective
address EA. (i.e. final address used to access an operand) formation mechanisms. Addressing modes
are either explicitly specified or implied by the instruction. The different major addressing modes are :
(1) Immediate addressing
The operand data is directly specified in the operand field. The instruction is a multiword
instruction, where the operand immediately follows the opcode. Both the opcode & the operand
are fetched from memory using program counter. The instruction format of immediate
addressing is as shown in fig.
Opcode Operand
The immediate addressing modes can be used for (1) Loading internal registers with initial value,
(2) Perform arithmetic or logical operation on immediate data.
(2) Direct addressing
The effective memory address where the
operand is present is directly specified with in Operand
the instruction. The instruction will contain Opcode address
opcode followed by direct memory address. Memory
Both the opcode and direct address are fetched
from memory by using program counter. The
direct address available is then used to access Operand
the operand. The instruction format of direct
(3) Extended addressing

The effective memory address L0 H0
where the operand is present is Opcode address address
directly specified with in the Memory
instruction. In the same way as
direct addressing. This is used by
some of the processors and address
Operand
specified is 16 bits long. The
difference between the two is direct
uses 8 bits address whereas
extended uses 16 bits address. This addressing is slow way of accessing memory because the
instruction is 3 bytes long and requires 3 memory access using the PC to acquire the instruction.
The instruction format of extended addressing is as shown in fig.
(4) Register addressing
In this addressing mode the instruction opcode
specifies the CPU registers where the operand is opcode
stored. There are different ways of implementing
REG
this. When 2 registers are specified one will be CPU registers
used as source while the other will be used as Register Rn
destination. Using internal registers instead of
memory for operand makes this mode of Rn
Operand
instructions execute faster than other mode of
instructions. The instruction format of register
(5) Indirect addressing
In this indirect addressing mode, the operand Address memory
instruction contains an address that points
to the memory location where the
effective direct address to be used for Effective
operand is stored. The instruction format address
of indirect addressing is as shown in fig.
Operand
(6) Register indirect addressing
In this register indirect mode the instruction
opcode specifies an internal register or Opcode
register pair which contains the effective
Register Rn CPU
address to be used for accessing operand in
Register
memory. This mode is used to save Memory
program space & improve speed of
program execution in situations where data Effective
elements are to be accessed from memory. Rn address
The instruction format of register indirect Operand
Some processors provides some variations in this mode.

(a) Address register indirect with post increment. In this the effective address is in the specified
address register. The operand is accessed using this address. After the operand is accessed
the effective address in address register is incremented by the operand size.
(b) Address register indirect with predecrement. In this the effective address is in specified
address register. Before this effective address is used, the address register is decremented by
the operand size. After decrementing register contents these are used to access memory
operand.
(7) Base addressing

In this base addressing mode the opcode specifies a register that contains an address. The instruction
also contains an offset field that contains a displacement. The indirect addressing to memory is
performed using an effective address to access the operand. But in this case the effective address is
formed by addition of the base address & the displacement value. The instruction format of base
Opcode Displacement
CPU
Base Register
Register
B
Base
Bn address Memory
EA
Operand
(8) Indexed addressing

In this indexed addressing mode Opcode Address
the opcode specifies a register CPU
that contains the offset value or Index Register
displacement. The instruction Register
also contains an address field.
The indirect addressing to Index
memory is performed using an Bn address Memory
effective address to access the
operand. But in this effective
address is formed by addition of EA
the index value and the address. Operand
The instruction format of indexed
(9) Base index addressing
The addressing mode is combination of two modes, Base addressing and index addressing. In
this case the instruction opcode specifies two registers a base register that contains base address
and an index register that contains an indexed value. The operands effective address is formed by
adding the contents of the base and index registers. This addressing mode instruction can use 8 bit
or 16 bit displacement as option. If displacement is used, it is also added to get effective address.
The instruction format of base indexed addressing is shown in figure below.
Opcode Displacement
CPU
Base Register
Register
Base
address
Index Memory
Register Index
value
EA
Operand
Fig. : Basic index addressing structure

(10) Relative addressing

In relative addressing mode, the operand comes from a location relative to the executed
instructions position. The operands effective address is generated by the sum of the contents of
the program counter & the signed value specified by the instruction in its address field. The
instruction format of relative addressing mode is as shown in fig.
Operand Displacement
Memory
Memory
counter
EA
Operand
(11)Implied addressing
Implied addressing also called as implicit addressing or inherent addressing. It is available in
some processors. In this case the instruction opcode doesn’t specify register or memory. It
automatically implies the operand position. As an example it is implied that operand is available
in specific register, the opcode doesn’t give register code but assumes it is present.
(12)Bit addressing
In bit addressing mode the operand is a specific bit within the words stored in memory or
registers. The instruction address of these individual bits uses a combination of other addressing
modes such as register, register indirect etc.
2. (b) (i) RISC versus CISC characteristics :

Architectural
CISC RISC
characteristics
Instruction set size and Large set of instructions with Small set of instructions with
instruction formats variable formats (16-64 bits per fixed (32bit) format and most
instruction) register based instructions.
Addressing modes 12 − 24 Limited to 3 − 5
GPRs and cache design 8−24 GPRs, mostly with a unified Large numbers (32−192) of
cache for instructions and data, GPRs with mostly split data
recent designs also use split caches cache and instruction cache.
Clock rate and CPI 33−50MHz in 1992 with a CPI 50 − 150MHz in 1993 with one
between 2 and 15 cycle for almost all instructions
and on average CPI < 1.5.
CPU control Most microcoded using control Most hardwired without control
memory (ROM), but modern CISC memory.
also uses hardwired control.
(ii)
Hardwired Control Microprogrammed Control
(i) The control unit is essentially a (i) The control memory contains program that
combinational circuit. describes the behavior of the control unit.
(ii) Hardwired control unit must contain (ii) Use of microprogramming to implement a
complex logic for sequencing through many control unit simplifies the design of
micro−operation of the instruction cycle. control unit.
(iii) Comparatively more costly than (iii) It is cheaper.
microprogrammed control.
(iv) More error−prone. (iv) Less error−prone to implement.
(v) Hardwired control is faster than (v) Microprogrammed control slower than
microprogrammed control. Hardwired control.
(iii)
DRAM SRAM
(i) It is made with cells that store data as (i) Binary values are stored using traditional
charge on capacitors (binary 1 and 0 flip−flop logic gate configurations.
represents presence and absence of charge
respectively)
(ii) Dynamic RAMs required periodic charge (ii) It will hold its data as long as power is
refreshing to maintain data storage. supplied to it.
(iii) It favours large memory requirements. (iii) Not suitable for large memory
requirement.
(iv) These are generally slower than SRAMs. (iv) These are generally faster than DRAMs.
(v) DRAM is more dense and less expensive (v) SRAM is less dense and more expensive
than a corresponding SRAM. than a corresponding DRAM.
(iv)
Synchronous Bus Asynchronous Bus
(i) The occurrence of events on the bus is (i)
The occurrence of one event on a bus
determined by a clock. follows and depends on the occurrence of
a previous event.
(ii) All devices in this bus are tied to a fixed (ii) A mixture of slow and fast devices, using
clock rate, the system cannot take advantage older and newer technology can share a
of advances in device performance. bus.
(iii) It is less flexible than asynchronous (iii) It is more flexible than synchronous
timing. timing.
(iv) Simpler to implement and test. (iv) Less simpler to implement and test.
2. (c) x3 x0
y3 y0
EAND
4
4
M 4
U Z
4 X
EOR 4 4
Not
4 bit used
4 adder 2 bit
y0
y1 Output
CIN select lines
y2
Truth Table for 4-bit adder − substractor
SUB Output
y3 1 X−Y
0 X+1
SUB
Above 4-bit ALU includes function select lines such as EAND, EOR, SUB.
ALU Design Steps :
1. First of all define the capacity of ALU (e.g., here it is 4-bit).
2. Implement necessary logical section using standard logic gates and mark them with necessary
function select lines e.g., EAND.
3. Select appropriate type of adder on the basis of cost and speed expected.
4. Implement adder-substractor unit with the help of array of Ex-OR gates to provide function
select line SUB.
5. Combine the logical and arithmetic sections using appropriate size of MUX.
3. (a) SPARC Architecture
• SPARC (Scalable Processor Architecture) is RISC processor
• SPARC architecture was initially developed by Sun and is based on RISCII design
from university of California.
• 32 bit SPARC was introduced in 1987
• 64 bit SPARC was introduced in 1995
Registers in SPARC r31 i7
• It has 32 general purpose 64 bit registers r0 − r31
• These 32 registers divided into 4 groups each of 8 registers like
(i) global register (g0 − g7) i1
i0
(ii) out register (O0 − O7) L7
(iii) local register (L0 − L7)
(iv) in register (i0 − i7)
L1
• SPARC uses register windows where each window consist of 24 L0
registers which consists of registers, local registers and out registers. O7
• The 'out registers' of one set overlaps with 'in register' of adjacent set
• The CWP (current window pointer) register gives the current O1
window information O0
g7
• The CWP can be increment or decrement by two instruction
1. The restore instruction decrement CWP register
2. The save instruction increment CWP register r1 g1
r0 g0
• The SPARC also maintains another set of 8 alternate global (AG) 63 0
registers.
63 0
• Thus in an implementation with 64 registers SPARC.
PC
• Thus in an implementation with 64 registers SPARC we can have
THREE sets (of 8 global 8 alternate global and the sets of 16 register 63 0
each) CWP−1 nPC
Out O7
4 0
reg. O0
Local L7 CWP
reg. L0 Window CWP
7 4 3 0
in i7 Out O7
reg. i0 reg. O0 xcc icc
Local L7
CCR
reg. L0 Window (CWP + 1)
in i7 Out O7
g7 reg. i0 reg. O0
Local L7
Alternate g7 Global reg. L0

global g1 registers in i7
registers g0 g0 reg. i0
CCR (Condition code register) :- This register is similar to Flag register of Pentium processor. It
provides two 4 bit integer condition code fields: xcc and icc as shown below:
xcc icc
n z v c n z v c
sign zero over carry
(negative) flow
The xcc record status information when the operands are 64 bit (Extended)
The icc record similar information when the operands are 32 bit.
3. (b) USB
• It was developed in 1955 by group of companies like
Compaq, HP, Intel, Lucent, Microsoft and Philips.
• The major goal of USB was
“To define an external expansion bus that makes attaching peripherals to computer as easy as
hooking up telephone on walljack.”
• USB 1.1 can operate at
1.5 Mbps (low speed) for mouse and KBD
and
12 Mbps (full speed) to support LAN’s and disk drives
• The next version
USB 2.0 has bandwidth of 40 to 480 Mbps so that it can be competitive with SCSI as well as
firewire, video conferencing cameras, scanners, CD writers etc.
• Motivation (to use USB)
1. USB uses single connector type to connect any device.
2. USB supports upto 127 devices per USB connection e.g. to one USB port we can connect
KBD, mouse, speakers using single cable type.
3. USB does not require memory or adder space. There is also no need for interrupt request
lines.
4. USB installation is very easy since it supports plug−and−play connectivity (doesn’t need
jumper or DIP switch setting)
5. Attaching USB device do not require to turn off computer and restarting after installing
the new device.
• Advantages of USB
1. Power distribution :
USB cable can provide +5V supply with 100 to 500 mA current. To avoid clutter of
power supplies. Device such as KBD, mouse, wireless LAN, FDD can be powered from
the cable.
2. Control peripheral :
The USB allow data transfer after in any direction.
3. Expandable through hub.
4. Power conservation : USB devices enter a suspended state if there is no activity on bus
for 3 ms.
5. Error detection and recovery: The USB uses CRC for checking to detect transmission
errors. In case of error the transaction is retried.
• USB uses NZI (Non return to zero inverted) encoding scheme to improve reliability of
transmission.
• Transfer types
USB support following 4 types of transfer
1. Interrupt: USB uses polling technique
2. Isochronous: For real time application i.e. transmission of data to speaker and reading
from CD-ROM.
3. Control: Used to configure and set up USB device.
4. Bulk: Devices that do not have specific transfer rate requirement use the bulk transfer.
E.g. transferring file to a printer.
• USB Architecture
USB hardware consist of
(i) USB host controller: It initiates transactions on USB.
(ii) Root hub: It provides the connection points.
The host controllers are of two type:
1. OHC (Open Host Controller) : It is defined by Microsoft, Compaq.
2. UHC (Universal Host Controller) : It is defined by Intel.
The main difference between OHC and UHC is in scheduling above four transfer types :
The root hub is responsible for power distribution, enabling and disabling the port, device
recognition and status reporting when polled by the host software.
Expansion of USB using HUB :
Host
Root HUB
USB Device USB Device
HUB HUB

USB Device HUB

USB Device
USB Transactions :
Transfers are done using one or more transactions. Each transaction consists of several pockets.
Transaction may have between one and three phases :
(a) Token packet phase : All transaction begin with a token phase. It specifies the type of
transaction and the target device address.
(b) Data packet phase : If the transaction type requires sending data, a data phase is included. A
maximum of 1023 bytes of data are transferred during a single transaction.
(c) Handshake packet phase : Except for the isochronous data transfers, the other three types use error
detection to provide guaranteed delivery. This phase provides feed back to the sender as to
whether the data have been received without any errors. In case of errors retransmission of the
transaction is done.
The packet format is as below : Packet
8-bits
8 bits 5 or 16 bits 3 bits
00000001
Sync
Packed ID Packet specific information CRC EOP
sequence
Type field Check field

4 bits 4 bits
A synchronization sequence precedes each packet. The receiver uses this sequence to
synchronize the incoming packet data rate.
The packet id has type field and check field. The type field identifies whether the packet is a
token, data or handshake packet. Check field is used to protect it.
A conventional CRC field protects the data part of a USB packet.
The end of packet field indicates the end of each packet. The EOP is indicated by holding
both signal lines low for two bit periods and leaving them idle for the third bit.
3. (c) Systolic Arrays

Systolic array computers derive their name from drawing an analogy to how blood rhythmically flows
through a biological heart. They are a network of processing elements that rhythmically compute data
by circulating it through the system. Systolic arrays are variation of SIMD computers that
incorporates large arrays of simple processors that use vector pipe lines for data flow.
One well known systolic array is CMU's iWarp processor, which was manufactured by Intel in
1990. This system consists of a linear array of processor connected by a bidirectional data bus.
− systolic unit (cell) is an independent processor with some register and ALU.
− these cells share information from their neighbours.
Memory Memory
PE PE PE PE PE
(a) A simple processing element (PE) (b) A systolic array processor
Although figure above illustrates a one-dimensional systolic array, two dimensional arrays
are not uncommon. Three-dimensional arrays are becoming more prevalent with the advances
in VLSI technology.
Systolic arrays employ a high degree of parallelism (through pipelining) and can sustain a
very high throughput. Connections are typically short and the design is simple and thus
highly scalable. They tend to be robust, highly compact, efficient and cheap to produce. On
the down side, they are highly specialized and thus inflexible as to the types and sizes of
problems they can solve.
A good example of using systolic arrays can be found in polynomial evaluation. To evaluate
the polynomial
y = a0 + a1x + a2x2 + ……… + akxk
A linear systolic array, in which the processors are arranged in pairs, can be used to evaluate
a polynomial.
Systolic arrays are typically used for repetitive tasks, including Fourier transformations,
image processing, data compression, shortest path problems, sorting, signal processing and
various matrix computations (such as inversion and multiplication). In short systolic array
computers are suitable for computational problems that lend themselves to a parallel solution
using a large number of simple processing elements.
Characteristics of Systolic Arrays :
− Synchronizations
− Modularity
− Regularity
− Locality
− Finite connection
− Parallel/pipeline
− Extendibility
Systolic Architectures are

massively parallel processing with limited I/O communication with host computer and is
suitable for many regular interactive operations e.g., FIR filter and Matrix multiplications. These
are extremely fast and easily scalable architecture. Systolic architecture is expensive and is not
needed on most applications as they are a highly specialized type of processor, even difficult to
implement and build.
3. (d) SCSI (1979)

It is parallel interface used for high speed transfer of data.
It is 8 or 16 bit called 'narrow' or 'wide'.
The 8 bit SCSI support transfer rate of 5 MB/s but today's 16 bit SCSI support transfer rate of
40 MB/s.
An 8 bit SCSI can have 8 devices and the 16 bit SCSI can have 16 devices.
Each SCSI device is assigned a unique number to identify it on the bus and to direct the data
traffic. For narrow SCSI the device ID ranges for 0 to 7 and for wide SCSI it is 0 to 15.
SCSI support both internal and external device connection :
(a) Narrow SCSI uses 50 pin connector for both internal and external interfaces.
(b) Wide SCSI uses 68 pin connector to allow for the additional 8 data lines.
Several devices on SCSI can be chained by connecting output of a device to input of other
(daisy).
SCSI allows a maximum cable length of 25 meters but as number of devices increase this
cable length decreases.
Types of SCSI
Type of SCSI Bus Width Transfer rate MB/s
SCSI 1 8 6
Fast SCSI 8 10
Ultra SCSI 8 20
Ultra2 SCSI 8 40
Wide Ultra SCSI 16 40
Wide Ultra2 SCSI 16 80
Ultra 3 (Ultra 160) SCSI 16 160
Ultra 4 (Ultra 320) SCSI 16 320
To achieve better isolation Twisted Pairs are used for signal lines.
SCSI used client-server model (initiator-target).
The initiator device (typically adapters in computer) issues command to target (typically
SCSI device like disk drives) to perform task.
The target receives command and performs requested task.
The SCSI host adapter sends a command to selected target device on SCSI bus by asserting
number of control signals.
The target device acknowledges the selection and begins to receive data from the initiator.
SCSI transfer proceeds in phases like
Command Phase Message in
Message out Data in
Data out Status
The direction of transfer 'In' or 'Out' is from the initiator point of view.
The SCSI uses asynchronous mode for all bus negotiations.
It uses handshaking using REQ and ACK signals for each data byte.
But in synchronous SCSI these REQ and ACK signals not used
− In this synchronous move number of data bytes (e.g., eight) can be sent without waiting
for ACK by using REQ pulse.
− This increases throughput and minimises the adverse impact of the cable propagation delay.
1
4. (a) Efficiency (η) = .
r + (1 − r ) H
t A2 10−2
But, r= = = 100 × 103 = 0.1 × 106 sec.
t A1 10−7
1
90% =
0.1 × 10 + (1 − 0.1× 106 ) × Η
6
0.1 × 106 − ( 99999 ) × H = 1.12

0.1 × 106 − 1.12 = 99999 × Η
H = 0.99
4. (b) Main and the secondary memory form a two level hierarchy. This interaction is managed by
operating system. However so, it is not transparent to system software but somewhat transparent
to the user code. The term ‘virtual memory’ is applied when main and secondary memories
appears in user program like a single, large and directly addressing memory.
Three reasons for using virtual memory.
• To free user from the need to carry out storage reallocation and permit the efficient sharing of
available memory space by different users.
• To make the program independent of the configuration and capacity of the physical memory for
execution.
• To achieve very low cost per bit and low access time that are possible with memory
hierarchy.
The program is divided into number of blocks of virtual memory which is known as ‘virtual
address space’.
• A simple method for translating virtual address into physical addresses is to assume that all
programs and data are compared of fixed−length units called pages, each of which consists of
a block of words that occupy contiguous locations in the main memory.
• Pages commonly range from 2k to 16k bytes in length. They constitute the basic unit of
information that is moved between the main memory and the disk whenever the translation
mechanism determines that a move is required.
• A virtual memory address translation method based on the concept of fixed length pages is
shown in above figure. Each virtual address generated by the processor, whether it is for an
instruction fetch or an operand fetch / store operation is interpreted as virtual page number
(high order bits) followed by an offset. Low−order bits that specifies the location of a
particular byte (or word) within a page. Information about the main memory location of each
page is kept in a page table. This information includes the main memory address where the
page is stored and the current status of the page. An area in the main memory that can hold
one page is called a page frame.
The starting address of the page table is kept in a page table base register. By adding the
virtual page number to the contents of this register, the address of the corresponding entry in
the page table is obtained. The contents of this location give the starting address of the page
if that page currently resides in the main memory.
• Each entry in the page table also includes some control bits that describes the status of the page
while it is in the main memory. One bit indicates the validity of the page, i.e., whether the page is
actually loaded in the main memory. This bit allows the operating system to invalidate the page
without actually removing it. Another bit indicates whether the page has been modified during its
residency in the memory. Other control bits indicate various restrictions that may be imposed on
accessing the page.
Page Table Virtual Page no. offset

Address
+
Page Table
Page frame
Control bits
in memory
Page frame Offset

Physical address in
Main memory
4. (c) Booth’s algorithm is depicted as in following figure and can be described as follows :
1) Multiplier and Multiplicand are placed in Q and M registers respectively.
2) There is also a 1 bit register placed logically to the right of the least significant bit (Q0) of the
Q register and designated Q-1.
3) The result of the multiplication will appear in the A and Q register. A and Q−1 are initialized
to zero.
4) Control logic scans the bits of the multiplier one at a time. Now, as each bit is examined, the
bit to its right is also examined.
5) If the two bits are the same (1 − 1 or 0 − 0), then all of the bits of the A, Q and Q−1 registers
are shifted to the right 1 bit. If two bits differ, then the multiplicand is added to or subtracted
from the A register depending on whether the two bits are 0−1 or 1−0.
6) Following the addition or subtraction, the right shift occurs. In either case, the right shift is
such that the leftmost bit of A, namely An−1, not only shifts into An−2, but also remains in An−1.
This is required to preserve the sign of the number of A and Q.
It is known as an arithmetic shift, since it preserves the sign bit.

e.g. M = 0111 Q = 0011
A Q Q-1
0000 0011 0 Initial
1001 0011 0 A←A−1 }
1100 1001 1 shift } First cycle
1110 0100 1 shift } Second cycle
0101 0100 1 A ← A + M}
0010 1010 0 shift } Third cycle
0001 0101 0 shift } Fourth cycle

(product in A, Q)
Result = (00010101)2 = (143)10
Start
A ← 0 Q-1 ← 0
M ← Multiplicand
Q ← Multiplier
Count ← π
= 10 Q0+Q−1 = 01
A←A−M = 11 A←A+M
= 00
Arithmetic Shift
Right A, Q, Q−1
Count ← Count −1
No Yes
Count = 0? END
Fig : Booth’s algorithm for two’s complement multiplication
Algorithm
Case − I : If Qi = 0, Qi−1 = 1
then add M to partial product
Case − II : If Qi = 1, Qi−1 = 0
then subtract M from partial product
Case − III : If Qi = 0, Qi−1 = 0 OR Qi = 1, Qi−1 = 1
then, Rshift
Declare : A(7 : 0) Q(7 : 0), M(7 : 0), Count (3 : 0)
Bus : Inbus (7 : 0) Outbus (7 : 0)
Begin : A←0 Inbus ← Q(7 : 0)
Count ← n (No. of bits of number)
Input : M ← Inbus, Q ← Inbus
Loop : if (Q0, Q-1) = (00 V11)
then goto Rshift
if (Q0, Qi−1) = 01
then, A = A − M or
A (7 : 0) = A(7 : 0) + M(0 : 7)
if (Q0 , Qi−1) 10
then,
A (7 : 0) = A(7 : 0) − M(7 : 0)
Rshift : A(7 : 1) Q ≤ A . Q (6 : 0)
Test : Count = Count − 1
if Count = 0
then end
else goto loop P
End : stop
5. (a) (i) Nano programming

Computer programming in Nano is one of the newest developments. It was believed that a
Nano mechanical computer could run a million times faster than a microprocessor based
computer. This is because that one out of the million components of a computer is made of
mechanical space.
The Nano computer language is believed to work well with the present day computer systems.
The primary use of this programming language is on graphics. With the Nano-X graphics system
you could create much fancier graphical programs. To make it work, you have to specifically
create the program with the Window, Unix or Macintosh interface in mind.
The Nano computer language primarily came from the nano-technology. Nano technology
refers to the fields of applied science that control matter on its molecular and atomic scale.
This program is easy to learn and apply. Texts can be typed immediately into the interface.
It is also simple to insert text into the program with the use of some editing configuration.
There is also nano editor software that you can use with the main program base so that
saving, cutting, pasting and searching becomes fairly straight forward.
Two level control store for nano programming

− Instead of encoding individual µ operations, it is possible to encode all µ operations in a
single field.
− This can be done when same µ operations IR
occur together repeatedly in the µ code.
− The micro code memory outputs a value that
points to a location in nano memory (ηCM) µPC ηPC
which output a nano instruction (which are µ
operation’s for that µ instructions)
− Nano instructions usually have highly µCM ηCM
parallel horizontal formats.
− Another advantage of nano programming is
that it provides greater design flexibility that
µIR ηIR
results from loosening the bonds between
instructions and hardware with two
intermediate levels of control rather than one. Ctrl signals
− But the draw back is loss of speed due to extra memory access to ηCM and more complex
control unit organization.
(ii) Fault Tolerant Computing

It is the act and science of building computing systems that continue to operate satisfactorily
in the presence of faults. A fault−tolerant system may be able to tolerate one or more fault
types including
(i) transient, intermittent or permanent hardware faults
(ii) Software and hardware design errors
(iii) operator errors
(iv) externally induced upsets or physical image
Hardware Fault−Tolerance
The majority of fault−tolerant designs have been directed towards building computers that
automatically recover from random faults occurring in hardware components.
The techniques employed to do this generally involve partitioning a computing system into
modules that act as fault containment regions. Each module is backed up with protective
redundancy so that, if the module fails, others can assume its function. Special mechanisms are
added to detect errors and implement recovery.
General approaches to hardware fault recovery are :

Fault masking is a structural redundancy technique that completely masks faults within a set of
redundant modules. A number of identical modules execute the same functions and their outputs
are voted to remove errors created by a faulty module. Dynamic recovery is required only when
one copy of a computation is running at a time (or in same cases two unchecked copies) and it
involves automated self−repair. In case of dynamic recovery, special mechanisms are required to
detect faults in the modules, switch out a faulty module, switch in spare and instigate software
actions (rollback, initialization, retry restart) necessary to restores and continue the computer.
Dynamic recovery is generally more hardware − efficient than voted systems.
Software fault tolerance : Efforts to attain software that can tolerate software design faults
(programming errors) have made use of static and dynamic redundancy approaches similar to
those used for hardware faults. One such approach, N-version programming, uses static
redundancy in the form of independently written programs (versions) that perform the same
functions and their outputs are voted at special checkpoints. Here the data being voted may
not be exactly the same.
An alternative approach is based on the concept of recovery blocks. Programs are partitioned into
blocks and acceptance tests are executed after each block. If test fails, a redundant code block is
executed.
(iii) Dataflow Computing

In dataflow computing, the control of the program is directly tied to data itself. It is a simple
approach. An instruction is executed when the data necessary for execution becomes available.
Therefore, the actual order of instructions has no bearing on the order in which they are eventually
executed. Execution flow is completely determined by data dependencies. Data flows
continuously and is available to multiple instructions at the same time. Each instruction is
considered to be a separate process. Instructions do not reference memory, instead, they reference
other instructions. Data is passed from one instruction to the next.
We can understand the computation sequence of a dataflow computer by examining its

dataflow graph. In a dataflow graph, nodes represent instructions and arcs indicate data
dependencies between instructions. Data flows through this graph in the form of data tokens.
When an instruction has all of the data tokens it needs, the node fires. When a node fires, it
consumes the data tokens, performs the required operation and places the resulting data token
on an output arc. This idea is illustrated below:
A B 4
+ −
A+B B−4
*
(A + B) * (B − 4)
Data Flow Graph Computing N = (A + B) * (B − 4).
The data flow graph shown above is an example of a static dataflow architecture in which
the token flows through the graph in a staged pipelined fashion.
In dynamic dataflow architecture, tokens are tagged with context information and are stored in
a memory. During every clock cycle, memory is searched for the set of tokens necessary for a node
to fire. Nodes fire only when they find a complete set of input tokens within the same context. The
tokens propagate along the arcs.
5. (b) Types of Memories

The memories can be classified on various bases. The memory components of a computer system
can be divided into three main groups according to their use. These are as follows:
• Internal Processor Memory: This comprises a small set of high speed registers used as a
working memory for temporary storage of instructions and data.
• Main Memory: It is also called as primary memory. This is a relatively large fast memory used
for program and data storage during computer operating. It is characterized by the fact that
locations in main memory can be accessed directly and rapidly by the CPU instruction set. The
technology used for main memory is based on semiconductor integrated circuits.
• Secondary Memory: It is also known as auxiliary or backing memory. This is generally
much larger in capacity but also much slower than main memory. It is used for storing system
programs, large data files and the information or data which are not continually required by
the CPU. It also serves as an overflow memory when the capacity of the main memory is
exceeded. Information in secondary storage is accessed indirectly via input−output programs
that first transfer the required information to main memory. Representative technologies used
for secondary memory are magnetic disks and tapes.
• Internal Processor Memory : Data which are frequently used are kept in these register.
This helps to reduce turnaround time.
5. (c) Synchronous Bus

In synchronous buses, a bus clock signal provides the timing information for all actions on the
bus. Change in other signals is relative to the falling or rising edge of the clock.
Basic Operation : Simple bus operations are single read and write.
Read Operation : The basic read operation takes three clocks. It starts with the rising edge of
clock T1. During the T1 clock cycle, the CPU places a valid address of the memory location to
be read. Since the address is not a single line, we show the valid values by two lines as some
address lines can be 0 and others 1. When we don't care about the values and show them shaded
e.g., the address values at the beginning of T1 are shown shaded.
Fig.: Memory read operation with no wait states.
After presenting a valid address on the address bus, the CPU asserts two control signals to
identify the operation type :
(i) The IO/ memory signal is made low to indicate a memory operation.
(ii) The read /write line is also turned low to indicate a read operation.
These two lines together indicate that the current bus transaction is a memory read operation.
The CPU samples the ready line at the end of T2. This line is used by slower memories to indicate
that they need more time. It asserts ready by making it low. The CPU then reads the data presented
by the memory on the data bus, removes the address and deactivates the IO/ memory and
read /write control signals.
Write Operation :
Write cycle is similar to read cycle. Since this is write operation, the read /write signal is held
high. The difference is that the CPU places data during the T2 clock cycle. As in read cycle, the
CPU removes the address and the IO/ memory and read /write control signals during the third
clock cycle.
Fig.: Memory write operation with no wait states.
Wait States :
The default timing allowed by the CPU is sometimes insufficient for a slow device to respond
e.g., in a memory read cycle, if we have a slow memory it may not be able to supply data during
the second clock cycle. Therefore, the CPU should not presume that whatever is present on the
data bus is actual data supplied by memory. Thus, the CPU always reads the value of the ready
line to see if the memory has actually placed the data on the bus. If this line is high, CPU waits
one more cycle and samples the ready line again. Once this line is low it reads the data and
terminates the read cycle.
Fig.: Memory read operation with a single wait state.
Fig.: Memory write operation with a single wait state.
6. (a) Memory Characteristics : The properties to be considered when evaluating any memory technology
are :
i) Cost : The price should include the cost of information storage cells as well as the cost of
the peripheral equipment or access circuitry essential to the operation of memory.
cost = price of complete memory system / total bits of storage capacity.
ii) Access time : It is the time required to read or write a fixed amount of information. e.g. one word
from the memory. Access time depends upon the physical characteristics of the storage medium
and also on the types of access mechanism used. It is usually calculated from the time a read
request is received by the memory unit to the time a read request is made available to the memory
output terminals. The access time measured in words per second is another widely used
performance measure for storage devices. Thus, low cost and high access rates are desirable
memory characteristics.
iii) Access modes: It is the order or sequence in which information can be accessed. Memory
can be accessed randomly or sequentially. In random access memories each storage location
can be accessed independently of the other locations whereas in serial access memory storage
locations can be accessed only in a certain predetermined sequence.
iv) Alterability: Memories whose contents cannot be altered on line are called Read Only
Memories (ROMs) Memories in which reading or writing can be done online are called read
write memories. All memories used for temporary purpose are read write memories.
v) Permanence of storage: The physical processes involved in storage are sometimes inherently
unstable, so that stored information may be lost over a period of time unless appropriate action is
taken. There are three important memory characteristics that can destroy information: destructive
readout, dynamic storage and volatility. In destructive readout, the memory contents are called as
the memory is read. Memories which require periodic refreshing are caused as dynamic memories.
Static memories do not require refreshing. If the contents of memory are lost in case of power
failure, the memory is termed as volatile memory.
vi) Cycle time and data transfer rate: The minimum time that must elapse between the
initiation of two different access by memory can be greater than access time. This loosely
defined term is called cycle time of the memory. It is generally convenient to assume that
cycle time is the time needed to complete any read or write operation in memory.
The maximum amount of data that can be transferred is 1/tm and is called data transfer rate.
The access time may be more important in measuring overall computer system performance
since it determines the length of time for which processor must wait until initiating a next
memory request.
vii) Physical characteristics: Many different physical properties of matter are used for
information storage. The most important properties used for this purpose are all classified as
electronic, magnetic, mechanical and optical. A factor determining the physical size of a
memory unit is the storage density measured in bits per unit area. In general, memories with
no moving parts have much higher reliability than memories such as magnetic disks which
involve considerable mechanical motion.
6. (b)
D4−D7 D0−D3
4 4
A0−A7
8 8
256 × 4 256 × 4
RD RD
(2) (1)
WR WR
EN EN
Step I: D0−D7
4
A0
A7 8
RD 256 × 8
WR
EN
enable
Step II: Designing 2k × 8 using 256 × 8 using 256 × 8 memory chip.

A0 − A7 8 D 0 − D7
256 × 8 8
R0 (1)
WR
256 × 8
(2)
EN
A8
256 × 8
(3)
3:8
A9
Decoder EN
A10
256 × 8
(8)
EN
7. (a) (i) Paged Memory System :

• Paging system uses fixed length blocks called pages and assigns them to fixed regions of
physical memory called page-frames. The main advantage of paging is that memory
allocation is greatly simplified since an incoming page can be assigned to any available page
frame. Physical memory is broken into fixed−size blocks called frames.
• Logical memory is also broken into blocks of the same size called pages. When a program is
to be executed, its pages are loaded into any available frames and the page table is defined to
translate from user pages to memory frames.
• In the figure (a) above, some of the frames in memory are in use and some are free. The list
of free frames is maintained by the operating system. Process A, stored on disk, consists of
four pages. When it comes to load this process the O/S finds four free frames and loads the
four pages of the process A into the four frames.
Free frame list Free frame list
13 20
14
15
Process A Process A Page 0
18 13 13
Page 0 20 Page 0 Page 1
14 14
Page 2
Page 1 15 Page 1 15
In use In use
16 16
In use In use
Page 2 17 Page 2 17
Page 3
Page 3 18 Page 3 18
In use In use
19 19
Logical 20 20
Memory In use 0 13 In use
21 Process A 21
1 14
Page
Physical 2 15 Table
Memory 3 18
Fig (a) Fig (b)
(Before) (After)
• The O/S maintains a page table for each process. The page table shows the frame location for
each page of the process. Each logical address consists of a page number and a relative
address within the page. In paging, the logical to physical address translation is done by CPU
hardware.
• Now the CPU must know how to access the page table of the current process. Presented with
a logical address consisting of page number and relative address, the CPU uses the page table
to produce a physical address consisting of frame number and relative address as shown in
Fig. below. Logical Address
2 30 Physical Address
Page 0 13
14 30
Process A Page 1 14
Page table
Page 2 15
13
16
14
17
15 Page 3 18
18
19
Fig. Logical and Physical Address
• Hence, paging solves a lot of problems. Main memory is divided into many small equal size
frames. Each process is divided into frame size pages. Smaller process requires lower pages,
large processes require more. When a process is brought in, its pages are loaded into
available frames and a page table is set up.
(ii) Demand Paging: A demand paging

system is similar to a paging system Program
with swapping which is shown in A Swap out
figure.
• In the figure, we see transfer of a Program

paged memory to contiguous disk A
Swap in
block processor residing on
secondary memory. When we want
to execute a process, we swap it into
memory, rather than swapping the Physical
entire process into memory. Memory
• However, we use a large swapper. A large swapper never swaps a page into memory unless
that page will be needed when a process is to be swapped in. The page guesses which page
will be used before the process is swapped out again.
• Instead of swapping in a whole process, the page brings only those necessary pages into
memory. Thus, it avoids reading into memory pages that will not be used anyway,
decreasing the swap time and the amount of physical memory needed. We need some form
of hardware support to distinguish between those pages that are in memory and those pages
that are on the disk. The valid−invalid bit can be used for this purpose.
• When this bit is set to ‘valid’ this value indicates that the associated page is both legal and in
memory. If the bit is set to invalid, this value indicates that the page is either not valid or is
valid but corrected on disk.
• The page table entry for a page that is brought into memory is set as usual but the page−table
entry for a page that is not correctly in memory is simply marked invalid which is shown in
below figure.
• If the process tries to use a page that was not brought into memory. Access to a page marked invalid
causes a page fault swap. The paging hardware will notice that the invalid bit is set, causing a trap to
the operating system. This trap is the result of the operating system’s failure to bring the desired
page into memory.
Valid-Invalid bit
Frame
0 A 0
1 B 0 4 V 1
2 C 1 i 2
3 D 2 6 V 3
4 E 3 i 4 A
5 F 4 i 5
6 G 5 9 V 6 C
7 I 6 i 7
Logical 7 i 8
Memory
Page table 9
10 F
Fig. : Page table of demand paging
Physical Memory
7. (b) (i) Memory mapped I/O :

• It is nothing but programmed I/O with shared memory & I/O address space. Programmed I/O
requires all I/O operations to be executed under the direct control of the CPU i.e. every data −
transfer operation involving an I/O device requires the execution of an instruction by the
CPU.
• The I/O device does not have direct access to main memory ‘M’. A data transfer from an I/O device
to ‘M’ (memory) requires the CPU to execute several instructions. It also includes an input
instruction to transfer a word from the I/O device to the CPU and a store instruction to transfer the
word from the CPU to memory.
Data
Address
READ
WRITE
I/O port I/O port I/O port

Main 1 2 3
memory CPU
I/O device
I/O device A B
Fig. : Programmed I/O with shared memory and I/O address space.
I/O address space is always shared with memory address space.

Same control signals are used for memory as well as I/O control. These are
(i) memory read (ii) memory write
All memory related instructions can be used for I/O operations.
I/O data transfer can be done with respect to any general purpose register.
Memory area proportionately reduces as I/O space increases.
In this, ALU operations may perform directly on port data.
The address lines of the system bus that are used to select memory locations can also be used
to select I/O devices.
An I/O device is connected to the bus via an I/O port, which from the CPU’s perspective is an
addressable data register.
A technique used in many machines, such as Motorola 680 × 0 series, is to assign a part of main
memory address space to support I/O ports. This technique is called memory mapped I/O.
(ii) I/O Mapped I/O :
The organization is as follows :
• The memory and I/O address space are separate.
• This scheme is used for Intel 80 × 86 microprocessor series.
• A memory referencing instruction activates the READ ‘M’ or WRITE ‘M’ control line which
does not affect the I/O devices.
• The CPU must execute separate I/O instructions to activate the READ I/O and WRITE I/O lines.
• It causes a word to be transferred between the addressed I/O port & the CPU.
• An I/O device and a memory location can have the same address bit pattern without conflict.
• I/O mapped I/O define separate I/O address space & memory address space.
• It uses separate control signals for memory and I/O devices. These are − memory read,
memory write, I/O − read and I/O − write.
• It uses dedicated instructions for I/O operations e.g., IN, OUT.
• I/O data transfer is always with respect to accumulator (a register) only.
• Address memory area is not reduced in this case.
• ALU operations cannot be directly performed on port data.
Data
Address
READ M READ I/O
WRITE M WRITE I/O
I/O port I/O port I/O port

Main 1 2 3
memory CPU
I/O device
I/O device A B
Fig. : programmed I/O with separate memory and I/O address spaces (I/O−mapped I/O)

COA

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

COA

Încărcat de

Drepturi de autor:

Formate disponibile

Vidyalankar

S.E. Sem. III [CMPN]

1. (a) Von Neumann Architecture

1. (b) IEEE 754 format :

8 bit signed 23 bit mantissa

15 bit 52 bit mantissa

Fig.: Processor configured to implement add operation A = A + B

Function Select : ADD

2. (a) Addressing Modes

(3) Extended addressing

Some processors provides some variations in this mode.

(7) Base addressing

(8) Indexed addressing

Fig. : Basic index addressing structure

(10) Relative addressing

2. (b) (i) RISC versus CISC characteristics :

Alternate g7 Global reg. L0

USB Device USB Device

USB Device USB Device

The packet format is as below : Packet

Type field Check field

3. (c) Systolic Arrays

Systolic Architectures are

3. (d) SCSI (1979)

0.1 × 106 − ( 99999 ) × H = 1.12

Page Table Virtual Page no. offset

Page frame Offset

It is known as an arithmetic shift, since it preserves the sign bit.

0001 0101 0 shift } Fourth cycle

Fig : Booth’s algorithm for two’s complement multiplication

5. (a) (i) Nano programming

Two level control store for nano programming

(ii) Fault Tolerant Computing

General approaches to hardware fault recovery are :

(iii) Dataflow Computing

We can understand the computation sequence of a dataflow computer by examining its

5. (b) Types of Memories

5. (c) Synchronous Bus

Fig.: Memory read operation with no wait states.

Fig.: Memory write operation with no wait states.

Fig.: Memory read operation with a single wait state.

Fig.: Memory write operation with a single wait state.

Step II: Designing 2k × 8 using 256 × 8 using 256 × 8 memory chip.

7. (a) (i) Paged Memory System :

Fig. Logical and Physical Address

(ii) Demand Paging: A demand paging

• In the figure, we see transfer of a Program

7. (b) (i) Memory mapped I/O :

I/O port I/O port I/O port

I/O address space is always shared with memory address space.

READ M READ I/O

WRITE M WRITE I/O

I/O port I/O port I/O port

S-ar putea să vă placă și