Sunteți pe pagina 1din 35

Department of Electrical and Computer Engineering

SRAM Architecture
Vishal Saxena, Boise State University
(vishalsaxena@boisestate.edu)

Vishal Saxena

-1-

Outline

Memory Arrays
SRAM Architecture

SRAM Cell
Decoders
Column Circuitry
Multiple Ports

Serial Access Memories

Vishal Saxena

-2-

Memory Arrays
Memory Arrays

Random Access Memory

Read/Write Memory
(RAM)
(Volatile)

Static RAM
(SRAM)

Dynamic RAM
(DRAM)

Mask ROM

Programmable
ROM
(PROM)

Content Addressable Memory


(CAM)

Serial Access Memory

Read Only Memory


(ROM)
(Nonvolatile)

Shift Registers

Serial In
Parallel Out
(SIPO)

Erasable
Programmable
ROM
(EPROM)

Parallel In
Serial Out
(PISO)

Electrically
Erasable
Programmable
ROM
(EEPROM)

Vishal Saxena

Queues

First In
First Out
(FIFO)

Last In
First Out
(LIFO)

Flash ROM

-3-

6T SRAM Cell

Cell size accounts for most of array size

6T SRAM Cell

Used in most commercial chips


Data stored in cross-coupled inverters

Read:

Reduce cell size at expense of complexity

Precharge bit, bit_b


Raise wordline
bit

Write:

Drive data onto bit, bit_b


Raise wordline

bit_b

word

Vishal Saxena

-4-

SRAM Read

Precharge both bitlines high


Then turn on wordline
One of the two bitlines will be pulled down by the cell
Ex: A = 0, A_b = 1
bit_b
bit

bit discharges, bit_b stays high


But A bumps up slightly

word

Read stability

P1 P2

N2

N4
A_b

A
N1 N3

A must not flip


N1 >> N2

A_b

bit_b

1.5

1.0

bit

word

0.5

A
0.0
0

100

200

300

400

500

600

time (ps)

Vishal Saxena

-5-

SRAM Write

Drive one bitline high, the other low


Then turn on wordline
Bitlines overpower cell with new value
Ex: A = 0, A_b = 1, bit = 1, bit_b = 0

bit_b

bit
word

Force A_b low, then A rises high

N4
A_b

Writability

P1 P2

N2

N1 N3

Must overpower feedback inverter


N2 >> P1
A_b
A

1.5

bit_b
1.0

0.5

word

0.0
0

100

200

300

400

500

600

700

time (ps)

Vishal Saxena

-6-

SRAM Sizing

High bitlines must not overpower inverters during reads


But low bitlines must write new value into cell

bit_b

bit
word
weak
med

med
A

A_b
strong

Vishal Saxena

-7-

SRAM Column Example


Read

Write
Bitline Conditioning
2

Bitline Conditioning

More
Cells

word_q1

More
Cells

out_v1r

SRAM Cell

bit_b_v1f

out_b_v1r

bit_v1f

bit_b_v1f

bit_v1f

SRAM Cell

word_q1

write_q1
1
2

data_s1

word_q1
bit_v1f
out_v1r

Vishal Saxena

-8-

SRAM Layout

Cell size is critical: 26 x 45 (even smaller in industry)


Tile cells sharing VDD, GND, bitline contacts

GND

BIT BIT_B GND

VDD

WORD

Cell boundary

Vishal Saxena

-9-

12T SRAM Cell

Basic building block: SRAM Cell

Holds one bit of information, like a latch


Must be read and written

12-transistor (12T) SRAM cell

Use a simple latch connected to bitline


46 x 75 unit cell

bit
write
write_b
read
read_b

Vishal Saxena

-10-

Thin Cell

In nanometer CMOS

Avoid bends in polysilicon and diffusion


Orient all transistors in one direction

Lithographically friendly or thin cell layout fixes this

Also reduces length and capacitance of bitlines

Vishal Saxena

-11-

Commercial SRAMs

Five generations of Intel SRAM cell micrographs

Transition to thin cell at 65 nm


Steady scaling of cell area

Vishal Saxena

-12-

Decoders

n:2n decoder consists of 2n n-input AND gates

One needed for each row of memory


Build AND from NAND or NOR gates

Static CMOS
A1

Pseudo-nMOS
A1

A0

word0
word1

A1

A0

A0

word0
word

word1

word2

word2

word3

word3

Vishal Saxena

A0

1/2

16

A1
1
1

word

-13-

Decoder Layout

Decoders must be pitch-matched to SRAM cell

Requires very skinny gates

A3

A3

A2

A2

A1

A1

A0

A0

VDD

word

GND
buffer inverter

NAND gate

Vishal Saxena

-14-

Large Decoders

For n > 4, NAND gates become slow

Break large gates into multiple smaller gates


A3

A2

A1

A0

word0

word1

word2

word3

word15

Vishal Saxena

-15-

Predecoding

Many of these gates are redundant

Factor out common


gates into predecoder
Saves area
Same path effort

A3

A2

A1

A0

predecoders
1 of 4 hot
predecoded lines
word0
word1

word2
word3

word15

Vishal Saxena

-16-

Column Circuitry

Some circuitry is required for each column

Bitline conditioning
Sense amplifiers
Column multiplexing

Vishal Saxena

-17-

Bitline Conditioning

Precharge bitlines high before reads

bit

bit_b

Equalize bitlines to minimize voltage difference when using


sense amplifiers

bit

bit_b

Vishal Saxena

-18-

Sense Amplifiers

Bitlines have many cells attached

Ex: 32-kbit SRAM has 128 rows x 256 cols


128 cells on each bitline

Even with shared diffusion contacts, 64C of diffusion capacitance


(big C)
Discharged slowly through small transistors (small I)

tpd (C/I) V

Sense amplifiers are triggered on small voltage swing


(reduce V)

Vishal Saxena

-19-

Differential Pair Amp

Differential pair requires no clock


But always dissipates static power

sense_b
bit

P1
N1

P2
N2

sense
bit_b

N3

Vishal Saxena

-20-

Clocked Sense Amp

Clocked sense amp saves power


Requires sense_clk after enough bitline swing
Isolation transistors cut off large bitline capacitance
bit

bit_b
isolation
transistors

sense_clk

regenerative
feedback

sense

sense_b

Vishal Saxena

-21-

Twisted Bitlines

Sense amplifiers also amplify noise

Coupling noise is severe in modern processes


Try to couple equally onto bit and bit_b
Done by twisting bitlines

b0 b0_b b1 b1_b b2 b2_b b3 b3_b

Vishal Saxena

-22-

Column Multiplexing

Ex: 2 kword x 16 folded into 256 rows x 128 columns

Must select 16 output bits from the 128 columns


Requires 16 8:1 column multiplexers

Vishal Saxena

-23-

Tree Decoder Mux

Column mux can use pass transistors

Use nMOS only, precharge outputs

One design is to use k series transistors for 2k:1 mux

No external decoder logic needed


B0 B1

B2 B3

B4 B5

B6 B7

B0 B1

B2 B3

B4 B5

B6 B7

A0
A0
A1
A1
A2
A2
Y

to sense amps and write circuits

Vishal Saxena

-24-

Single Pass-Gate Mux

Or eliminate series transistors with separate decoder


A1

A0

B0 B1

B2 B3

Vishal Saxena

-25-

Ex: 2-way Muxed SRAM


2
More
Cells

More
Cells

word_q1

A0
A0
write0_q1

write1_q1

data_v1

Vishal Saxena

-26-

Multiple Ports

We have considered single-ported SRAM

One read or one write on each cycle

Multiported SRAM are needed for register files


Examples:

Multicycle MIPS must read two sources or write a result on some


cycles
Pipelined MIPS must read two sources and write a third result each
cycle
Superscalar MIPS must read and write many sources and results
each cycle

Vishal Saxena

-27-

Dual-Ported SRAM

Simple dual-ported SRAM

Two independent single-ended reads


Or one differential write

bit

bit_b

wordA
wordB

Do two reads and one write by time multiplexing

Read during ph1, write during ph2

Vishal Saxena

-28-

Multi-Ported SRAM

Adding more access transistors hurts read stability


Multiported SRAM isolates reads from state node
Single-ended bitlines save area

Vishal Saxena

-29-

Large SRAMs

Large SRAMs are split into subarrays for speed


Ex: UltraSparc 512KB cache

4 128 KB subarrays
Each have 16 8KB banks
256 rows x 256 cols / bank
60% subarray area efficiency
Also space for tags & control

[Shin05]

Vishal Saxena

-30-

ROM Example

4-word x 6-bit ROM


Represented with dot diagram
Dots indicate 1s in ROM

Word 0: 010101
Word 1: 011001

weak
pseudo-nMOS
pullups

A1 A0

Word 2: 100101
Word 3: 101010

2:4
DEC

ROM Array

Y5

Y4

Y3

Y2

Y1

Y0

Looks like 6 4-input pseudo-nMOS NORs


Vishal Saxena

-31-

ROM Array Layout

Unit cell is 12 x 8 (about 1/10 size of SRAM)

Unit
Cell

Vishal Saxena

-32-

Row Decoders

ROM row decoders must pitch-match with ROM


Only a single track per word!

Vishal Saxena

-33-

Complete ROM Layout

Vishal Saxena

-34-

References
1.

Weste, Harris, CMOS VLSI Design, 2nd Ed., Addison Wesley.

Vishal Saxena

-35-

S-ar putea să vă placă și