Documente Academic
Documente Profesional
Documente Cultură
STEPHEN BROWN
RECENTLY, the development of of the various FPD architectures
JONATHAN ROSE
new types of sophisticated field- and discuss the most important
programmable devices (FPDs) has University of Toronto commercial products, emphasiz-
dramatically changed the process ing devices with relatively high log-
of designing digital hardware. ic capacity.
Unlike previous generations of
hardware technology in which This tutorial surveys commercially Evolution of FPDs
board level designs included large available, high-capacity field- The first user-programmable
numbers of SSI (small-scale inte- programmable devices. The chip that could implement logic cir-
gration) chips containing basic authors describe the three main cuits was the programmable read-
gates, virtually every digital design only memory (PROM), in which
categories of FPDs: simple and
produced today consists mostly of address lines serve as logic circuit
high-density devices. This is true complex programmable logic inputs and data lines as outputs.
not only of custom devices such as devices, and field-programmable Logic functions, however, rarely re-
processors and memory but also gate arrays. They then give quire more than a few product
of logic circuits such as state ma- architectural details of the most terms, and a PROM contains a full
chine controllers, counters, regis- important chips and example decoder for its address inputs.
ters, and decoders. When such PROMs are thus inefficient for real-
applications of each type of
circuits are destined for high-vol- izing logic circuits, so designers
ume systems, designers integrate device. rarely use them for that purpose.
them into high-density gate arrays. The first device developed
However, the high nonrecurring specifically for implementing log-
engineering costs and long manufac- The FPD market has grown over the ic circuits was the field-programmable
turing time of gate arrays make them past decade to the point where there is logic array, or simply PLA for short. A
unsuitable for prototyping or other low- now a wide assortment of devices to PLA consists of two levels of logic gates:
volume scenarios. Therefore, most pro- choose from. To choose a product, de- a programmable, wired-AND plane fol-
totypes and many production designs signers face the daunting task of re- lowed by a programmable, wired OR
now use FPDs. The most compelling searching the best uses of the various plane. A PLA’s structure allows any of
advantages of FPDs are low startup chips and learning the intricacies of its inputs (or their complements) to be
cost, low financial risk, and, because vendor-specific software. Adding to the ANDed together in the AND plane; each
the end user programs the device, difficulty is the complexity of the more AND plane output can thus correspond
quick manufacturing turnaround and sophisticated devices. To help sort out to any product term of the inputs.
easy design changes. the confusion, we provide an overview Similarly, users can configure each OR
SUMMER 1996 43
F I E L D - P R O G R A M M A B L E D E V I C E S
12,000 **
5,000 * devices based on SPLD architectures is cations than for others. There are also
2,000 to programmably interconnnect multi- special-purpose devices optimized for
1,000
ple SPLDs on a single chip. Many FPD specific applications (for example, state
products on the market today have this machines, analog gate arrays, large in-
200 basic structure and are known as com- terconnection problems). Since such
plex programmable-logic devices. devices have limited use, we do not de-
Altera pioneered CPLDs, first in their scribe them here.
SPLDs CPLDs FPGAs Classic EPLD chips, and then in the Max
5000, 7000, and 9000 series. Because of User-programmable switch
*** Altera Flex 10K, ATT&T ORCA 2 a rapidly growing market for large FPDs, technologies
** Altera Max 9000 other manufacturers developed CPLD User-programmable switches are the
* Altera Max 7000, AMD Mach, devices, and many choices are now key to user customization of FPDs. The
Lattice (p)LSI, Cypress Flash370,
Xilinx XC9500 available. CPLDs provide logic capaci- first user-programmable switch devel-
ty up to the equivalent of about 50 typi- oped was the fuse used in PLAs.
Figure 3. FPD logic capacities. cal SPLD devices, but extending these Although some smaller devices still use
architectures to higher densities is diffi- fuses, we will not discuss them here be-
cult. Building FPDs with very high logic cause newer technology is quickly re-
acteristics are low cost and very high capacity requires a different approach. placing them. For higher density
pin-to-pin speed performance. The highest capacity general-purpose devices, CMOS dominates the IC in-
Advances in technology have pro- logic chips available today are the tra- dustry, and different approaches to im-
duced devices with higher capacities ditional gate arrays sometimes referred plementing programmable switches are
than SPLDs. The difficulty with increas- to as mask-programmable gate arrays. necessary. For CPLDs, the main switch
ing a strict SPLD architecture’s capaci- An MPGA consists of an array of pre- technologies (in commercial products)
SUMMER 1996 45
F I E L D - P R O G R A M M A B L E D E V I C E S
PLICE uses polysilicon and n+ diffusion language, or combines these methods. Commercially available FPDs
as conductors and a custom-developed Since initial logic entry is not usually in This overview provides examples of
compound, ONO (oxide-nitride-ox- an optimized form, the system applies commercial FPD products and their ap-
ide),1 as an insulator. Other antifuses algorithms to optimize the circuits. plications. We encourage readers in-
rely on metal for conductors, with Then additional algorithms analyze the terested in more details to contact the
amorphous silicon as the middle lay- resulting logic equations and fit them manufacturers or distributors for the lat-
er.2,3 into the SPLD. Simulation verifies cor- est data sheets. Most FPD manufactur-
rect operation, and the designer returns ers provide data sheets on the World
CAD for FPDs to the design entry step to fix errors. Wide Web at http://www.company-
Computer-aided design programs are When a design simulates correctly, the name.com.
essential in designing circuits for im- designer loads it into a programming
plementation in FPDs. Such software unit to configure an SPLD. In most CAD SPLDs. As a staple of digital hard-
tools are important not only for CPLDs systems, the designer performs the orig- ware designers for the past two
and FPGAs, but also for SPLDs. A typi- inal design entry step manually, and all decades, SPLDs are very important de-
cal CAD system for SPLDs includes soft- other steps are automatic. vices. They have the highest speed per-
ware for the following tasks: initial The steps involved in CPLD design formance of all FPDs and are
design entry, logic optimization, device are similar to those for SPLDs, but the inexpensive. Because they are straight-
fitting, simulation, and configuration. CAD tools are more sophisticated. forward and well understood, we dis-
Figure 7 illustrates the SPLD design Because the devices are complex and cuss them only briefly here.
process. To enter a design, the designer can accommodate large designs, it is Two of the most popular SPLDs are
creates a schematic diagram with a more common to use different design the AMD (Advanced Micro Devices)
graphical CAD tool, describes the de- entry methods for different modules of 16R8 and 22V10 PALs. Both of these de-
sign in a simple hardware description a circuit. For instance, the designer vices are industry standards, widely sec-
SUMMER 1996 47
F I E L D - P R O G R A M M A B L E D E V I C E S
Output/buried
switch matrix
PT allocator,
macrocells
(flip-flops)
AND plane
OR, EXOR
gate, an EXOR gate, and a flip-flop), and
I/O cells
34 80 8
Output
16 16
2) an output switch matrix between the I/O (8)
OR gates and the I/O pins. These fea-
tures make a Mach 4 chip easier to use
because they decouple sections of the
16
PAL-like block. More specifically, the Input
switch 16
product term allocator distributes and
matrix
shares product terms from the AND PAL-like block
plane to OR gates that require them, al-
lowing much more flexibility than the Figure 12. Mach 4 34V16 PAL-like block.
fixed-size OR gates in regular PALs. The
output switch matrix enables any
macrocell output (OR gate or flip-flop) Output Generic logic
to drive any I/O pin connected to the routing blocks
PAL-like block, again providing greater pools
flexibility than a PAL, in which each
macrocell can drive only one specific
I/O pin. Mach 4’s combination of in-sys-
tem programmability and high flexibil-
ity allow easy hardware design changes.
Global routing pool AND Product Macrocells
plane term
Lattice pLSI and ispLSI. Lattice offers allocator
a complete range of CPLDs, with two
main product lines: the pLSI and the
ispLSI. Each consists of three families of
EEPROM CPLDs with different logic ca-
pacities and speed performance. The Input bus
I/O pads
ispLSI devices are in-system program-
mable.
Lattice’s earliest generation of CPLDs Figure 13. Lattice pLSI and ispLSI architecture.
is the pLSI and ispLSI 1000 series. Each
chip consists of a collection of SPLD-
like blocks and a global routing pool to delays. Compared with the chips dis- small PAL-like blocks consisting of an
connect the blocks. Logic capacity cussed so far, the functionality of the AND plane, a product term allocator,
ranges from about 1,200 to 4,000 gates, 3000 series is most similar to that of the and macrocells. The global routing
and pin-to-pin delays are 10 ns. Lattice Mach 4. Unlike the other Lattice CPLDs, pool is a set of wires that span the chip
also offers the 2000 series—relatively the 3000 series offers enhancements to to connect generic logic block inputs
small CPLDs with between 600 and support more recent design styles, such and outputs. All interconnects pass
2,000 gates. The 2000 series features a as IEEE Std 1149.1 boundary scan. through the global routing pool, so tim-
higher ratio of macrocells to I/O pins Figure 13 shows the general structure ing between logic levels is fully pre-
and higher speed performance than the of a Lattice pLSI or ispLSI device. dictable, as it is for the AMD Mach
1000 series. At 5.5-ns pin-to-pin delays, Around the chip’s outside edges are devices.
the 2000 series provides state-of-the-art bidirectional I/Os, which connect to
speed. both the generic logic blocks and the Cypress Flash370. Cypress has re-
Lattice’s 3000 series consists of the global routing pool. As the magnified cently developed CPLD products simi-
company’s largest CPLDs, with up to view on the right side of the figure lar to the AMD and Lattice devices in
5,000 gates and 10- to 15-ns pin-to-pin shows, the generic logic blocks are several ways. Cypress Flash370 CPLDs
SUMMER 1996 49
F I E L D - P R O G R A M M A B L E D E V I C E S
PT allocator
I/Os I/Os 2 I/O grammable interconnect matrix con-
PIM AND 3 I/O tains 32 wires. This means that a
I/Os I/Os 36 86
macrocell can be buried (not drive an
0 16 I/O
I/Os I/Os I/O pin), and yet the I/O pin that the
0-16 inputs macrocell would have driven can still
OR, bypassable
(D, T, latch) serve as an input. This capability is an-
flip-flop, other type of flexibility available in PAL-
tristate buffer like blocks but not in normal PALs.
Figure 14. Cypress Flash370 architecture. (PIM: programmable interconnect matrix.)
Xilinx XC7000. Although primarily a
manufacturer of FPGAs, Xilinx also of-
I/O I/O I/O I/O fers the XC7000 series of CPLDs. The two
main XC7000 families are the 7200 se-
ries (originally marketed by Plus Logic
CFB CFB CFB CFB
as Hiper EPLDs) and the 7300 series de-
veloped by Xilinx. The 7200s are mod-
In
Global interconnect matrix Clock erately small devices with about a 600
to 1,500 gate capacity, and they offer
CFB CFB CFB CFB speed performance of about 25-ns pin-
to-pin delays. Each chip consists of a
collection of SPLD-like blocks contain-
I/O I/O I/O I/O ing nine macrocells each. Unlike those
(a) in other CPLDs, a macrocell includes
two OR gates, each of which becomes
an input for a 2-bit arithmetic logic unit.
Data in CFB
The ALU can produce any functions of
CFB
its two inputs, and its output feeds a
Address SRAM
(128 words
configurable flip-flop. The 7300 series
× 10 bits) is an enhanced version of the 7200 with
Control greater capacity (up to 3,000 gates) and
10 higher speed performance. Xilinx also
Clock has announced a new CPLD family, the
Data out
XC9500, which will offer in-circuit pro-
(b) (c)
grammability with 5-ns pin-to-pin delays
and up to 6,200 logic gates.
Figure 15. Altera Flashlogic CPLD: general architecture (a); CFB in PAL mode (b); CFB
in SRAM mode (c). Altera Flashlogic. Previously known
as Intel’s Flexlogic, these devices feature
in-system programmability and on-chip
use flash EEPROM technology and of- The smallest parts have 32 macrocells SRAM blocks, a unique feature among
fer speed performance of 8.5 to 15 ns and 32 I/O pins; the largest have 256 CPLD products. Figure 15a illustrates the
pin-to-pin delays. The Flash370s are not macrocells and 256 pins. Flashlogic architecture, a collection of
in-system programmable. To meet the Figure 14 shows that Flash370s have PAL-like blocks called configurable
needs of larger chips, the devices pro- a typical CPLD architecture with multi- function blocks (CFBs), each of which
vide more I/O pins than competing ple PAL-like blocks connected by a pro- represents an optimized 24V10 PAL.
products, with a linear relationship be- grammable interconnect matrix. Each Flashlogic’s basic structure is similar
tween the number of macrocells and PAL-like block contains an AND plane to other products already discussed.
the number of bidirectional I/O pins. that feeds a product term allocator that However, one feature sets it apart from
SUMMER 1996 51
F I E L D - P R O G R A M M A B L E D E V I C E S
SUMMER 1996 53
F I E L D - P R O G R A M M A B L E D E V I C E S
From
FastTrack
interconnect Control Cascade, carry
4 2
I/O
Data Logic To FastTrack
4 element interconnect I/O
Embedded
array
Local interconnect
Lookup DQ
table
Figure 23. Altera Flex 10K architecture.
Lookup DQ
table
Switch matrix
can configure an embedded array block serves as four 4-input lookup tables, sev-
Lookup DQ
table
to implement a complex logic circuit, eral of the lookup tables’ inputs must
such as a multiplier, by employing it as a come from the same programmable-
Lookup DQ large, multioutput lookup table. Altera function unit input. While this restraint
table
CAD tools provide several macrofunc- reduces the programmable-function
PFU
tions that implement useful logic circuits unit’s flexibility, it also significantly re-
in embedded array blocks. Counting the duces the chip’s wiring cost. The pro-
Figure 24. AT&T ORCA programmable- embedded array blocks as logic gates, grammable-function unit includes
function unit. Flex 10K offers the highest logic capaci- arithmetic circuitry, as do the Xilinx
ty of any FPGA, although obtaining an XC4000 and the Altera Flex 8000, and
accurate number is difficult. like the XC4000, is configurable as a
the longer paths contain fewer pro- RAM block. A recently announced ver-
grammable switches. Moreover, con- AT&T ORCA. AT&T’s SRAM-based sion of the ORCA chip also allows dual-
nections between horizontal and vertical FPGAs, called Optimized Reconfig- port and synchronous RAM.
lines pass through active buffers, further urable Cell Arrays (ORCAs), feature an ORCA’s interconnect structure is also
enhancing predictability. overall structure similar to that of Xilinx different from other SRAM-based
The Flex 10K family offers all the Flex FPGAs. The ORCA logic block contains FPGAs. Each programmable-function
8000 features with the addition of vari- an array of programmable-function unit connects to an interconnect con-
able-size blocks of SRAM called embed- units (Figure 24) based on lookup ta- figured in four-bit buses. This structure
ded array blocks. As Figure 23 shows, bles. A programmable-function unit is supports system level designs more ef-
each row of a Flex 10K chip has an em- unique among lookup-table-based log- ficiently, since buses are common in
bedded array block on one end. Users ic blocks: It is configurable as four 4-in- such applications.
can configure each embedded array put lookup tables, two 5-input lookup The ORCA2 series extends the fami-
block to serve as an SRAM block with a tables, or one 6-input lookup table. A ly, offering a capacity of up to 40,000
variable aspect ratio: 256×8, 512×4, key element of this architecture is that logic gates. ORCA2 features a two-level
1K×2, or 2K×1. Alternatively, CAD tools when the programmable-function unit hierarchy of programmable-function
I/O blocks
I/O blocks
similar features, we focus on the most
recent devices. Unlike the FPGAs de-
scribed so far, Actel’s devices use anti-
fuse technology and a structure similar
to traditional gate arrays. Their design
arranges logic blocks in rows with hor-
izontal routing channels between adja-
I/O blocks
cent rows (Figure 25). Actel logic
blocks, based on multiplexers, are Figure 25. Actel FPGA structure.
small compared to those based on
lookup tables. Figure 26 illustrates the
Act 3 logic block, which consists of an to several other FPGAs: Like Xilinx
AND and an OR gate connected to a FPGAs, it has an array-based structure;
Multiplexer-based
multiplexer-based circuit block. In com- like Actel FPGAs, its logic blocks use Inputs circuit block Output
bination with the two logic gates, the multiplexers; and like Altera Flex 8000s,
arrangement of the multiplexer circuit its interconnect consists only of long
enables a single logic block to realize a lines. The pASIC2 is a recently intro-
wide range of functions. About half the duced enhanced version, which we will
logic blocks in an Act 3 device also con- not discuss here. Cypress also offers de-
tain a flip-flop. vices using the pASIC architecture, but
Actel’s horizontal routing channels we discuss only Quicklogic’s version. Inputs
consist of various-length wire segments Quicklogic’s ViaLink antifuse struc-
with antifuses to connect logic blocks ture (see Figure 27b) consists of a metal Figure 26. Actel Act 3 logic module.
to wire segments or one wire to anoth- top layer, an amorphous-silicon insulat-
er. Although not shown in Figure 25,
vertical wires also overlie the logic
blocks, forming signal paths that span
multiple rows. The speed performance
of Actel chips is not fully predictable be-
cause the number of antifuses traversed
by a signal depends on how CAD tools
allocate the wire segments during cir-
cuit implementation. However, a rich
selection of wire segment lengths in
each channel and algorithms that guar-
antee strict limits on the number of an- ViaLink
Logic cell at every
tifuses traversed by any two-point wire
connection improve speed perfor- crossing Amorphous silicon
mance significantly.
Metal 2
SUMMER 1996 55
F I E L D - P R O G R A M M A B L E D E V I C E S
D&T focuses on practical articles of near-term interest Interested authors should submit four copies of a double-
to the professional engineering community. D&T seeks ar- spaced manuscript no longer than 35 pages, in English, by
ticles of significant contribution that address the design, June 15, 1996. Each copy must contain contact informa-
test, debugging, manufacturability, and yield improvement tion (name, postal and e-mail addresses, and phone/fax
of microprocessors and microcontrollers. The areas of in- numbers). Final articles will be due October 15, 1996.
terest include but are not limited to For author guidelines, see D&T ’s Spring 1996 issue or Web
page at http://www.computer.org/pubs/d&t/d&t.htm.
➧ Circuit design and design methodologies
Submit manuscripts to:
➧ Logic design and design methodologies
➧ CAD tools and methodologies Marc E. Levitt
➧ Design-for-test techniques and applications Special Issue Guest Editor
➧ Debugging experiences, tools, and methodologies Sun Microelectronics, USUN02-301
➧ Yield improvement experiences, tools, and 2550 Garcia Avenue, Mountain View, CA 94043
methodologies phone (408) 774-8268; fax (408) 774-2099
➧ Project management marc.levitt@eng.sun.com
SUMMER 1996 57