Sunteți pe pagina 1din 23

TMS320C64xx fixed-point DSP

• It has 256-bit VLIW based on VelociTI instruction set dual bus Harvard
architecture.
• 32-bit data word length
• Clock speed ranging from 300–1000 MHz
• Code compatible with the older C6200 fixed-point family.

15-10-2019 1
C64xx CPU has two sets of functional units
• . Each set contains four units and a register file of 16 (C64x) or 32 (C64x+)
32-bit registers. There is a cross path between the sets of functional units.
• M - All multiplication operations also include bit-count, rotate, Galois field
multiplies, and bidirectional variable shift hardware.
• L - Logic and arithmetic operations - 32/40-bit arithmetic and compare
operations and 32-bit logical operations
• S - Logic and arithmetic operations - 32-bit arithmetic operations, 32/40-bit
shifts and 32-bit bit-field operations, 32-bit logical operations, Branches,
Constant generation and Register transfers to/from control register file (.S2
only)
• D - Data addressing units - Address calculations, loads and stores, constant
generation and 32 logical operations

15-10-2019 2
C674x CPU features uses VelociTI architecture
• Can execute up to eight 32-bit instructions per cycle
• There are 64 general-purpose 32-bit registers
• eight functional units
• Two multipliers and Six ALUs make up the 8 functional units.

15-10-2019 3
Features of C6000 devices
• Advanced VLIW CPU as 8 functional units are used
• Executes up to eight instructions per cycle making it fastest DSP
• RISC like code make development time fast
• Efficient code execution on independent functional units
• Reduces code size, program fetches, and power consumption due to
instruction packing
• Conditional execution of most instructions – Reduces costly branching
– Increases parallelism for higher sustained performance
• 8/16/32-bit data support, providing efficient memory support for a
variety of applications

15-10-2019 4
Features of C6000 devices
• 40-bit arithmetic options add extra precision for vocoders and other
computationally intensive applications
• Saturation and normalization provide support for key arithmetic
operations
• common operation found in control and data manipulation
applications such as set, clear, and bit counting.

15-10-2019 5
Additional features of C674x devices:
• Each multiplier can perform two 16 × 16-bit or four 8 × 8 bit
multiplies every clock cycle.
• Quad 8-bit and dual 16-bit instruction set extensions with data flow
support
• Support for non-aligned 32-bit (word) and 64-bit (double word)
memory accesses
• Special communication-specific instructions have been added to
address common operations in error-correcting codes.
• Bit count and rotate hardware extends support for bit-level
algorithms.

15-10-2019 6
Additional features of C674x devices:
• Compact instructions: Common instructions (AND, ADD, LD, MPY)
have 16-bit versions to reduce code size.
• Each multiplier can perform 32 × 32 bit multiplies
• Additional instructions to support complex multiplies allowing up to
eight 16-bit multiply/add/subtracts per clock cycle
• Hardware support for modulo loop operation to reduce code size

15-10-2019 7
Additional features to support floating point
computation
• Hardware support for single-precision (32-bit) and double-precision
(64-bit) IEEE floating-point operations.
• Execute packets can span fetch packets.
• Register file size is increased to 64 registers (32 in each data path).
• Floating-point addition and subtraction capability in the .S unit.
• Mixed-precision multiply instructions.
• 32 × 32-bit integer multiply with 32-bit or 64-bit result.

15-10-2019 8
TMS320C674x DSP Block Diagram or CPU

15-10-2019 9
The C674x CPU contains
• Program fetch unit
• 16/32 bit instruction dispatch unit, advanced instruction packing
• Instruction decode unit
• Two data paths, each with four functional units
• 64 32-bit registers
• Control registers
• Control logic
• Test, emulation, and interrupt logic
• Internal DMA (IDMA) for transfers between internal memories

15-10-2019 10
Working…
• The program fetch, instruction dispatch, and instruction decode units
can deliver up to eight 32-bit instructions to the functional units every
CPU clock cycles.
• Data paths help execute instructions using functional units and
register files.

15-10-2019 11
Data path
• Two general-purpose register files (A and B)
• Eight functional units (.L1, .L2, .S1, .S2, .M1, .M2, .D1, and .D2)
• Two load-from-memory data paths (LD1 and LD2)
• Two store-to-memory data paths (ST1 and ST2)
• Two data address paths (DA1 and DA2)
• Two register file data cross paths (1X and 2X)

15-10-2019 12
15-10-2019 13
General-Purpose Register Files
• A & B are the two 32 bit general purpose register files
• Each contain 32 registers identified as A31 – A0 and B31 – B0.
• Used as data register, data address pointer or conditional register.
• Data range from 8 bit packed through 64 bit fixed point data.
• Values > 32 bit (40 bit or 64 bit) is stored in register pairs.
• Colon operator shows use of register pair A1:A0, B3:B2 odd number
first.
• 32 LSB’s are at even address; 32 MSB’s are at odd address.

15-10-2019 14
Functional units
• Split into 2 groups each containing 4 functional units almost identical
to each other.
• Most data lines support 32 bit operands; some 40 bit long and double
word 64 bit operands.
• Each functional unit has its own 32 bit write port. All 8 functional
units access register file in parallel in one cycle.
• Functional units ending with 1 accesses file A; that with ending 2
accesses register file B.

15-10-2019 15
Operations performed by Functional units .L1
and .L2 fixed point operations
• 32/40-bit arithmetic and compare operations
• 32-bit logical operations
• Leftmost 1 or 0 counting for 32 bits
• Normalization count for 32 and 40 bits
• Byte shifts
• Data packing/unpacking
• 5-bit constant generation
• Dual 16-bit arithmetic operations
• Quad 8-bit arithmetic operations
• Dual 16-bit minimum/maximum operations
• Quad 8-bit minimum/maximum operations

15-10-2019 16
Floating-Point Operations of .L1 and .L2
• Arithmetic operations
• DP → SP conversion operations
• INT → DP conversion operations
• INT → SP conversion operations

15-10-2019 17
Operations performed by Functional units .S1
and .S2 fixed point operations
• 32-bit arithmetic operations / 32-bit logical operations
• 32/40-bit shifts and 32-bit bit-field operations
• Branches
• Constant generation
• Register transfers to/from control register file (.S2 only)
• Byte shifts
• Data packing/unpacking
• Dual 16-bit compare operations /Quad 8-bit compare operations
• Dual 16-bit shift operations
• Dual 16-bit /8 –bit saturated arithmetic operations

15-10-2019 18
Floating-Point Operations of .S1 and .S2
• Compare
• Reciprocal and reciprocal square-root operations
• Absolute value operations
• SP → DP conversion operations
• SP and DP adds and subtracts
• SP and DP reverse subtracts (src2 - src1)

15-10-2019 19
Operations performed by Functional units
.M1 and .M2 fixed point operations
• 32 × 32-bit multiply operations / 16 × 16-bit multiply operations
• 16 × 32-bit multiply operations
• Quad 8 × 8-bit multiply operations / Dual 16 × 16-bit multiply operations
• Dual 16 × 16-bit multiply with add/subtract operations
• Quad 8 × 8-bit multiply with add operation
• Bit expansion
• Bit interleaving/de-interleaving
• Variable shift operations
• Rotation
• Galois Field Multiply

15-10-2019 20
Floating-Point Operations of .M1 and .M2
• Floating-point multiply operations
• Mixed-precision multiply operations

15-10-2019 21
Operations performed by Functional units .D1
and .D2 fixed point operations
• 32-bit add, subtract, linear and circular address calculation
• Loads and stores with 5-bit constant offset
• Loads and stores with 15-bit constant offset (.D2 only)
• Load and store doublewords with 5-bit constant
• Load and store nonaligned words and doublewords
• 5-bit constant generation
• 32-bit logical operations

15-10-2019 22
Floating-Point Operations of .D1 and .D2
• Load doubleword with 5-bit constant offset

15-10-2019 23

S-ar putea să vă placă și