Documente Academic
Documente Profesional
Documente Cultură
Bachelor of Engineering
in
By
NAVYA CHERUPALLY 100514735020
Internal Guide:
External Guides:
Certificate
(1005-14-735020) (1005-14-735024)
ACKNOWLEDGEMENT
The IEEE 802.3 (Ethernet) standard specifies a networking protocol that allows multiple
devices connected to the network to communicate with each other. The PHY Coding
Sublayer (PCS) is a part of the physical layer of the Ethernet stack which performs
encoding/decoding and error correction and communicates the same to the MAC. On the
egress, the PCS accepts data from the MAC and sends encoded data to the serializer while
on the ingress, the PCS accepts data from the deserializer and provides decoded data to the
MAC.
As speeds of the backplane (BASE-R) Ethernet increase, the BER (Bit Error Rate) also goes
up. In the absence of a FEC scheme, the time taken by the Ethernet subsystem to identify
errors is large. It would be advantageous to detect and correct a reasonable subset of possible
errors as early as possible to greatly reduce the occurrence of retransmission which in turn
increases the efficiency of the link. By adding a FEC scheme like Reed Solomon coding to
traditional Ethernet, effective BER is significantly reduced. To allow error correction as
early as possible in the stack, the PCS sublayer must implement the FEC.
The RS-FEC scheme implemented allows for correction of 140-bit errors every 5280-bits
(BER of approximately up to 13 x 10-3). A method has also been proposed to predict variable
ingress and egress latency due to transcoding and gearbox conversion which will be useful
in MAC timestamping.
CONTENTS
LIST OF TABLES
LIST OF FIGURES
LIST OF ABBREVIATIONS
CHAPTER 1
INTRODUCTION................................................................................................................................. 1
1.1 Introduction ................................................................................................................................... 1
1.2 Objective ....................................................................................................................................... 2
1.3 Motivation ..................................................................................................................................... 2
1.4 Information and Coding Theory ................................................................................................... 2
1.5 Hardware solution ......................................................................................................................... 5
1.6 Superiority of RSFEC ................................................................................................................... 5
1.6 Overview of Report....................................................................................................................... 6
CHAPTER 2
PHYSICAL CODING SUBLAYER (PCS)......................................................................................... 7
2.1 INTERCONNECTION BETWEEN MODELS & LAYERS ....................................................... 7
2.2 CODES & BLOCK STRUCTURES .......................................................................................... 10
2.3 TRANSMISSION ....................................................................................................................... 14
2.4 RECEPTION............................................................................................................................... 17
CHAPTER 3
GALOIS FIELD & CYCLIC CODES .............................................................................................. 19
3.1 Galois Field (GF) .................................................................................................................. 19
3.1.1 Properties of Galois Field ................................................................................................. 19
3.1.2 Galois Field GF (2m) ............................................................................................................ 20
3.1.3 Representation of Galois Field Elements ............................................................................. 22
3.1.4 Basis of Galois Field GF (2m) .............................................................................................. 23
3.1.5 Implementation of GF (2m) Arithmetic ................................................................................ 24
3.1.6 Russian Peasant Multiplication algorithm ........................................................................... 25
3.2 Cyclic Codes ............................................................................................................................... 26
3.2.1 Description ........................................................................................................................... 26
3.2.2 Code words in Polynomial Forms ........................................................................................ 26
3.2.3 Generator Polynomial of a Cyclic Code ....................................................................... 26
3.2.4 Generation of Cyclic Codes in Systematic Form .......................................................... 27
3.3 Properties of Reed-Solomon Codes ...................................................................................... 28
3.4 Applications of Reed-Solomon Codes .................................................................................. 30
CHAPTER 4
REED SOLOMON ENCODING....................................................................................................... 31
4.1 Rate compensation for codeword markers in the transmit direction ........................................... 31
4.2 64B/66B to 256B/257B transcoder ............................................................................................. 31
4.3 Codeword marker insertion........................................................................................................ 33
4.4 Reed-Solomon Encoding ............................................................................................................ 34
4.5 Systematic Encoding ................................................................................................................... 34
4.6 Reed-Solomon encoder ............................................................................................................... 36
CHAPTER 5
REED SOLOMON DECODING....................................................................................................... 38
5.1 Lock FSM ................................................................................................................................... 38
5.2 Error Detection “Syndrome Calculation” ............................................................................. 39
5.3 The Decoding Algorithm ............................................................................................................ 40
5.3.1Decoding of RS Codes Using Berlekamp-Massey Algorithm .............................................. 41
5.4 Chien Search Calculation ............................................................................................................ 48
5.5 Forney Algorithm........................................................................................................................ 49
5.6 256B/257B to 64B/66B transcoder ............................................................................................ 50
CHAPTER 6
IMPLEMENTATION& VERIFICATION ...................................................................................... 52
6.1 DESIGN OF RSFEC................................................................................................................... 52
6.2 REGISTER DESCRIPTION....................................................................................................... 56
6.3VERIFICATION.......................................................................................................................... 58
6.3.1 TESTBENCH ARCHITECTURE ....................................................................................... 58
6.3.2 SIMULATION RESULTS .................................................................................................. 59
6.4 SYNTHESIS REPORT ............................................................................................................... 65
CHAPTER 7
CONCLUSION & SCOPE ................................................................................................................. 67
7.1 Conclusion .................................................................................................................................. 67
7.2 Future Scope ............................................................................................................................... 67
ANNEX -RS-FEC code word examples ............................................................................................ 68
A.1 Input to the 64B/66B to 256B/257B transcoder ......................................................................... 68
A.2 Output of the RS (528,514) Encoder.......................................................................................... 69
REFERENCES .................................................................................................................................... 70
LIST OF TABLES
Table 5-1 : B–M algorithm table for determining the error-location polynomial….46
Figure 4-4-1 : The information bit sequence divided into symbols. ………….…….34
Figure 4-5-1 : A codeword is formed from message and parity symbols. ….……....34
CHAPTER 1
INTRODUCTION
1.1 Introduction
The IEEE 802.3 (Ethernet) standard specifies a networking protocol that allows
multiple devices connected to the network to communicate with each other. The Physical
Coding Sublayer (PCS) is a part of the physical layer of the Ethernet model. It resides at the
top of the physical layer (PHY), and provides an interface between the Physical Medium
Attachment (PMA) sublayer and the Media Independent Interface (MII). It is responsible for
data encoding/decoding, scrambling/descrambling, alignment marker insertion/removal, block
and symbol redistribution, and lane block synchronization and deskew. It transmits the data in
a protected form by providing security at a physical level which is important in this modern
day internet world where hackers try to steal the information. Apart from all these functions
the data from PCS is encoded using a special Forward Error Correction (FEC) technique called
Reed Solomon FEC.
This whole process of encoding and decoding is done in a separate layer called FEC
Sublayer. Generally, for 10G Ethernet (speeds up to 10 Gigabits per second) applications this
is implemented using KR-FEC also called as Fire Codes. But as the bit rates increase, KRFEC
cannot be used since Error Probability or Bit Error Rate (BER) on the lane also increases we
need to employ RS-FEC. The RS-FEC is based on Reed Solomon Codes which have many
applications in areas other than Ethernet like CDs, DVDs, Blu-ray Discs, QR Codes, data
transmission technologies such as DSL and WiMAX, broadcast systems such
as DVB and ATSC, and storage systems such as RAID 6. They are also used in satellite
communication.
This report focuses on our implementation of RSFEC sublayer along with PCS
Sublayer which is a leading-edge technology in Ethernet Applications Industries.
2
1.2 Objective
To improved Error Correction Capability of IEEE 802.3 Ethernet by using Reed
Solomon FEC Coding. To use Inversion less Berlekamp Massey Algorithm which reduces the
need of complex inverters thereby reducing the complexity of the Decoder circuit. To also
reduce the variable data rates which are main concerns of MAC timestamping thereby
providing accurate timestamps.
1.3 Motivation
For designing of RS-FEC sublayer we used RS (528,514) code with m=10 and t = 14.
We used symbols which are Galois field elements GF (210 ) and 10-bits wide. The 14 check
symbols we add to a codeword block provided us a error correcting capability of 7 symbols
per codeword. This can be seen as correcting capability of 70 bits for every 5280 bits. This is
a huge improvement in reducing the bit error rate as it can handle up to 13 x 10-3. RS-FEC
layer is very important in Ethernet above 25G as it corrects the errors as early as possible
thereby reducing the retransmission requests from the MAC which in turn reduces the delay
in transmission.
MESSAGES CODEWORDS
S0 00
S1 01
S2 10
S3 11
3
S0 00 0000
S1 01 0101
S2 10 1010
S3 11 1111
So another coding block is needed called channel coding which adds parity check bits
to each message to make a distance between valid code words as shown in Table 1-2. When
we increase the parity check length the distance between each two code words is increased
and the probability of error is decreased but the effective rate is decreased, so it is a trade-
off between the rate and the probability of error. A block diagram of a communication
system as related to information theory is shown in Figure 1-1.
The block diagram seen in Figure 1-1 shows two types of encoders/decoders
. • Source encoder/decoder.
• Channel encoder/decoder
In previous years there were an increasing interest in the reliability of data transmission
and storage mediums, as if a single error happened all the system may be damaged due to
4
an unacceptable corruption for the data, e.g. in a bank account. The simplest way of
detecting a single error is a parity check sum. But in some applications this method is not
sufficient and different methods must be implemented.
If the transmission system transmits data in both directions, an error control strategy
may be determined by detecting an error and then, if an error is occurred, retransmitting the
corrupted data. These systems are called Automatic Repeat Request (ARQ). If transmission
transfers data in only one direction, e.g. information recorded on a compact disk, the only
way to control the error is with Forward Error Correction (FEC). In FEC systems some
redundant data is concatenated with the information data in order to allow for the detection
and correction of the corrupted data without having to retransmit it.
Error control coding mechanism is done in two inverse operations. The first one is a
mechanism of adding redundancy bits to the message and form a code word, this operation
called encoding operation, the second operation is excluding the redundancy bits from the
code word to achieve the message and this operation called decoding operation.
These types of codes are called block codes and are denoted by C (n, k). The rate of
the code, R = k/n, where k represents the message bits and n represents the coded bits. Since
the 2*k messages are converted into code words of n bits. This encoding procedure can be
understood a conversion from message vector of k bits located in space of size 2*k to a
coded vector of size n bits in a space of size, and 2*k only selected to be valid code words.
Linear block codes are considered the most common codes used in channel coding
techniques. In this technique, message words are arranged as blocks of k bits, constituting a
set of 2*k possible messages. The encoder takes each block of k bits, and converts it into a
longer block of n > k bits, called the coded bits or the bits of the code word. In this procedure
there are (n−k) bits that the encoder adds to the message word, which are usually called
redundant bits or parity check bits. The code words generated from the encoder is linearly
combined as the summation of any two code word is an existing code word so it is called
Linear Block Codes.
In this case, the first k bits are the message bits as it is and the remaining (n − k) bits
are called parity check or redundancy bits. The structure of a codeword in systematic form
is shown in Figure 1-3-1.
5
In this thesis, the message bits are converted into symbols “non-binary code” called
Reed Solomon (RS) code which is a special type of cyclic code which will be explained
briefly in chapter 2. The message symbols are placed at the beginning of the codeword,
while the redundancy symbols are placed at the end of the codeword.
FPGAs are customizable logic devices, as they give fast solution for specific problems.
FPGAs are considered a good step towards the ASIC design which is the most optimum
way in area, power consumption, and price. The solution we provided can be implemented
using a ASIC or FPGA.
• Our implementation also provides accurate timestamps for the MAC as we used Rate
Compensators to provide a constant data rate over any period of time.
• It also solves MTTFPA concerns when sending un-encoded 64B/66B data with 5
lane bit interleaving on a 25G lane.
• No Auto Negation Needed
• ~4.9 dB of gain at 10-15 output BER assuming burst errors due to DFE
• ~5.3 dB of gain at 10-12 output BER assuming random errors
• It can be implemented for 100G on 4 lanes achieving maximum bandwidth.
The organization of this thesis is as follows. In Chapter 2, designing of PCS layer (Both
TX and RX) is explained. In Chapter 4, introduction to Reed Solomon (RS) codes and their
properties and applications and Galois Fields will be discussed, then RS encoder is
presented in Chapter 5. In Chapter 6, the general architecture of RS decoder is discussed,
then the decoding algorithm of Berlekamp Massey and Euclidean Algorithms are discussed.
The decomposed inversion less Berlekamp Massey algorithm with a new architecture called
the decomposed inversion less Berlekamp Massey architecture is discussed, finally, the
simulation results and synthesis reports and comparison between these proposals and the
other architectures are discussed. In Chapter 6, our implementation of RS encoder and
decoder are discussed. Chapter 7 provides the conclusion and scope of project. Annex
provides the example inputs & outputs of a transcoder.
7
CHAPTER 2
PHYSICAL CODING SUBLAYER (PCS)
Encoding:
The WIS provides a medium-independent means for the PCS to operate over WAN
links. It creates a 25GBASE-W encoding by encapsulating the encoded data steam from the
25GBASE-R PCS in frames compatible with SONET and SDH transmission formats[1].
The PMA provides a medium-independent means for the PCS to support the use of a
range of physical media[1]. The 25GBASE-R PMA performs the following functions:
a) Mapping of transmit and receive data streams between the PCS or WIS and PMA via the
PMA service interface.
d) Mapping of transmit and receive bits between the PMA and PMD via the PMD service
interface.
The MDI, logically subsumed within each PMD sub clause, is the actual medium
attachment for the various supported media[1].
Inter-sublayer interfaces
There are a number of interfaces employed by 25GBASE-R. Some (such as the PMA
service interface) use an abstract service model to define the operation of the interface. The
XGMII has an optional physical instantiation.
Figure 2-1-2 depicts the relationship and mapping of the services provided by all of
the interfaces relevant to 25GBASE-R.
The upper interface of the PCS may connect to the Reconciliation Sublayer through
the XGMII or the PCS may connect to an XGXS sublayer. The XGXS and the
Reconciliation Sublayer provide the same service interface to the PCS. The lower interface
of the PCS may connect to the WIS to support a WAN PMD or to the PMA sublayer to
support a 25GBASE-R LAN PMD[1].
9
The PCS service interface allows the 25GBASE-R PCS to transfer information to and
from a PCS client. A PCS client is generally the Reconciliation Sublayer or an XGXS
sublayer[1].
The PCS comprises the PCS Transmit, Block Synchronization, PCS Receive, and BER
monitor processes for 25GBASE-R. The PCS shields the Reconciliation Sublayer (and
MAC) from the specific nature of the underlying channel. The PCS transmit channel and
receive channel can each operate in normal mode or, when not attached to a WIS, test-
pattern mode. When the PCS is attached to a WIS, the WIS provides the test-pattern
functionality [1].
When communicating with the XGMII, the PCS uses a four octet-wide, synchronous
data path, with packet delimiting being provided by transmit control signals (TXCn = 1) and
receive control signals (RXCn = 1). When communicating with the PMA or WIS, the PCS
10
uses a 32-bit wide, synchronous data path that conveys 32 encoded bits. Alignment to
64B/66B block is performed in the PCS. The WIS and PMA sublayers operate independent
of block and packet boundaries. The PCS provides the functions necessary to map packets
between the XGMII format and the PMA service interface format [1].
When the transmit channel is in normal mode, the PCS Transmit process continuously
generates blocks based upon the TXD and TXC signals on the XGMII. The Gearbox
function of the PCS Transmit process then packs the resulting bits into 40-bit transmit data-
units. Transmit data-units are sent to the PMA or WIS service interface via the
PMA_UNITDATA.request or WIS_UNITDATA.request primitive, respectively. When the
WIS is present, the PCS Transmit process also adapts between the XGMII and WIS data
rates by deleting idle characters[1].
Use of blocks
The PCS maps XGMII signals into 66-bit blocks, and vice versa, using a 64B/66B
coding scheme. The synchronization headers of the blocks allow establishment of block
boundaries by the PCS Synchronization process. Blocks are unobservable and have no
meaning outside the PCS. The PCS blocks ENCODER and DECODER generates,
manipulates, and interprets blocks[1].
Blocks consist of 66 bits. The first two bits of a block are the synchronization header
(sync header). Blocks are either data blocks or control blocks. The sync header is 01 for
data blocks and 10 for control blocks.
Thus, there is always a transition between the first two bits of a block. The remainder
of the block contains the payload. The payload is scrambled, and the sync header bypasses
the scrambler. Therefore, the sync header is the only position in the block that always
contains a transition. This feature of the code is used to obtain block synchronization.
Data blocks contain eight data characters. Control blocks begin with an 8-bit block type
field that indicates the format of the remainder of the block. For control blocks containing
a Start or Terminate character, that character is implied by the block type field. Other control
characters are encoded in a 7-bit control code or a 4-bit O Code. Each control block contains
eight characters[1].
The format of the blocks is as shown in Table 2-2-1. In the figure, the column labelled
Input Data shows, in abbreviated form, the eight characters used to create the 66-bit block.
These characters are either data characters or control characters and, when transferred across
the XGMII interface, the corresponding TXC or RXC bit is set accordingly. Within the Input
Data column, D0 through D7 are data octets and are transferred with the corresponding TXC
or RXC bit set to zero. All other characters are control octets and are transferred with the
11
corresponding TXC or RXC bit set to one. The single bit fields (thin rectangles with no
label in the Table 2-2-1) are sent as zero and ignored upon receipt [1].
Bits and field positions are shown with the least significant bit on the left. Hexadecimal
numbers are shown in normal hexadecimal. For example, the block type field 0x1e is sent
as 01111000 representing bits 2 through 9 of the 66-bit block. The least significant bit for
each field is placed in the lowest numbered position of the field.
Control codes
The same set of control characters are supported by the XGMII and the 25GBASE-R PCS.
The representations of the control characters are the control codes. XGMII encodes a control
character into an octet (an eight-bit value). The 25GBASE-R PCS encodes the start and
terminate control characters implicitly by the block type field. The 25GBASE-R PCS
encodes the ordered set control codes using a combination of the block type field and a 4-
bit O code for each ordered set. The 25GBASE-R PCS encodes each of the other control
characters into a 7-bit C code [1].
The control characters and their mappings to 25GBASE-R control codes and XGMII
control codes are specified in Table 2-2-2. All XGMII and 25GBASE-R control code values
that do not appear in the Table 2-2-2 shall not be transmitted and shall be treated as an error
if received.
Ordered Sets
Ordered sets are used to extend the ability to send control and status information over
the link such as remote fault and local fault status. Ordered sets consist of a control character
followed by three data characters. Ordered sets always begin on the first octet of the XGMII.
The sequence ordered set control character is denoted /Q/. An additional ordered set, the
signal ordered set, has been reserved and it begins with another control code. The 4-bit O
field encodes the control code. See Table 1-2-2 for the mappings.
e) The set of eight XGMII characters does not have a corresponding block format in Table
2-2-1.
Idle control characters (/I/) are transmitted when idle control characters are received
from the XGMII. Idle characters may be added or deleted by the PCS to adapt between clock
rates. /I/ insertion and deletion shall occur in groups of 4. /I/s may be added following idle or
ordered sets. They shall not be added while data is being received. When deleting /I/s, the first
four characters after a /T/ shall not be deleted.
To communicate LPI, LPI control character /LI/ is sent continuously in place of /I/. LPI
control characters are transmitted when LPI control characters are received from the XGMII.
LPI characters may be added or deleted by the PCS to adapt between clock rates in a similar
manner to idle control characters. /LI/ insertion and deletion shall occur in groups of four. /LI/s
may only be added following other LPI characters [1].
Start (/S/)
The start controls character (/S/) indicates the start of a packet. This delimiter is only
valid on the first octet of the XGMII (TXD<0:7> and RXD<0:7>). Receipt of an /S/ on any
other octet of TxD indicates an error. Block type field values implicitly encode an /S/ as the
fifth or first character of the block. These are the only characters of a block on which a start
can occur.
Terminate (/T/)
The terminate control character (/T/) indicates the end of a packet. Since packets may
be any length, the /T/ can occur on any octet of the XGMII interface and within any character
of the block. The location of the /T/ in the block is implicitly encoded in the block type field.
A valid end of packet occurs when a block containing a /T/ is followed by a control block
that does not contain a /T/.
The ordered set control characters (/O/) indicate the start of an ordered set. There are
two kinds of ordered sets: the sequence ordered set and the signal ordered set (which is
reserved). When it is necessary to designate the control character for the sequence ordered
set specifically, /Q/ will be used. /O/ is only valid on the first octet of the XGMII. Receipt
of an /O/ on any other octet of TXD indicates an error. Block type field values implicitly
encode an /O/ as the first or fifth character of the block. The 4-bit O code encodes the
specific /O/ character for the ordered set.
Sequence ordered sets may be deleted by the PCS to adapt between clock rates. Such
deletion shall only occur when two consecutive sequence ordered sets have been received
and shall delete only one of the two. Only Idles may be inserted for clock compensation.
Signal ordered sets are not deleted for clock compensation.
14
Error (/E/)
The /E/ is sent whenever an /E/ is received. It is also sent when invalid blocks are
received. The /E/ allows physical sublayers such as the XGXS and PCS to propagate
received errors [1].
2.3 TRANSMISSION
64B/66B transmission code
The relationship of block bit positions to XGMII, PMA, and other PCS constructs is
illustrated in Figure 2–3-1 for transmit and Figure 2–4-1 for receive. These figures illustrate
the processing of a block containing 8 data octets. See Table 2-2-1 for information on how
blocks containing control characters are mapped. Note that the sync header is generated by
the encoder and bypasses the scrambler.
Notation conventions
For values shown as binary, the leftmost bit is the first transmitted bit.
64B/66B encodes 8 data octets or control characters into a block. Blocks containing
control characters also contain a block type field. Data octets are labelled D0 to D7. Control
characters other than /O/, /S/ and /T/ are labelled C0 to C7. The control character for ordered
set is labelled as O0 or O4 since it is only valid on the first octet of the XGMII. The control
character for start is labelled as S0 or S4 for the same reason. The control character for
terminate is labelled as T0 to T7.
Two consecutive XGMII transfers provide eight characters that are encoded into one
66-bit transmission block. The subscript in the above labels indicates the position of the
character in the eight characters from the XGMII transfers.
Contents of block type fields, data octets and control characters are shown as
hexadecimal values. The LSB of the hexadecimal value represents the first transmitted bit.
For instance, the block type field 0x1e is sent from left to right as 01111000. The bits of a
transmitted or received block are labelled TxB<65:0> and RxB<65:0> respectively where
TxB<0> and RxB<0> represent the first transmitted bit. The value of the sync header is
15
shown as a binary value. Binary values are shown with the first transmitted bit (the LSB)
on the left.
Transmission order
Block bit transmission order is illustrated in Figure 2-3-1 and Figure 2-4-1. Note that
these figures show the mapping from XGMII to 64B/66B block for a block containing eight
data characters [1].
TRANSMIT PROCESS
The transmit process generates blocks based upon the TXD<31:0> and TXC<3:0>
signals received from the XGMII. Two XGMII data transfers are encoded into each block.
16
The transmit process generates blocks as specified in the transmit process state diagram.
The contents of each block are contained in a vector tx_coded<65:0>, which is passed to
the scrambler. tx_coded<1:0>contains the sync header and the remainder of the bits contain
the block payload.
Scrambler
There is no requirement on the initial value for the scrambler. The scrambler is run
continuously on all payload bits. The sync header bits bypass the scrambler [1].
Gearbox
The gearbox adapts between the 66-bit width of the blocks and the N-bit width of the
PMA or WIS interface. It receives the 66-bit blocks. When the transmit channel is operating
in normal mode, the gearbox sends N bits of transmit data at a time via
WIS_UNITDATA.request or PMA_UNITDATA.request primitives. The
UNITDATA.request primitives are fully packed with bits. For example, if one block
happened to start with the sync header on bits 0 and 1 of a PMA_UNITDATA.request, then
17
the last two bits of that block would be on bits 0 and 1 of a PMA_UNITDATA.request and
the next block would begin with a sync header on bits 2 and 3 of that
PMA_UNITDATA.request.
The gearbox functionality is necessary when the optional PMA compatibility interface,
XSBI, is implemented since that interface passes data over a N-bit wide path. It is also
necessary when connecting to a WIS since the WIS processes the data stream with 8-bit
granularity. When neither the WIS nor the XSBI is implemented, the internal data-path
width between the PCS and PMA is an implementation choice. Depending on the path
width, the gearbox functionality may not be necessary.
2.4 RECEPTION
Receive process
Gearbox
The gearbox adapts between the N-bit width of the blocks and the 66-bit width of the
PMA or WIS interface. It receives the N-bit blocks. When the transmit channel is operating
in normal mode, the gearbox sends 66 bits of transmit data at a time via
WIS_UNITDATA.request or PMA_UNITDATA.request primitives.
Descrambler
The descrambler processes the payload to reverse the effect of the scrambler using the
same polynomial. It shall produce the same result as the implementation shown in Figure
2–4-2.
Decoder
The receive process decodes blocks to produce RXD<31:0> and RXC<3:0> for
transmission to the XGMII.Two XGMII data transfers are decoded from each block. Where
the XGMII and PMA sublayer data rates are not synchronized to a 16:33 ratio, the receive
process will insert idles, delete idles, or delete sequence ordered sets to adapt between rates.
The WIS data rate is always slower than the XGMII data rate and a PCS connected to a WIS
will insert idles to adapt between the rates[1].
Conclusion
In this chapter we showed how a PCS layer works in Ethernet Stack and all its encoding and
decoding functions which provide a superior security at the physical level. In the next chapter
we’ll introduce Galois Field which is a basis for the Reed Solomon Codes and theory behind
the Reed Solomon Codes.
19
CHAPTER 3
+ 0 1
0 0 1
1 1 0
20
X 0 1
0 0 0
1 0 1
where the coefficient f0……, fk are the elements of GF (2) i.e. it can take only values 0 or 1.
A binary number of (K + 1) bits can be represented as a polynomial of degree K by taking the
coefficients equal to the bits and the exponents of X equal to bit locations. In the polynomial
representation, a multiplication by X represents a shift to the right [4].
For example, the binary number 10011 is equivalent to the following polynomial:
10011 ↔ 1 + 0X + 0X 2 + X 3 + X 4
The first bit (“position zero” the coefficient of X0) is equal to 1, the second
bit (“position one” the coefficient of X) is equal to 0, the third bit (“position two “the
coefficient of X2) is equal to 0, and so on.
For Example: assume we have GF (8) with the elements of {0, 1, 2, 3, 4, 5, 6, 7}. this
cannot be considered Galois Field as for the reasons:
• There is no multiplicative inverse for all elements in the field (e.g., 6 has no
inverse).
21
• The identity element under multiplication is not unique for some elements (e.g.,
4 ∗ 1 = 4 ∗ 3 = 4).
The element α2m_1 will be equal to α0, and higher powers of α will repeat the lower
powers found in the finite field. The best way to understand how to add the powers of
alpha is to examine the case:
𝑚 −1
𝛼2 = α0 = 1
Since in GF (2m) algebra, plus (+) and minus (-) are the same, the last one can be
represented as follows:
𝑚 −1
𝛼2 +1=0
3 2
X 7+ 1 = (X + 1) (X 3+ X + 1) (X + X + 1)
Both the polynomials of degree 3 are primitive and can be chosen, so let’s choose
the polynomial shown in equation 3.2
3
p(X) = X + X + 1 (3.2)
This polynomial has no solution in binary field. The primitive element α is the solution for
the primitive polynomial , so equation 3.2 is converted to equation
α3= 1 + α
4
α = α.α3 = α.(1 + α) = α + α2
5
α
= α.α4 = α.(α + α2) = α2 + α3 = α2 + (1 + α) = 1 + α + α2
Note that α7 = α0, and therefore the eight finite field elements (2m = 23 = 8) of GF
(23), generated by equation (3.2), are {0, α0, α1, α 2, α3, α4, α5, α6}, and all elements starting
from α4 to α6 are presented function of α0,α, and α2 which are called the basis of the Galois
field, and this is will be discussed in details in the following sections.
P (X) = 1 + X2 + X3 + X4 + X8 (3.4)
As we said in the previous section that there was a one-to-one mapping between
polynomials over GF (2) and binary numbers, now here in GF (2m) there is one-to-one
mapping between polynomials over GF (2m) and symbols of length m.
The following table shows three different ways to represent elements in GF (23):
Binary Form
Polynomial
α0, α1, α2
Power Form Form
− 0 000
0 1 100
23
1 α 010
2 α2 001
3 1+α 110
4 α+α 2 011
5 1 + α + α2 111
6 1 + α2 101
The first column of Table 3-3 represents the powers of α. The second column shows
the polynomial representation of the field elements. This polynomial representation is
obtained from equation 3.5. And the last column of Table 3-3 is the binary
representation of the field elements[8], where the coefficient of α2, α1 and α0, taken from
the second column, are represented as binary numbers and present the basis of the field
and it will be discussed in details in the next section.
Assume that we have β = β0, β1, ..., βm-1is the basis of GF (2m). Let a is any
general element in the field and denoted by (a0, a1, ..., am-1), a can be represented as
shown in equation 3.6.
This classification based on the optimization for the hardware and the need of the
applications.
• Polynomial Basis
24
In this type of basis, we choose the first m symbols excluding zero, i.e. we
For example, consider GF (23) with p(x) = x3 + x + 1. Take a as a root of p(x) then
the polynomial basis of this field will be {1, a, a2} and all 8 elements can be represented
as:
where the ai∈GF (2). These basis coefficients can be stored in a basis table of the
kind shown in Appendix B.
• Dual Basis
Dual basis is one of the most important types of basis which used to gain an
efficient hardware for RS encoders and decoders, as it is used in Galois field multipliers.
The difference between polynomial basis and dual basis is only re-ordering to the symbols.
For example: Let’s deal with GF (24) with primitive polynomial p(X) = X4 +X+1
• The standard (polynomial) basis for this field is {α0, α1, α2, α3}.
• The dual basis for this field is {α0, α3, α2, α1}.
• Normal Basis
Normal basis is a basis which is useful when we need squaring in our
calculations. Since if (a0, a1, ..., am-1) are the normal basis representation of a ∈ GF
(2m) then (am−1, a0, a1, ..., am−2) is the normal basis representation of a2. This property
make the hardware is more efficient
The most common implementation in Galois field arithmetic are addition and
multiplication.
For example, to multiply 238 by 13, the smaller of the numbers (to reduce the number of
steps), 13, is written on the left and the larger on the right. The left number is progressively
halved (discarding any remainder) and the right one doubled, until the left number is 1:
13 238
3 952
Lines with even numbers on the left column are struck out, and the remaining numbers on
the right are added, giving the answer as 3094:
26
13 238
6 476
3 952
1 + 1904
3094
3.2.1 Description
Cyclic codes are considered class of linear block codes, with the advantage of
being easily implemented using sequential logic or shift registers.
Let C be a codeword where C = (c0, c1, ..., cn−1). The ith shifted version of this
codeword is:
C(i)= (cn−i, cn−i+1, ..., cn−1, c0, c1, ..., cn−i−1) (3.9)
expression:
The i-position right-shift rotated polynomial is denoted as C(i)(X) and the original
code polynomial C (X), with relation shown in equations 3.12 and 3.13:
This polynomial is used in the encoding procedure for a linear cyclic code, as Ccyc(n,
k)can be introduced as a multiplication between the message polynomial m(X)and the
generator polynomial g(X)as shown in equation 3.14, and this operation is sufficient to
generate any code polynomial of the code[4].
• Here p(X) is the remainder polynomial of the division of equation 3.16, which
has degree n-k-1 or less, since the degree of g(X) is r = n−k. By reordering
equation 3.17, we obtain can get equation 3.18 as we discussed in the previous
section that there is no difference between (+) and (−).
Xn−km(X) + p(X) = q(X)g(X) (3.18)
Where it is seen that the polynomial Xn−km(X) + p(X) is a code polynomial because
it is a factor of g(X). In this polynomial, the term Xn−km(X) represents the message
polynomial right shifted n-k positions, where p(X) is the remainder polynomial of this
division and acts as the redundancy polynomial. This procedure allows the code
polynomial to be in systematic form:
where k is the number of data symbols to be encoded, and n is the total number of
code symbols after encoding, called codeword. This means that the RS encoder takes k data
symbols and adds parity symbols (redundancy) of (n − k) symbols to make an n symbol
codeword in systematic form as discussed in the previous section.
where t is the number of symbols that can be corrected with this code, where tcan
be expressed as
Equation 3.23 clarifies that for the case of RS codes, we need not more than 2t parity
symbols to correct t symbol errors. For each error, one redundant symbol is used to find
the location of the error in the codeword, and another redundant symbol is used to find
the value of the error.
Let the number of errors with an unknown location is nerrors and the number of errors
with known locations (erasures) as nerasures, the RS algorithm guarantees to correct a
codeword, provided that the following is true
n
2nerrors + erasures≤ 2t (3.24)
Keeping the same symbol size m, RS codes may be shortened by making a number
of data symbols zero at the encoder, not transmitting them, and then re-inserting them
at decoder[7]. For example, the RS (255, 239) code with (m = 8) can be shortened to RS
(200, 184) with the same m = 8. The encoder takes a block of 184 data bytes, then adds
55 zero bytes, creates a RS (255, 239) code word and transmits only the 184 data bytes
and 16 parity bytes.
The main advantage of RS code is that it performs well against burst noise.
Consider a popular Reed-Solomon code RS (255, 223), where each symbol is made up
of m = 8 bits. Since (n − k) = 32, Equation 3.23 indicates that this code can correct any
16 symbol errors in a codeword of 255 bytes. Now assume that we have burst error in
a 128-bit duration and affected one codeword during transmission, as shown in Figure
3-3-1
30
In this example, a burst of noise that lasts for a duration of 128 contiguous bits
corrupted exactly 16 symbols. The RS decoder for the (255,223) code will correct any
16 symbol errors regardless the type of damage suffered by the symbol. When the
decoder corrects a byte, it replaces the incorrect byte by the correct one, whether the
error was caused by one bit being corrupted or all eight bits being corrupted[7]. Thus if
a symbol is wrong, it might as well be wrong in all its bit positions. That is why RS
codes are extremely used because of their capacity to correct burst errors.
e.g.:
Conclusion
In this Chapter we showed how Galois fields form a basis for Reed Solomon Codes and
how the Reed Solomon Codes are constructed using these Galois Field elements followed
by the applications of Reed Solomon Codes. In the next chapter we’ll discuss about the RS
Encoding.
31
CHAPTER 4
The RS-FEC transmit process periodically inserts codeword markers into the
transcoded block stream. In order to maintain the same bit rate after codeword marker
insertion [6], the RS-FEC transmit process shall perform the rate compensation function
described below, or its functional equivalent:
a) Decode the PCS blocks received by descrambling and applying the PCS receive
process to obtain the 25GMII character stream.
b) Delete Idle control characters (/I/), Low Power Idle control characters (/LI/), and
ordered sets, to create room as necessary for the periodically occurring codeword
markers.
c) Re-encode the data stream obtained, by applying the PCS transmit process and
scrambler to obtain 64B/66B blocks.
blocks.
d1) Let tx_payloads<(64j+63):64j> = tx_coded_j<65:2> for j=0 to 3
e1) Omit tx_coded_c<9:6>, which is the second nibble (based on transmission order) of
the block type field for tx_coded_c, from tx_xcoded per the following expressions.
tx_xcoded<(64c+8):5> = tx_payloads<(64c+3):0>
tx_xcoded<256:(64c+9)> = tx_payloads<255:(64c+8)>
where the coefficients m0, m1, . . ., mk−1 of the polynomial m(X) are the symbols of
message block. These coefficients are elements of GF (2m). So the information
sequence is mapped into a polynomial by setting the coefficients equal to the symbol
values.
For example, consider the Galois field GF (28), so the information sequence is
divided into symbols of eight consecutive bits as shown in Figure (3-8). The first
symbol in the sequence is 10000000. In the power representation, 10000000 becomes
α0GF (28). Thus, α0 becomes the coefficient of X0. The second symbol is 00100000, so
the coefficient of X1 is α2. The third symbol is 10111111, so the coefficient of X2 is
α80 and so on.
Figure 4-5-1 A code word is formed from message and parity symbols.
35
Applying the polynomial notation, we can shift the information into the left most
bits by multiplying by X2t, leaving a code word of the form
Where C(x) is the code word polynomial, m(X) is message polynomial and p(x) is
the redundant polynomial.
The parity symbols are obtained from the redundant polynomial p(X), which is the
remainder obtained by dividing X2tm(X) by the generator polynomial, which is
expressed as
So, RS code word is generated using generator polynomial, which has such property
that all valid codewords are exactly divisible by the generator polynomial. The general
form of the generator polynomial is:
Whereα is a primitive element in GF (2m), and g0, g1, g2, · ·, g2t−1 are the coefficients
from GF (2m). The degree of the generator polynomial is equal to number of parity
symbols (n − k). Since the generator polynomial is of degree 2t, there must be precisely
2t consecutive powers of α that are roots of this polynomial. We designate the root of
g(X) as α, α2. . . α2t. It is not necessary to start with the root α, because starting with any
power of α is possible. The root of a generator polynomial, g(X), must also be roots of
the code word generated by g(X), because a valid code word is of the following form:
(4.7)
In the above equation 4.7, α is a primitive element of the finite field defined by the
polynomial x10+x3 +1. Equation below defines the message polynomial m(x) whose
coefficients are the message symbols mk-1 to m0.
(4.8)
The first symbol input to the encoder is mk-1. Equation below defines the parity
polynomial p(x) whose coefficients are the parity symbols p2t-1 to p0.
(4.9)
The parity polynomial is the remainder from the division of m(x) by g(x). This may
be computed using the shift register implementation illustrated in Figure 4-4-1. The
outputs of the delay elements are initialized to zero prior to the computation of the parity
for a given message. After the last message symbol, m0, is processed by the encoder, the
outputs of the delay elements are the parity symbols for that message.
The code word polynomial c(x) is then the sum of m(x) and p(x) where the coefficient
of the highest power of x, cn-1 = mk-1 is transmitted first and the coefficient of the lowest
power of x, c0 = p0 is transmitted last[3]. The first bit transmitted from each symbol is bit 0.
37
Conclusion:
In this chapter we showed how a Reed Solomon Encoder can be implemented along with the
CWM insertions and transcoding function of a RSFEC Transmitter. In the next Chapter, we
deal about the RSFEC Decoder where different methods used to decode are shown along
with the inversion less Berlekamp Massey Algorithm.
38
CHAPTER 5
Figure 5-1 shows the main block diagram of Reed Solomon decoder which consists
of two main parts:
After getting the values and locations of the error, we can correct the received
codeword by xor-ing the received vector with the error vector.
The first step in RS decoder is to check if there is any error in the received
codeword or not. This done using Syndrome computation block[5].
• Let the error polynomial e(X) which added by the channel formed as:
Which is related to the received polynomial r(X) and the transmitted polynomial
c(X) as follows:
From equation 5.4, the transmitted polynomial c(x) must be multiple of the
generator polynomial g(X), and the received polynomial r(X) is evaluated form the
addition between c(X) and e(X). So the roots of g(X) should give zero in the received
polynomial if the error polynomial is zero. i.e., no errors occurred.
(5.5)
Where i = 1, 2. . . 2t.
(5.6)
From equation 5.6, if there are no errors, all syndrome coefficients must give zero.
If there is any non-zero coefficient, it means that there is an occurrence for error.
The main function of the decoding algorithm is to get the error location polynomial
σ(x), and the error evaluator polynomial W (x), which represent the locations and the
values of the errors respectively.
The first error correction procedure for Reed Solomon codes was found by
Gornstien and Zierler, and improved by Chien and Forney. This producer is known as
the key equation solver, as it will be discussed later.
• Serial algorithms in which the error locator polynomial σ(x) is calculated first
then we substituted in the key equation to calculate the error evaluator polynomial
W (x), e.g. (Berlekamp–Massey algorithm).
• Parallel algorithms in which the error locator polynomial σ(x) and the error
evaluator polynomial W (x) are calculated are in parallel, e.g. (Euclidean
algorithm).
Let the error polynomial e(X) contains τ errors placed at positions Xj1, Xj2. . . Xjτ
with error values ej1, ej2. . . ejτ then:
Now our target is to calculate the values of eji and the powers of Xji.
From Equations [5.7 and 5.8] we can obtain set of equations that relate
the error locations and values to the syndrome coefficients in the form of:
42
S
2t = r(α2t) = e(α2t) = ej1 α2tj1+ ej2 α2tj2+· · ·+ejτ α2tjτ (5.9)
. . .
s
2t = r(α2t) = e(α2t) = ej1 β12t+ ej2 β12t+· · ·+ejτ βτ2t (5.10)
From equation 5.10 we have 2t equations in 2t unknowns as worst case, but these
equation is not linear equations so we define the two polynomials:
• The error locator polynomial σ(x) which present the locations of the error.
• The error evaluator polynomial W (x) which presents the values of the errors.
Let’s assume that we have binary errors, as the values of the errors will not affect
the location of the errors[4], so for B-M algorithm, the error location polynomial can be
defined as:
• The roots of this polynomial are β1-1, β2,-1 . . . , βτ−1, the inverse of the error
location numbers.
• Coefficients of this polynomial can be expressed as:
σ0 = 1
σ1 = β1 + β2 + · · · + βτ
σ
τ = β1β2. . . βτ
It is possible to get a relation between the coefficients of σ(X) and the syndrome
coefficients Si’s :
s
1 + σ1= 0
S
2 + σ1s1= 0
These equations are called Newton identities, and we can verify them as follow:
(β1 + β2 + · ·· + βτ )(β1 + β2 + · ·· + βτ ) = 0
The remaining Newton identities can be derived in the same way.The objective from
the algorithm is to find the minimum degree polynomial σ(X) whose coefficients satisfy
these newton identities.
2.At the kth step, the polynomial of minimum degree will be: The second Newton
identity is tested. If the polynomial σBM(1) (X) satisfies the second Newton identity
in 5.13, then σBM(2) (X) = σBM(1) (X). Otherwise the
44
(5.14)
(5.15)
3. In the next step the new polynomial with minimum degree will be:
(5.16)
(5.17)
4. Once the algorithm reaches step 2t, the polynomial σBM(2t) (X) is called as the error-
location polynomial σBM (X), i.e., σBM (X) = σBM(2t) (X).
Assume that we just completed the kth iteration and got σ(k)(X). To find
(5.18)
If yes, therefore σ(k+1)(X) = σ(k)(X) and there will not be any change in the
polynomial. If no, we add correction dµ, called the kth discrepancy. This term can be
obtained by using the following expression:
(5.19)
• If dk= 0, then the minimum-degree polynomial σBM(k) (X) satisfies (k + 1)th Newton
identity, and it becomes σBM(k+1)(X):
(5.20)
• If dµ= 0, then the minimum-degree polynomial σBM k (X) will not satisfy the (µ
( )
(5.21)
Where σBM ρ (X) is a previous polynomial such that the discrepancy dρ = 0 and
( )
ρ −l is a maximum, and the number lρ is the degree of the polynomial σBM(ρ) (X). So the
closed form of the algorithm will be:
If dk = 0 then σBM(k+1)(X) = σBM(k) (X), lk+1 = lk.
If dµ≠ 0, the algorithm take the previous row ρ, such that dρ≠ 0 and ρ − lρ is
maximum. Then,
(5.22)
The B–M algorithm can be implemented in the form of a table with 2t rows to give
the final value of the minimum degree error locator polynomial σBM(2t) (X) , as given in
Table 5-1.
46
Table 5-1: B–M algorithm table for determining the error-location polynomial
Note that, if the degree of σBM(2t) (X) is larger than t, it means that its roots do not
correspond to a real error-location numbers, it means also that the number of errors are
more than t errors, which is more than the error-correction capability of the code.
For example: consider (15, 9) RS code under GF (24) with the following syndrome
coefficients:
From Table 5-2, the minimum degree error locator polynomial σ(X) using
Berlekamp-Massey algorithm is:σ(X) = 1 + α7X + α4X2 + α6X3.
After the determination of the error-location polynomial, the roots of this poly-
nomial are calculated by applying the Chien search, which will be explained in the
following sections, by replacing the variable X with all the elements of the Galois field
GF (2m), 1, α, α2, . .. , α2m-2, in the expression of the obtained error-location polynomial,
looking for the condition σBM (αi) = 0, which present the inverse of the error locations.
As mentioned before that B-M algorithm is a serial algorithm so, once the B–M
algorithm determines the error-location polynomial σ(X), it substitutes in the following
equation:
This equation is called the Key equation, where µ(X) is a polynomial such that the
polynomials σ(X), S(X) and W (X) fit the key equation. Equation 4.2.17 can be proofed
as follow:
(5.24)
Then, from equation 3.24, the result of σ(X) *S(X) can be shown as:
By substituting in the Key equation shown in equation 5.25, we can get the
coefficients of the error evaluator polynomial W (X), so B-M algorithm is called serial
architecture as the error locator polynomial is calculated first the error evaluator
polynomial.
After above explanation, it is clear from equation 5.21, the evaluation of σBM(µ+1)
needs the inverse of dρ(dρ-1) at each iteration which needs GF inverter. There are two
methods to implement the GF inverter. One of them is by designing actual GF inverter
to get the inverse of GF elements. The other method is by using inverse ROM to
calculate the inverse of each element [4]. But using the GF inverter at each iteration will
consume extra delay in the calculation of equation 5.21 and also extra hardware which
increases the complexity of the decoder either we used first or second method. So to
overcome this drawback of B-M algorithm we will use Decomposed inversion-less
Berlekamp-Massey (DiB-M) algorithm is introduced.
σ(αi) = 0 (5.26)
. .
σ(α2
m
−1) = 1 + σ1(α2m−1) + σ2(α2m−1)2 + σ3(α2m−1)3
The Chien’s search block gets also the value of W (x) at the field elements, i.e., W
(α), W (α2), W (α3) ... W (α255). The only difference is the loaded coefficients, they are
w0∼w7 instead of σ0∼σ8, which is used in calculating the error values.
(5.28)
Where W (X) is the error evaluation polynomial, k is the number of errors, and σ’(X)is the
first derivative of the error locator polynomial σ(x) with respect to X[5].
Finally, after getting the error locations and error values, we finally can form the
error polynomial e(X) and correct the received polynomial r(X) just by adding (with
XOR operation) these two polynomials together, as shown in Figure 5-1.
σ (X) is the first derivative of the error locator polynomial σ(x) with respect to X.
Finally, after getting the error locations and error values, we finally can form the
error polynomial e(X) and correct the received polynomial r(X) just by adding (with
XOR operation) these two polynomials together, as shown in Figure 5-1.
50
The transcoder [3] extracts a group of four 66-bit blocks, rx_coded_j<65:0> where
j=0 to 3, from each 257-bitblock rx_scrambled<256:0>.
Bit 0 of the 257-bit block is the first bit received.
Conclusion:
In this chapter RSFEC decoding algorithms are discussed along with down
transcoding functions which are seen in the Reception side in order to maintain the correct data
rate. In the next chapter our Implementation of the RSFEC Encoder and Decoder along with
the PCS TX and RX is shown.
52
CHAPTER 6
IMPLEMENTATION& VERIFICATION
Our implementation of Reed Solomon decoder includes modules from all the above-
mentioned chapters. We implemented FEC Sublayer of the IEEE 802.3 Ethernet model. The
input to the layer comes from PCS via PMA.
1. A gearbox converts the input of N-bit PMA width into 66-bit width.
2. This 66-bit data is descrambled and Decoded into 64-bit output.
3. CWM Rate compensator removes idles in order to compensate the insertion of CWM
4. This data is Encoded back and Scrambled again before it goes for CWM insertion
5. The Up transcoder compresses the data to make enough space required for insertion of
CWM.
6. CWM insertion is done once every 1024 codewords.
7. This output is Encoded using RS FEC encoder
8. The 66-bit output from the Encoder is converted into N- bit PMA output, which is
processed to further layers.
RECEPTION
1. The N- bit PMA input from physical layer is converted to 66 bit using a gearbox.
2. Lock FSM is used to align with the CWM which is inserted in the transmission phase.
3. After Lock is achieved, the output is sent into RS FEC Decoder where the error
detection and correction is done if the errors are less than or equal to 70 bits (or 7
symbols).
4. The CWMs in corrected output are removed. and then decompressed by Down
Transcoder.
5. The Decompressed output is then descrambled and decoded.
6. Idles are added in order to compensate the rate which is mismatched due to removal of
CWMs.
7. This output is again scrambled and encoded back to 66-bit output
8. A gearbox is used at the end to convert this 66 – bit output into N- bit PMA Width
output which goes into PCS and other layers for further processing.
53
TX PIPELINE
54
RX PIPELINE
55
PCS
Gear Box
Gearbox
Descrambler
Scrambler
Decoder Encoder
CWM Rate
CWM Rate Compensator
Compensator
Decoder
Encoder
Descrambler
Scrambler
Decompressor/CWM
Removal
Compressor/ CWM
Insertion RS FEC Decoder
RS FEC Encoder
Lock FSM
Physical Layer
56
These registers are used to program the controls used in RSFEC module. We used APB
functionality by implementing a Dual Rank Synchronizer in order to read the values from the
RSFEC module.
6.3 VERIFICATION
6.3.1 TESTBENCH ARCHITECTURE
A Data Simulator is used in testbench to generate Data or IDLES which are applied as
input to the DUT (Device Under Test). The RSFEC Module (i.e, DUT) processes the inputs to
give output. Initially IDLEs are applied from the Data Simulator until lock is seen on the output
of DUT. Once the lock is achieved Data Simulator starts sending Source data and the Output
of DUT is compared with the Source data using a Comparator which detects a data mismatch.
Data Format:
IDLES START DATA TERMINATE
RSFEC SUBLAYER:
PROCESSING THROUGH PMA:
RSFEC TX:
TX USING PMA
The inputs are applied through PMA as interface_sel signal is 0. The above waveform shows
the TX side of the RSFEC module .
TX USING XGMII
The inputs are applied through XGMII interface as interface_sel signal is 1. The above
waveform shows the TX side of the RSFEC module with XGMII inputs.
PCS ENCODER:
Signals:
61
PCS SCRAMBLER:
Signals:
tx_coded_i: data input from PCS encoder each of 66-bit width.
tx_dav_i: data valid signal indicating presence of data input.
tx_init_i: initial seed(58-bit) to scrambler.
tx_scrambler_bypass_i: input signal,if made high the input data bypasses scrambler else
scrambling is done.
tx_scrambled_o:scrambled data output of 66-bit width to RSFEC uptranscoder.
tx_dav_o: data valid signal indicating presence of data output.
RSFEC UP-TRANSCODER:
Signals:
tx_coded_i: data input from PCS scrambler each of 66-bit width.
tx_dav_i: data valid signal indicating presence of data input.
tx_scrambled_o:transcoded data output of 264-bit width to width converter.
tx_dav_o: data valid signal indicating presence of data output.
62
WIDTH CONVERTER:
Signals:
data_i: data input from RSFEC up-transcoder each of 264-bit width.
dav_i: data valid signal indicating presence of data input.
data_o:converted data output of variable width (60,60,70,50,90) to RSFEC encoder.
dav_o: data valid signal indicating presence of data output.
state_r: state indicating width of data output.
RSFEC ENCODER:
Signals:
data_i: data input from width converter each of variable width.
dav_i: data valid signal indicating presence of data input.
data_o:encodeddata output of 66-bit width to physical layer.
dav_o: data valid signal indicating presence of data output.
generator_polynomial: value of g(x) used in RSFEC encoder.
RX:
RX USING XGMII
63
The above waveform shows the RX side of the RSFEC module where the XGMII outputs are
taken from the PCS decoder .
RX USING PMA
The above waveform shows the RX side of the RSFEC module where the PMA outputs are
taken from the Output Gearbox.
RX RSFEC LOCK:
Signals:
data_i: data input from physical layer each of PMA WIDTH.
dav_i: data valid signal indicating presence of data input.
signal_ok_i: lock signal which goes high on detecting CWM.
data_o: data output from LOCK block RSFEC decoder.
dav_o:data valid signal indicating presence of data output.
64
RSFEC DECODER:
Signals:
data_i: data input from LOCK block of RSFEC sublayer each of variable width.
dav_i: data valid signal indicating presence of data input.
data_o:decodeddata output to RSFEC syndrome decoder.
dav_o: data valid signal indicating presence of data output.
Signals:
data_i: data input from RSFEC decoder each of variable width.
dav_i: data valid signal indicating presence of data input.
synd_o: output syndrome values to RSFEC down transcoder.
synd_dav_o:data valid signal indicating presence of syndrome output.
data_o: data output from RSFEC syndrome decoder.
dav_o: data valid signal indicating presence of data output.
Signals:
rx_scrambled_i: data input from LOCK block of RSFEC sublayer each of variable width.
rx_dav_i: data valid signal indicating presence of data input.
rx_coded_o: decoded data output to PCS descrambler each of 66-bit width.
rx_dav_o: data valid signal indicating presence of data output.
65
PCS DESCRAMBLER:
Signals:
rx_scrambled_i: data input from RSFEC down transcoder each of 66-bit width.
rx_dav_i: data valid signal indicating presence of data input.
rx_valid_o: decoded data output to PCS decoder of 66-bit width.
rx_coded_o: data valid signal indicating presence of data output.
PCS DECODER:
Signals:
rx_coded_i: data input from descrambler each of 66-bit width.
rx_dav_i: data valid signal indicating presence of data input.
rxd_o: decoded data output to XGMII each of 64-bit width.
rxc_o: decoded control output to XGMII each of 8-bit width.
The synthesis of RSFEC module is done using Xilinx Vivado for xc7v585tffg1157-3.
===============================================
Summary of Synthesis Report :
-------------------------------------------------
Total flip-flops : 74.8K
Total ROM bits : 10.24K
The maximum achievable TX clock frequency is 80MHz.
The maximum achievable RX clock frequency is 50MHz.
=================================================
66
In this chapter we have shown our implementation which includes the RTL design of the both
PCS and RSFEC modules. The verification approach and simulation results are also showed.
In the next chapter, we conclude this thesis by showing what we have accomplished and the
future scope of this project.
67
CHAPTER 7
CONCLUSION& SCOPE
7.1 Conclusion
This thesis presents architectures for both transmission and reception side of RSFEC
sublayer along with the PCS layer. The IEEE 802.3 (Ethernet) standard specifies a networking
protocol that allows multiple devices connected to the network to communicate with each
other. The PHY Coding Sublayer (PCS) is a part of the physical layer of the Ethernet stack
which performs encoding/decoding and error correction and communicates the same to the
MAC. On the egress, the PCS accepts data from the MAC and sends encoded data to the
serializer while on the ingress, the PCS accepts data from the deserializer and provides decoded
data to the MAC.
As speeds of the backplane (BASE-R) Ethernet increase, the BER (Bit Error Rate) also goes
up. In the absence of a FEC scheme, the time taken by the Ethernet subsystem to identify errors
is large. It would be advantageous to detect and correct a reasonable subset of possible errors
as early as possible to greatly reduce the occurrence of retransmission which in turn increases
the efficiency of the link. By adding a FEC scheme like Reed Solomon coding to traditional
Ethernet, effective BER is significantly reduced. To allow error correction as early as possible
in the stack, the PCS sublayer must implement the FEC.
The RS-FEC scheme we implemented allows for correction of 140-bit errors every 5280-bits
(BER of approximately up to 13 x 10-3). We proposed a method to predict variable ingress and
egress latency due to transcoding and gearbox conversion which will be also useful in MAC
timestamping.
Table A–2 contains a RS (528,514) code word. Each row of Table A–1 is a set of four 66-bit blocks
that is converted to one 257-bit block using the procedure defined in 3.1. The resulting set of 20 257-bit
blocks constitute the message portion of the codeword. The parity is computed using the encoder
defined in3.2 and is appended to the message to complete the codeword.
REFERENCES
[12] IEEE P802.3by™/D2.0 Draft Standard for Ethernet Amendment: Media Access Control
Parameters, Physical Layers and Management Parameters for 25 Gb/s Operation, The
Institute of Electrical and Electronics Engineers, Inc. Three Park Avenue, New York.