Sunteți pe pagina 1din 14

Layer 2 Tunnel

xConnect performance test

1
About test

The main purpose of this examination was to find out the router performance in Layer 2 tunnel mode during
transmission of packets with different length. Some background information is present in document also for better
understanding.

During the test were used:


 Cisco 892/K9, C890 Software (C890-UNIVERSALK9-M), Version 15.1(2)T2, RELEASE SOFTWARE (fc1)
 JDSU SmartClass Ethernet Testers.

Measured L1 Rate [Mbps] results in tables are rounded. However, it is sufficient for the review. A more accurate value
you can get out from Frame Measured Rate.

For example: for 64-Byte frames L1 Rate is 14,9 Mbps and frame rate is 22172 frames per seconds, In this case
14,9Mbps is rounded. A more accurate value is 22172 [frames/s] x 84 [bytes on wire] x 8 [bits in byte] =
14.899584Mbps.

Background
Each layer have own unit of measure.
PDU – Protocol Data Unit
 Layer 1 Physical Layer - Bit
 Layer 2 Data Link Layer - Frame
 Layer 3 Network Layer - Packet

Ethernet frame Ethernet II / DIX

Figure 1 shows fields and lengths of Ethernet II/DIX frame.

Destination Source
Preamble Ethertype Payload CRC-32 Interframe gap
MAC MAC
8 Bytes 6 Bytes 6 Bytes 2 Bytes 46 - 1500 Bytes 4 Bytes 12 Byte times

This is Layer 1 part. This is Layer 1 part.


not included Included in the Ethernet frame length not included
in the Ethernet frame length in the Ethernet frame length

Figure 1. Untagged Ethernet frame

2
Maximum throughput

The inter frame gap is inserted between frames during transmitting (Figure 2)

#4 #3 #2 #1 Frames

Interframe gap
12 Byte times

Figure 2 Ethernet interframe gap

Preamble 8 Bytes + Destination MAC 6 Bytes + Source MAC 6 Bytes + Ethertype 2Bytes + Payload 1500 bytes + FCS 4
Bytes + Inter frame Gap “12 Bytes” = 1538 Bytes. In this way 1538 bytes are needed to transmit 1518 bytes untagged
frame.

Layer 3 maximum throughput can’t reach 100% wire speed and depends from packet length.
1538 Bytes are needed for transmitting 1500 bytes of L3 data -> 1500/1538*100% = 97,53% @ untagged frame.
84 Bytes are needed for transmitting 46 bytes of L3 data -> 64/84*100% = 76,19% @ untagged frame.

L2TP encapsulation

L2TP overhead is 38 bytes (Figure 3).

Figure 3 L2TP Encapsulation

3
Performance test
L2 Tunnel topology

Topology: L2 Tunnel with redundant paths (Figure 4).


Routing protocol: BGP protocol is used, but routing protocol selection is not important in this test.
Test type: Layer 2 RFC2544.

Two variants were tested:


st
1 variant: Tunnel via Gi0 port
nd
2 variant: Tunnel via Fa7 port (interface SVI VLAN 100)

Figure 4 Schematic diagram

Cisco 892 router has 2 routed ports and 8-port LAN switch. SVI is configured for creation of 3rd routed port.

Connections Fa7-Fa7 and Gi0-Gi0 are tunnels and must have Layer 3 MTU 1538 bytes (1500 + 38 bytes L2TP overhead).

SVI MTU by default is 1514 bytes. It included L3 packet 1500 bytes + L2 MAC header 14 bytes and without CRC.
Because SVI involved in tunneling, then MTU must be 1552 bytes (1538 + 14).

NB! Bug with MTU was found during SVI testing. It described below.

Initial router configurations:

Router R1
interface FastEthernet8
no ip address
xconnect 2.2.2.2 123 encapsulation l2tpv3 manual pw-class L2_TUNNEL <- define xConnect
l2tp id 10 20 <- define tunnel session ID

interface Loopback0
ip address 1.1.1.1 255.255.255.255

interface GigabitEthernet0
mtu 1538 <- tunnel MTU
ip address 192.168.2.1 255.255.255.0

interface FastEthernet7
switchport access vlan 100
mtu 1538 <- tunnel MTU

interface Vlan100
mtu 1552 <- tunnel MTU
ip address 192.168.1.1 255.255.255.0

pseudowire-class L2_TUNNEL <- define pseudo wire class


encapsulation l2tpv3 <- define encapsulation
protocol none <- manual mode
ip local interface Loopback0 <- use Loopback 0 interface

4
router bgp 10 <- BGP configuration
bgp router-id 1.1.1.1
bgp log-neighbor-changes
redistribute connected
neighbor 192.168.1.2 remote-as 20
neighbor 192.168.2.2 remote-as 20
neighbor 192.168.2.2 weight 10 <- path via Gi0 is preferred (primary)
no auto-summary

Router R2
interface FastEthernet8
no ip address
xconnect 1.1.1.1 123 encapsulation l2tpv3 manual pw-class L2_TUNNEL <- define xConnect
l2tp id 20 10 <- define tunnel session ID

interface Loopback0
ip address 2.2.2.2 255.255.255.255

interface GigabitEthernet0
mtu 1538 <- tunnel MTU
ip address 192.168.2.2 255.255.255.0

interface FastEthernet7
switchport access vlan 100
mtu 1538 <- tunnel MTU

interface Vlan100
mtu 1552 <- tunnel MTU
ip address 192.168.1.2 255.255.255.0

pseudowire-class L2_TUNNEL <- define pseudo wire class


encapsulation l2tpv3 <- define encapsulation
protocol none <- manual mode
ip local interface Loopback0 <- use Loopback 0 interface

router bgp 20 <- BGP configuration


bgp router-id 2.2.2.2
bgp log-neighbor-changes
redistribute connected
neighbor 192.168.1.1 remote-as 10
neighbor 192.168.2.1 remote-as 10
neighbor 192.168.2.1 weight 10 <- path via Gi0 is preferred (primary)
no auto-summary

The #sh processes cpu sorted 1min and #sh processes cpu history were used for CPU statistics
gathering. Please see the example below:

R1#sh processes cpu sorted 1min | i CPU utilization


CPU utilization for five seconds: 57%/55%; one minute: 38%; five minutes: 17%

R1#sh proc cpu hist

R1 07:02:20 AM Wednesday Jan 28 2015 UTC

5555555555555555555555556666655555555555555511111 333332
777777777666666666666666333336666666666666669999911111333339
100
90
80
70
60 ********************************************
50 ********************************************
40 ********************************************
30 ******************************************** ******
20 ************************************************* ******
10 ************************************************* ******
0....5....1....1....2....2....3....3....4....4....5....5....6
0 5 0 5 0 5 0 5 0 5 0
CPU% per second (last 60 seconds)

The CPU utilization for five seconds: 57%/55%; should be read as "Total CPU usage"/"CPU Usage Caused by traffic".

5
Test results

Original L2 Tunnel Test results. Tunnel via Gi0 port


L2 Frame Frame Measured CPU CPU usage Measured
Length Length L1 Rate total usage caused by Rate
[Bytes] [Bytes] [Mbps] [%] traffic [%] frame/sec
64 102 14.9 99 96 22172
128 166 26.3 99 96 22212
256 294 49.9 99 96 22554
512 550 89.2 99 96 20958
1024 1062 96.5 98 95 11552
1280 1318 97.1 57 55 9336
1518 1556 97.6 52 50 7931
random 70 60 56 avg. 12443

Table 1 Throughput of L2 Tunnel via Gi0 and CPU utilization

Original L2 Tunnel Test results. Tunnel via SVI port


L2 Frame Frame Measured CPU CPU usage Measured
Length Length L1 Rate total usage caused by Rate
[Bytes] [Bytes] [Mbps] [%] traffic [%] frame/sec
64 102 10.0 99 97 14880
128 166 17.0 99 97 14358
256 294 30.0 99 97 13587
512 550 59.5 99 97 13980
1024 1062 96.5 93 92 11552
1280 1318 97.1 84 83 9336
1518 1556 14.6 12 11 1186

Table 2 Throughput of L2 Tunnel via Fa7 and CPU utilization

You can see that performance fell fast for 1518-byte frames. It was strange. I began to look for the cause and saw some
interesting information.

R2(config)#do sh buffer leak | i Fa8|Header


Header DataArea Pool Size Link Enc Flags Input Output User
85B44984 1EC01544 DMA-1 1542 7 1 20280 Fa8 None L2X Data
85B45BB4 1EC0A944 DMA-1 1542 7 1 20280 Fa8 None L2X Data
85B4892C 1EC21B44 DMA-1 1542 7 1 20280 Fa8 None L2X Data
85B496D0 1EC28A44 DMA-1 1542 7 1 20280 Fa8 None L2X Data
85B4B6A4 1EC38D44 DMA-1 1542 7 1 20280 Fa8 None L2X Data
85B4DB04 1EC4B544 DMA-1 1542 7 1 20280 Fa8 None L2X Data
85B4F1C0 1EC56E44 DMA-1 1542 7 1 20280 Fa8 None L2X Data
......

L2 tunnel data does not fit to the MTU size of tunnel.

6
I began to capture traffic with Wireshark.

For start, frames are captured for the path through the Gi0 interfaces.

1. Source frame (Figure 5). Length 1518 bytes.

Figure 5. Source frame 1518 bytes

Frame size 1518 bytes. Wireshark shows 1514 bytes – without CRC. Payload pattern is 0xAA. Each frame contains 4
bytes (red selection) in the end of DATA.

2. Frame from R1 to R2 through interface Gi0 (Figure 6). MTU 1552 bytes.

Figure 6. L2TP frame. Path via Gi0

7
MTU is correct. New MAC header 14bytes + New IP header 20 bytes + L2TP header 4 bytes + Original frame
without CRC 1514 bytes = 1552 bytes (without CRC). Each frame contains 4 bytes (red selection) in the end of
DATA.

3. Frames from R2 to tester 2, from Tester 2 to R2, from R2 to R1 and from R1 to Tester 1.

Frames are correct and have right MTU. In general, changing the frame size is shown in Figure 7.

Figure 7. Frame length change during transmission thru tunnel (via Gi0)

Next step, frames are captured for the path through the Fa7 interfaces.

4. Source frames with length 1518 bytes


Frames are same like in Figure 5.
5. Frame from R1 to R2 through interface Fa7. MTU 1552 bytes.
Router R1 encapsulates frames into L2TP packets and sends to the router R2 via SVI and Fa7 ports. This step is
correct. Frames have correctly MTU (Figure 8).

8
Figure 8. L2TP frame. Path through Fa7

NB! But further steps are an anomaly. In general, changing the frame size is shown in Figure 9.

Figure 9. Frame length change during transmission through tunnel (via Fa7)

6. Frames from R2 to Tester 2


Router R2 adds 4 bytes to end of original frame DATA for frames which de-encapsulates from Tunnel to
output port. Please see Figure 10.

9
NB! This anomaly starts only from frame length 1493 bytes with CRC (1489 without CRC). Frames with length up to
1492 bytes are forwarded correctly.

Figure 10. Frame from R2 to Tester 2. 4 Bytes are added.

Added DATA is shown in yellow frame in Figure 10. These 4 bytes are outside of the Layer 3 packet length.
Please note, my NIC card sends frames without CRC to Windows and Wireshark also. Wireshark shows a value
“bytes on wire” without 4 bytes of CRC.
Wireshark decodes frame and sees that length of Layer 3 payload is 1500 bytes. But additional bytes are
present after End of packet. Wireshark thinks that they are CRC and starts check it. Check fails, because these
bytes are not Layer 2 CRC. Warning message is shown [Ethernet Frame Check Sequence Incorrect].

7. Frames from R2 to R1
Next step. Tester 2 receives frames with inserted 4 bytes from R2, and sends them back toward Tester1.
Because incoming port Fa8 is tunnel port, Router 2 does not check any Layer 3 payload. Router encapsulates
all incoming bytes into L2TP packet and forward to Router R1 via tunnel. This procedure does not add
additional bytes (Figure 11).

10
Figure 11. L2TP frame. Direction from R2 to R1. Tunnel through Fa7.

8. Frames from R1 to Tester 1


Final step. Router R1 also adds 4 bytes to end of “original frame DATA” for frames which de-encapsulates
from Tunnel to output port. “Original frame” from tunnel has 4 additional bytes and router R1 adds next 4
bytes. Frames with length 1522 bytes (without CRC) are returned to Tester 1. Please see Figure 12.

11
Figure 12. Frame with additional 8 bytes.

Tester receives long frames but ignores any data after End of Packet. This is same situation as padding is ignored for
short packets. In this case tester does not show error or lost packets. Only can be seen in the tester that the received
frames are recognized as >1518/1526, but must be as 1024-1518/1526.

MTU tuning

MTU are increased for Fa7 and VLAN 100 interfaces - 1542 bytes for Fa7 and 1556 for VLAN 100. Throughput test was
repeated after MTU increasing. Now performance is much better (Table 3). However anomaly with additional bytes is
remained. Router still adds 4 bytes during de-encapsulation from Tunnel to output port for frames with length 1493-
1518 bytes.

12
Original L2 Tunnel Test results. Tunnel via SVI port
L2 Frame Frame Measured CPU CPU usage Measured
Length Length L1 Rate total usage caused by Rate
[Bytes] [Bytes] [Mbps] [%] traffic [%] frame/sec
64 102 10.0 99 97 14880
128 166 17.0 99 97 14358
256 294 30.0 99 97 13587
512 550 59.5 99 97 13980
1024 1062 96.5 93 92 11552
1280 1318 97.1 84 83 9336
1518 1556 97.3 72 71 7908
random 60 64 60 avg. 11530

Table 3. L2 Tunnel throughput and CPU utilization

Summary
The Graph 1 is summary graph for RFC2544 tests. This graph shows a maximum throughput. The Graph 2 shows a
router performance for random frames, this gives more close result to a real throughput.

96,5 Mbps 97,1 Mbps 97,6 Mbps


99/97% 92/91% 99/97% 99/97% 96,5 Mbps 97,1Mbps 97,3 Mbps
100 100
99/96% 99/96% 99/96% 99/96%
90 90
89,2 Mbps
80 98/95% 80
93/92% 84/83%
70 70
Throughput [Mbps]

CPU Utilization [%]


72/71%
60 59,5 Mbps 60
49,9 Mbps
50 57/55% 50
52/50%
40 40
30 26,3 Mbps 30,0 Mbps 30
14,9 Mbps
20 20
17,0 Mbps
10 10
10,0 Mbps
0 0
Packet Length [Bytes]

L1 Throughput. Tunnel via Fa7 L1 Throughput. Tunnel via Gi0


CPU untilization. Tunnel via Fa7 CPU total usadge. Tunnel via Gi0

Graph 1 Throughput and CPU utilization.

13
70 Mbps
70

60 Mbps
60

50
L1 Rate [Mbps]

40
CPU utilization CPU utilization
60% 64%
30

20

10

Tunnel via Gi0 Tunnel via Fa7

Graph 2. Throughput and CPU utilization for random frames

Juri Jestin
9.02.2014

14

S-ar putea să vă placă și