Intel Nic Teaming 10

Intel Advanced Networking Services With Ethernet Teaming
2/1/05
Version 1.0
Dell Inc. One Dell Way Round Rock, Texas 78682
Table of Contents 1 Executive Summary.......................................................................................................................... 4 1.1 Key Definitions and Acronyms ................................................................................................ 4 1.2 Teaming Concepts .................................................................................................................... 5 1.2.1 Network Addressing ......................................................................................................... 5 1.2.2 Teaming and Network Addresses..................................................................................... 5 1.2.3 Description of teaming modes .......................................................................................... 6 1.2.3.1 Adapter Fault Tolerance (AFT) .................................................................................... 7 1.2.3.2 Switch Fault Tolerance (SFT) ...................................................................................... 7 1.2.3.3 Adaptive Load Balancing (ALB).................................................................................. 7 1.2.3.4 Static Link Aggregation (SLA) FEC, GEC, Static IEEE 802.3ad............................. 7 1.2.3.5 Dynamic Link Aggregation (DLA) LACP (IEEE 802.3ad)...................................... 7 1.3 Software Components............................................................................................................... 8 1.4 Hardware Requirements ........................................................................................................... 8 1.4.1 Repeater Hub .................................................................................................................... 9 1.4.2 Switching Hub .................................................................................................................. 9 1.4.3 Router ............................................................................................................................... 9 1.5 Supported Teaming by OS........................................................................................................ 9 1.6 Utilities for Configuring Teaming by OS ............................................................................... 10 1.7 Supported Features by Team Type ......................................................................................... 11 1.8 Selecting a team type .............................................................................................................. 13 2 Teaming Mechanisms ..................................................................................................................... 14 2.1 Architecture ............................................................................................................................ 14 2.1.1 Adapter Fault Tolerance (AFT) ...................................................................................... 15 2.1.2 Switch Fault Tolerance (SFT) ........................................................................................ 16 2.1.3 Failover Decision Events................................................................................................ 17 2.1.3.1 Probe Mechanism ....................................................................................................... 17 2.1.3.2 Link Status .................................................................................................................. 18 2.1.3.3 Hardware Status.......................................................................................................... 18 2.1.3.4 Packet Receive Counters ............................................................................................ 18 2.1.4 Adaptive Load Balancing (Outbound Traffic Control) .................................................. 18 2.1.5 Receive Side Load Balancing (Inbound Traffic Control)............................................... 19 2.1.6 Static Link Aggregation, FEC, GEC and Static IEEE 802.3AD (SLA)......................... 20 2.1.7 Dynamic Link Aggregation (IEEE 802.3AD) ................................................................ 20 2.1.8 Switch Requirements ...................................................................................................... 21 2.1.8.1 Teaming with a single Switch..................................................................................... 21 2.1.8.2 Teaming Across Switches........................................................................................... 21 2.1.8.3 Routers........................................................................................................................ 24 2.1.8.4 Teaming in Blade Servers with Switches ................................................................... 25 2.1.9 Protocol Support ............................................................................................................. 26 2.1.9.1 TCP/IP ........................................................................................................................ 26 2.1.9.2 PAgP ........................................................................................................................... 27 2.1.9.3 LACP .......................................................................................................................... 27 2.1.9.4 STP ............................................................................................................................. 27 2.2 Driver Support by Operating System ..................................................................................... 29 2.3 Supported Teaming Speeds .................................................................................................... 31 2.4 Teaming with Hubs (for Troubleshooting Purposes Only) .................................................... 32 2.4.1 Hub usage in teaming network configurations ............................................................... 32 2.4.2 AFT, ALB, and RLB Teams........................................................................................... 32
4 5
2.4.3 AFT, ALB, and RLB Team Connected to a Single Hub ................................................ 32 2.4.4 Static and Dynamic Link Aggregation (FEC/GEC/IEEE 802.3ad)................................ 33 2.5 Teaming with Microsoft NLB/WLBS .................................................................................... 33 2.5.1 Unicast Mode.................................................................................................................. 34 2.5.2 Multicast Mode............................................................................................................... 34 Teaming and Other Advanced Networking Features ..................................................................... 35 3.1 Hardware Offloading Features ............................................................................................... 35 3.1.1 Checksum Offload .......................................................................................................... 36 3.1.2 Large Send Offload......................................................................................................... 36 3.1.3 Jumbo Frames................................................................................................................. 36 3.2 Wake on LAN......................................................................................................................... 36 3.3 Preboot eXecution Environment (PXE) ................................................................................. 36 3.4 IPMI........................................................................................................................................ 37 3.5 802.1q VLAN Tagging Support ............................................................................................. 37 3.6 802.1p QoS Tagging Support ................................................................................................. 38 Performance .................................................................................................................................... 38 Application Considerations ............................................................................................................ 39 5.1 Teaming and Clustering.......................................................................................................... 39 5.1.1 Microsoft Cluster Software............................................................................................. 39 5.1.2 High Performance Computing Cluster ........................................................................... 40 5.1.2.1 Advanced Features...................................................................................................... 41 5.1.3 Oracle.............................................................................................................................. 41 5.2 Teaming and Network Backup ............................................................................................... 42 5.2.1 Load Balancing and Failover.......................................................................................... 44 5.2.2 Fault Tolerance ............................................................................................................... 46 Troubleshooting Teaming Problems............................................................................................... 48 6.1 Teaming Configuration Tips................................................................................................... 48 6.2 Troubleshooting guidelines .................................................................................................... 49 6.3 Teaming FAQ ......................................................................................................................... 50 Appendix A- Event Log Messages ................................................................................................. 53 7.1 Windows System Event Log messages .................................................................................. 53 7.2 Base Driver (Physical Port / Miniport)................................................................................... 53 7.3 Intermediate Driver (Virtual Adapter/Team).......................................................................... 56
List of Figures Figure 1. Intel PROSet for Windows...................................................................................................... 10 Figure 2. Process for Selecting a Team Type ........................................................................................ 14 Figure 3. Intermediate Driver ................................................................................................................ 15 Figure 4. Teaming Across Switches without a Inter-switch Link .......................................................... 22 Figure 5. Teaming Across Switches with Interconnect .......................................................................... 23 Figure 6. Failover event......................................................................................................................... 24 Figure 7. Teaming with Blades.............................................................................................................. 26 Figure 8. Team Connected to a Single Hub........................................................................................... 33 Figure 9. VLANs ................................................................................................................................... 38 Figure 10. Clustering With Teaming Across One Switch ..................................................................... 40 Figure 11. Clustering With Teaming Across Two Switches ................................................................. 42 Figure 12. Network Backup without teaming......................................................................................... 43 Figure 13. Network Backup with ALB Teaming and Switch Fault Tolerance ..................................... 47 List of Tables Table 1. Glossary of Terms ..................................................................................................................... 4 Table 2. Teaming Mode Selections Offered by Intel............................................................................... 6 Table 3. Intel Teaming Software Components ........................................................................................ 8 Table 4. Teaming Support by Operating System..................................................................................... 9 Table 5. Operating System Configuration Tools................................................................................... 10 Table 6. Comparison of Teaming Modes .............................................................................................. 12 Table 7. Teaming modes supported in blade servers.............................................................................. 25 Table 8. Teaming Attributes by OS....................................................................................................... 30 Table 9. Link Speeds in Teaming .......................................................................................................... 31 Table 10. Teaming Modes for NLB ...................................................................................................... 34 Table 11. HW Offloading and Teaming ................................................................................................ 35 Table 12. ALB Teaming Performance.................................................................................................... 39 Table 13. Base Driver Event Log Messages........................................................................................... 56 Table 14. Intermediate Driver Event Log Messages .............................................................................. 59 List of Graphs Graph 1. Backup Performance with no NIC Teaming .......................................................................... 44 Graph 2. Backup Performance............................................................................................................... 46
THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND.
1 Executive Summary
This white paper describes the technology and implementation considerations when working with the network teaming services offered by the Intel ANS software shipped with Dells servers and storage products. The goal of the Intel teaming services is to provide fault tolerance and link aggregation across a team of two or more ports. The information in this document will assist IT professionals during the deployment and troubleshooting of server applications that require network fault tolerance, load balancing, and/or VLANs.
1.1 Key Definitions and Acronyms

Term AFT ALB ANS ARP DNS FEC FTP G-ARP GEC ICMP IGMP IP IPX LACP LOM MAC NDIS NLB PXE RAID RLB SLA SFT STP TCP UDP WINS WLBS Definition Adapter Fault Tolerance Adaptive Load Balance Advanced Networking Services Address Resolution Protocol Domain Name Service Fast EtherChannel File Transfer Protocol Gratuitous Address Resolution Protocol Gigabit EtherChannel Internet Control Message Protocol Internet Group Management Protocol Internet Protocol Internet Packet Exchange IEEE 802.3ad Link Aggregation Control Protocol LAN On Motherboard Media Access Control Network Driver Interface Specification Network Load Balancing (from Microsoft) Pre-Execution Environment Redundant Array of Inexpensive Disks Receive Load Balance Static Link Aggregation Switch Fault Tolerance Spanning Tree Protocol Transmission Control Protocol User Datagram Protocol Windows Name Service Windows Load Balancing Services
Table 1. Glossary of Terms
1.2 Teaming Concepts

The concept of grouping multiple physical devices to provide fault tolerance and load balancing is not new. It has been around for years. Storage devices use RAID technology to group individual hard drives. Switch ports can be grouped together using technologies such as Cisco Gigabit EtherChannel, IEEE 802.3ad Link Aggregation, Bay Network Multilink Trunking, and Extreme Network Load Sharing. Beginning in 1996, Intel introduced fault tolerance functionality for Intel Architecture 32-bit Local Area Network (LAN) server adapters. Features have been added since to enhance reliability and accommodate the migration of applications used by and deployed to servers. Network interfaces on Dell servers can be grouped together into a team of physical ports called a virtual interface. 1.2.1 Network Addressing
To understand how teaming works, it is important to understand how node communications work in an Ethernet network. This paper assumes that the reader is familiar with the basics of IP and Ethernet network communications. The following information provides a high level overview of the concepts of network addressing used in an Ethernet network. Every Ethernet network interface in a host platform such as a server requires a globally unique Layer 2 address and at least one globally unique Layer 3 address. Layer 2 is the Data Link Layer, and Layer 3 is the Network layer as defined in the OSI model. The Layer 2 address is assigned to the hardware and is often referred to as the MAC address or physical address. This address is pre-programmed at the factory and stored in NVRAM on a network interface card or on the system motherboard for an embedded LAN interface. The layer 3 addresses are referred to as the protocol or logical address assigned to the software stack. IP and IPX are examples of Layer 3 protocols. In addition, Layer 4 (Transport Layer) uses port numbers for each network upper level protocol such as Telnet or FTP. These port numbers are used to differentiate traffic flows across applications. Layer 4 protocols such as TCP or UDP are most commonly used in todays networks. The combination of the IP address and the TCP port number is called a socket. Ethernet devices communicate with other Ethernet devices using the MAC address, not the IP address. However, most applications work with a host name that is translated to an IP address by a Naming Service such as WINS or DNS. Therefore, a method of identifying the MAC address assigned to the IP address is required. The Address Resolution Protocol for an IPv4 network provides this mechanism. For IPX, the MAC address is part of the network address and ARP is not required. ARP is implemented using an ARP Request and ARP Reply frame. ARP Requests are typically sent to a broadcast address while the ARP Reply is typically sent as Unicast traffic. A Unicast address corresponds to a single MAC address or a single IP address. A broadcast address is sent to all devices on a network. 1.2.2 Teaming and Network Addresses
A team of ports functions as a single virtual network interface and does not appear any different to other network devices than a non-teamed port. A virtual network interface advertises to the OS a single layer 2 and one or more layer 3 addresses. When the teaming driver initializes, it selects a MAC address from one of the physical ports that make up the team to be the Team MAC address. When the server hosting the team receives an ARP Request, it will select one MAC address from among the 5
physical ports in the team to use as the source MAC address in the ARP Reply. In Windows operating systems, the IPCONFIG /all command shows the IP and MAC address of the virtual interface and not the individual physical ports. The protocol IP address is assigned to the virtual network interface and not to the individual physical ports. For switch independent teaming modes, all physical ports that make up a virtual interface must use the unique MAC address assigned to them when transmitting data. That is, the frames that are sent by each of the physical ports in the team must contain a unique originating MAC address to be IEEE compliant. It is important to note that ARP cache entries are not learned from received frames, but only from ARP Requests and ARP Replies. All ARP traffic is transferred through the primary port. 1.2.3 Description of teaming modes
There are four methods for classifying teaming modes: 1. The mode is switch dependent and requires a switch that supports FEC, GEC, and/or IEEE 802.3ad that is properly configured 2. The mode requires the switch to have Link Aggregation Control Protocol enabled 3. The mode supports only failover, and 4. The mode supports both failover and load balancing (either Tx only or both Tx and Rx) The following table shows a summary of the team modes and their classification. The following sections provide an overview of the teaming modes while details about MAC addresses and protocols are contained in the section on teaming mechanisms. Intel Teaming Selections Failover Load Balancing Switch Dependentswitch must support specific teaming mode Link Aggregation Control Protocol support required on the switch
Adapter Fault Tolerance (AFT) Switch Fault Tolerance (SFT) Adaptive Load Balancing (ALB) ALB with Receive Load Balancing Static Link Aggregation (FEC/GEC)/802.3ad Static Dynamic Link Aggregation (802.3ad)
(TX only) (TX and RX)
(TX and RX)
(TX and RX)

Table 2. Teaming Mode Selections Offered by Intel
1.2.3.1 Adapter Fault Tolerance (AFT) Adapter Fault Tolerance provides automatic redundancy for your servers network connection. If the primary port fails, the secondary port takes over. AFT supports two to eight ports per team. This teaming mode works with any switch, and all team members must be connected to the same network. While AFT will also work with hubs, it is only recommended for troubleshooting purposes. AFT is inherent in all teaming modes. This mechanism is present as the basis for each of the following teaming modes. 1.2.3.2 Switch Fault Tolerance (SFT) Switch Fault Tolerance (SFT) provides a failover relationship between two ports when each port is connected to a separate switch. SFT supports two ports per team. Spanning Tree Protocol (STP) must be enabled on the switch except that the ports connected to the teamed ports should have Port Fast or Edge Port enabled. This teaming mode works with any switch. 1.2.3.3 Adaptive Load Balancing (ALB) Adaptive Load Balancing (ALB) provides load balancing of transmit (outbound) traffic. By default ALB also includes Receive Load Balancing (RLB). Together these two teaming modes permit load balancing in both transmit and receive directions. RLB can be independently disabled. This teaming mode works with any switch. 1.2.3.4 Static Link Aggregation (SLA) FEC, GEC, Static IEEE 802.3ad Static Link Aggregation accounts for the FEC, GEC and 802.3ad static protocols. SLA is a switchassisted teaming mode and requires configuring ports at both ends of the link: server interfaces and switch ports. This is often referred to as Cisco Fast EtherChannel or Gigabit EtherChannel. In addition, SLA supports similar implementations by other switch OEMs such as Extreme Networks Load Sharing and Bay Networks or IEEE 802.3ad Link Aggregation static mode. 1.2.3.5 Dynamic Link Aggregation (DLA) LACP (IEEE 802.3ad) Dynamic Link Aggregation (DLA) is similar to SLA except that it uses the Link Aggregation Control Protocol to negotiate the ports that will make up the team. LACP must be enabled at both ends of the link for the team to be operational. This teaming mode requires a switch that fully supports the IEEE 802.3ad standard. If LACP is not available at both ends of the link, 802.3ad provides a manual aggregation that only requires both ends of the link to be in a link up state. Because manual aggregation provides for the activation of a member link without performing the LACP message exchanges, it should not be considered as reliable and robust as an LACP negotiated link. LACP automatically determines which member links can be aggregated and then aggregates them. It provides for the controlled addition and removal of physical links for the link aggregation so that no frames are lost or duplicated. The removal of aggregate link members is provided by the marker protocol that can be optionally enabled for Link Aggregation Control Protocol (LACP) enabled aggregate links. Link aggregation combines the individual capacity of multiple links to form a high performance virtual link.
The failure or replacement of a link in an LACP trunk will not cause loss of connectivity. When a link fails the traffic will simply be failed over to the remaining links in the trunk and traffic is balanced among the remaining links. When a replacement occurs, the load balancing is reallocated by the switch to include the new link.
1.3 Software Components

Intel implements teaming as part of their Advanced Networking Service (ANS) intermediate driver. ANS is implemented as an intermediate driver that resides between the protocol software and the base driver that communicates directly to the networking hardware. This software component works with the protocol stack, the OS interfaces, and the base driver to enable the teaming architecture (see Figure 3). The base driver controls the host LAN controller directly to perform functions such as sending data, receiving data, and interrupt processing. The intermediate driver fits between the base driver and the protocol layer, such as TCP/IP, multiplexing several base driver instances, and creating a virtual port that looks like a single port to the protocol layer. The OS provides an interface to enable the communications between either base drivers or intermediate drivers and the protocol stack. The TCP protocol stack implements IP, IPX and ARP. A protocol address such as an IP address is assigned to each miniport device instance, but when an intermediate driver is installed, the protocol address is assigned to the virtual team port and not to the individual networking devices that make up the team. The Intel supplied teaming support is provided by three individual software components that work together and are supported as a package. When one component is upgraded, all the other components must be upgraded to the supported versions. The following table describes the three software components and their associated files for supported operating systems.
Software Component Base Driver
Intel Name Intel Base Driver
Windows e100b325.sys e1000325.sys ixbg325.sys iAnsWxp.sys iAnsw2k.sys iAnsw64.sys iAnsw32e.sys PROSet.msi
Linux
NetWare
Intermediate Driver
Intel Advanced Networking Services (ANS)
e1000.o ce100b.lan e100.o ce1000.lan ixgb.o ians.o ians.lan
Configuration GUI
Intel PROSet
xprocfg N/A
Table 3. Intel Teaming Software Components
1.4 Hardware Requirements

The various teaming modes described in this document place certain restrictions on the networking equipment used to connect clients to teamed servers. Each type of network interconnect technology has an effect on teaming as described below. 8
1.4.1
Repeater Hub
A repeater hub allows a network administrator to extend an Ethernet network beyond the limits of an individual segment. The repeater regenerates the input signal received on one port onto all other connected ports, forming a single collision domain. This means that when a station attached to a repeater sends an Ethernet frame to another station, every station within the same collision domain will also receive that message. If two stations begin transmitting at the same time, a collision will occur, and each transmitting station will need to retransmit its data after waiting a random amount of time. The use of a repeater hub requires that each station participating within the collision domain operate in half-duplex mode. Though half-duplex mode is supported for Gigabit Ethernet devices in the IEEE 802.3 specification, it is not supported by the majority of Gigabit Ethernet controller manufacturers and will not be considered here. Teaming across repeater hubs is supported for troubleshooting purposes such as connecting a Network Analyzer for AFT teams only. 1.4.2 Switching Hub
Unlike a repeater hub, a switching hub (or more simply a switch) allows an Ethernet network to be broken into multiple collision domains. The switch is responsible for forwarding Ethernet packets between hosts based solely on Ethernet MAC addresses. A physical network port that is attached to a switch may operate in half-duplex or full-duplex mode. In order to support FEC, GEC, and 802.3ad Static Link Aggregation and 802.3ad Dynamic Link Aggregation, a switch must specifically support such functionality. If the switch does not support these protocols, it may still be used for Adapter Fault Tolerance, Adaptive Load Balancing, Receive Load Balancing, and Switch Fault Tolerance. 1.4.3 Router
A router is designed to route network traffic based on Layer 3 or higher protocols, although it will often also work as a Layer 2 device with switching capabilities. Teaming ports connected directly to a router is not supported.
1.5 Supported Teaming by OS

All teaming modes are supported for the IA-32 server operating systems as shown in Table 4. Teaming Mode Windows Linux NetWare AFT ALB ALB/RLB SFT SLA(FEC/GEC/802.3ad) DLA(802.3ad dynamic)
Table 4. Teaming Support by Operating System
1.6 Utilities for Configuring Teaming by OS

Table 5 lists the tools used to configure teaming in the supported operating system environments. Operating System Windows (All versions) NetWare 5/6 Linux Configuration Tool Intel PROSet Autoexec.ncf; iAns.lan, Inetcfg Xprocfg
Table 5. Operating System Configuration Tools
Intels PROSet (see Figure 1) is a graphical user interface for managing Intels network products and ANS. PROSet is designed to run in on Linux, Microsoft Windows 2000 and Windows Server 2003. PROSet is used to perform diagnostics, configure load balancing and fault tolerance teaming, and VLANs. In addition, it displays the MAC address, driver version, and status information. On Linux the PROSet executable is known as xprocfg. Xprocfg can be used in custom initialization scripts. Please read your distribution-specific documentation for more information on your distributors startup procedures.
Figure 1. Intel PROSet for Windows
10
When a port configuration is saved in NetWare, the NetWare install program will add load and bind statements to the Autoexec.ncf file. By accessing this file, you can verify the parameters configured for each, add or delete parameters, or modify parameters.
1.7 Supported Features by Team Type

Table 6 provides a feature comparison across the teaming modes supported by Dell. Use this table to determine the best teaming mode for your application. The teaming software supports up to 8 ports in a single team. Intel ANS does not put a limit on the number of teams that can be created. The practical number of teams is limited by the capabilities of the system. The teams can be any combination of the supported teaming modes but must be on separate networks or subnets. However, all members of a given team must be in the same network or broadcast domain. Teaming Function Fault Tolerance Load Balancing Switch Switch Dependent Dependent Static Dynamic Link Aggregation Aggregation (FEC/GEC/IEEE (IEEE 802.3ad) 802.3ad) Static Link Aggregation 2-8 Dynamic Link Aggregation 2-8
Teaming Mode Number of ports per team (Same Broadcast domain) Number of teams
AFT 2-8
SFT 2
ALB/RLB 2-8
NIC Fault Tolerance Switch Link Fault Tolerance (same Broadcast domain) TX Load Balance RX Load Balance Requires Compatible Switch Heartbeats to check connectivity
Limited by Limited by Limited by Limited by system Limited by system capabilities. capabilities. system system system capabilities. capabilities capabilities. Dell supports up Dell supports up to 4 teams per to 4 teams per Dell Dell system. system. supports up supports up to 4 teams to 4 teams per system. per system. Yes Yes Yes Yes Yes Yes Yes Yes Switch Dependent Switch Dependent
No No No
No No No
Yes Yes No
Yes (Performed by ANS) Yes (Performed by Switch) Yes
Yes (Performed by ANS) Yes (Performed by Switch) Yes
Yes (ANS probe packets)
No
Yes (ANS probe packets) 11
No
Yes (LAC PDU probe packets)
Teaming Function
Fault Tolerance
Load Balancing
Switch Switch Dependent Dependent Static Dynamic Link Aggregation Aggregation (FEC/GEC/IEEE (IEEE 802.3ad) 802.3ad) Static Link Aggregation Switch Dependent Dynamic Link Aggregation Switch Dependent
Teaming Mode Mixed Mediaports with different media Mixed Speedports that do not support a common speed, but can operate at different speeds Mixed Speedports that support common speed(s), but can operate at different speeds. Load balances TCP/IP Mixed Vendor Teaming Load balances non-IP protocols
AFT Yes
SFT Yes
ALB/RLB Yes
Yes
Yes
Yes
No
Yes
Yes
Yes
Yes
No (must be the same speed)
Yes
No Yes* No
No Yes* No
Yes Yes* Yes (IPX outbound traffic only) No
Yes Yes* Yes
Yes Yes* Yes
No No Same MAC Address for all team members Yes Yes Same IP Address for all team members Load Balancing No No by IP Address No No Load Balancing by MAC Address *- Requires at least one Intel port in the team
Yes
Yes
Yes
Yes
Yes
Yes No
Yes Yes (For receive load balancing)
Yes Yes (For receive load balancing)
Table 6. Comparison of Teaming Modes
12
1.8 Selecting a team type

Intel offers several different teaming modes because each mode offers different advantages depending on your networking infrastructure and the networking demands placed on your servers. The simplest mode is Adapter Fault Tolerance. It offers reliability across the network ports on a single machine. However it does not do anything to increase networking bandwidth. If your concern for reliability involves failure of the switch or the connection between the server and the switch the Switch Fault Tolerance provides the redundancy you need with minimal impact to CPU utilization but no increase in networking bandwidth. If you have the proper types of switches you may want to try either the SLA or IEEE 802.3ad Dynamic modes. In these modes the switch manages receive side load balancing while ANS handles the transmit side load balancing. SLA and IEEE 802.3ad have the limitation of requiring a particular capability of the switch and the complexity of making sure the switches are properly configured. If your networking infrastructure uses switches of several different types from different vendors, you may not be able to use the same configuration across all of your servers. If this is a concern, increasing networking throughput via ALB with RLB is your best choice. It will provide you with increased throughput and fault tolerance among the ports within the system. Because choosing a mode is based primarily on whether the emphasis is on throughput or CPU utilization, the process is the pretty much the same regardless of the configuration of the servers. The following flowchart provides a general decision flow when planning for teaming. As indicated above there is no single solution that is correct for all environments and you need to consider your complete environment.
13
Do you need increased bandwidth or fault tolerance?
Neither
No Teaming
Fault Tolerance
Increased Bandwidth
Do you need fault tolerance across switches?
YES
Set up Switch Fault Tolerance.
Are you using a switch that supports IEEE 802.ad LACP?
YES
Set up a Dynamic Link Aggregation Team
NO
NO
Do you want fault tolerance across adapters?
YES
Set up an Adapter Fault Tolerance Team
Are you using a switch that supports static link aggregation
YES
Set up a Static Link Aggregation Team
NO
Set up an Adaptive Load Balancing Team (with Receive Load Balancing).
Figure 2. Process for Selecting a Team Type
2 Teaming Mechanisms
This section provides an overview on how the Intel ANS intermediate driver is implemented and how it performs load balancing and failover. This discussion is meant to be generic and not dependent on any particular operating system or to provide implementation details.
2.1 Architecture
The primary function of ANS is to provide fault tolerance and to load balance inbound and outbound traffic among the networking ports installed on the system and grouped for teaming. The inbound and outbound load balancing algorithms are independent of each other. The outbound traffic for a particular layer 3 connection can be assigned to a given port while its corresponding inbound traffic can be assigned to a different port. The ANS driver also provides VLAN tagging support through its virtual networking interface which may be bound to a single stand-alone port, or to a team of ports. For the purposes of this discussion the network stack in an OS can be considered to consist of the protocol stack (e.g. TCP/IP, IPX, etc.), a base driver that understands the underlying hardware, and the networking device or port. Generally an application talks to the protocol stack which talks to the base driver which talks to the networking device. Usually there is some glue provided by the OS to manage the three pieces. 14
The Intel Advanced Network Service is implemented as an intermediate driver. It operates below protocol stacks such as TCP/IP and IPX and exposes a virtual networking interface to the protocol layers. This virtual networking interface uses the MAC Address of a chosen port in the team. A Layer 3 address must also be configured for the virtual network interface.
Application Protocol Driver

TCP/IP IPX/SPX
iANS Intermediate Driver De/MUX
LAN base driver instance
NIC
NIC
Figure 3. Intermediate Driver
NIC
NIC
When the ANS team interface is initialized in the system, it binds itself to the TCP/IP protocol. Hence, it may either obtain a DHCP IP address, or be assigned a static IP address by the user. The teamed members bindings to TCP/IP are removed, and they are now bound to the ANS protocol edge that the ANS intermediate driver exposes. Hence, the ports are not seen by the TCP/IP layer in the OS, and cannot obtain/be assigned an IP address. In addition, ANS associates the MAC address of the primary port in the team with the teams TCP/IP address. If the primary network connection is an adapter and it is hot-plugged out of the server and that adapter is inserted somewhere else in the network the MAC address will show up on the network in two places unless the server is rebooted. On reboot, the team will use the MAC address of the new primary port. 2.1.1 Adapter Fault Tolerance (AFT) An AFT team provides basic fault tolerance functionality and can be configured with two or more physical ports. 15
An intrinsic primary port is chosen by ANS (based on the speed and capabilities of the ports that are part of the team) and is the only port in the team that is Active and transmits or receives clients data. The option also exists for a user-designated preferred primary port to be chosen in the team which performs the same functions as an intrinsic primary. Secondary ports are in Standby mode, ready to replace the primary port if it fails. If the primary port fails for any reason (e.g., port hardware failure, cable loss, switch/hub fault, etc.), a secondary port assumes the properties of the primary port, and is made Active. The new primary port will continue the communication with clients. The process is transparent to the clients and the user application. Ports may be in a Disabled, Standby or Active state. A disabled state for a port indicates that this port is not functioning as part of the team, and will not be used for failover. A disabled state for a primary port indicates that failover is imminent to a standby port. AFT uses four indicators to detect if a failover is needed: the primarys link status, the primarys hardware status, a probe mechanism between the members of the team, and the primary ports packet receive counters. These are described later in this section. When a failover occurs the load is transferred from the primary to the secondary fast enough to prevent user protocol sessions (e.g., TCP, file transfers, etc.) from being disconnected. After failover to the secondary, if the user chose a preferred primary port in the team, once this preferred primarys link is restored, all connections are restored back to the preferred primary. To achieve this fail-back, the preferred primary is again made Active, and the secondary is returned to Standby status. In this team mode all the ports in the team use the MAC address of the primary port. The port which is currently active and transmitting in the team uses the primary MAC as the source MAC address in the Ethernet header of the transmitted packet. This ensures that the corresponding switch port always learns the primary MAC address. This also ensures transparency on failover to a secondary port. 2.1.2 Switch Fault Tolerance (SFT)
A Switch Fault Tolerant team provides fault tolerance between ports connected to two different switches. This team can be configured with a maximum of two ports. In such a configuration one of the ports connected to one of the switches is the Active connection. The other port is in Standby mode. The Active port may be the intrinsic primary chosen by the intermediate driver, or the preferred primary port selected by the user. When the Active port loses link ANS will use the secondary port connected to the second switch as backup, and activate it. With this mode since one switch is redundant, both switches must be cross connected so when the primary switch fails, all active connections can be forwarded through the redundant switch. Hence, in this case STP must be enabled on all switches in the network to resolve loop-back link states. However the switch ports connected to teams should have Port Fast or Edge Port enabled. If the user chooses a preferred primary port in the team, and failover occurs to the secondary port, when the switch port connected to the preferred primary is functional again, the intermediate driver handles fail-back of connections just like in AFT. However, a time delay of 60 seconds is introduced here to allow for STP convergence on the switch network. This is to safeguard against packet-loss that could occur by activating the preferred primary, before its switch port becomes fully active.
16
2.1.3
Failover Decision Events
There are four methods used to determine when to perform a failover from a primary to a secondary: link status, hardware status, probe packets, and packets received. 2.1.3.1 Probe Mechanism The ANS probe mechanism is used to detect port or network path problems between primary and secondary ports using probe packets. These are layer 2 packets. Each port in the team sends a probe packet in a constant (user-defined) interval to the other ports in the team. This mechanism is used as a factor in failover decisions when there are three or more members in a team. Using probe packets the secondary ports send out a packet that is then detected by the primary. This is an indication to ANS that the secondary port is functional and is ready to take over, should a failure in the primary port occur. Every port holds two state flags and these flags indicate when a port has successfully sent or received a probe. When a port in the team receives a probe packet it will set its own receive flag and then set the sent flag of the port that sent the probe. If the sent and received flag states for a port are not updated successfully, the intermediate driver concludes there are link or hardware problems with that port. Since the success of the probe packet is based on both the sent and received flags being updated successfully, this mechanism can not be used to contribute to failover decisions if there are only two members in a team. When there are only two members in a team and link/hardware malfunctions occurs with one of the ports, the sent and received flags of both ports will remain in a pending state. ANS will not be able to determine whether the sending port has failed or if the receiving port failed. With two ports in a team ANS uses other methods to determine when a port failure has occurred. Probe packets can be broadcast or multicast. In a breakdown of the format of a probe packet, note that the type of the packet is a reserved type with 0x886D. Looking at a probe packet in a sniffer application, this would indicate that probes from Intel PRO Network Connections were being transmitted. ANS looks for this type of packet in order to determine whether probe packets are being sent and received across the network connections in the team. Probe Retries: Congestion on the network may lead to a delay in when probes are received or packet loss with respect to probes. A port in such a network could have its probe-received flag updated successfully, but its probe-sent flag would be in a pending state. In such situations, when probes are not being received or sent correctly by one of the ports, a retry mechanism will start. The retry mechanism will delay the probe check time until all retries have been sent, to account for the network congestion. The number of retries is a configurable variable through PROSet. Probe Burst: In additions to the retry mechanism there is also a burst mechanism. If the retry mechanism times out without successfully received/sent probes, ANS will send a burst of probe packets. This mechanism decreases the probability that the probe is lost due to stress on the switch. Probe packets are not utilized in all teaming modes. Teaming modes that use probes by default are AFT and ALB/RLB. SFT does not use probes by default, since this team mode supports a maximum of two ports in the team. ANS probe packets are not used in FEC, GEC, IEEE802.3ad Static and Dynamic Link aggregation, which support their own switch-controlled link state mechanisms for detected network connection failures. With these teaming modes, all members of the team have the same media access control (MAC) address, thus ANS probe packets cannot be directed to each port in the team. 17
2.1.3.2 Link Status Link status allows ANS to determine if a port is in an active state and able to communicate on the network. The base drivers for the ports that are part of the team detect link status for the hardware, and notify the upper layers of the stack when link-up and link-down events occur. These events are received by ANS via the NDIS wrapper, and this indication is used as a direct stimulus for the intermediate driver to perform failover, if the link down event is associated with the primary/active port in the team. It is important to note that loss of link detection is only possible between the ports in the team and their immediate link partner. ANS has no way of reacting to other hardware failures in the switches and cannot detect loss of link on other ports. 2.1.3.3 Hardware Status When the team is initialized by the intermediate driver, each of the base drivers that are part of the team is initialized. During this initialization sequence, the base driver objects are opened, and handles to these objects are obtained and stored. If ANS is unable to obtain the handle for a base driver, it is assumed that the particular port was not initialized properly, and its hardware status is marked as notready. Such a port is not involved in communication, and is not used in the team. 2.1.3.4 Packet Receive Counters Packet receive counters are updated in the base driver with every packet that is received by a particular port. When making failover decisions, if the primary ports link is functional but probes sent from the primary are not being received by the other ports in the team due to network conditions (probe sent flag of primary not updated successfully), packet receive counters for the primary port are checked to ensure that it is functional and receiving network traffic, which includes broadcast packets sent by other nodes in the network. If the packet receive counters are incrementing as normal, failover is not induced. 2.1.4 Adaptive Load Balancing (Outbound Traffic Control)
In Adaptive Load Balancing (ALB) transmit traffic is balanced across the connections in the team to increase transmit throughput. To perform transmit load balancing a hash table is used to assign flows destined for a particular end client to one of the ports in the team. The hash algorithm uses a layer 3 hash index, derived from the last octet of the end clients IP address. The transmit traffic from the server is then transmitted through the member in the team corresponding to that index in the hash table. New data flows from the server are assigned to the least loaded member in the team, and this member is placed in the table corresponding to the client hash index. The clients will continue to receive traffic from that particular team member in the server until a pre-configured load balancing interval timer expires. When the interval timer expires, data flows are rebalanced among the network connections in the team. All traffic is received by the server on the primary port in the team unless RLB is used to balance the received traffic. In ALB mode (without Receive Load Balancing enabled), the MAC address of the primary team member is used in the ARP reply packet, and is not load balanced. This ensures that clients learn the MAC address of the primary adapter, and all receive traffic is hence directed to the primary in the team. All other packets transmitted from the teamed ports are load balanced, and contain the MAC address of the team member doing the transmitting, in the Ethernet header. 18
Fault tolerance is built into this team mode and includes the same functionality as described under AFT. 2.1.5 Receive Side Load Balancing (Inbound Traffic Control)
RLB is a sub-feature of ALB. RLB enables all ports in the team to receive traffic directed to the team, and hence improve receive bandwidth. This feature is turned on by default when an ALB team is created. It can be disabled via the Intel PROSet GUI using the teams Advanced Settings. When a client sends an ARP request message, before beginning communication with the team, ANS takes control of the server ARP reply message that comes from the TCP stack in response, and copies into it the MAC address of one of the ports in the team chosen to service the particular end-client, according to the RLB algorithm. When the client gets this reply message, it includes this match between the team IP and given MAC address in its local ARP table. Subsequently, all packets from this end client will be received by the chosen port. In this mode, ANS allocates team members to service end client connections in a round-robin fashion, as the clients request connections to the server. In order to achieve a fair distribution of end clients among all enabled members in the team, the RLB client table is refreshed at even intervals (default is 5 minutes). This is the Receive Balancing Interval, which is a preconfigured setting in the registry. The refresh involves selecting new team members for each client as required. ANS initiates ARP Replies to the affected clients with the new MAC address. The OS can send out ARP requests at any time, and these are not under the control of the ANS driver. These are broadcast packets sent out through the primary port. Since the request packet is transmitted with the teams MAC address (the MAC address of the primary port in the team), all end-clients that are connected to the team will update their ARP tables by associating the teams IP address with the MAC address of the primary port. When this happens the receive load of those clients collapses to the primary port. To restart the receive load balancing ANS retransmits ARP REPLIES to all clients in the receive hash table that were transmitting to non-primary ports, with their MAC address of the respective team members. In addition, the ARP request sent by the OS is saved in the RLB hash table, and when the ARP reply is received from the end client, the clients MAC address is updated in the hash table. This is the same mechanism used to enable RLB when the server initiates the connection. Since RLB requires that each member in the team receive packets from different clients, in this team mode, all the members of the team use their own permanent MAC addresses as the source MAC in the Ethernet header for transmitted packets. Hence, the corresponding switch ports learn the MAC address of each individual member allowing them to forward packets from end clients to the appropriate member of the team. In the Windows Operating System, on failover from the primary to a secondary, the intermediate driver software puts the secondary port is put into promiscuous mode, and a filter with the primary ports MAC address is applied to it. This is to protect against receive packet loss from clients that were connected to the primary. When the subsequent receive balance interval timer expires, the appropriate secondary MAC address will be sent to the teams clients via an ARP reply to ensure that receive load is balanced again across all active ports in the team. A vice-versa implementation is followed, when the secondary fails, and all connections failover to the primary. In the Linux Operating System, on failover from the primary to the secondary, the intermediate driver software swaps the MAC addresses between the primary and the secondary. The secondary is also put into promiscuous mode to avoid loss of packets destined for its old MAC address from clients. 19
Directed ARP replies are sent to clients to collapse them from the secondary to the primary. The promiscuous mode on the secondary is then closed after 10 seconds. 2.1.6 Static Link Aggregation, FEC, GEC and Static IEEE 802.3AD (SLA)
Static link aggregation is a teaming mode where all the ports in the team share the same MAC address and are perceived as a single link from the switchs perspective. This is a switch controlled teaming mode. It requires configuring the teamed ports on the switch with the Port Aggregation Protocol (PAgP) protocol set to mode ON. This is a static mode, in the sense that, the ports on the switch are active by default, and there is no handshake/negotiation that takes place between the switch and the intermediate driver. There is no designated primary port in the team (all ports are equal) and all ports send and receive simultaneously. ANS does not implement any failover mechanism in this mode since there are no MAC address and packet filter changes. Failover and receive side load balancing are handled entirely by the switch. ANS implements the transmit load balancing mechanism similar to the ALB mode. When the user configures an SLA team, care should be taken that all the links are of the same speed. Since the receive side load balancing is handled by the switch, mixed speed teams may not be effective, and may not be supported by some switches. When an existing team member has to be deactivated, or when a new member has to be added to the team and activated in this team mode, care should be taken that the corresponding switch ports are already configured to use the PAGP protocol and are set to mode ON. In addition, to ensure that the switch port is activated in order to avoid packet-loss, members should be added/removed from such teams with their cables removed from the corresponding switch ports. 2.1.7 Dynamic Link Aggregation (IEEE 802.3AD)
Dynamic Link Aggregation refers to the dynamic IEEE 802.3AD teaming mode. This is similar to SLA in the sense that all members of this team share the same MAC address, and appear as one large link to the switch. This is also a switch controlled mode; however, in this mode, the corresponding teamed ports on the switch are configured to use the LACP protocol, and are set to mode ACTIVE. This configuration allows the switch ports to establish dynamic communication with the ANS intermediate driver, allowing the user to dynamically add/remove members from this team, without concern for packet loss unless it is the primary port as described below. There is no designated primary in the team (user configured or chosen by ANS). However, the first teamed port on the switch is treated as the initiator by the switch. Removal of this primary member from the team could hence lead to packet loss. To avoid this scenario, care should be taken that the switch ports are pre-configured for added/removed members before making any changes to the team itself. Check the documentation for your switch on how to configure the ports. When members of mixed speeds are configured in this team mode, the switch splits the team into separate channels, one channel each for the aggregate of members of matching speed. The user can control which channel is active on the switch, by choosing either the Maximum Adapters or the Maximum Bandwidth setting from the Advanced Page of the team in PROSet. Failover to the inactive channel, in this case, occurs only when all the links in the active channel malfunction. However, it should be noted that, mixed speed IEEE 802.3AD teams may not be supported by some switches.
20
2.1.8
Switch Requirements
Teams configured in AFT, ALB/RLB must have all members in the same subnet (same layer 3 broadcast domain) in both single and multiple switch environments. 2.1.8.1 Teaming with a single Switch SLA (FEC, GEC and static IEEE 802.3AD) and dynamic IEEE 802.3ad must have all members connected to the same Layer 2 switch These protocols do not work across switches because each of these implementations requires that all physical ports in a team share the same Ethernet MAC address. 2.1.8.2 Teaming Across Switches RLB teams can be configured across switches. However, because RLB relies on ARP packets which will not pass through routers, the switches between the servers and the clients must not be routers in other words, they should NOT be mutually isolated networks. The diagrams below describe the operation of an ALB/RLB team across switches. We show the mapping of the ping request and ping replies in a team with two members. All servers (Blue, Gray and Red) have a continuous ping to each other. Figure 4 is a setup without the interconnect cable in place between the two switches. Figure 5 has the interconnect cable in place and Figure 6 is an example of a failover event with the Interconnect cable in place. These scenarios describe the behavior of teaming across the two switches and the importance of the interconnect link. The diagrams show the secondary team member sending the ICMP echo requests (yellow arrows) while the primary team member receives the respective ICMP echo replies (blue arrows). This illustrates a key characteristic of the teaming software. The load balancing algorithms do not synchronize how frames are load balanced when sent or received. In other words, frames for a given conversation can go out and be received on different interfaces in the team. Therefore, an interconnect link must be provided between the switches that connect to ports in the same team. In the configuration without the interconnect, an ICMP Request from Blue to Gray goes out port 82:83 destined for Gray port 5E:CA, but the Top Switch has no way to send it there since it cannot go along the 5E:C9 port on Gray. A similar scenario occurs when Gray attempts to ping Blue. An ICMP Request goes out on 5E:C9 destined for Blue 82:82, but cannot get there. The Top Switch does not have an entry for 82:82 in its CAM table because there is no interconnect between the two switches. However, pings flow between Red and Blue and between Red and Gray. Furthermore, a failover event would cause additional loss of connectivity. Consider a cable disconnect on the Top Switch port 4. In this case, Gray would send the ICMP Request to Red 49:C9, but because the Bottom switch has no entry for 49:C9 in its CAM Table, the frame is flooded to all its ports but cannot find a way to get to 49:C9.
21
Figure 4. Teaming Across Switches without a Inter-switch Link
22
The addition of a link between the switches allows traffic from/to Blue and Gray to reach each other without any problems. Note the additional entries in the CAM table for both switches. The link interconnect is critical for the proper operation of the team. As a result, it is highly advisable to have a link aggregation trunk to interconnect the two switches to ensure high availability for the connection.
Figure 5. Teaming Across Switches with Interconnect
23
Figure 6 represents a failover event where the cable is unplugged on the Top Switch port 4. This is a successful failover with all stations sending pings to each other without loss of connectivity.
Figure 6. Failover event
2.1.8.3 Routers The device that the teamed ports are connected to must NOT be a router. The ports in the team must be in the same network. RLB must be used without a router between the server and the clients. If a router is being used, care should be taken that the RLB setting is disabled. Receive Load Balancing is not possible with clients across a Layer 3 router, since the RLB mechanism is based on the Layer 2 ARP mechanism. In the case where the clients are connected across a router, all ARP requests and responses from the clients are masked by the router. The only ARP requests and responses that the team sees are those that are sent by the router itself. Hence, in this case, the router itself becomes the single end-client for the ANS driver.
24
2.1.8.4 Teaming in Blade Servers with Switches Teaming in Blade Servers Teaming Modes Supported (Yes/No) AFT Yes SFT Yes ALB Yes RLB Yes SLA No DLA No
Table 7. Teaming modes supported in blade servers
ANS can only detect link failures between the ports in a team and the switches that the ports are directly connected. In a blade server environment, this direct link is a hard-wired connection, which has less possibility of link failures. Since the loss of connections beyond the on-board switches cannot be detected by ANS, when teaming is used on Blade servers the user has to ensure a fully meshed network configuration exists between the switches in the blade center and the next hop switches. The connections L1, L2, L3 and L4 in Figure 7 would form a fully meshed configuration between the blade switches 1 and 2 and the S1 and S2 switches. STP should be running on the network, but the internal switch ports of the blade on-board switches should be configured to be in the fast port mode. Table 7 summarizes the teaming modes supported in a blade server architecture with an integrated switch. SLA and DLA are not supported since the integrated switches do not support trunking across switches.
25
S3
Pow e rConne c t 50 1 2
CONSOLE POWER 9600,N,B,1 1000 100 ACT FDX 1 1000 100 ACT FDX 2 1000 100 ACT FDX 3 1000 100 ACT FDX 4 1000 100 ACT FDX 5 1000 100 ACT FDX 6 1000 100 ACT FDX 7 1000 100 ACT FDX 8 1000 100 ACT FDX 9 1000 100 ACT FDX 10 LNK 11 11 LNK 12 12
S1
Powe rConne c t 5 0 12
CONSOLE POWER 9600,N,B,1 1000 100 ACT FDX 1 1000 100 ACT FDX 2 1000 100 ACT FDX 3 1000 100 ACT FDX 4 1000 100 ACT FDX 5 1000 100 ACT FDX 6 1000 100 ACT FDX 7 1000 100 ACT FDX 8 1000 100 ACT FDX 9 1000 100 ACT FDX 10 LNK 11 11 LNK 12 12 POWER 9600,N,B,1
S2
Powe rConnec t 50 12
CONSOLE 1000 100 ACT FDX 1 1000 100 ACT FDX 2 1000 100 ACT FDX 3 1000 100 ACT FDX 4 1000 100 ACT FDX 5 1000 100 ACT FDX 6 1000 100 ACT FDX 7 1000 100 ACT FDX 8 1000 100 ACT FDX 9 1000 100 ACT FDX 10 LNK 11 11 LNK 12 12
L1 L3
CONSOLE POWER 9600,N,B,1 1000 100 ACT FDX 1 1000 100 ACT FDX 2 1000 100 ACT FDX 3 1000 100 ACT FDX 4 1000 100 ACT FDX 5 1000 100 ACT FDX 6 1000 100 ACT FDX 7 1000 100 ACT FDX 8 1000 100 ACT FDX 9 1000 100 ACT FDX 10 LNK 11 11 LNK 12 12
L2 L4
1000 CONSOLE POWER 9600,N,B,1 100 ACT FDX 1 1000 100 ACT FDX 2 1000 100 ACT FDX 3 1000 100 ACT FDX 4 1000 100 ACT FDX 5 1000 100 ACT FDX 6 1000 100 ACT FDX 7 1000 100 ACT FDX 8 1000 100 ACT FDX 9 1000 100 ACT FDX 10 LNK 11 11 LNK 12 12
Blade Switch #1 Chassis Mid-plane
Blade Switch #2
LOM 1 Compute Blade
LOM 2
Blade Enclosure
Figure 7. Teaming with Blades With this network configuration, failure of links external to the blade server would result in STP running on the network, and allowing traffic intended for the NIC to be forwarded through the redundant link. Hence, failover in this case, is really a function of the switch network, and not a function of the intermediate driver. 2.1.9 Protocol Support
2.1.9.1 TCP/IP IP/TCP/UDP flows are all transmit and receive load balanced. ARP flows are not load balanced. It should be noted that transmit and receive load balancing is only effective when the team (server) is servicing multiple end-clients. If the connection is point-to-point (server to single end-client), the benefits of transmit/receive load balancing are not seen. This is because the algorithm for load balancing uses the IP address of the end-client balancing purposes. 26
The actual assignment between physical ports may change over time, but any protocol that is not TCP/UDP based will be transferred over the same physical port because only the IP address is used to balance transmit traffic. If SLA (FEC/GEC) or DLA (IEEE 802.3ad Dynamic) is used for receive traffic, the switch determines the balancing algorithm and this is usually based on the MAC address. 2.1.9.2 PAgP Port aggregation protocol (PAgP) is a Cisco proprietary protocol that allows the user to combine several physical links into one logical link. It aids in the automatic creation of Fast EtherChannel and Gigabit EtherChannel links. PAgP packets are sent between FEC and GEC capable ports on a Cisco switch in order to negotiate the forming of a channel. This protocol runs on the switch, and has no interaction with the ANS intermediate driver. If the user wishes to create an FEC/GEC channel, s/he should configure the PAgP protocol on the respective ports, and set them to the ON state. ANS does not actively participate in the automatic configuration of an FEC/GEC team since the switch is responsible for administering all the links that are part of such a team. 2.1.9.3 LACP Link Aggregation Control Protocol (LACP) is part of the IEEE 802.3ad specification that allows you to bundle several physical ports together to form a single logical channel. IEEE 802.3ad, with LACP, is an industry standard protocol as opposed to PAgP which is the Cisco proprietary protocol. LACP allows a switch to negotiate an automatic bundle by sending LACP packets to the peer. In this case, the team is the peer, and active communication exists between the switch teamed ports and the ANS intermediate driver. The ANS intermediate driver sends and receives LACPDUs which are control packets used to communicate between the team and the switch. 2.1.9.4 STP In Ethernet networks, only one active path may exist between any two bridges or switches. Multiple active paths between switches can cause loops in the network. When loops occur, some switches recognize stations on both sides of the switch. This situation causes the forwarding algorithm to malfunction allowing duplicate frames to be forwarded. Spanning tree algorithms provide path redundancy by defining a tree that spans all of the switches in an extended network and then forces certain redundant data paths into a standby (blocked) state. At regular intervals, the switches in the network send and receive spanning tree packets that they use to identify the path. If one network segment becomes unreachable, or if spanning tree costs change, the spanning tree algorithm reconfigures the spanning tree topology and re-establishes the link by activating the standby path. Spanning tree operation is transparent to end stations, which do not detect whether they are connected to a single LAN segment or a switched LAN of multiple segments. Spanning-Tree Protocol (STP) is a Layer 2 protocol designed to run on bridges and switches. The specification for STP is defined in IEEE 802.1d. The main purpose of STP is to ensure that you do not run into a loop situation when you have redundant paths in your network. STP detects/disables network loops and provides backup links between switches or bridges. It allows the device to interact with other STP compliant devices in your network to ensure that only one path exists between any two stations on the network. 27
Once a stable network topology has been established, all bridges listen for Hello BPDUs (Bridge Protocol Data Units) transmitted from the root bridge. If a bridge does not get a Hello BPDU after a predefined interval (Max Age), the bridge assumes that the link to the root bridge is down. This bridge then initiates negotiations with other bridges to reconfigure the network to re-establish a valid network topology. The process to create a new topology can take up to 50 seconds. During this time end-to-end communications will be interrupted. The use of Spanning Tree is not recommended for ports that are connected to end stations, because by definition, an end station will not create a loop within an Ethernet segment. Additionally, when a teamed port is connected to a port with Spanning Tree enabled, users may experience unexpected connectivity problems. When SFT is enabled Spanning Tree is required but not on the ports connected to the end stations. STP should only be enabled on ports connected to other switches.
2.1.9.4.1 Topology Change Notice

A bridge/switch creates a forwarding table of MAC addresses and port numbers by learning the source MAC address that it received on a particular port. The table is used to forward frames to a specific port rather than flooding the frame to all ports. The typical maximum aging time of entries in the table is 5 minutes. Only when a host has been silent for 5 minutes would its entry be removed from the table. It is sometimes beneficial to reduce the aging time. One example is when a forwarding link goes to blocking and a different link goes from blocking to forwarding. This change could take up to 50 seconds. At the end of the STP re-calculation a new path would be available for communications between end stations. However, since the forwarding table would still have entries based on the old topology, communications may not be re-established until after 5 minutes when the affected ports entries are removed from the table. Traffic would then be flooded to all ports and re-learned. In this case it is beneficial to reduce the aging time. This is the purpose of a TCN BPDU. The TCN is sent from the affected bridge/switch to the root bridge/switch. As soon as a bridge/switch detects a topology change (a link going down or a port going to forwarding) it sends a TCN to the root bridge via its root port. The root bridge then advertises a BPDU with a TCN to the entire network. This causes every bridge to reduce the MAC table aging time to 15 seconds for a specified amount of time. This allows the switch to re-learn the MAC addresses as soon as STP re-converges. Topology Change Notice BPDUs are sent when a port that was forwarding changes to blocking or transitions to forwarding. A TCN BPDU does not initiate an STP re-calculation. It only affects the aging time of the forwarding table entries in the switch. It will not change the topology of the network or create loops. End nodes such as servers or clients will trigger topology change when they power off and then power back on.
2.1.9.4.2 Port Fast / Edge Port

In order to reduce the effect of TCNs on the network (for example, increasing flooding on switch ports), end nodes that are powered on/off often should use the Port Fast or Edge port setting on the switch port they are attached to. Port Fast or Edge Port is a command that is applied to specific ports and has the following effects: Ports coming from link down to link up will be put in the forwarding STP mode instead of going from listening to learning and then to forwarding. STP is still running on these ports. 28
The switch does not generate a Topology Change Notice when the port is going up or down.
2.2 Driver Support by Operating System

As previously noted, ANS is supported in the Windows 2000 Server, Windows Server 2003, Netware, and Linux operating system environments. In a Netware environment, NESL support is required because ANS relies on the NIC drivers to generate NESL events during link changes and other failure events. The following table summarizes the various teaming mode features for each operating system. Features Windows NetWare (W2K/ Server 2003) (5.1, 6.x) ALB/RLB/AFT/SFT PROSet Command line Limited by Limited by capabilities of the capabilities of the system. Dell system. Dell supports supports up to 4 up to 4 teams per teams per system. system. 81 81 Yes Yes Yes Different speeds IP ANS ANS Loss of link, Probe packets, HW failure, Packet Receive Counters <500 ms 1.5 sec approx. 2 Different Yes Yes Yes Yes Different speeds IP/IPX ANS ANS Loss of link, Probe packets, HW failure, Packet Receive Counters <500 ms 1.5 sec approx. 2 Different Yes Red Hat Linux (AS 2.1, EL3.0) xprocfg Limited by capabilities of the system. Dell supports up to 4 teams per system. 81 No No No Different speeds IP ANS ANS Loss of link, Probe packets, HW failure, Packet Receive Counters <500 ms 1.5 sec approx. 2 Different Yes
User interfaces Number of teams
Number of ports per team Hot replace Hot add Hot remove Link speeds support Frame protocol Incoming packet management Outgoing packet management Failover events
Failover time Failback time MAC address Multi-vendor teaming
Static Link Aggregation (FEC/GEC/Static IEEE 802.3ad User interfaces PROSet Command line XPROcfg Limited by Limited by Number of teams Limited by capabilities of the capabilities of the capabilities of the system. Dell system. Dell system. Dell supports supports up to 4 supports up to 4 up to 4 teams per teams per system. teams per system. system. 29
Features Number of ports per team Hot replace Hot add Hot remove Link speeds support Frame protocols Incoming packet management Outgoing packet management Failover event Failover time Failback time MAC address Multi-vendor teaming
Windows (W2K/ Server 2003) 81 Yes Yes Yes Speeds determined by the switch All Switch ANS Loss of link only <500 ms 1.5 sec approx. 2 Same for all ports Yes
NetWare (5.1, 6.x) 81 Yes Yes Yes Speeds determined by the switch All Switch ANS Loss of link only < 500 ms 1.5 sec approx. 2 Same for all ports Yes
Red Hat Linux (AS 2.1, EL3.0) 81 No No No Speeds determined by the switch All Switch ANS Loss of link only <500 ms 1.5 sec approx. 2 Same for all ports Yes
Dynamic Link Aggregation (IEEE 802.3ad dynamic) User interfaces PROSet Command line XPROcfg Limited by Limited by Number of teams Limited by capabilities of the capabilities of the capabilities of the system. Dell system. Dell system. Dell supports supports up to 4 supports up to 4 up to 4 teams per teams per system. teams per system. system. Number of ports per 8 8 8 team Hot replace Yes Yes No Hot add Yes Yes No Hot remove Yes Yes No Link speed support Frame protocols Incoming packet management Outgoing packet management Failover event Different speeds All Switch ANS Loss of link, LACPDU probes <500 ms 1.5 sec approx.2 Same for all ports Yes Different speeds All Switch ANS Loss of link LACPDU probes <500 ms 1.5 sec approx.2 Same for all ports Yes Different speeds All Switch ANS Loss of link LACPDU probes <500 ms 1.5 sec approx.2 Same for all ports Yes
Failover time Failback time Mac address Multivendor teaming 1 Two ports for SFT 2Make sure that Port Fast or Edge Port is enabled
Table 8. Teaming Attributes by OS
30
2.3 Supported Teaming Speeds

Table 9 summarizes the various link speeds supported by each teaming mode. Mixed speed refers to the capability of teaming ports that are running at different link speeds. Mode AFT SFT ALB ALB/RLB SLA (IEEE 802.3 ad static) DLA (IEEE 802.3ad dynamic) Link Speeds Supported 10/100/1000 10/100/1000 10/100/1000 10/100/1000 100/1000 10/100/1000 Traffic Direction Incoming/outgoing Incoming/outgoing Outgoing Incoming/outgoing Incoming/outgoing Incoming/outgoing Speed Support Mixed speed Mixed Speed Mixed speed Mixed speed Same speed Mixed speed
Table 9. Link Speeds in Teaming
31
2.4 Teaming with Hubs (for Troubleshooting Purposes Only)

AFT, ALB and RLB teaming can be used with 10/100 Hubs, but it is only recommended for troubleshooting purposes, such as connecting a network analyzer in the event that switch port mirroring is not an option. 2.4.1 Hub usage in teaming network configurations
Although the use of hubs in network topologies is functional in some situations, it is important to consider the throughput ramifications when doing so. Network hubs have a maximum of 100 Mbps half-duplex link speed, which will severely degrade performance in either a Gigabit or 100 Mbps switched-network configuration. Hub bandwidth is shared among all connected devices; as a result, when more devices are connected to the hub, the bandwidth available to any single device connected to the hub will be reduced in direct proportion to the number of devices connected to the hub. It is not recommended to connect team members to hubs; only switches should be used to connect to teamed ports. However, AFT, ALB, and RLB teams can be connected directly to a hub for troubleshooting purposes. Connecting other types of teams to hub can result in a loss of connectivity. 2.4.2 AFT, ALB, and RLB Teams
AFT, ALB, and RLB teams are the only teaming type not dependant on switch configuration. The server intermediate driver handles the load balancing and fault tolerance mechanisms with no assistance from the switch. These elements of ANS make it the only team type that maintains failover and failback characteristics when team ports are connected directly to a hub. 2.4.3 AFT, ALB, and RLB Team Connected to a Single Hub
AFT, ALB, and RLB teams configured as shown in Figure 8 will maintain their fault tolerance properties. Either of the server connections could fail and network functionality will be maintained. Clients could be connected directly to the hub and fault tolerance would still be maintained, however, server performance would be degraded.
32
Figure 8. Team Connected to a Single Hub
2.4.4
Static and Dynamic Link Aggregation (FEC/GEC/IEEE 802.3ad)
SLA (FEC/GEC and IEEE 802.3ad static) and DLA (IEEE 802.3ad Dynamic) teams cannot be connected to any hub configuration. These team types must be connected to a switch that has also been configured for this team type.
2.5 Teaming with Microsoft NLB/WLBS

Microsofts Network Load Balancing (NLB) is a technology for sharing network traffic between multiple servers. This is in contrast to ANS which shares traffic among multiple connections on the same server. NLB supports two configuration modes: one for when the traffic from the client is sent to the servers using a Unicast address and one where a multicast connection is established between the client and the server.
33
The following table shows the teaming modes supported in NLB. Mode AFT SFT ALB ALB/RLB SLA (IEEE 802.3 ad static) DLA (IEEE 802.3ad dynamic) Unicast Mode No No No No No No Multicast Mode Yes Yes Yes No Yes Yes
Table 10. Teaming Modes for NLB
2.5.1
Unicast Mode
In the Unicast mode NLB establishes a common MAC address among all of the servers called the cluster MAC address. The cluster MAC address will not actually match the real address of any of the ports. In general, it is invalid to have two ports with the same MAC address; therefore NLB uses the real MAC address of each port when sending a frame over the network. When an ARP request is received, NLB modifies the ARP SRC address in the ARP reply inserting the cluster MAC address. Therefore, the client associates the cluster MAC address with the servers IP address in its ARP table. However, the packets transmitted from the NLB server are sent with the adapter real MAC address in the Ethernet header. This is the MAC address learnt by the switch. Hence, the switch is completely unaware and does not recognize the NLB cluster MAC address. When a client sends a packet to the NLB servers, it uses the cluster MAC address for the destination. When the packet gets to the switch, the switch does not recognize this cluster MAC address as belonging to one of its ports. This is because the switch only knows about the real MAC address for the network port. Therefore, the switch will forward the packet to all of the ports and it will be received by all of the servers in the cluster. In RLB, ANS uses the real MAC address of the port in the ARP SRC field in ARP replies and as the source address in packets sent to the client. After NLB has modified the ARP SRC address in the reply, RLB also modifies the ARP SRC address by inserting its the real MAC address of the server port to be used to received traffic. As a result, the client will never receive an ARP reply with the cluster MAC address. Then when the client sends a packet to the cluster, it will use the port real MAC address. This will result in all traffic going to the same server in the NLB cluster. Although the traffic will be load balanced across the ports in the ANS RLB team, it will not be balanced across all of the servers in the NLB cluster. We do not recommend you use teaming with Unicast NLB clusters. 2.5.2 Multicast Mode
When NLB is configured to operate in multicast mode, then a multicast connection is established between the clients and the NLB server cluster. In this case the switch will recognize the members of 34
the multicast family and send the packet to the appropriate members. Each of the connections will use their own unique MAC address in multicast mode. When a client sends a packet it will send it to all of the members of the multicast list. In the multicast environment RLB does not have the mechanism to make the client aware of the real MAC addresses because the client will always use the multicast addresses for sending packets. Therefore, there is no advantage to using RLB in an NLB cluster. We recommend that you disable the RLB feature in an NLB cluster in multicast mode.
3 Teaming and Other Advanced Networking Features

3.1 Hardware Offloading Features
Intel networking products provide several HW task offload features. If these features are enabled ANS will support the same features in teams. Support is provided by ANS informing the OS that it has these capabilities. The OS will forward packets for offloading to ANS and ANS passes the packets to the base driver and networking hardware to actually perform the function. Some of the task offloads available are: Checksum Offload, Large Send Offload and Jumbo Frames. To take advantage of the HW offloading features all of the members of the team must support the HW offload feature and the feature must be enabled for the base driver. For example: in a four port team, if only 3 of the ports support Checksum Offload or one of the ports supports it but it is disabled, then ANS will tell the OS that the team does not support Checksum Offload. If all of the ports support Large Send Offload and it is enabled on all of the ports, then ANS will report to the OS that it supports Large Send Offload. This will cause the OS to forward packets to ANS that can take advantage of Large Send Offload. Otherwise the OS will handle the feature in SW and you will not realize the increased performance HW offloading offers. Before creating a team, adding or removing team members, or changing advanced settings of a team member, make sure each team member has been configured similarly. Settings to check include, whether a given feature is supported at the hardware and software levels e.g. VLANs, QoS Packet Tagging, Jumbo Frames, and offloads such as IPSec offload. These settings can be verified using the configuration tool available on the system. The settings must be made before the port is made a part of the team and cannot be modified through virtual teaming port. HW Offloads TCP Tx/Rx Checksum Offload IEEE 802.1p QoS Tagging Large Send Offload Jumbo Frames IEEE 802.1Q VLANs Supported by Teaming Virtual Interface (Y/N) Yes Yes Yes Yes Yes
Table 11. HW Offloading and Teaming
35
3.1.1
Checksum Offload
Checksum offload is a feature supported by Intel network connections that allow the TCP/IP/UDP checksums for send and receive traffic to be calculated by the networking hardware rather than by the host CPU. In high traffic situations, this can allow a system to handle more connections more efficiently than if the host CPU is forced to calculate the checksums. An Intel network connection that supports checksum offload will advertise this capability to ANS. If as described above all of the connections in the team support this capability, ANS will report to the OS that the team supports the ability to offload checksum calculations via the underlying hardware. When the OS sends the packets to ANS without the checksum, ANS then forwards it to the network connection where the calculation is performed. 3.1.2 Large Send Offload
Large Send Offload is a feature provided by Intel-ANS-supported network ports that prevents an upper level protocol such as TCP from breaking a large data message into a series of smaller packets with headers appended to them. The protocol stack need only generate a single header for a data packet as large as 64KB and the NIC hardware will segment the data buffer into appropriately sized Ethernet frames with the correctly sequenced header (based on the single header originally provided). Again if this feature is supported and enabled on all team members, ANS will report the capability to the OS. The OS will then forward large packets to ANS which will in turn forward them through the base driver to the HW for processing. 3.1.3 Jumbo Frames
Jumbo frames are Ethernet frames larger than 1518 bytes that were originally proposed in 1998 by Alteon Networks, subject to a maximum of 9000 bytes. They are used to reduce server CPU utilization and increase throughput. However, additional latency may be introduced. Intel ANS supports Jumbo Frames as long as all of the members of the team support Jumbo Frames. See your current user guide for configuring your system for Jumbo Frames.
3.2 Wake on LAN

Wake on LAN is a feature that allows a system to be awakened from a sleep state by the arrival of a specific packet over the Ethernet interface. Since an ANS virtual interface is implemented as a software only device, it lacks the hardware features to implement Wake on LAN and cannot be enabled to wake the system from a sleeping state via the virtual interface. However, the physical ports support this property, even when the port is part of a team.
3.3 Preboot eXecution Environment (PXE)

The Preboot eXecution Environment (PXE) allows a system to boot from an operating system image over the network. By definition, PXE is invoked before an operating system is loaded, so there is no opportunity for the ANS intermediate driver to load and enable a team. As a result, teaming is not supported as a PXE client, though a physical port that participates in a team when the operating system is loaded may be used as a PXE client.
36
While a teamed port cannot be used as a PXE client, it can be used for a PXE server, which provides operating system images to PXE clients using a combination of Dynamic Host Control Protocol (DHCP) and the Trivial File Transfer Protocol (TFTP). Both of these protocols operate over IP and are supported by all teaming modes.
3.4 IPMI
On systems with IPMI manageability, the IPMI traffic can be impacted when teaming is enabled and the TCO port responsible for IPMI traffic is added to a team. A TCO port is a port responsible for conducting IPMI traffic into and out of the machine. A TCO port may be configured to use a dedicated MAC address for management traffic or share a MAC address between the network traffic and the management traffic. One important aspect of utilizing a LOM that is part of a team for IPMI is that the teaming software or switches can not guarantee that the incoming management traffic would be directed to the LOM that is connected to the baseboard management controller. In addition, teaming does not provide fault tolerance for the management traffic since not all members of team (LOMs and NICs) would not be connected to the baseboard management controller.
3.5 802.1q VLAN Tagging Support

The term VLAN (Virtual Local Area Network) refers to a collection of devices that communicate as if they were on the same physical LAN. Any set of ports (including all ports on the switch) can be considered a VLAN. LAN segments are not restricted by the hardware that physically connects them. IEEE approved the 802.3ac standard, in 1998, thereby defining frame format extensions to support Virtual Bridged Local Area Network tagging on Ethernet networks as specified in the IEEE 802.1Q specification. The VLAN protocol permits insertion of a tag into an Ethernet frame to identify the VLAN to which a frame belongs. If present, the 4-byte VLAN tag is inserted into the Ethernet frame between the source MAC address and the length/type field. The first 2-bytes of the VLAN tag consist of the 802.1Q tag type while the second two bytes include a user priority field and the VLAN identifier (VID). Virtual LANs (VLANs) allow the user to split the physical LAN into logical subparts. Each defined VLAN behaves as its own separate network, with its traffic and broadcasts isolated from the others, thus increasing bandwidth efficiency within each logical group. VLANs also enable the administrator to enforce appropriate security and Quality of Service (QoS) policies. VLANs offer the ability to group computers together into logical workgroups. This can simplify network administration when connecting clients to servers that are geographically dispersed across the building, campus, or enterprise network. Typically, VLANs consist of co-workers within the same department but in different locations, groups of users running the same network protocol, or a cross-functional team working on a joint project. 37
Figure 9. VLANs
Improved network performance Limited broadcast storms Improved LAN configuration updates (adds, moves, and changes) Minimized security problems
To set up IEEE VLAN membership (multiple VLANs), the port must be attached to a switch with IEEE 802.1Q VLAN capability. In most environments, a maximum of 64 VLANs per port can be set up. In ANS VLAN tagging is done according to the IEEE 802.1q protocol and the process is the same whether it is a single port or a team. Multiple VLANs can be configured over a single port or a team of ports up to a maximum of 64 VLANs. Each VLAN is represented by a virtual network interface that is bound to the protocols. When the underlying HW supports VLAN tagging ANS forwards the tagging information to the HW. ANS relies on the HW to insert the VLAN tag into the packet. In this case the tags on received packets are also removed by the HW and the tagging information is forwarded to ANS which ultimately forwards it to the protocol stack along with the packet.
3.6 802.1p QoS Tagging Support

The IEEE 802.1p standard includes a three-bit field (supporting a maximum of 8 priority levels), which allows for traffic prioritization. ANS passes packets with 802.1p QoS tagging up the stack.
4 Performance
Modern network interface cards provide many hardware features that reduce CPU utilization by offloading certain CPU intensive operations (see Teaming and Other Advanced Networking Features). ANS will query the base drivers and hardware to determine which of the offload features are available. ANS will then take advantage of those features to improve performance or to reduce CPU utilization. While the load balancing algorithms increase networking throughput, they do so at the expense of CPU performance. Because of this those applications that are already CPU bound due to heavy computations may suffer if operated over a load balanced teamed interface. Such an application may be better suited to take advantage of the failover capabilities of ANS rather than the load balancing features, or it may operate more efficiently over a single physical port that provides a particular hardware feature such as Large Send Offload. If an application is I/O bound, then ALB will help increase the applications performance. The following table gives an indication of the impact on CPU utilization and throughput as the number of members of a team is increased. This table is based on sending, receiving, and simultaneously sending and receiving a 1 MB file and is only representative of what you may expect. However, it shows the benefits of increased aggregate bandwidth as a function of the number of teamed members when using ALB teaming.
38
Mode No Team ALB Team 1 2 3 4
# Of Ports
Receive Only CPU (%) Mbps 9 948 20 1896 32 2685 31 2707
Transmit Only CPU (%) Mbps 3 948 7 1896 12 2821 17 3783
Bi-directional CPU (%) Mbps 10 1846 29 3664 39 4609 36 4452
Table 12. ALB Teaming Performance Note: This is not a guarantee on performance. Performance will vary based on number of configuration factors and type of benchmark. It does indicate that ALB does provide a positive performance improvement as the number of ports increase in a team. This test was run on Microsoft Windows Server 2003.
5 Application Considerations
5.1 Teaming and Clustering
5.1.1 Microsoft Cluster Software
Dell PowerEdge cluster solutions integrate Microsoft Cluster Services with PowerVault SCSI or Dell / EMC Fibre-Channel based storage, PowerEdge servers, storage ports, storage switches and NICs to provide High-Availability solutions. HA clustering supports all NICs qualified on a supported PowerEdge server. MSCS clusters support up to two nodes if you are using Windows 2000 Advanced Server. If you are using Windows Server 2003, that support extends to eight nodes. In each cluster node it is strongly recommended that customers install at least two network ports (on-board NICs are acceptable). These interfaces serve two purposes. One NIC is used exclusively for intra-cluster heartbeat communications. This is referred to as the private NIC and usually resides on a separate private subnetwork. The other NIC is used for client communications and is referred to as the public NIC. Multiple NICs may be used for each of these purposes: private, intra-cluster communications and public, external client communications. All Intel teaming modes are supported with Microsoft Cluster Software for the public NIC only. Private network port teaming is not supported. Microsoft indicates that the use of teaming on the private interconnect of a server cluster is not supported because of delays that could possibly occur in the transmission and receipt of heartbeat packets between the nodes. For best results, when you want redundancy for the private interconnect, disable teaming and use the available ports to form a second private interconnect. This achieves the same end result and provides dual, robust communication paths for the nodes to communicate over. For teaming in a clustered environment, customers are recommended to use the same brand of NICs. Figure 10 shows a two-node Fibre-Channel cluster with a three network interfaces per cluster node: one private and two public. On each node, the two public NICs are teamed and the Private NIC is not. Teaming is supported across the same switch or across two switches. Figure 11 shows the same twonode Fibre-Channel cluster in this configuration.
39
Figure 10. Clustering With Teaming Across One Switch
Note: Microsoft Network Load Balancing is not supported with Microsoft Cluster Software 5.1.2 High Performance Computing Cluster
Gigabit Ethernet is typically used for the following three purposes in HPCC applications: 1. Inter-Process Communications (IPC): For applications that don't require low-latency highbandwidth interconnects (such as Myrinet, and InfiniBand), Gigabit Ethernet can be used for communication between the compute nodes. 40
2. I/O: Ethernet can be used for file sharing and serving the data to the compute nodes. This can be done simply by using an NFS server or using parallel file systems such as PVFS. 3. Management & Administration: Ethernet is used for out-of-band (Remote Access Card) and inband (OMSA) management of the nodes in the cluster. It can also be used for job scheduling and monitoring. In our current HPC offerings, only one of the on-board NICs is used. If Myrinet or IB is present, this NIC serves I/O and administration purposes, otherwise it is also responsible for IPC. In case of a NIC failure, the administrator can use the Felix package to easily configure NIC 2. NIC Teaming on the host side is not tested or supported in HPCC. 5.1.2.1 Advanced Features PXE is used extensively for the deployment of the cluster (installation and recovery of compute nodes). Teaming is typically not used on the host side and it is not a part of our standard offering. Link aggregation is commonly used between switches, especially for large configurations. Jumbo Frames, although not a part of our standard offering, may provide performance improvement for some applications due to reduced CPU overhead. 5.1.3 Oracle
Dell supports teaming in both the private network (interconnect between RAC nodes), and public network with clients in our Oracle Solution stacks.
41
Figure 11. Clustering With Teaming Across Two Switches
5.2 Teaming and Network Backup

In a non-teamed environment, the overall throughput on a backup servers NIC can be reduced due to excessive traffic and NIC overloading. Depending on the number of backup servers, data streams and tape drive speed; backup traffic can easily consume a high percentage of the network link bandwidth, thus impacting production data and tape backup performance. Network backups usually consist of a dedicated backup server running with tape backup software such as NetBackup, Galaxy or Backup Exec. Attached to the backup server is either a direct SCSI tape backup unit or a tape library connected through a fiber channel storage area network (SAN). Systems that are backed up over the network are typically called clients or remote servers and usually have a tape backup software agent 42
installed. Figure 12 shows a typical 1Gbps non-teamed network environment with tape backup implementation.
Figure 12. Network Backup without teaming
Because there are four client servers, the backup server can simultaneously stream four backup jobs (one per client) to a multi-drive autoloader. However, because of the single link between the switch and the backup server, a four-stream backup will easily saturate the NIC and link. If the NIC on the backup server operates at 1Gbps (125 MB/Sec), and each client is able to stream data at 20 MB/Sec during tape backup, then the throughput between the backup server and switch will be at 80 MB/Sec (20 MB/Sec x 4), which is equivalent to 64% of the network bandwidth. Although this is well within the network bandwidth range, the 64% constitutes a high percentage, especially if other applications share the same link. Using the non-teamed topology in Figure 12, four separate tests were run to calculate the remote backup performance. In test 1 One Stream, the backup server streamed data from a single client (Client-Server Red). In test 2 Two Streams, the backup server simultaneously streamed data from two separate clients (Red and Blue). In test three Three Streams, the backup server simultaneously streamed data from three separate clients. In test four Four Streams, the backup server 43
simultaneously streamed data from four separate clients. Performance throughput for each backup data stream is shown in Graph 1.
Backup Performance - No Teaming

90 80 70 60 One Stream MB/Sec 50 40 30 20 10 0 Number of Backup Streams Two Streams Three Streams Four Streams
Note: Performance results will vary depending on tape drive technology as well as data set compression. Graph 1. Backup Performance with no NIC Teaming
5.2.1
Load Balancing and Failover
The performance results show that as the number of backup streams increases, the overall throughput increases. However, each data stream may not be able to maintain the same performance as a single backup stream of 25 MB/Sec. In other words, even though a backup server can stream data from a single client at 25 MB/Sec., it is not expected that four simultaneous running backup jobs will stream at 100 MB/Sec. (25 MB/Sec. x 4 streams). Although overall throughput increases as the number of backup streams increases, each backup stream can be impacted by tape software or network stack limitations. In order for a tape backup server to reliably utilize NIC performance and network bandwidth when backing up clients, a network infrastructure must implement teaming such as adaptive load balancing with its inherent fault tolerance. Data centers will incorporate redundant switches and link aggregation as part of their fault tolerant solution. Although teaming device drivers will manipulate the way data flows through teamed interfaces and failover paths, this is transparent to tape backup applications and does not interrupt any tape backup process when backing up remote systems over the network. Figure 13 shows a network topology that demonstrates tape backup in an Intel teamed environment and how adaptive load balancing can load balance tape backup data across teamed NICs.
44
There are four paths that the client-server can use to send data to the backup server, but only one of these paths will be designated during data transfer. One possible path that Client-Server Red can use to send data to the backup server is: Example Path: Client-Server Red sends data via NIC A, Switch 1, Backup Server NIC A. The designated path is determined by two factors: 1. Client-Server ARP cache; which points to the Backup Servers MAC address. This is determined by the Intel intermediate driver inbound load balancing algorithm. 2. The physical NIC interface on Client-Server Red that will be used to transmit the data. The Intel intermediate driver outbound load balancing algorithm determines this. (See section 2.1.4) The RLB teamed interface on the backup server will transmit a gratuitous ARP (G-ARP) to ClientServer Red, which in turn causes the client servers ARP cache to get updated with the Backup Servers MAC address. The receive load balancing mechanism within the teamed interface determines the MAC address embedded in the G-ARP. The selected MAC address is essentially the destination for data transfer from the client server. On Client-Server Red, the ALB teaming algorithm will determine which of the two NIC interfaces will be used to transmit data. In this example, data from Client Server Red is received on the backup servers NIC A interface. To demonstrate the ALB mechanisms when additional load is placed on the teamed interface, consider the scenario when the backup server initiates a second backup operation: one to Client-Server Red, and one to Client-Server Blue. The route that Client-Server Blue uses to send data to the backup server is dependant on its ARP cache, which points to the backup servers MAC address. Since NIC A of the backup server is already under load from its backup operation with Client-Sever Red, the Backup Server will invoke its RLB algorithm to inform Client-Server Blue (thru an G-ARP) to update its ARP cache to reflect the backups servers NIC B MAC address. When Client-Server Blue needs to transmit data, it will use either one of its NIC interfaces, which is determined by its own ALB algorithm. What is important is that data from Client-Server Blue is received by the Backup Servers NIC B interface, and not by its NIC A interface. This is important because with both backup streams running simultaneously, the backup server must load balance data streams from different clients. With both backup streams running, each NIC interface on the backup server is processing an equal load, thus load-balancing data cross both NIC interfaces. The same algorithm applies if a third and fourth backup operation is initiated from the backup server. The RLB teamed interface on the backup server will transmit a Unicast G-ARP to backup clients to inform them to update their ARP cache. Each client will then transmit backup data along a route to the target MAC address on the backup server. Based on the network topology diagram in Figure 13, backup performance was measured on the teamed backup server when performing one or more backup streams. Graph 2 shows the tape backup performance that can be expected on the backup server when conducting network backups:
45
Backup Performance - Teaming

90 80 70 60
One Stream
MB/Sec
50 40 30 20 10 0 Number of Backup Streams
Two Stream s
Three Stream s
Four Stream s
Graph 2. Backup Performance
The backup performance results are nearly the same as the performance measured in the non-teamed environment. Since the network was not the bottleneck in the non-teamed case, teaming was not expected to improve performance. However, in this case teaming is recommended to improve fault tolerance and availability. In the example where a backup server has one NIC, all data streams can only go through that one NIC, and as shown in the charts, performance was 80 MB/Sec. In the teamed environment, although the same performance of 80 MB/Sec was measured, data from the clients was received across both NIC interfaces on the backup server. With four backup streams, the teamed interface equally received backup streams across both NICs in a load balanced manner. Two backup streams were received on NIC A and two streams were received on NIC B for a performance total of 20 MB/Sec x 4 backup streams = 80 MB/Sec. 5.2.2 Fault Tolerance In the event that a network link fails during tape backup operations, all traffic between the backup server and client will stop and backup jobs will fail. However, if the network topology was configured for both Intel ALB and switch fault tolerance, then this would allow tape backup operations to continue without interruption during the link failure. All failover processes within the network are transparent to tape backup software applications. To understand how backup data streams are directed during network failover process, consider the topology in Figure 13. Client-Server Red is transmitting data to the backup server via Path 1, but a link failure occurs between the backup server and switch. Since the data can no longer be sent from Switch #1 to the NIC A interface on the backup server, the data will be re-directed from Switch #1 through Switch #2, to the NIC B interface on the backup server. This occurs without the knowledge of the backup application because all fault tolerant operations are handled by the NIC team interface and aggregation settings on the switches. From the client servers perspective, it still thinks it is transmitting data through the original path.
46
Figure 13. Network Backup with ALB Teaming and Switch Fault Tolerance
47
6 Troubleshooting Teaming Problems

When running a protocol analyzer over a teams virtual interface the MAC address shown in the transmitted frames may not be correct. The analyzer does not show the frames as constructed by ANS and will show the MAC address of the team and not the MAC address of the interface transmitting the frame. It is suggested to use the following process to monitor a team: Mirror all uplink ports from the team at the switch If the team spans two switches, mirror the interlink trunk as well Sample all mirror ports independently On the analyzer use a NIC and driver that does not filter QOS and VLAN information
6.1 Teaming Configuration Tips

When troubleshooting network connectivity or teaming functionality issues, ensure that the following information is true for your configuration. 1. While Dell supports mixed speed teaming for AFT, ALB, RLB, and SFT, it is recommended that all ports in a team be the same speed (either all Gigabit or all Fast Ethernet).
2. Disable Spanning Tree Protocol or enable a STP mode that bypasses the initial phases (for
example, Port Fast or Edge Port) for the switch ports connected to a team. 3. All switches that the team is directly connected to must have the same hardware revision, firmware revision, and software revision to be supported. 4. To be teamed, ports should be members of the same VLAN. 5. In the event that multiple teams are configured, each team should be on a separate network 6. Do not enter a Multicast or Broadcast address in the Locally Administered Address field. Do not use the Locally Administered Address on any physical port that is a member of a team. 7. Verify that power management is disabled on all physical members of any team. 8. Remove any static IP address from the individual physical team members before the team is built. 9. A team that requires maximum throughput should use IEEE 802.3ad Dynamic, or SLA. In these cases, the intermediate driver is only responsible for the outbound load balancing while the switch performs the inbound load balancing. 10. Aggregated teams (802.3ad Dynamic and Static or SLA) must be connected to only a single switch that supports IEEE 802.3a, LACP or GEC/FEC. 11. It is not recommended to connect any team to a hub, as a hub only supports half duplex. Hubs should be connected to a team for troubleshooting purposes only. 48
12. Verify the base (Miniport) and team (intermediate) drivers are from the same release package. Dell does not test or support mixing base and teaming drivers from different CD releases. 13. Test the connectivity to each physical port prior to teaming. 14. Test the failover and failback behavior of the team before placing into a production environment. 15. Test the performance behavior of the team before placing into a production environment. 6.2
Troubleshooting guidelines
Before you call Dell support, make sure you have completed the following steps for troubleshooting network connectivity problems when the server is using NIC teaming. 1. Make sure the Link Light is on for every port and all the cables are attached. 2. Check that the matching base and intermediate drivers belong to the same Dell release and are loaded correctly. 3. Check for a valid IP Address using MS ipconfig, or Linux ifconfig or NetWare CONFIG. 4. Check that STP is disabled or Edge Port/Port Fast is enabled on the switch ports connected to the team. 5. Check that the ports and the switch are configured identically for Link Speed and Duplex. 6. If possible, break the team and check for connectivity to each port independently to confirm that the problem is directly associated with teaming. 7. Check that all switch ports connected to the team are on the same VLAN. 8. Check that the switch ports are configured properly for SLA aggregation and that it matches the NIC teaming mode. If configured for AFT, ALB, RLB or SFT, make sure the corresponding switch ports are NOT configured for aggregation (FEC/GEC/IEEE802.3ad).
49
6.3 Teaming FAQ

Question: Under what circumstances is traffic not load balanced? Why is all traffic not load balanced evenly across the team members? Answer: The bulk of traffic does not use IP/TCP/UDP or the bulk of the clients are in a different network. The receive load balancing is not a function of traffic load, but a function of the number of clients that are connected to the server. Question: What network protocols are load balanced when in a team? Answer: Intels teaming software only supports IP/TCP/UDP traffic. All other traffic will be forwarded to the primary port. Question: Which protocols are load balanced with ALB and which ones are not? Answer: Only IP/TCP/UDP protocols are load balanced in both directions: send and receive. Question: Can I team a port running at 100 Mbps with a port running at 1000 Mbps? Answer: Mixing link speeds within a team is support for all ANS teaming modes except SLA. Question: Can I team a fiber port with a copper Gigabit port? Answer: Yes. Question: What is the difference between NIC load balancing and Microsofts Network Load Balancing (NLB)? Answer: NIC load balancing is done at a network session level while NLB is done at the server application level. Question: Can I connect the teamed ports to a hub? Answer: Yes. Teamed ports can be connected to a hub for troubleshooting purposes only. However, this is not recommended for normal operation, because the performance improvement expected would be degraded due to hub limitations. Connect the teamed ports to a switch instead. Question: Can I connect the teamed ports to ports in a router? Answer: No. All ports in a team must be on the same network; however, in a router each port is a separate network by definition. All teaming modes require that the link partner be a Layer 2 switch. Question: Can I use teaming with Microsoft Cluster Services? Answer: Yes. Teaming is supported on the public network only, but not on the private network used for the heartbeat link. Question: Can PXE work over a teams virtual interface? Answer: A PXE client operates in a pre-OS environment; as a result, the virtual interface has not been enabled yet. If the physical port supports PXE, then it can be used as a PXE client, whether or not it is part of a team when the OS loads. PXE servers may operate over a teams virtual interface. Question: Can WOL work over a teams virtual interface? Answer: Wake-on-LAN functionality is not supported on a teams virtual interface; it is only supported on physical ports even if they are teamed. 50
Question: What is the maximum number of ports that can be teamed together? Answer: Up to 8 ports Question: What is the maximum number of teams that can be configured on the same server? Answer: Up to 4 teams. Question: Can I connect a team across multiple switches? Answer: Receive Load Balancing can be used with multiple switches because each physical port in the system uses a unique Ethernet MAC address. Static Link Aggregation and 802.2ad (dynamic) cannot operate across switches because they require all physical ports to share the same Ethernet MAC address. Question: How do we upgrade the intermediate driver (iANS)? Answer: The intermediate driver can be upgraded using the PROSet.msi installer Question: Should both the backup server and client servers that are backed up be teamed? Answer: Because the backup server is under the most data load, it should always be teamed for link aggregation and failover. However, a fully redundant network requires that both the switches and the backup clients be teamed for fault tolerance and link aggregation. Question: During backup operations, does the NIC teaming algorithm load balance data at a byte level or session level? Answer: When using NIC teaming, data is only load balanced at a session level and not a byte level to prevent out of order frames. NIC teaming load balancing does not work the same way as other storage load balancing mechanisms such as EMC PowerPath. Question: Is there any special configuration required in the tape backup software or hardware in order to work with NIC teaming? Answer: There is no configuration required in the tape software to work with teaming. Teaming is transparent to tape backup applications. Question: How do I know what driver I am currently using? Answer: In Windows use device manager to check driver properties for the network device. Question: Can ANS detect a switch failure in a Switch Fault Tolerance configuration? Answer: No. ANS can only detect the loss of link between the teamed port and its immediate link partner. ANS cannot detect link failures on other ports. Question: Where can I get the latest supported drivers? Answer: Please check at www.support.dell.com for driver package updates or support documents. Question: Why does my team loose connectivity for the first 30 to 50 seconds after the Primary Port is restored (fail back after a failover)? Answer: During a fail back event, link is restored causing Spanning Tree Protocol to configure the port for blocking until it determines that it can move to the forwarding state. You must enabled Port Fast or Edge Port on the switch ports connected to the team to prevent the loss of communications caused by STP. Question: Where do I monitor real time statistics for a NIC team in a Windows server? 51
Answer: Use PROSet to monitor general, IEEE 802.3 and custom counters. Question: Can I configure NLB and teaming concurrently? Answer: Yes, but only when running NLB in a multicast mode. (Note: NLB is not supported with MS Cluster Services.)
52
7 Appendix A- Event Log Messages

7.1 Windows System Event Log messages
This document will list all of the known base and intermediate Windows System event log status messages for the Intel products as of 12/2004 The following table lists the Windows Event Log Messages a user may encounter. There may be up to two classes of entries for these event codes depending if both drivers are loaded. One set for the base or miniport driver and one set for the intermediate or teaming driver.
7.2 Base Driver (Physical Port / Miniport)

Events from the base driver will be identified by Either E100 or E1000 in the source column of the event viewer. Table 13 lists the event log messages supported by the base driver, the cause for the message and the recommended action. System Event Message Message Number 1-3 NA Could not find a 4 PRO/1000 adapter Driver could not determine which 5 PRO/1000 adapter to load on Could not allocate the MAP REGISTERS 6 necessary for operation Could not assign an interrupt for the PRO/1000 Could not allocate memory necessary for operation Could not allocate shared memory necessary for operation Could not allocate memory for receive structures Could not allocate memory for receive Causes NA Faulty driver or driver not seated properly Faulty driver installation Corrective Action Not used Reinstall driver or reseat the adapter Reinstall the driver
The driver cannot allocate map registers from the operating system Interrupts being used by other cards Shortage of system memory
Reduce the number of transmit descriptors and restart the driver Try another PCI slot
Could not get shared memory from the operating system Could not get memory from the operating system Could not get memory from the operating 53
10 11
Reduce the number of receive descriptors, and coalesce buffer, then restart Reduce the number of receive descriptors, and coalesce buffers, then restart Reduce the number of receive descriptors and restart. Reduce the number of receive descriptors and
System Event Message Message Number descriptors. Could not allocate memory for receive 12 buffers Could not establish link 13 The PCI BIOS has NOT properly configured the PRO/1000 adapter. The PCI BIOS has NOT properly configured the PRO/1000 adapter. The PCI BIOS has NOT properly configured the PRO/1000 adapter. The PCI BIOS has NOT properly configured the PRO/1000 adapter. The PRO/1000 adapter was not configured for bus mastering by the PCI BIOS.
Causes system Could not get memory from the operating system Network connection not available Incorrect BIOS or faulty PCI slot
Corrective Action restart. Reduce the number of receive descriptors and restart. Check the network cable Get the latest BIOS for your computer, or try another PCI slot Get the latest BIOS for your computer, or try another PCI slot Get the latest BIOS for your computer, or try another PCI slot Get the latest BIOS for your computer, or try another PCI slot
14
15
Incorrect BIOS or faulty PCI slot
16
17
18
19
20
21
22
23
The adapter is unable to Install the adapter in a bus mastering-capable perform PCI bus slot. See your mastering computer documentation for details. For more information, run PROSet diagnostics. Reduce the number of Could not get memory Could not allocate the receive descriptors and from the operating NDIS receive packets restart necessary for operation. system. Reduce the number of Could not get memory Could not allocate the receive descriptors and from the operating NDIS receive packets restart necessary for operation. system. Conflict with PCI Move the adapter to The OS was unable to another slot. Or remove assign PCI resources to resources other hardware that the PRO/1000 adapter. may be causing a conflict Conflict with PCI Remove any unused The driver was unable driver instances from to claim PCI resources resources the network control of this PRO/1000 panel applet adapter. The EEPROM on your Visit your computer PRO/1000 adapter may manufacturers support 54
System Event Message Message Number have errors. 24 Could not start the PRO/1000 adapter MDIX setting conflict with the AutoNeg Settings. MDIX will not work Link has been established Link has been disconnected
Causes
Corrective Action web site for support
Faulty driver Faulty setting prohibit correcting for a crossed over cable
Install updated driver Enable AutoNeg and restart.
25
26
The adapter has established a link. The adapter has lost connection with its link partner
27
28
29
Link has been established Could not start the gigabit network connection [Adapter name] is set up for auto-negotiation but the link partner is not configured for autonegotiation. A duplex mismatch may occur. Spanning Tree Protocol has been detected on the device your network connection is attached to Link has been established: 1000Mbps Link has been established: 100 Mbps full duplex Link has been established: 100 Mbps half duplex Link has been established: 10 Mbps
The adapter has established a link. The adapter was unable to establish link with its link partner Link partner is unable to support auto negotiation
30
Informational only; No action required. Check that the network cable is connected, verify that the network cable is the right type, and verify that the link partner (e.g. switch or hub) is working correctly Informational only No action required. Connect cable to the network device and restart, or disable Link Based Login and restart Check link partner for link configuration- set to auto on both sides or select speed on both sides, Check link partner SPT configuration on ports connected to the server.
SPT is enabled on the switch
31
32
33
34 35
Gigabit connection established with the link partner FEC connection established with the link partner FEC connection established with the link partner 10 Mbps connection established with the 55
Informational No action required. Informational No action required. Informational No action required. Informational No action required.
System Event Message Message Number full duplex Link has been established: 10 Mbps 36 half duplex
Causes link partner 10 Mbps connection established with the link partner
Corrective Action
Informational No action required.
Table 13. Base Driver Event Log Messages
7.3 Intermediate Driver (Virtual Adapter/Team)

Events from the intermediate driver will be identified by iANS in the source column of the event viewer. Table 14 lists the event log messages supported by the intermediate driver, the cause for the message and the recommended action.
System Event Message Message Number 1 NA Unable to allocate required resources 2 Unable to read required registry parameters Unable to bind to physical adapter Unable to initialize an adapter team Primary Adapter is initialized: %2 Adapter is initialized: %2 #%2: Team is initialized Team #%2: Virtual Adapter for %3 Current Primary Adapter is switching from: %2 Adapter Link Down: %2 Secondary Adapter took over: %2
Causes NA The driver cannot allocate memory from the operating system. The driver cannot read registry information. Team binding to the physical adapter does not take place Adapter team does not get initialized Informational message 2nd adapter of the team is initialized The team is initialized Virtual adapter initialized Primary adapter is switching over The adapter link has gone down. Secondary adapter has taken over 56
Corrective Action Not used Free some memory resources and restart Remove the adapter team and then create a new team Remove the adapter team and then create a new team Remove the adapter team and then re-create a new team. No Action required. No Action required. No Action required. No Action required. Check primary adapter and connection. Check link partner and cable No Action required.
5 6 7 8 9 10 11 12
System Event Message Message Number The %2 has been deactivated from the 13 team Secondary Adapter has rejoined the Team: %2 14 15 Adapter link up: %2 Team#%2: The last adapter has lost link. Team network connection has been lost Team #%2: An adapter has re-established link. Team network connection has been restored. Preferred primary adapter has been detected: %2 Preferred secondary adapter has been detected: %2 Preferred primary adapter took over: %2 Preferred secondary adapter took over: %2 Primary Adapter does not sense any Probes: %2 . %2: A Virtual Adapter failed to initialize
Causes
Corrective Action
One of the team Check teamed port members is deactivated. Secondary team member has joined the team Adapter link up Team connectivity is lost No Action required.
No Action required. Check link
16
Team is alive
Informational message
17
18
Preferred Primary adapter detected Preferred Secondary adapter detected Preferred Primary adapter takes over Preferred secondary adapter takes over Possible reason could be partitioned Team Virtual adapter did not initialize
No Action required.
No Action required.
19 20 21 22
No Action required. Check primary adapter Informational message
23
24
25
%2: Adapter failed to join the team because it lacked IPSec Task Offload capabilities. %2: Adapter failed to join the team because it lacked TCP Checksum Task Offload capabilities. %2: Adapter failed to join the team because it
26
Check for other events; ensure you have current drivers; and re-create the team. Ensure connection Adapter failed to join team because of lack of supports IPSec offloading; enable IPSec Offload IPSec capabilities Replace the adapter Adapter failed to join team because of lack of with one that supports TCP Checksum Offload checksum offload or disable checksum capabilities offload on all other ports. Adapter failed to join Ensure connection team because of lack of supports TCP Large 57
System Event Message Message Number lacked TCP Large Send Task Offload capabilities. %2: Adapter failed to join the team because of insufficient PnP capabilities. %2: Adapter failed to join the team because MaxFrameSize too small. %2: Adapter failed to join the team because MulticastList size too small. %2 successfully added to the team. :%2 successfully removed from the team An invalid loop back situation has occurred on the adapter in device %2 No 802.3ad response from the link partner of any adapters in the team. More than one Link Aggregation Group was found. Only one group will be functional within the team. Initializing Team #%2 with %3 missing adapters. Team #%2 initialization failed. Not all base drivers have the correct MAC address Virtual adapter for %3 [VID=%4] removed from team #%2. %2 was removed from team The driver failed to
Causes Large Send Offload capabilities
Corrective Action Send Offload and that it is enabled.
27
28
29
Adapter failed to join team because of lack of Plug and Play capabilities Adapter failed to join team because the max frame size is insufficient Adapter failed to join team because of small size of multicast list Team member added to team Team member removed from team Team has received loop back frames
Check capabilities of the adapter; ensure it is PnP compliant Use PROSet to adjust the max frame size on the faulty connection Use PROSet to adjust the MulticastList size on the faulty connection No action required No action required Check the configuration to verify that all the adapters in the team are connected to 802.3ad compliant switch ports Check link partner (switch) and ensure LAPD is enabled Check switch configuration.
30 31
32
33
Did not receive any 802.3ad response
Multiple candidate LAGs were found
34
35
36
37 38 39
Team getting initialized with some adapters missing Team initialization failing as some base drivers do not have the right MAC address Adapter removed from a team Team member removed Driver initialization
Check other event messages for specific cause Use PROSet to check MAC assignment at the base driver level Check VLAN configuration Check other messages for a specific cause Check driver is loading
58
System Event Message Message Number initialize properly. You may not be able to change the virtual adapter settings.%r%r Virtual adapter unload process may have not completed 40 successfully.%rDriver may not be unloaded.%r%r %2 is improperly configured. %rThe adapter cannot processes the remote management features 41 and be a member of an EtherChannel or 802.3ad network team at the same time. %2 is improperly configured. %rThe adapter cannot processes the remote management features 42 and be a member of a network team at the same time.
Causes issue , affecting changing adapter settings Virtual adapter not getting unloaded as driver not getting unloaded
Corrective Action correctly. Verify driver file is current/ not over ridden; reload the driver file. Verify if team failure due to failure or improper shutdown; rebuild team
Team configuration needs to be fixed per error message
Remove adapter from the team
Team configuration needs to be fixed per error message
Remove adapter from the team
Table 14. Intermediate Driver Event Log Messages Printed in the USA. Dell and the DELL logo are trademarks of Dell Inc. Dell disclaims any proprietary interest in the marks and names of others. 2004 Dell Inc. All rights reserved. Reproduction in any manner whatsoever without the written permission of Dell Inc. is strictly forbidden.
59

Intel Nic Teaming 10

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Intel Nic Teaming 10

Încărcat de

Drepturi de autor:

Formate disponibile

Intel Advanced Networking Services With Ethernet Teaming

Dell Inc. One Dell Way Round Rock, Texas 78682

1.1 Key Definitions and Acronyms

Table 1. Glossary of Terms

1.2 Teaming Concepts

(TX only) (TX and RX)

(TX and RX)

(TX and RX)

1.3 Software Components

Software Component Base Driver

Intel Name Intel Base Driver

Windows e100b325.sys e1000325.sys ixbg325.sys iAnsWxp.sys iAnsw2k.sys iAnsw64.sys iAnsw32e.sys PROSet.msi

Intel Advanced Networking Services (ANS)

e1000.o ce100b.lan e100.o ce1000.lan ixgb.o ians.o ians.lan

Table 3. Intel Teaming Software Components

1.4 Hardware Requirements

1.5 Supported Teaming by OS

1.6 Utilities for Configuring Teaming by OS

Table 5. Operating System Configuration Tools

Figure 1. Intel PROSet for Windows

1.7 Supported Features by Team Type

Yes (Performed by ANS) Yes (Performed by Switch) Yes

Yes (Performed by ANS) Yes (Performed by Switch) Yes

Yes (ANS probe packets)

Yes (ANS probe packets) 11

Yes (LAC PDU probe packets)

No (must be the same speed)

Yes Yes* Yes (IPX outbound traffic only) No

Yes Yes* Yes

Yes Yes* Yes

Yes Yes (For receive load balancing)

Yes Yes (For receive load balancing)

Table 6. Comparison of Teaming Modes

1.8 Selecting a team type

Do you need increased bandwidth or fault tolerance?

Do you need fault tolerance across switches?

Set up Switch Fault Tolerance.

Are you using a switch that supports IEEE 802.ad LACP?

Set up a Dynamic Link Aggregation Team

Do you want fault tolerance across adapters?

Set up an Adapter Fault Tolerance Team

Are you using a switch that supports static link aggregation

Set up a Static Link Aggregation Team

Figure 2. Process for Selecting a Team Type

Application Protocol Driver

iANS Intermediate Driver De/MUX

LAN base driver instance

LAN base driver instance

LAN base driver instance

LAN base driver instance

Failover Decision Events

Figure 4. Teaming Across Switches without a Inter-switch Link

Figure 5. Teaming Across Switches with Interconnect

Figure 6. Failover event

Blade Switch #1 Chassis Mid-plane

LOM 1 Compute Blade

2.1.9.4.1 Topology Change Notice

2.1.9.4.2 Port Fast / Edge Port

2.2 Driver Support by Operating System

User interfaces Number of teams

Failover time Failback time MAC address Multi-vendor teaming

2.3 Supported Teaming Speeds

Table 9. Link Speeds in Teaming

2.4 Teaming with Hubs (for Troubleshooting Purposes Only)