Documente Academic
Documente Profesional
Documente Cultură
Quigley
Honours Final Project Report Project Title: An Investigation into Spanning Tree Protocol (802.1D) timers and convergence performance. BSc (Hons) Networking and System Support 2011
Abstract
Spanning Tree Protocol (STP (8021.D)) was introduced to Ethernet LANs to overcome the problems of loops forming in networks due to transparent bridging. Ethernet is still the most popular technology used for local area networks, however STP (802.1D) typically has a convergence time of between 30 and 50 seconds and does not scale well, this makes it inadequate for the demands of most modern Ethernet networks, therefore STP (802.1D) has been superseded by newer technologies offering greater scalability and faster convergence time, however businesses with legacy network equipment that does not support the newer technologies may find benefit in tuning STP (802.1D) settings to gain faster convergence times. As a result this project aims to investigate whether or not Spanning Tree Protocols (802.1D) convergence time can be improved by tuning the hello, max age and forward delay timers whilst still retaining network stability; the modification of timer parameters will be based on allowable and effective combinations identified in research. This investigation will be carried out using OPNET Modeller to build simulations of real LANs and then simulate link failure and recovery in each. Each LAN model will differ only in STP (802.1D) timer settings and diameter size. Recovery time and convergence state in each model will be measured, the subsequent measurements will be analysed and compared to determine if the network is stable and fully converged and if there is a performance increase or not in convergence time and to identify whether the increase if there is one is significant or not.
Submitted for the Degree of BSc (Hons) in Networking & Systems Support, 2010-2011
1.2 Research Question ........................................................................................ . 7 1.3 Hypothesis....................................................................................................... 7 1.4 Justification ................................................................................................... 7 1.5 Project Type .................................................................................................... 8 1.6 Project Aim ..................................................................................................... 9 1.7 Objectives....................................................................................................... 9 1.7.1 Examine the Features and Mechanisms of Ethernet..........................................9 1.7.2 Investigate Transparent Bridging...................................................................... 9 1.7.3 Examine the Mechanisms and Operation of STP (802.1D) ............................. 10 1.7.4 Identify and Analyse STP (802.1D) Timers...................................................... 10 1.7.5 Determine STP (802.1D) Convergence criteria................................................ 10
2.4 Timer Parameters........................................................................................... 2.4.1 Max Age Timer...................................................................................... 2.4.2 Hello Timer............................................................................................. 2.4.3 Forward Delay Timer............................................................................. 2.4.4 Bridge Diameter..................................................................................... 2.4.5 Timer Calculation and Analysis.............................................................
21 21 21 21 22 23
3.0 Methods
3.1 Primary Research Methods............................................................................. 3.1.1 Simulation Software Justification.......................................................... 3.1.2 Simulation Software Environment......................................................... 3.1.3 Hypothesis Testing ................................................................................ 3.2 Method Approach .......................................................................................... 3.2.1 Experiment Environment ....................................................................... 3.2.2 Topology of Experiment ......................................................................... 3.2.3 Experiment and Variables ...................................................................... 3.2.4 Select and Analyse Statistics .................................................................31 26 26 26 26 27 27 27 29
4.0 Results
4.1 4.2 4.3 4.4 Group 1 Results ...................................................................................................... 33 Group 2 Results .................................................................................................. 36 Group 3 Results......................................................................................................... 39 Group 4 Results......................................................................................................... 42
5.0 Conclusions
5.1 5.2 5.3 5.4 5.5 5.6 5.7 Group 1 Conclusions................................................................................................. 46 Group 2 Conclusions................................................................................................. 47 Group 3 Conclusions................................................................................................. 47 Group 4 Conclusions................................................................................................. 48 Overall Conclusions................................................................................................. 49 Benefit of Research................................................................................................. 50 Further Research................................................................................................. 50
6.0 Appendices
6.1 References ...................................................................................................... 51 6.2 Bibliography .................................................................................................. 53 6.3 Results (Charts) ......................................................................................................... 54 6.3.1 Group 1 Results........................................................................................ 54 6.3.2 Group 2 Results........................................................................................ 57 6.3.3 Group 3 Results........................................................................................ 59 6.3.4 Group 4 Results....................................................................................... 63 6.3.5 CD Index ................................................................................................. 66
1.0
Introduction
Spanning Tree Protocol (STP (8021.D)) was introduced to Ethernet LANs to overcome the problems of loops forming in networks due to transparent bridging. Ethernet is still the most popular technology used for local area networks, however STP (802.1D) typically has a convergence time of between 30 and 50 seconds and does not scale well, this makes it inadequate for the demands of most modern Ethernet networks, therefore STP (802.1D) has been superseded by newer technologies offering greater scalability and faster convergence time, however businesses with legacy network equipment that does not support the newer technologies may find benefit in tuning STP (802.1D) settings to gain faster convergence times. As a result this project aims to investigate whether or not Spanning Tree Protocols (802.1D) convergence time can be improved by tuning the hello, max age and forward delay timers whilst still retaining network stability; the modification of timer parameters will be based on allowable and effective combinations identified in research. This investigation will be carried out using OPNET Modeller to build simulations of real LANs and then simulate link failure and recovery in each. Each LAN model will differ only in STP (802.1D) timer settings, recovery time and convergence state in each model will be measured, the subsequent measurements will be analysed and compared to determine if the network is stable and fully converged and if there is a performance increase or not in convergence time and to identify whether the increase if there is one is significant or not.
1.1
Background
automatically selected by spanning tree based on the lowest MAC address of bridges if priority settings on bridges are set at default, these configurable priority settings allow an administrator to set any bridge in the network as root. A designated bridge according to (Prytz 2008) is responsible for all traffic going to and coming away from the root bridge and there is only one designated bridge per link allowed. A non-root bridge can be the designated bridge for more than one link or none, whereas the root bridge is the designated bridge for all directly connected links. A designated bridge has three port types; root port of which there is only one and many designated and inactive ports, the root port moves traffic towards the root bridge and designated ports move traffic away from the root bridge, inactive ports are either blocked or disabled. Link and path costs are also used in the calculation of the spanning tree, with high speed links being chosen over lower speed links (Faghani, Mirjalily 2008). Bridge Protocol Data Units (BPDUs) are periodically transmitted by bridges to compute and maintain the spanning tree, each BPDU is encapsulated in an Ethernet frame (Anon, IEEE 802.1D 1990) and contains information about the sending bridge, such as; port state, bridge priority and the path cost from the root to the bridge transmitting the BPDU, there are three types of BPDUs; configuration BPDU (CBPDU), topology change notification (TCN) and topology change acknowledgement (TCA), CPBDUs are used to calculate and maintain the tree, TCNs are used to notify of topology changes and TCAs are used to confirm a topology change (Prytz 2008). There are five states that bridge ports that are part of the spanning tree can be in; disabled, blocking, listening, learning and forwarding. A port is disabled when it is administratively down and does not take part in the spanning tree calculations in a disabled state no BPDUs are transmitted or received, a port that could cause a loop is in the blocking state and still listens for BPDUs in case of topology change, a port in listening state starts to listen for other bridges, however it does not send out BPDUs or learn MAC addresses, in the learning state the bridge continues to listen for bridges and learns MAC addresses, in forwarding state traffic is allowed to pass through ports. (Medagliani, Ferrari et al. 2009) states that there are important steps that spanning tree must take to have converged networks which are selecting a route bridge, selecting the designated bridges and ports for each link and maintaining the spanning tree.
delay of twice this default is 30 seconds and is the main reason for a convergence time of between 30 and 50 seconds (Sfeir, Pasqualini et al. 2005). Max age timer is the period of time that a bridge will hold BPDU information, which must have a limited lifetime in order to detect topology changes, the default settings for max age is 20 seconds (Lammle, Quinn 2002).
bridges between any two end points on the network and not the number of bridges in the network. As detailed by (Anon, IEEE 802.1D 1998), it is possible to change the default value of 7 to a higher number to allow for a larger network diameter, however changing the value will have a direct effect on STP (802.1D) timer settings. Convergence time is longer if we increase the values of max age and forward delay to increase the network diameter; therefore, theoretically by decreasing these timer values we will have faster convergence, however as a result the network diameter must also decrease or the network will not converge, yet decreasing the diameter value also increases the chance of broadcast storms (Lammle, Quinn 2002). The hello time value can also be lowered to decrease convergence time, however setting the hello time value from 2 to 1 will double the amount of control traffic in the form of hello BPDUs which could cause degradation of network performance (Lammle, Quinn 2002). In conclusion it may be possible to enhance the convergence time of STP (802.1D) by modifying timers, however great care must be taken when doing so to ensure a loop free stable network.
1.2
Research Question
Can modification of Spanning Tree Protocol (802.1D) timers: hello, max age and forward delay decrease network convergence time whilst maintaining network stability when simulating link failure and recovery in OPNET?
1.3
Hypothesis
Below are hypotheses that are to be tested throughout the experiment, these hypotheses may be subject to change as the research into the experiment develops: Convergence time of STP (802.1D) can be decreased by careful modification of the timers max age, forward delay and hello time without the network becoming unstable. Using a network diameter of 7 with the minimum allowable values for timers max age and forward delay the network will still converge. Using a network diameter of 9 with the minimum allowable values for timers max age and forward delay the network will not converge.
1.4
Justification
The results and conclusions of the experiment is the output of the project and will determine whether or not there is any STP (802.1D) performance benefit in tuning timers. Tuning STP (802.1D) timers to gain an improvement in convergence time may seem counterintuitive, due to the fact that STP (802.1D) has been replaced by superior technologies; however, small businesses with legacy network equipment will find it very useful, these
8
small businesses may be hesitant to upgrade equipment due to cost, therefore; fine tuning of existing legacy equipment could bring network performance benefits for these businesses. Network administrators and engineers could find it useful as reference material to assist in fine tuning of networks, administrators and engineers work across many sites and with varying types and sizes of LANs and it is likely that they may have to modify STP (802.1D) timers.
1.5
Project Type
An experimental research method will be used to test the projects hypotheses; the experiments will be conducted using network simulation software which will also produce statistical results for analysis.
1.6
Project Aim
The project aims to investigate whether or not Spanning Tree Protocols (802.1D) convergence time can be decreased by tuning the hello, max age and forward delay timers whilst still retaining network stability; the modification of said timer parameters will be based on allowable and effective combinations of timer parameters identified in research. This investigation will be carried out by simulating link failure and recovery under different defined scenarios followed by measurement of convergence time in each scenario, the subsequent simulation results will be analysed to determine if the network is stable and fully converged and if there is a performance increase or not in convergence time and to identify whether the increase if there is one is significant or not.
1.7
Objectives
The objectives below will have to be met to realise the project aim:
10
The secondary research methods use a literature review to meet the objectives identified in section 1.5. These objectives identified in an initial literature review are further defined below. The review will cover subjects such as Ethernet, Bridging and Spanning Tree and its timer parameters.
2.0
2.1
Literature Review
Ethernet Introduction
Local Area Networks (LANs) as stated by (Seifert 1988) have become increasingly important over the years, providing data communication and access to shared resources, LANs now pervade through many aspects of society including; large and small industries, commerce, health, education, government and the home. Ethernet (IEEE 802.3) is one of the most widely used LAN technologies implemented today offering a simple, popular and cost effective solution to LAN media access (Abuguba, Moldovn 2006). Ethernet is almost ubiquitous in LANs, it has now evolved from its early days as a technology that provided media access to a single LAN to a technology with diverse implementations, such as, Campus Networks, linking multiple LANs in Metropolitan Area Networks (MANs) and Metropolitan Ethernet Networks (MENs) (Bonada 2007) The following sections will aim to investigate Ethernet which is the foundation and basis of the spanning tree algorithm.
11
2.1.2 CSMA/CD
Radia Perlman describes mechanisms that must exist for end-stations to transmit onto the shared medium successfully, these include; each end-station having a fair share of bandwidth, no marked delay in accessing medium and the overheads of any access control method is minimal (Perlman 1992). CSMA/CD is a contention scheme where end-stations check first to see if the medium is being used this is Carrier Sense, if the medium is being used the endstation waits until it can transmit, if the medium is not being used the end-station will transmit, two stations could transmit simultaneously and a collision will occur, the end station will detect this collision and back off for a random period of time before trying to re-transmit, however, as the length of the coaxial backbone increases so too does the chance of these collisions, this is due to the fact that one end-station may not detect that another is already transmitting on the medium due to the time it has taken for the signal to fully traverse the medium. Adding more end-stations to the LAN segment will increase collisions too; this shared medium or LAN segment is what is known as a collision domain, therefore these media access limitations in basic LAN bus topologies prevent the network from scaling well (Sakandar, Barnes 2005); however transparent bridging overcomes this scalability problem allowing the LAN to be extended (See Section 2.2). It should also be pointed out that Ethernets CSMA/CD is not the only media access method implemented in LANs today, however, the application of other solutions such as, Token Ring and Token Bus technologies are comparatively sparse.
12
10BASE-T which also offered speeds of 10 Mbps changed the physical topology of the LAN, the coaxial bus topology had an inherent problem that if the backbone cable broke the whole network would become inoperative. In 10BASE-T the coaxial backbone was gone and was replaced by the hub, a device with many ports, with each port connected to an end-station using 10BASE-T wiring, this now created a physical star topology which was more resilient, with breaks in wiring only affecting single end-stations, however logically within the hub the networks are still connected on a bus topology and the connections are still half-duplex, the speed gained from unshielded twisted pair copper (UTP) wiring has rapidly increased over the years with 100BASE-T (FastEthernet) offering speeds of 100 Mbps, 1000BASE-T (GbEthernet) capable of 1000 Mbps, the speed and development of UTP is in parity with that of single-mode and multi-mode fibre. A wireless variant of Ethernet has been standardised in IEEE 802.11 to provide wireless media access, Ethernet can also provide high-speed Internet access in the same way ADSL and cable modems do. The Ethernet standard amendment IEEE Std 802.3ah-2004 also known as Ethernet in the First Mile (EMF) describes how the Ethernet protocol can be run over existing single pairs of copper telephone wires and single strands of single-mode fibre (SMF) providing Ethernet connectivity to Internet service providers (Frazier, Pesavento 2001).
2.2
As indicated in the previous sections there are limitations in the number of end-stations that can be added to a LAN Ethernet segment, this scalability problem is due to the fact that as more end-stations are added to the segment the greater the probability of collisions. Transparent Bridging as standardised in IEEE 802.1D is a solution that allows two or more Ethernet Segments to be interconnected by a device called a bridge, therefore, allowing traffic to pass between Ethernet segments, with the use of bridges collisions are kept isolated to each segment. (Backes 1988) describes how bridges operate on top of the Medium Access Control (MAC) sub-layer making the bridge transparent to higher layer protocols.
13
14
information in the CAM table on the destination the frame will be flooded out all ports except the port that it was received on as above in Flooding.
bridge B1 will again receive the frame from LAN segment A and again broadcast it out of every port other than the port it was received this process is repeated endlessly and creates what is known as a broadcast storm. As described by (Kim, Caesar et al. 2008) a bridge transparently forwards frames without altering them, therefore a bridge cannot alter the header to increment the value in a Time-to-Live field (TTL), with no TTL field the frame does not expire and will continue to loop. These broadcast storms will cause the whole network to be overwhelmed and fail. It is not only broadcast frames that are detrimental to a looped network, unicast frames will also bring the network down, for example, if end-station 1 sends a unicast frame to endstation 2, both bridges B1 and B2 will receive the frame and add the source address of endstation 1 to the CAM table and associate it to port P1. However, assuming bridge B1 has no entry for end-station 2 in its CAM table it will flood the frame onto LAN segment B, bridge B2 will receive the frame from LAN segment B on port P2 and overwrites the original entry in the CAM table that identified port P1 as the originating segment for end-station 1, as a result the CAM table information is incorrectly identifying port P2 as connecting to the originating segment for end-station 1, this is called CAM table or bridge table corruption which will cause feedback loops that will also disable the network (Sakandar, Barnes 2005).
1
LAN Segment A
P1
P1
B1
P2
LAN Segment B
B2
P2
Figure 1.
Bridging Loops
As the above examples show bridging loops should be avoided, (Medagliani, Ferrari et al. 2009) suggest one possible solution would be to physically avoid loops in the network design stage, however redundancy is useful in the likely event of link or device failure. Spanning Tree is a loop avoidance algorithm designed for Ethernet networks that overcomes the inherent problems associated with transparent bridging identified above.
16
2.3
The Spanning Tree algorithm overlays a logical loop free topology over a physical topology that may or may not contain loops, A root bridge is elected from which the logical tree will span from, the spanning tree algorithm then uses mechanisms to ensure there is only one path back to the root bridge from any branch in the tree (Prytz 2008). If there is more than one path from a branch bridge back to the root bridge spanning tree will calculate the best path back and block the other path, Figure 2 below shows how the spanning tree algorithm is used to overcome the problems of the bridging loop example in Figure 1. Assuming bridge B1 is the root bridge spanning tree will block port P2 on bridge B2 and logically break the loop, Any end-station on LAN segment A will be able to access bridge B1 and B2 directly, any end-station on LAN segment B can access bridge B1, however to access bridge B2 it will have to go via bridge B1 (root bridge).
1
LAN Segment A
P1
P1
B1
P2
LAN Segment B
B2
P2
Figure 2.
Spanning Tree
Spanning tree uses various mechanisms to calculate a stable loop free topology, such as root bridge selection, calculating shortest path to the root bridge and the role of ports in the tree, the following sections aims to describe these mechanisms of the IEEE 802.1D standardised Spanning Tree Protocol (STP 802.1D).
17
Configuration BPBDUs contain information about the transmitting bridge, including; the cost of the path from this transmitting bridge to the root bridge, ports states, priority values and the Root Bridge Identifier (RID). As also stated by (Perlman 1992), information inside the configuration BPDU enables bridges to; elect a single root bridge and calculate the least cost path to the root bridge, configuration BPDUs allow root ports, designated ports and nondesignated ports to be selected. Table 1 shows the fields of the configuration BPDU, how these fields relate to spanning tree operation will be discussed further in the following sections.
Table 1. BPDU Fields Protocol Identifier Version Message Type Flags Root ID (RID) Root Path Cost Bridge ID (BID) Port ID Message Age Max Age Hello Time Forward Delay Field Size 2 Bytes 1 Bytes 1 Bytes 1 Bytes 8 Bytes 4 Bytes 8 Bytes 2 Bytes 2 Bytes 2 Bytes 2 Bytes 2 Bytes
Topology Change Notification (TCN) and Topology Change Acknowledgment (TCA) differ in structure and function from configuration BPDUs; they are 32 bits long and contain only the first three fields in Table 1, which are Protocol Identifier, Version and Message Type. TCNs are sent out root port towards the root, TCA are used to acknowledge receipt of the TCN. The TCN has the least significant bit set in the flag field and TCA has the most significant bit set in the flag, more details on the operation TCN and TCA are given in section 2.3.5.
will be considered a better root bridge, the bridge now advertises the new BID as the root bridge within the RID field of its own BPDU frame. The root bridge election process is completed once all bridges hold the same RID information. In the event that the priority settings of bridges are the same, this is likely to be the case due to the 32768 default setting, then the bridge with lowest MAC address is selected as the root and every other bridge is known as a designated bridge, it is possible to lower the priority settings and manually choose which bridge will be the root, this is useful due to the fact that older bridges will have lower MAC addresses, therefore the oldest bridge will become root if priority settings of all bridges are equal, this is not a good idea as the majority of traffic will travel through the root and a newer more powerful bridge would be a far better choice for root. New root bridge elections will be re-triggered in the event of topology change for example, if the root bridge is removed from the spanning tree or a new bridge with a lower BID than the current root bridge is added.
The root bridge sends out a BPDU frame with a root cost path value of 0, when the next bridge receives the BPDU frame, it adds the path cost of its own port where the BPDU frame originated, this is done by each bridge in the tree, as a result each prospective root port on a designated bridge will have a cumulative path cost to the root associated with it. Using this cumulative total the designated bridge is able to choose the port with least cost path as the root port, if both ports path costs to the root bridge are equal then the BPDU frame with the lowest sending BID is selected and the port from where the BPDU frame originated will become root port, in cases where both prospective root parts are receiving BPDU frames with the same BID then the port with the lowest port ID is selected as root port, these ports may have the same sending BID due to the fact that there is more than one link between switches for redundancy. The port that is selected to be root will then transition from the blocked state to the listening state and begin to forward frames towards the root.
19
20
2.4
Timer Parameters
SPT (802.1D) relies on several timers that control various aspects regarding how frequently BPDU packets are sent and how long information can exist before it is dropped from the bridge table these timers are essential to control the delay needed for all bridges to converge and have the same stable topology. In Section 2.3.1 we discussed the fields in a BPDU, within these were three timer fields; max age, hello time and forward delay, there are however other timers such as maximum bridge transit delay, maximum BPDU transmission delay, maximum message age increment overestimate and hold time, these timers are used in calculating the max age, hello time and forward delay and CISCO suggests that they should not be modified instead recommending careful modification of only the max age, hello time and forward delay timers (Anon, CISCO 2006). These three timers and their values are discussed below.
21
B7
B1
B7
B1
B6
B2
B6
B2
B5 B4
Figure 3. Diameter of 7
B3
B5 B4
Figure 4. Diameter of 3
B3
22
2.5
It is possible to carefully modify spanning tree timer settings to decrease convergence time; with spanning tree timers at their default this convergence time is typically between 30-50 seconds, if there is a direct failure, the bridge will immediately time-out the max age timer, as the bridge goes through the listening and learning process it must wait 2 x forward delay which is 30 by default. However if there is an indirect failure the bridge must wait till the max age timer expires before it can go through the listening and learning process, therefore the convergence time in this case 2x forward delay + max age = 50 seconds (Menga 2003). (Anon, IEEE 802.1D 1998) states that the following equations in Table 3 should also be enforced when modifying spanning tree timers:
Table 3.
Furthermore, (Anon, CISCO 2006) states that to achieve better convergence time, you need to strictly follow the two equations in Table 4. This equation is used to determine the correct values for the max age and forward delay timers and is based on the equation in Table 3 and on the IEEE 802.1D specifications on diameter size equations. In Table 5 the equation has been recalculated to use the values of the max age and forward delay timers to determine the diameter.
Table 4.
max age = ((4 x hello time) + (2 x diameter)) 2 forward delay = ((4 x hello time) + (3 x diameter)) / 2
Table 5.
diameter = ((max age + 2 - (4 x hello time)) / 2 diameter = ((2 x forward delay) - (4 x hello time)) / 3 If we use the default settings of 20 for max age and 15 for forward delay then apply them to the equations above you get a diameter size of seven. Table 6 shows the recommended max age and forward delay for each diameter size, this table shows there is scope for improving convergence time by fine tuning the timers whilst still staying within the IEEE 802.1D recommendations. Using the topology in Figure 4 as an example, it can be seen the topology consists of seven bridges which we will assume have their timers set at default values which as we have shown above corresponds to a diameter size of seven. However it has already been determined that this topology has a maximum network diameter of three, thus it is possible set the max age to 12 and the forward delay to 9, these values correspond to the
23
network diameter of 3 in Table 8. Therefore it is theoretically possible the network will now take between 18-40 seconds to converge instead of between the default 30-50 seconds.
It is also theoretically possible to increase the diameter beyond the recommended maximum of seven, although (Anon, IEEE 802.1D 1998) states that this should not be done, however this contradicts their own recommended allowable range for timer values. For example if we use the maximum values from the allowable range for max age which is 40 and 30 for forward delay and continue to apply the equations in Table 4 and 5, thus, we get a network diameter of 17, however, convergence time will also increase to between 60-100 seconds. These values are also within the rules of the IEEE 802.1 D equations in Table 3. So far we have discussed modifying the max age and forward delay timers, however the hello timer value can also be lowered to decrease convergence time, Table 9 shows the values of max age and forward delay if we use the hello timer value of 1 on the equations in Table 4 and 5, therefore the network will theoretically converges in between 26-43 seconds, if we use the setting for a diameter of 3 from Table 7 and use them on the example in Figure 4 we get a convergence time of between 14-22 seconds as opposed to a convergence time of between 18-40 seconds in the last example where the hello timer was set to 2. However setting the value of the hello time from the default value of 2 to 1 will double the amount of control traffic in the form of hello BPDUs which could cause degradation of network performance (Lammle, Quinn 2002).
Table 7. Diameter 7 6 5 4 3 2 1 Max Age 17 14 12 10 8 6 4 Forward Delay 13 11 10 8 7 5 4 Hello Timer 1 1 1 1 1 1 1
24
It has been show that it is possible to improve the performance in convergence time whilst staying within the IEEE 802.1D specifications which states the network diameter must also decrease with any decrease in timers. It could be surmised that the IEEE 802.1D standards have been conservative in their calculations on the optimum and stable values for timers. What would happen if took a network with a diameter of 7 and use the timer settings of a network with a diameter of 5 from Table 6? Theoretically we would get a convergence time of between 24-40 seconds rather than the default 30-50 seconds; however, will the network converge and remain stable and if it does how low can we go before the network will not converge? (Medagliani, Ferrari et al. 2009) offers evidence that it will as they show evidence that a network with a diameter of 8 and a max age value of 6 and a forward delay value of 4 will converge, however they also produce evidence to show that a network with a diameter of 9 and a max age value of 6 and a forward delay value of 4 will not converge and instead forms two separate spanning trees. Table 8 shows timer values which were arrived at using the equation in Table 4 and 5, you will notice it contains the same values as is Table 6, however there are more increments thus there are more values. You can see that (Medagliani, Ferrari et al. 2009) used the values in the last row of Table 8, which are the minimum values allowed in the IEEE 802.1D standard, having a network with a diameter of 7 and using the lowest recommended settings of 6 for max age and 4 for forward delay, strictly speaking still meets the IEEE requirements discussed earlier, the network diameter 7 is within the diameter specification, the timer values of 6 and 4 are within the allowable range and meet IEEE equation rules in Table 3.
Table 8. Max Age 20 19 18 16 15 14 12 11 10 8 6
Forward Delay 15 14 13 12 11 10 9 8 7 6 4
Hello Timer 2 2 2 2 2 2 2 2 2 2 2
In conclusion, calculations and evidence show that it is possible to enhance the convergence time of STP (802.1D) by modifying timers, however great care must be taken when doing so to ensure a loop free stable network. The next chapter will aim to discuss the primary research methods that will be used to answer the research question in section 1.2 and testing the hypotheses in section 1.3.
25
Chapter 3.0 will aim to discuss the primary research methods that will be used to answer the research question in section 1.2 and for testing the hypotheses in section 1.3. The methods used in the next section are based on research conducted in chapter 2.
3.0 3.1
26
in Chapter 2, as a result of this research the hypotheses has been slightly modified from the hypotheses that were identified in the initial proposal. The first hypothesis states that convergence time of STP (802.1D) can be decreased by careful modification of the timers max age, forward delay and hello time without the network becoming unstable. This hypothesis was an original one from the initial proposal and was arrived at due to ongoing research and calculations of spanning tree timers. The final testing of the hypothesis will be at the end of the experiment after comparing results. There are two other hypotheses, the first being that the network will still converge with the minimum timer values with a network diameter of 7, the second is that the network will not converge minimum timer values with a network diameter of 9. These hypothesis are new and have been arrived at due to research detailed in section 2.5. A paper by (Medagliani, Ferrari et al. 2009) was instrumental at arriving at both these hypotheses. The final testing of both these hypotheses will also be conducted after comparing results of the final experiments.
3.2
Method Approach
F
Source
G
Destination 3
B A
ROOT
Figure 5.
The topology in Figure 6 consists of 8 switches, switch A is the root, the letters A-H corresponds to priority value identified earlier in this report, there is equal an equal path cost from switch H, as the port connecting to F has a lower priority it will become root port for switch H and the connecting to G will be blocking. The link failure will be simulated between switch B-D. This topology will be used for Group 2 Experiments.
Source
H F G
Destination 3
B A
ROOT
Figure 6.
28
The topology in Figure 7 consists of 9 switches, switch A is the root, the letters A-I corresponds to priority value identified earlier in this report, there is equal an equal path cost from the segment between H-I, as H has a lower priority it will become designated port for that segment and the port for G on the segment will be in the blocked state. The link failure will be simulated between switch B-D. This topology will be used for Group 3 experiments.
H
Source
I
Destination 3
B A
ROOT
Figure 7.
29
Figure 7.
The values per statistics should also be changed from the default 160 to 500, this will provide a more accurate results. There will be 4 groups of experiments, Group 1, 2 and 3 will use the same variables as shown in Table 9 and will differ only in network size (see section 3.2.2) each of these groups will consist of 8 scenarios which will be ran for 500 seconds each; each scenario will also be run 10 times, Group 4 will have the same topology as Group 1, however the variables in Table 10 will be used. These variables are based on the calculation and analysis of timers undertaken in earlier research, see section 2.5, in Table 9 we see the variables for each scenario in group 1, we can see from this table that the only variables between each scenario is the max age and forward delay timers, the hello timer does not vary from one scenario to the other this is the same in Table 10, however the hello time does vary between Group 1 and 4, this is due to the fact that the value of the hello timer has a direct correlation with both the max age and forward delay therefore each hello timer value produces different values for max age and forward delay when used with the equations in Table 4 and 5. (Prokkola 2008) suggests that each scenario should be run several times with different random generator seed values as this makes the experiment more realistic. It should also be noted that although Scenarios D1 and D0 relate to a diameter of 1 and 0, it is not physically possible to have a diameter of 1 or 0 and therefore the values are used theoretically to enable calculation of the lower allowable parameters.
Table 9. Scenario D7 D6 D5 D4 D3 D2 D1 D0 Max Age 20 18 16 14 12 10 8 6 Forward Delay 15 13 12 10 9 7 6 4 Hello Timer 2 2 2 2 2 2 2 2
30
An example in figure 8 shows the traffic received in bps for Group 1 Scenario D7, the x axis represents Time in seconds and the y axis represents Traffic received in bits, T1 represents when destination 3 initially starts receiving traffic, F is the point of link failure (100s) T2 is the point that destination 3 starts receiving traffic after link failure, R is the point of link recovery (200s) and T3 is the point that destination 3 starts receiving traffic after link recovery, to obtain the values for convergence times the following equations should be applied to the chart, T1-0=C1, this initial convergence, T2-F=C2, this is the convergence time
31
after link failure and T3-R=C3 signifies the convergence time after link recovery. All charts for every Scenario are included in the appendices and the corresponding raw data in excel format is included in an attached CD-ROM.
T1
T2
T3
31 61 91 121 151 181 211 241 271 301 331 361 391 421 451 481 Time (seconds)
Figure 8.
These results will then be analysed and compared and conclusions will be drawn based on these result and research conducted earlier. It is possible the results do not match the hypotheses and the experiment may have to be tweaked and carried out again.
32
4.0
Results
The experiment was split into four groups and the experiment was conducted successfully, the results from the experiment are presented in the following chapter.
4.1
Group 1 Results
Each of the eight scenarios in the Group 1 experiments consisted of seven switches in a ring topology (see figure 5), as indicated earlier this provides a network diameter of seven, this is the optimum diameter recommended by CISCO and the IEEE (see section 2.5), only the timer settings of Max Age and Forward Delay varied between each scenario. The settings for each scenario have been calculated using the diameter and the equations in Table 4 & 5. Table 11 below shows the settings for each scenario and their corresponding results, for example in scenario D7 the value of 15 for Forward Delay was used (FD), 20 for Max Age (MA) and 2 for the Hello Timer (HT), these timer settings are all default values and correspond to a network diameter of 7 (D7). It also shows that in scenario D7 the network had an initial convergence time of 30 seconds (C1), a convergence time of 45 seconds after link failure (C2) and a convergence time of 30 seconds after link recovery (C3). As these results are from the default timer settings and the recommended diameter size they will be used as a baseline by which results can be compared.
Table 11. Group 1 Settings Scenario D7 D6 D5 D4 D3 D2 D1 D0 FD 15 13 12 10 9 7 6 4 MA 20 18 16 14 12 10 8 6 HT 2 2 2 2 2 2 2 2 Group 1 Results C1 30 26 24 20 18 14 12 8 C2 45 39 35 29 25 19 15 9 C3 30 26 24 20 18 14 12 8
The results from Table 11 for each scenario have been plotted onto three charts, figure 9 represents the initial convergence time (C1), figure 10 represents the convergence time after link failure at 100 seconds (C2) and figure 11 represents the convergence time after link recovery at 200 seconds (C2). In figure 9 the x axis represents the scenarios D7-D0 and the y axis represents the initial convergence time in seconds, this chart shows that initial convergence time has steadily decreased as the network diameter (corresponding values) is stepped down. It shows that
33
scenario D0 initially converges 22 seconds faster than scenario D7, an overall 73.33% improvement on the baseline (D7).
Figure 9.
In figure 10 the x axis represents the scenarios D7-D0 and the y axis represents the convergence time after link failure (C2) in seconds, this chart shows C2 steadily decreased as the network diameter is stepped down. It shows that scenario D0 converges in 9 seconds as opposed to the 45 seconds in the baseline scenario (D7), this signifies an overall 80% improvement in convergence time after indirect link failure. The chart also shows that in each scenario that convergence after indirect link failure took 33.33% longer than the initial convergence detailed in the chart in figure 9.
Figure 10.
Scenario
34
In figure 11 the x axis represents the scenarios D7-D0 as in the previous charts; however the y axis represents convergence time after link recovery (C3) in seconds, this chart shows that convergence time after link recovery has steadily decreased as the network diameter is stepped down. It shows that scenario D0 initially converges 22 seconds faster than scenario D7, an overall 73.33% improvement on the baseline (D7). These convergence times mirror exactly the initial convergence times in figure 9.
Figure 11.
The chart in figure 12 represents the traffic received in bits per second by destination 3 for
scenario D0, the x axis signifies the Time in seconds and the y axis signifies the amount of traffic received in bits. The chart shows that the network initially converges at 8 seconds (C1), the traffic is received at an average rate of 206866 bps until the link fails at 100 seconds, at this point no traffic is received for 9 seconds (C2) until 109 seconds when the traffic continues to be received at an average rate of 209705 bps, this continues until link recovery at 200 seconds at this point no traffic is received for 8 seconds (C3) until 209 seconds when traffic is then received at an average rate of 206255 bps until the end of the experiment at 500 seconds. Figure 12 shows that scenario D0 successfully converged and remained stable whilst using the lowest allowable parameters of 6 for Max Age and 4 for Forward Delay as defined by IEEE (Anon, IEEE 802.1D 1990). The charts detailing the Traffic received in bps by destination 3 for all other Group 1 scenarios are included in the appendices in section 6.3.1.
35
08
31 61 91 121 151 181 211 241 271 301 331 361 391 421 451 481 Time (seconds)
Figure 12.
4.2
Group 2 Results
Group 2 experiments consisted of 8 switches (see figure 6), rather than 7 as in Group 1, all other parameters are the same as in Group 1 including timer parameters, all scenarios have a network diameter of eight, this is outside the diameter size of 7 recommended by CISCO and the IEEE (see section 2.5). The settings and results for each scenario are detailed in Table 12.
Table 12. Group 2 Settings Scenario D7 D6 D5 D4 D3 D2 D1 D0 FD 15 13 12 10 9 7 6 4 MA 20 18 16 14 12 10 8 6 HT 2 2 2 2 2 2 2 2 Group 2 Results C1 30 28 27 25 24 22 21 19 C2 44 38 34 28 24 18 14 8 C3 30 26 24 20 18 14 21 19
The results from Table 12 for each scenario have been plotted onto three charts, figure 13 represents the initial convergence time (C1), figure 14 represents the convergence time after link failure at 100 seconds (C2) and figure 15 represents the convergence time after link recovery at 200 seconds (C2). In figure 13 the x axis represents the scenarios D7-D0 and the y axis represents the initial convergence time in seconds, this chart shows that initial convergence time has steadily decreased as the network diameter is stepped down. It shows that scenario D0 initially
36
109
208
converges 11 seconds faster than scenario D7, an overall 36.66% improvement on the baseline (D7).
22
21 19 C1
Figure 13.
In figure 14 the x axis represents the scenarios D7-D0 and the y axis represents the convergence time after link failure in seconds, this chart shows convergence time has steadily decreased as the network diameter is stepped down. It shows that scenario D0 converges in 8 seconds as opposed to the 44 seconds in the baseline scenario (D7), this signifies an overall 81.81% improvement in convergence time after indirect link failure. The chart also shows that in scenario D7 convergence after indirect link failure took 31% longer than the initial convergence of D7 detailed in the chart in figure 9. This is similar for scenario D6, D5 and D4 in these scenarios convergence after indirect link failure all took longer than the initial convergence, however in scenario D3 convergence after indirect link failure was the same as the initial convergence of D3, in contrast to the other scenarios, in scenario D2, D1 and D0 convergence after indirect link failure was actually faster than the initial convergence.
37
Figure 14.
In figure 15 the x axis represents the scenarios D7-D0 and the y axis represents convergence time after link recovery (C3) in seconds, this chart shows convergence times mirror Group 1s convergence time after link recovery in figure 11, until scenario D1 when it instead increases from 14 seconds to 21 seconds and decreases again to 19 seconds in scenario D0 Figure 15 also shows that scenario D2 converges 16 seconds faster than scenario D7, an improvement of 53.33% on the baseline (D7) however this improvement drops to 30-33% in scenario D1 & D0.
19 C3
Figure 15.
38
The chart in figure 16 represents the traffic received in bits per second by destination 3 for
Group 2 scenario D0, the x axis signifies the Time in seconds and the y axis signifies the amount of traffic received in bits. The chart shows that the network initially converges at 19 seconds (C1), the traffic is received at an average rate of 207568 bps until the link fails at 100 seconds, at this point no traffic is received for 8 seconds (C2) until 108 seconds when the traffic continues to be received at an average rate of 209578 bps, this continues until link recovery at 200 seconds at this point no traffic is received for 19 seconds (C3) until 219 seconds when traffic is then received at an average rate of 203537 bps until the end of the experiment at 500 seconds. Figure 16 shows that scenario D0 successfully converged and remained stable despite using 8 switches and the lowest allowable parameters of 6 for Max Age and 4 for Forward Delay. The charts detailing the Traffic received in bps by destination 3 for all other Group 2 scenarios are included in the appendices in section 6.3.2.
108
31
19
61
91 121 151 181 211 241 271 301 331 361 391 421 451 481 Time (seconds)
Figure 16.
4.3
Group 3 Results
Group 3 experiments consisted of nine switches (see figure 7), rather than seven or eight as in Group 1 and 2, all other parameters are the same as in Group 1 and 2 including timer parameters, all scenarios have a network diameter of nine, this is two greater than the diameter size of seven recommended by CISCO and the IEEE (see section 2.5). The settings and results for each scenario are detailed in Table 13. Scenario D0 results highlighted in red has no values as the network failed to converge in this scenario.
219
39
In figure 17 the x axis represents the scenarios D7-D0 and the y axis represents the initial convergence time in seconds, this chart shows that initial convergence time has steadily decreased as the network diameter is stepped down from D7 to D1 and shows that scenario D1 initially converges 9 seconds faster than scenario D7, providing a 30% improvement on the baseline (D7). However the network in scenario D0 has failed to converge and has a value of 0. Grp 3 - Initial Convergence (C1)
Initial Convergence time C1 (secs) 35 30 25 20 15 10 5 0 D7 D6 D5 D4 D3 D2 D1 D0 Scenario 0 30 28 27 25
24
22
21 C1
Figure 17.
In figure 18 the x axis represents the scenarios D7-D0 and the y axis represents the convergence time after link failure (C2) in seconds, this chart shows C2 steadily decreased as the network diameter is stepped down from D7 to D1. It shows that scenario D1 converges in 13 seconds as opposed to the 43 seconds in the baseline scenario (D7), this signifies an overall 69.76% improvement in convergence time after indirect link failure. The chart also shows that scenario D0 has a value of 0 due to the fact that scenario D0 failed to converge.
40
Figure 18.
In figure 19 the x axis represents the scenarios D7-D0 and the y axis represents convergence time after link recovery (C3) in seconds, this chart closely resembles the chart in figure 15. Convergence time steadily decreases until scenario D2 when it instead increases from 18 seconds to 22 seconds and decreases again to 21 seconds in scenario D1. Convergence times after link recovery in scenarios D2, D1 mirror exactly the initial convergence times of the same scenarios. Scenario D0 has failed to converge and has a value of 0. The greatest performance increase was with the settings of scenario D3 which provided a 40% decrease in convergence time compared to scenario D7.
Convergence time after link recovery C3 (secs)
Figure 19.
41
The chart in figure 20 represents the traffic received in bits per second by destination 3 for
Group 3 scenario D0, the x axis signifies the Time in seconds and the y axis signifies the amount of traffic received in bits. The chart shows that the network has failed to converge and no traffic is being received by destination 3 for the period between 75 seconds and 500 seconds.
Figure 20.
4.4
Group 4 Results
Each of the seven scenarios in the Group 1 experiments consisted of seven switches in a ring topology (see figure 5), as indicated earlier this provides a network diameter of seven, this is the optimum diameter recommended by CISCO and the IEEE (see section 2.5). In this group of experiments the Hello Timer has been lowered from the default value of 2 to 1. The settings for each scenario have been calculated using the diameter and the equations in Table 4 & 5. The settings and results for each scenario are detailed in Table 14.
In figure 21 the x axis represents the scenarios D7-D1 and the y axis represents the initial convergence time in seconds, this chart shows that initial convergence time has steadily decreased as the network diameter is stepped down from D7 to D2 and shows that scenario D2 initially converges 18 seconds faster than scenario D7, and provides a 60% improvement on the baseline (Group 1 Scenario D7). However the network in scenario D1 has failed to converge and has a value of 0.
Figure 21.
In figure 22 the x axis represents the scenarios D7-D1 and the y axis represents the convergence time after link failure (C2) in seconds, this chart shows C2 steadily decreased as the network diameter is stepped down from D7 to D2. It shows that scenario D2 converges in 10 seconds as opposed to the 45 seconds in the baseline scenario (Group 1 Scenario D7), this signifies an overall 77.77% improvement in convergence time after indirect link failure. The chart also shows that scenario D0 has a value of 0 due to the fact that scenario D0 failed to converge.
43
Figure 22.
In figure 23 the x axis represents the scenarios D7-D1 and the y axis represents convergence time after link recovery (C3) in seconds. Convergence time steadily decreases until scenario D2. It shows that scenario D2 converges in 12 seconds as opposed to the 30 seconds in the baseline scenario (Group 1 Scenario D7), this signifies a 60% improvement in convergence time after link recovery. Scenario D1 has failed to converge and has a value of 0. Grp 4 - Convergence after link recovery (C3)
Convergence time after link recovery C3 (secs) 30 25 20 15 10 5 0 D7 D6 D5 D4 Scenario D3 D2 D1 0 26 22 20 16 14 12 C3
Figure 23.
44
The chart in figure 24 represents the traffic received in bits per second by destination 3 for
Group 4 scenario D1, the x axis signifies the Time in seconds and the y axis signifies the amount of traffic received in bits. The chart shows that the network has failed to converge and no traffic is being received by destination 3 for the period between 55 seconds and 500 seconds. Grp 4 - D1 - Dest 3 - Traffic received (bps)
120000000 Traffic received (bits) 100000000 80000000 60000000 40000000 20000000 0 1 31 61 91 121 151 181 211 241 271 301 331 361 391 421 451 481 Time (seconds)
Figure 24.
45
5.0 5.1
The results of Group 1 showed that the initial convergence time in all Group 1 scenarios could be equated to 2xFD=C1 where FD is forward delay and C1 is initial convergence time, in the case of convergence time after link recovery (C3) the equation 2xFD=C3 applies, therefore in Group 1 C1=C3, however in the convergence time after indirect link failure (C2) this equation does not apply and the Max Age (MA) and Message Age (Mage) values must be factored in. The Message Age (Mage) value is not fixed and is incremented every time a BPDU is forwarded by a switch, therefore the network diameter will directly affect convergence time after indirect link failure, as a result the following equation applies; 2xFD+MA-Mage=C2, for example if we apply this equation to the settings in Group 1 Scenario D7 we get 2x15+20-5=45, the Message Age value of 5 does not reflect the network diameter of 7, but is instead calculated from the root to the point of link failure which equals 5 hops in the Group 1 experiment, resulting in a Message Age of 5. If we apply this to Scenario D0 we get 2x4+6-5=9. The Group 1 experiment was within the confines of the CISCO and IEEE recommended diameter value of seven and using the equations specified by the IEEE (see tables 4 & 5), the results show that convergence times can be safely lowered from the defaults in Scenario D7 of between 30-45 seconds to between 8-9 seconds in Scenario D0, this is an overall average of 75.55% improvement in
convergence performance. Although the network converged successfully in scenario D0, there is not much time for the network to stabilise, IEEE states that the following statement must be true for a network to converge; MA>Mage, in scenario D0 this is true as 6>5, this however leaves only 1 second (6-5) before a BPDU received by the furthest switch from the root is expired causing a topology recalculation, furthermore if another switch was added to the topology the network would not converge with the lowest settings. Therefore it would not be advisable to lower the timers to their lowest default values of 4 for FD and 6 for MA. Setting MA=10 and FD=7 as in scenario D2 would still bring 66% performance increase whilst ensuring the network remains stable (MA-Mage=5). Group 1 experiments tested the following hypotheses initially identified in section 1.3: Convergence time of STP (802.1D) can be decreased by careful modification of the timers max age, forward delay and hello time without the network becoming unstable. Using a network diameter of 7 with the minimum allowable values for timers max age and forward delay the network will still converge.
These hypotheses have been proven true by Group 1 experiments which showed that STP (802.1D) convergence time can be decreased whilst still remaining stable, furthermore the network did remain stable whilst using the with the minimum allowable values for timers max age and forward delay.
46
5.2
Group 2 Conclusions
Group 2 experiments differed only in that another switch was added to the topology, this gave the network a diameter of eight, this value is out with the confines of the CISCO and IEEE recommended diameter value of seven (see section 2.5), when applying the diameter of eight to the equation in (see tables 4 & 5) the following values are calculated; FD=16 and MA=22, however the Group 2 experiment shows that convergence times can still be decreased in a networks with a diameter of eight. In the initial convergence in scenario D7 the equation of 2xFD=C1 identified in Group 1 results still applies, however in all other scenarios the network took longer than 2xFD to converge, in scenario D0 the network converged in 2xFD+11, despite this there was still a decrease in initial convergence time of 36.66% between scenario D7 and D0. In convergence after indirect link failure (C2); the equation 2xFD+MA-Mage=C2 is true in all scenarios and showed an 81.88% decrease in convergence time between scenario D7 and D0. The results for convergence time after link recovery (C3) showed that the equation of 2xFD=C3 was true in all scenarios except scenarios D1 and D0, where convergence time increased. This increase at scenario D1 represents the networks instability at lower timer settings causing longer convergence times, this was also true of initial convergence times (C1), the fact that the network takes longer than 2xFD shows instability in the calculation of the initial tree. In conclusion, with 8 switches the network becomes unstable when timers are lowered to MA=8 and FD=6.however, despite this the network did converge in all scenarios and with the timers at their lowest settings of MA=6 and FD=4.
5.3
Group 3 Conclusions
Group 3 experiments consisted of nine switches (see figure 7) this gives the network a diameter of nine, this value is out with the confines of the CISCO and IEEE recommended diameter value of seven (see section 2.5), when applying the diameter of nine to the equation in (see tables 4 & 5) the following values are calculated; FD=18 and MA=24, however the Group 3 experiment shows that convergence times can still be decreased in a networks with a diameter of nine. The formula 2xFD=C1 identified earlier for initial convergence, as in Group 2 only applies to Scenario D7 in Group 3, scenarios D6, D5, D4, D3, D2 and D1 have all took longer than 2xFD to initially converge, scenarios D7-D1 mirror exactly the results for initial convergence in Group 2. Convergence time after indirect link failure for scenarios D7-D1 followed the equation 2xFD+MAMage=C2 for example scenario D1 converged in 13 seconds due to 2x6+8-7=13. However scenario D0 failed to converge. In convergence after link recovery the formula 2xFD=C3 applied for scenarios D7-D3, as in Group 2 the convergence time then increases, the difference between Group 2 and Group 3 is that in Group 3 this increase happened a scenario earlier, In Group 2 the network started to become unstable in scenario D1 with the settings MA=8 and FD=6, however in Group 3 instability started in scenario D2 with settings MA=10 and FD=7, this shows the direct effect of adding another switch to the network,
47
from this it could be surmised that a network with a diameter of 10 would become unstable when the timer settings are lowered to MA=12 and FD=9. Scenario D0 which had the lowest allowable timer settings of MA=6 and FD=4 failed to converge, the rule MA>Mage is not true in scenario D0, the values of 6 for Max Age and 7 for Message Age,
therefore MA<Mage is true, this causes the BPDU from the furthest switch to expire before the network has stabilised, therefore STP 802.1D will never reach a stable state and the network will not converge. Group 3 experiments tested the following hypotheses initially identified in section 1.3: Using a network diameter of 9 with the minimum allowable values for timers max age and forward delay the network will not converge.
In conclusion Group 3 results show that convergence performance can be improved in a network with a diameter of nine, however setting timer values below MA=12 and FD=9 will
cause the network to become unstable and setting them as low as MA=6 and FD=4 the network will not converge at all, therefore Group 3 experiments has successfully tested the above hypothesis and found it to be true.
5.4
Group 4 Conclusions
Group 4 used the same network topology as Group 1, however different timer settings were used, these timer settings are based on the equations in table 4 & 5 (see table 7), these new timer settings values differ due to factoring in a reduced Hello Timer (HT) value of 1, therefore allowing FD and MA timers settings to be lowered for each scenario, for example in Group 1 Scenario D7, the timers values are as follows MA=20, FD=15 and HT=2 (see Table 6), however for Group 4 when we factor in the new Hello Timer of 1 we get MA=17, FD=13 and HT=1, both sets of parameters correspond to a diameter of 7, it is possible to apply a Hello Time of 1 to each of the values in table 6, however this would not provide any improvement on initial convergence time (C1) or on convergence time after link recovery (C3) and there would however be a 1-2 second improvement on convergence after link failure due to the fact that the failure would be discovered quicker due to the decreased Hello Time. The results show that Group 4 Scenarios D7 to D2 initially converged 2 seconds faster than the corresponding Scenarios in Group 1 an improvement of 6.66% in each. Group 1 initial convergence was equated to 2xFD=C1, however in Group 4 initial convergence equates to 2xFD+2=C1. In convergence after indirect link failure there was a 9 second improvement by Group 4 Scenarios D7 to D2 over corresponding Scenarios in Group 1. Group 1 and 4 equated to 2xFD+MA-Mage=C2 In convergence after link recovery (C2) the network converged in 2xFD in Group 4 Scenarios D7 to D3, however scenario D2 converged in 2xFD+2, this shows that although that scenario D2 did converge it took proportionally slightly longer than the others this suggests instability where Diameter=7 and FD=7, MA=8 and HT=1. In scenario D1 the network failed to converge at all, this is due to the Max age being set to 4, this again results in the Message Age being a larger number than
48
Max Age (MA<Mage), additionally the value of 4 set for MA in scenario D1 is below the lowest allowable parameters of 6 as defined by IEEE in (Anon, IEEE 802.1D 1990) Although there was convergence time improvement by Group 4 Scenarios D7 to D2 compared to the corresponding Scenarios in Group 1, this is on a scenario by scenario basis, if we measure overall performance we find that Group 4 scenario D2 was on average a 65% improvement on the baseline (Group 1 scenario D7), however Group 1 scenario D0 was on average a 75% improvement on the baseline, there is no overall improvement over Group 1 by Group 4, furthermore in Group 4 there is double the amount of control traffic in the form of BPDUs, which could be detrimental in a network whose bandwidth is being fully utilised, this could negate any convergence performance increase achieved by lowering Hello Timers.
5.5
Overall Conclusions
There were 31 individual experiments carried out, each with different variable of those 31 experiments only 2 failed to converge; Group 3 Scenario D7 and Group 4 Scenario D1, on both occasions the Message Age value was greater than the Max Age resulting in an unstable network. Research literature has identified that in initial convergence (C1) and in convergence after link recovery (C3) that the convergence time should be 2xFD, in the case of the Group 1 experiments with a network diameter of 7 this is indeed the case showing that an average 75% performance increase can be safely achieved without losing any network stability, this is also true of convergence after indirect link failure (C2) which should converge in 2xFD+MA-Mage. However when the diameter is increased to 8 or 9, the equations 2xFD=C1 and 2xFD=C3 do not apply this suggest that the network is taking longer to stabilise and when the lower setting are applied to the diameter of 8 and 9, the results show that the convergence time will increase or the network will not converge at all. Whilst there is indeed benefit in lowering the FD and MA timers in a network diameter of 7, it is suggested that networks with a diameter of 8 or 9 should be left at the default switch values of MA=20, FD=15 and HT=2, if the network diameter was to increase further than 9 it would be recommended to increase MA and FD to a higher value than the default values based upon the equation in table 4 & 5, the results of Group 4 has also shown that there is no overall stable performance benefit in lowering the Hello Timer. The initial research question in section 1.2 asked Can modification of Spanning Tree Protocol
(802.1D) timers: hello, max age and forward delay decrease network convergence time whilst maintaining network stability when simulating link failure and recovery in OPNET? After carrying out the experiments it can now be confidently said that by careful modification of Hello, Max Age and Forward Delay timers, network convergence time can be significantly decreased without compromising network stability, however great care should be taken when modifying timers following closely the equations and recommendations from the IEEE and CISCO in section 2.5
49
5.6
Benefit of Research
As first identified in section 1.4 small businesses with legacy network equipment may find the results beneficial, they would be most beneficial to a small business with a network that is a diameter of 7 or less and fine tuning of existing legacy equipment could bring network performance benefits for these businesses. Network administrators and engineers could find it useful as reference material to assist in fine tuning of networks. The report may also be of benefit to students studying the mechanisms of STP (802.1D).
5.7
Further Research
Whilst this experiment explored convergence performance within networks with a diameter of 7, 8 and 9, and fully tested the initial hypotheses and answered the research question, the research project itself did not fully investigate STP (802.1D) convergence. To have a comprehensive understanding of convergence performance all diameters should be fully tested based on the equations in table 4 & 5, using these equations and the maximum allowable range we get a network diameter of 17 using MA=40, FD=30 and HT=2, therefore testing all diameters up to 17 is essential in order to obtain a fuller representation of STP (802.1D) convergence performance.
50
6.0 6.1
Appendices References
Backes, F., (1988), "Transparent bridges for interconnection of IEEE 802 LANs", IEEE Network, vol. 2, no. 1, pp. 5-9. Bonada, E. (2007), Characterization of the Spanning Tree Protocol . Carmichael, L.S., Ghani, N., Rajan, P.K., O'Donoghue, K. & Hott, R., (2005) "Characterization and comparison of modern layer-2 Ethernet survivability protocols", Proceedings of the Thirty-Seventh Southeastern Symposium on System Theory, 2005. SSST '05. IEEE, , pp. 124. CISCO, (2006) Understanding and Tuning Spanning Tree Protocol Timers [online], Available at: http://www.cisco.com/en/US/tech/tk389/tk621/technologies_tech_note09186a0080094954.shtml, Document ID: 19120, [accessed on 28/10/2010] Faghani, F. & Mirjalily, G., (2008) "Selecting the best spanning tree in metro Ethernet networks using Genetic algorithm", IJCSNS, vol. 8, no. 6, pp. 106. Frazier, H. & Pesavento, G. (2001), "Ethernet takes on the first mile", IT Professional, vol. 3, no. 4, pp. 17-22. Guo, J., Xiang, W. & Wang, S., (2007) "Reinforce networking theory with opnet simulation", Journal of Information Technology Education, vol. 6, , pp. 215-226. Huynh, M. & Mohapatra, P., (2006) "Etherlay: An Overlay Enhancement for Metro Ethernet Networks", IEEE International Conference on Communications, ICC '06, pp. 2675. Huynh, M. & Mohapatra, P., (2007) "A Scalable Hybrid Approach to Switching in Metro Ethernet Networks", 32nd IEEE Conference on Local Computer Networks, LCN 2007, pp. 436. IEEE 802.1D-1990, IEEE Standards for Local and Metropolitan Area Networks: Media Access Control (MAC) Bridges. IEEE 802.1D-1998, IEEE Standard for Information Technology- Telecommunications and Information Exchange Between Systems- Local and Metropolitan Area Networks- Common Specifications Part 3: Media Access Control (MAC) Bridges. IEEE 802.1s-2002, IEEE Standards for Local and Metropolitan Area Networks--- Virtual Bridged Local Area Networks--- Amendment 3: Multiple Spanning Trees. IEEE 802.1w-2001, IEEE Standard for Local and Metropolitan Area Networks - Common Specification. Part 3: Media Access Control (MAC) Bridges - Amendment 2: Rapid Reconfiguration. Kim, C., Caesar, M. & Rexford, J. (2008), "Floodless in Seattle: A Scalable Ethernet Architecture for Large Enterprises", Proceedings of the ACM SIGCOMM 2008 Conference on Data Communication, ,pp. 3. Kwok, C.K. & Mukherjee, B. (1989) "On transparent bridging of CSMA/CD networks", Communications Technology for the 1990s and Beyond. GLOBECOM '89., pp. 185. Lammle, T. & Quinn, E. (2002) CCNP: Switching Study Guide, 2nd edn, SYBEX Inc. Alameda. 51
Lucio, G.F., Paredes-Farrera, M., Jammeh, E., Fleury, M. & Reed, M.J., (2003) "Opnet modeler and ns-2: Comparing the accuracy of network simulators for packet-level analysis using a network testbed", WSEAS Transactions on Computers, vol. 2, no. 3, pp. 700-707. Maowidzki, M. (2004) "Network simulators: A developers perspective", Proc.Int.Sym.Performance Evaluation of Computer and Telecommunication Systems (SPECTS04), pp. 19. Medagliani, P., Ferrari, G., Germi, G. & Cappelletti, F. (2009), "Simulation-assisted analysis and design of STP-based networks", Simutools '09: Proceedings of the 2nd International Conference on Simulation Tools and Techniques, ICST, pp. 1. Menga, J., (2003) CCNP Practical Studies: Switching (CCNP self-study), Cisco Press. Metcalfe, R.M. & Boggs, D.R. (1976) "Ethernet: Distributed packet switching for local computer networks", Communications of the ACM, vol. 19, no. 7, pp. 395-404. Mirjalily, G., Karimi, M.H., Adibnia, F. & Rajai, S. (2008) "An approach to select the best spanning tree in Metro Ethernet networks", 8th IEEE International Conference on Computer and Information Technology, 2008. CIT 2008., pp. 634. Pallos, R., Farkas, J., Moldovan, I. & Lukovszki, C. (2007) "Performance of rapid spanning tree protocol in access and metro networks", 2nd International Conference on Access Networks & Workshops, 2007. AccessNets '07. pp. 1. Perlman, R. (1985) "An algorithm for distributed computation of a spanning tree in an extended LAN", ACM SIGCOMM Computer Communication Review, vol. 15, no. 4, pp. 44-53. Perlman, R. (1992) Interconnections: Bridges and Routers, Addison-Wesley, Wokingham. Perlman, R. (2000) Interconnections: Bridges, Routers, Switches, and Internetworking Protocols, Addison-Wesley, Wokingham. Prytz, G. (2006) "Redundancy in Industrial Ethernet Networks", IEEE International Workshop on Factory Communication Systems, pp. 380. Prytz, G. (2007) "Network recovery time measurements of RSTP in an ethernet ring topology", IEEE Conference on Emerging Technologies and Factory Automation, 2007. ETFA. pp. 1247. Sakandar, B. & Barnes, D. (2005), Cisco LAN Switching Fundamentals, Cisco Press, Indianapolis. Seifert, W.M. (1988) "Bridges and routers", Network, IEEE, vol. 2, no. 1, pp. 57-64. Sfeir, E., Pasqualini, S., Schwabe, T. & Iselt, A. (2005) "Performance evaluation of ethernet resilience mechanisms", Workshop on High Performance Switching and Routing. HPSR, pp. 356. Shoch, J.F. & Hupp, J.A. (1980) "Measured performance of an Ethernet local network", Communications of the ACM, vol. 23, no. 12, pp. 711-721. Steinke, S. (March 1998). Interconnections: Bridges and Routers. Network Magazine.
52
6.2
Bibliography
Abuguba, S., Moldovn, I. & Lukovszki, C. (2006) "Verification of RSTP convergence and scalability by measurements and simulations", Proceedings BroadBand Europe 2006 . CISCO, (2005) Spanning Tree Protocol Problems and Related Design Considerations [online], Available at: http://www.cisco.com/warp/customer/473/16.html, Document ID: 10566, [accessed on 21/10/2010] Chiruvolu, G., Ge, A., Elie-Dit-Cosaque, D., Ali, M. & Rouyer, J. (2004) "Issues and approaches on extending Ethernet beyond LANs", IEEE Communications Magazine, vol. 42, no. 3, pp. 80-86. Forouzan, B.A., (2007) Data Communications and Networking, 4th edn, McGraw-Hill, Boston. Wang, G., Liu, J., Wu, L. & Yao, H. (2009) "Three-Rings Redundancy Industrial Ethernet Based on RSTP", 2009 International Conference on Signal Processing Systems , pp. 228. IEEE 802.1D-2004, IEEE Standard for Local and Metropolitan Area Networks Media Access Control (MAC) Bridges. IEEE 802.1Q-2003, IEEE Standards for Local and Metropolitan Area Networks. Virtual Bridged Local Area Networks. Mahajan, U., Mellacheruvu, R. & Jain, P. (2009), Value-added features for the spanning tree protocol. Thacker, C. (1986) "Personal distributed computing: the Alto and Ethernet hardware", Proceedings of the ACM Conference on The history of personal workstationsACM, , pp. 87. Thacker, C.P., McCreight, E.M., Lampson, B.W., Sproull, R.F. & Boggs, D.R., (1979) Alto: A personal computer, Xerox, Palo Alto Research Center,. Wojdak, W. (2003) "Rapid Spanning Tree Protocol: A new solution from an old technology", Performance Technologies, March 2003. Zeng, A., Hu, Y. & Di, Z. 2009, "Optimal tree for both synchronizability and converging time", Europhysics Letters, EPL, vol. 87, , pp. 48002. Perlman, R. (1992) Interconnections: Bridges and Routers, Addison-Wesley, Wokingham.
53
6.3
Results (Charts)
54
55
56
57
58
59
60
61
62
63
64
65
6.3.5 CD Index
The excel files for the raw data for all charts in Sections 6.31-4 can be found in the root folder of the CD named Results Raw Data these files are indexed below:
Group 1 - Scenario D7 Results Group 1 - Scenario D6 Results Group 1 - Scenario D5 Results Group 1 - Scenario D4 Results Group 1 - Scenario D3 Results Group 1 - Scenario D2 Results Group 1 - Scenario D1 Results Group 1 - Scenario D0 Results Group 2 - Scenario D7 Results Group 2 - Scenario D6 Results Group 2 - Scenario D5 Results Group 2 - Scenario D4 Results Group 2 - Scenario D3 Results Group 2 - Scenario D2 Results Group 2 - Scenario D1 Results Group 2 - Scenario D0 Results Group 3 - Scenario D7 Results Group 3 - Scenario D6 Results Group 3 - Scenario D5 Results Group 3 - Scenario D4 Results Group 3 - Scenario D3 Results Group 3 - Scenario D2 Results Group 3 - Scenario D1 Results Group 3 - Scenario D0 Results Group 4 - Scenario D7 Results Group 4 - Scenario D6 Results Group 4 - Scenario D5 Results Group 4 - Scenario D4 Results Group 4 - Scenario D3 Results Group 4 - Scenario D2 Results Group 4 - Scenario D1 Results
GRP1_D7.xls GRP1_D6.xls GRP1_D5.xls GRP1_D4.xls GRP1_D3.xls GRP1_D2.xls GRP1_D1.xls GRP1_D0.xls GRP2_D7.xls GRP2_D6.xls GRP2_D5.xls GRP2_D4.xls GRP2_D3.xls GRP2_D2.xls GRP2_D1.xls GRP2_D0.xls GRP3_D7.xls GRP3_D6.xls GRP3_D5.xls GRP3_D4.xls GRP3_D3.xls GRP3_D2.xls GRP3_D1.xls GRP3_D0.xls GRP4_D7.xls GRP4_D6.xls GRP4_D5.xls GRP4_D4.xls GRP4_D3.xls GRP4_D2.xls GRP4_D1.xls
66