Sunteți pe pagina 1din 16

________________________________________________________________________

Voice Over IP
Network Readiness Assessment

For Thomson Learning

Table of Contents
Section 1 – Overview and Scope.........................................................................................2
Section 2 – Executive Overview..........................................................................................2
Section 3 – Facilities Assessment........................................................................................3
Data Center......................................................................................................................3
TDC.................................................................................................................................3
Section 4 – Network Baseline and Documentation.............................................................4
Section 4 – Network Baseline and Documentation.............................................................4
Section 5 – Network Assessment.........................................................................................5
Mason Data Center..........................................................................................................5
Switch ports –vs- routed ports for WAN.....................................................................6
Routing issues..............................................................................................................6
Other Switch concerns.................................................................................................6
Network Device software images................................................................................7
Thomson Distribution Center..........................................................................................7
Section 6 – PBX and Telecom Assessment.........................................................................8
Section 7 – Voice over IP Simulation Testing......................................................................9
Section 8 – Action Item Summary.......................................................................................9
Appendix A – Network Loads...........................................................................................10

Prepared by Derek Small


CCIE # 5832, CCSP, NCSE

October 10, 200 Page 1


________________________________________________________________________

Section 1 – Overview and Scope


The purpose of this document is to assess the likelihood of a successful deployment of a
Voice over IP (VoIP) communications system between the Thomson Data Center in
Mason, OH, and the Thomson Distribution Center (TDC) in Florence, KY. This
assessment is limited in scope to the equipment and facilities that will transport voice
traffic between the Avaya IP enabled PBX at the TDC and the Avaya Witness server in
the Data Center in Mason.

This assessment will consider issues that could affect the reliability and performance of
voice traffic on the LAN and will make recommendations for improvements where
applicable. The recommendations presented here are based on best practices white-
papers from Avaya and Cisco, and recommended industry standards.

When you start moving voice traffic over your network, issues surrounding reliability and
security need extra attention, so these items will be reviewed as part of the assessment
also.

Section 2 – Executive Overview


The network at two sites inspected was found to be exceptionally well maintained and in
an excellent state of health. Active monitoring of all primary network functions is
accomplished through the NOC in Mason to ensure maximum availability and
performance.

No remarkable network issues were found and the results of the simulation testing
indicate the network is more than capable of handling the load of the anticipated voice
over IP traffic. The results of the simulation testing indicated that VoIP traffic should
yield acceptable results at the expected maximum load.

There were a number of less significant issues that should be addressed as time permits,
but should not significantly impact the success of the installation of the Witness server in
the Mason data center.

October 10, 200 Page 2


________________________________________________________________________

Section 3 – Facilities Assessment


The facilities that support the network at Thomson were found to be exceptionally well
designed and maintained.

Data Center
All cabling used in the data center was observed to be Cat-5e or better, and was well
managed with attention to proper cable routing and bend radiuses. Racks were all neat
and everything was clearly labeled.

Physical access to the data center is via the NOC which is staffed 7x24. Card keys are
required and users must sign in and out.

Fire suppression is accomplished via inert gas (not water). Redundant HVAC systems
were seen to be operating with plenty of additional reserve capacity.

Power systems were redundant with two primary UPS systems powering multiple power
distribution units (PDUs). There is also a backup generator system present for longer
duration power outages.

TDC
The network at the Thomson Learning Distribution Center is primarily housed in one
large room. This room contained the phone system, the core switching infrastructure and
a few servers. Currently only a singe Cisco 6509 is installed at the core, but a second one
is sitting in the racks and is expected to be brought on-line in the coming weeks. To
avoid confusion with the data center in Mason, this room shall be referred to as the TDC
MDF (Main Distribution Frame)

The cabling plant was well maintained and installed to meet or exceed Cat-5
specifications. There was also a combination of multimode and single mode fiber for all
vertical cabling. Environmental conditions in the MDF were adequate and the room is
equipped with a building UPS and generator. The room did use water based fire
suppression, however it is supposed to be on a different system than the rest of the
building, so a fire in the cubicle or office space should not trigger the sprinklers in the
MDF. Isolation of the fire suppression system could not be verified.

Access to the MDF was via card key and security at the facility in general was very high.

October 10, 200 Page 3


________________________________________________________________________

October 10, 200 Page 4


________________________________________________________________________

Section 4 – Network Baseline and Documentation


The following diagram shows the connectivity of the Data Center in Mason, and the
TDC, as well as the Call Center users connected to closet 4 in the TDC.

October 10, 200 Page 5


________________________________________________________________________
Using MRTG, the utilization of all primary data paths was monitored for 1 week. The
load was load on all links except for the Gig connection between the core switches at that
data center. Utilization on this connection was as high as 70% at times. Adding a second
link to this data path would be a good idea from both a load and redundancy stand-point.

The results of the MRTG load monitoring can be found in Appendix A – Network Loads.
The results show the network load during the weeks prior to the simulation testing, and
the week of the simulation testing. Simulation testing was done over four days beginning
on Wednesday, October 5th, and ending Saturday, October 8th.

Section 5 – Network Assessment


The overall health of the network was found to be very good. With little room for
improvement. A few specific issues were noted at each facility.

Mason Data Center


Although most ports on the two core switches showed little or no errors, initially
observed error rates were higher than expected on several ports. Error rates on
production ports should never exceed 0.1%. The following ports showed error rates
exceeding 0.1%, however a number of other ports showed lower but non-trival error
rates.

Core1 (Mason)
GigE 6/13 (TLUSOHCINDC05_72_5)
GigE 6/34 (questcon1_cab72_6)
GigE 6/36 (cx500-spa-spb_cb10)
GigE 8/24 (cleo_c66-4)
GigE 9/24 (bigip3_248)

Core2 (Mason)
GigE 3/47 (ohcinscribe1_c88_1)
GigE 4/15 (OHCINMAIL05FB_C74_)
GigE 4/25 (delmar-7_cab63_36)
GigE 4/43 (ohcindevg24_c73-10)
GigE 7/8 (tlowd01_cab68_7)
GigE 7/40 (OWDEV05_CAB51)
GigE 8/2 (ohcinvpn_lucent)
GigE 8/4 (tdc100Mbit-CinBell)
GigE 9/5 (bigip6_248)

October 10, 200 Page 6


________________________________________________________________________
Switch ports –vs- routed ports for WAN
Currently the high-speed WAN circuits from Cincinnati Bell and Time Warner are
connected to ports on the core switches that are configured as switch ports and use VLAN
interfaces for routing. These VLAN interfaces are then duplicated on each switch and
HSRP is used to allow traffic to fail from one interface to the other. This will cause
excessive convergence times in the event that one of the high-speed circuits fails.

Currently Core Switch 2 supports the CBT connection. Normally EIGRP retires routes
immediately on an interface failure. But if the CBT connection were to ever fail, the
associated VLAN interface on Core Switch 1 would not drop, since VLAN interfaces are
virtual interfaces, not physical This would prevent EIGRP from immediately retiring
routes leaned over that interface, and routing would not fail to the Time Warner circuit
until the EIGRP dead timer expires for the CBT routes. Worse, if static routes are
configured, the routes would never fail to the Time Warner circuit at all.

If the CBT and Time Warner circuits are connected to ports configured as router ports
instead, routing would fail from one service to the other in less than 2 seconds. As it is
configured now, routing would not fail to the alternate path for between 30 and 45
seconds, if ever.

Routing issues
For testing static routes were placed on the Core switches to ensure all simulated VoIP
traffic was sent over the CBTS WAN in both directions. The network monitoring showed
that something changed on the core switches about three weeks prior to the start of
simulation testing that caused most, if not all, traffic from the TDC to the data center to
sent over the TimeWarner circuits alone. The graphs are not high enough resolution to
tell if this included the simulated voice traffic.

If the future, dynamic routing is recommended, with routing offset-weights applied to


interfaces and subnets to influence routing decisions. This will reduce the possibility of a
routing failure if a redundant device or path fails.

Care must be taking when routing VoIP traffic so that it flows over the Cincinnati Bell
WAN circuit, since the RoadRunner circuit does not support QoS. QoS must be enabled
and configured on all switches and switch ports in the VoIP data path to ensure
satisfactory performance.

Other Switch concerns


Cisco switches have the ability too disable ports that encounter errors. This feature is
enabled by default, and is enabled on all your switches. Non-Cisco equipment and
broken network devices, can sometimes send traffic that the Cisco switch perceives as
erroneous which could cause the port to be disabled. Care should be taken to monitor
“error disable” events and the error that caused them. You also might find it advantageous
to enable the auto-recovery feature for error disabled ports. If you do so, you should

October 10, 200 Page 7


________________________________________________________________________
make the recovery time long enough to ensure that the user experiences a problem, and
will be likely to report it, but short enough that the port will be re-enabled by the time an
IT staff member would be required to act on it. This will allow you to review the
forensics of the issue and take corrective action without impacting the user any more than
necessary.

Although not related to VoIP, jumbo frames on Gigabit LAN environments can help
improve performance. Jumbo frames are not enabled by default or on any of your
network switches. To maximize the benefit of Jumbo frames they must be enabled on
both servers and switches.

The ports with high error rates and a few other problems seen in the log should have been
reported by your NOC. If the NOC isn’t reporting such errors to your IT staff you should
investigate why. If necessary you might want to consider a secondary monitoring system,
such as What’s Up Gold, to monitor your network devices for: high port error rates, high
CPU utilization on network devices, low free memory, etc.

Since the Avaya gear was not installed at the time the assessment was done, the
configuration of the ports that support that gear could not be confirmed. However, any
ports that will be operating at 100Mbps should have both the device port and the switch
port forced to 100Mbps, full-duplex. Allowing 100Mbps ports to auto-negotiated the
speed and duplex is a primary source of network errors and poor voice quality on many
VoIP deployments. 10Mbps and Gig-Ethernet ports are not subject to the same
negotiation problems and are usually best left to auto-negotiate.

Network Device software images.


The TDC core switch is running fairly recent switch code but the MSM is very out-of-
date. Also the MSM in the TDC 6509 was showing traceback errors in the system log.
Tracebacks are software exceptions that can sometimes lead to crashes. This problem
appears to be a software bug pertaining to EIGRP.

All other switches reviewed were running software images that was released within the
last 12 months.

Cisco recommends that you make a routine of updating your system software once every
six to twelve months. Upgrading to the latest version of the current image, ie from
12.2(2) to 12.2(21) introduces no new features, only bug fixes.

The tracebacks in the system log are items that the NOC should be getting alerts on, and
should be taking action on also.

Thomson Distribution Center


The Cincinnati Bell and Time Warner high-speed connections are both on blade 9 of the
TDC core switch. Port 9/1 supports the CBT circuit, while port 9/26 supports Time

October 10, 200 Page 8


________________________________________________________________________
Warner. One of these connections should be moved to a different blade to ensure that all
WAN communications are not lost if blade 9 fails. If the second 6509 switch is planned
to be installed in the very near future it might make sense to just wait until the new switch
is installed before addressing this, as one of the circuits would need to move to that new
switch anyway.

Also the MSM or router engine in the core switch at the TDC is running very old code.
Also there are traceback errors in the system log. Traceback errors are program
exceptions that sometimes lead to system crashes. The IOS image running on the MSMs
needs to be updated then the switch should be monitored to see if the traceback errors go
away after the upgrade. If they do not, a trouble ticket should be opened with Cisco to
determine the nature of the traceback.

Section 6 – PBX and Telecom Assessment


To complete this project, the current Avaya PBX at the TDC will have IP enabled CLAN
and MedPro cards installed that will communicate with the Witness server in the Mason
Data Center. The maximum number of calls anticipated between the PBX and the
Witness server is 134 calls, using G.729 codecs, with 20ms time sampling in each packet.

Avaya recommends a dedicated VLAN for voice, but discourages the use of routers to do
so. Plans for supporting the Witness server include a dedicated VLAN at the data center
for that purpose, and there is already a VLAN at the TDC to support VoIP connections to
the VoIP blades in the Avaya PBX at the TDC.

Avaya also conforms to industry standard recommendations for end-to-end delay of not
more than 100ms for best quality, with 300ms representing the point at which VoIP
should not be considered. End-to-end delay on the Thomson network is not surprisingly
less than 10ms and should be more than acceptable.

October 10, 200 Page 9


________________________________________________________________________

Section 7 – Voice over IP Simulation Testing


Using NetAlly from Viola Networks, 134 Voice calls were simulated between the Data
Center and the TDC, using G.729 codecs. The results of the simulation testing can be
found in Error: Reference source not found

Section 8 – Action Item Summary

The following section details specific issues observed that should be addressed before a
large scale deployment of Voice over IP (VoIP) is considered, as well as a list of less
significant items that were discovered that should be investigated as time permits.

Suggested Improvements
1. Reconfigure switch ports supporting the WAN to router ports to decrease fail-over
time in the event of a failure. This item would take about an hour to correct and could be
done during production, if done carefully.
2. Revisit what alerts the NOC receives and what action is taken as a result of those
alerts. Deficiencies could be corrected through training or enhancements to the NOC
monitoring tools.
3. Move one of the WAN connections on the core switch at the TDC to a different
blade. If the second 6509 is going to be installed soon, one of the circuits would need to
be moved to that switch instead.
4. Upgrade all your switches to current releases of software. This item could be
completed during a 1-2 hour service window.
5. Consider enabling “error disable recovery” on your Cisco switches to allow ports
disabled due to perceived errors to recover automatically since you will be deploying in a
mixed environment.

October 10, 200 Page 10


________________________________________________________________________

Appendix A – Network Loads


Figure 1. Network Load on Core1, RoadRunner WAN port, week of testing.

Figure 2. Network Load on Core1, RoadRunner WAN port, month of testing.

October 10, 200 Page 11


________________________________________________________________________

Figure 3. Network Load on Core1, to Core2 port, week of testing.

Figure 4. Network Load on Core1, to Core 2 port, week of testing.

October 10, 200 Page 12


________________________________________________________________________

Figure 5. Network Load on Core2, CBTS WAN port, week of testing.

Figure 6. Network Load on Core2, CBTS WAN port, month of testing.

October 10, 200 Page 13


________________________________________________________________________

Figure 7. Network Load on Core2, to Core1 port, week of testing.

Figure 8. Network Load on Core2, to Core1 port, month of testing.

October 10, 200 Page 14


________________________________________________________________________

Figure 9. Network Load on TDC-Core, CBTS WAN port, week of testing.

Figure 10. Network Load on TDC-Core, CBTS WAN port, month of testing.

October 10, 200 Page 15


________________________________________________________________________

Figure 11. Network Load on TDC-Core, RoadRunner WAN port, week of testing.

Figure 12. Network Load on TDC-Core, RoadRunner WAN port, month of testing.

October 10, 200 Page 16

S-ar putea să vă placă și