Sunteți pe pagina 1din 7

Case study: Olympic systems

IBM software has developed web information systems for three consecutive Olympic games. The systems for Atlanta 1996 had major problems, whereas Nagano 1998 and Sydney 2000 were successes.

Table of Contents: 1. Overview of the Systems used 1.1 Atlanta 1996 1.2 Nagano 1998 1.3 Sydney 2000 2. Reasons for failure in Atlanta and success in Nagano and Sydney from the Systems development viewpoint 3. Conclusion (Lessons learned)

1. Overview of the Systems 1.1 Atlanta 1996


The objectives for the global Web site were to deliver dynamic information across the Web to millions of people around the world, and to achieve complete customer satisfaction. To meet these objectives, information had to be kept up to date and accessible at all times. IBM chose its plant Transarcs DFS (Distributed File System) which was qualified to accomplish these goals through the use of specific features. Aggressive caching allowed to cache information on Web servers to dramatically reduce the number of times information had to be retrieved from a file server, which further reduced both server and network loads to sustain peak performance even under very high loads. Replication was meant to replicate information automatically across multiple servers within a single Web site, which ensured consistency of data and provided the scalability needed by large, distributed Web sites. Location-independent file names were established to support unique file names, independent of the physical location of the file, to give Web administrators a flexible plug-and-play environment where resources could be added or reconfigured easily to support information requests. As a last point security was provided by a specific security mechanism, with the flexibility to assign different levels of access to administrators and users to protect information as it was distributed and shared across the public Internet. Concerning the system architecture, the Olympic Web site consisted of five different locations: two sites in the United States and one each in Japan, Germany and England. 1

Each site contained Web servers (configured as DFS clients) running on SP2s, which are cluster microprocessors that combine multiple RISC processors and DFS file servers running on RS/6000s. All of these made up a single DFS cell. DFS state-of-the-art replication ensured consistency of data across all sites. In addition to balancing the load across servers and enhancing server performance, DFS replicas provided closer access to data and failover capabilities. DFS was also used to store and replicate system software required by all of the computers in the Web site, ensuring constant availability of these critical files. DFS security ensured that only authorised administrators could update Olympic information on the file servers and perform other maintenance operations, much of which was performed across the Internet. While users were given free access to the public Olympic information, system software files and other sensitive data were safe from unauthorised access because only administrators had the proper credentials to update machines at remote locations. DFS provided client-side caching, which allowed Web servers and DFS clients to respond to information requests with files that were already in the local cache. As a result, only 5% of the 1215 million hits required information to be retrieved directly from a DFS server, which drastically reduced the number of machines and network bandwidth required. DFS stored much of the most frequently accessed Olympic information in a variety of formats, including HTML, graphic files, audio and video to provide constantly updated footage of Olympic highlights. With IBM's Sneak Peek camera, for example, 48 different cameras in various locations at the Olympic site fed images to the master DFS site in Southbury, Conn. The images were then transferred to DFS servers at the other four sites where over half a million high-quality photos were captured in DFS, organised by event and constantly updated. DFS also stored a profile of each user accessing IBM's Olympic Web site. A unique identifier was assigned to each user visiting the site. Based on the user's previous visits, a profile that identified the user's areas of interest was created and attached to the identifier. The profile then was used to create a customised home page that displayed the latest information of interest to the user each time he or she returned to the Web site. All profiles were stored in DFS and replicated across file servers to keep the profiles accessible and to permit the personalised pages to be displayed. http://www.transarc.ibm.com/Library/newsletters/DT/DTarchive/9701/dfs.html

1.2 Nagano 1998


The Web site utilised thirteen SP2 systems scattered around the globe containing a total of 143 processors, 78 GB of memory, and 2.4 terabytes of disk space. This amount of hardware was deployed both for performance purposes and for high availability. One technique that was used to serve dynamic data efficiently to clients was to cache dynamic pages so that they only had to be generated once. For this reason, IBM developed and implemented a new algorithm named Data Update Propagation (DUP) which identifies the cached pages that have become stale as a result of changes to underlying data on which the cached pages depend, such as databases. For the 2

Olympic Games Web site, it was possible to update stale pages directly in the cache, which obviated the need to invalidate them. This allowed achieving cache hit rates of close to 100%. A private, high-speed IP network was built to support the movement of content from the source in Nagano to the Web server complexes in Tokyo, Schaumburg, Columbus, and Bethesda. Each of the server complexes had direct connectivity to OpenNet, which is part of the IBM Global Network lying outside the corporate firewalls and servicing Internet users around the globe. OpenNet and its network of routers became the foundation for the Nagano Olympics network. In order to provide surplus capacity for Internet users outside OpenNet, additional dedicated lines and line upgrades were established for public access to the Olympic Games Web servers. These dedicated lines provided high-bandwidth connectivity from Olympic host sites to major US Network Access Points (NAPs) which are sites where major networks (e.g. MCI, Sprint, UUNet) connect to each other and exchange data. These lines provided the essential bandwidth for the Olympic Games Network in order to decrease access times for users on other ISPs and ensure that the flood of data didn't overload and bring down major Internet links. Since multiple locations were used to host the Web site, one goal was to provide users with the fastest performance possible by routing their requests to the server complex closest to them. But it was needed a way to reroute traffic if a particular site or network link became overloaded. To meet these requirements, IBM used several common Internet protocols combined in a way that gave them maximum performance, flexibility and reliability. The combination of these techniques was called Multiple Single IP Routing (MSIRP). What makes MSIRP work is the way routing functions on the Internet. IP routing is dynamic by design. Routing can be redirected on the fly to ensure that an IP network will continue to function in cases where failures or changes impact the interconnections between routers and data exchange points. As a result, on the huge interconnected network that makes up the Internet, there are always multiple paths to a single destination. Dynamic routing protocols such as Open Shortest Path First (OSPF) or the Border Gateway Protocol (BGP) determine which route to take by assigning a cost based upon time or other cost metric for each network ``hop'' traversed between network routers until the destination is reached. After calculating the total cost of each route, data is sent along the least costly path. In network terms, this route is the shortest path to the destination. IBM's Interactive Network Dispatcher was an essential component for implementing the complex network management techniques that were used. The purpose of the Network Dispatcher was to accept traffic requests from the Internet that were routed to a complex and to forward the requests to an available Web server for processing. At each complex in the US, four Net Dispatcher servers sat between the routers and the frontend Web servers. Each of these four Net Dispatcher boxes was the primary source of three of the twelve SIPR addresses and secondary source for two other SIPR addresses. Secondary addresses for a particular Net Dispatcher were given a higher OSPF cost. The Net Dispatchers ran the gated routing daemon which was configured to advertise IP addresses as routes to the routers via OSPF. Differing costs were used depending on whether the Net Dispatcher was a primary or secondary server of an IP address. The routers then redistributed these routes into the OpenNet OSPF network. Incoming 3

requests received by any OpenNet router could then make routing decisions, with knowledge of the routes being advertised by each of the complexes, to deliver requests to the Net Dispatcher with the lowest OSPF cost. Typically, this would be the Net Dispatcher at the closest complex, which was the primary source for the address assigned to incoming requests; this was given to the browser via round-robin DNS. The request would only be sent to the secondary Net Dispatcher box for a given address if the primary Net Dispatcher box was down or had failed for some reason. If the secondary Net Dispatcher also failed, traffic would be routed to the primary ND in a different complex; this is similar to the situation where a system deliberately does not advertise an address in order to move traffic from one complex to another. A key benefit of this design was that it gave control of the load balancing across the complexes to operators of the Net Dispatchers. No changes were required on the routers to support this design as the routers learned routes from the Net Dispatchers via a dynamic routing protocol. Each Net Dispatcher box was connected to a pool of front-end Web servers dispersed among the SP/2 frames at each site. Traffic was distributed among these Web servers using the advisors included as part of the Interactive Session Support (ISS) Interactive Network Dispatcher (IND) product. If a Web node went down, the advisors immediately pulled it from the distribution list. This approach also ensured high availability by avoiding any single point of failure in a site. If a Web server went down, the Net Dispatcher automatically routed requests to the other servers in its pool. If an SP/2 frame went down, the Net Dispatcher routed requests to the other frames at a site. If a Net Dispatcher box went down, the router sent requests to its backup. If an entire complex failed, traffic was automatically routed to a backup site. In this way IBM achieved a so-called elegant degradation, in which various points of failure within a complex were immediately accounted for, and traffic was smoothly redistributed to elements of the system that were still functioning. http://www.supercomp.org/sc98/TechPapers/sc98_FullAbstracts/Challenger602/index.ht m#net

1.3 Sydney 2000


To handle the tidal wave of Internet attention and the Games' massive internal systems requirements, IBM had in place three S/390 mainframes with Parallel Sysplex capabilities, 50 RS/6000s, three RS/6000 SP servers, 540 Intel-based Netfinity servers, 7,300 PCs, 50 ThinkPad portable systems, 845 network switches, 7,000 monitors, and 1,655 printers. As part of its core IT solution, IBM had set up three basic systems: the Games Management Systems, the Games Results System, and INFO, an intranet-based information resource for the 260,000 members of the Olympic family. All three of these sites interacted with one another to share information across a multitiered infrastructure. The Games Results System received competitive results from the venue results applications and distributed them to 15,000 members of the media and the 700 printers. At the venues, the printed results could be distributed to International Sport Federation officials, athletes, and coaches. 4

The Games Management System consisted of a set of applications that handled the logistical and administrative aspects of the Games. IBM had developed applications for tasks such as accreditation, medical operations, processing arrivals and departures, and incident tracking. The competition results applications collected competition data and distributed it to broadcasters and venue scoreboards. These applications were venue-specific, so if one venue happened to experience technical problems, the other venues could continue scoring and reporting. IBM used DB2, WebSphere, Lotus Domino, and Net.Commerce to power the official Games site. The company's Hot Media software allowed visitors to the site's Olympic store to inspect selected items by rotating them. Playing a central role was DB2, which was used to collect and manage live results data and to help ensure data integrity and accuracy for the Games' scoring system. To help manage the systems, IBM subsidiary Tivoli provided a slew of products to coordinate applications and the Games' 10,000-device infrastructure. Tivoli had deployed Tivoli NetView and NetView for OS to manage the network and systems, Tivoli Software Distribution for all servers and desktops, and Tivoli Enterprise Console to provide a single view of the enterprise with availability and failure rules. Overall, IBM had developed 12 million lines of new code to run the Games' 37 sports results systems, deploy TV feeds and support broadcasters with commentator information, provide central feeds to information intranets at the Sydney Olympic Village, and supply print distribution systems. Finalising the sports results systems proved particularly tricky, because each unique report generated during a contest had to match the configuration requirements of the IOC, sports federations, and Olympics committees of 198 countries represented at the Games. http://www2.itworld.com/cma/ett_article_frame/0,2848,1_2535,00.html Organisers were expecting over a billion hits. To cope with this, IBM had set up multiple RS/6000 SP machines running AIX and its own HTTP server. Separate "cache farms" put frequently accessed pages into high-speed memory for faster retrieval. The collection and management of live results data used IBM's DB2 database software, while Lotus Domino R5 handled news feeds, photos, and static content. Dynamic content such as results pages, historical information, headlines, and scoring console feeds were generated by WebSphere Application Server Advanced. Net.Commerce was used to sell Olympic merchandise online, and IBM HotMedia was used for interactive elements. IBM officials said traffic on the server was constantly being analysed to refine content placement based on how people are using the site. http://www.newswire.com.au/apcweb/news.nsf/HTML/Category/24123DD1D7E86ADBC A25695F000D91C5

2. Reasons for Failure in Atlanta and Success in Nagano and Sydney -from the Systems Development ViewpointOne of the major reasons why there were problems in Atlanta Olympic games is that IBM tried to use their latest technology instead of concentrating on the needs of the games. They did not pay attention to the specific requirements, and so it leads to a system that was not powerful enough for the applications needs. It seemed logical to display the very latest technology at the games, but unfortunately it was fallible and difficult to fix. During the 1996 Olympic Games, a conservative approach was taken whereby a large number of pages were invalidated after database updates. While this preserved cache consistency, significantly more pages were invalidated than were necessary. Consequently, cache miss rates after database updates were high. They had a hard time to keep the results updated. It results in poor hit rates of around 80%: it was a big task to identify which pages were affected by changes to databases. At Nagano Olympic games, IBM developed and implemented a new algorithm they call Data Update Propagation (DUP) which is described in Nagano section. This new technology allowed IBM to achieve cache hit rates of close to 100%. Another key component in achieving near 100% hit rates was prefetching. When hot pages in the cache became obsolete as a result of updates to underlying data, new versions of the pages were updated directly in the cache. Such pages were never invalidated from the cache. Consequently, there were no cache misses for these pages. Further on, IBM introduced a new way of network architecture as described before: Multiple Single IP Routing (MSIRP) provided maximum performance, flexibility and reliability. In the 1996 Web site hierarchy, at least three Web server requests were needed to navigate to a result page. Similar browsing patterns were also required for the news, photos, and sports sections of the Web site. When a client reached a leaf page, there were no direct links to pertinent information in other sections. Due to this hierarchy, intermediate pages required for navigation were among the most frequently accessed. By contrast, virtually every page at the 1998 Web site contained useful information. The pages were designed to allow clients to access relevant information while examining fewer Web pages. The most significant change was the addition of another level whereby a different home page was created each day of the Olympic Games. Clients could view the home page for the current day as well as any previous day. The 1996 Web site organized results, news, photos, etc. by sports and events; although country information and athlete biographies were provided, results corresponding to a particular country or athlete could not be collated. Many users of the 1996 Web site felt that this was a limitation. Consequently, the 1998 Web site organized results, news, photos, etc. by countries and athletes as well as by sports and events. There was an additional improvement in 1998. Because in the 1996 Olympics the Web servers failed, there were errors in the athlete profiles and results information was 12 hours out of date. In contrast, during the Nagano Games 1998, a private, high-speed IP network was built to support the movement of content from the source in Nagano to the 6

Web server complexes in four different locations. Each of the server complexes had direct connectivity to OpenNet which is part of the IBM Global Network lying outside the corporate firewalls and servicing Internet users around the globe. OpenNet and its network of routers became the foundation for the Nagano Olympics network. In order to provide surplus capacity for Internet users outside OpenNet, additional dedicated lines and line upgrades were established for public access to the Olympic Games Web servers. These dedicated lines provided high-bandwidth connectivity from Olympic host sites to major US Network Access Points (NAPs) which are sites where major networks (e.g. MCI, Sprint, UUNet) connect to each other and exchange data. These lines provided the essential bandwidth for the Olympic Games Network in order to decrease access times for users on other ISPs and ensure that the flood of data didn't overload and bring down major Internet links. As a last aspect we want to mention the data warehouse for the Sydney Olympics (Olympics.com). DB2 makes it possible for the Sydney Organising Committee for the Olympic Games (SOCOG) to do business on a global scale. In general, the success in Sydney had been due partly to the Sydney organisers signing a contract putting IBM in charge of all the technology. In 1996, IBM could not control the whole operation of Atlanta, as several other technology companies were involved. And the Atlanta Operation Committees technology staff was in charge of the project. http://www.newswire.com.au/apcweb/news.nsf/HTML/Category/24123DD1D7E86ADBC A25695F000D91C5 http://www.supercomp.org/sc98/TechPapers/sc98_FullAbstracts/Challenger602/ http://www-4.ibm.com/software/data/db2today/na14.html http://olympic.datops.com/page/mustread_04_10_09.html

3. Conclusion
IBM came under severe criticism for its performance in Atlanta, with the results delivery system incurring many problems and unable to meet its promises to news media like news agencies and newspapers that had paid for the data service. So, IBMs problem was widely reported. After Atlanta, IBM committed the same amount of staff and funds to the winter Games in Nagano as it had to the summer Games, just to show the world and the Olympic movement that it could still do a good job and as we have shown in our case study IBM really improved their performance. They developed new technologies and used them in an efficient and creative way. But the Sydney Games in 2000 are IBMs last games. The project has gradually become too expensive and too risky for only one company. http://www.newswire.com.au/apcweb/news.nsf/HTML/Category/24123DD1D7E86ADBC A25695F000D91C5 http://www.usatoday.com/life/cyber/tech/ctd244.htm http://olympic.datops.com/page/mustread_04_10_09.html

S-ar putea să vă placă și