Submitted by: Qaim Din Mahar 0300-6714555 qaim_mahar@yahoo.com facebook.com/qaim.mahar
Class: MBA II-(B) Subject: Research Methods & Report Writing Session: Spring 2011
Quaid-i-Azam School of Management Sciences. Quaid-i-Azam University Islamabad.
2 Table of contents Acknowledgement................................................................................................................ 01 Abstract ............................................................................................................................... 02 Chapter 1 ............................................................................................................................. 03 Introduction ................................................................................................. 03 Significance of Study ..................................................................................... 03 Rationale...................................................................................................... 04 Chapter 2 ............................................................................................................................. 05 Literature Review ......................................................................................... 00 Business continuity planning ......................................................................... 05 Components of business Continuity planning ......................................................................... 08 Contingency planning ................................................................................... 08 Disaster recovery plan .................................................................................. 09 Recovery time and recovery point objectives in disaster recovery ................... 10 Insurance ..................................................................................................... 10 Disaster recovery plan for telecom systems.................................................... 11 Chapter 3 ............................................................................................................................. 12 Methodology................................................................................................ 12 Research question ........................................................................................ 12 Case study approach ..................................................................................... 12 Selection of an organization .......................................................................... 12
3 Technique used ............................................................................................ 13 Data collection source................................................................................... 13 Interviewer selection .................................................................................... 13 Interview Schedule ....................................................................................... 13 Chapter 4 ............................................................................................................................. 14 Introduction of Organization ......................................................................... 14 Introduction of Mobilink ............................................................................... 14 Mobilink's Vision .......................................................................................... 15 Mobilinks Organizational Culture .................................................................. 15 Case study............................................................................................................................ 16 Overview ..................................................................................................... 16 An insight to Mobilink fire incident ................................................................ 16 Role of PTA in disaster recovery..................................................................... 18 Equipment Sensitivity ................................................................................... 18 Restoration of System ................................................................................... 19 Second Update ............................................................................................. 19 Service restoration ....................................................................................... 20 Mobilink to compensate its Customers .......................................................... 20 Pakistan Telecommunication Authority.......................................................... 21 Users of Mobilink ......................................................................................... 22 Who Suffered More Loss ............................................................................... 22
5 Acknowledgement: Apart from our efforts, the success of any project depends largely on the encouragement and guidelines of many others. I take this opportunity to express my gratitude to the people who have been instrumental in the successful completion of this project. I would like to show our greatest appreciation to Dr. XYZ. I feel motivated and encouraged every time I attended her meeting. Without her encouragement and guidance this project would not have been materialized. I am really thankful to Mr. ABC (Mobilink), Mr. XYZ (Mobilink), DEF (Ufone), Miss ABC (Ufone), ABC (Mobilink), and ABCD (Mobilink) who are major source of information and guidance for me. The guidance and support received from all the members who contributed and who are contributing to this project, was vital for the success of the project. I am grateful for members constant support and help. At the end I want to say thanks to whole staff and management of Mobilink for co- operation and for providing me very useful and relevant information.
6 Abstract: Telecom infrastructure is assigned a special position in society because of its important role in the everyday functioning. In addition to the direct effects of loss of service, telecom failures cripple coordination capabilities, hindering rescue and recovery operations. The importance of business continuity and disaster recovery plan has grown considerably in the recent years, becoming a well-established practice to achieve organizations resiliency. In a world where continuous operations are essential for smooth business survival, action must be taken to ensure business continuity and disaster recovery irrespective of any kind of disruptive event. Business continuity and disaster recovery are a part of overall organizations risk management. This study presents a case based upon business continuity and disaster recovery in telecom sector and explains how organizations are effected by disasters and what steps and plans organizations should made to cope up with such kind of disaster in future.
7 Chapter 1 Introduction: All organizations are susceptible to a non-zero risk of experiencing out-of- course events, whether natural or man-made, that can lead to disasters with respect to business operations. Disasters are unpredictable, sudden and cost a lot to the business. Some of the costs of disasters include cash flow interruptions, loss of competitive edge, loss of market share and customer confidence, erosion of business image and legal/regulatory violation. Business continuity & disaster recovery are a part of overall organizations risk management. Business continuity relates to establishing the right process, procedures and resources necessary to continue a business in an acceptable form when something interrupt business. Therefore business recovery is an activity performed to ensure that critical business functions must be available to customers, employees and regulators (Devargas, 1999). It further implies that business should continue irrespective of any kind of disruptive event. Whereas, Disaster recovery is a subset of business continuity plan which suggests that how an organization recover itself from a disaster. Heikkinen and Sarkis state that It is the restoration of normal operating procedures after a failure. Significance of Study: This research Disaster Recovery and Business Continuity aims to examine what kind of relationship is there between disaster recovery and business continuity with a particular focus on the organizations recovery and continuity of business after the disaster. Due to disaster recovery plans (DRP) firms are able to recover much more quickly than firms would be otherwise. Business Continuity Planning is also very important as it focus on strategic management tool. This study will also help identify that how an organization plans to recover from disaster and continue its business. As disaster are occurring more frequently to entities so all the businesses are realizing the
8 importance of these two very important concepts. This study will be focused on preventive measures and strategic planning taken to ensure continuity. The Study will further help to understand the organizations work plan and actions towards a disaster. Rationale of study: Disaster recovery and business continuity are parts of an organizations risk management. The study on business continuity is frequently available but the impact of disaster recovery and business continuity planning in telecom sector is not yet studied. Especially study regarding recent disaster faced by a leading company is not yet conducted. The logic or reason behind this research is to find out the true cause of disaster and its impact on business. Another reason is to find out the true guidelines for the organization to insure itself from any kind of disruptive event.
9 Chapter 2 Literature Review: Natural disasters like floods and earthquakes or accidents resulting from human error or utility disruptions are all potential threats for interruptions in the continuous business operations (Hassanain and Al-Mudhei, 2006). However, systems break down and computer viruses like Trojan horse can also be act as disasters. In the light of recent natural disasters; terrorism, and machinery breakdown, organizations have realized the need to be prepared for coping with all the contingent events (Botha and Solms, 2004). Pitt and Goyal, (2004) concluded that due to the ever increasing rate of unforeseen crises within the business environment organization should have a business continuity plan (BCP). Business continuity planning: Herbane, Elliott, and Swartz, (1997) identifies business continuity planning as a new and evolving discipline. They believed that most organizations underestimate the trauma associated with major crises. Therefore, any type of shut down can have a devastating effect on continuity of business operations. The numerous disasters suffered by corporations, institutions and other organizations have shown us that planning for disaster recovery is simply not enough. We must also plan for business resumption (Moore, 1995). Many organizations already had business continuity plans and planning processes, while others began planning after one or more of the disastrous events. Fry and Diamond (2001) have described business survival requires every business organization to plan for any kind of disruption including natural disasters; hardware and communications failures; internal or external or acts of terrorism. Since such disruptions cannot be predicted, they can have dangerous effects on the survival of the business with results ranging from capital losses to customer dissatisfaction.
10 Organizations now a day are competing on a global scale and therefore they are allocating some part of their resources and budgets to ensure continuity in case of any disruptive event. According to Botha and Solms (2004), to completely define business continuity one has to consider two aspects. First, it should be ensured that an organization continue its business. Second, IT should be restored to a state similar to that preceding the disaster. According to Herbane, Elliott, and Swartz, (1997) A suitable definition of BCP is planning that identifies the organizations exposure to internal and external threats and synthesizes hard and soft assets to provide effective prevention and recovery, while maintaining competitive advantage and value system integrity. A business continuity plan designed to be followed throughout the organization is focused to mitigate risks that might cause: business process failure, asset loss, regulatory liabilities, customer services failure, and damage to reputation. Risk mitigation comes under the head of risk management. Risk management is the identification, assessment, and prioritization of risks followed by coordinated and economical application of resources to minimize, monitor, and control the impact of unfortunate events (http://en.wikipedia.org/wiki/Risk_management). A Mckinsey survey, reported in Business Continuity Online, found that over 80% of investors would pay 18% more for shares in a well governed company and well governed company is the one which has proper business continuity plan (Stanton, 2005). Keeping in view the importance of BCP Botha and Solms (2004) have identified seven phases of business continuity plan. According to them these phases are:The project planning phase: This phase includes all those activities required to ensure that the BCP project is properly planned.
11 In second phase which is the business impact analysis (BIA) phase, critical business processes are identified and then analyzed. Once the analysis is complete, the impact that various disasters may have on business should become clear. The business continuity strategies (BCS) phase entails the identification of various strategies that focus on ensuring business continuity and recovery. It requires the review of the various identified disaster scenarios to develop methods to deal with these situations In continuity strategies implementation (CSI) phase detailed functional plans are developed which indicates responses to the various scenarios. The fifth stage which is the continuity training (CTR) phase, it is ensured that business continuity training must form part of the organizations training framework. The training should be carried out as soon as the plan is complete. After CTR phase there should be continuity testing (CTE) phase which determines that whether individual contingency plans are adequately written to ensure continuity of business processes and the recovery of the data centre. Once all of the above stages are properly planed then comes the continuity plan maintenance (CPM) phase in which business continuity plan is reviewed regularly and updated if required. This is done to ensure that the plan stays effective and up to date. Paton, (1999) demonstrated that irrespective of the quality of the planning process effectiveness of the plan is influenced by several other organizational characteristics (e.g. management style and attitudes, reporting and decision procedures) and bureaucratic flexibility. To provide continuous service, management systems will be required which includes crisis management procedures, allocating response tasks, roles and responsibilities, identifying and allocating immediate needed resources. These systems will also be required to deal with delegation of authority, crisis communication and decision making. Stanton, (2005) stated, it can take less than 60 seconds for a companys reputation to be ruined and its business to be crippled. In just one minute, shut down of vital applications can lead to a catastrophic series of event, damaging companys
12 reputation. Therefore, its not sufficient for organization to only have BCP, they must also conduct BCP audit regularly. According to Brown, There are no generally accepted principles with which to analyze business continuity, any change that affects critical business functions should trigger you to review your BCP. Foster and Dye, (2005) stated that many companies had now recovery plans. But the key concern is that they are either insufficient, or out of date, or both. With all these changes, the concept of business continuity planning is being taken to a higher level. It is more focused on reactive measures and not only on recovery strategies and plans (Cerullo and Cerullo, 2004). Components of business Continuity planning According to Botha and Solms, (2004) two basic components of business continuity are Contingency Planning and Disaster Recovery Planning (DRP). Contingency planning: The contingency planning aims at ensuring the continuity of all business processes while information technology is recovering. Therefore, it mainly focuses on the identification of procedures and steps the top management should take in order to continue business process. In a paper that appeared in the February 2002 issue of Strategy Business magazine, authors Ralph W. Shrader and Mike McConnell have come up with a framework for chief executive officers (CEOs) to address the challenges in case of disaster. In this, three primary goals of the CEO are defined as: First secure the people of the enterprise, secondly secure the core business of the enterprise (systems, facilities, infrastructure, and processes) and thirdly secure the business networks (components of the business ecosystem, such as the supply chain). One part of contingency planning is training of employees. Training includes informing them about business process continuity and other issues regarding the
13 contingency planning cycle. Training programs are designed to promote an adaptable response capability. The most important thing in contingency planning is threat analysis which consists of looking at potential risks and the probability of the threat. Paton, (1999) have described that in an organizational prospective, several groups including employees, shareholders, customers, suppliers, distributors, family members and community members are vulnerable. Therefore, each organization while planning for contingent events also keep in mind its responsibility for the wellbeing other than its employees. For example, businesses can connect with distributors or other business that can ensure inexpensive delivery of products and services to customers. In other words, plans are made on the basis of predetermined risks. These fell into the categories of fire, natural disasters, and some form of infrastructure breakdown including acts of terrorism and cybercrimes. Disaster recovery plan: Bryson, Millar, Joseph and Mobolurin, (2002) defines, disaster recovery plan (DRP) or disaster recovery strategy (DRS) is a system for internal control that focuses on quick restoration of service for crucial organizational processes when there are service break downs due to natural or man-made disasters. A disaster recovery plan aims to minimize any possible loss by identifying and protecting the key organizational assets or those that are most valuable and that need the most protection. Herbane, Elliott, and Swartz, (1997) are of the view that recovering from disaster typically depends on the integrity of a firms physical systems, assets, and processes during and after a crisis or disruption. It also includes planning that ensures continuous business applications that can fix a broken business loop maintaining user satisfaction, and juggling downtime According to Paradine, (1995) When contingency plans are co-ordinated into a single, integrated and workable plan of action, you have a disaster recovery plan.
14 Bryson, Millar, Joseph and Mobolurin, (2002) demonstrated that various increasing factors like government regulations in certain industries like banking, credit unions, and in telecom sector, natural disasters (e.g. flood), and social disasters have increased interests in the development, testing and maintenance of DRPs. Therefore customer and vendor interests in disaster management have also increased. Recovery time and recovery point objectives in disaster recovery: Recovery time objectives (RTOs) and recovery point objectives (RPOs) are perhaps the most important factors in designing a disaster recovery solution. An RTO is the amount of time required to recover from a disastrous event, and an RPO is the amount of data, measured in time, that you can lose from that same event (http://searchdatacenter.techtarget.com). Or it can be defined as a measure of the maximum time period in which data might be lost (http://en.wikipedia.org/wiki/Recovery_point_objective). Once the RTO and RPO has been defined, business continuity planners can decide which disaster recovery technologies are best suited to the situation which includes online backups and off-site vaulting of information? For example, if the RTO for a given application is one hour, data backup on external hard drives may be the best solution. If the RTO is five days, then tape, compact disk, or offsite disk storage may be more cost-effective. Insurance: a vital ingredient in disaster recovery Many business organizations do not have adequate financial resources to meet the post disaster needs. Therefore, every business entity must plan in advance in order to make better decisions and analyze critical financial needs in case of any kind of disruptive event. This can be done by purchasing a commercial insurance (Reiss). Insurance can play it part once a loss has occurred, and helps to pay for some of the costs incurred in the loss incident. Insurance on its own does nothing to prevent losses, it only protects from the financial consequences of loss (Pradine, 1994).
15 Disaster recovery plan for telecom systems: The breakdown of essential communication is also one of the characteristic of disaster. (Townsend & Moss, 2005). Telephone system is considered as the most complicated machine ever designed by human being and is highly vulnerable to physical disruption such as fire or earthquake (Piece and Noll, 1990). Physical disruption or loss of supporting infrastructure cause communication failure. Another major reason for telecommunication failure is network congestion or data overload. Due to ever increasing need of communication to co-ordinate the activities, exchange information and ideas and to know the response of people, telecommunication traffic occurs resulting in clogging of well managed networks (Townsend & Moss, 2005). Whenever such kind of disasters occurs there is always some regulatory authority. Since this communication channel is also not immune to disruptions thats why whenever any type of disaster occur there is a regulatory body in order to govern the recovery process. In Pakistan this authority is PTA.
16 Chapter 3 Methodology: Methodology that has been used for information gathering and research work is described below. Research Question: How telecom sector can survive in disaster and ensure smooth business continuity? Case study approach: Qualitative Research is ideal for earlier phases of research projects. A disaster has been recently faced by a telecom sector organization but no study has been made to find out the reasons of that disaster. Therefore, in order to find out the true reasons of the disaster company has to work with a probing mind using the in-depth information and analysis, for which case study is best suited. It should be mentioned here that case study does not provide the solution but case study explains, diagnoses or proposes the possible outcomes of the topic under consideration. A case study also aims to provide a complete, detailed description of the research topic. This can be better attained through case study analysis. Selection of an organization: Communication is very important these days and the breakdown of essential communication is also one of the characteristics of disaster. The most complicated machine ever constructed by human beings for communication is the telephone system. It has been highly vulnerable to physical destruction during disaster. The study on business continuity is frequently available but the impact of disaster recovery and business continuity planning in telecom sector is not yet studied.
17 Especially study regarding recent disaster faced by a leading company is not yet conducted. Thats why organization from telecom sector is selected. Technique used: Qualitative research generates non-numerical data and case study is one of the methods of qualitative research. Qualitative research is not sampling research, which is a fact asserted by all the major researchers in the field, including Yin, Stake, Feagin and others. It rather focuses on gathering of mainly verbal data than measurements. Gathered information is then analyzed in an interpretative manner, subjective, impressionistic or even diagnostic. Since, the objective is to achieve the greatest possible information on a given problem therefore, in such cases information oriented selection is done. Data collection source: Information is gathered through interviews. In interviews open ended question has been asked. Interviewer selection: With the help of our coordinator Dr. XYZ, Mr. XYZ and our personal links we were able to meet with concern persons from which interviews were conducted about the recent disaster in Mobilink. With the help of these interviews we were able to understand the true cause of Mobilink disaster. Interview Schedule: Interviews were conducted in accordance to the feasibility and availability of the interviewees. Appointment was taken for a meeting and the time of interview was decided by the interviewees themselves.
18 Chapter 4 Introduction of Organization: Orascom Telecom: Orascom Telecom Holding S.A.E. ("Orascom Telecom") or ("OTH") was established in 1998 and has grown to become a major player in the telecommunication market. Orascom Telecom Holding is considered amongst the largest and most diversified network operators in the Middle East, Africa, and South Asia, and has acquired in early 2008 a license to operate mobile services in North Korea. Orascom Telecom is a leading mobile telecommunications company operating in six emerging markets having a population under license of 430 million with an average penetration of mobile telephony across all markets of approximately 40%. In Pakistan, the Pakistan Mobile communications Ltd (Mobilink) started operations in 1994 and in 2010, had a market share of 32.5%. In April 2001, OTH took over management control of the company. As the market leader, Mobilink serves more than 31.6 million subscribers, representing a market share of 32.5% (as of March 2010). (Exhibit 1) Introduction of Mobilink: Mobilink, a subsidiary of the Orascom Telecom Holding, is Pakistans leading cellular and Blackberry service provider. With more than 31.6 million subscribers, Mobilink maintains market leadership through cutting-edge, integrated technology, the strongest brands and the largest portfolio of value added services in the industry, a broadband carrier division providing next generation internet technology as well as the countrys largest voice and data network with over 8,000 cell sites. Housing Pakistans largest distribution and contact center networks and an unparalleled 6,500 kilometers fiber optic backbone, Mobilink (http://www.mobilinkgsm.com) has already invested over US $3.3 billion in the country to date and provides uninterrupted countrywide
19 connectivity, unmatched customer services and international roaming in over 140 countries. The company is also the official cellular partner of the Pakistan Cricket Board. As a responsible corporate citizen, Mobilink also offers a range of socially inclusive products and services dedicated to enhance access to information. Through Mobilink Foundation, the company supports education, health and environmental initiatives and promotes sustainable business practices. During the last 15 years, Mobilink has set up one of the largest cellular networks in the country. Currently, Mobilink is covering more than 10,000+ cities and towns. This has involved an investment in the company of more than US$ 3.3 Billion. Mobilink has 66 Switches and more than 8,000 cell sites and the number keeps growing at a rapid pace. Mobilink also has deployed around 6,500 km of optical cable (http://www.mobilinkgsm.com). Mobilink's Vision: "To be the leading Telecommunication Services Provider in Pakistan by offering innovative Communication solutions for our Customers while exceeding Shareholder value & Employee Expectations". Mobilinks Organizational Culture: Mobilink's corporate values serve as the foundation for its culture, behavioral norms, and decision making. These values have always been at the heart of our business principles and success. Each of our values guides the way we work as an organization and ensures that we not only meet the needs of our employees and stakeholders today, but work towards sustaining and enhancing human and financial capital for the future. These values sum up the culture at Mobilink, and aim to ensure a workplace that necessitates open and respectful communication and exceptional quality of service to internal and external stakeholders.
20 Case study: Overview: Unforeseen events are an unfortunate by-product of the corporate environment, especially in the technology sector. In areas that are furnished with wires, electric gadgets, batteries and loads of other inflammable goods, the smallest of issues can end up causing significant headaches for the entire company as well as its consumers. For Mobilink, this year has seen these events become a bit too regular. The fire at Beverly Centre in Islamabad was followed by a fire at a technical facility in Karachi, and then the event of 6th March 2011 at I-10 Islamabad, destroying integral parts of MSC. An insight to Mobilink fire incident: Vendors indoor services at Mobilink are: Installation of radio base station (RBS), Installation of TX rack, Installation & commissioning of TX equipment, Fixing of cable tray, Grounding of equipment and Installation of rectifier and battery bank. During the Installation of battery bank in the companys battery room located at its MSC (Mobile Switching Center), Sector I-10/3, Islamabad, one of the batteries fell down and exploded. For those who dont know, MSCs are exchanges that make the connection between mobile users in a network, through BSCs which transfers mobile calls to BTS and BTS finally transfer calls to cell phone, the result from any damage to MSCs infrastructure or system would have effects for the entire base it supported. This phenomenon is clearly explained in Exhibit 2. Due to that battery explosion fire broke out in power room and damaged major portion of power room. Due to early containment, telecom equipment, power supply cables (ceiling & underground), cooling system and building was damaged. Some parts of machinery were also burned. The staff on duty immediately switched off the Mobile Switching Center (MSC) to avoid further damage due to which North region went down, causing zero signals in areas from Peshawar to Islamabad and the surrounding.
21 Mobilink also owns LinkDotNet, which is an ISP, as Mobilink shares its infrastructure between ISP and its cell phone operator setup wherever possible. Along with the cellular service, DSL services by Link DOT Net also got affected as the company is reliant on the Mobilink backbone for its products. However, as the MSC was a vital component for cross-country connectivity, the loss of service is extended to other regions, with subscribers suffering as far as Karachi. Mobilink and LinkDotNet services gone down about after noon Sunday and for some areas services were available back about afternoon Monday. Since the leading company was facing the burning situation, rumors were spread like flames in the forest. Customers were facing problems and these problems increases the level of dissatisfaction in the air. Therefore, after some hours of fire eruption Mobilink releases a press update: Dear customers, we have encountered an unexpected situation in the past few hours. Our team is trying to recover as soon as possible. We apologize for the inconvenience caused, primarily in Islamabad and Rawalpindi. Said Omar Manzur, Director Public Relations, Mobilink. At that time most of Mobilink customers were not receiving signals, while when Mobilink numbers were dialed, system says that dialed cell number is switched off. Customers were receiving Invalid SIM, No SIM Card, or similar issues on customers cell phones. Mobilink technical staff restricted the area where fire broke out; meanwhile staff called the fire brigade. Mobilink staff tried to overcome the fire by fire extinguishers, but were unable to do so. Finally fire brigade reached to overcome the fire in 3 to 4 hours. When the MSC was shut down Mobilink shifts the voice calls/SMS of that MSC to other MSCs. Due to this shifting other regions also faces no connectivity or low signal bars on customers handsets.
22 As fire broke out on Sunday so no human loss was made as most of the employees were not at that place. Important thing to be noted here is that the communication among the top level management, technical staff and lower level management was zero because all employees have Mobilink connections which werent working due to shut down of the service. To cope up with this problem Mobilink immediately bought some PTCL V wireless phones and PTCL Evo for internet connection. Mobilink requested the Ufone to provide MNP (Mobile Number Portability which enables mobile telephone users to retain mobile telephone numbers when changing from one mobile network operator to another) for some top level management. Role of PTA in disaster recovery: PTA (Pakistan Telecommunication Authority) was informed about this incident as disaster occurred, by Mobilinks public relation officers. Pakistan Telecommunication Authority (PTA) took prompt action on the matter. PTA teams reached the scene and monitored the whole restoration activity immediately after the occurrence of the incident. A Committee headed by Member (Technical) PTA, Dr. Khawar Siddique Khokhar with members DG (Enforcement) PTA, DG (Licensing) and DG (Law) visited Mobilinks premises in I-10/3 examined the situation at scene. Equipment Sensitivity: Like any other computer or sensitive electronic equipment, telecommunications electronics are susceptible to damage from heat, electrical disturbances, moisture, and direct physical damage. The equipment installed in place where fire broke out, consists of complex assemblies of semiconducting, conducting, and non-conducting materials. This equipment was intended to perform functions within these assemblies only so long as company do not suffer physical or chemical degradation. Therefore, when fire broke out the temperature rose to an immense level. The rule of thumb is that for every 10 degrees Celsius of rise in temperature ,the speed of all chemical reactions doubles.( Heat and Power Problems are the Greatest Risks To Telecom Equipment By Robert
23 Weir,) That rule also applies to the chemical reactions involved in the degradation of electronics and to every other substance. Realizing the importance and sensitivity of the equipment, the first thing done after taking control of the fire was the cooling because extended temperature is an enemy to the equipment. Restoration of System: Firstly Mobilink made teams and different tasks were assigned to the teams. After controlling the fire, process of restoration was started. The technical team came to the sight and team thoroughly analyze the electrical equipment and machinery and separated the burned equipment and tools. Cleaning process was started in battery room where incident was happened. Electrical team was assigned to restore power supply cables which were partially destroyed and were unable to work properly. The main challenge of electrical team was to maintain the required temperature by the machinery. So the team worked really hard to bring the temperature at required level. The technical team which includes special engineers sorts out the burned racks of electrical circuits, cards and equipment. The technical team thoroughly analyzes the electrical equipment and machinery and separated the burned equipment and tools. Mobilink has its own warehouse where company holds spare parts and extra machinery equipment. The team started to change the damaged machinery and parts with the new machinery- and equipment. Engineers also start to make connection between spare parts and machinery. It must be mentioned here that all MSCs, Offices, and other telecom buildings are always insured. Second Update: Update from Mobilink authority: Partial connectivity was restored in the early hours of the morning and our technical teams continue to work around the clock to ensure that all equipment is fully restored at the earliest, said Omar Manzur, Director Public Relations, Mobilink. The whole network was fully restored in almost 32 hours.
24 Good news was, Mobilink and PTA both worked hard during the Sunday to put back the network online and Mobilink have successfully made network back online within a day. Yes, partially the network is back which includes mobile service and the Internet service (of link dot net) as well. This outage is Pakistans every worst telecom outage which rolled out the whole nation and the users got confused what to do. The businessmen were affected the most who also work on Sundays. Service restoration: Following is a screen shot of the note on Mobilink website.
Mobilink to compensate its Customers: Mobilink has decided to reward its customers by offering free minutes and SMSs who had remained loyal at the when service was down due to fire incidence at one of its facilities in Islamabad, few months back. With a view to thank its customers for patience and loyalty, Mobilink has announced a phased plan for customers who experienced disruption in service. This is the first time a reward of this scale has been announced and awarded by any telecom operator in the country for disruption caused by a force majeure event. Sharing his views, Mobilink President & CEO Rashid Khan shared that while it was difficult to make up for the missed experience, the company cherishes customers
25 who continue to prefer Mobilink and this is a humble gesture of goodwill by the company for patience and loyalty. Customers were updated of the incident and restoration via press announcements by the company at the time. To facilitate customers who were directly affected by the incident, the company has announced free minutes and SMS for prepaid customers whereas postpaid customers will receive an adjustment against monthly bill. Pakistan Telecommunication Authority (PTA): If any telecom company faces a disaster or its network or services went down then PTA has directions to the company to recover the service within 72 hours. If any company fails to restore its services then PTA can make fine or have choice to cancel the license. The company must inform the PTA if any kind of disaster occurred to the company and PTA will guide the company and also monitor the company during the recovery from that disaster. In case of disaster a company should stay in touch with customer through internet and inform the customers about the reason behind the network outage. Pakistan Telecommunication Authority (PTA) announced an investigation of the issue from all technical and legal aspects in order to prevent such incidents in the future and compensate the losses of the operator and subscribers. Pakistan Telecommunication Authority has formed a Committee headed by Member (Technical) PTA, Dr. Khawar Siddique Khokhar to analyze situation that arose after Mobilinks outage. This was mentioned in a statement issued by the authority. It was mentioned that DG (Enforcement), DG (Licensing) and DG (Law) at PTA are also part of this committee that is likely to prepare a report based on facts gathered during committees visit to Mobilinks office to formulate a way for better crisis management in case any such event happen in future. Mobilinks in-house investigation on incident will also be made part of this report. The Committee is deliberating further action on the issue from all technical and legal aspects and how such incidents can be prevented in future, said the statement. PTA top priority was to
26 monitor the situation and keep an eye on the network restoration efforts of the Mobilink, an official said and added the authority wants to service accessibility to its mobile phone users. After this incident, and other relating fire eruptions in past. Users of Mobilink: Mobilink have a family of more than 32.203 million customers. More than 10 million people are believed to be affected due to dropping or missing signals. Since Prime Minister Gilani uses the market leader companys connection on his personal cell-phone, the service collapse forced him to issue instructions to the Ministry of Information Technology and Telecom to get the service restored at the earliest. Following the instructions of the Prime Minister, Ministry of Information Technology took notice and ensured to restore Mobilink network as early as possible, stated a press release of the PM Secretariat. Who Suffered More Loss? Though disaster wasnt a deliberate incident but still should be inquired properly and thoroughly so that these kind of incidents can be avoided in future. Moreover there should be some backup plan so that if any untoward incident occurs then customers dont suffer much. While the company also suffered loss of millions of rupees in terms of infrastructure and revenue but Mobilink can surely overcome it with in few days as all of Mobilink equipment is insured. The customers were more at loss as throughout the Sunday customers were unable to make calls, use internet and send SMSes. As Sunday is the day for recreation and talking to loved ones but Mobilink subscribers were unable to avail this facility for almost of Sunday and half of Monday. Thousands of calls are made daily and millions of SMSes are exchanged but company deprived customers from this right on Sunday.
27 Chapter 5 Analysis/ Results: It becomes clear from the incident that one of the leading cell phone operator which claims to have over 30 million subscribers does not have an effective disaster management plan and a simple fire can take services down for long hours for several hundreds of thousands of customers. Sunday is the day for recreation and talking to loved ones but Mobilink subscribers were unable to avail this facility for almost of Sunday and half of Monday. Thousands of calls are made daily and millions of SMSes are exchanged but company deprived customers from this right on Sunday. So, special care should be taken that no interruption in services on Sundays. The company should make separate offices for vendors to meet. Had company met the vendor in separate room for purchasing of batteries, this incident would not happen. Accidents do happen but up to what extent an organization is prepared to handle a potential catastrophe is important for businesses in this rapidly changing market. It is also notable that since Prime Minister Gilani uses the market leader companys connection on his personal cell-phone, the service collapse forced him to issue instructions to the Ministry of Information Technology and Telecom to get the service restored at the earliest. It was long about 30 hour interruption in services. This is however surprising that leading cell phone operator which claims to have over 30 million subscribers does not have an effective disaster management plan. Hope Mobilink must have learned much from these few experiences and will have some better disaster management plan in the future. Mobilink must also consider making service provision a fail-safe. Accidents do happen but its important that how much someone or some organization is prepared to handle a potential catastrophic it is real important for businesses in this rapidly changing market, especially for organization which charges customers more because of good quality of services and organizations entire
28 customers are with the organization only due to good quality and uninterrupted services. Problem indeed affected and upset masses over country but good news is that all over the country problems are solved. Mobilink has also issued a note about recent interruption of services on Mobilink website which claims that all problems have been resolved. This incident contributes in a major change in PTAs regulations, i-e making it necessary for all operators to maintain backups of all core-infrastructure equipment, also termed as disaster recovery setup. Moreover there should be some backup plan so that if any unwanted incident occurs then customers dont suffer much. It becomes clear from the incident that how a disaster can change the brighter side of companys slogan into dark side (as in this case jaz apna ha changed into jaz sapna ha). Mobilink made a good example of customer care by providing free minutes and SMS to its customers due to inconvenience faced by customers. During visits it has been identified that data backups have been maintained on daily basis. Anti-spam firewalls/filters are also been installed in the vicinity to avoid garbage SMSes. Fuel power backups are also been maintained for MSCs, BSCs and power generators. Cloud based management system is also used to recover from any kind of disruption. In this process, data is accessed virtually to speed up the recovery process. During special events or festivals, there are chances of network congestion which might be the cause of service interruption, a clear case of disaster for telecom, is managed effectively by proper recovery plans. If plan A fails due to any reason then company also have a plan B, which indicates the internal strength of the company. In Mobilink Both equipment and infrastructure elements are usually installed in substantial buildings having extensive physical security and fire resistant construction. Such a structure forms the base of the key operation taking place in the industry. Mobilink often tend to take buildings for granted, but Mobilink deserve some special respect in this case because of the protection and support Mobilink provide to the equipment when the fire broke out.
29 Chapter 6 Discussion: The disasters in telecom sector are common. In every telecom organization disasters of different kinds happened. The fire disaster in Mobilink has put questions on the contingency planning and disaster recovery planning of a firm. The telecom organization needs disaster recovery plans and should follow some steps to recover from disaster. If any telecom companys service went offline then PTA has directions to the company to recover the service within 72 hrs. The company must inform the PTA if any kind of disaster occurred to the company and PTA will guide the company and also monitor the company during recovery from that disaster. The company should also stay in touch with customer through internet in case of disaster. Mobilink didnt communicate early to its customers that some problem has occurred and service will be restored as early as possible. Mobilink communicated this news through internet after 7 hours of network goes offline. Due to that customers were unaware that what happened to the network and when the network will be restored. In December 2010 Ufone services went down and there was no announcement as such on internet websites like Facebook or Twitter page. Fans got crazy angry and left annoyed comments as to why nobody is telling the customers when the services will be restored. Obviously, the issue in the social media was not handled properly and that led to many fans of Ufone unsatisfied and angry. All this ended up real bad when Ufone started restricting its fans to comment on that page. However, whats done is done but lets think over what telecom company could have done in this situation. This is the first thing a company like Ufone or Mobilink should have done in case of disaster; effected company should have updated companys Facebook and Twitter page status right away announcing;
30 Dear customers, we are experiencing service outage due to some technical problems which are being resolved and the service will be restored in 3 hours (lets suppose). Sorry for the inconvenience but well let you know as soon as were back on! With this act, company would have bought customers trust and even sympathy and everyone would have known by default that there is a problem going on which will soon be resolved. The second thing to do is keep monitoring what people are saying. If someone is really agitated, try to calm that person down by replying to comments in an apologetic and empathetic tone. Such acts make that person feel company has not left that person in the middle of a desert with no help. This behavior makes the customers feel company cares and is doing whatever it can to bring the services back to order for the customers. Last, when the services are restored, be the first ones to let the fans know that services are back with a bang! Be cheerful, be happy and be caring. Let the customers know how much it means for the company that customers have been patient and cooperative. That is how company actually and successfully make use of social networks in difficult times. As any information like in this case cant remain hidden in the world of internet for too long! Truth comes out one day or the other so instead of hiding flaws, be big enough to step forward, admit the problem, give the solution and win the trust of the customers because if a company wont say it, that company will definitely hear it from customers. For any telecom organization, formulating disaster recovery plans or business continuity plans for this kind of disaster are very difficult. Mobilink or any other telecom company has the recovery plans for virtual disaster like burden on different MScs, low signal bars, difficulty in making calls or data lost. Telecom companies followed certain steps in these virtual disasters or problems and these companies are quite certain in following these steps. But planning for physical disaster or fire eruption can be done to some extent but not completely. Reason is that a company cannot predict that what kind of disaster will occur and what would be the intensity of that
31 disaster. Physical disaster occurs in no time and a company cannot predict disaster. So a company should keep itself prepared all the time. Like if fire broke out company will switch down the machinery then try to overcome the fire. Then estimate the loss and install new machinery and equipment and restore the system. This will be the process companies normally follow in a fire disaster. Spare part of machinery will be setup at alternate locations in case of machinery loss in a disaster. The Lahore-based Ufone is expected to provide the foundation of a virtualized disaster recovery infrastructure. The company is evaluating using VMware vCenter Site Recovery Manager to automate virtual machine recovery at the secondary site should the primary datacenter experience a major outage. Company has targeted a range of business-critical applications and systemsincluding customer relationship management, billing, inventory, retail management, email and other line-of-business applicationsfor full disaster recovery. These will be run in a fleet of more than 100 virtual machines at the disaster recovery site. In Pakistan, Ufone have strict regulations governing disaster recovery of all critical line-of-business applications, VMware vCenter Site Recovery Manager will help Ufone to accelerate compliance with disaster recovery policies applied by regulatory authorities. Recovery plan includes setting up machines at alternate locations; however, backing up data will be most crucial part. Freeing up resources such as less-critical tasks or internal monitoring is also being rolled out. Mobilink should also make these kinds of recovery plans to ensure the business continuity after a disaster.
32 Chapter 7 Conclusion: From this case study we can easily conclude that every organization in telecom sector needs a disaster recovery plan. Through this plan a company can recover itself from disaster more quickly and can avoid a big loss. Mobilink or any other telecom company has the recovery plans for virtual disasters but Physical disaster occurs in no time and a company cannot predict disasters. So a definite disaster recovery plan is always lacking in a telecom sector organization. The prediction cannot be made about the physical disasters. Even if a company knows the type of disaster but it cannot predict the intensity and maybe the location of that disaster is changed from the first one so as a result companys whole plan or setting will be different because now the entire situation is different. Everything changes from situation to situation. So some steps can be followed by a company in one type of disaster but formulating a complete disaster plan is very difficult for a physical disaster. Suggestions: The mission of a telecommunications company, whether company is a traditional regional operating company, a long-distance provider, a wireless service, or a broadband company offering bundled TV, internet and phone is to provide customers with round-the-clock, uninterrupted service. A fire in any critical telecom facility whether an MSC, central office, a satellite transmission station, or a cable head-end can knock out vital communication links for thousands of customers at once. And providing customers with service that customers can depend on whether it is TV sports programming, a business's phone service, or service for 911 uninterrupted services are critical to companys business mission, and vital to the continued productivity and profitability of a company.
33 All ICT rooms must be equipped with smoke detectors connected to automatic addressable fire alarm systems to enable the registration of smoke generation/fire and alarm activation. The early detection principle is based on the earliest possible notification of the outbreak of a fire so that personnel are given sufficient time to intervene before the outbreak develops into a major fire loss. Early detection imposes certain requirements on the selection of fire detector systems. Moreover, it is essential that the detectors used in such systems are more sensitive than standard optical/ionization point detectors. Systems and procedures for fire detection, alarms and extinguishing must comply with all requirements and directives stipulated by local and/or national fire authorities and, if relevant, insurance companies. All institutions must prepare written fire protection regulations which, in addition to including personnel safety and loss reduction measures, shall also incorporate measures aimed at mitigating damage to computer equipment and data loss. Measures such as backup systems, mirror servers etc., are effective in mitigating consequential damage and reducing vulnerability in the event of fire. Good fire hygiene procedures such as cleaning, tidiness, the replacement of filters, the removal of packaging, the disconnection and removal of equipment that is not in use, etc., all represent important fire prevention measures. It is essential that operating electronics be maintained at an ambient temperature consistent with reliable operation and satisfactory equipment life. As long as power is supplied to the electronics, sufficient cooling must also be available. Extended temperature is an enemy to the equipment. Therefore, heating, ventilation and air conditioning which are vital to overall function must be properly managed. High Sensitivity Smoke Detection Systems can provide the earliest possible warning in a potential fire situation.
34 Mobilink and Equipment Suppliers should perform a risk assessment of all systems and classify suppliers by the value these suppliers have to the company and the impact to the company if suppliers are compromised or lost. Based on the risk assessment, develop a security policy which recommends and assigns then appropriate controls to protect the system. Network operators, service providers and equipment uppliers should ensure to conduct organization wide security awareness training, training on technical products and controls. When a Business Recovery Plan (BRP) does not exist, Network Operators and Service Providers should bring together an adhoc team to address the current incident. The team should have technical, operations, legal, and public relations representation. Team should be sponsored by senior management and have a direct communication path back to management sponsor. If situation exceeds internal capabilities consider contracting response/recovery options to 3rd party security provider. Redundant security measures should be implemented by both operators and vendors to ensure maximum safety at telecom sites. Enough operations and maintenance teams should be available to perform preventive maintenance apart from corrective measures.
35 References:
Pitt, M., and Goyal, S., (2004). Business continuity planning as a facilities management toll, Electronic Journal, 22 (), 87-89. Emerald Group Publishing Limited.
Foster, P.S., and Dye, K., (2005). Building continuity into strategy, Journal of Corporate Real Estate, 7 (2), 105119.
Botha, J., and Solms, R.V., (2004). A cyclic approach to business continuity planning, journal of Information Management & Computer Security, 12 (4), 328-337.
Paton, D., (1999). Disaster business continuity: promoting staff capability, Disaster Prevention and Management, 8, (2), 127-133.
Herbane, B., Elliott, D., and Swat, E., (1997). Contingency and Continua: Achieving Excellence Through Business Continuity Planning, Business Horizon / November-December, 19-25.
Fry, K.N., and Diamond, T., (2001). Business Continuity and Disaster Recovery Planning and Management: Perspective, Technology Overview, 8 October, 2001.
36 Bryson, K.M., Millar, H., Joseph, A., and Mobolurin, A., (2001). O.R. Applications Using formal MS/OR modeling to support disaster recovery planning, European Journal of Operational Research, 141, (ed.2002), 679-688.
Herbane, B., Elliott, D., and Swat, E., (2004). Business Continuity Management: time for a strategic role? Long Range Planning, 37, (ed.2004), 435-457.
Devargas, M., (1996). Survival is Not Compulsory: An Introduction to Business Continuity Planning, Computers & Security, 18, 35-46.
Hassanain, A,M., and Al-Mudhei, A., (2006). Business continuity during facility renovations, Journal of Corporate Real Estate, 8, (2), 62 - 72
Moore, P., (1995). Critical elements of a disaster recovery and business/service continuity plan, Strohl Systems, 13, 22-27.
Heikkinen, D., and Sarkis, J., (1996). Disaster recovery issues for EDI systems, Industrial Management & Data Systems, 96, (2), (p. 32).
Cerullo, V., and Cerullo, J.M., (2004), Business continuity Planning: A comprehensive Approach, information systems managements, Web site: http://www.auerbach- publications.com/dynamic_data/3138_1893_bcp.pdf
37
Piece, J.R., and Noll, A.M., (1990). Signals: The Science of Telecommunications, Scientific American library series, 32, (p. 4).
Townsend, M.A., and Moss, L.A., (2005). Telecommunications Infrastructure In disasters: Preparing cities for crisis communications, Center for Catastrophe Preparedness and Response. Retrieved, April, from New York University. Web site: http://hurricane.wagner.nyu.edu.
Stanton, R., (2005). Beyond disaster recovery: the benefits of business continuity, Computer Fraud & Security, 18-19.
Brown, T., (2006). Boardroom Briefing, A publication of Directors & Boards magazine, 3, 1, (10-12).
Recovery time and recovery point objectives in disaster recovery, (2005). Retrieved Nov 15, 2011, from http://searchdatacenter.techtarget.com/tip/Recovery-time-and-recovery-point objectives-in-disaster-recovery.html
Recovery point objective, (2011). In Wikipedia, the free encyclopedia. Retrieved 04:27, November 18, 2011, from http://en.wikipedia.org/wiki/Recovery_point_objective.html
Reiss, L.R., (2011). The Role of Insurance in Business Disaster Planning & Recovery, Retrieved Nov 15, 2011, from https://www.riskinstitute.org/peri/component/option,com_bookmarks/Itemid,44/catid,24/ navstart,0/task,detail/mode,0/id,588/search,*/
38 Acronyms and Abbreviations:
Business Continuity Plan BCP Business Continuity Strategies BCS Business Impact Analysis BIA Base Station Controller BSC Chief Executive Officer CEO Chief Executive Officers CEOs Continuity Plan Maintenance CPM Continuity Strategies Implementation CSI Continuity Testing CTE Continuity Training CTR Disaster Recovery Plans DRP Digital Subscriber Line DSL Internet Service Provider ISP Mobile Number Portability MNP Mobile Switching Center MSC Orascom Telecom Holding OTH Pakistan Telecommunication Authority PTA Pakistan Telecommunication Limited PTCL Radio Base Station RBS Recovery Point Objectives RPOs Recovery Time Objectives RTOs Subscriber Identity Module SIM Short Message Service SMS Television TV Transcoder TC
39
40
Annexures
41
Exhibit 1 Exhibit 2
42
43
Source: PTA Annual Report
44
Financial data of Mobilink: Source (Orascom Annual Report 2010)