Sunteți pe pagina 1din 23

European Journal of Operational Research 180 (2007) 13581380 www.elsevier.

com/locate/ejor

O.R. Applications

A trac shaping model for optimizing network operations


Suresh K. Nair
a

a,*

, David C. Novak

b,1

Department of Operations and Information Management, School of Business Administration, 2100 Hillside Road, Unit 1041 OPIM, University of Connecticut, Storrs, CT 06269-1041, United States b School of Business Administration, 310 Kalkin Hall, 55 Colchester Ave., University of Vermont, Burlington, VT 05405-0157, United States Received 22 December 2004; accepted 27 April 2006 Available online 7 July 2006

Abstract The management of technology in multi-service computer networks, such as university networks, has become a challenge with the explosive growth of entertainment oriented peer-to-peer (P2P) trac. Trac shaping is one of the tools used to manage bandwidth to improve system performance by allocating bandwidth between P2P and non-peer-to-peer (NP2P) trac. We present a model for trac shaping and bandwidth management that considers the trade-os from allocating different amounts of bandwidths for dierent application categories and use data from a university network. The current policy allocates varying bandwidths over the day to P2P and NP2P trac to reect the importance of not letting entertainment based trac choke the network during the day time at the expense of the more important trac, such as Web trac. We highlight the diculties in obtaining data in the form required for analysis, and the need to estimate demand for allocations not covered by current policy. We present a goal programming model for this estimation task. We also model the trac shaping problem as a Markov decision process and develop an algorithm for determining the optimal bandwidth allocation to maximize the utility of all users. Finally we use a numerical example to illustrate our approach. 2006 Elsevier B.V. All rights reserved.
Keywords: Markov decision process; Trac shaping; Bandwidth; Goal programming

1. Introduction Multi-service networks, where a single network infrastructure supports a wide variety of applications such as video, voice, and data are increasingly common in todays networking environment [1]. For these networks to operate successfully, managers must allocate nite resources like bandwidth in an appropriate manner. Demand for bandwidth is growing explosively due to the astonishing success of Internet applications and the continuing rollout of faster network access technologies. Shifts in user behavior, the deployment of new applications, and publishing of new Web content has resulted in drastic uctuations in the volume and content
*

Corresponding author. Tel.: +1 860 486 3641. E-mail addresses: suresh.nair@business.uconn.edu (S.K. Nair), dnovak@bsad.uvm.edu (D.C. Novak). Tel.: +1 802 656 4043.

0377-2217/$ - see front matter 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2006.04.036

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

1359

of Internet trac [2]. Bandwidth management plays a critical role with respect to network trac engineering and capacity planning and relates directly to end-to-end transport performance, overlay network routing, and peer-to-peer (P2P) le distribution. Managing network bandwidth is critical for many Internet applications and protocols particularly those involving large le transfers or involving the transmission of content subject to real-time service quality constraints such as streaming media [3]. For the purposes of this paper, we call all trac other than the entertainment focused P2P le sharing trac as non-peer-to-peer (NP2P) trac. Over the past several years, P2P le sharing systems have emerged as a signicant social and technical phenomenon, and are gaining in popularity as network connectivity increases and low cost storage and individual computing resources become more widely available [4]. Within a very short time period, P2P trac has become a dominant component of Internet trac, accounting for about 51% of the trac on the Abilene backbone (a high performance Internet 2 network) in 2002 [5]. In research networks with unrestricted Internet access, it is estimated that P2P trac accounts for between 60% and 70% of the bandwidth utilized [6,7]. The growth in demand coupled with the bandwidth intensive nature of these applications presents serious issues with respect to network technology management. A general solution to network congestion is to increase network capacity, but simply adding capacity with no corresponding management solution is expensive and the additional capacity is likely to very quickly become swamped by additional P2P trac. KaZaA, Ares, Gnutella, eDonkey and the entire genre of entertainment-based P2P applications that facilitate le sharing can impose massive resource demands on both campus and backbone service networks [810]. Most P2P le sharing applications have no centralized database and require a communication-intensive search mechanism to communicate with other peers on the network [11]. From a network management perspective, these applications demand huge amounts of bandwidth as search requests are broadcast over the network and computers that receive the search requests scan their local databases for possible hits. In general, most Internet Access Providers (IAPs) have no desire to impose draconian policies on unsanctioned network uses or severely restrict the use of P2P applications, as these policies are viewed as oppressive and may result in unhappy users [9,10,12]. At the same time, IAPs need to maintain control over their network and oer an acceptable level of performance for all trac. The potential operational and managerial problems caused by unrestricted P2P trac can be very serious for network providersboth for backbone providers that transfer communications in bulk among network exchange points and IAPs that receive communications from individuals or organizations and transfer them to a backbone network and vice versa [13]. While allowing liberal use P2P applications presents providers with opportunities to increase the protability of their IP networks and to meet the high consumer demand associated with various P2P services, these applications tend to aggressively consume network resources and are dicult to manage because they often bypass and/or alter TCPs congestion and ow control mechanisms [6,7]. Extensive use of P2P applications causes congestion and performance deterioration, which may lead to customer dissatisfaction and turnover. P2P applications have presented particularly dicult problems for university network providers, as university IT managers have discovered that unrestricted P2P trac can monopolize the capacity of the network for long periods of time causing performance problems for the entire network [10,13,14]. A serious consideration faced by many network administrators is how to manage the trade-o between bandwidth resources allocated to NP2P trac and bandwidth resources allocated to P2P trac. O-the-shelf solutions are available to address resource allocation issues associated with uncontrolled P2P trac. A variety of products such as Packeteers PacketShaperTM, Allot Communications Net EnforcerTM, and Cisco Systems Service Control Technologies are used to manage bandwidth by controlling recreational and potentially harmful trac through trac shaping. Trac shaping is a generic term given to a wide range of techniques designed to enforce prioritization policies and to control bandwidth utilization and application performance [15]. It generally involves classifying, queueing, and prioritizing trac streams to achieve a high level of service with respect to guaranteeing low latency, and/or bandwidth availability for high-priority or mission critical trac [16]. Trac shaping products provide application layer classication and control of network trac, allowing managers to identify, lter, and independently allocate resources to specic trac types like FTP, SMTP, and HTTP as well as P2P trac generated by le sharing applications such as Gnutella and KaZaA. Using these trac shaping tools network managers can improve network performance and reduce congestion by employing dierent bandwidth management policies that restrict P2P le sharing trac so that other applications are given a larger portion of the total bandwidth.

1360

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

While a simple management solution would be to eliminate or severely curtail P2P trac, this could make a number of users unhappy [9,10,12]. As the bandwidth allocated to entertainment-based applications (including le sharing, streaming media, and gaming trac) is curtailed, the utility gained from the network for these users will also decrease. In fact, many network users derive a high level of utility or satisfaction from being able to use the high-speed Internet access specically for entertainment purposes. In practice, there are few universities that completely restrict P2P communications [9,10]. The trade-o centers on how network managers can both implement acceptable service quality policies for mission critical applications as well as ensure acceptable usage policies for largely entertainment based applications. Relationships between network user satisfaction and bandwidth are discussed in [1,17]. The problem of Internet Quality-of-Service (QoS) is a active area of research. As the term QoS often means dierent things to dierent people, we clarify our meaning by dening QoS as the ability to guarantee and limit bandwidth for certain users and services [18]. Trac shaping is not the only QoS management solution that is used to manage network trac. In practice, many dierent QoS solutions may be used together. For example, trac shaping solutions may be used in conjunction with dierent QoS architectures such as DiServ and IntServ, with protocols that provide some support for QoS such as TCP, and with dierent QoS control mechanisms like buering, packet scheduling, and resource reservation [19]. Unlike many other QoS solutions, trac shaping is not application-based and does not allow the applications themselves to provide the network with service quality information. Since most trac shapers allow upper layer (layers 57 of the OSI) functionality, trac can be identied, separated, and managed at the application layer. This provides a much more robust and accurate way of categorizing trac compared to traditional layer 3 and 4 ltering policies. Trac shaping devices can be placed locally on networks and do not require other devices along the path of incoming/outgoing packets to coordinate in any way, as it does not focus on end-to-end QoS routing and buering issues. Trac shaping also works well in heterogeneous networks where dierent standards and technologies are employed that may limit the eectiveness of other QoS solutions [18]. The wide spread use of trac shaping solutions and products to manage problems caused P2P trac is well documented, particularly in the university network management domain. While we employ data and demonstrate the solution methodology from the perspective of a single universitythe University of Connecticut (UConn)we would like to emphasize that this problem is faced by many universities around the world. An ad hoc list of about 90 universities that currently employ trac shaping specically using PacketShaperTM is provided in Appendix A. This paper describes a methodology used to develop an optimal policy for bandwidth allocation between P2P and NP2P applications by maximizing the utility for all network users. There are a number of unique contributions associated with this research. First, although trac shaping is a bandwidth management technique commonly employed around the country, we are unaware of published work that considers the trade-o between resources allocated to P2P and NP2P trac on networks. Second, this research is based on a real problem facing many IAPs (particularly university providers) and backbone/overlay network providers todaybandwidth allocation policies, trac shaping, and how to treat P2P applications that tend to monopolize available bandwidth and conict with mission critical trac. Much of the existing network bandwidth allocation literature focuses on modeling end-to-end QoS [1,2023]. This literature is theoretical in nature and tends to rely on simulated, hypothetical networks. We consider bandwidth allocations on a local network (in this case an actual campus network), where managers have full control over management policies and set QoS requirements for the network. In this case, various QoS control mechanisms such as packet scheduling, queue management, and buering may already be in place, but do not solve localized problems caused by unrestricted P2P trac. Consequently, our model is dierent from models that focus on end-to-end QoS in that it is designed specically for an IAP and directly addresses problems caused by P2P trac that are not addressed from an end-to-end perspective. Another contribution of this paper is that we estimate the P2P demand for time periods and bandwidth allocations not observed in the data set by using a goal programming approach. We also present a bandwidth allocation model using Markov decision process (MDP) methodology. While MDPs have been successfully applied to admission control problems in networks, their application to network resource allocation problems is relatively new [1]. The model is constructed in a manner that supports decision making by IAPs and uses

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

1361

variables that network can readily collect and understand. A detailed numerical example is presented using actual data to demonstrate the methodology. Empirical data is used to characterize P2P and NP2P trac ows and is used to build the model. Since the data collected is biased by the trac shaping policies currently in place at the test bed provider, it does not allow for analysis of the system using other bandwidth allocation policies. In order to estimate trac if other policies were used, we develop a goal programming model. The collected data along with the results of the goal program and utility information are then used to solve an MDP to maximize the utility of all network users while considering the cost of using and monitoring bandwidth allocation changes during the day. 2. Literature review Although the respective research literatures addressing bandwidth management and network resource allocation is extensive, relatively little work has been done on developing solution methodologies directly related to managing P2P trac on multi-service networks or pertaining to trac shaping strategies in general. We briey summarize a sampling of research focusing on network resource allocation as well as research addressing a variety of P2P-related issues and provide references to point interested readers to appropriate sources of additional information. Most P2P-related literature focuses on trac characterization and architectural or protocol-related topics associated with P2P trac, and not directly on issues associated with managing P2P trac. The literature discusses the impacts P2P le sharing have had on the overall volume of Internet trac as well as the numerous QoS issues problems caused by P2P le sharing applications; however, the focus is on technical rather than managerial solutions. Saroiu et al. [24] perform a measurement study of two popular P2P le sharing systemsNapster and Gnutellaat a university to accurately characterize the population of end user hosts. The authors nd heterogeneity in peers availability, bandwidth, and data transfer rates. Gummandi et al. [13] analyze P2P trace data from a large university and develop a model of multimedia workloads. The results reveal dramatic dierences between Web trac and P2P le sharing. Saroiu et al. [14] examine Internet content delivery systems owing into and out of a large university network, comparing Web-based trac to P2P le sharing trac. Schollmeier and Dumanois [6] investigate the properties of P2P trac and compare those properties to CS trac. The authors discuss how the use of P2P applications may impact performance and load on traditional clientserver (CS) networks. Sen and Wang [8] analyze P2P trac ows for a large ISP by investigating three popular recreational P2P applications, DirectConnect, Gnutella, and FastTrack. P2P trac characteristics are compared to Web-based trac. Kim et al. [25] discuss how modern Internet trac is dicult to categorize because of complex trac patterns and the fact that many applications do not consistently use well known Layer 4 port values. The authors propose an algorithm for identifying P2P trac ows on a network. Ripeanu et al. [5] examine the problems P2P based systems cause on CS architectures such as the Internet by studying the topology and protocols used by Gnutella. The authors discuss the costs and benets of the P2P approach and investigate possible improvements that allow increased reliability in P2P networks. Aberer et al. [11] discuss the complex searching, node organization, and security issues associated with P2P systems. The authors introduce a Peer-Grid approach for improving the performance of P2P systems. Ge et al. [26] develop mathematical models to explore and illustrate performance network issues associated with the use of P2P le sharing applications. The authors use the models to evaluate scalability and the impact of freeloaders. Junginger and Lee [27] design a P2P middleware solution for improving the operational performance of a wide range of P2P applications. Menasce [28] examines how P2P systems can improve network resource usage through the design and use of scalable P2P resource location protocols. Mischke and Stiller [29] provide a comprehensive survey of P2P search systems and propose a middleware framework for improving search, looking, and routing functions of P2P le sharing applications. Ng et al. [30] consider measurement based optimization as a strategy for improving the performance of bandwidth hungry P2P systems. The authors evaluate the properties of traditional measurement-based techniques such as RTT probing, TCP probing, and bottleneck bandwidth probing in an eort to determine how eective these techniques might be in optimizing P2P performance.

1362

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

Bandwidth management and network resource allocation is a widely researched topic. We summarize a number of publications that examine various issues related to bandwidth management and network admission control. These papers oer a sample of research pertaining to bandwidth allocation, network admission, and/ or the use of MDPs in modeling network performance problems. Prasad et al. [4] provide an overview of bandwidth estimation metrics, measurement techniques, and tools. The paper focuses on end-to-end measurement techniques performed by end hosts of a path and oers a comprehensive overview of terminology and bandwidth estimation tools. Firoiu et al. [19] survey recent advances in models and theories for Internet QoS. The authors address the theory of network calculus, network architecture support for guaranteed services, statistical performance guarantees, and nally review recent proposal and results supporting best eort performance guarantees. Savagaonkar et al. [31] consider pricing strategies for bandwidth resource allocation in multi-class networks using a partially observable MDP. The authors introduce two novel pricing schemes, reactive pricing and spot pricing and compare the their performance with at rate pricing. Kalyanasundaram et al. [1] consider resource allocation in a multi-service network where users specify values they attach to various resources via a utility function. The authors develop an optimal resource allocation scheme for multi-service networks that is modeled as a continuous-time MDP. Habib and Saadawi [32] suggest a dynamic bandwidth allocation and control scheme for multi-service trafc on an asynchronous transfer mode (ATM) network. Dierent types of trac with similar characteristics and service requirements are grouped together. The allocation scheme then supports similar types of trac over the same virtual path. Park et al. [20] study a quality of service (QoS) provision problem in non-cooperative networks. Users are given the freedom to choose the service classes and trac volume allocated, and heterogeneous preferences are captured by individual utility functions. Shiomoto et al. [22] discuss a connection admission control method based on virtual path utilization on ATM networks. The admission control policy achieves 65% or 80% of the optimum statistical multiplexing gain for two categories of xed bandwidth virtual paths. Hung and Kesidis [33] oer suggestions for scheduling algorithms based on a minimum bandwidth property for a class of bandwidth scheduling policies for widearea ATM networks. Chan and Geraniotis [34] analyze performance of multi-service virtual circuit-switched networks. Near-optimal approximations for bandwidth requirements for specic trac types are developed. Yuang and Haung [23] analytically determine optimal bandwidth allocated to voice and data using a queueing model with heterogeneous arrivals in a multi-channel system. 3. Background We examine a network allocation problem focusing on bandwidth management and the trade-o between mission critical and entertainment-based trac from the perspective of a university IAP. Specically, we present a model for trac shaping that considers the trade-os from allocating dierent amounts of bandwidths for dierent application categories. To clearly illustrate the problems associated with unrestricted P2P trac and to quantify our results, we use empirical data collected from an actual university provider to demonstrate the proposed methodology. The University of Connecticuts (UConn) University Information Technology Services (UITS) manages the entire campus network including the gateway links from the campus network to the Qwest backbone network. At the beginning of the fall 2000 semester UITS had experienced heavy but sustainable Internet usage patterns with slightly more than 60% of the total network utilization attributed to P2P le sharing applications [36]. By October 2000, average network utilization (in terms of bandwidth demanded) had increased by around 17% with slightly more than 66% of the demand attributed to P2P trac. During this time, the ratio of incoming to outgoing trac remained fairly stable, but congestion increased considerably with mean delay rising by a factor of three. The utilization percentages attributed to P2P activity on the UConn campus reported by UITS are consistent with the percentages described in [57]. In an eort to reduce congestion, UITS experimented with a policy for excessive usage that involved conducting an IP audit, where the IP addresses of heavy users were identied and the guilty parties were asked to voluntarily reduce their usageabusive or persistent users were then disconnected. This policy worked well for a portion of the fall 2000 semester, as P2P applications were still fairly new. As P2P demand escalated; however, the policy could not be applied to the student

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

1363

body as a whole. The use of P2P applications became commonplace and realistically, there was no way all heavy users could be contacted nor could UITS enforce the denial of service policy for a large percentage of the on-campus student body. Beginning in spring 2001, UITS began experimenting with trac shaping or managing and prioritizing Internet trac using Packeteers PacketShaperTM. After a few months of trial and error, an ad hoc management policy was implemented. At the time this research began, UITS was leasing a 105 Mbps Internet 1 (I1) gateway pipe. As the actual UITS policy implemented at the Transmission Control Protocol (TCP) stream level (Layer 4) is fairly complex, a simplied version of the policy is summarized as follows. Between 6:00 a.m. and 6:00 p.m. all P2P trac on campus is limited to 1 Kbps per stream out (upload) and a maximum total bandwidth of about 10 Mbps in (or about 10% of download capacity). This policy eectively allows only enough bandwidth for individual P2P connections to get established and stay open, but is small enough to deter large scale demand for le-downloads because the wait time per user is so high. Beginning at 6:00 p.m. the total P2P capacity is increased to about 6 Mbps out and about 26 Mbps in (about 25%). At 10:00 p.m. the total P2P capacity is increased to 11 Mbps for outgoing trac and about 42 Mbps in (approximately 40%). Finally, between 1:00 a.m. and 6:00 a.m. the policy is relaxed to allow fairly free owing P2P trac 16 Mbps outgoing and a maximum of around 63 Mbps in (about 60%). The policy has been adjusted slightly over the course of the past few years, but has basically remained the same since fall 2001. Although the policy appears to be functional, it is not necessarily optimal and the university has not, to this point addressed the issue of optimal bandwidth allocation. UITS engineers have acknowledged that there could very well be times during the day that the network is either over utilized or under utilized based on the current bandwidth management policies and have expressed interest in optimizing the bandwidth allocations [35]. Trac shaping solutions are often discussed from a universal perspective. The authors are unaware of any literature that addresses optimal bandwidth allocations or that provides recommendations for actual allocation policies. For the purpose of this research, we examine only trac coming into the campus network (downloads), although the methodology presented here could potentially be used for both incoming and outgoing trac streams. The reason behind this is that UITS currently imposes very stringent restrictions on the uploading of les from university-based systems. The P2P upload policy is xed at 1 Kbps per stream and there is no plan to alter the policy. Unrestricted uploads tend to present a much more serious bandwidth management problem than do downloads. This is because at any given point in time, a very large number of remote users from across the world could potentially be attempting to upload les from various P2P hosts on a local network, while the number of local P2P hosts attempting to download is much smaller and is constrained by the number of subscribers or hosts on the local network. For example, many tens or even hundreds of thousands of remote users could attempt to upload les from P2P hosts on a campus network, while the number of local P2P hosts that attempt to download les is limited to a hundreds or thousands of users depending on the time of day. Since IAPs are primarily concerned with QoS issues relating to their customers and not remote users, setting P2P upload policies may not be a priority. The test bed network supports full-duplex service; however, we concentrate only on the download policy in this paper. 4. Data collection and use There are two basic problems associated with monitoring and analyzing Internet trac today. The rst is how to capture and handle the huge amount of trac generated from high speed network links. The second is how to analyze and categorize the vast and complex types of trac generated by dierent network-based applications such as streaming media, P2P, and online gaming [25]. In the past, Internet trac was much easier to analyze as well-known protocols such as HTTP, FTP, Telnet, and SMTP accounted for the great majority of trac and these use specic Internet protocol port numbers. Data routinely captured for network management give port numbers of origin and destination, and this can be used to distinguish the various types of trac. Today, the proportion of trac associated with well known ports is decreasing while the proportion of trac associated with P2P applications, streaming media, and gamingwhich may not be restricted to specic ports or may hop from one port to another, is increasing [25]. The data used in this study are empirical TCP stream data from March 2003 collected in the normal course of daily operations and represent typical TCP sessions between remote (o-campus) and local hosts and are

1364

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380
45,000
Total (Observed)

40,000 Incoming Bytes (106) 35,000 30,000 25,000 20,000 15,000 10,000 5,000

E s tim ated NP 2P E s tim ated P 2P

H13

H17

Hour of Day

Fig. 1. Daily incoming bytes empirical data example (Source: UITS, Networking Engineering).

not biased by any unusual events. Internet trac is highly dynamic and can change drastically over relatively small time periods for a variety of reasons [2]. Fig. 1 illustrates the dynamic nature of incoming Internet trac on a daily basis. The values are estimated for NP2P and P2P because it is not possible to identify all application types and categorize them as either P2P or NP2P downloads for the population as a whole based on port numbers [8,25,35]. UITS collects and stores data in 30-min time intervals. For the purpose of this study, the data are aggregated into hourly time intervals for analysis. The raw TCP stream data contain 13 elds: (1) Local IP address, (2) Remote IP address, (3) Protocol, (4) Local port number, (5) Remote port number, (6) Incoming bytes, (7) Outgoing bytes, (8) Incoming packets, (9) Outgoing packets, (10) First packet time, (11) Last packet time, (12) First packet source, and (13) Last packet source. Table 1 presents a sample of empirical TCP stream data. TCP stream data are used in this study because these are the types of data network managers have readily available and typically use to analyze network performance. However, raw TCP stream data is not inherently useful in analyzing user activity or performance by trac type without extra manipulation. For example, a user engaged in a single activity such as downloading a series of Web pages, may generate many separate TCP sessions during the course of the downloadsome of the streams represent control information and some represent an actual transfer of data. In some cases, a single session may remain open for a fairly long time (tens of seconds) without exchanging any signicantly large data streams. In other cases, multiple sessions may be set up and torn down to accomplish a single task, resulting in many tens or hundreds of individual TCP sessions that actually characterize a single task. This is what makes modeling of packet switched networks more challenging than circuit switched (e.g., phone) networks. IAPs are commonly concerned with standard engineering-based link performance metrics such as latency, throughput, packet loss, and jitter [1,2,36]. Historical data related to average le sizes and stream durations for dierent trac types are not collected and stored. There is no comparison of wait times for P2P downloads and NP2P downloads. Engineering-based data are useful in evaluating the overall network performance, but do not provide detailed wait time information for specic applications or classes of users. Nor do they provide any insight into how users might change their behavior based on variations in network wait times or throughput. While UITS possesses a number of network performance diagnostic tools, many of the data elements required or desired for this study were not easily available. The data and performance statistics typically collected and analyzed do not directly correspond to all of the management issues or types of analyses service providers could consider. 4.1. Estimating the number of active users IAPs can readily produce local IP address counts using TCP session log data (see Table 1). However, data on active users is challenging to obtain because of the decentralized domain/user account management architectures prevalent on multiple domain networks. For example, UITS does not manage the domain structure or

H21

H1

H5

H9

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

Table 1 Sample of TCP stream data Local IP 137.099.164.204 137.099.141.221 137.099.139.235 137.099.173.226 137.099.169.000 137.099.138.080 137.099.128.113 137.099.165.234 137.099.166.157 137.099.137.198 137.099.170.202 137.099.171.028 207.075.164.245 137.099.130.073 137.099.167.125 137.099.153.127 137.099.154.039 137.099.178.136 Remote IP 066.028.100.170 080.060.053.139 024.229.117.237 210.016.037.029 172.176.127.210 148.221.163.061 209.012.072.214 200.165.238.124 141.157.221.084 068.116.163.164 209.078.017.022 210.083.127.039 233.045.017.245 168.143.179.115 064.061.198.167 024.126.251.134 131.118.093.170 004.034.213.021 Protocol 6 6 6 6 6 6 6 6 6 6 6 6 17 6 6 6 6 6 Lcl Port 2668 6699 1210 1214 3401 1220 3992 2679 3534 4597 3587 4342 1027 1698 2177 1214 1214 3766 Rmt port 3021 1077 2686 65,010 1883 1226 10,153 10,117 1214 1420 4334 3996 4444 80 1214 2357 10,347 6112 In bytes 2,711,984 9,413,390 2484 11,880 3,272,460 1368 2,894,639 324 4,100,062 98,462 1962 8,522,224 0 13,955 1998 0 1624 14,263 Out bytes 40,062 432,992 29,788 53,411 126,252 16,282 54,648 4542 121,920 2856 38,724 221,998 5.85E+08 432 16,308 54 25,954 14,450 In Pkt 1797 11,215 46 220 2359 24 1920 6 2949 69 35 6003 0 11 37 0 15 231 Out Pkt 741 7358 47 220 2336 30 1012 3 2148 50 33 3901 408,102 8 33 1 21 232 1st Pkt time 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 00:00.6 Last Pkt time 02:32.3 30:01.4 03:37.0 06:21.0 29:55.3 02:19.4 02:21.0 00:01.2 08:22.5 01:11.3 01:25.2 10:40.3 25:17.5 01:00.8 02:54.3 00:00.6 05:16.1 12:26.5 1st Src 2 2 2 1 2 1 1 1 1 2 1 1 1 2 1 1 2 2 Last Src 2 2 1 1 2 2 2 2 2 2 2 2 1 1 2 1 2 1

1365

1366

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

user account information within specic colleges or departments and has no quick and accurate way to reconcile user accounts with IP addresses. In this paper, we estimate the number of users from empirical IP address data. We do this because IP address counts dont convey true user demandthe counts do not distinguish between active users who are logged on to more than one physical machine, various types of servers, networked peripheral devices, or idle computers. A simple dump of unique local IP address counts on campus might yield tens of thousands of unique IP hits over a day; however, the active users may number only in the hundreds or thousands depending on the time of day. Problems associated with separating live or active users from IP address counts are discussed in [8]. To lter out active users from aggregate IP address counts we used a minimum value for the bytes downloaded per IP address over hourly time intervals and the unique IP itself. This eectively removes control/signaling trac streams from the analysis. IPs associated with routers and DNS servers as well as IPs used by UConn aliates located o campus are ltered out using a list of IP ranges provided by UITS. A minimum download threshold value of 100 KB over the course of an hour is used to identify active network users. Information provided in [6,8] is used to help set the minimum download threshold. According to [8], the low range of the average P2P download is around 2 MB per day (about 85 KB per hour). A 100 KB per hour minimum download threshold eectively lters out idle IPs and signaling trac (both TCP and application based). At the same time the threshold is not set too high as to exclude IPs downloading basic Web content or to exclude P2P downloads during times when the P2P allocation is very restrictive. While the method of determining the number of active users is not perfect, our estimates were validated by the UITS engineers [35] and are consistent with the literature. In practice, the minimum download threshold can be changed if IAPs have a reason to do so. The potential problem associated with automated IP counts is clearly illustrated in Table 2. The number of unique IP addresses ranges from about 97,600 to nearly 120,000. Obviously, these counts cannot represent individual users, as UConn has less than 20,000 faculty and students on campus. The NP2P and P2P samples represent unique IP addresses using the specic port numbers discussed previously. The ltered population

Table 2 Daily user estimates by uniqure IP address Observed IP addresses Full Pop. H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11 H12 H13 H14 H15 H16 H17 H18 H19 H20 H21 H22 H23 H24 117,943 110,057 104,106 106,204 112,971 118,542 119,556 122,712 114,716 119,632 118,956 117,915 102,256 110,577 117,256 121,571 114,987 110,125 97,643 111,304 115,464 110,257 99,779 108,861 Filtered Pop. 45,372 44,201 49,039 48,365 45,475 41,704 44,600 46,843 42,652 42,821 42,953 46,370 42,050 49,126 47,743 46,791 42,924 45,912 42,394 44,134 44,731 50,623 47,347 45,550 Filtered Pop. P 100 KB 2324 1586 1109 764 611 521 435 889 1685 2247 2595 2904 3126 3147 3275 3517 3458 2978 2915 2939 2859 2912 2754 2494 Estimated # active users Est. NP2P Users 1861 1215 776 519 381 325 341 814 1589 2126 2445 2741 2948 2962 3084 3305 3168 2500 2495 2503 2457 2431 2342 2054 Est. P2P users 380 328 319 232 223 193 92 70 87 108 133 141 155 157 163 179 246 408 312 339 324 396 316 341 Est. using both 83 42 14 14 7 3 2 5 9 13 17 22 23 27 28 32 44 70 107 97 78 85 96 99

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

1367

column excludes known DNS, router, network administration, and o-campus IP address ranges and represents unique IP addresses actively engaged in downloading at least 100 KB over an hours time. The last three columns represent an estimated breakdown of the population of users based on a sample of trac that we were able to clearly categorize by Layer 4 port numbers. In addition to TCP stream data, UITS provided aggregated counts of P2P and NP2P sessions for all Internet links (I1 and I2) and aggregated counts of incoming and outgoing bytes over hourly time intervals. The raw session log data for each hour comprise multiple giga bytes (GB) of data. To better characterize P2P versus NP2P trac, four Layer 4 port numbers are used to identify certain well known P2P trac. Any TCP stream data with either remote or local port 1214 (KaZaA download), 6699 (WinMX), 6346 (Gnutella download), and 4662 (eDonkey, overnet, eMule download) are used to represent P2P trac. These ports are used by popular P2P applications and generate the majority of P2P trac on the campus network [35]. It is a common practice in literature that explores P2P trac characterization and performance issues to classify P2P trac in terms of well-known, entertainment-based, le-sharing applications such as Gnutella and KaZaA [8,13,14,25]. Many of the Layer 4 port numbers used in this study to represent generic, recreational P2P trac are also specically mentioned in [8,25]. Other IAPs may choose to add port numbers to this list or use different combinations of port numbers as needed in characterizing P2P trac. Based on empirical data, HTTP trac accounts for over 50% of all NP2P trac. Other well known services such as FTP, Telnet, and SSH account for approximately 3%, 1.5%, and 0.2% of NP2P trac respectively. To simplify the study and because Web-based trac is the single largest component of NP2P trac as a whole, we felt that HTTP trac was suitable for general NP2P trac characterization. The prevalence of HTTP trac and its use potential use in categorizing non-P2P Internet trac behavior is also mentioned in [6,8,13,25]. Separating trac based on port numbers is not an exact procedure because not all Internet trac uses recommended or well known ports. For example, some applications choose to use somewhat obscure port numbers. Other applications are simply not associated with any well known or commonly used ports. It is also the case that many P2P applications are designed to port hop or jump from port to port at the TCP stream level which makes it very dicult to identify and manage the trac. The sheer volume of data, the fact that we could not accurately match many sessions to an identied trac type, and the inherent diculty of translating Layer 4 stream level data into well dened user activities [6,8,25] made the data analysis portion of the project very challenging. The data utilized in this study are not necessarily representative of all universities or IAPs as a whole as providers employ dierent network architectures, have dierent capacity limitations, and service dierent customer bases with potentially dierent usage patterns. However, the use of empirical data from the test provider allows us to construct a model and eectively test and demonstrate our methodology. The objective of the paper is to explain a unique solution methodology and discuss the management implications associated with disparate bandwidth policy allocations. The sample data employed is adequate for our stated goal. We acknowledge that IAPs will want to utilize individualized data and possibly even perform separate analyses on a semi-annual or annual basis when applying this methodology to their particular situation. 5. Methodology In the packet-switched domain, the term bandwidth typically refers to the maximum amount of data (expressed in bits per second) that can be transferred over a communications path during a xed time interval [4]. There are many metrics one may use for optimizing allocations of bandwidth. Traditionally network problems have looked at the costs of delay. Other metrics could be the costs of not serving users, or the cost of not serving total demand of a particular kind. Since users may renege when the delay is large, the observed delay may not be an accurate measure of the potential delay. The management policies employed by the test provider directly impact how the problem is modeled and the assumptions underlying the model. For example, P2P and NP2P trac is treated separately via a virtual pipe for each trac category as dened by the particular bandwidth allocation policy being used. Separating dierent trac types and modeling them as individual virtual streams is an approach that is also used in [1,22,23]. While NP2P trac is prioritized on the NP2P virtual pipe by application type (HTTP trac receives priority), P2P trac is not prioritized. That is, UITS does not prioritize between dierent P2P applications

1368

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

Collect empirical session data

Select the best number of bandwidth allocation changes

Identify Traffic Types (P2P versus NP2P)

Use MDP to maximize total utility for a given number of bandwidth allocation changes

NP2P Traffic

P2P Traffic

Note total Mbps, Mb/user and Coeff of variation

Note total Mb/user and Coeff of variation

Compute total utility to all users

Compute arrival rate, users/sec

Use Goal program to estimate Mb/user at all allocations and time

Compute Utility to Users

Fig. 2. Problem solving methodology overview.

(KaZaA downloads are not given priority over Gnutella downloads for example). Within the P2P virtual pipe, the bandwidth allotted to specic applications may be capped, but P2P and NP2P trac are treated separately. P2P trac is not allowed to overow into the NP2P virtual pipe at any time [35]. This is because many P2P applications tend to monopolize available bandwidth by maintaining virtual connections even when not actively transmitting data, prohibiting unused bandwidth from being released and reallocated as neededwhich is a key performance function of TCPs connection establishment and teardown process. Because we were not explicitly concerned with NP2P wait time estimates by application type (HTTP versus FTP or telnet for example), we do not model the problem using priority queues. Fig. 2 provides a visual summary of our proposed methodology for solving the bandwidth allocation problem. Data are rst collected on the observed total NP2P and P2P trac at dierent times during the day under the existing bandwidth allocation policy. As the empirical data collected are biased by the current bandwidth allocation policies employed by UITS, unconstrained P2P demand estimates are required. One approach could have involved turning o the existing allocation policies for a few days, publicizing this to the target audience (students and faculty in our case), and then collecting data on observed usage patterns. This alternative is not viable since changing the allocation policies would create serious network performance issues for certain allocations (for example 5% NP2P and 95% P2P). Also, monitoring the results and collecting data for each policy change (the percentage of bandwidth allocated to P2P versus NP2P over dierent percentage increments) for each hour of the day would require too many dierent scenarios and was unrealistic with respect to the stang resources currently available to UITS. For these reasons, we estimate the demand during all times and all policy allocations by interpolation using a goal programming model. We model P2P demand as elastic, where demand may change noticebly

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

1369

as bandwidth allocations change. Assuming a policy allocation that would allow a P2P allocation of the whole pipe, it is possible for P2P demand to grow to consume all the available capacity. This assumption is consistent with reported P2P trac behavior in the literature [6,8,25]. NP2P demand is modeled as inelastic, where demand does not change noticably as the bandwidth allocation changes. As the general category of NP2P trafc is considered mission critical tracusers will attempt to complete the NP2P transfer because it represents work that needs to be done, as opposed to entertainment based transfers. Unconstrained NP2P demand is reected in the empirical data we collected at the current bandwidth allocation policies as the policies are explicitly set to make sure NP2P demand is met regardless of P2P demand [35]. Therefore, for the purposes of this paper we used the goal programming model only for P2P trac. The next step is to compute the utility to the users for NP2P and P2P download speeds achieved during each time period and bandwidth allocation combination. We do this using the standard 5-point method of utility assessment. Finally, we use a Markov decision process (MDP) model to determine the optimal bandwidth allocation between P2P and NP2P trac on an hourly basis. The MDP model is designed maximize the total utility for all users given a particular number of bandwidth allocation change during the day. By solving this model at various number of changes, we show that as the number of changes increase the total utility increases but at a decreasing rate. Since increasing the number of allocation changes comes with additional costs of monitoring, our approach gives the manager a way to locate themselves at the number of changes they are comfortable with, knowing how much additional utility they may be foresaking by limiting the number of allocation changes. 5.1. Using goal programming to estimate P2P demand per user for all times and allocations The rst step in the optimization process is to use the observed P2P trac data to estimate trac at all times and bandwidth allocations. Suppose there are 19 bandwidth allocations possible (from 5% NP2P to 95% NP2P allocation in steps of 5%), and 24 time periods in a day. Then the observed trac data has 24 data points, one for each hour (e.g., 55% at hour 1, 40% at hour 2, . . . , 60% at hour 24) based on the current policy. Since we want to evaluate all possible bandwidth allocations to nd the optimal, we need to interpolate from these 24 data points to all possible (19 24 = 456) data points. Let Dpa be the total demand, in megabits per second, for P2P demand in each period t and allocation a, and t p a Dt be the observed trac per user from P2P services in period t at current allocations t . Let Dn be the cora t responding total demand for NP2P demand, irrespective of bandwidth allocation, since we assume (after conrming with UITS that this is close to reality) that the current observed demand is the unconstrained demand. Let S be the size of the pipe (105 Mbps for UITS). All bandwidth allocations, a, in this paper are NP2P allocations. The corresponding P2P allocation is 1 a. We estimate the demand at the lowest NP2P allocation (5%, we do not consider 0% for computational convenience), which is also the highest P2P allocation (95%), to be 80% of the total pipe size (105 Mbps) adjusted to mimic the variation in NP2P demand by hour. The adjustment by hour reects the fact that at certain hours, the unconstrained demand is likely to be very low (early mornings) or high (around 4 p.m.). In this case, the network is provisioned to operate at a mean maximum utilization of 80% of the total pipe capacity in order to eectively handle spikes in trac. We bt incorporate this feature into our model. We denote the maximum P2P trac size in megabits as D p , and p0:05 b bt the corresponding maximum for over all time periods of NP2P demand to be D n . Thus, Dt D p at the lowest bandwidth allocation, a = 0.05. For purposes of convenience, we henceforth use  to denote the a set of currently observed allocations (including the estimated demand at allocation a = 0.05). To ensure bt bt that the unconstrained demand is in fact higher than all observed demand, we compute D p as D p   Dn p a t maxa 0:8S Dn ; Dt . ^ We use goal programming to interpolate P2P demand in hours and allocations not observed in the data. To do this, we rst evaluate the various ways P2P demand can decrease as P2P allocation decreases in any hour. These include linear, convex and concave decreasing functions. Some of the functions depend on allocation alone, hour alone, hour and allocations, and functions with hour and allocation interaction terms. We evaluate these functions for robustness by using them in 50 random demand scenarios, where both demand and

1370

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

Table 3 Robustness summary for dierent demand equations Model form et D pa e pa Dt et D pa et D pa et D pa et D pa Total absolute deviation Mean y 1t y 1t y 1t y 1t y 1t y 1t y2a y 2 a2 y 2 a0:5 bt y2 a y3 aDp bt y 2 a2 y 3 a2 D p bt y 2 a0:5 y 3 a0:5 D p 642.26 740.09 575.57 433.4 561.96 355.16 Std. Dev. 76.68 90.6 67.88 79.0 96.87 68.0 Min. 407.77 455.6 389.37 233.39 287.86 204.08 Max. 820.68 977.53 726.06 647.43 837.85 509.75

the allocation are randomized. The NP2P and P2P bandwidth demanded for each hour in each of the 50 randomly generated scenarios is independently drawn from dierent distributions where each of the distributions used is based on observed data. NP2P demand scenarios are independently drawn from a truncated normal distribution with a mean of 45.2 Mbps, a standard deviation of 18.7 Mbps, a max of 71 Mbps, and a min of 10 Mbps. P2P demand scenarios are independently drawn from a truncated normal distribution with a mean of 13.8 Mbps, a standard deviation of 12 Mbps, a max of 33 Mbps, and a min of 1 Mbps. The NP2P pipe allocations for each hour are independently drawn from a discrete distribution where each of the NP2P allocations used in the study10%, 15%, 20%, and so on, have an equal probability of occurrence. We investigate a subset of generalized P2P demand functions consisting of the NP2P allocation (which determines the available P2P bandwidth), the time period, and any potential interactions between allocation and time. The objective of the simulation experiments is to validate the form of the P2P demand equation (convex, concave, or linear) we employ, as well as to validate the usefulness of the goal program in estimating unconstrained P2P demand. If there is no evidence of interaction between allocation and time (the yiaj term is equal to zero), there would be no need to employ a goal programlinear extrapolation would be a perfectly reasonable approach for estimating demand. We do not explicitly investigate variations in the demand functions using dierent exponent values in an attempt to minimize D, as this would add little value to the experimental validation. Such an exercise would capture slight changes in the slope of the demand function, but would not add any insight into the underlying form of the demand equation. et Table 3 shows a summary of the results, where D pa is the estimate of the actual demand Dpa . Notice that the t function in last row gave the best t, and we use this function in our goal program. The demand function is b e D pa y 1t y 2 a0:5 y 3 a0:5 D p t t 1

Using the goal program we determine the three sets of co-ecients, y1t for each t, and y2 and y3. Notice that in the function above, the rst term depends on time, the second depends on allocation and is convex (that is, the demand drops o rapidly with NP2P allocation increase and then levels o), and the last term is an interaction term. The use of a convex function for modeling general categories of le-transfer applications, which are very elastic is discussed in [1,17]. Since time is not monotonic with demand in our model, we use the allocation and maximum demand in each period in the interaction term. The goal program is detailed below. For purposes of convenience, we henceforth use  to denote the set of a currently observed allocations and the allocation a = 0.05. The input data is
a Dp P2P user demand in megabits for all the observed allocations , where for a = 0.05 we have Dt a t p0:05

bt Dp

The variables are y1t, y2 and y3 The coecients for Eq. (1) a a S  ; S  The positive and negative deviations from the actual observed demand at allocations . a t t

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380
90 80 70 1 AM

1371

4 AM

8 AM

P2P demand

60 50 40 30 20 10

12 Noon

4 PM

8 PM

12 Midnight

0%

10%

20%

30%

40% 50% 60% NP2P Allocation

70%

80%

90%

100%

Fig. 3. The drop-o in P2P demand as NP2P bandwidth allocation increases, as estimated by the goal program, using UITS data.

The problem then is to nd the values for the coecients that solve the following goal program: XX XX a a Minimize S  S  t t
a t a t

subject to

a a a e a D p S  S  Dp t t t t ^

for each ; t a 2

b e D p P D pa P Dp for each a; t t t t e e D p0:05 P D tp0:95 for each t t S; S P 0 i i


^

et In the above formulation, Dp is the minimum P2P demand in each t, and may be set to zero. D pa is from (1). t The rst constraint ensures that the estimated demand is equal to the observed demand, and if not, the deviation in taken up by the positive or negative deviation variables. Since these deviations are being minimized in the objective function, at least one of them will be zero in the solution for each  and t. The second constraint a ensures that the estimated demand is between the minimum and maximum for each allocation a and time t. The third constraint ensures that the tted demand is non-increasing in allocation for each time. The last constraint ensures that the positive and negative deviations are non-negative. The solution to the optimization gives the values of the three coecients in (1) and can then be used to predict the P2P demand from any unobserved time and allocation. Basically we are using the observed demands in 48 cells (24 at the current allocations and 24 at the 5% allocation level that is set to the maximum demand period) to estimate demand in the rest of the 408 cells (for a total of 19 24 = 456 cells), much as we would using a regression model. Fig. 3 illustrates the result of estimation of P2P demands using UITS data. 5.2. Utility to users We next discuss the computation of utility to users of P2P and NP2P applications from dierent bandwidth allocations. Using the solution of the goal program to estimate the total P2P demand for all times and allocations, and using the NP2P demand for all times and allocations as discussed before, it is possible to estimate the download transfer rates that users experience using an G/D/1 queueing model. The use of the G/D/1 model is supported by our initial trac characterization. It should be noted that other assumptions such as G/G/1 for example, could be employed if warranted. We measure transfer rates with respect to bandwidth. In networking, it is common to refer to bandwidth requirements associated with specic applications. This is the number of bits per second an application needs to transmit over the network to perform at an acceptable level. This acceptable performance level is dicult to quantify because some applications require a xed

1372

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

bandwidth (preferably less than the bandwidth available on the link), other applications use whatever they can get, and for other applications bandwidth requirements may vary over time [36]. Suppose mp is the number of users involved in only P2P activity in period t, mn is the number of users t t involved in only NP2P activity in period t, and mb is the number of users involved in both P2P and NP2P t activity in period t. We have already estimated the demand for user trac for all allocations and times as et D pa megabits. As discussed earlier, we assume that the observed NP2P demand is the unconstrained demand, since the current policy allocations are designed to meet unconstrained NP2P demand. Therefore, we use the observed NP2P demand of Dn megabits as the demand for all bandwidth allocations. If this assumption were t not true, one could always use our goal programming approach to estimate demands for dierent allocations. Let vp and vn be the coecient of variation of the time between arrivals of P2P and NP2P demand respect t tively in period t. If S is the capacity of the Internet link in Mbps (105 Mbps at UConn), and at is the allocation for NP2P trac in period t, then the rate of transfer for NP2P trac per user, rna , can be computed t for a G/D/1 model using the approximation in Hopp and Spearman [38, p. na 270], where W na is the waiting time t Dt is the system, d na is the demand per user, in megabits per user, and qna Sat is the utilization of NP2P trac. t t When qna > 1 t " # vn 2 qna dn na t t Wt 1 t 3 2 1 qna Sat t And since rna W natmn , when qna > 1 t t
t t

dn

rna t

2Sat 1 qna t vn qna 21 qna mn t t t t


D
p

t Similarly, for P2P users, the rate of transfer per user, rpa , can be computed when qpa S1at > 1 as t t

rpa t

2S1 at 1 qpa t vp 2 qpa 21 qpa mp t t t t

We next compute the utility to the users from the transfer rates at dierent bandwidth allocations and times. Suppose upa is the utility for users involved in only P2P activity if the bandwidth allocation is p and t the time period is t. Suppose the corresponding utilities for users involved in only NP2P and both P2P and NP2P activity are una and uba respectively. Using the standard 5-point method of utility assessment t t [39], we can evaluate upa and una . t t For the purposes of our analysis, we use a constant risk aversion exponential utility function as shown below. uba can be assessed using multi-attribute utility theory. We assume mutual utility independence t (MUI) because we feel preferences for an uncertain choice involving dierent NP2P transfer rates are independent of the P2P transfer rates. The resulting utility function can be represented by a function such as the one shown below for uba . t upa 1 ert =d t na n una 1 ert =d t uba t b1 una t b2 upa t b3 una upa t t
pa p

The dn and dp are delay tolerances (much like risk tolerances) for NP2P and P2P delays. The lower the value, the lower the tolerance. The coecients in the third equation should add to 1. The utilities for a particular hour are shown in Fig. 4 where the the parameters used for the illustration are dn = 150, dp = 50, b1 = 0.8, b2 = 0.3 and b3 = 0.1 Once the individual user utility to dierent rates of download are computed from (6), the total utility across all users, U a , can be computed as t U a mp upa mn una mb uba t t t t t t t 7 Fig. 4 describes the utility per user at 10 p.m. for NP2P users, P2P users, and simultaneous users of both NP2P and P2P applications given dierent NP2P bandwidth allocations. During this time period, the rate of transfer is fairly low for NP2P users for all bandwidth allocations. This is because there are many NP2P users

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380
0.70 0.60 Utility per user 0.50 0.40 0.30 0.20 0.10 0.00% 20.00% 40.00% 60.00% 80.00% NP2P Band Width Allocation 100.00%
NP2P - 10 pm P2P - 10 pm Both - 10 pm

1373

Fig. 4. The utility per user at dierent allocations for NP2P, P2P and both users at 10 p.m. for UITS data.

relative to the total bandwidth demanded. This results in approximately zero utility for NP2P users for bandwidth allocations below 50%NP2P users can only achieve very low or negligible transfer rates at these allocations. At the same time, the number of P2P users is small relative to the total bandwidth demanded. The transfer rate for P2P users is high when the NP2P pipe allocation is below 50% and the P2P utility is highP2P users have a lot of available bandwidth and obtain high transfer rates, while the NP2P users have little bandwidth and obtain very low transfer rates. We observe that as the NP2P allocation increases (say from 20% to 50%), the utility of P2P users decreases in an exponential manner. This is because both the number of P2P users and the total bandwidth available to the P2P users decreasesfewer users and a lower transfer rate per user. When the NP2P bandwidth allocation is 50%, we see a substantial decrease in P2P utility and begin to notice a linear increase in NP2P utility. The bandwidth available to P2P users becomes so constrained that users begin to balk in large numbers. The utility of NP2P users begins to increase as the NP2P allocation increases and more bandwidth is available to NP2P usersincreasing the transfer rate per NP2P user. We observe a linear increase in NP2P utility because NP2P user demand is treated as inelastic with respect to the allocation, while P2P demand is elastic. NP2P users do not balk in situations where the transfer rate is low, they simply have very small or zero utility. At an NP2P allocation of 80% the transfer rate for P2P users is essentially zero. Correspondingly, the utility for P2P users is zero. Although the total bandwidth available to NP2P users is fairly high when the allocation is 80%, the NP2P transfer rates are still low because of the large number of NP2P users. As expected, NP2P utility also remains relatively low, but increases in a linear fashion along with the bandwidth allocation. Based on the threshold values (175 for NP2P users and 50 for P2P users), P2P users are much patient in waiting for downloads and are more exible with respect to the bandwidth available than NP2P users. These characteristics are captured in the utility functions illustrated in Fig. 4. In the next section we present a MDP model to evaluate the eect of restricting the number of allocation changes made during a day. 5.3. The MDP model At present UITS makes 4 bandwidth allocation changes. From hours 2 to 6, the NP2P bandwidth allocation is set to 40%, for hours 7 through 18 it is set at 90%, for hours 19 through 22 it is set to 75%, and from 23 through 1 it is set to 60%. The trade-o involved in the choice of the number of allocation changes that should be made during a day is the following. If you make numerous bandwidth allocation changes during a day you are likely to obtain very high total utility for the user community. However, this comes at the cost of increased monitoring expenses by the network administrators. On the other hand, if very few bandwidth allocation changes are made, monitoring expenses are minimal, but the utility for the user community is also reduced. Instead of trying to calculate the disutility on the part of network administrators for changing bandwidth allocations, we gure out the optimal allocations given a xed number of allocation changes are permitted. Then by varying the number of changes allowed, it becomes possible to see exactly how much utility is lost

1374

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

by reducing the number of changes permitted. This approach allows the network administrators to choose the number of allocation changes they are comfortable with. We model the problem of determining the optimal bandwidth allocation as a Markov decision process. Suppose fta m; a is the optimal utility in time t of having a NP2P bandwidth allocation of at, with m remaining allocation changes and given that a was the allocation in hour 1. We can solve for fta m; a using the following dynamic programming recursion: ( fta m; a max
a  a a  Change to t : U  ft1 m 1; a a t a No change : U a ft1 m; a t

With the terminal condition, at t = T,  a U T m; a a 6 a; m P 1; or a a; m 0 a fT m; a 9999 otherwise

The above terminal condition assesses a large penalty if the state at T (hour 24) has more (or less) changes remaining than you should have at that point in time. If, however, the state has no remaining changes permitted, then the allocation in hour 24 and hour 1 should be exactly aligned. Further, if the state has exactly one remaining change permitted, the allocation in hour 24 should be dierent from the allocation in hour 1. By recursively following the sequence of optimal decisions taken, the optimal bandwidth allocations, a can t be obtained using the following algorithm: Suppose # changes permitted = M; Suppose # allocation levels = N (19 in our work); Suppose # periods = T (24 in our work); s : 999999 FOR z : 1 TO N DO FOR x : 1 TO N DO IF f1z m; x > s THEN BEGIN s : f1z m; x; a : z; 1 a : x; END; ~ m : M; {The number of changes remaining} FOR y:2 TO T DO BEGIN a y1 ~ a : arg max fy1 m; a; y
y1 ~ ~ ~ ~ ~ IF a arg max fy1 m; a THEN m : m ELSE m : m 1; y1 END;

9500 Total Utility 9400 9300 9200 9100 9000 2 3 4 5 6 # Allocation Changes 7 8
Max achievable

Fig. 5. Total optimal utility for dierent # allocation changes for UITS data. Notice the additional gain in utility levels o. The UITS policy of 4 changes has a utility of 8132.

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380
100% 90% 80% NP2P Allocation 70% 60% 50% 40% 30% 20% 10% 0% 1 3 5 7

1375

2 Changes 4 Changes 8 Changes UITS (4 Changes)

11 13 Hours

15

17

19

21

23

Fig. 6. Optimal NP2P allocations for dierent permitted number of allocation changes.

Fig. 5 shows optimal total utility from changing the number of bandwidth allocation changes permitted. We note that the current UITS policy of 4 allocation changes (utility of 8132) is 13.715.25% worse than those obtained by the MDP model. Further, the improvement in utility levels o as the number of allocation changes are permitted to increase. The gure also shows the maximum possible total utility if the number of bandwidth allocation changes permitted were unlimited. Also notice that even with the 4 allocation changes used by UITS, the MDP model, with a utility of 9323 is 14.6% better. Fig. 6 shows the optimal NP2P allocations for dierent permitted number of allocation changes. Comparing the UITS policy with those obtained from the MDP model, we note that the UITS policy is conservative up until midnight (7 a.m. through midnight), but too aggressive in favor of P2P during the early morning hours (1 a.m. through 6 a.m.). The discontinuity in the 18th hour in the case of 8 changes, for example, takes advantage of the small improvement in utility between the 10% and 95% allocations. 6. An illustrative numerical example In this section we illustrate our methodology using a simple numerical example. In our example we, for simplicity of exposition, we use only 6 time periods and 5 NP2P bandwidth allocations (0.05, 0.25, 0.50, 0.75 and 0.95). Suppose the data collected from NP2P and P2P users is as shown in the table below (the shaded cells are the current bandwidth allocation levels at the corresponding time periods). Let the size of the Internet pipe, S, be 105 Mbps.
NP2P Bandwidth Allocations NP2P Traffic Demand # Users, CV, Mbps, mtn Vtn Dtn 45.00 1,587 1.18 25.00 1,074 1.03 65.00 765 1.03 70.00 522 1.14 50.00 448 1.04 60.00 364 1.11 P2P Traffic # Users, mtp 402 321 250 191 134 132 CV, vtp 1.16 1.13 1.10 1.07 1.06 1.09 Both # Users, mtb 336 192 94 51 29 25

Time 1 2 3 4 5 6

0.05

0.25

0.5

0.75

0.95

The input for the goal program to estimate the P2P demand for all time periods and bandwidth allocations is shown in table below. Note that the user demand at allocation of 5% has been estimated to be the higher of the numbers for the other allocations for that hour or 80% if 105 Mbps times ratio of the observed demand the  n a b p max 0:8S Dtn ; Dp . in that hour to the maximum observed demand, or D t ^ t D
a

1376

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

Time

NP2P bandwidth allocations 0.05 0.25 45 0 0 0 0 0 0.5 0 25 0 0 0 60 0.75 0 0 0 0 50 0 0.95 0 0 65 70 0 0

P2P user demand, Mb/user 1 54 2 30 3 78 4 84 5 60 6 72

On solving the goal program in (2), the objective function value is 10. The coecients y1t, y2 and y3 and and the positive and negative deviations are shown below: Time y1t S+ NP2P bandwidth allocations 0.05 y2 = 5.73; y3 = 0.15 1 52.01 0.0 2 32.31 0.0 3 82.27 0.3 4 88.17 0.0 5 62.95 0.0 6 75.76 0.0 0.25 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.0 3.9 0.75 0.0 0.0 0.0 0.0 0.0 0.0 0.95 0.0 0.0 0.0 0.0 0.0 0.0 S NP2P bandwidth allocations 0.05 5.1 0.0 0.0 0.0 0.4 0.0 0.25 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.75 0.0 0.0 0.0 0.0 0.0 0.0 0.95 0.0 0.0 0.0 0.0 0.0 0.0

et The estimated demands, D pa , for all time periods and allocations can then be derived from (1) as shown in the table below (e.g., for cell (2,0.25) the formula is 32.31 5.73(0.25)0.5 0.15(30)(0.25)0.5 = 27.2). Time NP2P bandwidth allocations 0.05 Estimated P2P demand, Mbps 1 49 2 30 3 78 4 84 5 60 6 72 as Time NP2P download rate, rna t NP2P bandwidth allocations 0.05 1 2 3 4 5 6 0 0 0 0 0 0 0.25 0 2 0 0 0 0 0.5 5 28 0 0 9 0 0.75 21 50 26 22 85 68 0.95 33 67 59 69 135 133 P2P download rate, rpa t NP2P bandwidth allocations 0.05 82 153 91 102 332 251 0.25 56 115 25 0 205 112 0.5 19 65 0 0 1 0 0.75 0 8 0 0 0 0 0.95 0 0 0 0 0 0 0.25 45 27 73 79 55 67 0.5 42 25 70 75 52 64 0.75 40 23 67 72 50 61 0.95 38 22 65 70 48 59

For NP2P and P2P users, we can now compute the rate of transfers for downloads in kbps from (4) and (5)

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

1377

The utility per user can be then computed from (6). For our illustrative example we use dn = 220, dp = 50, b1 = 0.8, b2 = 0.3 and b3 = 0.1 to obtain the following: t NP2P utility per user, una t NP2P bandwidth allocations 0.05 1 2 3 4 5 6 0.00 0.00 0.00 0.00 0.00 0.00 0.25 0.00 0.01 0.00 0.00 0.00 0.00 0.5 0.02 0.12 0.00 0.00 0.04 0.00 0.75 0.09 0.20 0.11 0.10 0.32 0.27 0.95 0.14 0.26 0.23 0.27 0.46 0.45 P2P utility per user, upa t NP2P bandwidth allocations 0.05 0.81 0.95 0.84 0.87 1.00 0.99 0.25 0.68 0.90 0.39 0.00 0.98 0.89 0.5 0.32 0.73 0.00 0.00 0.03 0.00 0.75 0.00 0.15 0.00 0.00 0.00 0.00 0.95 0.00 0.00 0.00 0.00 0.00 0.00 Both utility per user, uba t NP2P bandwidth allocations 0.05 0.24 0.29 0.25 0.26 0.30 0.30 0.25 0.20 0.28 0.12 0.00 0.30 0.27 0.5 0.11 0.30 0.00 0.00 0.04 0.00 0.75 0.07 0.21 0.09 0.08 0.26 0.21 0.95 0.11 0.21 0.19 0.21 0.37 0.36

The total utility for all users can then be computed from (7): Time NP2P bandwidth allocations 0.05 Total utility of users 1 405 2 360 3 232 4 179 5 143 6 138 0.25 340 350 108 0 141 124 0.5 204 420 0 0 23 0 0.75 171 307 95 54 151 103 0.95 259 322 196 151 216 174

On running the MDP for dierent allowed allocation changes, we get the following result: Time Allocation changes 2 Optimal bandwidth allocations Utility 1568 1 0.05 2 0.05 3 0.05 4 0.05 5 0.95 6 0.95 3 568 0.05 0.05 0.05 0.05 0.95 0.95 4 1628 0.05 0.50 0.05 0.05 0.95 0.95 5 1628 0.05 0.50 0.05 0.05 0.95 0.95

The maximum achievable utility for this data is 1628. Note that the number of allocation changes shown includes the allocation change for the wraparound from time period 6 to time period 1. Also, notice that the total utility levels o to the maximum achievable. If the manager feels that the extra utility gained from changing from 2 to 4 is not commensurate to the extra oversight required, then she may choose to use only 2 bandwidth allocation changes during the day. 7. Managerial takeaways and conclusions Trac shaping is an important tool for network management. While equipment exists to implement trac shaping policies, there is very little research on how those policies should be set or how often they should be

1378

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

changed during the course of a day. In this paper we have presented the issues that network managers have to contend with and the data issues that make setting up a cogent policy so dicult. The lack of data on network trac at bandwidth allocations not in use is a common problem. It is almost impossible to try out all dierent allocations at dierent times to see what the implications of allocation changes may be. We hope that our goal programming approach will begin to address this issue. Our approach can be extended to the case of more than the two types of trac, P2P and NP2P, that we have considered in this paper. Even if the data on trac at dierent bandwidth allocations is available, simple heuristics may not do a good job of determining a good allocation. We present an easy to implement MDP model for determining the optimal bandwidth allocation. For the sizes of problems we ran (24 time periods, 19 allocation levels) on a 2.2 GHz Intel Pentium processor, the model ran in a fraction of a second. The cost inputs for this model are dicult to estimate in practice, but a sensitivity analysis approach of trying out the implications of various cost scenarios can be an eective tool to address this concern. There is certainly room for further research on the cost inputs needed for any approach to trac shaping. Our work has been welcomed by UITS at the University of Connecticut [37]. Appendix A. List of universities using packet shaping Ad hoc list of universities/colleges involved in packet shaping with PacketeerTM obtained from online forum archives between March 2004 and February 2005. Representatives from listed universities actively participating in newsgroup discussions concerning PacketeerTM conguration and trac management issues. This listing is not exhaustive. http://www.stanford.edu/group/networking/netlists/ 1 2 3 4 5 6 7 Alvernia College Auburn University Beloit College Boston College Boston University Brescia University Bucknell University 31 32 33 34 35 36 37 38 39 40 Knox College McMaster University Millersville University Montana State University Niagara University North Park University, Chicago Northwest Missouri State University Northwestern University Oberlin College Pepperdine University 61 62 63 64 65 66 University University University University University University at Albany of Arizona of California, Davis of California, Irvine of Connecticut of Georgia

67 University of Hartford 68 University of Idaho 69 University of Maryland 70 University of Massachusetts, Lowell 71 University of Minnesota 72 73 74 75 76 77 78 79 80 81 82 University University University University University Alabama University University University University University LaCrosse University Madison of of of of of of of of of of Montana Montevallo New Brunswick New Haven Northern Notre Dame Pittsburgh Western Ontario Winnipeg Wisconsin,

8 Bualo State University 9 Cal State Chico 10 Cal State San Bernardino 11 California Institute of the Arts 12 Campbellsville University 13 Carthage College 14 Colorado State University 15 Creighton University 16 Dalhousie University 17 18 19 20 21 Denison University Eastern Michigan University Ferris State University Florida State University Gainesville College

41 Plymouth State University 42 43 44 45 46 47 48 49 50 51 Princeton University Principia College Providence College Quinnipiac University Radford University Randolph-Macon College Ripon College Ryerson University Saint Vincent College San Francisco State University

22 Gannon University

52 Skidmore College

of Wisconsin,

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

1379

23 Grand Valley State University 24 Guilford College 25 Hardin Simons University 26 Hiram College 27 Indiana University, Pennsylvania 28 Iowa State University 29 Kent State University 30 Kings College

53 Southeastern Louisiana University 54 Southwest Wisconsin Technical College 55 SUNY Cortland 56 Texas A& M University 57 The Catholic University of America 58 The Citadel 59 The San Diego Community College District 60 Union College

83 University of Wisconsin, Whitewater 84 Washington and Jeerson College 85 Wentworth Institute of Technology 86 Western Washington University 87 Wheaton College 88 William and Mary 89 Woord College 90 Xavier University

References
[1] S. Kalyanasundaram, E.K.P. Chong, N.B. Shro, Optimal resource allocation in multi-class networks with user-specied utility functions, Computer Networks 38 (2002) 613630. [2] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, F. True, Deriving trac demands for operational IP networks: Methodology and experience, IEEE/ACM Transactions on Networking 9 (3) (2001) 265279. [3] K. Harfoush, A. Bestavros, J. Beyers, Measuring bottleneck bandwidth of targeted path segments, IEEE INFOCOM, 2003, pp. 2079 2089. [4] R. Prasad, C. Dovrolis, M. Murray, K.C. Clay, Bandwidth estimation: Metrics, measurement techniques, and tools, IEEE Network (November/December) (2003) 112. [5] M. Ripeanu, A. Iamnitchi, I. Foster, Mapping the Gnutella network, IEEE Internet Computing (JanuaryFebruary) (2002) 5057. [6] R. Schollmeier, A. Dumanois, Peer-to-peer trac characteristics, in: 9th Open European Summer School and IFIP Workshop on next Generation Networks. Budapest, Hungary, September 2003, pp. 810. [7] Cisco Systems, Inc., Managing peer-to-peer trac with Cisco service control technology, White Paper. Available from: <http:// www.cisco.com/application/pdf/en/us/guest/products/ps6150/c1244/cdccont_0900aecd8 023500d.pdf. Downloaded 6/30/05>. [8] S. Sen, J. Wang, Analyzing peer-to-peer trac across large networks, IEEE/ACM Transactions on Networking 12 (2) (2004) 219232. [9] D. Plonka, 2000, University of Wisconsin, Madison Napster Trac Measurement, Downloaded 9/2/05. Available from: <http:// net.doit.wisc.edu/data/Napster/>. [10] D. Joachim, University Gets Tough on P2P, Internetweek.com, February 18, 2004. Available from: <http://www.internetweek.com/ security02/showArticle. jhtml?articleID=17701191 Downloaded 2/20/05>. [11] K. Aberer, M. Punceva, M. Hauswirth, R. Schmidt, Improving data access in P2P systems, IEEE Internet Computing (January February) (2002) 5867. [12] Cache Logic, Understanding peer-to-peer: The options available to service providers. Available from: <http://www.cachelogic.com/ p2p/p2pchoices.php Downloaded 6/30/05>. [13] K.P. Gummandi, R.J. Dunn, S. Saroiu, S.D. Gribble, H.M. Levy, J. Zahorjan, 2003, Measurement, modeling, and analysis of a peerto-peer le sharing workload, in: Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP), 2003. [14] S. Saroiu, P.K. Gummandi, S.D. Gribble, H.M. Levy, 2002, An analysis of Internet content delivery systems, in: Proceedings of the 5th Symposium of Operating System Design and Implementation (OSDI), 2002. [15] T. Dawson, Trac Shaping, OReilly LinuxDevCenter.com, 2000. Available from: <http://www.linuxdevcenter.com/pub/a/linux/ 2000/08/24/LinuxAdmin.html. August, pp. 24. Downloaded 11/5/04>. [16] Natural Micro Systems Denition of packet shaping, 2004. Available from: <http://www.google.com/search?hl=en& lr=& oi=defmore&q=dene: Trac+Shaping Downloaded 10/29/04>. [17] Z. Cao, E.W. Zegura, Utility maxmin: An application-oriented bandwidth allocation scheme, Proceedings of IEEE INFOCOM (2) (1999) 793801. [18] Clavister, 2005, Trac Shaping Introduction. Downloaded 7-02-05. Available from: <http://www.clavister.com/manuals/ver8. 5x/ manual/trac_shaping/trac_shaping_overview.htm>. [19] V. Firoiu, J.L. Boudec, D. Towsley, Z. Zhang, Theories and models for Internet quality of service, Proceedings of the IEEE 90 (9) (2002) 15651591. [20] K. Park, M. Sitharam, S. Chen, Quality of service provision in noncooperative networks with diverse user requirements, Decision Support Systems, Special Issue on Information and Computational Economics 28 (2000) 101122.

1380

S.K. Nair, D.C. Novak / European Journal of Operational Research 180 (2007) 13581380

[21] X. Chang, K.R. Subramanian, An approximation algorithm for optimal resource allocation in multi-service broadband networks, Proceedings of IEEE ICC (3) (2000) 13151319. [22] K. Shiomoto, S. Chaki, N. Yamanaka, A simple bandwidth management strategy based on measurements of instantaneous virtual path utilization in ATM networks, IEEE Transactions on Networking 6 (5) (1998) 625634. [23] M.C. Yuang, Y.R. Haung, Bandwidth assignment paradigms for broadband integrated voice/data networks, Computer Communications 21 (1998) 243253. [24] S. Saroiu, P.K. Gummandi, S.D. Gribble, 2002, A measurement study of peer-to-peer le sharing systems, in: Proceedings of Multimedia Computing and Networking (MMCN) 2002. [25] M.S. Kim, H.J. Kang, J.W. Hong, 2003, Towards peer-to-peer trac analysis using ows, Working paper obtained from the Distributed Processing and Network Management Laboratory. Department of Computer Science and Engineering, Pohang University of Science and Technology, Republic of Korea. Available from: <http://dpnm.postech.ac.kr/papers/DSOM/03/P2P/ camera-ready/L45. pdf Downloaded 6/25/04>. [26] Z. Ge, D.R. Figueiredo, S. Jaiswal, J. Kurose, D. Towsley, Modeling peer-to-peer le sharing systems, IEEE INFOCOM (3) (2003) 21882198. [27] M.O. Junginger, Y. Lee, A self-organizing publish/subscribe middleware for dynamic peer-to-peer networks, IEEE Network (January/February) (2004) 3843. [28] D.A. Menasce, Scalable P2P search, IEEE Internet Computing (MarchApril) (2003) 8387. [29] J. Mischke, B. Stiller, A methodology for the design of distributed search in P2P middleware, IEEE Network (January/February) (2004) 3037. [30] T.S.E. Ng, Y.H. Chu, S.G. Roa, K. Sripanidkulchia, H. Zhang, Measurement-based optimization techniques for bandwidthdemanding peer-to-peer systems, IEEE INFOCOM (3) (2003) 21992209. [31] U. Savagaonkar, E.K.P. Chong, R.L. Givan, Online bandwidth provisioning in multi-class networks, Computer Networks (44) (2004) 835853. [32] I. Habib, T. Saadawi, Dynamic bandwidth control in ATM networks, Computer Communications 22 (1999) 317339. [33] A. Hung, G. Kesidis, Bandwidth scheduling for wide-area ATM networks using virtual nishing times, IEEE/ACM Transactions on Networking 4 (1) (1996) 4954, February. [34] W.C. Chan, E. Geraniotis, Near-optimal bandwidth allocation for multi-media virtual circuit switched networks, Proceedings of IEEE INFOCOM (2) (1996) 749757. [35] UITS, Personal correspondence or conversations with Connecticuts University Information Technology Services (UITS), Phil Rodrigues and Mike Lang., 2002. [36] L.L. Peterson, B.S. Davie, Computer Networks: A Systems Approach, 1996. ISBN: 1-55860-368-9. [37] UITS, Personal correspondence or conversations with Connecticuts University Information Technology Services (UITS), Mike Lang. Specically related to comments pertaining to the completion of the research project and this paper, 2004. [38] W.J. Hopp, M.L. Spearman, Factory Physics, 2nd ed., 2001, ISBN 0256247951. [39] W.L. Whinston, Operations Research, Applications and Algorithms, 4th ed., Duxbury Press, 2004.

S-ar putea să vă placă și