Sunteți pe pagina 1din 54

TeXCP:

Protecting Providers Networks from


Unexpected Failures & Traffic Spikes

Dina Katabi
High Performance
Switching and Routing
Telecom Center Workshop: Sept 4, 1997.

MIT - CSAIL
dk@mit.edu
nms.csail.mit.edu/~dina
TeXCP is TE with an XCP-Like
Protocol

Crash Course in XCP


Problem Addressed by XCP:

TCP has trouble providing a few


Gb/s per-flow throughput
Need to increase faster

Gigabit Links
Need to increase faster Need Explicit Feedback

How much faster? Kb/s Link


Need to increase faster Need Explicit Feedback

How much faster?

Need to coordinate connections!

Scalability Constraint:
Routers should not maintain per-connection state
XCP
How does XCP Work?

Round Trip
RoundTime
Trip Time

Throughput
Throughput

Feedback
Feedback
Feedback =
+ 1 packet/sec

Congestion Header
How does XCP Work?

Round Trip Time

Throughput

Feedback
Feedback =
- 31 packet/sec
+ packet/sec
How does XCP Work?

Rate = Rate + Feedback

Explicit Feedback
Make senders react according to the
amount of spare capacity
How Does an XCP Router Compute the
Feedback?

1. Efficiency Controller
2. Fairness Controller

Router makes decisions every control interval D


How Does an XCP Router Compute the
Feedback?
Efficiency Controller Fairness Controller
Goal: Matches input traffic to Goal: Divides between
link capacity & drains the queue connections to converge to
fairness

MIMD AIMD
Algorithm: Algorithm:
Aggregate traffic changes by If > 0 Divide equally
~ Spare Capacity between connections
~ - Queue Size If < 0 Divide between
So, = Spare - Queue/D connections proportionally to
their current rates
(shown to achieve Fairness [Jain])
Getting the devil out of the details
Efficiency Controller Fairness Controller
Algorithm:
= davg Spare - Queue
If > 0 Divide equally between
connections
Theorem: System converges If < 0 Divide between connections
to optimal utilization (i.e., proportionally to their current rates

stable) for any link capacity, Need to estimate number of


delay, number of sources if: connections N

0
4 2
and 2 2 N
pkts in D
1
D Throughputpkt

No Parameter
(Proof Tuning
based on Nyquist No Per-Flow State
D: Control/Counting Interval
Criterion)
XCP Remains Efficient as Bandwidth or
Delay Increases
Utilization vs. Capacity Utilization vs. Delay

XCP increases and chosen to


proportionally to make XCP robust
spare capacity to delay

Bottleneck Capacity (Mb/s) Round Trip Delay (sec)


TeXCP: Intra-Domain Traffic
Engineering with XCP

with Srikanth Kandula, Asfandyar Qureshi, Shan Sinha


What is the problem?

Provider Network
(AT&T, Sprint, BBN, .)
What is the problem?

Boston
L.A.

Egress

Ingress

Traffic Engineering routes the traffic demands of


IE pairs to achieve good network performance?
Good performance means:

Network is robust to unexpected events:


Link failures
Traffic Spikes

Support as much demands as possible given


capacities

Minimize the maximum link utilization


(i.e., load balancing)
To minimize Max U Multi paths

Boston
L.A.

Two issues:
- How much traffic to put on each path?
- How does TCP interact with multipath
multi-pathrouting?
routing?
How to divide traffic between available
paths to minimize max U?

Simplistic solution: give all information to a


centralized computer and solve it as a linear
programming problem

But that wouldnt be realtime and wouldnt


react to failures and attack traffic

Need a distributed, in network, realtime solution


Challenge 1:

Feedback Delays
Challenge 1:

Feedback Delays
Utilization
? Boston
L.A.

Utilization feedback might be obsolete


Challenge 2:

Coordination Without Global Knowledge


Challenge 2:

Coordination Without Global Knowledge

Boston
L.A.

NYC

SF
Challenge 2:

Coordination Without Global Knowledge

Boston
L.A.

NYC

SF

Actions of uncoordinated ingress nodes might


cause undesirable effect
Solution Idea:
Use experience from congestion control

Congestion control & TE are close


CC: single path; want 100% utilization
TE: multi-paths; want balanced utilization
XCP tells us how fast each path can change its
utilization so that system is stable despite delay
XCP tells us how to coordinate multiple senders
who share the same path

TeXCP: traffic engineering with an XCP-


like protocol
TeXCP
A TeXCP agent per IE, at ingress node

Boston
L.A.

TeXCP
Agent

TeXCP agent knows the paths between IE, which


are computed offline
Paths are pinned
TeXCP Agent

TeXCP divides the IE traffic between


available paths to Minimize the Max U?
A TeXCP agent 3 components
Probe path utilization
Load balancer

Per-path XCP controller


Component 1:

Probe Path Utilization


Periodically, send probes on each path to
learn Max utilization along the path

U1 = 0.4
U2 = 0.7
Egress 1

Ingress 1
x

Probes are sent along the slow path like ICMP


messages implementation is easy
Component 2:
Load Balancer
Objective: Balance utilization across IE paths
How? move traffic from overutilized paths to
under-utilized paths
Continuously estimate demands L(t)
Assume xp is fraction of traffic that should be
sent on path p
Periodically, update xp

x p ~
rp (t )
u (t ) u (t )
r (t )
p
i

Where rp is the sending rate on path p and up is its


utilization
Component 3:
Per-Path XCP Controller
Load Balancer tells us how much to change
utilization, but
Cant increase/decrease immediately because of
delays
Need to coordinate the increase/decrease from
all ingress nodes

Run a lightweight XCP on each path


Replace congestion header with probes along the
slow path
Change the slow path in core routers to answer
probes
TeXCP Performance
Simple Topology

Ingress 1 Egress 1
L1

L2 Egress 2
Ingress 2
L3

Ingress 3 Egress 3

Traffic demands per IE pair are modeled using 100


Pareto on-off sources, or Poisson arrivals
Also, 50% of the traffic is uncontrollable by TeXCP
TeXCP balances
Started at t = 0 with
the load across available paths,
and=1.3
Load1 reacts to demands change in realtime
Load2=0.8
Load3=0.8
Utilization

Decrease in
Uncontrollable Traffic

Time (sec)
TeXCP is More Accurate than Previous
Approaches

TeXCP MATE
Utilization

Time (sec) Time (sec)


Real Traffic and Provider Topology
TeXCP Improves Robustness Against Link
Failures

OSPF Optimizer With TeXCP

Link Down Link Up Link Up


Link Down

The utilization of all Abilene links as function of time


TeXCP Improves Robustness Against Link
Failures

Without TeXCP With TeXCP

Max Utilization
Max Utilization

Link Down Link Up

TeXCP prevents congestion caused by link failures


When multiple solutions results in the same Max
U, TeXCP prefers solutions with shorter paths
TeXCP Prefers Shorter Paths

Ingress1 Ingress2

TeXCP prefers low delay paths while balancing


utilization
How about TCP?

Multi-paths must NOT reorder TCP


packets

Problem:
Can you take traffic at any backbone
router and accurately split it between
multiple paths without reordering TCP
packets?
Simplistic Solution

Assign TCP flows to each path


proportionally to the desirable split
1. Flows are not all equal: Elephants & Mice
2. So, estimate the rate of each TCP
3. But rates change over with time
4. Too complex
Our Approach

If pkt enters network after the previous pkt


from the flow has left We can reassign the
flow to a different path
Use UDP traffic to correct for errors
Our Approach
Each TeXCP agent keeps a Hash Table
If (now-last_seen) > Max delay, we can change the
assigned path
Assign a TCP flow to a new path with a probability
equals to the fraction of traffic you want on this path

Last Seen Path

9920.2659 3
OC12 traffic
Split it between 2 paths, with desirable
splitting changing with time as a sinusoidal
wave
Path1 Fraction
Path2 Fraction
OC12 traffic
Split it between 2 paths, with desirable
splitting changing with time as a sinusoidal
wave
Path1 Fraction
Path2 Fraction
Path1 UDP
Path2 UDP
Very, Very Cheap. Edge routers maintain a
hash table of 210 entries (10KB).
Conclusion

TeXCP
Adaptive multipath routing protocol
Makes the network more robust against
link failures and traffic spikes
Note TeXCP does not assume XCP

Multipath routing can be easily and


effectively implemented without causing
TCP packet reordering
Questions?

More Information at:


http://nms.lcs.mit.edu/~dina/texcp.html
TeXCP Reacts in Real-Time to Traffic
Spikes
Max Utilization

Spike
begins No TeXCP
Optimal
TeXCP

Time (sec)
Comparison with MATE
Ingress 1 Egress 1

Ingress 2 Egress 2

Ingress 3 Egress 3

Mate
Minimize delay
Conservative
increase
Comparison with MATE

TeXCP MATE
Link load / capacity Link load / capacity

decrease
in cross
traffic

Time (sec) Time (sec)

TeXCPs utilization is more balanced


Avg. drop rate in MATE is 2% while in TeXCP is 0%
XCP Deals Well with Short Web-Like Flows

Arrivals of Short Flows/sec


Traffic rate change by every davg

Rate r(t) changes per time unit by r
d avg
Efficiency Controller
S (Fairness
t ) Q(t )Controller
= davg Spare - Queue r
Algorithm:
d avg 2
If > 0 Divide
d avg equally between flows
If < 0 Divide between flows
Theorem: System converges proportionally to their current rates
to optimal utilization (i.e.,
stable) for any link bandwidth, Need to estimate number of
delay, number of sources if: connections N


0
4 2
and 2 2 N
pkts in D
1
D Throughputpkt

Stability
(Proof based Properties
on Nyquist No Per-Connection
D: Counting Interval
Criterion) State

S-ar putea să vă placă și