Sunteți pe pagina 1din 15

Enabling Flow-level Latency

Measurements across Routers in Data


Centers

Parmjeet Singh, Myungjin Lee


Sagar Kumar, Ramana Rao Kompella
Latency-critical applications in data centers
 Guaranteeing low end-to-end latency is important
 Web search (e.g., Google’s instant search service)
 Retail advertising
 Recommendation systems
 High-frequency trading in financial data centers

 Operators want to troubleshoot latency anomalies


 End-host latencies can be monitored locally
 Detection, diagnosis and localization through a network: no
native support of latency measurements in a router/switch
Prior solutions
 Lossy Difference Aggregator (LDA)
 Kompella et al. [SIGCOMM ’09]
 Aggregate latency statistics

 Reference Latency Interpolation (RLI)


 Lee et al. [SIGCOMM ’10]
 Per-flow latency measurements
More suitable due to more fine-grained measurements
Deployment scenario of RLI
 Upgrading all switches/routers in a data center network
 Pros
 Provide finest granularity of latency anomaly localization
 Cons
 Significant deployment cost
 Possible downtime of entire production data centers

 In this work, we are considering partial deployment of RLI


 Our approach: RLI across Routers (RLIR)
Overview of RLI architecture
Router
Ingress I

 Goal Egress E
 Latency statistics on a per-flow basis between interfaces

 Problem setting
 No storing timestamp for each packet at ingress and egress
due to high storage and communication cost
 Regular packets do not carry timestamps
Overview of RLI architecture
Linear interpolation
Ingress I 1E
2
Egress L
line

Delay
1
Reference
L
Latency R
Packet
R 2
Estimator Interpolated
Injector
delay
 Premise of RLI: delay locality
Time
 Approach
1) The injector sends reference packets regularly
2) Reference packet carries ingress timestamp
3) Linear interpolation: compute per-packet latency estimates at
the latency estimator
4) Per-flow estimates by aggregating per-packet estimates
Full vs. Partial deployment
RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)

Switch 1 Switch 3 Switch 5

Switch 2 Switch 4 Switch 6

 Full deployment: 16 RLI sender-receiver pairs


 Partial deployment: 4 RLI senders + 2 RLI receivers

 81.25 % deployment cost reduction


Case 1: Presence of cross traffic
RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)

Switch 1 Switch 3 Switch 5

Link utilization Bottleneck


Cross
estimation on Switch 1 Link
Traffic
Switch 2 Switch 4 Switch 6
 Issue: Inaccurate link utilization estimation at the sender
leads to high reference packet injection rate
 Approach
 Not actively addressing the issue
 Evaluation shows no much impact on packet loss rate increase
 Details in the paper
Case 2: RLI Sender side
RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)

Switch 1 Switch 3 Switch 5

Switch 2 Switch 4 Switch 6

 Issue: Traffic may take different routes at an intermediate


switch
 Approach: Sender sends reference packets to all receivers
Case 3: RLI Receiver side
RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)

Switch 1 Switch 3 Switch 5

Switch 2 Switch 4 Switch 6


 Issue: Hard to associate reference packets and regular
packets that traversed the same path
 Approaches
 Packet marking: requires native support from routers
 Reverse ECMP computation: ‘reverse’ engineer intermediate
routes using ECMP hash function
 IP prefix matching at limited situation
Deployment example in fat-tree topology
RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)

IP prefix matching Reverse ECMP computation /


IP prefix matching
Evaluation
 Simulation setup
 Trace: regular traffic (22.4M pkts) + cross traffic (70M pkts)
 Simulator
10% / 1%
RLI RLI
injection rate
Regular Sender Reference Receiver
Traffic packets
Packet Traffic
Switch1 Switch2
Trace Divider

Cross Cross Traffic


Traffic Injector
 Results
 Accuracy of per-flow latency estimates
Accuracy of per-flow latency estimates
Bottleneck link utilization: 93%
67%

10% injection

1% injection 10% injection


1% injection
CDF

Relative error
1.2% 4.5% 18% 31%
Summary
 Low latency applications in data centers
 Localization of latency anomaly is important

 RLI provides flow-level latency statistics, but full


deployment (i.e., all routers/switches) cost is expensive

 Proposed a solution enabling partial deployment of RLI


 No too much loss in localization granularity (i.e., every other
router)
Thank you! Questions?

S-ar putea să vă placă și