Documente Academic
Documente Profesional
Documente Cultură
5.1 Introduction
In distributed data center architectures, global load balancing (GLB) allows to virtualize a corporate
service at the global level, by providing a unique URL to users, for accessing content from any data
center. The benefit is performance through traffic distribution and, of course, business continuity
through multiple centers capable of providing the same content.
Global load balancing is a DNS based technology that processes user DNS requests for a specific
domain, by globally selecting the best data center for a specific host name, using pre-defined
criteria. Typical selection criteria are:
Once the best data center has been selected, the global load balancer answers to the user DNS
request with the best VIP address (Virtual IP address) that corresponds to the requested service.
Then, a local load balancer in the chosen data center will select the best real server for the user
request in the real server farm.
Although traditional DNS provides an efficient and scalable system for users to be matched with
the address of servers that contain the data they seek, the end user may not always be directed to
the best site. For example, traditional DNS has no way of knowing whether the host whose
address it receives is on-line, in which case the data returned may be an error message that the
server is down. In addition, in a distributed content architecture, the selected site is not always the
most optimal in terms of network delay or servers load condition.
In the GLOBE infrastructure, global load balancing technology is used for extranet access as well
in the intranet :
In the next paragraphs, we will first explain the details of the Cisco GLB technology and then, we
will address the specific implementation for GLOBE extranet and intranet.
In the following paragraphs, we will describe the exact role of each element in the global load
balancing solution. The following figure should help you to understand the role of each element :
Client
(3)
DR
(1) DNS request (4) RTT probing sg P
M Ms
RP g
)D
(3
g
(3)
s
RP M
e
DR
iv
VI
l
pa
P
PM
Ke
ee
(3) D
e
(5
K
ep
lu
VIP K
sg
)R
iv e
P
va
al
VI
TT
iv
pal
e
RT
va
eepali
Kee
lu
)
e
(5
VI P
e v
(5) R
value
CSS/ CSS/
CSM CSM
TT v
(5) RTT
DRP DRP
luea
Agent Agent
Site 1 Site n
CSS/ CSS/
CSM CSM
DRP DRP
Agent Agent
Site 2 Site 3
To introduce this principles, here is a summary of what happens in a global load balancing
network, when a user wants to establish a connection :
• The requesting client sends a DNS A-record query to its local DNS proxy (1)
• The local DNS proxy determines that the GSS is authoritative for the domain and forwards
the DNS query to the GSS (2)
• If the DNS proxy server is requesting DNS resolution for the first time, the GSS proximity
database does not contain any proximity information for it. Therefore, the GSS sends the
configured default VIP answer to the D proxy (6), which forwards the answer to the client
(7)
• At the same time, the GSS sends a DRP message to each DRP agent, which initiates an
RTT probing process from the agents towards the D proxy (4).
• After the RTT has been determined, the measured values are sent to the GSS (5), which
stores the results in its proximity database. These entries will be used for subsequent DNS
requests form the same DNS proxy.
• In parallel, the GSS uses the keep alive messages to query local load balancers (CSS or
CSM) for portal status and load information
• GSS Manager Primary: The primary GSSM is a GSS performing content routing as well
as centralized management functions for the GSS network. The primary GSSM serves as
the organizing point of the GSS network, hosting the embedded GSS database that
contains configuration information for all your GSS resources, such as individual GSS and
DNS rules. Other GSS devices report their status to the primary GSSM. Configuration
changes initiated on the primary GSSM using the graphical user interface (GUI) are
automatically communicated to each device that the primary GSSM manages. Any GSS
device can serve as a GSSM.
• GSS Manager Standby: The standby GSSM is a GSS performing DNS functions for the
GSS network even while operating in standby mode. In addition, the standby GSSM can be
configured to act as the GSSM should the primary GSSM go offline or become unavailable
to communicate with other GSS devices. As with the primary GSSM, the standby GSSM is
configured to run the GSSM GUI and contains a duplicate copy of the embedded GSS
database that is currently installed on the primary GSSM. Any configuration or network
changes affecting the GSS network are synchronized between the primary and the standby
GSSM so that the two devices are never out of step.
• GSS Slave: It is a slave GSS is performing routing of DNS queries based on DNS rules and
conditions configured using the GSSM. Each GSS is known to and synchronized with the
primary GSSM, but individual GSS do not report their presence or status to one another.
Each GSS on the network must delegate authority to the parent domain GSS DNS server
that serves the DNS requests. Each GSS is managed separately using the Cisco CLI. GUI
support is not available on a GSS device in slave mode. However, a GSS device may also
be serving as the primary GSSM if configured so.
As an example, here is a DNS Dig lookup in one of the upstream DNS server for the domain
connect.nestle.biz for which two GSS devices are active (other GSS will be deployed as you can
see in chapter 5.3) :
; <<>> DiG 8.3 <<>> connect.nestle.biz NS @141.122.67.67 (the upstream DNS server)
;; QUERY SECTION:
;; connect.nestle.biz, type = NS, class = IN
;; ANSWER SECTION:
connect.nestle.biz. 1D IN NS ctrgss1.nestle.com. (The 1st GSS name)
connect.nestle.biz. 1D IN NS ctrgss2.nestle.com. (The 2nd GSS name)
;; ADDITIONAL SECTION:
ctrgss1.nestle.com. 1H IN A 141.122.188.18 (The 1st GSS IP address)
ctrgss2.nestle.com. 1H IN A 141.122.188.19 (The 2nd GSS IP address)
The GSS can be configured a multiple ways to handle receives DNS lookups. However, in our
implementation, we will always consider the GSS returning an IP address corresponding to the
best VIP to use for a specific client.
Global load balancing strategy is configured in a GSS by specifying “DNS rules”. To make it
simple, a DNS rule is a set of three balance clauses that are evaluated in order. Each clause
specifies a collection of possible VIP addresses (answers) and how to select the best one to be
returned to the client. If all sites fail to comply with the clause conditions (i.e all sites exceed a
configured load threshold or RTT cannot be measured), the next clause is evaluated. The last
clause is often used to return a default answer to the client if both other clauses failed.
The following figure shows all relevant configuration elements and their relationship:
The DNS rule is the connection point of all configuration objects that make up the global load-
balancing configuration. A DNS rule accepts an address list that specifies source IP addresses
valid for this DNS rule, a domain list that specifies for which domain name the DNS rule is destined
and up to three load-balancing clauses, together with whether or not RTT probing should be
enabled and which load balancing method should be used. Each answer group specifies a list of
possible VIP answers to be use by the load balancing clause.
DNS lookup
received
Is the lookup
coming from a
configured src
address list ?
N Y
Can Clause #1
be used ?
N Y
N Y
TCP Keep-Alive: This keep-alive is a periodic TCP session establishment attempt to the VIP
address, using the conventional three-way handshake exchange (SYN - SYN/ACK - ACK). The
TCP port used by the keep-alive is the port used by the real service. Once a session has been
successfully established with the VIP, the GSS closes the session either with a graceful sequence
(FIN - FIN/ACK - FIN/AK - ACK) or with a RST packet. This is a configurable option. If the keep-
alive fails after three consecutive retries, the corresponding VIP is marked inactive by the GSS,
which removes the site for its DNS routing decisions. After 3 successful keep-alive (configurable),
the site is declared back on-line.
KAL-AP Keep-Alive: KAL-AP (Keep-Alive Access Protocol) is a more advanced method to test a
VIP address. KAL-AP is a UDP based protocol (UDP port 5002) used between the GSS and
distributed CSS/CSM. KAL-AP is not only able to retrieve availability of a VIP, but also the
averaged load information for real portal servers. Load of each individual real server is measured
by the local load balancer (see chapter 5.4 for a detailed explanation on how server load is
measured) and an averaged value for the entire server farm is sent to the GSS via KAL-AP. Load
is a number between 2 and 255 where a load of 255 represents an offline site.
5.2.3 Proximity
Proximity is an important aspect of a global load balancing solution, especially if the access is has
to be established through a network where latency is highly variable, like the Internet. In such a
case, one would prefer to connect to the most proximate data centre, if it can provide the same
content, to minimize response time. Proximity is determined by the DRP protocol (Director
Response Protocol), which used by the GSS and DRP agents running on Cisco routers.
In each data centre, an agent running in a Cisco router is responsible to measure the round trip
time (RTT) between itself and the requesting DNS proxy.
Currently, a DRP agent running in an IOS router uses two methods to probe the DNS proxy:
• ICMP Echo
• TCP SYN/ACK
The first probe method is a simple PING packet. If the first method fails (most probably due to
packet filtering), the DRP agent tries the second method, which is a TCP SYN/ACK packet, sent to
the DNS proxy on port 53. As defined in TCP RFC #793, if a host receives a packet with both SYN
and ACK bits set, and no SYN packet has been sent before by the host, the host should respond
with a TCP reset (RST) packet back to the DRP agent. The delay between the SYN/ACK and the
RST packets is measured in milliseconds by the DRP agent and sent to the GSS.
QUESTIONS:
www.cisco.com, type = A, class = IN
ANSWERS:
-> www.cisco.com
internet address = 198.133.219.25
ttl = 55650 (15 hours 27 mins 30 secs)
In a global load-balancing environment, the GSS should indicate a much lower TTL value, in order
to allow the user to re-assess the best site, in order to adapt to the Internet connection
performance variations. Therefore the GSS can be configured to set the TTL to a lower value. 60
second is a good choice
Here is an example applied to the domain connect.nestle.biz , where the TTL value is set to 20
seconds :
QUESTIONS:
connect.nestle.biz, type = A, class = IN
ANSWERS:
-> connect.nestle.biz
internet address = 160.213.122.146
ttl = 20 (20 secs)
Microsoft Internet Explorer ignores DNS TTL value returned in an A-record. It means that even if
the A-record returned by the GSS contains a short TTL value, the client browser caches the DNS
answer for some fixed amount of time, typically 30 minutes. In other words, the Windows operating
system use the returned TTL value, but the browser uses an internal DNS cache of 30 minutes !
TCP Keep-Alive : If there is at least one active real server in the server farm, the CSS/CSM
completes the session establishment sequence. If all servers are down, the keep alive fails
KAL-AP Keep-Alive : If there is at least one active real server in the server farm, the CSS/CSM
returns the averaged server load for each active VIP in the CSS/CSM. The load is locally
measured on each individual real server.
This section is a detailed explanation of Global Load Balancing applied to the SSL VPN Access.
Note that SSL VPN GLB implementation is applied to the access of the service and not to the data
the user can access. Internet users get connected with data centre presenting the lowest network
latency but access their local applications through the GWAN. As the response time in the GWAN
is managed (which is not the case in the Internet), an optimal response time should be observed.
EUR
2 x GSS on VLANK0
1 x VIP on VLANK0
2 x DRP Agts on VLANO0
2 x SSL Access Devices
AMS AOA
1 x GSS on VLANK0 1 x GSS on VLANK0
1 x VIP on VLANK0
Internet 1 x VIP on VLANK0
2 x DRP Agts on VLANO0 2 x DRP Agts on VLANO0
2 x SSL Access Devices 2 x SSL Access Devices
The three regional data centres participate in the global load balancing architecture. A user
connected to the Internet requests access to the infrastructure using the unique URL
https://connect.nestle.biz. In the three data centres, a full set of GLB elements are installed. This
includes Cisco-GSS, DRP agents, Cisco-CSS and SSL Appliances.
• Cisco GSS (G1 & G2) : Provide DNS based global load balancing functions
• DRP Agents : Perform RTT probing on request from the GSS devices
• Cisco CSS (C3 & C4) : Perform local load balancing functions
• Juniper SSL IVE : Perform SSL tunnel terminations and application relay
Important notice:
At the time of this writing, CTR (Vevey) is still providing an SSL VPN access with two GSS devices
and two SSL Appliances, as any other regional data centre. However, this description corresponds
to the strategic deployment of GLOBE, where RDC and CDC are being moved to DC-EUR.
R5 A B R6
F7 R7 C3 L3 G1 IP55 V1 V2 IP56 G2 L4 C4 R8 F8
IP1 IP2
DRP DRP
Agent Agent
This layout corresponds to the EUR implementation as two GSS devices are connected. In AOA
and AMS data centres, only one GSS is installed. Please refer to the key IP address assignment
table on page 109 for GSS and SSL appliances IP addresses.
5.3.2 Proximity
Proximity is measured between the 3 regional data centres (EUR, AOA and AMS) and any
requesting DNS proxy in the Internet
!
ip drp server
!
ip access-list 2 permit <IP49 in EUR>
ip access-list 2 permit <IP50 in EUR>
ip access-list 2 permit <IP49 in AOA>
ip access-list 2 permit <IP49 in AMS>
ip drp access-group 2
!
ip drp authentication key-chain gss
key-chain gss
key 1
key-string cdcgss
!
Note :
At the time of this writing, Nestlé is in discussion with Cisco Systems, requesting more probing
methods to be implemented in the GSS to improve the ratio of successful RTT probing
Therefore, static entries can be entered in the GSS proximity database to direct users, for which no
RTT is available, to the most proximate data centre. Different IP to location databases are
available in the market and on-line mapping services exist as well (www.ip2location.com/ or
www.maxmind.com/app/ip_locate). This method is however somehow cumbersome as the static
entries have to be configured via the GSS CLI and static entries must be manually copied from one
GSS to another.
The designed solution is more straightforward. We configure a specific DNS rule for each region
(AMS, AOA, EUR) to which we assign an address list with IP prefixes assigned by Regional
Internet Registries (RIR). The GSS will then statically return the regional VIP address to the client,
providing the VIP is available and active. Hence, we can consider the static VIP definition as a
backup answer, in case of non-responding DNS proxies.
ISPs obtain allocations of IP addresses from a Local Internet Registry (LIR) or National Internet
Registry (NIR), or from their appropriate Regional Internet Registry (RIR). IANA (Internet Assigned
Numbers Authority) is the organism that assigns addresses to these registries.
In a first step, we will configure static entries for addresses assigned to the following regional
registries (RIR):
You can see a list of all assigned IP prefixes assigned to these registries at the following web site:
(http://www.iana.org/www.iana.org/assignments/ipv4-address-space)
While address ranges assigned to these registries can be easily mapped to GLOBE data centers,
other address ranges are flagged by IANA as assigned to “various registries”, or to commercial
companies and other organizations. For instance, addresses between 128.0.0.0/8 to 172.0.0.0/8
are assigned to different registries worldwide, and for those addresses, a clear geographical
location cannot be determined. Therefore, for those addresses, a “wildcard” DNS rule is configured
with no address list assignment.
DNS lookup
received
What is the
address of the
D-proxy ?
Evalue
Evalue Evalue Evalue
Wildcard
AMS DNS Rule EUR DNS Rule AOA DNS Rule
DNS Rule
Y N Y N Y N Y N
Return the most Is the AMS Return the most Is the EUR Return the most Is the AOA Return the most Return another
proximate active regional VIP proximate active regional VIP proximate active regional VIP proximate active active VIP
VIP to client active ? VIP to client active ? VIP to client active ? VIP to client (Round-Robin)
Y N Y N Y N
First, the GSS checks if the lookup request comes from a configured address list. If it is the case,
the appropriate DNS rule is selected. Else, a wildcard DNS rule is used which does not specify any
address list. Once the appropriate DNS rule is selected, the GSS then evaluates the balance
clauses in order they are defined.
All DNS rules use the same principle: The GSS tries to use proximity information (RTT) to select
the best available site. If no proximity information is available because the DNS proxy cannot be
probed, then the GSS returns the data centre VIP address mapped to the address list, if an
address list is specified in the DNS rule. If not (wildcard DNS rule), or if the regional data centre
VIP is not available, the GSS selects one other active data centre VIP in a round robin fashion.
The Source Address List “PREFIXES-EUR” is a pointer on an address list that contains IP prefixes
assigned to RIPE and AFRINIC.
• In the Balance Clause #1, the answer group “CONNECT-RTT-ANSGRP” contains VIP
addresses for EUR, AOA and AMS. Because “Proximity enable” is checked on this clause,
RTT probing results will be evaluated to select the closest active VIP in the network.
• In the event where RTT probing was unsuccessful, Balance Clause #2 is executed. In this
clause, the GSS returns the EUR VIP address. This is the best decision as the source IP
address of the received DNS lookup came form a RIPE assigned prefix.
• If the EUR VIP is not available (i.e both SSL appliances are down), the GSS selects
another available data centre VIP, in a round robin fashion.
The wildcard DNS rule is used to process DNS lookups coming from IP addresses outside any
static prefixes definition:
As you can see, the Source Address List field indicates “Anywhere”.
• In the Balance Clause #1, the answer group “CONNECT-RTT-ANSGRP” contains VIP
addresses for EUR, AOA and AMS. Because “Proximity enable” is checked on this clause,
RTT probing is used and results will be evaluated to select the closest VIP in the network.
The associated balance method (Round Robin) has no significance, as only one VIP will be
returned to the client.
• In the event where RTT probing was unsuccessful, Balance Clause #2 is executed. ), the
GSS selects another available data centre VIP, in a round robin fashion.
R5 HSRP2
A B R6
FO4
VLANJ0 (502) SCZ2
F7 R7 C3 L3 G1 IP55 V1 V2 IP56 G2 L4 C4 R8 F8
FO3
FO2
FO1
DRP DRP
Agent Agent
DRP packets sourced by the GSS and destined to DRP agents on VLANO0 in each data centre
are routed through the GSS’s default gateway HSRP6 on R7/R8. Then, the packets are routed
through firewalls F5/F6 towards the Internet using the R7/R8 default gateway FO2. DRP reply
packets are directly to the GSS without passing through R7/R8.
Similarly, DNS lookup packets from Internet users are directly delivered to the GSS on VLANK0.
DNS reply packets however, are routed through R7/R8 via the GSS’s default gateway HSRP6.
R5 HSRP2
A B R6
FO4
VLANJ0 (502) SCZ2
F7 R7 C3 L3 G1 IP55 V1 V2 IP56 G2 L4 C4 R8 F8
FO3
FO2
FO1
DRP DRP
Agent Agent
This VLANK0-to-VLANK0 communication must pass through the Internet in order for the GSS
devices, to keep track on the real VIP availability, as seen by an Internet user. By default, VLANK0
to VLANK0 communication is routed through F7/F8 to the internal network. Therefore, static routes
have to be configured in R7/R8, C3/C4 and F5/F6, to force GSS keep-alive inbound and outbound
packets to use the Internet connection.
The GSS uses its default gateway HSRP6 to send its keep-alive probes. R7/R8 then uses the
configured static routes (see paragraph 5.3.12) to remote VIP7 addresses in other data centres, to
send the probes via F5/F6. On the return path, the static routes configured in C3/C4 force the
keep-alive return packets to pass through F5/F6. On F5/F6, a static route to remote VLANK0
subnets forces the keep-alive packets to flow through the Internet.
R5 HSRP2
A B R6
FO4
VLANJ0 (502) SCZ2
F7 R7 C3 L3 G1 IP55 V1 V2 IP56 G2 L4 C4 R8 F8
FO3
FO2
FO1
DRP DRP
Agent Agent
R5 HSRP2
A B R6
FO4
VLANJ0 (502) SCZ2
F7 R7 C3 L3 G1 IP55 V1 V2 IP56 G2 L4 C4 R8 F8
FO3
FO2
FO1
DRP DRP
Agent Agent
SSL session establishment requests coming from Internet users are directly received by the CSS
on VIP7, which represents the SSL VPN connect.nestle.biz service. Then, the CSS uses a Layer-4
content rule to load balance these sessions on both SSL appliances on IP53 and IP54. No source
address NAT is performed by the CSS. Therefore, a virtual redundant interface (FO10) is
configured on both CSS and used as default gateway by the SSL appliances for the return
packets. This ensures that the return traffic flows through the same CSS as the incoming flow. For
the return flow towards the client, the CSS uses its default gateway HSRP6, which then routes the
packets through F5/F6 by using its own default gateway (FO2). Remember that R7/R8 are acting
as route servers on VLANK0, and therefore decide if a packet has to be routed in the internal
network or through the Internet.
SSL appliances appear as proxy devices for user traffic. New HTTP sessions are established
between the SSL appliances and the target server (Outlook Web Access, Documentum etc..).
for this traffic; recall that R7/R8 receive Production Plane and Management Plane routes from
R5/R6 via BGP (see section 4.6.4)
5.3.10.3 TACACS+
GSS devices send authentication requests to the TACACS+ server, connected to
VLANX6/VLANY6 on the Management Plane. In order to guarantee that packets from the GSS are
routed through FM1/FM2, the static NAT address defined in section 6.5.6.3 on VLANP1/VLANP2 is
used as target address.
Bidirectional Rule
Bidirectional Rule