Types of Failure Detection: Applies To

9/9/12
Document
Solaris 10 IP Multipathing (IPMP) Link-based Only Failure Detection [ID 1008064.1]

Modified: Dec 1, 2011 Type: HOWTO Migrated ID: 211105 Status: PUBLISHED Priority: 3
Applies to:
Solaris SPARC Operating System - Version: 10 3/05 to 10 8/11 [Release: 10.0 to 10.0] Solaris x64/x86 Operating System - Version: 10 3/05 to 10 8/11 [Release: 10.0 to 10.0] All Platforms
Goal
Explain the different types of failure detection modes used by IP Multipathing (IPMP) and in particular how to set up IPMP in link-based only failure detection mode. Provide simple configuration examples.
Solution
Description Most of the information is already documented in the IP Services/IPMP part of the Solaris[TM] System Administration Guide (8164554). This document is a short summary of failure detection types with additional/typical/recommended configuration examples using Linkbased failure detection only. Even though link-based failure detection was supported before Solaris[TM] 10 (since DLPI link up/down notifications are supported by used network driver), it is now possible to use this failure detection type without any probing (probebased failure detection). Steps to Follow IPMP Link-based Only Failure Detection with Solaris [TM] 10 Operating System (OS) Contents: 1. Types of Failure Detection 1.1. Link-based Failure Detection 1.2. Probe-based Failure Detection 2. Configuration Examples using Link-based Failure Detection only 2.1. Single Interface 2.2. Multiple Interfaces 2.2.1. Active-Active 2.2.1.1. Two Interfaces 2.2.1.2. Two Interfaces + logical 2.2.1.3. Three Interfaces 2.2.2. Active-Standby 2.2.2.1. Two Interfaces 2.2.2.2. Two Interfaces + logical 3. References
1. Types of Failure Detection

1.1. Link-based Failure Detection Link-based failure detection is always enabled (supposed to be supported by the interface), whether optional probe-based failure detection is used or not. As per PSARC/1999/225 network drivers do send asynchronous DLPI notifications DL_NOTE_LINK_DOWN (link/NIC is down) and DL_NOTE_LINK_UP (link/NIC is up). The UP and DOWN notifications are used in IP to set and clear the IFF_RUNNING flag which is, in the absence of such notifications, always set for an interface that is up. Failure detection software will immediately detect changes to IFF_RUNNING. These DLPI notifications were implemented to network drivers by and by, and supported by almost all of them since Solaris 10.
https://support.oracle.com/epmos/faces/ui/km/DocumentDisplay.jspx?_adf.ctrl-state=hfngveqxz_266 1/5
9/9/12
Document
With link-based failure detection, only the link between local interface and the link partner is been checked on hardware layer. Neither IP layer nor any further network path will be monitored! No test addresses are required for link-based failure detection. For more informations, please refer to Solaris 10 System Administration Guide: IP Services >> IPMP >> 30. Introducing IPMP (Overview) >> Link-Based Failure Detection 1.2. Probe-based Failure Detection Probe-based failure detection is performed on each interface in the IPMP group that has a test address. Using this test address, ICMP probe messages go out over this interface to one or more target systems on the same IP link. The in.mpathd daemon determines which target systems to probe dynamically: all default routes on same IP link are used as probe targets. all host routes on same IP link are used as probe targets. ( Configuring Target Systems) always neither default nor host routes are available, in.mpathd sends out a all hosts multicast to 224.0.0.1 in IPv4 and ff02::1 in IPv6 to find neighbor hosts on the link. Note: Available probe targets are determined dynamically, so the daemon in.mpathd has not to be re-started. The in.mpathd daemon probes all the targets separately through all the interfaces in the IPMP group. The probing rate depends on the failure detection time (FDT) specified in /etc/default/mpathd (default 10 seconds) with 5 probes each timeframe. If 5 consecutive probes fail, the in.mpathd considers the interface to have failed. The minimum repair detection time is twice the failure detection time, 20 seconds by default, because replies to 10 consecutive probes must be received. Without any configured host routes, the default route is used as a single probe target in most cases. In this case the whole network path up to the gateway (router) is monitored on IP layer. With all interfaces in the IPMP group connected via redundant network paths (switches etc.), you get full redundancy. On the other hand the default router can be a single point of failure, resulting in 'All Interfaces in group have failed'. Even with default gateway down, it could make sense to not fail the whole IPMP group, and to allow traffic within the local network. In this case specific probe targets (hosts or active network components) can be configured via host routes. So it is question of network design, which network path you do want to monitor. A test address is required on each interface in the IPMP group, but the test addresses can be in a different IP test subnet than the data address(es). So private network addresses as specified by rfc1918 (e.g. 10/8, 172.16/12, or 192.168/16) can be used as well. For more informations, please refer to Solaris 10 System Administration Guide: IP Services >> IPMP >> 30. Introducing IPMP (Overview) >> Probe-Based Failure Detection
2. Configuration Examples using Link-based Failure Detection

An IPMP configuration typically consists of two or more physical interfaces on the same system that are attached to the same IP link. These physical interfaces might or might not be on the same NIC. The interfaces are configured as members of the same IPMP group. A single interface can be configured in its own IPMP group. The single interface IPMP group has the same behavior as an IPMP group with multiple interfaces. However, failover and failback cannot occur for an IPMP group with only one interface. The following message does tell you, that this is link-based failure detection only configuration. It is reported for each interface in the group.
/a/d/esgs vrammsae i.ptd14:[D952 deo.ro]N ts adescniue o nmah[4] I 709 amnerr o et drs ofgrd n itraec0 dsbigpoebsdfiuedtcino i nefc e; ialn rb-ae alr eeto n t
So in this configuration it is not an error, but more a confirmation, that the probe-based failure detection has been disabled correctly. 2.1. Single Interface /etc/hostname.ce0 192.168.10.10 netmask + broadcast + group ipmp0 up
#icni fofg a c0 fas1083u,racs,unn,utcs,p4 mu10 idx4 e: lg=004<pbodatrnigmliativ> t 50 ne

9/9/12
Document
ie 12181.0ntakfff0 bodat12181.5 nt 9.6.01 ems fff0 racs 9.6.025 gopaeim0 runm pp ehr03b:39:c te ::a9:0f
2.2. Multiple Interfaces 2.2.1. Active-Active 2.2.1.1. Two Interfaces /etc/hostname.ce0 192.168.10.10 netmask + broadcast + group ipmp0 up /etc/hostname.ce1 group ipmp0 up
#icni fofg a c0 fas1083u,racs,unn,utcs,p4 mu10 idx4 e: lg=004<pbodatrnigmliativ> t 50 ne ie 12181.0ntakfff0 bodat12181.5 nt 9.6.01 ems fff0 racs 9.6.025 gopaeim0 runm pp ehr03b:39:c te ::a9:0f c1 fas1083u,racs,unn,utcs,p4 mu10 idx5 e: lg=004<pbodatrnigmliativ> t 50 ne ie 0000ntakf000 bodat0252525 nt ... ems f000 racs .5.5.5 gopaeim0 runm pp ehr03b:39:5 te ::a9:13
2.2.1.2. Two Interfaces + logical /etc/hostname.ce0 192.168.10.10 netmask + broadcast + group ipmp0 up \ addif 192.168.10.11 netmask + broadcast + up /etc/hostname.ce1 group ipmp0 up
#icni fofg a c0 fas1083u,racs,unn,utcs,p4 mu10 idx4 e: lg=004<pbodatrnigmliativ> t 50 ne ie 12181.0ntakfff0 bodat12181.5 nt 9.6.01 ems fff0 racs 9.6.025 gopaeim0 runm pp ehr03b:39:c te ::a9:0f c01 fas1083u,racs,unn,utcs,p4 mu10 idx e:: lg=004<pbodatrnigmliativ> t 50 ne 4 ie 12181.1ntakfff0 bodat12181.5 nt 9.6.01 ems fff0 racs 9.6.025 c1 fas1083u,racs,unn,utcs,p4 mu10 idx5 e: lg=004<pbodatrnigmliativ> t 50 ne ie 0000ntakf000 bodat0252525 nt ... ems f000 racs .5.5.5 gopaeim0 runm pp ehr03b:39:5 te ::a9:13
2.2.1.3. Three Interfaces /etc/hostname.ce0 192.168.10.10 netmask + broadcast + group ipmp0 up /etc/hostname.ce1 group ipmp0 up
9/9/12
Document
/etc/hostname.bge1 group ipmp0 up
#icni fofg a be:fas1083u,racs,unn,utcs,p4 mu10 idx3 g1 lg=004<pbodatrnigmliativ> t 50 ne ie 0000ntakf000 bodat0252525 nt ... ems f000 racs .5.5.5 gopaeim0 runm pp ehr093:19:b te ::d1:11 c0 fas1083u,racs,unn,utcs,p4 mu10 idx4 e: lg=004<pbodatrnigmliativ> t 50 ne ie 12181.0ntakfff0 bodat12181.5 nt 9.6.01 ems fff0 racs 9.6.025 gopaeim0 runm pp ehr03b:39:c te ::a9:0f c1 fas1083u,racs,unn,utcs,p4 mu10 idx5 e: lg=004<pbodatrnigmliativ> t 50 ne ie 0000ntakf000 bodat0252525 nt ... ems f000 racs .5.5.5 gopaeim0 runm pp ehr03b:39:5 te ::a9:13
2.2.2. Active-Standby 2.2.2.1. Two Interfaces /etc/hostname.ce0 192.168.10.10 netmask + broadcast + group ipmp0 up /etc/hostname.ce1 group ipmp0 standby up
#icni fofg a c0 fas1083u,racs,unn,utcs,p4 mu10 idx4 e: lg=004<pbodatrnigmliativ> t 50 ne ie 12181.0ntakfff0 bodat12181.5 nt 9.6.01 ems fff0 racs 9.6.025 gopaeim0 runm pp ehr03b:39:c te ::a9:0f c01 fas1083u,racs,unn,utcs,p4 mu10 idx e:: lg=004<pbodatrnigmliativ> t 50 ne 4 ie 0000ntakf000 bodat0252525 nt ... ems f000 racs .5.5.5 c1 e: fas6004<racs,unn,utcs,p4nfioe,tnb,ncie lg=9082bodatrnigmliativ,oalvrsadyiatv> mu0idx5 t ne ie 0000ntak0 nt ... ems gopaeim0 runm pp ehr03b:39:5 te ::a9:13
2.2.2.2. Two Interfaces + logical /etc/hostname.ce0 192.168.10.10 netmask + broadcast + group ipmp0 up \ addif 192.168.10.11 netmask + broadcast + up /etc/hostname.ce1 group ipmp0 standby up
#icni fofg a c0 fas1083u,racs,unn,utcs,p4 mu10 idx4 e: lg=004<pbodatrnigmliativ> t 50 ne ie 12181.0ntakfff0 bodat12181.5 nt 9.6.01 ems fff0 racs 9.6.025 gopaeim0 runm pp
9/9/12
Document
ehr03b:39:c te ::a9:0f c01 fas1083u,racs,unn,utcs,p4 mu10 idx e:: lg=004<pbodatrnigmliativ> t 50 ne 4 ie 12181.1ntakfff0 bodat12181.5 nt 9.6.01 ems fff0 racs 9.6.025 c02 fas1083u,racs,unn,utcs,p4 mu10 idx e:: lg=004<pbodatrnigmliativ> t 50 ne 4 ie 0000ntakf000 bodat0252525 nt ... ems f000 racs .5.5.5 c1 e: fas6004<racs,unn,utcs,p4nfioe,tnb,ncie lg=9082bodatrnigmliativ,oalvrsadyiatv> mu0idx5 t ne ie 0000ntak0 nt ... ems gopaeim0 runm pp ehr03b:39:5 te ::a9:13
3. References
in.mpathd(1M) Solaris 10 System Administration Guide: Introducing IPMP (Overview) Solaris 10 System Administration Guide: Administering IPMP (Tasks) To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community, Oracle Solaris Networking Community.
References
NOTE:1010640.1 - Summary of Typical Solaris IP Multipathing (IPMP) Configurations NOTE:1382335.1 - Transitioning From Solaris 10 IP Multipathing (IPMP) to Oracle Solaris 11 IPMP
https://support.oracle.com/epmos/faces/ui/km/DocumentDisplay.jspx?_adf.ctrl-state=hfngveqxz_266
5/5

Types of Failure Detection: Applies To

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Types of Failure Detection: Applies To

Încărcat de

Drepturi de autor:

Formate disponibile

9/9/12

Solaris 10 IP Multipathing (IPMP) Link-based Only Failure Detection [ID 1008064.1]

1. Types of Failure Detection

2. Configuration Examples using Link-based Failure Detection

#icni fofg a c0 fas1083u,racs,unn,utcs,p4 mu10 idx4 e: lg=004<pbodatrnigmliativ> t 50 ne

/etc/hostname.bge1 group ipmp0 up

S-ar putea să vă placă și