Atrg Clusterxl r6x r7x

ClusterXL
Advanced Technical
Reference Guide
9 February 2016
Classification: [Protected]
2015 Check Point Software Technologies Ltd. All rights reserved.

Classification: [Protected] All rights reserved | P. 1
Latest Documentation
The latest version of this document is at:
http://supportcontent.checkpoint.com/solutions?id=sk93306
http://supportcontent.checkpoint.com/documentation_download?id=25321
For additional technical information, visit the Check Point Support Center.
Revision History
Date
03 Feb 2016
14 Jan 2016
27 Dec 2015
24 Nov 2015
15 Nov 2015
14 Nov 2015
07 Sep 2015
17 Aug 2015
06 Aug 2015
28 July 2015
23 July 2015
12 June 2015
30 Mar 2015
02 Mar 2015
10 Feb 2015
05 Feb 2015
02 Feb 2015
21 Dec 2014
28 Oct 2014
26 Oct 2014
01 Oct 2014
03 Sep 2014
26 Aug 2014
05 Aug 2014
Description
Improved calculation of Destination MAC address for CCP packets

when there is no VIP configured on the involved interface
Updated the definition of 3rd party cluster to include VRRP on Gaia
Added additional related solution
Added a note about Cluster Under Load (CUL) mechanism
Added CCP versions for R76SP.10_VSLS and R76SP.30
Clarified the explanation about pingable host
Added the explanation that CCP packets are not encrypted
Improved description of Destination MAC Address of CCP packets
Added a note that throughput of Sync interface does not depend on
throughput of traffic interfaces
Added additional related solutions
Clarified the explanation about probing mechanism
Clarified the explanation about cpha_bond_ls_config.conf file
Added location of new cluster log files (starting in R77.20):
$FWDIR/log/cphaconf.elg
$FWDIR/log/cphastart.elg
$FWDIR/log/cphamcset.elg
Added description of Critical Device in VSX cluster - 'Instances'
Removed links to ClusterXL Guides R55, R60, R61, R62, R65
Added a note that starting from Gaia R75.47, R77.20, the file
$FWDIR/conf/discntd.if is not needed anymore
Updated the information when the Full Sync is performed
Corrected explanation of "Flush and ACK"
Corrected the description of CCP FWHAP_IF_PROBE_RPLY packet
Added a note that Legacy HA mode is not supported on Gaia OS
Corrected the information about Source MAC address of CCP
Added CCP versions for R77.30 and R76SP.10
Corrected the information about Source MAC address of CCP
Corrected the information about Legacy HA mode
Corrected the note: more than one Sync Network is not supported
Added new debug flag for 'cluster' module
Added documents with kernel debug flags

30 July 2014
29 July 2014
01 July 2014
30 June 2014
17 Mar 2014
09 Feb 2014
21 Jan 2014
19 Dec 2013
15 Dec 2013
08 Oct 2013
29 Sep 2013
22 Sep 2013
09 Sep 2013
15 Aug 2013
29 July 2013
18 July 2013
16 July 2013
10 July 2013
03 July 2013

Corrected the syntax for 'cphaconf -t ... add' command
Added CCP versions for R77.20 and R76SP
Added links to R77 Administration Guides
Minor spelling corrections
Added description of new Critical Device in R77 ClusterXL - 'ted'
Added a note about Crossbeam DBHA to ClusterXL requirements
Added additional information about MILS
Added additional flags for 'cphaconf ... start' command
Added some minor clarifications
First release of this document

Table of Contents:
Introduction to ClusterXL ...................................................................................................... 6
The need for gateway clusters ........................................................................................................................6
Check Point cluster solution ............................................................................................................................6
ClusterXL definitions and terms .......................................................................................................................7
ClusterXL requirements for hardware and software ......................................................................................18
State Synchronization in ClusterXL .................................................................................... 22

Introduction ....................................................................................................................................................22
State Synchronization modes ........................................................................................................................23
Restrictions ....................................................................................................................................................24
Synchronization network................................................................................................................................25
Configuring synchronization network .............................................................................................................25
Configuring a service to be non-synchronized ..............................................................................................26
ClusterXL Modes ................................................................................................................ 27

Mode comparison table .................................................................................................................................28
Example cluster topology...............................................................................................................................29
High Availability New mode ...........................................................................................................................30
Full High Availability mode.............................................................................................................................31
High Availability Legacy (Traditional) mode ..................................................................................................32
Load Sharing Multicast mode ........................................................................................................................33
Load Sharing Unicast mode ..........................................................................................................................34
VRRP .............................................................................................................................................................35
Bridge.............................................................................................................................................................36
Sticky Decision Function................................................................................................................................36
Forwarding .....................................................................................................................................................38
ClusterXL Configuration ..................................................................................................... 45

Clock synchronization ....................................................................................................................................45
Preparing cluster members............................................................................................................................45
Configuring cluster object in SmartDashboard ..............................................................................................47
Configuring routing on networks around the cluster ......................................................................................48
CCP mode .....................................................................................................................................................48
ClusterXL High Availability for IPv6 ...............................................................................................................48
Defining 'Disconnected' interfaces .................................................................................................................49
SecureXL .......................................................................................................................................................51
CoreXL ...........................................................................................................................................................51
VPN................................................................................................................................................................51
NAT ................................................................................................................................................................52
VLAN..............................................................................................................................................................52
Link Aggregation (Bonding) ...........................................................................................................................53
Monitoring the Interface Link State (MILS) ....................................................................................................60
Configuring cluster addresses on different subnets ......................................................................................60
Moving from a single gateway to a cluster ....................................................................................................62
Adding another member or interface to an existing cluster ...........................................................................62
Proxy ARP .....................................................................................................................................................63
ISP Redundancy ............................................................................................................................................64
Dynamic Routing ...........................................................................................................................................65
SNMP.............................................................................................................................................................67
Enhanced enforcement of the TCP 3-way handshake ..................................................................................67
Cluster state transitions ...................................................................................................... 69

Special notes for state transitions ..................................................................................................................69
Policy installation ...........................................................................................................................................69
State transitions of the cluster member .........................................................................................................74
State transitions due to 'FWHA_MY_STATE' packet ....................................................................................75
State transitions due to a Critical Device (Pnote)..........................................................................................76
State transitions due to the 'Interface Active Check' Critical Device (Pnote)................................................76
Actions performed by a cluster member following a state transition .............................................................77
Cluster Control Protocol (CCP) .......................................................................................... 78

Introduction ....................................................................................................................................................78
CCP and security policy rule base .................................................................................................................80
CCP internal timers........................................................................................................................................81

CCP modes ...................................................................................................................................................82

CCP and VLAN interfaces .............................................................................................................................83
CCP packet header .......................................................................................................................................85
External Header .......................................................................................................... 85

CCP Header ................................................................................................................ 90
ClusterXL Monitoring and Troubleshooting ...................................................................... 104
SmartView Tracker ......................................................................................................................................104
SmartView Monitor ......................................................................................................................................105
Clock synchronization ..................................................................................................................................106
CCP mode ...................................................................................................................................................106
SecureXL .....................................................................................................................................................106
CoreXL .........................................................................................................................................................107
VPN..............................................................................................................................................................107
NAT ..............................................................................................................................................................108
VLAN............................................................................................................................................................109
Link Aggregation (Bonding) .........................................................................................................................109
Adding another member or interface to an existing cluster .........................................................................111
ISP Redundancy ..........................................................................................................................................112
Dynamic Routing .........................................................................................................................................112
SNMP...........................................................................................................................................................113
Policy Installation .........................................................................................................................................113
Full Sync ......................................................................................................................................................117
Delta Sync ...................................................................................................................................................118
Traffic ...........................................................................................................................................................120
Flapping .......................................................................................................................................................121
'fw ctl pstat' command..................................................................................................................................122
'cphaprob' command....................................................................................................................................125
'cphastart' and 'cphastop' commands ..........................................................................................................140
'cphaconf' command ....................................................................................................................................141
'cpstat' command .........................................................................................................................................149
$FWDIR/bin/clusterXL_admin script............................................................................................................152
$FWDIR/bin/clusterXL_monitor_ipsscript...................................................................................................152
$FWDIR/bin/clusterXL_monitor_processscript...........................................................................................152
ClusterXL Debugging ....................................................................................................... 153

Debugging Check Point Security Gateway..................................................................................................153
Debugging modules and flags .....................................................................................................................156
Working with kernel parameters ....................................................................................... 160

ClusterXL Error Messages ............................................................................................... 161
Additional related solutions .............................................................................................. 161

Introduction to ClusterXL
The need for gateway clusters
Gateways and VPN connections are business critical devices. The failure of a Security
Gateway or VPN connection can result in the loss of active connections and access to
critical data. The gateway between the organization and the world must remain open under
all circumstances.
Check Point cluster solution

A ClusterXL cluster is a group of identical Check Point Security Gateways connected in
such a way that if one fails, another immediately takes its place.
ClusterXL is a software-based High Availability and Load Sharing solution that
distributes network traffic between clusters of redundant Security Gateways and provides
transparent failover between machines in a cluster:
A High Availability cluster ensures Security Gateway and VPN connection
redundancy by providing transparent failover to a backup Security Gateway in the
event of failure.
A Load Sharing cluster provides reliability and increases performance, as all cluster
members are active.
ClusterXL uses unique physical IP addresses and MAC addresses for the cluster
members, and Virtual IP addresses to represent the cluster itself on the attached networks.
Virtual IP addresses do not belong to an actual machine interface (except in High
Availability Legacy (Traditional) mode).
ClusterXL provides an infrastructure that ensures that data is not lost due to a failure, by
ensuring that each cluster member is aware of connections passing through the other
members. Passing information about connections and other Security Gateway states
between the cluster members is known as State Synchronization.
Security Gateway Clusters can also be built using OPSEC certified High Availability and
Load Sharing products. OPSEC certified clustering products use the same State
Synchronization infrastructure as ClusterXL. Refer to http://www.checkpoint.com/opsec/.
Note: This applies to ClusterXL in Security Gateway mode only. For more on VSX
mode, see the VSX Administration Guide (VSX NGX, VSX NGX Scalability Pack, VSX
NGX R65, VSX NGX R67, VSX NGX R68, R75.40VS, R76, R77).

ClusterXL definitions and terms

Different vendors give different meanings to terms that relate to clusters, High
Availability and Load Sharing (Load Balancing).
Check Point uses the following definitions and terms (the order of definitions and terms
listed below is dictated by education reasons):
Cluster - A group of machines that work together in a redundant configuration.
Cluster member - Machine that is a part of Cluster.
ClusterXL - Cluster of Check Point Security Gateways that work together in a redundant
configuration. These Check Point Security Gateways are installed on Gaia /
SecurePlatform / Crossbeam / Solaris / Windows OS.
Up to 8 cluster members are supported in ClusterXL (in Load Sharing mode, configuring
more than 4 members significantly decreases cluster performance due to amount of Delta
Sync).
Up to 5 cluster members are supported 3rd party cluster (in Crossbeam chassis,
configuring more than 4 members (APMs) significantly decreases cluster performance due
to amount of Delta Sync).
Notes:
VRRP Cluster on Gaia OS is considered a 3rd party cluster.
In Crossbeam DBHA configuration, the above requirement applies to a single
chassis (Check Point code is not aware of DBHA).
3rd party cluster - Cluster of Check Point Security Gateways that work together in a
redundant configuration. These Check Point Security Gateways are installed on
Crossbeam XOS, or IPSO OS.
Notes:
VRRP Cluster on Gaia OS is also considered a 3rd party cluster.
Cluster Mode - configuration of cluster members to work in either High Availability / VRRP
(one cluster member processes all the traffic), or Load Sharing (all traffic is processed in
parallel by all cluster members).
Failure - A hardware or software problem that causes a machine to be unable to serve as a
cluster member (for example, one of cluster interface has failed; one of the monitored
daemon has crashed).
Cluster member that suffered from a failure is declared as failed, and its state is
changed to 'Down' (A physical interface is considered 'Down' only if all configured VLANs
on that physical interface are 'Down').
Failover / Fail-over - Transferring of a control over traffic (packet filtering) from a cluster
member that suffered a failure to another cluster member (based on internal cluster
algorithms).
Failback / Fallback - Recovery of a cluster member that suffered from a failure. The state
of a recovered cluster member is changed from 'Down' to either 'Active', or 'Standby'
(depending on Cluster Mode).

State Synchronization (a.k.a. Sync) - Technology that synchronizes the relevant

information about the current connections (stored in various Check Point kernel tabled)
among all cluster members over Synchronization Network.
Due to State Synchronization, the current connections are not cut off during cluster
failover.
Synchronization (a.k.a. Sync, Secured, Trusted) Network - A set of interfaces on cluster
members that were configured as interfaces, over which State Synchronization information
will be passed (as Delta Sync packets ).
All Synchronization Networks work in parallel - i.e., the same information is passed in
parallel over all configured Synchronization Networks.
Up to three Synchronization Networks can be configured per cluster (SmartDashboard cluster object - 'Topology' pane - 'Network Objective').
Cluster topology - set of interfaces on all members of a cluster and their settings (Network
Objective, IP address/Net Mask, Topology, Anti-Spoofing, etc.).
Network Objective - defines how the cluster will configure and monitor an interface Cluster, Sync, Cluster+Sync, Monitored Private, Non-Monitored Private.
Configured in SmartDashboard - cluster object - 'Topology' pane - 'Network Objective'.
Cluster interface - An interface that was configured as a part of cluster topology:
SmartDashboard - cluster object - 'Topology' pane - 'Network Objective' column - set to
'Cluster'.
Sync (a.k.a. Secured, Trusted) interface - An interface that was configured as a part of
cluster topology for State Synchronization: SmartDashboard - cluster object - 'Topology'
pane - 'Network Objective' column - set to 'Sync' or 'Cluster+Sync'.
Up to three Sync interfaces can be configured per cluster.
Monitored Private interface - An interface that was configured as not to be a part of
cluster topology: SmartDashboard - cluster object - 'Topology' pane - 'Network Objective'
column - set to 'Monitored Private'.
However, this interface will be monitored by cluster, and failure on this interface will
cause the member to change its state to 'Down'.
Non-Monitored Private interface - An interface that was configured as not to be a part of
cluster topology: SmartDashboard - cluster object - 'Topology' pane - 'Network Objective'
column - set to 'Non-Monitored Private'.
This interface will not be monitored by cluster, and failure on this interface will not cause
any changes in member's state.
Private interface - An interface that was configured as not to be a part of cluster topology:
SmartDashboard - cluster object - 'Topology' pane - 'Network Objective' column - set to
'Private'.
This interface will not be monitored by cluster, and failure on this interface will not cause
any changes in member's state. Applies only to 3rd party clusters.
Disconnected interface - An interface that was set as 'Non-Monitored Private'
interface in SmartDashboard, or was explicitly added into $FWDIR/conf/discntd.if file

(e.g., pre-requisite for configuring Bond interface). This interface state appears in the output
of 'cphaprob -a if' command.
Full Sync - Complete synchronization of relevant kernel tables by a cluster member that
tries to join the cluster against the working cluster member(s). This process is meant to
fetchasnapshotoftherelevant kernel tables of already Active cluster member(s).
Full Sync is performed during initialization of Check Point software (during boot process,
the first time the member runs policy installation, during 'cpstart'). Until the Full Sync
process is complete successfully, this member remains in 'Down' state because until it is
fully synchronized with other cluster members, it can not function as a cluster member.
Meanwhile the Delta Sync packets continue to arrive and are stored in kernel memory
until Full Sync completes.
The whole Full Sync process is performed by FWD daemons on TCP port 256 and is
always done over.
The information is sent by FWD daemons in chunks while making sure they confirm
getting the information before sending the next chunk.
Delta Sync - Synchronization of kernel tables between all working cluster members exchange of CCP packets that carry pieces of information about different connections and
operations that should be performed on these connections in relevant kernel tables.
This Delta Sync process is performed directly by Check Point kernel.
While performing Full Sync, the Delta Sync updates are not processed and saved in
kernel memory. After Full Sync is complete, the Delta Sync packets stored during the Full
Sync phase are applied by order of arrival.
Delta Sync retransmission - It is possible that Delta Sync packets will be lost or corrupted
during the Delta Sync operations. In such cases, it is required to make sure the Delta Sync
packet is re-sent.The cluster member request the sending member to retransmit the
lost/corrupted Delta Sync packet.
Each Delta Sync packet has a sequence number.
The sending member has a queue of sent Delta Sync packets.
Each cluster member has a queue of packets sent from each of the peer cluster
members.
If, for any reason, a Delta Sync packet was not received by a cluster member, it can ask
for a retransmission of this packet from the sending member.
The Delta Sync retransmission mechanism is somewhat similar to a TCP Window and
TCP retransmission mechanism.
When a member requests retransmission of Delta Sync packet, which no longer exists
on the sending member, the member prints a console messages that the sync is not
complete.
Cluster Control Protocol (CCP) - Proprietary Check Point protocol that runs between
cluster members on UDP port 8116, and has the following roles (refer to 'Cluster Control
Protocol (CCP) section):
State Synchronization (Delta Sync)
Health checks (state of cluster members and of cluster interfaces):
o Health-status Reports
o Cluster-member Probing
o State-change Commands
o Querying for Cluster Membership

Note: CCP is located between the Check Point kernel and the network interface
(therefore, only TCPdump should be used for capturing this traffic).
Preconfigured mode - Cluster Mode, where cluster membership is enabled on all
members to be, however no policy had been yet installed on any of the members - none of
them is actually configured to be primary, secondary, etc. The cluster cannot function if one
machinefails.Inthisscenario,thepreconfigured mode takes place.
The preconfigured mode also comes into effect when no policy is yet installed, right
after the machines came up after boot, or when running 'cphaconf init' command.
Blocking mode - Cluster Mode, where cluster member does not forward any traffic (e.g.,
caused by a failure).
Non-blocking mode - Cluster Mode, where cluster member keeps forwarding all traffic.
High Availability (a.k.a. Active/Standby) mode - Cluster Mode, where only one cluster
member ('Active' member) processes all the traffic, while other cluster members ('Standby'
members) are ready to be promoted to 'Active' state if 'Active' member fails.
In High Availability New Mode, the cluster Virtual IP address (that represents the cluster
on that network) is associated:
with physical MAC Address of 'Active' member
with virtual MAC Address (refer to sk50840 (How to enable ClusterXL Virtual MAC
(VMAC) mode))
In High Availability Legacy (Traditional) Mode, there are no Virtual IP addresses - the
cluster members share identical IP and MAC addresses, so that the Active cluster member
receives from a hub or switch all the packets that were sent to the cluster IP address.
Load Sharing (a.k.a. Active/Active, Load Balancing) mode - Cluster Mode, where all
traffic is processed by all cluster members in parallel.
Load Sharing Multicast mode - Load Sharing Cluster Mode, where all traffic is processed
by all cluster members in parallel - each member is assigned the equal load of [ 100% /
number_of_members ].
The cluster Virtual IP address (that represents the cluster on that network) is associated
with Multicast MAC Address 01:00:5E:X:Y:Z (which is generated based on last 3 bytes
of cluster Virtual IP address on that network).
A ClusterXL decision algorithm (Decision Function) on all cluster members decides
which cluster member should process the given packet.
Load Sharing Unicast mode - Load Sharing Cluster Mode, where all traffic is accepted by
one member (called Pivot), and then the traffic is either processed by this member (Pivot),
or forwarded to one of the peer members (called non-Pivot).
The traffic load is assigned to cluster members based on the hard-coded formula per
the value of 'Pivot_overhead' attribute (refer to sk34668 (How to modify the assigned
load between the members of ClusterXL in Load Sharing Unicast mode)).
The cluster Virtual IP address (that represents the cluster on that network) is associated
with:
Physical MAC Address of 'Pivot' member
Virtual MAC Address (refer to sk50840 (How to enable ClusterXL Virtual MAC
(VMAC) mode))

Full High Availability (a.k.a. Full HA) mode - Special Cluster Mode (supported only on
Check Point appliances running Gaia OS or SecurePlatform OS) where each cluster
member also runs as a Security Management Server. This provides redundancy both
between Security Gateways (only High Availability is supported) and between Security
Management Servers (only High Availability is supported). Refer to sk101539 (ClusterXL
Load Sharing mode limitations and important notes) and sk39345 (Management High
Availability restrictions).
Decision Function - Special cluster algorithm applied by each cluster member on the
incoming traffic in order to decide, which member should process the given packet - each
cluster members maintains a table of hash values generated based on connections tuple
(source and destination IP addresses/Ports, and Protocol number).
In order to see the decision process, run kernel debug of 'cluster' module with flag
'df' (also recommended to enable the flag 'select').
Sticky Decision Function (SDF) - Special cluster algorithm in Load Sharing mode that
allows the user to control based on which parameters should the Decision Function be
applied to the incoming connections:
IPs, Ports, SPIs
IPs, Ports
IPs
Selection - The packet selection mechanism is one of the central and most important
components in the ClusterXL product and State Synchronization infrastructure for 3rd party
clustering solutions. Its main purpose is to correctly decide (select) what has to be done to
the incoming and outgoing traffic on the cluster machine.
In order to see the selection process, run kernel debug of 'cluster' module with flag
'select' (also recommended to enable the flag 'df').
In ClusterXL - the packet is selected by cluster member(s) depending on the cluster
mode:
o In HA modes - by Active member
o In LS Unicast mode - by Pivot member
o In LS Multicast mode - by all members.
Then the member applies the Decision Function (and SDF).
In 3rd party / OPSec cluster - the 3rd party software selects the packet, and Check
Point code just inspects it (and performs State Synchronization).
HA not started - Output of 'cphaprob flag' command on the given cluster member means that Check Point clustering software is not started on this Security Gateway (e.g.,
this machine is not a part of a cluster, or 'cphastop' command was run, or some failure
occurred that prevented the ClusterXL product from starting correctly).
Initializing - State of a cluster member during initialization of Check Point software (this
state can be seen only in cluster debug). An initial and transient state of the cluster member
- the ClusterXL product is already running, but not all ClusterXL Critical Devices are
initialized yet and FireWall product is not ready yet.
Ready - State of a cluster member during after initialization and before promotion to the
next required state - Active/Standby/Master/Backup (depending on Cluster Mode).

A member in this state does not process any traffic passing through cluster. A member
might be stuck in this state due to several reasons - refer to sk42096 (Cluster member is
stuck in 'Ready' state).
Active - State of a cluster member that is fully operational:
In ClusterXL - state of the Security Gateway component
In 3rd party / OPSec cluster - state of the State Synchronization mechanism
Active attention - In ClusterXL - state of the 'Active' cluster member that suffers from a
failure (and failover is not possible because there are no other available members, e.g.,
while Standby member of an HA cluster reboots).
Standby - State of a cluster member that is ready to be promoted to 'Active' state (if Active
member fails) in ClusterXL configured in High Availability mode.
Master - State of a cluster member that processes all traffic in ClusterXL configured in
VRRP mode.
Backup - State of a cluster member that is ready to be promoted to 'Master' state (if Master
member fails) in ClusterXL configured in VRRP mode.
Active Up - ClusterXL in High Availability mode that was configured as 'Maintain
current active Cluster Member'.
This means the following:
If the current Active member fails for some reason, or is rebooted (e.g., Member_A),
then failover occurs between cluster members - another Standby member will be
promoted to be Active (e.g., Member_B).
When former Active member (Member_A) recovers from a failure, or boots, the
former Standby member (Member_B) will remain to be in Active state (and
Member_A will assume the Standby state).
Primary Up - ClusterXL in High Availability mode that was configured as 'Switch to
higher priority Cluster Member'.
This means the following:
Each cluster member is given a priority (SmartDashboard - cluster object - 'Cluster
Members' pane) - member with highest priority appears at the top of the table, and
member with lowest priority appears at the bottom of the table.
The member with highest priority will assume the Active state.
If the current Active member with highest priority (e.g., Member_A), fails for some
reason, or is rebooted, then failover occurs between cluster members - the member
next highest priority will be promoted to be Active (e.g., Member_B).
When the member with highest priority (Member_A) recovers from a failure, or
boots, then additional failover occurs between cluster members - the member with
highest priority (Member_A) will be promoted to Active state (and Member_B will
return to Standby state).
Down - State of a cluster member during a failure:
In ClusterXL - state of the Security Gateway component
In 3rd party / OPSec cluster - state of the State Synchronization mechanism

Dead - State reported by a cluster member when it goes out of the cluster (due to
'cphastop' command (which is a part of 'cpstop'), or reboot).
Dying - State of a cluster member as assumed by peer members if it did not report its state
for 0.7 sec.
ClusterXL is inactive, or the machine is down - Such state is reported by the given
member regarding the peer member after the peer member notifies (via CCP) that it goes
out of the cluster (due to 'cphastop' command (which is a part of 'cpstop'), or reboot).
Critical Device (a.k.a. Problem Notification, Pnote) - Special software device on each
cluster member through which the critical aspects for cluster operation are monitored.
When the critical monitored component on a cluster member fails to report its state on
time, or when its state is reported as problematic, the state of that member is immediately
changed to 'Down'
The complete list of the configured critical devices (pnotes) is printed by the 'cphaprob
-ia list' command.
Restrictions:
Total number of critical devices (pnotes) on cluster member is limited to 16.
Name of any critical device (pnote) on cluster member is limited to 16 characters.
There are several predefined built-in critical devices (pnotes):
Device Name: Problem Notification

o Current state: OK - none of the Critical Devices reports its state as 'problem'
o Current state: problem - at least one of the Critical Devices reports its
state as 'problem'
Device Name: Interface Active Check

o Current state: OK - all cluster interfaces are up (CCP packets are sent and
received on all cluster interfaces)
o Current state: problem - at least one of the cluster interface is down (CCP
packets are not sent / received on time)
Note:
o The transmit state of an interface (as monitored by this pnote) is refreshed once
a FWHAP_IF_PROBE_RPLY packet is received in acknowledge to
FWHA_IF_PROBE_REQ packet.
o The receive state of an interface (as monitored by this pnote) is refreshed once
any CCP packet (UDP on port 8116) is received.
Device Name: HA Initialization

o Current state: OK - "HA module" was initialized successfully (sk36372)
Device Name: Load Balancing Configuration

o Current state: OK - Pnote is currently not used (sk36373)

Device Name: Recovery Delay

o Current state: OK - state of a Virtual System can be changed (sk92353)
o Current state: problem - state of a Virtual System can not be changed yet
(sk92353)
Note: Recovery Delay mechanism is disabled by default on 3rd party clusters.
Device Name: IPSO member status

o Current state: OK - IPSO machine joined the cluster, all interfaces are up
o Current state: problem - IPSO machine left the cluster, less interfaces
than expected in UP state
There are several predefined registered critical devices (pnotes):
Device Name: Synchronization

o Current state: OK - Full Sync has completed successfully
o Current state: problem - Full Sync has failed
Device Name: Filter

o Current state: OK - Security Policy was installed successfully
o Current state: problem - Security Policy is not currently installed
Device Name: VSX

o Current state: OK - On VS0 means that states of all VSs are not 'Down';
On other VSs means that VS0 is alive
o Current state: problem - minimum of blocking states of all VSs is not
"active" (the VSIDs will be printed on the line 'Problematic VSIDs:')
Note: This pnote appears on Gaia OS since R75.40VS.
Device Name: fwd

o Current state: OK - FWD daemon reported its state on time (i.e., FWD is
up)
o Current state: problem - FWD daemon did not report its state on time
(e.g., FWD is not running)
Device Name: cphad

o Current state: OK - CPHAMCSET daemon reported its state on time (i.e.,
CPHAMCSET is up)
o Current state: problem - CPHAMCSET daemon did not report its state on
time (e.g., CPHAMCSET is not running)
Note: Does not exist on VSX cluster R6x.
Note: Starting in R77.20, refer to $FWDIR/log/cphamcset.elg
Device Name: cvpnd

o Current state: OK - CVPND daemon reported its state on time (i.e., CVPND
is up)
o Current state: problem - CVPND daemon did not report its state on time
(e.g., CVPND is not running)
Note: This pnote appears if Mobile Access Blade is enabled.

Device Name: FIB

o Current state: OK - FIBMGRD daemon reported its state on time (i.e.,
FIBMGRD is up) and it is able to send/receive its packets (on TCP port 2010)
o Current state: problem - FIBMGRD daemon did not report its state on
time (e.g., RouteD is not running) or it is not able to exchange its packets with
peer members
Note: This pnote appears only on SecurePlatform Pro OS when Advanced Dynamic
Routing is enabled.
Device Name: routed

o Current state: OK - RouteD daemon reported its state on time (i.e., RouteD
is up)
o Current state: problem - RouteD daemon did not report its state on time
(e.g., RouteD is not running)
Note: This pnote appears on Gaia OS since R76.
Device Name: ted

o Current state: OK - Threat Emulation Daemon reported its state on time
(i.e., TED is up)
o Current state: problem - Threat Emulation Daemon did not report its state
on time (e.g., TED is not running)
Note: This pnote appears since R77.
Device Name: Instances

o Current state: problem - mismatch between the number of CoreXL FW
instances in the received CCP packet and the number of loaded CoreXL FW
instances on this VSX member / this Virtual System (refer to sk106912).
Note: This pnote appears since R75.40VS in VSX HA mode (not VSLS) cluster.
Additional critical devices (pnotes) can be registered by using Check Point shell scripts:
'$FWDIR/bin/clusterXL_admin' shell script registers the admin_down device
(sk55081)
'$FWDIR/bin/clusterXL_monitor_ips' shell script registers the
host_monitor device (sk35780)
'$FWDIR/bin/clusterXL_monitor_process' shell script registers devices with
the names of processes that are specified in the
$FWDIR/conf/cpha_proc_list file (sk92904)
Additional critical devices (pnotes) can be registered by using the following syntax:
cphaprob -d Device_Name -t TimeOut_in_Sec -s State [-p]
register
Important Note: For R76 and above, refer to sk92878 (User Space process monitoring
mechanism in R76 ClusterXL).
Note: On Security Gateway in VSX mode, global pnotes can be registered only from the
context of VS0.
Any critical device (pnote) can be unregistered by using the following syntax:
cphaprob -d Device_Name [-p] unregister
Note: On Security Gateway in VSX mode, global pnotes can be unregistered only
from the context of VS0.

Subscribers - User Space processes that are made aware of the current state of the
ClusterXL state machine and other clustering configuration parameters. List of such
subscribers can be obtained by running the cphaconf debug_data command.
Sticky connection - A connection is called 'sticky' if all packets are handled by a single
cluster member (in High Availability mode, all packets reach the 'Active' machine, so all
connections are sticky).
Non-sticky connection - A connection is called 'non-sticky' if the reply packet returns via a
different cluster member than the original packet (e.g., if network administrator has
configured asymmetric routing; in Load Sharing mode, all cluster members are 'Active', and
in Static NAT and encrypted connections, the Source and Destination IP addresses
change, therefore, Static NAT and encrypted connections through a Load Sharing cluster
may be non-sticky).
Flush and ACK (a.k.a. FnA, F&A) - Cluster member forces the Delta Sync packet about
the incoming packet and waiting for acknowledgements from all other Active members and
only then allows the incoming packet to pass through.
In some scenarios, it is required that some information, written into the kernel tables, will
be Sync-ed promptly, or else a race condition can occur. The race condition may occur if a
packet that caused a certain change in kernel tables left cluster Member_A toward its
destination and then the return packet tries to go through cluster Member_B.
In general, this kind of situation is called asymmetric routing. What may happen in this
scenario is that the return packet arrives at cluster Member_B before the changes induced
by this packet were Sync-ed to this Member_B.
Example of such a case is when a SYN packet goes through cluster Member_A,
causing multiple changes in the kernel tables and then leaves to a server. The SYN-ACK
packet from a server arrives at cluster Member_B, but the connection itself was not Synced yet. In this condition, the cluster Member_B will drop the packet as an Out-of-State
packet ("First packet isn't SYN"). In order to prevent such conditions, it is possible
tousetheFlushandAck(F&A)mechanism.
This mechanism can send the Delta Sync packets with all the changes accumulated so
far in the Sync buffer to the other cluster members, hold the original packet that induced
these changes and wait for acknowledgement from all other (Active) cluster members that
they received the information in the Delta Sync packet. When all acknowledgements
arrived, the mechanism will release the held original packet.
This ensures that by the time the return packet arrived from a server at the cluster, all
the cluster members are aware of the connection.
F&A is being operated at the end of the Inbound chain and at the end of the Outbound
chain (it is more common at the Outbound).
Forwarding - Process of transferring of an incoming traffic from one cluster member to
another cluster member for processing.
There are two types of forwarding the incoming traffic between cluster members:
Packet forwarding
Chain forwarding
Refer to Forwarding section.

Packet Selection - Distinguishing between different kinds of packets coming from the
network, and selecting, which member should handle a specific packet (Decision Function
mechanism):
CCP packet from another member of this cluster
CCP packet from another cluster or from a cluster member with another version
(usually older version of CCP)
Packet is destined directly to this member
Packet is destined to another member of this cluster
Packet is intended to pass through this cluster member
ARP packets
CPHA - General term that stands for Check Point High Availability (historic fact: the first
release of ClusterXL supported only High Availability) that is used only for internal
references (e.g., inside kernel debug) to designate ClusterXL infrastructure.
Probing - If a cluster member fails to receive status for another member (does not receive
CCP packets from that member) on a given segment, cluster member will probe that
segment in an attempt to illicit a response.
The purpose of such probes is to detect the nature of possible interface failures, and to
determine which module has the problem.
The outcome of this probe will determine what action is taken next (change the state of
an interface, or of a cluster member).
Refer to Cluster Control Protocol (CCP) section.
IP tracking - Collecting and saving of Source IP addresses and Source MAC addresses
from incoming IP packets during the probing.
This information is saved in IP tracking tables according to IP tracking policy:
host_ip_addrs_all, id 8125
host_ip_addrs, id 8177
IP tracking is a useful for members within a cluster to determine whether the network
connectivity of the member is acceptable.
IP tracking policy - Setting that controls, which IP addresses should be tracked during IP
tracking:
Only IP addresses from the subnet of cluster VIP, or from subnet of physical cluster
interface (fwha_track_ip_policy=1; default value)
All IP addresses, also outside the cluster subnet (fwha_track_ip_policy=0)
Pingable host - Some host (i.e., some IP address) that cluster members can ping during
probing mechanism. Pinging hosts in an interface's subnet is one of the health checks that
ClusterXL mechanism performs. This pingable host will allow the cluster members to
determine with more precision what has failed (which interface on which member).
On Sync network, usually, there are no hosts. In such case, if switch supports this, an
IP address should be assigned on the switch (e.g., in the relevant VLAN).
The IP address of such pingable host should be assigned per this formula:
IP_of_pingable_host = IP_of_physical_interface_on_member + ~10
Assigning the IP address to pingable host that is higher than the IP addresses of
physical interfaces on the cluster members will give some time to cluster members to
perform the default health checks.

Example:
IP address of physical interface on a given subnet on Member_A is 10.20.30.41
IP address of physical interface on a given subnet on Member_B is 10.20.30.42
IP address of pingable host should be at least 10.20.30.50
Flapping - Consequent changes in the state of either cluster interfaces (cluster interface
flapping), or cluster members (cluster member flapping). Such consequent changes in the
state are seen in SmartView Tracker (if in SmartDashboard in cluster object, the cluster
administrator set 'Track changes in the status of cluster members' to 'Log').
VMAC - Virtual MAC address (available since R71). When this feature is enabled on cluster
members, all cluster members in High Availability New mode / Load Sharing Unicast mode
(Note: any VSX cluster works in High Availability mode) associate the same Virtual MAC
address with Virtual IP address.
This allows avoiding issues when Gratuitous ARP packets sent by cluster during failover
are not integrated into ARP cache table on switches surrounding the cluster.
Refer to sk50840 (How to enable ClusterXL Virtual MAC (VMAC) mode).
HTU - Stands for "HA Time Unit". All internal time in ClusterXL is measured in HTUs (the
times in cluster debug also appear in HTUs).
Formula in the code:
1 HTU = 10 x fwha_timer_base_res = 10 x 10 milliseconds = 100 ms
ClusterXL requirements for hardware and software

Note: ClusterXL product is part of the standard Security Gateway installation.
Open Servers vs. Check Point appliances
If ClusterXL is installed on Open Servers, then it must be installed in a distributed

configuration, in which the cluster members and the Security Management Server
are installed on different machines.
If ClusterXL is installed on Check Point appliances, then it can be installed in

either a distributed configuration (in which the cluster members and the Security
Management Server and installed on different machines), or in a Full High
Availability configuration (in which the cluster members and the Security
Management Server are installed on the same machines).
Requirements for number of cluster members

Up to 8 cluster members are supported in ClusterXL (in Load Sharing mode, configuring
more than 4 members significantly decreases cluster performance due to amount of Delta
Sync).
configuring more than 4 members (APMs) significantly decreases cluster performance due
to amount of Delta Sync).

Note: In Crossbeam DBHA configuration, the above requirement applies to a single

Requirements for hardware
ClusterXL operation completely relies on internal timers and calculation of internal
timeouts, which are based on hardware clock ticks.
Therefore, in order to avoid unexpected behaviour, ClusterXL is supported only
between machines with identical CPU characteristics.
In addition, in order to avoid unexpected fail-overs due to issues with CCP packets on
cluster interfaces, it is strongly recommended to pair only identical physical interfaces as
cluster interfaces - even when connecting the cluster members via a switch:
Intel 82598EB on Member_A with Intel 82598EB on Member_B
Broadcom NeXtreme on Member_A with Broadcom NeXtreme on Member_B
Note: There is no requirement for throughput of Sync interface to be identical to / larger
than throughput of traffic interfaces (although, to prevent a possible bottle neck, a good
practice for throughput of Sync interface is to be at least identical to throughput of traffic
interfaces).
Requirements for software
ClusterXL is supported only between identical operating systems (all cluster members
must be installed on the same operating system) and between identical Check Point
software versions (all cluster members must be installed with identical Check Point
software, including OS build and hotfixes).
All ClusterXL modes are supported on all operating systems.
All Check Point software components must be identical on all cluster members.
Meaning that identical Software Blades and features must be enabled on all cluster
members:
SecureXL status - SecureXL on all members has to be either enabled, or disabled
Number of CoreXL FW instances - number of instances on all members must be
identical
Advanced Dynamic Routing - on all members has to be either enabled, or disabled
Otherwise, traffic might not be processed as expected and/or state of cluster members
might change expectedly. In addition, Full Sync will fail.
Refer to this solution:
sk41023 ('fwsync: there is a different installation of Check Point's products on each
member of this cluster' error in /var/log/messages)

Requirements for switches and routers

Cluster interfaces can be connected only via Layer 2 networking devices - hubs and
switches. Connecting cluster interfaces via Layer 3 networking devices (routers) is not
supported.
Cluster networks must meet the requirements for latency (less than ~30 milliseconds)
and packet loss (less than ~2-3%).
Note: Latency cannot be measured correctly by a simple tool as Ping or Traceroute more sophisticated tools are required that measure electrical signals on the wire.
High Availability New mode and Load Sharing Unicast Mode
When running the CCP in Multicast mode (default), the Layer 2 Destination MAC
address of CCP packets is a Multicast MAC address 01:00:5E:X:X:X.
Configure the following settings:
Switch Setting
IGMP and Static CAMs
Disabling multicast limits
Router Setting
Unicast MAC
Explanation
By default, ClusterXL does not support IGMP registration (also
known as IGMP Snooping).
Either disable IGMP registration in switches that rely on IGMP
packets to configure their ports, or enable IGMP registration on
ClusterXL members per sk33221.
In situations, where disabling IGMP registration in switches is
not acceptable, it is necessary to configure static CAMs in order
to allow multicast traffic on specific ports.
Certain switches have an upper limit on the number of
broadcasts and multicasts that they can pass, in order to
prevent broadcast storms. This limit is usually a percentage of
the total interface bandwidth.
It is possible either to turn off broadcast storm control, or to
allow a higher level of broadcasts or multicasts through the
switch.
If the connecting switch is incapable of having any of these
settings configured, it is possible, though less efficient, for the
switch to use broadcast to forward traffic, and to configure the
cluster members to run CCP in broadcast mode per sk20576.
Explanation
When working in High Availability New mode (without VMAC) /
Load Sharing Unicast mode, the Cluster Virtual IP address is
mapped to a physical MAC address of the 'Active' / 'Pivot'
member.
In case of fail-over, another member will be promoted to 'Active'
/ 'Pivot'. As a result, the Cluster Virtual IP address will be
mapped to new physical MAC address.
In order to update the surrounding networking devices, 'Active' /
'Pivot' member sends Gratuitous ARP packets.
The router needs to be able to learn this MAC through these
ARP packets (otherwise, it will route the traffic to "old" MAC
address, which will cause traffic outage on the network).

Load Sharing Multicast Mode

Load Sharing Multicast Mode, the cluster Virtual IP address (that represents the cluster
on that network) is associated with Multicast MAC Address 01:00:5E:X:Y:Z, which is
generated based on last 3 bytes of cluster Virtual IP address. Refer to Mode comparison
table.
When working in Load Sharing Multicast mode, the router must support sending unicast
IP packets with multicast MAC addresses. This is required so that all cluster members will
receive the data packets.
Configure the following settings:
Switch Setting
CCP in Multicast mode
Port Mirroring
Router Setting
Static MAC
IGMP and static CAMs
Disabling multicast limits
Disabling forwarding
multicast traffic to the
router
Explanation
Multicast mode is the default Cluster Control Protocol mode in
Load Sharing Multicast.
ClusterXL does not support the use of unicast MAC addresses
with Port Mirroring for Multicast Load Sharing solutions.
Explanation
Most routers can map the following ARP entries automatically
using the ARP mechanism:
unicast Layer 3 IP address
multicast Layer 2 MAC address
If you have a router that is not able to learn this type of
mapping dynamically, you will have to configure these
mappings as static MAC entries.
Some routers require disabling of IGMP snooping or
configuration of static CAMs in order to support sending
packets with unicast Layer 3 IP address and multicast Layer 2
MAC address.
Certain routers have an upper limit on the number of
broadcasts and multicasts that they can pass, in order to
prevent broadcast storms. This limit is usually a percentage of
the total interface bandwidth.
It is possible either to turn off broadcast storm control, or to
allow a higher level of broadcasts or multicasts through the
router.
Some routers will send multicast traffic to the router itself. This
may cause a packet storm through the network, and should be
disabled.
Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77) - Example Configuration of a Cisco Catalyst Routing Switch.
Refer to these solutions related to IGMP snooping:
sk31934 (ClusterXL IGMP Membership)
sk33221 (Using ClusterXL with IGMP Snooping-enabled switches)
sk22495 (Interface flapping (down/up) in a ClusterXL environment)
sk93327 (IGMP groups are not learned on cluster member)

ClusterXL licenses
To use ClusterXL, each Security Gateway in the cluster configuration must have a
regular Security Gateway license and the Security Management Server must have a
license for each cluster defined. There are separate licenses for cluster High Availability
mode and for cluster Load Sharing mode.
It does not matter how many Security Gateways are included in the cluster. If the proper
licenses are not installed, the policy installation operation will fail.
Refer to these solutions:
sk11054 (Check Point License Guide)
sk10200 ('too many internal hosts' error in /var/log/messages on Security Gateway)
For assistance with licenses, contact Check Point Customer Account Services
(http://www.checkpoint.com/form/contact_account.html, AccountServices@checkpoint.com,
+1-972-444-6600 ext 5).
State Synchronization in ClusterXL

Introduction
A failure of a firewall results in an immediate loss of active connections in and out of the
organization. A failure of a firewall results in an immediate loss of active connections in and
out of the organization. Many of these connections, such as financial transactions, may be
mission critical, and losing them will result in the loss of critical data.
ClusterXL supplies an infrastructure that ensures that no data is lost in case of a failure,
by making sure each cluster member is aware of the connections going through the other
members. Passing information about connections (stored in various Check Point kernel
tabled) and other Security Gateway states between the cluster members is called State
Synchronization.
Every IP-based service (including ICMP, TCP and UDP) recognized by the Security
Gateway is synchronized (unless configured otherwise in SmartDashboard).
State Synchronization is used both by ClusterXL and by 3rd party OPSEC-certified
clustering products.
ClusterXL modes and state synchronization:
ClusterXL High Availability configuration does not require state synchronization,
though if it is not enabled, connections will be lost upon failover.
ClusterXL Load Sharing configuration requires state synchronization (it is enabled
automatically and can not be disabled).

State Synchronization modes

State Synchronization uses the following two synchronization modes (since NG with
Application Intelligence):
Full Sync - Complete synchronization of relevant kernel tables by a cluster member

that tries to join the cluster against the working cluster member(s). This process is
meanttofetchasnapshotoftherelevant kernel tables of already Active cluster
member(s).
Full Sync is performed during initialization of Check Point software (during boot
process, the first time the member runs policy installation, during 'cpstart'). Until the
Full Sync process is complete successfully, this member remains in 'Down' state
because until it is fully synchronized with other cluster members, it can not function as a
cluster member.
Meanwhile the Delta Sync packets continue to arrive and are stored in kernel
memory until Full Sync completes.
The whole Full Sync process is performed by FWD daemons on TCP port 256 and is
always done over SIC (the information is written into relevant kernel tables via IOCTL):
o The member that tries to join the cluster starts to serve as Full Sync Client.
$FWDIR/log/fwd.elg log file shows:
fwsync: Connected to Sync Server
Decimal_IP_Address_of_Peer_Member. Starting full sync
fwsync: Full sync connection finished successfully
fwsync: End Sync Connection successfully
o A member chosen for Full Sync starts to serve as Full Sync Server.
$FWDIR/log/fwd.elg log file shows:
fwd_syncn_handler: got new full sync connection request from peer
Hex_IP_Address_of_Peer_Member
The information is sent by FWD daemons in chunks while making sure they confirm
getting the information before sending the next chunk.
Delta Sync - Synchronization of kernel tables between all working cluster members
- exchange of CCP packets that carry pieces of information about different
connections and operations that should be performed on these connections in
relevant kernel tables.
This Delta Sync process is performed directly by Check Point kernel.
While performing Full Sync, the Delta Sync updates are not processed and saved in
kernel memory. After Full Sync is complete, the Delta Sync packets stored during the
Full Sync phase are applied by order of arrival.
Whenever an operation is performed on a kernel table, which is marked as "sync"-ed
(in $FWDIR/conf/table.def file on Security Management Server), the Delta Sync
mechanism duplicates this action into a buffer of its own.

Once this Delta Sync buffer is full, and every Sync timer interval, the Delta Sync
buffer is sent to all cluster members over the Synchronization Network. The receiving
member will duplicate those actions into its kernel tables.
Restrictions
1. Refer to ClusterXL Requirements for Hardware and Software section above.
2. State synchronization is supported only between cluster members that meet the
following requirements:
identical operating systems
identical Check Point software components
latency on synchronization network is less than ~30 milliseconds and packet loss
is less than ~2-3%
Note: There is no requirement for throughput of Sync interface to be identical to
/ larger than throughput of traffic interfaces.
In addition, some connections can not be synchronized by design:
Connections that use User Authentication can not be synchronized (because
user authentication state is maintained on Security Servers, which are User
Space processes, and thus cannot be synchronized on different machines in the
way that kernel data can be synchronized).
Connections that use Resources can not be synchronized (because state of
such connections is maintained on Security Servers, which are User Space
processes, and thus cannot be synchronized on different machines in the way
that kernel data can be synchronized).
Accounting information can not be synchronized (because it is accumulated in
each cluster member and reported separately to the Security Management
Server, where the information is aggregated).
Broadcasts and multicasts can not be synchronized by design.
When DHCP Server is enabled on cluster members, the DHCP Server lease
database is not synchronized by design.
In R6x versions, Web Intelligence features on a ClusterXL cluster do not survive
failover. This means that if ClusterXL is providing Web Intelligence protections
and a cluster member fails, HTTP connections passing through the failed
member are lost.
Refer to sk92909 (How to debug ClusterXL to understand why a connection is not
synchronized).

Synchronization network
A set of interfaces on cluster members that were configured as interfaces, over which
State Synchronization information will be passed (as Delta Sync packets) comprise the
Synchronization Network.
All Synchronization Networks work in parallel - i.e., the same information is passed in
parallel over all configured Synchronization Networks.
Up to three Synchronization Networks can be configured per cluster (SmartDashboard cluster object - 'Topology' pane - 'Network Objective').
Important Notes:
1. The use of more than one Synchronization Network for redundancy is not supported
because the CPU load will increase significantly due to duplicate tasks performed by
all configured Synchronization Networks.
If a redundancy of Synchronization Networks is required, Check Point recommends
using Link Aggregation - configure several physical interfaces as a Bond interface,
and then configure single dedicated Synchronization Network over this single Bond
interface.
Refer to Link Aggregation (Bonding) section.
Refer to sk92804 (Sync Redundancy in ClusterXL).
2. State Synchronization information (payload of Delta Sync packets) is not encrypted.
It is up to cluster administrator to make sure that the Sync network is secured and
isolated.
Configuring synchronization network

It is strongly recommended to configure each Synchronization Network on dedicated
interfaces - meaning, configuring 'Network Objective' for those interface only as '1st sync',
'2nd sync' and '3rd sync' and not mixing as 'Cluster + 1st sync', 'Cluster + 2nd sync' and
'Cluster + 3rd sync'.
Adding the 'Sync' objective to the interfaces that pass the production traffic will create
additional load on that interface, on the CPU and will make the troubleshooting much more
difficult.
In ClusterXL (including VSX), the Synchronization Network is supported only on
the lowest VLAN tag of a VLAN interface. For example, if three VLANs with tags 10, 20
and 30 were configured on interface eth1, then only interface eth1.10 may be used for
State Synchronization.
If 'Sync' Network Objective was configured on any VLAN tags other than the lowest tag,
then cluster members will reject such configuration, and the output of 'cphaprob -a if'
command will explicitly show that no synchronization interfaces were configured.

Configuring a service to be non-synchronized

R75.40VS, R76, R77) - Chapter 'Synchronizing Connection Information Across the Cluster'
- Configuring State Synchronization - Configuring a Service Not to Synchronize.
By default, all connections are synchronized (i.e., for each processed connection, a
Delta Sync packet is created, sent by the member that processed this connection, received
and processed by peer members).
If the amount of traffic is high, the amount of Delta Sync packets will cause noticeable
load on the CPU. This load will increase significantly, if more than one Synchronization
Network is configured.
In order to increase the performance of cluster members, synchronization of some
connections can be set in the following way:
disabled completely (e.g., for DNS UDP, ICMP)
start the synchronization after a pre-defined delay (e.g., for HTTP downloads)
In order to change the relevant settings for a specific service:
1. Locate the service (either in the rulebase, or on the 'Services' tab)
2. Right-click on the service - select 'Edit...' - click on 'Advanced...' button
3. Change the relevant settings:
A. To disable the synchronization completely, uncheck the box 'Synchronize
connections on Cluster':
B. To start the synchronization after a pre-defined delay, check the box

'Synchronize connections on Cluster', check the box 'Start
synchronizing [ ] seconds after connection initiation' and
enter the desired number of seconds:
4. Click 'OK' to apply the changes.

5. Save the settings: 'File' menu - 'Save'.
6. Install policy onto cluster object.

It is possible to have both a synchronized and a non-synchronized definition of the

service, and to use them selectively in the Security Rule Base:
1. Create a new service (TCP, UDP and Other type) and give it a name that
distinguishes it from the existing service.
2. Copy all the definitions from the existing service into the Service Properties window
of the new service.
3. In the new service, click on 'Advanced...' button.
4. Copy all the advanced definitions from the existing service into the Advanced
Service Properties window of the new service.
5. Set the 'Synchronize connections on Cluster' in the new service, so that it
is different from the setting in the existing service.
6. Save the settings: 'File' menu - 'Save'.
Disable the Synchronization for a service if ALL of the following conditions are true:
1. A significant portion of the traffic crossing the cluster uses a particular service. If you
do not synchronize this service, then the amount of synchronization traffic is reduced
and cluster performance is enhanced.
2. The service usually opens short connections, whose loss may not be noticed. DNS
(over UDP) and HTTP are typically responsible for most connections, and generally
have very short life and inherent recoverability at the application level. However,
services, which typically open long connections, such as FTP, should always be
synchronized.
3. Configurations that ensure bi-directional stickiness for all connections do not require
synchronization to operate (only to maintain High Availability). Such configurations
include:
Any cluster in High Availability mode (for example, ClusterXL New HA or VRRP
ClusterXL in a Load Sharing mode with clear connections (no VPN or static NAT)
OPSec clusters that guarantee full stickiness (refer to the OPSec cluster's
documentation)
ClusterXL Modes
Note:
Up to 8 cluster members are supported in ClusterXL (in Load Sharing mode,
configuring more than 4 members significantly decreases cluster performance due to
amount of Delta Sync).
configuring more than 4 members (APMs) significantly decreases cluster
performance due to amount of Delta Sync).
Note: In Crossbeam DBHA configuration, the above requirement applies to a single
This limitation exists due to the high load on CPU that is caused by the amount of Delta
Sync packets, which increases significantly with the number of cluster members (the whole
cluster might suffocate, depending on the production traffic, of course).

Mode comparison table

R75.40VS, R76, R77) - Chapter 'High Availability and Load Sharing in ClusterXL' ClusterXL Modes - Mode Comparison Table:
Mode
Feature
High Availability
Load Sharing
Assigned Traffic
Load per Member
Performance
HW Support
SecureXL
Support
State
Synchronization is
Mandatory
VLAN Tagging
Support
Number of
members that
deal with network
traffic
Number of
members that
receive packets
from router
How cluster
answers ARP
requests for a VIP
address 4
CCP mode (also
refer to sk36644)
High
High
Availability Availability
New
Legacy
Mode
Mode 1
Yes
Yes
No
No
VRRP
Yes
No
Load
Sharing
Multicast
Mode 2
No
Yes
Load
Sharing
Unicast
Mode 2
No
Yes
100%
100%
100%
100% / N
sk34668
Good
Good
Good
Very Good
All
All
All
Excellent
Not all
routers 3
Yes
Yes
Yes
Yes 2
Yes 2
No
No
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Unicast
MAC
address of
Active
Multicast /
Broadcast
Unicast
shared
MAC
address
Broadcast
only
Virtual
VRRP MAC
address
Multicast
MAC
address 5
Multicast /
Broadcast
Multicast /
Broadcast
All
Unicast
MAC
address of
Pivot
Multicast /
Broadcast
Notes:
1.
2.
3.
4.
High Availability Legacy Mode is not supported on Gaia OS (sk103083).

Refer to sk101539 (ClusterXL Load Sharing mode limitations and important notes).
Refer to Requirements for switches and routers section.
Refer to sk31782 (ClusterXL association between cluster Virtual IP addresses and
MAC Addresses).
5. In Load Sharing Multicast mode, the Multicast MAC address that is associated with
Virtual IP address is calculated automatically (and can be changed manually).

In order to see / change this Multicast MAC address:

A.
B.
C.
D.
E.
F.
Open SmartDashboard
Open cluster object
Go to 'Topology' pane - click on 'Edit...'
Select the relevant VIP interface - click on 'Edit...'
On 'General' tab, click on 'Advanced...'
If you need to change this address, then
select 'User defined:' and enter new Multicast MAC address
G. Click 'OK' in all windows to apply the changes
H. Save the changes: go to 'File' menu - click on 'Save'
I. Install policy
Automatic algorithm for generating a Multicast MAC address to be associated with
cluster Virtual IP address of a format "A"."B"."C"."D":
o If 2nd octet "B" < 127, then
Final MAC = 01:00:5E:("B"hex):("C"hex):("D"hex)
Example:
VIP = 192.50.204.20
Final MAC = 01:00:5E:("50"hex):("204"hex):("20"hex) =
= 01:00:5E:32:CC:14
o If 2nd octet "B" > 127, then
Final MAC = 01:00:5E:("B-128"hex):("C"hex):("D"hex)
Example:
VIP = 192.168.204.20
Final MAC = 01:00:5E:("168-128"hex):("204"hex):("20"hex) =
= 01:00:5E:28:CC:14
Refer to sk25977 (Connecting multiple clusters to the same network segment (same
VLAN, same switch) - section about the "Destination MAC address" of the Cluster
Control Protocol.
Example cluster topology

This diagram can be used as example of cluster topology for these modes:
o High Availability New mode
o Full High Availability mode
o Load Sharing Multicast mode
o Load Sharing Unicast mode
Note: The example diagram for High Availability Legacy (Traditional) mode appears in
the corresponding section.

High Availability New mode

R75.40VS, R76, R77) - Chapter 'High Availability and Load Sharing in ClusterXL' ClusterXL Modes - High Availability Mode.
The High Availability New mode provides basic High Availability capabilities in a cluster
environment. The cluster provides Firewall services even when it encounters a problem,
which on a StandAlone module would have resulted in a complete loss of connectivity.
When combined with Check Point's State Synchronization, ClusterXL in High Availability
New mode can maintain connections through failover events, in a user-transparent manner,
allowing a flawless connectivity experience.
ClusterXL High Availability New mode designates one of the cluster members as the
Active machine, while the rest of the members are kept in a Standby mode.
High Availability New mode uses unique, real IP addresses for the cluster members
interfaces. The cluster Virtual IP addresses are associated with the physical network
interfaces of the Active machine (by matching the Virtual IP address with the unique MAC
address of the appropriate physical interface).
The cluster members physical IP addresses do not have to be routable on the Internet.
Only the cluster Virtual IP addresses must be routable.
All traffic directed at the cluster is actually routed (and filtered) by the Active member assigned traffic load is 100%. The role of each cluster member is chosen according to its
priority, with the Active member being the one with the highest priority (and lowest Member
ID).

Whenever the cluster detects a problem in the Active member that is severe enough to
cause a failover event, it passes the role of the Active member to one of the Standby
machines (the member with the next highest priority).
If State Synchronization is applied, any open connections are recognized by the new
Active machine, and are handled according to their last known state.
Upon the recovery of a member with a higher priority, the role of the Active machine
may or may not be switched back to that member, depending on the cluster configuration.
Full High Availability mode

Refer to ClusterXL Administration Guide (R70, R70.1, R71, R75, R75.20) - Chapter
'UTM-1 Clustering'.
Full High Availability (a.k.a. Full HA) mode is a special Cluster Mode that is supported
only on Check Point appliances running Gaia OS or SecurePlatform OS, where each
cluster member also runs as a Security Management Server.
This mode provides redundancy both between Security Gateways (only High Availability
is supported - assigned traffic load is 100%) and between Security Management Servers
(only High Availability is supported and there is no failover between the Security
Management server components).
sk39345 (Management High Availability restrictions)
sk61580 (How to rebuild a Full HA cluster after primary member fails and backup is
not available)
sk98831 ("Execution finished with errors" message on migrate import / export
command failure)
sk67681 (How to migrate from distributed environment to a UTM-1 Full HA cluster)
sk44201 (How to migrate Full HA environment to Distributed)
sk69627 (How to migrate from StandAlone configuration on Open Server to Full HA
cluster on UTM-1 appliances)
sk60443 (How To Install UTM-1 Appliances in Full HA cluster)
sk79200 (SmartView Monitor shows 'Log Server is not responding' error for a Full
HA cluster on Gaia OS)
sk36863 (Limitations of UTM-1 cluster that consists of different UTM-1 appliance
models)
Refer to 'cp_conf fullha' command.
Notes:
Unlike between the Security Gateway components, there is no failover between the
Security Management server components. If the Primary Security Management
Server goes down, the Secondary Security Management Server does not take over.
However, the database on the Secondary Security Management Server is fully
synchronized with the database on the Primary Security Management Server, so no
data is lost.

The members of Full HA cluster can be configured either together (both Check Point
appliances are linked before the First Time Configuration Wizard is opened), or
separately (the user chooses to configure a cluster consisting of a single, Primary
member, and configure the Secondary member at a later time).
Even if you decide not to install a Secondary cluster member during the initial
configuration, it is still worth your while to configure a cluster composed of a single
Primary member. A Full HA cluster is visible to the external network through its
Virtual IP addresses, not the actual physical addresses of its members. If at some
point you do decide to add a Secondary member, you will not have to alter the Layer
3 topology of your networks.
High Availability Legacy (Traditional) mode

R75.40VS, R76, R77) - Appendix A 'High Availability Legacy Mode'.
Note: High Availability Legacy Mode is not supported on Gaia OS (sk103083).
In High Availability Legacy mode, the cluster members share identical IP and MAC
addresses, so that the Active cluster member receives from a hub or switch all the packets
that were sent to the cluster IP address - assigned traffic load is 100%. A shared interface
is an interface with MAC addresses and IP addresses that are identical to those of another
interface on the peer members.
The Security Management Server must not be connected to these shared interfaces - in
other words, the synchronization network of the cluster, or to a dedicated management
network.
Configuring this mode is complicated, and must be performed in a precise sequence in
order to be successful.
Important Note: Although it is still supported, High Availability Legacy mode is
considered obsolete. There is no technical reason to use this mode anymore. If the cluster
administrator needs High Availability mode and the association of Cluster Virtual IP
addresses with constant unicast MAC addresses, then the cluster should be configured in
High Availability New mode with enabled VMAC (refer to sk50840).
Configuration instructions:
1. On Cluster Members
A. Disconnect the members from any switches / hubs
B. Install the same version of Check Point Security Gateway
C. Enable cluster membership
D. Configure identical IP addresses for shared interfaces
E. Do NOT reboot the machines
F. Connect the members through switch / hub - connect each network (internal,
external, Synchronization, DMZ, etc.) to a separate VLAN, switch or hub
G. Reboot the machines - for MAC Address configuration to take place

2. In SmartDashboard
A. Create a cluster object and select High Availability Legacy mode
B. Add members' objects - assign IP address from dedicated synchronization
network of the cluster, or from a dedicated management network
C. Initialize SIC
D. Define Topology:
No Virtual IP addresses for shared interfaces
Network Objective for shared interfaces is 'Monitored Private'
Sync interfaces, Network Objective for dedicated management interfaces
is 'Monitored Private' or 'Non-Monitored Private'
E. Install policy
F. Reboot the machines - MAC Address configuration will take place
Example:
Load Sharing Multicast mode

R75.40VS, R76, R77) - Chapter 'High Availability and Load Sharing in ClusterXL' ClusterXL Modes - Load Sharing Multicast Mode.
ClusterXL in Load Sharing Multicast mode distributes traffic within a cluster of Security
Gateways, so that the total throughput of multiple machines is increased.
In Load Sharing configurations, all functioning machines in the cluster are Active, and
handle network traffic (Active/Active operation) - assigned traffic load is 100% equally
divided by the number of active members.
If there is a failure in one of the machines, its connections are redistributed amongst the
remaining operational machines in the cluster.

If any individual Check Point Security Gateway in the cluster becomes unreachable,
transparent failover occurs to the other machines, thus providing High Availability. All
connections are shared between the remaining Security Gateways without interruption.
Load Sharing Multicast mode uses unique, real IP addresses for the cluster members
interfaces. The cluster Virtual IP addresses are associated with the Multicast MAC address
(created based on the Virtual IP addresses).
ClusterXL uses the Ethernet Multicast mechanism to associate the cluster Virtual IP
addresses with all cluster members. By binding these Virtual IP addresses to Multicast
MAC addresses, it ensures that all packets sent to the cluster, acting as a gateway, will
reach all members in the cluster.
Distribution of the traffic between cluster members is performed by applying a Decision
Function to each packet - each member decides whether it should or should not process
the packets.
This decision is the core of the Load Sharing mechanism: it has to assure that at least
one member will process each packet (so that traffic is not blocked), and that no two
members will handle the same packets (so that traffic is not duplicated).
If it is required that specific connections are always processed by particular member,
then additional decision algorithm can be enabled - Sticky Decision Function.
Load Sharing Unicast mode

R75.40VS, R76, R77) - Chapter 'High Availability and Load Sharing in ClusterXL' ClusterXL Modes - Load Sharing Unicast Mode.
The Load Sharing Unicast mode was developed in order to meet the needs of
customers, who use legacy routers that do not support the use of a multicast MAC address
for a unicast IP address (refer to Requirements for switches and routers section).
Customers, who work in such environment require a Load Sharing solution, but do not wish
or cannot afford to replace their existing hardware.
Historical fact: This mode was introduced in NG FP4.
One of the cluster members - machine with highest priority, called the Pivot, is the only
machine that communicates with the router. In this scheme, the router has to know and
deal with a single unicast MAC address only - thePivots MAC address. The Pivot
communicates with the router "on behalf" of the cluster, thus, enabling the usage of a
unicast traffic.
The Pivot is responsible for forwarding and distributing the traffic throughout the cluster,
while implementing both load sharing and redundancy solutions.
Load Sharing Unicast mode uses unique, real IP addresses for the cluster members
interfaces. The cluster Virtual IP addresses are associated with the physical network
interfaces of the Pivot machine (by matching the Virtual IP address with the unique MAC
address of the appropriate physical interface).

Distribution of the traffic by Pivot member is performed by applying a Decision Function
to each packet, the same way it is done in Load Sharing Multicast mode. The difference is
that only one member (Pivot) performs this selection: any non-Pivot member that receives
a forwarded packet will handle it, without applying the Decision Function.
If it is required that specific connections are always processed by particular member,
then additional decision algorithm can be enabled - Sticky Decision Function.
Note that non-PivotmembersarestillconsideredasActive,sincetheyperformrouting
and Firewall tasks on their share of the traffic (although they do not perform decisions).
Default traffic load assignment:
Cluster size
% of traffic handled
(including Pivot)
by the Pivot
1
100
2
33
3
20
4
10
5
0
6 and more
0
% of traffic handled by
each of the other members
0
67
40
30
20
100 / cluster size

o sk101539 (ClusterXL Load Sharing mode limitations and important notes)
o sk34668 (How to modify the assigned load between the members of ClusterXL in
Load Sharing Unicast mode)
o sk61331 (ClusterXL Load Sharing in Unicast (Pivot) mode - after second reboot of
Pivot member, output of 'cphaprob stat' on non-Pivot member shows wrong
assigned load as 0% for Pivot and 100% for non-Pivot)
VRRP
Refer to Gaia Administration Guide (R75.40, R75.40VS, R76, R77) - Chapter 'High
Availability'.
Virtual Router Redundancy Protocol (VRRP, RFC 3768) provides dynamic failover of IP
addresses from one router (Master) to another router (one of the Backup routers) in the
event of failure. VRRP allows you to provide alternate router paths for end hosts.
The Check Point VRRP implementation on Gaia OS includes functionality called
Monitored Circuit VRRP. Monitored Circuit VRRP prevents connection issues caused by
asymmetric routes created when only one interface on Master router fails (as opposed to
the Master itself).
Each VRRP cluster, known as a Virtual Router, has a unique identifier, known as the
VRID (Virtual Router Identifier). A Virtual Router can have one or more virtual IP addresses
(VIP) to which other network nodes connect as a final destination or the next hop in a route.

By assigning a Virtual IP address (VIP), you can define alternate paths for nodes
configured with static default routes. Only the Master router is assigned a VIP. The Backup
router is assigned a VIP upon failover when it becomes the new Master. Nodes can have
alternate paths with static default routes in the event of a failure.
Static default routes minimize configuration and processing overhead on host
computers.
Important Note: You cannot have a standalone deployment (Security Gateway and
Security Management server on the same computer) in a Gaia VRRP cluster.
sk70380 (Gaia FAQ - Frequently Asked Questions)
sk69684 (Using VRRP with Check Point 2012 Security Appliances)
sk92061 (How to configure VRRP on Gaia)
sk66569 (IPSO-to-Gaia Upgrade Scripts and VRRP Cluster Upgrade Instructions)
sk86881 (Changing the High Availability configuration from ClusterXL and VRRP (or
from VRRP to ClusterXL) requires reboot)
sk40278 (VRRP configuration is not updated when the logical interface information
(IP address) is changed)
sk92880 (It is not possible to configure preempt in Simplified VRRP on IPSO and
Gaia)
sk89980 (Sub-interfaces / Alias IP address / Secondary IP address on Gaia)
Bridge
ClusterXL in Bridge Mode is supported only in R75.40VS / R76 / R77 and above.
Refer to sk101371 (Bridge Mode on Gaia OS and SecurePlatform OS).
Sticky Decision Function

Cluster administrator should learn about Sticky Decision Function, which enables
certain services to operate in a Load Sharing deployment. For example, it is required for
L2TP traffic, or when the cluster is a participant in a Site-to-Site VPN tunnel with a 3rd party
peer.
The Sticky Decision Function has the following limitations:
Sticky Decision Function is not supported when employing either Performance Pack
or a hardware based accelerator card. Enabling the Sticky Decision Function
disables these acceleration products.
When the Sticky Decision Function is used in conjunction with VPN, cluster
members are prevented from opening more than one connection to a specific peer.
Opening another connection would cause another SA to be generated, which a thirdparty peer, in many cases, would not be able to process.
Sticky Decision Function does not maintain stickiness in VPN Routing (back-to-back
VPN) gateways where both sides of the connection are encrypted.

Sticky Decision Function supports the following sharing methods (configured in

SmartDashboard - cluster object - go to 'ClusterXL' pane - select 'Load Sharing' - click on
'Advanced...' - check the box 'Use Sticky Decision Function'):
IPs, Ports, SPIs (default) - provides the best sharing distribution, and is
recommended for use.
It is the least "sticky" sharing configuration.
Clarification:
A connection will stick to a cluster member based on IP addresses and based on
Ports.
Example:
Connection from IP_1:Port_1 to IP_2:Port_2 will stick to Member_A. Connection
from IP_1:Port_2 to IP_2:Port_2 might stick to Member_B.
IPs, Ports - should be used only if problems arise when distributing IPSec packets
to a few machines although they have the same source and destination IP
addresses.
IPs - should be used only if problems arise when distributing IPSec packets or
different port packets to a few machines although they have the same source and
destination IP addresses.
It is the most "sticky" sharing configuration.
In other words, it increases the probability that a certain connection will pass through
a single cluster member on both inbound and outbound directions.
Clarification:
A connection will "stick" to a cluster member based only on IP addresses.
Example:
All connections from IP_1 (from any port) to IP_2 (to any port) will stick to the same
Member_A.
Warning:
Since all connections between the given IP addresses will stick to the same
member, the CPU load on that member might increase significantly, which in turn
will negate the whole purpose of Load Sharing cluster mode.
Note:
Sticky Decision Function is enabled automatically, if Mobile Access Software Blade is
enabled on the cluster.
For more details, refer to ClusterXL Administration Guide (R70, R70.1, R71, R75,
R75.20, R75.40, R75.40VS, R76, R77) - Chapter 'Sticky Connections'.

Forwarding
Forwarding is a process of transferring of an incoming traffic from one cluster member
to another cluster member for processing
There are two methods of forwarding the incoming traffic:
Packet forwarding
o the packet is forwarded to the target member
o the packet will skip the inbound chain and get directly into the IP stack of
target machine
Chain forwarding
o the chain is forwarded to the target member
o the chain will start the chain process from the chain module that has asked
the chain forwarding (or the one after it)
Packet forwarding
Example:
A connection was initiated on the 'Standby' member in High Availability cluster. The
reply packets to such connection will be accepted by 'Active' member, and must be
forwarded to 'Standby' member.
Description:
The sending cluster member forwards the packet at the end of the Inbound processing.
On the target cluster member, the processing of the forwarded packet will continue from
the chain at which it has stopped on the source cluster member, or the packet will be
entered directly into the TCP/IP stack (if the packet has already passed through all
Inbound chains).
Debugging:
In order to see how a packet is forwarded between cluster members, debug the
'cluster' module with 'forward' flag (in addition, these flags are recommended:
'select', 'if', 'mac'):
[Expert@GW_HostName]# fw ctl debug -m cluster + forward select if mac
Technical details:
Packet Forwarding is performed in the following way (so that the target cluster member
can understand that this packet is intended to him):
In High Availability mode, the connection is forwarded over Synchronization

Network.
Description:
Since the processed packet may be already decrypted, it must be sent over
the secured interfaces.
On the receiving side, the machine will not pass this packet to the FireWall (the
packet will not perform the inspection again), but instead the packet is passed
directly to the IP stack of the operating system.

o Layer 2 Source MAC address of the packet is changed to:

In ClusterXL:
1st
2nd
3rd
4th
5th
6th
fwha_mac_forward_magic
ID_of_Target_Member
00
00
00
00
Notes:
fwha_mac_forward_magic - name of the kernel parameter that controls
the value of 5th byte in forwarded packets (default value is 0xFD hex / 253 dec)
Refer to sk25977 (Connecting multiple clusters to the same network segment
(same VLAN, same switch)
Refer to sk95150 (When the Synchronization interfaces of three and more
ClusterXL members are connected to the same switch, port flapping occurs
on the switch)
In VSX cluster (any VSX cluster works in High Availability mode):
1st
2nd
3rd
4th
5th
6th
XXXXXXXX
fwha_mac_forward_magic
YYYZZZZZ
00
00
00
Notes:
XXXXXXXX - 8 most significant bits of VSID
fwha_mac_forward_magic - name of the kernel parameter that controls
the value of 5th byte in forwarded packets (default value is 0xF5 hex / 245 dec)
YYY - 3 least significant (right-most) bits of VSID
ZZZZZ - ID of target cluster member
o Layer 2 Destination MAC address of the packet is changed to the MAC address
of the Sync interface on peer member.
o Layer 3 Source IP address is the IP address of the host that sent the original
packet.
o Layer 3 Destination IP address is the physical IP address of the cluster member
on that subnet.
o The packet is dropped on the member that forwarded the packet (log is
generated only if forwarding fails).
Debug:
o In order to see the forwarding process, run the debug of 'cluster' module with
'forward' flag and of 'fw' module with 'drop' flag on Active member:
fwha_forward_msg_wrapper(if_number direction position
Target_Member_ID): forwarding
FW-1: fwha_forward_send_msg: Forwarding packet to id Target_Member_ID
fwha_forw_flush_callback: Forwarded successfully. Dropping chain
fw_log_drop: Packet proto= ... dropped by fwhaforw.c:LINE Reason:
unknown;

o In order to see the arrival of forwarded packet, run the debug of 'cluster'
module with 'select' flag on Standby member:
FW-1: FORWARDED Packet : fwha_select_ip_packet: (IF if_name (if_number)
at N sec) using magic ether header (0xZZZZZZZZ)
Example:
Layer 2 and Layer 3 Addresses:

Machine
MAC address
00:50:56:c0:00:01
Host
00:0C:29:DB:26:47
Active - eth0 (ext)
00:0C:29:DB:26:51
Active - eth1 (sync)
00:0C:29:72:56:47
Standby - eth0 (ext)
00:0C:29:72:56:51
Standby - eth1 (sync)
IP address
192.168.204.1
192.168.204.10
10.10.10.10
192.168.204.12
10.10.10.12
Traffic flow: Standby member initiates a TCP connection to the Host.

Packet flow:
1. The TCP SYN packet is sent by Standby over the External (eth0) with:
o Source MAC address 00:0C:29:72:56:47 (Standby ext eth0)
o Destination MAC address 00:50:56:c0:00:01 (Host)
o Source IP address 192.168.204.12 (Standby ext eth0)
o Destination IP address 192.168.204.1 (Host)
2. The SYN ACK packet is sent by Host over the External (eth0) with:
o Source MAC address 00:50:56:c0:00:01 (Host)
o Destination MAC address 00:0C:29:DB:26:47 (Active)
o Source IP address 192.168.204.1 (Host)
o Destination IP address 192.168.204.20 (cluster Virtual IP)
3. The SYN ACK is forwarded over the Sync (eth1) with:
o Source MAC address 00:00:00:00:FD:01 (ClusterXL Forwarding layer)
o Destination MAC address 00:0C:29:72:56:51 (Standby sync eth1)
o Destination IP address 192.168.204.12 (Standby ext eth0)
Refer to sk95150 (When the Synchronization interfaces of three and more ClusterXL
members are connected to the same switch, port flapping occurs on the switch).

In Load Sharing Multicast mode, the connection arrives to all cluster members,
and each member decides whether it should process the packet or not.
When the Sticky Decision Function (SDF) is used, refer to sk95150 (When the
Synchronization interfaces of three and more ClusterXL members are connected to
the same switch, port flapping occurs on the switch).
In order to see the arrival of forwarded packet, run the debug of 'cluster' module
with 'select' flag on receiving member:
o If the local member should process this packet, the following is printed:
FW-1: fwha_select_ip_packet: Packet IN SourceIP_in_Hex->DestIP_in_Hex
FW-1: fwha_local_member_should_procces_mc: local member should process
packet
FW-1: fwha_select_ip_packet: Packet was filtered by member Member_ID
o If the local member should not process this packet, the following is printed:
FW-1: fwha_select_ip_packet: Packet IN SourceIP_in_Hex->DestIP_in_Hex
FW-1: fwha_local_member_should_procces_mc: local member should not process
packet
FW-1: fwha_select_ip_packet: Packet was dropped by member Member_ID
In Load Sharing Unicast mode, the connection is forwarded over the same
interface, on which it was received - not over Synchronization Network.
o Layer 2 Source MAC address of the packet is inverted and combined in a special
way with values of these kernel parameters: fwha_mac_magic and
fwha_mac_forward_magic.
Notes:
fwha_mac_magic - controls the value of 5th byte in Source MAC address of
CCP packets (default values is 0xFE hex / 254 dec)
fwha_mac_forward_magic - controls the value of 5th byte in Source MAC
address of forwarded packets (default values is 0xFD hex / 253 dec)
o Layer 2 Destination MAC address of the packet is changed to the MAC address
of the non-Pivot cluster member on the same subnet.
o Layer 3 Source IP address is the IP address of the host that sent the original
packet.
o Layer 3 Destination IP address is the physical IP address of the cluster member
on that subnet.
o The packet is dropped on the member that forwarded the packet (log is
generated only if forwarding fails).
Refer to sk41898 (Connecting multiple clusters running in Load Sharing Unicast
mode results in MAC Address flapping on switches).

Debug:
o In order to see the forwarding process, run the debug of 'cluster' module with
flags 'pivot' and flag 'select' on Pivot member:
FW-1: fwha_pivot_selection_from_packet: packet forwarded ok to machine
Target_Member_ID
fwhamultik_handle_ip_packet: Dropping packet since it is not my packet,
packet was forwarded (LS pivot)
o In order to see the arrival of forwarded packet, run the debug of 'cluster'
module with 'select' flag on non-Pivot member:
fwha_select_ip_packet: The inverted back source MAC address will be XXXX-XX-XX-XX-XX
Example:
Layer 2 and Layer 3 Addresses:

Machine
MAC address
00:50:56:c0:00:01
Host
00:0C:29:DB:26:47
Pivot - eth0 (ext)
00:0C:29:72:56:47
non-Pivot - eth0 (ext)
IP address
192.168.204.1
192.168.204.10
192.168.204.12
Traffic flow: Pivot cluster member receives a TCP connection from Host and
forwards it to the non-Pivot cluster member.
Packet flow:
1. Pivot cluster member performs bit-wise 'NOT' on the 4 last octets (from the left)
of the Source MAC address of the packet.
Hence, in our example:
00:50:56:c0:00:01 becomes 00:50:A9:3F:FF:FE.
2. Pivot cluster member performs bit-wise 'AND' between:
o the value of fwha_mac_magic kernel parameter
o the value of fwha_mac_forward_magic kernel parameter
Let us take the default values:
o fwha_mac_magic=0xFE
o fwha_mac_forward_magic=0xFD
[fwha_mac_magic AND fwha_mac_forward_magic] =
[(0xFE) AND (0xFD)] = 0xFC.

3. Pivot cluster member performs bit-wise 'XOR' between:

o the 3rd octet (from the left) of the 'NOT'-ed Source MAC address from Step 1
o the result of 'AND' operation from Step 2
[(3rd octet of NOT-ed Host MAC) XOR
(fwha_mac_magic AND fwha_mac_forward_magic)] =
[(
3rd octet of 00:50:A9:3F:FF:FE) XOR (
0xFC)] =
[(A9) XOR (0xFC)] = 55
4. Pivot cluster member replaces the 3rd octet (from the left) of the 'NOT'-ed
Source MAC address from Step 1 with the result from Step 3.
00:50:A9:3F:FF:FE becomes 00:50:55:3F:FF:FE.
5. Therefore, in our example, the final inverted Source MAC address during the
packet forward will be:
00:50:55:3F:FF:FE.
6. Pivot cluster member forwards the packet through the original interface (eth0)
towards the non-Pivot cluster member with:
o Source MAC address 00:50:55:3F:FF:FE(final inverted MAC of Host)
o Destination MAC address 00:0C:29:72:56:47 (non-Pivot eth0)
o Destination IP address 192.168.204.12 (non-Pivot eth0)
7. The non-Pivot cluster member performs the reversed inversion in order to extract
the original Source MAC address of the Host (to use it later as the Destination
MAC address in the packets).
Note: The only information that can be seen in kernel debug is the original Source
MAC address of the packet after the non-Pivot cluster member performs the
reversed inversion.
In order to see the original Source MAC address, debug the 'cluster' module with
'select' flag on the non-Pivot cluster member:
[Expert@GW_HostName]# fw ctl debug -m cluster + select
The following will be printed (based on our example):

;fwha_select_ip_packet: The inverted back source MAC address will
be 00:50:56:c0:00:01;

Chain forwarding
Example:
A connection was initiated that requires inspection by Check Point Active Streaming
(CPAS) - e.g., SMTP Security Server.
Description:
Chain forwarding enables one cluster member to pass a chain (a packet filtered by a
FireWall module, along with data attached to the packet by the different
handling routines) to another cluster member.
Thus, the second member can resume the handling process at the same point the first
member has ceased.
Starting in NGX R60, chain forwarding is also used for Dynamic Routing.
Debugging:
In order to see how a chain is forwarded between cluster members, debug the 'fw'
module with 'chainfwd' flag (in addition, these flags are recommended: chain',
'conn', 'packet'):
[Expert@GW_HostName]# fw ctl debug -m fw + chainfwd chain conn packet
Technical details:
In CPAS case, packet forwarding cannot be used because in order to use
packet forwarding, the chain must finish passing through all the chain modules. But
since all the information that CPAS holds on this connection is located only on the other
member, the chain cannot be processed by CPAS, and therefore should be forwarded
to the member that handled this connection originally.
CPAS information is not forwarded between members because of the size of
information that will need to be synchronized and will cause performance issues.
The Forwarding Layer will receive a packed chain on the source cluster member, and
will transmit it to the target cluster member. Any table updates, which are the result of a
transmitted chain, will be applied to the target member before the chain is delivered for
processing on that machine.
Packet Forwarding is performed in the following way (so that the target cluster member
can understand that this packet is intended to him):
In case, the target member is down, but its Sync interface is still up, the chain will be
forwarded to it and handled by it.
In case, the Sync interface is down, the chain will be dropped by the source
member.
Note:
Why not forwarding the packet and starting it at the beginning of the chain? Because in
that case, the original packet needs to be kept (before the changes that made to it by
the chain modules) and the entire table changes that were made need to be undone,
because they will be on the target member again. It appears that such implementation is
more complicated.

ClusterXL Configuration
sk66527 (Recommended configuration for ClusterXL)
sk42096 (Cluster member is stuck in 'Ready' state)
Clock synchronization
In order to improve cluster stability, the clocks on all cluster members must be
synchronized. Although cluster members are able to deal with difference within 1 hour
(VPN has much stricter limit of several minutes), it is strongly recommended to use NTP on
cluster members.
sk25894 (Configuring NTP on SecurePlatform OS)
sk76600 (How to confirm NTP settings on SecurePlatform OS)
sk83820 (How to configure Advanced NTP features on Gaia OS)
sk92379 (How to configure NTP authentication on Gaia OS)
sk38957 (NTP FAQ for IP appliances)
sk41502 (How to adjust the polling interval in NTP on IP appliances)
sk62845 (How to enable or disable NTP on IP appliances)
sk62861 (How to verify that NTP is working on IP Appliances)
Preparing cluster members

Refer to ClusterXL Requirements for Hardware and Software section.
Important Note: Configuring cluster for the first time with a single member is not
supported - it will not be possible to install policy for the first time onto cluster object that
contains only a single member.
If an administrator plans to install the cluster of several members, but currently has only
one machine, then there are only these two options:
1. To avoid/minimize traffic outage in the future when adding 2nd member:
A. Prepare the switches
B. Connect the existing machine
C. Connect some temporary machine (in this case, only the operating system and
the number of interfaces matter)
D. Configure the cluster object of 2 members
E. Install policy for the first time
F. Shut down this temporary machine with 'cpstop' and 'shutdown' commands
G. The existing machine will function as a cluster member with VIP, etc. (this
member's state will be 'Active Attention')

H. When the 2nd desired machine is available to be installed as 2nd member:

a) Install the 2nd member on desired machine
b) In SmartDashboard, reset the SIC in the object of temporary member
c) Initialize the SIC with new member
d) Get topology from new member
e) Verify the configuration of cluster interfaces
f) Save the changes
I. Install policy onto cluster object
2. Configure the existing machine as a single gateway and suffer from traffic outage in
the future when adding 2nd member:
A. Prepare the switches
B. Connect only the existing machine
C. Configure object of singe Security Gateway
D. Initialize the SIC with existing machine
E. Install the policy
F. When the 2nd desired machine is available to be installed as 2nd member:
a) Install the 2nd machine
b) Configure object of singe Security Gateway for 2nd machine
c) Initialize the SIC with new machine
G. Create a cluster object
H. Add existing security gateway objects as members
I. Get topology from all members
J. Configure cluster interfaces
K. Install policy onto cluster object
To prepare the cluster member machines:
1. For IPSO clusters, configure VRRP or IP Clustering before installing Check Point
Security Gateway.
Notes:
Before the initial policy installation, make sure that the 'Enable VPN-1/FW-1
monitoring' is set to 'Disable' in the IPSO Network Voyager.
After the installation has finished, make sure that the 'Enable VPN-1/FW-1
monitoring' is set to 'Enable' in the IPSO Network Voyager. This assures that
IPSO will monitor changes in the status of the Check Point cluster member (the
state of the firewall is reported to the IPSO cluster for failover purposes since
IPSO 3.8.2).
Refer to sk39008 (What does 'Monitor Firewall State' actually monitor (VRRP and
IP Clustering) and how does it influence availability).
2. For OPSec certified clusters, follow the vendor recommendations.
3. Install Check Point Security Gateway on all cluster member machines.
During the first time configuration phase (or later, using the 'cpconfig' command):
Install a license for Check Point Security Gateway on each cluster member.
No special license is required to allow the OPSec certified product to work with the
Security Gateway.
During the configuration phase, enable State Synchronization by selecting
'Enable cluster membership for this gateway' on Unix machines,
or 'This Gateway is part of a cluster' on Windows machines.
4. Define IP addresses on all relevant interfaces on all the cluster members.

5.
6.
7.
8.
Notes:
Unused interfaces must be configured as 'Disconnected' (refer to Defining
Disconnected Interfaces section).
Alias IP addresses are not supported by ClusterXL. Refer to sk31821 (Traffic that
is sent to Secondary IP addresses / Alias IP addresses that were defined on
interfaces of ClusterXL members is not processed).
Configure identical number of CoreXL FW instances on cluster member machines
(using the 'cpconfig' command).
Configure SecureXL in identical way on cluster member machines (using the
'cpconfig' command and the 'sim affinity -s' command).
Connect the cluster member machines via the switches.
For the Synchronization interface(s), due to security reasons, a crossover cable or a
dedicated switch is recommended.
Proceed to the next section - configuration in SmartDashboard.
Configuring cluster object in SmartDashboard

R75.40VS, R76, R77) - Chapter 'Configuring ClusterXL' - Configuring Cluster Objects &
Members.
Important Note: Configuring cluster for the first time with a single member is not
supported. Refer to Preparing Cluster Members section.
To define a new Gateway Cluster object:
1. In the Network Objects tree, right-click on Check Point and then select Security
Cluster.
2. If the Security Gateway Cluster Creation window appears, select one of the
following methods to create your new cluster object:
o Simple Mode (Wizard), which guides you step by step through the configuration
process.
o Classic Mode, which allows to configure all relevant settings at once.
Note: In order to make cluster troubleshooting easier, configured the following in
SmartDashboard - cluster object - ClusterXL - under Tracking, set 'Track changes in
the status of cluster members' to 'Log' , or 'SNMP Trap Alert' (refer to SNMP
section).
Use the cluster object Topology page to configure the topology for the cluster object
and its members.
Pay attention to the names of the members' interfaces - they must match the names of
the interfaces are assigned by the operating system, subject to the guidelines provided in
sk30154 ($FWDIR/log/fwd.elg shows repeatedly - 'fwarp_initialize_myself: unable to find
mac address of interface IF_NAME').

Configuring routing on networks around the cluster

Hosts and networking devices should forward all traffic to cluster Virtual IP address on
their subnet and not to physical IP addresses of cluster members.
Refer to Requirements for switches and routers section and to Configuring Cluster
Addresses on Different Subnets section.
Example:
In the topology depicted below, on the Host, the Default Gateway has to be configured
with IP address 192.168.204.20:
CCP mode
The ClusterXL Control Protocol (CCP) uses multicast by default, because it is more
efficient than broadcast.
If the connecting switch cannot forward multicast traffic, it is possible, though less
efficient, for the switch to use broadcast to forward traffic.
Refer to CCP modes section and to Requirements for switches and routers section.
ClusterXL High Availability for IPv6

Starting in R76, ClusterXL supports IPv6. All IPv6 status information is synchronized
and the IPv6 clustering mechanism is activated during failover.
ClusterXL performs both state synchronization and clustering for IPv6 as with IPv4. For
this to work, in SmartDashboard, you must define IPv6 addresses for all cluster interfaces.
In case of IPv4, during cluster failover, cluster sends Gratuitous ARP Request packets
to update an ARP cache of hosts/routers connected to the cluster interfaces, by advertising
the new MAC address for the cluster Virtual IPv4 addresses.
In case of IPv6, during cluster failover, cluster uses Neighbor Discovery Protocol (NDP)
and sends Neighbor Advertisement messages to update the neighbor cache of
hosts/routers connected to the cluster interfaces, by advertising the new MAC address for
the cluster Virtual IPv6 addresses. In addition, ClusterXL will reply to any Neighbor
Solicitation with a target address equal to the Cluster Virtual IPv6 address.

Note: ClusterXL failover event detection is based on IPv4 probing (refer to the definition
of 'probing' and of 'pingable host' in Clustering Definitions and Terms section).
During state transition, the IPv4 driver instructs the IPv6 driver to reestablish IPv6
network connectivity to the cluster.
To enable IPv6 functionality for an interface, define an IPv6 address for the applicable
interface on the cluster and on each member. All interfaces configured with an IPv6
address must also have a corresponding IPv4 address. If an interface does not require
IPv6, only the IPv4 definition address is necessary.
Note: You must configure synchronization interfaces with an IPv4 address only. This is
because the synchronization mechanism works using IPv4 only. All IPv6 information
and states are synchronized using this interface.
In an IPv6 environment, the 'cphaprob -a if' command shows both the cluster
Virtual IPv4 addresses and cluster Virtual IPv6 addresses.
sk35178 (How to set up IPv6 in ClusterXL)
sk34552 (How to set up IPv6 on SecurePlatform)
sk39374 (IPv6 Support FAQ)
sk78220 ("fw ctl pstat" command shows "Sync: off" on cluster members when IPv6 is
enabled in R75.40 and above)
sk91905 (Configuring Proxy NDP for IPv6 Manual NAT)
sk92368 (ATRG: IPv6)
Defining 'Disconnected' interfaces

For more information, refer to ClusterXL Administration Guide (R70, R70.1, R71, R75,
R75.20, R75.40, R75.40VS, R76, R77) - Chapter 'ClusterXL Advanced Configuration' Defining Disconnected Interfaces.
Disconnected interfaces are cluster member interfaces that are not monitored by the
ClusterXL mechanism.
Important Notes:
Unused interfaces must be defined as 'Disconnected' in order to avoid cluster
flapping.
Never define sync interface as 'Disconnected'.
This configuration applies only to physical interfaces.
Starting from Gaia R75.47, R77.20, the $FWDIR/conf/discntd.if file is not needed
anymore. Any interface, which is not part of cluster topology, will be counted as
disconnected.

Procedure:
Interfaces that have IP addresses configured, can be defined as 'Disconnected' via
special configuration file / registry key (see below), or in SmartDashboard:
In SmartDashboard - open cluster object - go to 'Topology' pane - click on
'Edit...':
o If the unused interfaces are present, then set their Network Objective to 'NonMonitored Private' and install policy.
o If the unused interfaces do not appear in the topology yet, then click on 'Get...'
- select 'All Members' Interfaces...', then set their Network Objective to 'NonMonitored Private' and install policy.
Interfaces that do not have IP addresses configured, can be defined as

'Disconnected' only in this way (procedure also applies to interfaces with configured
IP address):
o UNIX OS:
A. Create the $FWDIR/conf/discntd.if file (if does not exist yet)
B. Add the name of each relevant physical interface on a separate line
C. Save the changes in the file
D. Restart ClusterXL with 'cphastop;cphastart' commands
o Windows OS:
A. Open Windows Registry editor (Start - Run... - regedit)
B. Go to
HKEY_LOCAL_MACHINES\System\CurrentControlSet\Services\CPH
A\
C. Add a new key:
Value Name: DisconnectedInterfaces
Data Type: REG_MULTI_SZ
D. Check the names of the interfaces as assigned by the system (Start - run... cmd):
fw getfs
E. Add the name of each relevant physical interface using the following format:
\device\System_Interface_Name
F. Restart ClusterXL with 'cphastop;cphastart' commands

sk30060 (SmartView Tracker repeatedly shows messages "cluster_info: (ClusterXL)
interface is down / up")
sk65826 (Cluster member is 'Down' because normal operational interface was
configured as 'Disconnected')
sk93037 (Output of 'cphaprob state' command on Crossbeam chassis cluster shows
only local member)
sk52020 (The $FWDIR/conf/discntd.if configuration file on ClusterXL member
contains the 'pimreg' entry, which was not added by the user)

SecureXL
Refer to Requirements for software section.
Refer to Performance Pack Administration Guide (R70, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77).
Refer to R70 Performance Optimization Guide.
sk25972 (About SecureXL Performance Pack)
sk32578 (SecureXL Mechanism)
sk98348 (Best Practices - Security Gateway Performance)
sk71200 (SecureXL NAT Templates)
sk67861 (Accelerated Drop Rules Feature in R75.40 and above)
sk66402 (SecureXL Drop Templates are not supported in versions lower than R76)
CoreXL
Refer to Firewall Administration Guide (R70, R71, R75, R75.20, R75.40, R75.40VS Chapter 'CoreXL Administration'; R76, R77 - Chapter 'Maximizing Network Performance' CoreXL).
Refer to Performance Tuning Administration Guide (R76, R77) - Chapter 2 'CoreXL
Administration'
sk61701 (CoreXL Known Limitations)
sk35990 (How Connections Table limit capacity behaves in CoreXL)
sk36151 (Maximum Concurrent Connections in CoreXL)
sk62620 (What is the fw_worker_X process?)
VPN
R75.20, R75.40, R75.40VS, R76, R77) - Chapter 'ClusterXL Advanced Configuration' Working with VPNs and Clusters.
For 3rd party VPN products, refer to vendor's documentation.


sk92332 (Customizing the VPN configuration for Check Point Security Gateway 'vpn_table.def' file)
sk108600 (VPN Site-to-Site with 3rd party)
sk35383 (How to configure VPN between Check Point cluster and a VPN-1 UTM
Edge with WAN High Availability (HA))
NAT
Network Address Translation (NAT) is a fundamental aspect of the way ClusterXL
works.
When a packet leaves cluster member, the source IP address in the outgoing packet, is
the physical IP address of the cluster member interface.
The source IP address is changed using NAT to that of the Virtual IP address of the
cluster on that subnet.
This address translation is called "Cluster Hide".
The packet sent to the cluster Virtual IP address is accepted by one of the cluster
members. The destination IP address in the incoming packet is changed using NAT to that
of the physical IP address of the cluster member interface on that subnet.
This address translation is called "Cluster Fold".
For OPSec certified clustering products, this corresponds to the default setting (in
SmartDashboard) in the 3rd Party Configuration page of the cluster object, of Forward
Cluster's incoming traffic to Cluster Members' IP addresses being checked.
R75.20, R75.40, R75.40VS, R76, R77) - Chapter 'ClusterXL Advanced Configuration' Working with NAT and Clusters.
For OPSec certified clustering products, refer to vendor's documentation.
sk31832 (How to prevent ClusterXL / VRRP / IPSO IP Clustering from hiding its own
traffic behind Virtual IP address)
sk32224 (NAT Table 'fwx_alloc')
sk30197 (Configuring Proxy ARP for Manual NAT)
VLAN
When defining VLAN tags on an interface, cluster IP addresses can be defined only on
the VLAN interfaces (the tagged interfaces).
Defining a cluster IP address on a physical interface that has VLANs is not supported.
This physical interface has to be defined with the Network Objective Monitored Private.
Note: Refer to CCP and VLAN interfaces section.

R75.20, R75.40, R75.40VS, R76, R77) - Chapter 'ClusterXL Advanced Configuration' Working with VLANs and Clusters.
In addition, refer to the Release Notes of the given version.
sk92826 (ClusterXL VLAN monitoring)
sk61323 (Monitoring of VLAN interfaces in ClusterXL)
sk92784 (Configuring VLAN Monitoring on ClusterXL for specific VLAN interface)
Link Aggregation (Bonding)

Refer to ClusterXL Administration Guide (R65, R70, R70.1, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77) - Chapter 'ClusterXL Advanced Configuration' - Link Aggregation and
Clusters.
Overview
Link Aggregation, NIC Teaming, Bonding of interfaces, Bond interface - all refer to the
same redundancy technology of physical Network Interface Cards (NICs) - a virtual
interface, defined on the OS, similar to a physical interface - where the physical bonded
interfaces are set to act as a single interface, using the same MAC address and the same
IP address.
Each physical interface in a Bond is called a slave of that bond. Enslaved interfaces do
not function independently of the bond.
The interface bonding supplies High Availability in case of interface failure and, in Load
Sharing mode, can significantly increase total throughput.
Figure below depicts Bonded interfaces:
In this scenario:
GW-1 is a single gateway, or a cluster
member
S-1 and S-2 are switches
eth0 and eth1 are bonded slave
interfaces
eth0 is the Active slave interface
eth1 is the Standby slave interface
bond0 is the name of the bond
If GW-1 should lose connectivity with the
currently active switch, it is able to detect the
failure and initiate an internal failover to eth1.
Note: Link Aggregation is supported on SecurePlatform, Gaia, and IPSO OS.

Introduction
When dealing with mission-critical applications, an enterprise requires its network to be
highly available.
Clustering provides redundancy at the gateway level.
However, without Link Aggregation - redundancy of Network Interface Cards (NICs), or
redundancy of the switches on either side of the gateway are only possible in a cluster, and
only by failover of the gateway to another cluster member.
Configuration
Refer to these User Guides:
How to Configure ClusterXL for L2 Link Aggregation on SecurePlatform and Gaia
OS
How to Configure Link Aggregation Groups on IPSO OS
Start with these ClusterXL Administration Guides (because the Link Aggregation
support was added for the first time in these versions):
R65 ClusterXL Administration Guide - Chapter 'ClusterXL Advanced Configuration' Working with Link Aggregation and Clusters - Configuring Interface Bonds
R70 ClusterXL Administration Guide - Chapter 'ClusterXL Advanced Configuration' Working with Link Aggregation and Clusters - Configuring Interface Bonds
R70.1 ClusterXL Administration Guide - Chapter 'ClusterXL Advanced Configuration'
- Link Aggregation and Clusters
In addition, refer to ClusterXL Administration Guide (R71, R75, R75.20, R75.40,
Clusters.
Important Note: It is mandatory to define the physical slave interfaces (that will
comprise the bond interface) as 'Disconnected'. Refer to Defining Disconnected Interfaces
section.
Link Aggregation can be configured in one of these two modes:
High Availability (Active/Backup) mode (supported since R65) - only one interface
at a time is active.
Upon interface failure, the bond fails over to another interface.
Different slave interfaces of the bond can be connected to different switches, to
benefit from high availability of switches in addition to high availability of interfaces
(refer to Fully Meshed Redundancy via Interface Bonding section above).
Load Sharing (Active/Active) mode (supported since R70.1 / VSX R67) - all
interfaces are active, for different connections.
Connections are balanced between interfaces according to Layer 3 and Layer 4, and
follow either the IEEE 802.3ad standard, or XOR.
Load Sharing mode has the advantage of increasing throughput, but requires
connecting all the interfaces of the bond to one switch (which must support LACP).

For both Link Aggregation High Availability mode and for Link Aggregation Load
Sharing mode:
The number of bond interfaces that can be defined is limited by the maximal number
of interfaces supported by each platform (refer to Release Notes of each given
version).
Up to 8 slave NICs can be configured in a single High Availability bond or Load
Sharing bond.
Link Aggregation - High Availability mode

Link Aggregation provides high availability of NICs. If one fails, the other can function in
its place. This functionality is provided by Link Aggregation in both High Availability mode
and Load Sharing mode.
High Availability mode of Link Aggregation, when deployed together with ClusterXL,
enables a higher level of reliability through granular redundancy in the network topology.
This granular redundancy is achieved with a Fully Meshed Topology, which effectively
provides independent backups for both NICs and switches.
Redundant (High Availability) Topologies:
Simple Redundant Topology without Interface Bonding
Fully Meshed Redundancy via Interface Bonding
Simple Redundant Topology without Interface Bonding
In the case of switch or Security Gateway failure, a High Availability cluster solution
provides system redundancy.
Figure below depicts a redundant system without Link Aggregation (two synchronized
Security Gateways - cluster members) deployed in a simple redundant topology:
In this scenario:
GW-1 and GW-2 are cluster members
C-1 and C-2 are interconnecting
networks
Cluster members GW-1 and GW-2 each have
one external NIC connected to an external
switch (S-1 and S-2, respectively).
In the event of a failure of either Active cluster
member GW-1, its NIC (on C-X), or switch S1, cluster member GW-2 becomes the only
Active gateway, connecting to switch S-2 over
C-2.

In any of the 3 cases (gateway failure, NIC failure or switch failure), the result of the
failover is that no further redundancy exists, and a further failure of any active component
will completely stop network traffic.
Link Aggregation provides high availability of NICs. If one fails, the other can function in
its place. This functionality is in Bond High Availability mode and in Bond Load Sharing
mode.
Fully Meshed Redundancy via Interface Bonding
The Link Aggregation High Availability mode, when deployed with ClusterXL, enables a
higher level of reliability by providing granular redundancy in the network. This granular
redundancy is achieved by using a fully meshed topology, which provides for independent
backups for both NICs and switches.
A fully meshed topology further enhances the redundancy in the system by providing a
backup to both the interface and the switch, essentially backing up the cable. Each cluster
member has two external interfaces, one connected to each switch.
Figure below depicts this implementation, where both cluster members are connected to
both external switches:
In this scenario:
GW-1 and GW-2 are Security Gateway
cluster members in New High
Availability
mode
C-1, C-2, C-3 and C-4 are networks
After a switch failure, switch functionality and
gateway high availability are maintained.
Similarly, after a NIC failure, switch and
gateway high availability are maintained.
Bond Internal Failover
Note: The bond failover operation requires network interface cards that support the
Media-Independent Interface (MII) standard.
Failover can occur because of a failure in the physical link state, or a failure in the
receiving/sending of CCP packets. Either of these failures will trigger a failover: either
within the bond interface, or between cluster members (depending on the circumstances)

Link state initiated failover

1. The active slave interface detects a link state of down, and notifies the bond
interface.
2. The bond initiates an internal bond failover to the standby slave interface.
Note: Since this is a failover within the bond, the status of the other cluster
member is not considered.
3. If this slave interface should detect a link failure, and the initial slave interface is
still down, ClusterXL initiates a failover to the other cluster member, as long as
the state of the other cluster member is not Down.
CCP initiated failover

1. ClusterXL detects a problem in the receiving/sending of CCP packets.
2. ClusterXL initiates an internal bond failover.
3. ClusterXL monitors CCP packet transmission/arrival. If a problem is detected, the
system initiates a failover to the other cluster member, as long as the state of the
other cluster member is not Down.
Configuring Bond High Availability Failover Mode
There are a number of configurable settings regarding bond failover:
'fwha_manual_bond_failover' kernel parameter

Description:
Sets the failover mode.
Values:
o 0 = (default) automatically perform internal bond failover to the other slave
interface.
o 1 = perform ClusterXL failover to the other cluster member (as long as the state
of the other cluster member is not Down), unless the command 'cphaconf
enable_bond_failover' was run, in which case, the next failover will be
internal bond failover to the other slave interface (refer to command's description
below).
Notes:
o With both values, the next bond failover occurs in 2 minutes.
o The current value of this kernel parameter can be checked with 'fw ctl get
int fwha_manual_bond_failover' command.
o The value of this kernel parameter can be set:
either on-the-fly with 'fw ctl set int fwha_manual_bond_failover
VALUE' command (this change does not survive reboot)
or by adding this parameter with desired value into the
$FWDIR/boot/modules/fwkern.conf file (per sk26202)

'cphaconf enable_bond_failover BondName' command

Description:
Sets what happens during a ClusterXL failover after a bond has already failed
over internally. This command works only if the value of the
'fwha_manual_bond_failover' kernel parameter is currently set to 1 (one).
After a failover occurs within a bond, the next time a failure is detected on a slave
interface, ClusterXL automatically fails over to the other cluster member.
An administrator can prevent this from occurring by first correcting the error on
slave interface that caused the failover, and then resetting the system to failover
internally.
The 'cphaconf enable_bond_failover BondName' command directs the
system to failover within the bond the next time a failure is detected on a slave
interface.
Notes:
o When successful, there is no immediate output from this command; however the
words 'can failover' appear in the output of the 'cphaprob -a if'
command (in the corresponding line for this bond interface).
o This command should be run each time the system is reconfigured - after
verifying that all slave interfaces are active.
o Refer to 'cphaconf' command section.

sk43730 (Failover in Bond interface can cause failover in ClusterXL)
Link Aggregation - Load Sharing mode

Note: Refer to sk22345 (Security Gateway support for EtherChannel technology /
802.3ad Link Aggregation).
Link Aggregation in Load Sharing Mode is supported in:
SecurePlatform OS version R70.1 and above
Gaia OS version R75.40 and above
VSX SecurePlatform OS version R67 and above
IPSO OS version 3.8.1 and above
In Bond Load Sharing mode, Link Aggregation supplies load sharing, in addition to High
Bond Availability. All slave interfaces are active, and connections are balanced between the
bond's slave interfaces, similar to the way ClusterXL balances connections between cluster
members.
In Bond Load Sharing mode, each connection is assigned to a specific slave interface.
For the individual connection, only one slave interface is active. On failure of that interface,
the bond does failover of the connection to one of the other interfaces, which adds the
failed interface's connection to the connections it is already handling.

Important Note: Bond in Load Sharing requires Performance Pack (SecureXL) to be

enabled (SIM Affinity should be configured to run in Static mode via 'sim affinity -s'
command).
Connections are balanced between slave interfaces according to Layer 3 and Layer 4,
and follow one of these standards:
802.3ad - includes LACP and is the recommended mode, but some switches may
not support this mode.
XOR.
In Bond Load Sharing mode, all the interfaces of a bond must be connected to the same
switch. The switch itself must support and be configured for Link Aggregation, by the same
standard (802.3ad or XOR) as the bond interface on Check Point Security Gateway.
A bond in Load Sharing mode is considered to be Down when less than a critical
minimum number of slave interfaces remain up.
When not explicitly defined, the critical minimum number of interfaces in a bond of n
slave interfaces is n-1.
Note: Failure of a second interface will cause the entire bond to be considered down,
even if the bond contains more than two interfaces.
If a smaller number of interfaces will be able to handle the expected traffic, you can
increase redundancy by explicitly defining the number of critical interfaces.
To explicitly define the number of critical interfaces, create and edit the
cpha_bond_ls_config.conf file:
[Expert@HostName]# cd $FWDIR/conf/
[Expert@HostName]# touch cpha_bond_ls_config.conf
[Expert@HostName]# chown admin:bin cpha_bond_ls_config.conf
[Expert@HostName]# chmod -v u=rwx,g=rwx cpha_bond_ls_config.conf
[Expert@HostName]# vi cpha_bond_ls_config.conf
Location of the cpha_bond_ls_config.conf file:
On ClusterXL in Gateway Mode:
$FWDIR/conf/cpha_bond_ls_config.conf
On ClusterXL in VSX Mode R75.40VS, R76, R77 and above:
$FWDIR/CTX/CTX0000<VSID>/conf/cpha_bond_ls_config.conf
Each line of the file should contain the Bond Name and the number of critical interfaces
(separated by a space of horizontal tab):
BondName
number_of_critical_interfaces
Example:
If bond0 has 7 interfaces, and bond1 has 6 interfaces, then file contents could be:
bond0 5
bond1 3
Explanation:
bond0 would be considered Down when 3 of its interfaces have failed
bond1 would be considered Down when 5 of its interfaces have failed


sk95087 ('cphaconf show_bond -a' command shows incorrect number of Slave
interfaces that does not match configuration in the
$FWDIR/conf/cpha_bond_ls_config.conf file)
sk97779 (Critical minimal number of interfaces in a Bond in VRRP cluster running on
Gaia OS)
sk94545 Configuration in the $FWDIR/conf/cpha_bond_ls_config.conf file on VSX
cluster member does not apply to other Virtual Systems)
Monitoring the Interface Link State (MILS)

Notes:
MILS is enabled by default since R75.47 (fwha_monitor_if_link_state=1).
MILS is supported only on SecurePlatform / Gaia OS.
MILS is supported only for physical interfaces.
MILS feature considers the link state of a Bond interface as up if it has at least one
slave with link up.
sk31336 (Using Monitor Interface Link State feature to improve ClusterXL interfacefailure-detection ability).
Enabling Interface Link State Monitoring significantly shortens the time it takes
ClusterXL to detect an interface failure (from milliseconds to microseconds).
By monitoring the link state (i.e., the electrical state) of an interface, ClusterXL is
immediately alerted to connectivity issues concerning a certain network interface, such as a
disconnected cable, or an electrical failure (real or simulated) on a switch.
Interface Link State Monitoring requires a NIC driver that supports link state detection.
The device driver reports the link state as either connected or disconnected.
Monitoring the interface link state is particularly useful during cluster probing when none
of the hosts answer on the connected subnet (refer to the definition of 'probing' and of
'pingable host' in Clustering Definitions and Terms section).
When MILS is enabled, ClusterXL immediately detects when an interface goes down.
When MILS is disabled, ClusterXL determines whether an interface is malfunctioning
based on expiration of internal timeouts.
Configuring cluster addresses on different subnets

Cluster IPs are Virtual IP addresses given to ClusterXL objects, which differ from the
unique physical IP addresses of the individual cluster member. These addresses enable
the cluster to be seen as a single gateway, thus allowing it to serve as a router in a network
that is unaware of the cluster's internal structure and status.

Note: ISP Redundancy is not supported in ClusterXL where physical interfaces of

cluster members and cluster VIP are defined on different subnets. Refer to sk66521 (ISP
Redundancy in ClusterXL when interfaces of cluster members and cluster VIP are defined
on different subnets per sk32073).
Refer to Configuring ISP Redundancy on a Cluster section.
Cluster IP addresses can reside on subnets other than those of the members. The
advantage of this is that it:
Enables a multi-machine cluster to replace a single-machine gateway in a preconfigured network, without the need to allocate new addresses to the cluster
members.
Makes it possible to use only one routable address for the ClusterXL Gateway
Cluster.
sk32073 (Configuring Cluster Addresses on Different Subnets).
There are two major steps required in order for ClusterXL to function correctly with
cluster IPs on different subnets:
1. The first step is to create static routes on each cluster member, which determine
the interface connected to the cluster's network (the subnet, to which the cluster IP
belongs). Unless these entries are created, the OS cannot route packets to the
cluster's network. No additional configuration is required for the cluster members. It
is, however, important to note that the unique IP addresses given to the members
must share common subnets on each "side" of the cluster (meaning, each interface
on each machine must have an interface on every other machine using the same
subnet).
Note:
Configuring the static route is not needed in these cases:
On SecurePlatform OS Security Gateway with enabled Advanced Dynamic

Routing (GateD daemon will add the route to cluster VIP network when the
member's interface comes up).
On Gaia OS Security Gateway in VSX mode (this is done automatically when
configuring routes in SmartDashboard).
2. The second step relates to the configuration of the cluster topology. Here, the
cluster IP addresses are determined, and associated with the interfaces of the
cluster members (each member must have an interface responding to each cluster
IP address). Normally, cluster IP addresses are associated with an interface based
on a common subnet. In this case, these subnets are not the same. It must be
explicitly specified, which member subnet is associated with the cluster IP address.

Example:
Moving from a single gateway to a cluster

R75.40VS, R76, R77) - Chapter 'ClusterXL Advanced Configuration' - Moving from a Single
Gateway to a ClusterXL Cluster.
Adding another member or interface to an existing cluster

R75.40VS, R76, R77) - Chapter 'ClusterXL Advanced Configuration' - Adding Another
Member to an Existing Cluster.

sk57100 (Adding or removing an interface in ClusterXL High Availability topology
might cause fail-over)
sk69180 (Adding a new Slave interface to existing Bond interface on
SecurePlatform)

Proxy ARP
sk30197 (Configuring Proxy ARP for Manual NAT).
Let us consider the following scenario:
1. Two networks (Network_A and Network_B) are separated by a Security Gateway
(single Security Gateway or ClusterXL).
2. On each network, there is a host (Host_A on Network_A,
and Host_B on Network_B).
3. Let us assume, that Network_A represents the Internal network,
and Network_B represents the External network.
4. According to the existing standards, when Host_B needs to send data to Host_A,
an ARP Request for the MAC address of Host_A will be sent
by Host_B to Network_B.
Since Host_A is located on another network, and the Security Gateway acts as a
router, this ARP Request (sent to Broadcast address on Layer2) will not be
forwarded by the Security Gateway from Network_B to Network_A.
As a result, Host_B will not discover the MAC address of Host_A, and will not be
able to send the data to Host_A.
A standard solution, in such cases, is to configure the Security Gateway to act
as Proxy ARP.
The Security Gateway will pretend to be the Host in question. The Security Gateway
will accept the ARP Requests and the Security Gateway will send its own MAC
Address in ARP Reply. Then, when the data is received from the External
network, the Security Gateway will forward the data to the relevant host on the
Internal network.
Configuration on the Security Gateway is two-fold:
1. Layer2-to-Layer3 matching - matching IP addresses of the relevant hosts on the
Internal network to the MAC Address of the Security Gateway on the External
network (performed via special configuration file
$FWDIR/conf/local.arp
).
2. NAT rules

ISP Redundancy
If you have a ClusterXL Gateway cluster, connect each cluster member to each ISP
using two physical interfaces.
R75.40VS, R76, R77) - Chapter 'ClusterXL Advanced Configuration' - Configuring ISP
Redundancy on a Cluster.
Note: ISP Redundancy is not supported in ClusterXL where physical interfaces of
cluster members and cluster VIP are defined on different subnets. Refer to sk66521 (ISP
Redundancy in ClusterXL when interfaces of cluster members and cluster VIP are defined
on different subnets per sk32073).
Refer to FireWall Administration Guide (R70, R71, R75, R75.20, R75.40, R75.40VS) Chapter 'ISP Redundancy'.
Refer to Security Gateway Technical Administration Guide (R76, R77) - Chapter 3 'ISP
Redundancy'.
How To Configure ISP Redundancy
How To Configure ISP Redundancy in SecurePlatform
sk25129 (Supported platforms for ISP Redundancy)
sk42636 (Controlling connections configured with ISP Redundancy in Load Sharing
mode)
sk66521 (ISP Redundancy in ClusterXL when interfaces of cluster members and
cluster VIP are defined on different subnets per sk32073)
sk23630 (Advanced configuration options for ISP Redundancy)
sk32225 (Configuring ISP Redundancy so that certain traffic uses specific ISP)
sk40958 (How to verify the status of ISP Redundancy links on command line)

An example of cluster-specific configuration is depicted below:
Dynamic Routing
ClusterXL supports Dynamic Routing (Unicast and Multicast) protocols as an integral
part of Check Point operating systems. As the network infrastructure views the clustered
gateways as a single logical entity, failure of a cluster member will be transparent to the
network infrastructure and will not result in a ripple effect.
When configuring the routing protocols on each cluster member, each member is
defined identically, and uses the cluster VIP addresses (not the members' physical IP
addresses). Meaning, that Router ID should be set to cluster Virtual IP on each member.
Note: When configuring OSPF restart, you must define the restart type as signaled or
graceful. For Cisco devices, use type signaled.
Note: If cluster running on SecurePlatform OS does not participate in Dynamic Routing
protocols, then disable Advanced Dynamic Routing on each cluster member in order to
prevent unexpected cluster failovers due to FIB pnote.

Refer to these documents:

Gaia Advanced Routing Administration Guide (R75.40, R75.40VS, R76, R77)
SecurePlatform Advanced Routing Suite CLI Reference Guide (R60, R61, R62, R65,
R70, R71, R75.20, R75.40, R75.40VS, R76, R77)
Release Notes of the given version
SecurePlatform Pro - Advanced Routing Suite - Configuring ClusterXL
SecurePlatform Pro - Advanced Routing Suite - Configuring OSPF
SecurePlatform Pro - Advanced Routing Suite - Configuring BGP
SecurePlatform Pro - Advanced Routing Suite - Configuring BGP and OSPF
SecurePlatform Pro - Advanced Routing Suite - Configuring PIM
Guidelines for configuring PIM with VRRP
PIM-DM and PIM-SM Failover Behaviour in a High Availability Configuration
How To Configure Policy Based Routing
sk39960 (How to allow Dynamic Routing protocols traffic (OSPF, BGP, PIM, RIP,
IGRP) through Check Point Security Gateway)
sk95968 (OSPF on Gaia)
sk95967 (BGP on Gaia)
sk100499 (BGP on Gaia OS - configuring Graceful Restart)
sk100239 (How to configure PIM on Gaia OS)
sk100501 (How to configure Routemaps in Gaia Clish)
sk98936 (How to configure route redistribution and inbound route filters in Gaia
Portal)
sk32614 (Configuring SecurePlatform Pro for OSPF)
sk36969 (How to configure OSPF on Security Gateway & UTM-1 Edge VTI
environment)
sk36646 (OSPF graceful restart types)
sk42974 (How to manually add route on SecurePlatform to override OSPF route)
sk32615 (Configuring SecurePlatform Pro for BGP)
sk86985 (How to configure iBGP to propagate routes)
sk92836 (PIM HA Mode explained)
sk32702 (Configuring PIM and IGMP Multicast Protocols)

SNMP
sk90860 (How to configure SNMP on Gaia OS)
sk79280 (How to add SNMP user defined settings in Gaia)
sk92999 (How to create custom SNMP traps in Gaia)
sk34511 (How to enable SNMP on SecurePlatform OS)
sk68560 (How to configure SNMP on SecurePlatform OS)
sk65923 (How to configure the cluster to send SNMP Trap upon fail-over)
sk93455 (Send SNMP Trap in the event of a ClusterXL failover to multiple Trap
Servers)
sk40266 (SNMP, MIBs, and how SNMP traps work)
sk71980 (Output of a 'snmpwalk' command with 'exec' extension or 'extend'
extension is limited)
sk78360 (How to Extend SNMP)
sk65173 (Check Point SNMP sysObjectID .1.3.6.1.2.1.1.2)
sk40622 (SNMPv3 USM (User-based Security Model) User)
sk42426 (Hardware Monitoring with SNMP on Power-1 / UTM-1 / Smart-1 / 2012
appliances)
Enhanced enforcement of the TCP 3-way handshake

The standard enforcement for a 3-way handshake that initiates a TCP connection
provides adequate security by guaranteeing one-directional stickiness. This means that it
ensures that the SYN-ACK will always arrive after the SYN. However, it does not guarantee
that the ACK will always arrive after the SYNACK, or that the first data packet will arrive
after the ACK. If you wish to have stricter policy that denies all out-of-state packets, you
can configure the synchronization mechanism so that all the TCP connection initiation
packets arrive in the right sequence (SYN, SYN-ACK, ACK, followed by the data). The
price for this extra security is a considerable delay in connection establishment.
To enable enhanced enforcement, use the GuiDBedit Tool to change the value of global
attribute sync_tcp_handshake_mode from minimal_sync (default value) to
complete_sync:
1. Close all SmartConsole windows (SmartDashboard, SmartView Tracker, etc.).
2. Connect to Security Management Server with GuiDBedit Tool.
3. In the left upper pane, go to 'Table' - 'Network Objects' - 'network_objects'.
4. In the right upper pane, select the relevant Cluster object (Class Name gateway_cluster).
5. Press CTRL+F (or go to 'Search' menu - 'Find') - paste
sync_tcp_handshake_mode - click on 'Find Next'.
6. In the lower pane, right-click on the sync_tcp_handshake_mode - 'Edit...' choose "complete_sync" - click on 'OK'.
7. Save the changes: go to 'File' menu - 'Save All'.
8. Close the GuiDBedit Tool.
9. Connect to Security Management Server with SmartDashboard.
10. Install the policy onto the Cluster object.

Synchronization modes for TCP 3-way handshake:

1. Minimal sync
3-way handshake is not enforced. This mode offers the best connectivity for users
who are willing to compromise on security is this case.
2. Complete sync
All 3-way handshake packets are Synced-and-ACKed, and 3-way handshake is
enforced. This mode slows down connection establishment considerably. It may be
used when there is no way to know where the next packet will go, e.g. in 3d party
clusters.
3. Smart sync
In most cases, we can assume that if SYN and SYN-ACK were encountered by the
same cluster member, thentheconnectionissticky.
ClusterXL uses one additional flag in Connections Table record that says,Ifthis
member encounters a 3-way handshake packet, it should sync all other cluster
members.
When a SYN packet arrives, the member that encountered it, records the connection
and turns off its flag. All other members are synched, and by using a post-synchandler, their flag is turned on (in their Connections Tables).
If the same member encounters the SYN-ACK packet, the connection is sticky, thus
other cluster members are not informed.
Otherwise, the relevant member will inform all other member (since its flag is turned
on).
The original member (that encountered the SYN) will now turn on its flag, thus
all members will have their flag on.
In this case, the third packet of the 3-way handshake will also be synced.
If for some reason, our previous assumption is not true (i.e., one cluster
member encountered both SYN and SYN-ACK packets, and other members
encountered the third ACK), then thethirdACK will be dropped by the other cluster
members, and we rely on the periodic sync and TCP retransmission scheme to
complete the 3-way handshake.
This mode is a good solution for Load Sharing users that want to enforce 3-way
handshake verification with the minimal performance cost. It is also recommended
for High Availability New mode.

Cluster state transitions

Refer to ClusterXL definitions and terms section.
The state machine mechanism is triggered by 3 events:
Starting or stopping of the ClusterXL product / State Synchronization
Incoming CCP packet with the state information from other cluster members (refer to
FWHA_MY_STATE Data)
Timer event - every predefined timeout (refer to CPHA timer)
Special notes for state transitions
When all Critical Devices (Pnotes) report their states as 'ok', the machine will try to
change its state to 'Active', depending on the cluster configuration (HA mode / LS
mode) and states of the peer members.
Among several properly functioning cluster members working in HA mode, the
machine will become an 'Active' depending on the configuration:
o In 'Active Up' configuration ('Maintain current active Cluster Member') - a first
cluster member (on a time base), which reaches the 'Ready' state, will become
'Active'.
o In 'Primary Up' configuration ('Switch to higher priority Cluster Member') machine with highest priority will become 'Active'.
When on all cluster members, some Critical Device report their state as 'problem',
one of the member will become 'Active' and will get into derived state 'Active
attention', symbolizing that it has a failure. The choice regarding what machine will
become an 'Active' is a random and does not depend on the machines priorities /
numbers and type of Critical Devices that report their state as 'problem'.
Policy installation
When the policy is installed onto a cluster member, the fwd daemon calls the
"cphastart" command in order to start the clustering mechanism.
The "cphastart" command is responsible to read the $FWDIR/conf/objects.C file
in order to get all required information from the cluster object, and cluster members'
objects.
Once done, the "cphastart" command calls the "cphaconf" command with all the
relevant parameters.
The "cphaconf" command performs 2 main actions:
Moves the configuration parameters to the Check Point kernel (in the kernel, the
parameters are not enforced right away - instead, the new configuration parameters
are buffered, and a process called "policy negotiation" starts)
Notifies the cphastart daemon about the new loaded policy
The "cphaconf" command sends a signal to the cphamcset daemon to reload the
information from the objects. If the cphamcset daemon is not yet started, it will be started.

The cphamcset daemon is responsible for opening sockets on the NICs in order to
allow them to pass multicast traffic (CCP) to the machine (run 'ip maddr show'
command).
Check Point kernel has a mechanism, which ensures that all cluster members enforce
the same security policy and the same ClusterXL parameters at any given time.
Since the policy installation does not take place simultaneously on all cluster members
(actually the policy commit is sequential), there may be some time difference between the
installations on all the members.
In order to overcome this problem, the policy negotiation is divided into two phases:
All members must acknowledge the new policy arrival.
Then, all members must acknowledge moving into the new policy.
During Phase I, a machine that got new policy sends CCP packets declaring that it got a
new policy with a certain Policy ID.
The following line appears in cluster debug with 'conf' flag:
CPHA: Phase I: Looking for machines in policy update mode...
The other machines also send this CCP packet as soon as they get the new policy. All
the machines wait to receive the confirmation packet from all the other machines, signalling
that the new policy arrived to all cluster members.
Now, Phase II takes place, when the CPHA timer is stopped completely in order to
avoid sending packets with the old parameters, and the new policy parameters are
enforced.
The following line appears in cluster debug with 'conf' flag:
CPHA: Phase II: Looking for machines ready to update policy...
After having done that, each machine sends another packet indicating it completed the
policy change phase.
When all the machines completed the policy change p
hase, the HA timer is started and
all the machines are updated with the new configuration.
Each one of these steps is backed up with a HA timer, which reverts the process, if not
all the Active machines confirmed the new stage after a certain time (refer to
'fwha_policy_update_timeout_factor' kernel parameter).
In this case, the old parameters are restored.
Debugging:
In order to see the policy installation, debug the 'cluster' module with 'conf' flag (in
addition, these flags are recommended: 'stat', 'pnote', 'if', 'mac'):
[Expert@GW_HostName]# fw ctl debug -m cluster + conf stat pnote if mac

Example from R76 High Availability (Active Up) cluster - from Active member:
; 2Jul2013 13:51:53.832490;[cpu_0];[fw4_0];FW-1: SIM (SecureXL Implementation Module) SecureXL device detected.;
; 2Jul2013 13:51:53.958580;[cpu_2];[fw4_0];FW-1: SecureXL: Connection templates are not possible for the installed policy. Please
refer to the Performance Pack documentation for further details.;
; 2Jul2013 13:51:56.060111;[cpu_2];[fw4_1];FW-1: fwha_set_conf: entered with State=ACTIVE, Blocking State=ACTIVE;
; 2Jul2013 13:51:56.060115;[cpu_2];[fw4_1];FW-1: fwha_set_conf: need_to_set_trusted_ifs=0, need_to_delete_trusted_ifs=0;
; 2Jul2013 13:51:56.060116;[cpu_2];[fw4_1];FW-1: fwha_set_conf: confinfo->op: FWHAC_DEL_TRUSTED_IFS;
; 2Jul2013 13:51:56.060260;[cpu_2];[fw4_1];FW-1: fwha_set_conf: setting HA configuration (op = 0x80):;
; 2Jul2013 13:51:56.060273;[cpu_2];[fw4_1];FW-1: fwha_set_conf: SWITCH SUPPORT ;
; 2Jul2013 13:51:56.060275;[cpu_2];[fw4_1];FW-1: fwha_set_conf: Deleting all Trusted IFs;
; 2Jul2013 13:51:56.060277;[cpu_2];[fw4_1];FW-1: fwha_set_conf: buffering deletion of trusted interfaces;
; 2Jul2013 13:51:56.060279;[cpu_2];[fw4_1];FW-1: fwha_set_conf: setting need_to_delete_trusted_ifs=1 and returning;
; 2Jul2013 13:51:56.175754;[cpu_2];[fw4_1];FW-1: fwha_set_conf: confinfo->op: FWHAC_ADD_TRUSTED_IF;
; 2Jul2013 13:51:56.176228;[cpu_2];[fw4_1];FW-1: fwha_set_conf: setting HA configuration (op = 0x100):;
; 2Jul2013 13:51:56.176230;[cpu_2];[fw4_1];FW-1: fwha_set_conf: Trusted IF name = eth1;
; 2Jul2013 13:51:56.176234;[cpu_2];[fw4_1];FW-1: fwha_set_conf: Adding Trusted IF;
; 2Jul2013 13:51:56.176237;[cpu_2];[fw4_1];FW-1: fwha_set_conf: buffering trusted interface info (setting need_to_set_trusted_ifs=1);
; 2Jul2013 13:51:56.176240;[cpu_2];[fw4_1];FW-1: fwha_set_conf: copying confinfo->if_name=eth1 to fwha_trusted_ifs_buffered[0] and
returning;
.........................................
; 2Jul2013 13:51:56.255176;[cpu_3];[fw4_1];CPHA: the list of cluster IPs according to the interface:;
; 2Jul2013 13:51:56.255179;[cpu_3];[fw4_1];Interface: 1) eth0, cluster ip: 172.30.41.79;
; 2Jul2013 13:51:56.255180;[cpu_3];[fw4_1];Interface: 3) eth2, cluster ip: 20.20.20.79;
.........................................
; 2Jul2013 13:52:00.110480;[cpu_2];[fw4_1];CPHA: policy update packet local=NO, random=47647, status=2, policy=1862759333, first=YES,
entry=0;
; 2Jul2013 13:52:00.110489;[cpu_2];[fw4_1];Entry: 0
random_id: 47647
policy_id: 1862759333
update status: 2
time: 2013864;
; 2Jul2013 13:52:01.373822;[cpu_3];[fw4_1];FW-1: fwha_set_conf: confinfo->op: FWHAC_START;
; 2Jul2013 13:52:01.373826;[cpu_3];[fw4_1];FW-1: fwha_state_freeze: turning freeze type 0 ON (time=2013876, caller=fwha_set_conf);
; 2Jul2013 13:52:01.373828;[cpu_3];[fw4_1];FW-1: fwha_state_freeze: FREEZING state machine at ACTIVE (time=2013876,
caller=fwha_set_conf, freeze_type=0);
; 2Jul2013 13:52:01.373830;[cpu_3];[fw4_1];FW-1: fwha_set_conf: setting HA configuration (op = 0x30407e):;
; 2Jul2013 13:52:01.373831;[cpu_3];[fw4_1];FW-1: fwha_set_conf: mode = 4 (active up);
; 2Jul2013 13:52:01.373833;[cpu_3];[fw4_1];FW-1: fwha_set_conf: cluster ID = 4916;
; 2Jul2013 13:52:01.373834;[cpu_3];[fw4_1];FW-1: fwha_set_conf: cluster size = 2;

Classification: [Protected] All rights reserved
P. 71
; 2Jul2013 13:52:01.373835;[cpu_3];[fw4_1];FW-1: fwha_set_conf: machine_id = 0;

; 2Jul2013 13:52:01.373837;[cpu_3];[fw4_1];FW-1: fwha_set_conf: Log mode = log(1);
; 2Jul2013 13:52:01.373838;[cpu_3];[fw4_1];FW-1: fwha_set_conf: policy_id = 2147483647;
; 2Jul2013 13:52:01.373841;[cpu_3];[fw4_1];FW-1: fwha_set_conf: Starting HA;
; 2Jul2013 13:52:01.373842;[cpu_3];[fw4_1];FW-1: fwha_set_conf: Stopping HA;
; 2Jul2013 13:52:01.373843;[cpu_3];[fw4_1];FW-1: fwha_set_conf: Vmac Mode: 0;
; 2Jul2013 13:52:01.374296;[cpu_3];[fw4_1];fwha_load_bond_configuration: Succeded getting bond ls required slaves num Data structure
from user space: ret=0;
; 2Jul2013 13:52:01.374336;[cpu_3];[fw4_1];FW-1: fwha_set_conf: bond required slaves num conf have changed since last install policy;
; 2Jul2013 13:52:01.374340;[cpu_3];[fw4_1];FW-1: fwha_set_conf: setting buffered trusted interface info;
; 2Jul2013 13:52:01.374341;[cpu_3];[fw4_1];FW-1: fwha_set_conf: setting need_to_set_trusted_ifs=0, confinfo->op |=
FWHAC_ADD_TRUSTED_IF;
; 2Jul2013 13:52:01.374342;[cpu_3];[fw4_1];FW-1: fwha_set_conf: setting buffered deletion of trusted interfaces;
; 2Jul2013 13:52:01.374344;[cpu_3];[fw4_1];FW-1: fwha_set_conf: setting need_to_delete_trusted_ifs=0, confinfo->op |=
FWHAC_DEL_TRUSTED_IFS;
; 2Jul2013 13:52:01.374346;[cpu_3];[fw4_1];FW-1: fwha_set_conf: calling check_upd_trusted_if();
; 2Jul2013 13:52:01.374348;[cpu_3];[fw4_1];FW-1: check_upd_trusted_if: interface eth1 has been found among buffered ifs;
; 2Jul2013 13:52:01.374350;[cpu_3];[fw4_1];FW-1: check_upd_trusted_if: interface eth1 has been found among trusted ifs;
; 2Jul2013 13:52:01.374352;[cpu_3];[fw4_1];FW-1: policy ID old=3778374695 new=1862759333;
; 2Jul2013 13:52:01.374354;[cpu_3];[fw4_1];FW-1: fwha_set_conf: policy ID will be changed from 3778374695 to 1862759333FW-1: Current
policy update status - 2;
; 2Jul2013 13:52:01.374356;[cpu_3];[fw4_1];CPHA: Sending Policy ID change request. Status: 2;
; 2Jul2013 13:52:01.374377;[cpu_3];[fw4_1];CPHA: policy update packet local=YES, random=56134, status=2, policy=1862759333,
first=YES, entry=1;
; 2Jul2013 13:52:01.374380;[cpu_3];[fw4_1];Entry: 0
random_id: 47647
policy_id: 1862759333
update status: 2
time: 2013864;
; 2Jul2013 13:52:01.374382;[cpu_3];[fw4_1];Entry: 1
random_id: 56134
policy_id: 1862759333
update status: 2
time: 2013876;
; 2Jul2013 13:52:01.374384;[cpu_3];[fw4_1];CPHA: Phase I: Looking for machines in policy update mode...found 2 machines.;
; 2Jul2013 13:52:01.374385;[cpu_3];[fw4_1];CPHA: Sending Policy ID change request. Status: 3;
; 2Jul2013 13:52:01.374391;[cpu_3];[fw4_1];CPHA: Phase II: Looking for machines ready to update policy...found 1 machines.;
; 2Jul2013 13:52:01.374392;[cpu_3];[fw4_1];CPHA: waiting for more machines.;
; 2Jul2013 13:52:01.375847;[cpu_2];[fw4_1];CPHA: policy update packet local=NO, random=47647, status=3, policy=1862759333, first=NO,
entry=0;
; 2Jul2013 13:52:01.375852;[cpu_2];[fw4_1];Entry: 0
random_id: 47647
policy_id: 1862759333
update status: 3
time: 2013876;

P. 72
; 2Jul2013 13:52:01.375854;[cpu_2];[fw4_1];Entry: 1
random_id: 56134
policy_id: 1862759333
update status: 3
time: 2013876;
; 2Jul2013 13:52:01.375856;[cpu_2];[fw4_1];CPHA: Phase II: Looking for machines ready to update policy...found 2 machines.;
; 2Jul2013 13:52:01.375858;[cpu_2];[fw4_1];CPHA: All machines are ready to change their configuration.;
; 2Jul2013 13:52:01.375891;[cpu_2];[fw4_1];FW-1: Stopping ClusterXL.;
; 2Jul2013 13:52:01.375924;[cpu_2];[fw4_0];FW-1: stopping HA timer;
; 2Jul2013 13:52:01.376515;[cpu_2];[fw4_1];FW-1: fwha_bond_set_configuration: entering ...;
; 2Jul2013 13:52:01.376565;[cpu_2];[fw4_1];FW-1: fwha_conf_mode: fwha_installed=1, fwha_mode=4, mode=4, pivot_mode=0;
; 2Jul2013 13:52:01.376571;[cpu_2];[fw4_1];FW-1: Changing the machine ID to 0;
; 2Jul2013 13:52:01.376575;[cpu_2];[fw4_1];FW-1: set_use_sdf: Setting sdf mode to 0;
; 2Jul2013 13:52:01.376587;[cpu_2];[fw4_1];FW-1: fwha_reset_trusted_ifs: resetting required if number;
; 2Jul2013 13:52:01.376591;[cpu_2];[fw4_1];FW-1: add_trusted_if: added interface eth1 in position 0 in list;
; 2Jul2013 13:52:01.376596;[cpu_2];[fw4_1];fwha_set_vmac_state: fwha_vmac_global_param_enabled=0, ha_new_config.cluster_vmac_mode =
0, fwha_pivot_mode = 0, FWHA_USE_BACKUP_MODE() = 1, enable_vmac=0;
; 2Jul2013 13:52:01.376598;[cpu_2];[fw4_1];fwha_set_vmac_state: vmac mode should be disabled;
; 2Jul2013 13:52:01.376600;[cpu_2];[fw4_1];fwha_set_vmac_state: vmac state was not changed=0;
; 2Jul2013 13:52:01.385388;[cpu_2];[fw4_0];fwha_set_sync_tcp_handshake_mode: mode=MINIMAL. Disabling TCP handshake enforcement;
; 2Jul2013 13:52:01.385452;[cpu_2];[fw4_0];FW-1: starting HA timer;
; 2Jul2013 13:52:01.385464;[cpu_2];[fw4_1];fwha_df_set_force_df_ips_only_mode: is_nac_enabled = 0;
; 2Jul2013 13:52:01.385469;[cpu_2];[fw4_1];fwha_df_set_force_df_ips_only_mode: multi_portal_enabled = 0;
; 2Jul2013 13:52:01.385470;[cpu_2];[fw4_1];fwha_df_set_force_df_ips_only_mode: old force df ips only mode: 0, new force df ips only
mode: 0;
; 2Jul2013 13:52:01.385484;[cpu_2];[fw4_1];FW-1: Starting ClusterXL.;
; 2Jul2013 13:52:01.385500;[cpu_2];[fw4_1];FW-1: fwha_state_freeze: turning freeze type 0 OFF (time=2013876, caller=policy change finished changes (fwha_start));
; 2Jul2013 13:52:01.385503;[cpu_2];[fw4_1];FW-1: fwha_state_freeze: ENABLING state machine at ACTIVE (time=2013876,caller=policy
change - finished changes (fwha_start));

P. 73
State transitions of the cluster member
In ClusterXL:
Initialization built-in Devices
report OK
Initializing
Ready
Periodic check
of Devices OK
HA mode, and
other machine
is Active
Interface
Active Check
reports OK
Down
LS mode, or
no other active
machines
heard
Standby
Active
All nonproblematic
machines
confirmed the
Active state
There are no
members that
send lower
version of CCP
Interface
Active Check
reports problem
Other
Critical Device
reports problem
No other Active
machines
in the cluster

In 3rd party cluster:

Note: Only 'Down' and 'Active' states are available, since they reflect the status of
the State Synchronization (which is the only active mechanism).
Full Sync
successful
Down
Active
State Sync
failure
State transitions due to 'FWHA_MY_STATE' packet

Each time the cluster member receives a CCP packet with OpCode 1
(FWHA_MY_STATE), the decision mechanism is invoked and is required to re-evaluate the
state of the current machine:
Any cluster
configuration
High Availability
configuration
Down
Active
The previously Active

machine sent "Down"
state packet
Down
Active
All active machines

stopped sending
state packets

State transitions due to a Critical Device (Pnote)

If any Critical Device (Pnote) reports its state as 'problem', the state of the current
machine is changed:
Critical Device
failure
Active/Standby
Down
Critical Devices
OK
State transitions due to the 'Interface Active Check' Critical Device

(Pnote)
State of the interface is changed based on the arrival/transmission of CCP packets:
o The transmit state of an interface (as monitored by this pnote) is refreshed once a
FWHAP_IF_PROBE_RPLY packet is received in acknowledge to
o The receive state of an interface (as monitored by this pnote) is refreshed once any
CCP packet (UDP on port 8116) is received.
State of an interface as displayed in the output of 'cphaprob -a if' command:
UP - CCP packets are received and sent within the predefined timeouts
DOWN - CCP packets are not received and not sent (beyond the predefined
timeouts), and probing mechanism was able to determine the problematic interface
Inbound: UP , Outbound: DOWN - CCP packets are not sent (beyond the
predefined timeouts)
Inbound: DOWN , Outbound: UP - CCP packets are not received (beyond the
predefined timeouts)
Inbound: DOWN , Outbound: DOWN - CCP packets are not received and not
sent (beyond the predefined timeouts), and probing mechanism was not yet able to
determine the problematic interface
The state of the interface in Check Point kernel (as seen in the debug of 'cluster'
module with flag 'if') is changed in the following way (assuming default values of
relevant kernel parameters):
From UP to ASSUMED UP - after total of 0.6 sec and 1.2 sec of not receiving/sending
CCP packets
From ASSUMED UP to UNKNOWN - after total of 2.2 sec of not receiving/sending CCP
packets


sk62863 (Cluster debug shows interface flapping due to the missing CCP packets)
sk43984 (Interface flapping when cluster interfaces are connected through several
switches)
Note about 3rd party cluster:
The FWHAP_IF_PROBE_REQ and FWHAP_IF_PROBE_RPLY packets are not sent.
The timeout of the interface is based on the FWHAP_IFCONF_RPLY packets that are
sent every 1 second. The total timeout for determining that an interface is not
responding in a certain direction is 3 seconds.
Actions performed by a cluster member following a state transition

As a result of any state transition, the state machine performs correspondent actions
depending on the old and a new state. All actions can be divided into several groups
according to the ClusterXL components and other FW-1 features integrated with ClusterXL:
1. Re-establishing network connectivity to the cluster - during the transition to the
Active state usually sends Gratuitous ARP request packets to update ARP cache of
the hosts/routers connected to the cluster interfaces by advertising the new MAC
address for the cluster Virtual IP addresses.
2. Pivot selection refresh - when working in the Load Sharing Unicast mode, the
transition to/from the Active state will invoke the Pivot selection mechanism and, if
needed, the recalculation of the Pivot packet selection table, which will redistribute
the traffic among the cluster members according to the new state.
3. Automatic proxy ARP refresh - when automatic Proxy ARP feature enabled, the
new Active cluster member in High Availability New mode and Load Sharing Unicast
mode, will issue Gratuitous ARP Requests according to the contents of FW-1 kernel
"arp_table". For each entry in it that table, an ARP Request will be sent containing a
Proxied IP address and the local MAC address.
4. SecureXL selection refresh - when SecureXL is configured, the transitions from/to
the Active state will cause the state machine to issue updates to the SecureXL
device with a new cluster member state.
5. Synchronization buffer flush - when the machine stops processing the FW-1
gateway traffic (which usually happens when it is moving from the Active state to any
other state) the synchronization buffer, which might hold any Delta Sync data
regarding the connections processed so far, is flushed. This is done in order to
update the rest of cluster members that obviously, will handle the connections
belong to the machine changing its state.
6. Accounting information flush - in the case of Synchronization buffer flush,the
machine will flush Accounting information to the Security Management Server.

Cluster Control Protocol (CCP)

Introduction
The Cluster Control Protocol (CCP) is a proprietary Check Point protocol that runs on
UDP port 8116 (packets are not encrypted).
Refer to Cluster Control Protocol Reference document (this document applies to all
cluster versions since NG FP3).
CCP is located between the Check Point kernel and the network interface (therefore,
only TCPdump/Snoop utilities should be used for capturing this traffic).
Historical fact: The first release of Check Point ClusterXL supported only High
Availability Mode. The CCP at that time was called High Availability Protocol - HAP.
The ordinal numbers of letters in the alphabet are: H(8) + A(1) + P(16) = "8116".
CCP runs only on cluster and sync interfaces (refer to ClusterXL definitions and terms
section).
CCP has the following roles:
State Synchronization - cluster members exchange Delta Sync packets about the
processed connections to keep the relevant kernel tables synchronized on all cluster
members.
Note: Each Delta Packet contains many pieces of information about different
connections. The payload of these Delta Sync packets is not encrypted, but it is not
human-readable (i.e., sniffing this traffic will not allow anyone to understand the
contents of these packets). The only way to understand what was transferred in
these packets is to run the relevant cluster debug on all cluster members (fw ctl
debug -m fw + sync).
It is up to the cluster administrator to make sure the Sync network is secured and
isolated.
Health checks - cluster members exchange reports and query each other about
their own states and the states of their cluster interfaces:
o Health-status Reports
o Cluster-member Probing
o State-change Commands
o Querying for Cluster Membership
Notes:
o These CCP packets are not encrypted.
o This applies only to ClusterXL - Check Point cluster running on Gaia OS /
SecurePlatform OS / Crossbeam COS / Windows OS / Solaris OS.
o In 3rd Party clusters (e.g., Check Point cluster running on Crossbeam XOS /
IPSO OS), the 3rd Party software is responsible for health checks.

Explanations:
o Health-status Reports - These reports contain the state of the transmitting cluster
member, as well as the presumed state of the other cluster members.
o Cluster-member Probing - If a cluster member fails to receive status for another
member (does not receive CCP packets from that member) on a given segment,
cluster member will probe that segment in a best-effort attempt to illicit a
response.
The purpose of such probes is to detect (best-effort) the nature of possible
interface failures, and to determine which module has the problem.
The outcome of this probe will determine what action is taken next (change
the state of an interface, or of a cluster member).
Cluster member sends a CCP packet 'FWHA_IF_PROB_REQ'.
Cluster member sends series of ARP Requests in the loop for all IP
addresses on this subnet.
If hosts on this subnet send ARP Replies to cluster member, then cluster
member sends series of ICMP Requests (one such host is enough).
If hosts on this subnet send ICMP Replies to cluster member (one such host
is enough), then the local interface on this member is considered to work
correctly, and the missing CCP packets from peer member are considered as a
failure on peer member.
As a result, the peer member might be declared as failed ('Down'), which in
turn might cause a fail-over in the cluster.
Example:
Cluster member FW1 is not able to send/receive CCP packets to/from the
other member FW2 on the interface eth1, this member FW1 will need to
determine where the problem occurs - on the local interface eth1 or on the
other member - and perform a fail-over (if needed)
There are 2 possible reasons why this member FW1 will not able to
send/received CCP packets from the other member FW2:
o Cluster mechanism on the other member FW2 does not work anymore
- nobody can send CCP packets to this member FW1 and receive CCP
packets from this member FW1.
o Local interface eth1 on this member FW1 does not work anymore there is not traffic at all.
Computer administrator (human) can always determine where the problem
is - check cables, send pings, etc.
Cluster member is not that smart and has to rely on some simple tests that
are called "Probing".
When a member starts probing, cluster member starts sending ARP
Requests for the IP addresses in the subnet.

For example, on FW1, the IP address on eth1 is 192.168.196.201 / 23.

It means, that the following ARP Requests will be sent:
Who has IP 192.168.196.203
Who has IP 192.168.196.204
Who has IP 192.168.196.205
Who has IP 192.168.196.206
....................................
Who has IP 192.168.196.255
Who has IP 192.168.197.0
Who has IP 192.168.197.1
....................................
Who has IP 192.168.197.255
Who has IP 192.168.196.1
Who has IP 192.168.196.2
Who has IP 192.168.196.3
....................................
If there are hosts with such IP addresses on the subnet, they will send an
ARP Reply to the cluster member (one such host is enough).
Cluster member starts sending ICMP Requests to the IP addresses that
answered the ARP Requests.
If the hosts send an ICMP Reply to the cluster member (one such host is
enough), then the cluster member FW1 will know that it can send usual traffic
through this interface eth1 and the problem with CCP packets must be
happening on the other member FW2.
If this cluster member FW1 is not able to determine where the problem is,
this interface eth1 will be declared as Failed (and by design, a fail-over will
occur).
o State-change Commands - If a cluster member needs to change its state, the
command to do so takes place on the defined secured (sync) interface.
o Querying for Cluster Membership - When a cluster member comes online, it will
send a series of CCP query/response messages, to gain knowledge of cluster
membership (which members are located on these subnets).
CCP and security policy rule base

CCP is located between the Check Point kernel and the network interface. Therefore,
there is no need to add a rule to the Security Policy Rule Base that accepts CCP packets.
It is not possible to drop CCP packets based on Security Policy Rule Base (even if such
rule is created, the CCP packets will still be processed by Check Point kernel).
Refer to sk44177 (SmartView Tracker repeatedly shows drops with "Source and
destination addresses are equal").
It is not possible to perform NAT on CCP packets.
To capture the CCP packets, use only TCPdump/Snoop utilities (do not use Check
Point FW Monitor).

CCP internal timers

In order for the ClusterXL to be as robust as possible, it is designed to send CCP
packets and expects to receive CCP packets on time - based on internal CCP timers.
If CCP packets are not sent/received on time (as expected based on internal CCP
timers), the internal ClusterXL algorithms will suspect that there is a problem with the state
of the involved interface(s) and/or with the state of the cluster member(s). Eventually, a
problematic interface, or the whole member might be declared as failed.
For example, if Member_B does not receive CCP packets from Member_A on interface
eth3, then Member_B might declare its interface eth3 as 'Down', or even declare itself as
'Down'. This, in turn, might lead to a fail-over between cluster members.
The operation of CCP is based on the following two separate internal timers:
Sync timer
Purpose:
Performs sync-related actions every fixed interval. By default, the sync timer interval is
100ms. The base time unit is 100ms (or 1 tick), which is therefore the minimal value.
This time interval is controlled via global kernel parameter.
Global kernel parameter:
fwha_timer_sync_res
Sync timer interval =
= 10 x fwha_timer_base_res x fwha_timer_sync_res =
= 10 x 10 ms x fwha_timer_sync_res
Parameter values:
Integers from 1 (default) to 232-1
Notes:
o Increasing this value increases the time interval between Delta Sync actions. For
example, if the timer is doubled to 200 ms (fwha_timer_sync_res=2), then the
time interval between Delta Sync actions also doubles to 200 ms.
o Refer to sk41471 (ClusterXL - State Synchronization time interval and
'fwha_timer_sync_res' kernel parameter).
CPHA timer
Purpose:
Performs cluster-related actions every fixed interval. By default, the CPHA timer interval
is 100ms. The base time unit is 100 ms (or 1 tick), which is also the minimum value.
This time interval is controlled via global kernel parameter.
Global kernel parameter:
fwha_timer_cpha_res


CPHA timer interval =
= 10 x fwha_timer_base_res x fwha_timer_cpha_res =
= 10 x 10 ms x fwha_timer_cpha_res
Parameter values:
Integers from 1 (default) to 232-1
Notes:
If the cluster members are geographically separated from each other (e.g., located in
different cities), set the CPHA timer to be around 10 times the round-trip time (RTT)
of the synchronization network.
Increasing this value increases the time it takes to detect a failover. For example, if
detecting interface failure takes 0.3 seconds, and the timer is doubled to 200 ms
(fwha_timer_cpha_res=2), then the time needed to detect an interface failure
also doubles - to 0.6 seconds.
Refer to sk43872 (ClusterXL - CCP packets and fwha_timer_cpha_res parameter).
CCP modes
CCP can run in these modes:
Multicast (default since NG FP3 HF2) - the Layer 2 Destination MAC address of
CCP packets is 01:00:5E:X:X:X
Broadcast - the Layer 2 Destination MAC address of CCP packets is
FF:FF:FF:FF:FF:FF
Unicast - the Layer 2 Destination MAC address of CCP packets is the physical MAC
address of specific cluster member(s). This mode is used:
o On VSX cluster in VSLS configuration - when number of configured Virtual
Systems is less than the number of cluster members
o On 41000/61000 appliance (starting in R75.40VS for 61000) - refer to
'asg_sync_manager' utility (61000 Security System Administration Guide)
In VSX cluster:
VSX NGX / VSX NGX R65 / VSX NGX R67 / VSX NGX R68:
o The only possible mode of CCP is Broadcast.
R75.40VS / R76 and above:
o CCP mode over Sync Network is Broadcast for all Virtual Systems.
o CCP mode over non-Sync Networks is Multicast.
In VSLS configuration, when instances of Virtual Systems are not running on all
cluster members (e.g., only 2 VSs were configured on a VSX cluster that has 4
cluster members), the Delta Sync packets generated by a Virtual System, are sent in
Unicast only to those members that run the instance of same the Virtual System.
Refer to sk36644 (The Mode of Cluster Control Protocol (CCP) in VSX cluster).
Note: The CCP mode is not set on Virtual Switches because they do not send CCP
packets.
It is possible to change the CCP mode on-the-fly. Refer to sk20576 (How to set
ClusterXL Control Protocol (CCP) in Broadcast / Multicast mode in ClusterXL):

Notes:
This change must be done on all members of the cluster.
This change is applied immediately.
This change survives reboot:
o Unix OS: refer to $FW_BOOT_DIR/ha_boot.conf file
o Windows OS: refer to Windows Registry key
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\CPHA\C
CP_mode
Procedure:
To check the current mode, run:

[Expert@HostName]# cphaprob -a if
Notes:
o The CCP mode will appear at the end of the line.
o In VSX R68 and lower, the mode is not displayed (only Broadcast is supported).
Example from ClusterXL:
Required interfaces: 4
Required secured interfaces: 1
eth0
eth1
eth2
eth3
UP
UP
UP
UP
non sync(non secured), multicast

sync(secured), multicast
To change the CCP mode to broadcast mode, run:

[Expert@HostName]# cphaconf set_ccp broadcast
To change the CCP mode to multicast mode, run:

[Expert@HostName]# cphaconf set_ccp multicast
CCP and VLAN interfaces

CCP has the following roles:
Health checks (state of cluster members and of cluster interfaces)
State Synchronization (Delta Sync)
In ClusterXL (including VSX), the Synchronization Network (CCP packets that carry
Delta Sync information) is supported only on the lowest VLAN tag of a VLAN interface. For
example, if three VLANs with tags 10, 20 and 30 were configured on interface eth1, then
only interface eth1.10 may be used for State Synchronization.

Default health checks of cluster interfaces (monitoring of interfaces by ClusterXL) are

performed in the following way:
Interface
Physical interfaces
VLAN interfaces
ClusterXL
all cluster interfaces are
monitored
only lowest VLAN tag is
monitored
only lowest and highest
VLAN tags are monitored
(since R75.47/R77)
VSX cluster
all cluster interfaces
are monitored
HA: only lowest and highest
VLAN tags are monitored
VSLS: all VLAN tags
are monitored
It is possible to customize the default monitoring of VLAN tags in the following way:
Monitor VLAN tag
Only lowest VLAN tag
Only lowest and highest
VLAN tag
All VLAN tags
Only specific VLAN tag
(since R71)
ClusterXL
default
VSX cluster
need to disable
the default behaviour *
default
default (HA)
(since R75.47/R77)
not supported
default (VSLS)
Refer to sk92784 (Configuring VLAN Monitoring on
ClusterXL for specific VLAN interface)
* Note: In VSX cluster, in order to disable the default monitoring behaviour, set the value of
the relevant kernel parameter to 0 (zero):
Pre-R75.40VS versions: fwha_monitor_all_vlans
R75.40VS / R76 and above: fwha_monitor_all_vlan
Refer to sk35462 (Abnormal behavior of cluster members during failover when 'Monitor all
VLAN' feature is enabled).

CCP packet header

The protocol is sub-divided into several types. Packets of different types are used to
send machine status reports, query interfaces of other machines, and perform safe update
of their internal state (policy). A special type of message is used to perform State
Synchronization between cluster members, i.e., notify cluster members of connections
handled by each other.
External Header
0
Layer 2 Destination MAC address
16
Total
Length
IP
datagram
ID
IP Flags +
Fragment
Offset
32
Layer 3
Destin.
IP addr
(cont.)
Layer 4
Source
Port
Layer 4
Destin.
Port
48
64
10
11
Layer 2 Source MAC address
TTL
IP
Pro
to
(11)
Total
Length
12
13
Eth Type
(0x0800)
14
IP
Ver
(4)
IP
header
checksum
Layer 3
Source
IP address
UDP
checksum
CCP Header and Data
15
IP
Hdr
Len
Layer 3
Destin.
IP addr
CCP Header and Data (cont.)

CCP Header and Data (cont.)
Important Note: It is not possible to control the CCP packets by security policy rule
base (neither by security rules, nor by NAT rules) because CCP is located between the
Check Point kernel and the network interface.
Length of external headers:
o Ethernet Header = 14 bytes
o IP Header = 20 bytes
o UDP Header = 8 bytes
o CCP offset = 42 bytes from frame start

Let us review the header fields:

o Layer 2
Refer to sk25977 (Connecting multiple clusters to the same network segment (same
VLAN, same switch).
o Destination MAC address (Bytes 0 - 5)
In ClusterXL:
Value
Notes
01:00:5e:YY:ZZ:WW when VIP address is configured on these interfaces
when CCP is set to run in multicast mode
when sent over non-secured (non-sync) interfaces
YY:ZZ:WW = concatenation of 3 last octets of VIP
Algorithm for VIP address = "A"."B"."C"."D":
o If 2nd octet "B" < 127, then Final MAC =
01:00:5E:("B"hex):("C"hex):("D"hex)
Example:
If VIP = 192.50.204.20, then Final MAC =
01:00:5E:("50"hex):("204"hex):("20"hex)
01:00:5E:32:CC:14
o If 2nd octet "B" > 127, then Final MAC =

01:00:5E:("B-128"hex):("C"hex):("D"hex)
Example:
If VIP = 192.168.204.20, then Final MAC =
01:00:5E:("168-128"hex):("204"hex):("20"hex) =
01:00:5E:28:CC:14
01:00:5e:YY:ZZ:WW
when there is no VIP configured on this interface

when CCP is set to run in multicast mode
when sent over non-secured (non-sync) interfaces
YY = the 2nd octet (from the left) of the final
calculated IP address after adding 250 to the
interface's network address
ZZ = the 3rd octet (from the left) of the final calculated
IP address after adding 250 to the interface's network
address
WW = the 4th octet (from the left) of the final
calculated IP address after adding 250 to the
interface's network address

Algorithm:
1. Calculate the interface's network address - perform
logical AND between the interface's IP address
and subnet mask
2. Add 250 to the calculated interface's network
address
3. Convert the 2nd (YY), 3rd (ZZ) and 4th (WW)
octets of the final calculated IP address from Dec
to Hex format
Example #1
A. The interface's IP address and subnet mask are:
192.168.40.100 / 24
B. The interface's network address is:

192.168.40.100 AND 255.255.255.0 =
192.168.40.0
C. The final calculated IP address is:

192.168.40.0 + 250 =
11000000.10101000.00101000.01100000 +
00000000.00000000.00000000.11111010 =
11000000.10101000.00101000.11111010 =
192.168.40.250
D. The converted octets are:

"168" dec = "A8" hex
"40" dec = "28" hex
"250" dec = "FA" hex
E. Hence, the Final MAC:

01:00:5E:("168"hex):("40"hex):("250"hex) =
01:00:5E:A8:28:FA
Example #2
A. The interface's IP address and subnet mask are:
192.168.40.100 / 29
B. The interface's network address is:

192.168.40.100 AND 255.255.255.248 =
192.168.40.96
C. The final calculated IP address is:

192.168.40.96 + 250 =
11000000.10101000.00101000.01100000 +
00000000.00000000.00000000.11111010 =
00000000.00000000.00101001.01011010 =
192.168.41.90
D. The converted octets are:

"168" dec = "A8" hex
"41" dec = "29" hex
"90" dec = "5A" hex
E. Hence, the Final MAC:

01:00:5E:("168"hex):("41"hex):("90"hex) =
01:00:5E:A8:29:5A
FF:FF:FF:FF:FF:FF
when CCP is set to run in broadcast mode

when sent over secured (sync) interfaces

In VSX cluster:
Value
Notes
FF:FF:FF:FF:FF:FF Refer to sk36644.
o In VSX NGX / VSX R65 / VSX R67 / VSX R68:
The only possible mode of CCP is Broadcast.
o In R75.40VS / R76 and above in VSX mode:
o CCP mode over Sync Network is Broadcast for all
Virtual Systems
o CCP mode over non-Sync Networks is Multicast
o In VSLS configuration:
When instances of VSs are not running on all cluster
members (e.g., only 2 VSs were configured on a VSX
cluster that has 4 cluster members), the Delta Sync
packets generated by a VS, are sent in Unicast only to
those members that run the instance of same the VS.
o Source MAC address (Bytes 6 - 11)
Note: The same Source MAC address is used for all the VSs on the same member.
In ClusterXL (on Gaia R77.30 and above) and in VSX mode (R77.30 and above)
before installing the policy for the first time:
1st
00
2nd
00
3rd
00
4th
00
5th
Value derived from
Cluster_Global_ID
6th
21
Notes:
Cluster_Global_ID - controls the value of 5th byte in Source MAC address of
CCP packets.
Default values are:
o 0xFE hex / 254 dec - ClusterXL Gateway mode on Gaia OS R77.30 and above
o 0xFE hex / 254 dec - ClusterXL VSX mode on Gaia OS R77.30 and above
In ClusterXL (R77.20 and lower) and in VSX mode (R75.40VS / R76 / R77 / R77.10 /
R77.20) before installing the policy for the first time:
1st
00
2nd
00
3rd
00
4th
00
5th
fwha_mac_magic
6th
21
Notes:
fwha_mac_magic - name of the kernel parameter that controls the value of 5th
byte in Source MAC address of CCP packets.
Default values are:
o 0xFE hex / 254 dec - ClusterXL Gateway mode on R77.20 and lower
o 0xFE hex / 254 dec - ClusterXL VSX mode on R75.40VS / R76 and above
o 0xF6 hex / 246 dec - VSX Cluster from VSX NGX up to VSX R68
sk62432 (Source MAC Address of Cluster Control Protocol (CCP) frames in
ClusterXL before installing the policy for the first time)
sk25977 (Connecting multiple clusters to the same network segment (same
VLAN, same switch)

In ClusterXL running Gaia R77.30 and above:

1st
2nd
3rd
4th
00
00
00
00
5th
Value derived from
Cluster_Global_ID
6th
ID_of_Source_Member
In ClusterXL running Gaia R75.40-R77.20 / SecurePlatform / IPSO:

1st
2nd
3rd
4th
00
00
00
00
5th
Value of
fwha_mac_magic
6th
ID_of_Source_Member
In VSX cluster running Gaia R77.30 and above:

1st
2nd
3rd
4th
00
00
XXXXXXXX
00
5th
Value derived from
Cluster_Global_ID
6th
ID_of_Source_Member
In VSX cluster running Gaia R75.40VS, R76, R77, R77.10, R77.20 /

SecurePlatform VSX NGX, VSX R65, VSX R67, VSX R68 / IPSO VSX R65:
1st
2nd
3rd
4th
00
00
00
00
5th
Value of
fwha_mac_magic
6th
ID_of_Source_Member
Notes:
Cluster_Global_ID - controls the value of 5th byte in Source MAC address of
CCP packets.
Default values are:
o 0xFE hex / 254 dec - ClusterXL Gateway mode on Gaia OS R77.30 and above
o 0xFE hex / 254 dec - ClusterXL VSX mode on Gaia OS R77.30 and above=
fwha_mac_magic - controls the value of 5th byte in Source MAC address of
CCP packets.
Default values are:
o 0xFE hex / 254 dec - ClusterXL Gateway mode on R77.20 and lower
o 0xFE hex / 254 dec - ClusterXL VSX mode on R75.40VS / R76 and above
o 0xF6 hex / 246 dec - VSX Cluster from VSX NGX up to VSX R68
XXXXXXXX - is either 00000000, or 8 least significant (right-most) bits of VSID
sk25977 (Connecting multiple clusters to the same network segment (same
VLAN, same switch)
Layer 3
Address
Value
0.0.0.0
Source IP address
Destination IP address broadcast
address for
this subnet
Notes
The IP address of the CCP packet on
the receiver side is ignored and is not
being checked.

sk104567 (Traffic passing through the VSX cluster is lost during a cluster failure
on Standby member)

Layer 4 (UDP)
Port
Source port
Destination port
Value
8116
8116
Notes
It is strongly recommended not to pass
any other traffic on UDP port 8116
through ClusterXL
CCP Header
0
16
Total
Length
IP
datagram
ID
IP Flags +
Fragment
Offset
32
Layer 3
Destin.
IP addr
(cont.)
Layer 4
Source
Port
Layer 4
Destin.
Port
Total
Length
UDP
checksum
48
CCP
OpCode
Source IF
Number
Random
ID
Source
Machine
ID
Destin.
Machine
ID
64
CoreXL
instance
ID
VSX
VSID
10
11
TTL
IP
Pro
to
(11)
IP
header
checksum
12
13
Eth Type
(0x0800)
14
IP
Ver
(4)
Layer 3
Source
IP address
Magic
Number
(0x1A90)
Policy ID
15
IP
Hdr
Len
Layer 3
Destin.
IP addr
CCP
Version
Cluster
Number
Filler
Total
num. of
CoreXL
FW inst.
CCP Data
CCP offset = 42 bytes from frame start

Magic Number (Bytes 42 - 43) - Identifies the CCP protocol with the constant number
1A90 hex / 6800 dec.
CCP Version (Bytes 44 - 45) - An integer number that is assigned in each Check Point
version. All member of the same cluster must have identical CCP version (i.e., identical
Check Point software).
When a cluster member receives CCP packets with CCP Version lower than his, it goes
into 'Ready' state (by design). Refer to sk42096 (Cluster member is stuck in 'Ready' state).
Notes:
The CCP version can be obtained by checking the value of kernel parameter
'fwha_version':
[Expert@GW_HostName]# fw ctl get int fwha_version
The CCP version on 32-bit system in Gateway mode equals the value of kernel
parameter fwha_version:
CCP 32-bit GW = fwha_version

The CCP version on 64-bit system in Gateway mode is greater by 1 than CCP
version on 32-bit system in Gateway mode:
CCP 64-bit GW = CCP 32-bit GW + 1 = fwha_version + 1
The CCP version on system in VSX mode is greater by 2 than CCP version on 32-bit
system in Gateway mode:
CCP VSX = CCP 32-bit GW + 2 = fwha_version + 2
CCP Version vs. Check Point software:

Version
Code (Dec)
1
2
3
6
530
534
537
538
540
541
550
551
552
553
591
593
601
Version
Code (Hex)
0x0001
0x0002
0x0003
0x0006
0x0212
0x0216
0x0219
0x0219
0x021C
0x021D
0x0226
0x0227
0x0228
0x0229
0x024F
0x0251
0x0259
602
646
650
665
667
690
691
700
705
710
800
801
802
803
804
0x025A
0x0286
0x028A
0x0299
0x029B
0x02B2
0x02B3
0x02BC
0x02C1
0x02C6
0x0320
0x0321
0x0322
0x0323
0x0324
805
810
811
813
0x0325
0x032A
0x032B
0x032D
Check Point
software version
4.1
NG (FP0)
NG FP1
NG FP2
NG FP3
VSX NG AI R2
VSX NGX EA
VSX NGX GA
NG AI R54 EA
NG AI R54 GA
NG AI R55 (up to HFA_16)
NG AI R55 HFA_17
NG AI R55W
NG AI R55 HFA_18
NG AI R55 LSV
NGX R60 EA
NGX R60 GA
NGX R60 HFA_01
NGX R60 HFA_02
NGX R60 Multicast acceleration
NGX R60 with Anti-Virus
NGX R61 EA2
NGX R61 GA
NGX R62 EA
NGX R62 GA
Connectra NGX R61 EA
Connectra NGX R61 GA
Connectra NGX R66 GA
NGX R65 EA
NGX R65 GA
NGX R65 HFA_01
NGX R65 HFA_02
NGX R65 HFA_02
Connectra NGX R66.1
NGX R65 HFA_03
NGX R65 HFA_03 GA
NGX R65 HFA_40
NGX R65 HFA_50

814
815
816
850
900
901
902
1000
1010
1001
1100
1500
1501
1502
0x032E
0x032F
0x0330
0x0352
0x0384
0x0385
0x0386
0x03E8
0x03F2
0x03E9
0x044C
0x05DC
0x05DD
0x05DE
1505
1506
1508
1516
1518
1520
0x05E1
0x05E2
0x05E4
0x05EC
0x05EE
0x05F0
1523
1505
1555
0x05F3
0x05E1
0x0613
1557
1559
1561
1562
1563
2000
2000
0x0615
0x0617
0x0619
0x061A
0x061B
0x07D0
0x07D0
2005
2010
2020
2210
2211
2500
2501
2502
2500
2501
2502
2220
2221
2225
2226
0x07D5
0x07DA
0x07E4
0x08A2
0x08A3
0x09C4
0x09C5
0x09C6
0x09C4
0x09C5
0x09C6
0x08AC
0x08AD
0x08B1
0x08B2
NGX R65 HFA_50

NGX R65 HFA_60
NGX R65 HFA_70
VSX NGX Scalability Pack
VSX NGX R65 GA
VSX NGX R65 HFA_10
VSX NGX R65 HFA_20
NGX R65 with CoreXL LE
VSX NGX R67 EA
VSX NGX R67 GA
VSX NGX R68 GA
R70 EA
R70 GA
R70.1 EA
R70.1 IPv6Pack HCC
R70.1 GA
R70.1 IPv6Pack
R70.12
R70.20
R70.30
R70.40
GX 5.0 HCC
R70.50
R71 GA (refer to sk50260)
R71.10
R71 VE
R71.20
R71.30
R71.40
R71.45
R71.50
R75 GA
R75.050 for 61000
R75.051 for 61000
R75.052 for 61000
R75.10
R75.20
R75.30
R75.40 32-bit
R75.40 64-bit
R75.40VS for 61000 32-bit
R75.40VS for 61000 64-bit
R75.40VS for 61000 in VSX mode
R75.40VS 32-bit
R75.40VS 64-bit
R75.40VS in VSX mode
R75.45 32-bit
R75.45 64-bit
R75.46 32-bit
R75.46 64-bit

2230
2231
2235
2236
2700
2701
2702
2720
2721
2722
2900
2901
2902
2905
2906
2907
2910
2911
2912
2920
2921
2922
62700
62701
62702
62710
62711
62712
62700
62701
62702
62700
62701
62702
62700
62701
62702
0x08B6
0x08B7
0x08BB
0x08BC
0x0A8C
0x0A8D
0x0A8E
0x0AF0
0x0AA1
0x0AA2
0x0B54
0x0B55
0x0B56
0x0B59
0x0B5A
0x0B5B
0x0B5E
0x0B5F
0x0B60
0x0B68
0x0B69
0x0B6A
0xF4EC
0xF4ED
0xF4EE
0xF4F6
0xF4F7
0xF4F8
0xF4EC
0xF4ED
0xF4EE
0xF4EC
0xF4ED
0xF4EE
0xF4EC
0xF4ED
0xF4EE
R75.47 32-bit
R75.47 64-bit
R75.48 32-bit
R75.48 64-bit
R76 32-bit
R76 64-bit
R76 in VSX mode
R76.10 32-bit
R76.10 64-bit
R76.10 in VSX mode
R77 32-bit
R77 64-bit
R77 in VSX mode
R77.10 32-bit
R77.10 64-bit
R77.10 in VSX mode
R77.20 32-bit
R77.20 64-bit
R77.20 in VSX mode
R77.30 32-bit
R77.30 64-bit
R77.30 in VSX mode
R76SP for 41000/61000 32-bit
R76SP for 41000/61000 64-bit
R76SP for 41000/61000 in VSX mode
R76SP.10 for 41000/61000 32-bit
R76SP.10 for 41000/61000 64-bit
R76SP.10 for 41000/61000 in VSX mode
R76SP.10_VSLS for 41000/61000 32-bit
R76SP.10_VSLS for 41000/61000 64-bit
R76SP.10_VSLS for 41000/61000 in VSX mode
R76SP.20 for 41000/61000 32-bit
R76SP.20 for 41000/61000 64-bit
R76SP.30 for 41000/61000 32-bit
R76SP.30 for 41000/61000 64-bit

Cluster Number (Bytes 46 - 47) - This number identifies the cluster, on which this
datagram is communicated. The cluster number is set by Security Management Server.
CCP OpCode (Bytes 48 - 49) - This code identifies the type of CCP packet. Each CCP
OpCode implies a different structure of thepacketsData section (see below).
Refer to this document (the structure of CCP Data has not changed):
o NGX R60 Advanced Technical Reference Guide (ATRG) - Chapter 11 ClusterXL Debugging CPHA Issues - General Analysis Matrix for CPHA Packets
OpCode
1
2
3
4
5
6
Type
FWHA_MY_STATE
FWHA_QUERY_STATE
FWHA_IF_PROBE_REQ
FWHA_IF_PROBE_REPLY
FWHAP_IFCONF_REQ
FWHAP_IFCONF_RPLY
FWHAP_LB_CONF
FWHAP_LB_CONF_CONFIRM
9
10
FWHAP_POLICY_CHANGE
FWHAP_SYNC
11
FWHAP_CHASSIS_STATE
12
FWHAP_CHASSIS_FREEZE
13
FWHAP_SECURITY_GROUP
14
FWHAP_CHASSIS_SYNC_LOST
15
FWHAP_CHASSIS_LINK_STATE
16
FWHAP_CHASSIS_GENERAL_INFO
Description
Report source machine's state
Query other machine's state
Interface active check (probe) request
Interface active check (probe) reply
Interface configuration request
Interface configuration reply
Load Balancing (Load Sharing)
configuration report
Load Balancing (Load Sharing)
configuration report and a request for
its confirmation (a reply to
FWHAP_LB_CONF)
Policy ID change request/notification
Delta Sync packets ("New" version)
Only on 41000/61000 appliance:
Chassis protocol
Chassis freeze mechanism
(freeze after failover)
Security group advertising
Chassis sync lost mechanism
(freeze when sync is lost)
Chassis link state mechanism
(freeze when sync is lost)
Additional Chassis info
Source IF Number (Bytes 50 - 51) - The ID of the network interface that originated this
CCP packet.
These IDs are assigned by Check Point kernel during attachment to the interfaces.
Refer to the output of the 'fw ctl iflist' command on each cluster member (Note:
these outputs show the local configuration on the cluster member, and therefore do not
have to be identical on all cluster members).
Random ID (Bytes 52 - 53) - Each cluster member is assigned a random ID upon boot.
This field states the random ID of the machine that originated this CCP packet.

Source Machine ID (Bytes 54 - 55) - The ID of the machine that originated the packet
based on the internal cluster numbering (starts from zero). Each cluster member is given a
number, which identifies it within the cluster - refer to the output of 'cphaprob state'
command.
These numbers are assigned based on the priority of cluster members as configured in
SmartDashboard - cluster object - 'ClusterXL Members' pane (the higher the member is
located in this list, the higher its priority and the lower its ID).
Destination Machine ID (Bytes 56 - 57) - The ID of the machine, for which this CCP
packet is intended based on the internal cluster numbering (starts from zero). Each cluster
member is given a number, which identifies it within the cluster - refer to the output of
'cphaprob state' command.
These numbers are assigned based on the priority of cluster members as configured in
SmartDashboard - cluster object - 'ClusterXL Members' pane (the higher the member is
located in this list, the higher its priority and the lower its ID).
Policy ID (Bytes 58 - 59) - Each policy installed on cluster member is identified by a unique
ID. This enables different cluster members to verify they are working under the same
policy. Policy ID can be seen only during cluster debug (fw ctl debug -m cluster +
conf).
Note: To handle a situation, where one member has already enforced the new policy ID,
and sends Delta Sync packets to member, who has not yet done so, we regard packets
that contain the previous policy ID as legal, for a short period after the end of the policy
negotiations.
Filler (Bytes 60 - 61) - Originally, this field was used to align the CCP header, and it was
always set to 0.
As of NG FP3, this field is also used to indicate the status of the source machine in
Service Mode only. Possible values for this field are 1 for 'Active' and 0 for 'Down'.
Starting in NG FP4, the Filler has 2 fields in Service Mode:
The first byte (nibble) contains the member status (as in NG FP3):
o If it contains 1, then in 'Sync only' mode, the member is ready to accept a Full
Sync from other cluster members. Otherwise, it can not act as a Full Sync server.
This can happen if Full Sync has failed, or if there is no policy yet.
The second byte (nibble) contains the pnote status.
o If it contains 0, then all pnotes report their status as 'OK'.
o Otherwise, it will contain 1.
Note: The 'Filler' field is relevant only in cluster running on IPSO OS, in which a member
state is updated also by the statuses of pnotes. In other 3rd party solutions, the pnote
status is passed on the network, but is being disregarded by Check Point code.
Total num. of CoreXL FW inst. (Bytes 62 - 63) - Total number of loaded CoreXL FW
instances. This field exists since R70.
CoreXL instance ID (Bytes 64 - 65) - The ID of the CoreXL FW instance, to which this
CCP packet belongs (sent from/to). This field exists since R70.
VSX VSID (Bytes 66 - 67) - ID of the Virtual System, to which this CCP packet belongs
(sent from/to). In non-VSX, always contains 0.
This field exists in R75.40VS, R76, R77 and above.
CCP Data (Bytes 68 - above) - Each CCP OpCode implies a different structure of the
packetsData section.

Let us review several types of CCP packets in detail:

FWHA_MY_STATE Data
FWHA_QUERY_STATE Data
FWHAP_IF_PROBE_REQ Data
FWHAP_IF_PROBE_RPLY Data
FWHAP_IFCONF_REQ Data
FWHAP_IFCONF_RPLY Data
FWHAP_POLICY_CHANGE Data
FWHAP_SYNC Data
FWHA_MY_STATE Data
OpCode
1
Type
FWHA_MY_STATE
Description
The FWHAP_MY_STATE OpCode designates a packet containing a report on the state of

the machines in the cluster, as known to the source machine, as well as a report on the
state of the source machine. Since this packet may induce state changes on the receiving
cluster member, the receiving member will accept state changes only if the packet was
received on a secured interface.
Note: These packets are not sent in 3rd party clusters.
0
16
Total
Length
IP
datagram
ID
IP Flags +
Fragment
Offset
32
Layer 3
Destin.
IP addr
(cont.)
Layer 4
Source
Port
Layer 4
Destin.
Port
Total
Length
UDP
checksum
48
CCP
OpCode
Source IF
Number
Random
ID
Source
Machine
ID
Destin.
Machine
ID
64
CoreXL
instance
ID
VSX
VSID
Number
of
Reported
IDs
Report
Code
IFR
In
IF
As.
In
IF
Out
IF
As.
Out
IF
LPT
of
ID 0
LPT
of
ID 1
10
11

IP
Pro
to
(11)
TTL
...
IP
header
checksum
HA Mode
12
13
Eth Type
(0x0800)
Layer 3
Source
IP address
Magic
Number
(0x1A90)
15
IP
Hdr
Len
Layer 3
Destin.
IP addr
CCP
Version
Cluster
Number
Filler
Total
num. of
CoreXL
FW inst.
Policy ID
Problem
14
IP
Ver
(4)
State
State
of
of
ID 0
ID 1
...
...
...
The byte IFR (Interface Report) is calculated using the following formula:
IFR = 70 + <Number of Reported IDs>
Number of Reported IDs (Bytes 68 - 69) - Specifies the number of machines, for which
the state is reported.

Report Code (Bytes 70 - 71) - Flags indicating whether this packet contains a machine
state report, an interface state report, or both. The flags specified below can be combined
together using bitwise OR to form the field value:
Value
0x1
0x2
Flag Name
Description
FWHAP_RP_MACHINE_STATE Report source machine's state
FWHAP_RP_IF_STATE
Queryanothermachinesstate
HA Mode (Bytes 72 - 73) - Contains the mode of the machine that sent this datagram.
Value
0
1
2
3
4
Mode Name
FWHA_UNDEF_MODE
FWHA_NOT_ACTIVE_MODE
FWHA_BALANCE_MODE
FWHA_PRIMARY_UP_MODE
FWHA_ONE_UP_MODE
Description
HA is not active
More than one machine is active
Backup mode: active machine is the one
with the lowest ID alive
Backup mode: active machine remains
active until it dies
Problem (Bytes 74 - 75) - This field contains a Boolean value:

1 - the machine that originated this packet has a problem
0 - otherwise
Note: On NG FP2, this field held a time stamp, which was set to the number of tenths of
secondselapsedsincethemachineslastboot.
State of ID x (Bytes 76+x) - ReportsthestateofthemachinewhoseinternalIDisx.
Possible states are:
Value State Name
0
FWHA_FW_DEAD
1
FWHA_FW_INIT
2
FWHA_FW_STANDBY
FWHA_FW_READY
4
10
FWHA_FW_ACTIVE
FWHA_FW_TOTAL_DEAD
Description
Machine reports itself as dead
Machine is up and running, but is not
ready to receive packets yet
Machine is able to process packets, but
is currently set as a backup machine
Machine is ready to process packets,
but is currently waiting for other
machines to confirm their states
Machine is filtering packets
Timeout occurred waiting for this
machine to report (more than 1 sec)
In IF (Byte IFR) - Number of interfaces currently up, in the inbound direction on the source
machine.
As. In IF (Byte IFR+1) - Number of interfaces currently assumed to be up, in the inbound
direction on the source Machine. (Bytes 70+2x - 70+2x+1)
Out IF (Byte IFR+2) - Number of interfaces currently up, in the outbound direction on the
source machine.

As. Out IF (Byte IFR+3) - Number of interfaces currently assumed to be up, in the
outbound direction on the source machine.
LPT of ID x (Byte IFR+4+x) - Reports the time, in HA time units (10 HA time units ~ 1
second), elapsed since the last CCP packet was received from machine with ID x.
Note: HA time units are mostly used by Check Point RnD.
FWHA_QUERY_STATE Data
OpCode
2
Type
FWHA_QUERY_STATE
Description
Query other machine's state
These packets are used by a cluster member to ask another member for its status. This
is used when source member stopped receiving CCP packets from another member for
some time (0.2 seconds) and may want to inquire the other member to see if it is "alive".
This CCP packet does not have any CCP Data.
FWHAP_IF_PROBE_REQ Data
OpCode
3
Type
FWHA_IF_PROBE_REQ
Description
Interface active check request
An interface probing is a mechanism, which allows a machine to verify that its interfaces
are up and are able to receive and transmit data.
These packets are used to verify the status of each interface.
This is done to detect connectivity problems of the interfaces.
Refer to Clustering Definitions and Terms section.
0
16
Total
Length
IP
datagram
ID
IP Flags +
Fragment
Offset
32
Layer 3
Destin.
IP addr
(cont.)
Layer 4
Source
Port
Layer 4
Destin.
Port
Total
Length
UDP
checksum
48
CCP
OpCode
Source IF
Number
Random
ID
Source
Machine
ID
Destin.
Machine
ID
64
CoreXL
instance
ID
VSX
VSID
Interface
Number
10
11
TTL
IP
Pro
to
(11)
IP
header
checksum
12
13
Eth Type
(0x0800)
Layer 3
Source
IP address
Magic
Number
(0x1A90)
Policy ID
14
IP
Ver
(4)
15
IP
Hdr
Len
Layer 3
Destin.
IP addr
CCP
Version
Cluster
Number
Filler
Total
num. of
CoreXL
FW inst.

Interface Number (Bytes 62 - 63) - FireWall-1 serial interface number of the queried
interface. Refer to the output of 'fw ctl iflist' command.
FWHAP_IF_PROBE_RPLY Data
OpCode
4
Type
FWHA_IF_PROBE_REPLY
Description
Interface active check reply
An interface probing is a mechanism, which allows a machine to verify that its interfaces
are up and are able to receive and transmit data.
This packet is a reply to FWHAP_IF_PROBE_REQ packet.
These packets are used to verify the status of each interface.
This is done to detect connectivity problems of the interfaces.
Note: The transmit state of an interface (as monitored by 'Interface Active Check' pnote)
is refreshed once a FWHAP_IF_PROBE_RPLY packet is received in acknowledge to
Note: The 'FWHA_IF_PROBE_REPLY' packet is always sent with Layer 2 Destination

MAC address of subnet Broadcast FF:FF:FF:FF:FF:FF. Refer to sk44410 (CCP packets
are sent in Broadcast although CCP mode is set to Multicast).
0
16
Total
Length
IP
datagram
ID
IP Flags +
Fragment
Offset
32
Layer 3
Destin.
IP addr
(cont.)
Layer 4
Source
Port
Layer 4
Destin.
Port
Total
Length
UDP
checksum
48
CCP
OpCode
Source IF
Number
Random
ID
Source
Machine
ID
Destin.
Machine
ID
64
CoreXL
instance
ID
VSX
VSID
Interface
Number
10
11
TTL
IP
Pro
to
(11)
IP
header
checksum
12
13
Eth Type
(0x0800)
Layer 3
Source
IP address
Magic
Number
(0x1A90)
Policy ID
14
IP
Ver
(4)
15
IP
Hdr
Len
Layer 3
Destin.
IP addr
CCP
Version
Cluster
Number
Filler
Total
num. of
CoreXL
FW inst.
Interface Number (Bytes 62 - 63) - FireWall-1 serial interface number of the queried
interface. Refer to the output of 'fw ctl iflist' command.

FWHAP_IFCONF_REQ Data
OpCode
5
Type
FWHAP_IFCONF_REQ
Explanation
Interface configuration request
These packets are used in order to learn the following information about peer
cluster members:
o Interfaces
o IP addresses
o MAC addresses
These packets are sent occasionally to verify the IP addresses are still the same.
ClusterXL uses these packets in order to discover cluster misconfiguration as follows:
o whether one machine considers an interface secured, while the other does not
o whether the IP addresses reported by the sending machine belong to a different
interface on the receiving machine (which may indicate a cable connectivity
problems).
This CCP packet does not have any CCP Data.
Note: The 'FWHA_IFCONF_REQ' packet is always sent with Layer 2 Destination MAC
address of subnet Broadcast FF:FF:FF:FF:FF:FF. Refer to sk44410 (CCP packets are
sent in Broadcast although CCP mode is set to Multicast).
Note: These packets are sent in 3rd party clusters.
FWHAP_IFCONF_RPLY Data
OpCode
6
Type
FWHAP_IFCONF_RPLY
Explanation
Interface configuration reply
Packets of this type, sent at fixed intervals, or as a reply to FWHAP_IFCONF_REQ

packets, contain a report on the interface originating the message.
These packets are also sent when the machine boots to "tell" all other cluster members
about the IP addresses and MAC addresses of the new machine.
These packets are sent on each interface, and contain indication whether the sending
machine considers this interface secured.
ClusterXL uses these packets in order to discover cluster misconfiguration as follows:
o whether one machine considers an interface secured, while the other does not
o whether the IP addresses reported by the sending machine belong to a different
interface on the receiving machine (which may indicate a cable connectivity
problems).


0
10
16
Total
Length
IP
datagram
ID
IP Flags +
Fragment
Offset
32
Layer 3
Destin.
IP addr
(cont.)
Layer 4
Source
Port
Layer 4
Destin.
Port
Total
Length
UDP
checksum
48
CCP
OpCode
Source IF
Number
Random
ID
Source
Machine
ID
Destin.
Machine
ID
CoreXL
VSX
instance
VSID
ID
IP addr 1
64
80
11
TTL
IP
Pro
to
(11)
Number of
Reported
IPs
IP addr 2
IP
header
checksum
12
13
14
IP
Ver
(4)
Eth Type
(0x0800)
Layer 3
Source
IP address
Magic
Number
(0x1A90)
Policy ID
Layer 3
Destin.
IP addr
CCP
Version
Cluster
Number
Total
num. of
CoreXL
FW inst.
Trusted
interface
?
Filler
Ethernet Address
IP addr 3
15
IP
Hdr
Len
...
Number of Reported IPs (Bytes 68 - 69) - Number of IP addresses associated with this
interface.
Ethernet Address (Bytes 70 - 75) - The real Ethernet address of the interface (as opposed
tothephonyaddress,seeExternal Header).
Trusted interface? (Bytes 76 - 77) - Boolean value: 1 if this interface is trusted (secured),
0 otherwise.
IP addr X (Bytes 78+4x - 78+4x+3) - IP address number X associated with the reporting
interface (ClusterXL uses only the first configured IP address).
FWHAP_POLICY_CHANGE Data
OpCode
9
Type
FWHAP_POLICY_CHANGE
Explanation
Policy ID change request/notification
This message type is used to synchronize cluster members when configuration

parameters change. It ensures that all cluster members will activate the changes
simultaneously, so as to avoid configuration conflicts, and in order to verify the entire
cluster enforces the same policy at any given time.

16
Total
Length
IP
datagram
ID
IP Flags +
Fragment
Offset
32
Layer 3
Destin.
IP addr
(cont.)
Layer 4
Source
Port
Layer 4
Destin.
Port
Total
Length
UDP
checksum
48
CCP
OpCode
Source IF
Number
Random
ID
Source
Machine
ID
Destin.
Machine
ID
64
CoreXL
instance
ID
VSX
VSID
10
11
12
TTL
Policy
Update
State
IP
Pro
to
(11)
IP
header
checksum
13
Eth Type
(0x0800)
Layer 3
Source
IP address
Magic
Number
(0x1A90)
Policy ID
14
IP
Ver
(4)
15
IP
Hdr
Len
Layer 3
Destin.
IP addr
CCP
Version
Cluster
Number
Filler
Total
num. of
CoreXL
FW inst.
New Policy ID
Policy Update State (Bytes 68 - 71) - The members that originated this packet, notifies the
other members whether or not it needs to change its own configuration, due to the new
policy. The message is also used to notify all cluster members that the originator is ready to
apply the changes.
Possible values are:
Value
Name
Description
FWHA_POLICY_UPD_INIT
1
This member does not need
to update its configuration
FWHA_POLICY_UPD_NEED
2
This member needs to update
its configuration to conform
with the new policy
FWHA_POLICY_UPD_READY
3
This member is ready to apply
the configuration changes
FWHA_POLICY_UPD_NEW
4
This member has just joined the cluster,
and has already applied the new policy
New Policy ID (Bytes 72 - 75) - Specifies the ID of the new policy, which the source
member is trying to enforce. All cluster members should agree on this value before the
policy can be updated.
This field contains last two bytes of MD4 hash of Policy ID (Policy ID is generated by the
Security Management Server based on the contents of compiled policy files <PolicyName>.ft, <PolicyName>.fc, <PolicyName>.set).
FWHAP_SYNC Data
OpCode
10
Type
FWHAP_SYNC
Explanation
New Delta Sync packets
This packet type defines a sub-protocol of CCP, used to maintain the State
Synchronization between cluster members. This is done by sending updates about the
FireWall kernel tables wrapped in the CCP packet data.
Refer to State Synchronization in ClusterXL section.


0
16
Total
Length
IP
datagram
ID
IP Flags +
Fragment
Offset
32
Layer 3
Destin.
IP addr
(cont.)
Layer 4
Source
Port
Layer 4
Destin.
Port
Total
Length
UDP
checksum
48
CCP
OpCode
Source IF
Number
Random
ID
Source
Machine
ID
Destin.
Machine
ID
64
CoreXL
instance
ID
VSX
VSID
10
11
Sequence
Number
Sync OP
Flags
TTL
IP
Pro
to
(11)
IP
header
checksum
12
13
Eth Type
(0x0800)
14
IP
Ver
(4)
Layer 3
Source
IP address
Magic
Number
(0x1A90)
Policy ID
15
IP
Hdr
Len
Layer 3
Destin.
IP addr
CCP
Version
Cluster
Number
Filler
Total
num. of
CoreXL
FW inst.
Sync OP Specific Data
Sequence Number (Bytes 68 - 70) - Uniquely identifies the packet, in case a

retransmission is needed.
Sync OP (Byte 71, Lower Nibble) - Defines a Sync protocol OpCode:
OpCode value
0
OpCode name
BC_MSG
BC_RETRANS_REQ
BC_RETRANS_REQ
BC_RETRANS_REJECT
Explanation
Holds FireWall table data
(may be fragmented)
Request to retransmit
missing data fragments
Request an ACK message
from peer members
Rejects a retransmission request
Flags (Byte 71, Upper Nibble) - A bit-wise combination of the following values:
Flag Value
0x80
0x10
0x20
Flag Name
BC_ACK_FLAG
Explanation
Indicates an acknowledge is
required for flushed data
BC_FRAGM_FLAG
Indicates this packet is a single
fragment of a larger message
BC_LAST_FRAGM_FLAG Indicates this is the last
fragment in the message

ClusterXL Monitoring and Troubleshooting

Refer to ClusterXL Debugging section.
sk56202 (How to troubleshoot failovers in ClusterXL)
sk62570 (How to troubleshoot failovers in ClusterXL - Advanced Guide)
sk81740 (How to configure a cluster to send Mail Alert upon fail-over)
sk67560 (How to export History Report from SmartView Monitor)
sk52421 (Ports used by Check Point software)
SmartView Tracker
The best and simplest way to start cluster troubleshooting, is to check the cluster logs
(pre-requisite for such logs is to set 'Track changes in the status of cluster
members' to 'Log' in SmartDashboard - cluster object - ClusterXL - Tracking).
Refer to Configuring cluster object in SmartDashboard section.
In SmartView Tracker:
1. Open the FireWall log that contains the data from the time of the cluster problem.
2. Go to the 'Date' column
A. Right-click on the 'Date' column header
B. Click on 'Edit Filter...'
C. Select the relevant date
D. Click on OK button
3. Go to the 'Time' column
A. Right-click on the 'Time' column header
C. Select the relevant time
D. Click on OK button
4. Go to the 'Information' column
A. Right-click on the 'Information' column header
C. Select 'Specific'
D. In 'Field' - select 'Contains'
E. In 'Text' - type cluster_info
F. Click on OK button
5. Analyze the cluster logs
6. Go to 'File' menu - click on 'Export...'
R75.40VS, R76, R77) - Chapter 'Monitoring and Troubleshooting Gateway Clusters' Monitoring Cluster Status Using SmartConsole Clients - SmartView Tracker.
Refer to SmartView Tracker Administration Guide (R75.40, R75.40VS, R76, R77).

SmartView Monitor
SmartView Monitor displays a snapshot of all ClusterXL cluster members, enabling real
time monitoring and alerting. For each cluster member, state change and critical device
problem notifications are displayed.
The SmartView Monitor GUI client communicates with the cluster member via the
Check Point Application Monitoring (AMON) Infrastructure.
The AMON client (SmartView Monitor GUI) sends a request for some specific OID
(SNMP Get) to the AMON server on the cluster member. The AMON server queries the
Check Point kernel (in the same way as the "cphaprob" commands) in order to retrieve the
requested information.
The information is then formatted into MIB (SNMP Response) and sent back to the
AMON client for display.
It is also possible to stop and start ClusterXL on the member:
1. On the left, go to Gateways Status view.
2. Select the relevant cluster member of a given cluster.
3. Right-click on the selected member.
4. Go to Cluster Member menu
5. Select the relevant operation - 'Stop Member' or 'Start Member'.
Notes:
SmartView Monitor uses a separate Check Point infrastructure to control ClusterXL
(special internal command is sent from SmartView Monitor to Security Management
Server that manages this cluster, which sends another internal command to perform
the requested operation on ClusterXL).
Complicated debug is required in order to see this communication (FWM and CPD
daemons on Management Server, and CPD daemon on cluster member).
Cluster administrator should use command line on each cluster member to control
ClusterXL (cpstart/cpstop ; cphastart/cphastop).
R75.40VS, R76, R77) - Chapter 'Monitoring and Troubleshooting Gateway Clusters' Monitoring Cluster Status Using SmartConsole Clients - SmartView Monitor.
Refer to SmartView Monitor Administration Guide (R70, R71, R75, R75.20, R75.40,
R75.40VS, R76, R77) - Chapter 'Monitoring Gateway Status' - Configuring Gateway Views
- Start/Stop Cluster Member.
sk67560 (How to export History Report from SmartView Monitor)
sk31961 (When viewing a ClusterXL Member via SmartView Monitor, VLAN
Interfaces not visible)
sk88360 ('Error: 'ClusterXL' is not responding. Verify that 'ClusterXL' is installed on
the gateway' message in SmartView Monitor)
sk53701 (ClusterXL works correctly in HA mode, but in LS mode a member is shown
as 'Disconnected' in SmartView Monitor, and policy installation intermittently fails on
that member with SIC error no. 148)

Clock synchronization
Refer to Clock Synchronization section.
If clocks on cluster members are out of sync, then the SIC communication between the
members and the VPN will fail.
sk92602 (How to troubleshoot NTP on Gaia OS)
sk90365 (Enabling NTP causes OSPF adjacencies to disconnect)
sk92984 (NTP client on Gaia fails to synchronize with Windows 2003)
sk40322 (Is it recommended to use NTP with VRRP or IP Clustering?)
sk39783 (NTP process fails after there is a VRRP state change)
sk67740 (How to stop 'ntpdate[PID]: adjust time server' logs in /var/log/messages)
sk32647 (Entries in /var/log/messages files have different timestamps when using
NTP Server - some entries are shown with local time, and some entries are shown
with correct UTC/GMT time)
CCP mode
Refer to Cluster Control Protocol (CCP) section and to ClusterXL Requirements for
Hardware and Software section.
SecureXL
Refer to Requirements for software section and to SecureXL section.
sk32578 (SecureXL Mechanism)
sk98722 (ATRG: SecureXL)
sk71200 (SecureXL NAT Templates)
sk67861 (Accelerated Drop Rules Feature in R75.40 and above)
sk66402 (SecureXL Drop Templates are not supported in versions lower than R76)
sk79620 (SecureXL 'sim affinity -s' settings do not survive reboot)
sk61962 (SMP IRQ Affinity on Check Point Security Gateway)
sk62441 (Problems with VPN and NAT when SecureXL is enabled)
sk93308 (Security Gateway randomly reboots when IPS or SecureXL is enabled)
sk82280 (Security Gateway with Route Based VPN configuration crashes when
SecureXL is enabled)
sk90301 (SecureXL does not start on the Backup member of VRRP cluster after
reboot)
sk79880 (Traffic is dropped 'by cphwd_offload_conn Reason: VPN and/or NAT
traffic between accelerated and non-accelerated interfaces or between nonaccelerated interfaces is not allowed')
sk93348 (On R75.40VS in VSX mode, traffic does not pass from Virtual Router to
Virtual System when SecureXL is enabled)

sk31404 (How to Debug SecureXL)

sk35175 (Security Gateway does not apply 'keep_DF_flag' parameter when
sk90740 (Latency after upgrade from NGX R65 to R75.x when SecureXL is enabled)
sk82280 (Security Gateway with Route Based VPN configuration crashes when
sk80940 (Multi-Queue hotfix for Security Gateway)
CoreXL
Refer to Requirements for software section and to CoreXL section.
sk61701 (CoreXL Known Limitations)
sk98737 (ATRG: CoreXL)
sk42096 (Cluster member is stuck in 'Ready' state)
sk44488 (CoreXL is enabled, however not all available CPU cores are used)
sk36750 ("License violation: The current machine has M CPU cores and the
installed license is valid for up to N CPU cores" error when installing license)
sk61284 (CoreXL Affinity settings of daemons do not survive reboot)
sk64301 (CoreXL interface affinity is not enforced, even if SecureXL is disabled)
sk76800 (IP Pool NAT support in CoreXL)
sk53060 (URI Resource and CoreXL)
sk86401 (Connections with Hide NAT are dropped during policy installation due to
NAT port allocation failure when CoreXL is enabled)
sk65463 ('Peak' number of connections - discrepancy between the output of 'fw tab t connections -s' command and the output of 'fw ctl pstat' command when CoreXL is
enabled)
sk83300 (Packets are dropped on Trusted Interface MPLS when CoreXL is enabled)
sk43443 (How to debug CoreXL)
sk80940 (Multi-Queue hotfix for Security Gateway)
VPN
Refer to VPN section.
sk92332 (Customizing the VPN configuration for Check Point Security Gateway 'vpn_table.def' file)
sk108600 (VPN Site-to-Site with 3rd party)
sk93204 (Troubleshooting "Clear text packet should be encrypted" error in
ClusterXL)
sk61902 (How to start VPND daemon under debug)
skI4326 (Enabling IKE and VPN debugging)
sk33327 (How to generate a valid ike debug, vpn debug and fw monitor)

sk63560 (How to run complete VPN debug on Security Gateway to troubleshoot

VPN issues)
sk40114 (What files and information are needed to troubleshoot VPN related issues)
sk92465 (Slow Site-to-Site VPN affected by Virtual Defragmentation)
sk34467 (Debugging Site to Site VPN)
sk31114 (How to collect debug on Security Gateway during boot)
NAT
Refer to NAT section.
The following command allows to work with the NAT table (fwx_alloc, ID 8187):
[Expert@Member]# fw tab -t fwx_alloc [flags]
For more information on the 'fw tab' command, refer to Command Line Interface
Reference Guide - Chapter 'Security Management Server and Firewall Commands' - fw - fw
tab.

sk34180 (Outgoing connections from cluster members are sent with cluster Virtual
IP address instead of member's Physical IP address)
sk32224 (NAT Table 'fwx_alloc')
sk36708 (NAT table reaches its maximum capacity on ClusterXL, which causes
traffic issues)
sk35733 (NAT Table (fwx_alloc) is larger than Connections Table (connections) on
ClusterXL members with enabled State Synchronization)
sk60343 (How To Troubleshoot NAT-related Issues)
sk30197 (Configuring Proxy ARP for Manual NAT)
sk78340 (NATed connections are dropped during ISP Link failover if CoreXL is
enabled)
sk86401 (Connections with Hide NAT are dropped during policy installation due to
NAT port allocation failure when CoreXL is enabled)
sk69480 ('NAT Hide failure - there are currently no available ports for hide operation'
log appears repeatedly in SmartView Tracker)

VLAN
Refer to VLAN section and to CCP and VLAN interfaces section.
sk92826 (ClusterXL VLAN monitoring)
sk61323 (Monitoring of VLAN interfaces in ClusterXL)
sk92784 (Configuring VLAN Monitoring on ClusterXL for specific VLAN interface)
sk35462 (Abnormal behavior of cluster members during failover when 'Monitor all
VLAN' feature is enabled)
sk95218 (Disconnected monitored VLAN can cause ClusterXL upgrade failure)
Link Aggregation (Bonding)

Refer to Link Aggregation (Bonding) section.
Refer to ClusterXL Administration Guide (R65, R70, R70.1, R71, R75, R75.20, R75.40,
Clusters - Troubleshooting Bonded Interfaces.
sk64009 (Bond interface configured on 10 Gb Intel interfaces - enslaved interfaces
do not have link after reboot)
sk93341 (Output of 'cphaprob -a if' shows Bond interface as 'Down' in the context of
any Virtual System)
sk71880 (After several reboots of VSX Gateway, state of Bond interface is 'Down')
sk97779 (Critical minimal number of interfaces in a Bond in VRRP cluster running on
Gaia OS)
Checking Bond status

1. Run the command 'cphaconf show_bond -a', and note the state of the bond
interfaces.
Note: On Linux-based OS, this information is taken from
/proc/net/bonding/BondName.
2. Run the command 'cphaconf show_bond BondName', and note which interfaces
are active inside the bond interface.
Note: On Linux-based OS, this information is taken from
/proc/net/bonding/BondName.

Possible statuses of slave interfaces:

Down - (only Bond in Load Sharing mode) the physical link is down.
Active - currently is handling traffic.
Standby - (only Bond in High Availability mode) the interface is ready and can
support internal bond failover.
Not Available - (only Bond in High Availability mode) the physical link is broken,
or the Cluster member is in status Down. The bond cannot failover internally in
this state.
3. Check the state of slave interfaces directly with 'cat
/proc/net/bonding/BondName' command.
Pay attention to the link status on physical slave interfaces and to the bond
parameters, compare these to the configuration on the switch(es).
Bond in High Availability mode

To see whether the bond can failover internally:
[Expert@Member]# cphaprob -a if
On the corresponding line for the bond interface, the words 'can failover' must
appear.
To test the internal bond failover:
[Expert@Member]# cphaconf failover_bond BondName.
The standby and active interfaces should switch.
Bond in Load Sharing mode
Important Note: Bond in Load Sharing requires Performance Pack (SecureXL) to be
enabled.
Check the configuration of critical minimum number of slave interfaces :
[Expert@Member]# cat $FWDIR/conf/cpha_bond_ls_config.conf
In order to improve the performance, SIM Affinity should be configured to run in Static
mode via 'sim affinity -s' command.

Adding another member or interface to an existing cluster

Any change in the physical of software configuration of an existing cluster might cause a
failover.
Therefore, in order to avoid traffic outage and to have the ability to troubleshoot, all such
changes must be carried out during a maintenance window.
R75.40VS, R76, R77) - Chapter 'ClusterXL Advanced Configuration' - Adding Another
Member to an Existing Cluster.
sk57100 (Adding or removing an interface in ClusterXL High Availability topology
might cause fail-over)
sk39047 (Output of 'cphaprob -a if' command shows 'Sync will not function since
there aren't any sync(secured) interfaces' after replacing one VRRP/Clustered pair
with new or upgraded hardware)
If some changes must be performed in "real time" (e.g., installation of a hotfix), then
follow these suggestions:
In High Availability mode cluster:

o Perform the necessary changes on Standby member
o Make sure that the machine is up and running
o Make sure that the ClusterXL is running correctly on Standby member (refer to
'cphaprob' command section)
o Perform a manual failover from currently Active member to Standby member per
sk55081 (Best practice for manual fail-over in ClusterXL)
o Make sure that the new Active member processes the traffic correctly
o Perform the necessary changes on former Active member
In Load Sharing Multicast mode cluster:
o Select one of the members
o Perform the necessary changes on the selected member
o Make sure that the ClusterXL is running correctly on selected member (refer to
o Perform a manual failover from the "unchanged" member to the "changed"
member per sk55081 (Best practice for manual fail-over in ClusterXL)
o Make sure that the traffic passes correctly
o Perform the necessary changes on "unchanged" member
In Load Sharing Unicast mode cluster:
o Perform the necessary changes on Non-Pivot member
o Make sure that the ClusterXL is running correctly on Non-Pivot member (refer to
o Perform a manual failover from current Pivot member to Non-Pivot member per
o Make sure that the new Pivot member processes the traffic correctly
o Perform the necessary changes on former Pivot member

ISP Redundancy
Refer to section ISP Redundancy section.
sk42636 (Controlling connections configured with ISP Redundancy in Load Sharing
mode)
sk66521 (ISP Redundancy in ClusterXL when interfaces of cluster members and
cluster VIP are defined on different subnets per sk32073)
sk25152 (Static (Hide) NAT fails for outgoing connections through gateway with ISP
Redundancy in Load Sharing mode)
sk60590 (ISP Redundancy is missing from the gateway or cluster object)
sk61692 (Troubleshooting ISP Redundancy)
sk65341 (ISP Redundancy probing is not working in ClusterXL)
sk83900 (ISP Redundancy failover is not working in Gaia OS)
sk31530 (ISP Redundancy Link Interface cannot be created)
sk40958 (How to verify the status of ISP Redundancy links on command line)
Dynamic Routing
Refer to Dynamic Routing section.
sk31243 (ClusterXL member is "Down" due to Critical device "FIB")
sk43281 (FIBMGR packets dropped by fw_cluster_ttl_anti_spoofing Reason: ttl
check drop)
sk43243 (How to debug FIBMGRD daemon)
sk41393 (How to Troubleshoot OSPF Problems)
sk40164 (What Information do I collect for OSPF issues?)
sk33201 (Regarding ClusterXL and OSPF)
sk36231 (OSPF equal multipath support in SecurePlatform Pro)
sk82600 (Graceful restart for OSPF and BGP in Gaia does not work)
sk32568 (How to increase OSPF adjacency membership on SecurePlatform Pro)
sk84520 (How to debug OSPF and RouteD daemon on Gaia)
sk60860 (How to debug OSPF and GateD daemon on SecurePlatform Pro)
sk60861 (How to debug BGP and GateD daemon on SecurePlatform Pro)
sk92598 (How to collect traces and debugs information for PIM and Multicast on
Gaia)
sk85280 (Advanced Routing (OSPF, BGP, etc) configuration is not saved by 'save
configuration <file name>' command in Gaia CLISH shell)

SNMP
Refer to SNMP section.
sk59023 (Disable verbose SNMP logging - "snmpd[PID]: Received SNMP packet(s)
from UDP:")
sk66648 (SecurePlatform does not send SNMP Traps)
sk66581 (SecurePlatform sends SNMP Traps only to one sink server, although
several sink servers were configured; SNMP Traps are always sent with with 'public'
community name)
sk93644 (How to bind SNMPD on SecurePlatform OS to specific interface)
sk80820 (LinkUp/LinkDown (linkUpLinkDown) Trap is not working on Gaia)
sk72760 ('snmpwalk' always reports speed of Bond and Bridge interfaces as 10
Mbps)
sk77260 ('snmpwalk' always reports speed of 10 Gb interfaces as 10 Mbps)
sk90362 (SNMPD daemon fails to start on Gaia OS)
sk89300 (SNMPD daemon crashes after interface IP address change on Gaia OS)
sk61425 (Machine with Check Point software responds with 'No Such Object
available on this agent at this OID' to Check Point SNMP OID, but responds
correctly to generic SNMP OID)
sk69625 (Gaia does not provide SNMP RAID Trap)
sk66585 (/var/log/messages shows - snmpd[PID]: /etc/snmp/snmpd.conf: line N:
Warning: Unknown token)
sk92937 (SNMPv3 with USM 'authentication' configuration does not survive reboot
on Gaia OS)
ClusterXL)
sk38936 (How to debug dropped SNMP V1 & V2 packets)
sk56783 (How to debug SNMPD daemon on SecurePlatform and Gaia)
sk66586 (How to debug SNMPMONITOR on SecurePlatform and Gaia)
sk66383 (How to debug CPSNMPAGENTX on SecurePlatform and Gaia)
sk66384 (How to debug CPSNMPD on SecurePlatform and Gaia)
Policy Installation
Policy installation on cluster triggers re-configuration of each cluster member. Part of
this re-configuration is negotiation of the state of each member.
The policy installation process is transparent for the traffic. Policy installation, in certain
cases, may cause a cluster member to initiate a failover.
Cluster administrator can control the installation of policy on cluster with the help of
several kernel parameters (each parameter is described below):
fwha_freeze_state_machine_timeout
fwha_policy_update_timeout_factor
fwha_conf_immediate
fwha_cul_policy_freeze_timeout_millisec
fwha_cul_policy_freeze_event_timeout_millisec

Refer to sk92723 (Cluster flapping prevention).

How to work with kernel parameters:
The current value of a kernel parameter can be checked with 'fw ctl get int
fwha_conf_immediate
' command.
The value of a kernel parameter can be set:
o either on-the-fly with 'fw ctl set int fwha_conf_immediateVALUE'
command (this change does not survive reboot)
o or by adding a parameter with desired value into the
$FWDIR/boot/modules/fwkern.conf file (per sk26202)
Let us review each kernel parameter:
'fwha_freeze_state_machine_timeout' kernel parameter

Explanation:
o This kernel parameter is related to what Check Point calls the "state machine",
which is responsible for determining the state of each machine in the cluster, i.e.,
whether the machine is Active/Standby/Down. When the state of the machine is
changed, failover takes place. During policy installation, there are cases, in
which, the state is changed, and consequently an unwanted failover may occur.
o This parameter sets the number of seconds, during which the state of each
cluster member will be "frozen" starting from the moment the policy installation
starts on the member, and until the count-down reaches zero.
Values:
o This parameter sets a timeout value, and its units are seconds.
o In versions prior to R75.40VS, the default value is 0 seconds ("freeze"
mechanism is disabled).
o Starting in R75.40VS, the default value is set to 30 seconds.
o Upper limit is 232-1.
Notes:
o This kernel parameter is known as "freeze" mechanism.
o When the value of this kernel parameter is set to some value, and Cluster
Member priorities were changed, then during policy installation, the cluster
configuration on members will not be updated correctly even though output of
'cphaprob state' command shows that the Member IDs and their state have
changed. Refer to sk66064 (Change of Cluster Member priority when the kernel
parameter 'fwha_freeze_state_machine_timeout' is enabled may cause network
outage).
o On VSX cluster members, the "freeze" mechanism applies only to cluster
member itself (Virtual System 0). It does not apply to any other Virtual Systems.
o In R75.40VS / R76 and above in VSX mode, Virtual System 0 will
monitor/perform state change lock even when other Virtual Systems get the
policy.
o When the value of this kernel parameter is set to some value, the following
messages will appear in /var/log/messages file during policy installation:
;FW-1: fwha_state_freeze: FREEZING state machine at CURRENT_STATE
(time=HTU,caller=fwha_set_conf);
;FW-1: fwha_state_freeze: ENABLING state machine at CURRENT_STATE
(time=HTU,caller=policy change - finished changes (fwha_start));

The following messages in /var/log/messages file are normal during the boot of
the machine:
;FW-1: fwha_state_freeze: FREEZING state machine at FAILURE
(time=HTU,caller=fwha_set_conf);
;FW-1: fwha_state_freeze: ENABLING state machine at FAILURE
(time=HTU,caller=policy change - finished changes (fwha_start));

o sk32488 (When to use 'fwha_freeze_state_machine_timeout' parameter)
o sk25971 (Failover occurs in the cluster during Security Policy installation)
o sk66064 (Change of Cluster Member priority when the kernel parameter
'fwha_freeze_state_machine_timeout' is enabled may cause network outage)
o sk66881 (On OPSec cluster, output of 'cphaprob state' does not show the local
member, only peer members)
'fwha_policy_update_timeout_factor' kernel parameter

Explanation:
o When policy is installed on a cluster, the cluster members undertake a
negotiation process to make sure all of them have received the same policy
before they actually apply it. This negotiation process has a timeout mechanism,
which makes sure a cluster member does not wait indefinitely for responses from
other cluster members, which is useful in cases when another cluster member
goes down when policy is being installed (for example).
o In configurations, in which policy installation takes a long time (usually caused by
a policy with a large number of rules), a cluster with more than two machines,
and slow machines, this timeout mechanism may expire prematurely.
o It is possible to tune the timeout by setting the desired value for this kernel
parameter.
Formula:
Policy change timeout for members to synchronize policy installation state before
proceeding:
[20 x (number of members in cluster) x
fwha_policy_update_timeout_factor]
Values:
o This kernel parameter is a multiplier, therefore it has no units.
o The default value is 1.
o Do not set this parameter to a value larger than 3.
Notes:
o The default value of 1 should be sufficient for most configurations.
o For configurations, where the situation described above occurs, setting this
parameter to 2 should be sufficient.
o Do not set this parameter to a value larger than 3.
o sk63163 (Failover does not occur in ClusterXL HA Primary Up mode after
changing cluster member priorities and installing the policy)
o sk92723 (Cluster flapping prevention)

'fwha_conf_immediate' kernel parameter

Policy can be updated without negotiation in the following cases:
o ClusterXL is not running on this machine
o ClusterXL should be stopped on this machine
o The only action to take is to add or remove unused interfaces
o The policy is or was missing (Policy ID = 0)
o Policy is installed on a single cluster member
Explanation:
o When the value is set to 0, the cluster member will not change its state to the
next required state until it negotiates with other cluster members.
o When the value is set to 1, the cluster members skip policy installation
negotiation and install new cluster configuration immediately.
Values:
o This kernel parameter is an on-off switch, therefore it has no units.
o Accepted values are 0 and 1.
o sk63163 (Failover does not occur in ClusterXL HA Primary Up mode after
changing cluster member priorities and installing the policy)
Refer to Installation and Upgrade Guide (R60, R61, R62, R65, R70, R71, R75,
R75.20, R75.40).
'fwha_cul_policy_freeze_timeout_millisec' kernel parameter

Explanation:
This parameter controls the time, during which a member should wait for an event
(e.g., pnote problem, CCP of Active member are not received) to occur during policy
installation (starting from the local policy installation).
Values:
o This parameter sets a timeout value, and its units are milliseconds.
o Recommended value is 30000.

'fwha_cul_policy_freeze_event_timeout_millisec' kernel parameter

Explanation:
This parameter controls the time, during which a member should freeze its state
upon event (e.g., pnote problem, CCP of Active member are not received) during
policy installation.
Values:
o This parameter sets a timeout value, and its units are milliseconds.
o Recommended value is 15000.
Notes:
o If the Active member fails during policy installation, a network outage might occur
of maximal duration depending on the value assigned to this kernel parameter.
Full Sync
sk37029 (Forcing Full Synchronization in ClusterXL)
sk37030 (Debugging Full Synchronization in ClusterXL)
sk65103 (After reboot, state of cluster member is 'Down', and state of
'Synchronization' device is 'problem')
sk101695 (Cluster member is Down after reboot / policy installation / running
'cpstart')
Cluster member may fail to start correctly while the cluster is under severe load.
If a reboot (or 'cpstop' followed by 'cpstart') is performed on a cluster member while
the cluster is under severe load, the member may fail to start correctly.
The starting member will attempt to perform a Full Sync with the existing active
member(s) and may in the process use up all its resources and available memory.
This can lead to unexpected behaviour.
Procedure:
To overcome this problem, define the maximum amount of memory that the member
may use when starting up for synchronizing its connections with the active member. By
default, this amount is not limited. Estimate the amount of memory required as follows:

Memory required (MB) for Full Sync:

New connections per second
1000
5000
10 000
Number of open
100
Connections
1000
1.1 MB
6.9 MB
10 000
11 MB
69 MB
329 MB
20 000
21 MB
138 MB
657 MB
1305 MB
50 000
53 MB
345 MB
1642 MB
3264 MB
Note: These figures were derived for cluster members using the Windows platform,
with Pentium 4 processors running at 2.4 GHz.
Example:
If the cluster holds 10 000 connections, and the connection rate is 1000 connections per
second, then cluster administrator will need 69 MB for Full Sync.
Instructions:
Define the maximal limit for memory allocation to Full Sync by setting the value of the
global kernel parameter fw_sync_max_saved_buf_mem to the required number of
megabytes. Refer to sk26202 (Changing the kernel global parameters for Check Point
Security Gateway).
Impact:
If memory allocation reaches this limit during Full Sync, then further allocations are
forbidden, and relevant messages are printed into /var/log/messages file:
FW-1: fwlddist_save: WARNING: this member will not be fully synchronized !
FW-1: fwlddist_save: current delta sync memory during full sync has reached the
maximim of N MB
FW-1: fwlddist_save: it is possible to set a different limit by changing
fw_sync_max_saved_buf_mem value
Delta Sync
sk92909 (How to debug ClusterXL to understand why a connection is not
synchronized)
sk41827 (Synchronization network in the cluster is flooded with Sync Retransmit
packets)
Processing of Delta Sync packets during Full Sync
While performing Full Sync, the Delta Sync updates are not processed and saved.
Cluster member may fail to complete Full Sync while the cluster is under severe load.
It is possible that the rate of Delta Sync updates during the Full Sync process exceeds
the rate of the Full Sync packets. The FWD daemon on the Full Sync client member will not

be able to handle this number of Delta Sync packets because of the starvation of the user
space daemon, and Full Sync will never end.
Meanwhile, the Delta Sync packets are stored and occupy an ever-increasing amount of
memory in the kernel until memory allocation fails.
Procedure:
To overcome this problem, define the maximal limit for memory allocated to save the
Delta Sync packets during Full Sync. By default, this amount is not limited.
Instructions:
Define the maximal limit for memory allocated to save the Delta Sync packets during
Full Sync by setting the value of the global kernel parameter
fw_sync_max_saved_buf_mem to the required per cent of the memory allocated by
Check Point kernel (controlled by be kernel parameter fw_salloc_total_alloc) from
the overall allowed memory (controlled by be kernel parameter
fw_salloc_total_alloc_limit).
Impact:
After a certain amount of Delta Sync packets is received, no more Delta Sync packets
are accepted, so additional sync updates received during Full Sync are discarded, and
relevant messages are printed into /var/log/messages file:
FW-1: fwlddist_save: WARNING: this member will not be fully synchronized !
FW-1: fwlddist_save: reached the memory threshold.
FW-1: fwlddist_save: Current = X MB, allowed = Y MB, threshold = N%
A consequence of this is that connections that were not transferred during full sync will
not survive failover.
After Full Sync is complete, the Delta Sync packets stored during the Full Sync phase
are applied by order of arrival.
Delta Sync Buffer Threshold

Whenever an operation is performed on a kernel table, which is marked as "sync"-ed (in
$FWDIR/conf/table.def file on Security Management Server), the Delta Sync
mechanism duplicates this action into a buffer of its own.
Once this Delta Sync buffer is full, and every Sync timer interval, the Delta Sync buffer
is sent to all cluster members over the Synchronization Network. The receiving member will
duplicate those actions into its kernel tables.
State Synchronization mechanism creates Delta Sync packets for incoming
connections. These Delta Sync packets are placed in the Sync Sending Queue.
Obviously, this queue has a limited size, which might create a bottleneck - member
would not be able to place the new Delta Sync packets in the Sync Sending Queue - there
might not be enough space in this queue. Either because the number of incoming
connection is too high, or because the Delta Sync packets are not sent in timely manner
(due to CPU load, some problems on the Sync interfaces / Sync network).

In order to deal with such potential bottleneck, ClusterXL monitors the Sync Sending
Queue - if the number of Delta Sync packets in this queue reaches the threshold, then:
1. The 'FW-1: State synchronization is in risk. Please examine
your synchronization network to avoid further problems' warning is
printed into /var/log/messages file
2. Member starts blocking new incoming connections
This threshold is controlled via kernel parameter fw_sync_buffer_threshold,
whose value is the maximal percentage of the buffer that may be filled before new
connections are blocked:
By default, this value it is set to 80, with a buffer size of 512 sync words.
By default, if more than 410 consecutive packets are sent without getting an
Acknowledgement on any one of them, new connections are blocked.
sk43896 (Blocking New Connections Under Load in ClusterXL)
sk82080 (/var/log/messages are filled with 'kernel: FW-1:
fwldbcast_update_block_new_conns: sync in risk: did not receive ack for the last
410 packets')
sk23695 ('FW-1: State synchronization is in risk. Please examine your
synchronization network to avoid further problems!' appears in /var/log/messages
file)
Traffic
sk80520 (ClusterXL drops traffic with 'dropped by fwha_forw_run Reason: Failed to
send to another cluster member')
sk106425 (Connections through cluster to physical IP address of ClusterXL Standby
member / VRRP Backup member are dropped by Anti-Spoofing)
sk34668 (How to modify the assigned load between the members of ClusterXL in
Load Sharing Unicast mode)
ClusterXL)
sk34180 (Outgoing connections from cluster members are sent with cluster Virtual
IP address instead of member's Physical IP address)
sk42384 (Outgoing connections from cluster members are sent with member's
Physical IP address instead of cluster Virtual IP address)
sk37411 (Forwarding mechanism does not work properly on a machine with more
than 60 interfaces in a Nokia IP cluster)
sk31821 (Traffic that is sent to Secondary IP addresses / Alias IP addresses that
were defined on interfaces of ClusterXL members is not processed)
sk44084 (Kernel debug on ClusterXL Pivot member shows - FW-1:
fwha_pivot_forward_packet: can not forward since fwha_ether_addrs[dst=X][ifn=Y]
is NULL)

sk30154 ($FWDIR/log/fwd.elg shows repeatedly - 'fwarp_initialize_myself: unable to

find mac address of interface IF_NAME')
sk35175 (Security Gateway does not apply 'keep_DF_flag' parameter when
sk90740 (Latency after upgrade from NGX R65 to R75.x when SecureXL is enabled)
sk44177 (SmartView Tracker repeatedly shows drops with "Source and destination
addresses are equal")
sk25977 (Connecting multiple clusters to the same network segment (same VLAN,
same switch)
sk26874 (Cannot simultaneously ping Virtual IP address of the cluster and IP
addresses of physical interfaces on cluster members from a remote host)
sk43321 (Cluster Forwarding is enabled in 3rd party cluster after each policy
installation)
Flapping
Refer to ClusterXL definitions and terms section and to Cluster Control Protocol (CCP)
section.
If CCP packets are not received/sent within the expected timeouts, then eventually
either the problematic interface(s), or the whole member will be declared as failed. This in
turn (by design) will lead to the change in state of either the problematic interface(s), or the
whole member to 'Down'.
Depending on the configuration and the nature of the issue, the state might randomly
change between 'Up'/'Active' and 'Down'. Such random change in state is called "flapping"
(of either an interface, or a member).
Flapping, in its turn might cause an interruption in the production traffic that passes
through the cluster.
Cluster Under Load (CUL) mechanism (R75.40VS, R76, R77 and above) involves a
number of kernel parameters that allow cluster members to automatically monitor the CPU
utilization and prevent flapping according to the values of these kernel parameters - as
described in sk92723 (Cluster flapping prevention):
fwha_cul_mechanism_enable
fwha_cul_member_cpu_load_limit
fwha_cul_member_long_timeout
fwha_cul_cluster_short_timeout
fwha_cul_cluster_log_delay_millisec
fwha_cul_policy_freeze_timeout_millisec
fwha_cul_policy_freeze_event_timeout_millisec
switches)
sk93454 (Increasing ClusterXL dead timeout)
sk97827 (How to change ClusterXL Interface Monitoring Timeouts)


sk92787 (How to debug ClusterXL failovers caused by RouteD daemon on Gaia OS)
sk90283 (ClusterXL fails over for no apparent reason when IPS, DLP, Application
Control or Anti-Malware Blade is enabled)
sk65103 (After reboot, state of cluster member is 'Down', and state of
'Synchronization' device is 'problem')
sk101695 (Cluster member is Down after reboot / policy installation / running
'cpstart')
sk25971 (Failover occurs in the cluster during Security Policy installation)
sk41089 (How to troubleshoot and isolate the cause of VRRP transitions)
sk65502 (Crossbeam cluster - after each reboot, the member is Down and Sync is
Off)
sk44101 (HA process does not start when configuring more than 63 disconnected
interfaces)
'fw ctl pstat' command

Description:
Prints internal Security Gateway Statistics.
Output is divided into several sections.
Cluster-related section is called "Sync".
It is always located at the bottom of the output.
Syntax:
[Expert@Member]# fw ctl pstat
R75.40VS, R76, R77) - Chapter 'Monitoring and Troubleshooting Gateway Clusters' Monitoring Synchronization (fw ctl pstat).
sk34476 (ClusterXL Sync Statistics - output of 'fw ctl pstat' command).
Example:
Sync:
Version: new
Status: Able to Send/Receive sync packets
Sync packets sent:
total : 466729198, retransmitted : 1305, retrans reqs : 89, acks : 809
Sync packets received:
total : 77283541, were queued : 6715, dropped by net : 6079
retrans reqs : 37462, received 175 acks
retrans reqs for illegal seq : 0
dropped updates as a result of sync overload: 0
Delta Sync memory usage: currently using XX KB mem
Callback statistics: handled 138 cb, average delay : 2, max delay : 34
Number of Pending packets currently held: 1
Packets released due to timeout: 18

Explanations:
Output section
Sync: off
Sync:
Live connections update: on
Sync:
Version: old
Sync:
Version: new
Sync:
Version: new
Status: Able to Send/Receive sync
packets
Sync:
Version: new
Status:
Able to send sync packets
Unable to receive sync packets
Sync:
Version: new
Status:
Unable to send sync packets
Unable to receive sync packets
Sync:
Version: new
Status:
Saving incoming sync packets
Sync:
Version: new
Status:
Saving incoming sync packets
Sync:
Version: new
Status:
Able to receive sync packets
Sync:
Version: new
Status:
Able to receive sync packets
Sync packets sent:
total : 466729198, retransmitted :
1305, retrans reqs : 89, acks : 809
Explanation
Delta Sync is disabled: either Full Sync
failed, or Delta Sync was disabled by cluster
administrator.
'Active Mode' tab is opened in SmartView
Tracker. Refer to sk30908.
Check Point FW-1 v4.1 and lower.
Check Point FW-1 NG and above.
Delta Sync works correctly.
The problem is described in the output itself

(requires cluster debugging).


TOTAL number of sync packets is nonzero and increasing.

RETRANS REQS may increase under
load.

Sync packets received:

total : 77283541, were queued
dropped by net : 6079
retrans reqs : 37462, received
acks
retrans reqs for illegal seq :
dropped updates as a result of
overload: 0
: 6715,
175
0
sync
Delta Sync memory usage: currently

using XX KB mem
Callback statistics: handled 138 cb,

average delay : 2, max delay : 34
Number of Pending packets currently

held: 1
Packets released due to timeout: 18
TOTAL
number of sync packets is nonzero and increasing.

QUEUED value never decreases - a nonzero value does not indicate a problem.
DROPPED BY NET number may indicate
network congestion - this counter is
incremented when the cluster member
receives a sync packet with a sequence
number, which is higher, than the expected
sequence number; meaning, packets with
lower sequence numbers where lost
somewhere along the way, and we should
find out where.
RETRANS REQS growing very fast may
indicate that the load is becoming too high.
RETRANS REQS FOR ILLEGAL SEQ may
indicate a sync problem.
DROPPED UPDATES AS A RESULT OF
SYNC OVERLOAD - in a heavily loaded
system, the cluster member may drop
synchronization updates sent from peer
cluster members.
This statistic only appears for a non-zero

value. It requires memory only while Full
Sync is occurring. At other times, Delta Sync
requires no memory.
value.
AVERAGE DELAY should be ~1-5 packets,
otherwise indicates an overload of sync
traffic.
value.
value. If the number is large (more than 100
pending packets), and the "Number of
Pending packets currently held" is small in
the output of 'cphaprob syncstat'
command, then you should take action to
reduce the number of pending packets.
To tackle this problem, see "Reducing the
Number of Pending Packets" in ClusterXL
Administration Guide.

'cphaprob' command
Description:
Use the 'cphaprob' command to verify that the cluster and the cluster members are
working properly, and to define critical devices.
Syntax:
[Expert@Member]# cphaprob [flags]
Note: The commands below are listed in the order to their importance / relevance.
cphaprob state
Description:
Prints the summary with the following information:
o Cluster Mode
o Member ID of each known member
o Assigned traffic load for each known member
o State of each known member
Syntax:
[Expert@Member]# cphaprob state
Example:
[Expert@FW2-Member:0]# cphaprob state
Cluster Mode:
High Availability (Active Up) with IGMP Membership
Number
Unique Address
Assigned Load
State
1
2 (local)
10.10.10.31
10.10.10.32
0%
100%
Standby
Active
[Expert@FW2-Member:0]#
Output in 3rd party and OPSec clusters:

o The machine state is only Check Point status and is not really a machine status.
o The command only monitors Full Sync success, and if a policy was successfully
installed. For IPSO IP Clustering, the state is accurate and also includes the
status of the IPSO Cluster.
o For IPSO VRRP, the status is accurate for a Firewall, but it does not correctly
reflect the status of the IPSO machine (e.g., it does not detect interface failure).
o For Gaia VRRP, the status is accurate for a Firewall, and it does not reflect the
status of the Gaia machine (e.g., it does not detect interface failure).


o sk61546 (The IP addresses of ClusterXL members in the output of 'cphaprob
state' command differ from the IP addresses of Sync interfaces)
o sk66881 (On OPSec cluster, output of 'cphaprob state' does not show the local
member, only peer members)
o sk93037 (Output of 'cphaprob state' command on Crossbeam chassis cluster
shows only local member)
o sk36247 (Cluster member detects only itself in 'cphaprob state' output, or detects
other member as "ClusterXL inactive or machine is down")
o sk30154 ($FWDIR/log/fwd.elg shows repeatedly - 'fwarp_initialize_myself: unable
to find mac address of interface IF_NAME')
o sk61331 (ClusterXL Load Sharing in Unicast (Pivot) mode - after second reboot
of Pivot member, output of 'cphaprob stat' on non-Pivot member shows wrong
assigned load as 0% for Pivot and 100% for non-Pivot)
o sk42096 (Cluster member is stuck in 'Ready' state)
cphaprob [-l][-ia][-e] list

Description:
Prints the summary of Critical Devices (Pnotes) with the following information:
o Device name
o Device timeout (how frequently the periodic reports are expected)
o Device state
o Time since last periodic report
Syntax:
[Expert@Member]# cphaprob [-l][-ia][-e] list
Commands:
cphaprob list
In R77.30 and above
When there are no issues on the cluster
member:
In R77.20 and lower
There are no pnotes in problem

state
* Issue 'cphaprob -l list' to
show full list of pnotes
When a critical device reports a problem prints only the critical device that reports
its state as "problem"
Example:
Registered Devices:
Prints the list of some of the "Built-in

Devices" and the "Registered Devices"
Device Name: Interface Active
Check
Device
Device
Device
Device
Device
Device
Name:
Name:
Name:
Name:
Name:
Name:
Synchronization
Filter
fwd
cphad
cvpnd
routed
Device Name: routed

Registration number: 2
Timeout: none
Current state: problem
Time since last report: 2.8 sec

cphaprob -l list
In R77.30 and above
Prints the list of all the "Built-in Devices"
and the "Registered Devices" - exactly as
"cphaprob -ia list" does in R77.20 and
lower
Device Name: Problem
Notification
Check
Device Name: Load Balancing
Configuration
Device
Device
Device
Device
Device
Device
Name:
Name:
Name:
Name:
Name:
Name:
In R77.20 and lower
Command does not exist in these versions
Synchronization
Filter
routed
cphad
fwd
cvpnd
cphaprob -i list
In R77.30 and above
member:
state
Example:
Registered Devices:
Device Name: routed
Timeout: none
In R77.20 and lower
Prints the list of some of the "Built-in

Devices" and the "Registered Devices"
Check
Device
Device
Device
Device
Device
Device
Name:
Name:
Name:
Name:
Name:
Name:
Synchronization
Filter
fwd
cphad
cvpnd
routed

cphaprob -ia list

In R77.30 and above
member:
In R77.20 and lower

state
When a critical device reports a problem prints the device "Problem
Notification" and the critical device
that reports its state as "problem"
Example:
Built-in Devices:
Prints the list of all the "Built-in Devices"

and the "Registered Devices"
Notification
Check
Device Name: Load Balancing
Configuration
Device
Device
Device
Device
Device
Device

Notification
Registered Devices:
Name:
Name:
Name:
Name:
Name:
Name:
Synchronization
Filter
routed
cphad
fwd
cvpnd
Device Name: routed

Timeout: none
cphaprob -e list
In R77.30 and above
member:
In R77.20 and lower

state
Example:
Prints the list of "Registered Devices" only

Device
Device
Device
Device
Device
Device
Name:
Name:
Name:
Name:
Name:
Name:
Synchronization
Filter
fwd
cphad
cvpnd
routed
Registered Devices:
Device Name: routed
Timeout: none

Example from R77.20:

[Expert@FW2-Member:0]# cphaprob -ia list
Built-in Devices:
Device Name: Problem Notification
Current state: OK
Device Name: Interface Active Check
Current state: OK
Current state: OK
Device Name: Load Balancing Configuration
Current state: OK
Current state: OK
Registered Devices:
Device Name: Synchronization
Timeout: none
Current state: OK
Device Name: Filter
Timeout: none
Current state: OK
Time since last report: 2302 sec
Device Name: fwd
Timeout: none
Current state: OK
Device Name: cphad
Timeout: none
Current state: OK
Device Name: routed
Timeout: none
Current state: OK
Device Name: ted
Timeout: 600 sec
Current state: OK


o sk36372 (Output of 'cphaprob -ia list' on ClusterXL shows a Critical Device called
'HA Initialization')
'Load Balancing Configuration')
'Recovery Delay')
cphaprob [-a] if
Description:
Prints the summary of cluster interfaces with the following information:
o Number of required cluster interfaces - including the Sync interfaces (the
maximal number of good cluster interfaces seen since the last reboot)
o Number of required secured (trusted) interfaces (the maximal number of good
sync interfaces seen since the last reboot)
o Names of monitored cluster interfaces (refer to CCP and VLAN interfaces)
o State of cluster interfaces (based on arrival/transmission of CCP packets)
o CCP mode on cluster interfaces
o Number of cluster Virtual IP addresses
o Virtual IP addresses
o Virtual MAC addresses (if VMAC mode is enabled per sk50840)
Syntax:
[Expert@Member]# cphaprob [-a] if
Flag
-a
Description
Prints Virtual IP addresses and
their corresponding interfaces.
Example:
[Expert@FW2-Member:0]# cphaprob -a if
eth0
eth1
eth2
UP
UP
UP

Virtual cluster interfaces: 2

eth0
eth2
192.168.204.33
20.20.20.33
VMAC address: 00:1C:7F:00:00:1F


Output in 3rd party / OPSec clusters:

o Shows only the relevant information - interface name, if it is a trusted (secured)
interface or not.
o "Multicast"/"Broadcast" refers to the CCP and is relevant only for the trusted
(secured) interface.
o Status of the interface is not printed since it is not monitored.
cphaprob [-reset] syncstat

Description:
Prints internal cluster statistics about the operation of the State Synchronization.
Can be used on ClusterXL and 3rd party / OPSec clusters.
Syntax:
[Expert@Member]# cphaprob [-reset] syncstat
Flag
-reset
Description
Resets the statistics in kernel that was
collected since boot, or last reset.
R75.40VS, R76, R77) - Chapter 'Monitoring and Troubleshooting Gateway Clusters'
- Troubleshooting Synchronization.
o sk34475 (ClusterXL Sync Statistics - output of 'cphaprob syncstat' command)
o sk82080 (/var/log/messages are filled with 'kernel: FW-1:
410 packets')
Example:
Sync Statistics (IDs of F&A Peers - 1 2 3 4 5 6 7 ):
Other Member Updates:
Sent retransmission requests...................
Avg missing updates per request................
Old or too-new arriving updates................
Unsynced missing updates.......................
Lost sync connection (num of events)...........
Timed out sync connection .....................
165
1
5661
0
4354
1
Local Updates:
Total generated updates .......................
Recv Retransmission requests...................
Recv Duplicate Retrans request.................
9180670
1073
2564
Blocking Events................................
Blocked packets................................
Max length of sending queue....................
Avg length of sending queue....................
Hold Pkts events...............................
Unhold Pkt events..............................
Not held due to no members.....................
Max held duration (sync ticks).................
0
0
4598
0
1
1
16
0

Avg held duration (sync ticks).................
11
Timers:
Sync tick (ms).................................
CPHA tick (ms).................................
100
100
Queues:
Sending queue size.............................
Receiving queue size...........................
512
256
Output section
IDs of
F&A
Peers
Other
Member
Updates:
Sent
retransmission
requests
Avg missing
updates per
request
Old or too-
new
arriving
updates
Unsynced
missing
updates
Lost sync
connection
(num of
events)
Explanation
The F&A (Flush and Ack) peers are
the cluster members that this
member recognizes as being part
of the cluster.
The IDs correspond to IDs and
IP addresses shown by the
'cphaprob state' command.
The statistics in this section relate
to Delta Sync updates generated
by other cluster members, or to
Delta Sync updates that were not
received from the other members.
Updates inform about changes in
the connections handled by the
cluster member, and are sent from
and to members. Updates are
identified by sequence numbers.
The number of
retransmission requests, which
were sent by this member.
Retransmission requests are sent
when certain packets (with
a specified sequence number)
are missing, while the sending
member already received updates
with advanced sequences.
Each retransmission request can
contain up to 32 missing
consecutive sequences.
The value of this field is the
average number of requested
sequences per
retransmission request.
The number of arriving Delta Sync
updates where the sequence
number is too low, which implies it
belongs to an old transmission, or
too high, to the extent that it cannot
belong to a new transmission.
The number of missing Delta Sync
updates, for which the receiving
member stopped waiting. It stops
waiting when the difference in
sequence numbers between the
newly arriving updates and the
missing updates is larger than the
length of the "Receiving
Queue".
The number of events, in
which synchronization with another
member was lost and regained due
to either Security Policy installation
Limits
Has to be less than 30% of "Total

generated updates" ON
OTHER MEMBERS.
More than 20 can

imply connectivity problems.
Has to be less than 10% of "Total

generated updates" ON THIS
MEMBER.
Should be 0 - less than 1% of

"Total generated
updates" is
acceptable.
Lost sync connection (number

of events)

Timed out
sync
connection
Local
Updates:
Total
generated
updates
Recv
Retransmission
requests
Recv
Duplicate
Retrans
request
Blocking Events
Blocked
packets
Max length
of sending
queue
on the other member, or a large

difference between the expected
and received sequence number.
The number of events, in which
the member declares another
member as not connected. The
member is considered
as disconnected because no CCP
packets with ACK were received
from that member for a period of
time (1 second), even though there
are Flush and Ack packets being
held for that member.
The statistics in this section relate
to Delta Sync updates generated
by the local cluster member.
Updates inform about changes in
the connections handled by the
cluster member, and are sent from
and to members. Updates are
identified by sequence numbers.
The number of Delta Sync
updates generated by the Sync
mechanism since the statistics
were last reset (with 'cphaprob reset syncstat' command). Its
value is the same as the difference

between the sequence number
when running the 'cphaprob reset syncstat' command,
and the current sequence number.
The number of received
retransmission requests. A member
requests retransmissions when it is
missing specified packets with
lower sequence numbers than the
ones already received.
The number of duplicated
retransmission requests received
by the member. Duplicate requests
were already handled, and so are
dropped.
Under extremely heavy load
conditions, the cluster member may
block new connections (refer
to sk43896). This counter shows
the number of times that the cluster
member started blocking new
connections due to Sync overload.
The number of packets that were
blocked because the cluster
member was blocking all new
connections (see 'Blocking
Events' above). The number of
blocked packets is usually one
packet per new connection attempt.
The size of the Sending Queue is
fixed and by default, it is 512 sync
words. This size is controlled via
kernel parameter
fw_sync_sending_queue_size.
Should be 0 - positive value

indicates connectivity problems.
Can have any value.
Should be less than 30%

of "Total generated
updates" ON THIS MEMBER.
Should be less than 30%

of "Total generated
updates" ON THIS MEMBER.
If "Block New Connections"
mechanism is enabled (per
sk43896), then positive value
indicates heavy load.
Higher than 5% of "Avg length

of sending queue" can
imply connectivity problems.
If "Block New Connections"

mechanism is enabled (per
sk43896), then should be less than
"Sending queue size".

Avg length of
sending
queue
Hold
Pkts
events
Unhold
Pkt
events
Not held due

to no
members
Max held
duration
(sync
ticks)
Avg held
duration
(sync
ticks)
Timers:
The average value of the 'Max

length of sending queue',
since last reboot or since the Sync
statistics were reset.
The number of event, where the
Delta Sync update required Flush
and Ack, and so was kept within
the system until an ACK
arrived from all the other
functioning members
The number of events, when the
member received all the required
ACKs from the other functioning
members.
The number of packets, which
should have been held within the
system, but were released because
there were no other operating
members.
The maximal time in cluster ticks
(1 tick equals 100ms), for which a
held packet was delayed in the
system for Flush and Ack
purposes.
The average duration in cluster
ticks (1 tick equals 100ms), for
which the held packets were
delayed within the system for Flush
and Ack purposes.
The values in this section relate to
internal timers that control Sync
and cluster related actions.
Sync tick
(ms)
Timer interval for Delta Sync

operations.
CPHA tick
(ms)
Timer interval for cluster operations

(excluding Delta Sync).
The values in this section relate to
the sizes of Delta Sync Queues.
The Sending Queue on the cluster
member stores locally generated
Delta Sync updates. Updates in the
Sending Queue are replaced by
more recent updates. In a highly
loaded cluster, updates are
therefore kept for less time. If a
member is asked to retransmit an
update, it can only do so if the
update is still in its Sending Queue.
Queues:
Sending
queue size
Each member has one

sending queue.
If "Block New Connections" is

enabled (per sk43896), then
should be less than 80% of
"Sending Queue size".
Should be the same as "Unhold
Pkt events".
Should be the same as "Hold

Pkt events".
Should be 0 - positive
value indicates connectivity
problem between the members.
Should be less than 50 positive value indicates

connectivity problem between the
members.
Should be about the RoundTrip Time (RTT) of the Sync
network. A larger value
indicates connectivity problem.
The value is controlled via kernel
parameter
fwha_timer_sync_res
per sk41471.
Default value is 100 ms (minimal
possible value).
parameter
fwha_timer_cpha_res
per sk43872.
Default value is 100 ms (minimal
possible value).

parameter
fw_sync_sending_queue_size
per sk82080.
Default value is 512 sync words
(minimal possible value).

parameter
fw_sync_recv_queue_size per
sk82080.
Default value is 256 sync words
(minimal possible value).

cphaprob -d <device> -t <timeout> -s <ok|init|problem> [-p] [-g] register

Description:
Registers a Critical Device (Pnote) with specified parameters.
Syntax:
[Expert@Member]# cphaprob -d <device> -t <timeout_in_sec> -s
<ok|init|problem> [-p] [-g] register
Flags
-d device
-t timeout_in_sec
-s <ok|init|problem>
-p
-g
Description
Specifies the name of the Pnote (refer to
ClusterXL definitions and terms section).
Specifies how frequently the periodic reports are
expected.
If no periodic reports should be expected, then
enter 0 (zero).
Specifies the initial state with which
the Pnote will be registered.
(Optional) Specifies that this Pnote
must be registered permanently (this
configuration will be saved in the
$FWDIR/conf/cphaprob.conf file).
must be registered globally (applies to
R75.40VS and above in VSX mode).

o sk43172 (Cluster performs fail-overs - detected a problem (cphad) / (fwd))
o sk32712 (FWD response time increase (cphaprob -d fwd -t seconds register -p)
does not survive reboot)
o sk92878 (User Space process monitoring mechanism in R76 ClusterXL)
cphaprob -d <device> [-p] [-g] unregister

Description:
Registers a Critical Device (Pnote) with specified parameters.
Syntax:
[Expert@Member]# cphaprob -d <device> [-p] [-g] unregister
Flags
-d device
-p
-g
Description
must be unregistered permanently (this
configuration will be removed from the
$FWDIR/conf/cphaprob.conf file).
must be unregistered globally (applies to


cphaprob -d <device> -s <ok|init|problem> [-g] report

Description:
Reports a specified state for Critical Device (Pnote).
Syntax:
[Expert@Member]# cphaprob -d <device> -s <ok|init|problem> [-g] report
Flags
-d device
-s <ok|init|problem>
-g
Description
Specifies the state, which
will be reported for the Pnote .
state must be reported globally
(applies to R75.40VS and above in VSX mode).

o sk92868 (Cannot change the state of the pnote 'cphad' to 'problem' with
'cphaprob -d cphad -s problem report' command)
cphaprob -f <file> [-g] register

Description:
Registers Critical Devices (Pnotes) with specified parameters from a file.
Syntax:
[Expert@Member]# cphaprob -f <file> [-g] register
Flags
-f file
-g
Description
Specifies the file that contains the list of Pnotes
and their parameters.
For file syntax, refer to the
$FWDIR/conf/cphaprob.conf file.
must be registered globally (applies to


cphaprob -a [-g] unregister

Description:
Unregisters all Critical Devices (Pnotes).
Syntax:
[Expert@Member]# cphaprob -a [-g] unregister
Flags
-a
-g
Description
Specifies that all Pnotes must be unregistered.
(Optional) Specifies that all Pnotes must be
unregistered globally (applies to

cphaprob igmp
Description:
Prints IGMP membership status.
Syntax:
[Expert@Member]# cphaprob igmp
Example:
[Expert@FW2-Member:0]# cphaprob igmp
IGMP Membership: Enabled
Supported Version: 2
Report Interval [sec]: 60
IGMP queries are replied only by Operating System
Interface
Host Group
Multicast Address Last ver. Last Query[sec]
--------------------------------------------------------------------------eth0
224.168.204.33 01:00:5e:28:cc:21
N/A
N/A
eth1
224.10.10.250
01:00:5e:0a:0a:fa
N/A
N/A
eth2
224.20.20.33
01:00:5e:14:14:21
N/A
N/A


sk22495 (Interface flapping (down/up) in a ClusterXL environment)
cphaprob [-reset] ldstat

Description:
Prints the serialization statistics about the operations performed in kernel tables
based on Delta Sync - creating a new connection, updating an existing connection,
deleting an existing connection, etc.
Can be used on ClusterXL and 3rd party / OPSec clusters.
Syntax:
[Expert@Member]# cphaprob [-reset] ldstat
Flag
-reset
Description
Resets the statistics in kernel that was
collected since boot, or last reset.
Example:
[Expert@FW2-Member:0]# cphaprob ldstat
Operand
Calls
Bytes
Average Ratio %
------------------------------------------------------ERROR
0 0
0
0
SET
5287
1359896 257
27
RENAME
0 0
0
0
REFRESH
41105
2137460 52
42
DELETE
5276
189792 35
3
SLINK
10496
671744 64
13
UNLINK
0 0
0
0
MODIFYFIELDS
8032
610432 76
12
RECORD DATA CONN
0 0
0
0
COMPLETE DATA CONN
0 52026
0
1
Total bytes sent: 4893244 (4 MB) in 52026 packets. Average 94

o sk92909 (How to debug ClusterXL to understand why a connection is not
synchronize)
cphaprob ucfstat
Description:
Prints the Full Connectivity Upgrade (FCU) statistics on the member that is being
upgraded in Full Connectivity mode.
Note: FCU is not supported since R75 (refer to sk107042).

Syntax:
[Expert@Member]# cphaprob fcustat
Example:
[Expert@FW2-Member:0]# cphaprob fcustat
During FCU....................... yes
Number of connection modules..... 23
Connection module map (remote -->local)
0 --> 0 (Accounting)
1 --> 1 (Authentication)
2 --> 3 (NAT)
3 --> 4 (SeqVerifier)
4 --> 5 (SynDefender)
5 --> 6 (Tcpstreaming)
6 --> 7 (VPN)
Table id map (remote->local)..... (none or a specific list,
depending on configuration)
Table handlers ..................
78 --> 0xF98EFFD0 (sip_state)
8158 --> 0xF9872070 (connections)
Global handlers ................. none
Output section
During FCU
Number of connection modules

Connection module map
Table id map
Table handlers
Global handlers
Explanation
This should be "yes" only after
running the 'fw fcu' command and
before running 'cphastop' on the
final old member.
In all other cases it should be "no".
Safe to ignore.
The output reveals a translation
map from the old member to the
new member.
For additional information, refer to
'Full Connectivity Upgrade
Limitations' in the Installation and
Upgrade Guide.
This shows the mapping between
the gateway's kernel table indices
on the old member and on the NM.
Having a translation is not
mandatory.
This should include a sip_state
and connection table handlers.
Security Gateway configuration (in
VSX, applies to R75.40VS and
above ), a VPN handler should also
be included.
Reserved for future use.

cphaprob tablestat
Description:
Prints the Cluster tables.
Syntax:
[Expert@Member]# cphaprob tablestat
Example:
[Expert@FW2-Member:0]# cphaprob tablestat
----
Unique IP's Table
----
Member
Interface
IP-Address
-----------------------------------------0
1
192.168.204.31
0
2
10.10.10.31
0
3
20.20.20.31
(Local)
1
1
1
1
2
3
192.168.204.32
10.10.10.32
20.20.20.32
------------------------------------------
'cphastart' and 'cphastop' commands

R75.40VS, R76, R77) - Chapter 'Monitoring and Troubleshooting Gateway Clusters' ClusterXL Configuration Commands - The cphastart and cphastop Commands.
cphastart
o
o
o
o
Running cphastart on a cluster member activates ClusterXL on the member.

It does not initiate Full Sync.
The cpstart command is the recommended way to start a cluster member.
To collect more information, run this command under debug (by default, output is
printed on the screen; output can be redirected into a file):
[Expert@Member]# cphastart -d > /var/log/cphastart.txt
Refer to the following lines in the output file:

prepare_command_args: -D ... start
/opt/CPsuite-RXX/fw1/bin/cphaconf clear-secured
/opt/CPsuite-RXX/fw1/bin/cphaconf -D ... start
Refer to cphaconf <relevant flags> start section.

Note: Starting in R77.20, refer to $FWDIR/log/cphastart.elg

cphastop
o Running cphastop on a cluster member stops the cluster member from passing
traffic.
o State Synchronization also stops.
o It is still possible to open connections directly to the cluster member.
o In High Availability Legacy mode, running cphastop may cause the entire
cluster to stop functioning.
'cphaconf' command
Important Note: This command should NOT normally be used, since configuration is
controlled by the Management Server. Use it only if specifically instructed to by Check
Point Support. Exception: when working with Bond interfaces.
R75.40VS, R76, R77) - Chapter 'Monitoring and Troubleshooting Gateway Clusters' ClusterXL Configuration Commands - The cphaconf command.
Note: Starting in R77.20, refer to $FWDIR/log/cphaconf.elg
Note: The commands below are listed in the order to their importance / relevance.
cphaconf <relevant flags> start

Important Note:
Use the following table only to analyze the output of 'cphastart -d' command.
Refer to the following lines in the output of 'cphastart -d' command:
prepare_command_args: -D ... start
/opt/CPsuite-RXX/fw1/bin/cphaconf clear-secured
/opt/CPsuite-RXX/fw1/bin/cphaconf -D ... start
Description:
Loads cluster configuration with relevant options into kernel.
Flags:
Flags
-D
-c <size>
-i <ID>
-n <ID>
Description
Prints debug information about
the execution of 'cphaconf' command
Sets cluster size
(number of members in the cluster)
Sets member ID of the local machine
(count is starts from 1
Sets cluster ID

-p
-m
-m
-m
-m
<policy_id>
<1|service>
<2|balance>
<3|primary-up>
<4|active-up>
Sets Policy ID explicitly

Sets cluster mode (use either the ordinal
number, or the explicit name):
balance = Load Sharing (refer to '-M' flag)
primary-up = High Availability Primary Up
active-up = High Availability Active Up
service = Service Mode (HA is not active -
-R a
-R <required_IF_num>
Sets the number of required interfaces:

a = the number should be detected
automatically
required_IF_num = sets the number
explicitly (the number of required trusted
(secured, a.k.a. Sync) interfaces will be
set to zero, because in manual mode we
do not check trusted interfaces)
Sets/adds the trusted (secured, a.k.a. Sync)
interfaces explicitly
though sync, for example may be on)
-t <secured_IF_1>
<secured_IF_2>
...
-d <disconnected_IF_1>
<disconnected_IF_2>
...
-A
-M <0|multicast>
-M <1|pivot>
-l
-l
-l
-l
-l
-l
-l
-l
0
1
2
3
4
5
6
7
-S 0
-S 1
-f 0
-f 1
-f 2
-o
-x
-z 0
Sets/adds the disconnected interfaces

explicitly
Enables auto mode in order to add all unused
interfaces (automatically)
Sets Load Sharing mode (use either the
ordinal number, or the explicit name)
Note: <0|multicast> is the default - it can
be omitted from the command
Sets tracking mode for changes in the status
of cluster members (failover):
0 = None
1 = Log
2 = Popup Alert
3 = Mail Alert
4 = SNMP Trap Alert
5 = User Defined Alert no. 1
Enables/disables Sticky Decision Function:
0 = do not use Sticky Decision Function
1 = use Sticky Decision Function
(must be set with '-f' flag)
Sets Sticky Decision Function method:
0 = IPs, Ports, and SPIs (default)
1 = IPs and Ports
2 = IPs only
Sets High Availability in Legacy mode
Sets High Availability in New mode (multicast)
0 = Disables VMAC mode

-z 1
-v
-V
-T 0
-T 1
-T 2
-r
-s
1 = Enables VMAC mode (per sk50840)

Enables VSLS mode
(refer to VSX Administration Guide)
Prior to R75.40VS:
MD5 checksum for VSLS parameters
Prior to R75.40VS:
3rd party mode
0 = 3rd party cluster mode is undefined
1 = more than 1 machine is active in 3rd party
2 = 1 machine is active in 3rd party
Read registry (only on Windows OS)
Disables Switch Support - the sending of
Ethernet broadcast to overcome switches.
Related to probing of local network via ARP.
Notes:
If it is guaranteed that the cluster will not
be connected to Layer 2 switches, then
Switch Support can be disabled (no
impact).
The following lines appear in cluster
debug with 'conf' flag:
FW-1: fwha_set_conf: SWITCH SUPPORT
FW-1: fwha_set_conf: NO SWITCH SUPPORT
cphaconf stop
Description:
Removes the cluster configuration from kernel.
Background:
The 'cphastop' command is actually a shell script wrapper that runs this command.
cphaconf set_ccp <broadcast/multicast>

Description:
Sets the CCP mode - broadcast / multicast (default mode).
Notes:
o Refer to CCP modes section.
o Explicit configuration will be added into:
Unix OS: $FWDIR/boot/ha_boot.conf file
Windows OS: Windows Registry HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\CPHA
\CCP_mode
o sk20576 (How to set ClusterXL Control Protocol (CCP) in Broadcast / Multicast
mode in ClusterXL)
o sk36644 (The Mode of Cluster Control Protocol (CCP) in VSX cluster).

cphaconf show_bond <-a | bond_name>

Description:
Shows the current bond configuration
o -a = displays a summary table for all bond interfaces
o bond_name = displays a summary table for specific bond and its slaves
cphaconf failover_bond <bond_name>

Description:
Starts internal failover between slave interfaces of given bond interface (Bond in
High Availability mode only).
cphaconf enable_bond_failover <bond_name>

Refer to Configuring Bond Failover Mode section.
Description:
Sets what happens during a failover after a bond has already failed over internally.
Note:
It works only if the value of kernel parameter 'fwha_manual_bond_failover' is
currently set to 1 (one).
The fwha_manual_bond_failover kernel parameter is used to set the failover
mode - either within the bond, or to the next cluster member:
o 0 - (default) sets the system to fail over to another bonded slave interface when a
failure is detected on a slave interface. This is the default setting.
o 1 - sets the system to fail over to another cluster member when failure is
detected on a slave interface.
o In both modes, the next bond failover occurs in three minutes.
cphaconf debug_data
Description:
Prints the current cluster configuration as loaded in the kernel on this machine.
Note:
Works only during the following cluster debug:
In 1st shell:
[Expert@Member_HostName]#
fw
fw
fw
fw
ctl
ctl
ctl
ctl
debug 0
debug -buf 32000
debug -m cluster + conf
kdebug -T -f > /var/log/debug.txt
In 2nd shell:
[Expert@Member_HostName]# cphaconf debug_data

In 1st shell:
[Expert@Member_HostName]# fw ctl debug 0
Review /var/log/debug.txt
Example:
Configuration:
[Expert@FW1-Member:0]# cphaprob state

Cluster Mode:
High Availability (Active Up) with IGMP Membership
Number
Unique Address
Assigned Load
State
1 (local)
2
10.10.10.31
10.10.10.32
100%
0%
Active
Standby
[Expert@FW1-Member:0]# cphaprob -a if
eth0
eth1
eth2
UP
UP
UP

Virtual cluster interfaces: 2

eth0
eth2
192.168.204.33
20.20.20.33

Debug output:
;[cpu_1];[fw4_0];================================================;
;[cpu_1];[fw4_0];=====
ClusterXL
debug
information
===;
;[cpu_1];[fw4_0];================================================
;
;[cpu_1];[fw4_0];---- Sync ---;
;[cpu_1];[fw4_0];fwlddist_state is (1a): Receiving, Not Saving, Sending;
;[cpu_1];[fw4_0];fwlddist_dobcast is: 1;
;[cpu_1];[fw4_0];fw_has_nondefault_filter is: 1;
;[cpu_1];[fw4_0];fw_syncn_is_configured is: 1;
;[cpu_1];[fw4_0];fwlddist_policy_in_ready_state is: 1;
;[cpu_1];[fw4_0];---VMAC mode:
---;
;[cpu_1];[fw4_0];VMAC: vmac mode is enabled;
;[cpu_1];[fw4_0];VMAC: the vmac of each interface:;
;[cpu_1];[fw4_0];Interface: 1) eth0, vmac: 00:1C:7F:00:00:FE;
;[cpu_1];[fw4_0];VMAC: priomisc mode interfaces (by the VMAC mechanism) are:;
;[cpu_1];[fw4_0];Interface: 1) eth0, vmac_index=0x0;
;[cpu_1];[fw4_0];------------------------

;
;[cpu_1];[fw4_0];---Interfaces info:
---;
;[cpu_1];[fw4_0];0) if: lo, flags: 0x800;
;[cpu_1];[fw4_0];1) if: eth0, flags: 0x10000800;
;[cpu_1];[fw4_0];-----------------------;
;[cpu_1];[fw4_0];================================================;
;[cpu_1];[fw4_0];=====
ClusterXL
debug
end
===;
;[cpu_1];[fw4_0];================================================
;
;[cpu_1];[fw4_1];================================================;
;[cpu_1];[fw4_1];=====
ClusterXL
debug
information
===;
;[cpu_1];[fw4_1];================================================
;
;[cpu_1];[fw4_1];-----------------------;[cpu_1];[fw4_1];=====
Cluster instance information
===;
;[cpu_1];[fw4_1];-----------------------;
;[cpu_1];[fw4_1];---Selection table ---;
;[cpu_1];[fw4_1];Effective selection table size: 2
;
;[cpu_1];[fw4_1];0: 0;
;[cpu_1];[fw4_1];1: 0;
;[cpu_1];[fw4_1];-----------------------;
;[cpu_1];[fw4_1];---- Multicast table ---;
;[cpu_1];[fw4_1];lo: Address:
1.0.0.127;
;[cpu_1];[fw4_1];Cluster/default multicast IP: 0.0.0.0, MAC address:
00:00:00:00:00:00;
;[cpu_1];[fw4_1];eth0: Address:
31.204.168.192;
01:00:5E:28:CC:21;
31.10.10.10;
01:00:5E:0A:0A:FA;
31.20.20.20;
01:00:5E:14:14:21;
;[cpu_1];[fw4_1];-----------------------;
;[cpu_1];[fw4_1];---- Status subscribers ---;
;[cpu_1];[fw4_1];Subscriber: 0 pid 23079 sig 12 desc pepd;
;[cpu_1];[fw4_1];Subscriber: 1 pid 23078 sig 12 desc pdpd;
;[cpu_1];[fw4_1];Subscriber: 2 pid 25236 sig 3 desc routed instance 0;
;[cpu_1];[fw4_1];Subscriber: 3 pid 25270 sig 12 desc ted;
;[cpu_1];[fw4_1];Subscriber: 4 pid 4533 sig 12 desc cvpnd;
;[cpu_1];[fw4_1];-----------------------;
;[cpu_1];[fw4_1];=====
Cluster instance information end
;[cpu_1];[fw4_1];------------------------
===;

;[cpu_1];[fw4_1];---- Sync ---;

;[cpu_1];[fw4_1];fwlddist_state is (1a): Receiving, Not Saving, Sending;
;[cpu_1];[fw4_1];fwlddist_dobcast is: 1;
;[cpu_1];[fw4_1];fw_has_nondefault_filter is: 1;
;[cpu_1];[fw4_1];fw_syncn_is_configured is: 1;
;[cpu_1];[fw4_1];fwlddist_policy_in_ready_state is: 1;
;[cpu_1];[fw4_1];---VMAC mode:
---;
;[cpu_1];[fw4_1];VMAC: vmac mode is enabled;
;[cpu_1];[fw4_1];VMAC: the vmac of each interface:;
;[cpu_1];[fw4_1];VMAC: priomisc mode interfaces (by the VMAC mechanism) are:;
;[cpu_1];[fw4_1];-----------------------;
;[cpu_1];[fw4_1];---Interfaces info:
---;
;[cpu_1];[fw4_1];0) if: lo, flags: 0x800;
;[cpu_1];[fw4_1];-----------------------;
;[cpu_1];[fw4_1];================================================;
;[cpu_1];[fw4_1];=====
ClusterXL
debug
end
===;
;[cpu_1];[fw4_1];================================================
cphaconf -t <secured_IF_1><secured_IF_2>...
add
Description:
Adds the specified trusted (secured) interfaces explicitly into the current cluster
configuration in kernel.
cphaconf sync
Description:
Sets sync configuration in kernel (in HA New mode).
cphaconf stop_all_vs
Description:
Stops clustering on each Virtual System (relevant only for VSX systems).

cphaconf forward <on|off>

Description:
Enables (on; default setting) / Disables (off) the Forwarding Layer (controls the
forwarding of traffic between cluster members).
Refer to Forwarding section.
cphaconf clear-secured
Description:
Clears the list of secured (trusted) interfaces in kernel.
cphaconf clear-disconnected
Description:
Clears the list of disconnected interfaces in kernel.
Refer to Defining 'Disconnected' interfaces section.
cphaconf clear_subs
Description:
Clears the list of subscribers.
Note:
List of such subscribers can be obtained by running the cphaconf debug_data
command.
cphaconf mc_reload
Description:
Updates the multicast configuration by reloading the 'cphamcset' daemon (if this is
HA New mode and CCP is set to run in Multicast mode). The current configuration is
kept.
cphaconf uninstall_macs
Description:
Calls the $FWDIR/bin/cpha_restore_macs script to remove the cluster MAC
address configuration (and restore a previous MAC configuration if it was saved on
Linux-based OS to the ifcfg-ethX file).

cphaconf macs
Description:
Only on IPSO OS: Sets Multicast MAC addresses on relevant interfaces.
cphaconf init
Description:
Initializes cluster configuration.
cphaconf fini
Description:
Finalizes cluster configuration.
cphaconf debug <on|off>

Description:
Enables (on) / Disables (off) ClusterXL kernel debug module (fw ctl debug -m
cluster).
'cpstat' command
Description:
Produces relevant information for the installed products.
Syntax:
[Expert@HostName]# cpstat [-d] [-s SIC_Name] [-p port] [-o
polling_interval [-c count] [-e period]] [-f flavour]
application_flag
Flags:
'cpstat' flags
-d
s <SIC_Name>
-p <port>
-o <polling_interval
>
-c <count>
-e <period>
Description
Prints some debug information about the
execution of 'cpstat' command
Sets the SIC name of the AMON server
Sets the port number of the AMON server
(default port is 18192)
Sets polling interval (in seconds)- how
frequently to produce the output (default is 0,
i.e., the results are shown only once)
Sets how many times in total to produce the
output(default is 0, i.e., the results are shown
repeatedly)
Sets the interval, over which "statistical" OIDs
are computed (ignored for regular OIDSs)

-f <flavour
>
application_flag
Specifies "flavor" of the output - which

information to display for desired product
Note: to see the supported flavours, just type
'cpstat' and press Enter
Specifies the desired product
Note: to see the supported products, just type
'cpstat' and press Enter
In our case, we are interested in the information only about the ClusterXL product:
[Expert@HostName]# cpstat -f default ha
[Expert@HostName]# cpstat -f all ha
Refer to sk93201 (Output of 'cpstat -f all ha' command on Gaia OS does not populate
the 'Cluster IPs table' and the 'Sync table').
The 'cpstat -f all ha' command on Gaia OS and on 3rd party / OPSec clusters
works in the following way:
1. The 'cpstat -f all ha' command calls the
$FWDIR/bin/cxl_create_partner_topology_file shell script.
2. The $FWDIR/bin/cxl_create_partner_topology_file shell script collects
the relevant information and saves in the
$FWDIR/tmp/cxl_partner_topology_config.txt file.
3. 'cpstat -f all ha' uses the information in
$FWDIR/tmp/cxl_partner_topology_config.txt file and populates the
'Cluster IPs table' and the 'Sync table'.
Examples:
[Expert@Member]# cpstat -f default ha
Product name:
Version:
Status:
HA installed:
Working mode:
HA started:
High Availability
N/A
OK
1
High Availability (Active Up)
yes
[Expert@Member]# cpstat -f all ha

Product name:
High Availability
Major version:
6
Minor version:
0
Service pack:
3
Version string:
N/A
Status code:
0
Status short:
OK
Status long:
Refer to the Notification and Interfaces tables for
information about the problem
HA installed:
1
Working mode:
High Availability (Active Up)
HA protocol version: 2
HA started:
yes
HA state:
standby
HA identifier:
1

Interface table
---------------------------------------------------------------|Name|IP
|Status
|Verified|Trusted|Shared|Netmask|
---------------------------------------------------------------|eth0|172.30.41.78|Up
|
0|
0|
2|0.0.0.0|
|eth1| 10.10.10.78|Up
|
300|
1|
2|0.0.0.0|
|eth2| 20.20.20.78|Up
|
300|
0|
2|0.0.0.0|
|eth3| 30.30.30.78|Disconnected|21318100|
0|
2|0.0.0.0|
|eth4| 40.40.40.78|Disconnected|21318100|
0|
2|0.0.0.0|
---------------------------------------------------------------Problem Notification table
-----------------------------------------------|Name
|Status|Priority|Verified|Descr|
-----------------------------------------------|Synchronization|OK
|
0| 168880|
|
|Filter
|OK
|
0|
21318|
|
|cphad
|OK
|
0|
21318|
|
|fwd
|OK
|
0| 168949|
|
|routed
|OK
|
0|
21307|
|
|cvpnd
|OK
|
0|
1|
|
|ted
|OK
|
0|
1|
|
-----------------------------------------------Cluster IPs table
--------------------------------------------------------------|Name|IP
|Netmask
|Member Network|Member Netmask|
--------------------------------------------------------------|eth0|172.30.41.79| 255.255.0.0|
172.30.0.0|
255.255.0.0|
|eth2| 20.20.20.79|255.255.255.0|
20.20.20.0| 255.255.255.0|
--------------------------------------------------------------Sync table
-------------------------------|Name|IP
|Netmask
|
-------------------------------|eth1|10.10.10.78|255.255.255.0|
--------------------------------

$FWDIR/bin/clusterXL_admin script
This shell script registers a pnote (called 'admin_down') and gracefully changes the
state of the given cluster member to 'Down' (by reporting the state of that pnote as
'problem'), or gracefully reverts the state of the given cluster member to 'Up' (by reporting
the state of that pnote as 'ok').
Refer to sk55081 (Best practice for manual fail-over in ClusterXL).
$FWDIR/bin/clusterXL_monitor_ipsscript
This shell script pings a list of predefined IP addresses and changes the state of the
given cluster member to 'Down' or 'Up' based on the replies to these pings.
Note: Cluster member will go down even if one ping is not answered.
Refer to sk35780 (How to configure $FWDIR/bin/clusterXL_monitor_ips script to run
automatically on Gaia / SecurePlatform OS).
$FWDIR/bin/clusterXL_monitor_processscript
This shell script monitors a list of predefined processes and changes the state of the
given cluster member to 'Down' or 'Up' based on whether these processes are running or
not.
Refer to sk92904 (How to configure $FWDIR/bin/clusterXL_monitor_process script to
run automatically on Gaia / SecurePlatform OS).

ClusterXL Debugging
Debugging Check Point Security Gateway
In order to see how the Security Gateway processes the traffic, and how the internal
components are working, a debug of Check Point kernel should be run on this Security
Gateway (depending on the issue, it might also be required to run a debug of the relevant
user space daemon - e.g., in case of VPN - vpnd, in case of Full Sync - fwd).
Some debugs print so much information, that the load on CPU might increase to 100%
and render the Security Gateway unresponsive.
Note: It is always recommended to run the kernel debug during a scheduled
maintenance window in order to minimize the impact on production traffic and on users.
Syntax
[Expert@GW_HostName]# fw ctl debug -h
fw ctl debug [-d <strings>] [-s "<string>"] [-v ("<VSIDs>"|all)] [-k] [-x] [-m
<module>] [-e expr |-i <filter-file|-> | -u] [+|-] <options | all | 0>
Or: fw ctl debug [-t (NONE|ERR|WRN|NOTICE|INFO)] [-f (RARE|COMMON)]
Or: fw ctl debug -buf [buffer size][-v ("<VSIDs>"|all)][-k]
-h - for help
-e - Set debug filter to expr (inspect script)
-i - Set debug filter from filter-file (- is the standard input)
-u - Unset debug filtering
To display all kernel debugging modules and all their flags that this machine supports:
[Expert@GW_HostName]# fw ctl debug -m
To display all kernel debugging modules and their flags that were turned on:
[Expert@GW_HostName]# fw ctl debug
To display all debugging flags that were turned on for this kernel debugging module:
[Expert@GW_HostName]# fw ctl debug -m MODULE
To set default kernel debug options:

[Expert@GW_HostName]# fw ctl debug 0
Notes:
Some debug flags are enabled by default (error, warning) in various kernel
debugging modules, so that some generic messages are printed into Operating
System log (Linux-based OS: /var/log/messages; Windows OS: Event
Viewer).
This command should be issued before starting any kernel debug.
This command must be issued to stop the kernel debug.

To unset all kernel debug options:

[Expert@GW_HostName]# fw ctl debug -x
Note:
This unsets all debug flags, which means that none of the relevant messages will be
printed. Default debug flags should be enabled.
To set kernel debugging buffer:
[Expert@GW_HostName]# fw ctl debug -buf 32000
Notes:
Default size of the debugging buffer is 50 KB
Maximal size of the debugging buffer is 32768 KB
Unless the size of the debugging buffer is increased from default 50 KB, the
debug will not be redirected to a file (debug messages will be printed into
Operating System log)
Debug messages are collected in this buffer, and a user space process
($FWDIR/bin/fw) collects them and prints into the output file.
To print debug messages into the output file (start the kernel debug):
[Expert@GW_HostName]# fw ctl kdebug -T -f > /var/log/debug.txt
Note:
If you need to use this command in shell scripts, then add an ampersand at the end
to run the command in the background (fw ctl kdebug -T -f >
/var/log/debug.txt &).
To stop the kernel debug:
Press CTRL+C and set the default kernel debug options
Note:
If you started the kernel debug via shell script, then you should just set the default
kernel debug options.
Important Notes about 'cpstop' and 'cpstart':

When running the 'cpstop' command, all Check Point services are stopped - and
the kernel debug will stop printing debug messages.
When running the 'cpstart' command (after the 'cpstop'), the kernel debug will
continue printing debug messages.
Important Notes about Security Gateway in VSX mode:

In VSX NGX / VSX R6x, the kernel debug commands can be run from context of any
Virtual Device.
In VSX R6x, if you wish to filter the debug for messages only from specific Virtual
Devices, then use specify the relevant VSID in the syntax when setting flags:
[Expert@VSX_HostName:0]# fw ctl debug -v VSID1, VSID2 -m MODULE + flags
Note: Refer to VSX NGX R65 Administration Guide - 'Per Virtual System
Debugging'.

In R75.40VS and above in VSX mode, you have to switch to the context of the
specific Virtual Device, and then run the usual debugging commands:
[Expert@VSX_HostName:0]# vsenv VSID
[Expert@VSX_HostName:VSID]# fw ctl debug ...
Debug action plan

1. Prepare the kernel debug options:
A. Set default kernel debug options:
Should get this message:

Defaulting all kernel debugging options
B. Set kernel debug buffer:

[Expert@GW_HostName]# fw ctl debug -buf 32000

Initialized kernel debugging buffer to size 32000K
Note: Any other message means that there was a problem allocating the buffer,
and you should not continue until that issue is resolved (e.g., "Failed to
allocate kernel debugging buffer").
C. Set relevant kernel debug flags in relevant kernel debugging modules:
[Expert@GW_HostName]# fw ctl debug -m MODULE + FLAG1 FLAG2 ... FLAGn
or
[Expert@GW_HostName]# fw ctl debug -m MODULE all

Updated kernel's debug variable for module MODULE
Note: Pay close attention to the name of the kernel debug module.
2. Verify the kernel debug options:
[Expert@GW_HostName]# fw ctl debug -m MODULE
Should get this output:

Kernel debugging buffer size: 32000KB
Module: MODULE
Enabled Kernel debugging options: LIST OF FLAGS
Notes:
Pay close attention to the size of the kernel debugging buffer.
Pay close attention to the name of the kernel debugging module.
The order of the flags in this output does not matter - just all the flags you set
have to be here.

3. Start the kernel debug:

[Expert@GW_HostName]# fw ctl kdebug -T -f > /var/log/debug.txt
Should see the blinking cursor - the debug has started.

You can open a new shell and verify that the information is written into the output
file:
[Expert@GW_HostName]# tail -f
/var/log/debug.txt
4. If needed, start capturing the relevant traffic:

A. Start Check Point FW Monitor (refer to sk30583)
B. Start TCPdump on relevant interfaces
Note: It is strongly recommended to filter only the relevant traffic.
5. Replicate the issue:
A. Initiate the problematic traffic (write down exact times, IP addresses, ports, etc)
B. Repeat the steps that lead to unwanted behaviour
C. Make sure the issue was replicated
6. Stop the kernel debug and set default kernel debug options:
Press CTRL+C
7. Stop the traffic captures:

Press CTRL+C
8. Collect the debug output files (from kernel debug and traffic captures) and all other
related files (OS logs, CPinfo files, daemons' logs, SmartView Tracker logs, etc).
Debugging modules and flags

To debug Check Point ClusterXL software, the following kernel debugging settings are
used:
Global Kernel parameters
Before starting the kernel debug itself, pay attention to the following global kernel
parameters relevant to relevant to cluster issues (after debug, set the default values):
Disable this kernel parameter to disable the limit on the debug messages time
window (default - 60 ; zero - disables the limit):
[Expert@Member_HostName]# fw ctl set int fw_kdprintf_limit_time 0
Disable this kernel parameter to disable the limit on the amount of debug messages
(default - 30 ; zero - disables the limit) that are printed within specified time
(fw_kdprintf_limit_time):
[Expert@Member_HostName]# fw ctl set int fw_kdprintf_limit 0

Set this kernel parameter to print additional IO information and the contents of the
packets in HEX format when 'select' flag is enabled in 'cluster' module:
[Expert@Member_HostName]# fw ctl set int fwha_dprint_io 1
Set this kernel parameter to print additional information about cluster interfaces
when 'if' flag is enabled in 'cluster' module (very helpful for Check Point RnD):
[Expert@Member_HostName]# fw ctl set int fwha_dprint_all_net_check 1
Set this kernel parameter to print the dump of each packet when 'packet' flag is
enabled in 'fw' module (very helpful for Check Point RnD):
[Expert@Member_HostName]# fw ctl set int fw_debug_dump_packet 1
Notes:
o This parameter is available in R75.40VS, R76 and above
o Enabling the debug with flag 'packet' creates high load on CPU
o Enabling the parameter 'fw_debug_dump_packet' creates high load on CPU
Kernel debugging modules and debug flags

Refer to Kernel Debug flags (R77, R77.10, R77.20, R77.30).
Kernel debugging modules and debug flags relevant to cluster issues are listed below.
Firewall module: fw ctl debug -m fw + flag1 flag2 ... flagN
Flag
Explanation
chainfwd
chain forwarding - processing

of packet by various layers
highavail
cluster configuration
ioctl
mrtsync
nat
sync
xlate
IOCTL control messages sending configuration from

user space to kernel
synchronization (in kernel)
between cluster members of
Multicast Routes that are
added when working with
Dynamic Routing Multicast
protocols
When should be used

* complicated traffic issues and
forwarding between members
* recommended: 'df' , 'forward' flags
from 'cluster' module
* changes in the configuration
* information about interfaces during
traffic processing
* policy installation (in specific cases only)
* manual changes in the configuration
* PIM Routing is enabled on cluster

* Multicast traffic passes through
* Refer to sk95156
NAT issues - basic information

(Hiding and Folding behind
VIP)
synchronization operations in
ClusterXL
NAT issues - basic information
(Hiding and Folding behind
VIP)
* traffic is not hidden/folded, or incorrectly

hidden/folded behind cluster Virtual IP
address
* NAT is not working as expected
see how and which connections are
synchronized
address

xltrc

address
NAT issues - additional

information - going through
NAT rulebase
Cluster module: fw ctl debug -m cluster + flag1 flag2 ... flagN

Flag
Explanation
accel
related to status and support of

SecureXL
ccp
arrival/transmission of Cluster
Control Protocol (CCP) packets
conf
configuration and policy

installation
cu
Connectivity Upgrade
(only since R77.20)
df
drop
forward
Decision Function - decides,

which member will handle each
packet in a Load Sharing mode
connections dropped by the
CXL Decision Function (DF)
module (only in R60 and above)
- excluding CCP packets
Forwarding Layer messages sending and receiving a
forwarded packet
if
interface tracking and validation

- all the operations and checks
on interfaces
log
creating and sending of logs by

cluster (should be used in
parallel with 'log' flag in 'fw'
module)
When should be used

* see how cluster works with SecureXL
* use with: conf
* recommended: when debugging
SecureXL
* issues related to CCP (e.g., member is
stuck in 'Ready' state)
* recommended: during policy installation
(use with: conf)
* use with: pnote, stat
* recommended: if, mac
* anything related to cluster configuration
and policy installation
* use with: pnote, stat, subs
* recommended: 'ioctl' flag from 'fw'
module
* translation of kernel tables
* connection synchronization
* connection rematch
* etc.
* use with: conf, pnote, stat, subs
* traffic issues in LS cluster
* recommended: select
* traffic issues
* recommended: df, select, forward
* traffic issues
* use with: df, if, mac
* recommended: select
* changes in member's state
* configuration and policy installation
* use with: conf, pnote, stat, mac,
subs
* recommended: 'highavail' flag from
'fw' module
* logging issues
* use with 'log' flag from 'fw' module

mac
related to current configuration

of and detection of cluster
interfaces (should be used in
parallel with 'conf' flag and 'if'
flag)
nokia
related to cluster running on

IPSO OS
pivot
related to ClusterXL Load

Sharing Unicast (Pivot) mode
related to registering and
monitoring of critical devices
(pnotes)
pnote
select
packet selection - including

Decision Function (DF)
stat
related to state of cluster

members (state machine)
Subscriber module - set of APIs,
which enable user space
processes (by using a DLL) to
be aware of the current state of
the ClusterXL state machine
and other clustering
configuration parameters.
subs
timer
reports of cluster internal timers

* configuration and policy installation
* use with: conf, pnote, stat, if, subs
* recommended: 'highavail' flag from
'fw' module
* function calls between Check Point and
IPSO OS and reports about state of
cluster
* use only on IPSO OS
all decisions made in LS Unicast (Pivot
mode)
* state of critical devices (pnotes)
* use with: conf, stat, subs
* traffic issues
* use with: df, forward
* use with: pnote, subs, mac
* cluster configuration and policy
installation
* use with: conf, pnote, stat, mac
* issues with cluster operations that are
based on internal cluster timers
* use with: ccp
* recommended: pnote, stat, if, mac
Additional flags in 'cluster' module only on 41000/61000 appliance:

Flag
correction
Explanation
Correction Layer
bstat
Blade State
ch_ccp
Chassis CCP
ch_conf
Chassis configuration
ch_stat
Chassis State
iterator
Iterator
osp
Open Security Platform
smo
Single Management Object
unisync
vpn
Unicast Sync
VPN traffic
When should be used

traffic issues
anything related to policy installation and
configuration
issues related to CCP
anything related to configuration and
interfaces
anything related to state
* traffic issues
* anything related to policy installation and
configuration
traffic balancing issues
* anything related to policy installation and
configuration
* anything related to state
anything related to Unicast Sync
VPN traffic issues


sk83220 (How to collect ClusterXL debug during boot)
sk31114 (How to collect debug on Security Gateway during boot)
sk92987 (How to debug Check Point 'cpstart' and 'cpstop' commands)
Working with kernel parameters

Kernel parameters allow the administrator to control various aspects of Check Point
software operation.
The desired behaviour is achieved by setting the relevant values for the corresponding
kernel parameters.
It is strongly recommended to consult Check Point Support before changing any

values of kernel parameters.
Some kernel parameters can be set on-the-fly with 'fw ctl set int PARAMETER
VALUE' command (e.g., fwha_mac_magic).
Note: This change does not survive reboot.
Some kernel parameters can be set only during boot of the machine (any parameter
that controls memory allocation, sizes of memory buffers).
To check the current value of any integer parameter:

[Expert@Member_HostName]# fw ctl get int PARAMETER
To set the desired value on-the-fly (if parameter supports it):

[Expert@Member_HostName]# fw ctl set int PARAMETER VALUE
Note: This change does not survive reboot.
To set the desired value permanently (all parameters support it):

Refer to sk26202 (Changing the kernel global parameters for Check Point Security
Gateway).
Refer to the solutions that contain most relevant cluster-related kernel parameters:
sk92723 (Cluster flapping prevention)
sk25977 (Connecting multiple clusters to the same network segment (same VLAN,
same switch)
sk23695 ('FW-1: State synchronization is in risk. Please examine your
synchronization network to avoid further problems!' appears in /var/log/messages
file)
switches)
sk31655 (State of Standby cluster member in High Availability cluster is constantly
changing between 'Standby' and 'Down')
sk31336 (Using Monitor Interface Link State feature to improve ClusterXL interfacefailure-detection ability)
sk62863 (ClusterXL - cluster debug shows interface flapping due to the missing CCP
packets)

sk63163 (Failover does not occur in ClusterXL HA Primary Up mode after changing
cluster member priorities and installing the policy)
sk41827 (Synchronization network in the cluster is flooded with Sync Retransmit
packets)
sk82080 (/var/log/messages are filled with 'kernel: FW-1:
410 packets')
sk43872 (ClusterXL - CCP packets and fwha_timer_cpha_res parameter)
sk41471 (ClusterXL - State Synchronization time interval and 'fwha_timer_sync_res'
kernel parameter)
sk95156 (How to control the synchronization of multicast routes in Check Point
cluster)
sk104567 (Traffic passing through the VSX cluster is lost during a cluster failure on
Standby member)
ClusterXL Error Messages

R75.40VS, R76, R77) - Chapter 'Monitoring and Troubleshooting Gateway Clusters':
Monitoring Cluster Status Using SmartConsole Clients - SmartView Tracker ClusterXL Log Messages
ClusterXL Error Messages
Additional related solutions

sk57120 (Various warnings in SmartDashboard about synchronization networks in
cluster)
sk98722 (ATRG: SecureXL)
sk98737 (ATRG: CoreXL)


Atrg Clusterxl r6x r7x

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Atrg Clusterxl r6x r7x

Încărcat de

Drepturi de autor:

Formate disponibile

ClusterXL

2015 Check Point Software Technologies Ltd. All rights reserved.

Improved calculation of Destination MAC address for CCP packets

2015 Check Point Software Technologies Ltd. All rights reserved.

Added additional related solutions

2015 Check Point Software Technologies Ltd. All rights reserved.

State Synchronization in ClusterXL .................................................................................... 22

ClusterXL Modes ................................................................................................................ 27

ClusterXL Configuration ..................................................................................................... 45

Cluster state transitions ...................................................................................................... 69

Cluster Control Protocol (CCP) .......................................................................................... 78

2015 Check Point Software Technologies Ltd. All rights reserved.

CCP modes ...................................................................................................................................................82

External Header .......................................................................................................... 85

ClusterXL Debugging ....................................................................................................... 153

Working with kernel parameters ....................................................................................... 160

2015 Check Point Software Technologies Ltd. All rights reserved.

Check Point cluster solution

2015 Check Point Software Technologies Ltd. All rights reserved.

ClusterXL definitions and terms

2015 Check Point Software Technologies Ltd. All rights reserved.

State Synchronization (a.k.a. Sync) - Technology that synchronizes the relevant

2015 Check Point Software Technologies Ltd. All rights reserved.

2015 Check Point Software Technologies Ltd. All rights reserved.

2015 Check Point Software Technologies Ltd. All rights reserved.

2015 Check Point Software Technologies Ltd. All rights reserved.

2015 Check Point Software Technologies Ltd. All rights reserved.

Device Name: Problem Notification

Device Name: Interface Active Check

Device Name: HA Initialization

Device Name: Load Balancing Configuration

2015 Check Point Software Technologies Ltd. All rights reserved.

Device Name: Recovery Delay

Device Name: IPSO member status

There are several predefined registered critical devices (pnotes):

Device Name: Synchronization

Device Name: Filter

Device Name: VSX

Device Name: fwd

Device Name: cphad

Device Name: cvpnd

2015 Check Point Software Technologies Ltd. All rights reserved.

Device Name: FIB

Device Name: routed

Device Name: ted

Device Name: Instances

2015 Check Point Software Technologies Ltd. All rights reserved.

2015 Check Point Software Technologies Ltd. All rights reserved.

2015 Check Point Software Technologies Ltd. All rights reserved.

ClusterXL requirements for hardware and software

If ClusterXL is installed on Open Servers, then it must be installed in a distributed

If ClusterXL is installed on Check Point appliances, then it can be installed in

Requirements for number of cluster members

2015 Check Point Software Technologies Ltd. All rights reserved.

Note: In Crossbeam DBHA configuration, the above requirement applies to a single

2015 Check Point Software Technologies Ltd. All rights reserved.

Requirements for switches and routers

Disabling multicast limits

2015 Check Point Software Technologies Ltd. All rights reserved.

Load Sharing Multicast Mode

IGMP and static CAMs

Disabling multicast limits

2015 Check Point Software Technologies Ltd. All rights reserved.

State Synchronization in ClusterXL