Documente Academic
Documente Profesional
Documente Cultură
Clusters.
http://systemadminstrators.blogspot.in/2015/01/how-to-troubleshootingwindows-server.html
HowtoTroubleshootingWindowsServer2008R2&2012
FailoverClusters.
Iwasconfiguringwindows2012clusteronxencloudplatformandfoundthistroubleshootguide
frommicrosoftblog,reallyusefullandeasytofindanerror.
How to get to the root of the problem
In"TroubleshootingWindowsServer2008R2FailoverClusters,"thelocationsandtipsforwhere
youcangotogetthedatayouneedtotroubleshootaproblem.NowI'lldiscusssomeofthe
improvementsmadetothetroubleshootingtoolsforWindowsServer2012failoverclustersand
showyouhowtotakeadvantageofthosetools.
IntroducingtheNewEventChannels
Therearesomeneweventchannelsforfailoverclusteringtohelpwithtroubleshooting.Figure1
showsalltheavailablechannels.
Figure1:EventChannelsforFailoverClusteringinServer2012
Notethattheeventsarespecifictothenodeyou'reon.
Knowingthepurposeofeacheventchannelcanhelpyoufindtheerrorsmorequickly,whichinturn
willhelpyoutroubleshoottheproblemmorequickly.Here'sanexplanationofeachchannel:
FailoverClustering
oDiagnostic.Thisisthemainlogthat'scircularinnatureandrunsanytimetheclusterservicestarts.Events
canbereadintheEventViewerifloggingisdisabled.Theycanalsobeconvertedtotextfileformat.
oOperational.Anyinformationalclustereventsareregisteredinthislog,suchasgroupsmoving,going
online,orgoingoffline.
oPerformanceCSV.ThischannelisusedtocollectinformationpertainingtothefunctionalityofCluster
SharedVolumes(CSVs).
FailoverClusteringClient
oDiagnostic.ThischannelcollectsClusterAPItracelogging.Thislogcanbeusefulintroubleshooting
theCreateClusterandAddNodeClusteractions.
FailoverClusteringCsvFlt(newinServer2012)
oDiagnostic.ThischannelcollectstraceloggingfortheCSVFilterDriver(CsvFlt.sys)thatismountedonly
onthecoordinatornodeforaCSV.Thischannelprovidesinformationregardingmetadataoperationsand
redirectedI/Ooperations.
FailoverClusteringCsvFs(newinServer2012)
oDiagnostic.ThischannelcollectstraceloggingfortheCSVFileSystemDriver(CsvFs.sys),whichis
mountedonallnodesinthecluster.ThischannelprovidesinformationregardingdirectI/Ooperations.
FailoverClusteringManager
oAdmin.Thischannelcollectserrorsassociatedwithdialogboxesandpopupwarningsthataredisplayedin
FailoverClusterManager.
FailoverClusteringWMIProvider
oAdmin.ThischannelcollectseventsassociatedwiththeFailoverClusterWMIprovider.
oDiagnostic.ThischannelcollectstraceloggingassociatedwiththeFailoverClusterWMIprovider.Itcan
beusefulwhentroubleshootingWindowsManagementInstrumentation(WMI)scriptsorMicrosoftSystem
Centerapplications.
UsingtheFailoverClusteringClient/DiagnosticChannel
Becauseadministratorsoftenencounterproblemswhencreatingclustersandjoiningnodes,Iwantto
showyouhowtousetheFailoverClusteringClient/Diagnosticchannel.Thischannelisdisabledby
default,soitwon'tbecollectinganydata.Toenableit,youneedtorightclickthechanneland
chooseEnableLog.TheDiagnosticchannelwillthenstartcollectingdatarelevanttoajoinorcreate
operation.
Forexample,supposeyoupreviouslyenabledtheDiagnosticchannelandyou'rehavingaproblem
creatingacluster.Toviewthedatacollected,youneedtorightclickthechannelandchooseDisable
Log.IntheFailoverClusteringClient/Diagnosticeventlog,youseethefollowingevents:
EventID:2
Level:Error
Description:CreateCluster(1883):Createclusterfailed
withexception.Error=8202,msg:Failedtocreate
clusternameCLUSTERonDC\\DC.CONTOSO.COM.Error8202.
EventID:2
Level:Error
Description:CreateClusterNameCOIfNotExists(6879):Failed
tocreatecomputerobjectCLUSTERonDC\\DC.CONTOSO.COM
withOUou=Clusters,dc=contoso,dc=com.Error8202.
Becauseyouhaveerrors,youcanusetheNet.execommandtoseewhattheirstatuscode(8202)
means:
NETHELPMSG8202
Thecommandreturnsthemessage:Thespecifieddirectoryserviceattributeorvaluedoesnot
exist.WiththenewfeaturesofServer2012FailoverClustering,theclusterwillbecreatedinthe
sameorganizationalunit(OU)asthenodes.Fortheclusternametobecreated,theloggedonuser
musthaveatleastReadandCreateComputerObjectspermissions.Iftheuserdoesn'thavethose
rights,thenamewon'tbecreatedandyou'llreceivethistypeoferror.
Nowsupposeyou'retryingtoaddanodetotheexistingclusterandtheoperationfails.Youreview
theeventsintheFailoverClusteringClient/Diagnosticlog,andseethefollowing:
EventID:56
Level:Warning
Description:AsyncNotificationCallback(1463):ApiGetNotify
onhNotify=0x0000000021EBCDC0returns1withrpc_error0
EventID:2
Level:Error
Description:SCMStateNotify(837):Repostof
NotifyServiceStatusChangefailedfornode
'NodeX':status=1168
Althoughtheirwordingisabitonthecrypticside,thedescriptionsgiveyoutheinformationthatyou
need.Thedescriptionforthefirsteventtellsyouthataremoteprocedurecall(RPC)erroroccurred.
Thedescriptionforthesecondeventgivesyouastatuscodeof1168.Onceagain,youcanusethe
Net.execommandtoseewhatthatstatuscodemeans:
NETHELPMSG1168
Thistime,thecommandreturnsthemessage:Elementnotfound.Whenanodetriestojoinacluster,
therunningclusternodeneedstomakeanRPCconnectiontothenodebeingadded.Inthiscase,it
couldn'tfindthenode.
So,fromtheinformationreturnedbythetwoevents,youcandeducethattherunningclusternode
can'tmakeanRPCconnectiontothenodebeingaddedbecauseitcan'tfindthatnode.Afterfurther
investigation,youdiscoverthattheDNSserverhasanincorrectIPaddressforthenodebeingadded.
AfteryoucorrecttheIPaddress,thenodesuccessfullyjoinsthecluster.
IntroducingtheNewTestsintheValidateaConfigurationWizard
AnotherhelpfultroubleshootingtoolthatyoucanuseistheValidateaConfigurationWizardin
FailoverClusterManager.SeveralnewclusteringtestshavebeenaddedinServer2012.Allthenew
testsforServer2012clusteringareinbold:
HyperV(availableonlyiftheHyperVRoleisinstalled)
oListHyperVVirtualMachineInformation
oListInformationAboutServersRunningHyperV
oValidateCompatibilityofVirtualFibreChannelSANsforHyperV
oValidateFirewallRulesforHyperVReplicaAreEnabled
oValidateHyperVIntegrationServicesVersion
oValidateHyperVMemoryResourcePoolCompatibility
oValidateHyperVNetworkResourcePoolandVirtualSwitchCompatibility
oValidateHyperVProcessorPoolCompatibility
oValidateHyperVRoleInstalled
oValidateHyperVStorageResourcePoolCompatibility
oValidateHyperVVirtualMachineNetworkConfiguration
oValidateHyperVVirtualMachineStorageConfiguration
oValidateMatchingProcessorManufacturers
oValidateNetworkListenersAreRunning
oValidateReplicaServerSettings
ClusterConfiguration(availableonlyifaclusterisrunning)
oListClusterCoreGroups
oListClusterNetworkInformation
oListClusterResources
oListClusterVolumes
oListClusteredRoles
oValidateQuorumConfiguration
oValidateResourceStatus
oValidateServicePrincipalName
oValidateVolumeConsistency
Inventory
oStorage
ListFibreChannelHostBusAdapters
ListiSCSIHostBusAdapters
ListSASHostBusAdapters
oSystem
ListBIOSInformation
ListEnvironmentVariables
ListMemoryInformation
ListOperatingSystemInformation
ListPlugandPlayDevices
ListRunningProcesses
ListServicesInformation
ListSoftwareUpdates
ListSystemDrivers
ListSystemInformation
ListUnsignedDrivers
Network
oListNetworkBindingOrder
oValidateClusterNetworkConfiguration
oValidateIPConfiguration
oValidateNetworkCommunications
oValidateWindowsFirewallConfiguration
Storage
oListDisks
oListPotentialClusterDisks
oValidateCSVNetworkBindings
oValidateCSVSettings
oValidateDiskAccessLatency
oValidateDiskArbitration
oValidateDiskFailover
oValidateFileSystem
oValidateMicrosoftMPIOBasedDisks
oValidateMultipleArbitration
oValidateSCSIdeviceVitalProductData(VPD)
oValidateSCSI3PersistentReservation
oValidateSimultaneousFailover
oValidateStorageSpacesPersistentReservation
SystemConfiguration
oValidateActiveDirectoryConfiguration
oValidateAllDriversSigned
oValidateMemoryDumpSettings
oValidateOperatingSystemEdition
oValidateOperatingSystemInstallationOption
oValidateOperatingSystemVersion
oValidateRequiredServices
oValidateSameProcessorArchitecture
oValidateServicePackLevels
oValidateSoftwareUpdateLevels
ExceptfortheStoragetests,allthetestscanberunatanytimebecausetheyaren'tdisruptivetothe
cluster.
UsingtheValidateaConfigurationWizard
Let'sexplorehowtotakeadvantageoftheValidateaConfigurationWizard.Usingtheprevious
exampleoftheproblemrelatedtoaddinganode,let'ssaythattheDNSserverhadtheproperIP
addressandyoucanconnectbetweenthenodesoutsidethecluster.Inthiscase,youcanrunthe
ValidateaConfigurationWizard.
Whenyourunthewizard,theNetwork/ValidateWindowsFirewallConfigurationtestfails.Thistest
specificallylooksattheWindowsFirewallsettingstoensurethatport3343,whichisusedbythe
cluster,hasn'tbeenenabled.Whenthisportisdisabled,allcommunicationsonthatportareblocked
andyougeterrorsintheDiagnosticchannel.
IntroducingtheNewGetClusterLogCommandSwitch
TheWindowsPowerShellcommandGetClusterLogletsyouconverttheeventsinachannel(e.g.,
FailoverClustering/Diagnostics)toatextfileformat.PowerShellwillnamethetextfileCluster.log
andplaceitintheC:\Windows\Cluster\Reportsfolder.Ifyourunthecommandbyitself,eachnode
willhaveitsownCluster.logfile.Youcanuseswitchestochangethisdefaultbehavior.Herearethe
switches,includingthenewUseLocalTimeswitch:
Cluster<string>,where<string>isthenameoftheclusteryouwanttorunthecommandagainst.
Thisallowsyoutospecifyaremotecluster.Ifyouomittheswitch,itwillrunagainsttheclusteryou're
currentlyon.
Node<string>,where<string>isthenameofthenodeyouwanttorunthecommandagainst.You
usethiscommandwhenyouwanttogeneratetheCluster.logfileforacertainnodeonly.
Destination<string>,where<string>isthefoldertowhichyouwanttocopytheCluster.logfiles.If
youincludethisswitch,PowerShellwillnotonlycreateaCluster.logineachnode's
C:\Windows\Cluster\Reportsfolderbutalsocopyallofthelogfilestothespecifieddestinationfolder.This
switchwilladdthenode'snameaspartofthefilename(e.g.,Node1_Cluster.log,Node2_Cluster.log)forthe
logfilescopiedtothedestinationfolder.Thisway,eachnode'slogfilesareeasilyidentifiable.
TimeSpan<uint32>.Youusethisswitchifyoujustwanttogetalogfilethatspansthelastspecified
numberofminutes,where<uint32>isthatnumber(e.g.,5).Thiswillgiveyouamuchsmallerlogfileto
review.Youcanusethisswitchifyou'retryingtoreproduceanerror.Forexample,youcanreproducethe
erroryouthinkmightbeoccurring,thengeneratethelogforthelast5minutestoseeifthat'sthecase.
UseLocalTime.Asmentionedpreviously,thisswitchisnewinServer2012.Clusterswritealltheir
informationinGMT.Forexample,ifyouhaveaclusterthat'sintheGMT5timezoneandyourlocaltimeis
13:00(1:00p.m.),Cluster.logwillshowatimeof18:00(6:00p.m.)bydefault.Withthisswitch,thelocaltime
isused,sothelogwillshowatimeof13:00.WhenyouusetheUseLocalTimeswitch,thetimesreturnedby
theGetClusterLogcommandcaneasilybematchedwiththeEventLogtimes.
NowthatyouknowhowtogetCluster.logfiles,it'stimetolearnhowtoreadandsearchthrough
them.
ReadingCluster.logFiles
ReadingCluster.logfilestakesalongtimetomaster,becausetheycontainalotofinformationthat
canbeconfusing.However,I'llgiveyousometipsthatcanhelpyougetstarted.
ThefirstthingyouneedtounderstandistheanatomyofaCluster.logfile.Everyentryhasthesame
basicstructure.Here'sanentryforanIPaddressresourcecomingonline:
00000bb8.000001d4::2013/05/1501:13:24.852
INFO[RES]IPAddress<IPAddress1.1.1.1>:
Online:Openedobjecthandlefornetinterface
353c85ee7ea74b2a927d1538dffcdecd
Let'sbreakthisentrydownintosmallerpiecestomakebettersenseofit:
00000bb8.ThisistheprocessIDinhexadecimalnotation.Typically,theprocessistheResource
HostSystem(RHS).Youcanseewhatresourcestheprocessisusingbysortingorsearchingforthe
linesthatincludethisprocessID.ThisisusefulwhendebugginganRHSdumpifyouhavemultiple
filespresent.EachofthesedumpsisidentifiedbyaprocessID,soknowingwhattheprocessIDis
willensurethatyou'reworkingwiththecorrectprocessdump.Ifyouhaveacompletememory
dump,therewillbemultipleRHSprocesses.EachisidentifiedbytheID,soyoucangettothe
correctone.
000001d4.ThisisthethreadIDinhexnotation.Youcanseewhatthethreadisdoingbysortingor
searchingforlinesthatincludethisthreadID.Whenyou'redebugginganRHSprocessthathas100
threads,youcanjumprighttothecorrectoneusingthisID.
2013/05/1501:13:24.852.ThisisthedateandtimeinGMT(unlesstheUseLocalTimeswitchwas
usedtogeneratethelog).Soifyou'reusingGMT5,thelocaltimeinthiscaseisMay14,2013,at
8:13p.m.Thetimegoesdowntomilliseconds.
INFO.Thisistheleveloftheentry.ThelevelcanbeINFO(informational),WARN(warning),ERR
(error),orDBG(debug).Thereareafewothers,buttheselevelsarewhatyou'llseethemajorityof
thetime.Generally,alinewithERRinitindicatesaproblemwitharesource.Whenyouopena
Cluster.logfileafterafailure,youcansearchforaspecificleveltotrytogettotheproblemarea
quicker.
[RES]IPAddress.Thisistheresourcetype.Aresourcewillalwaysidentifyitstypeinthelog.With
thisinformation,youcanmorequicklyfollowtheresourcegoingonlinewhentherearemultiple
typesofresourcesallcomingonlineatthesametime.
<IPAddress1.1.1.1>.Thisistheactualresource,asshowninFailoverClusterManager.
Online:Openedobjecthandlefornetinterface353c85ee7ea74b2a927d1538dffcdecd.Thisisa
descriptionofwhat'sgoingonwiththeresource.What'sgoingonhereisthattheresourceisopening
ahandletothenetworkcarddriverinordertobindtheIPaddresstoit.Ifitfailshere,it'smostlikely
aproblemwiththenetworkcarddrivernotacceptinganything,whichmeansit'sbad.Alternatively,
thenetworkcardmighthavedied.YournextstepwouldbetoreviewtheSystemeventlogentriesto
checkforanynetworktypeevents,suchasthenetworkgoingdownoracardfailing.Withmanyof
thedescriptions,themoreyouseethem,themoreyou'llunderstandwhattheymean.Adescription
canbeparticularlyhelpfulifit'sdescribingthelastactionthatoccurredbeforeafailure.
SearchingCluster.logFiles
WhenreviewingCluster.logfiles,ithelpstosearchforkeywords.Table1providesalistof
keywordsthatIusewhensearchingforresources.
Table1:KeywordstoUseWhenSearchingforResources
Keyword
Description
ThiskeywordappearsinthelogthesecondthatFailoverCluster
>OnlinePending
ManagerdisplaysOnlinePendingforaresource.Thisiswhereyour
searchshouldstartifyouwanttofollowaresourcecomingonline.
ThiskeywordappearsinthelogthesecondthatFailoverCluster
>OfflinePending
ManagerdisplaysOfflinePendingforaresource.Thisiswhereyour
searchshouldstartifyouwanttofollowaresourcegoingoffline.
ThiskeywordappearsinthelogwhenFailoverClusterManager
>Offline
displaysOfflineforaresource.Soifyouwerefollowingtheresource,
there'snoneedtolookfurther.Ifthisresourcedependsonanother
resource,thatotherresourcecouldstartitsofflineprocessfirst.
ThiskeywordappearsinthelogwhenFailoverClusterManager
displaysOnlineforaresource.Soifyouwerefollowingtheresource,
>Online
there'snoneedtolookfurther.Ifanotherresourcedependsonthis
resource,thatotherresourcewouldnotstartitsonlineprocessuntil
thisonecompletes.
Thiskeywordappearsinthelogwhenaresourcejustfailed.Ifyou
findthisline,youwouldwanttostartlookingatpreviousentriestosee
>ProcessingFailure
whattriggeredthefailure.Lookingatentriesafterthiseventisn't
necessary.Anytimearesourcefails,youshouldstilltrytogothrough
thenormalofflineprocess,eventhoughyou'llmostlikelygeterrors
becausetheresourceisunavailable.
Notethatyoushouldtypethesekeywordsexactlyasyouseethem.Inotherwords,includethe
hyphenhyphengreaterthansymbol(>)anddon'tincludeanyspaces.
Youcanalsousethesekeywordstoquicklydeterminehowlongaresourcetooktogoofflineor
comeonline.Forexample,supposethatagroupistakinglongerthannormaltocomeonline.You
canuse>OfflinePendingasastartingpoint,thenuse>Offlineforallresourcesinthegroup.
Alternatively,youcanuse>OnlinePendingfollowedby>Online.Foreachresource,addupall
thetimestoseehowlongittooktocomeonline.Afteryou'vedonethatforalltheresources,youcan
comparetheresources'totaltimestoseewhichresourcetookthelongestamountoftime.Youcan
thenreviewtheentriesinCluster.logtodeterminewhy.Forexample,ifagrouptook30secondstotal
tocomeonlineononenodeand3minutestotaltocomeonlineonanothernode,youshouldgenerate
Cluster.logfilesforbothnodesandcomparethem.
Youcanusethesamekeywordsforgroups,exceptthattheremustbeaspaceafterthegreaterthan
symbol.Forexample,ifagroupgoesoffline,youwouldfirstuse>OfflinePending,followedby
>Offline.Theonlyotherdifferencebetweentheresourceentryandthegroupentryisthatthe
groupfailureis>Failed,whereastheresourcefailureis>ProcessingFailure.
PuttingItAllTogether
Toseehowalltheinformationpresentedfitstogether,let'swalkthoughsolvingaproblem.Suppose
thatyouhaveatwonodeclusterconfiguredwithmultiplefileserversusingdifferentnetworksanda
FibreChannelconnectedSAN.Here'sthesetupforthenetworks:
ClusterNetwork1=IPscheme192.168.0.0/24
ClusterNetwork2=IPscheme1.0.0.0/8
ClusterNetwork3=IPscheme172.168.0.0/16
Inthenodes'networkconnections,thenetworkadaptersareidentifiedas:
CORPNET=IPscheme192.168.0.0/24
MGMT=IPscheme1.0.0.0/8
BACKUP=IPscheme172.168.0.0/16
FILESERVER1isusingClusterNetwork1,whichisrunningonNODE1.FILESERVER2isusing
ClusterNetwork2,whichisrunningonNODE2.
Let'ssaythattherewasafailureovernightandafileservergroupnamedFILESERVER2wasmoved
fromNODE2toNODE1.Youneedtofindoutwhythefailureoccurred.
ThefirstthingyoudoisopenFailoverClusterManager,rightclicktheFILESERVER2group,and
selectShowCriticalEvents.Thisbringsupthefollowingevents:
EventID:1069
Description:ClusterResource'IPAddress1.1.1.1'of
type'IPAddress'inClusteredRole'FILESERVER'failed.
EventID:1205
Description:TheClusterservicefailedtobringclustered
serviceorapplication'FILESERVER2'completelyonlineor
offline.Oneormoreresourcesmaybeinafailedstate.
ThefirsteventtellsyouthatIPAddress1.1.1.1hadafailure.So,yourightclickthisresourcein
FailoverClusterManagerandchooseShowCriticalEvents.Youseetheseevents:
EventID:1077
Description:HealthcheckforIPInterface
'IPAddress1.1.1.1'(address1.1.1.1)failed(statusis
1168).RuntheValidateaConfigurationwizardtoensure
thatthenetworkadapterisfunctioningproperly.
EventID:1069
Description:ClusterResource'IPAddress1.1.1.1'of
type'IPAddress'inClusteredRole'FILESERVER'failed.
Basedonthedescriptioninfirstevent(event1077),youdecidetousetheValidateaConfiguration
Wizard.YouwanttorunonlytheNetwork/ValidateNetworkCommunicationtestbecausethattest
willchecktheadaptersandallnetworkpathsbetweenthenodes.
AfteryouruntheNetwork/ValidateNetworkCommunicationtest,youcheckthetestreport.You
don'tseeanyerrorsorwarnings,soyouputitaside.
Thereareeventchannelsyoucanreview,soyougointotheFailoverClustering/Operationalchannel,
whereyouseethisevent:
EventID:1153
Description:TheClusterserviceisattemptingtofailover
theclusteredserviceorapplication'FILESERVER2'from
node'NODE2'tonode'NODE1'
Becauseofthisdescription,yougointotheFailoverClustering/Diagnosticschannel,whereyousee
theseevents:
EventID:2051
Description:[RCM]rcm::RcmResource::HandleFailure:
(IPAddress1.1.1.1)
EventID:2051
Description:[RES]IPAddress<IPAddress1.1.1.1>:
Failedtoquerypropertiesofadapterid
F3EDD1C8698482BC498806B841CA,status87.
Basedonthisinformation,yougenerateaCluster.logfileforthisnode.Inthelog,yousearchfor
>ProcessingFailureandfindtheseentries:
[RES]IPAddress<IPAddress1.1.1.1>:IPInterface
3600A8C0failedLooksAlivecheck,status1168.
[RES]IPAddress<IPAddress1.1.1.1>:IPInterface
3600A8C0failedIsAlivecheck,status1168.
[RHS]ResourceIPAddress1.1.1.1hasindicatedfailure.
[RCM]ResIPAddress1.1.1.1:Online>ProcessingFailure
(StateUnknown)
[RCM]TransitionToState(IPAddress1.1.1.1)
Online>ProcessingFailure.
AbitlaterinCluster.log,youseetheentriesdocumentingthatthegroupwasbeingmoved.Thisisa
goodindicationthattheentriesfoundwiththe>ProcessingFailuresearcharerelatedtotheproblem
thatcausedthegrouptobemoved.Becauseoftheerrorsseeninthoseentries,youknowforsurethat
theIPaddressresourcefailed.Tofindoutwhattheerrors'statuscodemeans,youusetheNet.exe
command:
NETHELPMSG1168
Thecommandreturnsthemessage:Elementnotfound.Afterlookingmorecloselyattheentries,it
appearsasthoughtheactualproblemmightbewiththenetworkadapter.So,yourunsomehardware
testsagainsttheadaptersandfindthatoneadapterisfaultyandnotevenshowingupinWindows
anymore.Replacingthefaultyadapteristhecourseofactiontofixtheproblem.
Butthere'sstillthequestionofwhytheNetwork/ValidateNetworkCommunicationtestresultsdidn't
showanyerrorswheneverythingelsedid.Thistestchecksallnetworkadapters,goingfromone
nodetoanother,whetherthey'reonthesamenetworkornot.Itdoesthissothatitknowsallthe
routesitcantaketogettotheothernodes.So,therearesomeexpectedfailuresjustbecauseofthe
waythenetworksbetweenthenodesarecabledorsegmented.
Youdecidetolookmorecloselyatthetestreport.That'swhenyouspottheoutputshowninFigure
2.
Figure2:Network/ValidateNetworkCommunicationTestResults
YounoticethatNODE1doesn'thaveanetworkadapterdefinedasMGMT.Thisisbasicallysaying
thesamethingastheevents,whichisthatNODE1hastwonetworksandNODE2hasthree
networks.So,thelessonhereisthatyouneedtodomorethanjustlookattheerrorsorwarningsat
thetopofthereport.Youalsoneedtolookatthetestresults.
GettotheRootoftheProblem
Troubleshootingaclusterisliketroubleshootingjustaboutanything.Therearedifferentwaysto
troubleshootandmultiplethingstolookatinordertogettoaproblem'srootcause.Ipresentedone
waytogettotherootcause,andIhopeyou'reabletouseitwhentroubleshootingproblemsinyour
clusters.Formoreinformationpertainingtofailoverclustering,checkouttheAsktheCore
TeamblogsiteandtheClusteringandHighAvailability
blogsite.