Sunteți pe pagina 1din 39

Step-by-Step Guide to Installing, Configuring, and Tuning a HighPerformance Compute Cluster

White Paper
Published: June 2007 For the latest information, please see http://www.microsoft.com/windowsserver200 /ccs

Contents
!ntroduction ............................................................................................................................... "efore #ou "e$in...................................................................................................................... % Plan #our &luster.................................................................................................................. % !nstall #our &luster 'ardware............................................................................................... 7 &onfi$ure #our &luster 'ardware......................................................................................... ( )btain *e+uired ,oftware..................................................................................................... ( !nstallation, &onfi$uration, and -unin$ ,teps........................................................................... ,tep .: !nstall and &onfi$ure the ,ervice /ode................................................................... ,tep 2: !nstall and &onfi$ure 01, on the ,ervice /ode .....................................................7 ,tep : !nstall and &onfi$ure the 'ead /ode......................................................................2. ,tep 2: !nstall the &ompute &luster Pac3............................................................................22 ,tep 4: 1efine the &luster -opolo$5...................................................................................22 ,tep %: &reate the &ompute /ode !ma$e...........................................................................24 ,tep 7: &apture and 1eplo5 !ma$e to &ompute /odes......................................................27 ,tep (: &onfi$ure and 6ana$e the &luster.........................................................................2( ,tep 7: 1eplo5 the &lient 8tilities to &luster 8sers............................................................. 0 0ppendi9 0: -unin$ 5our &luster............................................................................................. 2 0ppendi9 ": -roubleshootin$ #our &luster............................................................................. 4 0ppendi9 &: &luster &onfi$uration and 1eplo5ment ,cripts................................................... ( *elated :in3s........................................................................................................................... 7

Introduction
'i$h;performance computin$ is now within reach for man5 businesses b5 clusterin$ industr5; standard servers. -hese clusters can ran$e from a few nodes to hundreds of nodes. !n the past, wirin$, provisionin$, confi$urin$, monitorin$, and mana$in$ these nodes and providin$ appropriate, secure user access was a comple9 underta3in$, often re+uirin$ dedicated support and administration resources. 'owever, 6icrosoft < Windows< &ompute &luster ,erver 200 simplifies installation, confi$uration, and mana$ement, reducin$ the cost of compute clusters and ma3in$ them accessible to a broader audience. Windows &ompute &luster ,erver 200 is a hi$h;performance computin$ solution that uses clustered commodit5 9%2 servers that are built with a combination of the 6icrosoft Windows ,erver< 200 &ompute &luster =dition operatin$ s5stem and the 6icrosoft &ompute &luster Pac3. -he base operatin$ s5stem incorporates traditional Windows s5stem mana$ement features for remote deplo5ment and cluster mana$ement. -he &ompute &luster Pac3 contains the services, interfaces, and supportin$ software needed to create and confi$ure the cluster nodes, as well as the utilities and mana$ement infrastructure. !ndividuals tas3ed with Windows &ompute &luster ,erver 200 administration and mana$ement have the advanta$e of wor3in$ within a familiar Windows environment, which helps enable users to +uic3l5 and easil5 adapt to the mana$ement interface. Windows &ompute &luster ,erver 200 is a si$nificant step forward in reducin$ the barriers to deplo5ment for or$ani>ations and individuals who want to ta3e advanta$e of the power of a compute clusterin$ solution. ; Integrated software stack. Windows &ompute &luster ,erver 200 provides an inte$rated software stac3 that includes operatin$ s5stem, ?ob scheduler, messa$e passin$ interface @6P!A la5er, and the leadin$ applications for each tar$et vertical. etter integration with IT infrastructure. Windows &ompute &luster ,erver 200 inte$rates seamlessl5 with 5our current networ3 infrastructure @for e9ample, 0ctive 1irector5<A, enablin$ 5ou to levera$e e9istin$ or$ani>ational s3ills and technolo$5. !amiliar de"elopment en"ironment. 1evelopers can levera$e e9istin$ Windows;based s3ills and e9perience to develop applications for Windows &ompute &luster ,erver 200 . 6icrosoft Bisual ,tudio< is the most widel5 used inte$rated development environment @!1=A in the industr5, and Bisual ,tudio 2004 includes support for developin$ 'P& applications, such as parallel compilin$ and debu$$in$. -hird;part5 hardware and software vendors provide additional compiler and math librar5 options for developers see3in$ an optimi>ed solution for e9istin$ hardware. Windows &ompute &luster ,erver 200 supports the use of 6P! with 6icrosoftCs 6P! stac3, or the use of stac3s from other vendors.

-his step;b5;step $uide is based on the hi$hl5 successful cluster deplo5ment at /ational &enter for ,upercomputin$ 0pplications @/&,0A at the 8niversit5 of !llinois at &hampai$n; 8rbana. -he cluster was built as a ?oint effort between /&,0 and 6icrosoft, usin$ commonl5 available hardware and 6icrosoft software. -he cluster was composed of 240 9%2 servers, achievin$ 2.. teraflops @-F:)PsA on (7% processors usin$ the widel5 accepted :!/P0&D benchmar3. Fi$ure . shows the cluster topolo$5 used for the /&,0 deplo5ment, includin$ the public, private, and 6P! networ3s.

!igure # Supported cluster topology similar to $CS%-deployment topology

0lthou$h ever5 !- environment is different, this $uide can serve as a basis for settin$ up 5our lar$e;scale compute cluster. !f 5ou need additional $uidance, see the *elated :in3s section at the end of this $uide for more resources. $ote -he intended audience for this document is networ3 administrators who have at least two 5earsC e9perience with networ3 infrastructure, mana$ement, and confi$uration. -he e9ample deplo5ment outlined in this document is tar$eted at clusters in e9cess of .00 nodes. 0lthou$h the steps discussed here will wor3 for smaller clusters, the5 represent steps modeled on lar$e deplo5ments for enterprise;scale and research;scale clusters.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

$ote -he s3ill level that is re+uired to complete the steps in this document assumes 3nowled$e of how to install, confi$ure, and mana$e 6icrosoft Windows ,erver 200 in an 0ctive 1irector5 environment, and e9perience in addin$ and mana$in$ computers and users within a domain. $ote -his is Bersion . of this document. -o download the latest updated version, visit the 6icrosoft Web site @http://www.microsoft.com/hpc/A. -he update ma5 contain critical information that was not available when this document was published.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

Before You Begin


,ettin$ up a compute cluster with Windows ,erver 200 &ompute &luster =dition be$ins with the followin$ tas3s: .. Plan 5our cluster. 2. !nstall 5our cluster hardware. . &onfi$ure 5our cluster hardware. 2. )btain re+uired software. When 5ou have completed these tas3s, use the steps in the Installation, Configuration, and Tuning Steps section to help 5ou install, confi$ure, and tune 5our cluster.

Plan Your Cluster


-his step;b5;step $uide provides basic instructions on how to deplo5 a Windows compute cluster. #our cluster plannin$ should cover the t5pes of nodes that are re+uired for a cluster, and the networ3s that 5ou will use to connect the nodes. 0lthou$h the instructions in this $uide are based on one specific deplo5ment, 5ou should also consider 5our environment and the number and t5pes of hardware 5ou have available. #our cluster re+uires three t5pes of nodes: ; Head node. 0 head node mediates all access to the cluster resources and acts as a sin$le point for cluster deplo5ment, mana$ement, and ?ob schedulin$. -here is onl5 one head node per cluster. Ser"ice node. 0 service node provides standard networ3 services, such as director5 and 1/, and 1'&P services, and also maintains and deplo5s compute node ima$es to new hardware in the cluster. )nl5 one service node is needed for the cluster, althou$h 5ou can have more than one service node for different roles in the clusterFfor e9ample, movin$ the ima$e deplo5ment service to a separate node. Compute node. 0 compute node provides computational resources for the cluster. &ompute nodes are provided ?obs and are mana$ed b5 the head node.

0dditional node t5pes that can be used but are not re+uired are remote administration nodes and application development nodes. For an overview of device roles in the cluster, see the Windows &ompute &luster ,erver 200 *eviewers Euide @http://www.microsoft.com/windowsserver200 /ccs/reviewers$uide.msp9A. #our cluster also depends on the number and t5pes of networ3s used to connect the nodes. -he *eviewerCs Euide discusses the topolo$ies that 5ou can use to connect 5our nodes, b5 usin$ combinations of private and public adapters for messa$e passin$ between the nodes and s5stem traffic amon$ all of the nodes. For the cluster detailed in this $uide, the head node and service node have public and private adapters for s5stem traffic, and the compute nodes have private and messa$e passin$ interface @6P!A adapters. @ $ote& -his is not a supported topolo$5 but is ver5 similar to one that is.A &onsult the *eviewerCs Euide for the advanta$es of each networ3 topolo$5. :astl5, 5ou should consider the level of cluster e9pertise, networ3in$ 3nowled$e, and amount of mana$ement time available on 5our staff to dedicate to 5our cluster. 0lthou$h deplo5ment and mana$ement is simplified with Windows &ompute &luster ,erver 200 , 3eep in mind that no matter what the circumstances, a lar$e;scale compute cluster deplo5ment should not be ta3en li$htl5. !t is important to understand how mana$ement and deplo5ment wor3 when

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

plannin$ for the appropriate resources. &ompute &luster ,erver uses robust, enterprise;$rade technolo$ies for all aspects of networ3 and device mana$ement. !ts mana$ement tools and pro$rams allow $ranular, role;based mana$ement of securit5 for cluster administration and cluster users, and its networ3 and s5stem mana$ement tools can easil5 and +uic3l5 deplo5 applications and ?obs usin$ familiar, wi>ard;based interfaces. 0dditional compute nodes can be added automaticall5 to the compute cluster b5 simpl5 plu$$in$ the nodes in and connectin$ them to the cluster. =9tensive @and e9pensiveA dail5 hands;on twea3in$, confi$uration, and mana$ement are not needed when usin$ commodit5 hardware and a standards;based infrastructure.

Install Your Cluster Hardware


For ease of mana$ement and confi$uration, all nodes in the deplo5ment in this $uide will use the same basic hardware platform. 'ardware re+uirements for computers runnin$ Windows &ompute &luster ,erver 200 are similar to those for Windows ,erver 200 , ,tandard 9%2 =dition. #ou can find the s5stem re+uirements for 5our cluster at http://www.microsoft.com/windowsserver200 /ccs/s5sre+s.msp9. -able . shows a list of hardware for all nodes. -his list is based on the hardware used in the /&,0 deplo5ment.

Table 1: Hardware for All Nodes


&omponent
&P8

*ecommended 'ardware
"lade servers ; =ach blade has two sin$le;core .2 E'> processors with 2 6" cache and an (006'> front;side bus. 6otherboard includes 29 P&! =9press slots. 2 9 .E" 200 6'> 1!66s. For compute nodes, 5ou should plan on havin$ 2 E" *06 per core. ,&,! adapter, 7 E" .0D *P6 8ltra 20 ,&,! dis3. *0!1 ma5 be used on an5 node, but was not used in this deplo5ment. For the head node, 5ou should plan on havin$ three dis3s: one for the ),, one for the database, and one for the transaction lo$s. -his will provide improved performance and throu$hput. .000 6b Ei$abit =thernet adapter .9 !nfini"and 29 P&= =9press adapter 2(;port Ei$abit switch per rac3: 20 ports for blades, 2 for uplin3 to rin$ 2(;port :a5er 2 Ei$abit switches in rin$ confi$uration

*06 ,tora$e

/etwor3 !nterface &ards Ei$abit /etwor3 'ardware

!nfini"and /etwor3 'ardware

49 22;port !nfini"and switches per rac3 29 7%;port !nfini"and switches for cross;rac3 connectivit5

$ote -he head node and the networ3 services node each use two Ei$abit =thernet networ3 adaptersG both the compute nodes and the head nodes use the private 6P! networ3, thou$h the head nodeCs 6P! interface was disabled for this specific deplo5ment. 0lso, the service node re+uires a 2;bit operatin$ s5stem, since 01, will onl5 wor3 with 2;bit, but 5ou can run the operatin$ s5stem on 2;bit or %2;bit hardware. @-his is a custom confi$uration used on the

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

cluster deplo5ment at /&,0 and is not supported for $eneral use. 'owever, it is ver5 similar to a supported cluster topolo$5. For more information on supported cluster topolo$ies, please refer to the Windows &ompute &luster ,erver 200 *eviewers Euide.A

Configure Your Cluster Hardware


When 5ou have added 5our switches and blades to the rac3, 5ou must confi$ure the networ3 connections and networ3 hardware prior to installin$ the networ3 software. -o confi$ure 5our hardware, follow the chec3list in -able 2.

Table 2: Hardware Configuration Checklist


&hec3 when completed &onfi$uration !tem
&onnect all hi$h;speed interconnect connections from the pass;throu$h module on the chassis to the rac3Cs hi$h;speed interconnect switches. &onnect all Ei$abit =thernet connections from the pass;throu$h module on the chassis to the rac3Cs 2(;port Ei$abit =thernet switch. &onnect all !nfiniband switches to the :a5er 2 switches. &onnect all Ei$abit =thernet switches to the Ei$abit =thernet :a5er 2 switches. 1isable the built;in subnet mana$er on all switches. -he built;in subnet mana$er doesnCt support )pen!" clients, and conflicts with the subnet mana$er that does support such clients. &han$e the "!), boot se+uence on all nodes to /etwor3 Pre;boot =9ecution =nvironment @PH=A first, &1 *)6 second, and 'ard 1rive third. For platforms that d5namicall5 remove missin$ devices at power; up, an efficient wa5 to set the hard drives last in the boot order is to pull the hard drives, power up the devices once, power off the devices, put the drives bac3 in, and then power up a$ain. -he boot order will be set correctl5 thereafter. 1isable h5perthreadin$ on all nodes and set the nodeCs s5stem cloc3 to the correct time >one, if re+uired. )btain a list of all private Ei$abit =thernet adapter 60& addresses for the compute nodes. -hese addresses are used as input with a confi$uration script to identif5 5our nodes and confi$ure them with the proper ima$e. !n some cases 5ou can use the blade chassis telnet interface to collect the 60& addresses. ,ee 0ppendi9 & for a description of the input file and the file format.

btain !e"uired #oftware


!n addition to Windows &ompute &luster ,erver 200 , 5ou will need to obtain operatin$ s5stems, administration utilities, drivers, and Iuic3 Fi9 files to brin$ 5our s5stems up;to;date. -able lists the software re+uired for each node t5pe, and the notes followin$ the chart show 5ou where to obtain the necessar5 software. -he followin$ list is based on the software used in the /&,0 deplo5ment.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

Table $: #oftware !e"uired b% Node T%&e


,oftware *e+uired b5 /ode -5pe
Windows ,erver 200 *2 ,tandard =dition 9%2 Windows ,erver 200 *2 =nterprise =dition 9(% Windows ,erver 200 &ompute &luster =dition 9%2 6icrosoft &ompute &luster Pac3 ,I: ,erver 2004 ,tandard =dition 9%2 0utomated 1eplo5ment ,ervices @01,A version ... 6icrosoft 6ana$ement &onsole @66&A .0 ./=- Framewor3 2.0 Windows Preinstall =nvironment @WinP=A IF= D"7.02(. IF= D"7.27(2 6icrosoft ,5stem Preparation tool @s5sprep.e9eA &luster confi$uration and deplo5ment scripts :atest networ3 adapter drivers

'ead /ode

,ervice /ode

&ompute /ode

/otes on the software re+uired the deplo5ment described in this paper: 'icrosoft S() Ser"er* +,,- Standard .dition /01& "5 default, the &ompute &luster Pac3 will install 6,1= on the head node for data and node trac3in$ purposes. "ecause 6,1= is limited to ei$ht concurrent connections, ,I: ,erver ,tandard =dition 2004 is recommended for clusters with more than %2 compute nodes. %2S "ersion #3#& 01, re+uires 2;bit versions of Windows ,erver 200 =nterprise =dition for ima$e mana$ement and deplo5ment. Future 6icrosoft ima$in$ technolo$5 @Windows 1eplo5ment ,ervices, available in the ne9t release of Windows ,erver, code name J:on$hornKA will support %2;bit software. #ou can download the latest version of 01, from the 6icrosoft Web site @http://www.microsoft.com/windowsserver200 /technolo$ies/mana$ement/ads/default.msp9A. "ecause this paper is based on a previous lar$e;scale compute cluster deplo5ment at /&,0, it details usin$ 01, to deplo5 compute node ima$es as opposed to usin$ 6icrosoft Windows 1eplo5ment ,ervices @W1,A. 'owever, future updates to this paper will e9plain how to use W1, to deplo5 compute node ima$es to 5our cluster. ''C 43,& 66& .0 is re+uired for the administration node, which ma5 or ma5 not be the head node. !t is automaticall5 installed b5 the &ompute &luster Pac3 on the computer that is used to administer the cluster. #ou can also download and install the latest versions for Windows ,erver 200 and Windows HP 9(% and 9%2 versions at the 6icrosoft Web site @http://support.microsoft.com/L3bidM7072%4A. 3$.T !ramework +3,& -he ./=- Framewor3 is automaticall5 installed b5 the &ompute &luster Pac3. #ou can also download the latest version at the 6icrosoft Web site @http://msdn2.microsoft.com/en;us/netframewor3/aa7 .422.asp9A.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

5inP.& #ou will need a cop5 of Windows Preinstallation =nvironment for Windows ,erver 200 ,P.. !f 5ou need to add 5our Ei$abit =thernet drivers to the WinP= ima$e, 5ou will need to obtain a cop5 of the Windows ,erver 200 ,P. )=6 Preinstallation Dit @)PDA, which contains the pro$rams needed to update the WinP= ima$e for 5our hardware. WinP= and the )PD are available onl5 to customers with enterprise or volume license a$reementsG contact 5our 6icrosoft representative for more information. (!. 6 7#,18#& -his Iuic3 Fi9 is for potential problems when deplo5in$ Winsoc3 1irect in a fast ,tora$e 0rea /etwor3 @,0/A environment. #ou can download the +uic3 fi9 at the 6icrosoft Web site @http://support.microsoft.com/L3bidM7.02(.A. (!. 6 7#1981& -his Iuic3 Fi9 is in response to a ,ecurit5 0dvisor5 and provides additional 3ernel protection in some environments. #ou can download the +uic3 fi9 at the 6icrosoft Web site @http://support.microsoft.com/L3bidM7.27(2A. Sysprep3e/e& ,5sprep.e9e is used to help prepare the compute node ima$e prior to deplo5ment. ,5sprep is included as part of Windows ,erver 200 &ompute &luster =dition. $ote& #ou must use the 9%2 bit version of ,5sprep in order to capture and deplo5 5our ima$es. Cluster configuration and deployment scripts& -hese scripts are available to download at http://www.microsoft.com/technet/scriptcenter/scripts/ccs/deplo5/default.msp9. -he5 include hard;coded paths and re+uire 5ou to follow the installation and usa$e instructions e9actl5 as described in this $uide. !f 5ou must modif5 the scripts for 5our deplo5ment, ma3e sure that 5ou verif5 that the scripts wor3 in 5our environment before usin$ them to deplo5 5our cluster. For the scripts to run properl5, 5ou will also need specific information about 5our cluster and its hardware. 0ppendi9 & contains a sample input file @0dd&ompute/odes.csvA that is used to automaticall5 confi$ure the compute cluster nodes and populate 0ctive 1irector5 with node information. -able 2 lists the specific items needed, with room for 5ou to write down the values for 5our deplo5ment. #ou can then use this information when buildin$ 5our cluster and when creatin$ 5our compute node ima$es. Follow the instructions in 0ppendi9 & for creatin$ 5our own sample input file. $ote =ver5 item in -able 2 must have an entr5 or the input file will not wor3 properl5. !f 5ou do not have a value for a field, use a h5phen N;N for the field instead. )atest network adapter dri"ers& &ontact the manufacturer of 5our networ3 adapters for the most recent drivers. #ou will need to install these drivers on 5our cluster nodes.

Table ': Cluster Infor(ation Needed for #cri&t In&ut )ile


!nput Balue
Full/ame )r$anisation name ProductDe5

#our Balue

1escription
Populates the cluster node re$istr5 with the *e$istered )wner name. Populates the cluster node re$istr5 with the *e$istered )r$ani>ation name. 24;di$it alphanumeric product 3e5 used for all compute cluster nodes. &ontact 5our 6icrosoft representative for 5our volume license 3e5. Populates 0ctive 1irector5 with a &ompute &luster node name.

,erver /ame

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

.0

!nput Balue
,rv 1escription

#our Balue

1escription
Populates the 01, 6ana$ement console with a te9t description of the node. &an be used to list rac3 placement or other helpful information. Ei$abit =thernet 60& address for each compute cluster node. 8sed to confi$ure the cluster node with a machine name. 6ust match the value in the Ser"er $ame field. :ocal administrator password. -he cluster domain name @for e9ample, J'P&&luster.localKA. 0ccount name with permission to add computers to a domain. Password for the account with permission to add computers to a domain. -he ima$e name to be installed on the cluster node @for e9ample, &&,!ma$eA. -he head node name must be used for the cluster name. 6ust be J,in$leK. /ot used. /ot used. /ot used. /ot used. /ot used. /ot used. /ot used. /ot used. /ot used. /ot used. /ot used. /ot used. /ot used. 0ssi$ns a static address to the 6P! adapter @for e9ample, ...0.0..A. 0ssi$ns a subnet mas3 to the 6P! adapter @for e9ample, 244.244.0.0A. /ot used. /ot used.

,erver 60& 6achine /ame

0dmin Password 1omain 1omain 8sername 1omain Password

!ma$e/ame 'P& &luster /ame /etwor3-opolo$5 Partition,i>e Public!P Public,ubnet PublicEatewa5 Public1/, Public/!&/ame Public60& Private!P Private,ubnet PrivateEatewa5 Private1/, Private/!&/ame Private60& 6P!!P 6P!,ubnet 6P!Eatewa5 6P!1/,

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

..

!nput Balue
6P!/!&/ame 6P!60& 6achine)8

#our Balue

1escription
/ot used. /ot used. Populates 0ctive 1irector5 with 6achine )8 information @for e9ample, )8M&luster ,ervers,1&M'P&&luster,1&MlocalA.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

.2

Installation* Configuration* and Tuning #te&s


-o install, confi$ure, and tune a hi$h;performance compute cluster, complete the followin$ steps: .. !nstall and confi$ure the service node. 2. !nstall and confi$ure 01, on the service node. . !nstall and confi$ure the head node. 2. !nstall the &ompute &luster Pac3. 4. 1efine the cluster topolo$5. %. &reate the compute node ima$e. 7. &apture and deplo5 ima$e to compute nodes. (. &onfi$ure and mana$e the cluster. 7. 1eplo5 the client utilities to cluster users.

#te& 1: Install and Configure the #er+ice Node


-he service node provides all the bac3;end networ3 services for the cluster, includin$ authentication, name services, and ima$e deplo5ment. !t uses standard Windows technolo$5 and services to mana$e 5our networ3 infrastructure. -he service node has two Ei$abit =thernet networ3 adapters and no 6P! adapters. )ne adapter connects to the public networ3G the other connects to the private networ3 dedicated to the cluster. -here are five tas3s that are re+uired for installation and confi$uration: .. !nstall and confi$ure the base operatin$ s5stem. 2. !nstall 0ctive 1irector5, 1omain /ame ,ervices @1/,A, and 1'&P. . &onfi$ure 1/,. 2. &onfi$ure 1'&P. 4. =nable *emote 1es3top for the cluster. Install and configure the base operating system. Follow the normal setup procedure for Windows ,erver 200 *2 =nterprise =dition, with the e9ceptions as noted in the followin$ procedure. To install and configure the base operating system "oot the computer to the Windows ,erver 200 *2 =nterprise =dition &1. .. 0ccept the license a$reement. 2. )n the Partition )ist screen, create two partitions: one partition of 0 E", and a second usin$ the remainder of the space on the hard drive. ,elect the 0 E" partition as the install partition, and then press =/-=*. . )n the !ormat Partition screen, accept the default of /-F,, and then press =/-=*. Proceed with the remainder of the te9t;mode setup. -he computer then reboots into $raphical setup mode.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

2. )n the )icensing 'odes pa$e, select the option for which 5ou are licensed, and then confi$ure the number of concurrent connections if needed. &lic3 $e/t. 4. )n the Computer $ame and %dministrator Password pa$e, t5pe a name for the service node @for e9ample, ,=*B!&=/)1=A. -5pe 5our local administrator password twice, and then press =/-=*. %. )n the $etworking Settings pa$e, select Custom settings, and then clic3 $e/t. 7. )n the $etworking Components pa$e for 5our private adapter, select Internet Protocol :TCP;IP<, and then clic3 Properties. )n the Internet Protocol :TCP;IP< Properties pa$e, select =se the following IP address. &onfi$ure the adapter with a static nonroutable address, such as .0.0.0.., and a 22;bit subnet mas3 @244.0.0.0A. ,elect =se the following 2$S ser"er addresses, and then confi$ure the adapter to use .27.0.0... &lic3 >6, and then clic3 $e/t. $ote& !f this computer has a . 72 /et 0dapter, it will as3 5ou to set the !P for that adapter first @before settin$ settin$ -&P/!P propertiesA. &lic3 $e/t to s3ip this pa$e @unnecessar5 to the cluster deplo5mentA and move on to settin$ the -&P/!P properties. (. *epeat the previous step for the public adapter. &onfi$ure the adapter to ac+uire its address b5 usin$ 1'&P from the public networ3. !f 5ou prefer, 5ou can assi$n it a static address if 5ou have one alread5 reserved. &onfi$ure the public adapter to use .27.0.0.. for 1/, +ueries. &lic3 >6, and then clic3 $e/t. 7. )n the 5orkgroup or Computer 2omain pa$e, accept the default of $o and the default of 5>?6G?>=P, and then clic3 $e/t. -he computer will cop5 files, and then reboot. .0. :o$ in to the server as administrator. &lic3 Start, clic3 ?un, t5pe dis3m$mt.msc, and then clic3 >6. -he 1is3 6ana$ement console starts. ... *i$ht;clic3 the second partition on 5our drive, and then clic3 !ormat. !n the !ormat dialo$ bo9, select (uick !ormat, and then clic3 >6. When the format process is finished, close the 1is3 6ana$ement console. Install %cti"e 2irectory, 2$S, and 2HCP3 Windows ,erver 200 provides a wi>ard to confi$ure 5our server as a t5pical first server in a domain. -he wi>ard confi$ures 5our server as a root domain controller, installs and confi$ures 1/,, and then installs and confi$ures 1'&P. To install %cti"e 2irectory, 2$S, and 2HCP .. :o$ in to 5our service node as 0dministrator. !f the 'anage @our Ser"er pa$e is not visible, clic3 Start, and then clic3 'anage @our Ser"er. 2. &lic3 %dd or remo"e a role. -he Configure @our Ser"er 5iAard starts. &lic3 $e/t. . )n the &onfi$uration )ptions pa$e, select Typical configuration for a first ser"er, and then clic3 $e/t. 2. )n the %cti"e 2irectory 2omain $ame pa$e, t5pe the domain name that will be used for 5our cluster and append the J.localK suffi9 @for e9ample, 'P&&luster.localA. &lic3 $e/t.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

.2

4. )n the $et I>S 2omain $ame pa$e, accept the default /et"!), name @for e9ample, 'P&&:8,-=*A and clic3 $e/t. 0t the Summary of Selections page, clic3 $e/t. !f the Configure @our Ser"er 5iAard prompts 5ou to close an5 open pro$rams, clic3 >6. %. )n the $%T Internet Connection pa$e, ma3e sure the public adapter is selected. 1eselect .nable security on the selected interface, and then clic3 $e/t. !f 5ou have more than two networ3 adapters in 5our computer, the $etwork Selection pa$e appears. ,elect the private :0/ adapter and then clic3 $e/t. &lic3 !inish. 0fter the files are copied, the server reboots. 7. 0fter the server reboots, lo$ on as 0dministrator. *eview the actions listed in the Configure @our Ser"er 5iAard, and then clic3 $e/t. &lic3 !inish. Configure 2$S3 1/, is re+uired for the cluster and will be used b5 people who want to use the cluster. !t is lin3ed to 0ctive 1irector5 and mana$es the node names that are in use. 1/, must be confi$ured so that name resolution will function properl5 on 5our cluster. -he followin$ tas3 helps to confi$ure 5our 1/, settin$s for 5our private and public networ3s. To configure 2$S .. &lic3 Start, and then clic3 'anage @our Ser"er. !n the 1/, ,erver section, clic3 'anage this 2$S ser"er. #ou can also start the 1/, 6ana$ement console b5 clic3in$ Start, %dministrati"e Tools, and then 2$S. 2. *i$ht;clic3 5our server, and then clic3 Properties. . &lic3 the Interfaces tab. ,elect >nly the following IP addresses. ,elect the public interface, and then clic3 ?emo"e. )nl5 the private interface should be listed. !f it is not, t5pe the !P address of the private interface, and then clic3 %dd. -his ensures that 5our services node will provide 1/, services onl5 to the private networ3 and not to addresses on the rest of 5our networ3. &lic3 %pply. 2. &lic3 the !orwarders tab. !f the public interface is usin$ 1'&P, confirm that the forwarder !P list has the !P address for a 1/, server in 5our domain. !f not, or if 5ou are usin$ a static !P address, t5pe the !P address for a 1/, server on 5our public networ3, and then clic3 %dd. -his ensures that if the service node cannot resolve name +ueries, the re+uest will be forwarded to another name server on 5our networ3. &lic3 >6. 4. !n the 1/, 6ana$ement console, select ?e"erse )ookup Bones. *i$ht;clic3 ?e"erse )ookup Bones, and then clic3 $ew Bone. -he $ew Bone 5iAard starts. &lic3 $e/t. %. )n the Bone Type pa$e, select Primary Aone, and then select Store the Aone in %cti"e 2irectory. &lic3 $e/t. 7. )n the %cti"e 2irectory Bone ?eplication Scope pa$e, select To all domain controllers in the %cti"e 2irectory domain. &lic3 $e/t.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

.4

(. )n the ?e"erse )ookup Bone $ame pa$e, select $etwork I2, and then t5pe the first three octets of 5our private networ3Cs !P address @for e9ample, .0.0.0A. 0 reverse name loo3up is automaticall5 created for 5ou. &lic3 $e/t. 7. )n the 2ynamic =pdate pa$e, select %llow only secure dynamic updates. &lic3 $e/t. .0. )n the Completing the $ew Bone 5iAard pa$e, clic3 !inish. -he new reverse loo3up >one is added to the 1/, 6ana$ement console. &lose the 1/, 6ana$ement console. Configure 2HCP3 #our cluster re+uires automated !P addressin$ services to 3eep node traffic to a minimum. 0ctive 1irector5 and 1'&P wor3 to$ether so that networ3 addressin$ and resource allocation will function smoothl5 on 5our cluster. 1'&P has alread5 been confi$ured for 5our cluster networ3. 'owever, if 5ou want finer control over the number of !P addresses available and the information provided to 1'&P clients, 5ou must delete the current 1'&P scope and create a new one, usin$ settin$s that reflect 5our cluster deplo5ment. To configure 2HCP .. &lic3 Start, and then clic3 'anage @our Ser"er. !n the 1'&P ,erver section, clic3 'anage this 2HCP ser"er. #ou can also start the 1'&P 6ana$ement console b5 clic3in$ Start, clic3in$ %dministrati"e Tools, and then clic3in$ 2HCP. 2. *i$ht;clic3 the scope name @for e9ample, Scope C#,3,3,3,D Scope#A, and then clic3 2eacti"ate. When prompted, clic3 @es. *i$ht;clic3 the scope a$ain, and then clic3 2elete. When prompted, clic3 @es. -he old scope is deleted. . *i$ht;clic3 5our server name and then clic3 $ew Scope. -he $ew Scope 5iAard starts. &lic3 $e/t. 2. )n the Scope $ame pa$e, t5pe a name for 5our scope @for e9ample, J'P& &lusterKA and a description for 5our scope. &lic3 $e/t. 4. )n the IP %ddress ?ange pa$e, t5pe the start and end ran$es for 5our cluster. For e9ample, the start address would be the same address used for the private adapter: .0.0.0... -he end address depends on how man5 nodes 5ou plan to have in 5our cluster. For up to 240 nodes, the end address would be .0.0.0.242. For 240 to 400 nodes, the end address would be .0.0...242. For the subnet mas3, 5ou can either increase the len$th to .%, or t5pe in a subnet mas3 of 244.244.0.0. &lic3 $e/t. %. )n the %dd ./clusions pa$e, 5ou define a ran$e of addresses that will not be handed to computers at boot time. -he e9clusion ran$e should be lar$e enou$h to include all devices that use static !P addresses. For this e9ample, t5pe the start address of .0.0.0.. and an end address of .0.0.0.7. &lic3 %dd, and then clic3 $e/t. 7. )n the )ease 2uration pa$e, accept the defaults, and then clic3 $e/t. (. )n the Configure 2HCP >ptions pa$e, select @es, I want to configure these options now, and then clic3 $e/t. 7. )n the ?outer :2efault Gateway< pa$e, t5pe the private networ3 adapter address @for e9ample, .0.0.0..A, and then clic3 %dd. &lic3 $e/t.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

.%

.0. )n the 2omain $ame and 2$S Ser"ers pa$e, in the Parent domain te9t bo9, t5pe 5our domain name @for e9ample, 'P&&luster.localA. !n the Ser"er name te9t bo9, t5pe the server name @for e9ample, ,=*B!&=/)1=A. !n the !P 011*=,, fields, t5pe the private networ3 adapter address @for e9ample, .0.0.0..A. &lic3 %dd, and then clic3 $e/t. ... )n the 5I$S Ser"ers pa$e, clic3 $e/t. .2. )n the %cti"ate Scope pa$e, select @es, I want to acti"ate this scope now, and then clic3 $e/t. . . )n the Completing the $ew Scope 5iAard pa$e, clic3 !inish. &lose the 1'&P 6ana$ement console. .nable ?emote 2esktop for the cluster3 #ou can enable *emote 1es3top for nodes on 5our cluster so that 5ou can lo$ on remotel5 and mana$e services b5 usin$ the nodeCs des3top. To disable 5indows !irewall and enable ?emote 'anagement for the domain .. &lic3 Start, clic3 %dministrati"e Tools, and then clic3 %cti"e 2irectory =sers and Computers. 2. *i$ht;clic3 5our domain @for e9ample, hpccluster.localA, clic3 $ew, and then clic3 >rganiAational =nit. . -5pe the name of 5our new )8 @for e9ample, &luster ,erversA and then clic3 >6. 0 new )8 is created in 5our domain. 2. *i$ht;clic3 5our )8 and then clic3 Properties. -he )8 Properties dialo$ appears. &lic3 the Group Policy tab. &lic3 $ew. -5pe the name for 5our new Eroup Polic5 @for e9ample, =nable *emote 1es3topA and then press =/-=*. 4. &lic3 .dit. -he Eroup Polic5 )b?ect =ditor opens. "rowse to &omputer &onfi$uration O 0dministrative -emplates O Windows &omponents O -erminal ,ervices. %. 1ouble;clic3 %llow users to connect remotely using Terminal Ser"ices . &lic3 .nabled and then clic3 >6. &lose the Eroup Polic5 )b?ect =ditor. 7. )n the )8 Properties pa$e, on the Eroup Polic5 tab, select 5our new Eroup Polic5 and then clic3 >ptions. &lic3 $o >"erride, clic3 >6. #ou have created a new Eroup Polic5 for 5our )8 that enables *emote 1es3top. &lic3 >6.

#te& 2: Install and Configure A,# on the #er+ice Node


01, is used to install compute node ima$es on new hardware with little or no input from the cluster administrator. -his automated procedure ma3es it eas5 to set up and install new nodes on the cluster, or to replace failed nodes with new ones. -o install and confi$ure 01,, perform the followin$ procedures: .. &op5 and update the WinP= binaries. 2. &op5 and edit the script files. . !nstall and confi$ure 01,. 2. ,hare the 01, certificate.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

.7

4. !mport 01, templates. %. 0dd devices to 01,.


Copy and update the 5inP. binaries3 -he WinP= binaries provide a simple operatin$ s5stem

for the 01, 1eplo5ment 0$ent and the scriptin$ en$ine to run scripts a$ainst the node. "ecause the WinP= binaries are based on the installation files that are found on the Windows ,erver 200 &1, the driver cabinet files ma5 not include the drivers for 5our Ei$abit =thernet adapters. !f 5our adapter is not reco$ni>ed durin$ installation and confi$uration of 5our compute node ima$e, 5ou will need to update the WinP= binaries with the necessar5 adapter drivers and information files. $ote& #ou can also wait to create the WinP= binaries until after 5ou have installed and confi$ured 01, on the service node. Copy and update the 5inP. binaries .. &reate a &:OWinP= folder on 5our service node. &op5 the WinP= binaries to &:OWinP=. 2. -o update 5our WinP= binaries with the drivers and information files for 5our adapter, create a &:O1rivers folder on 5our service node. &op5 the .s5s, .inf, and .cat files for 5our driver to &:O1rivers. . &lic3 Start, clic3 ?un, t5pe cmd, and then clic3 >6. 0 command prompt window opens. 2. &han$e directories to &:OWinP=O. 4. -5pe drvinst.e9e /inf:c:OdriversOPfilenameQ.inf c:OWinP=, where PfilenameQ is the file name for 5our driverCs .inf file, and then press =/-=*. #our WinP= binaries are now updated with the drivers for 5our Ei$abit =thernet 0dapter. Copy and edit the script files3 Follow the normal setup procedure for Windows ,erver 200 *2 =nterprise =dition, with the e9ceptions noted later. Copy and edit the script files .. &reate the &:O'P&;&&,. &reate three new folders within the 'P&;&&, folder: &:O'P&; &&,O,cripts, &:O'P&;&&,O,e+uences, and &:O'P&;&&,O,5sprep. &reate the folder &:O'P&;&&,O,5sprepO! (%. 2. &op5 the files 0dd01,1evices.vbs, &han$e!Pfor!".vbs, and 0dd&ompute/odes.csv @or the name of 5our input fileA into &:O'P&;&&,O,cripts. &op5 &apture;&&,;ima$e;with; winpe.9ml and 1eplo5;&&,;ima$e;with;winpe.9ml into &:O'P&;&&,O,e+uences. &op5 s5sprep.inf into &:O'P&;&&,O,5sprep. . !nsert the Windows ,erver 200 &ompute &luster =dition &1 into the &1 drive. "rowse to the &1 folder O,upportO-ools. 1ouble;clic3 1eplo5.cab. &op5 the files s5sprep.e9e and setupcl.e9e to the &:O'P&;&&,O,5sprepO! (% folder. #ou must use the %2;bit versions of these files or the ima$e capture script will not wor3. 2. 8se the chart in -able 2 to edit the file 0dd&ompute/odes.csv @or the name of 5our input fileA and use the values for 5our compan5, 5our administrator password information, 5our product 3e5, 60& addresses, and 6achine)8 values. -he easiest wa5 to wor3 with this file, especiall5 for enterin$ the 60& addresses, is to import it into =9cel as a comma; delimited file, add the necessar5 values, and then e9port the data as a comma;separated value file.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

.(

Install and configure %2S3 #ou can download the 01, binaries from 6icrosoft, and then either

cop5 them to 5our service node or burn them onto a &1. To install and configure %2S .. "rowse to the &1 or the folder containin$ the 01, binaries and then run 01,,etup.e9e. 2. 0 Welcome pa$e appears. &lic3 Install 'icrosoft S() Ser"er 2esktop .ngine SP1 :5indows<. -he setup pro$ram automaticall5 installs the 6,1= software. . )n the Welcome pa$e, clic3 Install %utomated 2eployment Ser"ices. -he %utomated 2eployment Ser"ices Setup 5iAard starts. &lic3 $e/t. 2. )n the )icense %greement pa$e, select I accept the terms of the license agreement, and then clic3 $e/t. 4. )n the Setup Type pa$e, select !ull installation, and then clic3 $e/t. %. -he Installing PE. warnin$ dialo$ appears. &lic3 >6, and then clic3 $e/t. 7. )n the Configure the %2S Controller pa$e, ma3e sure that =se 'icrosoft S() Ser"er 2esktop .ngine :5indows< is selected, and that Create a new %2S database is selected. &lic3 $e/t. (. )n the $etwork oot Ser"ice Settings pa$e, ma3e sure that =se this path is selected. !nsert the Windows ,erver 200 *2 =nterprise =dition 9(% &1 into the drive. "rowse to the &1 drive, or t5pe the drive containin$ the &1, and then clic3 $e/t. 7. )n the 5indows P. ?epository pa$e, select )ocation of 5indows P.. "rowse to the folder containin$ the WinP= binaries @for e9ample, &:OWinP=A. !n the ?epository name te9t bo9, t5pe a name for 5our repositor5 @for e9ample, /ode!ma$esA. &lic3 $e/t. .0. )n the Image )ocation pa$e, t5pe the path to the folder where the ima$es will be stored. -hese must be on the second partition that 5ou created on 5our server @for e9ample, =:O!ma$esA. -he folder will be created and shared automaticall5. &lic3 $e/t. ... !f %2S Setup 5iAard detects more than one networ3 adapter in 5our computer, the $etwork Settings for %2S Ser"ices pa$e is displa5ed. !n the ind to this IP address drop;down list, select the !P address that the 01, services will use to distribute ima$es on the private networ3, and then clic3 $e/t. .2. )n the Installation Confirmation pa$e, clic3 Install. . . )n the Completing the %utomated 2eployment Ser"ices Setup 5iAard pa$e, clic3 !inish. &lose the %utomated 2eployment Ser"ices 5elcome dialo$ bo9. .2. -o open the 01, 6ana$ement console, clic3 Start, clic3 %ll Programs, clic3 'icrosoft %2S, and then clic3 %2S 'anagement. .4. =9pand the %utomated 2eployment Ser"ices node, and then select Ser"ices. !n the center pane, ri$ht;clic3 Controller Ser"ices, and then clic3 Properties. )n the &ontroller ,ervice Properties pa$e, select the Ser"ice tab, and then chan$e Global Fob template to boot-to-winpe. For the 1evice !dentifier, select '%C %ddress. For the WinP= *epositor5 /ame, t5pe $odeImages or the repositor5 name that 5ou created earlier. &lic3 %pply, and then clic3 >6.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

.7

.%. !n the 01, 6ana$ement console, ri$ht;clic3 Image 2istribution Ser"ice, and then clic3 Properties. ,elect the Ser"ice tab, and ensure that 'ulticast image deployment is selected. &lic3 >6. Share the %2S certificate3 01, creates a computer certificate when it is installed. -his certificate is used to identif5 all computers in the cluster. -he certificate must be shared so that the compute node ima$e can import the certificate and then use it durin$ the confi$uration process. To share the %2S certificate .. &lic3 Start, clic3 %dministrati"e Tools, and then clic3 Ser"er 'anagement. -he ,erver 6ana$ement console opens. 2. &lic3 Shared !olders, and then clic3 $ew !ile Share. -he Share a !older 5iAard starts. &lic3 $e/t. . )n the !older Path pa$e, clic3 01,O &ertificate. &lic3 $e/t. rowse, and then browse to &:O Pro$ram FilesO 6icrosoft

2. )n the $ame, 2escription, and Settings pa$e, accept the defaults, and then clic3 $e/t. 4. )n the Permissions pa$e, accept the defaults, and then clic3 !inish. &lic3 Close, and then close the ,erver 6ana$ement console. -he 01, certificate is shared on 5our networ3. Import %2S templates3 01, includes several templates that are useful when mana$in$ 5our nodes, includin$ reboot;to;winpe and reboot;to;hd. -he templates are not installed b5 defaultG 5ou must add them to 01, usin$ a batch file. #ou also need to add the compute cluster templates to 01, so that 5ou can capture and deplo5 the compute node ima$e on 5our networ3. To import %2S templates .. )pen Windows =9plorer and browse to &:O Pro$ram FilesO 6icrosoft 01,O ,amplesO ,e+uences. 2. 1ouble;clic3 create;templates.bat. -he script file automaticall5 installs the templates in 01,. &lose Windows =9plorer. . &lic3 Start, clic3 %ll Programs, clic3 'icrosoft %2S, and then clic3 %2S 'anagement. -he 01, 6ana$ement console opens. 2. "rowse to Gob Templates. *i$ht;clic3 Gob Templates, and then clic3 $ew Gob Template. -he $ew Gob Template 5iAard starts. &lic3 $e/t. 4. )n the Template Type pa$e, select %n entirely new template, and then clic3 $e/t. %. )n the $ame and 2escription pa$e, t5pe a name for the compute node capture template @for e9ample, &apture &ompute /odeA. -5pe a description @for e9ample, *un within Windows ,erver &&=A, and then clic3 $e/t. 7. )n the Command Type pa$e, select Task seHuence, and then clic3 $e/t. (. )n the Script or ./ecutable Program pa$e, browse to &:Ohpc;ccsOse+uences. ,elect %ll files from the !iles of type drop;down list. ,elect &apture;&&,;ima$e;with;winpe.9ml, and then clic3 >pen. &lic3 $e/t.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

20

7. )n the 2e"ice 2estination page, select $one, and then clic3 $e/t. &lic3 !inish. #our capture template is added to 01,. .0. *epeat steps 2 throu$h 7. !n step %, use 1eplo5 &ompute /ode and *un from WinP= as the name and description. !n step (, select the file 1eplo5;&&,;ima$e;with;winpe.9ml. When finished, 5ou have added the deplo5ment template to 01,. %dd de"ices to %2S3 Follow the normal setup procedure for Windows ,erver 200 *2 =nterprise =dition, with the e9ceptions noted later. To add de"ices to %2S .. Populate the 01, server with 01, devices. &lic3 Start, clic3 ?un, t5pe cmd.e9e, and then clic3 >6. &han$e the director5 to &:O'P&;&&,O,cripts. 2. -5pe 0dd01,1evices.vbs 0dd&ompute/odes;,ample.csv @use the name of 5our input file instead of the sample file nameA. -he script will echo the nodes as the5 are added to the 01, server. When the script is finished, close the command window. !f 5our compan5 uses a pro95 server to connect to the !nternet, 5ou should confi$ure 5our server so that it can receive s5stem and application updates from 6icrosoft. .. -o confi$ure 5our pro95 server settin$s, open !nternet =9plorer <. &lic3 Tools, and then clic3 Internet >ptions. 2. &lic3 the Connections tab, and then clic3 )%$ Settings. . )n the )ocal %rea $etwork :)%$< Settings pa$e, select =se a pro/y ser"er for your )%$. =nter the 8*: or !P address for 5our pro95 server. 2. !f 5ou need to confi$ure secure '--P settin$s, clic3 %d"anced, and then enter the 8*: and port information as needed. 4. &lic3 >6 three times, and then close !nternet =9plorer. When 5ou have finished confi$urin$ 5our server, clic3 Start, clic3 %ll Programs, and then clic3 5indows =pdate. -his will ensure that 5our server is up;to;date with service pac3s and software updates that ma5 be needed to improve performance and securit5.

#te& $: Install and Configure the Head Node


-he head node is responsible for mana$in$ the compute cluster nodes, performin$ ?ob control, and actin$ as the $atewa5 for submitted and completed ?obs. !t re+uires ,I: ,erver 2004 ,tandard =dition as part of the underl5in$ service and support structure. #ou should consider usin$ three hard drives for 5our head node: one for the operatin$ s5stem, one for the ,I: ,erver database, and one for the ,I: ,erver transaction lo$s. -his will provide reduced drive contention, better overall throu$hput, and some transactional redundanc5 should the database drive fail. !n some cases, enablin$ h5perthreadin$ on the head node will also result in improved performance for heavil5;loaded ,I: ,erver applications. -here are two tas3s that are re+uired for installin$ and confi$urin$ 5our head node: .. !nstall and confi$ure the base operatin$ s5stem. 2. !nstall and confi$ure ,I: ,erver 2004 ,tandard =dition.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

2.

To install and configure the base operating system .. )n the head node computer, boot to the Windows ,erver 200 *2 ,tandard =dition 9%2 &1. 2. 0ccept the license a$reement. . )n the Partition )ist screen, create two partitions: one partition of 0 E", and a second that uses the remainder of the space on the hard drive. ,elect the 0 E" partition as the install partition, and then press =/-=*. 2. )n the !ormat Partition screen, accept the default of /-F,, and then press =/-=*. Proceed with the remainder of the te9t;mode setup. -he computer then reboots into $raphical setup mode. 4. )n the )icensing 'odes pa$e, select the option for which 5ou are licensed, and then confi$ure the number of concurrent connections, if needed. &lic3 $e/t. %. )n the Computer $ame and %dministrator Password pa$e, t5pe a name for the head node @for e9ample, '=01/)1=A. -5pe the account with permission to ?oin a computer to the domain @for e9ample, hpcclusterOadministratorA, t5pe the password twice, and then press =/-=*. 7. )n the $etworking Settings pa$e, select Typical settings, and then clic3 $e/t. -his will automaticall5 assi$n addresses to 5our public and private adapters. !f 5ou want to use static !P addresses for either interface, select Custom Settings, and then clic3 $e/t. Follow the steps that 5ou used to confi$ure 5our service node adapter settin$s. (. )n the 5orkgroup or Computer 2omain pa$e, select @es, make this computer a member of a domain. -5pe the name of 5our cluster domain @for e9ample, 'P&&luster.localA, and then clic3 $e/t. When prompted, t5pe the name and the password for an account that has permission to add computers to the domain @t5picall5, the 0dministrator accountA, and then clic3 >6. /ote: !f 5our networ3 adapter drivers are not included on the Windows ,erver 200 &1, then 5ou will not be able to ?oin a domain at this time. !nstead, ma3e the computer a member of a wor3$roup, complete the rest of setup, install 5our networ3 adapters, and then ?oin 5our head node to the domain. When 5ou have confi$ured the base operatin$ s5stem, 5ou can install ,I: ,erver 2004 ,tandard =dition on 5our head node. To install and configure S() Ser"er +,,- Standard .dition .. :o$ on to 5our server as 0dministrator. !nsert the ,I: ,erver 2004 ,tandard =dition 9%2 &1 into the head node. !f setup does not start automaticall5, browse to the &1 drive and then run setup.e9e. 2. )n the .nd =ser )icense %greement pa$e, select I accept the licensing terms and conditions, and then clic3 $e/t. . )n the Installing PrereHuisites pa$e, clic3 Install. When the installations are complete, clic3 $e/t. -he 5elcome to the 'icrosoft S() Ser"er Installation 5iAard starts. &lic3 $e/t.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

22

2. )n the System Configuration Check pa$e, the installation pro$ram displa5s a report with potential installation problems. #ou do not need to install !!, or address an5 !!,; related warnin$s because !!, is not used in this deplo5ment. &lic3 $e/t. 4. )n the ?egistration Information pa$e, complete the $ame and Company fields with the appropriate information, and then clic3 $e/t. %. )n the Components to Install pa$e, select all chec3 bo9es, and then clic3 $e/t. 7. )n the Instance $ame pa$e, select $amed instance, and then t5pe C>'P=T.C)=ST.? in the te9t bo9. #our cluster must have this name, or Windows &ompute &luster will not wor3. &lic3 $e/t. (. )n the Ser"ice %ccount pa$e, select =se the built-in System account, and then select )ocal system in the drop;down list. !n the Start ser"ices at the end of setup section, select all options e9cept S() Ser"er %gent, and then clic3 $e/t. 7. )n the %uthentication 'ode pa$e, select 5indows %uthentication 'ode. &lic3 $e/t. .0. )n the Collation Settings pa$e, select S() collations, and then select 2ictionary order case-insensiti"e for use with #+-+ Character Set from the drop;down list. &lic3 $e/t. ... )n the .rror and =sage ?eport Settings pa$e, clic3 $e/t. .2. )n the ?eady to Install pa$e, clic3 Install. When the Setup Progress pa$e appears, clic3 $e/t. . . )n the Completing 'icrosoft S() Ser"er +,,- Setup page, clic3 !inish. .2. )pen the 1is3 6ana$ement console. &lic3 Start, clic3 ?un, t5pe dis3m$mt.msc, and then clic3 >6. .4. *i$ht;clic3 the second partition on 5our drive, and then clic3 !ormat. !n the !ormat dialo$ bo9, select (uick !ormat, and then clic3 >6. When the format process finishes, close the 1is3 6ana$ement console. !f 5our compan5 uses a pro95 server to connect to the !nternet, 5ou should confi$ure 5our head node so that it can receive s5stem and application updates from 6icrosoft. .. -o confi$ure 5our pro95 server settin$s, open !nternet =9plorer. &lic3 Tools, and then clic3 Internet >ptions. 2. &lic3 the Connections tab, and then clic3 )%$ Settings. . )n the )ocal %rea $etwork :)%$< Settings pa$e, select =se a pro/y ser"er for your )%$. =nter the 8*: or !P address for 5our pro95 server. 2. !f 5ou need to confi$ure secure '--P settin$s, clic3 %d"anced, and then enter the 8*: and port information as needed. 4. &lic3 >6 three times, and then close !nternet =9plorer. When 5ou have finished confi$urin$ 5our server, clic3 Start, clic3 %ll Programs, and then clic3 5indows =pdate. -his will ensure that 5our server is up;to;date with service pac3s and software updates that ma5 be needed to improve performance and securit5. #ou should elect to install 6icrosoft 8pdate from the Windows 8pdate pa$e. -his service provides service pac3s and updates for all 6icrosoft applications, includin$ ,I: ,erver. Follow the instructions on the Windows 8pdate pa$e to install the 6icrosoft 8pdate service.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

#te& ': Install the Co(&ute Cluster Pack


When the head node has been confi$ured, 5ou can install the &ompute &luster Pac3 that contains services, interfaces, and supportin$ software that is needed to create and confi$ure cluster nodes. !t also includes utilities and mana$ement infrastructure for 5our cluster. To install the Compute Cluster Pack .. !nsert the &ompute &luster Pac3 &1 into the head node. -he 'icrosoft Compute Cluster Pack Installation 5iAard appears. &lic3 $e/t. 2. )n the 'icrosoft Software )icense Terms pa$e, select I accept the terms in the license agreement, and then clic3 $e/t. . )n the Select Installation Type pa$e, select Create a new compute cluster with this ser"er as the head node. 1o not use the head node as a compute node. &lic3 $e/t. 2. )n the Select Installation )ocation pa$e, accept the default. &lic3 $e/t. 4. )n the Install ?eHuired Components pa$e, a list of re+uired components for the installation appears. =ach component that has been installed will appear with a chec3 ne9t to it. ,elect a component without a chec3, and then clic3 Install. %. *epeat the previous step for all uninstalled components. When all of the re+uired components have been installed, clic3 $e/t. -he 'icrosoft Compute Cluster Pack Installation 5iAard completes. &lic3 !inish.

#te& -: ,efine the Cluster To&olog%


0fter the &ompute &luster Pac3 installation for the head node is complete, a &luster 1eplo5ment -as3s window appears with a -o 1o :ist. !n this procedure, 5ou will confi$ure the cluster to use a networ3 topolo$5 that consists of a sin$le private networ3 for the compute nodes and a public interface from the head node to the rest of the networ3. To define the cluster topology .. )n the To 2o )ist pa$e, in the $etworking section, clic3 Configure Cluster $etwork Topology. -he Configure Cluster $etwork Topology 5iAard starts. &lic3 $e/t. 2. )n the Select Setup Type pa$e, select Compute nodes isolated on pri"ate network from the drop;down list. 0 $raphic appears that shows 5ou a representation of 5our networ3. #ou can learn more about the different networ3 topolo$ies b5 clic3in$ the )earn more about this setup lin3. When 5ou have reviewed the information, clic3 $e/t. . )n the Configure Public $etwork pa$e, select the correct public @e9ternalA networ3 adaptor from the drop;down list. -his networ3 will be used for communicatin$ between the cluster and the rest of 5our networ3. &lic3 $e/t. 2. )n the Configure Pri"ate $etwork pa$e, select the correct private @internalA adaptor from the drop;down list. -his networ3 will be used for cluster mana$ement and node deplo5ment. &lic3 $e/t. 4. )n the .nable $%T =sing ICS pa$e, select 2isable Internet Connection Sharing for this cluster. &lic3 $e/t. %. *eview the summar5 pa$e to ensure that 5ou have chosen an appropriate networ3 confi$uration, and then clic3 !inish. &lic3 Close.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

22

#te& .: Create the Co(&ute Node I(age


#ou can now create a compute node ima$e. -his is the compute node ima$e that will be captured and deplo5ed to each of the compute nodes. -here are three tas3s that are re+uired to create the compute node ima$e: .. !nstall and confi$ure the base operatin$ s5stem. 2. !nstall and confi$ure the 01, a$ent and &ompute &luster Pac3. . 8pdate the ima$e and prepare it for deplo5ment. To install and configure the base operating system 2. ,tart the node that 5ou want to use to create 5our compute node ima$e. !nsert the 6icrosoft Windows ,erver 200 &ompute &luster =dition &1 into the &1 drive. -e9t; mode setup launches automaticall5. 4. 0ccept the license a$reement. %. )n the Partition )ist screen, create one partition of .% E". ,elect the .% E" partition as the install partition, and then press =/-=*. 7. )n the !ormat Partition screen, accept the default of /-F,, and then press =/-=*. Proceed with the remainder of the te9t;mode setup. -he computer then reboots into $raphical setup mode. (. )n the )icensing 'odes pa$e, select the option for which 5ou are licensed, and then confi$ure the number of concurrent connections, if needed. &lic3 $e/t. 7. )n the Computer $ame and %dministrator Password pa$e, t5pe a name for the compute node that has not been added to 01, @for e9ample, /)1=000A. -5pe 5our local administrator password twice, and then press =/-=*. .0. )n the $etworking Settings pa$e, select Typical settings, and then clic3 $e/t. -his will automaticall5 assi$n addresses to 5our public and private adapters. -he adapter information for the deplo5ed nodes will be automaticall5 created when the ima$e is deplo5ed to a node. ... )n the 5orkgroup or Computer 2omain pa$e, select @es, make this computer a member of a domain. -5pe the name of 5our cluster domain @for e9ample, 'P&&lusterA, and then clic3 $e/t. When prompted, t5pe the name and the password for an account that has permission to add computers to the domain @for e9ample, hpcclusterOadministratorA, and then clic3 >6. -he computer will cop5 files, and then reboot. /ote: !f 5our networ3 adapter drivers are not included on the Windows ,erver 200 &ompute &luster =dition &1, then 5ou will not be able to ?oin a domain at this time. !nstead, ma3e the computer a member of a wor3$roup, complete the rest of setup, install 5our networ3 adapters, and then ?oin 5our compute node to the domain. .2. :o$ on to the node as administrator. . . &op5 the IF= files to 5our compute node. *un each e9ecutable and follow the instructions for installin$ the +uic3 fi9 files on 5our server. .2. )pen *e$edit. &lic3 Start, clic3 ?un, t5pe re$edit, and then clic3 >6.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

24

.4. "rowse to 'D=#R:)&0:R60&'!/=O ,#,-=6O &urrent&ontrol,etO ,ervicesO -cpipO Parameters. *i$ht;clic3 in the ri$ht pane. &lic3 $ew, and then clic3 25>?2 "alue. -5pe ,5n0ttac3Protect @case sensitiveA, and then press =/-=*. .%. 1ouble;clic3 the new 3e5 that 5ou ?ust created. &onfirm that the value data is >ero, and then clic3 >6. .7. *i$ht;clic3 in the ri$ht pane. &lic3 $ew, and then clic3 25>?2 "alue. -5pe -cp6a91ata*etransmissions @case sensitiveA, and then press =/-=*. .(. 1ouble;clic3 the new 3e5 that 5ou ?ust created. !n the Ialue data te9t bo9, t5pe 20. =nsure that ase is set to He/adecimal, and then clic3 >6. .7. &lose *e$edit. 20. 1isable an5 networ3 interfaces that will not be used b5 the cluster, or that do not have ph5sical networ3 connectivit5. When 5ou have confi$ured the base operatin$ s5stem, 5ou can then install and confi$ure the 01, 0$ent and the &ompute &luster Pac3 on 5our ima$e. To install and configure the %2S %gent and Compute Cluster Pack .. &op5 the 01, binaries to a folder on the compute node. "rowse to the folder, and then run 01,,etup.e9e. 2. 0 Welcome pa$e appears. &lic3 Install %2S %dministration %gent. -he %dministration %gent Setup 5iAard starts. &lic3 $e/t. . )n the )icense %greement pa$e, select I accept the terms of the license agreement, and then clic3 $e/t. 2. )n the Configure Certificates pa$e, select $ow. -5pe the full5;+ualified path to the certificate share on the service node @for e9ample, OOservicenode O&ertificateO adsroot.cerA. &lic3 $e/t. 4. )n the Configure the %gent )ogon Settings pa$e, select $one, and then clic3 $e/t. %. )n the Installation Confirmation pa$e, clic3 Install. 7. )n the Completing the %dministration %gent Setup 5iAard pa$e, clic3 !inish. &lose the %utomated 2eployment Ser"ices 5elcome pa$e. (. !nsert the &ompute &luster Pac3 &1 into the head node. -he 'icrosoft Compute Cluster Pack Installation 5iAard appears. &lic3 $e/t. 7. )n the 'icrosoft Software )icense Terms pa$e, select I accept the terms in the license agreement, and then clic3 $e/t. .0. )n the Select Installation Type pa$e, select Goin this ser"er to an e/isting compute cluster as a compute node. -5pe the name of the head node in the te9t bo9 @for e9ample, '=01/)1=A. &lic3 $e/t. ... )n the Select Installation )ocation pa$e, accept the default. &lic3 $e/t. .2. )n the Install ?eHuired Components pa$e, a list of re+uired components for the installation appears. =ach component that has been installed will appear with a chec3 ne9t to it. ,elect a component without a chec3, and then clic3 Install.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

2%

. . *epeat the previous step for all uninstalled components. When all of the re+uired components have been installed, clic3 $e/t. When the 6icrosoft &ompute &luster Pac3 completes, clic3 !inish. When 5ou have installed and confi$ured the 01, 0$ent and &ompute &luster pac3, 5ou can update 5our ima$e with the latest service pac3s, and then prepare 5our ima$e for deplo5ment. To update the image and prepare it for deployment .. *un the Windows 8pdate service on 5our compute node. !f 5our cluster lies behind a pro95 server, confi$ure !nternet =9plorer with 5our pro95 server settin$s. For information on how to do this, see Step #& Install and Configure the Ser"ice $ode, earlier in this $uide. 2. *un the 1is3 &leanup utilit5. &lic3 Start, clic3 %ll Programs, clic3 %ccessories, clic3 System Tools, and then clic3 2isk Cleanup. ,elect the &: drive, and then clic3 >6. ,elect all of the chec3 bo9es, and then clic3 >6. When the cleanup utilit5 is finished, close the utilit5. . *un the 1is3 1efra$menter utilit5. &lic3 Start, clic3 %ll Programs, clic3 %ccessories, clic3 System Tools, and then clic3 2isk 2efragmenter. ,elect the &: drive, and then clic3 2efragment. When the defra$mentation utilit5 is finished, close the utilit5.

#te& /: Ca&ture and ,e&lo% I(age to Co(&ute Nodes


#ou can now capture the compute node ima$e that 5ou ?ust created. #ou can then deplo5 the ima$e to compute nodes on 5our cluster. To capture the compute node image .. !f the compute node is not runnin$, turn on the computer and wait for the node to boot into Windows ,erver 200 &ompute &luster =dition. 2. :o$ on to the service node as administrator. &lic3 Start, and then clic3 %2S 'anagement. *i$ht;clic3 2e"ices, and then clic3 %dd 2e"ice. . !n the %dd 2e"ice dialo$ bo9, t5pe a name in the $ame te9t bo9 @for e9ample, /ode000A, a description for 5our node @for e9ample, &ompute /ode !ma$eA, and then t5pe the 60& address for the node that is runnin$ the compute node ima$e. &lic3 >6. -he status pane will indicate that the node was created successfull5. &lic3 Cancel to close the dialo$ bo9. 2. *i$ht;clic3 5our compute node name. &lic3 Properties, and then clic3 the =ser Iariables tab. 4. &lic3 %dd. !n the Iariables dialo$ bo9, in the $ame te9t bo9, t5pe !ma$ename. !n the Ialue te9t bo9, t5pe a name for 5our ima$e @for e9ample, &&,!ma$eA. &lic3 >6 twice. %. *i$ht;clic3 the compute node device a$ain, and then clic3 Properties. !n the 5inP. repository name te9t bo9, t5pe the name for 5our repositor5 that 5ou defined when 5ou installed 01, @for e9ample, /ode!ma$esA. &lic3 %pply, and then clic3 >6. 7. *i$ht;clic3 the compute node that 5ou ?ust added, and then clic3 Take Control. (. *i$ht;clic3 the compute node device a$ain, and then clic3 ?un Fob. -he ?un Gob 5iAard starts. &lic3 $e/t.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

27

7. )n the Gob Type pa$e, select =se an e/isting Fob template, and then clic3 $e/t. .0. )n the Template Selection pa$e, select Capture Compute $ode. &lic3 $e/t. ... )n the Completing the ?un Gob 5iAard pa$e, clic3 !inish. 0 Created Gobs dialo$ bo9 appears. &lic3 >6. -he 01, 0$ent on 5our compute node runs the ?ob, usin$ ,5sprep to prepare and confi$ure the node ima$e, and then usin$ the 01, ima$e capture functions to create and cop5 the ima$e to 01,. When the ima$e capture is complete, the node boots into WinP=. 2eploy the image to nodes on the cluster3 When 5ou have captured the compute node ima$e to the service node, 5ou can deplo5 the ima$e to compute nodes on the cluster. To deploy the image to nodes on the cluster .. :o$ on to the service node as administrator. &lic3 Start, clic3 %ll Programs, clic3 'icrosoft %2S, and then clic3 %2S 'anagement. 2. =9pand the %utomated 2eployment Ser"ices node, and then select 2e"ices. . ,elect all devices that appear in the ri$ht pane, ri$ht;clic3 on the selected devices, and then select Take Control. -he &ontrol ,tatus chan$es to @es. 2. *i$ht;clic3 on the devices, and then clic3 ?un Fob. 4. -he ?un Gob 5iAard appears. &lic3 $e/t. %. )n the Gob Type pa$e, select =se an e/isting Fob template. &lic3 $e/t. 7. )n the Template Selection pa$e, select boot-to-winpe. &lic3 $e/t. (. )n the Completing the ?un Gob 5iAard pa$e, clic3 !inish. 7. "oot the computer nodes. -he networ3 adapters should alread5 be confi$ured to use PH= and obtain the WinP= ima$e from the service node. -o avoid overwhelmin$ the 01, server durin$ unicast deplo5ment of WinP= ima$e, it is recommended that 5ou boot onl5 four nodes at a time. ,ubse+uent sets of four nodes should be booted up onl5 after all of the previous sets of four nodes are showin$ Connected to 5inP. status in the 01, 6ana$ement window on the head node. .0. 0fter all the nodes are connected to WinP=, 5ou can deplo5 the compute node ima$e to those nodes. *i$ht;clic3 the devices, and then clic3 ?un Fob. ... -he ?un Gob 5iAard appears. &lic3 $e/t. .2. )n the Gob Type pa$e, select =se an e/isting Fob template. &lic3 $e/t. . . )n the Template Selection pa$e, select 2eploy CCS Image. &lic3 $e/t. .2. )n the Completing the ?un Gob 5iAard pa$e, clic3 !inish. -he nodes automaticall5 download and run the ima$e. -his tas3 will ta3e a si$nificant amount of time, especiall5 when 5ou are installin$ hundreds of nodes. 1ependin$ on 5our available staff, 5ou ma5 want to run this as an overni$ht tas3. When finished, 5our nodes are ?oined to the domain and read5 to be mana$ed b5 the head node.

#te& 0: Configure and 1anage the Cluster


-he head node is used to mana$e and maintain 5our cluster once the node ima$es have been deplo5ed. -he &ompute &luster Pac3 includes a &ompute &luster 0dministrator console that simplifies mana$ement tas3s, includin$ approvin$ nodes on the cluster and addin$ users

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

2(

and administrators to the cluster. -he console includes a -o 1o :ist that shows 5ou which tas3s have been completed. Follow these steps to confi$ure and mana$e 5our cluster: .. 1isable Windows Firewall on all nodes in the cluster. 2. 0pprove nodes that have ?oined the cluster. . 0dd users and administrators to the cluster. 2isable 5indows !irewall on all nodes on the cluster3 -he &ompute &luster 0dministrator console enables 5ou to define how the firewall is confi$ured on all cluster node networ3 adapters. For best performance on lar$e;scale deplo5ments, it is recommended that 5ou disable Windows Firewall on all interfaces. To disable 5indows !irewall on all nodes on the cluster .. &lic3 Start, clic3 %ll Programs, clic3 'icrosoft Compute Cluster Pack, and then clic3 Compute Cluster %dministrator. 2. &lic3 the To 2o )ist. !n the $etworking section in the results pane, clic3 'anage 5indows !irewall Settings. -he 'anage 5indows !irewall 5iAard starts. &lic3 $e/t. . )n the Configure !irewall pa$e, select 2isable 5indows !irewall, and then clic3 $e/t. 2. )n the Iiew Summary pa$e, clic3 !inish. )n the ?esult pa$e, clic3 Close. When compute nodes are approved to ?oin the cluster, the firewall will be disabled. %ppro"e nodes that ha"e Foined the cluster3 When 5ou deplo5 &ompute &luster =dition nodes, the5 have ?oined the cluster but have not been approved to participate or process an5 ?obs. #ou must approve them before the5 can receive and process ?obs from 5our users. To appro"e nodes that ha"e Foined the cluster .. )pen the &ompute &luster 0dministrator console. &lic3 $ode 'anagement. 2. !n the results pane, select one or more nodes that displa5 a status of Pendin$ 0pproval. . !n the tas3 pane, clic3 %ppro"e. #ou can also ri$ht;clic3 the selected nodes and then clic3 %ppro"e. 2. When the nodes are approved, the status chan$es to Paused. #ou can leave the nodes in Paused status, or in the tas3 pane 5ou can clic3 *esume to enable the node to receive ?obs from 5our users. %dd users and administrators to your cluster3 !n order to use and maintain the cluster, 5ou must add cluster users and administrators to 5our cluster domain. -his will ma3e it possible for others to submit ?obs to the cluster, and to perform routine administration and maintenance on the cluster. !f 5our or$ani>ation uses 0ctive 1irector5, 5ou will need to create a trust relationship between 5our cluster domain and other domains in 5our or$ani>ation. #ou will also need to create or$ani>ational units @)8sA in 5our cluster domain that will act as containers for other )8s or users from 5our or$ani>ation. #ou ma5 need to wor3 with other $roups in 5our compan5 to create the necessar5 securit5 $roups so that 5ou can add users from other domains to 5our compute cluster domain. "ecause each or$ani>ation is uni+ue, it is not possible to provide step;b5;step instructions on how to add users and administrators to the cluster domain. For help and information on how best to add users and administrators to 5our cluster, see Windows ,erver 'elp.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

27

To add users and administrators to your cluster .. !n the &ompute &luster 0dministrator, clic3 the To 2o )ist. !n the results pane, clic3 'anage Cluster =sers and %dministrators. -he 'anage Cluster =sers 5iAard starts. &lic3 $e/t. 2. )n the Cluster =sers pa$e, the default $roup of 'P&&:8,-=*O1omain 8sers has been added for 5ou. -5pe a user or $roup b5 usin$ the format domainOuser or domainO$roup, and then clic3 %dd. #ou can add or remove users and $roups usin$ the %dd and ?emo"e buttons. When 5ou have finished addin$ or removin$ users and $roups, clic3 $e/t. . )n the Cluster %dministrators pa$e, the default $roup of 'P&&:8,-=*O1omain 0dmins has been added for 5ou. -5pe a user or $roup usin$ the format domainOuser or domainO$roup, and then clic3 %dd. #ou can add or remove users and $roups b5 usin$ the %dd and ?emo"e buttons. When 5ou have finished addin$ or removin$ users and $roups, clic3 $e/t. 2. )n the Iiew Summary pa$e, clic3 $e/t. 4. )n the ?esult pa$e, clic3 Close.

#te& 2: ,e&lo% the Client 3tilities to Cluster 3sers


-he &ompute &luster 0dministrator and the &ompute &luster Job 6ana$er are installed on the head node b5 default. !f 5ou install the client utilities on a remote wor3station, an administrator can mana$e clusters from that wor3station. !f 5ou install the &ompute &luster 0dministrator or Job 6ana$er on a remote computer, the computer must have one of the followin$ operatin$ s5stems installed: Windows ,erver 200 , &ompute &luster =dition Windows ,erver 200 , ,tandard 9%2 =dition Windows ,erver 200 , =nterprise 9%2 =dition Windows HP Professional 9%2 =dition Windows ,erver 200 *2 ,tandard 9%2 =dition Windows ,erver 200 *2 =nterprise 9%2 =dition 6icrosoft ./=- Framewor3 2.0 6icrosoft 6ana$ement &onsole @66&A .0 to run the &ompute &luster 0dministrator snap;in ,I: ,erver 2000 1es3top =n$ine @6,1=A to store all ?ob information

!n addition, Windows &ompute &luster ,erver 200 re+uires the followin$:

-he last step in the Windows &ompute &luster ,erver 200 deplo5ment process is to create an administrator or operator console. To deploy the client utilities

.. )n the wor3station that is runnin$ the appropriate operatin$ s5stem, insert the &ompute &luster Pac3 &1. -he 'icrosoft Compute Cluster Pack Installation 5iAard is automaticall5 launched. &lic3 $e/t. 2. )n the 'icrosoft Software )icense Terms pa$e, select I accept the terms in the license agreement, and clic3 $e/t.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

. )n the Select Installation Type pa$e, select Install only the 'icrosoft Compute Cluster Pack Client =tilities for the cluster users and administrators, and then clic3 $e/t. 2. )n the Select Installation )ocation pa$e, accept the default location, and then clic3 $e/t. 4. )n the Install ?eHuired Components pa$e, hi$hli$ht an5 components that are not installed, and then clic3 Install. %. When the installation is finished, a window appears that sa5s 'icrosoft Compute Cluster Pack has been successfully installed. &lic3 !inish. Please note that for an administration console, 5ou should install onl5 the client utilities. For a development wor3station, 5ou should install both the software development 3it @,1DA utilities and the client utilities.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

A&&endi4 A: Tuning %our Cluster


=ach cluster is created with a different $oal in mindG therefore, there is a different wa5 to tune each cluster for optimal performance. 'owever, some basic $uidelines can be established. -o achieve performance improvements, 5ou can do some plannin$, but testin$ will also be crucial. For testin$, it is important to use applications and data that are as close as possible to the ones that the cluster will ultimatel5 use. !n addition to the specific use of the cluster, its pro?ected si>e will be another basis for ma3in$ decisions. 0fter 5ou deplo5 the applications, 5ou can wor3 on tunin$ the cluster appropriatel5. -he best networ3in$ solution will depend on the nature of 5our application. 0lthou$h there are man5 different t5pes of applications, the5 can be broadl5 cate$ori>ed as messa$e;intensive and embarrassin$l5 parallel. !n messa$e;intensive applications, each nodeCs ?ob is dependent on other nodes. !n some situations, data is passed between nodes in man5 small messa$es, meanin$ that latenc5 is the limitin$ factor. With latenc5;sensitive applications, hi$h; performance networ3in$ interfaces, such as Winsoc3 1irect, are crucial. !n addition, the use of hi$h;+ualit5 routers and switches can improve performance with these applications. !n some messa$in$ situations, lar$e messa$es are passed infre+uentl5, meanin$ that bandwidth is the limitin$ factor. 0 specialt5 networ3, such as !nfini"and or 65rinet, will meet these hi$h;bandwidth re+uirements. !f networ3 latenc5 is not an issue, a $i$abit =thernet networ3 adapter mi$ht be the best choice. !n embarrassin$l5 parallel applications, each node processes data independentl5 with little messa$e passin$. !n this case, the total number of nodes and the efficienc5 of each node is the limitin$ factor. !t is important to be able to fit the entire dataset into *06. -his will result in much faster performance, as the data will not have to be pa$ed in and out from the dis3 durin$ processin$. -he speed of the processors and the t5pe and number of nodes is a prime concern. !f the processors are dual;core or +uad;core, this ma5 not be as efficient as havin$ separate processors, each with their own memor5 bus. !n addition, if h5per;threadin$ is available, it ma5 be advanta$eous to turn this feature off. '5per;threadin$ is used when applications are not usin$ all &P8 c5cles, so we have them run on a sin$le processor. '5perthreadin$ is $enerall5 bad for hi$h;performance computin$ applications, but not necessaril5 all of them. ,o lon$ as the operatin$ s5stem 3ernel is h5perthread;aware, the floatin$ point intensive processes will be balanced across ph5sical cores. For multi;threaded applications that ma5 have !/) intensive threads and floatin$ point intensive threads, h5perthreadin$ could be a benefit. '5perthreadin$ was disabled at /&,0 because none of the applications were floatin$;point intensive, and no specific thread;balancin$ or 3ernel; tunin$ was performed. -his wor3s for re$ular scenarios, but in hi$h;performance computin$, all &P8 resources are used, so havin$ all processes on a sin$le processor has the opposite effects: the5 have to wait to $et resources. !f the application were actuall5 perfectl5 parallel, each e9tra node would increase performance time linearl5. For each application, there are a ma9imum number of processors that will increase performance. 0bove that number, each processor adds no value, and could even decrease performance. -his is referred to as application scalin$. 1ependin$ on the s5stem architecture, all cores sometimes divide available bandwidth to memor5, and the5 certainl5 alwa5s divide the networ3 bandwidth. )ne of these three @&P8, networ3, or memor5 bandwidthA is the performance bottlenec3 with an5 application. !f the nature of the application@sA is 3nown, 5ou can determine in advance the optimal cluster specifications that will match the application. #ou should wor3 with 5our application vendor to ensure that 5ou have the optimal number of processors. !n some applications, man5 ?obs are run, each of short duration. With this scenario, the performance of the ?ob scheduler is crucial. -he &&, ?ob scheduler was desi$ned to handle this situation.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

When evaluatin$ cluster performance, it is important to be aware that benchmar3s donCt alwa5s tell the whole stor5. #ou must evaluate the performance based on 5our own needs and e9pectations. =valuation should ta3e place b5 usin$ the application alon$ with the data that will be runnin$ on the cluster. -his will help to ensure a more accurate evaluation that will result in a s5stem that better meets 5our needs. For more information on cluster tunin$, 5ou can download the JPerformance -unin$ a &ompute &lusterK white paper from the 6icrosoft Web site at http://$o.microsoft.com/fwlin3/L :in3!dM(7(2( #ou can also find additional tips and new information on performance tunin$ at the 'P& -eam blo$: http://windowshpc.net/blo$s. -able 4 deals with scalabilit5 and will help 5ou ma3e decisions based on the intended si>e of 5our cluster. -he first part focuses on mana$ement scenarios, while the second part focuses on networ3in$ scenarios. For each scenario, there are an estimated number of nodes, above which the scenario will manifest itself. !f 5our cluster e9ceeds the specified number of nodes, 5ou ma5 need to use the /ote column to plan accordin$l5, or to troubleshoot.

Table -: #calabilit% Considerations


'anagement Scenario
'S2. on Head $ode supports ( or fewer concurrent connections. ?IS on Ser"ice $ode supports onl5 (0 machines simultaneousl5. ICS;$%T has an address ran$e limit of .72..%(.0.T -he !ile ser"er on Head $ode onl5 supports a limited number of simultaneous connections to ,6"//-F,. -he 2C;2$S ser"er on Head $ode is not optimal. !t doesnCt handle well with several /!&s. %2S loses contact with compute nodes after Winsoc31irect has been enabled. Cisco I switch subnet manager is incompatible with open!" drivers. 0 S2' update bottlenec3 e9ists. Gob Scheduling bottlenec3s e9ist. 5insock2irect @lar$e scale onl5A Infiniband dri"ers @lar$e scale onl5A

$odes
%2S

$ote
8se ,I: ,erver 2004 on the head node @hard codedA 4;7 tables for scheduler. 8se ( tables for ,16. 8se 01, for &&, 200 @01, re+uires 2;bitA.

%2S

240S 22

8se 1'&P ,erver instead. =9ecutable on compute nodes !ncrease the number of connections that the file server on the head node can support @see D" .7227A !t is best to levera$e corporate !- 1&. Put 1&/1/, on a separate mana$ement node. 8se clusrun or ?obs to control the machine. !f !P6! is available, use !P6! to reboot the machine into winP=. W1, for ne9t version of &&, wor3s with Winsoc31irect. 8se open,6: 1isable &isco !" switch subnet mana$er =nable open,6 &&, B. ,P. &&, B. ,P. &&, B. ,P. Winsoc3 1irect hotfi9es 7.02(. , 727%20, 7222(% -his is fi9ed in openFabrics build 247, found at: http://windows.openib.or$/downloads/binaries/

%2S

//0

//0

%2S %2S %2S %2S

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

'anagement Scenario
-here is a bottlenec3 in the number of possible simultaneous connections with code path used when S@$ attack protection is on. -here are TCP timeouts on callin$ nodes when networ3 is ?ammed @dela5 at switchA. For e9ample, mpi all reduce. :atenc5 is too hi$h. "andwidth is too low.

$odes
%2S

$ote
1isable ,#/ attac3 protection re$istr5 value 'D:6O,5stemO&urrent&ontrol,etO,ervicesO-cpipOPara meters ,5n0ttac3Protect M0 ,et -&P retransmission count to 0920. Please note that this is hard to dia$nose as one;to;all ma3es different nodes fail. 8se mpie9ec Uenv !"W,1RP):: 400 linpac3. 8se mpie9ec Uenv 6P!&'R,)&D=-R,"8FF=*R,!V= 0 to avoid cop5 on send to improve bandwidth. )nl5 use this when Winsoc3 1irect is enabled U it can cause loc3up with Ei$= and !Po!". 8se mpie9ec Uenv !"W,1R,0R-!6=)8- .000 to set the subnet mana$er timeout to a hi$her value durin$ Winsoc3 1irect connection establishment.

%2S

//0 //0

0 5insock 2irect connection timeout e9ists.

//0

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

A&&endi4 B: Troubleshooting Your Cluster


!n addition, -able % can help 5ou troubleshoot problems with 5our cluster.

Table .: Troubleshooting
Issue
//0

'itigation
For ease of debu$$in$, switch off shared memor5 communication usin$ environment variable 6P!&'R1!,0":=R,'6M. $ote& -his is done from the command line with the command:
mpiexec env B0*!0":= ,=--!/E -env )-'=*B0*!0":= )-'=*,=--!/E

2etails
6P! environment variable 6P!&'R1!,0":=R,'6 !f 5ou disable the shared memor5, 6,6P! stops loo3in$ at communication between processors and focuses on networ3 +ueues @node to nodeA communication.

%pplication Hangs

$ote& #ou can also set up Win1b$ for ?ust;in;time debu$$in$ with the command Windbg I

%pplication !ails
/etwor3 connectivit5 issue -he last line of the 6P! output file $ives 5ou information on networ3 errors $ote& -he stdout output is located where 5ou route it in 5our ?obG i.e., it is specified b5 the /stdout: switch to J?ob submitK -urn ,#/ protection off completel5 on all &/s @leave ,#/ protection active on the head node to avoid denial;of;serviceA )utput file

,#/ protection interferes with connectivit5 under heav5 load

*e$istr5 settin$ for ,#/ attac3 protection:


D=#R:)&0:R60&'!/=O,5stemO&urrent&ontr ol,etO,ervicesO-cpipOParametersO,5n0ttac3Pr otectM0

-o deplo5 this settin$ to all nodes:


clusrun /all reg add HKLM\SYSTEM\CurrentControlSet\Service s\Tcpip\ arameters /v S!n"ttac# rotect /t $E%&'()$' /d * /+ clusrun /all s,utdo-n -t .* -r -+

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

Issue
/etwor3 connectivit5 failure

'itigation
!dentif5 node with defect 8se Pallas pin$ pon$, one;to;all, all;to;all

2etails
0 $ood set of tools for this are the :inu9; based !ntel 6P! benchmar3s @based on the Pallas test suiteA. -hese are available for download from http://windowshpc.net/files/2/portin$Runi9 Rcode/entr5 7 .asp9 $ote& "ecause these tests are :inu9; based, 5ou will have to port them to &&, usin$ the ,ubs5stem for 8/!H 0pplications @,80A. !nstructions how to do this are included with the download. !" driver and Winsoc3 1irect installation utilit5

Winsoc3 1irect issues

1isable Winsoc3 1irect @W,1A and use the !Po!" path instead of *160:
Clusrun /all \\HE"'/)'E\01'river0nstall at,\n et\amd23\installsp -r

!f it wor3s when disabled, then tr5 to N*epairC !" connections clusrun netsh interface set interface nameM6P! adminM1!,0":=/=/0":= Balidate that 5our cluster has the latest Winsoc3 1irect patches

%pplication Performance $ot >ptimal


0pplication not optimi>ed for memor5 or &P8 utili>ation 0pplication doesnCt scale to lar$e number of nodes 6,6P! does not balance well between same node processors communication and node to node communication 6essa$es are not comin$ in fast enou$h &hec3 whether nodes are pa$in$ instead of usin$ *06 &hec3 &P8 utili>ation 1ecrease the number of nodes used b5 the application until application performance comes bac3 to e9pected level =9periment with disablin$ the shared memor5 settin$ and see whether application performance improves =speciall5 relevant for messa$e; intensive applications Ei$=: =9periment with turnin$ off the interrupt modulation to free up &P8 usa$e !": =9periment with increasin$ pollin$ of messa$es. Pollin$ causes hi$h &P8 usa$e, so if usa$e is too hi$h, it will be detrimental to the application computin$ &P8 needs. 6P!&'R1!,0":=R,'6 8se perfmon counters http://$o.microsoft.com/fwlin3/L :in3!dM(%%.7

open!" driver !"W,1RP)::

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

Issue
&onnectivit5 to one or more nodes on the cluster is lost

'itigation
1ivide the cluster into subsets of nodes. *un Pallas pin$ pon$ or Pallas one;to;all or Pallas all;to;all on those subsets.

2etails
!ntel 6P! benchmar3s. -his strate$5 brea3s the cluster into subclusters to tr5 to find where the issue is. !n each sublcluster run sanit5 tests li3e the Pallas series in order to discover which subcluster contains the JbadK node. -his strate$5 involves chec3in$ the number of uplin3s/downlin3s per switch to chec3 to see this is the cause of poor application performance. 6,6P! settin$ ,et 6P!&'R,)&D=-R,"8FF=*R,!V= to 0 $ote& -his is done on the command line with the command:
mpiexec env VARIABLE SETTING -env OT ERVARIABLE OT ERSETTING

,witches oversubscription not optimal ,end operation

-r5 a hi$her number of uplin3s

=9periment with havin$ no e9tra cop5 on the ,end operation

$ote& -his will lead to hi$her bandwidth but also to hi$her &P8 utili>ation. $ote& 8se this onl5 when compute nodes are fitted with a W,1;enabled driver. 8sin$ a settin$ of 0 will cause the compute nodes on non;W,1 networ3s to stop respondin$. 6emor5 bus bottlenec3 =9periment with settin$ the processor affinit5 @assi$n an 6P! process to a specific &P8 or &P8 coreA 0n e9ample of doin$ this from the command line:
4o5 su5mit /numprocessors6.7 mpiexec /cmd /c set"++init!85at m!app8exe

where set%ffinity3bat consists of:


9ec,o o++ set /a "::0/0TY;<. == >? M0&$"/K? ?? ?/@M1E$&):& $)CESS)$S?A< ec,o a++init! is ?"::0/0TY? start /-ait /5 /a++init! ?"::0/0TY? ?B

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

A&&endi4 C: Cluster Configuration and ,e&lo%(ent #cri&ts


-hese scripts are used to automaticall5 add nodes to the cluster and to deplo5 ima$es to nodes automaticall5 without administrator intervention. %dd%2S2e"ices3"bs. Parses an input file and uses the data to automaticall5 populate 01, with the correct compute node information, includin$ node names and 60& address values. -hese values are later used b5 ,5sprep.e9e to confi$ure the node ima$es durin$ deplo5ment. http://www.microsoft.com/technet/scriptcenter/scripts/ccs/deplo5/ccdevb0..msp9 %ddCompute$odes3cs". ,ample input file that shows confi$uration information needed for addin$ nodes to the cluster. -he easiest wa5 to wor3 with this file is to import it into =9cel as a comma;delimited file, add the necessar5 values, includin$ compute node 60& addresses, and then e9port the data as a comma;separated value file. =ver5 item must have an entr5 or the input file will not wor3 properl5. !f 5ou do not have a value for a field, use a h5phen N;N for the field instead. http://www.microsoft.com/technet/scriptcenter/scripts/ccs/node/ccnovb...msp9 Capture-CCS-image-with-winpe3/ml. 01, ?ob template that captures a compute node ima$e for later deplo5ment to nodes on the cluster. http://www.microsoft.com/technet/scriptcenter/scripts/ccs/deplo5/ccdevb02.msp9 2eploy-CCS-image-with-winpe3/ml. 01, ?ob template that deplo5s a compute node ima$e to compute nodes on the cluster. http://www.microsoft.com/technet/scriptcenter/scripts/ccs/deplo5/ccdevb02.msp9 Sysprep3inf. Eeneric confi$uration file for use with ,5sprep.e9e. Bariable values are retrieved from 01, durin$ the ima$e deplo5ment process. http://www.microsoft.com/technet/scriptcenter/scripts/ccs/deplo5/ccdevb04.msp9 -he ori$inal hi$h;performance compute cluster used additional scripts specific to its environment, includin$ confi$urin$ !nfini"and networ3in$. !f 5ou have similar needs, 5ou can use these e9amples as a foundation for creatin$ 5our own scripts and ?ob templates. ChangeIPforI 3"bs. )ri$inal script to confi$ure !P over !nfini"and networ3in$. http://www.microsoft.com/technet/scriptcenter/scripts/ccs/deplo5/ccdevb0 .msp9 Capture-image-with-winpe3/ml. )ri$inal ?ob template to capture compute node ima$e. http://www.microsoft.com/technet/scriptcenter/scripts/ccs/deplo5/ccdevb0%.msp9 2eploy-image-on-#0G -with-winpe3/ml. )ri$inal ?ob template to deplo5 a compute node ima$e to the compute nodes. http://www.microsoft.com/technet/scriptcenter/scripts/ccs/deplo5/ccdevb07.msp9

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

!elated 5inks
For more information about Windows &ompute &luster ,erver 200 and hi$h;performance computin$, visit the Windows hi$h;performance computin$ Web site at http://www.microsoft.com/hpc For more information on scripts, visit ,criptin$ for &ompute &luster ,erver at http://www.microsoft.com/technet/scriptcenter/hubs/ccs.msp9. For more information on the Windows ,erver 200 famil5, visit the Windows ,erver 200 Web site at http://www.microsoft.com/windowsserver200 For information on obtainin$ professional assistance with plannin$ and deplo5in$ a cluster, visit 6icrosoft Partner ,olutions &enter at http://www.microsoft.com/serviceproviders/busresources/mpsc.msp9

-he information contained in this document represents the current view of 6icrosoft &orporation on the issues discussed as of the date of publication. "ecause 6icrosoft must respond to chan$in$ mar3et conditions, it should not be interpreted to be a commitment on the part of 6icrosoft, and 6icrosoft cannot $uarantee the accurac5 of an5 information presented after the date of publication. -his white paper is for informational purposes onl5. 6!&*),)F- 60D=, /) W0**0/-!=,, =HP*=,, )* !6P:!=1, !/ -'!, 1)&86=/-. &ompl5in$ with all applicable cop5ri$ht laws is the responsibilit5 of the user. Without limitin$ the ri$hts under cop5ri$ht, no part of this document ma5 be reproduced, stored in, or introduced into a retrieval s5stem, or transmitted in an5 form or b5 an5 means @electronic, mechanical, photocop5in$, recordin$, or otherwiseA, or for an5 purpose, without the e9press written permission of 6icrosoft &orporation. 6icrosoft ma5 have patents, patent applications, trademar3s, cop5ri$hts, or other intellectual propert5 ri$hts coverin$ sub?ect matter in this document. =9cept as e9pressl5 provided in an5 written license a$reement from 6icrosoft, the furnishin$ of this document does not $ive 5ou an5 license to these patents, trademar3s, cop5ri$hts, or other intellectual propert5. W 2007 6icrosoft &orporation. 0ll ri$hts reserved. 6icrosoft, 0ctive 1irector5, !nternet =9plorer, Birtual ,tudio, Windows, the Windows lo$o, and Windows ,erver are either re$istered trademar3s or trademar3s of 6icrosoft &orporation in the 8nited ,tates and/or other countries. 0ll other trademar3s are propert5 of their respective owners.

,tep;b5;,tep Euide to !nstallin$, &onfi$urin$, and -unin$ a 'i$h;Performance &ompute cluster

S-ar putea să vă placă și