Sunteți pe pagina 1din 18

GuidetothePBSQueuingSystem

onThunder
TableofContents
1. Introduction
2. AnatomyofaBatchScript
2.1. SpecifyYourShell
2.2. RequiredPBSDirectives
2.2.1. NumberofNodesandProcessesPerNode
2.2.2. HowLongtoRun
2.2.3. WhichQueuetoRunIn
2.2.4. YourProjectID
2.3. TheExecutionBlock
3. SubmittingYourJob
4. SimpleBatchScriptExample
5. JobManagementCommands
6. OptionalPBSDirectives
6.1. JobIdentificationDirectives
6.1.1. ApplicationName
6.1.2. JobName
6.2. JobEnvironmentDirectives
6.2.1. InteractiveBatchShell
6.2.2. ExportAllVariables
6.2.3. ExportSpecificVariables
6.3. ReportingDirectives
6.3.1. RedirectingStdoutandStderr
6.3.2. SettingupEmailAlerts
6.4. JobDependencyDirectives
7. EnvironmentVariables
7.1. PBSEnvironmentVariables
7.2. OtherImportantEnvironmentVariables
8. ExampleScripts
8.1. MPIScript
8.2. MPIScript(accessingmorememoryperprocess)
8.3. OpenMPScript
8.4. SHMEMScript
8.5. HybridMPI/OpenMPScript

1.Introduction

Onlargescalecomputers,manyusersmustshareavailableresources.Becauseofthis,youcannotjustlogonto
oneofthesesystems,uploadyourprograms,andstartrunningthem.Essentially,yourprograms(calledbatchjobs)
haveto"getinline"andwaittheirturn.And,thereismorethanoneoftheselines(calledqueues)fromwhichto
choose.Somequeueshaveahigherprioritythanothers(liketheexpresscheckoutatthegrocerystore).The
queuesavailabletoyouaredeterminedbytheprojectsthatyouareinvolvedwith.

Thejobsinthequeuesaremanagedandcontrolledbyabatchqueuingsystem,withoutwhich,userscouldoverload
systems,resultingintremendousperformancedegradation.Thequeuingsystemwillrunyourjobassoonasitcan
whilestillhonoringthefollowing:

Meetingyourresourcerequests
Notoverloadingsystems
Runninghigherpriorityjobsfirst
Maximizingoverallthroughput

AtAFRL,weusethePBSProfessionalqueuingsystem.ThePBSmoduleshouldbeloadedautomaticallyforyouat
login,allowingyouaccesstothePBScommands.

2.AnatomyofaBatchScript

Abatchscriptissimplyasmalltextfilethatcanbecreatedwithatexteditorsuchasviornotepad.Youmaycreate
yourownfromscratch,orstartwithoneofthesamplebatchscriptsavailablein$SAMPLES_HOME.Althoughthe
specificsofabatchscriptwilldifferslightlyfromsystemtosystem,abasicsetofcomponentsarealwaysrequired,
andafewcomponentsarejustalwaysgoodideas.Thebasiccomponentsofabatchscriptmustappearinthe
followingorder:

SpecifyYourShell
RequiredPBSDirectives
TheExecutionBlock

IMPORTANT:NotallapplicationsonLinuxsystemscanreadDOSformattedtextfiles.PBSdoesnothandle^M
characterswell,nordosomecompilers.Toavoidcomplications,pleaseremembertoconvertallDOSformatted
ASCIItextfileswiththedos2unixutilitybeforeuseonanyHPCsystem.Usersarealsocautionedagainstrelyingon
ASCIItransfermodetostripthesecharacters,assomefiletransfertoolsdonotperformthisfunction.

2.1.SpecifyYourShell

Firstofall,rememberthatyourbatchscriptisascript.It'sagoodideatospecifywhichshellyourscriptiswrittenin.
Unlessyouspecifyotherwise,PBSwilluseyourdefaultloginshelltorunyourscript.TotellPBSwhichshelltouse,
startyourscriptwithalinesimilartothefollowing,whereshelliseitherbash,sh,ksh,csh,tcsh,orzsh:

#!/bin/shell

2.2.RequiredPBSDirectives

ThenextblockofyourscriptwilltellPBSabouttheresourcesthatyourjobneedsbyincludingPBSdirectives.
Thesedirectivesareactuallyaspecialformofcomment,beginningwith"#PBS".Asyoumightsuspect,the#
charactertellstheshelltoignoretheline,butPBSreadsthesedirectivesandusesthemtosetvarious
values.IMPORTANT!!AllPBSdirectivesMUSTcomebeforethefirstlineofexecutablecodeinyourscript,
otherwisetheywillbeignored.

Everyscriptmustincludedirectivesforthefollowing:

Thenumberofnodesandprocessespernodeyouarerequesting
Themaximumamountoftimeyourjobshouldrun
Whichqueueyouwantyourjobtorunin
YourProjectID
PBSalsoprovidesadditionaloptionaldirectives.Thesearediscussedin OptionalPBSDirectives ,below.

2.2.1.NumberofNodesandProcessesPerNode

BeforePBScanscheduleyourjob,itneedstoknowhowmanynodesyouwant.Beforeyourjobcanberun,itwill
alsoneedtoknowhowmanyprocessesyouwanttorunoneachofthosenodes.Ingeneral,youwouldspecifyone
processpercore,butyoumightwantmoreorfewerprocessesdependingontheprogrammingmodelyouareusing.
See ExampleScripts (below)foralternateusecases.

Boththenumberofnodesandprocessespernodearespecifiedusingthesamedirectiveasfollows,whereN1isthe
numberofnodesyouarerequestingandN2isthenumberofprocessespernode(mustbe1,2,4,6,9,18,or36):

#PBSlselect=N1:ncpus=36:mpiprocs=N2

Thevalueofncpusreferstothenumberofphysicalcoresavailableoneachnode,andmustalwaysbesetto36for
standardcomputenodes.

GPUnodeswillrequirencpus=28,plustheextraargumentofngpus=1:

#PBSlselect=N1:ncpus=28:mpiprocs=N2:ngpus=1

Largememorynodeswillrequirencpus=36,plustheextraargumentofbigmem=1:

#PBSlselect=N1:ncpus=36:mpiprocs=N2:bigmem=1

PHInodeswillrequirencpus=28,plustheextraargumentofnmics=2:

#PBSlselect=N1:ncpus=28:mpiprocs=N2:nmics=2

Anexceptiontothisruleisthetransferqueue,whichusesthedirectivebelow:

#PBSlselect=1:ncpus=1

2.2.2.HowLongtoRun

Next,PBSneedstoknowhowlongyourjobwillrun.Forthis,youwillhavetomakeanestimate.Therearethree
thingstokeepinmind.

1.Yourestimateisalimit.Ifyourjobhasn'tcompletedwithinyourestimate,itwillbeterminated.
2.Yourestimatewillaffecthowlongyourjobwaitsinthequeue.Ingeneral,shorterjobswillrunbeforelonger
jobs.
3.Eachqueuehasamaximumtimelimit.Youcannotrequestmoretimethanthequeueallows.

Tospecifyhowlongyourjobwillrun,includethefollowingdirective:

#PBSlwalltime=HHH:MM:SS

2.2.3.WhichQueuetoRunIn

Now,PBSneedstoknowwhichqueueyouwantyourjobtorunin.Youroptionsherearedeterminedbyyourproject.
Mostusersonlyhaveaccesstothedebug,standard,andbackgroundqueues.Otherqueuesexist,butaccessto
thesequeuesisrestrictedtoprojectsthathavebeengrantedspecialprivilegesduetourgencyorimportance,and
theywillnotbediscussedhere.Astheirnamessuggest,thestandardanddebugqueuesshouldbeusedfornormal
daytodayanddebuggingjobs.Thebackgroundqueue,however,isabitspecialbecausealthoughithasthelowest
priority,jobsthatruninthisqueuearenotchargedagainstyourprojectallocation.Usersmaychoosetoruninthe
backgroundqueueforseveralreasons:

1.Youdon'tcarehowlongittakesforyourjobtobeginrunning.
2.Youaretryingtoconserveyourallocation.
3.Youhaveusedupyourallocation.

Toseethelistofqueuesavailableonthesystem,usetheshow_queuescommand.Tospecifythequeueyouwant
yourjobtorunin,includethefollowingdirective:

#PBSqqueue_name

2.2.4.YourProjectID

PBSnowneedstoknowwhichprojectIDtochargeforyourjob.Youcanusetheshow_usagecommandtofindthe
projectsthatareavailabletoyouandtheirassociatedprojectIDs.Intheshow_usageoutput,projectIDsappearin
thecolumnlabeled"Subproject."Note:Userswithaccesstomultipleprojectsshouldrememberthattheprojectthey
specifymaylimittheirchoiceofqueues.

TospecifytheProjectIDforyourjob,includethefollowingdirective:

#PBSAProject_ID

2.3.TheExecutionBlock

OncethePBSdirectiveshavebeensupplied,theexecutionblockmaybegin.Thisisthesectionofyourscriptthat
containstheactualworktobedone.Awellwrittenexecutionblockwillgenerallycontainthefollowingstages:

EnvironmentSetupThismightincludesettingenvironmentvariables,loadingmodules,creating
directories,copyingfiles,initializingdata,etc.Asthelaststepinthisstage,youwillgenerallycdtothe
directorythatyouwantyourscripttoexecutein.Otherwise,yourscriptwouldexecutebydefaultinyour
homedirectory.Mostusersuse"cd$PBS_O_WORKDIR"torunthebatchscriptfromthedirectorywherethey
typedqsubtosubmitthejob.
CompilationYoumayneedtocompileyourapplicationifyoudon'talreadyhaveaprecompiled
executableavailable.
LaunchingIfyourapplicationusesIntelMPI,launchitwiththempiruncommand.IfitusesSGIMPT,
launchitwiththempiexec_mpt.
CleanupThisusuallyincludesarchivingyourresultsandremovingtemporaryfilesanddirectories.

3.SubmittingYourJob

Onceyourbatchscriptiscomplete,youwillneedtosubmitittoPBSforexecutionusingtheqsubcommand.For
example,ifyouhavesavedyourscriptintoatextfilenamedrun.pbs,youwouldtype"qsubrun.pbs".

Occasionallyyoumaywanttosupplyoneormoredirectivesdirectlyontheqsubcommandline.Directivessupplied
inthiswayoverridethesamedirectivesiftheyarealreadyincludedinyourscript.Thesyntaxtosupplydirectiveson
thecommandlineisthesameaswithinascriptexceptthat#PBSisnotused.Forexample:

qsublwalltime=HHH:MM:SSrun.pbs

4.SimpleBatchScriptExample

Thebatchscriptbelowcontainsalloftherequireddirectivesandcommonscriptcomponentsdiscussedabove.This
examplestarts32processes.EachThundernodehas16cores,so32processesrequire2nodes.Thejobis
submittedtothestandardqueuetorunforatmost12hours.

#!/bin/bash
##RequiredPBSDirectives
#PBSAProject_ID
#PBSqstandard
#PBSlselect=2:ncpus=36:mpiprocs=36
#PBSlwalltime=12:00:00
#PBSjoe

##ExecutionBlock
#EnvironmentSetup
#cdtoyourscratchdirectory
#$JOBDIRisadirectorythatiscreatedwhenyourjobruns.
#Fileswithin$JOBDIRareprotectedfromourfilescrubber
#forthedurationofthejob,plus30daysafteritfinishes.
#FilesNOTwithina$JOBDIRaresusceptibletothenormal
#30dayfilelifespan.
cd${JOBDIR}

#Launching
#copyexecutablefrom$HOMEandsubmitit
cp${HOME}/my_prog.exe.
mpiexec_mptn32./my_prog.exe>my_prog.out

#Cleanup
#archiveyourresults
#Usingthe"heredocument"syntax,createajobscript
#forarchivingyourdata.
cd${WORKDIR}
rmfarchive_job
cat>archive_job<<END
#!/bin/bash
#PBSlwalltime=12:00:00
#PBSqtransfer
#PBSAProject_ID
#PBSlselect=1:ncpus=1
#PBSjoe
#PBSS/bin/bash
cd${WORKDIR}

#Singlefilesareeasierandfastertoretrievefromthe
#archivethanmultiplefiles,socreatea.tarfileofyour
#results.
tarcvf${PBS_JOBID}.tar${JOBDIR}

#Createadirectoryin$ARCHIVE_HOMEnamedafterthis
#PBSjobidandplaceyour.tarfilethere.
archivemkdirC${ARCHIVE_HOME}}${PBS_JOBID}
archiveputC${ARCHIVE_HOME}/${PBS_JOBID}${PBS_JOBID}.tar
archivels${ARCHIVE_HOME}/${PBS_JOBID}

#Removescratchdirectoryfromthefilesystem.
cd${WORKDIR}
rmrf${JOBDIR}
END

#Submitthearchivejobscript.
qsubarchive_job

5.JobManagementCommands

ThetablebelowcontainscommandsformanagingyourjobsinPBS.

JobManagementCommands
Command Description
qsub Submitajob.
qstat Checkthestatusofajob.
qview Amoreuserfriendlyversionofqstat.
qstatq DisplaythestatusofallPBSqueues.
show_queues Amoreuserfriendlyversionof"qstatq".
qdel Deleteajob.
qhold Placeajobonhold.
qrls Releaseajobfromhold.
tracejob Displayjobaccountingdatafromacompletedjob.
pbsnodes DisplayhoststatusofallPBSbatchnodes.
qpeek Letsyoupeekatthestdoutandstderrofyourrunningjob.
qhist Displayadetailedhistoryofaspecificjob.

6.OptionalPBSDirectives

Inadditiontotherequireddirectivesmentionedabove,PBShasmanyotherdirectives,butmostuserswillonlyuse
afewofthem.Someofthemoreusefuldirectivesarelistedbelow.

6.1.JobIdentificationDirectives

Jobidentificationdirectivesallowyoutoidentifycharacteristicsofyourjobs.Thesedirectivesarevoluntary,but
stronglyencouraged.Thefollowingtablecontainssomeusefuljobidentificationdirectives.

JobIdentificationDirectives
Directive Options Description
lapplication application_name Identifytheapplicationbeingused.
N job_name Nameyourjob.

6.1.1.ApplicationName
The"lapplication"directiveallowsyoutoidentifytheapplicationbeingusedbyyourjob.Thishelpstheprogramto
accuratelyassessapplicationusageandtoensurethatadequatesoftwarelicensesandappropriatesoftwareare
purchased.Tousethisdirective,addalineinthefollowingformtoyourbatchscript:

#PBSlapplication=application_name
Ortoyourqsubcommand
qsublapplication=application_name

Whereapplication_nameischosenfromalistofacceptablenamesthatismaintained
in$SAMPLES_HOME/Application_Name/application_namesoneachsystem.

6.1.2.JobName

The"N"directiveallowsyoutodesignateanameforyourjob.Inadditiontobeingeasiertorememberthana
numericjobID,thePBSenvironmentvariable,$PBS_JOBNAME,inheritsthisvalueandcanbeusedinsteadofthejob
IDtocreatejobspecificoutputdirectories.Tousethisdirective,addalineinthefollowingformtoyourbatchscript:

#PBSNjob_20
Ortoyourqsubcommand
qsubNjob_20...

6.2.JobEnvironmentDirectives

Jobenvironmentdirectivesallowyoutocontroltheenvironmentinwhichyourscriptwilloperate.Thefollowingtable
containsafewusefuljobenvironmentdirectives.

JobEnvironmentDirectives
Directive Options Description
I Requestaninteractivebatchshell.
V Exportallenvironmentvariablestothejob.
v variable_list Exportspecificenvironmentvariablestothejob.

6.2.1.InteractiveBatchShell

The"I"directiveallowsyoutorequestaninteractivebatchshell.Withinthatshell,youcanperformnormalUnix
commands,includinglaunchingparalleljobs.Touse"I",appendittotheendofyourqsubrequest.Forexample,the
qsubcommandbelowrequests2nodes(totalof72cores)for1hour.

qsubAProject_IDqdebuglselect=2:ncpus=36:mpiprocs=36lwalltime=1:00:00I

6.2.2.ExportAllVariables

The"V"directivetellsPBStoexportalloftheenvironmentvariablesfromyourloginenvironmentintoyourbatch
environment.Tousethisdirective,addalineinthefollowingformtoyourbatchscript:

#PBSV
Ortoyourqsubcommand
qsubV...

6.2.3.ExportSpecificVariables
The"v"directivetellsPBStoexportspecificenvironmentvariablesfromyourloginenvironmentintoyourbatch
environment.Tousethisdirective,addalineinoneofthefollowingformstoyourbatchscript:

#PBSvmy_variable
Ortoyourqsubcommand
qsubvmy_variable

Usingeitherofthesemethods,multiplecommaseparatedvariablescanbeincluded.Itisalsopossibletosetvalues
forvariablesexportedinthisway,asfollows:

qsubvmy_variable=my_value,...

6.3.ReportingDirectives

Reportingdirectivesallowyoutocontrolwhathappenstostandardoutputandstandarderrormessagesgeneratedby
yourscript.Theyalsoallowyoutospecifyemailoptionstobeexecutedatthebeginningandendofyourjob.

6.3.1.RedirectingStdoutandStderr

Bydefault,messageswrittentostdoutandstderrarecapturedforyouinfilesnamedx.ojob_idandx.ejob_id,
respectively,wherexiseitherthenameofthescriptorthenamespecifiedwiththe"N"directive,andjob_idisthe
IDofthejob.Ifyouwanttochangethisbehavior,the"o"and"e"directivesallowyoutoredirectstdoutandstderr
messagestodifferentnamedfiles.The"j"directiveallowsyoutocombinestdoutandstderrintothesamefile.

RedirectionDirectives
Directive Options Description
e Filename Definestandarderrorfile.
o Filename Definestandardoutputfile.
j oe Mergestderrandstdoutintostdout.
j eo Mergestderrandstdoutintostderr.

6.3.2.SettingupEmailAlerts

Manyuserswanttobenotifiedwhentheirjobsbeginandend.The"m"directivemakesthispossible.Ifyouusethis
directive,youwillalsoneedtosupplythe"M"directivewithoneormoreemailaddressestobeused.

EmailDirectives
Directive Options Description
m b Sendemailwhenthejobbegins.
m e Sendemailwhenthejobends.
M Emailaddress(es) Sendmailtoaddress(es).

Forexample:

#PBSmbe
#PBSMjoesmith@gmail.com,joe.smith@us.army.mil

6.4.JobDependencyDirectives

Jobdependencydirectivesallowyoutospecifydependenciesthatyourjobmayhaveonotherjobs.Thisallows
userstocontroltheorderjobsrunin.Thesedirectiveswillgenerallytakethefollowingform:

#PBSWdepend=dependency_expression

wheredependency_expressionisacommadelimitedlistofoneormoredependencies,andeachdependencyisof
theform:

type:jobids

wheretypeisoneofthedirectiveslistedbelow,andjobidsisacolondelimitedlistofoneormorejobIDsthatyour
jobisdependentupon.

JobDependencyDirectives
Directive Description
after Executethisjobafterlistedjobshavebegun.
afterok Executethisjobafterlistedjobshaveterminatedwithouterror.
afternotok Executethisjobafterlistedjobshaveterminatedwithanerror.
afterany Executethisjobafterlistedjobshaveterminatedforanyreason.
before Listedjobsmayberunafterthisjobbeginsexecution.
beforeok Listedjobsmayberunafterthisjobterminateswithouterror.
beforenotok Listedjobsmayberunafterthisjobterminateswithanerror.
beforeany Listedjobsmayberunafterthisjobterminatesforanyreason.

Forexample,runajobaftercompletion(successorfailure)ofjobID1234:

#PBSWdepend=afterany:1234

Or,runajobaftersuccessfulcompletionofjobID1234:

#PBSWdepend=afterok:1234

Formoreinformationaboutjobdependencies,seetheqsubmanpage.

7.EnvironmentVariables

7.1.PBSEnvironmentVariables

WhiletherearemanyPBSenvironmentvariables,youonlyneedtoknowafewimportantonestogetstartedusing
PBS.ThetablebelowliststhemostimportantPBSenvironmentvariablesandhowyoumightgenerallyusethem.

FrequentlyUsedPBSEnvironmentVariables
PBSVariable Description
$PBS_JOBID Jobidentifierassignedtojoborjobarraybythebatch
system.
$PBS_O_WORKDIR Theabsolutepathofdirectorywhereqsubwasexecuted.
$PBS_JOBNAME Thejobnamesuppliedbytheuser.

ThefollowingadditionalPBSvariablesmaybeusefultosomeusers.

OtherPBSEnvironmentVariables
PBSVariable Description
$PBS_ARRAY_INDEX Indexnumberofsubjobinjobarray.
$PBS_ENVIRONMENT Indicatesjobtype:PBS_BATCHor
PBS_INTERACTIVE
$PBS_NODEFILE Filenamecontainingalistofvnodesassignedtothe
job.
$PBS_O_HOST Hostnameonwhichtheqsubcommandwasexecuted.
$PBS_O_PATH ValueofPATHfromsubmissionenvironment.
$PBS_O_SHELL ValueofSHELLfromsubmissionenvironment.
$PBS_QUEUE Thenameofthequeuefromwhichthejobisexecuted.

7.2.OtherImportantEnvironmentVariables

InadditiontothePBSenvironmentvariables,thetablebelowlistsafewothervariableswhicharenotgenerally
required,butmaybeimportantdependingonyourjob.

OtherImportantEnvironmentVariables
Variable Description
$OMP_NUM_THREADS ThenumberofOpenMPthreadspernode
$MPI_DSM_DISTRIBUTE Ensuresthatmemoryisassignedclosesttothe
physicalcorewhereeachMPIprocessisrunning
$BC_CORES_PER_NODE Thenumberofcorespernodeforthecomputenode
onwhichajobisrunning.
$BC_MEM_PER_NODE Theapproximatemaximumuseraccessiblememory
pernode(inintegerMBytes)forthecomputenode
onwhichajobisrunning.
$BC_MPI_TASKS_ALLOC ThenumberofMPItasksallocatedforajob.
$BC_NODE_ALLOC Thenumberofnodesallocatedforajob.

8.ExampleScripts

Allofthescriptexamplesshownbelowcontaina"Cleanup"sectionwhichdemonstrateshowtoautomatically
archiveyourdatausingthetransferqueueandcleanupyour$WORKDIRafteryourjobcompletes.Usingthismethod
helpstoavoiddataloss,andensuresthatyourallocationisnotchargedforidlecoreswhileperformingfiletransfer
operations.

8.1.MPIScript

Thefollowingscriptisfora288coreMPIjobrunningfor20hoursinthestandardqueue.Toruna288corejob,we
need8nodeswith36coreseach.

Notetheuseofthe$BC_MPI_TASKS_ALLOCvariabletodefinethenumberofMPIprocessestostart.

#!/bin/ksh
##RequiredDirectives
#PBSlselect=8:ncpus=36:mpiprocs=36
#PBSlwalltime=20:00:00
#PBSqstandard
#PBSAProject_ID

##OptionalDirectives
#PBSNtestjob
#PBSjoe
#PBSMmy_email@email.com
#PBSmbe

##ExecutionBlock
#EnvironmentSetup
#cdtoyourscratchdirectory
#$JOBDIRisadirectorythatiscreatedwhenyourjobruns.
#Fileswithin$JOBDIRareprotectedfromourfilescrubber
#forthedurationofthejob,plus30daysafteritfinishes.
#FilesNOTwithina$JOBDIRaresusceptibletothenormal
#30dayfilelifespan.
cd${JOBDIR}

#stageinputdatafromarchive
archivegetC${ARCHIVE_HOME}/my_data_dir"*.dat"

#copytheexecutablefrom$HOME
cp${HOME}/my_prog.exe.

#Launching
mpiexec_mptn${BC_MPI_TASKS_ALLOC}./my_prog.exe>my_prog.out

#Cleanup
#archiveyourresults
#Usingthe"heredocument"syntax,createajobscript
#forarchivingyourdata.
cd${WORKDIR}
rmfarchive_job
cat>archive_job<<END
#!/bin/bash
#PBSlwalltime=12:00:00
#PBSqtransfer
#PBSAProject_ID
#PBSlselect=1:ncpus=1
#PBSjoe
cd${WORKDIR}

#Singlefilesareeasierandfastertoretrievefromthe
#archivethanmultiplefiles,socreatea.tarfileofyour
#results.
tarcvf${PBS_JOBID}.tar${JOBDIR}

#Createadirectoryin$ARCHIVE_HOMEnamedafterthis
#PBSjobidandplaceyour.tarfilethere.
archivemkdirC${ARCHIVE_HOME}${PBS_JOBID}
archiveputC${ARCHIVE_HOME}/${PBS_JOBID}${PBS_JOBID}.tar
archivels${ARCHIVE_HOME}/${PBS_JOBID}

#Removescratchdirectoryfromthefilesystem.
cd${WORKDIR}
rmrf${JOBDIR}
END

#Submitthearchivejobscript.
qsubarchive_job

8.2.MPIScript(accessingmorememoryperprocess)

Bydefault,anMPIjobrunsoneprocesspercore,withallprocessessharingtheavailablememoryonthenode.If
youneedmorememoryperprocess,thenyourjobneedstorunfewerMPIprocessespernode.

Thefollowingscriptrequests8nodes(288cores)butusesonlyonecorepernode.Thisstarts8MPIprocesses,
eachwithaccesstoabout126GBytesofmemory.Thejobrunsfor20hoursinthestandardqueue.

Notetheuseofthe$BC_MPI_TASKS_ALLOCenvironmentvariabletodefinethenumberofMPIprocessestostart.

#!/bin/ksh
##RequiredDirectives
#PBSlselect=8:ncpus=36:mpiprocs=1
#PBSlwalltime=20:00:00
#PBSqstandard
#PBSAProject_ID

##OptionalDirectives
#PBSNtestjob
#PBSjoe
#PBSMmy_email@email.com
#PBSmbe

##ExecutionBlock
#EnvironmentSetup
#cdtoyourscratchdirectory
#$JOBDIRisadirectorythatiscreatedwhenyourjobruns.
#Fileswithin$JOBDIRareprotectedfromourfilescrubber
#forthedurationofthejob,plus30daysafteritfinishes.
#FilesNOTwithina$JOBDIRaresusceptibletothenormal
#30dayfilelifespan.
cd${JOBDIR}

#stageinputdatafromarchive
archivegetC${ARCHIVE_HOME}/my_data_dir"*.dat"

#copytheexecutablefrom$HOME
cp${HOME}/my_prog.exe.
#Launching
mpiexec_mptn${BC_MPI_TASKS_ALLOC}./my_prog.exe>my_prog.out

#Cleanup
#archiveyourresults
#Usingthe"heredocument"syntax,createajobscript
#forarchivingyourdata.
cd${WORKDIR}
rmfarchive_job
cat>archive_job<<END
#!/bin/bash
#PBSlwalltime=12:00:00
#PBSqtransfer
#PBSAProject_ID
#PBSlselect=1:ncpus=1
#PBSjoe
#PBSS/bin/bash
cd${WORKDIR}

#Singlefilesareeasierandfastertoretrievefromthe
#archivethanmultiplefiles,socreatea.tarfileofyour
#results.
tarcvf${PBS_JOBID}.tar${JOBDIR}

#Createadirectoryin$ARCHIVE_HOMEnamedafterthis
#PBSjobidandplaceyour.tarfilethere.
archivemkdirC${ARCHIVE_HOME}${PBS_JOBID}
archiveputC${ARCHIVE_HOME}/${PBS_JOBID}${PBS_JOBID}.tar
archivels${ARCHIVE_HOME}/${PBS_JOBID}

#Removescratchdirectoryfromthefilesystem.
cd${WORKDIR}
rmrf${JOBDIR}
END

#Submitthearchivejobscript.
qsubarchive_job

8.3.OpenMPScript

ThefollowingscriptisforanOpenMPjobusingonethreadpercoreonasinglenodeandrunningfor20hoursinthe
standardqueue.ThenumberofOpenMPthreadsissetusingthe$OMP_NUM_THREADSenvironmentwhichissetby
PBSusingtheompthreadsoptioninyourselectstatement:

#PBSlselect=1:ncpus=36:mpiprocs=36:ompthreads=36

Tostartyourapplicationwithfewerthan36threads,setompthreadstoanumberlessthan36:

#PBSlselect=1:ncpus=36:mpiprocs=36:ompthreads=N
Nisthenumberofthreads(upto36)thatyouwishtorunon.

#!/bin/ksh
##RequiredDirectives
#PBSlselect=1:ncpus=36:mpiprocs=36:ompthreads=36
#PBSlwalltime=20:00:00
#PBSqstandard
#PBSAProject_ID

##OptionalDirectives
#PBSNtestjob
#PBSjoe
#PBSMmy_email@email.com
#PBSmbe

##ExecutionBlock
#EnvironmentSetup
#cdtoyourscratchdirectory
#$JOBDIRisadirectorythatiscreatedwhenyourjobruns.
#Fileswithin$JOBDIRareprotectedfromourfilescrubber
#forthedurationofthejob,plus30daysafteritfinishes.
#FilesNOTwithina$JOBDIRaresusceptibletothenormal
#30dayfilelifespan.
cd${JOBDIR}

#stageinputdatafromarchive
archivegetC${ARCHIVE_HOME}/my_data_dir"*.dat"

#copytheexecutablefrom$HOME
cp${HOME}/my_prog.exe.

#Launching
mpiexec_mptn${BC_MPI_TASKS_ALLOC}>my_prog.out

#Cleanup
#archiveyourresults
#Usingthe"heredocument"syntax,createajobscript
#forarchivingyourdata.
cd${WORKDIR}
rmfarchive_job
cat>archive_job<<END
#!/bin/bash
#PBSlwalltime=12:00:00
#PBSqtransfer
#PBSAProject_ID
#PBSlselect=1:ncpus=1
#PBSjoe
#PBSS/bin/bash
cd${WORKDIR}

#Singlefilesareeasierandfastertoretrievefromthe
#archivethanmultiplefiles,socreatea.tarfileofyour
#results.
tarcvf${PBS_JOBID}.tar${JOBDIR}

#Createadirectoryin$ARCHIVE_HOMEnamedafterthis
#PBSjobidandplaceyour.tarfilethere.
archivemkdirC${ARCHIVE_HOME}${PBS_JOBID}
archiveputC${ARCHIVE_HOME}/${PBS_JOBID}${PBS_JOBID}.tar
archivels${ARCHIVE_HOME}/${PBS_JOBID}

#Removescratchdirectoryfromthefilesystem.
cd${WORKDIR}
rmrf${JOBDIR}
END

#Submitthearchivejobscript.
qsubarchive_job

8.4.SHMEMScript

Thefollowingscriptisfora36coreSHMEMjobrunningfor20hoursinthestandardqueue.

Thescriptrequests1node,with36cores.SinceeachSHMEMthreadrequiresaccesstothesamepoolofmemory
aseachotherthread,thesejobsarelimitedtoasinglenodeofThunder,whichis36cores.

Notetheuseofthe$BC_CORES_PER_NODEenvironmentvariabletosetthevaluesofboth.Tostartyourapplication
withfewerthan16threads,replace$BC_CORES_PER_NODEwithalowervalue,likeso:

exportBC_CORES_PER_NODE=N

Nisthenumberofthreads(fewerthan16)thatyouwishtorunon.

#!/bin/ksh
##RequiredDirectives
#PBSlselect=1:ncpus=36:mpiprocs=36
#PBSlwalltime=20:00:00
#PBSqstandard
#PBSAProject_ID

##OptionalDirectives
#PBSNtestjob
#PBSjoe
#PBSMmy_email@email.com
#PBSmbe

##ExecutionBlock
#EnvironmentSetup
#cdtoyourscratchdirectory
#$JOBDIRisadirectorythatiscreatedwhenyourjobruns.
#Fileswithin$JOBDIRareprotectedfromourfilescrubber
#forthedurationofthejob,plus30daysafteritfinishes.
#FilesNOTwithina$JOBDIRaresusceptibletothenormal
#30dayfilelifespan.
cd${JOBDIR}

#stageinputdatafromarchive
archivegetC${ARCHIVE_HOME}/my_data_dir"*.dat"

#copytheexecutablefrom$HOME
cp${HOME}/my_prog.exe.

#Launching
mpiexec_mptn${BC_CORES_PER_NODE}./my_prog.exe>my_prog.out

#Cleanup
#archiveyourresults
#Usingthe"heredocument"syntax,createajobscript
#forarchivingyourdata.
cd${WORKDIR}
rmfarchive_job
cat>archive_job<<END
#!/bin/bash
#PBSlwalltime=12:00:00
#PBSqtransfer
#PBSAProject_ID
#PBSlselect=1:ncpus=1
#PBSjoe
#PBSS/bin/bash
cd${WORKDIR}

#Singlefilesareeasierandfastertoretrievefromthe
#archivethanmultiplefiles,socreatea.tarfileofyour
#results.
tarcvf${PBS_JOBID}.tar${JOBDIR}

#Createadirectoryin$ARCHIVE_HOMEnamedafterthis
#PBSjobidandplaceyour.tarfilethere.
archivemkdirC${ARCHIVE_HOME}${PBS_JOBID}
archiveputC${ARCHIVE_HOME}/${PBS_JOBID}${PBS_JOBID}.tar
archivels${ARCHIVE_HOME}/${PBS_JOBID}

#Removescratchdirectoryfromthefilesystem.
cd${WORKDIR}
rmrf${JOBDIR}
END
#Submitthearchivejobscript.
qsubarchive_job

8.5.HybridMPI/OpenMPScript

Thefollowingscriptuses8nodes(288cores)withoneMPItaskpernodeandonethreadpercore.Thenumberof
threadsperMPIprocesswillequalthenumberofcorespernode.

Notetheuseoftheuseoftheselectstatementbelowtosetboth$BC_MPI_TASKS_ALLOC(mpiprocs=1)
and$OMP_NUM_THREADS(ompthreads=36).

Tostartyourapplicationwithfewerthan36threads,setompthreadstoanumberlessthan36:

#PBSlselect=1:ncpus=36:mpiprocs=1:ompthreads=N

Nisthenumberofthreads(upto36)thatyouwishtorunon.

#!/bin/ksh
##RequiredDirectives
#PBSlselect=8:ncpus=36:mpiprocs=1:ompthreads=36
#PBSlwalltime=20:00:00
#PBSqstandard
#PBSAProject_ID

##OptionalDirectives
#PBSNtestjob
#PBSjoe
#PBSMmy_email@email.com
#PBSmbe

##ExecutionBlock
#EnvironmentSetup
#cdtoyourscratchdirectory
#$JOBDIRisadirectorythatiscreatedwhenyourjobruns.
#Fileswithin$JOBDIRareprotectedfromourfilescrubber
#forthedurationofthejob,plus30daysafteritfinishes.
#FilesNOTwithina$JOBDIRaresusceptibletothenormal
#30dayfilelifespan.
cd${JOBDIR}

#stageinputdatafromarchive
archivegetC${ARCHIVE_HOME}/my_data_dir"*.dat"

#copytheexecutablefrom$HOME
cp${HOME}/my_prog.exe.

#Launching
mpiexec_mptn${BC_MPI_TASKS_ALLOC}omplace./my_prog.exe>my_prog.out
#Cleanup
#archiveyourresults
#Usingthe"heredocument"syntax,createajobscript
#forarchivingyourdata.
cd${WORKDIR}
rmfarchive_job
cat>archive_job<<END
#!/bin/bash
#PBSlwalltime=12:00:00
#PBSqtransfer
#PBSAProject_ID
#PBSlselect=1:ncpus=1
#PBSjoe
#PBSS/bin/bash
cd${WORKDIR}

#Singlefilesareeasierandfastertoretrievefromthe
#archivethanmultiplefiles,socreatea.tarfileofyour
#results.
tarcvf${PBS_JOBID}.tar${JOBDIR}

#Createadirectoryin$ARCHIVE_HOMEnamedafterthis
#PBSjobidandplaceyour.tarfilethere.
archivemkdirC${ARCHIVE_HOME}${PBS_JOBID}
archiveputC${ARCHIVE_HOME}/${PBS_JOBID}${PBS_JOBID}.tar
archivels${ARCHIVE_HOME}/${PBS_JOBID}

#Removescratchdirectoryfromthefilesystem.
cd${WORKDIR}
rmrf${JOBDIR}
END

#Submitthearchivejobscript.
qsubarchive_job

S-ar putea să vă placă și