Documente Academic
Documente Profesional
Documente Cultură
DevelopmentofGPUsasacoprocessingacceleratorforx86CPUs
HPCEvolutionofGPUs
2004: Began strategic investments in GPU as HPC co-processor
2006: G80 first GPU with built-in compute features, 128 cores; CUDA SDK Beta
2007: Tesla 8-series based on G80, 128 cores CUDA 1.0, 1.1
2008: Tesla 10-series based on GT 200, 240 cores CUDA 2.0, 2.3
2009: Tesla 20-series, code named Fermi up to 512 cores CUDA SDK 3.0, 3.2
ANSYS 2011 Regional Conferences | 25 Aug 2011 | Palm Beach Gardens, FL
3 Years With
3 Generations
UniversityofCambridgeDARWINCluster
CUDACenterofExcellenceSince2008
GPUsubcluster:
DellT5500servers,
32dualsocketCPUs
TeslaS1070GPUs,
4GPUspersocket
fortotal128GPUs
Turbostream SimulationSpeedup19x
~19x
www.hpc.cam.ac.uk/services/darwin.html
Structural
Mechanics
Fluid
Dynamics
Electro
magnetics
ANSYSNexxim
(SignalIntegrity)
Available
Today
ANSYSMechanical13
SMP,SingleGPU
Updates
for2011
ANSYSMechanical14
DMP,ImprovedPCG
ANSYSCFD14
RadiationHT(beta)
Product
Evaluation
ANSYSMechanical15
MultiGPU,Multinode
ANSYSCFD15
Solver,othermodels
ANSYSHFSS
Research
Evaluation
ANSYSMaxwell
NVIDIAProvidesBusinessandEngineeringInvestmentsinANSYSTechnologyDevelopments
ANSYS 2011 Regional Conferences | 25 Aug 2011 | Palm Beach Gardens, FL
3
ANSYS 2011 Regional Conferences | 25 Aug 2011 | Palm Beach Gardens, FL
MostANSYSsoftwareemploysadomainparallelmethod
GPUcomputingfitsthismethod,preservesDANSYSinvestments
ANSYS13focuswasSMPsolvers;ANSYS14focusisDANSYSsolvers
ANSYSsoftwareisparallelandscaleswellformulticoreCPUs
DirectsolversuseaschemeofcomputationsonbothGPUandCPU
IterativesolvershavecomputationsonGPU,matrixassemblyonCPU
InvestigationsincludeGPUperformanceagainstmulticoreCPUonly
ANSYS 2011 Regional Conferences | 25 Aug 2011 | Palm Beach Gardens, FL
AcceleratingSystemLevelSignalIntegritySimulationwithGPU
Dr.EkanathanPalamadai,ANSYS
ANSYSCFDpreliminaryresultsof
radiationheattransferviewfactor
computationonGPUsvs.CPUs
RadiationHTApplications:
Underhood cooling
CabincomfortHVAC
Furnacesimulations
Solarloadsonbuildings
Combustorinturbine
Electronicspassivecooling
OtherANSYSCFDEvaluations:
Models(e.g.dispersephase)
Implicitequationsolvers
"ThisinitialdevelopmentforGPUcomputingdemonstratesourfocusonevolvingANSYSsoftwaretotakeadvantage
ofimportanttechnologytrendsinhighperformancecomputing."saidDipankarChoudhury,vicepresidentof
corporateproductstrategyandplanningatANSYS."Weworktoachieveoptimizedsoftwareperformance,acrossthe
fullspectrumofHPCtechnologies,sothatourcustomersgetmaximumvaluefromtheirinvestmentinHPC.Here,our
technicalcollaborationwithNVIDIAhasresultedinasignificantbenefitforourmutualcustomers."
ANSYSMechanical14:CollaborationonDMPsolvers Q42011
ANSYS 2011 Regional Conferences | 25 Aug 2011 | Palm Beach Gardens, FL
10
GPUSolver
KernelSpeedups
FromNAFEMSWorld
CongressMay2011
Boston,MA,USA
AccelerateFEA
SimulationswithaGPU
byJeffBeisheim,ANSYS
GPUOverall
SimulationSpeedups
SystemConfiguration:
Xeon5560,2.8GHz
2sockets,8cores
32GBmemory
WinXPSP264bit
TeslaC2050GPU
11
3000
Xeon56702.93GHzWestmere(DualSocket)
Lower
is
better
Xeon56702.93GHzWestmere+TeslaC2075
ResultsfromHPZ800Workstation,2xXeonX56702.93GHz
48GBmemory,CentOS 5.4x64; TeslaC2075,CUDA4.0.17
2000
1848
1000
4.2x
1192
846
2.7x
564
2.1x
516
1.9x
342
314
273
270
2Core
4Core
6Core
8Core
1Core
V13sp5Model
3.5x
444
AVAILABLE
Q42011
1Socket
399
12Core
Turbinegeometry
2,100KDOF
SOLID187FEs
Static,nonlinear
Oneloadstep
Directsparse
2Socket
12
13
ResultsfromHPZ800Workstation,2xXeonX56702.93GHz
48GBmemory,CentOS 5.4x64; TeslaC2075,CUDA4.0.17
CPUSpeedup
GPUSpeedup
SolutionCost
SolutionCostBasis
ANSYSbaselicense
ANSYSHPCPack
Workstation
TeslaC2075
PerformanceBasis
V13sp5Model:
2,100KDOF
SOLID187FEs
Staticnonlinear
Oneloadstep
Directsparse
1
1.0 1.0
0
BaseLicense
2Core
ANSYS 2011 Regional Conferences | 25 Aug 2011 | Palm Beach Gardens, FL
14
5
4
3
ResultsfromHPZ800Workstation,2xXeonX56702.93GHz
48GBmemory,CentOS 5.4x64; TeslaC2075,CUDA4.0.17
ANSYSbaselicense
ANSYSHPCPack
Workstation
TeslaC2075
CPUSpeedup
GPUSpeedup
SolutionCost
2.3
2.1
1
1.0 1.0
1.23
PerformanceBasis
1.23
BaseLicense
2Core
SolutionCostBasis
ANSYSHPC
Pack 6Cores
V13sp5Model:
2,100KDOF
SOLID187FEs
Staticnonlinear
Oneloadstep
Directsparse
ANSYSHPC
Pack 8Cores
15
5
4
3
ResultsfromHPZ800Workstation,2xXeonX56702.93GHz
48GBmemory,CentOS 5.4x64; TeslaC2075,CUDA4.0.17
4.4
CPUSpeedup
GPUSpeedup
SolutionCost
1.0 1.0
ANSYSbaselicense
ANSYSHPCPack
Workstation
TeslaC2075
3.8
2.3
2.1
SolutionCostBasis
1.23
PerformanceBasis
1.23
1.28
1.28
BaseLicense
2Core
ANSYSHPC
Pack 6Cores
ANSYSHPC
Pack 8Cores
V13sp5Model:
2,100KDOF
SOLID187FEs
Staticnonlinear
Oneloadstep
Directsparse
ANSYSHPCPack ANSYSHPCPack
4Cores+GPU
6Cores+GPU
16
ANSYSMechanical largedeflectionbendingofPCBs
ANSYSMechanical comfortandfitof3Demitterglasses
17
18
Workstations
Servers
Existing System
Existing System
19
JointCollaborationonANSYS13.0isonlythebeginning
CollaborationongoinginalldisciplinesofCSM,CFDandCEM
LearnmoreaboutANSYSandNVIDIAGPUsolution
Moreat:www.nvidia.com/object/teslaansysaccelerations.html
WanttotryANSYSonNVIDIAGPUs?Contactcae@nvidia.com
ANSYS 2011 Regional Conferences | 25 Aug 2011 | Palm Beach Gardens, FL
20
21