Sunteți pe pagina 1din 65

1 / 64

Improving Java Performance

Sun Learning Service


October 2008 Consultant eun-su
October, eun su Jeon
2 / 64

Agenda
1. 컴퓨팅패러다임의 변천과 자바
기술의 발전
2
2. 성능 문제를 일으킬 수 있는 곳
3. JVM의 구조
4. Execution Engine의
의 선택
5
5. J2SE5 JavaSE6 Performance
J2SE5,JavaSE6
features
6. T i the
Tuning h JVM
7
7. Q&A
3 / 64

1.컴퓨팅
1 컴퓨팅 패러다임의 변천과
자바 기술의 발전
4 / 64

-Mainframe시대
5 / 64

-CS시대

server
client
li t
6 / 64

-Web시대

HTTP HTML

Web Server
7 / 64

-Web시대(CGI)

HTML CGI
HTTP

Web Server
8 / 64

-Web시대(Servlet)

Servlet
HTML
DBMS
HTTP
Java
Bean

Web Container
Web Server
9 / 64

-Web시대(Model2Architecture)

HTML Servlet

DBMS
HTTP Java
Bean
JSP

Web Container
Web Server
10 / 64

-Web시대(EJB,Web Service…)

Application Server
Servlet
HTML
DBMS
H
T Java
JSP Bean
T
P
Web
Container
Web Server
B2B
11 / 64

-JavaSE
12 / 64

-JavaEE
13 / 64

-Multi Tier Architecture


Client-tier Presentation-tier Business-tier Integration-tier Resource/EIS-tier

DBMS
Applet Servlet
HTML
Java SE Java Bean
Any Web browser
Any O/S EJB
Any H/W JSP Java Bean

Business
application

Java EE – Servlet &


Java EE-EJB Container
JSP container
?
Java SE
?
Standalone Any Web Server Java SE
application Any O/S
Any O/S Any O/S
Any H/W
Java SE Any H/W Any H/W
Any O/S
Any H/W
14 / 64

2.성능 문제를 일으킬 수 있는 곳


Client-tier Presentation-tier Business-tier Integration-tier Resource/EIS-tier

latency latency
Servlet latency latency DBMS
Applet
HTML
Java SE Java Bean
Any Web browserlatency
latency
Any O/S EJB
Any H/W JSP Java Bean latency
latency
Business
latency application
latency
Java EE – Servlet &
Java EE-EJB Container
JSP container
latency latency ?
Java SE
?
Standalone Any Web Server latency Java SE latency
application Any O/S
Any O/S latency Any O/S latency Any H/W
Java SE Any H/W Any H/W
latency latency
Any O/S
Any H/W
15 / 64

-only this point!


Client-tier Presentation-tier Business-tier Integration-tier Resource/EIS-tier

DBMS
Applet Servlet
HTML
Java SE Java Bean
Any Web browser
Any O/S EJB
Any H/W JSP Java Bean

Business
application

Java EE – Servlet &


Java EE-EJB Container
JSP container
latency ?
Java SE
?
Standalone Any Web Server Java SE
application Any O/S
Any O/S Any O/S
Any H/W
Java SE Any H/W Any H/W
Any O/S
Any H/W
16 / 64

3.JVM의 구조
17 / 64

-execution code & interfaces


18 / 64

-Runtime Data Area


19 / 64

4.Execution Engine의 선택

HotSpot Client HotSpot Server


Compiler Compiler
20 / 64

-Execution Engine의 특징
• The Java HotSpot™
™ Client Compiler
• Two-phase compiler
• Phase 1 – Constructs intermediate representation
(IR) of bytecode
• Phase 2 – Generates machine code from IR
• The Java HotSpot Server Compiler
• High
High-end
end optimizing compiler
• Uses advanced static single assignment-based IR
optimizations
• Performs all classic optimizations
• Performs full inlining and full deoptimizations
21 / 64

- Sun is always improving Java™ Performance

HotSpot
Server
Compi-
ler

Any O/S (Microsoft Windows i586 Platform제외)


Any H/W : 2G RAM, 2CPU 이상
22 / 64

5. J2SE 5,JavaSE
5 5 JavaSE 6
Performance features
J2SE5 JavaSE6
VM 성능 최소 20% 향상 Optimized memory access

Class Data Sharing에 따른 Parallel Compacting


어플리케이션 시작 시간 C ll t
Collector
단축

Garbage Collector 성능 Large Page support


향상 through -
XX:+UseLargePages
23 / 64

- between Java SE 5 and


Java SE 6 Update 2

This test was conducted on a Sun Fire V890 with 24 x 1.5 GHz
UltraSparc CPU's and 64 GB RAM running Solaris 10:
24 / 64

-Runtime Data Area


25 / 64

- JVM 메모리 영역
26 / 64

- Heap Space With


Generations
27 / 64

- Non-generational GC
• Most straightforward GC will just iterate over
every
object in the heap and determine if any other
j
objects reference it
> Non-generational GC
> This gets really slow as the number of objects in
the
heap increases
> This does not take advantage of the
characteristics of
typical objects
• Hence the reason for Generational GC
28 / 64

- Empirical Statistics
• Most objects are very short lived
> 80-98% of all newly allocated objects
die within a few million instructions
> 80-98%
80 98% off all
ll newly
l allocated
ll t d objects
bj t
die before another megabyte has been allocated
29 / 64

- Empirical Statistics
Keep young and old objects separate
• Use different GC algorithms for each generation
> Different requirements on each of these object
g
groups
p
30 / 64

-minor collection

Eden에서 Alive된 객체를 Suvivor1으로 이동하고 Eden 영역을 Clear한다.


Clear한다
31 / 64

-minor collection

Eden영역에 Alive된 객체와 Suvivor1영역에 Alive된 객체를 Survivor 2에


copy한다.
copy한다
Eden영역과 Suvivor1영역을 clear한다.
32 / 64

-minor collection

객체가 생성된 시간이 오래 지나면 Eden과 Survivor영역에 있는


오래된 객체들을 Old 영역으로 이동한다.
이동한다
33 / 64

- Major collection
= Full GC
Mark Sweep Compact
- “stop
stop the world
world”
34 / 64

-OLTP
Client-tier Presentation-tier Business-tier Integration-tier Resource/EIS-tier

DBMS
Applet Servlet
HTML
Java SE Java Bean
H
Any Web browser
Any O/S T EJB
Any H/W T JSP Java Bean
P
Business
application

Java EE – Servlet &


Java EE-EJB Container
JSP container
?
Java SE
?
Standalone Any Web Server Java SE
application Any O/S
Any O/S Any O/S
Any H/W
Java SE Any H/W Any H/W
Any O/S
Any H/W
35 / 64

6. Tuning the JVM

Primary tuning tasks


1. Select the appropriate hotspot
compiler
2. Select the garbage collector
3. Configure the garbage collector
36 / 64

- 다양한 G/C 알고리즘


1.Default Collector ((serial collector))
2.Parallel GC for young generation
3 M tl C
3.Mostly Concurrentt GC ffor old
ld
g
generation
4.Incremental GC (Train GC)
5 Parallel Compaction GC
5.Parallel
37 / 64

- Default GC(Serial GC)


38 / 64

- Parallel GC
39 / 64

- Mostly Concurrent GC
40 / 64

- JVM Options

표준 옵션
비표준
비 준 옵션 : -X로
X 시작
히든옵션 : -XX로 시작

http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp
41 / 64

- JVM Options-standard(1/2)
-client, -server, -hotspot
-agentlib:libname[=options]
g p p [ p
-agentpath:pathname[=options] ]
-classpath|-cp classpath
-Dproperty=value
Dproperty=value
-d32, -d64
-enableassertions|-ea[:<packagename>...|:<classname>]
| [ p g |
-disableassertions|-da[:<packagename>...|:<classname>]]
-enablesystemassertions|-esa
42 / 64

- JVM Options-standard(2/2)
-disablesystemassertions|-dsa
-jar
j
-javaagent:jarpath[=options]
-verbose[:class|gc|jni]
version, -showversion
-version, showversion
43 / 64

-JVM Options-non standard(1/2)


-Xmixed, -Xcomp, -Xint
-Xbatch
Xdebug
-Xdebug
-Xbootclasspath:bootclasspath
-Xbootclasspath/a:path,
Xb t l th/ th -Xbootclasspath/p:path
Xb t l th/ th
-Xcheck:jni
-Xfuture
Xnoclassgc
-Xnoclassgc
-Xincgc , -Xnoincgc
44 / 64

-JVM Options-non standard(2/2)


-Xloggc:file
gg
-Xmsn, -Xmxn, -Xmnn
-Xprof
Xprof
-Xrunhprof[:help][:<suboption>=<value>,...]
-Xrs
[ | | ]
-Xshare:[on|off|auto]
-Xssn
-Xmaxjitcodesizen
Xmaxjitcodesizen
-XX:+UseAltSigs
45 / 64

-JVM Options-hidden(1/2)
-XX:MaxheapFreeRatio=n, -XX:MinheapFreeRatio=n
-XX:ParallelGCThreads=n
-XX:+UseBoundThreads
-XX:+PrintGC,
XX:+PrintGC -XX:+PrintGCDetails
XX:+PrintGCDetails
-XX:+DisableExplicitGC
-XX:+PrintHeapAtGC
p
-XX:+PrintCompilation
-XX:CompileOnly=ClasName.methodName
-XX:+ShowMessageBoxOnError
XX:+ShowMessageBoxOnError
-XX:+UseParNewGC, -XX:+UseParallelGC
-XX:+UseConcMarkSweepGC
46 / 64

- JVM Options-hidden(2/2)
-XX:CMSInitiationOccupancyFraction=x
-XX:+RecordMarkSweepCompaction
XX:+RecordMarkSweepCompaction
-XX:+AggressiveHeap
-XX:+PrintTenuringDistribution
-XX:+PrintGCTimeStamps p
-XX:NewRatio=n, -XX:SurvivorRatio=n
-XX:NewSize=n,
XX:NewSize=n -XX:MaxNewSize=n
XX:MaxNewSize=n
-XX:MaxPermSize=n
47 / 64

- Example 1
48 / 64

- Example 1
1. 전체 힙 사이즈는 물리적 메모리의 ¼정도
2 Client AppÎyoung:old
2. AppÎyoung:old=1:4
1:4 ((-XX:NewRatio=4
XX:NewRatio 4 )
Server AppÎyoung:old=1:10 (-XX:NewRatio=10)
3 전체 GC비율이 2%미만이 되도록 NewRatio 조율
3.
(GC수행백분율 =Application stopped
time/ApplicationConcurrentTime *100
100 (%)
VS.
-XX:GCTimeRatio=n
GCtime:ApplicationTime=1/(1+n)
)
-Xmx256m -Xms256m -XX:SurvivorRatio=2
-XX:NewRatio=4 -XX:+PrintGC
–XX:+PrintGCApplicationConcurrentTime
XX P i tGCA li ti C tTi
–XX:+PrintGCApplicationStoppedTime BezierAnim
49 / 64

- Example 1

Survivor New GC 발생 횟수 GC수행백


수행백
ratio ratio /분 분율(%)
auto auto 100번 이상 0 204
0.204

2 4 73 0.356

2 10 100번 이상 0.273

8 4 45 0.336
50 / 64

- Example
p 1
51 / 64

- Example 2
52 / 64

- Example 2
Presentation-tier Resource/EIS-tier
Client-tier

DBMS
Applet
HTML Servlet

Java SE
Any Web browser
Any O/S Java Bean
JSP
Any H/W

Tomcat 6
6.0.18
0 18

JDK1.6.0_07 (SerialGC)

Windows 32bit

AMD Athlon 1.81GHz 1GB RAM


53 / 64

- Example 2
32bit windows platform

300

250

200
TPS

150

100

50

00
0

0
10

30

50

70
1

10

20

40

20
active clients

no options
-Xmx256m -Xms256m -XX:SurvivorRatio=8 -XX:NewRatio=4
54 / 64

- Example 2
GC 횟수

1200
1000
GC count

800
auto
600
heap 조정후
400
200
0
1
5
10
30
50
70
00
00
00
00
10
20
40
200
active client
55 / 64

- Example 2
GC ratio

20

15
gc rratio

auto
10
heap 조정후
5

0
1
5
10
30
50
70
00

00
00
00
3

10

20
40
200
active client
56 / 64

- Example 3
Presentation-tier Resource/EIS-tier
Client-tier

DBMS
Applet
HTML Servlet

Java SE
Any Web browser
Any O/S Java Bean
JSP
Any H/W

Tomcat 6
6.0.18
0 18

JDK1.6.0_07 (SerialGC)

64bit Solaris OS 10

T2000 : T1 CPU (8core x 4thread=32)1.2Ghz 64GB


57 / 64

- Example 3

SerialGC

600
500
400
TPS

300
200
100
0

00
0

0
10

30

50

70
1

10

20

40

20
active client

1CPU:32bit windows platform


1CPU 8core * 4thread :64bit sparc platform
58 / 64

- Example 3
T2000 SerialGC heap 조정 전후

600
500
400
auto
TPS

300
heap 조정후
T

200
100
0
1
5

10
30

50
70

100
200

400
2000
active clients

serialGC -Xmx256m -Xms256m -XX:SurvivorRatio=2 -XX:NewRatio=4


59 / 64

-Amdahl’s law
is concerned with the speedup achievable from an 1
improvement to a computation that affects a fraction F
of that computation where the improvement has a
(1-F)+F/S
speedup of S
.
In the special case of
1 parallelization, Amdahl's law
states that if F is the fraction of a
calculation that is sequential, and
F+(1-F)/P (1-F) is the fraction that can be
(1
parallelised, then the maximum
speedup that can be achieved by
using P processors is
60 / 64

-Amdahl’s law
61 / 64

- Example 3
T2000 SerialGC VS ParallelGC

700
600
500
400 serialGC
PS
TP

300 parallelGC
200
100
0
1

10
30

50

70

00

00
00

00
10

20
40

200
active clients
62 / 64

- Example 3

By: Peter Lin


63 / 64

- Example 3
전체 비교

700
600
500 AMD
400
PS
TP

300 T2000 SerialGC


S GC
200 no options
100 T2000 SerialGC
0 heap 조정후
T2000 ParallelGC
1
5
10
30
50
70
00
00
00
00
1
3
5
7
10
20
40
200
active clients
64 / 64

- Application Requirement

• Different applications have different


requirements
> Higher throughput is more important for Web
application:
pauses during garbage collection may be tolerable,
tolerable or
simply obscured by network latencies.
> Shorter pause time is more important to an
interactive graphics application
65 / 64

7. Q&A

S-ar putea să vă placă și