Documente Academic
Documente Profesional
Documente Cultură
Java Virtual Machine (JVM) Java HotSpot Virtual Machine (HotSpot JVM)
Who We Are
>
Tony Printezis
GC Group / HotSpot JVM development team Been working on the HotSpot JVM since 2006 10+ years of GC experience Charlie Hunt
>
Java Platform Performance Engineering Group Works with many Sun product teams and customers 10+ years of Java technology performance work
3
GC Tuning is an Art
>
Unfortunately, we can't give you a flawless recipe or a flowchart that will apply to all your GC tuning scenarios GC tuning involves a lot of common pattern recognition This pattern recognition requires experience
>
>
Agenda
> > >
> >
Old Generation
Permanent Generation
Young Generation
Allocation (new Object())
Eden
Survivor Spaces
Old Generation
10
Permanent Generation
Agenda
> > >
> >
Your Dream GC
>
>
13
Supersize it!
14
For both young and old generation Larger space: less frequent GCs, lower GC overhead, objects more likely to become garbage Smaller space: faster GCs (not always! see later)
>
Sometimes max heap size is dictated by available memory and/or max space the JVM can address
You have to find a good balance between young and old generation size
15
Dictates frequency of minor GCs Dictates how many objects will be reclaimed in the young generation
>
Old Generation
Should comfortably hold the application's steadystate live size Decrease the major GC frequency as much as possible
16
You should try to maximize the number of objects reclaimed in the young generation
This is probably the most important piece of advice when sizing a heap and/or tuning the young generation
>
Your application's memory footprint should not exceed the available physical memory
This is probably the second most important piece of advice when sizing a heap
>
>
> >
-Xmn<size> : young generation size Applications with emphasis on performance tend to set -Xms and -Xmx to the same value When -Xms != -Xmx, heap growth or shrinking requires a Full GC
18
>
Set -Xms to what you think would be your desired heap size
>
If memory allows, set -Xmx to something larger than -Xms just in case
Maybe the application is hit with more load Maybe the DB gets larger over time
>
In most occasions, it's better to do a Full GC and grow the heap than to get an OOM and crash
19
-XX:PermSize=<size> : permanent generation initial size -XX:MaxPermSize=<size> : permanent generation max size Applications with emphasis on performance almost always set -XX:PermSize and -XX:MaxPermSize to the same value
>
>
>
Agenda
> > >
> >
Newly-allocated objects in Eden start from age 0 Their age is incremented at every minor GC
>
Increasing the size of the Eden will not always affect minor GC times
Remember: minor GC times are proportional to the amount of objects they copy (i.e., the live objects), not the young generation size
22
Survivor Ratio
0 Youngest
Oldest
23
Survivor Ratio
0 Youngest
Oldest
24
Survivor Ratio
0 Youngest
Oldest
25
-XX:NewSize=<size> : initial young generation size -XX:MaxNewSize=<size> : max young generation size -XX:NewRatio=<ratio> : young generation to old generation ratio Applications with emphasis on performance tend to use -Xmn to size the young generation since it combines the use of -XX:NewSize and -XX:MaxNewSize
26
>
>
>
Tenuring
>
-XX:TargetSurvivorRatio=<percent>, e.g., 50
>
> >
-XX:MaxTenuringThreshold=<threshold> -XX:+AlwaysTenure
Never keep any objects in the survivor spaces Very bad idea!
27
>
-XX:+NeverTenure
Try to retain as many objects as possible in the survivor spaces so that they can be reclaimed in the young generation
Less promotion into the old generation Less frequent old GCs
>
But also, try not to unnecessarily copy very longlived objects between the survivors
Unnecessary overhead on minor GCs Generally: better copy more, than promote more
28
>
Tenuring Distribution
>
>
29
>
30
>
Either increase max tenuring threshold Or even set max tenuring threshold to 2
The number of parallel GC threads is controlled by -XX:ParallelGCThreads=<num> Default value assumes only one JVM per system Set the parallel GC thread number according to:
> >
Number of JVMs deployed on the system / processor set / zone CPU chip architecture
32
Agenda
> > >
> >
Parallel GC Ergonomics
>
i.e., auto-tuning
>
Ergonomics help in improving out-of-the-box GC performance To get maximum performance, most customers we know do manual tuning
>
34
Tune the young generation as described so far Try to avoid / decrease the frequency of major GCs We know of customers who use the Parallel GC in low-pause environments
>
35
NUMA
>
> >
>
Allocates new objects into the partition that belongs to the allocating CPU Big win for some applications
36
>
Agenda
> > >
> >
Tune the young generation as described so far Need to be even more careful about avoiding premature promotion
Originally we were using an +AlwaysTenure policy We have since changed our mind :-)
> >
Promotion in CMS is expensive (free lists) The more often promotion / reclamation happens, the more likely fragmentation will settle in
38
We know customers who tune their applications to do mostly minor GCs, even with CMS
CMS is used as a safety net, when applications load exceeds what they have provisioned for Schedule Full GCs at non-critical times (say, late at night) to tidy up the heap and minimize fragmentation
39
Fragmentation
>
Two types
External fragmentation
No free chunk is large enough to satisfy an allocation Allocator rounds up allocation requests Free space wasted due to this rounding up
Internal fragmentation
40
Fragmentation (ii)
>
It has been proven Decrease promotion into the CMS old generation Be careful when coding
>
>
41
Available in post 6 JVMs CMS cycle duration vs. Concurrent overhead during a CMS cycle
>
Trade-Off
42
To date, classes will not be unloaded by default from the permanent generation when using CMS
Both -XX:+CMSClassUnloadingEnabled and -XX: +PermGenSweepingEnabled need to be set to enable class unloading in CMS The 2nd switch is not needed in post 6u4 JVMs
43
Frequent CMS cycles High concurrent overhead Chance of an evacuation failure / Full GC
>
>
Initiating heap occupancy should be (much) higher than the application steady-state live size Otherwise, CMS will constantly do CMS cycles
44
>
Old generation grows at a non-trivial rate Very frequent CMS cycles CMS cycles need to start relatively early
>
Applications that promote very few or even no objects to the old generation
Old generation grows very slowly, if at all Very infrequent CMS cycles CMS cycles can start quite late
45
It first does a CMS cycle early to collect stats Then, it tries to start cycles as late as possible, but early enough not to run out of heap before the cycle completes It keeps collecting stats and adjusting when to start cycles Sometimes, the second cycle starts too late
46
-XX:CMSInitiatingOccupancyFraction=<percent>
Occupancy percentage of CMS old generation that triggers a CMS cycle Don't use the ergonomic initiating occupancy
>
-XX:+UseCMSInitiatingOccupancyOnly
47
-XX:CMSInitiatingPermOccupancyFraction=<percent>
Occupancy percentage of permanent generation that triggers a CMS cycle Class unloading must be enabled
48
50
This is better:
[ParNew 640710K->546360K(773376K), 0.1839508 secs] [CMS-initial-mark 548460K(773376K), 0.0883685 secs] [ParNew 651320K->556690K(773376K), 0.2052309 secs] [CMS-concurrent-mark: 0.832/1.038 secs] [CMS-concurrent-preclean: 0.146/0.151 secs] [CMS-concurrent-abortable-preclean: 0.181/0.181 secs] [CMS-remark 623877K(773376K), 0.0328863 secs] [ParNew 655656K->561336K(773376K), 0.2088224 secs] [ParNew 648882K->554390K(773376K), 0.2053158 secs] ... [ParNew 489586K->395012K(773376K), 0.2050494 secs] [ParNew 463096K->368901K(773376K), 0.2137257 secs] [CMS-concurrent-sweep: 4.873/6.745 secs] [CMS-concurrent-reset: 0.010/0.010 secs] [ParNew 445124K->350518K(773376K), 0.1800791 secs] [ParNew 455478K->361141K(773376K), 0.1849950 secs]
51
-XX:+ExplicitGCInvokesConcurrent
-XX:+ExplicitGCInvokesConcurrentAndUnloadClasses
>
52
Agenda
> > >
> >
Monitoring the GC
>
Online
http://java.sun.com/performance/jvmstat/ VisualGC is also available as a VisualVM plug-in Can monitor multiple JVMs within the same tool
>
Offline
GC Logging in Production
>
Very helpful when diagnosing production issues Maybe some large files in your file system. :-) We are surprised that customers are still afraid to enable it If someone doesn't enable GC logging in production, I shoot them!
55
>
>
-XX:+PrintGCTimeStamps
Add -XX:+PrintGCDateStamps if you must Preferred over -verbosegc as it's more detailed
-XX:+PrintGCDetails
>
Also useful:
PrintGCStats
> >
>
Usage
Where <num> is the number of CPUs on the machine where the GC log was obtained
>
PrintGCStats Parallel GC
what gen0t(s) gen1t(s) GC(s) alloc(MB) promo(MB) used0(MB) used1(MB) used(MB) commit0(MB) commit1(MB) commit(MB) count 193 1 194 193 193 193 1 194 193 193 193 = = = = = = = = total 11.470 7.350 18.819 11244.609 807.236 16018.930 635.896 91802.213 17854.188 123520.000 141374.188 11244.609 11244.609 11244.609 807.236 807.236 301.110 0.000 301.110 MB MB MB MB MB s s s / / / / / / / / mean 0.05943 7.34973 0.09701 58.26222 4.18257 82.99964 635.89648 473.20728 92.50874 640.00000 732.50874 77.237 1235.792 934.682 77.237 11.470 1235.792 1235.792 1235.792 s s s s s s s s max 0.687 7.350 7.350 100.875 96.426 114.375 635.896 736.490 114.500 640.000 754.500 stddev 0.0633 0.0000 0.5272 18.8519 9.9291 17.4899 0.0000 87.8376 9.8209 0.0000 9.8209
= 145.586 MB/s = 9.099 MB/s = 12.030 MB/s = 10.451 MB/s = 70.380 MB/s = 24.366% = 0.000% = 24.366%
58
PrintGCStats CMS
what gen0(s) gen0t(s) cmsIM(s) cmsRM(s) GC(s) cmsCM(s) cmsCP(s) cmsCS(s) cmsCR(s) alloc(MB) promo(MB) used0(MB) used(MB) commit0(MB) commit1(MB) commit(MB) count 110 110 3 3 113 3 6 3 3 110 110 110 110 110 110 110 = = = = = = = = total 24.381 24.397 0.285 0.092 24.774 2.459 0.971 14.620 0.036 11275.000 1322.718 12664.750 56546.542 12677.500 70400.000 83077.500 11275.000 11275.000 11275.000 1322.718 1322.718 396.378 18.086 414.464 MB MB MB MB MB s s s / / / / / / / / mean 0.22164 0.22179 0.09494 0.03074 0.21924 0.81967 0.16183 4.87333 0.01200 102.50000 12.02471 115.13409 514.05947 115.25000 640.00000 755.25000 83.621 1337.936 923.472 83.621 24.397 1337.936 1337.936 1337.936 s s s s s s s s max 1.751 1.751 0.108 0.032 1.751 0.835 0.191 4.916 0.016 102.500 104.608 115.250 640.625 115.250 640.000 755.250 stddev 0.2038 0.2038 0.0112 0.0015 0.2013 0.0146 0.0272 0.0638 0.0035 0.0000 11.8770 1.2157 91.5858 0.0000 0.0000 0.0000
= 134.835 MB/s = 8.427 MB/s = 12.209 MB/s = 15.818 MB/s = 54.217 MB/s = 29.626% = 1.352% = 30.978%
59
GChisto
> >
>
Open source at
>
60
Demo
GChisto Demo
61
Agenda
> > >
> >
Conclusions
> >
Basic GC tuning concepts How to monitor GCs What to look out for Examples of good tuning practices
>
63