12
- December
2009
Posted By : cybergavin
The MaxTenuringThreshold for a Hotspot JVM

What is MaxTenuringThreshold?:

In a Sun Hotspot JVM, objects that survive Garbage Collection in the Young Generation are copied multiple times between Survivor Spaces before being moved into the Tenured (Old) Generation. The JVM flag that governs how many times the objects are copied between the Survivor Spaces is MaxTenuringThreshold (MTT) and is passed to a JVM as –XX:MaxTenuringThreshold=n , where n is the number of times the objects are copied. The default value of ‘n’ is (or actually was) 31.

Setting the MaxTenuringThreshold:

A few years ago, while my colleagues and I were tuning a 1.4.2_11 Hotspot JVM using flags like PrintTenuringDistribution and tools like  visualgc, we found that setting MTT=10 along with other flags gave us the best results (JVM throughput, pause time, footprint). However, recently when tuning a 1.4.2_11 Hotspot JVM for another application that had mostly short-lived objects, I suggested testing a value of MTT=80 (I still have no idea how the value 80 came to my mind) which is ridiculous as you’ll soon know. My objective was to retain the short-lived objects for as long as possible in the Young Generation to allow them to be collected by Minor GCs as opposed to the Full GCs in the tenured generation. Anyway, all our performance tests of the application on JVM 1.4.2_11 with MTT=80 and other JVM flags showed significant improvement in JVM performance than before (when it was untuned).

Last week, I came across some interesting proposals discussed among Sun engineers last year, regarding modifying the way MTT is handled by the JVM. I don’t know whether the proposals have been implemented, but they give some good insight into how MTT works. To quote those discussions,

Each object has an "age" field in its header which is incremented every time an object is copied within the young generation. When the age field reaches the value of MTT, the object is promoted to the old generation (I’ve left out some detail here…). The parameter -XX:+NeverTenure tells the GC never to tenure objects willingly (they will be promoted only when the target survivor space is full). (out of curiosity: does anyone actually use -XX:+NeverTenure?) Originally, in the HotSpot JVM, we had 5 bits per object for the age field (for a max value of 31, so values of MTT would make sense if they were <= 31). A couple of years ago (since 5u6 IIRC), the age field "lost" one bit and it now only has 4 (for a max value of 15).

Refer http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2008-May/000309.html for more details. So, basically , when I set MTT=80, the JVM would have actually “never tenured” the objects in the Young Generation until the Survivor Spaces were full. Hope Sun have fixed that problem as per the proposal or at least provide proper documentation (similar to the referenced article) for their JVMs which explains how MTT works. Well, MTT=80 did not have an adverse impact on our application, but we eventually switched to MTT=8 (the value 8 was a guess and didn’t provide very different results). I suggest that MTT is not set at first and be used only if your analysis of GC logs and your requirements indicate that you need to retain short-lived objects in the Young Generation for longer. As a matter of fact, when tuning a JVM, always start with basic flags for Heap Size and nothing else. Then, based on load tests, customer experience metrics (e.g. response time, response errors) and analysis of GC logs, set JVM flags and retest. Tuning is iterative and apart from all the tools available, a must-have quality (especially for complex applications) is patience.

VN:F [1.9.22_1171]
Rating: +14 (from 14 votes)
Print Friendly, PDF & Email
(Visited 6 times, 1 visits today)

Comments

  • Deepak Devadathan

    July 6, 2010 at 4:03 pm

    what should be considered while tuning jvm.
    1) should it be such that heap utilization does reach 80 -90 % of max heap
    OR
    2) throughput should be high.

    I have seen that in my application, scavenge collection ( young collection ) stops after 4-5 days of starting server. After that tenured collections keep occuring which high pause time.

    So neither heap is recollect and nor the pause times is good. Using CMS improves throughput but at the price of higher cpu utilization

  • Hi Deepak

    1) and 2) are not mutually exclusive. In your case, if there are no longer any young generation collections but many tenured collections, it implies, your young generation size may need to be increased. Now, if you increase YG size, opt for parallel scavenge in YG to ensure low pause-time. Before deciding how to tune your JVM, you must decide on what’s best for your application – High throughput or low pause-time. For example, if you have an application that requires high throughput but can tolerate occasional moderate to high pause times for GC, then parallel collectors would help. On the other hand, if your application cannot tolerate even moderate pause times (eg. > 3 secs for a gaming application), then CMS or iCMS will help. CMS will definitely consume more CPU cycles than the default GC because CMS uses another thread to provide that “mostly concurrent” mechanism. JRE 7 chucks out CMS in favour of the G1 collector. I haven’t used G1 yet.

    Regards
    Gavin

  • Deepak Devadathan

    July 8, 2010 at 6:53 am

    i have tried -XX:+UseParallelGC in the past
    MEM_ARGS=-Xms1536m -Xmx1536m -Xss924k -Xmn512m -XX:SurvivorRatio=4 -XX:PermSize=256m -XX:MaxPermSize=256m -Xmpas:on -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseParallelGC

    But what was seen that the heap utilization was reaching up close to 90%
    I tried the defualt..i found that my perm gen isnt much used…so took some out it and gave it to xmx and xmn.

    MEM_ARGS=”-Xms1408m -Xmx1408m -Xss924k -Xmn588m -XX:SurvivorRatio=4 -XX:PermSize=256m -XX:+UseGetTimeOfDay -XX:MaxPermSize=256m -Xmpas:on -XX:+PrintGCDetails -XX:+PrintGCTimeStamps”

    MEM_ARGS=”-Xms1280m -Xmx1280m -Xss924k -Xmn460m -XX:SurvivorRatio=4 -XX:PermSize=512m -XX:MaxPermSize=512m -Xmpas:on -XX:+PrintGCDetails -XX:+PrintGCTimeStamps”

    In both cases whats observerd is that eventually the heap is getting utilized closed to 85-90% ( in around 10 days )
    Also the scavenge collection stops in 5 days. After which theres only tenured collection.

    What i feel is that after 5 days my young generation is full of live objects may be the referenced one from the tenured.

    After 5th day there only tenured collectioin with high pause times (15 sec avg_) ..but nothing seems to be affected by the pause times.

    Contrary to your blog, i was thinking in a different manner , though i could be wrong..so correct me if i am. I was thinking..why not keep the MTT=0 , that will ensure that no live objects are there in the young , after young collection. Tenured regions would fill up quickly. But my scavenge collection should not be stopping after 5 days. I could select a very high surviour ratio (may be 128), select a relatively smaller young generation, may be 400mb and give the rest to the tenured regions..havent tried this…but would like to hear your views

Leave a Reply