Menu Close

WebLogic Server crashes due to 2 GB stdout file

Problem:

A WebLogic managed Server crashed with no relevant information whatsoever in the logs. The server was started by a Node Manager.

Background & Analysis:

WebLogic Version: 8.1 SP6 (cluster with 2 managed servers)
JVM version: 32-bit JRockit R27.6 1.4.2_21
Operating System : 64-bit RHEL 4.0 AS update 7 (kernel 2.6.9)
When the server crashed, the server and stderr logs had no clues regarding the cause of the crash. However, the stdout log was 2147483647 bytes (2 GB) as it was not rotated. The last modified time of the stdout file was the same as the time when the server crashed. The very same scenario was observed when the other server in the cluster crashed. The filesystem is large-file aware.

Solution:

Rotate and archive the stdout file, so that the JVM running WebLogic does not crash when stdout reached 2 GB in size.

NOTE: All logs (server, stderr, stdout, application) must be effectively rotated and archived. I’ve seen several enterprise environments fall victim to lack of log housekeeping. To rotate files like the JVM’s stdout and stderr, it’s best to use the copy-truncate method (make a copy of existing file and then truncate existing file) as the JVM will still have a file descriptor open for the file. You may lose a tiny amount of log information using this method, but it’s less harmful than your server crashing. Removing or renaming a file with an open file descriptor will only make the problem invisible to you as the JVM will still be writing to the old file descriptor and growing a file in a location other than your logs directory (/proc).

Root Cause:

The JVM’s stdout file reached 2GB in size.

 

NOTE:
(1) The solution above describes a successful problem-solving experience and may not be applicable to other problems with similar symptoms.
(2) Your rating of this post will be much appreciated. Also, feel free to leave comments.

 

VN:F [1.9.22_1171]
Rating: +4 (from 6 votes)
Print Friendly, PDF & Email

5 Comments

  1. Anantha Krishnan

    We are on a Solaris 10 environment and the only way that we have been getting around this issue is bouncing the managed server when the file size of the stdout file reaches round about 2GB
    We have tried the copy-truncate method where we take a backup of the original file and truncate the actual file.
    However the JVM seems to keep a handle on this file and we never see the file size decrease post truncate..

    Any help would be useful.

  2. mrkips

    Anantha

    Yes, the JVM keeps a file descriptor open for the stdout file. If you move the file, the JVM will still be writing to file descriptor 1 (even though you no longer see the file) and so disk usage will continue to grow. If you copy-truncate and the file is being logged to almost continuously by the JVM, then sometimes, you see null characters padding the file and bloating it. These null characters can be removed by a shell script.
    However, to resolve this problem, you must address the root cause – excessive logging to stdout. Well-developed applications will not spew out lots of stuff to stdout. Instead, they would use an appropriate logging framework and write to application logs (info,error,debug, etc.). I suggest you get your developers to stop the application from writing to stdout.

  3. Eric

    Are you redirecting the output of a startscript or is it the nodemanager’s xxx.out file?
    If you’re re-directing a script, then truncation is possible. However the re-direction must use append “>>” not truncate “>” and the script should be bash or ksh NOT sh.

    An example skeleton:
    ======================================================
    #!/bin/ksh

    java -D… …. weblogic.Server >> file.out 2>&1
    ======================================================

    When file.out becomes too big, truncate it:
    : > file.out

    DO NOT remove (rm) it or rename (mv) it. if you rm the file, It will stay open and huge but have no directory entry (so you you will lose visibility). If you mv it, it will stay open and huge but have a different name.

  4. mrkips

    PrajyotP: By disable xxx.out file, I believe you mean that you do not want a xxx.out file. In that case, simple direct your output to /dev/null (> /dev/null). However, if nodemanager does not start for some reason and you don’t find any clues in the logfile or stderr, you may want to temporarily redirect stdout to a logfile to troubleshoot.

Leave a Reply

Your email address will not be published. Required fields are marked *