A WebLogic managed Server crashed with no relevant information whatsoever in the logs. The server was started by a Node Manager.
Background & Analysis:
WebLogic Version: 8.1 SP6 (cluster with 2 managed servers)
JVM version: 32-bit JRockit R27.6 1.4.2_21
Operating System : 64-bit RHEL 4.0 AS update 7 (kernel 2.6.9)
When the server crashed, the server and stderr logs had no clues regarding the cause of the crash. However, the stdout log was 2147483647 bytes (2 GB) as it was not rotated. The last modified time of the stdout file was the same as the time when the server crashed. The very same scenario was observed when the other server in the cluster crashed. The filesystem is large-file aware.
Rotate and archive the stdout file, so that the JVM running WebLogic does not crash when stdout reached 2 GB in size.
NOTE: All logs (server, stderr, stdout, application) must be effectively rotated and archived. I’ve seen several enterprise environments fall victim to lack of log housekeeping. To rotate files like the JVM’s stdout and stderr, it’s best to use the copy-truncate method (make a copy of existing file and then truncate existing file) as the JVM will still have a file descriptor open for the file. You may lose a tiny amount of log information using this method, but it’s less harmful than your server crashing. Removing or renaming a file with an open file descriptor will only make the problem invisible to you as the JVM will still be writing to the old file descriptor and growing a file in a location other than your logs directory (/proc).
The JVM’s stdout file reached 2GB in size.
(1) The solution above describes a successful problem-solving experience and may not be applicable to other problems with similar symptoms.
(2) Your rating of this post will be much appreciated. Also, feel free to leave comments.