############################################################################################# # # # WLfdmon - Version 1.0 - READ ME !! # # # ############################################################################################# # # # Author : Gavin Satur (GS) - gavin.satur@mrkips.com - http://software.mrkips.com # # Date : 14th May 2009 # # # ############################################################################################# ---------------- (I) BACKGROUND: ---------------- File Descriptors are data structures which the OS uses for operations on files. There are maximum limits imposed on the total number of open file descriptors (open fd) at the system-wide level and also at the per-process level. In WebLogic installations, a function called resetFd, sets the maximum limit of open file descriptors for WebLogic Server processes (1024 for Linux and Solaris). This function is available in ${WLS_HOME}/common/bin/commEnv.sh. At times, the WebLogic server processes may require more than the default number of open fd. Also, at times, there may be a problem leading to a growing number of open fd (example, several connections made to an unreachable system without closing earlier connections). In either of these scenarios, if the WebLogic servers run out of available open fd, the servers will hang and become unresponsive or exhibit erratic behaviour. You will also see the "Too many open files" error and the WebLogic servers may not be able to listen on their ports (listening on ports requires open fd). ------------------- (II) WHY WLfdmon?: ------------------- WLfdmon will enable you to proactively monitor the number of open file descriptors used by WebLogic servers and assist with troubleshooting problems related to excessive usage of open file descriptors. The salient features of WLfdmon are given below: --> Can be run in interactive and non-interactive modes --> Configuration-driven, thereby allowing you to run this script for different WebLogic domains without modifying the script --> Generates statistics on open fd usage of WebLogic Servers. These Statistics can be used for trend analysis. --> Logs alarms when open fd threshold is breached. Alarms can notify Support Staff of abnormal application/server behaviour. --> Logs lsof output when open fd threshold is breached. This output will be useful for root cause analysis of excessive open fd usage. --> Housekeeping for data and lsof output files --------------------------- (III) SYSTEM REQUIREMENTS: --------------------------- --> Solaris/Linux --> Korn shell - /bin/ksh --> lsof ---------------- (IV) TESTED ON: ---------------- WLfdmon has been successfully tested on the following environments: --> WLS 10.0 MP1 + Ubuntu 9.04 + AMD x64 --> WLS 10.0 MP1 + RHEL 5.1 + Intel Xeon x64 --> WLS 8.1 SP4 + Solaris 9 + Sparc ----------------------- (V) KNOWN LIMITATIONS: ----------------------- --> /usr/ucb/ps on Solaris 10: On Solaris systems, WLfdmon uses /usr/ucb/ps to check for running WebLogic Servers. On Solaris 10, Sun have introduced restrictions on /usr/ucb/ps such that only owners of processes may see their running processes using /usr/ucb/ps. Hence, on Solaris 10, WLfdmon must be run as the same user which runs the WebLogic server processes. This issue does not exist in earlier versions of Solaris. ------------------------- (VI) HOW TO USE WLfdmon: ------------------------- The WLfdmon software is available as a compressed tar archive, WLfdmon_1.0.tar.gz, which consists of only 3 files - WLfdmon.ksh, WLfdmon.cfg and this ReadMe.txt. Get started by following the steps below: (1) Choose a directory for installation of WLfdmon and upload the WLfdmon_1.0.tar.gz file into that location. (2) Uncompress and untar WLfdmon_1.0.tar.gz (Example commands: gunzip WLfdmon_1.0.tar.gz; tar xvf WLfdmon_1.0.tar) (3) Edit the WLfdmon.cfg file and configure parameters, as required. (4) Test the WLfdmon.ksh script interactively and non-interactively. Read sections below for details. INTERACTIVE EXECUTION: ====================== SYNTAX: ksh WLfdmon.ksh [-i] --> Default mode of execution. So, the -i input parameter is optional. --> Real-time dashboard showing server pid, server name, listen ports, total number of open file descriptors for the WebLogic Server and total number of open file decsriptors for outgoing TCP connections made by the WebLogic Server. --> Displays dashboard for all WebLogic Servers configured in WLfdmon.cfg EXAMPLE OUTPUT: =============================================================================================================== PID | NAME | LISTEN PORT(S) | TOTAL OPEN FD | OPEN FD FOR TCP =============================================================================================================== 8000 | MedRecServer | 7011 7012 | 424 | 4 =============================================================================================================== SUGGESTED USAGE: Use ad hoc for real-time snapshot of open fd usage of WebLogic Servers. Define an alias to quickly display the dashboard. NON-INTERACTIVE EXECUTION: ========================== SYNTAX: ksh WLfdmon.ksh -n --> Runs in silent mode. i.e. does not display any output. --> Logs statistics for server pid, server name, listen ports, total number of open file descriptors for the WebLogic Server and total number of open file descriptors for outgoing TCP connections made by the WebLogic Server, in csv files in the data directory. --> When thresholds are breached, logs raw lsof output in text files in the data directory. --> When thresholds are breached, logs warnings in log files in the logs directory. --> All the data above collected for all WebLogic Servers configured in WLfdmon.cfg SUGGESTED USAGE: For proactive monitoring, schedule regular execution (ksh WLfdmon.ksh -n) via cron job or any other scheduling software. Use logscanner programs to scan logs for WARNINGS and raise alarms. Example crontab: */5 * * * * /software/mrkips/WLfdmon/WLfdmon.ksh -n > /software/mrkips/WLfdmon/WLfdmon.cron.stdout 2> /software/mrkips/WLfdmon/WLfdmon.cron.stderr The above crontab entry schedules WLfdmon.ksh to execute in non-interactive mode every 5 minutes. The notation used (*/5) to specify every 5 minutes may not be applicable to all versions of crontab. Check your crontab manual for the correct syntax for your version. # ##################################### THE END ###########################################