xcat-core/xCAT-server/share/xcat/ib/scripts/annotatelog.README
2009-03-10 06:58:21 +00:00

233 lines
11 KiB
Plaintext

# IBM(c) 2008 EPL license http://www.eclipse.org/legal/epl-v10.html
annotatelog.README
This README describes how to use the annotatelog script.
The syntax of the annotatelog command is:
annotatelog -f log_file [-s start_time] [-e end_time]
{ [-i -g guid_file -l link_file] [-S] [-c] [-u]| [-a -g guid_file -l link_file]}
{[-n node_list -g guid_file] [-E]}
[-h]
-A Output the combination of -i, -S, -c and -u. It should be used with -g and -l flags.
-f log_file
Specifies a log file fullpath name to analyze.
Must be xCAT consolidated log got from Qlogic HSM or ESM.
-s start_time
Specifies the start time for analysis, where the start_time
variable has the format ddmmyyhh:mm:ss (day, month, year,
hour, minute, and second), 00:00:00 is valid.
-e end_time
Specifies the end time for analysis, where the end_time
variable has the format ddmmyyhh:mm:ss (day, month, year,
hour, minute, and second), 00:00:00 is valid.
-l link_file
Specifies a link file fullpath name, which concatenates all
'/var/opt/iba/analysis/baseline/fabric*links' files from all fabric management servers.
-g guid_file
Specifies a guid file fullpath name, which has a list of
GUIDs as obtained from the "getGuids" script.
-E Annotate with node ERRLOG_ON and ERRLOG_OFF information. This
can help determine if a disappearance was caused by a node
disappearing. It is for AIX nodes only and should be used with -x or -n flag.
-S Sort the log entries by subnet manager only.
-i Sort the log entries by IB node only.
-c Sort the log entries by chassis only.
-u Sort the log entries by FRU only.
-n node_list
Specifies a comma-separated list of node host names, IP addresses to look up in log entries.
-h Display usage information.
In xCAT cluster with IB QLogic switches, the switch logs and subnet
manager (ESM/HSM) logs will use the syslog protocol for log redirection;
they are redirected to the xCAT Management Node. The xCAT Management Node syslogd recognizes the facility (local6) and priority (NOTICE and above) and put the log
entries into a file/FIFO that is being monitored by AIXSyslogSensor on
AIX system or ErrorLogSensor on Linux system. The condition-response
setup on xCAT Management Node local will move the log entries to file
/var/log/xcat/errorlog/[xCAT Management Node]. So there are a lot of
entries in this log file and it is difficult for the administrator to look through.
annotatelog is a sample script to parse the QLogic log entries in file
/var/log/xcat/errorlog/[xCAT Management Node] on xCAT Management Node
by subnet manager, IB node, chassis, FRU(Field-Replaceable Unit) or a
particular node. This script is supported by both AIX and Linux Management
Node. From xCAT's point of view, the log to analyze must be xCAT
consolidated log, which means this log file must come from xCAT
syslog/errorlog monitoring mechanism, such as /var/log/xcat/errorlog/[xCAT
Management Node] file. Since the log format is various, xCAT do not
support other log files.
This script provides several flags to specify the category critera,
they are -S, -i, -c, -u, -n and -A.
If -S flag is set, the output will be sorted by Subnet Manager, since
the SM may have multi-port, so the output is classified by
<subnet_manager_name: port number>, please see the details in the
example below:
############################################
Logs by Subnet Manager
############################################
----------------------------------------------
Report by subnet manager: 'c890f12ec07:port 2'
----------------------------------------------
May 5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5445]:
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 2|COND:#4 Disappearance
from fabric|NODE:IBM G1 Logical HCA :port 2:0x000255007002651f|DETAIL:
Node type: hca
May 5 08:23:55 c890f12ec07 local6:notice c890f12ec07 iview_sm[5128]:
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 2|COND:#3 Appearance in
fabric|NODE:IBM G1 Logical HCA :port 2:0x0002550070027f1f|DETAIL:Node
type: hca
----------------------------------------------
Report by subnet manager: 'c890f12ec07:port 1'
----------------------------------------------
May 5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5442]:
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 1|COND:#4 Disappearance
from fabric|NODE:IBM G1 Logical HCA :port 1:0x000255007002650f|DETAIL:
Node type: hca
May 5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5442]:
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 1|COND:#4 Disappearance
from fabric|NODE:IBM G1 Logical HCA :port 1:0x0002550070027f0f|DETAIL:
Node type: hca
If -i flag is set, the output will be sorted by IB node, it is
classified by < node_name: port number : GUID number>, furthermore,
if the log entry includes keyword of "IBM *Logical", then we will
display the link relationships between the IBM G1 Logic HCA and
IBM G1 Logic Switch, IBM G1 Logic Switch and Qlogic Switch. We will
use the result file of getGuids script to get the node name
corresponding to HCA port number and GUIDs; and use the fabric link
files from Fabric Management Server(HSM/ESM) to get the Qlogic swith
connection information, so -i flag must be used with -g and -l flags.;
and if the log entry does not include keyword of "IBM *Logical", we
could not get the corresponding nodename and connection relationship
for it, so this entry will be displayed directly and set the string
after "NODE:" as the IB node. Please see the details in the examples below:
############################################
Logs by IB node
############################################
----------------------------------------------
Reported by IB node: 'c890f11ec06.ppd.pok.ibm.com: port 1: GUID 0x0002550070027f0f'
- ehca0 <-> IBM G1 Logical Switch 1 port 17 (GUID 0x0002550070027f20)
- Connector is C65-T1 (HV=Cx-T1)
- IBM G1 Logical Switch 1 <-> SilverStorm 9024 DDR port 16 (GUID 0x00066a00d900042d)
----------------------------------------------
May 5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5442]:
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 1|COND:#4 Disappearance
from fabric|NODE:IBM G1 Logical HCA :port 1:0x0002550070027f0f|DETAIL:
Node type: hca
----------------------------------------------
Reported by IB node: 'SilverStorm 9120 GUID=0x00066a0002000225 Leaf 8, Chip A:port 0:0x00066a0007001311'
----------------------------------------------
Mar 25 16:21:54 c890f12ec07 iview_sm[3725]: c890f12ec07; MSG:NOTICE|
SM:c890f12ec07:port 1|COND:#4 Disappearance from fabric|NODE:SilverStorm
9120 GUID=0x00066a0002000225 Leaf 8, Chip A:port 0:0x00066a0007001311|
DETAIL:Node type: switch
If -c flag is set, the output will be sorted by chassis, it is classified
by <chassis: model type: GUID number>, please see the details in the example
below:
############################################################
Logs by CHASSIS
############################################################
----------------------------------------------
Reported by chassis: 'SilverStorm: model 9120: GUID 0x00066a0002000225'
----------------------------------------------
Apr 23 09:16:56 qswitch slot101:9.114.80.179;MSG:WARNING|CHASSIS:SilverStorm
9120 GUID=0x00066a0002000225|COND:#17 FRU state changed from online
to offline|FRU:Power Supply 1|PN:200805-101
If -u flag is set, the output will be sorted by FRU, please see the details in
the example below:
############################################################
Logs by FRUS from CHASSIS
############################################################
------------------------------------------------------------
Associated with FRU: 'Power Supply 1'
------------------------------------------------------------
Apr 23 09:19:12 qswitch slot101:9.114.80.179;MSG:NOTICE|CHASSIS:SilverStorm
9120 GUID=0x00066a0002000225|COND:#18 FRU state changed from offline
to online|FRU:Power Supply 1|PN:200805-101
If -n flag is set, the output will be sorted by a certain nodename, please see
the details in the example below:
############################################################
Logs for special nodes
############################################################
------------------------------------------------------------
Reported by node: 'c890f11ec06.ppd.pok.ibm.com'
------------------------------------------------------------
May 5 08:23:55 c890f12ec07 local6:notice c890f12ec07 iview_sm[5128]:
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 2|COND:#3 Appearance in fabric|
NODE:IBM G1 Logical HCA :port 2:0x0002550070027f1f|DETAIL:Node type: hca
May 5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5442]:
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 1|COND:#4 Disappearance from
fabric|NODE:IBM G1 Logical HCA :port 1:0x0002550070027f0f|DETAIL:Node
type: hca
------------------------------------------------------------
Reported by node: 'c890f11ec05.ppd.pok.ibm.com'
------------------------------------------------------------
May 5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5445]:
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 2|COND:#4 Disappearance from
fabric|NODE:IBM G1 Logical HCA :port 2:0x000255007002651f|DETAIL:Node
type: hca
If -E flag is set with -n or -i flag, annotatelog script will use dsh to access
the IB nodes or node list specified by -n flag, and use "errpt -J ERRLOG_ON
ERRLOG_OFF" command to get the corresponding timestamps, and added these
timestamps into annotatelog output. Please see the detail below:
----------------------------------------------
Reported by IB node: 'c890f11ec06.ppd.pok.ibm.com: port 1: GUID 0x0002550070027f0f'
- ehca0 <-> IBM G1 Logical Switch 1 port 17 (GUID 0x0002550070027f20)
- Connector is C65-T1 (HV=Cx-T1)
- IBM G1 Logical Switch 1 <-> SilverStorm 9024 DDR port 16 (GUID 0x00066a00d900042d)
----------------------------------------------
# c890f11ec06.ppd.pok.ibm.com
# ERRLOG_ON: 01/05/08 09:23
# ERRLOG_OFF: 04/05/08 09:23
----------------------------------------------
May 5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5442]:
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 1|COND:#4 Disappearance from
fabric|NODE:IBM G1 Logical HCA :port 1:0x0002550070027f0f|DETAIL:Node
type: hca
If -A flag is set, the output will be the combination of -i, -S, -c and -u.
If a log entry that can not be parsed into any types above, it will be displayed
in "Logs by others". And these logs can only be displayed when -A flag is used,
please see the details in the example below:
############################################################
Logs by others
############################################################
May 14 11:44:22 qswitch slot101:9.114.80.179 FEtask[86f38fb8]: ESM: Embedded SM
Error: rmsg_recv: output buffer[2016] too small for incoming data[2035] : 0
May 14 11:44:22 qswitch slot101:9.114.80.179 PM_task[86f08298]: ESM: Embedded SM
Error: DoSendFeAsync - message send failed : 128