mirror of
				https://github.com/xcat2/xcat-core.git
				synced 2025-11-04 05:12:30 +00:00 
			
		
		
		
	git-svn-id: https://svn.code.sf.net/p/xcat/code/xcat-core/trunk@2706 8638fb3e-16cb-4fca-ae20-7b5d299a9bcd
		
			
				
	
	
		
			233 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			233 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
# IBM(c) 2008 EPL license http://www.eclipse.org/legal/epl-v10.html
 | 
						|
 | 
						|
annotatelog.README
 | 
						|
 | 
						|
This README describes how to use the annotatelog script.
 | 
						|
 | 
						|
The syntax of the annotatelog command is:
 | 
						|
 | 
						|
annotatelog -f log_file [-s start_time] [-e end_time]
 | 
						|
            { [-i -g guid_file -l link_file] [-S] [-c] [-u]| [-a -g guid_file -l link_file]}
 | 
						|
            {[-n node_list -g guid_file] [-E]}
 | 
						|
            [-h]
 | 
						|
 | 
						|
        -A       Output the combination of -i, -S, -c and -u. It should be used with -g and -l flags.
 | 
						|
        -f log_file
 | 
						|
                 Specifies a log file fullpath name to analyze.
 | 
						|
                 Must be xCAT consolidated log got from Qlogic HSM or ESM.
 | 
						|
        -s start_time
 | 
						|
                 Specifies the start time for analysis, where the start_time
 | 
						|
                 variable has the format ddmmyyhh:mm:ss (day, month, year, 
 | 
						|
                 hour, minute, and second), 00:00:00 is valid.
 | 
						|
        -e end_time
 | 
						|
                 Specifies the end time for analysis, where the end_time 
 | 
						|
                 variable has the format ddmmyyhh:mm:ss (day, month, year, 
 | 
						|
                 hour, minute, and second), 00:00:00 is valid.
 | 
						|
        -l link_file
 | 
						|
                 Specifies a link file fullpath name, which concatenates all
 | 
						|
                 '/var/opt/iba/analysis/baseline/fabric*links' files from all fabric management servers.
 | 
						|
        -g guid_file
 | 
						|
                 Specifies a guid file fullpath name, which has a list of 
 | 
						|
                 GUIDs as obtained from the "getGuids" script.
 | 
						|
        -E       Annotate with node ERRLOG_ON and ERRLOG_OFF information. This
 | 
						|
                 can help determine if a disappearance was caused by a node
 | 
						|
                 disappearing. It is for AIX nodes only and should be used with -x or -n flag.
 | 
						|
        -S       Sort the log entries by subnet manager only.
 | 
						|
        -i       Sort the log entries by IB node only.
 | 
						|
        -c       Sort the log entries by chassis only.
 | 
						|
        -u       Sort the log entries by FRU only.
 | 
						|
        -n node_list
 | 
						|
                 Specifies a comma-separated list of node host names, IP addresses to look up in log entries.
 | 
						|
        -h       Display usage information.
 | 
						|
        
 | 
						|
In xCAT cluster with IB QLogic switches, the switch logs and subnet 
 | 
						|
manager (ESM/HSM) logs will use the syslog protocol for log redirection; 
 | 
						|
they are redirected to the xCAT Management Node. The xCAT Management Node syslogd recognizes the facility (local6) and priority (NOTICE and above) and put the log 
 | 
						|
entries into a file/FIFO that is being monitored by AIXSyslogSensor on 
 | 
						|
AIX system or ErrorLogSensor on Linux system. The condition-response 
 | 
						|
setup on xCAT Management Node local will move the log entries to file 
 | 
						|
/var/log/xcat/errorlog/[xCAT Management Node]. So there are a lot of 
 | 
						|
entries in this log file and it is difficult for the administrator to look through.
 | 
						|
 | 
						|
annotatelog is a sample script to parse the QLogic log entries in file 
 | 
						|
/var/log/xcat/errorlog/[xCAT Management Node] on xCAT Management Server 
 | 
						|
by subnet manager, IB node, chassis, FRU(Field-Replaceable Unit) or a 
 | 
						|
particular node. This script is supported by both AIX and Linux Management 
 | 
						|
Node. From xCAT's point of view, the log to analyze must be xCAT 
 | 
						|
consolidated log, which means this log file must come from xCAT 
 | 
						|
syslog/errorlog monitoring mechanism, such as /var/log/xcat/errorlog/[xCAT 
 | 
						|
Management Node] file. Since the log format is various, xCAT do not 
 | 
						|
support other log files.
 | 
						|
 | 
						|
This script provides several flags to specify the category critera, 
 | 
						|
they are -S, -i, -c, -u, -n and -A.
 | 
						|
If -S flag is set, the output will be sorted by Subnet Manager, since 
 | 
						|
the SM may have multi-port, so the output is classified by 
 | 
						|
<subnet_manager_name: port number>, please see the details in the 
 | 
						|
example below:
 | 
						|
 | 
						|
############################################
 | 
						|
Logs by Subnet Manager
 | 
						|
############################################
 | 
						|
 | 
						|
----------------------------------------------
 | 
						|
Report by subnet manager: 'c890f12ec07:port 2' 
 | 
						|
----------------------------------------------
 | 
						|
May  5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5445]: 
 | 
						|
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 2|COND:#4 Disappearance 
 | 
						|
from fabric|NODE:IBM G1 Logical HCA :port 2:0x000255007002651f|DETAIL:
 | 
						|
Node type: hca
 | 
						|
May  5 08:23:55 c890f12ec07 local6:notice c890f12ec07 iview_sm[5128]: 
 | 
						|
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 2|COND:#3 Appearance in 
 | 
						|
fabric|NODE:IBM G1 Logical HCA :port 2:0x0002550070027f1f|DETAIL:Node 
 | 
						|
type: hca
 | 
						|
 | 
						|
----------------------------------------------
 | 
						|
Report by subnet manager: 'c890f12ec07:port 1'
 | 
						|
----------------------------------------------
 | 
						|
 | 
						|
May  5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5442]: 
 | 
						|
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 1|COND:#4 Disappearance 
 | 
						|
from fabric|NODE:IBM G1 Logical HCA :port 1:0x000255007002650f|DETAIL:
 | 
						|
Node type: hca
 | 
						|
May  5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5442]: 
 | 
						|
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 1|COND:#4 Disappearance 
 | 
						|
from fabric|NODE:IBM G1 Logical HCA :port 1:0x0002550070027f0f|DETAIL:
 | 
						|
Node type: hca
 | 
						|
 | 
						|
 | 
						|
If -i flag is set, the output will be sorted by IB node, it is 
 | 
						|
classified by < node_name: port number : GUID number>, furthermore, 
 | 
						|
if the log entry includes keyword of "IBM *Logical", then we will 
 | 
						|
display the link relationships between the IBM G1 Logic HCA and 
 | 
						|
IBM G1 Logic Switch, IBM G1 Logic Switch and Qlogic Switch. We will 
 | 
						|
use the result file of getGuids script to get the node name 
 | 
						|
corresponding to HCA port number and GUIDs; and use the fabric link 
 | 
						|
files  from Fabric Management Server(HSM/ESM) to get the Qlogic swith 
 | 
						|
connection information, so -i flag must be used with -g and -l flags.; 
 | 
						|
and if the log entry does not include keyword of "IBM *Logical", we 
 | 
						|
could not get the corresponding nodename and connection relationship 
 | 
						|
for it, so this entry will be displayed directly and set the string 
 | 
						|
after "NODE:" as the IB node. Please see the details in the examples below:
 | 
						|
 | 
						|
############################################
 | 
						|
Logs by IB node
 | 
						|
############################################
 | 
						|
 | 
						|
----------------------------------------------
 | 
						|
Reported by IB node: 'c890f11ec06.ppd.pok.ibm.com: port 1: GUID 0x0002550070027f0f'
 | 
						|
- ehca0 <-> IBM G1 Logical Switch 1 port 17 (GUID 0x0002550070027f20)
 | 
						|
- Connector is C65-T1 (HV=Cx-T1)
 | 
						|
- IBM G1 Logical Switch 1 <-> SilverStorm 9024 DDR port 16 (GUID 0x00066a00d900042d)
 | 
						|
----------------------------------------------
 | 
						|
 | 
						|
May  5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5442]: 
 | 
						|
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 1|COND:#4 Disappearance 
 | 
						|
from fabric|NODE:IBM G1 Logical HCA :port 1:0x0002550070027f0f|DETAIL:
 | 
						|
Node type: hca
 | 
						|
 | 
						|
----------------------------------------------
 | 
						|
Reported by IB node: 'SilverStorm 9120 GUID=0x00066a0002000225 Leaf 8, Chip A:port 0:0x00066a0007001311'
 | 
						|
----------------------------------------------
 | 
						|
Mar 25 16:21:54 c890f12ec07 iview_sm[3725]: c890f12ec07; MSG:NOTICE|
 | 
						|
SM:c890f12ec07:port 1|COND:#4 Disappearance from fabric|NODE:SilverStorm 
 | 
						|
9120 GUID=0x00066a0002000225 Leaf 8, Chip A:port 0:0x00066a0007001311|
 | 
						|
DETAIL:Node type: switch
 | 
						|
 | 
						|
 | 
						|
If -c flag is set, the output will be sorted by chassis, it is classified 
 | 
						|
by <chassis: model type: GUID number>, please see the details in the example 
 | 
						|
below:
 | 
						|
 | 
						|
############################################################
 | 
						|
Logs by CHASSIS
 | 
						|
############################################################
 | 
						|
 | 
						|
----------------------------------------------
 | 
						|
Reported by chassis: 'SilverStorm: model 9120: GUID 0x00066a0002000225' 
 | 
						|
----------------------------------------------
 | 
						|
 | 
						|
Apr 23 09:16:56 qswitch slot101:9.114.80.179;MSG:WARNING|CHASSIS:SilverStorm 
 | 
						|
9120 GUID=0x00066a0002000225|COND:#17 FRU state changed from online 
 | 
						|
to offline|FRU:Power Supply 1|PN:200805-101
 | 
						|
 | 
						|
 | 
						|
If -u flag is set, the output will be sorted by FRU, please see the details in 
 | 
						|
the example below:
 | 
						|
 | 
						|
############################################################
 | 
						|
Logs by FRUS from CHASSIS
 | 
						|
############################################################
 | 
						|
 | 
						|
------------------------------------------------------------
 | 
						|
Associated with FRU: 'Power Supply 1' 
 | 
						|
------------------------------------------------------------
 | 
						|
 | 
						|
Apr 23 09:19:12 qswitch slot101:9.114.80.179;MSG:NOTICE|CHASSIS:SilverStorm 
 | 
						|
9120 GUID=0x00066a0002000225|COND:#18 FRU state changed from offline 
 | 
						|
to online|FRU:Power Supply 1|PN:200805-101
 | 
						|
 | 
						|
 | 
						|
If -n flag is set, the output will be sorted by a certain nodename, please see 
 | 
						|
the details in the example below:
 | 
						|
 | 
						|
############################################################
 | 
						|
Logs for special nodes
 | 
						|
############################################################
 | 
						|
------------------------------------------------------------
 | 
						|
Reported by node: 'c890f11ec06.ppd.pok.ibm.com'
 | 
						|
------------------------------------------------------------
 | 
						|
 | 
						|
May  5 08:23:55 c890f12ec07 local6:notice c890f12ec07 iview_sm[5128]: 
 | 
						|
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 2|COND:#3 Appearance in fabric|
 | 
						|
NODE:IBM G1 Logical HCA :port 2:0x0002550070027f1f|DETAIL:Node type: hca
 | 
						|
May  5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5442]: 
 | 
						|
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 1|COND:#4 Disappearance from 
 | 
						|
fabric|NODE:IBM G1 Logical HCA :port 1:0x0002550070027f0f|DETAIL:Node 
 | 
						|
type: hca
 | 
						|
 | 
						|
------------------------------------------------------------
 | 
						|
Reported by node: 'c890f11ec05.ppd.pok.ibm.com'
 | 
						|
------------------------------------------------------------
 | 
						|
May  5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5445]: 
 | 
						|
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 2|COND:#4 Disappearance from 
 | 
						|
fabric|NODE:IBM G1 Logical HCA :port 2:0x000255007002651f|DETAIL:Node 
 | 
						|
type: hca
 | 
						|
 | 
						|
 | 
						|
If -E flag is set with -n or -i flag, annotatelog script will use dsh to access 
 | 
						|
the IB nodes or node list specified by -n flag, and use "errpt -J ERRLOG_ON 
 | 
						|
ERRLOG_OFF" command to get the corresponding timestamps, and added these 
 | 
						|
timestamps into annotatelog output. Please see the detail below:
 | 
						|
 | 
						|
----------------------------------------------
 | 
						|
Reported by IB node: 'c890f11ec06.ppd.pok.ibm.com: port 1: GUID 0x0002550070027f0f'
 | 
						|
- ehca0 <-> IBM G1 Logical Switch 1 port 17 (GUID 0x0002550070027f20)
 | 
						|
- Connector is C65-T1 (HV=Cx-T1)
 | 
						|
- IBM G1 Logical Switch 1 <-> SilverStorm 9024 DDR port 16 (GUID 0x00066a00d900042d)
 | 
						|
----------------------------------------------
 | 
						|
# c890f11ec06.ppd.pok.ibm.com
 | 
						|
# ERRLOG_ON: 01/05/08 09:23
 | 
						|
# ERRLOG_OFF: 04/05/08 09:23
 | 
						|
----------------------------------------------
 | 
						|
 | 
						|
May  5 09:06:33 c890f12ec07 local6:notice c890f12ec07 iview_sm[5442]: 
 | 
						|
c890f12ec07; MSG:NOTICE|SM:c890f12ec07:port 1|COND:#4 Disappearance from 
 | 
						|
fabric|NODE:IBM G1 Logical HCA :port 1:0x0002550070027f0f|DETAIL:Node 
 | 
						|
type: hca
 | 
						|
 | 
						|
 | 
						|
If -A flag is set, the output will be the combination of -i, -S, -c and -u.
 | 
						|
 | 
						|
If a log entry that can not be parsed into any types above, it will be displayed 
 | 
						|
in "Logs by others". And these logs can only be displayed when -A flag is used, 
 | 
						|
please see the details in the example below:
 | 
						|
 | 
						|
############################################################
 | 
						|
Logs by others
 | 
						|
############################################################
 | 
						|
May 14 11:44:22 qswitch slot101:9.114.80.179 FEtask[86f38fb8]: ESM: Embedded SM 
 | 
						|
Error: rmsg_recv: output buffer[2016] too small for incoming data[2035] : 0
 | 
						|
May 14 11:44:22 qswitch slot101:9.114.80.179 PM_task[86f08298]: ESM: Embedded SM 
 | 
						|
Error: DoSendFeAsync  - message send failed  : 128
 |