From 6a1586d4cb1d3ce1391284779242f28bda47e096 Mon Sep 17 00:00:00 2001 From: GONG Jie Date: Sun, 31 Dec 2017 23:59:59 +0000 Subject: [PATCH] Remove trailing spaces in file xCAT-server/share/xcat/ib/scripts/healthCheck.README --- .../share/xcat/ib/scripts/healthCheck.README | 124 +++++++++--------- 1 file changed, 62 insertions(+), 62 deletions(-) diff --git a/xCAT-server/share/xcat/ib/scripts/healthCheck.README b/xCAT-server/share/xcat/ib/scripts/healthCheck.README index 2c59dbf57..798f15782 100644 --- a/xCAT-server/share/xcat/ib/scripts/healthCheck.README +++ b/xCAT-server/share/xcat/ib/scripts/healthCheck.README @@ -8,7 +8,7 @@ The syntax of the healthCheck command is: healthCheck { [-n node_list] [-M]} {[-p min_clock_speed] [-i method] [-m min_memory] - [-l min_freelp] [ -H [--speed speed --ignore interface_list --width width]]} + [-l min_freelp] [ -H [--speed speed --ignore interface_list --width width]]} [ -h ] -M Check status for all the Managed Nodes that are defined on this MN. @@ -17,7 +17,7 @@ healthCheck { [-n node_list] [-M]} -p min_clock_speed Specifies the minimal processor clock speed in MHz for processor monitor. -i method - Specifies the method to do Infiniband interface status check, the supported + Specifies the method to do Infiniband interface status check, the supported check methods are LL and RSCT. -m min_memory Specifies the minimal total memory in MB. @@ -27,68 +27,68 @@ healthCheck { [-n node_list] [-M]} --speed speed Specifies the physical port speed in G bps, it should be used with -H flag. --ignore interface_list - Specifies a comma-separated list of interface name to ignore from HCA status check, + Specifies a comma-separated list of interface name to ignore from HCA status check, such as ib0,ib1. It should be used with -H flag. --width width Specifies the physical port width, such as 4X or 12X. It should be used with -H flag. -h Display usage information. - -This script is used to check the system health for both AIX and Linux -Managed Nodes on Power6 platforms. It will use xdsh to access the target -nodes, and check the status for processor clock speed, IB interfaces, -memory and large page configuration. If xdsh is unreachable, an error + +This script is used to check the system health for both AIX and Linux +Managed Nodes on Power6 platforms. It will use xdsh to access the target +nodes, and check the status for processor clock speed, IB interfaces, +memory and large page configuration. If xdsh is unreachable, an error message will be given. 1. Processor clock speed check -This script will use xdsh command to access the target nodes, and run -"/usr/pmapi/tools/pmcycles -M" command on the AIX MNs or "cat -/proc/cpuinfo" command on Linux MNs to list the actual processor clock -speed in MHz. Compare this actual speed with the minimal value that user -specified in command line with -p flag, if it is smaller than the minimal -value, a warning message will be given out to indicate the unexpected low +This script will use xdsh command to access the target nodes, and run +"/usr/pmapi/tools/pmcycles -M" command on the AIX MNs or "cat +/proc/cpuinfo" command on Linux MNs to list the actual processor clock +speed in MHz. Compare this actual speed with the minimal value that user +specified in command line with -p flag, if it is smaller than the minimal +value, a warning message will be given out to indicate the unexpected low frequency. 2. IB interface status check by llstatus -In LoadLeveler cluster environment, all the nodes are sharing the same -cluster information. So we only need to xdsh to one of these nodes, and -run LoadLeveler command "/usr/lpp/LoadL/full/bin/llstatus -a" on AIX or -"/opt/ibmll/LoadL/full/bin/llstatus -a" on Linux nodes to list the IB -interface status. If the status is not "READY", a warning message related -to its nodename and IB port will be given out. This check process needs -the "llstatus" command existed on the MNs, if it does not exist, an error +In LoadLeveler cluster environment, all the nodes are sharing the same +cluster information. So we only need to xdsh to one of these nodes, and +run LoadLeveler command "/usr/lpp/LoadL/full/bin/llstatus -a" on AIX or +"/opt/ibmll/LoadL/full/bin/llstatus -a" on Linux nodes to list the IB +interface status. If the status is not "READY", a warning message related +to its nodename and IB port will be given out. This check process needs +the "llstatus" command existed on the MNs, if it does not exist, an error message will be output. 3. IB interface status check by lsrsrc -This script will use xdsh command to access the target nodes, and run -"/usr/bin/lsrsrc IBM.NetworkInterface Name OpState" command on AIX or -Linux MNs to list the IB interface status for each node. If the "OpState" -value is not "1", a warning message related to its nodename and IB port +This script will use xdsh command to access the target nodes, and run +"/usr/bin/lsrsrc IBM.NetworkInterface Name OpState" command on AIX or +Linux MNs to list the IB interface status for each node. If the "OpState" +value is not "1", a warning message related to its nodename and IB port will be given out. 4. Memory check -This script will use xdsh command to access the target nodes, and run -"/usr/bin/vmstat" command on AIX MNs or "cat /proc/meminfo" commands on -Linux MNs to list the total memory information. If the total memory is -smaller than the minimal value specified by the user in GB, a warning -message will be given out with the node name and its real total memory -account. +This script will use xdsh command to access the target nodes, and run +"/usr/bin/vmstat" command on AIX MNs or "cat /proc/meminfo" commands on +Linux MNs to list the total memory information. If the total memory is +smaller than the minimal value specified by the user in GB, a warning +message will be given out with the node name and its real total memory +account. 5. Free large page check -This script will use xdsh command to access the target nodes, and run -"/usr/bin/vmstat -l" command on AIX MNs or "cat /proc/meminfo" commands -on Linux MNs to list the free large page information. If the free large -page number is smaller than the minimal value specified by the user, a -warning message will be given out with the node name and its real free -large page number. +This script will use xdsh command to access the target nodes, and run +"/usr/bin/vmstat -l" command on AIX MNs or "cat /proc/meminfo" commands +on Linux MNs to list the free large page information. If the free large +page number is smaller than the minimal value specified by the user, a +warning message will be given out with the node name and its real free +large page number. 6. Check HCA status -This script will use xdsh command to access the target nodes. +This script will use xdsh command to access the target nodes. For AIX nodes, we use command ibstat -v | egrep "IB PORT.*INFO|Port State -:|Physical Port" to get the HCA status of Logical Port State, Physical -Port State, Physical Port Physical State, Physical Port Speed and Physical -Port Width. The expected values are "Logical Port State: Active", "Physical -Port State: Active", "Physical Port Physical State: Link Up", "Physical -Port Width: 4X". If the actual value is not the same as expected one, a +:|Physical Port" to get the HCA status of Logical Port State, Physical +Port State, Physical Port Physical State, Physical Port Speed and Physical +Port Width. The expected values are "Logical Port State: Active", "Physical +Port State: Active", "Physical Port Physical State: Link Up", "Physical +Port Width: 4X". If the actual value is not the same as expected one, a warning message will be given out. This is an example of the output of ibstat command: c890f11ec01:/ # ibstat -v | egrep "IB PORT.*INFO|Port State:|Physical Port" @@ -106,9 +106,9 @@ Physical Port Speed: 2.5G Physical Port Width: 4X For Linux nodes, we use command ibv_devinfo -v | egrep "ehca|port:|state: -|width:|speed:" to get the HCA status of port state, active_width, active_speed -and phys_state. The expected values are "port state: PORT_ACTIVE", -"active_width: 4X", "phys_state: LINK_UP". If the actual value is not the +|width:|speed:" to get the HCA status of port state, active_width, active_speed +and phys_state. The expected values are "port state: PORT_ACTIVE", +"active_width: 4X", "phys_state: LINK_UP". If the actual value is not the same as expected one, a warning message will be given out. This is an example of the output of ibv_devinfo command: c890f11ec05:~ # ibv_devinfo -v | egrep "ehca|port:|state:|width:|speed:" @@ -124,16 +124,16 @@ hca_id: ehca0 active_speed: 2.5 Gbps (1) phys_state: LINK_UP (5) -But for "Physical Port Speed" on AIX nodes or "active_speed" on Linux nodes, -since SDR and DDR adapters will use the different speeds, SDR is 2.5G and DDR -is 5.0G, so the user needs to specify this "Speed" by flag "--speed", for +But for "Physical Port Speed" on AIX nodes or "active_speed" on Linux nodes, +since SDR and DDR adapters will use the different speeds, SDR is 2.5G and DDR +is 5.0G, so the user needs to specify this "Speed" by flag "--speed", for example: healthCheck -N AIXNodes -H --speed 2.5 -If "--speed" is not specified with "-H" flag, healthCheck script will list the -actual value of "Physical Port Speed" gotten from ibstat command for each HCAs, -so that it is easy for the user to use "grep" command to find the speed value +If "--speed" is not specified with "-H" flag, healthCheck script will list the +actual value of "Physical Port Speed" gotten from ibstat command for each HCAs, +so that it is easy for the user to use "grep" command to find the speed value he/she wants. The output format is ::< Physical Port Speed >: , for example: @@ -142,8 +142,8 @@ c890f11ec01.ppd.pok.ibm.com: ib0: Physical Port Speed: 2.5G c890f11ec01.ppd.pok.ibm.com: ib1: Physical Port Speed: 2.5G c890f11ec02.ppd.pok.ibm.com: ib0: Physical Port Speed: 5.0G c890f11ec02.ppd.pok.ibm.com: ib1: Physical Port Speed: 5.0G -Since the output of ibstat or ibv_devinfo is identified by HCA name and port -number, so we will use the mapping table below to map the HCA name and port +Since the output of ibstat or ibv_devinfo is identified by HCA name and port +number, so we will use the mapping table below to map the HCA name and port number to its interface name. Please see the table below: Interface Name Adapter Name Port Number @@ -153,14 +153,14 @@ ib2 iba1/ehca1 1 ib3 iba1/ehca1 2 ...... -For "Physical Port Width" on AIX nodes or "active_width" on Linux nodes, since -it could be 4X or 12X, so the user needs to specify this "width" by flag +For "Physical Port Width" on AIX nodes or "active_width" on Linux nodes, since +it could be 4X or 12X, so the user needs to specify this "width" by flag "--width", for example: healthCheck -N LinuxNodes -H --width 4X -If "--width" is not specified, healthCheck script will list the actual value -of "Physical Port Width" gotten from ibstat command for each HCAs, so that it +If "--width" is not specified, healthCheck script will list the actual value +of "Physical Port Width" gotten from ibstat command for each HCAs, so that it is easy for the user to use "grep" command to find the speed value he/she wants. The output format is ::< Physical Port Width >: , for example: @@ -169,8 +169,8 @@ c890f11ec01.ppd.pok.ibm.com: ib0: Physical Port Width: 4X c890f11ec01.ppd.pok.ibm.com: ib1: Physical Port Width: 4X c890f11ec02.ppd.pok.ibm.com: ib0: Physical Port Width: 4X -For the ports that are not used by the target nodes, the user could use --ignore -flag to exclude them from HCA status check. If the user does not specify these -"unused port" with --ignore flag, healthCheck script will check all HCA check -items for all interfaces, and return the warning message to for the failed ones. +For the ports that are not used by the target nodes, the user could use --ignore +flag to exclude them from HCA status check. If the user does not specify these +"unused port" with --ignore flag, healthCheck script will check all HCA check +items for all interfaces, and return the warning message to for the failed ones. The user could use grep piped into wc -l to get the total number of "unused port".