Overview
We want to make a new command to probe all the possible issues in xCAT. It can probe xCAT MN and xCAT node definition statically. It also can probe the node discovery and node deployment staticaly. The goal is to make a command to help xCAT users to predict and debug xCAT problems easily.
Interface
The syntax of the xcatprobe command:
xcatprobe <probe_type> [parameters]
xcatprobe
# same asxcatprobe help
xcatprobe help
xcatprobe nodedef <noderange>
xcatprobe osdef <osimage>
xcatprobe xcatmn
xcatprobe node <noderange>
xcatprobe switch
xcatprobe nodeready <noderange> [console] [deployment]
xcatprobe nodediscover
xcatprobe nodedeploy
xcatprobe help
Display the usage of xcatprobe.
- Display the basic usage
- Display all the probe type
xcatprobe nodedef
Probe node definition.
- Check validate of node name
- Check ip<=>node entry in /etc/hosts
- Check DNS resolution
- Check HWcontrol: check definition and try the rpower status to make sure hwcontrol is ready for using.
- Check attributes: mgt, netboot, mac
xcatprobe osdef
Probe the definition of OSimage
- check the basic attribtues: imagetype, osarch, osdistroname, osname, osvers
- check the existence of packages in pkgdir
- check the packages in the otherpkgdir
- check the entries in the pkglist and otherpkglist
- check the rootimage in rootimgdir for netboot image
xcatprobe xcatmn
Probe the readiness of xCAT MN
- Check the hostname, long name
- Check xcatd has been started sucessfully(six processes is working)
- Check xcatd is listening on 2 important port
- Check the basic configuration of xcat: site table, passwd table, network table
- Check mnip is configured on current server and is a static ip
- Check the selinux has been disabled
- Check the firewall has been closed
- Check the free disk space of /tmp /var /install
- Check the size of dhcpd.leases file less than 100MB
- Check the network services are running configured properly: dhcpd, named, tftpd, httpd
- Verify the all the above items for all the service nodes
xcatprobe node
Probe whether the node is ready for using
- ssh without password
- syslog has been configured
- verify the parallel commands like xdsh: xdcp
xcatprobe switch
Probe the configuration of switches
- Check whether the IP, user, password, auth have been configured
- Check whether the snmp v1 /v3 are enabled
- Check the system description/name from snmp
- Display the mac table for the switch
xcatprobe nodeready [console] [deployment]
Probe the readiness of node.
-
Check the console configuration
- check the node attributes: cons, serial*
- check the cfg in /etc/conserver.cf
-
Check the readiness for OS deployment:
- provmethod is set, readiness of
osimage
- dhcp set in dhcpd.leases
- readiness of bootloader and bootloader cfg file
- readiness of installer kernel + initrd
- readiness of installer cfg file
- provmethod is set, readiness of
-
It can handle all the nodes in the
xcatprobe nodediscover [noderange]
Probe the node discovery process
Start a process to check the following stages for a node discovery process
- check the dhcp dynamic range for BMC and host
- if possible, display the free ips in the range
- check the readiness of genesis packages first
- check the genesis has been installed
- check the mknb has been run, the genesis kernel+initrd has been created
- check the cfg files have been created and the name is same with the one which has been cfged in the dhcpd.conf
- for the case the [noderange] is specified
- check the nextbootorder to be network
- node sends dhcp request and get an ip(syslog)
- the ip should be one from the host dynamic ip range
- node downloads bootloader
- for x86_64: xnba (syslog/httplog)
- for ppc64le: none
- else: error
- node downloads cfg file for bootloader
- for x86_64: xnba cfg (net_cfg for discovery)
- for ppc64le: petitboot cfg
- node downloads genesis (kernel + initrd)
- node run doxcat
- node run discovery
- node finish the info collection
- node send findme request to xCAT MN
- xcatd handle the findme request
- xcat find or cannot find a matched node for the discovered node:
- [for findme code specific instead of xcatprobe]
- if matched: display the matched node. (Add prefix with the discovery method like: [MTMS], [Switch], [SEQ])
- if not matched: display the
- [MTMS]: my MTMS is xxxx, cannot find any pre-defined node;
- [Switch]: my mac is xxxx, my switch port is yyyy+zzzz, cannot find any pre-defined node; Display the mac address table for the switch;
- [SQE]: cannot find free host or bmc
- log the findme info if xcatdebugmode is enabled
- update matched node
- finished the node discovery
- do the next task: bmcsetup ...
xcatprobe nodedeploy
Probe the process of node deployment
- node sends dhcp request (syslog)
- node downloads xnba (syslog/httplog)
- node downloads xnba cfg (node specific cfg file)
- node downloads installer (kernel + initrd)
- node downloads cfg file for installer (kickstart, autoyast)
- node start package install
- node run postscript (A, B, C)
- node reboot
- node run postbootscript
- node is sshd
News
- Apr 22, 2016: xCAT 2.11.1 released.
- Mar 11, 2016: xCAT 2.9.3 (AIX only) released.
- Dec 11, 2015: xCAT 2.11 released.
- Nov 11, 2015: xCAT 2.9.2 (AIX only) released.
- Jul 30, 2015: xCAT 2.10 released.
- Jul 30, 2015: xCAT migrates from sourceforge to github
- Jun 26, 2015: xCAT 2.7.9 released.
- Mar 20, 2015: xCAT 2.9.1 released.
- Dec 12, 2014: xCAT 2.9 released.
- Sep 5, 2014: xCAT 2.8.5 released.
- May 23, 2014: xCAT 2.8.4 released.
- Jan 24, 2014: xCAT 2.7.8 released.
- Nov 15, 2013: xCAT 2.8.3 released.
- Jun 26, 2013: xCAT 2.8.2 released.
- May 17, 2013: xCAT 2.7.7 released.
- May 10, 2013: xCAT 2.8.1 released.
- Feb 28, 2013: xCAT 2.8 released.
- Nov 30, 2012: xCAT 2.7.6 released.
- Oct 29, 2012: xCAT 2.7.5 released.
- Aug 27, 2012: xCAT 2.7.4 released.
- Jun 22, 2012: xCAT 2.7.3 released.
- May 25, 2012: xCAT 2.7.2 released.
- Apr 20, 2012: xCAT 2.7.1 released.
- Mar 19, 2012: xCAT 2.7 released.
- Mar 15, 2012: xCAT 2.6.11 released.
- Jan 23, 2012: xCAT 2.6.10 released.
- Nov 15, 2011: xCAT 2.6.9 released.
- Sep 30, 2011: xCAT 2.6.8 released.
- Aug 26, 2011: xCAT 2.6.6 released.
- May 20, 2011: xCAT 2.6 released.
- Feb 14, 2011: Watson plays on Jeopardy and is managed by xCAT!
- xCAT Release Notes Summary
- xCAT OS And Hw Support Matrix
- xCAT Test Environment Summary
History
- Oct 22, 2010: xCAT 2.5 released.
- Apr 30, 2010: xCAT 2.4 is released.
- Oct 31, 2009: xCAT 2.3 released.
xCAT's 10 year anniversary! - Apr 16, 2009: xCAT 2.2 released.
- Oct 31, 2008: xCAT 2.1 released.
- Sep 12, 2008: Support for xCAT 2
can now be purchased! - June 9, 2008: xCAT breaths life into
(at the time) the fastest
supercomputer on the planet - May 30, 2008: xCAT 2.0 for Linux
officially released! - Oct 31, 2007: IBM open sources
xCAT 2.0 to allow collaboration
among all of the xCAT users. - Oct 31, 1999: xCAT 1.0 is born!
xCAT started out as a project in
IBM developed by Egan Ford. It
was quickly adopted by customers
and IBM manufacturing sites to
rapidly deploy clusters.