Table of Contents
Table of Contents
{{:Design Warning}}
xCAT nodelist table holds node reachability status (status) and application status (appstatus). To turn on the status monitoring, run the following commands:
**monadd xcatmon -n -s [ping-interval=5]** (The default ping-interval is 3 minutes).
**monstart xcatmon**
To turn off the status monitoring, run:
**monstop xcatmon**
1. Node status
xCAT is now using fping to get the node status. We will switch to use nmap (to query ssh port) on Linux for performance reason.
2. Application status
Format: app1=status1,app2=status2.... Example: ssh="up",ll="down",gpfs="not working at all"
The basic idea is to use nmap to query the ports for application deamons. If the ports is open then the application is healthy. However, some application may need further checking even though the port is open. For such applications, user can surpply a command (scripts) that checks the status. The input to the command is a comma separated list of node names, the output is the application status on each given node. The output format is:
node1:status
node2:status
...
It can be a local command, or a command that will be run remotely on the nodes.
Settings:
Table monsetting:
name key value
xcatmon apps ssh,ll,gpfs,someapp
xcatmon gpfs cmd=/tmp/mycmd,group=compute,group=service
xcarmon ll port=5001,group=compute
xcatmon someapp rmccondname=xxxx,group=all
Keywords:
apps --- a list of comma separated application names whose status will be queried. For how to get the status of each app, look for app name in the key filed in a different row.
**port ** --- the application port number, if not specified, use internal list, then /etc/services. If there is no key specified for an app, assume"port" and "group=all".
group -- the name of a node group that needs to get the application status from. If not specified, assume all the nodes in the nodelist table.
**cmd ** ---- the command will be run locally on mn or sn.
dcmd ---- the command will be run distributed on the nodes (xdsh <nodes> ...).
rmccondname --- the RMC condition name. xCAT needs to associate the condition with LogEventToxCATDatabase response first. Then goto eventlog table, get the events since last observation. (This has not implemented yet.)
3. nodestate
A new flag for nodestat command:
nodestat <nodelist> -u|--updatedb -m|--usemon
It displays the node status and application status, it also writes the status on the nodelist table.
By default, it works as before, that is:
1. gets the ssh,pbs,xend port status;
2. if none of them are open, it gets the fping status;
3. for pingable nodes that are in the middle of deployment, it gets the deployment status;
4. for non-pingable nodes, it shows 'no ping'.
But when -m is specified and there are settings in the monsetting table, it displays the status of the applications specified in the monsetting table. When -u is spcified it saves the status info into the xCAT database. Node's pingable status and deployment status is saved in the nodelist.status column. Node's application status is saved in the nodelist.appstatus column.
News
- Apr 22, 2016: xCAT 2.11.1 released.
- Mar 11, 2016: xCAT 2.9.3 (AIX only) released.
- Dec 11, 2015: xCAT 2.11 released.
- Nov 11, 2015: xCAT 2.9.2 (AIX only) released.
- Jul 30, 2015: xCAT 2.10 released.
- Jul 30, 2015: xCAT migrates from sourceforge to github
- Jun 26, 2015: xCAT 2.7.9 released.
- Mar 20, 2015: xCAT 2.9.1 released.
- Dec 12, 2014: xCAT 2.9 released.
- Sep 5, 2014: xCAT 2.8.5 released.
- May 23, 2014: xCAT 2.8.4 released.
- Jan 24, 2014: xCAT 2.7.8 released.
- Nov 15, 2013: xCAT 2.8.3 released.
- Jun 26, 2013: xCAT 2.8.2 released.
- May 17, 2013: xCAT 2.7.7 released.
- May 10, 2013: xCAT 2.8.1 released.
- Feb 28, 2013: xCAT 2.8 released.
- Nov 30, 2012: xCAT 2.7.6 released.
- Oct 29, 2012: xCAT 2.7.5 released.
- Aug 27, 2012: xCAT 2.7.4 released.
- Jun 22, 2012: xCAT 2.7.3 released.
- May 25, 2012: xCAT 2.7.2 released.
- Apr 20, 2012: xCAT 2.7.1 released.
- Mar 19, 2012: xCAT 2.7 released.
- Mar 15, 2012: xCAT 2.6.11 released.
- Jan 23, 2012: xCAT 2.6.10 released.
- Nov 15, 2011: xCAT 2.6.9 released.
- Sep 30, 2011: xCAT 2.6.8 released.
- Aug 26, 2011: xCAT 2.6.6 released.
- May 20, 2011: xCAT 2.6 released.
- Feb 14, 2011: Watson plays on Jeopardy and is managed by xCAT!
- xCAT Release Notes Summary
- xCAT OS And Hw Support Matrix
- xCAT Test Environment Summary
History
- Oct 22, 2010: xCAT 2.5 released.
- Apr 30, 2010: xCAT 2.4 is released.
- Oct 31, 2009: xCAT 2.3 released.
xCAT's 10 year anniversary! - Apr 16, 2009: xCAT 2.2 released.
- Oct 31, 2008: xCAT 2.1 released.
- Sep 12, 2008: Support for xCAT 2
can now be purchased! - June 9, 2008: xCAT breaths life into
(at the time) the fastest
supercomputer on the planet - May 30, 2008: xCAT 2.0 for Linux
officially released! - Oct 31, 2007: IBM open sources
xCAT 2.0 to allow collaboration
among all of the xCAT users. - Oct 31, 1999: xCAT 1.0 is born!
xCAT started out as a project in
IBM developed by Egan Ford. It
was quickly adopted by customers
and IBM manufacturing sites to
rapidly deploy clusters.