The mini-design of post.xcat restructure
Background
As the original post.xcat file is not easy debug. We should identify some critical error and plain error. When error happens, we should record the detail information in log files on MN and the node.
Also as there is no big difference between post.ubuntu and post.xcat scripts. So we will merge the post.ubuntu into post.xcat to make sure the post.xcat is consistent for redhat
sles
and ubuntu
.
This mini-design support the redhat6.7 redhat7
sles11 sles12
ubuntu14 ubuntu15
and the etc...
For redhat
On the node, with xcatdebugmode off, only the critical error will output(/var/log/xcat/xcat.log) if happens; with xcatdebugmode on, the plain error and critical error will output if happens and the running process will output too.
On the MN, the error msg will record in the related file
For sles
On the node, with xcatdebugmode off, the critical error will output(/var/log/xcat/xcat.log) if happens; with xcatdebugmode on, the plain error and critical error will output if happens; the running process will output whether debugmode is on or off.
On the MN, the error msg will record in the related file
For ubuntu
On the node, with xcatdebugmode off, only the critical error will output(/var/log/xcat/xcat.log) if happens; with xcatdebugmode on, the plain error and critical error will output if happens and the running process will output too.
On the MN, the error msg will record in the related file
critical error
Solution:
write this error information to the node(/var/log/xcat/xcat.log) and MN(/var/log/xcat/), halt the system. And update the node status to failed
on MN about this node.
1. openssl is not installed on the system
2. download the postscripts failure
We use wget command to download the postscripts from the http://$i$INSTALLDIR/postscripts/ on MN, it maybe failure for a serial reasons.
1) Without wget command
2) The network is unreachable
3. getpostscript.awk not exist
First we try to download the mypostscript.$NODE file from the MN, we will rename it to mypostscript if MN have this file. If MN don't have this file, we will try to create mypostscript file using getpostscript.awk. If the getpostscript.awk file is not in the /xcatpost folder, then the error happens.
4. create the mypostscript failure
The mypostscript file is used to generate the mypostscript.post and other files. If this file can't generate with these two methods, then the error happens.
plain error
Solution: write this error information to the node(/var/log/xcat/xcat.log) and MN(/var/log/xcat), but not halt the system.
1. download the precreate mypostscript file failure
2. create the mypostscript.post file failure
3. create the xcatpostinit1 file failure
4. create the xcatinstallpost file failure
5. create the xcatdsklspost file failure
Code Logic and Process
1. Export environment variable information, such as MASTER_IP, NODESTATUS, TFTPDIR and etc..
2. Include the library of the xCAT to use some functions.
3. Set the value for the variable:INSTALLDIR, TFTPDIR if they haven't set.
4. Sleep for a while, then download the postscripts from management node and write the related information in xcatinfo file.
5. Before download postscripts form management node, exam whether the openssl and wget is installed or not, if not then the system should **halt**.
6. Time to download postscripts, use wget command to download the postscripts from MN and exam whether the download is successful, if not then the system should **halt**.
7. Fortunately the postscripts have been downloaded sucessfully, then we will create the mypostscript file.
1. First try to download the mypostscript.$NODE file, this file is created when set the precreatemypostscripts attribute to 1. If this file exists, rename this file to mypostscript.
2. If there is not mypostscript.$NODE file, then we should generate mypostscript file through getpostscript.awk. If the getpostscript.awk file not exist, then the system should **halt**.
3. We use a while loop to generate mypostscript with getpostscript.awk in case there is a failure.
8. Use sed command to add run_ps before the commands in the mypostscript file. We output the run_ps subroutine and append the mypostscript file content to recreate mypostscript file. Unfortunately, this file can't be created, so the system will **halt**.
9. Now we have the mypostscript file. It's time to use the mypostscript file to create the mypostscript.post file according sed command to delete the items between postscripts-start-here and postscripts-end-here
10. Create the post init file(xcatpostinit1)
11. Create the xcatinstallpost file
12. Create the dskls post file(xcatdsklspost)
13. Finally create the mypostscript file according sed command to delete the items between postbootscripts-start-here and postbootscripts-end-here
14. update the node status using updateflag.awk
Planning Outputs
When xcatdebugmode is on, the log information will be saved.
1. The system will sleep for a while to get ready, the output will looks like.
sleep 16
2. Before download postscripts from the management node, exam whether the openssl is installed or not, if not the output will looks like.
/usr/bin/openssl does not exist, halt ...
3. Generate the xcatinfo file. Output:
/opt/xcat/xcatinfo generated
4. When download postscripts file from the management node
1. Show this message as a reminder that we are going to download the postscripts
trying to download postscripts from http://$MASTER_IP$INSTALLDIR/postscripts/
2. If the system have no wget command, we can't download. Output:
/usr/bin/wget does not exist, halt ...
3. It's time to download the postscripts file from the management node.
1. If the postscripts downloaded sucessfully, the output will looks like:
postscripts downloaded successfully
2. If we can't download the postscripts, the output will looks like:
failed to download postscripts from http://$MASTER_IP$INSTALLDIR/postscripts/, halt ...
5. Now we generate the mypostscript file
1. According the precreated mypostscript file
1. Show this message as a reminder that we are going to download the precreated mypostscript file.
trying to download precreated mypostscript file http://$MASTER_IP$TFTPDIR/mypostscripts/mypostscript.$NODE
2. If the precreated mypostscript file download successfully, the output will looks like:
precreated mypostscript downloaded successfully
2. According the getpostscript.awk
1. If we can't download the precreated mypostscript, then we will try to generate the getpostscript file using getpostscript.awk. Show this message as a reminder that we are going to generate it.
failed to download precreated mypostscript, trying to generate with getpostscript.awk
2. If the getpostscript.awk file don't exist, the output will looks like:
/xcatpost/getpostscript.awk does not exist, halt ...
3. If this file can't generate with these two methods, the output will looks like:
generate mypostscript file failure, halt ...
4. If this file generated successfully, output:
generate mypostscript file successfully
6. Time to generate mypostscript.post
1. If successfully generated, output:
/xcatpost/mypostscript.post generated
2. If failed to generate, output:
failed to generate /xcatpost/mypostscript.post
7. Time to generate xcatpostinit1
1. If successfully generated, output:
/etc/init.d/xcatpostinit1 generated
2. If failed to generate, output:
failed to generate /etc/init.d/xcatpostinit1
3. Enable the xcatpostinit1, output(for redhat and sles):
service xcatpostinit1 enabled
8. Time to generate xcatinstallpost
1. If successfully generated, output:
/opt/xcat/xcatinstallpost generated
2. If failed to generate, output:
failed to generate /opt/xcat/xcatinstallpost
9. Time to generate xcatdsklspost
1. If successfully generated, output:
/opt/xcat/xcatdsklspost generated
2. If failed to generate, output:
failed to generate /opt/xcat/xcatdsklspost
10. Running mypostscript
1. Output this information before running mypostscript:
running mypostscript
2. Output this information after running mypostscript:
mypostscript returned
11. show this message as a reminder that grub has updated(for redhat and sles)
/boot/grub/grub.conf updated
12. report the installation status
finished node installation, reporting status...
News
- Apr 22, 2016: xCAT 2.11.1 released.
- Mar 11, 2016: xCAT 2.9.3 (AIX only) released.
- Dec 11, 2015: xCAT 2.11 released.
- Nov 11, 2015: xCAT 2.9.2 (AIX only) released.
- Jul 30, 2015: xCAT 2.10 released.
- Jul 30, 2015: xCAT migrates from sourceforge to github
- Jun 26, 2015: xCAT 2.7.9 released.
- Mar 20, 2015: xCAT 2.9.1 released.
- Dec 12, 2014: xCAT 2.9 released.
- Sep 5, 2014: xCAT 2.8.5 released.
- May 23, 2014: xCAT 2.8.4 released.
- Jan 24, 2014: xCAT 2.7.8 released.
- Nov 15, 2013: xCAT 2.8.3 released.
- Jun 26, 2013: xCAT 2.8.2 released.
- May 17, 2013: xCAT 2.7.7 released.
- May 10, 2013: xCAT 2.8.1 released.
- Feb 28, 2013: xCAT 2.8 released.
- Nov 30, 2012: xCAT 2.7.6 released.
- Oct 29, 2012: xCAT 2.7.5 released.
- Aug 27, 2012: xCAT 2.7.4 released.
- Jun 22, 2012: xCAT 2.7.3 released.
- May 25, 2012: xCAT 2.7.2 released.
- Apr 20, 2012: xCAT 2.7.1 released.
- Mar 19, 2012: xCAT 2.7 released.
- Mar 15, 2012: xCAT 2.6.11 released.
- Jan 23, 2012: xCAT 2.6.10 released.
- Nov 15, 2011: xCAT 2.6.9 released.
- Sep 30, 2011: xCAT 2.6.8 released.
- Aug 26, 2011: xCAT 2.6.6 released.
- May 20, 2011: xCAT 2.6 released.
- Feb 14, 2011: Watson plays on Jeopardy and is managed by xCAT!
- xCAT Release Notes Summary
- xCAT OS And Hw Support Matrix
- xCAT Test Environment Summary
History
- Oct 22, 2010: xCAT 2.5 released.
- Apr 30, 2010: xCAT 2.4 is released.
- Oct 31, 2009: xCAT 2.3 released.
xCAT's 10 year anniversary! - Apr 16, 2009: xCAT 2.2 released.
- Oct 31, 2008: xCAT 2.1 released.
- Sep 12, 2008: Support for xCAT 2
can now be purchased! - June 9, 2008: xCAT breaths life into
(at the time) the fastest
supercomputer on the planet - May 30, 2008: xCAT 2.0 for Linux
officially released! - Oct 31, 2007: IBM open sources
xCAT 2.0 to allow collaboration
among all of the xCAT users. - Oct 31, 1999: xCAT 1.0 is born!
xCAT started out as a project in
IBM developed by Egan Ford. It
was quickly adopted by customers
and IBM manufacturing sites to
rapidly deploy clusters.