2
0
mirror of https://github.com/xcat2/xcat-core.git synced 2025-07-24 13:21:12 +00:00

Add probe docs, some docs' status is TODO

This commit is contained in:
XuWei
2016-09-19 22:31:35 -04:00
parent d104f531af
commit 82eba7e520
8 changed files with 203 additions and 0 deletions

View File

@@ -16,6 +16,7 @@ Advanced Topics
mixed_cluster/index.rst
networks/index.rst
ports/xcat_ports.rst
probe/index.rst
raid/index.rst
restapi/index.rst
security/index.rst

View File

@@ -0,0 +1,6 @@
detect_dhcpd
============
**detect_dhcp** can be used to detect the dhcp server in a network for a specific mac address.
**TODO**

View File

@@ -0,0 +1,6 @@
discovery
=========
**discovery** can be used to probe the discovery process, including pre-check for required configuration and realtime monitor of discovery process.
**TODO**

View File

@@ -0,0 +1,4 @@
image
=====
**TODO**

View File

@@ -0,0 +1,25 @@
xCAT probe
==========
xCAT offers a tool **probe** to help customer to use xCAT.
You can use ``xcatprobe -l`` to list all valid subcommand, output will be as below ::
osdeploy Probe for OS provision process, realtime monitor of OS provision process.
xcatmn After xcat installation, use this command to check if xcat has been installed correctly and is
ready for use. Before using this command, install 'tftp', 'nslookup' and 'wget' commands.
switch-macmap To retrieve MAC address mapping for the specified switch, or all the switches defined in
'switches' table in xCAT db.
......
.. toctree::
:maxdepth: 2
xcatmn.rst
detect_dhcpd.rst
image.rst
osdeploy.rst
discovery.rst
switch-macmap.rst

View File

@@ -0,0 +1,100 @@
osdeploy
========
**osdeploy** can be used to probe OS provision process. Realtime monitor or replay history of OS provision process.
If realtime monitor, run this command before ``rpower`` node(including the command rpower node indirectly, e.g ``rinstall``, ``rnetboot``).
**Note**: Currently, hierarchical structure is not supported.
Usage
-----
::
xcatprobe osdeploy -h
xcatprobe osdeploy -n <node_range> [-V]
xcatprobe osdeploy -n <node_range> -r <xxhxxm> [-V]
Options:
* **-n**: The range of nodes for monitor or replay log.
* **-r**: Replay history log for probe provisioniong. Input a start time when probe should begin. Supported time formats are ``xxhxxm``, ``xxh``, or ``xxm``. If units not specified, hour will be used by default.
* **-t**: The maximum time in minutes to wait when doing monitor, default is 60.
* **-V**: Output more information for debug.
``-r`` means replay history of OS provision, if no ``-r`` means to do realtime monitor.
This command will do pre-check before realtime monitor and replay history automatically. If all nodes' definition are valid, will run monitor or replay. Or will exit and show error message.
Realtime monitor
----------------
If want to realtime monitor OS provision, please Open 2 terminal windows at least. One is to run ``osdeploy`` command as below ::
xcatprobe osdeploy -n cn1 [-V]
after pre-check will wait for provision information and show as below ::
# xcatprobe osdeploy -n c910f03c17k20
The install NIC in current server is enp0s1 [INFO]
All nodes which will be deployed are valid [ OK ]
-------------------------------------------------------------
Start capturing every message during OS provision process......
-------------------------------------------------------------
do provision on another terminal window. ::
nodeset cn1 osimage=<osimage>
rpower cn1 boot
When all the nodes complete provision, will exit and output summary as below ::
# xcatprobe osdeploy -n c910f03c17k20
The install NIC in current server is enp0s1 [INFO]
All nodes which will be deployed are valid [ OK ]
-------------------------------------------------------------
Start capturing every message during OS provision process......
-------------------------------------------------------------
[c910f03c17k20] Use command rinstall to reboot node c910f03c17k20
[c910f03c17k20] Node status is changed to powering-on
[c910f03c17k20] Receive DHCPDISCOVER via enp0s1
[c910f03c17k20] Send DHCPOFFER on 10.3.17.20 back to 42:d0:0a:03:11:14 via enp0s1
[c910f03c17k20] DHCPREQUEST for 10.3.17.20 (10.3.5.4) from 42:d0:0a:03:11:14 via enp0s1
[c910f03c17k20] Send DHCPACK on 10.3.17.20 back to 42:d0:0a:03:11:14 via enp0s1
[c910f03c17k20] Via TFTP download /boot/grub2/grub2-c910f03c17k20
[c910f03c17k20] Via TFTP download /boot/grub2/powerpc-ieee1275/normal.mod
......
[c910f03c17k20] Postscript: otherpkgs exited with code 0
[c910f03c17k20] Node status is changed to booted
[c910f03c17k20] done
[c910f03c17k20] provision completed.(c910f03c17k20)
[c910f03c17k20] provision completed [ OK ]
All nodes specified to monitor, have finished OS provision process [ OK ]
==================conclusion_report=================
All nodes provision successfully [ OK ]
If there is something wrong when provision, will exit when timeout or press ``Ctrl+C`` by user. The maximum time can be set by using ``-t`` as below ::
xcatprobe osdeploy -n cn1 -t 30
The maximum time is set to 30 minites.
Replay history
--------------
It want to replay history of OS provision from 1 hour 20 minutes ago, use command as ::
xcatprobe osdeploy -n cn1 -r 1h20m
The outout will be as below ::
# xcatprobe osdeploy -n c910f03c17k20
The install NIC in current server is enp0s1 [INFO]
All nodes which will be deployed are valid [ OK ]
Start to scan logs which are later than *********, waiting for a while.............
==================conclusion_report=================
All nodes provision successfully [ OK ]

View File

@@ -0,0 +1,4 @@
switch-macmap
=============
**TODO**

View File

@@ -0,0 +1,57 @@
xcatmn
======
**xcatmn** can be used to check if xcat has been installed correctly and is ready for use.
**Note**: For several check items(eg. tftp service, dns service, http service), 'tftp', 'nslookup' and 'wget' are need. If not be installed, will not check that item and give warning message.
Command is as below ::
xcatprobe xcatmn -i <install_nic> [-V]
* **-i**: [Required] Specify the network interface name of provision network on management node.
* **-V**: Output more information for debug.
For example, run command on Management Node ::
xcatprobe xcatmn -i eth0
**xcatmn** will check xcatd's process, xcat config and xcat service. If the item is ready for xcat use, result label is ``[ OK ]``. If the item is not ready and xcat can not be used, result label is ``[FAIL]``. If the item is not ready but maybe xcat can be used, result label is ``[WARN]``.
Output will be like this ::
[MN]: Sub process 'xcatd: SSL listener' is running [ OK ]
[MN]: Sub process 'xcatd: DB Access' is running [ OK ]
[MN]: Sub process 'xcatd: UDP listener' is running [ OK ]
[MN]: Sub process 'xcatd: install monitor' is running [ OK ]
[MN]: Sub process 'xcatd: Discovery worker' is running [ OK ]
[MN]: Sub process 'xcatd: Command log writer' is running [ OK ]
[MN]: xcatd is listening on port 3001 [ OK ]
[MN]: xcatd is listening on port 3002 [ OK ]
[MN]: 'lsxcatd -a' works [ OK ]
[MN]: The value of 'master' in 'site' table is an IP address [ OK ]
[MN]: NIC enp0s1 exists on current server [ OK ]
[MN]: Get IP address of NIC eth0 [ OK ]
[MN]: The IP *.*.*.* of eth0 equals the value of 'master' in 'site' table [ OK ]
[MN]: IP *.*.*.* of NIC eth0 is a static IP on current server [ OK ]
[MN]: *.*.*.* belongs to one of networks defined in 'networks' table [ OK ]
[MN]: There is domain definition in 'site' table [ OK ]
[MN]: There is a configuration in 'passwd' table for 'system' for node provisioning [ OK ]
[MN]: There is /install directory on current server [ OK ]
[MN]: There is /tftpboot directory on current server [ OK ]
[MN]: The free space of '/' is less than 12 G [ OK ]
[MN]: SELinux is disabled on current server [ OK ]
[MN]: Firewall is closed on current server [ OK ]
[MN]: HTTP service is ready on *.*.*.* [ OK ]
[MN]: TFTP service is ready on *.*.*.* [ OK ]
[MN]: DNS server is ready on *.*.*.* [ OK ]
[MN]: The size of /var/lib/dhcpd/dhcpd.leases is less than 100M [ OK ]
[MN]: DHCP service is ready on *.*.*.* [ OK ]
======================do summary=====================
[MN]: Check on MN PASS. [ OK ]
**[MN]** means it's MN's check result. When complete all items' check, will show summary to give a conclusion ``PASS`` or ``FAILED``.
For hierarchical clusters, ``xcatmn`` will check Service Node automatically.
For Service Nodes, the output will contain ``[SN:nodename]`` to distinguish different Service Nodes.