2
0
mirror of https://github.com/xcat2/xcat-core.git synced 2025-08-01 00:57:37 +00:00

Merge pull request #1861 from xuweibj/probe_doc

Changes look good to me. Merging.
This commit is contained in:
Mark Gurevich
2016-10-17 09:11:28 -04:00
committed by GitHub
10 changed files with 203 additions and 0 deletions

View File

@@ -16,6 +16,7 @@ Advanced Topics
mixed_cluster/index.rst
networks/index.rst
ports/xcat_ports.rst
probe/index.rst
raid/index.rst
restapi/index.rst
security/index.rst

View File

@@ -0,0 +1,5 @@
detect_dhcpd
============
**detect_dhcp** can be used to detect the dhcp server in a network for a specific mac address.

View File

@@ -0,0 +1,5 @@
discovery
=========
**discovery** can be used to probe the discovery process, including pre-check for required configuration and realtime monitor of discovery process.

View File

@@ -0,0 +1,3 @@
image
=====

View File

@@ -0,0 +1,27 @@
xCAT probe
==========
To help identify some of the common issues with xCAT, a new tool suite is now available **xCAT probe**.
You can use ``xcatprobe -l`` to list all valid subcommands, output will be as below ::
# xcatprobe -l
osdeploy Probe operating system provision process. Supports two modes - 'Realtime monitor' and 'Replay history'.
xcatmn After xcat installation, use this command to check if xcat has been installed correctly and is
ready for use. Before using this command, install 'tftp', 'nslookup' and 'wget' commands.
switch-macmap To retrieve MAC address mapping for the specified switch, or all the switches defined in
'switches' table in xCAT db.
......
.. toctree::
:maxdepth: 2
xcatmn.rst
detect_dhcpd.rst
image.rst
osdeploy.rst
discovery.rst
switch-macmap.rst
nodecheck.rst
osimagecheck.rst

View File

@@ -0,0 +1,2 @@
nodecheck
=========

View File

@@ -0,0 +1,99 @@
osdeploy
========
**osdeploy** operating system provision process. Supports two modes - 'Realtime monitor' and 'Replay history'.
Realtime monitor: This is a default. This tool with monitor provision state of the node. Trigger 'Realtime monitor' before rebooting target node to do provisioning.
Replay history: Used after provisioning is finished to probe the previously completed provisioning.
**Note**: Currently, hierarchical structure is not supported.
Usage
-----
::
xcatprobe osdeploy -h
xcatprobe osdeploy -n <node_range> [-t <max_waiting_time>] [-V]
xcatprobe osdeploy -n <node_range> -r <xxhxxm> [-V]
Options:
* **-n**: The range of nodes to be monitored or replayed.
* **-r**: Trigger 'Replay history' mode. Follow the duration of rolling back. Units are 'h' (hour) or 'm' (minute). If unit is not specified, hour will be used by default.
* **-t**: The maximum time to wait when doing monitor, unit is minutes. default is 60.
* **-V**: Output more information.
``-r`` means replay history of OS provision, if no ``-r`` means to do realtime monitor.
Realtime monitor
----------------
To monitor OS provisioning in real time, open at least 2 terminal windows. One to run ``osdeploy`` probe: ::
xcatprobe osdeploy -n cn1 [-V]
After some pre-checks, the probe will wait for provisioning information, similar to output below: ::
# xcatprobe osdeploy -n c910f03c17k20
The install NIC in current server is enp0s1 [INFO]
All nodes which will be deployed are valid [ OK ]
-------------------------------------------------------------
Start capturing every message during OS provision process......
-------------------------------------------------------------
Open second terminal window to run provisioning: ::
nodeset cn1 osimage=<osimage>
rpower cn1 boot
When all the nodes complete provisioning, the probe will exit and display output similar to: ::
# xcatprobe osdeploy -n c910f03c17k20
The install NIC in current server is enp0s1 [INFO]
All nodes which will be deployed are valid [ OK ]
-------------------------------------------------------------
Start capturing every message during OS provision process......
-------------------------------------------------------------
[c910f03c17k20] Use command rinstall to reboot node c910f03c17k20
[c910f03c17k20] Node status is changed to powering-on
[c910f03c17k20] Receive DHCPDISCOVER via enp0s1
[c910f03c17k20] Send DHCPOFFER on 10.3.17.20 back to 42:d0:0a:03:11:14 via enp0s1
[c910f03c17k20] DHCPREQUEST for 10.3.17.20 (10.3.5.4) from 42:d0:0a:03:11:14 via enp0s1
[c910f03c17k20] Send DHCPACK on 10.3.17.20 back to 42:d0:0a:03:11:14 via enp0s1
[c910f03c17k20] Via TFTP download /boot/grub2/grub2-c910f03c17k20
[c910f03c17k20] Via TFTP download /boot/grub2/powerpc-ieee1275/normal.mod
......
[c910f03c17k20] Postscript: otherpkgs exited with code 0
[c910f03c17k20] Node status is changed to booted
[c910f03c17k20] done
[c910f03c17k20] provision completed.(c910f03c17k20)
[c910f03c17k20] provision completed [ OK ]
All nodes specified to monitor, have finished OS provision process [ OK ]
==================osdeploy_probe_report=================
All nodes provisioned successfully [ OK ]
If there is something wrong when provisioning, this probe will exit when timeout is reachedd or ``Ctrl+C`` is pressed by user. The maximum time can be set by using ``-t`` as below(default 30 minutes) ::
xcatprobe osdeploy -n cn1 -t 30
Replay history
--------------
To replay history of OS provision from 1 hour 20 minutes ago, use command as ::
xcatprobe osdeploy -n cn1 -r 1h20m
Outout will be similar to: ::
# xcatprobe osdeploy -n c910f03c17k20
The install NIC in current server is enp0s1 [INFO]
All nodes which will be deployed are valid [ OK ]
Start to scan logs which are later than *********, waiting for a while.............
==================osdeploy_probe_report=================
All nodes provisioned successfully [ OK ]

View File

@@ -0,0 +1,2 @@
osimagecheck
============

View File

@@ -0,0 +1,3 @@
switch-macmap
=============

View File

@@ -0,0 +1,56 @@
xcatmn
======
**xcatmn** can be used to check if xcat has been installed correctly and is ready for use.
**Note**: For several check items(eg. tftp service, dns service, http service), 'tftp', 'nslookup' and 'wget' are need. If not installed, a warning message will be displayed..
Command is as below ::
xcatprobe xcatmn -i <install_nic> [-V]
* **-i**: [Required] Specify the network interface name of provision network on management node.
* **-V**: Output more information for debug.
For example, run command on Management Node ::
xcatprobe xcatmn -i eth0
Output will be similar to: ::
# xcatprobe xcatmn -i eth0
[MN]: Sub process 'xcatd: SSL listener' is running [ OK ]
[MN]: Sub process 'xcatd: DB Access' is running [ OK ]
[MN]: Sub process 'xcatd: UDP listener' is running [ OK ]
[MN]: Sub process 'xcatd: install monitor' is running [ OK ]
[MN]: Sub process 'xcatd: Discovery worker' is running [ OK ]
[MN]: Sub process 'xcatd: Command log writer' is running [ OK ]
[MN]: xcatd is listening on port 3001 [ OK ]
[MN]: xcatd is listening on port 3002 [ OK ]
[MN]: 'lsxcatd -a' works [ OK ]
[MN]: The value of 'master' in 'site' table is an IP address [ OK ]
[MN]: NIC enp0s1 exists on current server [ OK ]
[MN]: Get IP address of NIC eth0 [ OK ]
[MN]: The IP *.*.*.* of eth0 equals the value of 'master' in 'site' table [ OK ]
[MN]: IP *.*.*.* of NIC eth0 is a static IP on current server [ OK ]
[MN]: *.*.*.* belongs to one of networks defined in 'networks' table [ OK ]
[MN]: There is domain definition in 'site' table [ OK ]
[MN]: There is a configuration in 'passwd' table for 'system' for node provisioning [ OK ]
[MN]: There is /install directory on current server [ OK ]
[MN]: There is /tftpboot directory on current server [ OK ]
[MN]: The free space of '/' is less than 12 G [ OK ]
[MN]: SELinux is disabled on current server [ OK ]
[MN]: Firewall is closed on current server [ OK ]
[MN]: HTTP service is ready on *.*.*.* [ OK ]
[MN]: TFTP service is ready on *.*.*.* [ OK ]
[MN]: DNS server is ready on *.*.*.* [ OK ]
[MN]: The size of /var/lib/dhcpd/dhcpd.leases is less than 100M [ OK ]
[MN]: DHCP service is ready on *.*.*.* [ OK ]
======================do summary=====================
[MN]: Check on MN PASS. [ OK ]
**[MN]** means that the verfication is performerd on the Management Node. Overall status of ``PASS`` or ``FAILED`` will be displayed after all items are verified..
Service Nodes are checked automatically for hierarchical clusters.
For Service Nodes, the output will contain ``[SN:nodename]`` to distinguish different Service Nodes.