diff --git a/docs/source/advanced/index.rst b/docs/source/advanced/index.rst index dacf235d0..7193f3b71 100755 --- a/docs/source/advanced/index.rst +++ b/docs/source/advanced/index.rst @@ -16,6 +16,7 @@ Advanced Topics mixed_cluster/index.rst networks/index.rst ports/xcat_ports.rst + probe/index.rst raid/index.rst restapi/index.rst security/index.rst diff --git a/docs/source/advanced/probe/detect_dhcpd.rst b/docs/source/advanced/probe/detect_dhcpd.rst new file mode 100644 index 000000000..fe7449497 --- /dev/null +++ b/docs/source/advanced/probe/detect_dhcpd.rst @@ -0,0 +1,5 @@ +detect_dhcpd +============ + +**detect_dhcp** can be used to detect the dhcp server in a network for a specific mac address. + diff --git a/docs/source/advanced/probe/discovery.rst b/docs/source/advanced/probe/discovery.rst new file mode 100644 index 000000000..611dba27e --- /dev/null +++ b/docs/source/advanced/probe/discovery.rst @@ -0,0 +1,5 @@ +discovery +========= + +**discovery** can be used to probe the discovery process, including pre-check for required configuration and realtime monitor of discovery process. + diff --git a/docs/source/advanced/probe/image.rst b/docs/source/advanced/probe/image.rst new file mode 100644 index 000000000..31d20dbfa --- /dev/null +++ b/docs/source/advanced/probe/image.rst @@ -0,0 +1,3 @@ +image +===== + diff --git a/docs/source/advanced/probe/index.rst b/docs/source/advanced/probe/index.rst new file mode 100644 index 000000000..4e6932de7 --- /dev/null +++ b/docs/source/advanced/probe/index.rst @@ -0,0 +1,27 @@ +xCAT probe +========== + +To help identify some of the common issues with xCAT, a new tool suite is now available **xCAT probe**. + +You can use ``xcatprobe -l`` to list all valid subcommands, output will be as below :: + + # xcatprobe -l + osdeploy Probe operating system provision process. Supports two modes - 'Realtime monitor' and 'Replay history'. + xcatmn After xcat installation, use this command to check if xcat has been installed correctly and is + ready for use. Before using this command, install 'tftp', 'nslookup' and 'wget' commands. + switch-macmap To retrieve MAC address mapping for the specified switch, or all the switches defined in + 'switches' table in xCAT db. + ...... + +.. toctree:: + :maxdepth: 2 + + xcatmn.rst + detect_dhcpd.rst + image.rst + osdeploy.rst + discovery.rst + switch-macmap.rst + nodecheck.rst + osimagecheck.rst + diff --git a/docs/source/advanced/probe/nodecheck.rst b/docs/source/advanced/probe/nodecheck.rst new file mode 100644 index 000000000..479846e75 --- /dev/null +++ b/docs/source/advanced/probe/nodecheck.rst @@ -0,0 +1,2 @@ +nodecheck +========= diff --git a/docs/source/advanced/probe/osdeploy.rst b/docs/source/advanced/probe/osdeploy.rst new file mode 100644 index 000000000..546d33198 --- /dev/null +++ b/docs/source/advanced/probe/osdeploy.rst @@ -0,0 +1,99 @@ +osdeploy +======== + +**osdeploy** operating system provision process. Supports two modes - 'Realtime monitor' and 'Replay history'. + +Realtime monitor: This is a default. This tool with monitor provision state of the node. Trigger 'Realtime monitor' before rebooting target node to do provisioning. + +Replay history: Used after provisioning is finished to probe the previously completed provisioning. + +**Note**: Currently, hierarchical structure is not supported. + +Usage +----- + +:: + + xcatprobe osdeploy -h + xcatprobe osdeploy -n [-t ] [-V] + xcatprobe osdeploy -n -r [-V] + +Options: + +* **-n**: The range of nodes to be monitored or replayed. +* **-r**: Trigger 'Replay history' mode. Follow the duration of rolling back. Units are 'h' (hour) or 'm' (minute). If unit is not specified, hour will be used by default. +* **-t**: The maximum time to wait when doing monitor, unit is minutes. default is 60. +* **-V**: Output more information. + +``-r`` means replay history of OS provision, if no ``-r`` means to do realtime monitor. + +Realtime monitor +---------------- + +To monitor OS provisioning in real time, open at least 2 terminal windows. One to run ``osdeploy`` probe: :: + + xcatprobe osdeploy -n cn1 [-V] + +After some pre-checks, the probe will wait for provisioning information, similar to output below: :: + + # xcatprobe osdeploy -n c910f03c17k20 + The install NIC in current server is enp0s1 [INFO] + All nodes which will be deployed are valid [ OK ] + ------------------------------------------------------------- + Start capturing every message during OS provision process...... + ------------------------------------------------------------- + +Open second terminal window to run provisioning: :: + + nodeset cn1 osimage= + rpower cn1 boot + +When all the nodes complete provisioning, the probe will exit and display output similar to: :: + + # xcatprobe osdeploy -n c910f03c17k20 + The install NIC in current server is enp0s1 [INFO] + All nodes which will be deployed are valid [ OK ] + ------------------------------------------------------------- + Start capturing every message during OS provision process...... + ------------------------------------------------------------- + + [c910f03c17k20] Use command rinstall to reboot node c910f03c17k20 + [c910f03c17k20] Node status is changed to powering-on + [c910f03c17k20] Receive DHCPDISCOVER via enp0s1 + [c910f03c17k20] Send DHCPOFFER on 10.3.17.20 back to 42:d0:0a:03:11:14 via enp0s1 + [c910f03c17k20] DHCPREQUEST for 10.3.17.20 (10.3.5.4) from 42:d0:0a:03:11:14 via enp0s1 + [c910f03c17k20] Send DHCPACK on 10.3.17.20 back to 42:d0:0a:03:11:14 via enp0s1 + [c910f03c17k20] Via TFTP download /boot/grub2/grub2-c910f03c17k20 + [c910f03c17k20] Via TFTP download /boot/grub2/powerpc-ieee1275/normal.mod + ...... + [c910f03c17k20] Postscript: otherpkgs exited with code 0 + [c910f03c17k20] Node status is changed to booted + [c910f03c17k20] done + [c910f03c17k20] provision completed.(c910f03c17k20) + [c910f03c17k20] provision completed [ OK ] + All nodes specified to monitor, have finished OS provision process [ OK ] + ==================osdeploy_probe_report================= + All nodes provisioned successfully [ OK ] + + +If there is something wrong when provisioning, this probe will exit when timeout is reachedd or ``Ctrl+C`` is pressed by user. The maximum time can be set by using ``-t`` as below(default 30 minutes) :: + + + xcatprobe osdeploy -n cn1 -t 30 + +Replay history +-------------- + +To replay history of OS provision from 1 hour 20 minutes ago, use command as :: + + xcatprobe osdeploy -n cn1 -r 1h20m + +Outout will be similar to: :: + + # xcatprobe osdeploy -n c910f03c17k20 + The install NIC in current server is enp0s1 [INFO] + All nodes which will be deployed are valid [ OK ] + Start to scan logs which are later than *********, waiting for a while............. + ==================osdeploy_probe_report================= + All nodes provisioned successfully [ OK ] + diff --git a/docs/source/advanced/probe/osimagecheck.rst b/docs/source/advanced/probe/osimagecheck.rst new file mode 100644 index 000000000..9bf8d6c81 --- /dev/null +++ b/docs/source/advanced/probe/osimagecheck.rst @@ -0,0 +1,2 @@ +osimagecheck +============ diff --git a/docs/source/advanced/probe/switch-macmap.rst b/docs/source/advanced/probe/switch-macmap.rst new file mode 100644 index 000000000..bf600a239 --- /dev/null +++ b/docs/source/advanced/probe/switch-macmap.rst @@ -0,0 +1,3 @@ +switch-macmap +============= + diff --git a/docs/source/advanced/probe/xcatmn.rst b/docs/source/advanced/probe/xcatmn.rst new file mode 100644 index 000000000..c692e6a98 --- /dev/null +++ b/docs/source/advanced/probe/xcatmn.rst @@ -0,0 +1,56 @@ +xcatmn +====== + +**xcatmn** can be used to check if xcat has been installed correctly and is ready for use. + +**Note**: For several check items(eg. tftp service, dns service, http service), 'tftp', 'nslookup' and 'wget' are need. If not installed, a warning message will be displayed.. + +Command is as below :: + + xcatprobe xcatmn -i [-V] + +* **-i**: [Required] Specify the network interface name of provision network on management node. +* **-V**: Output more information for debug. + +For example, run command on Management Node :: + + xcatprobe xcatmn -i eth0 + +Output will be similar to: :: + + # xcatprobe xcatmn -i eth0 + [MN]: Sub process 'xcatd: SSL listener' is running [ OK ] + [MN]: Sub process 'xcatd: DB Access' is running [ OK ] + [MN]: Sub process 'xcatd: UDP listener' is running [ OK ] + [MN]: Sub process 'xcatd: install monitor' is running [ OK ] + [MN]: Sub process 'xcatd: Discovery worker' is running [ OK ] + [MN]: Sub process 'xcatd: Command log writer' is running [ OK ] + [MN]: xcatd is listening on port 3001 [ OK ] + [MN]: xcatd is listening on port 3002 [ OK ] + [MN]: 'lsxcatd -a' works [ OK ] + [MN]: The value of 'master' in 'site' table is an IP address [ OK ] + [MN]: NIC enp0s1 exists on current server [ OK ] + [MN]: Get IP address of NIC eth0 [ OK ] + [MN]: The IP *.*.*.* of eth0 equals the value of 'master' in 'site' table [ OK ] + [MN]: IP *.*.*.* of NIC eth0 is a static IP on current server [ OK ] + [MN]: *.*.*.* belongs to one of networks defined in 'networks' table [ OK ] + [MN]: There is domain definition in 'site' table [ OK ] + [MN]: There is a configuration in 'passwd' table for 'system' for node provisioning [ OK ] + [MN]: There is /install directory on current server [ OK ] + [MN]: There is /tftpboot directory on current server [ OK ] + [MN]: The free space of '/' is less than 12 G [ OK ] + [MN]: SELinux is disabled on current server [ OK ] + [MN]: Firewall is closed on current server [ OK ] + [MN]: HTTP service is ready on *.*.*.* [ OK ] + [MN]: TFTP service is ready on *.*.*.* [ OK ] + [MN]: DNS server is ready on *.*.*.* [ OK ] + [MN]: The size of /var/lib/dhcpd/dhcpd.leases is less than 100M [ OK ] + [MN]: DHCP service is ready on *.*.*.* [ OK ] + ======================do summary===================== + [MN]: Check on MN PASS. [ OK ] + +**[MN]** means that the verfication is performerd on the Management Node. Overall status of ``PASS`` or ``FAILED`` will be displayed after all items are verified.. + +Service Nodes are checked automatically for hierarchical clusters. + +For Service Nodes, the output will contain ``[SN:nodename]`` to distinguish different Service Nodes.