From 57c7dca6daed291ef8f9f238a59d196ad8abca90 Mon Sep 17 00:00:00 2001 From: chenglch Date: Wed, 2 Sep 2015 02:47:30 -0400 Subject: [PATCH] Add hierarchy doc Add hierarchy documents including introduction, database management,defining node, service node setup, test service node and appendix. TODO: resolve the fake_todo link reference in the following patch --- .../appendix_a_setup_backup_service_nodes.rst | 167 +++++++++ .../hierarchy/appendix_b_diagnostics.rst | 66 ++++ .../appendix_c_migrating_mn_to_sn.rst | 14 + ...pendix_d_set_up_hierarchical_conserver.rst | 13 + .../advanced/hierarchy/configure_dhcp.rst | 11 + .../define_and_install_compute_node.rst | 39 ++ .../hierarchy/define_service_node.rst | 345 ++++++++++++++++++ .../advanced/hierarchy/hierarchy_cluster.rst | 2 - docs/source/advanced/hierarchy/index.rst | 11 + .../advanced/hierarchy/introduction.rst | 48 ++- .../hierarchy/service_node_for_diskfull.rst | 156 ++++++++ .../hierarchy/service_node_for_diskless.rst | 221 +++++++++++ .../setup_mn_hierachical_database.rst | 27 ++ .../setup_mn_hierarchical_database.rst | 27 ++ .../advanced/hierarchy/setup_service_node.rst | 6 + .../test_service_node_installation.rst | 11 + 16 files changed, 1158 insertions(+), 6 deletions(-) create mode 100644 docs/source/advanced/hierarchy/appendix_a_setup_backup_service_nodes.rst create mode 100644 docs/source/advanced/hierarchy/appendix_b_diagnostics.rst create mode 100644 docs/source/advanced/hierarchy/appendix_c_migrating_mn_to_sn.rst create mode 100644 docs/source/advanced/hierarchy/appendix_d_set_up_hierarchical_conserver.rst create mode 100644 docs/source/advanced/hierarchy/configure_dhcp.rst create mode 100644 docs/source/advanced/hierarchy/define_and_install_compute_node.rst create mode 100644 docs/source/advanced/hierarchy/define_service_node.rst delete mode 100644 docs/source/advanced/hierarchy/hierarchy_cluster.rst create mode 100644 docs/source/advanced/hierarchy/service_node_for_diskfull.rst create mode 100644 docs/source/advanced/hierarchy/service_node_for_diskless.rst create mode 100644 docs/source/advanced/hierarchy/setup_mn_hierachical_database.rst create mode 100644 docs/source/advanced/hierarchy/setup_mn_hierarchical_database.rst create mode 100644 docs/source/advanced/hierarchy/setup_service_node.rst create mode 100644 docs/source/advanced/hierarchy/test_service_node_installation.rst diff --git a/docs/source/advanced/hierarchy/appendix_a_setup_backup_service_nodes.rst b/docs/source/advanced/hierarchy/appendix_a_setup_backup_service_nodes.rst new file mode 100644 index 000000000..8798d1939 --- /dev/null +++ b/docs/source/advanced/hierarchy/appendix_a_setup_backup_service_nodes.rst @@ -0,0 +1,167 @@ +Appendix A: Setup backup Service Nodes +====================================== + +For reliability, availability, and serviceability purposes you may wish to +designate backup service nodes in your hierarchical cluster. The backup +service node will be another active service node that is set up to easily +take over from the original service node if a problem occurs. This is not an +automatic failover feature. You will have to initiate the switch from the +primary service node to the backup manually. The xCAT support will handle most +of the setup and transfer of the nodes to the new service node. This +procedure can also be used to simply switch some compute nodes to a new +service node, for example, for planned maintenance. + +Abbreviations used below: + +* MN - management node. +* SN - service node. +* CN - compute node. + +Initial deployment +------------------ + +Integrate the following steps into the hierarchical deployment process +described above. + + +#. Make sure both the primary and backup service nodes are installed, + configured, and can access the MN database. +#. When defining the CNs add the necessary service node values to the + "servicenode" and "xcatmaster" attributes of the `node definitions + `_. +#. (Optional) Create an xCAT group for the nodes that are assigned to each SN. + This will be useful when setting node attributes as well as providing an + easy way to switch a set of nodes back to their original server. + +To specify a backup service node you must specify a comma-separated list of +two **service nodes** for the servicenode value of the compute node. The first +one is the primary and the second is the backup (or new SN) for that node. +Use the hostnames of the SNs as known by the MN. + +For the **xcatmaster** value you should only include the primary SN, as known +by the compute node. + +In most hierarchical clusters, the networking is such that the name of the +SN as known by the MN is different than the name as known by the CN. (If +they are on different networks.) + +The following example assume the SN interface to the MN is on the "a" +network and the interface to the CN is on the "b" network. To set the +attributes you would run a command similar to the following. :: + + chdef servicenode="xcatsn1a,xcatsn2a" xcatmaster="xcatsn1b" + +The process can be simplified by creating xCAT node groups to use as the + in the `chdef `_ command. To create an +xCAT node group containing all the nodes that have the service node "SN27" +you could run a command similar to the following. :: + + mkdef -t group sn1group members=node[01-20] + +**Note: Normally backup service nodes are the primary SNs for other compute +nodes. So, for example, if you have 2 SNs, configure half of the CNs to use +the 1st SN as their primary SN, and the other half of CNs to use the 2nd SN +as their primary SN. Then each SN would be configured to be the backup SN +for the other half of CNs.** + +When you run `makedhcp `_, it will configure dhcp +and tftp on both the primary and backup SNs, assuming they both have network +access to the CNs. This will make it possible to do a quick SN takeover +without having to wait for replication when you need to switch. + +xdcp Behaviour with backup servicenodes +--------------------------------------- + +The xdcp command in a hierarchical environment must first copy (scp) the +files to the service nodes for them to be available to scp to the node from +the service node that is it's master. The files are placed in +``/var/xcat/syncfiles`` directory by default, or what is set in site table +SNsyncfiledir attribute. If the node has multiple service nodes assigned, +then xdcp will copy the file to each of the service nodes assigned to the +node. For example, here the files will be copied (scp) to both service1 and +rhsn. lsdef cn4 | grep servicenode. :: + + servicenode=service1,rhsn + +f a service node is offline ( e.g. service1), then you will see errors on +your xdcp command, and yet if rhsn is online then the xdcp will actually +work. This may be a little confusing. For example, here service1 is offline, +but we are able to use rhsn to complete the xdcp. :: + + xdcp cn4 /tmp/lissa/file1 /tmp/file1 + + service1: Permission denied (publickey,password,keyboard-interactive). + service1: Permission denied (publickey,password,keyboard-interactive). + service1: lost connection + The following servicenodes: service1, have errors and cannot be updated + Until the error is fixed, xdcp will not work to nodes serviced by these service nodes. + + xdsh cn4 ls /tmp/file1 + cn4: /tmp/file1 + +Switch to the backup SN +----------------------- + +When an SN fails, or you want to bring it down for maintenance, use this +procedure to move its CNs over to the backup SN. + +Move the nodes to the new service nodes +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use the xCAT `snmove `_ to make the database +updates necessary to move a set of nodes from one service node to another, and +to make configuration modifications to the nodes. + +For example, if you want to switch all the compute nodes that use service +node "sn1" to the backup SN (sn2), run: :: + + snmove -s sn1 + +Modified database attributes +"""""""""""""""""""""""""""" + +The **snmove** command will check and set several node attribute values. + +**servicenode**: : This will be set to either the second server name in the +servicenode attribute list or the value provided on the command line. +**xcatmaster**: : Set with either the value provided on the command line or it +will be automatically determined from the servicenode attribute. +**nfsserver**: : If the value is set with the source service node then it will +be set to the destination service node. +**tftpserver**: : If the value is set with the source service node then it will + be reset to the destination service node. +**monserver**: : If set to the source service node then reset it to the +destination servicenode and xcatmaster values. +**conserver**: : If set to the source service node then reset it to the +destination servicenode and run **makeconservercf** + +Run postscripts on the nodes +"""""""""""""""""""""""""""" + +If the CNs are up at the time the snmove command is run then snmove will run +postscripts on the CNs to reconfigure them for the new SN. The "syslog" +postscript is always run. The "mkresolvconf" and "setupntp" scripts will be +run IF they were included in the nodes postscript list. + +You can also specify an additional list of postscripts to run. + +Modify system configuration on the nodes +"""""""""""""""""""""""""""""""""""""""" + +If the CNs are up the snmove command will also perform some configuration on +the nodes such as setting the default gateway and modifying some +configuration files used by xCAT. + +Switching back +-------------- + +The process for switching nodes back will depend on what must be done to +recover the original service node. If the SN needed to be reinstalled, you +need to set it up as an SN again and make sure the CN images are replicated +to it. Once you've done this, or if the SN's configuration was not lost, +then follow these steps to move the CNs back to their original SN: + +* Use snmove: :: + + snmove sn1group -d sn1 + diff --git a/docs/source/advanced/hierarchy/appendix_b_diagnostics.rst b/docs/source/advanced/hierarchy/appendix_b_diagnostics.rst new file mode 100644 index 000000000..cde5d7202 --- /dev/null +++ b/docs/source/advanced/hierarchy/appendix_b_diagnostics.rst @@ -0,0 +1,66 @@ +Appendix B: Diagnostics +======================= + +* **root ssh keys not setup** -- If you are prompted for a password when ssh to + the service node, then check to see if /root/.ssh has authorized_keys. If + the directory does not exist or no keys, on the MN, run xdsh service -K, + to exchange the ssh keys for root. You will be prompted for the root + password, which should be the password you set for the key=system in the + passwd table. +* **XCAT rpms not on SN** --On the SN, run rpm -qa | grep xCAT and make sure + the appropriate xCAT rpms are installed on the servicenode. See the list of + xCAT rpms in :ref:`setup_service_node_stateful_label`. If rpms + missing check your install setup as outlined in Build the Service Node + Stateless Image for diskless or :ref:`setup_service_node_stateful_label` for + diskfull installs. +* **otherpkgs(including xCAT rpms) installation failed on the SN** --The OS + repository is not created on the SN. When the "yum" command is processing + the dependency, the rpm packages (including expect, nmap, and httpd, etc) + required by xCATsn can't be found. In this case, please check whether the + ``/install/postscripts/repos///`` directory exists on the MN. + If it is not on the MN, you need to re-run the "copycds" command, and there + will be some file created under the + ``/install/postscripts/repos//`` directory on the MN. Then, you + need to re-install the SN, and this issue should be gone. +* **Error finding the database/starting xcatd** -- If on the Service node when + you run tabdump site, you get "Connection failure: IO::Socket::SSL: + connect: Connection refused at ``/opt/xcat/lib/perl/xCAT/Client.pm``". Then + restart the xcatd daemon and see if it passes by running the command: + service xcatd restart. If it fails with the same error, then check to see + if ``/etc/xcat/cfgloc`` file exists. It should exist and be the same as + ``/etc/xcat/cfgloc`` on the MN. If it is not there, copy it from the MN to + the SN. The run service xcatd restart. This indicates the servicenode + postscripts did not complete successfully. Check to see your postscripts + table was setup correctly in :ref:`add_service_node_postscripts_label` to the + postscripts table. +* **Error accessing database/starting xcatd credential failure**-- If you run + tabdump site on the servicenode and you get "Connection failure: + IO::Socket::SSL: SSL connect attempt failed because of handshake + problemserror:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca + at ``/opt/xcat/lib/perl/xCAT/Client.pm``", check ``/etc/xcat/cert``. The + directory should contain the files ca.pem and server-cred.pem. These were + suppose to transfer from the MN ``/etc/xcat/cert`` directory during the + install. Also check the ``/etc/xcat/ca`` directory. This directory should + contain most files from the ``/etc/xcat/ca`` directory on the MN. You can + manually copy them from the MN to the SN, recursively. This indicates the + the servicenode postscripts did not complete successfully. Check to see + your postscripts table was setup correctly in + :ref:`add_service_node_postscripts_label` to the postscripts table. Again + service xcatd restart and try the tabdump site again. +* **Missing ssh hostkeys** -- Check to see if ``/etc/xcat/hostkeys`` on the SN, + has the same files as ``/etc/xcat/hostkeys`` on the MN. These are the ssh + keys that will be installed on the compute nodes, so root can ssh between + compute nodes without password prompting. If they are not there copy them + from the MN to the SN. Again, these should have been setup by the + servicenode postscripts. + +* **Errors running hierarchical commands such as xdsh** -- xCAT has a number of + commands that run hierarchically. That is, the commands are sent from xcatd + on the management node to the correct service node xcatd, which in turn + processes the command and sends the results back to xcatd on the management + node. If a hierarchical command such as xcatd fails with something like + "Error: Permission denied for request", check ``/var/log/messages`` on the + management node for errors. One error might be "Request matched no policy + rule". This may mean you will need to add policy table entries for your + xCAT management node and service node: + diff --git a/docs/source/advanced/hierarchy/appendix_c_migrating_mn_to_sn.rst b/docs/source/advanced/hierarchy/appendix_c_migrating_mn_to_sn.rst new file mode 100644 index 000000000..13df593b8 --- /dev/null +++ b/docs/source/advanced/hierarchy/appendix_c_migrating_mn_to_sn.rst @@ -0,0 +1,14 @@ +Appendix C: Migrating a Management Node to a Service Node +========================================================= + +If you find you want to convert an existing Management Node to a Service +Node you need to work with the xCAT team. It is recommended for now, to +backup your database, setup your new Management Server, and restore your +database into it. Take the old Management Node and remove xCAT and all xCAT +directories, and your database. See ``Uninstalling_xCAT +`_ and then follow the process for setting up a +SN as if it is a new node. + + + + \ No newline at end of file diff --git a/docs/source/advanced/hierarchy/appendix_d_set_up_hierarchical_conserver.rst b/docs/source/advanced/hierarchy/appendix_d_set_up_hierarchical_conserver.rst new file mode 100644 index 000000000..72797d469 --- /dev/null +++ b/docs/source/advanced/hierarchy/appendix_d_set_up_hierarchical_conserver.rst @@ -0,0 +1,13 @@ +Appendix D: Set up Hierarchical Conserver +========================================= + +To allow you to open the rcons from the Management Node using the +conserver daemon on the Service Nodes, do the following: + +* Set nodehm.conserver to be the service node (using the ip that faces the + management node) :: + + chdef -t conserver= + makeconservercf + service conserver stop + service conserver start diff --git a/docs/source/advanced/hierarchy/configure_dhcp.rst b/docs/source/advanced/hierarchy/configure_dhcp.rst new file mode 100644 index 000000000..23ef04179 --- /dev/null +++ b/docs/source/advanced/hierarchy/configure_dhcp.rst @@ -0,0 +1,11 @@ +Configure DHCP +============== + +Add the relevant networks into the DHCP configuration, refer to: +`XCAT_pLinux_Clusters/#setup-dhcp `_ + +Add the defined nodes into the DHCP configuration, refer to: +`XCAT_pLinux_Clusters/#configure-dhcp `_ + + + diff --git a/docs/source/advanced/hierarchy/define_and_install_compute_node.rst b/docs/source/advanced/hierarchy/define_and_install_compute_node.rst new file mode 100644 index 000000000..952dd3d6b --- /dev/null +++ b/docs/source/advanced/hierarchy/define_and_install_compute_node.rst @@ -0,0 +1,39 @@ +Define and install your Compute Nodes +===================================== + +Make /install available on the Service Nodes +-------------------------------------------- + +Note that all of the files and directories pointed to by your osimages should +be placed under the directory referred to in site.installdir (usually +/install), so they will be available to the service nodes. The installdir +directory is mounted or copied to the service nodes during the hierarchical +installation. + +If you are not using the NFS-based statelite method of booting your compute +nodes and you are not using service node pools, set the installloc attribute +to "/install". This instructs the service node to mount /install from the +management node. (If you don't do this, you have to manually sync /install +between the management node and the service nodes.) + +:: + + chdef -t site clustersite installloc="/install" + +Make compute node syncfiles available on the servicenodes +--------------------------------------------------------- + +If you are not using the NFS-based statelite method of booting your compute +nodes, and you plan to use the syncfiles postscript to update files on the +nodes during install, you must ensure that those files are sync'd to the +servicenodes before the install of the compute nodes. To do this after your +nodes are defined, you will need to run the following whenever the files in +your synclist change on the Management Node: +:: + + updatenode -f + +At this point you can return to the documentation for your cluster environment +to define and deploy your compute nodes. + + diff --git a/docs/source/advanced/hierarchy/define_service_node.rst b/docs/source/advanced/hierarchy/define_service_node.rst new file mode 100644 index 000000000..47bee6e7b --- /dev/null +++ b/docs/source/advanced/hierarchy/define_service_node.rst @@ -0,0 +1,345 @@ +Define the service nodes in the database +======================================== + +This document assumes that you have previously **defined** your compute nodes +in the database. It is also possible at this point that you have generic +entries in your db for the nodes you will use as service nodes as a result of +the node discovery process. We are now going to show you how to add all the +relevant database data for the service nodes (SN) such that the SN can be +installed and managed from the Management Node (MN). In addition, you will +be adding the information to the database that will tell xCAT which service +nodes (SN) will service which compute nodes (CN). + +For this example, we have two service nodes: **sn1** and **sn2**. We will call +our Management Node: **mn1**. Note: service nodes are, by convention, in a +group called **service**. Some of the commands in this document will use the +group **service** to update all service nodes. + +Note: a Service Node's service node is the Management Node; so a service node +must have a direct connection to the management node. The compute nodes do not +have to be directly attached to the Management Node, only to their service +node. This will all have to be defined in your networks table. + +Add Service Nodes to the nodelist Table +--------------------------------------- + +Define your service nodes (if not defined already), and by convention we put +them in a **service** group. We usually have a group compute for our compute +nodes, to distinguish between the two types of nodes. (If you want to use your +own group name for service nodes, rather than service, you need to change some +defaults in the xCAT db that use the group name service. For example, in the +postscripts table there is by default a group entry for service, with the +appropriate postscripts to run when installing a service node. Also, the +default ``kickstart/autoyast`` template, pkglist, etc that will be used have +files names based on the profile name service.) :: + + mkdef sn1,sn2 groups=service,ipmi,all + +Add OS and Hardware Attributes to Service Nodes +----------------------------------------------- + +When you ran copycds, it creates several osimage definitions, including some +appropriate for SNs. Display the list of osimages and choose one with +"service" in the name: :: + + lsdef -t osimage + +For this example, let's assume you chose the stateful osimage definition for +rhels 6.3: rhels6.3-x86_64-install-service . If you want to modify any of the +osimage attributes (e.g. ``kickstart/autoyast`` template, pkglist, etc), +make a copy of the osimage definition and also copy to ``/install/custom`` +any files it points to that you are modifying. + +Now set some of the common attributes for the SNs at the group level: :: + + chdef -t group service arch=x86_64 \ + os=rhels6.3 \ + nodetype=osi + profile=service \ + netboot=xnba installnic=mac \ + primarynic=mac \ + provmethod=rhels6.3-x86_64-install-service + +Add Service Nodes to the servicenode Table +------------------------------------------ + +An entry must be created in the servicenode table for each service node or the +service group. This table describes all the services you would like xcat to +setup on the service nodes. (Even if you don't want xCAT to set up any +services - unlikely - you must define the service nodes in the servicenode +table with at least one attribute set (you can set it to 0), otherwise it will +not be recognized as a service node.) + +When the xcatd daemon is started or restarted on the service node, it will +make sure all of the requested services are configured and started. (To +temporarily avoid this when restarting xcatd, use "service xcatd reload" +instead.) + +To set up the minimum recommended services on the service nodes: :: + + chdef -t group -o service setupnfs=1 \ + setupdhcp=1 setuptftp=1 \ + setupnameserver=1 \ + setupconserver=1 + +.. TODO + +See the setup* attributes in the `node object definition man page +`_ for the services available. (The HTTP server +is also started when setupnfs is set.) + +If you are using the setupntp postscript on the compute nodes, you should also +set setupntp=1. For clusters with subnetted management networks (i.e. the +network between the SN and its compute nodes is separate from the network +between the MN and the SNs) you might want to also set setupipforward=1. + +.. _add_service_node_postscripts_label: + +Add Service Node Postscripts +---------------------------- + +By default, xCAT defines the service node group to have the "servicenode" +postscript run when the SNs are installed or diskless booted. This +postscript sets up the xcatd credentials and installs the xCAT software on +the service nodes. If you have your own postscript that you want run on the +SN during deployment of the SN, put it in ``/install/postscripts`` on the MN +and add it to the service node postscripts or postbootscripts. For example: :: + + chdef -t group -p service postscripts= + +Notes: + + * For Red Hat type distros, the postscripts will be run before the reboot + of a kickstart install, and the postbootscripts will be run after the + reboot. + * Make sure that the servicenode postscript is set to run before the + otherpkgs postscript or you will see errors during the service node + deployment. + * The -p flag automatically adds the specified postscript at the end of the + comma-separated list of postscripts (or postbootscripts). + +If you are running additional software on the service nodes that need **ODBC** +to access the database (e.g. LoadLeveler or TEAL), use this command to add +the xCAT supplied postbootscript called "odbcsetup". :: + + chdef -t group -p service postbootscripts=odbcsetup + +Assigning Nodes to their Service Nodes +-------------------------------------- + +The node attributes **servicenode** and **xcatmaster** define which SN +services this particular node. The servicenode attribute for a compute node +defines which SN the MN should send a command to (e.g. xdsh), and should be +set to the hostname or IP address of the service node that the management +node contacts it by. The xcatmaster attribute of the compute node defines +which SN the compute node should boot from, and should be set to the +hostname or IP address of the service node that the compute node contacts it +by. Unless you are using service node pools, you must set the xcatmaster +attribute for a node when using service nodes, even if it contains the same +value as the node's servicenode attribute. + +Host name resolution must have been setup in advance, with ``/etc/hosts``, DNS +or dhcp to ensure that the names put in this table can be resolved on the +Management Node, Service nodes, and the compute nodes. It is easiest to have a +node group of the compute nodes for each service node. For example, if all the +nodes in node group compute1 are serviced by sn1 and all the nodes in node +group compute2 are serviced by sn2: + +:: + + chdef -t group compute1 servicenode=sn1 xcatmaster=sn1-c + chdef -t group compute2 servicenode=sn2 xcatmaster=sn2-c + +Note: in this example, sn1 and sn2 are the node names of the service nodes +(and therefore the hostnames associated with the NICs that the MN talks to). +The hostnames sn1-c and sn2-c are associated with the SN NICs that communicate +with their compute nodes. + +Note: if not set, the attribute tftpserver's default value is xcatmaster, +but in some releases of xCAT it has not defaulted correctly, so it is safer +to set the tftpserver to the value of xcatmaster. + +These attributes will allow you to specify which service node should run the +conserver (console) and monserver (monitoring) daemon for the nodes in the +group specified in the command. In this example, we are having each node's +primary SN also act as its conserver and monserver (the most typical setup). +:: + + chdef -t group compute1 conserver=sn1 monserver=sn1,sn1-c + chdef -t group compute2 conserver=sn2 monserver=sn2,sn2-c + +Service Node Pools +^^^^^^^^^^^^^^^^^^ + +Service Node Pools are multiple service nodes that service the same set of +compute nodes. Having multiple service nodes allows backup service node(s) for +a compute node when the primary service node is unavailable, or can be used +for work-load balancing on the service nodes. But note that the selection of +which SN will service which compute node is made at compute node boot time. +After that, the selection of the SN for this compute node is fixed until the +compute node is rebooted or the compute node is explicitly moved to another SN +using the `snmove `_ command. + +To use Service Node pools, you need to architect your network such that all of +the compute nodes and service nodes in a partcular pool are on the same flat +network. If you don't want the management node to respond to manage some of +the compute nodes, it shouldn't be on that same flat network. The +site, dhcpinterfaces attribute should be set such that the SNs' DHCP daemon +only listens on the NIC that faces the compute nodes, not the NIC that faces +the MN. This avoids some timing issues when the SNs are being deployed (so +that they don't respond to each other before they are completely ready). You +also need to make sure the `networks `_ table +accurately reflects the physical network structure. + +To define a list of service nodes that support a set of compute nodes, set the +servicenode attribute to a comma-delimited list of the service nodes. When +running an xCAT command like xdsh or updatenode for compute nodes, the list +will be processed left to right, picking the first service node on the list to +run the command. If that service node is not available, then the next service +node on the list will be chosen until the command is successful. Errors will +be logged. If no service node on the list can process the command, then the +error will be returned. You can provide some load-balancing by assigning your +service nodes as we do below. + +When using service node pools, the intent is to have the service node that +responds first to the compute node's DHCP request during boot also be the +xcatmaster, the tftpserver, and the NFS/http server for that node. Therefore, +the xcatmaster and nfsserver attributes for nodes should not be set. When +nodeset is run for the compute nodes, the service node interface on the +network to the compute nodes should be defined and active, so that nodeset +will default those attribute values to the "node ip facing" interface on that +service node. + +For example: :: + + chdef -t node compute1 servicenode=sn1,sn2 xcatmaster="" nfsserver="" + chdef -t node compute2 servicenode=sn2,sn1 xcatmaster="" nfsserver="" + +You need to set the sharedtftp site attribute to 0 so that the SNs will not +automatically mount the ``/tftpboot`` directory from the management node: +:: + + chdef -t site clustersite sharedtftp=0 + +For statefull (full-disk) node installs, you will need to use a local +``/install`` directory on each service node. The ``/install/autoinst/node`` +files generated by nodeset will contain values specific to that service node +for correctly installing the nodes. +:: + + chdef -t site clustersite installloc="" + +With this setting, you will need to remember to rsync your ``/install`` +directory from the xCAT management node to the service nodes anytime you +change your ``/install/postscripts``, custom osimage files, os repositories, +or other directories. It is best to exclude the ``/install/autoinst`` directory +from this rsync. + +:: + + rsync -auv --exclude 'autoinst' /install sn1:/ + +Note: If your service nodes are stateless and site.sharedtftp=0, if you reboot +any service node when using servicenode pools, any data written to the local +``/tftpboot`` directory of that SN is lost. You will need to run nodeset for +all of the compute nodes serviced by that SN again. + +For additional information about service node pool related settings in the +networks table, see ref: networks table, see :ref:`setup_networks_table_label`. + +Conserver and Monserver and Pools +""""""""""""""""""""""""""""""""" + +The support of conserver and monserver with Service Node Pools is still not +supported. You must explicitly assign these functions to a service node using +the nodehm.conserver and noderes.monserver attribute as above. + +Setup Site Table +---------------- + +If you are not using the NFS-based statelite method of booting your compute +nodes, set the installloc attribute to ``/install``. This instructs the +service node to mount ``/install`` from the management node. (If you don't do +this, you have to manually sync ``/install`` between the management node and +the service nodes.) :: + + chdef -t site clustersite installloc="/install" + +For IPMI controlled nodes, if you want the out-of-band IPMI operations to be +done directly from the management node (instead of being sent to the +appropriate service node), set site.ipmidispatch=n. + +If you want to throttle the rate at which nodes are booted up, you can set the +following site attributes: + + +* syspowerinterval +* syspowermaxnodes +* powerinterval (system p only) + +See the `site table man page `_ for details. + +.. _setup_networks_table_label: + +Setup networks Table +-------------------- + +All networks in the cluster must be defined in the networks table. When xCAT +is installed, it runs makenetworks, which creates an entry in the networks +table for each of the networks the management node is on. You need to add +entries for each network the service nodes use to communicate to the compute +nodes. + +For example: :: + + mkdef -t network net1 net=10.5.1.0 mask=255.255.255.224 gateway=10.5.1.1 + +If you want to set the nodes' xcatmaster as the default gateway for the nodes, +the gateway attribute can be set to keyword "". In this case, xCAT +code will automatically substitute the IP address of the node's xcatmaster for +the keyword. Here is an example: +:: + + mkdef -t network net1 net=10.5.1.0 mask=255.255.255.224 gateway= + +The ipforward attribute should be enabled on all the xcatmaster nodes that +will be acting as default gateways. You can set ipforward to 1 in the +servicenode table or add the line "net.ipv4.ip_forward = 1" in file +``/etc/sysctl``.conf and then run "sysctl -p /etc/sysctl.conf" manually to +enable the ipforwarding. + +Note:If using service node pools, the networks table dhcpserver attribute can +be set to any single service node in your pool. The networks tftpserver, and +nameserver attributes should be left blank. + +Verify the Tables +-------------------- + +To verify that the tables are set correctly, run lsdef on the service nodes, +compute1, compute2: :: + + lsdef service,compute1,compute2 + +Add additional adapters configuration script (optional) +------------------------------------------------------------ + +It is possible to have additional adapter interfaces automatically configured +when the nodes are booted. XCAT provides sample configuration scripts for +ethernet, IB, and HFI adapters. These scripts can be used as-is or they can be +modified to suit your particular environment. The ethernet sample is +``/install/postscript/configeth``. When you have the configuration script that +you want you can add it to the "postscripts" attribute as mentioned above. Make +sure your script is in the ``/install/postscripts`` directory and that it is +executable. + +Note: For system p servers, if you plan to have your service node perform the +hardware control functions for its compute nodes, it is necessary that the SN +ethernet network adapters connected to the HW service VLAN be configured. + +Configuring Secondary Adapters +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +o configure secondary adapters, see `Configuring_Secondary_Adapters +`_ + + diff --git a/docs/source/advanced/hierarchy/hierarchy_cluster.rst b/docs/source/advanced/hierarchy/hierarchy_cluster.rst deleted file mode 100644 index fa3b38c3f..000000000 --- a/docs/source/advanced/hierarchy/hierarchy_cluster.rst +++ /dev/null @@ -1,2 +0,0 @@ -Setting Up a Linux Hierarchical Cluster -======================================= diff --git a/docs/source/advanced/hierarchy/index.rst b/docs/source/advanced/hierarchy/index.rst index 1f0a6b2e8..5ff6a59af 100644 --- a/docs/source/advanced/hierarchy/index.rst +++ b/docs/source/advanced/hierarchy/index.rst @@ -5,3 +5,14 @@ Hierarchical Clusters :maxdepth: 2 introduction.rst + setup_mn_hierachical_database.rst + define_service_node.rst + configure_dhcp.rst + setup_service_node.rst + service_node_for_diskfull.rst + service_node_for_diskless.rst + test_service_node_installation.rst + appendix_a_setup_backup_service_nodes.rst + appendix_b_diagnostics.rst + appendix_c_migrating_mn_to_sn.rst + appendix_d_set_up_hierarchical_conserver.rst diff --git a/docs/source/advanced/hierarchy/introduction.rst b/docs/source/advanced/hierarchy/introduction.rst index 00d01e5d0..5391a3606 100644 --- a/docs/source/advanced/hierarchy/introduction.rst +++ b/docs/source/advanced/hierarchy/introduction.rst @@ -1,10 +1,50 @@ Introduction ============ -In supporting large clusters, it is desirable to have more than a single management node handling the installation and management of compute nodes. - -In xCAT, these additional nodes are referred to as *Service Nodes (SN)*. The management node can delegate all management operations for a compute node to the Service node that is managing them. You can have one of more service nodes configured to install and manage a group of compute nodes. - +In large clusters, it is desirable to have more than one node (the Management +Node - MN) handle the installation and management of the compute nodes. We +call these additional nodes **service nodes (SN)**. The management node can +delegate all management operations needed by a compute node to the SN that is +managing that compute node. You can have one or more service nodes setting up +to install and manage groups of compute nodes. Service Nodes ------------- + +With xCAT, you have the choice of either having each service node +install/manage a specific set of compute nodes, or having a pool of service +nodes, any of which can respond to an installation request from a compute +node. (Service node pools must be aligned with the network broadcast domains, +because the way a compute node choose its SN for that boot is by whoever +responds to the DHCP request broadcast first.) You can also have a hybrid of +the 2 approaches, in which for each specific set of compute nodes you have 2 +or more SNs in a pool. + +Each SN runs an instance of xcatd, just like the MN does. The xcatd daemons +communicate with each other using the same XML/SSL protocol that the xCAT +client uses to communicate with xcatd on the MN. + +Daemon-based Databases +---------------------- + +The service nodes need to communicate with the xCAT database on the Management +Node. They do this by using the remote client capability of the database (i.e. +they don't go through xcatd for that). Therefore the Management Node must be +running one of the daemon-based databases supported by xCAT (PostgreSQL, +MySQL). + +The default SQLite database does not support remote clients and cannot be used +in hierarchical clusters. This document includes instructions for migrating +your cluster from SQLite to one of the other databases. Since the initial +install of xCAT will always set up SQLite, you must migrate to a database that +supports remote clients before installing your service nodes. + +Setup +----- +xCAT will help you install your service nodes as well as install on the SNs +xCAT software and other required rpms such as perl, the database client, and +other pre-reqs. Service nodes require all the same software as the MN +(because it can do all of the same functions), except that there is a special +top level xCAT rpm for SNs called xCATsn vs. the xCAT rpm that is on the +Management Node. The xCATsn rpm tells the SN that the xcatd on it should +behave as an SN, not the MN. diff --git a/docs/source/advanced/hierarchy/service_node_for_diskfull.rst b/docs/source/advanced/hierarchy/service_node_for_diskfull.rst new file mode 100644 index 000000000..d8af3abdc --- /dev/null +++ b/docs/source/advanced/hierarchy/service_node_for_diskfull.rst @@ -0,0 +1,156 @@ +.. _setup_service_node_stateful_label: + +Set Up the Service Nodes for Stateful (Diskful) Installation (optional) +======================================================================= + +Any cluster using statelite compute nodes must use a stateful (diskful) service +nodes. + +Note: If you are using diskless service nodes, go to +:ref:`setup_service_node_stateless_label`. + +First, go to the `Download_xCAT `_ site and +download the level of the xCAT tarball you desire. Then go to +http://localhost/fake_todo and get the latest xCAT dependency tarball. +**Note: All xCAT service nodes must be at the exact same xCAT version as the +xCAT Management Node**. Copy the files to the Management Node (MN) and untar +them in the appropriate sub-directory of ``/install/post/otherpkgs`` + +**Note for the appropriate directory below, check the +``otherpkgdir=/install/post/otherpkgs/rhels6.4/ppc64`` attribute of the +osimage defined for the servicenode.** + +For example ubuntu14.04.1-ppc64el-install-service **** :: + + mkdir -p /install/post/otherpkgs/ubuntu14.04.1/ppc64el/ + cd /install/post/otherpkgs/ubuntu14.04.1/ppc64el/ + tar jxvf core-rpms-snap.tar.bz2 + tar jxvf xcat-dep-ubuntu*.tar.bz2 + +Next, add rpm names into your own version of +service...otherpkgs.pkglist file. In most cases, you can find an +initial copy of this file under /opt/xcat/share/xcat/install/ . If +not, copy one from a similar platform. +:: + + mkdir -p /install/custom/install/ubuntu/ + cp /opt/xcat/share/xcat/install/ubuntu/service.ubuntu.otherpkgs.pkglist/\ + install/custom/install/ubuntu/service.ubuntu.otherpkgs.pkglist + vi /install/custom/install/ubuntu/service.ubuntu.otherpkgs.pkglist + +Make sure the following entries are included in the +``/install/custom/install/ubuntu/service.ubuntu.otherpkgs.pkglist``: +:: + + mariadb-client + mariadb-common + xcatsn + conserver-xcat + +The "pkgdir" should include the online/local ubuntu official mirror with the +following command: +:: + + chdef -t osimage -o ubuntu14.04.1-ppc64el-install-service \ + -p pkgdir="http://ports.ubuntu.com/ubuntu-ports trusty main, \ + http://ports.ubuntu.com/ubuntu-ports trusty-updates main, \ + http://ports.ubuntu.com/ubuntu-ports trusty universe, \ + http://ports.ubuntu.com/ubuntu-ports trusty-updates universe" + +plus the "otherpkgdir" should include the mirror under otherpkgdir on MN, this +can be done with: :: + + chdef -t osimage -o ubuntu14.04.1-ppc64el-install-service -p \ + otherpkgdir="http:// < Name or ip of Management Node > \ + /install/post/otherpkgs/ubuntu14.04.1/ppc64el/xcat-core/ \ + trusty main, http://< Name or ip of Management Node > \ + /install/post/otherpkgs/ubuntu14.04.1/ppc64el/xcat-dep/ trusty main" + +**Note: you will be installing the xCAT Service Node rpm xCATsn meta-package +on the Service Node, not the xCAT Management Node meta-package. Do not install +both.** + +Update the rhels6 RPM repository (rhels6 only) +---------------------------------------------- + +* This section could be removed after the powerpc-utils-1.2.2-18.el6.ppc64.rpm + is built in the base rhels6 ISO. +* The direct rpm download link is: + ftp://linuxpatch.ncsa.uiuc.edu/PERCS/powerpc-utils-1.2.2-18.el6.ppc64.rpm +* The update steps are as following: :: + + put the new rpm in the base OS packages + cd /install/rhels6/ppc64/Server/Packages + mv powerpc-utils-1.2.2-17.el6.ppc64.rpm /tmp + cp /tmp/powerpc-utils-1.2.2-18.el6.ppc64.rpm + # make sure that the rpm is be readable by other users + chmod +r powerpc-utils-1.2.2-18.el6.ppc64.rpm + + + +* create the repodata + +:: + + cd /install/rhels6/ppc64/Server + ls -al repodata/ + total 14316 + dr-xr-xr-x 2 root root 4096 Jul 20 09:34 . + dr-xr-xr-x 3 root root 4096 Jul 20 09:34 .. + -r--r--r-- 1 root root 1305862 Sep 22 2010 20dfb74c144014854d3b16313907ebcf30c9ef63346d632369a19a4add8388e7-other.sqlite.bz2 + -r--r--r-- 1 root root 1521372 Sep 22 2010 57b3c81512224bbb5cebbfcb6c7fd1f7eb99cca746c6c6a76fb64c64f47de102-primary.xml.gz + -r--r--r-- 1 root root 2823613 Sep 22 2010 5f664ea798d1714d67f66910a6c92777ecbbe0bf3068d3026e6e90cc646153e4-primary.sqlite.bz2 + -r--r--r-- 1 root root 1418180 Sep 22 2010 7cec82d8ed95b8b60b3e1254f14ee8e0a479df002f98bb557c6ccad5724ae2c8-other.xml.gz + -r--r--r-- 1 root root 194113 Sep 22 2010 90cbb67096e81821a2150d2b0a4f3776ab1a0161b54072a0bd33d5cadd1c234a-comps-rhel6-Server.xml.gz + **-r--r--r-- 1 root root 1054944 Sep 22 2010 98462d05248098ef1724eddb2c0a127954aade64d4bb7d4e693cff32ab1e463c-comps-rhel6-Server.xml** + -r--r--r-- 1 root root 3341671 Sep 22 2010 bb3456b3482596ec3aa34d517affc42543e2db3f4f2856c0827d88477073aa45-filelists.sqlite.bz2 + -r--r--r-- 1 root root 2965960 Sep 22 2010 eb991fd2bb9af16a24a066d840ce76365d396b364d3cdc81577e4cf6e03a15ae-filelists.xml.gz + -r--r--r-- 1 root root 3829 Sep 22 2010 repomd.xml + -r--r--r-- 1 root root 2581 Sep 22 2010 TRANS.TBL + createrepo -g repodata \ + /98462d05248098ef1724eddb2c0a127954aade64d4bb7d4e693cff32ab1e463c-comps-rhel6-Server.xml + + Note: you should use comps-rhel6-Server.xml with its key as the group file. + +Set the node status to ready for installation +--------------------------------------------- + +Run nodeset to the osimage name defined in the provmethod attribute on your +service node. :: + + nodeset service osimage="" + +For example :: + + nodeset service osimage="ubuntu14.04.1-ppc64el-install-service" + +Initialize network boot to install Service Nodes +------------------------------------------------ + +:: + + rnetboot service + +Monitor the Installation +------------------------ + +Watch the installation progress using either wcons or rcons: :: + + wcons service # make sure DISPLAY is set to your X server/VNC or + rcons + tail -f /var/log/messages + +Note: We have experienced one problem while trying to install RHEL6 diskful +service node working with SAS disks. The service node cannot reboots from SAS +disk after the RHEL6 operating system has been installed. We are waiting for +the build with fixes from RHEL6 team, once meet this problem, you need to +manually select the SAS disk to be the first boot device and boots from the +SAS disk. + +Update Service Node Diskfull Image +---------------------------------- + +If you need to update the service nodes later on with a new version of xCAT +and its dependencies, obtain the new xCAT and xCAT dependencies rpms. +(Follow the same steps that were followed in +:ref:`setup_service_node_stateful_label`. diff --git a/docs/source/advanced/hierarchy/service_node_for_diskless.rst b/docs/source/advanced/hierarchy/service_node_for_diskless.rst new file mode 100644 index 000000000..eef051e3b --- /dev/null +++ b/docs/source/advanced/hierarchy/service_node_for_diskless.rst @@ -0,0 +1,221 @@ +.. _setup_service_node_stateless_label: + +Setup the Service Node for Stateless Deployment (optional) +========================================================== + +**Note: The stateless service node is not supported in ubuntu hierarchy +cluster. For ubuntu, please skip this section.** + +If you want, your service nodes can be stateless (diskless). The service node +must contain not only the OS, but also the xCAT software and its dependencies. +In addition, a number of files are added to the service node to support the +PostgreSQL, or MySQL database access from the service node to the Management +node, and ssh access to the nodes that the service nodes services. +The following sections explain how to accomplish this. + + +Build the Service Node Diksless Image +-------------------------------------- + +This section assumes you can build the stateless image on the management node +because the service nodes are the same OS and architecture as the management +node. If this is not the case, you need to build the image on a machine that +matches the service node's OS architecture. + +* Create an osimage definition. When you run copycds, xCAT will create a + service node osimage definitions for that distribution. For a stateless + service node, use the *-netboot-service definition. + + :: + + lsdef -t osimage | grep -i service + rhels6.4-ppc64-install-service (osimage) + rhels6.4-ppc64-netboot-service (osimage) + rhels6.4-ppc64-statelite-service (osimage) + + lsdef -t osimage -l rhels6.3-ppc64-netboot-service + Object name: rhels6.3-ppc64-netboot-service + exlist=/opt/xcat/share/xcat/netboot/rh/service.exlist + imagetype=linux + osarch=ppc64 + osdistroname=rhels6.3-ppc64 + osname=Linux + osvers=rhels6.3 + otherpkgdir=/install/post/otherpkgs/rhels6.3/ppc64 + otherpkglist=/opt/xcat/share/xcat/netboot/rh/service.rhels6.ppc64.otherpkgs.pkglist + pkgdir=/install/rhels6.3/ppc64 + pkglist=/opt/xcat/share/xcat/netboot/rh/service.rhels6.ppc64.pkglist + postinstall=/opt/xcat/share/xcat/netboot/rh/service.rhels6.ppc64.postinstall + profile=service + provmethod=netboot + rootimgdir=/install/netboot/rhels6.3/ppc64/service + +* You can check the service node packaging to see if it has all the rpms you + require. We ship a basic requirements lists that will create a fully + functional service node. However, you may want to customize your service + node by adding additional operating system packages or modifying the files + excluded by the exclude list. View the files referenced by the osimage + pkglist, otherpkglist and exlist attributes: + + :: + + cd /opt/xcat/share/xcat/netboot/rh/ + view service.rhels6.ppc64.pkglist + view service.rhels6.ppc64.otherpkgs.pkglist + view service.exlist + + If you would like to change any of these files, copy them to a custom + directory. This can be any directory you choose, but we recommend that you + keep it /install somewhere. A good location is something like + ``/install/custom/netboot//service``. Make sure that your + ``otherpkgs.pkglist`` file as an entry for + + :: + + xcat/xcat-core/xCATsn + + This is required to install the xCAT service node function into your image. + + You may also choose to create an appropriate /etc/fstab file in your + service node image. Copy the script referenced by the postinstall + attribute to your directory and modify it as you would like: + + :: + + cp /opt/xcat/share/xcat/netboot/rh/service.rhels6.ppc64.postinstall + /install/custom/netboot/rh + vi /install/custom/netboot/rh + # uncomment the sample fstab lines and change as needed: + proc /proc proc rw 0 0 + sysfs /sys sysfs rw 0 0 + devpts /dev/pts devpts rw,gid=5,mode=620 0 0 + service_x86_64 / tmpfs rw 0 1 + none /tmp tmpfs defaults,size=10m 0 2 + none /var/tmp tmpfs defaults,size=10m 0 2 + + After modifying the files, you will need to update the osimage definition to + reference these files. We recommend creating a new osimage definition for + your custom image: :: + + lsdef -t osimage -l rhels6.3-ppc64-netboot-service -z > /tmp/myservice.def + vi /tmp/myservice.def + # change the name of the osimage definition + # change any attributes that now need to reference your custom files + # change the rootimgdir attribute replacing 'service' + with a name to match your new osimage definition + cat /tmp/msyservice.def | mkdef -z + + While you are here, if you'd like, you can do the same for your compute node + images, creating custom files and new custom osimage definitions as you need + to. + + For more information on the use and syntax of otherpkgs and pkglist files, + see `Update Service Node Stateless Image `_ + +* Make your xCAT software available for otherpkgs processing + +* If you downloaded xCAT to your management node for installation, place a + copy of your xcat-core and xcat-dep in your otherpkgdir directory :: + + lsdef -t osimage -o rhels6.3-ppc64-netboot-service -i otherpkgdir + Object name: rhels6.3-ppc64-netboot-service + otherpkgdir=/install/post/otherpkgs/rhels6.3/ppc64 + cd /install/post/otherpkgs/rhels6.3/ppc64 + mkdir xcat + cd xcat + cp -Rp /xcat-core + cp -Rp /xcat-dep + +* If you installed your management node directly from the Linux online + repository, you will need to download the xcat-core and xcat-dep tarballs + + - Go to the `Download xCAT page `_ and download + the level of xCAT tarball you desire. + - Go to the `Download xCAT Dependencies `_ page + and download the latest xCAT dependency tarball. Place these into your + otherpkdir directory: + + :: + + lsdef -t osimage -o rhels6.3-ppc64-netboot-service -i otherpkgdir + Object name: rhels6.3-ppc64-netboot-service + otherpkgdir=/install/post/otherpkgs/rhels6.3/ppc64 + cd /install/post/otherpkgs/rhels6.3/ppc64 + mkdir xcat + cd xcat + mv . + tar -jxvf + mv . + tar -jxvf + +* Run image generation for your osimage definition: + + :: + + genimage rhels6.3-ppc64-netboot-service + +* Prevent DHCP from starting up until xcatd has had a chance to configure it: + + :: + + chroot /install/netboot/rhels6.3/ppc64/service/rootimg chkconfig dhcpd off + chroot /install/netboot/rhels6.3/ppc64/service/rootimg chkconfig dhcrelay off + +* IF using NFS hybrid mode, export /install read-only in service node image: + + :: + + cd /install/netboot/rhels6.3/ppc64/service/rootimg/etc + echo '/install *(ro,no_root_squash,sync,fsid=13)' >exports + +* Pack the image for your osimage definition: + + :: + + packimage rhels6.3-ppc64-netboot-service + +* Set the node status to ready for netboot using your osimage definition and + your 'service' nodegroup: + + :: + + nodeset service osimage=rhels6.3-ppc64-netboot-service + +* To diskless boot the service nodes + + :: + + rnetboot service + +Update Service Node Stateless Image +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To update the xCAT software in the image at a later time: + + * Download the updated xcat-core and xcat-dep tarballs and place them in + your osimage's otherpkgdir xcat directory as you did above. + * Generate and repack the image and reboot your service node. + * Run image generation for your osimage definition. + + :: + + genimage rhels6.3-ppc64-netboot-service + packimage rhels6.3-ppc64-netboot-service + nodeset service osimage=rhels6.3-ppc64-netboot-service + rnetboot service + +Note: The service nodes are set up as NFS-root servers for the compute nodes. +Any time changes are made to any compute image on the mgmt node it will be +necessary to sync all changes to all service nodes. In our case the +``/install`` directory is mounted on the servicenodes, so the update to the +compute node image is automatically available. + +Monitor install and boot +------------------------ + +:: + + wcons service # make sure DISPLAY is set to your X server/VNC or + rcons # or do rcons for each node + tail -f /var/log/messages + diff --git a/docs/source/advanced/hierarchy/setup_mn_hierachical_database.rst b/docs/source/advanced/hierarchy/setup_mn_hierachical_database.rst new file mode 100644 index 000000000..d4c07c4e4 --- /dev/null +++ b/docs/source/advanced/hierarchy/setup_mn_hierachical_database.rst @@ -0,0 +1,27 @@ +Setup the MN Hierarchical Database +================================== + +Before setting up service nodes, you need to set up either MySQL, PostgreSQL, +as the xCAT Database on the Management Node. The database client on the +Service Nodes will be set up later when the SNs are installed. MySQL and +PostgreSQL are available with the Linux OS. + +Follow the instructions in one of these documents for setting up the +Management node to use the selected database: + +MySQL or MariaDB +---------------- + +* Follow this documentation and be sure to use the xCAT provided mysqlsetup + command to setup the database for xCAT: + .. TODO http link + + - `Setting_Up_MySQL_as_the_xCAT_DB `_ + +PostgreSQL: +----------- +* Follow this documentation and be sure and use the xCAT provided pgsqlsetup + command to setup the database for xCAT: + .. TODO http link + + - `Setting_Up_PostgreSQL_as_the_xCAT_DB `_ diff --git a/docs/source/advanced/hierarchy/setup_mn_hierarchical_database.rst b/docs/source/advanced/hierarchy/setup_mn_hierarchical_database.rst new file mode 100644 index 000000000..d4c07c4e4 --- /dev/null +++ b/docs/source/advanced/hierarchy/setup_mn_hierarchical_database.rst @@ -0,0 +1,27 @@ +Setup the MN Hierarchical Database +================================== + +Before setting up service nodes, you need to set up either MySQL, PostgreSQL, +as the xCAT Database on the Management Node. The database client on the +Service Nodes will be set up later when the SNs are installed. MySQL and +PostgreSQL are available with the Linux OS. + +Follow the instructions in one of these documents for setting up the +Management node to use the selected database: + +MySQL or MariaDB +---------------- + +* Follow this documentation and be sure to use the xCAT provided mysqlsetup + command to setup the database for xCAT: + .. TODO http link + + - `Setting_Up_MySQL_as_the_xCAT_DB `_ + +PostgreSQL: +----------- +* Follow this documentation and be sure and use the xCAT provided pgsqlsetup + command to setup the database for xCAT: + .. TODO http link + + - `Setting_Up_PostgreSQL_as_the_xCAT_DB `_ diff --git a/docs/source/advanced/hierarchy/setup_service_node.rst b/docs/source/advanced/hierarchy/setup_service_node.rst new file mode 100644 index 000000000..f90eb04ab --- /dev/null +++ b/docs/source/advanced/hierarchy/setup_service_node.rst @@ -0,0 +1,6 @@ +Setup Service Node +================== + +* Follow this documentation to :ref:`setup_service_node_stateful_label`. + +* Follow this documentation to :ref:`setup_service_node_stateless_label`. diff --git a/docs/source/advanced/hierarchy/test_service_node_installation.rst b/docs/source/advanced/hierarchy/test_service_node_installation.rst new file mode 100644 index 000000000..e786608aa --- /dev/null +++ b/docs/source/advanced/hierarchy/test_service_node_installation.rst @@ -0,0 +1,11 @@ +Test Service Node installation +============================== + +* ssh to the service nodes. You should not be prompted for a password. +* Check to see that the xcat daemon xcatd is running. +* Run some database command on the service node, e.g tabdump site, or nodels, + and see that the database can be accessed from the service node. +* Check that ``/install`` and ``/tftpboot`` are mounted on the service node + from the Management Node, if appropriate. +* Make sure that the Service Node has Name resolution for all nodes, it will + service.