mirror of
https://github.com/xcat2/xcat-core.git
synced 2025-05-30 17:46:38 +00:00
Merge pull request #530 from whowutwut/hierarchy_master
Merge the pages for Large Cluster and Hierarchical Clusters (master)
This commit is contained in:
commit
13beea7af1
@ -402,7 +402,7 @@ Configure DRBD
|
||||
[>....................] sync'ed: 0.5% (101932/102400)M
|
||||
finish: 2:29:06 speed: 11,644 (11,444) K/sec
|
||||
|
||||
If a direct, back-to-back Gigabyte Ethernet connection is setup between the two management nodes and you are unhappy with the syncronization speed, it is possible to speed up the initial synchronization through some tunable parameters in DRBD. This setting is not permanent, and will not be retained after boot. For details, see `http://www.drbd.org/users-guide-emb/s-configure-sync-rate.html`_::
|
||||
If a direct, back-to-back Gigabyte Ethernet connection is setup between the two management nodes and you are unhappy with the syncronization speed, it is possible to speed up the initial synchronization through some tunable parameters in DRBD. This setting is not permanent, and will not be retained after boot. For details, see http://www.drbd.org/users-guide-emb/s-configure-sync-rate.html. ::
|
||||
|
||||
drbdadm disk-options --resync-rate=110M xCAT
|
||||
|
||||
@ -1168,13 +1168,13 @@ Trouble shooting and debug tips
|
||||
|
||||
* **x3550m4n02** ::
|
||||
|
||||
drbdadm disconnect xCAT
|
||||
drbdadm secondary xCAT
|
||||
drbdadm connect --discard-my-data xCAT
|
||||
drbdadm disconnect xCAT
|
||||
drbdadm secondary xCAT
|
||||
drbdadm connect --discard-my-data xCAT
|
||||
|
||||
* **x3550m4n01** ::
|
||||
|
||||
drbdadm connect xCAT
|
||||
drbdadm connect xCAT
|
||||
|
||||
Disable HA MN
|
||||
=============
|
||||
|
@ -2,33 +2,25 @@ Appendix A: Setup backup Service Nodes
|
||||
======================================
|
||||
|
||||
For reliability, availability, and serviceability purposes you may wish to
|
||||
designate backup service nodes in your hierarchical cluster. The backup
|
||||
service node will be another active service node that is set up to easily
|
||||
take over from the original service node if a problem occurs. This is not an
|
||||
designate backup Service Nodes in your hierarchical cluster. The backup
|
||||
Service Node will be another active Service Node that is set up to easily
|
||||
take over from the original Service Node if a problem occurs. This is not an
|
||||
automatic failover feature. You will have to initiate the switch from the
|
||||
primary service node to the backup manually. The xCAT support will handle most
|
||||
of the setup and transfer of the nodes to the new service node. This
|
||||
primary Service Node to the backup manually. The xCAT support will handle most
|
||||
of the setup and transfer of the nodes to the new Service Node. This
|
||||
procedure can also be used to simply switch some compute nodes to a new
|
||||
service node, for example, for planned maintenance.
|
||||
|
||||
Abbreviations used below:
|
||||
|
||||
* MN - management node.
|
||||
* SN - service node.
|
||||
* CN - compute node.
|
||||
Service Node, for example, for planned maintenance.
|
||||
|
||||
Initial deployment
|
||||
------------------
|
||||
|
||||
Integrate the following steps into the hierarchical deployment process
|
||||
described above.
|
||||
Integrate the following steps into the hierarchical deployment process described above.
|
||||
|
||||
|
||||
#. Make sure both the primary and backup service nodes are installed,
|
||||
configured, and can access the MN database.
|
||||
#. When defining the CNs add the necessary service node values to the
|
||||
"servicenode" and "xcatmaster" attributes of the `node definitions
|
||||
<http://localhost/fake_todo>`_.
|
||||
"servicenode" and "xcatmaster" attributes of the :doc:`node </guides/admin-guides/references/man7/node.7>` definitions.
|
||||
#. (Optional) Create an xCAT group for the nodes that are assigned to each SN.
|
||||
This will be useful when setting node attributes as well as providing an
|
||||
easy way to switch a set of nodes back to their original server.
|
||||
@ -51,10 +43,8 @@ attributes you would run a command similar to the following. ::
|
||||
|
||||
chdef <noderange> servicenode="xcatsn1a,xcatsn2a" xcatmaster="xcatsn1b"
|
||||
|
||||
The process can be simplified by creating xCAT node groups to use as the
|
||||
<noderange> in the `chdef <http://localhost/fake_todo>`_ command. To create an
|
||||
xCAT node group containing all the nodes that have the service node "SN27"
|
||||
you could run a command similar to the following. ::
|
||||
The process can be simplified by creating xCAT node groups to use as the <noderange> in the :doc:`chdef </guides/admin-guides/references/man1/chdef.1>` command to create a
|
||||
xCAT node group containing all the nodes that belong to service node "SN27". For example: ::
|
||||
|
||||
mkdef -t group sn1group members=node[01-20]
|
||||
|
||||
@ -64,10 +54,7 @@ the 1st SN as their primary SN, and the other half of CNs to use the 2nd SN
|
||||
as their primary SN. Then each SN would be configured to be the backup SN
|
||||
for the other half of CNs.**
|
||||
|
||||
When you run `makedhcp <http://localhost/fake_todo>`_, it will configure dhcp
|
||||
and tftp on both the primary and backup SNs, assuming they both have network
|
||||
access to the CNs. This will make it possible to do a quick SN takeover
|
||||
without having to wait for replication when you need to switch.
|
||||
When you run :doc:`makedhcp </guides/admin-guides/references/man8/makedhcp.8>` command, it will configure dhcp and tftp on both the primary and backup SNs, assuming they both have network access to the CNs. This will make it possible to do a quick SN takeover without having to wait for replication when you need to switch.
|
||||
|
||||
xdcp Behaviour with backup servicenodes
|
||||
---------------------------------------
|
||||
@ -83,7 +70,7 @@ rhsn. lsdef cn4 | grep servicenode. ::
|
||||
|
||||
servicenode=service1,rhsn
|
||||
|
||||
If a service node is offline ( e.g. service1), then you will see errors on
|
||||
If a service node is offline (e.g. service1), then you will see errors on
|
||||
your xdcp command, and yet if rhsn is online then the xdcp will actually
|
||||
work. This may be a little confusing. For example, here service1 is offline,
|
||||
but we are able to use rhsn to complete the xdcp. ::
|
||||
@ -108,49 +95,39 @@ procedure to move its CNs over to the backup SN.
|
||||
Move the nodes to the new service nodes
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Use the xCAT `snmove <http://localhost/fake_todo>`_ to make the database
|
||||
updates necessary to move a set of nodes from one service node to another, and
|
||||
to make configuration modifications to the nodes.
|
||||
Use the :doc:`snmove </guides/admin-guides/references/man1/snmove.1>` command to make the database changes necessary to move a set of compute nodes from one Service Node to another.
|
||||
|
||||
For example, if you want to switch all the compute nodes that use service
|
||||
node "sn1" to the backup SN (sn2), run: ::
|
||||
To switch all the compute nodes from Service Node ``sn1`` to the backup Service Node ``sn2``, run: ::
|
||||
|
||||
snmove -s sn1
|
||||
snmove -s sn1
|
||||
|
||||
Modified database attributes
|
||||
""""""""""""""""""""""""""""
|
||||
|
||||
The **snmove** command will check and set several node attribute values.
|
||||
The ``snmove`` command will check and set several node attribute values.
|
||||
|
||||
**servicenode**: : This will be set to either the second server name in the
|
||||
servicenode attribute list or the value provided on the command line.
|
||||
**xcatmaster**: : Set with either the value provided on the command line or it
|
||||
will be automatically determined from the servicenode attribute.
|
||||
**nfsserver**: : If the value is set with the source service node then it will
|
||||
be set to the destination service node.
|
||||
**tftpserver**: : If the value is set with the source service node then it will
|
||||
be reset to the destination service node.
|
||||
**monserver**: : If set to the source service node then reset it to the
|
||||
destination servicenode and xcatmaster values.
|
||||
**conserver**: : If set to the source service node then reset it to the
|
||||
destination servicenode and run **makeconservercf**
|
||||
* **servicenode**: This will be set to either the second server name in the servicenode attribute list or the value provided on the command line.
|
||||
|
||||
* **xcatmaster**: Set with either the value provided on the command line or it will be automatically determined from the servicenode attribute.
|
||||
|
||||
* **nfsserver**: If the value is set with the source service node then it will be set to the destination service node.
|
||||
|
||||
* **tftpserver**: If the value is set with the source service node then it will be reset to the destination service node.
|
||||
|
||||
* **monserver**: If set to the source service node then reset it to the destination servicenode and xcatmaster values.
|
||||
* **conserver**: If set to the source service node then reset it to the destination servicenode and run ``makeconservercf``
|
||||
|
||||
Run postscripts on the nodes
|
||||
""""""""""""""""""""""""""""
|
||||
|
||||
If the CNs are up at the time the snmove command is run then snmove will run
|
||||
postscripts on the CNs to reconfigure them for the new SN. The "syslog"
|
||||
postscript is always run. The "mkresolvconf" and "setupntp" scripts will be
|
||||
run IF they were included in the nodes postscript list.
|
||||
If the CNs are up at the time the ``snmove`` command is run then ``snmove`` will run postscripts on the CNs to reconfigure them for the new SN. The "syslog" postscript is always run. The ``mkresolvconf`` and ``setupntp`` scripts will be run if they were included in the nodes postscript list.
|
||||
|
||||
You can also specify an additional list of postscripts to run.
|
||||
|
||||
Modify system configuration on the nodes
|
||||
""""""""""""""""""""""""""""""""""""""""
|
||||
|
||||
If the CNs are up the snmove command will also perform some configuration on
|
||||
the nodes such as setting the default gateway and modifying some
|
||||
configuration files used by xCAT.
|
||||
If the CNs are up the ``snmove`` command will also perform some configuration on the nodes such as setting the default gateway and modifying some configuration files used by xCAT.
|
||||
|
||||
Switching back
|
||||
--------------
|
||||
@ -161,7 +138,7 @@ need to set it up as an SN again and make sure the CN images are replicated
|
||||
to it. Once you've done this, or if the SN's configuration was not lost,
|
||||
then follow these steps to move the CNs back to their original SN:
|
||||
|
||||
* Use snmove: ::
|
||||
* Use ``snmove``: ::
|
||||
|
||||
snmove sn1group -d sn1
|
||||
snmove sn1group -d sn1
|
||||
|
@ -0,0 +1,10 @@
|
||||
Appendix C: Migrating a Management Node to a Service Node
|
||||
=========================================================
|
||||
|
||||
Directly converting an existing Management Node to a Service Node may have some issues and is not recommended. Do the following steps to convert the xCAT Management Node into a Service node:
|
||||
|
||||
#. backup your xCAT database on the Management Node
|
||||
#. Install a new xCAT Management node
|
||||
#. Restore your xCAT database into the new Management Node
|
||||
#. Re-provision the old xCAT Management Node as a new Service Node
|
||||
|
10
docs/source/advanced/hierarchy/appendix/index.rst
Normal file
10
docs/source/advanced/hierarchy/appendix/index.rst
Normal file
@ -0,0 +1,10 @@
|
||||
Appendix
|
||||
========
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
appendix_a_setup_backup_service_nodes.rst
|
||||
appendix_b_diagnostics.rst
|
||||
appendix_c_migrating_mn_to_sn.rst
|
||||
appendix_d_set_up_hierarchical_conserver.rst
|
@ -1,14 +0,0 @@
|
||||
Appendix C: Migrating a Management Node to a Service Node
|
||||
=========================================================
|
||||
|
||||
If you find you want to convert an existing Management Node to a Service
|
||||
Node you need to work with the xCAT team. It is recommended for now, to
|
||||
backup your database, setup your new Management Server, and restore your
|
||||
database into it. Take the old Management Node and remove xCAT and all xCAT
|
||||
directories, and your database. See ``Uninstalling_xCAT
|
||||
<http://localhost/fake_todo>`_ and then follow the process for setting up a
|
||||
SN as if it is a new node.
|
||||
|
||||
|
||||
|
||||
|
@ -3,7 +3,7 @@ Define Service Nodes
|
||||
|
||||
This next part shows how to configure a xCAT Hierarchy and provision xCAT service nodes from an existing xCAT cluster.
|
||||
|
||||
*The document assumes that the compute nodes part of your cluster have already been defined into the xCAT database and you have successfully provisioned the compute nodes using xCAT*
|
||||
*The document assumes that the compute nodes that are part of your cluster have already been defined into the xCAT database and you have successfully provisioned the compute nodes using xCAT*
|
||||
|
||||
|
||||
The following table illustrates the cluster being used in this example:
|
||||
@ -22,7 +22,7 @@ The following table illustrates the cluster being used in this example:
|
||||
| | r1n10 |
|
||||
+----------------------+----------------------+
|
||||
| Compute Nodes | r2n01 |
|
||||
| (group=rack1) | r2n02 |
|
||||
| (group=rack2) | r2n02 |
|
||||
| | r2n03 |
|
||||
| | ... |
|
||||
| | r2n10 |
|
||||
@ -30,23 +30,27 @@ The following table illustrates the cluster being used in this example:
|
||||
|
||||
#. Select the compute nodes that will become service nodes
|
||||
|
||||
The first node in each rack, ``r1n01 and r2n01``, is selected to become the xCAT service nodes and manage the compute nodes in that rack
|
||||
The first node in each rack, ``r1n01`` and ``r2n01``, is selected to become the xCAT service nodes and manage the compute nodes in that rack
|
||||
|
||||
|
||||
#. Change the attributes for the compute node to make them part of the **service** group: ::
|
||||
|
||||
chdef -t node -o r1n01,r2n01 groups=service,all
|
||||
chdef -t node -o r1n01,r2n01 -p groups=service
|
||||
|
||||
#. When ``copycds`` was run against the ISO image, several osimages are created into the ``osimage`` table. The ones containing "service" are provided to help easily provision xCAT service nodes. ::
|
||||
#. When ``copycds`` was run against the ISO image, several osimages are created into the ``osimage`` table. The ones named ``*-service`` are provided by easily help provision xCAT service nodes. ::
|
||||
|
||||
# lsdef -t osimage | grep rhels7.1
|
||||
rhels7.1-ppc64le-install-compute (osimage)
|
||||
rhels7.1-ppc64le-install-service (osimage) <======
|
||||
rhels7.1-ppc64le-netboot-compute (osimage)
|
||||
|
||||
#. Add the service nodes to the ``servicenode`` table: ::
|
||||
#. Add some common service node attributes to the ``service`` nodegroup: ::
|
||||
|
||||
chdef -t group -o service setupnfs=1 setupdhcp=1 setuptftp=1 setupnameserver=1 setupconserver=1
|
||||
chdef -t group -o service setupnfs=1 \
|
||||
setupdhcp=1 \
|
||||
setuptftp=1 \
|
||||
setupnameserver=1 \
|
||||
setupconserver=1
|
||||
|
||||
**Tips/Hint**
|
||||
* Even if you do not want xCAT to configure any services, you must define the service nodes in the ``servicenode`` table with at least one attribute, set to 0, otherwise xCAT will not recognize the node as a service node**
|
@ -1,18 +1,24 @@
|
||||
Hierarchical Clusters
|
||||
=====================
|
||||
Hierarchical Clusters / Large Cluster Support
|
||||
=============================================
|
||||
|
||||
xCAT supports management of very large sized cluster by creating a **Hierarchical Cluster** and the concept of **xCAT Service Nodes**.
|
||||
|
||||
When dealing with large clusters, to balance the load, it is recommended to have more than one node (Management Node, "MN") handling the installation and management of the Compute Nodes ("CN"). These additional *helper* nodes are referred to as **Service Nodes** ("SN"). The Management Node can delegate all management operational needs to the Service Node responsible for a set of compute node.
|
||||
|
||||
The following configurations are supported:
|
||||
* Each service node installs/manages a specific set of compute nodes
|
||||
* Having a pool of service nodes, any of which can respond to an installation request from a compute node (*Requires service nodes to be aligned with networks broadcast domains, compute node chooses service nodes based on who responds to DHCP request first.*)
|
||||
* A hybrid of the above, where each specific set of compute nodes have 2 or more service nodes in a pool
|
||||
|
||||
The following documentation assumes an xCAT cluster has already been configured and covers the additional steps needed to support xCAT Hierarchy via Service Nodes.
|
||||
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
introduction.rst
|
||||
setup_mn_hierarchical_database.rst
|
||||
define_service_node.rst
|
||||
service_nodes.rst
|
||||
databases/index.rst
|
||||
define_service_nodes.rst
|
||||
configure_dhcp.rst
|
||||
setup_service_node.rst
|
||||
service_node_for_diskful.rst
|
||||
service_node_for_diskless.rst
|
||||
test_service_node_installation.rst
|
||||
appendix_a_setup_backup_service_nodes.rst
|
||||
appendix_b_diagnostics.rst
|
||||
appendix_c_migrating_mn_to_sn.rst
|
||||
appendix_d_set_up_hierarchical_conserver.rst
|
||||
provision/index.rst
|
||||
appendix/index.rst
|
||||
|
@ -1,48 +0,0 @@
|
||||
Introduction
|
||||
============
|
||||
|
||||
When dealing with large clusters, it is desirable to have more than one node,
|
||||
the xCAT Management Node (MN), handle the installation and management of the
|
||||
Com[pute Nodes (CN). The concept of these additional "helper" nodes are called
|
||||
**Service Nodes (SN)**. The Management Node can delegate all management
|
||||
operations required by a compute node to the service node that is assigned to
|
||||
manage that Compute Node. You can configure one or more Service Nodes to install
|
||||
and manage a group of Compute Nodes.
|
||||
|
||||
Service Nodes
|
||||
-------------
|
||||
|
||||
With xCAT, you have the choice of either having each Service Node
|
||||
install/manage a specific set of compute nodes, or having a pool of Service
|
||||
Nodes, any of which can respond to an installation request from a compute
|
||||
node. (Service Node pools must be aligned with the network broadcast domains,
|
||||
because the way a compute node choose its Service Node for that boot is by whoever
|
||||
responds to the DHCP request broadcast first.) You can also have a hybrid of
|
||||
the 2 approaches, in which for each specific set of compute nodes you have 2
|
||||
or more Service Nodes in a pool.
|
||||
|
||||
Each Service Node runs an instance of xcatd, just like the Management Node does.
|
||||
The ``xcatd`` daemons communicate with each other using the same XML/SSH protocol
|
||||
that the xCAT clients use to communicate with ``xcatd`` on the Management Node.
|
||||
|
||||
Daemon-based Databases
|
||||
----------------------
|
||||
|
||||
The Service Nodes will need to communicate with the xCAT database on the Management
|
||||
Node and do this by using the remote client capabilities of the database. Therefore,
|
||||
the Management Node must be running one of the daemon-based databases supported by
|
||||
xCAT (PostgreSQL, MySQL, MariaDB, etc).
|
||||
|
||||
The default SQLite database does not support remote clients and cannot be used
|
||||
in hierarchical clusters. This document includes instructions for migrating
|
||||
your cluster from SQLite to one of the other databases. Since the initial
|
||||
install of xCAT will always set up SQLite, you must migrate to a database that
|
||||
supports remote clients before installing your Service Nodes.
|
||||
|
||||
Setup
|
||||
-----
|
||||
xCAT will help you install your Service Nodes as well as install on the xCAT-SN
|
||||
software and other required rpms and pre-reqs. Service Nodes require the same
|
||||
software as installed on the Management Node with the exception of the top level
|
||||
xCAT rpm. The Management Node installs the ``xCAT`` package while the Service Nodes
|
||||
install the ``xCATsn`` package.
|
@ -1,11 +1,9 @@
|
||||
.. _setup_service_node_stateful_label:
|
||||
Diskful (Stateful) Installation
|
||||
===============================
|
||||
|
||||
Set Up the Service Nodes for Stateful (Diskful) Installation
|
||||
============================================================
|
||||
Any cluster using statelite compute nodes must use a stateful (diskful) Service Nodes.
|
||||
|
||||
Any cluster using statelite compute nodes must use a stateful (diskful) service nodes.
|
||||
|
||||
**Note: All xCAT service nodes must be at the exact same xCAT version as the xCAT Management Node**. Copy the files to the Management Node (MN) and untar them in the appropriate sub-directory of ``/install/post/otherpkgs``
|
||||
**Note: All xCAT Service Nodes must be at the exact same xCAT version as the xCAT Management Node**. Copy the files to the Management Node (MN) and untar them in the appropriate sub-directory of ``/install/post/otherpkgs``
|
||||
|
||||
**Note for the appropriate directory below, check the ``otherpkgdir=/install/post/otherpkgs/rhels7/x86_64`` attribute of the osimage defined for the servicenode.**
|
||||
|
||||
@ -73,8 +71,7 @@ Update the rhels6 RPM repository (rhels6 only)
|
||||
Set the node status to ready for installation
|
||||
---------------------------------------------
|
||||
|
||||
Run nodeset to the osimage name defined in the provmethod attribute on your
|
||||
service node. ::
|
||||
Run nodeset to the osimage name defined in the provmethod attribute on your Service Node. ::
|
||||
|
||||
nodeset service osimage="<osimagename>"
|
||||
|
||||
@ -95,12 +92,12 @@ Monitor the Installation
|
||||
|
||||
Watch the installation progress using either wcons or rcons: ::
|
||||
|
||||
wcons service # make sure DISPLAY is set to your X server/VNC or
|
||||
rcons <one-node-at-a-time>
|
||||
tail -f /var/log/messages
|
||||
wcons service # make sure DISPLAY is set to your X server/VNC or
|
||||
rcons <node_name>
|
||||
tail -f /var/log/messages
|
||||
|
||||
Note: We have experienced one problem while trying to install RHEL6 diskful
|
||||
service node working with SAS disks. The service node cannot reboots from SAS
|
||||
Service Node working with SAS disks. The Service Node cannot reboots from SAS
|
||||
disk after the RHEL6 operating system has been installed. We are waiting for
|
||||
the build with fixes from RHEL6 team, once meet this problem, you need to
|
||||
manually select the SAS disk to be the first boot device and boots from the
|
||||
@ -109,7 +106,7 @@ SAS disk.
|
||||
Update Service Node Diskfull Image
|
||||
----------------------------------
|
||||
|
||||
If you need to update the service nodes later on with a new version of xCAT
|
||||
and its dependencies, obtain the new xCAT and xCAT dependencies rpms.
|
||||
(Follow the same steps that were followed in
|
||||
:ref:`setup_service_node_stateful_label`.
|
||||
To update the xCAT software on the Service Node:
|
||||
|
||||
#. Obtain the new xcat-core and xcat-dep RPMS
|
||||
#.
|
@ -1,39 +1,30 @@
|
||||
.. _setup_service_node_stateless_label:
|
||||
Diskless (Stateless) Installation
|
||||
=================================
|
||||
|
||||
Setup the Service Node for Stateless Deployment (optional)
|
||||
==========================================================
|
||||
**Note: The stateless Service Node is not supported in ubuntu hierarchy cluster. For ubuntu, please skip this section.**
|
||||
|
||||
**Note: The stateless service node is not supported in ubuntu hierarchy
|
||||
cluster. For ubuntu, please skip this section.**
|
||||
|
||||
If you want, your service nodes can be stateless (diskless). The service node
|
||||
If you want, your Service Nodes can be stateless (diskless). The Service Node
|
||||
must contain not only the OS, but also the xCAT software and its dependencies.
|
||||
In addition, a number of files are added to the service node to support the
|
||||
PostgreSQL, or MySQL database access from the service node to the Management
|
||||
node, and ssh access to the nodes that the service nodes services.
|
||||
In addition, a number of files are added to the Service Node to support the
|
||||
PostgreSQL, or MySQL database access from the Service Node to the Management
|
||||
node, and ssh access to the nodes that the Service Nodes services.
|
||||
The following sections explain how to accomplish this.
|
||||
|
||||
|
||||
Build the Service Node Diksless Image
|
||||
--------------------------------------
|
||||
-------------------------------------
|
||||
|
||||
This section assumes you can build the stateless image on the management node
|
||||
because the service nodes are the same OS and architecture as the management
|
||||
node. If this is not the case, you need to build the image on a machine that
|
||||
matches the service node's OS architecture.
|
||||
This section assumes you can build the stateless image on the management node because the Service Nodes are the same OS and architecture as the management node. If this is not the case, you need to build the image on a machine that matches the Service Node's OS architecture.
|
||||
|
||||
* Create an osimage definition. When you run copycds, xCAT will create a
|
||||
service node osimage definitions for that distribution. For a stateless
|
||||
service node, use the *-netboot-service definition.
|
||||
* Create an osimage definition. When you run ``copycds``, xCAT will create a Service Node osimage definitions for that distribution. For a stateless
|
||||
Service Node, use the ``*-netboot-service`` definition. ::
|
||||
|
||||
::
|
||||
|
||||
lsdef -t osimage | grep -i service
|
||||
# lsdef -t osimage | grep -i service
|
||||
rhels6.4-ppc64-install-service (osimage)
|
||||
rhels6.4-ppc64-netboot-service (osimage)
|
||||
rhels6.4-ppc64-netboot-service (osimage) <================
|
||||
rhels6.4-ppc64-statelite-service (osimage)
|
||||
|
||||
lsdef -t osimage -l rhels6.3-ppc64-netboot-service
|
||||
# lsdef -t osimage -l rhels6.3-ppc64-netboot-service
|
||||
Object name: rhels6.3-ppc64-netboot-service
|
||||
exlist=/opt/xcat/share/xcat/netboot/rh/service.exlist
|
||||
imagetype=linux
|
||||
@ -50,14 +41,7 @@ matches the service node's OS architecture.
|
||||
provmethod=netboot
|
||||
rootimgdir=/install/netboot/rhels6.3/ppc64/service
|
||||
|
||||
* You can check the service node packaging to see if it has all the rpms you
|
||||
require. We ship a basic requirements lists that will create a fully
|
||||
functional service node. However, you may want to customize your service
|
||||
node by adding additional operating system packages or modifying the files
|
||||
excluded by the exclude list. View the files referenced by the osimage
|
||||
pkglist, otherpkglist and exlist attributes:
|
||||
|
||||
::
|
||||
* You can check the Service Node packaging to see if it has all the rpms you require. We ship a basic requirements lists that will create a fully functional Service Node. However, you may want to customize your service node by adding additional operating system packages or modifying the files excluded by the exclude list. View the files referenced by the osimage pkglist, otherpkglist and exlist attributes: ::
|
||||
|
||||
cd /opt/xcat/share/xcat/netboot/rh/
|
||||
view service.rhels6.ppc64.pkglist
|
||||
@ -74,10 +58,10 @@ matches the service node's OS architecture.
|
||||
|
||||
xcat/xcat-core/xCATsn
|
||||
|
||||
This is required to install the xCAT service node function into your image.
|
||||
This is required to install the xCAT Service Node function into your image.
|
||||
|
||||
You may also choose to create an appropriate /etc/fstab file in your
|
||||
service node image. Copy the script referenced by the postinstall
|
||||
Service Node image. Copy the script referenced by the postinstall
|
||||
attribute to your directory and modify it as you would like:
|
||||
|
||||
::
|
||||
@ -109,9 +93,6 @@ matches the service node's OS architecture.
|
||||
images, creating custom files and new custom osimage definitions as you need
|
||||
to.
|
||||
|
||||
For more information on the use and syntax of otherpkgs and pkglist files,
|
||||
see `Update Service Node Stateless Image <http://localhost/fake_todo>`_
|
||||
|
||||
* Make your xCAT software available for otherpkgs processing
|
||||
|
||||
* If you downloaded xCAT to your management node for installation, place a
|
||||
@ -161,7 +142,7 @@ matches the service node's OS architecture.
|
||||
chroot /install/netboot/rhels6.3/ppc64/service/rootimg chkconfig dhcpd off
|
||||
chroot /install/netboot/rhels6.3/ppc64/service/rootimg chkconfig dhcrelay off
|
||||
|
||||
* IF using NFS hybrid mode, export /install read-only in service node image:
|
||||
* IF using NFS hybrid mode, export /install read-only in Service Node image:
|
||||
|
||||
::
|
||||
|
||||
@ -181,7 +162,7 @@ matches the service node's OS architecture.
|
||||
|
||||
nodeset service osimage=rhels6.3-ppc64-netboot-service
|
||||
|
||||
* To diskless boot the service nodes
|
||||
* To diskless boot the Service Nodes
|
||||
|
||||
::
|
||||
|
||||
@ -194,7 +175,7 @@ To update the xCAT software in the image at a later time:
|
||||
|
||||
* Download the updated xcat-core and xcat-dep tarballs and place them in
|
||||
your osimage's otherpkgdir xcat directory as you did above.
|
||||
* Generate and repack the image and reboot your service node.
|
||||
* Generate and repack the image and reboot your Service Node.
|
||||
* Run image generation for your osimage definition.
|
||||
|
||||
::
|
||||
@ -204,9 +185,9 @@ To update the xCAT software in the image at a later time:
|
||||
nodeset service osimage=rhels6.3-ppc64-netboot-service
|
||||
rnetboot service
|
||||
|
||||
Note: The service nodes are set up as NFS-root servers for the compute nodes.
|
||||
Note: The Service Nodes are set up as NFS-root servers for the compute nodes.
|
||||
Any time changes are made to any compute image on the mgmt node it will be
|
||||
necessary to sync all changes to all service nodes. In our case the
|
||||
necessary to sync all changes to all Service Nodes. In our case the
|
||||
``/install`` directory is mounted on the servicenodes, so the update to the
|
||||
compute node image is automatically available.
|
||||
|
10
docs/source/advanced/hierarchy/provision/index.rst
Normal file
10
docs/source/advanced/hierarchy/provision/index.rst
Normal file
@ -0,0 +1,10 @@
|
||||
Provision Service Nodes
|
||||
=======================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
diskful_sn.rst
|
||||
diskless_sn.rst
|
||||
verify_sn.rst
|
||||
|
@ -1,5 +1,5 @@
|
||||
Test Service Node installation
|
||||
==============================
|
||||
Verify Service Node Installation
|
||||
================================
|
||||
|
||||
* ssh to the service nodes. You should not be prompted for a password.
|
||||
* Check to see that the xcat daemon xcatd is running.
|
@ -1,25 +0,0 @@
|
||||
Setup the MN Hierarchical Database
|
||||
==================================
|
||||
|
||||
Before setting up service nodes, you need to set up either MySQL, PostgreSQL,
|
||||
as the xCAT Database on the Management Node. The database client on the
|
||||
Service Nodes will be set up later when the SNs are installed. MySQL and
|
||||
PostgreSQL are available with the Linux OS.
|
||||
|
||||
Follow the instructions in one of these documents for setting up the
|
||||
Management node to use the selected database:
|
||||
|
||||
MySQL or MariaDB
|
||||
----------------
|
||||
|
||||
* Follow this documentation and be sure to use the xCAT provided mysqlsetup
|
||||
command to setup the database for xCAT:
|
||||
|
||||
- :doc:`/guides/admin-guides/large_clusters/databases/mysql_install`
|
||||
|
||||
PostgreSQL:
|
||||
-----------
|
||||
* Follow this documentation and be sure and use the xCAT provided pgsqlsetup
|
||||
command to setup the database for xCAT:
|
||||
|
||||
- :doc:`/guides/admin-guides/large_clusters/databases/postgres_install`
|
@ -1,6 +0,0 @@
|
||||
Setup Service Node
|
||||
==================
|
||||
|
||||
* Follow this documentation to :ref:`setup_service_node_stateful_label`.
|
||||
|
||||
* Follow this documentation to :ref:`setup_service_node_stateless_label`.
|
@ -20,7 +20,7 @@ Traditional cluster with OS on each node's local disk.
|
||||
|
||||
|
||||
Stateless (diskless)
|
||||
-------------------
|
||||
--------------------
|
||||
|
||||
Nodes boot from a RAMdisk OS image downloaded from the xCAT mgmt node or service node at boot time.
|
||||
|
||||
|
@ -1,31 +1,20 @@
|
||||
Admin Guide
|
||||
===========
|
||||
|
||||
When reading this chapter, assume you have read the :doc:`Overview of xCAT <../../overview/index>` to understand the architecture and features of xCAT, and have read the :doc:`xCAT Install Guide <../install-guides/index>` to have a xCAT Management Node installed.
|
||||
The admin guide is intended to help with learning how to manage a cluster using xCAT with the following major sections:
|
||||
|
||||
Now you can start to learn how to manage a cluster by xCAT. This chapter includes following major sections:
|
||||
|
||||
* **Basic Concepts**
|
||||
* **Basic Concepts** Introduces some of the basic concepts in xCAT.
|
||||
|
||||
This section will give you the introduction of some basic concepts in xCAT like the **Object Concept**, **Database**, **Global Configuration**, **Network** and **Node Type**.
|
||||
* **Manage Cluster** Describes managing clusters under xCAT. The management procedures are organized based on the hardware type since management may vary depending on the hardware architecture.
|
||||
|
||||
* **Manage Cluster**
|
||||
* **Reference** xCAT reference sections.
|
||||
|
||||
This is the a major part of xCAT doc. It describes the procedures of how to manage a real cluster. Since the management procedures are different among the hardware type, this section is organized base on the hardware type.
|
||||
|
||||
* **Reference**
|
||||
|
||||
This section includes the brief introduction of xCAT commands, the man page of each command and the definition of each xCAT Database table.
|
||||
|
||||
* **Large Cluster**
|
||||
|
||||
This section gives some advanced topics of how to manage a large cluster. **Large Cluster** means a cluster which has more than 500 compute nodes.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
basic_concepts/index.rst
|
||||
large_clusters/index.rst
|
||||
manage_clusters/index.rst
|
||||
references/index.rst
|
||||
|
||||
|
@ -1,22 +0,0 @@
|
||||
Large Cluster Support
|
||||
=====================
|
||||
|
||||
xCAT supports management of very large sized cluster through the use of **xCAT Hierarchy** or **xCAT Service Nodes**.
|
||||
|
||||
When dealing with large clusters, to balance the load, it is recommended to have more than one node (Management Node, "MN") handling the installation and management of the compute nodes. These additional *helper* nodes are referred to as **xCAT Service Nodes** ("SN"). The Management Node can delegate all management operational needs to the Service Node responsible for a set of compute node.
|
||||
|
||||
The following configurations are supported:
|
||||
* Each service node installs/manages a specific set of compute nodes
|
||||
* Having a pool of service nodes, any of which can respond to an installation request from a compute node (*Requires service nodes to be aligned with networks broadcast domains, compute node chooses service nodes based on who responds to DHCP request first.*)
|
||||
* A hybrid of the above, where each specific set of compute nodes have 2 or more service nodes in a pool
|
||||
|
||||
The following documentation assumes an xCAT cluster has already been configured and covers the additional steps needed to support xCAT Hierarchy via Service Nodes.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
service_nodes/service_nodes.rst
|
||||
databases/index.rst
|
||||
service_nodes/define_service_nodes.rst
|
||||
service_nodes/provision_service_nodes.rst
|
||||
tips.rst
|
@ -1,11 +0,0 @@
|
||||
Provision Service Nodes
|
||||
=======================
|
||||
|
||||
Diskful
|
||||
-------
|
||||
|
||||
Diskless
|
||||
--------
|
||||
|
||||
Verfication
|
||||
-----------
|
@ -1,5 +0,0 @@
|
||||
Tips/Tuning/Suggestions
|
||||
=======================
|
||||
|
||||
TODO: Content from: https://sourceforge.net/p/xcat/wiki/Hints_and_Tips_for_Large_Scale_Clusters/
|
||||
|
@ -45,26 +45,26 @@ Remote Console
|
||||
|
||||
Most enterprise level servers do not have video adapters installed with the machine. Meaning, the end user can not connect a monitor to the machine and get display output. In most cases, the console can be viewed using the serial port or LAN port, through Serial-over-LAN. Serial cable or network cable are used to get a command line interface of the machine. From there, the end user can get the basic machine booting information, firmware settings interface, local command line console, etc.
|
||||
|
||||
In order to get the command line console remotely. xCAT provides the ``rcons`` command. ::
|
||||
In order to get the command line console remotely. xCAT provides the ``rcons`` command.
|
||||
|
||||
#. Make sure the ``conserver`` is configured by running ``makeconservercf``.
|
||||
|
||||
First of all, make sure the ``conserver`` is configured, if not, configue it with ::
|
||||
|
||||
makeconservercf
|
||||
|
||||
Then check if the ``conserver`` is up and running ::
|
||||
#. Check if the ``conserver`` is up and running ::
|
||||
|
||||
ps ax | grep conserver
|
||||
|
||||
If the conserver is not running, or you just updated its configuration file, restart the conserver with ::
|
||||
#. If ``conserver`` is not running, start ::
|
||||
|
||||
service conserver restart
|
||||
[sysvinit] service conserver start
|
||||
[systemd] systemctl start conserver.service
|
||||
|
||||
In case you have ``systemd`` instead of ``sysvinit``, use the command below instead ::
|
||||
or restart, if changes to the configuration were made ::
|
||||
|
||||
systemctl restart conserver.service
|
||||
[sysvinit] service conserver restart
|
||||
[systemd] systemctl restart conserver.service
|
||||
|
||||
After that, you can get the command line console for a specific machine with the ``rcons`` command ::
|
||||
|
||||
#. After that, you can get the command line console for a specific machine with the ``rcons`` command ::
|
||||
|
||||
rcons cn1
|
||||
|
||||
|
@ -1,18 +1,10 @@
|
||||
References
|
||||
==========
|
||||
|
||||
xCAT Commands
|
||||
-------------
|
||||
|
||||
xCAT Database
|
||||
-------------
|
||||
|
||||
xCAT Man Pages
|
||||
--------------
|
||||
|
||||
*These man pages are auto generated from pod files to rst. *
|
||||
|
||||
*DO NOT modify directly from GitHub*
|
||||
These man pages are auto generated from .pod files to .rst files using the ``create_man_pages.py`` script under `xcat-core <https://github.com/xcat2/xcat-core>`_
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
@ -28,11 +20,10 @@ xCAT Man Pages
|
||||
xCAT Tools
|
||||
----------
|
||||
|
||||
**Use at your own risk**
|
||||
*Disclaimer:* **Use at your own risk**
|
||||
|
||||
This is a list of additional tools that are shipped with xCAT. The tools are located in the ``/opt/xcat/share/xcat/tools/`` directory and it's recommended to add to your PATH. Many of these tools have been contributed by xCAT users that are not part of the core xCAT development team.
|
||||
The following tools are shipped with xCAT and have been contributed by various xCAT community users. The tools are located under ``/opt/xcat/share/xcat/tools/``.
|
||||
|
||||
If you encounter any problems with the tools, post a message to the xCAT mailing list for help.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth:1
|
||||
@ -47,3 +38,4 @@ If you encounter any problems with the tools, post a message to the xCAT mailing
|
||||
tools/test_hca_state.rst
|
||||
|
||||
|
||||
If you encounter any problems with the tools, post a message to the xCAT mailing list for help.
|
||||
|
Loading…
x
Reference in New Issue
Block a user