2
0
mirror of https://github.com/xcat2/xcat-core.git synced 2025-05-30 17:46:38 +00:00

Merge pull request #530 from whowutwut/hierarchy_master

Merge the pages for Large Cluster and Hierarchical Clusters (master)
This commit is contained in:
cxhong 2015-12-11 22:16:44 -05:00
commit 13beea7af1
34 changed files with 152 additions and 307 deletions

View File

@ -402,7 +402,7 @@ Configure DRBD
[>....................] sync'ed: 0.5% (101932/102400)M
finish: 2:29:06 speed: 11,644 (11,444) K/sec
If a direct, back-to-back Gigabyte Ethernet connection is setup between the two management nodes and you are unhappy with the syncronization speed, it is possible to speed up the initial synchronization through some tunable parameters in DRBD. This setting is not permanent, and will not be retained after boot. For details, see `http://www.drbd.org/users-guide-emb/s-configure-sync-rate.html`_::
If a direct, back-to-back Gigabyte Ethernet connection is setup between the two management nodes and you are unhappy with the syncronization speed, it is possible to speed up the initial synchronization through some tunable parameters in DRBD. This setting is not permanent, and will not be retained after boot. For details, see http://www.drbd.org/users-guide-emb/s-configure-sync-rate.html. ::
drbdadm disk-options --resync-rate=110M xCAT
@ -1168,13 +1168,13 @@ Trouble shooting and debug tips
* **x3550m4n02** ::
drbdadm disconnect xCAT
drbdadm secondary xCAT
drbdadm connect --discard-my-data xCAT
drbdadm disconnect xCAT
drbdadm secondary xCAT
drbdadm connect --discard-my-data xCAT
* **x3550m4n01** ::
drbdadm connect xCAT
drbdadm connect xCAT
Disable HA MN
=============

View File

@ -2,33 +2,25 @@ Appendix A: Setup backup Service Nodes
======================================
For reliability, availability, and serviceability purposes you may wish to
designate backup service nodes in your hierarchical cluster. The backup
service node will be another active service node that is set up to easily
take over from the original service node if a problem occurs. This is not an
designate backup Service Nodes in your hierarchical cluster. The backup
Service Node will be another active Service Node that is set up to easily
take over from the original Service Node if a problem occurs. This is not an
automatic failover feature. You will have to initiate the switch from the
primary service node to the backup manually. The xCAT support will handle most
of the setup and transfer of the nodes to the new service node. This
primary Service Node to the backup manually. The xCAT support will handle most
of the setup and transfer of the nodes to the new Service Node. This
procedure can also be used to simply switch some compute nodes to a new
service node, for example, for planned maintenance.
Abbreviations used below:
* MN - management node.
* SN - service node.
* CN - compute node.
Service Node, for example, for planned maintenance.
Initial deployment
------------------
Integrate the following steps into the hierarchical deployment process
described above.
Integrate the following steps into the hierarchical deployment process described above.
#. Make sure both the primary and backup service nodes are installed,
configured, and can access the MN database.
#. When defining the CNs add the necessary service node values to the
"servicenode" and "xcatmaster" attributes of the `node definitions
<http://localhost/fake_todo>`_.
"servicenode" and "xcatmaster" attributes of the :doc:`node </guides/admin-guides/references/man7/node.7>` definitions.
#. (Optional) Create an xCAT group for the nodes that are assigned to each SN.
This will be useful when setting node attributes as well as providing an
easy way to switch a set of nodes back to their original server.
@ -51,10 +43,8 @@ attributes you would run a command similar to the following. ::
chdef <noderange> servicenode="xcatsn1a,xcatsn2a" xcatmaster="xcatsn1b"
The process can be simplified by creating xCAT node groups to use as the
<noderange> in the `chdef <http://localhost/fake_todo>`_ command. To create an
xCAT node group containing all the nodes that have the service node "SN27"
you could run a command similar to the following. ::
The process can be simplified by creating xCAT node groups to use as the <noderange> in the :doc:`chdef </guides/admin-guides/references/man1/chdef.1>` command to create a
xCAT node group containing all the nodes that belong to service node "SN27". For example: ::
mkdef -t group sn1group members=node[01-20]
@ -64,10 +54,7 @@ the 1st SN as their primary SN, and the other half of CNs to use the 2nd SN
as their primary SN. Then each SN would be configured to be the backup SN
for the other half of CNs.**
When you run `makedhcp <http://localhost/fake_todo>`_, it will configure dhcp
and tftp on both the primary and backup SNs, assuming they both have network
access to the CNs. This will make it possible to do a quick SN takeover
without having to wait for replication when you need to switch.
When you run :doc:`makedhcp </guides/admin-guides/references/man8/makedhcp.8>` command, it will configure dhcp and tftp on both the primary and backup SNs, assuming they both have network access to the CNs. This will make it possible to do a quick SN takeover without having to wait for replication when you need to switch.
xdcp Behaviour with backup servicenodes
---------------------------------------
@ -83,7 +70,7 @@ rhsn. lsdef cn4 | grep servicenode. ::
servicenode=service1,rhsn
If a service node is offline ( e.g. service1), then you will see errors on
If a service node is offline (e.g. service1), then you will see errors on
your xdcp command, and yet if rhsn is online then the xdcp will actually
work. This may be a little confusing. For example, here service1 is offline,
but we are able to use rhsn to complete the xdcp. ::
@ -108,49 +95,39 @@ procedure to move its CNs over to the backup SN.
Move the nodes to the new service nodes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Use the xCAT `snmove <http://localhost/fake_todo>`_ to make the database
updates necessary to move a set of nodes from one service node to another, and
to make configuration modifications to the nodes.
Use the :doc:`snmove </guides/admin-guides/references/man1/snmove.1>` command to make the database changes necessary to move a set of compute nodes from one Service Node to another.
For example, if you want to switch all the compute nodes that use service
node "sn1" to the backup SN (sn2), run: ::
To switch all the compute nodes from Service Node ``sn1`` to the backup Service Node ``sn2``, run: ::
snmove -s sn1
snmove -s sn1
Modified database attributes
""""""""""""""""""""""""""""
The **snmove** command will check and set several node attribute values.
The ``snmove`` command will check and set several node attribute values.
**servicenode**: : This will be set to either the second server name in the
servicenode attribute list or the value provided on the command line.
**xcatmaster**: : Set with either the value provided on the command line or it
will be automatically determined from the servicenode attribute.
**nfsserver**: : If the value is set with the source service node then it will
be set to the destination service node.
**tftpserver**: : If the value is set with the source service node then it will
be reset to the destination service node.
**monserver**: : If set to the source service node then reset it to the
destination servicenode and xcatmaster values.
**conserver**: : If set to the source service node then reset it to the
destination servicenode and run **makeconservercf**
* **servicenode**: This will be set to either the second server name in the servicenode attribute list or the value provided on the command line.
* **xcatmaster**: Set with either the value provided on the command line or it will be automatically determined from the servicenode attribute.
* **nfsserver**: If the value is set with the source service node then it will be set to the destination service node.
* **tftpserver**: If the value is set with the source service node then it will be reset to the destination service node.
* **monserver**: If set to the source service node then reset it to the destination servicenode and xcatmaster values.
* **conserver**: If set to the source service node then reset it to the destination servicenode and run ``makeconservercf``
Run postscripts on the nodes
""""""""""""""""""""""""""""
If the CNs are up at the time the snmove command is run then snmove will run
postscripts on the CNs to reconfigure them for the new SN. The "syslog"
postscript is always run. The "mkresolvconf" and "setupntp" scripts will be
run IF they were included in the nodes postscript list.
If the CNs are up at the time the ``snmove`` command is run then ``snmove`` will run postscripts on the CNs to reconfigure them for the new SN. The "syslog" postscript is always run. The ``mkresolvconf`` and ``setupntp`` scripts will be run if they were included in the nodes postscript list.
You can also specify an additional list of postscripts to run.
Modify system configuration on the nodes
""""""""""""""""""""""""""""""""""""""""
If the CNs are up the snmove command will also perform some configuration on
the nodes such as setting the default gateway and modifying some
configuration files used by xCAT.
If the CNs are up the ``snmove`` command will also perform some configuration on the nodes such as setting the default gateway and modifying some configuration files used by xCAT.
Switching back
--------------
@ -161,7 +138,7 @@ need to set it up as an SN again and make sure the CN images are replicated
to it. Once you've done this, or if the SN's configuration was not lost,
then follow these steps to move the CNs back to their original SN:
* Use snmove: ::
* Use ``snmove``: ::
snmove sn1group -d sn1
snmove sn1group -d sn1

View File

@ -0,0 +1,10 @@
Appendix C: Migrating a Management Node to a Service Node
=========================================================
Directly converting an existing Management Node to a Service Node may have some issues and is not recommended. Do the following steps to convert the xCAT Management Node into a Service node:
#. backup your xCAT database on the Management Node
#. Install a new xCAT Management node
#. Restore your xCAT database into the new Management Node
#. Re-provision the old xCAT Management Node as a new Service Node

View File

@ -0,0 +1,10 @@
Appendix
========
.. toctree::
:maxdepth: 2
appendix_a_setup_backup_service_nodes.rst
appendix_b_diagnostics.rst
appendix_c_migrating_mn_to_sn.rst
appendix_d_set_up_hierarchical_conserver.rst

View File

@ -1,14 +0,0 @@
Appendix C: Migrating a Management Node to a Service Node
=========================================================
If you find you want to convert an existing Management Node to a Service
Node you need to work with the xCAT team. It is recommended for now, to
backup your database, setup your new Management Server, and restore your
database into it. Take the old Management Node and remove xCAT and all xCAT
directories, and your database. See ``Uninstalling_xCAT
<http://localhost/fake_todo>`_ and then follow the process for setting up a
SN as if it is a new node.

View File

@ -3,7 +3,7 @@ Define Service Nodes
This next part shows how to configure a xCAT Hierarchy and provision xCAT service nodes from an existing xCAT cluster.
*The document assumes that the compute nodes part of your cluster have already been defined into the xCAT database and you have successfully provisioned the compute nodes using xCAT*
*The document assumes that the compute nodes that are part of your cluster have already been defined into the xCAT database and you have successfully provisioned the compute nodes using xCAT*
The following table illustrates the cluster being used in this example:
@ -22,7 +22,7 @@ The following table illustrates the cluster being used in this example:
| | r1n10 |
+----------------------+----------------------+
| Compute Nodes | r2n01 |
| (group=rack1) | r2n02 |
| (group=rack2) | r2n02 |
| | r2n03 |
| | ... |
| | r2n10 |
@ -30,23 +30,27 @@ The following table illustrates the cluster being used in this example:
#. Select the compute nodes that will become service nodes
The first node in each rack, ``r1n01 and r2n01``, is selected to become the xCAT service nodes and manage the compute nodes in that rack
The first node in each rack, ``r1n01`` and ``r2n01``, is selected to become the xCAT service nodes and manage the compute nodes in that rack
#. Change the attributes for the compute node to make them part of the **service** group: ::
chdef -t node -o r1n01,r2n01 groups=service,all
chdef -t node -o r1n01,r2n01 -p groups=service
#. When ``copycds`` was run against the ISO image, several osimages are created into the ``osimage`` table. The ones containing "service" are provided to help easily provision xCAT service nodes. ::
#. When ``copycds`` was run against the ISO image, several osimages are created into the ``osimage`` table. The ones named ``*-service`` are provided by easily help provision xCAT service nodes. ::
# lsdef -t osimage | grep rhels7.1
rhels7.1-ppc64le-install-compute (osimage)
rhels7.1-ppc64le-install-service (osimage) <======
rhels7.1-ppc64le-netboot-compute (osimage)
#. Add the service nodes to the ``servicenode`` table: ::
#. Add some common service node attributes to the ``service`` nodegroup: ::
chdef -t group -o service setupnfs=1 setupdhcp=1 setuptftp=1 setupnameserver=1 setupconserver=1
chdef -t group -o service setupnfs=1 \
setupdhcp=1 \
setuptftp=1 \
setupnameserver=1 \
setupconserver=1
**Tips/Hint**
* Even if you do not want xCAT to configure any services, you must define the service nodes in the ``servicenode`` table with at least one attribute, set to 0, otherwise xCAT will not recognize the node as a service node**

View File

@ -1,18 +1,24 @@
Hierarchical Clusters
=====================
Hierarchical Clusters / Large Cluster Support
=============================================
xCAT supports management of very large sized cluster by creating a **Hierarchical Cluster** and the concept of **xCAT Service Nodes**.
When dealing with large clusters, to balance the load, it is recommended to have more than one node (Management Node, "MN") handling the installation and management of the Compute Nodes ("CN"). These additional *helper* nodes are referred to as **Service Nodes** ("SN"). The Management Node can delegate all management operational needs to the Service Node responsible for a set of compute node.
The following configurations are supported:
* Each service node installs/manages a specific set of compute nodes
* Having a pool of service nodes, any of which can respond to an installation request from a compute node (*Requires service nodes to be aligned with networks broadcast domains, compute node chooses service nodes based on who responds to DHCP request first.*)
* A hybrid of the above, where each specific set of compute nodes have 2 or more service nodes in a pool
The following documentation assumes an xCAT cluster has already been configured and covers the additional steps needed to support xCAT Hierarchy via Service Nodes.
.. toctree::
:maxdepth: 2
introduction.rst
setup_mn_hierarchical_database.rst
define_service_node.rst
service_nodes.rst
databases/index.rst
define_service_nodes.rst
configure_dhcp.rst
setup_service_node.rst
service_node_for_diskful.rst
service_node_for_diskless.rst
test_service_node_installation.rst
appendix_a_setup_backup_service_nodes.rst
appendix_b_diagnostics.rst
appendix_c_migrating_mn_to_sn.rst
appendix_d_set_up_hierarchical_conserver.rst
provision/index.rst
appendix/index.rst

View File

@ -1,48 +0,0 @@
Introduction
============
When dealing with large clusters, it is desirable to have more than one node,
the xCAT Management Node (MN), handle the installation and management of the
Com[pute Nodes (CN). The concept of these additional "helper" nodes are called
**Service Nodes (SN)**. The Management Node can delegate all management
operations required by a compute node to the service node that is assigned to
manage that Compute Node. You can configure one or more Service Nodes to install
and manage a group of Compute Nodes.
Service Nodes
-------------
With xCAT, you have the choice of either having each Service Node
install/manage a specific set of compute nodes, or having a pool of Service
Nodes, any of which can respond to an installation request from a compute
node. (Service Node pools must be aligned with the network broadcast domains,
because the way a compute node choose its Service Node for that boot is by whoever
responds to the DHCP request broadcast first.) You can also have a hybrid of
the 2 approaches, in which for each specific set of compute nodes you have 2
or more Service Nodes in a pool.
Each Service Node runs an instance of xcatd, just like the Management Node does.
The ``xcatd`` daemons communicate with each other using the same XML/SSH protocol
that the xCAT clients use to communicate with ``xcatd`` on the Management Node.
Daemon-based Databases
----------------------
The Service Nodes will need to communicate with the xCAT database on the Management
Node and do this by using the remote client capabilities of the database. Therefore,
the Management Node must be running one of the daemon-based databases supported by
xCAT (PostgreSQL, MySQL, MariaDB, etc).
The default SQLite database does not support remote clients and cannot be used
in hierarchical clusters. This document includes instructions for migrating
your cluster from SQLite to one of the other databases. Since the initial
install of xCAT will always set up SQLite, you must migrate to a database that
supports remote clients before installing your Service Nodes.
Setup
-----
xCAT will help you install your Service Nodes as well as install on the xCAT-SN
software and other required rpms and pre-reqs. Service Nodes require the same
software as installed on the Management Node with the exception of the top level
xCAT rpm. The Management Node installs the ``xCAT`` package while the Service Nodes
install the ``xCATsn`` package.

View File

@ -1,11 +1,9 @@
.. _setup_service_node_stateful_label:
Diskful (Stateful) Installation
===============================
Set Up the Service Nodes for Stateful (Diskful) Installation
============================================================
Any cluster using statelite compute nodes must use a stateful (diskful) Service Nodes.
Any cluster using statelite compute nodes must use a stateful (diskful) service nodes.
**Note: All xCAT service nodes must be at the exact same xCAT version as the xCAT Management Node**. Copy the files to the Management Node (MN) and untar them in the appropriate sub-directory of ``/install/post/otherpkgs``
**Note: All xCAT Service Nodes must be at the exact same xCAT version as the xCAT Management Node**. Copy the files to the Management Node (MN) and untar them in the appropriate sub-directory of ``/install/post/otherpkgs``
**Note for the appropriate directory below, check the ``otherpkgdir=/install/post/otherpkgs/rhels7/x86_64`` attribute of the osimage defined for the servicenode.**
@ -73,8 +71,7 @@ Update the rhels6 RPM repository (rhels6 only)
Set the node status to ready for installation
---------------------------------------------
Run nodeset to the osimage name defined in the provmethod attribute on your
service node. ::
Run nodeset to the osimage name defined in the provmethod attribute on your Service Node. ::
nodeset service osimage="<osimagename>"
@ -95,12 +92,12 @@ Monitor the Installation
Watch the installation progress using either wcons or rcons: ::
wcons service # make sure DISPLAY is set to your X server/VNC or
rcons <one-node-at-a-time>
tail -f /var/log/messages
wcons service # make sure DISPLAY is set to your X server/VNC or
rcons <node_name>
tail -f /var/log/messages
Note: We have experienced one problem while trying to install RHEL6 diskful
service node working with SAS disks. The service node cannot reboots from SAS
Service Node working with SAS disks. The Service Node cannot reboots from SAS
disk after the RHEL6 operating system has been installed. We are waiting for
the build with fixes from RHEL6 team, once meet this problem, you need to
manually select the SAS disk to be the first boot device and boots from the
@ -109,7 +106,7 @@ SAS disk.
Update Service Node Diskfull Image
----------------------------------
If you need to update the service nodes later on with a new version of xCAT
and its dependencies, obtain the new xCAT and xCAT dependencies rpms.
(Follow the same steps that were followed in
:ref:`setup_service_node_stateful_label`.
To update the xCAT software on the Service Node:
#. Obtain the new xcat-core and xcat-dep RPMS
#.

View File

@ -1,39 +1,30 @@
.. _setup_service_node_stateless_label:
Diskless (Stateless) Installation
=================================
Setup the Service Node for Stateless Deployment (optional)
==========================================================
**Note: The stateless Service Node is not supported in ubuntu hierarchy cluster. For ubuntu, please skip this section.**
**Note: The stateless service node is not supported in ubuntu hierarchy
cluster. For ubuntu, please skip this section.**
If you want, your service nodes can be stateless (diskless). The service node
If you want, your Service Nodes can be stateless (diskless). The Service Node
must contain not only the OS, but also the xCAT software and its dependencies.
In addition, a number of files are added to the service node to support the
PostgreSQL, or MySQL database access from the service node to the Management
node, and ssh access to the nodes that the service nodes services.
In addition, a number of files are added to the Service Node to support the
PostgreSQL, or MySQL database access from the Service Node to the Management
node, and ssh access to the nodes that the Service Nodes services.
The following sections explain how to accomplish this.
Build the Service Node Diksless Image
--------------------------------------
-------------------------------------
This section assumes you can build the stateless image on the management node
because the service nodes are the same OS and architecture as the management
node. If this is not the case, you need to build the image on a machine that
matches the service node's OS architecture.
This section assumes you can build the stateless image on the management node because the Service Nodes are the same OS and architecture as the management node. If this is not the case, you need to build the image on a machine that matches the Service Node's OS architecture.
* Create an osimage definition. When you run copycds, xCAT will create a
service node osimage definitions for that distribution. For a stateless
service node, use the *-netboot-service definition.
* Create an osimage definition. When you run ``copycds``, xCAT will create a Service Node osimage definitions for that distribution. For a stateless
Service Node, use the ``*-netboot-service`` definition. ::
::
lsdef -t osimage | grep -i service
# lsdef -t osimage | grep -i service
rhels6.4-ppc64-install-service (osimage)
rhels6.4-ppc64-netboot-service (osimage)
rhels6.4-ppc64-netboot-service (osimage) <================
rhels6.4-ppc64-statelite-service (osimage)
lsdef -t osimage -l rhels6.3-ppc64-netboot-service
# lsdef -t osimage -l rhels6.3-ppc64-netboot-service
Object name: rhels6.3-ppc64-netboot-service
exlist=/opt/xcat/share/xcat/netboot/rh/service.exlist
imagetype=linux
@ -50,14 +41,7 @@ matches the service node's OS architecture.
provmethod=netboot
rootimgdir=/install/netboot/rhels6.3/ppc64/service
* You can check the service node packaging to see if it has all the rpms you
require. We ship a basic requirements lists that will create a fully
functional service node. However, you may want to customize your service
node by adding additional operating system packages or modifying the files
excluded by the exclude list. View the files referenced by the osimage
pkglist, otherpkglist and exlist attributes:
::
* You can check the Service Node packaging to see if it has all the rpms you require. We ship a basic requirements lists that will create a fully functional Service Node. However, you may want to customize your service node by adding additional operating system packages or modifying the files excluded by the exclude list. View the files referenced by the osimage pkglist, otherpkglist and exlist attributes: ::
cd /opt/xcat/share/xcat/netboot/rh/
view service.rhels6.ppc64.pkglist
@ -74,10 +58,10 @@ matches the service node's OS architecture.
xcat/xcat-core/xCATsn
This is required to install the xCAT service node function into your image.
This is required to install the xCAT Service Node function into your image.
You may also choose to create an appropriate /etc/fstab file in your
service node image. Copy the script referenced by the postinstall
Service Node image. Copy the script referenced by the postinstall
attribute to your directory and modify it as you would like:
::
@ -109,9 +93,6 @@ matches the service node's OS architecture.
images, creating custom files and new custom osimage definitions as you need
to.
For more information on the use and syntax of otherpkgs and pkglist files,
see `Update Service Node Stateless Image <http://localhost/fake_todo>`_
* Make your xCAT software available for otherpkgs processing
* If you downloaded xCAT to your management node for installation, place a
@ -161,7 +142,7 @@ matches the service node's OS architecture.
chroot /install/netboot/rhels6.3/ppc64/service/rootimg chkconfig dhcpd off
chroot /install/netboot/rhels6.3/ppc64/service/rootimg chkconfig dhcrelay off
* IF using NFS hybrid mode, export /install read-only in service node image:
* IF using NFS hybrid mode, export /install read-only in Service Node image:
::
@ -181,7 +162,7 @@ matches the service node's OS architecture.
nodeset service osimage=rhels6.3-ppc64-netboot-service
* To diskless boot the service nodes
* To diskless boot the Service Nodes
::
@ -194,7 +175,7 @@ To update the xCAT software in the image at a later time:
* Download the updated xcat-core and xcat-dep tarballs and place them in
your osimage's otherpkgdir xcat directory as you did above.
* Generate and repack the image and reboot your service node.
* Generate and repack the image and reboot your Service Node.
* Run image generation for your osimage definition.
::
@ -204,9 +185,9 @@ To update the xCAT software in the image at a later time:
nodeset service osimage=rhels6.3-ppc64-netboot-service
rnetboot service
Note: The service nodes are set up as NFS-root servers for the compute nodes.
Note: The Service Nodes are set up as NFS-root servers for the compute nodes.
Any time changes are made to any compute image on the mgmt node it will be
necessary to sync all changes to all service nodes. In our case the
necessary to sync all changes to all Service Nodes. In our case the
``/install`` directory is mounted on the servicenodes, so the update to the
compute node image is automatically available.

View File

@ -0,0 +1,10 @@
Provision Service Nodes
=======================
.. toctree::
:maxdepth: 2
diskful_sn.rst
diskless_sn.rst
verify_sn.rst

View File

@ -1,5 +1,5 @@
Test Service Node installation
==============================
Verify Service Node Installation
================================
* ssh to the service nodes. You should not be prompted for a password.
* Check to see that the xcat daemon xcatd is running.

View File

@ -1,25 +0,0 @@
Setup the MN Hierarchical Database
==================================
Before setting up service nodes, you need to set up either MySQL, PostgreSQL,
as the xCAT Database on the Management Node. The database client on the
Service Nodes will be set up later when the SNs are installed. MySQL and
PostgreSQL are available with the Linux OS.
Follow the instructions in one of these documents for setting up the
Management node to use the selected database:
MySQL or MariaDB
----------------
* Follow this documentation and be sure to use the xCAT provided mysqlsetup
command to setup the database for xCAT:
- :doc:`/guides/admin-guides/large_clusters/databases/mysql_install`
PostgreSQL:
-----------
* Follow this documentation and be sure and use the xCAT provided pgsqlsetup
command to setup the database for xCAT:
- :doc:`/guides/admin-guides/large_clusters/databases/postgres_install`

View File

@ -1,6 +0,0 @@
Setup Service Node
==================
* Follow this documentation to :ref:`setup_service_node_stateful_label`.
* Follow this documentation to :ref:`setup_service_node_stateless_label`.

View File

@ -20,7 +20,7 @@ Traditional cluster with OS on each node's local disk.
Stateless (diskless)
-------------------
--------------------
Nodes boot from a RAMdisk OS image downloaded from the xCAT mgmt node or service node at boot time.

View File

@ -1,31 +1,20 @@
Admin Guide
===========
When reading this chapter, assume you have read the :doc:`Overview of xCAT <../../overview/index>` to understand the architecture and features of xCAT, and have read the :doc:`xCAT Install Guide <../install-guides/index>` to have a xCAT Management Node installed.
The admin guide is intended to help with learning how to manage a cluster using xCAT with the following major sections:
Now you can start to learn how to manage a cluster by xCAT. This chapter includes following major sections:
* **Basic Concepts**
* **Basic Concepts** Introduces some of the basic concepts in xCAT.
This section will give you the introduction of some basic concepts in xCAT like the **Object Concept**, **Database**, **Global Configuration**, **Network** and **Node Type**.
* **Manage Cluster** Describes managing clusters under xCAT. The management procedures are organized based on the hardware type since management may vary depending on the hardware architecture.
* **Manage Cluster**
* **Reference** xCAT reference sections.
This is the a major part of xCAT doc. It describes the procedures of how to manage a real cluster. Since the management procedures are different among the hardware type, this section is organized base on the hardware type.
* **Reference**
This section includes the brief introduction of xCAT commands, the man page of each command and the definition of each xCAT Database table.
* **Large Cluster**
This section gives some advanced topics of how to manage a large cluster. **Large Cluster** means a cluster which has more than 500 compute nodes.
.. toctree::
:maxdepth: 2
basic_concepts/index.rst
large_clusters/index.rst
manage_clusters/index.rst
references/index.rst

View File

@ -1,22 +0,0 @@
Large Cluster Support
=====================
xCAT supports management of very large sized cluster through the use of **xCAT Hierarchy** or **xCAT Service Nodes**.
When dealing with large clusters, to balance the load, it is recommended to have more than one node (Management Node, "MN") handling the installation and management of the compute nodes. These additional *helper* nodes are referred to as **xCAT Service Nodes** ("SN"). The Management Node can delegate all management operational needs to the Service Node responsible for a set of compute node.
The following configurations are supported:
* Each service node installs/manages a specific set of compute nodes
* Having a pool of service nodes, any of which can respond to an installation request from a compute node (*Requires service nodes to be aligned with networks broadcast domains, compute node chooses service nodes based on who responds to DHCP request first.*)
* A hybrid of the above, where each specific set of compute nodes have 2 or more service nodes in a pool
The following documentation assumes an xCAT cluster has already been configured and covers the additional steps needed to support xCAT Hierarchy via Service Nodes.
.. toctree::
:maxdepth: 2
service_nodes/service_nodes.rst
databases/index.rst
service_nodes/define_service_nodes.rst
service_nodes/provision_service_nodes.rst
tips.rst

View File

@ -1,11 +0,0 @@
Provision Service Nodes
=======================
Diskful
-------
Diskless
--------
Verfication
-----------

View File

@ -1,5 +0,0 @@
Tips/Tuning/Suggestions
=======================
TODO: Content from: https://sourceforge.net/p/xcat/wiki/Hints_and_Tips_for_Large_Scale_Clusters/

View File

@ -45,26 +45,26 @@ Remote Console
Most enterprise level servers do not have video adapters installed with the machine. Meaning, the end user can not connect a monitor to the machine and get display output. In most cases, the console can be viewed using the serial port or LAN port, through Serial-over-LAN. Serial cable or network cable are used to get a command line interface of the machine. From there, the end user can get the basic machine booting information, firmware settings interface, local command line console, etc.
In order to get the command line console remotely. xCAT provides the ``rcons`` command. ::
In order to get the command line console remotely. xCAT provides the ``rcons`` command.
#. Make sure the ``conserver`` is configured by running ``makeconservercf``.
First of all, make sure the ``conserver`` is configured, if not, configue it with ::
makeconservercf
Then check if the ``conserver`` is up and running ::
#. Check if the ``conserver`` is up and running ::
ps ax | grep conserver
If the conserver is not running, or you just updated its configuration file, restart the conserver with ::
#. If ``conserver`` is not running, start ::
service conserver restart
[sysvinit] service conserver start
[systemd] systemctl start conserver.service
In case you have ``systemd`` instead of ``sysvinit``, use the command below instead ::
or restart, if changes to the configuration were made ::
systemctl restart conserver.service
[sysvinit] service conserver restart
[systemd] systemctl restart conserver.service
After that, you can get the command line console for a specific machine with the ``rcons`` command ::
#. After that, you can get the command line console for a specific machine with the ``rcons`` command ::
rcons cn1

View File

@ -1,18 +1,10 @@
References
==========
xCAT Commands
-------------
xCAT Database
-------------
xCAT Man Pages
--------------
*These man pages are auto generated from pod files to rst. *
*DO NOT modify directly from GitHub*
These man pages are auto generated from .pod files to .rst files using the ``create_man_pages.py`` script under `xcat-core <https://github.com/xcat2/xcat-core>`_
.. toctree::
:maxdepth: 1
@ -28,11 +20,10 @@ xCAT Man Pages
xCAT Tools
----------
**Use at your own risk**
*Disclaimer:* **Use at your own risk**
This is a list of additional tools that are shipped with xCAT. The tools are located in the ``/opt/xcat/share/xcat/tools/`` directory and it's recommended to add to your PATH. Many of these tools have been contributed by xCAT users that are not part of the core xCAT development team.
The following tools are shipped with xCAT and have been contributed by various xCAT community users. The tools are located under ``/opt/xcat/share/xcat/tools/``.
If you encounter any problems with the tools, post a message to the xCAT mailing list for help.
.. toctree::
:maxdepth:1
@ -47,3 +38,4 @@ If you encounter any problems with the tools, post a message to the xCAT mailing
tools/test_hca_state.rst
If you encounter any problems with the tools, post a message to the xCAT mailing list for help.