mirror of
https://github.com/xcat2/xcat-core.git
synced 2025-06-14 18:30:23 +00:00
Clean up some formatting and add links for the missing "todo"
This commit is contained in:
@ -402,7 +402,7 @@ Configure DRBD
|
||||
[>....................] sync'ed: 0.5% (101932/102400)M
|
||||
finish: 2:29:06 speed: 11,644 (11,444) K/sec
|
||||
|
||||
If a direct, back-to-back Gigabyte Ethernet connection is setup between the two management nodes and you are unhappy with the syncronization speed, it is possible to speed up the initial synchronization through some tunable parameters in DRBD. This setting is not permanent, and will not be retained after boot. For details, see `http://www.drbd.org/users-guide-emb/s-configure-sync-rate.html`_::
|
||||
If a direct, back-to-back Gigabyte Ethernet connection is setup between the two management nodes and you are unhappy with the syncronization speed, it is possible to speed up the initial synchronization through some tunable parameters in DRBD. This setting is not permanent, and will not be retained after boot. For details, see http://www.drbd.org/users-guide-emb/s-configure-sync-rate.html. ::
|
||||
|
||||
drbdadm disk-options --resync-rate=110M xCAT
|
||||
|
||||
@ -1168,13 +1168,13 @@ Trouble shooting and debug tips
|
||||
|
||||
* **x3550m4n02** ::
|
||||
|
||||
drbdadm disconnect xCAT
|
||||
drbdadm secondary xCAT
|
||||
drbdadm connect --discard-my-data xCAT
|
||||
drbdadm disconnect xCAT
|
||||
drbdadm secondary xCAT
|
||||
drbdadm connect --discard-my-data xCAT
|
||||
|
||||
* **x3550m4n01** ::
|
||||
|
||||
drbdadm connect xCAT
|
||||
drbdadm connect xCAT
|
||||
|
||||
Disable HA MN
|
||||
=============
|
||||
|
@ -2,33 +2,25 @@ Appendix A: Setup backup Service Nodes
|
||||
======================================
|
||||
|
||||
For reliability, availability, and serviceability purposes you may wish to
|
||||
designate backup service nodes in your hierarchical cluster. The backup
|
||||
service node will be another active service node that is set up to easily
|
||||
take over from the original service node if a problem occurs. This is not an
|
||||
designate backup Service Nodes in your hierarchical cluster. The backup
|
||||
Service Node will be another active Service Node that is set up to easily
|
||||
take over from the original Service Node if a problem occurs. This is not an
|
||||
automatic failover feature. You will have to initiate the switch from the
|
||||
primary service node to the backup manually. The xCAT support will handle most
|
||||
of the setup and transfer of the nodes to the new service node. This
|
||||
primary Service Node to the backup manually. The xCAT support will handle most
|
||||
of the setup and transfer of the nodes to the new Service Node. This
|
||||
procedure can also be used to simply switch some compute nodes to a new
|
||||
service node, for example, for planned maintenance.
|
||||
|
||||
Abbreviations used below:
|
||||
|
||||
* MN - management node.
|
||||
* SN - service node.
|
||||
* CN - compute node.
|
||||
Service Node, for example, for planned maintenance.
|
||||
|
||||
Initial deployment
|
||||
------------------
|
||||
|
||||
Integrate the following steps into the hierarchical deployment process
|
||||
described above.
|
||||
Integrate the following steps into the hierarchical deployment process described above.
|
||||
|
||||
|
||||
#. Make sure both the primary and backup service nodes are installed,
|
||||
configured, and can access the MN database.
|
||||
#. When defining the CNs add the necessary service node values to the
|
||||
"servicenode" and "xcatmaster" attributes of the `node definitions
|
||||
<http://localhost/fake_todo>`_.
|
||||
"servicenode" and "xcatmaster" attributes of the :doc:`node </guides/admin-guides/references/man7/node.7>` definitions.
|
||||
#. (Optional) Create an xCAT group for the nodes that are assigned to each SN.
|
||||
This will be useful when setting node attributes as well as providing an
|
||||
easy way to switch a set of nodes back to their original server.
|
||||
@ -51,10 +43,8 @@ attributes you would run a command similar to the following. ::
|
||||
|
||||
chdef <noderange> servicenode="xcatsn1a,xcatsn2a" xcatmaster="xcatsn1b"
|
||||
|
||||
The process can be simplified by creating xCAT node groups to use as the
|
||||
<noderange> in the `chdef <http://localhost/fake_todo>`_ command. To create an
|
||||
xCAT node group containing all the nodes that have the service node "SN27"
|
||||
you could run a command similar to the following. ::
|
||||
The process can be simplified by creating xCAT node groups to use as the <noderange> in the :doc:`chdef </guides/admin-guides/references/man1/chdef.1>` command to create a
|
||||
xCAT node group containing all the nodes that belong to service node "SN27". For example: ::
|
||||
|
||||
mkdef -t group sn1group members=node[01-20]
|
||||
|
||||
@ -64,10 +54,7 @@ the 1st SN as their primary SN, and the other half of CNs to use the 2nd SN
|
||||
as their primary SN. Then each SN would be configured to be the backup SN
|
||||
for the other half of CNs.**
|
||||
|
||||
When you run `makedhcp <http://localhost/fake_todo>`_, it will configure dhcp
|
||||
and tftp on both the primary and backup SNs, assuming they both have network
|
||||
access to the CNs. This will make it possible to do a quick SN takeover
|
||||
without having to wait for replication when you need to switch.
|
||||
When you run :doc:`makedhcp </guides/admin-guides/references/man8/makedhcp.8>` command, it will configure dhcp and tftp on both the primary and backup SNs, assuming they both have network access to the CNs. This will make it possible to do a quick SN takeover without having to wait for replication when you need to switch.
|
||||
|
||||
xdcp Behaviour with backup servicenodes
|
||||
---------------------------------------
|
||||
@ -83,7 +70,7 @@ rhsn. lsdef cn4 | grep servicenode. ::
|
||||
|
||||
servicenode=service1,rhsn
|
||||
|
||||
If a service node is offline ( e.g. service1), then you will see errors on
|
||||
If a service node is offline (e.g. service1), then you will see errors on
|
||||
your xdcp command, and yet if rhsn is online then the xdcp will actually
|
||||
work. This may be a little confusing. For example, here service1 is offline,
|
||||
but we are able to use rhsn to complete the xdcp. ::
|
||||
@ -108,49 +95,39 @@ procedure to move its CNs over to the backup SN.
|
||||
Move the nodes to the new service nodes
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Use the xCAT `snmove <http://localhost/fake_todo>`_ to make the database
|
||||
updates necessary to move a set of nodes from one service node to another, and
|
||||
to make configuration modifications to the nodes.
|
||||
Use the :doc:`snmove </guides/admin-guides/references/man1/snmove.1>` command to make the database changes necessary to move a set of compute nodes from one Service Node to another.
|
||||
|
||||
For example, if you want to switch all the compute nodes that use service
|
||||
node "sn1" to the backup SN (sn2), run: ::
|
||||
To switch all the compute nodes from Service Node ``sn1`` to the backup Service Node ``sn2``, run: ::
|
||||
|
||||
snmove -s sn1
|
||||
snmove -s sn1
|
||||
|
||||
Modified database attributes
|
||||
""""""""""""""""""""""""""""
|
||||
|
||||
The **snmove** command will check and set several node attribute values.
|
||||
The ``snmove`` command will check and set several node attribute values.
|
||||
|
||||
**servicenode**: : This will be set to either the second server name in the
|
||||
servicenode attribute list or the value provided on the command line.
|
||||
**xcatmaster**: : Set with either the value provided on the command line or it
|
||||
will be automatically determined from the servicenode attribute.
|
||||
**nfsserver**: : If the value is set with the source service node then it will
|
||||
be set to the destination service node.
|
||||
**tftpserver**: : If the value is set with the source service node then it will
|
||||
be reset to the destination service node.
|
||||
**monserver**: : If set to the source service node then reset it to the
|
||||
destination servicenode and xcatmaster values.
|
||||
**conserver**: : If set to the source service node then reset it to the
|
||||
destination servicenode and run **makeconservercf**
|
||||
* **servicenode**: This will be set to either the second server name in the servicenode attribute list or the value provided on the command line.
|
||||
|
||||
* **xcatmaster**: Set with either the value provided on the command line or it will be automatically determined from the servicenode attribute.
|
||||
|
||||
* **nfsserver**: If the value is set with the source service node then it will be set to the destination service node.
|
||||
|
||||
* **tftpserver**: If the value is set with the source service node then it will be reset to the destination service node.
|
||||
|
||||
* **monserver**: If set to the source service node then reset it to the destination servicenode and xcatmaster values.
|
||||
* **conserver**: If set to the source service node then reset it to the destination servicenode and run ``makeconservercf``
|
||||
|
||||
Run postscripts on the nodes
|
||||
""""""""""""""""""""""""""""
|
||||
|
||||
If the CNs are up at the time the snmove command is run then snmove will run
|
||||
postscripts on the CNs to reconfigure them for the new SN. The "syslog"
|
||||
postscript is always run. The "mkresolvconf" and "setupntp" scripts will be
|
||||
run IF they were included in the nodes postscript list.
|
||||
If the CNs are up at the time the ``snmove`` command is run then ``snmove`` will run postscripts on the CNs to reconfigure them for the new SN. The "syslog" postscript is always run. The ``mkresolvconf`` and ``setupntp`` scripts will be run if they were included in the nodes postscript list.
|
||||
|
||||
You can also specify an additional list of postscripts to run.
|
||||
|
||||
Modify system configuration on the nodes
|
||||
""""""""""""""""""""""""""""""""""""""""
|
||||
|
||||
If the CNs are up the snmove command will also perform some configuration on
|
||||
the nodes such as setting the default gateway and modifying some
|
||||
configuration files used by xCAT.
|
||||
If the CNs are up the ``snmove`` command will also perform some configuration on the nodes such as setting the default gateway and modifying some configuration files used by xCAT.
|
||||
|
||||
Switching back
|
||||
--------------
|
||||
@ -161,7 +138,7 @@ need to set it up as an SN again and make sure the CN images are replicated
|
||||
to it. Once you've done this, or if the SN's configuration was not lost,
|
||||
then follow these steps to move the CNs back to their original SN:
|
||||
|
||||
* Use snmove: ::
|
||||
* Use ``snmove``: ::
|
||||
|
||||
snmove sn1group -d sn1
|
||||
snmove sn1group -d sn1
|
||||
|
||||
|
@ -1,14 +1,10 @@
|
||||
Appendix C: Migrating a Management Node to a Service Node
|
||||
=========================================================
|
||||
|
||||
If you find you want to convert an existing Management Node to a Service
|
||||
Node you need to work with the xCAT team. It is recommended for now, to
|
||||
backup your database, setup your new Management Server, and restore your
|
||||
database into it. Take the old Management Node and remove xCAT and all xCAT
|
||||
directories, and your database. See ``Uninstalling_xCAT
|
||||
<http://localhost/fake_todo>`_ and then follow the process for setting up a
|
||||
SN as if it is a new node.
|
||||
Directly converting an existing Management Node to a Service Node may have some issues and is not recommended. Do the following steps to convert the xCAT Management Node into a Service node:
|
||||
|
||||
#. backup your xCAT database on the Management Node
|
||||
#. Install a new xCAT Management node
|
||||
#. Restore your xCAT database into the new Management Node
|
||||
#. Re-provision the old xCAT Management Node as a new Service Node
|
||||
|
||||
|
||||
|
@ -1,9 +1,9 @@
|
||||
Hierarchical Clusters / Large Cluster Support
|
||||
=============================================
|
||||
|
||||
xCAT supports management of very large sized cluster through the use of **Hierarchical Cluster** and the concept of **xCAT Service Nodes**.
|
||||
xCAT supports management of very large sized cluster by creating a **Hierarchical Cluster** and the concept of **xCAT Service Nodes**.
|
||||
|
||||
When dealing with large clusters, to balance the load, it is recommended to have more than one node (Management Node, "MN") handling the installation and management of the compute nodes. These additional *helper* nodes are referred to as **xCAT Service Nodes** ("SN"). The Management Node can delegate all management operational needs to the Service Node responsible for a set of compute node.
|
||||
When dealing with large clusters, to balance the load, it is recommended to have more than one node (Management Node, "MN") handling the installation and management of the Compute Nodes ("CN"). These additional *helper* nodes are referred to as **Service Nodes** ("SN"). The Management Node can delegate all management operational needs to the Service Node responsible for a set of compute node.
|
||||
|
||||
The following configurations are supported:
|
||||
* Each service node installs/manages a specific set of compute nodes
|
||||
|
@ -20,7 +20,7 @@ Traditional cluster with OS on each node's local disk.
|
||||
|
||||
|
||||
Stateless (diskless)
|
||||
-------------------
|
||||
--------------------
|
||||
|
||||
Nodes boot from a RAMdisk OS image downloaded from the xCAT mgmt node or service node at boot time.
|
||||
|
||||
|
@ -45,26 +45,26 @@ Remote Console
|
||||
|
||||
Most enterprise level servers do not have video adapters installed with the machine. Meaning, the end user can not connect a monitor to the machine and get display output. In most cases, the console can be viewed using the serial port or LAN port, through Serial-over-LAN. Serial cable or network cable are used to get a command line interface of the machine. From there, the end user can get the basic machine booting information, firmware settings interface, local command line console, etc.
|
||||
|
||||
In order to get the command line console remotely. xCAT provides the ``rcons`` command. ::
|
||||
In order to get the command line console remotely. xCAT provides the ``rcons`` command.
|
||||
|
||||
#. Make sure the ``conserver`` is configured by running ``makeconservercf``.
|
||||
|
||||
First of all, make sure the ``conserver`` is configured, if not, configue it with ::
|
||||
|
||||
makeconservercf
|
||||
|
||||
Then check if the ``conserver`` is up and running ::
|
||||
#. Check if the ``conserver`` is up and running ::
|
||||
|
||||
ps ax | grep conserver
|
||||
|
||||
If the conserver is not running, or you just updated its configuration file, restart the conserver with ::
|
||||
#. If ``conserver`` is not running, start ::
|
||||
|
||||
service conserver restart
|
||||
[sysvinit] service conserver start
|
||||
[systemd] systemctl start conserver.service
|
||||
|
||||
In case you have ``systemd`` instead of ``sysvinit``, use the command below instead ::
|
||||
or restart, if changes to the configuration were made ::
|
||||
|
||||
systemctl restart conserver.service
|
||||
[sysvinit] service conserver restart
|
||||
[systemd] systemctl restart conserver.service
|
||||
|
||||
After that, you can get the command line console for a specific machine with the ``rcons`` command ::
|
||||
|
||||
#. After that, you can get the command line console for a specific machine with the ``rcons`` command ::
|
||||
|
||||
rcons cn1
|
||||
|
||||
|
Reference in New Issue
Block a user