2
0
mirror of https://github.com/xcat2/xcat-core.git synced 2025-10-26 17:05:33 +00:00

Merge pull request #226 from daniceexi/docrefine_discover

Doc refine: Hardware Discovery
This commit is contained in:
zet809
2015-09-29 10:54:08 +08:00
5 changed files with 107 additions and 11 deletions

View File

@@ -1,6 +1,26 @@
Admin Guide
===========
When reading this chapter, assume you have read the :doc:`Overview of xCAT <../../overview/index>` to understand the architecture and features of xCAT, and have read the :doc:`xCAT Install Guide <../install-guides/index>` to have an xCAT Management Node installed.
Now you can start to learn how to manage a cluster by xCAT. This chapter includes following major sections:
* **Basic Concepts**
This Section will give you the introduction of some basic concepts in xCAT like the **Object Concept**, **Database**, **Global Configuration**, **Network** and **NOde Type**.
* **Manage Cluster**
This is the a major part of xCAT doc. It describes the procedures of how to manage a real cluster. Since the management procedures are different among the hardware type, this section is organized base on the hardware type.
* **Reference**
This section includes the brief introduction of xCAT commands, the man page of each command and the definition of each xCAT Database table.
* **Large Cluster**
This section gives some advanced topics of how to manage a large cluster. **Large Cluster** means a cluster which has more than 500 compute nodes.
.. toctree::
:maxdepth: 2

View File

@@ -1,6 +1,15 @@
Manage Clusters
===============
This chapter introduces the procedures of how to manage a real cluster. Basically, it includes the following parts:
* Discover and Define Nodes
* Deploy/Configure OS for the Nodes
* Install/Configure Applications for the Nodes
* General System Management Work for the Nodes
You should select the proper sub-chapter according to the hardware type of your cluster. If having a mixed cluster that has multiple types of hardware, you have to refer to multiple sub-chapters accordingly.
.. toctree::
:maxdepth: 2

View File

@@ -1,7 +1,76 @@
Hardware Discovery & Define Node
================================
Hardware discovery is used to configure the FSP/BMC to get the hardware configuration information for the physical servers. The physical servers can be defined into xCAT database manually, or though our hardware discovery process. The available options for hardware discovery are MTMS based, switch based and sequential based.
Have the servers to be defined as **Node Object** in xCAT is the first step to do for a cluster management.
In the chapter :doc:`xCAT Object <../../../basic_concepts/xcat_object/index>`, it describes how to create a **Node Object** through `mkdef` command. You can collect all the necessary information of target servers and define them to a **xCAT Node Object** by manually run `mkdef` command. This is doable when you have a small cluster which has less than 10 servers. But it's really error-prone and inefficiency to manually configure SP (like BMC) and collect information for a large number servers.
xCAT offers several powerful **Automatic Hardware Discovery** methods to simplify the procedure of SP configuration and server information collection. If your managed cluster has more than 10 servers, the automatic discovery is worth to take a try. If your cluster has more than 50 servers, the automatic discovery is recommended.
Following are the brief characters and adaptability of each method, you can select a proper one according to your cluster size and other consideration.
* **Manually Define Nodes**
Manually collect information for target servers and manually define them to xCAT **Node Object** through ``mkdef`` command.
This method is recommended for small cluster which has less than 10 nodes.
* pros
No specific configuration and procedure required and very easy to use.
* cons
It will take additional time to configure the SP (Management Modules like: BMC, FSP) and collect the server information like MTMS (Machine Type and Machine Serial) and Host MAC address for OS deployment ...
This method is inefficiency and error-prone for a large number of servers.
* **MTMS-based Discovery**
**Step1**: **Automatically** search all the servers and collect server MTMS information.
**Step2**: Define the searched server to a **Node Object** automatically. In this case, the node name will be generate base on the **MTMS** string. Or admin can rename the **Node Object** to a reasonable name like **r1u1 (It means the physical location is in Rack1 and Unit1)** base on the **MTMS**.
**Step3**: Power on the nodes, xCAT discovery engine will update additional information like the **MAC for deployment** for the nodes.
This method is recommended for the medium scale of cluster which has less than 100 nodes.
* pros
With limited effort to get the automatic discovery benefit.
* cons
Compare to **Switch-based Discovery**, admin needs to be involved to rename the auto discovered node if wanting to give node a reasonable name. It's hard to rename the node to a location awared name for a large number of server.
* **Switch-based Discovery**
**Step1**: **Pre-define** the **Node Object** for all the nodes in the cluster. The **Pre-defined** node must have the attributes **switch** and **switchport** defined to specify which **Switch and Port** this server connected to. xCAT will use this **Switch and Port** information to map a discovered node to certain **Pre-defined** node.
**Step2**: Power on the nodes, xCAT discovery engine will discover node attributes and update them to certain **Pre-defined** node.
* pros
The whole discovery process is totally automatic.
Since the node is physically identified by the **Switch and Port** that the server connected, if a node fail and replaced with a new one, xCAT will automatically discover the new one and assign it to the original node name since the **Switch and Port** does not change.
* cons
You need to plan the cluster with planned **Switch and Port** mapping for each server and switch. All the Switches need be configured with snmpv3 accessible for xCAT management node.
* **Sequential-based Discovery**
**Step1**: **Pre-define** the **Node Object** for all the nodes in the cluster.
**Step2**: Manually power on the node one by one. The booted node will be discovered, each new discovered node will be assigned to one of the **Pre-defined** node in **Sequential**.
* pros
No special configuration required like **Switch-based Discovery**. No manual rename node step required like **MTMS-based Discovery**.
* cons
You have to strictly boot on the node in order if you want the node has the expected name. Generally you have to waiting for the discovery process finished before power on the next one.
.. toctree::
:maxdepth: 2

View File

@@ -1,22 +1,16 @@
Manually Define Nodes
=====================
Manually define node means the admin know detailed information of the physical server and defines it into xCAT database with commands.
**Manually Define Node** means the admin knows the detailed information of the physical server and manually defines it into xCAT database with ``mkdef`` commands.
.. include:: schedule_environment.rst
Manually define node
Manually Define Node
--------------------
To add a node object::
Execute ``mkdef`` command to define the node: ::
nodeadd cn1 groups=powerLE,all
Use the ``chdef`` command to add and change node attributes: ::
chdef cn1 mgt=ipmi cons=ipmi ip=10.0.101.1 netboot=petitboot
chdef cn1 bmc=50.0.101.1 bmcusername=ADMIN bmcpassword=admin
chdef cn1 installnic=mac primarynic=mac mac=6c:ae:8b:6a:d4:e4
mkdef -t node cn1 groups=powerLE,all mgt=ipmi cons=ipmi ip=10.0.101.1 netboot=petitboot bmc=50.0.101.1 bmcusername=ADMIN bmcpassword=admin installnic=mac primarynic=mac mac=6c:ae:8b:6a:d4:e4
The manually defined node will be like this::

View File

@@ -1,6 +1,10 @@
IBM Power LE / OpenPOWER
=========================
This chapter introduces the procedure of how to manage an IBM Power LE/OpenPower cluster. Generally speaking, the processor of **Compute Node** is **IBM Power Chip** based and the management module is **BMC** based.
For a new user, you are recommended to read this chapter in order since later section depends on the execute result of previous section.
.. toctree::
:maxdepth: 2