2
0
mirror of https://github.com/xcat2/xcat-core.git synced 2025-05-30 01:26:38 +00:00

Remove tailing spaces

This commit is contained in:
GONG Jie 2018-03-22 14:00:30 +08:00
parent dcb3ea4270
commit 7e6290fb7d
5 changed files with 26 additions and 37 deletions

View File

@ -1,15 +1,14 @@
Deploy CUDA nodes
=================
Diskful
Diskful
-------
* To provision diskful nodes using osimage ``rhels7.5-ppc64le-install-cudafull``: ::
nodeset <noderange> osimage=rhels7.5-ppc64le-install-cudafull
rsetboot <noderange> net
rpower <noderange> boot
rpower <noderange> boot
Diskless
--------
@ -18,5 +17,4 @@ Diskless
nodeset <noderange> osimage=rhels7.5-ppc64le-netboot-cudafull
rsetboot <noderange> net
rpower <noderange> boot
rpower <noderange> boot

View File

@ -7,7 +7,7 @@ For more information, see NVIDIAs website: https://developer.nvidia.com/cuda-zon
xCAT supports CUDA installation for Ubuntu 14.04.3 and RHEL 7.5 on PowerNV (Non-Virtualized) for both diskful and diskless nodes.
Within the NVIDIA CUDA Toolkit, installing the ``cuda`` package will install both the ``cuda-runtime`` and the ``cuda-toolkit``. The ``cuda-toolkit`` is intended for developing CUDA programs and monitoring CUDA jobs. If your particular installation requires only running GPU jobs, it's recommended to install only the ``cuda-runtime`` package.
Within the NVIDIA CUDA Toolkit, installing the ``cuda`` package will install both the ``cuda-runtime`` and the ``cuda-toolkit``. The ``cuda-toolkit`` is intended for developing CUDA programs and monitoring CUDA jobs. If your particular installation requires only running GPU jobs, it's recommended to install only the ``cuda-runtime`` package.
.. toctree::
:maxdepth: 2

View File

@ -1,7 +1,7 @@
RHEL 7.5
========
xCAT provides a sample package list (pkglist) files for CUDA. You can find them:
xCAT provides a sample package list (pkglist) files for CUDA. You can find them:
* Diskful: ``/opt/xcat/share/xcat/install/rh/cuda*``
* Diskless: ``/opt/xcat/share/xcat/netboot/rh/cuda*``
@ -9,7 +9,7 @@ xCAT provides a sample package list (pkglist) files for CUDA. You can find them:
Diskful images
--------------
The following examples will create diskful images for ``cudafull`` and ``cudaruntime``. The osimage definitions will be created from the base ``rhels7.5-ppc64le-install-compute`` osimage.
The following examples will create diskful images for ``cudafull`` and ``cudaruntime``. The osimage definitions will be created from the base ``rhels7.5-ppc64le-install-compute`` osimage.
**[Note]**: There is a requirement to reboot the machine after the CUDA drivers are installed. To satisfy this requirement, the CUDA software is installed in the ``pkglist`` attribute of the osimage definition where a reboot will happen after the Operating System is installed.
@ -20,7 +20,7 @@ cudafull
lsdef -t osimage -z rhels7.5-ppc64le-install-compute \
| sed 's/install-compute:/install-cudafull:/' \
| mkdef -z
| mkdef -z
#. Add the CUDA repo created in the previous step to the ``pkgdir`` attribute: ::
@ -39,7 +39,7 @@ cudaruntime
lsdef -t osimage -z rhels7.5-ppc64le-install-compute \
| sed 's/install-compute:/install-cudaruntime:/' \
| mkdef -z
| mkdef -z
#. Add the CUDA repo created in the previous step to the ``pkgdir`` attribute: ::
@ -54,9 +54,9 @@ cudaruntime
Diskless images
---------------
The following examples will create diskless images for ``cudafull`` and ``cudaruntime``. The osimage definitions will be created from the base ``rhels7.5-ppc64le-netboot-compute`` osimage.
The following examples will create diskless images for ``cudafull`` and ``cudaruntime``. The osimage definitions will be created from the base ``rhels7.5-ppc64le-netboot-compute`` osimage.
**[Note]**: For diskless, the install of the CUDA packages MUST be done in the ``otherpkglist`` and **NOT** the ``pkglist`` as with diskful. The requirement for rebooting the machine is not applicable in diskless nodes because the image is loaded on each reboot.
**[Note]**: For diskless, the install of the CUDA packages MUST be done in the ``otherpkglist`` and **NOT** the ``pkglist`` as with diskful. The requirement for rebooting the machine is not applicable in diskless nodes because the image is loaded on each reboot.
cudafull
^^^^^^^^
@ -65,16 +65,16 @@ cudafull
lsdef -t osimage -z rhels7.5-ppc64le-netboot-compute \
| sed 's/netboot-compute:/netboot-cudafull:/' \
| mkdef -z
| mkdef -z
#. Verify that the CUDA repo created in the previous step is available in the directory specified by the ``otherpkgdir`` attribute.
#. Verify that the CUDA repo created in the previous step is available in the directory specified by the ``otherpkgdir`` attribute.
The ``otherpkgdir`` directory can be obtained by running lsdef on the osimage: ::
# lsdef -t osimage rhels7.5-ppc64le-netboot-cudafull -i otherpkgdir
Object name: rhels7.5-ppc64le-netboot-cudafull
otherpkgdir=/install/post/otherpkgs/rhels7.5/ppc64le
Create a symbolic link of the CUDA repository in the directory specified by ``otherpkgdir`` ::
ln -s /install/cuda-9.2 /install/post/otherpkgs/rhels7.5/ppc64le/cuda-9.2
@ -84,7 +84,7 @@ cudafull
chdef -t osimage -o rhels7.5-ppc64le-netboot-cudafull \
rootimgdir=/install/netboot/rhels7.5/ppc64le/cudafull
#. Create a custom pkglist file to install additional operating system packages for your CUDA node.
#. Create a custom pkglist file to install additional operating system packages for your CUDA node.
#. Copy the default compute pkglist file as a starting point: ::
@ -111,7 +111,7 @@ cudafull
#. Create the otherpkg.pkglist file for cudafull: ::
vi /install/custom/netboot/rh/cudafull.rhels7.ppc64le.otherpkgs.pkglist
# add the following packages
# add the following packages
cuda-9.2/ppc64le/cuda-deps/dkms
cuda-9.2/ppc64le/cuda-core/cuda
@ -137,7 +137,7 @@ cudaruntime
| sed 's/netboot-compute:/netboot-cudaruntime:/' \
| mkdef -z
#. Verify that the CUDA repo created previously is available in the directory specified by the ``otherpkgdir`` attribute.
#. Verify that the CUDA repo created previously is available in the directory specified by the ``otherpkgdir`` attribute.
#. Obtain the ``otherpkgdir`` directory using the ``lsdef`` command: ::
@ -177,7 +177,6 @@ cudaruntime
packimage rhels7.5-ppc64le-netboot-cudaruntime
POWER9 Setup
------------

View File

@ -1,11 +1,10 @@
RHEL 7.5
========
#. Create a repository on the MN node installing the CUDA Toolkit: ::
# For cuda toolkit name: /path/to/cuda-repo-rhel7-9-2-local-9.2.64-1.ppc64le.rpm
# extract the contents from the rpm
# extract the contents from the rpm
mkdir -p /tmp/cuda
cd /tmp/cuda
rpm2cpio /path/to/cuda-repo-rhel7-9-2-local-9.2.64-1.ppc64le.rpm | cpio -i -d
@ -14,15 +13,15 @@ RHEL 7.5
mkdir -p /install/cuda-9.2/ppc64le/cuda-core
cp /tmp/cuda/var/cuda-repo-9-2-local/*.rpm /install/cuda-9.2/ppc64le/cuda-core
# Create the yum repo files
# Create the yum repo files
createrepo /install/cuda-9.2/ppc64le/cuda-core
#. The NVIDIA CUDA Toolkit contains rpms that have dependencies on other external packages (such as ``DKMS``). These are provided by EPEL. It's up to the system administrator to obtain the dependency packages and add those to the ``cuda-deps`` directory: ::
mkdir -p /install/cuda-9.2/ppc64le/cuda-deps
# Copy the DKMS rpm to this directory
cp /path/to/dkms-2.4.0-1.20170926git959bd74.el7.noarch.rpm /install/cuda-9.2/ppc64le/cuda-deps
# Copy the DKMS rpm to this directory
cp /path/to/dkms-2.4.0-1.20170926git959bd74.el7.noarch.rpm /install/cuda-9.2/ppc64le/cuda-deps
# Execute createrepo in this directory
# Execute createrepo in this directory
createrepo /install/cuda-9.2/ppc64le/cuda-deps

View File

@ -3,7 +3,7 @@ Update NVIDIA Driver
If the user wants to update the newer NVIDIA driver on the system, follow the :doc:`Create CUDA software repository </advanced/gpu/nvidia/repo/index>` document to create another repository for the new driver.
The following example assumes the new driver is in ``/install/cuda-9.2/ppc64le/nvidia_new``.
The following example assumes the new driver is in ``/install/cuda-9.2/ppc64le/nvidia_new``.
Diskful
-------
@ -13,33 +13,26 @@ Diskful
chdef -t osimage -o rhels7.5-ppc64le-install-cudafull \
pkgdir=/install/cuda-9.2/ppc64le/nvidia_new,/install/cuda-9.2/ppc64le/cuda-deps
#. Use xdsh command to remove all the NVIDIA rpms: ::
xdsh <noderange> "yum remove *nvidia* -y"
xdsh <noderange> "yum remove *nvidia* -y"
#. Run updatenode command to update NVIDIA driver on the compute node: ::
updatenode <noderange> -S
#. Reboot compute node: ::
rpower <noderange> off
rpower <noderange> on
#. Verify the newer driver level: ::
nvidia-smi | grep Driver
Diskless
--------
To update a new NVIDIA driver on diskless compute nodes, re-generate the osimage pointing to the new NVIDIA driver repository and reboot the node to load the diskless image.
To update a new NVIDIA driver on diskless compute nodes, re-generate the osimage pointing to the new NVIDIA driver repository and reboot the node to load the diskless image.
Refer to :doc:`Create osimage definitions </advanced/gpu/nvidia/osimage/index>` for specific instructions.
Refer to :doc:`Create osimage definitions </advanced/gpu/nvidia/osimage/index>` for specific instructions.