|
|
|
@@ -1,5 +1,5 @@
|
|
|
|
|
RHEL 7.2 LE
|
|
|
|
|
===========
|
|
|
|
|
RHEL 7.5
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
xCAT provides a sample package list (pkglist) files for CUDA. You can find them:
|
|
|
|
|
|
|
|
|
@@ -7,9 +7,9 @@ xCAT provides a sample package list (pkglist) files for CUDA. You can find them:
|
|
|
|
|
* Diskless: ``/opt/xcat/share/xcat/netboot/rh/cuda*``
|
|
|
|
|
|
|
|
|
|
Diskful images
|
|
|
|
|
---------------
|
|
|
|
|
--------------
|
|
|
|
|
|
|
|
|
|
The following examples will create diskful images for ``cudafull`` and ``cudaruntime``. The osimage definitions will be created from the base ``rhels7.2-ppc64le-install-compute`` osimage.
|
|
|
|
|
The following examples will create diskful images for ``cudafull`` and ``cudaruntime``. The osimage definitions will be created from the base ``rhels7.5-ppc64le-install-compute`` osimage.
|
|
|
|
|
|
|
|
|
|
**[Note]**: There is a requirement to reboot the machine after the CUDA drivers are installed. To satisfy this requirement, the CUDA software is installed in the ``pkglist`` attribute of the osimage definition where a reboot will happen after the Operating System is installed.
|
|
|
|
|
|
|
|
|
@@ -18,18 +18,18 @@ cudafull
|
|
|
|
|
|
|
|
|
|
#. Create a copy of the ``install-compute`` image and label it ``cudafull``: ::
|
|
|
|
|
|
|
|
|
|
lsdef -t osimage -z rhels7.2-ppc64le-install-compute \
|
|
|
|
|
lsdef -t osimage -z rhels7.5-ppc64le-install-compute \
|
|
|
|
|
| sed 's/install-compute:/install-cudafull:/' \
|
|
|
|
|
| mkdef -z
|
|
|
|
|
|
|
|
|
|
#. Add the CUDA repo created in the previous step to the ``pkgdir`` attribute: ::
|
|
|
|
|
|
|
|
|
|
chdef -t osimage -o rhels7.2-ppc64le-install-cudafull -p \
|
|
|
|
|
pkgdir=/install/cuda-7.5/ppc64le/cuda-core,/install/cuda-7.5/ppc64le/cuda-deps
|
|
|
|
|
chdef -t osimage -o rhels7.5-ppc64le-install-cudafull -p \
|
|
|
|
|
pkgdir=/install/cuda-9.2/ppc64le/cuda-core,/install/cuda-9.2/ppc64le/cuda-deps
|
|
|
|
|
|
|
|
|
|
#. Use the provided ``cudafull`` pkglist to install the CUDA packages: ::
|
|
|
|
|
|
|
|
|
|
chdef -t osimage -o rhels7.2-ppc64le-install-cudafull \
|
|
|
|
|
chdef -t osimage -o rhels7.5-ppc64le-install-cudafull \
|
|
|
|
|
pkglist=/opt/xcat/share/xcat/install/rh/cudafull.rhels7.ppc64le.pkglist
|
|
|
|
|
|
|
|
|
|
cudaruntime
|
|
|
|
@@ -37,24 +37,24 @@ cudaruntime
|
|
|
|
|
|
|
|
|
|
#. Create a copy of the ``install-compute`` image and label it ``cudaruntime``: ::
|
|
|
|
|
|
|
|
|
|
lsdef -t osimage -z rhels7.2-ppc64le-install-compute \
|
|
|
|
|
lsdef -t osimage -z rhels7.5-ppc64le-install-compute \
|
|
|
|
|
| sed 's/install-compute:/install-cudaruntime:/' \
|
|
|
|
|
| mkdef -z
|
|
|
|
|
|
|
|
|
|
#. Add the CUDA repo created in the previous step to the ``pkgdir`` attribute: ::
|
|
|
|
|
|
|
|
|
|
chdef -t osimage -o rhels7.2-ppc64le-install-cudaruntime -p \
|
|
|
|
|
pkgdir=/install/cuda-7.5/ppc64le/cuda-core,/install/cuda-7.5/ppc64le/cuda-deps
|
|
|
|
|
chdef -t osimage -o rhels7.5-ppc64le-install-cudaruntime -p \
|
|
|
|
|
pkgdir=/install/cuda-9.2/ppc64le/cuda-core,/install/cuda-9.2/ppc64le/cuda-deps
|
|
|
|
|
|
|
|
|
|
#. Use the provided ``cudaruntime`` pkglist to install the CUDA packages: ::
|
|
|
|
|
|
|
|
|
|
chdef -t osimage -o rhels7.2-ppc64le-install-cudaruntime \
|
|
|
|
|
chdef -t osimage -o rhels7.5-ppc64le-install-cudaruntime \
|
|
|
|
|
pkglist=/opt/xcat/share/xcat/instal/rh/cudaruntime.rhels7.ppc64le.pkglist
|
|
|
|
|
|
|
|
|
|
Diskless images
|
|
|
|
|
---------------
|
|
|
|
|
|
|
|
|
|
The following examples will create diskless images for ``cudafull`` and ``cudaruntime``. The osimage definitions will be created from the base ``rhels7.2-ppc64le-netboot-compute`` osimage.
|
|
|
|
|
The following examples will create diskless images for ``cudafull`` and ``cudaruntime``. The osimage definitions will be created from the base ``rhels7.5-ppc64le-netboot-compute`` osimage.
|
|
|
|
|
|
|
|
|
|
**[Note]**: For diskless, the install of the CUDA packages MUST be done in the ``otherpkglist`` and **NOT** the ``pkglist`` as with diskful. The requirement for rebooting the machine is not applicable in diskless nodes because the image is loaded on each reboot.
|
|
|
|
|
|
|
|
|
@@ -63,7 +63,7 @@ cudafull
|
|
|
|
|
|
|
|
|
|
#. Create a copy of the ``netboot-compute`` image and label it ``cudafull``: ::
|
|
|
|
|
|
|
|
|
|
lsdef -t osimage -z rhels7.2-ppc64le-netboot-compute \
|
|
|
|
|
lsdef -t osimage -z rhels7.5-ppc64le-netboot-compute \
|
|
|
|
|
| sed 's/netboot-compute:/netboot-cudafull:/' \
|
|
|
|
|
| mkdef -z
|
|
|
|
|
|
|
|
|
@@ -71,18 +71,18 @@ cudafull
|
|
|
|
|
|
|
|
|
|
The ``otherpkgdir`` directory can be obtained by running lsdef on the osimage: ::
|
|
|
|
|
|
|
|
|
|
# lsdef -t osimage rhels7.2-ppc64le-netboot-cudafull -i otherpkgdir
|
|
|
|
|
Object name: rhels7.2-ppc64le-netboot-cudafull
|
|
|
|
|
otherpkgdir=/install/post/otherpkgs/rhels7.2/ppc64le
|
|
|
|
|
# lsdef -t osimage rhels7.5-ppc64le-netboot-cudafull -i otherpkgdir
|
|
|
|
|
Object name: rhels7.5-ppc64le-netboot-cudafull
|
|
|
|
|
otherpkgdir=/install/post/otherpkgs/rhels7.5/ppc64le
|
|
|
|
|
|
|
|
|
|
Create a symbolic link of the CUDA repository in the directory specified by ``otherpkgdir`` ::
|
|
|
|
|
|
|
|
|
|
ln -s /install/cuda-7.5 /install/post/otherpkgs/rhels7.2/ppc64le/cuda-7.5
|
|
|
|
|
ln -s /install/cuda-9.2 /install/post/otherpkgs/rhels7.5/ppc64le/cuda-9.2
|
|
|
|
|
|
|
|
|
|
#. Change the ``rootimgdir`` for the cudafull osimage: ::
|
|
|
|
|
|
|
|
|
|
chdef -t osimage -o rhels7.2-ppc64le-netboot-cudafull \
|
|
|
|
|
rootimgdir=/install/netboot/rhels7.2/ppc64le/cudafull
|
|
|
|
|
chdef -t osimage -o rhels7.5-ppc64le-netboot-cudafull \
|
|
|
|
|
rootimgdir=/install/netboot/rhels7.5/ppc64le/cudafull
|
|
|
|
|
|
|
|
|
|
#. Create a custom pkglist file to install additional operating system packages for your CUDA node.
|
|
|
|
|
|
|
|
|
@@ -102,7 +102,7 @@ cudafull
|
|
|
|
|
|
|
|
|
|
#. Set the new file as the ``pkglist`` attribute for the cudafull osimage: ::
|
|
|
|
|
|
|
|
|
|
chdef -t osimage -o rhels7.2-ppc64le-netboot-cudafull \
|
|
|
|
|
chdef -t osimage -o rhels7.5-ppc64le-netboot-cudafull \
|
|
|
|
|
pkglist=/install/custom/netboot/rh/cudafull.rhels7.ppc64le.pkglist
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -112,28 +112,28 @@ cudafull
|
|
|
|
|
|
|
|
|
|
vi /install/custom/netboot/rh/cudafull.rhels7.ppc64le.otherpkgs.pkglist
|
|
|
|
|
# add the following packages
|
|
|
|
|
cuda-7.5/ppc64le/cuda-deps/dkms
|
|
|
|
|
cuda-7.5/ppc64le/cuda-core/cuda
|
|
|
|
|
cuda-9.2/ppc64le/cuda-deps/dkms
|
|
|
|
|
cuda-9.2/ppc64le/cuda-core/cuda
|
|
|
|
|
|
|
|
|
|
#. Set the ``otherpkg.pkglist`` attribute for the cudafull osimage: ::
|
|
|
|
|
|
|
|
|
|
chdef -t osimage -o rhels7.2-ppc64le-netboot-cudafull \
|
|
|
|
|
chdef -t osimage -o rhels7.5-ppc64le-netboot-cudafull \
|
|
|
|
|
otherpkglist=/install/custom/netboot/rh/cudafull.rhels7.ppc64le.otherpkgs.pkglist
|
|
|
|
|
|
|
|
|
|
#. Generate the image: ::
|
|
|
|
|
|
|
|
|
|
genimage rhels7.2-ppc64le-netboot-cudafull
|
|
|
|
|
genimage rhels7.5-ppc64le-netboot-cudafull
|
|
|
|
|
|
|
|
|
|
#. Package the image: ::
|
|
|
|
|
|
|
|
|
|
packimage rhels7.2-ppc64le-netboot-cudafull
|
|
|
|
|
packimage rhels7.5-ppc64le-netboot-cudafull
|
|
|
|
|
|
|
|
|
|
cudaruntime
|
|
|
|
|
^^^^^^^^^^^
|
|
|
|
|
|
|
|
|
|
#. Create a copy of the ``netboot-compute`` image and label it ``cudaruntime``: ::
|
|
|
|
|
|
|
|
|
|
lsdef -t osimage -z rhels7.2-ppc64le-netboot-compute \
|
|
|
|
|
lsdef -t osimage -z rhels7.5-ppc64le-netboot-compute \
|
|
|
|
|
| sed 's/netboot-compute:/netboot-cudaruntime:/' \
|
|
|
|
|
| mkdef -z
|
|
|
|
|
|
|
|
|
@@ -141,18 +141,18 @@ cudaruntime
|
|
|
|
|
|
|
|
|
|
#. Obtain the ``otherpkgdir`` directory using the ``lsdef`` command: ::
|
|
|
|
|
|
|
|
|
|
# lsdef -t osimage rhels7.2-ppc64le-netboot-cudaruntime -i otherpkgdir
|
|
|
|
|
Object name: rhels7.2-ppc64le-netboot-cudaruntime
|
|
|
|
|
otherpkgdir=/install/post/otherpkgs/rhels7.2/ppc64le
|
|
|
|
|
# lsdef -t osimage rhels7.5-ppc64le-netboot-cudaruntime -i otherpkgdir
|
|
|
|
|
Object name: rhels7.5-ppc64le-netboot-cudaruntime
|
|
|
|
|
otherpkgdir=/install/post/otherpkgs/rhels7.5/ppc64le
|
|
|
|
|
|
|
|
|
|
#. Create a symbolic link to the CUDA repository in the directory specified by ``otherpkgdir`` ::
|
|
|
|
|
|
|
|
|
|
ln -s /install/cuda-7.5 /install/post/otherpkgs/rhels7.2/ppc64le/cuda-7.5
|
|
|
|
|
ln -s /install/cuda-9.2 /install/post/otherpkgs/rhels7.5/ppc64le/cuda-9.2
|
|
|
|
|
|
|
|
|
|
#. Change the ``rootimgdir`` for the cudaruntime osimage: ::
|
|
|
|
|
|
|
|
|
|
chdef -t osimage -o rhels7.2-ppc64le-netboot-cudaruntime \
|
|
|
|
|
rootimgdir=/install/netboot/rhels7.2/ppc64le/cudaruntime
|
|
|
|
|
chdef -t osimage -o rhels7.5-ppc64le-netboot-cudaruntime \
|
|
|
|
|
rootimgdir=/install/netboot/rhels7.5/ppc64le/cudaruntime
|
|
|
|
|
|
|
|
|
|
#. Create the ``otherpkg.pkglist`` file to do the install of the CUDA runtime packages:
|
|
|
|
|
|
|
|
|
@@ -161,19 +161,53 @@ cudaruntime
|
|
|
|
|
vi /install/custom/netboot/rh/cudaruntime.rhels7.ppc64le.otherpkgs.pkglist
|
|
|
|
|
|
|
|
|
|
# Add the following packages:
|
|
|
|
|
cuda-7.5/ppc64le/cuda-deps/dkms
|
|
|
|
|
cuda-7.5/ppc64le/cuda-core/cuda-runtime-7-5
|
|
|
|
|
cuda-9.2/ppc64le/cuda-deps/dkms
|
|
|
|
|
cuda-9.2/ppc64le/cuda-core/cuda-runtime-9-2
|
|
|
|
|
|
|
|
|
|
#. Set the ``otherpkg.pkglist`` attribute for the cudaruntime osimage: ::
|
|
|
|
|
|
|
|
|
|
chdef -t osimage -o rhels7.2-ppc64le-netboot-cudaruntime \
|
|
|
|
|
chdef -t osimage -o rhels7.5-ppc64le-netboot-cudaruntime \
|
|
|
|
|
otherpkglist=/install/custom/netboot/rh/cudaruntime.rhels7.ppc64le.otherpkgs.pkglist
|
|
|
|
|
|
|
|
|
|
#. Generate the image: ::
|
|
|
|
|
|
|
|
|
|
genimage rhels7.2-ppc64le-netboot-cudaruntime
|
|
|
|
|
genimage rhels7.5-ppc64le-netboot-cudaruntime
|
|
|
|
|
|
|
|
|
|
#. Package the image: ::
|
|
|
|
|
|
|
|
|
|
packimage rhels7.2-ppc64le-netboot-cudaruntime
|
|
|
|
|
packimage rhels7.5-ppc64le-netboot-cudaruntime
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
POWER9 Setup
|
|
|
|
|
------------
|
|
|
|
|
|
|
|
|
|
NVIDIA POWER9 CUDA driver need some additional setup. Refer the URL below for details.
|
|
|
|
|
|
|
|
|
|
http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#power9-setup
|
|
|
|
|
|
|
|
|
|
xCAT includes a script, `cuda_power9_setup` as example, to help user handle this situation.
|
|
|
|
|
|
|
|
|
|
Diskful osimage
|
|
|
|
|
^^^^^^^^^^^^^^^
|
|
|
|
|
|
|
|
|
|
For diskful deployment, there is no need to change the osimage definition. Instead, add this postscript to your compute node postbootscrtips list.
|
|
|
|
|
|
|
|
|
|
chdef p9compute -p postbootscripts=cuda_power9_setup
|
|
|
|
|
|
|
|
|
|
Disless osimage
|
|
|
|
|
^^^^^^^^^^^^^^^
|
|
|
|
|
|
|
|
|
|
For diskless deployment, the script need to add to the postinstall script of the osimage. And it should be run in the chroot environment. Please refer the following commands as an example.
|
|
|
|
|
|
|
|
|
|
mkdir -p /install/custom/netboot
|
|
|
|
|
cp /opt/xcat/share/xcat/netboot/rh/compute.rhels7.ppc64le.postinstall /opt/xcat/share/xcat/netboot/rh/compute.rhels7.ppc64le.postinstall
|
|
|
|
|
|
|
|
|
|
cat >>/install/custom/netboot/rh/cudafull.rhels7.ppc64le.postinstall <-EOF
|
|
|
|
|
|
|
|
|
|
cp /install/postscripts/cuda_power9_setup /install/netboot/rhels7.5/ppc64le/compute/rootimg/tmp/cuda_power9_setup"
|
|
|
|
|
chroot /install/netboot/rhels7.5/ppc64le/compute/rootimg" /tmp/cuda_power9_setup
|
|
|
|
|
|
|
|
|
|
rm -f /install/netboot/rhels7.5/ppc64le/compute/rootimg/tmp/cuda_power9_setup
|
|
|
|
|
EOF
|
|
|
|
|
|
|
|
|
|
chdef -t osimage rhels7.5-ppc64le-netboot-cudafull postinstall=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.ppc64le.postinstall
|
|
|
|
|