diff --git a/docs/source/advanced/gpu/nvidia/index.rst b/docs/source/advanced/gpu/nvidia/index.rst index db621ed66..ea9459018 100644 --- a/docs/source/advanced/gpu/nvidia/index.rst +++ b/docs/source/advanced/gpu/nvidia/index.rst @@ -17,3 +17,4 @@ Within the NVIDIA CUDA Toolkit, installing the ``cuda`` package will install bot deploy_cuda_node.rst verify_cuda_install.rst management.rst + update_nvidia_driver.rst diff --git a/docs/source/advanced/gpu/nvidia/update_nvidia_driver.rst b/docs/source/advanced/gpu/nvidia/update_nvidia_driver.rst new file mode 100644 index 000000000..e7a651cf2 --- /dev/null +++ b/docs/source/advanced/gpu/nvidia/update_nvidia_driver.rst @@ -0,0 +1,41 @@ +Upgrade NVIDIA Driver +===================== + +If the user wants to update the newer NVIDIA driver on the system, need to :doc:`create New CUDA software reposity ` . Assume the newer driver is in the ``/install/cuda-7.5/ppc64le/nvidia_new`` for the following processes. + +Diskful +------- + +#. Change pkgdir for the cuda image: :: + + chdef -t osimage -o rhels7.2-ppc64le-install-cudafull \ + pkgdir=/install/cuda-7.5/ppc64le/nvidia_new,/install/cuda-7.5/ppc64le/cuda-deps + + +#. Use xdsh command to remove all the nvidia rpms: :: + + xdsh "yum remove *nvidia* -y" + + +#. Run updatenode command to upgrade NVIDIA driver on the compute node: :: + + updatenode -S + + +#. Reboot compute node: :: + + rpower off + rpower on + + +#. Verify the newer driver level on the compute node: :: + + nvidia-smi | grep Driver + + + + +Diskless +-------- + +For update new NVIDIA driver on the diskless compute node, the easy and simple way is re-generate the osimage with New NIVIDIA driver reposity and re-provision the node with this osimage because node needs to be reboot in order for NIVIDIA driver to load. Please follow :doc:`this doc ` to create osimage definitions and deploy CUDA nodes.