Revision as of 11:34, 25 October 2023

Create a jupyterhub instance with GPU support enabled.

setup

Link lustre path to home directory

When working from Jupyterhub the default working directory is the home folder. However, it is recommended to put your data and code on the lustre pathings. To make this easier, we can create a link to lustre from our home directory:

ln -s /lustre/[path to your lustre folder] [reference name, for example lustre_folders]

To remove a link:

rm [reference name, for example lustre_folders]

Create conda environment that we can use for a jupyter kernel

conda create -y -n kernel_test python=3.10 ipykernel 
conda activate kernel_test
python -m ipykernel install --user --name kernel_test

NOTE: You can specific the python version for you conda environment with python=3 Please take care what python version is compatible with you required packages.

Install required packages

For pytorch you can find information here and for TensorFlow here.

As an example I use the following pytorch installation:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Start jupyter notebook with GPU

Go here and select:

Select a location for your server: on the cluster (default option)
Partition to use: gpu
Memory (in MB): desired memory
Number of CPUs: desired CPU count
Maximum execution time (hours:minutes:seconds): maximum amount of time the notebook is available
Extra options: --gres=gpu:1 (default when selecting GPU, gpu:x for x amount of GPUs)

Using multiple GPUs

Select multiple GPUs in when starting jupyterhub in the extra options menu: --gres=gpu:x where x is amount of requested GPUs
There should be multiple GPUs available to the jupyterhub notebook. Check this by using GPU tests in the following section.

Test GPU availability

Pytorch

def check_all_cuda_devices():
    device_count = torch.cuda.device_count()
    for i in range(device_count):
        print('>>>> torch.cuda.device({})'.format(i))
        result = torch.cuda.device(i)
        print(result, '\n')

        print('>>>> torch.cuda.get_device_name({})'.format(i))
        result = torch.cuda.get_device_name(i)
        print(result, '\n')


def check_cuda():
    print('>>>> torch.cuda.is_available()')
    result = torch.cuda.is_available()
    print(result, '\n')

    print('>>>> torch.cuda.device_count()')
    result = torch.cuda.device_count()
    print(result, '\n')

    print('>>>> torch.cuda.current_device()')
    result = torch.cuda.current_device()
    print(result, '\n')

    print('>>>> torch.cuda.device(0)')
    result = torch.cuda.device(0)
    print(result, '\n')

    print('>>>> torch.cuda.get_device_name(0)')
    result = torch.cuda.get_device_name(0)
    print(result, '\n')

    check_all_cuda_devices()


def check_cuda_ops():
    print('>>>> torch.zeros(2, 3)')
    zeros = torch.zeros(2, 3)
    print(zeros, '\n')

    print('>>>> torch.zeros(2, 3).cuda()')
    cuda_zero = torch.zeros(2, 3).cuda()
    print(cuda_zero, '\n')

    print('>>>> torch.tensor([[1, 2, 3], [4, 5, 6]])')
    tensor_a = torch.tensor([[1, 2, 3], [4, 5, 6]]).cuda()
    print(tensor_a, '\n')

    print('>>>> tensor_a + cuda_zero')
    sum = tensor_a + cuda_zero
    print(sum, '\n')

    print('>>>> tensor_a * cuda_twos')
    tensor_a = tensor_a.to(torch.float)
    cuda_zero = cuda_zero.to(torch.float)
    cuda_twos = (cuda_zero + 1.0) * 2.0
    product = tensor_a * cuda_twos
    print(product, '\n')

    print('>>>> torch.matmul(tensor_a, cuda_twos.T)')
    mat_mul = torch.matmul(tensor_a, cuda_twos.T)
    print(mat_mul, '\n')

try:
    get_version()
except Exception as e:
    print('get_version() failed, exception message below:')
    print(e)

try:
    check_cuda()
except Exception as e:
    print('check_cuda() failed, exception message below:')
    print(e)

try:
    check_cuda_ops()
except Exception as e:
    print('check_cuda_ops() failed, exception message below:')
    print(e)

Tensorflow

import tensorflow as tf

hasGPUSupport = tf.test.is_built_with_cuda()
gpuList = tf.config.list_physical_devices('GPU')

print("Tensorflow Compiled with CUDA/GPU Support:", hasGPUSupport)
print("Tensorflow can access", len(gpuList), "GPU")
print("Accessible GPUs are:")
print(gpuList)

tf.debugging.set_log_device_placement(True)
# Place tensors on the GPU
with tf.device('device:GPU:0'):
  a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
  b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])

# Run on the GPU
c = tf.matmul(a, b)
print(c)

@@ Line 21: / Line 21: @@
   <nowiki>
-conda create -y -n kernel_test python=3 ipykernel && conda activate kernel_test
+conda create -y -n kernel_test python=3.10 ipykernel
+conda activate kernel_test
 python -m ipykernel install --user --name kernel_test</nowiki>
 NOTE: You can specific the python version for you conda environment with python=3 Please take care what python version is compatible with you required packages.
 == Install required packages ==

JupyterHub with GPU: Difference between revisions

Revision as of 11:34, 25 October 2023

Contents

setup

Link lustre path to home directory

Create conda environment that we can use for a jupyter kernel

Install required packages

Start jupyter notebook with GPU

Using multiple GPUs

Test GPU availability

Pytorch

Tensorflow

Navigation menu

JupyterHub with GPU: Difference between revisions

Revision as of 11:34, 25 October 2023

setup

Link lustre path to home directory

Create conda environment that we can use for a jupyter kernel

Install required packages

Start jupyter notebook with GPU

Using multiple GPUs

Test GPU availability

Pytorch

Tensorflow

Navigation menu

Search