GPU Resource Usage

NVIDIA GPU Resource Usage

nvidia-smi will report GPU utilization and memory usage for NVIDIA GPUs.

  • GPU Utilization refers to the percentage of time that at least one kernel was running on the GPU.

watch -n 1 nvidia-smi will update the display every second.

Source: https://medium.com/analytics-vidhya/explained-output-of-nvidia-smi-utility-fc4fbee3b124 and https://developer.download.nvidia.com/compute/DCGM/docs/nvidia-smi-367.38.pdf

GPU Resource Usage

gpustat will report GPU utilization and memory usage.

> module load gpustat
> gpustat
Source: https://github.com/wookayin/gpustat

PyTorch

  • Correct GPU memory usage will be reported by the previous tools.

TensorFlow/Keras

  • By default, TensorFlow automatically allocates ALL of the GPU memory so the previous tools will show that all (or almost all) of the GPU memory is being used.
  • To track the amount of GPU memory actually used, you can add these lines to your python script:
import os
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'

Visit the Tensorflow website for additional information.

Check Your Knowledge

  • Find the name of the GPU that you have access to.
Previous
Next