Code Profiling
A code profiler provides information about how much time each line of a program takes to run.
- This information is helpful for effectively speeding up code execution time.
- There are various profilers available for Python. A recommendation is line_profiler ( https://github.com/pyutils/line_profiler).
Install line_profiler
Use the PyTorch container:
module load apptainer pytorch
apptainer exec $CONTAINERDIR/pytorch-2.0.1.sif pip install line_profiler
Use the TensorFlow container:
module load apptainer tensorflow
apptainer exec $CONTAINERDIR/tensorflow-2.13.0.sif pip install line_profiler
Outside of a container:
pip install --user line_profiler
Run line_profiler
- In the code file, include the import statement as below and use
@profile
to decorate the functions you would like to profile:
from line_profiler import profile
...
@profile
def fcn_to_profile(arg1, arg2, ...):
...
- Run line_profiler, use the
--nv
flag for GPU use
- Pytorch:
LINE_PROFILE=1 apptainer run --nv $CONTAINERDIR/pytorch-2.0.1.sif file_name.py
- Tensorflow:
LINE_PROFILE=1 apptainer run --nv $CONTAINERDIR/tensorflow-2.13.0.sif file_name.py
- Profiler results will be printed out into two text files (the files are the same): profile_output.txt and profile_output_[TIMESTAMP].txt.
Visit the Line Profiler documentation for more information.
Notes:
- Running the command without
LINE_PROFILE=1
will just runfile_name.py
but not profile it. line_profiler
has a very slight overhead (for code run on GPU). Some notive more of a slow down on strictly CPU code (~40 more seconds for code that should run in ~160 seconds).
Check Your Knowledge
- Install line_profiler.
- Use line_profiler to profile the function “train” in example1.py.
- While the code is running, open a terminal and watch the nvidia-smi output.
- How was the GPU utilization?
- Which line takes the longest to run? Does this surprise you?
- What would you suggest we do to increase the efficiency of the code?