HPC Advanced/Preparation Exercise: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
Created page with "= HPC Basics Refresher — Weather Data Analyser = {{Note|'''Who is this for?''' This is a self-paced preparation exercise for people who are attending the '''HPC Advanced''' course but did '''not''' attend '''HPC Basics'''. Working through it end to end means you arrive at the advanced course already comfortable with environment variables, virtual environments, launcher scripts, interactive jobs, and SLURM batch submission on '''Anunna'''.}} '''The application:''' a s..."
 
 
(2 intermediate revisions by the same user not shown)
Line 24: Line 24:


<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Provided it does not exist
# Create an empty ~/.bash_aliases if you do not already have one
# Create an empty ~/.bash_aliases if you do not already have one
$ touch ~/.bash_aliases
$ touch ~/.bash_aliases


# Load the anunna module (this provides getLustreDir)
# Load the anunna module  
$ module load anunna
$ module load anunna


Line 81: Line 82:
Then <code>source ~/.bashrc</code> once more.
Then <code>source ~/.bashrc</code> once more.


== Part 2 (Step 0) — Python Virtual Environment ==
== Part 2 — Python Virtual Environment ==


We isolate the analyser's dependencies in a virtual environment living on Lustre scratch.
We isolate the analyser's dependencies in a virtual environment living on Lustre scratch.
Line 117: Line 118:
</syntaxhighlight>
</syntaxhighlight>


{{Note|'''Note on <code>datetime</code>:''' <code>datetime</code> is part of the Python '''standard library''' — it ships with the interpreter and does not actually need to be installed. The analyser imports the built-in module. The line above is kept for parity with the course slides; you only truly need <code>matplotlib</code> and <code>pandas</code>.}}
== Part 3 — The Launcher Script <code>weather.sh</code> ==
 
== Part 3 (Step 1) — The Launcher Script <code>weather.sh</code> ==


=== Tasks ===
=== Tasks ===
Line 144: Line 143:
source "$myScratch/hpcCourse/venv/bin/activate"
source "$myScratch/hpcCourse/venv/bin/activate"


# Run the analyser (240 = months of data to plot, ~20 years)
# Run the analyser (240 = ID of this specific weather station)
python "$myScratch/hpcCourse/weer_vanaf_2000.py" 240
python "$myScratch/hpcCourse/weer_vanaf_2000.py" 240
</syntaxhighlight>
</syntaxhighlight>
Line 154: Line 153:
</syntaxhighlight>
</syntaxhighlight>


== Part 4 (Step 2) — Find the Job Requirements ==
== Part 4 — Find the Job Requirements ==


Before submitting a batch job you need to know how much memory it needs. We start with a deliberately small allocation and increase it until the job stops crashing.
Before submitting a batch job you need to know how much memory it needs. We start with a deliberately small allocation and increase it until the job stops crashing.
Line 170: Line 169:
If the job is killed (out-of-memory), exit the interactive session, re-launch <code>sinteractive</code> with a larger <code>--mem</code> value (e.g. <code>200M</code>, <code>500M</code>, <code>1G</code>, …) and run again. Repeat until <code>weather.sh</code> completes cleanly. '''Note the smallest <code>--mem</code> value that works''' — you will use it in Step 3.
If the job is killed (out-of-memory), exit the interactive session, re-launch <code>sinteractive</code> with a larger <code>--mem</code> value (e.g. <code>200M</code>, <code>500M</code>, <code>1G</code>, …) and run again. Repeat until <code>weather.sh</code> completes cleanly. '''Note the smallest <code>--mem</code> value that works''' — you will use it in Step 3.


== Part 5 (Step 3) — Write and Submit a SLURM Script ==
== Part 5 — Write and Submit a SLURM Script ==


=== Tasks ===
=== Tasks ===
Line 176: Line 175:
# '''Write''' a SLURM script for the job using the resource requirements you determined in Step 2. Label it '''<code>weather.slurm</code>'''.
# '''Write''' a SLURM script for the job using the resource requirements you determined in Step 2. Label it '''<code>weather.slurm</code>'''.
# '''Submit''' the job with <code>sbatch</code>.
# '''Submit''' the job with <code>sbatch</code>.
# '''Check the status''' of your job and cancel it if needed.
# '''Check the status''' of your job and cancel it if needed, though probably not since the jobs are very short.
# '''Check the figures''' generated by the Python script.
# '''Check the figures''' generated by the Python script.



Latest revision as of 06:24, 17 June 2026

HPC Basics Refresher — Weather Data Analyser

 ℹ️ Note: Who is this for? This is a self-paced preparation exercise for people who are attending the HPC Advanced course but did not attend HPC Basics. Working through it end to end means you arrive at the advanced course already comfortable with environment variables, virtual environments, launcher scripts, interactive jobs, and SLURM batch submission on Anunna.

The application: a small Python script that loads weather data from a Dutch weather station and draws some figures.

Source: /lustre/shared/hpcCourses/Basics/weer_vanaf_2000.py

By the end you will have:

  • Set up handy Lustre path variables ($myScratch, $myBkp, $myNoBkp) and aliases
  • Created a Python virtual environment on Lustre scratch
  • Written an executable launcher script (weather.sh)
  • Measured the job's memory requirement interactively
  • Written and submitted a SLURM batch script (weather.slurm) and collected the figures

Conventions in this document: lines starting with $ are commands you type at the shell (do not type the $). <placeholder> means substitute your own value.

Part 1 — Environment Variables & Aliases

The Anunna getLustreDir helper writes your personal Lustre directories ($myScratch, $myBkp, $myNoBkp) into your environment so you never have to type long paths again.

1.1 — Set up the variables

# Provided it does not exist
# Create an empty ~/.bash_aliases if you do not already have one
$ touch ~/.bash_aliases

# Load the anunna module 
$ module load anunna

# Inspect what getLustreDir does
$ getLustreDir -h

# Append the export lines to your ~/.bash_aliases
$ getLustreDir -e >> ~/.bash_aliases

# Check the result (cat, less, more or nano all work)
$ cat ~/.bash_aliases

# Reload your environment so the new variables take effect
$ source ~/.bashrc

1.2 — Test the variables

Test 1 — does this print your scratch folder location?

$ echo $myScratch

Test 2 — does this take you to your scratch folder?

$ cd $myScratch

If both work, your variables are live.

1.3 — Bonus: convenience aliases

Add these lines to your ~/.bash_aliases and reload your environment again:

alias cds="cd $myScratch"
alias cdb="cd $myBkp"
alias cdn="cd $myNoBkp"
 ℹ️ Note: Important: the alias lines must come after the variable definitions in ~/.bash_aliases, because they reference those variables.

If your aliases still do not work after source ~/.bashrc, your ~/.bashrc may not be sourcing ~/.bash_aliases at all. Add this block to the end of your ~/.bashrc:

if [ -f ~/.bash_aliases ]; then
    . ~/.bash_aliases
fi

Then source ~/.bashrc once more.

Part 2 — Python Virtual Environment

We isolate the analyser's dependencies in a virtual environment living on Lustre scratch.

2.1 — Load the software stack and Python

$ module purge
$ module load 2024
$ module load Python/3.12.3

2.2 — Create the virtual environment (in Lustre scratch)

$ mkdir $myScratch/hpcCourse; cd $_
$ python -m venv $myScratch/hpcCourse/venv
$ source $myScratch/hpcCourse/venv/bin/activate
 ℹ️ Note: cd $_ reuses the last argument of the previous command — here, the directory you just created.

2.3 — Check you are using the venv's Python

$ which python      # should point inside .../hpcCourse/venv/bin
$ python -V          # should report Python 3.12.3

2.4 — Install the required libraries

$ pip install matplotlib pandas datetime
$ pip freeze          # for checking what is installed

Part 3 — The Launcher Script weather.sh

Tasks

  1. Copy the Python script into your course folder:
    cp /lustre/shared/hpcCourses/Basics/weer_vanaf_2000.py $myScratch/hpcCourse/
  2. Write a bash script called weather.sh that loads the modules and activates the virtual environment you created, and make it executable.
  3. The script should end with the run command below. Hint: use variables for paths.
    python $myScratch/hpcCourse/weer_vanaf_2000.py 240
    (240 is the argument passed to the analyser — the number of months of data to plot, i.e. ~20 years.)

Example solution — weather.sh

#!/bin/bash
# weather.sh — load the environment and run the Weather Data Analyser

# Load the software stack and Python
module purge
module load 2024
module load Python/3.12.3

# Activate the virtual environment
source "$myScratch/hpcCourse/venv/bin/activate"

# Run the analyser (240 = ID of this specific weather station)
python "$myScratch/hpcCourse/weer_vanaf_2000.py" 240

Make it executable:

$ chmod +x $myScratch/hpcCourse/weather.sh

Part 4 — Find the Job Requirements

Before submitting a batch job you need to know how much memory it needs. We start with a deliberately small allocation and increase it until the job stops crashing.

Task

Find the job requirements (e.g. memory) of the Python script using sinteractive. Increase the amount of RAM until the job no longer crashes.

$ sinteractive --mem 100M --time=0-0:10 -c 1
$ cd $myScratch/hpcCourse
$ ./weather.sh

If the job is killed (out-of-memory), exit the interactive session, re-launch sinteractive with a larger --mem value (e.g. 200M, 500M, 1G, …) and run again. Repeat until weather.sh completes cleanly. Note the smallest --mem value that works — you will use it in Step 3.

Part 5 — Write and Submit a SLURM Script

Tasks

  1. Write a SLURM script for the job using the resource requirements you determined in Step 2. Label it weather.slurm.
  2. Submit the job with sbatch.
  3. Check the status of your job and cancel it if needed, though probably not since the jobs are very short.
  4. Check the figures generated by the Python script.
 ℹ️ Note: Tip: a SLURM template is available at /lustre/shared/hpcCourses/Basics/script_slurm.sh

Example solution — weather.slurm

#!/bin/bash
#SBATCH --job-name=weather
#SBATCH --mem=<RAM-from-Step-2>     # e.g. 500M — the value you found in Step 2
#SBATCH --time=0-0:10
#SBATCH --cpus-per-task=1
#SBATCH --output=weather-%j.out
#SBATCH --error=weather-%j.err

cd "$myScratch/hpcCourse"
./weather.sh

Submit, monitor, and collect results

# Submit the job
$ sbatch weather.slurm

# Check the status of your jobs
$ squeue --me

# Cancel a job if needed (replace <jobid> with the ID from squeue / sbatch)
$ scancel <jobid>

# Once finished, inspect the output and the generated figures
$ ls $myScratch/hpcCourse
$ cat weather-<jobid>.out

The analyser writes its figures into your course folder — open them to confirm the run succeeded.

Checklist

  • echo $myScratch prints your scratch path and cd $myScratch works
  • cds / cdb / cdn aliases jump to the right folders
  • Virtual environment activates and which python points inside it
  • weather.sh is executable and runs the analyser interactively
  • You know the minimum --mem the job needs (from Step 2)
  • weather.slurm submits with sbatch and the figures are generated

You are now ready for the HPC Advanced course.