Python: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
Created page with "Python is a high-level, interpreted programming language that has gained widespread popularity for its readability, versatility, and user-friendly syntax. Created by Guido van Rossum and first released in 1991, Python was designed to emphasize code clarity and reduce the complexity often associated with other languages. Its straightforward, English-like syntax makes it a natural choice for beginners, while its power and flexibility continue to attract experienced develop..."
 
Phase 1 § 5 P1.5.5: merge Setting up Python virtualenv + Virtual environment Python 3.4+ + Using conda... into Python. Filled empty Mamba heading with Miniforge, aligned variable to $myNobackup, fixed /bin/bin typo, dropped stale venv content. (via update-page on MediaWiki MCP Server)
 
(16 intermediate revisions by one other user not shown)
Line 1: Line 1:
Python is a high-level, interpreted programming language that has gained widespread popularity for its readability, versatility, and user-friendly syntax. Created by Guido van Rossum and first released in 1991, Python was designed to emphasize code clarity and reduce the complexity often associated with other languages. Its straightforward, English-like syntax makes it a natural choice for beginners, while its power and flexibility continue to attract experienced developers in numerous industries.
Python is a high-level, interpreted programming language popular in scientific computing for its readability and its enormous ecosystem of third-party libraries — NumPy, Pandas, scikit-learn, PyTorch, TensorFlow, and many more. This page describes how to use Python on Anunna: the provided modules, how to manage your own packages with virtual environments or Miniforge, and how to expose an environment as a Jupyter kernel.


One of Python’s greatest strengths lies in its extensive standard library, which provides built-in modules and functions for tasks ranging from file manipulation to internet protocols. Additionally, a thriving open-source community has developed countless third-party libraries and frameworks, making Python suitable for everything from data analysis and machine learning to web development and automation. Popular libraries like NumPy, Pandas, and TensorFlow enable developers to handle massive datasets, train artificial intelligence models, and build sophisticated applications with relative ease.
== Modules ==


Anunna offers environment modules for a single version of python for each bucket. On top of the standard environment modules, bundle modules are available for more specific user cases.  
Anunna provides one Python version per [[Environment Modules | module bucket]], plus bundle modules that carry a curated set of common extensions. Load a bucket, then the Python module:


Users can also install their own Python distributions, like [https://mamba.readthedocs.io/en/latest/ Mamba]
* '''2023''': <code>Python/3.11.3</code>
* '''2024''': <code>Python/3.12.3</code>


<syntaxhighlight lang="bash">
module load 2024
module load Python/3.12.3
</syntaxhighlight>
The bundle module <code>Python-bundle-PyPI</code> adds many frequently-used packages on top of the base interpreter. Use <code>module key <package></code> to find which bundle contains a package you need (see [[Environment Modules#Searching by keyword | searching modules by keyword]]).
For packages not in a module, the two recommended routes are a '''virtual environment''' built on a Python module (below), or '''Miniforge''' for a self-contained conda/mamba setup. The use of Anaconda is discouraged on Anunna — its default channels carry licensing restrictions and the full distribution is heavy; Miniforge is the lighter, unrestricted alternative.
== Virtual environments ==
A virtual environment is a self-contained directory holding a specific Python and its packages, so one project's dependencies cannot clash with another's. Python's built-in <code>venv</code> module is the simplest way to make one on top of a Python module.
First load the Python version you want:
<syntaxhighlight lang="bash">
module load 2024
module load Python/3.12.3
</syntaxhighlight>


== Creating Custom Virtual Environments ==
Then create the environment in a location of your choosing. The example uses <code>$myNobackup/PythonEnv</code> — <code>$myNobackup</code> is your Lustre nobackup location, set in your <code>~/.bash_aliases</code> (see [[Installing Personal Software#Aliases and local variables | Aliases and local variables]]). Keeping environments on Lustre rather than your home directory avoids filling your home quota, and the nobackup tier is the right choice because an environment can always be recreated from scratch and so does not need backing up.


The HPC team is aware that users may need to build custom python environments for their jobs. These environments can be based off the python environment modules provided by Anunna.
<syntaxhighlight lang="bash">
python -m venv $myNobackup/PythonEnv/my_env
</syntaxhighlight>


Python virtual environments are self-contained directories that house a specific Python installation and its associated packages, ensuring that one project’s dependencies don’t clash with another. They isolate your software requirements, letting the user manage different versions of libraries or modules in separate, discrete environments. This prevents version conflicts and keeps your system’s base Python environment clean. Tools like ‘venv’ and ‘virtualenv’ simplify creating and activating these spaces, making it straightforward to switch between multiple projects.
Activate it whenever you want to use it:


<syntaxhighlight lang="bash">
source $myNobackup/PythonEnv/my_env/bin/activate
</syntaxhighlight>


* Load the module of the desired Python version
Once active, the environment name appears as a prefix in your prompt:
* Create a virtual environment folder
* Activate virtual environment
* Install desired packages with Pip


<syntaxhighlight lang="text">
(my_env) user001@login200:~$
</syntaxhighlight>


Firstly, load the module of the desired python version. The example that hereby follows makes use of Python-3.12.3 available at the 2024 bucket.
Install packages with <code>pip</code> while the environment is active; they go into the environment, not your home directory:


<pre>
<syntaxhighlight lang="bash">
module load 2024
pip install -U numpy pandas matplotlib
module load Python/3.12.3
</syntaxhighlight>
</pre>
 
Leave the environment with <code>deactivate</code>.
 
== Miniforge (conda / mamba) ==
 
[https://github.com/conda-forge/miniforge Miniforge] is a minimal installer that gives you the <code>conda</code> and <code>mamba</code> package managers preconfigured to use the community [https://conda-forge.org/ conda-forge] channel. <code>mamba</code> is a fast drop-in replacement for <code>conda</code>. This is the recommended way to use conda-style environments on Anunna; it avoids Anaconda's licensing restrictions.
 
Download and run the installer, pointing it at a location with room (your Lustre nobackup space, not your home directory):
 
<syntaxhighlight lang="bash">
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh -b -p $myNobackup/miniforge3
</syntaxhighlight>
 
'''Do not run <code>conda init</code>''' on Anunna — it writes startup code into your <code>~/.bashrc</code> that runs on every login and can interfere with the module system. Instead, activate Miniforge only when you need it:
 
<syntaxhighlight lang="bash">
source $myNobackup/miniforge3/bin/activate
</syntaxhighlight>
 
Then create and use environments with <code>mamba</code>:
 
<syntaxhighlight lang="bash">
mamba create -n myenv python=3.12 numpy pandas
mamba activate myenv
</syntaxhighlight>
 
== Jupyter kernels ==
 
To use one of your environments inside [[Jupyter]], register it as a kernel.


Once the Python module is loaded, the environment can be created. The environment will be stored in a folder at the location of your choosing. It is not necessary to create the folder beforehand, though in this example it is assumed that the location $myBKP/PythonEnv already exists. Note that $myBKP refers to the lustre backup location specified at your ~/.bash_aliases file(see our entry on [[Aliases_and_local_variables| Aliases and local variables]])
=== From a virtual environment ===


<pre>
A <code>venv</code> kernel needs a wrapper script, because the kernel is launched without your normal shell environment and so cannot load modules by itself. First, with the environment active, install <code>ipykernel</code> and generate the kernel:
python -m venv $myBKP/PythonEnv/my_env
</pre>


Once created, the virtual environment needs to be activated in order to be used or modified.
<syntaxhighlight lang="bash">
module load 2024
module load Python/3.12.3
source $myNobackup/PythonEnv/my_env/bin/activate
pip install ipykernel
python -m ipykernel install --user --name=my_env_kernel
</syntaxhighlight>


<pre>
The kernel is written to <code>~/.local/share/jupyter/kernels/</code>, which Jupyter watches. On its own it will not work, because it cannot find the modules — so write a wrapper script that loads them. Save this as, for example, <code>$HOME/wrap.sh</code>:
source $myBKP/PythonEnv/my_env/bin/activate
</pre>


After the environment has been activated, you should see the name of the environment as a suffix of your prompt.
<syntaxhighlight lang="bash">
#!/bin/bash -l


<pre>
module reset
(my_env) user001@login200:$
module load 2024
</pre>
module load Python/3.12.3


While the virtual environment is active, you can use pip to install modules directly into your environment
exec $myNobackup/PythonEnv/my_env/bin/python -m ipykernel_launcher "$@"
</syntaxhighlight>


<pre>
The <code>#!/bin/bash -l</code> line starts a login shell, which loads Lmod and sources your <code>~/.bash_aliases</code> (so <code>$myNobackup</code> is defined). Make the wrapper executable with <code>chmod +x $HOME/wrap.sh</code>.
pip install -U numpy pandas matplotlib datetime
</pre>


The installed modules are stored at $myBKP/PythonEnv/my_env/lib/python3.12/site-packages/.  
Finally point the kernel at the wrapper by editing <code>~/.local/share/jupyter/kernels/my_env_kernel/kernel.json</code>:


<syntaxhighlight lang="json">
{
"argv": [
  "/home/WUR/user001/wrap.sh",
  "-f",
  "{connection_file}"
],
"display_name": "Python my_env",
"language": "python",
"metadata": {
  "debugger": true
}
}
</syntaxhighlight>


The only difference from a plain kernel file is that <code>argv</code> points at the wrapper script instead of the Python executable directly.


Note
=== From a conda / mamba environment ===


== Creating Jupyter kernels from virtual environments ==
A conda or mamba environment is simpler, because the environment is self-contained. With Miniforge active and your environment created, install <code>ipykernel</code> into it and register the kernel:


It is often the case that users need to have a custom environment in jupyter. This can be facilitated with virtual environments. Assumming we use the virtual environment from the previous section, we just need to
<syntaxhighlight lang="bash">
mamba create -y -n kernel_test python=3 ipykernel
mamba activate kernel_test
python -m ipykernel install --user --name kernel_test
</syntaxhighlight>


* activate the environment
To remove the kernel and environment again:
* install the '''ipython''' package
* generate the kernel


<syntaxhighlight lang="bash">
jupyter kernelspec uninstall kernel_test
mamba deactivate
mamba remove -y -n kernel_test --all
</syntaxhighlight>


<pre>
== See also ==
source $myBKP/PythonEnv/my_env/bin/activate
</pre>


<pre>
* [[Environment Modules]]
pip install -U numpy pandas matplotlib datetime ipykernel
* [[Installing Personal Software]]
</pre>
* [[Jupyter]]
* [[R]]
* [[Apptainer]]


<pre>
== External links ==
python -m ipykernel install --user --name=my_env_kernel_name
</pre>


The kernel will be written to your home folder, more precisely ~/.local/share/jupyter/kernels/ .  
* [https://docs.python.org/3/library/venv.html Python venv documentation]
This location will be monitored by jupyter, which should display your custom kernel as on of the options.
* [https://github.com/conda-forge/miniforge Miniforge]
* [https://conda-forge.org/ conda-forge]

Latest revision as of 14:01, 16 June 2026

Python is a high-level, interpreted programming language popular in scientific computing for its readability and its enormous ecosystem of third-party libraries — NumPy, Pandas, scikit-learn, PyTorch, TensorFlow, and many more. This page describes how to use Python on Anunna: the provided modules, how to manage your own packages with virtual environments or Miniforge, and how to expose an environment as a Jupyter kernel.

Modules

Anunna provides one Python version per module bucket, plus bundle modules that carry a curated set of common extensions. Load a bucket, then the Python module:

  • 2023: Python/3.11.3
  • 2024: Python/3.12.3
module load 2024
module load Python/3.12.3

The bundle module Python-bundle-PyPI adds many frequently-used packages on top of the base interpreter. Use module key <package> to find which bundle contains a package you need (see searching modules by keyword).

For packages not in a module, the two recommended routes are a virtual environment built on a Python module (below), or Miniforge for a self-contained conda/mamba setup. The use of Anaconda is discouraged on Anunna — its default channels carry licensing restrictions and the full distribution is heavy; Miniforge is the lighter, unrestricted alternative.

Virtual environments

A virtual environment is a self-contained directory holding a specific Python and its packages, so one project's dependencies cannot clash with another's. Python's built-in venv module is the simplest way to make one on top of a Python module.

First load the Python version you want:

module load 2024
module load Python/3.12.3

Then create the environment in a location of your choosing. The example uses $myNobackup/PythonEnv$myNobackup is your Lustre nobackup location, set in your ~/.bash_aliases (see Aliases and local variables). Keeping environments on Lustre rather than your home directory avoids filling your home quota, and the nobackup tier is the right choice because an environment can always be recreated from scratch and so does not need backing up.

python -m venv $myNobackup/PythonEnv/my_env

Activate it whenever you want to use it:

source $myNobackup/PythonEnv/my_env/bin/activate

Once active, the environment name appears as a prefix in your prompt:

(my_env) user001@login200:~$

Install packages with pip while the environment is active; they go into the environment, not your home directory:

pip install -U numpy pandas matplotlib

Leave the environment with deactivate.

Miniforge (conda / mamba)

Miniforge is a minimal installer that gives you the conda and mamba package managers preconfigured to use the community conda-forge channel. mamba is a fast drop-in replacement for conda. This is the recommended way to use conda-style environments on Anunna; it avoids Anaconda's licensing restrictions.

Download and run the installer, pointing it at a location with room (your Lustre nobackup space, not your home directory):

wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh -b -p $myNobackup/miniforge3

Do not run conda init on Anunna — it writes startup code into your ~/.bashrc that runs on every login and can interfere with the module system. Instead, activate Miniforge only when you need it:

source $myNobackup/miniforge3/bin/activate

Then create and use environments with mamba:

mamba create -n myenv python=3.12 numpy pandas
mamba activate myenv

Jupyter kernels

To use one of your environments inside Jupyter, register it as a kernel.

From a virtual environment

A venv kernel needs a wrapper script, because the kernel is launched without your normal shell environment and so cannot load modules by itself. First, with the environment active, install ipykernel and generate the kernel:

module load 2024
module load Python/3.12.3
source $myNobackup/PythonEnv/my_env/bin/activate
pip install ipykernel
python -m ipykernel install --user --name=my_env_kernel

The kernel is written to ~/.local/share/jupyter/kernels/, which Jupyter watches. On its own it will not work, because it cannot find the modules — so write a wrapper script that loads them. Save this as, for example, $HOME/wrap.sh:

#!/bin/bash -l

module reset
module load 2024
module load Python/3.12.3

exec $myNobackup/PythonEnv/my_env/bin/python -m ipykernel_launcher "$@"

The #!/bin/bash -l line starts a login shell, which loads Lmod and sources your ~/.bash_aliases (so $myNobackup is defined). Make the wrapper executable with chmod +x $HOME/wrap.sh.

Finally point the kernel at the wrapper by editing ~/.local/share/jupyter/kernels/my_env_kernel/kernel.json:

{
 "argv": [
  "/home/WUR/user001/wrap.sh",
  "-f",
  "{connection_file}"
 ],
 "display_name": "Python my_env",
 "language": "python",
 "metadata": {
  "debugger": true
 }
}

The only difference from a plain kernel file is that argv points at the wrapper script instead of the Python executable directly.

From a conda / mamba environment

A conda or mamba environment is simpler, because the environment is self-contained. With Miniforge active and your environment created, install ipykernel into it and register the kernel:

mamba create -y -n kernel_test python=3 ipykernel
mamba activate kernel_test
python -m ipykernel install --user --name kernel_test

To remove the kernel and environment again:

jupyter kernelspec uninstall kernel_test
mamba deactivate
mamba remove -y -n kernel_test --all

See also