Python: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
 
(2 intermediate revisions by the same user not shown)
Line 78: Line 78:
* install the '''ipython''' package
* install the '''ipython''' package
* generate the kernel
* generate the kernel
* Write a wrapper
* Write a wrapper script
* Make the wrapper script executable
* Edit the kernel file
* Edit the kernel file


Line 125: Line 126:
</pre>  
</pre>  


The code above initializes lmod by sourcing <code>/etc/bash.bashrc</code>, then it loaded the modules required by the environment. Finally, it executes ipykernel_launcher from the virtual environemnt created.
The code above initializes lmod by sourcing <code>/etc/bash.bashrc</code>, then it loaded the modules required by the environment. Finally, it executes ipykernel_launcher from the virtual environment created. In this example, the code above is saved in <code>/home/WUR/user001/wrap.sh</code> .  Make sure the wrapper script is executable.


The last thing to do now is to modify the kernel.json file of the jupyter kernel created above. It should be located at <code>~/.local/share/jupyter/kernels/my_env_kernel_name/kernel.json</code>
The last thing to do now is to modify the kernel.json file of the jupyter kernel created above. It should be located at <code>~/.local/share/jupyter/kernels/my_env_kernel_name/kernel.json</code>

Latest revision as of 16:05, 27 February 2025

Python is a high-level, interpreted programming language that has gained widespread popularity for its readability, versatility, and user-friendly syntax. Created by Guido van Rossum and first released in 1991, Python was designed to emphasize code clarity and reduce the complexity often associated with other languages. Its straightforward, English-like syntax makes it a natural choice for beginners, while its power and flexibility continue to attract experienced developers in numerous industries.

One of Python’s greatest strengths lies in its extensive standard library, which provides built-in modules and functions for tasks ranging from file manipulation to internet protocols. Additionally, a thriving open-source community has developed countless third-party libraries and frameworks, making Python suitable for everything from data analysis and machine learning to web development and automation. Popular libraries like NumPy, Pandas, and TensorFlow enable developers to handle massive datasets, train artificial intelligence models, and build sophisticated applications with relative ease.

Anunna offers environment modules for a single version of python for each bucket. On top of the standard environment modules, bundle modules are available for more specific user cases.

Users can also install their own Python distributions, like Mamba. While, the use of Anaconda is discouraged.

Modules

Each module year bucket comes with a version of Python installed

  • 2023: Python/3.11.3
  • 2024: Python/3.12.3

Additionally, modules of bundles of python extensions are also installed

  • Python-bundle-PyPI

Mamba

Creating Custom Virtual Environments from Modules

The HPC team is aware that users may need to build custom python environments for their jobs. These environments can be based off the python environment modules provided by Anunna.

Python virtual environments are self-contained directories that house a specific Python installation and its associated packages, ensuring that one project’s dependencies don’t clash with another. They isolate your software requirements, letting the user manage different versions of libraries or modules in separate, discrete environments. This prevents version conflicts and keeps your system’s base Python environment clean. Tools like ‘venv’ and ‘virtualenv’ simplify creating and activating these spaces, making it straightforward to switch between multiple projects.


  • Load the module of the desired Python version
  • Create a virtual environment folder
  • Activate virtual environment
  • Install desired packages with Pip


Firstly, load the module of the desired python version. The example that hereby follows makes use of Python-3.12.3 available at the 2024 bucket.

module load 2024
module load Python/3.12.3

Once the Python module is loaded, the environment can be created. The environment will be stored in a folder at the location of your choosing. It is not necessary to create the folder beforehand, though in this example it is assumed that the location $MYBKP/PythonEnv already exists. Note that $MYBKP refers to the lustre backup location specified at your ~/.bash_aliases file (see our entry on Aliases and local variables)

python -m venv $MYBKP/PythonEnv/my_env

Once created, the virtual environment needs to be activated in order to be used or modified.

source $MYBKP/PythonEnv/my_env/bin/activate

After the environment has been activated, you should see the name of the environment as a suffix of your prompt.

(my_env) user001@login200:$ 

While the virtual environment is active, you can use pip to install modules directly into your environment

pip install -U numpy pandas matplotlib datetime

The installed modules are stored at $MYBKP/PythonEnv/my_env/lib/python3.12/site-packages/.


Note

Creating Jupyter kernels from virtual environments

It is often the case that users need to have a custom environment in jupyter. This can be facilitated with virtual environments. Assuming we use the virtual environment from the previous section, we just need to

  • load the required modules
  • activate the environment
  • install the ipython package
  • generate the kernel
  • Write a wrapper script
  • Make the wrapper script executable
  • Edit the kernel file

First, load the required modules.


module load 2024
module load Python/3.12.3


Then activate the environment

source $myBKP/PythonEnv/my_env/bin/activate

Install the ipykernel package

pip install ipykernel

Run the command below to generate a kernel. Enter the desired name for the kernel with the flag --name

python -m ipykernel install --user --name=my_env_kernel_name

The kernel will be written to your home folder, more precisely ~/.local/share/jupyter/kernels/. This location will be monitored by jupyter, which should display your custom kernel as one of the kernel options.

As it is, the kernel will not work on jupyter because it will not be able to find the necessary modules to run it. A workaround is, then, to write a wrapper script to load the modules inside the kernel.


#!/bin/bash
source /etc/bash.bashrc
#modules to load
module load 2024
module load Python/3.12.3
# wrapper line
# exec <pathToVirtualEnvPythonExecutable> -m ipykernel_launcher "$@"
exec /lustre/scratch/<AFFILIATION>/<GROUP>/user001/PythonEnv/my_env/bin/bin/python -m ipykernel_launcher "$@"

The code above initializes lmod by sourcing /etc/bash.bashrc, then it loaded the modules required by the environment. Finally, it executes ipykernel_launcher from the virtual environment created. In this example, the code above is saved in /home/WUR/user001/wrap.sh . Make sure the wrapper script is executable.

The last thing to do now is to modify the kernel.json file of the jupyter kernel created above. It should be located at ~/.local/share/jupyter/kernels/my_env_kernel_name/kernel.json

{
 "argv": [
  "/home/WUR/user001/wrap.sh",
  "-f",
  "{connection_file}"
 ],
 "display_name": "Python modTest",
 "language": "python",
 "metadata": {
  "debugger": true
 }
}

This kernel differs from a vanilla kernel file by specifying the location of the wrapper script as the first string passed to argv. Now that the modules are loaded, the kernel should be able to run on jupyter.