Setting up Python virtualenv: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(28 intermediate revisions by 5 users not shown)
Line 1: Line 1:
With many Python packages available, which are often in conflict or requiring different versions depending on application, installing and controlling packages and versions is not always easy. In addition, so many packages are often used only occasionally, that it is questionable whether a system administrator of a centralized server system or a High Performance Compute (HPC) infrastructure can be expected to resolve all issues posed by users of the infrastructure. Even on a local system with full administrative rights managing versions, dependencies, and package collisions is often very difficult. The solution is to use a virtual environment, in which a specific set of packages can then be installed. As many different virtual environments can be created, and used side-by-side, as is necessary.  
With many Python packages available, which are often in conflict or requiring different versions depending on application, installing and controlling packages and versions is not always easy. In addition, so many packages are often used only occasionally, that it is questionable whether a system administrator of a centralized server system or a High Performance Compute (HPC) infrastructure can be expected to resolve all issues posed by users of the infrastructure. Even on a local system with full administrative rights managing versions, dependencies, and package collisions is often very difficult. The solution is to use a virtual environment, in which a specific set of packages can then be installed. As many different virtual environments can be created, and used side-by-side, as is necessary.  


== creating a new virtual environment ==
NOTE: as of Python 3.3 virtual environment support is built-in. See this page for an [[virtual_environment_Python_3.4_or_higher | alternative set-up of your virtual environment if using Python 3.4 or higher]].
It is assumed that the appropriate <code>virtualenv</code> executable for the Python version of choice is installed.
 
<source lang='bash'>
== Creating a new virtual environment ==
It is assumed that the appropriate <code>virtualenv</code> executable for the Python version of choice is installed. A new virtual environment, in this case called <code>newenv</code> is created like so:
<pre>
module load python/my-favourite-version (e.g. 2.7.12)
virtualenv newenv
virtualenv newenv
</source>
OR
pyvenv newenv (For versions >3.4)
</pre>


== activating a virtual environment ==
When the new environment is created, one will see a message similar to this:
<source lang='bash'>
<nowiki>  New python executable in newenv/bin/python3
  Also creating executable in newenv/bin/python
  Installing Setuptools.........................................................................done.
  Installing Pip................................................................................done.</nowiki>
 
== Activating a virtual environment ==
Once the environment is created, each time the environment needs to be activated, the following command needs to be issued:
<pre>
source newenv/bin/activate
source newenv/bin/activate
</source>
</pre>
<code>
This assumes that the folder that contains the virtual environment documents (in this case called <code>newenv</code>), is in the present working directory.
  (newenv)hjm@ubuntu:~$
When working on the virtual environment, the virtual environment name will be between brackets in front of the <code>user-host-prompt</code> string.
</code>
<nowiki>  (newenv)user@host:~$</nowiki>
== installing modules on the virtual environment ==
 
Installing modules is the same as usual. The difference is that modules are in <code>/path/to/virtenv/lib</code>, which may be living somewhere on your home directory. When working from the virtual environment, the default <code>easy_install</code> will belong to the python version that is currently active. This means that the executable in <code>/path/to/virtenv/bin</code> are in fact the first in the <code>$PATH</code>.
== Installing modules on the virtual environment ==
<source lang='bash'>
Installing modules is the same as usual. The difference is that modules are in <code>/path/to/virtenv/lib</code>, which may be living somewhere on your home directory. When working from the virtual environment, the default <code>pip</code> will belong to the python version that is currently active. This means that the executable in <code>/path/to/virtenv/bin</code> are in fact the first in the <code>$PATH</code>.
easy_install numpy
<pre>
</source>
pip install numpy
</pre>
Similarly, installing packages from source works exactly the same as usual.
Similarly, installing packages from source works exactly the same as usual.
<source lang='bash'>
<pre>
python setup.py install
python setup.py install
</source>
</pre>


== deactivating a virtual environment ==
== deactivating a virtual environment ==
Quitting a virtual environment can be done by using the command <code>deactivate</code>, which was loaded using the <code>source</source> command upon activating the virtual environment.
Quitting a virtual environment can be done by using the command <code>deactivate</code>, which was loaded using the <code>source</code> command upon activating the virtual environment.
<source lang='bash'>
<pre>
deactivate
deactivate
</source>
</pre>
 
== Virtualenv kernels in Jupyter ==
Want your own virtualenv kernel in a notebook? This can be done by making your own kernel specifications:
 
(an alternative way to the manual way (using conda) is described [[Using conda to install a new kernel into your notebook|here ]])
 
* Make sure you have the ipykernel module in your venv. Activate it and pip install it:
<nowiki>source ~/path/to/my/virtualenv/bin/activate && pip install ipykernel</nowiki>
* Create the following directory path in your homedir if it doesn't already exist:
<nowiki>mkdir -p ~/.local/share/jupyter/kernels/</nowiki>
* Think of a nice descriptive name that doesn't clash with one of the already present kernels. I'll use 'testing'. Create this folder:
<nowiki>mkdir ~/.local/share/jupyter/kernels/testing/</nowiki>
* Add this file to this folder:
<nowiki>vi ~/.local/share/jupyter/kernels/testing/kernel.json
{
"language": "python",
"argv": [
  "/home/myhome/path/to/my/virtualenv/bin/python",
  "-m",
  "ipykernel",
  "-f",
  "{connection_file}"
],
"display_name": "testing"
}</nowiki>
* Reload Jupyterhub page. testing should now exist in your kernels list.
 
You can do more complex things with this, such as construct your own Spark environment. This relies on having the module findspark installed:
<nowiki> vi ~/.local/share/jupyter/kernels/mysparkkernel/kernel.json
{
"language": "python",
"env": {
  "SPARK_HOME":
    "/shared/apps/spark/my-spark-version"
},
"argv": [
  "/home/myhome/my/spark/venv/bin/python",
  "-m",
  "ipykernel",
  "-c", "import findspark; findspark.init()",
  "-f",
  "{connection_file}"
],
"display_name": "My Spark kernel"
}</nowiki>
(You'll want to make sure your spark cluster has the same environment - start it after activating this venv inside your sbatch script)
 
== Make IPython work under virtualenv ==
== Make IPython work under virtualenv ==
IPython may not work initially under a virtual environment. It may produce an error message like below:
IPython may not work initially under a virtual environment. It may produce an error message like below:


<code>
<nowiki>   File "/usr/bin/ipython", line 11
    File "/usr/bin/ipython", line 11
     print "Could not start qtconsole. Please install ipython-qtconsole"
     print "Could not start qtconsole. Please install ipython-qtconsole"
                                                                       ^
                                                                       ^</nowiki>
</code>


This can be resolved by adding a soft link with the name <code>ipython</code> to the <code>bin</code> directory in the virtual environment folder.
This can be resolved by adding a soft link with the name <code>ipython</code> to the <code>bin</code> directory in the virtual environment folder.
<source lang='bash'>
<pre>
ln -s /path/to/virtenv/bin/ipython3 /path/to/virtenv/bin/ipython
ln -s /path/to/virtenv/bin/ipython3 /path/to/virtenv/bin/ipython
</source>
</pre>
 
== External links ==
* [https://pypi.python.org/pypi/virtualenv Python3 documentation for virtualenv]
* [http://cemcfarland.wordpress.com/2013/03/09/getting-ipython3-working-inside-your-virtualenv/ Solving the IPython hickup under virtual environment]

Latest revision as of 10:02, 16 June 2023

With many Python packages available, which are often in conflict or requiring different versions depending on application, installing and controlling packages and versions is not always easy. In addition, so many packages are often used only occasionally, that it is questionable whether a system administrator of a centralized server system or a High Performance Compute (HPC) infrastructure can be expected to resolve all issues posed by users of the infrastructure. Even on a local system with full administrative rights managing versions, dependencies, and package collisions is often very difficult. The solution is to use a virtual environment, in which a specific set of packages can then be installed. As many different virtual environments can be created, and used side-by-side, as is necessary.

NOTE: as of Python 3.3 virtual environment support is built-in. See this page for an alternative set-up of your virtual environment if using Python 3.4 or higher.

Creating a new virtual environment

It is assumed that the appropriate virtualenv executable for the Python version of choice is installed. A new virtual environment, in this case called newenv is created like so:

module load python/my-favourite-version (e.g. 2.7.12)
virtualenv newenv
OR
pyvenv newenv (For versions >3.4)

When the new environment is created, one will see a message similar to this:

  New python executable in newenv/bin/python3
  Also creating executable in newenv/bin/python
  Installing Setuptools.........................................................................done.
  Installing Pip................................................................................done.

Activating a virtual environment

Once the environment is created, each time the environment needs to be activated, the following command needs to be issued:

source newenv/bin/activate

This assumes that the folder that contains the virtual environment documents (in this case called newenv), is in the present working directory. When working on the virtual environment, the virtual environment name will be between brackets in front of the user-host-prompt string.

  (newenv)user@host:~$

Installing modules on the virtual environment

Installing modules is the same as usual. The difference is that modules are in /path/to/virtenv/lib, which may be living somewhere on your home directory. When working from the virtual environment, the default pip will belong to the python version that is currently active. This means that the executable in /path/to/virtenv/bin are in fact the first in the $PATH.

pip install numpy

Similarly, installing packages from source works exactly the same as usual.

python setup.py install

deactivating a virtual environment

Quitting a virtual environment can be done by using the command deactivate, which was loaded using the source command upon activating the virtual environment.

deactivate

Virtualenv kernels in Jupyter

Want your own virtualenv kernel in a notebook? This can be done by making your own kernel specifications:

(an alternative way to the manual way (using conda) is described here )

  • Make sure you have the ipykernel module in your venv. Activate it and pip install it:
source ~/path/to/my/virtualenv/bin/activate && pip install ipykernel
  • Create the following directory path in your homedir if it doesn't already exist:
mkdir -p ~/.local/share/jupyter/kernels/
  • Think of a nice descriptive name that doesn't clash with one of the already present kernels. I'll use 'testing'. Create this folder:
mkdir ~/.local/share/jupyter/kernels/testing/
  • Add this file to this folder:
vi ~/.local/share/jupyter/kernels/testing/kernel.json 
{
 "language": "python",
 "argv": [
  "/home/myhome/path/to/my/virtualenv/bin/python",
  "-m",
  "ipykernel",
  "-f",
  "{connection_file}"
 ],
 "display_name": "testing"
}
  • Reload Jupyterhub page. testing should now exist in your kernels list.

You can do more complex things with this, such as construct your own Spark environment. This relies on having the module findspark installed:

 vi ~/.local/share/jupyter/kernels/mysparkkernel/kernel.json 
{
 "language": "python",
 "env": {
   "SPARK_HOME":
     "/shared/apps/spark/my-spark-version"
 },
 "argv": [
  "/home/myhome/my/spark/venv/bin/python",
  "-m",
  "ipykernel",
  "-c", "import findspark; findspark.init()",
  "-f",
  "{connection_file}"
 ],
 "display_name": "My Spark kernel"
}

(You'll want to make sure your spark cluster has the same environment - start it after activating this venv inside your sbatch script)

Make IPython work under virtualenv

IPython may not work initially under a virtual environment. It may produce an error message like below:

    File "/usr/bin/ipython", line 11
    print "Could not start qtconsole. Please install ipython-qtconsole"
                                                                      ^

This can be resolved by adding a soft link with the name ipython to the bin directory in the virtual environment folder.

ln -s /path/to/virtenv/bin/ipython3 /path/to/virtenv/bin/ipython

External links