Python Virtual Environments¶
When working on a Python project, we may have to install some Python
packages. You can use a command called pip
(package installer for
python). Here's a link to the documentation on
pip.
Pip is a command that looks online at the Python Package Index, finds the package you want to install and then installs it on your local disk. But where? By default, it will install them in a "system" directory.
On the CS server, and in general whenever you are working on a system, you might not have administrator privilege to install Python packages in system directories.
Furthermore, you might need a different set of packages for different projects. Even worse, the set of packages in one project might be incompatible with the set of packages in another. For example, they both might need the orange package (used for data mining), but one might need version 3.12 and the other version 3.9. What to do?
Virtual Environments to the rescue!
Virtual environments are a really smart feature of the Python infrastructure that allows:
- Non-root users to install python packages/modules
- Different projects to use different Python packages/modules
- A project can bundle itself with its various modules in a way that makes it easy to copy to another computer.
That is, each virtualenv is a place where we can put the Python modules and packages that belong to a particular project. Different projects can have different sets of modules, and each project is completely independent.
I found this Virtual Environment Primer that looks quite good, if you'd like something more thorough.
Virtualenv Concepts¶
To understand virtualenv, it helps to know a couple of concepts first:
- When Python imports a module, it searches for it on a list of directories called the
PYTHONPATH
. - The list of directories is usually stored in an environment variable of that name.
- That environment variable is modifiable, often
by
source
-ing some shell commands. - The PYTHONPATH is read by programs like
python
andpip
. - The
pip
program installs Python modules from various internet sites. It installs the module into a directory in your PYTHONPATH, from which Python will import it. Which brings us full-circle.
Don't get confused: a virtual environment is just a folder (directory) that has some pre-installed stuff in it, including some pre-installed Python packages. Creating a virtual environment just means creating a folder and installing some stuff in it. Since you own the folder, you can install addition stuff (Python modules) into it.
Creating a Virtual Environment¶
Historically, there was (and still is) a command called virtualenv
which creates a virtual environment. But nowadays, the recommended
practice is to use the Python command, along with a venv
module
specified on the command line, to create the virtual environment. For example:
$ python3.12 -m venv foo_env
will create a folder called foo_env
and install some stuff into
it. The nice thing about this way of creating a virtual environment is
that you can be really clear about what version of python is installed
into that environment. The command above creates a virtual environment
that uses Python version 3.12, probably for a project called
foo
. Another project, called bar
, that uses the older 3.9 version
of Python, might be created like this:
$ python3.9 -m venv bar_env
Activating a Virtual Environment¶
Remember, you might have several virtual environments, each one for a
different project, and so you activate a virtual environment when
you're ready to use that set of packages. The commands to activate a
virtual environment are stored in a file inside the environment
folder, so, when you are ready to work on the foo
project, you would
do:
$ source foo_env/bin/activate
If the activation is successful, it modifies your prompt by putting the name of the venv folder in the prompt, so that you can be reminded that this shell has an active virtual environment and which one it is. Like this:
(foo_env) $ python
The command above will run python 3.12 and will load python packages
from the foo_env
folder.
Installing Python Modules using PIP¶
There's also a command called pip
, which installs Python packages
from the Internet, typically from the Python Package Index at
Pypi.org.
The command downloads the software and any dependencies (other packages that the package depends on) and installs them into the active virtual environment. So, use it after you do the activate command, above. The command might look like this:
(foo_env) $ python -m pip install some_package
There's also a shorthand, which I always use:
(foo_env) $ pip install some_package
Deactivating¶
When you are done with a virtual environment, you can deactivate it:
(foo_env) $ deactivate
$
Example¶
Here's an example of creating a virtual environment and some of the subfolders it creates.
$ mkdir foo
$ cd foo
$ python3.12 -m venv foo_env
$ ls
foo_venv/
$ ls foo_venv
bin/ include/ lib/ local/
$ ls foo_venv/bin
activate easy_install pip pip3.13 python python3.13
$ ls foo_venv/lib/
python3.12/
$ ls foo_venv/lib/python3.12
... site-packages/
$ ls foo_venv/lib/python3.6/site-packages/
We can depict a subset of the directory tree like this:
foo/ foo_venv/ bin/ activate pip python lib/ python3.12/ site-packages/
It's that last place (site-packages
) where pip installs packages,
and where python reads them, once you activate the virtual
environment. (Remember that activating the virtual environment
modifies the shell's prompt, to remind you that you are in one.)
$ source foo_venv/bin/activate
(foo_venv) $ pip install pymysql
(foo_venv) $ ls venv/lib/python3.12/site-packages/
... pymysql ...
Now, Python can import the PyMySQL
package. (Usually, the name you give
to pip
is the same as the name of the package, but PyMySQL
decided to
be different in capitalization.
Pros and Cons of Virtualenv¶
- Pro: you don't need to be root (be able to use
sudo
) to install Python packages. Instead, you can install them to directories that you own and control. - Pro: each virtualenv is independent, so different virtualenvs can have different, even conflicting, sets of Python packages.
- Con: because each virtualenv is independent, you have to (re-)install packages in each one. (It's possible to copy a virtualenv to another location, but such operations are considered fragile and therefore are discouraged. ) Fortunately, there are tools that make this easier.
- Con: the pathnames embedded in
venv/bin/activate
are absolute pathnames, so the virtualenv is not (generally) portable and relocatable, even on the same machine. You can't justmv
it to another place. (There are tricks, but we'll learn a different way to copy a project.)
That concludes the basic idea and usage of virtual environments. The rest of this reading has some practical tips and related information.
Appendix¶
Most of the time, you can infer from the prompt exactly what virtual environment you are in. But if it ever gets confusing, the following command will give you the complete, absolute pathname:
(foo_env) $ printenv VIRTUAL_ENV
The source
Command¶
You'll notice that we activate a virtual environment by using a
source
command. What's that and how is it related to the MySQL
source command?
The Unix source
command is the older one. Indeed it is likely the
ancestor of all source
commands (the ur-source
command). The
source
command means:
there are some commands in the named file. Read that file and execute them.
You can see that the MySQL version of source
means almost exactly
the same thing:
there is some code in the named file. Read that file and execute them.
The only difference is the kind of code in the file. If you encounter that command in other languages and situations, it's a decent bet that it means the same thing.
Source Pathnames¶
When we refer to a file in the source
command, we can either give a
relative pathname or an absolute pathname. We learned about both
of kinds of pathnames when we learned about Unix.
If you are in your ~/cs304/
folder, you can do:
source venv/bin/activate
That's a relative pathname and is the shortest pathname I can suggest. With tab completion at the end, it's not hard to type.
If you are in a folder that is a sibling of your venv
folder, you
can use a relative pathname like this:
source ../venv/bin/activate
A little more to type, but beats having to move around using cd
, like this:
cd ..
source venv/bin/activate
cd folder_you_were_in
Finally, you could instead use an absolute pathname by starting with your home directory in the pathname. The following command will work from anywhere:
source ~/cs304/venv/bin/activate
Again, a bit more to type than the minimal version, but only a couple of characters, and you avoid having to cd to your home directory and cd back to where you want to work.
I will try to remember to always use this absolute pathname in directions and examples, but you should keep it in mind for your own work. I suggest that you use the command with the tilde.