python: bare "pip install" no longer works in container images by default
When creating container images where some component(s) use python, it is not unusual to use pip install
to add packages that may not be present in the distro package repositories, or present but outdated.
With recent(-ish) alpine:edge containers this has ceased to work:
$ podman run -it alpine:edge
/ # apk add py3-pip
(1/17) Installing libbz2 (1.0.8-r4)
(2/17) Installing libexpat (2.5.0-r0)
(3/17) Installing libffi (3.4.4-r0)
(4/17) Installing gdbm (1.23-r0)
(5/17) Installing xz-libs (5.4.1-r0)
(6/17) Installing libgcc (12.2.1_git20220924-r9)
(7/17) Installing libstdc++ (12.2.1_git20220924-r9)
(8/17) Installing mpdecimal (2.5.1-r1)
(9/17) Installing ncurses-terminfo-base (6.4_p20230128-r0)
(10/17) Installing ncurses-libs (6.4_p20230128-r0)
(11/17) Installing readline (8.2.0-r0)
(12/17) Installing sqlite-libs (3.40.1-r0)
(13/17) Installing python3 (3.11.1-r3)
(14/17) Installing py3-parsing (3.0.9-r1)
(15/17) Installing py3-packaging (23.0-r0)
(16/17) Installing py3-setuptools (67.2.0-r0)
(17/17) Installing py3-pip (23.0-r1)
Executing busybox-1.35.0-r27.trigger
OK: 99 MiB in 32 packages
/ # pip install flake8
error: externally-managed-environment
× This environment is externally managed
╰─>
The system-wide Python installation in Alpine is managed
by using the system package manager (apk).
It appears you are installing to this system-wide location using
a different package manager. Please use a virtualenv instead:
python3 -m venv /path/to/venv
. /path/to/venv/bin/activate
pip install mypackage
note: If you believe this is a mistake, please contact your Python installation or OS distribution provider.
hint: See PEP 668 for the detailed specification.
I understand the rationale for using a virtualenv in a persistent & precious OS install on bare metal or a virtual machine, as it limits possible bad interactions between pip and the native package manager when doing updates.
When installing python stuff in a container though this is far less compelling IMHO. It is typical to simply re-create the container image from scratch each time, rather than live updating it. IOW, the container images are typically stateless and disposable, avoiding the problems that the use of virtualenvs aims to solve.
The use of virtualenvs is not without cost, the dockerfiles now need modifying to activate the virtualenv for any RUN
command that might use the python code (directly or indirectly) from the venv, as well as adding ENV
statements to activate the virtualenv when the container is later launched. Apps run in the container need to be careful to preserve the env variables whenever exec'ing any external program in case it happens to use a python module from the venv.
IOW, it has gone from a Dockerfile that can do
FROM alpine:edge
RUN apk add py3-pip
RUN pip install mypackage
to one that AFAICT either has to do
FROM alpine:edge
RUN apk add py3-pip
RUN python3 -m venv /path/to/venv
RUN . /path/to/venv/bin/activate && \
pip install mypackage
ENV PATH $PATH:/path/to/venv/bin
ENV VIRTUAL_ENV /path/to/venv
or has to disable the undesired restriction
FROM alpine:edge
RUN apk add py3-pip
RUN rm -f /usr/lib/python3.*/EXTERNALLY-MANAGED
RUN pip install mypackage
I don't know if this container inconvenience was already evaluated and discussed somewhere ? I presume that Alpine is simply following the PEP 668 recommendations for containers (https://peps.python.org/pep-0668/#keep-the-marker-file-in-container-images). If this container scenario was not anticipated/expected, however, then I'd suggest this change is re-evaluated, with a view to potentially limiting to traditional OS installs, not containers.