Compiling Python extensions for old glibc versions

Posted on

I’m a big fan of the Anaconda Python distribution. It makes managing multiple Python environments on different operating systems easy (at least in theory).

I recently came across an issue trying to import a Cython extension I’d built for Linux on a different machine. We’d be testing the module on Travis-CI for months without any issues so this came as a surprise. When I tried to import the module the following exception was raised:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/snorf/miniconda3/envs/python34/lib/python3.4/site-packages/example/__init__.py", line 6, in <module>
    import ._example
ImportError: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/snorf/miniconda3/envs/python34/lib/python3.4/site-packages/example/_example.cpython-34m.so)

You can check the version of glibc installed by running the shared library as an executable. On my development machine (running Ubuntu 14) it reports version 2.19.

$ /lib64/libc.so.6

GNU C Library (Ubuntu EGLIBC 2.19-0ubuntu6.9) stable release version 2.19, by Roland McGrath et al.
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.8.4.
Compiled on a Linux 3.13.11 system on 2016-05-26.
Available extensions:
    crypt add-on version 2.1 by Michael Glad and others
    GNU Libidn by Simon Josefsson
    Native POSIX Threads Library by Ulrich Drepper et al
    BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<https://bugs.launchpad.net/ubuntu/+source/eglibc/+bugs>.

On the target machine (running Scientific Linux 6) the installed version is only 2.12.

$ /lib64/libc.so.6

GNU C Library stable release version 2.12, by Roland McGrath et al.
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.4.7 20120313 (Red Hat 4.4.7-16).
Compiled on a Linux 2.6.32 system on 2016-05-10.
Available extensions:
The C stubs add-on version 2.1.2.
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
RT using linux kernel aio
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<http://www.gnu.org/software/libc/bugs.html>.

One option would be to upgrade glibc on the target machine. This wasn’t an option for me though, as I’m not a sysadmin on that machine - it’s actually a node on a high performance cluster with many other users.

The alternative is to find a work-around. I created a minimal example program that triggered the issue. It looks like using Cython’s memory views was the cause.

# a trivial example which uses a cython memoryview
cpdef test():
    cdef int carr[3]
    cdef int [:] carr_view = carr
carr_view[:] = 42

Glibc uses symbol versioning in order to support forwards compatibility of binaries (i.e. programs build for older versions should behave the same way on newer versions). This is discussed in the GCC wiki and also the GNU ld manual.

We can use the objdump command with the -T argument to work out which symbols require version 2.14:

$ objdump -T _example.cpython-34m.so | grep GLIBC

0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 free
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strlen
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.4   __stack_chk_fail
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 memset
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 memcmp
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.14  memcpy
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 malloc
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.3.4 __vsnprintf_chk
0000000000000000 w DF *UND* 0000000000000000 GLIBC_2.2.5 __cxa_finalize

It looks like memcpy is the only function with a dependency on glibc 2.14. A search uncovered a bug report from 2011 which discusses the change. The older version of of memcpy had an undefined behavior when called in a certain way.

It’s possible to force the linker to use the older version of the function with just one line of C. This is explained in the GCC and GNU ld references above.

__asm__(".symver memcpy,memcpy@GLIBC_2.2.5");

It wasn’t immediately clear to me how I get this line of C into a Cython extension. The easiest solution I found was to include it in a C header (e.g., glibc_version_fix.h) then include it using an argument to the compiler passed in the CFLAGS environment variable.

$ export CFLAGS="-I. -include glibc_version_fix.h"
$ python setup.py build_ext

Once the extension has been compiled we can use objdump again to check it worked.

$ objdump -T _example.cpython-34m.so | grep GLIBC

0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 memcpy
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 free
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strlen
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.4   __stack_chk_fail
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 memset
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 memcmp
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 malloc
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.3.4 __vsnprintf_chk
0000000000000000 w DF *UND* 0000000000000000 GLIBC_2.2.5 __cxa_finalize

Notice that memcpy is now using the the GLIBC_2.2.5 version instead of GLIBC_2.14.

Don’t forget to run all of your programs tests to check the change hasn’t broken anything!