Working with Compilers

Broadly speaking, interpreted languages tend to be slow, but are relatively easy to learn and use. Compiled languages generally deliver the maximum speed, but are more complex to learn and use effectively. Python can utilize libraries of compiled code that are appropriately prepared, or can invoke a compiler (standard or “just in time”) to compile snippets of code and incorporate it directly into the execution.

Wrapping Compiled Code in Python

Function libraries can be written in C/C++/Fortran and converted into Python-callable modules.

Fortran

  • If you have Fortran source code you can use f2py
    • Part of NumPy
    • Can work for C as well
    • Extremely easy to use
    • Can wrap legacy F77 and some newer F90+ (modules are supported)
    • Must be used from the command line

http://docs.scipy.org/doc/numpy-dev/f2py/

We will illustrate with a simple (and very incomplete) example of code to work with fractions.

Example

Download the example Fortran code fractions.f90 to try this yourself.

First create a Python signature file.

f2py fractions.f90 -m Fractions -h Fractions.pyf

We then use the signature file to generate the Python module.

f2py -c -m Fractions fractions.f90 

The original Fortran source consisted of a module Fractions. Examining the signature file, we see that f2py has lower-cased it and created the Python module Fractions. Under Linux the module file is called Fractions.cpython-39-x86_64-linux-gnu.so but we can drop everything past the period when we import it.

>>> from Fractions import fractions
>>> fractions.adder(1,2,3,4)
array([10,  8], dtype=int32)

One significant weakness of f2py is limited support of the Fortran90+ standards, particularly derived types. One option is the f90wrap package.

It is also possible to wrap the Fortran code in C by various means, such as the F2003 ISO C binding features, then to use the Python-C interface packages, such as ctypes and CFFI, for Fortran. More details are available at fortran90.org for interfacing with C and Python.

C

Python provides the ctypes standard package. Ctypes wraps C libraries into Python code. To use it, prepare a shared (dynamic) library of functions. This requires a C compiler, and the exact steps vary depending on your operating system. Windows compilers produce a file called a DLL, whereas Unix and MacOS shared libraries end in .so.

Much as for f2py, the user must prepare some form of signature for the C functions. Ctypes types include

Python C
c_double double
c_int int
c_longlong longlong
c_numpy.ctypeslib.ndpointer(dtype=numpy.float64) *double

See the documentation for a complete list.

Example

Download the arith.c file, which implements some trivial arithmetic functions. It must be compiled to a shared library. Under Unix we execute the commands

gcc -fPIC -c arith.c 
gcc -shared arith.o -o arith.so

We now utilize it from the intepreter as follows:

>>> import ctypes
>>> arith.sum.restype = ctypes.c_double
>>> arith.sum.argtypes = [ctypes.c_double,ctypes.c_double]
>>> arith.sum(3.,4.)
7.0

A newer tool for C is CFFI. CFFI is a Python tool and must be installed through pip or a similar package manager. CFFI has four modes but we will show only the “API outline” mode, which is the easiest and most reliable but does require a C compiler to be available. In this mode, CFFI will use the C source and any header files to generate its own library.

Example

We will wrap arith.c with CFFI. First we must create and run a build file, which we will call build_arith.py.

from cffi import FFI
ffibuilder = FFI()

#First list all the function prototypes we wish to include in our module.
ffibuilder.cdef("""
  double sum(double x, double y);
  double difference(double x, double y);
  double product(double x, double y);
  double division(double x, double y);
 """)

# set_source() specifies the name of the Python extension module to
# create, as well as the original C source file.  If we had arith.h we
# would put include "arith.h" in the triple quotes.  Conventionally, we
# name the extension module with a leading underscore.
ffibuilder.set_source("_arith",
"""
  double sum(double x, double y);
  double difference(double x, double y);
  double product(double x, double y);
  double division(double x, double y);
""",
 sources = ['arith.c'],
 library_dirs = [],
)

ffibuilder.compile()


We can now import our module. The functions will be available with the namespace lib.

>>> from _arith import lib
>>> lib.sum(3.,4.)
7.0

CFFI supports more advanced features. For example, structs can be wrapped into Python classes. See here for an example. The CFFI mode we have illustrated also creates static bindings, unlike ctypes (and other modes of CFFI) for which this happens on the fly.

However, neither ctypes nor CFFI supports C++ directly. Any C++ must be “C-like” and contain an extern C declaration.

C++

One of the most popular packages that deals directly with C++ is PyBind11. Setting up the bindings is more complex than is the case for ctypes or CFFI, however, and the bindings are written in C++, not Python. Pybind11 will have to be installed through conda or pip.

One option is to use CMake, since it can be configured to generate the fairly complex Makefile required, and it also works on Windows. A somewhat simpler method is to use the Python package invoke. This can be installed through pip or through a manager such as a Linux operating system package manager. The Python header file Python.h must also be accessible.

Example We will wrap the fractions.cxx file. It also requires the fractions.h header. These files implement a very incomplete Fractions class, similar to the Fortran example above.

  1. Write the bindings wrap_fractions.cxx.
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>

#include "fractions.h"

namespace py = pybind11;

PYBIND11_MODULE(py_fractions,m) {

    m.doc() = "Fractions module";
    py::class_<Fraction>(m, "Fraction")
        .def(py::init<int, int>())
        .def("addFracs", &Fraction::addFracs);

}

  1. Compile fractions.cxx into a shared library. Invoke can be used for this, but a simple command line is also sufficient here.
g++ -O3 -Wall -Werror -shared -std=c++11 -fPIC fractions.cxx -o libfractions.so

Run the command

python -m pybind11 --includes

in order to determine the include path. On a particular system it returned

-I/usr/include/python3.9 -I/usr/include/pybind11

Take note of the include file paths, which will vary from one system to another. Move into Python and run invoke

import invoke
cpp_name="wrap_fractions.cxx"
extension_name="py_fractions"

invoke.run(
    "g++ -O3 -Wall -Werror -shared -std=c++11 -fPIC fractions.cxx "
    "-o libfractions.so "
)
invoke.run(
    "g++ -O3 -Wall -Werror -shared -std=c++11 -fPIC "
    "`python3 -m pybind11 --includes` "
    "-I /usr/include/python3.9 -I .  "
    "{0} "
    "-o {1}`python3.9-config --extension-suffix` "
    "-L. -lfractions -Wl,-rpath,.".format(cpp_name, extension_name)
)

This will create a module whose name begins with py_fractions (the rest of the name is specific to the platform on which it was created, and is ignored when importing). Test that it works:

>>> from py_fractions import Fraction
>>> f1=Fraction(5,8)
>>> f2=Fraction(11,13)
>>> f1.addFracs(f2)
[153, 104]

Pybind11 requires C++ code that adheres to the C++11 standard or higher. Another option is the Boost library Python bindings Boost.Python. Pybind11 is a fork of these bindings; the Boost version is more general, and can handle many older C++ codes, but it is more complex to use.

Cython

Cython is a package that allows Python code to be compiled into C code. Some rewriting is required because C requires statically-typed variables. Cython defines two additional keywords to declare functions, cdef and cpdef. cdef is basically C and can produce the fastest code, but cdef declared functions are not visible to Python code that imports the module. cpdef is mix of C with dynamic bindings for passing of Python objects which makes it slower than cdef (read the details).

Exercise: integrate.py

Suppose we start with

"""
Created on Mon Feb 3 14:05:03 2020

@author: Katherine Holcomb
"""

cpdef double f(double x):
    return x**2-x

cpdef double integrate_f(double a, double b, int N):
    cdef int i
    cdef double s, dx
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f(a+i*dx)
    return s*dx

Save the above code as integrate_cyf.pyx. Now create a setup.py file:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("integrate_cyf.pyx")
)

On the command line run python setup.py build_ext --inplace to build the extension.

This will create a file integrate_cyf.c in your local directory. In addition you will find a file called integrate_cyf.so in unix or integrate_cyf.pyd in Windows. Now you can import your module in your Python scripts.

import integrate_cyf as icyf
print(icyf.integrate_f(1.,51.,1000))

More detailed information describing the use of Cython can be found here.

Numba

Numba is available with the Anaconda Python distribution. It compiles selected functions using the LLVM compiler. Numba is accessed through a decorator. Decorators in Python are wrappers that modify the functions without the need to change the code.

Exercise: A well-known but slow way to compute pi is by a Monte Carlo method. Given a circle of unit radius inside a square with side length 2, we can estimate the area inside and outside the circle by throwing “darts” (random locations). Since the area of the circle is pi and the area of the square is 4, the ratio of hits inside the circle to the total thrown is pi/4.

Open the MonteCarloPi.py script.

"""
Created on Mon Feb 3 14:41:44 2020

 This program estimates the value of PI by running a Monte Carlo simulation.

 NOTE:  This is not how one would normally want to calculate PI, but serves
 to illustrate the principle.

@author: Katherine Holcomb
"""

import sys
import random

def pi(numPoints):
    """Throw a series of imaginary darts at an imaginary dartboard of unit
        radius and count how many land inside the circle."""

    numInside=0
 
    for i in range(numPoints):
        x=random.random()
        y=random.random()
        if (x**2+y**2<1):
            numInside+=1

    pi=4.0*numInside/numPoints
    return pi

def main():
    # parse number of points from command line. Try 10^8
    if (len(sys.argv)>1):
        try:
            numPoints=int(float((sys.argv[1])))
            print('Pi (approximated): {}'.format(pi(numPoints)))
        except:
            print("Argument must be an integer.")
    else:
        print("USAGE:python MonteCarlo.py numPoints")


if __name__=="__main__":
    main()

Running with $10^9$ points takes 6 minutes and 21 seconds on one particular system.

Now add another import statement.

from numba import jit

Add the decorator above the pi function:

@jit
def pi(numPoints):

No other changes are required. The time is reduced to only 14.7 seconds!

Previous
Next