Multi-stage Docker builds for Python apps

Smaller docker images are quicker to transfer and deploy. What’s more, by only including what is absolutely required you can avoid security vulnerabilities in packages that aren’t even needed. There are many examples online for applications written in Go, which deploys as a single statically-linked binary. It’s not so obvious how to translate these examples to an application written in Python. Images based on alpine Linux are the smallest, but are not compatible with manylinux1 wheels due to the use of musl libc instead of glibc.
Read more

The equals operation in Shapely

Shapely provides two ways of testing the equivalence of geometries: Using the == operator, e.g. a == b Using the .equals method, e.g. a.equals(b) The result of the two methods are not identical, although it may appear that way at first. For example, take two points: >>> A = Point([1, 2]) >>> B = Point([1, 2]) >>> A == B True >>> A.equals(B) True So far so good, but what about a more complex example?
Read more

cfncluster + anaconda

CfnCluster is a tool for deploying and managing high performance clusters on AWS. It uses AWS CloudFormation to allow you to quickly configure and deploy a cluster using EC2 instances. The cluster automatically scales the number of workers based on the number of jobs in the task queue, starting new instances as required (to a preconfigured maximum) and shutting down idle nodes. The entire cluster can be shutdown and restarted easily which is great for heavy but intermittent workloads.
Read more

Adafruit GPS with a Pro Mini 3v3 8MHz

The default 9600 baud rate is a little too fast for the 8MHz clock speed of the Pro Mini 3v3 when using a software serial connection. This can result in invalid data being received from the GPS. NMEA messages include a basic checksum to ensure the message was received correctly. In the example below the checksum for the message is 47 in hexidecimal, preceeded by a * (the last 3 characters).
Read more

XKCD + Google Trends, 5 Years Later

Back in 2012 XKCD #1043 presented data from Google Trends, predicting that searches for “tumblr” would overtake “blog” on 12th October 2012: Fast-forward to present (2017) and we can see the prediction was a good, with “tumblr” overtaking “blog” sometime between October and November: https://trends.google.co.uk/trends/explore?date=all&q=blog,tumblr,wordpress,livejournal The data is available in CSV format and can be displayed easily using Pandas and Matplotlib. The XKCD extension for Matplotlib even gives it that XKCD-feel.
Read more

Sorted Open Data (Part 2) with D3

This post is Part 2 of a series. See Part 1: Sorted Open Data with Shapely and SVG. D3 is a JavaScript library for manipulating documents based on data. Support for SVG in all modern browsers allows you to create beautiful, interactive maps all though manipulation of the DOM using D3. D3 has quite a steep learning curve and every time I use it I feel like I’m fumbling around in the dark.
Read more

Sorted Open Data with Shapely and SVG

I recently came across the Sorted Cities project by Hans Hack. He uses building footprint data extracted from OpenStreetMap to create beautiful posters, showing all of the buildings in a city sorted by their area. As soon as I saw this, the hacker in me thought “how would I go about creating this myself?” Shapely has a fantastic feature that converts geometries into an SVG representation, which is used to display the geometry in Jupyter.
Read more

Fun with emoji

Despite being somewhat late to the party 🎉, this week I’ve been having fun with emoji 😂. Emoji are just unicode characters, which means as well as being easy to send in text messages they can also turn up in places you might not expect. For example, emoji are correctly displayed in the macOS Terminal app. They’re also valid in filenames provided that the file system supports unicode (which all modern filesystems do).
Read more

Compiling Python extensions for old glibc versions

I’m a big fan of the Anaconda Python distribution. It makes managing multiple Python environments on different operating systems easy (at least in theory). I recently came across an issue trying to import a Cython extension I’d built for Linux on a different machine. We’d be testing the module on Travis-CI for months without any issues so this came as a surprise. When I tried to import the module the following exception was raised:
Read more

Pickling Cython classes

Automatic pickle support in Cython is still a pending feature. In order to support pickling of cdef classes you must implement the pickle protocol. This is done by implementing the __getstate__ and __setstate__ methods. Although the official documentation is quite clear, it lacks a simple example and also instruction on handling objects that can’t be directly pickled. A minimal example is given below for the Person class which stores a name (string) and age (integer).
Read more