Minimal crying while migrating to Python 3

Keep tears at bay with this walkthrough of my experience of migrating Python from version 2 to 3.

Procrastination must be addicting. I’ve been “writing” this article for months. In a similar fashion, I’ve been putting off updating our 50 thousand line code base to Python3.

As some of you know, Python 2 became legacy at the start of 2020. There’s a few downsides to staying on version 2. Libraries you rely on may stop supporting python 2, and security vulnerabilities may arise. Worst of all, there’s this message any time you pip install anything:

My own personal Y2K My own personal Y2K

It’s not the most urgent thing on my mind, I assume a lot of applications will stay on version 2 for years to come. Our code base did not spontaneously combust as the world had its birthday, and I got to spend my New Year’s Eve like everyone else did that day: Drinking so much that their New Years resolution was to never to drink again.

Jokes aside, my team and I still see the merit in upgrading from 2 to 3. We are very much technical perfectionists and enjoy having the latest features of libraries and applications we use. Most importantly, we don’t want to find ourselves in a pickle half a year from now. It’s best not to continue kicking the can, so we are buckling up and getting this done. Oh, and we want to do this without any server downtime (fingers crossed).

I’ve been leading the charge of migrating our applications to Python 3, and I’d like to share what we've learned so far.

Upgrade dependencies

My first order of business was to update our packages to versions that support both Python 2 and 3. I was very naive about how much work would be required to update all packages. All of the cross-dependencies between packages made it really hard to keep track of what I was doing. Unless you rely on very few packages, I don’t recommend trying to do this manually.

Until recently, we managed our packages in a “requirements.txt” file which we manually edited. Welcome Pipenv. It’s not a perfect tool, but it’s been so useful that I’ve since added it to all our Python repositories. It finds all cross dependency issues and let’s you know when there’s an issue. Pipenv reduced the amount of work I had to do significantly. The package management is very familiar if you’ve ever used npm or yarn.

A few other useful tools to use are:

pip list --outdated # Will find packages that are behind current.  
pip-check           # Will list outdated packages by Major/Minor.
pipenv check        # Check for security vulnerabilities.

With these tools, it was fairly straightforward to update all our packages to their 2/3 compatible versions. You should always do this in iterations, and be diligent with your unit tests and QA. Not all package updates are backwards compatible, so be ready to read a few change logs if you update any major versions.

Support both Python 2 AND 3

So now that our external resources are Python 2/3 compatible, it’s time to update the code base itself. This is the step that we are currently on, and the library futurize has been essential. This is perhaps the most important step in making sure the upgrade from 2 to 3 has no server downtime.

Futurize gives us the ability to transform our code to code that can run on both Python 2 and 3. Since the code base remains compatible with our server, we are able to piecemeal the upgrade on directory at a time instead of having to do it all at once.

The generated code is not always perfect so it’s important to have a solid unit test and QA strategy. We really felt the importance of code standards, unit tests, and keeping our codebase DRY.

Unfortunately, there are parts of the code base that we wrote before we implemented our rigorous standards. Before we had the amount of customers we do today, there were much bigger concerns than having rigorous code standards, and, dare I say, unit tests. So the sins of our past have made this step somewhat difficult. If I had to leave you with one piece of advice today it’s this: Write good code, and don’t write bad code.

After all is said and done, our production server will be running Python 2 on a code base that supports either version of Python.

Update servers to Python 3

Since I am still on the previous step, everything written after this sentence is unproven. I’ve put in a lot of research and thought on the next steps that we will take, but as we all know “The best-laid plans of mice and men often go awry”. High school English FTW.

Once our code base is compatible with both versions of Python, updating our server to run Python 3 should be presumably safe. This step is key in ensuring there is no server downtime, and this should be like most other server updates we’ve ever done. If all goes as planned, all we will have to do is change a single line in our Dockerfile.

One by one, each of the EC2 machines will take it’s turn shutting down, and spin back up with a fresh docker using Python 3. I will be tightly holding my breath while all my 3 screens are occupied with our infrastructure health charts. This is how I get my thrills.

Make your codebase beautiful again

Have you ever seen what a Python codebase looks like when it is compatible with versions 2 and 3? It’s not pretty. I take great pride in the code I work on and I really care about the readability of our code base. There is another wonderful package named ‘2to3' which we will use as our last step to turn our code base from 2/3 compatible with just 3. Built-in unicode here we come.

I’m meticulous to a fault, and this project definitely been a lot of fun and pushed me to think of all details. Aside from planning, sometimes you just have to try and spin up your servers after changing your environment to Python 3 and see what comes out. It’s as much a planning activity as it is a learning activity. After reading countless articles and guides, there is still much that I won’t be able to predict. And while I lay out my plans for you today, reality will probably have something else to say. I’ve learned a lot already, and I look forward to finishing this without any downtime.

Wish us luck.

TLDR

  1. Update your external dependencies using a package manager such as Pipenv. All packages should support both Python 2 and 3.
  2. Update your code base to be 2/3 compatible with futurize.
  3. Update your server to run Python 3.
  4. Optional: Update your code base to Python 3 using 2to3.