And Python 3 only in development.
A huge thank you to everyone on the Mathspace Team who supported, reviewed and helped out with this endeavour.
One day the Mathspace Team will put up a dedicated engineering blog, but until then this can live here.
For those who haven’t heard me talk about bringing step-by-step working out, handwriting recognition, teachers never needing to mark homework again, adaptive mastery-based learning, geometry, interactive graphing, calculus, The Martian content (currently under development) and so on to High School and College education:
Love it? Want to help make it happen faster for parents, students and teachers? We’re hiring.
8 or so developers, similar number of content team members, lots of sales and support staff, moving to a bigger office soon.
86 Python Dependencies
pip freeze | wc -l
147 Django models across two databases
~160k Python LOC
find . -name “*.py” | xargs wc -l
Above all do it incrementally, following lean startup ideas such as reduce batch size, thanks Don Reinertsen, and thanks for coming to YOW! Sydney.
for i in $(git log --grep="py3k" --format=format:"%H" --all);
do git diff $i $i~1 --stat | grep "changed";
Some quick git stats:
- 42+ specific merges to master between 30 August 2014 and 11 January 2016. It just wasn’t quite ready for Christmas/New Year 2014 so as things go we built out more math features.
- 1629 files changed
- 18027 insertions(+)
- 16885 deletions(-)
- So around 11% of our code base changed, including the final 2to3 conversion, but excluding final few bug fixes for legacy code without unit tests.
Most of the development battle felt like it was dependencies because frankly GitHub is awesome but there’s always higher latency than tapping a colleague on the shoulder or @colleague in Slack or Trello.
Switched out dependencies
- Fabric => Fabric3 (at least until @bitprophet decides to give us Fabric2’s alpha)
- python-cloudfiles => apache-libcloud
- suds (SOAP/WSDL) => Stripe billing gateway (might also use PySimpleSoap which claims Python 3 support, but it failed our unit tests before the Stripe switch and its master was broken with a print statement for months which is unfortunately not very confidence inspiring)
One pleasant point was upgrading Django because we have pretty good unit test coverage:
# -Wonce works too, just
# -Wall is funner to read out loud
python -Wall manage.py test
Tells you what’s RemovedInDjango18, RemovedInDjango19, RemovedInDjango110, which you’ll need to fix sooner or later.
Making the switch
Basically the things that broke were things with no unit tests. We fixed all of them over a weekend with a few quick deploys, thanks to David Cramer and the Sentry team.
How did we build the final branch?
0. Move anything that works under Python 2+3 off into a separate branch, reviewed and merged regularly.
1. pyenv (or homebrew) to install Python 3
# Mavericks and El Capitan come with versions of SQLite3 that
# segfault when running "./manage.py test", so update it
brew install sqlite3
brew link --force sqlite3
brew install pyenv
# Tell pyenv to use updated SQLite3
# Takes around 2 minutes
time PYTHON_CONFIGURE_OPTS="LD_RUN_PATH=/usr/local/opt/sqlite/lib LDFLAGS=-L/usr/local/opt/sqlite/lib CPPFLAGS=-I/usr/local/include" pyenv install 3.4.4
# Test it
>>> import sqlite3
# Remember to use it when making virtualenvs:
mkvirtualenv mathspace_py3 --python=$HOME/.pyenv/versions/3.4.4/bin/python
2. Look for outdated dependencies, update them, split off upgrades of dependencies with 2+3 versions when possible.
3. Update / migrate our code base internally. As an end project, 2to3 was a better choice than six.
2to3 . --output-dir=. -n -w
- Fix imports 2to3 did not get right so unit tests run. DiscoverRoadRunner was invaluable saving me 4 minutes per test-driven-development cycle, I know I should spend more time on it though Django 1.9’s builtin test runner does “–parallel” 🍻
- Fix other things 2to3 tool did not get right, e.g. Django’s smart_unicode => smart_text
- Update or fix unit tests under Python 3 accordingly, things like tests that failed with different PYTHONHASHSEED values that were bugs under Python 2 but only detected under Python 3.
- Check things like the following don’t throw errors, add unit tests if they did:
./manage.py celery worker
4. Add new Python 3 servers – thanks Rackspace devops for getting poise-python working so supervisor can stay under Python 2.
5. Test on staging environments for any issues.
6. Deploy to production.
7. Fixed a few things fast. Thanks to Sentry we could pounce quickly. Maybe 20-30 users affected by Sentry events total, so 99.94% or more of our >50k users in that particular low-traffic week would never have noticed, anecdotally better than our last Django upgrade.
Was it overengineered? Perhaps, but to channel Django, Mathspacers can be perfectionists with deadlines, and we do think it is very important to provide the best experience we possibly can for parents, students and teachers.
- Better unit testing
- Cleaner superclass calls:
# Instead of monstrosities like
- Cleaner types and unicode support:
long->int, str => bytes, unicode => str
- Can’t easily go the wrong way when using:
.decode() or .encode()
- No re.UNICODE flag required in regular expressions.
- Don’t need to remember
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
- No cryptic backtick syntax:
repr(a) == `a`
- Don’t lose original error if another error triggered in an except block
- Hundreds of other bugs fixes and feature improvements: https://docs.python.org/3/whatsnew/changelog.html
Nick Coghlan can tell the full story far better than I: