Saturday, November 28, 2009

Django and Python 3

Today I'm starting off doing some of the posts people want to see, and the number one item on that list is Django and Python 3. Python 3 has been out for about a year at this point, and so far Django hasn't really started to move towards it (at least at a first glance). However, Django has already begun the long process towards moving to Python 3, this post is going to recap exactly what Django's migration strategy is (most of this post is a recap of a message James Bennett sent to the django-developers mailing list after the 1.0 release, available here).

One of the most important things to recognize in this that though there are many developers using Django for smaller projects, or new projects that want to start these on Python 3, there are also a great many more with legacy (as if we can call recent deployments on Python2.6 and Django 1.1 legacy) deployments that they want to maintain and update. Further, Django's latest release, 1.1, has support for Python releases as old as 2.3, and a migration to Python 3 from 2.3 is nontrivial. However, it is significantly easier to make this migration from Python 2.6. This is the crux of James's plan, people want to move to Python 3.0 and moving towards Python 2.6 makes this easier for them and us. Therefore, since the 1.1 release Django has been removing support for one point version of Python per Django release. So, Django 1.1 will be the last release to support Python 2.3, 1.2 will be the last to support 2.4, etc. This plan isn't guaranteed, if there's a compelling reason to maintain support for a version for longer it will likely override this plan (for example if a particularly common deployment platform only offered Python 2.5 removing support for it might be delayed an additional release).

At the end of this process Django is going to end up only supporting Python 2.6. At this point (or maybe even before), a strategy will need to be devised for how to actually handle the switch. Some possibilities are, 1) having an official breakpoint, only one version is supported at a given time, 2) Python 3 support begins in a branch that tracks trunk and eventually it switches to become trunk once Python 3 is the more common deployment, 3) Python 2.6 and 3 are supported from a single codebase. I'm not sure which one of these is easiest, other projects such as PLY have chosen to go with option 3, however my inclination is that option 2 will be best for Django since issues like bytes vs. string are particularly prominent in Django (since it talks to so many external data sources).

For people who are interested Martin von Löwis actually put together a patch that, at the time, gave Django Python 3 support (at least enough to run the tutorial under SQLite). If you're very interested in Django on Python 3 the best path would probably be to bring that patch up to date (unless it's wildly out of date, I haven't checked), and starting to fix new things that have been introduced since the patch was written. This work isn't likely to get any official support, since maintaining Python 2.4 support and Python 3 would be far too difficult, however there's no reason you can't maintain the patch externally on something like Github or Bitbucket.

5 comments:

  1. Well if it's going to be that "slow" (relatively speaking of course), don't you think you'd better be targeting python 2.7 ?

    ReplyDelete
  2. Looking at the changes (http://docs.python.org/dev/3.0/whatsnew/3.0.html) I don't see any advantage of 3.0 over 2.6. In fact I see plenty of disadvantages:
    * It's slower
    * Some favourite functions now return 'views' instead of simple lists, eg x.keys(), others like map() and filter now return iterators instead of lists
    * Comparisons are now less flexible, eg 3 > None throws up an exception instead of returning True
    * You can't pass tuples into functions
    * Even the simply print function is now more complicated

    I could go on. It's not all bad, there are some useful additions, but overall 2.x series appears aimed at those that just want to Get Things Done but 3.x appears to heading towards Java.

    I'd be quite happy for Django to ignore 3.x for the immediate future. A port can be done in a few years time when distros are no longer installing the 2.x series as standard. I have no problem with slowly making everybody upgrade to 2.6 though.

    Phillip.

    ReplyDelete
  3. Great article, I've been looking to starting a project in Django, my first, next year and have been hoping to just start with Python 3. So, I've been wondering about this.

    PS not to be annoying, but I puzzled over this bit for a minute "to become trunk ones Python 3 is the more common" until I saw you meant 'once'.

    ReplyDelete
  4. Having been through this once (as part of the pywin32 project) I can say that the method BDFL recommends works very well. You DO NOT fork your code for version 3. The 2to3.py utility is amazing and converts your code for you. So when I do maintenance to adodbapi.py, I pretty much ignore Python 3 until the end of my development work and testing. Then I run it through 2to3 and test on Python 3.0 and 3.1. If that works, I commit the 2.x version to the tree. As part of the distribution process, the Python 3 version will be automagically created for the next release.
    There are a few Python3 adaptations in the code, such as:
    v v v v code starts here v v v v
    if sys.version[0] >= '3':
    StringTypes = [str]
    makeByteBuffer = bytes
    else:
    makeByteBuffer = buffer
    bytes = str
    StringTypes = types.StringTypes

    def Binary(aString):
    """This function constructs an object capable of holding a binary (long) string value. """
    return makeByteBuffer(aString)
    ^ ^ ^ ^ code ends here ^ ^ ^ ^

    You keep maintaining the 2.x version until you no longer need it, and the 3.x version just follows automatically along until you are ready to commit to only it.

    ReplyDelete
  5. Seconding VernonDCole's suggestion- I'm maintianing a bit of code that does some nasty things (both from the standpoint of what it does and 2to3 conversion)- an equivalent of bzr demandload is in use for example (which 2to3 obviously cannot translate).

    That said... 2to3 is a bit of work up front, but pretty painless when you've got it finished- as long as your unit tests are non sucky.

    That said, I personally avoid doctests as unittests like the plague- not sure how that'll play out for django.

    Either way, http://pkgcore.org/pkgcore/browser/snakeoil/snakeoil/caching_2to3.py might be of interest- it's a bit nasty due to 2to3 not being written in a way that makes it easily extensible.

    That little script basically checks the directory $PY2TO3_CACHEDIR for the converted form of the input- if found, it uses that rather then doing the conversion process all over. Pretty hefty speed up when you're doing conversion on quite a few files (20s to 1.8s in my own usage).

    Particularly useful if you delay your 2to3 conversion until build/install time- via that approach, and w/ a buildbot setup, you can continually do py3k runs of your tests w/out burning way more cpu then needed.

    Either way, you *don't* have to go to 2.6 to support 3k; I'm managing 2.4 and up w/out much issues (2.3 would be viable except I require proper sets and genexps).

    ReplyDelete

Note: Only a member of this blog may post a comment.