Wednesday, December 2, 2009

A few thoughts on education

Lately I've been thinking quite a lot about education, both my own and in general. I ended up writing quite a bit about it. You can find all my thoughts here (PDF warning). I stuck it in a PDF because I think it's a more canonical form. I'm interested in hearing any thoughts people have, both about what I wrote and about education.

Tuesday, December 1, 2009

A month in review

I've now been blogging for 30 days in a row, so tonight is just going to be a simple post to finish the month of. First of all, WOW, another month of blogging every day completed. This month was great fun, and I managaged to keep the bullshit filler posts to a minimum (3 or 4 by my count). I also finished the month with 42700 total hits (easily a record), including hitting reddit and Hacker News several times. It's been great fun writing every night, hopefully I'll be able to keep up the regular posts, but for now I plan on taking a nice long nap, then I'll get back to the requests.

Monday, November 30, 2009

You Built a Metaclass for *what*?

Recently I had a bit of an interesting problem, I needed to define a way to represent a C++ API in Python. So, I figured the best way to represent that was one class in Python for each class in C++, with a functions dictionary to track each of the methods on each class. Seems simple enough right, do something like this:

class String(object):
functions = {
"size": Function(Integer, []),
}


We've got a String class with a functions dictionary that maps method names to Function objects. The Function constructor takes a return type and a list of arguments. Unfortunately we run into a problem when we want to do something like this:

class String(object):
functions = {
"size": Function(Integer, []),
"append": Function(None, [String])
}


If we try to run this code we're going to get a NameError, String isn't defined yet. Django models have a similar issue, with recursive foreign keys. Django's solution is to use the placeholder string "self", and have a metaclass translate it into the right class. Also having a slightly more declarative API might be nice, so something like this:

class String(DeclarativeObject):
size = Function(Integer, [])
append = Function(None, ["self"])


So now that we have a nice pretty API we need our metaclass to make it happen:

RECURSIVE_TYPE_CONSTANT = "self"

class DeclarativeObjectMetaclass(type):
def __new__(cls, name, bases, attrs):
functions = dict([(n, attr) for n, attr in attrs.iteritems()
if isinstance(attr, Function)])
for attr in functions:
attrs.pop(attr)
new_cls = super(DeclarativeObjectMetaclass, cls).__new__(cls, name, bases, attrs)
new_cls.functions = {}
for name, function in functions.iteritems():
if function.return_type == RECURSIVE_TYPE_CONSTANT:
function.return_type = new_cls
for i, argument in enumerate(function.arguments):
if argument == RECURSIVE_TYPE_CONSTANT:
function.arguments[i] = new_cls
new_cls.functions[name] = function
return new_cls

class DeclarativeObject(object):
__metaclass__ = DeclarativeObjectMetaclass



And that's all their is to it. We take each of the functions on the class out of the attributes, create a normal class instance without the functions, and then we do the replacements on the function objects and stick them in a functions dictionary.

Simple patterns like this can be used to build beautiful APIs, as is seen in Django with the models and forms API.

Sunday, November 29, 2009

Getting Started with Testing in Django

Following yesterday's post another hotly requested topic was testing in Django. Today I wanted to give a simple overview on how to get started writing tests for your Django applications. Since Django 1.1, Django has automatically provided a tests.py file when you create a new application, that's where we'll start.

For me the first thing I want to test with my applications is, "Do the views work?". This makes sense, the views are what the user sees, they need to at least be in a working state (200 OK response) before anything else can happen (business logic). So the most basic thing you can do to start testing is something like this:

from django.tests import TestCase
class MyTests(TestCase):
def test_views(self):
response = self.client.get("/my/url/")
self.assertEqual(response.status_code, 200)


By just making sure you run this code before you commit something you've already eliminated a bunch of errors, syntax errors in your URLs or views, typos, forgotten imports, etc. The next thing I like to test is making sure that all the branches of my code are covered, the most common place my views have branches is in views that handle forms, one branch for GET and one for POST. So I'll write a test like this:

from django.tests import TestCase
class MyTests(TestCase):
def test_forms(self):
response = self.client.get("/my/form/")
self.assertEqual(response.status_code, 200)

response = self.client.post("/my/form/", {"data": "value"})
self.assertEqual(response.status_code, 302) # Redirect on form success

response = self.client.post("/my/form/", {})
self.assertEqual(response.status_code, 200) # we get our page back with an error


Now I've tested both the GET and POST conditions on this view, as well the form is valid and form is invalid cases. With this strategy you can have a good base set of tests for any application with not a lot of work. The next step is setting up tests for your business logic. These are a little more complicated, you need to make sure models are created and edited in the right cases, emails are sent in the right places, etc. Django's testing documentation is a great place to read more on writing tests for your applications.

Saturday, November 28, 2009

Django and Python 3

Today I'm starting off doing some of the posts people want to see, and the number one item on that list is Django and Python 3. Python 3 has been out for about a year at this point, and so far Django hasn't really started to move towards it (at least at a first glance). However, Django has already begun the long process towards moving to Python 3, this post is going to recap exactly what Django's migration strategy is (most of this post is a recap of a message James Bennett sent to the django-developers mailing list after the 1.0 release, available here).

One of the most important things to recognize in this that though there are many developers using Django for smaller projects, or new projects that want to start these on Python 3, there are also a great many more with legacy (as if we can call recent deployments on Python2.6 and Django 1.1 legacy) deployments that they want to maintain and update. Further, Django's latest release, 1.1, has support for Python releases as old as 2.3, and a migration to Python 3 from 2.3 is nontrivial. However, it is significantly easier to make this migration from Python 2.6. This is the crux of James's plan, people want to move to Python 3.0 and moving towards Python 2.6 makes this easier for them and us. Therefore, since the 1.1 release Django has been removing support for one point version of Python per Django release. So, Django 1.1 will be the last release to support Python 2.3, 1.2 will be the last to support 2.4, etc. This plan isn't guaranteed, if there's a compelling reason to maintain support for a version for longer it will likely override this plan (for example if a particularly common deployment platform only offered Python 2.5 removing support for it might be delayed an additional release).

At the end of this process Django is going to end up only supporting Python 2.6. At this point (or maybe even before), a strategy will need to be devised for how to actually handle the switch. Some possibilities are, 1) having an official breakpoint, only one version is supported at a given time, 2) Python 3 support begins in a branch that tracks trunk and eventually it switches to become trunk once Python 3 is the more common deployment, 3) Python 2.6 and 3 are supported from a single codebase. I'm not sure which one of these is easiest, other projects such as PLY have chosen to go with option 3, however my inclination is that option 2 will be best for Django since issues like bytes vs. string are particularly prominent in Django (since it talks to so many external data sources).

For people who are interested Martin von Löwis actually put together a patch that, at the time, gave Django Python 3 support (at least enough to run the tutorial under SQLite). If you're very interested in Django on Python 3 the best path would probably be to bring that patch up to date (unless it's wildly out of date, I haven't checked), and starting to fix new things that have been introduced since the patch was written. This work isn't likely to get any official support, since maintaining Python 2.4 support and Python 3 would be far too difficult, however there's no reason you can't maintain the patch externally on something like Github or Bitbucket.

Friday, November 27, 2009

Why Meta.using was removed

Recently Russell Keith-Magee and I decided that the Meta.using option needed to be removed from the multiple-db work on Django, and so we did. Yesterday someone tweeted that this change caught them off guard, so I wanted to provide a bit of explanation as to why we made that change.

The first thing to note is that Meta.using was very good for one specific use case, horizontal partitioning by model. Meta.using allowed you to tie a specific model to a specific database by default. This meant that if you wanted to do things like have users be in one db and votes in another this was basically trivial. Making this use case this simple was definitely a good thing.

The downside was that this solution was very poorly designed, particularly in light on Django's reusable application philosophy. Django emphasizes the reusability of application, and having the Meta.using option tied your partitioning logic to your models, it also meant that if you wanted to partition a reusable application onto another DB this easily the solution was to go in and edit the source for the reusable application. Because of this we had to go in search of a better solution.

The better solution we've come up with is having some sort of callback you can define that lets you decide what database each query should be executed on. This would let you do simple things like direct all queries on a given model to a specific database, as well as more complex sharding logic like sending queries to the right database depending on which primary key value the lookup is by. We haven't figured out the exact API for this, and as such this probably won't land in time for 1.2, however it's better to have the right solution that has to wait than to implement a bad API that would become deprecated in the very next release.

Thursday, November 26, 2009

Just a Small Update

Unfortunately, I don't have an interesting, contentful post today. Just a small update about this blog instead. I now have a small widget on the right hand side where you can enter topics you'd like to hear about. I don't always have a good idea of what readers are interested in, and far too often I reject blog post ideas because I think either, "no one cares about that" or "everyone always knows that" so hopefully this will be both a good way for me to write interesting content that people want to read about, as well as a good way for me to overcome any writers block. So please submit anything you'd like to hear about, Python, Django, the web, programming in general, compilers, or me ranting about politics, I'm willing to consider any topic.

To my American readers: Happy Thanksgiving!