Monday, November 10, 2008

How the Heck do Django Models Work

Anyone who has used Django for just about any length of time has probably used a Django model, and possibly wondered how it works. The key to the whole thing is what's known as a metaclass, a metaclass is essentially a class that defines how a class is created. All the code for this occurs is here. And without further ado, let's see what this does.

So the first thing to look at is the method, __new__, __new__ is sort of like __init__, except instead of returning an instance of the class, it returns a new class. You can sort of see this is the argument signature, it takes cls, name, bases, and attrs. Where __init__ takes self, __new__ takes cls. Name is a string which is the name of the class, bases is the class that this new class is a subclass of, and attrs is a dictionary mapping names to class attributes.

The first thing the __new__ method does is check if the new class is a subclass of ModelBase, and if it's not, it bails out, and returns a normal class. The next thing is it gets the module of the class, and sets the attribute on the new class(this is going to be a recurring theme, getting something from the original class, and putting it in the right place on the new class). Then it checks if it has a Meta class(where you define your Model level options), it has to look in two places for this, first in the attrs dictionary, this is where it will be if you stick your class Meta inside your class. However, because of inheritance, we also have to check if the class has an _meta attribute already(this is where Django ultimately stores a bunch of internal information), and handle that scenario as well.

Next we get the app_label attribute, for this we either use the app_label attribute in the Meta class, or we pull it out of sys.modules. Lastly(at least for Meta), we build an instance of the Options class(which lives at django.db.models.options.Option) and add it to the new class as _meta. Next, if this class isn't an abstract base class we add the DoesNotExist and MultipleObjectsReturned exceptions to the class, and also inherit the ordering and get_latest_by attributes if we are a subclass.

Now we start getting to adding fields and stuff. First we check if we have an _default_manager attribute, and if not, we set it to None. Next we check if we've already defined the class, and if we have, we just return the class we already created. Now we go through each item that's left in the attrs dictionary and class the add_to_class method with it on the new class. add_to_class is a piece of internals that you may recognize from my first two blog posts, and what exactly it does I'll explain exactly what it does in another most, but at it's most basic level it adds each item in the dictionary to the new class, and each item knows where exactly it needs to get put.

Now we do a bunch of stuff to deal with inherited models. We iterate through ever item in bases, that's also a subclass of models.Model, and do the following: if it doesn't have an _meta attribute, we ignore it. If the parent isn't an abstract base class, if we already have a OneToOne field to it we set it up as a primary key, otherwise we create a new OneToOne field and install it as a primary key for the model. And now, if it is an abstract class, we iterate through the fields, if any of these fields has a name that is already defined on our class, we raise an error, otherwise we add that field to our class. And now we move managers from the parents down to the new class. Essentially we juts copy them over, and we also copy over virtual fields(these are things like GenericForeignKeys, which doesn't actually have a database field, but we still need to pass down and setup appropriately).

And then we do a few final pieces of cleanup. We make sure our new class doesn't have abstract=True in it's _meta, even if it's inherited from an abstract class. We add a few methods(get_next_in_order, and other), we inherit the docstring, or set a new one, and we send the class prepared signal. Finally, we register the model with Django's model loading system, and return the instance in Django's model cache, this is to make sure we don't have duplicate copies of the class floating around.

And that's it! Obviously I've skirted over how exactly somethings occur, but you should have a basic idea of what occurs. As always with Django, the source is an excellent resource. Hopefully you have a better idea of what exactly happens when you subclass models.Model now.


  1. Thanks for the walk through. The only part that puzzles me is the description of how _default_manager is handled, which looks backwards from what the code is doing (and the code seems wrong?)

  2. Dave, the reason it seems confusing is that I didn't cover how add_to_class and contribute_to_class work, which I'll be doing in a a future blog post, essentially when the managers get added to class they need to know whether or not the model already has a default manager, and if not, the first manager declared becomes the default, I'll be going over this in greater detail sometime soon.

  3. That helps with what the code is doing. But for the description to match what's happening on line 70, I'm thinking that the "and if not," in paragraph five needs to be "and if so,".

  4. Just a small note, where you say:

    "So the first thing to look at is the method, __new__, __new__ is sort of like __init__, except instead of returning an instance of the class, it returns a new class."

    Note that __new__ *always* returns a new instance, the __new__ method of a metaclass (that subclasses type) returns a new instance of that particular type (a new type == a new class in this particular case) while the __new__ method of a class (that subclasses object) returns a new instance of that particular class.

    __init__ *never* returns somethings, it only initializes the instance returned by __new__.



Note: Only a member of this blog may post a comment.