Lazy Pythonista: Why Meta.using was removed

Friday, November 27, 2009

Why Meta.using was removed

Recently Russell Keith-Magee and I decided that the Meta.using option needed to be removed from the multiple-db work on Django, and so we did. Yesterday someone tweeted that this change caught them off guard, so I wanted to provide a bit of explanation as to why we made that change.

The first thing to note is that Meta.using was very good for one specific use case, horizontal partitioning by model. Meta.using allowed you to tie a specific model to a specific database by default. This meant that if you wanted to do things like have users be in one db and votes in another this was basically trivial. Making this use case this simple was definitely a good thing.

The downside was that this solution was very poorly designed, particularly in light on Django's reusable application philosophy. Django emphasizes the reusability of application, and having the Meta.using option tied your partitioning logic to your models, it also meant that if you wanted to partition a reusable application onto another DB this easily the solution was to go in and edit the source for the reusable application. Because of this we had to go in search of a better solution.

The better solution we've come up with is having some sort of callback you can define that lets you decide what database each query should be executed on. This would let you do simple things like direct all queries on a given model to a specific database, as well as more complex sharding logic like sending queries to the right database depending on which primary key value the lookup is by. We haven't figured out the exact API for this, and as such this probably won't land in time for 1.2, however it's better to have the right solution that has to wait than to implement a bad API that would become deprecated in the very next release.

6 comments:

Waldemar KornewaldNovember 27, 2009 at 2:56 AM
Will you also allow for rewriting/proxying the query with that API? For example, with sharding it might be necessary to transparently run a query on multiple DBs. With non-relational backends you might have to emulate some features like in-memory joins or select_related.
ReplyDelete
Replies
Doug NapoleoneNovember 27, 2009 at 4:38 AM
Waldemar beat me to it. I would recommend looking at what Storm and other sharding based systems use for inspiration. Look at what people love, hate, and most importantly, and hardest to discover, what is just taken for granted. The aspects which are just taken for granted are typically the most crucial parts, and are the optimal solution.
ReplyDelete
Replies
AnonymousNovember 27, 2009 at 10:37 AM
Any idea if we can have multi db production ready before 15th of January?

I kinda need it by then for a big production website :D
ReplyDelete
Replies
KennuNovember 29, 2009 at 4:42 PM
I think this is a good decision. The callback API for database selection is much more interesting for real scalability anyway. Low level control is more essential than convenience.

For me the two biggest problems seemed to be Sites and Users, which many tables need foreign keys to point to. I wish there were some mechanism to support "virtual" foreign keys, so that the Sites and Users could be located in separate database than the content, but all apps would work normally. And Django would maintain foreign key relationships without creating actual foreign key columns in the database.
ReplyDelete
Replies
Jon LoyensNovember 30, 2009 at 1:11 PM
I'm the guy who tweeted about Meta.using being a bit of a surprise. I also think I might be partially responsible for it's removal (since I complained about the non-reusability of it on the django-developers group... I guess you can't win for losing :)

Anyway, I love the call back API idea. That would be ideal for our use case (we're using django-multidb against a set of legacy enterprise sharding db's). Alex, if you need any help or want to ping us for feedback at anytime, I can make myself or any member of my team available to you or Russ. You know how to get a hold of me :)
ReplyDelete
Replies
AnonymousDecember 4, 2009 at 8:42 AM
Would it make sense to allow INSTALLED_APPS entries to be tuples, and include a default database with an app name? I wouldn't want to do away with the more complex callbacks, though, for sharding and such.
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.

Lazy Pythonista

Friday, November 27, 2009

Why Meta.using was removed

6 comments:

Contact Me

Suggestions

Blog Archive