Skip to main content

On Ruby on Rails with PostgreSQL, and Acts as Paranoid

Back a few years ago I was researching differences between PostgreSQL and MySQL databases, and chose PostgreSQL because at the time it supported foreign key constraints and many other fantastic SQL extensions that make developer's life a lot easier. Today I am sure MySQL is just as functional as PostgreSQL, and it does appear to be a more popular choice as the Rails DB than MySQL. I still prefer PostgreSQL, it just feels more natural to me coming out of Oracle background, so I am probably biased (who isn't?) Anyway, in the last year and a half or so, I've been writing Rails apps that use PG-based databases and found a few random tricks I'd love to share here. Opinions differ here, but just like accountants like the ancient double entry accounting system, I personally prefer a double validation system - where Rails validates my objects before/after updates, but the database double checks this using proper constraints. Rail's validation system is very robust and extensive, and it is its power. However nobody is protected from human error, and it's pretty easy to forget certain validations, some validations may be performed conditionally, and in general rails validations are at a higher level. I've certainly missed a validation or two in the past myself, or had a bug in my condition. So as a rule, I specifically like to use in all of my migrations:

  • Foreign key constraints, with "ON DELETE CASCADE" to allow tests to drop/remove fixture data
  • Check constraints to verify values satisfy certain conditions. Use these to validate finite value columns, such as state/status, gender, inheritance discriminator column, or polymorphic table's type discriminator. Use it to validate pattern based columns, eg if you decide to store SSN as a string of the form XXX-XX-XXXX, a check constraint can ensure it really is.

Foreign Key Constraints

Let's start with foreign key constraints. They are extremely useful in ensuring your relationships work and do not point to non-existing rows. It is even more important if you bypass ActiveRecord and use direct SQL in certain cases to insert data (as this may be the case with some batched or high-volume data loads). I use foreign keys on ALL of my tables, and haven't had much trouble with that. The only issue to watch out for, is the order in which your fixtures are loaded. Basically you'd want to load your independent tables first, such as countries/states - those that do not depend on anything, and then load fixtures that depend on them in the order of dependence.

Please see the following post for an opposite opinion, and note that I don't agree with author's suggestion not to use foreign keys - read first comment, I too was saved on multiple occasions by having them, especially in the development (since that's where you'll be finding most bugs in the first place).

Here's how to add a foreign key constraint in your migration:

create_table :profiles do |t|
   t.column :user_id,    :integer, :null => false
end
execute "ALTER TABLE profiles 
          ADD CONSTRAINT profiles_fk_user_id
          FOREIGN KEY (user_id) 
          REFERENCES users (id) 
          ON DELETE CASCADE"

Notice how all constraints are named. This is very important, because you want to be able to change/update constraints in the future (especially the check constraints - see below). Not giving a constraint a name explicitly forces DB to come up with an auto-generated name, which is not very useful. Any future migration that needs to change this constraint would be at loss as to how to reference it reliably. Using a proper naming convention is also a good idea: table_fk_field is common naming practice for foreign key constraints.

Check Constraints

Check constraints verify that values in a column satisfy provided conditions. They are extremely useful in ensuring you don't get "^%$$$" as your gender value for user Bob, or "Frak!" as a status for your order. Database can be manipulated in many ways, and in my experience unless you protect your columns some weird stuff always ends up in there, whatever the greatest validation framework sits in front of it. In this hypothetical example, we need to ensure that the gender column on our frogs table only allows "M", "F" or "T" as possible gender values (I live in San Francisco, yo!). So in the migration file, specify check constraint similar to how we did this for foreign key constraints:

create_table :frogs do |t|
   t.column :name, :string
   t.column :gender, :string,   :limit => 1, :null => false
end

execute "ALTER TABLE frogs 
          ADD CONSTRAINT frogs_check_gender 
          CHECK (gender IN ('M', 'F', 'T'))"

Please see the following link for more information on PostgreSQL powerful constraints syntax: PostgreSQL Constraints

Partial Indexes, and Acts as Paranoid

If you had to install acts_as_paranoid plugin, you'd notice that it adds deleted_at column which is going to be NULL for all values that are active in your database. Calling destroy will not physically delete the record, but would merely set deleted_at to the current time stamp. By declaring a model as "acts_as_paranoid" it will also add the following to all queries: and (deleted_at is NULL or deleted_at > ?) First of all, the "deleted_at > ?" clause is meant for allowing models to expire in the future, but I haven't met a single person who uses this feature. So the first thing I do when I install this plugin is I modify it's source to remove this comparison. I want my find statements generate the following: and (deleted_at is NULL) Ok, so things are great now, but suddenly I realize that my favorite index on users table and email field, does not really work as well! But of course... I am now querying on both "email = ? and deleted_at is null". So how do you add deleted_at to the index? Here's how:

execute "CREATE INDEX users_idx_email 
            ON users (email, deleted_at) 
            WHERE deleted_at IS NULL"

This uses a cool feature of PostgreSQL called partial index to create an index on subset of values.

Comments and thoughts/suggestions are always welcome! References:

  1. http://www.oreillynet.com/ruby/blog/2005/11/migrating_to_ruby_on_rails_and.html
  2. http://www.postgresql.org/docs/current/static/index.html

Comments

Chris said…
Thanks for taking the time to write up your post. I realized in reading it that acts_as_paranoid was keeping mysql from doing query caching on account of the perpetual deleted_at >= current_time check. I just pruned it out as you suggested. thanks!
seebs said…
Thanks, very helpful. In particular, helpful because I forgot to add constraints when I first defined a couple of migrations.

I love constraints so much. I spent a number of months working on an app which had no constraints, and it was... unpleasant.

Popular posts from this blog

Car or Auto Make-Model-Year Database : For Breakfast

Make Model What?If you like me were tasked with loading a database of recent car makes/models/years, you would start by looking on the web and seeing if someone else just has it out there, readily available, hopefully for free, but perhaps for a tiny nominal fee.?If only it was that simple... I looked and looked, and couldn't find anything that would fit the above requirements. So I thought, who would know about US car models better than Kelly Blue Book? So I went on their site, and sure enough they have a javascript file that lists all known to them makes and models of used cars. Since the file is public, I figured it's not really "evil" if I scrape and parse it for my own benefit. Disagree? Have a better source? Then leave a comment.Anyway, to cut the long story short, I'm hoping to save a day or so to someone else who may, like me, be looking for this information. The ruby module shown below retrieves and parses the javascript from KBB site into a Ruby da…

Getting RMagic and friends to work on OS-X Mountain Lion

Upgraded my ruby environment today to Mountain Lion.

Here is a quick checklist that I went through to get everything working.  The largest change was having to reinstall XCode and command line tools, and also download XQuarts in order to reinstall ImageMagick successfully. Without it, I was getting errors building RMagick of the following shape:
ld: file not found: /usr/lib/libltdl.7.dylib for architecture x86_64clang:
error: linker command failed with exit code 1
(use -v to see invocation)make: *** [RMagick2.bundle] Error 1

Quick checklist: Install Mountain Lion Install XCode 4.4 Install command line tools from XCode 4.4 Preferences dialog Install XQuartzIn terminal run brew update brew uninstall imagemagick brew install --fresh imagemagick wipe out your ~/.rvm folder reinstall RVM and install the latest ruby 1.9.3-p-194 run "bundle" in the project folder run "rake" and rejoice
References:

https://github.com/mroth/lolcommits/issues/65

Wanelo Tech Gems: we've been busy!

It's been quite some time that I've posted here, to my personal blog, so long in fact, that the publishing interface a'la Blogger now looks like MS Word. Or God forbid you remember, Word Perfect. (Yes, I am that old).

But that's not what I wanted to write about.

Wanelo, the team that I lead as a CTO, has been kicking so much ass lately, that I've just had no time documenting personal projects here, even though there have been plenty.

So to sort of catch up in one blog post at once, I wanted to put a few links to some of the great content on our technical blog, that's been recently migrated to Github Pages.

Without further ado:
Multi-process or multi-threaded design for Ruby daemons A pretty awesome blog post by our own Eric Saxby about thinking in terms of GIL (global interpreter lock in ruby), it's impact on production environments, especially in multi-threaded environments, and where long-running processes may have impact on your database transactions. Ver…