Porting SQLite code to Postgres

2012-07-04

Background

I have a model I work on at work. Model output is a SQLite database; one database is created each run of the model. Currently there are about 30 tables and about 80 views. This exercise took maybe two days

I wanted as few branches as possible in the implementing language: wherever possible, I want the same SQL that works for both. In the end, the only bit of SQL that isn't common to both is the primary key type in table declarations

The same code in my system can now write to SQLite or Postgres databases. I have two methods, one to open a SQLite database and set some parameters, and one to open a Postgres database and run a couple bits of SQL. There's only about 20 lines difference, all in setup

Finally, I liked the idea of being able to host the database on a server, and having folks write to that database. The current cycle goes:

  1. User develops inputs
  2. User runs model
  3. User calls me to figure out why it didn't do what they want

Making it so that last step doesn't start with "ok, send me the database" seemed like it would be helpful. [The model solves what I heard one guy describe as "the hardest easy problem he knows of"; it sounds easy right up until the moment you try it]

Why Postgres?

There's a bunch of reasons that I went with Postgres for this exercise:

  • Postgres Schemas. These were the big one; using schemas, I can segregate runs the same way I do with SQLite ["one SQLite .db for each run" maps neatly to "one schema for each run", and avoids the permission trainwreck in "one mysql database for each run"]
  • Postgres datatypes more closely match SQLite datatypes
  • MySQL angers me in new ways every time I use it, and I wanted to try something else
  • After fifteen minutes using it, I noticed postgres is full-blown anal about everything, which appeals to my inner OCD coder

List of bullet points

These are all the things that came up while I was doing this. These are the real reason I'm writing this up, for my own future reference

SQLite Datatypes

  • SQLite "INTEGER PRIMARY KEY" => Postgres "SERIAL"
  • SQLite has pretty weak typing. That's OK, and I enjoy the benefits it provides. But moving to Postgres, make sure you're putting the right types in the right columns.
    • I had a table of key-value pairs for "generic input parameters", of varying types. I made the values all "text", and do conversion in the model instead of expecting the database to do it
    • Use CAST; Postgres knows how to convert those text values to integers
  • Because SQLite's "boolean" is really an integer valued 0/1:
    • SUM(boolCol) works in sqlite, not in Postgres: SUM(CAST(boolCol AS INTEGER)) works in both.
    • I was using WHERE boolCol=1 and WHERE boolCol=0 in a bunch of places. Now it's just WHERE boolCol and WHERE NOT boolCol

SQLite-specific functions

  • I was using a couple functions in SQLite that aren't available in Postgres. Luckily, Postgres lets the user create functions. So in the same DB setup code where I create the schema, I also now add a couple functions.
  • This way I don't have to create different versions of the view SQL code for different databases
    • GROUP_CONCAT: Implementation here
    • DATETIME: I wrote a function that takes the same arguments as the SQLite one does, check that the last two are 'unixepoch' and 'localtime', and return Postgres' to_timestamp(epoch_seconds)

GROUP BY

  • General SQL thing: When using GROUP BY, *every* column you SELECT must be either an aggregate column or in the GROUP BY clause
    • SQLite lets you SELECT columns that aren't one of these things; if you know of some relation where it doesn't matter which row it pulls the value from after aggregation, it'll always be the same, then it's a convenient shortcut
    • Postgres doesn't like that
  • When using GROUP BY, postgres is more anal about ambiguity. Fully specify the column name [ie, with a "table." prefix] for columns in the GROUP BY clause

Other stuff

  • Postgres demands that every subselect be aliased to something, even if all you're doing is a single subselect, and then an aggregate function against it
  • SQLite lets you get away with using double-quotes for values. Not so Postgres
  • Postgres doesn't like "IF NOT EXISTS" on CREATE {TABLE,VIEW,INDEX}
  • SQLite lets you create tables with FK constraints to other tables that don't exist. FKs are enforced in SQLite at write-time. Postgres wants them all correct for DB correctness. Simple bit of table creation re-ordering
  • SQLite lets you use an ON clause in joins with a table further down the list of joins. Postgres doesn't
  • Triggers: Probably won't be portable. This incentivised me to look at my triggers in more detail, and in every case but one I found there was a better way to achieve the same goal. I'm calling this a plus [the one remaining ease actually only solves a problem that's specific to SQLite anyway]

Afterthoughts

It's still faster to have people run the SQLite version and mail me the db than it is to wait for the model to write to Postgres

Postgres is so much slower, I think I'm probably doing something wrong

This helped though

I found a bunch of places where my code should never have worked; this is why porting stuff is always good. [Same with Linux/OSX/Windows porting, but perhaps that's a topic for another day]

Overall, SQLite is just way more easygoing about everything. All kinds of stuff you can get away with in SQLite, Postgres won't stand for

This was fun

Addendum based on some feedback

I posted this page to reddit/r/PostgreSQL and got some feedback. This should answer most of that

  • Some of these things are better in newer versions of Postgres. I was specifically including 8.3 in my targets
  • Having the *exact same* SQL for different DBs was a primary goal. I didn't want to:
    • Litter my code with if(dbType==pg) do_something; else do_something_else;
    • Use the same PK type but then have to run an extra SQL query to create the sequence for each table
    • Switch between GROUP_CONCAT and STRING_AGG based on database
  • Overall, everything here now works in both databases. There are lots of things that "could be done a different way with Postgres 9.1", but that'd be taking away the 8.3 ability. I always find it faintly annoying when people are willing to sacrifice backward compatability without actually gaining anything


Gary "ChunkyKs" Briggs <chunky@icculus.org>