A complex database architecture requires a reliable change management tool.
We have a lot of relational databases. Databases for subscriber configuration. Databases for reporting. Databases for reputation. Databases for billing. A quick count of our production systems turns up 26 databases redundantly deployed to our two data centers. Then there are the 18 QA databases that duplicate production, the eight or so for customer integration testing, plus an uncountable number of development and testing databases.
Sure, a lot of those have the same schemas, but those schemas themselves are often constructed from multiple components. Our reporting databases, to take the most extreme example, keep a subset of subscriber configuration, manage six different categories of reports, and split data among a number of shards. All told, we currently have 40 separate database projects composed into all these databases.
So, one might reasonably wonder, how do we manage all of these projects and databases? The answer is Sqitch, the database change management system we’ve developed over the last three years.
Sqitch was designed with a number of goals in mind.
First, it mustn’t be tied to a specific framework. Many applications manage their database via migration systems bundled with their frameworks or ORMs. But we have a heterogeneous environment, with applications written in Java, Perl, Python, Groovy, and a number of third-party applications. Sqitch doesn’t care about languages, frameworks, ORMs, or any other applications. It aims to be as useful a database change management system as possible, and nothing more.
Neither must it care about what data storage system you use. It was designed with relational databases in mind, but ultimately it cares only about managing databases consistently, without regard to their capabilities or theoretical underpinnings.
To that end, it must be speak the native language of the databases with which it works. In practice, that means full support for the scripted programming environments for all supported database engines. Postgres changes are written for
psql, SQLite changes for
sqlite3, and Oracle changes for
sqlplus. This design allows developers to take full advantage of the features of their chosen database engines.
It must make database deployments as reliable and consistent as possible. In other words, we want the operations team to only feel secure, warm, and fuzzy when deploying changes to a production database. To accomplish this feat, Sqitch uses a hashing pattern pioneered by Git and popularized by the BitCoin block hashing algorithm to make it difficult to deploy changes if something unexpected has changed. If the the deployment record of a database doesn’t look exactly right, Sqitch refuses to apply any changes.
It must be able to manage multiple projects in a single database. The vision was to enable lots of smaller database components that we could compose into database solutions. In other words, enable greater reuse of database code, as is already endemic to application development. Sqitch’s approach allows the declaration of dependencies on changes in other Sqitch projects.
And finally, it must be developer-friendly, not only in the sense of its Git-inspired interface, but also to enable iterative, agile, or test-driven development. Unlike other migration systems, Sqitch doesn’t consider a change permanent once committed to the project VCS, but relies on tags (in the VCS sense) to identify releases. This approach frees developers to make more radical changes to a schema, or even to refactor the whole thing, with little fear of stepping on each other’s toes or disrupting production deployments.
When we started developing Sqitch, we only had two large databases, but plans for many smaller ones distributed between our data centers. Using Sqitch enabled both an iterative approach to the development of those databases – with greater componentization of the code into composable parts – and reliable deployments to our production systems. Even databases built from many Sqitch projects are reliably and consistently deployed across all of our servers and data centers.
The result is a multitude well-developed and tested, reliably-deployed relational databases. Furthermore, the ever-increasing number of projects and deployments causes no concern. Without Sqitch, it would be a nightmare. With Sqitch, database change management becomes just another application deployment task.