Bringing Deployment Capability to Drupal

by Greg Dunlap

At Drupalcon DC, Kent Bye from Lullabot interviewed me about Deploy module for the Drupal Voices podcast series. Deploy offers a potential solution to one of Drupal's most difficult problems - how to stage content and configuration between servers. Here's how it came to be:

Two years ago I began my first Drupal project, the conversion of NWSource.com from a variety of homegrown content management systems to Drupal. Lullabot came in to get the team up to speed. One of the first topics of dicussion was "How do we manage the dev->qa->live process, both for our engineering team and our content team?" The answer was not encouraging. We learned about exporting things to code, but accepted that much of our admin work would have to be done by hand when deploying from server to server. The content team? They have to do all their work on the live site. "That's just how everyone does it."

In response to my experiences working on NWSource I created Deploy, a module designed to push Drupal objects between servers. I really only had three goals in mind for the module - it had to be easy (GUI), it had to be database-agnostic, and it had to work without hacking core. It was a good proof-of-concept, but it stalled on a big problem - how do you synchronize changes between servers when their unique identifiers may not match. I won't go into the details here, but it's a thorny problem. This thread from the Drupal development list has a lot of discussion of this issue if you're interested / bored.

Having hit this wall I let the module languish. Fast forward a bit. I come to work for Palantir, and my first project is the new website for Foreign Affairs magazine. One of their central requirements was that their editorial team needs to be able to work on a staging server and not only push content live, but also update it with changes as necessary. They also wanted their dev team to be able to deploy changes from their development environments to QA with a minimum of hand-tweaking of the admin. This was my first major task at Palantir.

With the help of the amazing engineering team at Palantir, I designed a system that maps Universally Unique Identifiers (UUIDs) to Drupal serial IDs. This gives every Drupal object (node, taxonomy term, user, etc.) a unique identifier that can be used no matter where the node resides, but leaves Drupal's primary key management unchanged. This mapping is done in separate tables which live alongside thier Drupal counterparts. When an object is deployed, the remote server is queried to see if an object with that UUID already exists or need to be updated. If so, it gets pushed. The Services module is used to manage this communication and transfer process.

In addition to this, I designed a system which manages dependencies between Drupal objects. For instance, I can deploy a node, but that node might also have 4 taxonomy terms, an attached file, and nodereference three other nodes. You can't just push this node without pushing all the objects it depends on in order to be saved properly. Deploy manages this through a series of hooks that work recursively through the various dependencies and add them for deployment as necessary. This means that simply by choosing one node for deployment, you may end up deploying dozens of other objects. It also weights them in the right order to determine which ones need to be pushed before the other ones.

Finally, Deploy contains a solid API developers can use to easily build custom scripts for deploying objects in ways that aren't handled out of the box. As an example, Foreign Affairs wanted to be able to deploy all the content for one issue of their print magazine with a single click. I was able to build this simple module in about a day once the deployment framework was in place. Third part module developers can also hook into the dependency checking / deployment process in order to add their own functionality. For instance, if someone wanted to deploy Filefield files, they could hook into the node deployment process to update the node object as necessary.

Reading all that makes it sound incredibly complex, but on the front end it is dead simple. Don't believe it? Here's a screencast demonstrating the process in action.

Deploy is currently available with support for nodes, all standard CCK fields, Date fields, users, taxonomy and system settings implemented with system_settings_form(). Very soon I will be adding support for filefield / imagefield, views, content types, and comments. I will also be implementing the pluggable extension framework from Chaos Tools to make it easier for third party modules to integrate with Deploy. Development is extremely active. Come try it out, or even better, help!

Comments

You know, I got to see this module as it was developed and I've seen the demo multiple times... and it's still wicked cool every time I see it. :-) Seriously awesome stuff, Greg!

This is just great stuff you did and presented, but instantly draws a question:

So it's clear why you had to "invent" UUIDs, but doesn't it mean death to hard-linking to nodes via NID (e.g. in some body or cck text field) and also Pathauto settings using the [nid] token and all modules / filters that assist you in linking to nodes?

While I don't have a problem per se with this consequence it would require quite some thoughtwork to build nid-independant ways. Or does Deploy also support hooks to search and replace / re-filter? Hm..

Yes, this is a good point. Text field hardlinking will definitely break. Pathauto settings using [nid] will probably break, I haven't tested it but that makes sense. As far as modules and filters that allow you to link to nodes, these may or may not work. For instance, Deploy's nodereference handling currently manages this behind the scene. If you nodereference node 1 to node 2 and deploy them both, they will remain nodereferenced even if they become node 5 and node 6. It is likely that such custom solutions will need to be written for any module like this, but its pretty easy. nodereference_deploy.module is only 123 lines of code, of which half are comments.

Once I implemented the pluggable extensions from Chaos Tools then writing deploy plugins will be pretty simple, and I hope to be submitting patches to lots of major modules in the coming months.

Chaos Tools are certainly suited to write plugins, but is there any chance of not focusing on "deploy" plugins but on "generic" field plugins instead? It's devastating to see how each and every import/export/deployment solution reinvents CCK field support over and over again.

I started a plugin-based, generic approach with Field Tool (which is also using the Chaos Tools to do the plugin stuff), maybe you could have a look at that and try to pimp it up instead of creating yet another module-specific field plugin system. Or if you don't like it, come up with your own reusable plugin system by yourself - just please, don't constrain yourself to "deploy" plugins.

Greg, way to go!!!

I've been using Drupal at the last 2 years, mostly for big organizations websites / applications. The deployment process was / is hell. It is so cumbersome and time wasting.
I think that your initiative is the single most needed function that was *badly* missing for Drupal. Obviously it is merely a start but now that you've set up the foundations I am sure that the required implementors and services will grow quite fast.

Well well done!
Udi.