Developing Drupal sites: Plan or Perish

by Larry Garfield

One of the key challenges Drupal developers face when building a complex Drupal site is that there are many moving parts. It's not unusual for a mid- to large-sized site to have as many as a dozen content types, 50+ fields (some of them shared), 20 different Views displays, a couple of flags, some nodequeues ... there's a lot that goes into a Drupal build. Keeping track of all of them can be a challenge when building out a site. What was that View called? Did we remember to name fields consistently? Singular or plural? Wait, did we use Flag or Nodequeue here? Okay, who forgot to include help text!

While it's tempting to say "Drupal is configurable, and we're Agile, we'll figure it out as we go!", that's rarely a workable answer for any but the most trivial sites. There have been attempts in the past to automate an entire build off of some scripted format, but none have really caught on. At Palantir, though, we've realized that it's the planning step that is most important: Planning out how the entire site's content model and build will look is itself a valuable exercise, as it reduces errors and improves overall consistency and quality. To that end, we've developed a tool that we now use for all sites to aid in discovery, strategy, and planning: the Build Spec.

Another document? Great Druplicon, why?

Some folks may be flipping out at this point. Big specification documents are an artifact of waterfall processes, that hated dinosaur of software development that exists only in legends of time gone by and in the expectations of less-hip clients everywhere. "Design documents get out of date as soon as they're written!" you cry. "The code is the spec; that's the Agile way!"

Perhaps, but the Build Spec is not documentation. It's a tool for discovery and content strategy. As Dwight Eisenhower once said, "Plans are worthless, but planning is everything." While Drupal is extremely configurable, that doesn't mean step one of a new site is to start pushing buttons; it's to figure out "so what do we actually want to do?"

That process is commonly called Content Strategy these days, and is an important step in the process. But how do you translate the outcome of that process into actionable Drupal tasks? That's what the Build Spec is for.

The Build Spec is a Drupal-specific birds-eye-view of how the site will be built. It contains every content type that will be created, all fields, and all field, formatter, and widget settings (that is, the data model). It contains all Views that are going to be needed, and all of their displays. It contains every image style, every flag or nodequeue that is expected, every taxonomy vocabulary ... anything that you'd get from just pushing buttons.

Why go to all that work? How many times have you gotten most of the way through a Drupal site and realized that your image style names don't make any sense; or that you have four fields that are identical but duplicated, and all named with different standards; or that one third of your fields have help text, but the grammar is inconsistent; or that you want to build a View but the fields you need to filter on don't exist; Or that you're using a custom "weight" field in one place but nodequeue in another, not for any particular reason other than the site evolved that way as different people built them.

We've seen all of that, which is exactly where having that big-picture plan before you start button-pushing is helpful. It's much easier to rename fields in a spreadsheet than after you've built three views based on them, and much easier to spot potential places to merge fields when they're all lined up in a spreadsheet. And, especially, it's easier to remember to write, in a consistent voice, all of the necessary help text for fields, views, and so on when there's a blank cell staring you right in the face. It's far easier to tweak, adjust, and evolve a content model via spreadsheet than it is in Drupal itself with all of its forms, and everything is still changeable (unlike in Drupal where many values can't be changed after they're set without deleting something entirely).

OK, you sold me. How does it work?

Glad you asked! There's no One True Way(tm) to capture this sort of information, but at Palantir we've developed a fairly robust template for our use as a Google Spreadsheet. (Follow along at home for the next part.)

During the discovery and definition phases of a project, the technical lead, product owner, and usually another engineer take the information unearthed during discovery and translate it into a content model. That's largely the first two tabs, which contain all content types with fields used and all fields on the site, respectively. We split it across two tabs for a number of reasons, but most notably because fields exist independently of the content type that uses them. There's a lot of configuration that goes into a field, and fields can be shared across content types. Of note, though, each column corresponds almost 1:1 with a a field in the UI when actually configuring Drupal.

Any other fieldable entities (taxonomy terms, fieldable panels panes, etc.) have their own tabs and work essentially the same as content types. There are also pages for Flag, Nodequeue, Menus, and User roles that are all fairly self-explanatory and as above map very closely to the actual fields to fill out in Drupal's UI forms.

Of particular note, though, are View Modes, Image Styles, and Views. The first two are really under-utilized features of Drupal's site building arsenal. Both View Modes and Image Styles are ways to take an underlying object (node, user, image, etc.) and define a special representation of it. A single object can be presented in a variety of different ways, not just in the default "full" and "teaser". That's why Entity View Mode is a standard part of Palantir's Drupal toolkit. Between that and core's image styles, we're able to single-source content and display far better than with just Views alone. That's because we can, in most cases, use the view mode/image style for theming and display purposes and let Views handle just the querying logic, then display nodes (or any entity) in some well-defined view mode.

Depending on the site, creating a Build Spec could take anywhere from 2-3 hours to 2-3 days. Often, and by design, filling out the Build Spec will surface architectural questions, and therefore business logic questions, that are best answered early when they're cheap to change. (It's just a spreadsheet, after all.) That is actually the biggest benefit of this approach; internal consistency and "crossing all the Ts" is a secondary, but still very useful, benefit.

Once the Build Spec has been completed, it should take a competent Drupal site builder only one to three days (again, depending on the site size) of clicking, copying, and pasting to build out the "80% solution" that Drupal is reputedly so good at. Then the rest of the project can be spent building necessary custom code, theming, testing, and so on.

We do generally try to keep the Build Spec up to date if the site changes, but we've found that if our discovery process was sufficiently complete changes tend to be rare and minimal. (That makes it more likely that we will actually keep it up to date.)

Since it's "just" a spreadsheet, the Build Spec is also flexible. On sites that won't use Nodequeue, or Fieldable Panels Panes, just delete those tabs. Using Organic Groups? Add a column to the Node types tab for OG settings.

How else does this approach help?

Aside from the more robust discovery and more thought-through architecture, we've found a number of other ancillary benefits to using this model:

  • It allows for easier identification of possible custom modules that may be needed. (Field types, formatters, Views plugins, etc.)
  • Hidden custom code needs can be identified early and factored into the budget or into design changes, as appropriate.
  • It allows the entire team to review the content model at-a-glance to see how all of the moving parts fit together, as well as identify potential pitfalls before they're reached.
  • Developers joining a project late can get a much clearer picture of the site they're inheriting, how it's built, and more importantly why.
  • Because it includes the help text, field settings, and other little details in-your-face, it's harder to forget about those important UX details.
  • The entire build can be completed at once, greatly reducing "oh, we haven't built that part yet" problems during development and avoiding too many context switches from "build headspace" to "code headspace" by the development team.
  • The entire site build can be completed at once, showing the client early progress.
  • By having specific call-outs for information like image style settings, it helps to ensure that such information is figured out early and in a consistent manner.

Nifty. Can I use it?

But of course! Give it a whirl on your next Drupal site and see if having a big-picture view of a site makes your life easier. It's certainly made life easier for us. And if you have suggestions for improvements, let us know!

Wile E. Coyote blueprint poster by Dave Delisle.

PS: A number of people have requested sharing-access to the Google Document. You don't need it! Just follow the link above and then click File > Make a Copy and now you've got your very own editable copy of the document, notes and all.

Comments

How does this spreadsheet work in when your product owner has no "technical" know-how. What if they can't use Excel or lack the imagination to visualize the final result from your Excel sheet.

I've done similar stuff using a word doc, site hiearachy and word tables to denote whats on a page in terms of Drupal content/fields/module-usage/etc.

This works for a project and product owner who is very detailed and tech savvy -- this sucks for clients who say "I want a website -- that's pretty ...". It's a shame we can't pick and choose our clients.

The product owner doesn't necessarily need to get into the weeds of field labels and views relationships. It is critical that everyone on the implementation team is familiar with the build spec and how to read/use it, but the non-technical members of the team don't need to get into that level of detail. The project manager, for instance, probably doesn't need to be working in the build spec. The product owner should be influencing the build spec, as decisions they make impact what goes into it, but they don't necessarily need to be editing it directly themselves. The build spec is not always a client-facing document, depending on the client.

That said, if a client cannot provide a product owner who is technical enough to understand basic spreadsheets you probably need to consider providing your own surrogate product owner. That's a much larger discussion than just the build spec, but in some cases is an important one to have.

Thanks for sharing this great resource for planning out a Drupal build.

However, on the "fields"-tab, the name collumn suggests "The human-readable name of a Field. This should be a singular (for single-value) or plural (for mult-value) noun in "Sentence case"." And then there is a field "Images" (plural), for a single-value field.....

Congratulations, you're our first bug reporter. :-) I fixed the label. Thanks.

Great Spreadsheet. Thanks for sharing. We use a very similar one for collecting client requirements.

Wondering a bit more about your process - at what stage do you involve a designer? We have increasingly been building the spec sheet, building data wireframes that show content relationships, building the site skeleton, content types, etc. and THEN go through a design process. It sometimes seems a little backwards but it has been working well as of late. Curious how others approach this?

In practice for us, more often than not the design and wire-framing happens first, or at least starts first. However, wireframes and the build spec will ideally build on each other. They should be developed in tandem, iteratively, with one informing the other. That way the designer can say "so we have this display over here" and the implementation team can model that out in the build spec, then go back to the designer and say "you know, if you change this thing here we can build this View without any custom code", and the designer can adjust the wireframes quickly and cheaply.

It doesn't always work out that way, but IMO that's the ideal process.

Very Cool. I've never had a spreadsheet this detailed before. Will come in handy! I would say however it wouldn't be a far stretch for a developer to take a well structured spreadsheet and an import utility to batch create the different components. In D8 I suppose we would just need a spreadsheet to yaml converter :-)

Good post!

For many of the columns, particularly field and widget configuration, the format is more loosey-goosey. Capturing all widget settings into one column is quite a challenge when they can vary so widely between field and widget types. That would be very hard to script over.

That said, things like vocabularies or image styles or view modes are strict enough that automation is probably possible. A script that does maybe half of the build out automatically (in D8, yes, by just writing a bunch of YAML files) is probably feasible, and would open up a lot of interesting potential.

Incidentally, one advantage of this spreadsheet is that the parts of Drupal it touches haven't really changed dramatically in Drupal 8. The underlying code is very different, but the concepts of nodes and fields, of views and view modes, hasn't really changed. That means this template should work fine for Drupal 8 sites with only very minimal updates.

My team employ a similar approach on large projects. We have even exposed this kind of documentation to some clients which has helped with change control and auditing.

Having such a document has lead to more collaborative and better architecture decisions since the structure is visible. During the longer life-cycles of large projects it is inevitable that sickness, annual leave requires staff rotation. Having this complexity clearly documented in 'Build Specs' pays dividends way over the time they take to create.

Using Google docs and exposing them our clients (who are commonly remote) has reduced the number of (costly) in person meetings we conduct and meant those we do have are more valuable as the client has a better understanding of their platform.

Thanks for sharing this invaluable knowledge.

Thanks for a great tool!

We can add a column to each tab where we link the defined entity to the actual definition on the site. This will make the tool much more usable during the later project steps and as a result will increase it's chance of being kept up to date.

Once we do that we can add an overview tab show a birds-eye status of the spec. This tab can count the number of views, content types, flags, etc... with links to the respective tabs. If we added the link column as suggested above, we can show progress (10/84 views, 5/12 rules, 2/5 flags have been defined).

Also I miss context, rules and panels tabs.

Amnon

Yeah, adding links to the actual build could be useful if you have a single canonical instance. (We generally don't, but some workflows do.)

Palantir rarely uses Context. Where appropriate we use Panels, or just vanilla blocks. That varies with the site.

Rules and Panels we don't have tabs for yet because we're not entirely sure how to capture them. Panels in particular is really hard to capture this way, since so much of it is visual, so we likely won't. Rules I could see fitting into a spreadsheet structure, but we haven't mapped it out yet. We also don't have Workbench tabs yet for the same reason.

Rules and Workbench sound like good additions to the spreadsheet down the line as we work out the best way to capture and represent that data. If you have suggestions for how to do so, please share. (I don't think Google spreadsheets take patches, but "patches welcome!".)

Larry,

First, thanks for the great Build Spec. We have started to use it on some client projects and it has been really helpful.

Second, I submit to you my "patch" for handling Contexts. I know Planatir does not use them much, but other may find it helpful.

I figured the best approach for submitting a google spreadsheet "patch" would to just share a spreadsheet with the example structure.

https://docs.google.com/spreadsheet/ccc?key=0AhGpmpvWFYepdDRqM0s1eFdzaHV...

I've been doing something similar with a pen and paper, with a different data structure. I would group everything under one content type (with features in mind). At least from a Drupal point of view this spreadsheet is better for keeping things in check, and having a birds eye view of the project.

Also, I couldn't agree more about Entity View Mode. Outstanding little module when used properly. It will help tremendously not only building the site, but later on extending it. It makes creating new views for sections a breeze, if the design is up for it of course ;)

Thanks for sharing your thoughts on the planning process, it's that or create a frankenstein site...

I am finding more and more need for Entity References (and Corresponding Entity Reference). While they are fields, they have far more importance as a way to navigate between related content.

Yes, Entity Reference is a critical module on many sites we build. While the sample data in the spreadsheet doesn't include it, we generally file it like any other field on the Fields tab. Its field type is just "Entity Reference", and its field settings include the node type or view that it's restricted to. (Restricting ER to a View is an extremely powerful technique that not enough people know about.)

I've always done something somewhat similar, but just using regular documents and listing everything out. The speadsheet looks like a huge time saver and I can't wait for the chance to try it out. Thanks for sharing it. This is what makes the Drupal community one of the best, by far!

Great stuff!

We've been doing this with content types and fields for a while, but not for all of the other categories. In line with the Agile methodologies, I can't help but wonder if it would be useful to have a module that provides this 'spreadsheet' (maybe using AngularJS) to present all of this information within a particular Drupal project itself. This would cover all of the use cases you outlined above and provide a pretty nice and re-usable method for other to just include and use this on their projects. build_spec.module anyone? :wink:

Thank you so much! This is fantastic! Any plans to turn this into a module? (just kidding... but it would be pretty sweet if we could fill out the spreadsheet and then click "build". Ha ha).

Thanks Larry, that is one awesome shortcut to launch you guys have developed. I'm going to try it out on a project I have and am pretty sure it is going to help me avoid common pitfalls and also build a more maintainable site.

After Aaron Couch forwarded me a link to your blog post, I was inspired to update my Architecture module for Drupal so that it now provides CSV file exports of much of the information needed to fill in your template. I realize that the purpose of the template is planning, not documentation, but documenting this information on existing websites can also be valuable and can help with planning. Currently, for example, I'm working on upgrading a couple of Drupal 6 websites to Drupal 7, and documenting the structure of the D6 sites is helpful both in identifying what needs to be migrated and in making some decisions about what should change.

I've been thinking of writing a similar article for a while. Many recent blog posts have been espousing a doctrine of "We're agile, we just start building new stuff every two/three weeks. We start with the most important stuff and the unimportant stuff may or may not get done." While it's possible to do that with an ongoing project, I can't see it working well on a new project. There needs to be a structured plan about how concepts A, B, C, and D interrelate before you start building any of the individual parts. Otherwise by the time you finish D you will be regretting many of your implementation choices.

Hi Larry,

My colleague Mark West is holding a BoF at Prague on this subject. Would be great to have you along with other any interested parties.

"In this BOF we will be discussing methods of documenting a Drupal project in a collaborative way using Google Drive. We also hope to share and demonstrate processes, tools and Drupal specific templates which have helped improve our production."

https://prague2013.drupal.org/bof/documenting-drupal-projects-google-drive

While properly planning for complex Drupal sites is extremely critical, proper hosting is, too. Are there any Drupal specific hosting platforms that you recommend? Or even better, personally use for yourself or with clients?

Hi Casey - We've worked with both Acquia and Pantheon, as well as other hosting providers. The question of which hosting provider is best for which client depends both on their technical and business requirements as well as their in-house staffing and expertise. While a managed, cloud-based solution like Acquia or Pantheon may be the right choice for some clients, others may be better off managing their own stack or running on their own in-house infrastructure, assuming they have the staff and resources to support it.

Regardless, this is definitely an important conversation to have during the discovery phase of any project.

I'm in the process of finding someone to finish my Drupal site for me. I've gotten pretty far but found I need help to get the last 20% of the way to deployment. This spreadsheet will help immensely in terms of talking with developers/designers about the final sprint.

If you've designed a relational database from the ground up, chances are you familiar with the term, 'normalizing, or the Nth normal form. Content types and fields can fall into this model if you look at types as database tables and fields as columns. It's a bit of a broad subject for a comment, but worth looking into when it comes time to modeling content types.

Thanks for the amazing article. And template.

I just a question about the way you structure the content type.
In the template sheet, there was [menu-trail]/[title] Pathauto token in the page.
Is [menu-trail] a custom token defined from a custom hook_token_info? Or is there a contrib module that defines it?

Larry, short note to thank you for posting this doc. I agree that drupal architecture documentation will decrease any re-work and allow you to demo and familiarize clients with the admin sooner. Love how this documents help you visualize the relationships, or "moving parts" within Drupal and can ultimately help build a cleaner editor experience.