20 Feb 2009

Files and multi-site and migration, oh my!

garfield's picture
Senior Architect and Consultant
9

One of the oft-annoying issues with Drupal is file handling. There are many problems with Drupal's file handling at present (fortunately there's considerable effort now to improve matters for Drupal 7 that looks very promising), but today I just want to talk about a seemingly simple problem: Where to put the files directory. It's not as simple a question as you would expect.

Drupal itself does not, in fact, care where the files directory is. When dealing with public files, the files directory can be placed pretty much anywhere under the drupal root as long as it is web-accessible. You then need simply go to Admin > Site configuration > File system and enter the appropriate path, and any properly written module code will then pick up that value and put its files inside that directory as appropriate. No muss, no fuss. (And I can't recall the last time I ran into a module that got that wrong; if you do, delete it from your system immediately until the author figures out how to use Drupal. Or even better, file a patch.)

Before Drupal 6, the default files directory was simply "files", that is, a directory called "files" in the Drupal web root. That was all well and good and worked fine, with one caveat. It meant that when backing up or restoring or upgrading your site there were two directories that were not part of the core files: files and sites. Couldn't we get that down to one directory?

It also runs into a problem with multi-site installations. Drupal will actually function just fine if you have 10 multi-site installs all using the same files directory; new files will get silently renamed if the file name is already in use (true for individual sites as well) and everything keeps on working. However, that is organizationally a disaster, especially if you want to backup or restore just one site.

For those and other reasons, Drupal 6 changed the default location to sites/default/files. That has two advantages. One, everything non-core is in the sites directory for easier backup. Two, it provides a natural way for multi-site installs to split up their files directories: sites/siteone.com/files, sites/sitetwo.com/files, etc. No, muss, no fuss.

Well, actually, there is fuss. Drupal keeps a record of all files it knows about in the files table in the database. Among other things, it tracks the path of the file... relative to Drupal root, not to the files directory. That means you can't move the files directory once you start populating the site with content, because the database will still point to the old location for any existing files.

Why would you want to do that? Well you don't, but you have to with multi-site. Multi-site support in Drupal 6, as with all previous versions, has a direct, literal one-to-one mapping from the requested domain to the sites directory for that site. That is, sites/siteone.com gets used if and only if the request comes in to siteone.com (or a subdomain of it). That makes migrating the site from one domain to another, say from an in-house test server to a live server, quite difficult. The site directory changes, which means the paths to modules and themes stored in the system table change. And, more importantly, the paths in the files table break, too.

Now, that is not an unsolvable problem. The system table can be updated if you can just get to the Modules and Themes admin pages, and a simple PHP script to change all of the records in the files table is not exactly hard to write. We've done it before and the resulting sites worked fine. But really, who wants to have to muck with the files table directly just to migrate a site? You have a good chance of breaking something along the way if you're not careful. Really.

Instead, there are two things we've done to deal with multi-site weirdness. The first is a patch that we wrote for Drupal 7 to allow aliasing of multi-site directories. The second is to go back to a top-level files directory.

But wait, doesn't that mix all the files together? Not at all! Remember, Drupal doesn't care where the files directory is, as long as it doesn't change out from under it. Instead of putting all uploaded files into the files directory, we put them into a subdirectory: files/siteone, files/sitetwo, files/sitethree, etc. Then it doesn't matter if the site lives at siteone.com or at siteone.testserver.palantir.net. The files directory doesn't move, it's still easy enough to back up (just like it was in Drupal 5), it can migrate from server to server without any database trickery, and all sites on the single install have their own easy to manage files directory. And all is right with the world.

Tags

Comments

Relative file path?

Why not store the path of the file from the system files directory downwards, then to get a path just precede it with the system files directory and the URL could always be correct?

Doing a migration

Damien,

In my experience, one place where that would be less than ideal is during a migration - if you've got an existing directory/directories with lots of files in it, there would be more decision-making about where to put the files and which directory to store them in - more steps.

Then when we're pulling the urls out (especially with a non-drupal query) we've got to combine the two - one of which would probably be stored in the variables table (as serialized PHP), the other in the files table - that's not entirely obvious.

This is being worked on in

This is being worked on in http://drupal.org/node/366464

This is still one of my

This is still one of my favorite patches. I have used the backport to 6.x on my sites and this has saved tons of work as many of my users are migrating to drupal and they need a test site for the time being.

Also since the sites.php file is an associative array I created a sites.ini file and in sites.php it just grabs the sites.ini file as an associative array and sets it to the $sites variable. In another script that automatically deploys sites for me all I have to do is write to the .ini file for it to be included. I'm down to about 10 seconds to deploy a new site or test site. This took me at least 10 minutes before. Plus I don't have to go back and update the files table when they want to go live.

Really I can't thank you enough for this patch.

Adam

nice to read about..

Its nice to read that others are facing the same challenges with files that I am. The problem gets worse with loadbalancing without shared storage. Right now, I use NFS to share out a common files directory and symlink to that under the sites/example.com/files. It works fine but I wish there was a more elegant way.

Upload path

I came to this post researching the UploadPath module project. Reading through your blog post and through Drupal nodes I'm wondering if in Drupal 7.x core there will be a possibility to organize files like it is done in UploadPath module. I'm deciding whether to or not to reorganize all of my old uploaded files in files table and files folder in the same way as they are organized in UploadPath module [= files_path/content_type/year/month]. It will take me some time to move files, sql-php rewrite files table and sql-php correct the url paths in body and teaser of all the nodes that contain links to uploaded files. I consider it as good action as it will organize my files that are now in one folder (a huge list), but I'm wondering: "Is it worth doing it? Will the Drupal 7.x offer continuation of the multiple upload sub-folder organized logic?"

I don't know

We developed UploadPath for another client a while back, but handed it off to Dave Thomas quite a while ago.

The problem with UploadPath-in-core is that UploadPath requires token module. Token module is not going into D7 core. Therefore UploadPath's functionality cannot either.

Similar functionality in D7

Will there be any similar functionality as Upload Path gives in D7 core upload system (to point the path for different uploads) - all files in one folder is really to much if you are dealing with files?
Or should I (by your opinion) rely on Dave T. at upgrading Upload Path to D7 version?
Thanks for advices?

Not so far

As of right now there's no UploadPath-esque functionality for the core upload module. Filefield in Drupal 6 already has integrated token support on its own. What will happen between now and code freeze depends on what people work on. We can't predict that. :-)

You'll have to ask Dave T about his plans for UploadPath at this point. We have no involvement with it anymore.