Let’s all talk about HTTP error code 410.

As far as I can tell, it’s the forgotten stepchild of error 404 (Resource not found). Error 410 means Resource gone, as in, a resource used to exist at this location, but now it’s gone. Not only is it gone, but I don’t know (or I don’t want to tell you) where it went. If I knew where it went, and I wanted to tell you, I would use error 301 (Permanent redirect) and any smart client would simply redirect to the new address. But 410 means Resource gone, no forwarding address. Train gone sorry.

Somewhere in my audience is an HTTP guru who can tell me if I’m getting this right.

Now, there is not a lot of information about error 410. Oh sure, you can search for http error 410 on Google and come up with lots of hits, but they’re all just pages that list all the error codes and give a brief description of each. No docs, no further explanation. I suppose because it addresses a condition that doesn’t come up very often. Also, we’ve all been brainwashed into believing that all resources should be permanent, which simply isn’t true.

Embracing HTTP error code 410 means embracing the impermanence of all things.

Now then, on to implementation. Scouring the Apache documentation, I’ve found several ways to specify that a resource is Gone. The first uses Redirect:

Redirect gone /path/to/resource

For example, if I put up a temporary page:

http://diveintomark.org/tmp/some-screenshot.png

…and I later wanted to delete it to save space (or whatever), I should put this in my .htaccess file:

Redirect gone /tmp/some-screenshot.png

The path is the virtual path of the resource on my server, not the full filename on disk, and not the full URL.

You can also use RedirectMatch to match multiple files, using regular expressions. For instance, this would match all files in my tmp/ directory named something-screenshot.png:

RedirectMatch gone /tmp/.*-screenshot\.png

The third option is to use mod_rewrite, which allows you to use complex conditionals to decide when to serve up the 410 Gone error. For example, I have a mobile edition that contains an index page and stripped-down pages of the most recent 5 articles. These article pages are not meant to be permanent; the whole thing acts like an RSS feed, except that it’s split across several pages because that’s how mobile devices expect it. Each page has its own separate address, but it only lives for a short time. I also can’t reuse the same URLs over and over, like always putting the articles in /mobile/1 through /mobile/5, because that would confuse AvantGo’s caching proxies.

So after articles fall off the mobile index page, I delete them from my server, and I want to serve up the appropriate error code to proxies and robots that come looking for them later. From what I can tell, 410 is the perfect error code for this. I don’t want to manually maintain a list of Redirect rules, though, so I use mod_rewrite:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule mobile/[0-9]{6}\.html$ - [G,L]

In English, this says:

  1. if there is a request for a file that doesn’t exist (that’s the !-f flag),
  2. and the file is in the mobile/ directory and named any-six-digits + .html (that’s the regular expression),
  3. then use HTTP error code 410 Gone (that’s the G flag)
  4. and immediate return without further ado (that’s the L flag).

This is not a perfect solution; if you randomly type arbitrary URLs of pages that never existed but that match the pattern, you’ll get error 410 instead of the more proper 404. But it does cover the more likely case of a proxy or search engine coming back to a resource it previously spidered, and finding that it no longer exists. (Update: read the comments for some possible solutions to this problem.)

When a client requests a page that you have marked as 410 Gone, Apache generates a default error page that looks like this:

Gone

The requested resource
/path/to/requested/resource
is no longer available on this server and there is no forwarding address. Please remove all references to this resource.

…which is fine as far as it goes, but it’s about as aesthetically pleasing as the default 404 Not Found error. However, you can create a custom 410 Gone page, in much the same way you can create a custom 404 Not Found page, by using the ErrorDocument directive in your .htaccess file:

ErrorDocument 410 /path/to/custom/page

Again, this is the virtual path on your server, which is probably just the web address without your domain name. (It could also be a fully-qualified URL to a remote machine, but in that case, the client would not receive HTTP error code 410; they would receive a redirect status code instead. So it’s probably best to keep it local, so clients get both the custom page and the intended 410 error code.)

Other possible uses for 410:

I don’t know. It’s not a very common situation, and it’s not a very common error code. Which is probably why no one has written a tutorial about it before. I’m not even totally convinced that I’m using it correctly, although I am convinced that there are people reading this who know more about it than I do, and who have an opinion about whether I’m using it correctly.

Discuss.

§

Thirty eight comments here (latest comments)

  1. Yup, you seem to be right. Check this page out, if you haven’t already: http://www.plinko.net/404/history.asp

    — Leonya #

  2. You are using it correctly, Mark – I can’t understand why less people do.

    It’s just that most people don’t bother to look through a list of HTTP error codes and find gems like this.

    The W3C protocol reference (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html) describes 410 exactly as you have, only with slightly more words and the note that this is expected to be permanent and cachable – if it’s temporarily gone, use 404.

    I also like the sound of 402 (Payment Required), although it says it’s reserved for future use.

    — Thomas Scott #

  3. By using a 410, aren’t you effectively saying “nothing exists at this url, and I am sure that nothing will at any point in the future”?

    This doesn’t seem to be the case for a non-existant file in your mobile directory, where you aren’t sure whether the file has been there and been deleted forever, or if you’ve just not created that particular file yet.

    I know I’m being picky, but isn’t 404 more appropriate for the situation where you aren’t sure?

    — Paul Hammond #

  4. The way to go on this would be to use some sort of database to mark what documents have existed ever on a webserver, and on a 404 page check wether the file has existed or not, and serve up a 410 or 404 depending on the answer.

    You’re probably gonna do this in Python one way or the other. ;)

    — Jesper #

  5. But given that, with the exception of people randomly typing in URLs, who deserve everything they get, the only time you’d get a 4XX error in that directory is when a mobile device is spidering a now-dead link.

    …unless Mark mistypes. Hmm.

    — Thomas Scott #

  6. No need for a database – instead of deleting old files, you could zero-length them then do something like:

    RewriteCond %{REQUEST_FILENAME} -f
    RewriteCond %{REQUEST_FILENAME} !-s
    RewriteRule mobile/[0-9]{6}\.html$ – [G,L]

    (which means if the file exists but has zero length, send a 410)

    This seems like a lot of effort for very little gain though…

    — paul hammond #

  7. IMHO a 410 error shouldn’t occur for a file that never has existed, so I’d probably prefer your first method explicitly using “Redirect gone” for every file that’s been removed.

    Hypothetically, a User-Agent encountering a 410 could try to locate the resource on it’s own, at alternative sources like Google’s cache, or even http://web.archive.org/. If you send a 410 for a file that has never existed, such a client could hypothetically make several pointless requests to other sites.

    Also, look at it the other way around: It could even be seen as an explicit ‘order’ for caches and search engines to remove the URL in question from their indexes, since it’s been deliberately removed.

    — Arve #

  8. Oooh, I like Paul’s zero-length solution. Database interaction, imo, would be overkill.

    But besides the fact that using the -s flag will require keeping tons of zero-length files hanging around, it sounds like it would result in correct usage of the 410, which Mark’s current solution isn’t quite yet.

    — Gina #

  9. Hey Mark… Where is my money? http://linuxintegrators.com/hl30/blog/technology/?permalink=Open+Letter+to+Mark+Pilgram+regarding+possible+patent+infringement.html

    — Andy #

  10. I know I’m missing something, but how does the webserver know what files existed and what files didn’t?

    More importantly, how long does it keep this cache of data?

    — pro heat #

  11. > how does the webserver know what files existed and what files didn’t?

    It doesn’t. A webserver simply produces a ‘404 Not found’ when you request a non-existing resource. If that resource has been there but is removed, a ‘410 Gone’ would be appropriate. But you have to explicitely tell the webserver to produce a 410 on a particular request.

    How to do that is what Mark explains in this post. Or am I missing something in your question?

    — Martijn #

  12. I was gonna suggest RewriteMap, but it looks like you have to have access to your httpd.conf to do that. If you have that kind of access, though, you could create a text map listing the files that you’ve erased and send 410 for those files.

    — Mark A. Hershberger #

  13. Oooh I got it.

    Apparently logical thought has escaped me this morning.

    Thanks Martijn.

    — pro heat #

  14. MAH, if you could write up an example of that (2-3 files in the map would be fine), that’d be great. I’ve never used RewriteMap, since I’ve never had httpd.conf access on a server where I cared about such things. But it would be good to have the example, for completeness.

    — Mark #

  15. Seems to me that you could add the path to the file that’s gone to a .htaccess file. Those are read on each request, right? This would, of course, result in a large .htaccess file, but I like that idea better than chewing up an inode per gone file (a la the zero-length file sol’n).

    Another idea might be ensure that you’re using unique increasing values for your pages (/mobile/1, /mobile/2, etc.) and use a rewrite rule to mark all resources that have a value less than the current minimum value as gone. That seems pretty clean to me…

    — B #

  16. Ah, make that

    RewriteCond %{REQUEST_URI} >/mobile/000574.html

    Damn off-by-1 bugs.

    — Mark #

  17. Mark: Call it PyGones. :D

    — Jesper #

  18. Heya Mark

    In answer to your question the other day, yes, it’s the front row MJ. =-)

    You got your last word in before I could respond. Too busy getting my words in on other sites. *GRIN*

    — emjay #

  19. MJ, how the hell are you? I must say, it’s very odd to find you here. I prefer to labor under the illusion that the separate, unconnected parts of my life are actually separate and unconnected, despite overwhelming evidence to the contrary. But you’re most welcome here anyway. Whatdya think about HTTP error 410?

    — Mark #

  20. Heh, Dive into American Deaf Culture. ;)

    — Adam #

  21. Mark,
    Most people don’t serve 410 because 404 is the default. That is, most URLs map to files, and the web server mostly stupidly looks at the file system, sees no file, and doesn’t know it was ever there. 404 is appropriate, in this poor implementation.

    For that matter, for practical purposes, the bang isn’t worth the buck, as far as web authors can tell. It may save some proxy some fetching, but what does it do for the author? Nada. It should be cheap, or free, to properly do 410s.

    410s, like 404s, shouldn’t take special effort or hacks to have happen. The web server should just know.

    Raising the bar on worthwhile functionality means driving the usual and commonly useful stuff into the infrastructure.

    Of course, as long as there are clever people hacking workarounds rather than fixes, the infrastructure won’t be formed.

    One of these days, someone’s gonna tell me to put up or shut up.

    :D

    — Jeremy Dunck #

  22. Mark, you’ve violated my patent again with your statement:

    MJ, how the hell are you? I must say, it’s very odd to find you here. I prefer to labor under the illusion that the separate, unconnected parts of my life are actually separate and unconnected, despite overwhelming evidence to the contrary. But you’re most welcome here anyway.

    I must insist you send me my money!

    -Andy

    — Andy #

  23. Andy, a key element of satire is knowing when to quit. Also, spelling. Spelling is key.

    — Mark #

  24. Bah. Speling is for winers.

    — Andy #

  25. Well, since you asked…

    I’ve always used the 404 error and never really thought about 410. That’s just the way it’s set up by default in my apache installations. Being more of a sysadmin than a web designer, I guess it’s my realm. It seems to me 410 is the pinky toe of the “Huh? What File Was That?” genre.

    When the average AOL user (or below average web surfer/Cerfer) tries to get a page which is either gone or was never there to begin with, I don’t think they’re going to care if it’s 404 or 410. The end result is the same…they’re scratching their heads wondering what happened and trying to find a link to fire off an email to webmaster@domain.tld

    I completely understand the usage, but as Yogi says, I’m smarter than your average bear. I don’t think it’ll matter for anyone other than the seriously anal webmaster/sysadmin looking over the log or running webalizer against it.

    I can see how I might like to know how many people are still trying to find a blog post or particularly offensive picture I happen to have on one of my sites. But when it comes down to it, right now, I don’t care. I do however reserve the right to change my opinion on this matter without notice. =-)

    — emjay #

  26. Andy: Oh, like Dave of scripting.com?

    — Jesper #

  27. Mark,

    In my opinion, you’re wrong. As one of those wacky RESTafarians who waste far too much time thinking about this sorta thing, I’ve got to agree with Paul Hammond. (See above). 410 is stronger assertion than simply saying ‘there’s no resource here’. 410 means ‘there’s once was a resource here but it’s been destroyed and it’s (very, very likely) never, ever coming back’. The real purpose (IMHO) of 410 is actually to indicate a successful DELETE operation (that’s the way I use it).

    So most people are correct in serving only 404s. Still, you might try asking this on the rest-discuss list.

    - Bo

    — Bo #

  28. Bo, we’ve already corrected the problem of not being sure about whether a resource once existed at this location.

    http://diveintomark.org/archives/2003/03/27/http_error_410_gone.html#c000912

    What else am I wrong about? I’m confused. We seem to be in agreement.

    — Mark #

  29. Twocentsworth (trackback)
  30. I’ve had 410 support on aagh.net since it’s first incarnation. I’ve not used it much though; last time was when I scrapped the lot and the entire document tree was 410′d.

    http://www.aagh.net/gone anyway; unfortunately this implementation requires you to place an entry in a metadata file for each 410, but it’s better than nothing. It’ll do until the next rewrite, anyway :)

    — Freaky #

  31. Re: less-than-min-value. This could work, since my actual addresses in the mobile edition are padded entry IDs, so they increase monotonically. This looks promising:

    RewriteCond %{REQUEST_URI} >/mobile/000575.html
    RewriteCond %{REQUEST_URI} </mobile/002225.html
    RewriteRule mobile/[0-9]{6}\.html$ – [G,L]

    (Due to a failed initial import, my first entry ID is 575.)

    This would require some maintenance — incrementing the second line as articles fell off the mobile index page — but it’s nothing that couldn’t be automated. Yes, Jesper, in Python. :D

    — Mark #

  32. Twocentsworth (trackback)
  33. Where is the paypal button! A lot of love is going out to you right now and I feel like giving you some of my hard earned bundles of energy. You’re a benefit to mankind and role model of community service.

    — Eric Rolph #

  34. Everybody keep their bundles of energy in their pants. It’s only an HTTP error code.

    I didn’t expect the Spanish Inquisition.

    — Mark #

  35. Noone expects the Spanish Inquisition! Our weapon is surprise, surprise and fear, fear and surprise.

    — Jesper #

  36. Our two main weapons… er… oh, sod it, I’ll come in again.

    — Thomas Scott #

  37. Quick, close the comment thread!

    — Jeremy Dunck #

  38. Re: Comment 16

    I’d just like to point out that Mark used the word
    “monotonically”&emdash;a word I have not heard since Calculus II. Thank you.

    — Ken #

Respond privately

I am no longer accepting public comments on this post, but you can use this form to contact me privately. (Your message will not be published.)



§

firehosecodeplanet

© 2001–9 Mark Pilgrim