I spent 18 hours this weekend recovering 600 GB of extremely valuable data after a catastrophic hard drive crash. I highly recommend DiskWarrior. It worked. Nothing else worked.
I spent another 6 hours backing up as much of the data as can reasonably be backed up onto 4GB DVD-R discs. Twice. Once to keep in a safe place somewhere in the house, and once to keep in a safe place in my parents’ house. This should cover not only hard drive failures, but other catastrophic events like my house burning down or being broken into. And make no mistake: if my house were burning down, I would get my wife and children to safety, and then I would try to go back and get this data.
But it’s not enough. I’m creating a lot of data, and I want to keep most of it for the rest of my life. This includes video of my children growing up, but also things like video footage of New Orleans after Hurricane Katrina. In 2004, I generated 35 GB of such data. In 2005, I generated just shy of 150 GB. This year I’m on track to generate about 100 GB. I foresee doing this for about 20 more years, and then maintaining the archive for another 30 years after that. After that I’ll be dead and it will be Somebody Else’s Problem.
I don’t know how to back up 100 GB of video.
I have the original Mini DV tapes, of course, which I currently keep in a drawer, which now strikes me as an incredibly poor strategy. I could at least keep it in a safety deposit box. I rent a safety deposit box; it has things like my marriage certificate and my children’s birth certificates. This does not scale to decades. In fact, it probably won’t even scale to a single decade, due to a combination of physical space limitations and media degradation. I suppose I could invest in a larger safety deposit box when I run out of space, but by then I’ll need to transfer it all to newer media, which (one would hope) will be more space-efficient.
But that doesn’t solve my problem either, because that’s not really the data I want to keep. Every year I take the footage from the previous year and edit it down to a few hours and create a DVD out of it, which I distribute to family members and a few trusted friends who graciously pretend to care. Editing takes about two weeks, so I’d like to save the edits along with the raw digital footage. I currently do the editing in iMovie, which now strikes me as an incredibly poor strategy as well, due to the combination of closed-source software and undocumented data formats.
It’s not quite as bad as that sounds, since iMovie stores the raw footage in individual .DV files, which I have successfully read and converted to various other formats with open source software. I have no idea how iMovie stores the edits. I can at least mitigate this problem by exporting the entire edited movie in .DV format. Assuming I get smart and start editing with a program that respects my long-term aspirations, I still have the problem of storing the raw footage I generate every year.
How do you back up 100 GB of data per year for 50 years? Or even 10 years?
Right now, I keep everything on a Lacie 1 TB hard drive, which is OK for short-term storage. But the drive is really a combination of two 500 GB drives in a RAID 0 configuration, which means that if one drive fails, all the data is lost. At a higher level, it is susceptible to OS-level failures which cause invalid volume header errors and make the drive unmountable. Disk Utility couldn’t fix it. fsck in single-user mode couldn’t fix it. Tech Tool Pro couldn’t fix it. DiskWarrior fixed it… this time. Who knows if it will save me again? It was certainly worth the $80 though, even if I never use it again.
I’ve done a little research into tape backup, specifically tape backup under Free Software operating systems (like Linux) with Free Software archiving tools (like tar). The high end tape drives can record 80 GB per tape (”160 GB with compression”), but these drives cost about $4000. The “sweet spot” of tape backup today seems to be 40 GB per tape. Hardware interfaces may also be a problem; most tape drives only have SCSI interfaces, although there are a few with FireWire ports.
I’ve also explored the possibility of using hard drives instead of tape. External FireWire hard drives are less than $1 per GB. I could buy two 100 GB hard drives, put a year’s worth of data on each one, and store each of them in different attics. Of course, with lower prices comes fewer guarantees. Today’s hard drives are warrantied for one year, two tops. Two years from now I could buy two 300 GB hard drives for the same price and copy everything over.
Just for kicks, I calculated what it would cost to pay someone else to store it. Amazon offers HTTP-based storage for $0.15 per GB per month. Transferring 600 GB at $0.20 per GB would cost $120, and storing it would cost $90 per month. That’s over $1200 for the first year. Assuming I transfer an additional 100 GB each year, my storage costs accumulate and increase by $200 every year. (This naively assumes that Amazon never raises — or lowers — their prices. And never mind the privacy and trust issues.) I could buy a lot of hard drives for $1200 this year, and even more for $1400 next year.
And let’s not forget, I could “just” buy a spindle of 4GB DVD-R discs and burn everything onto… 150 discs. Twice.
So, to my illustrious audience, I throw out this question: how do you back up 100 GB of data per year for 50 years?


I recently faced some of the same issues. I documented my solution here:
http://www.eightypercent.net/Archive/2006/04/15.html#a267
I have enough space for ~800GB right now. I figure that enough will change over the next five years that I can upgrade.
The only other solution that I could come up with is to do some sort of p2p thing where you buy a disk/machine and give it to family and friends. That is cheaper, probably, but it requires more software to be written.
Joe
Comment by Joe Beda — Monday, May 8, 2006 @ 11:09 am
Here’s what I’d do (am in the process of moving to):
Buy a ReadyNAS device from infrant - I’m looking at the 1U rack enclosure, because I have access to a rack, but one of their other boxes would be just as good. It’s pricey for a NAS, but it’s currently unbeatable, and supports hardware RAID-5. On top of that, its one of the few hardware RAID-5 implementations I’ve found that allows you to resize the array up by adding one hard drive at a time and rebuilding until all are replaced, and you have more place. Most seem to require you to destroy the array, and then recreate it.
This gives you a great upgrade strategy - buy four hard-drives from seagate (5 year warranty, and best drives out there) that will provide enough space for the foreseeable future (defined by you) - $200 / drive can buy a lot of space. When they fill up (and it will be before the warranty is up), replace with the newer generation drives in the same price range - drive space will have jumped remarkably, and suddenly your array is much bigger. Rinse and repeat every few years.
Not the cheapest option - especially for the initial setup - but you have some solid data protection. Either double for offsite storage (yes, pricey), or only worry about doing offsite storage for the most important information. Or find an offsite location for the rack enclosure, and trust they’ll never burn down…
Comment by Isaac — Monday, May 8, 2006 @ 11:16 am
Every time I backup to dvd-r’s in wonder how long they last in storage. Do you have to do media checks regularly?
Comment by Darryl — Monday, May 8, 2006 @ 11:20 am
Reading your post Mark I was very excited to tell you about one of Infrant’s ReadyNAS boxes, but I see Isaac has beat me to it. He mentions that he is in the process of switching to one of these for his long term storage solution, I am as well.
I’m going with the ReadyNas X6 which is roughly $560 from Newegg, it holds 4 SATA hard drives, and as Isaac mentioned these ReadyNAS systems have a wonderful feature allowing you to upgrade your RAID with 4 new hard drives so long as you insert them one at a time. I plan to start with 500GB Seagate drives, and eventually move up to the newly released 750GB Seagate hard drives once the price is nice. You can search google and find some great reviews of the ReadyNAS boxes, they are all Linux based. Matter of fact it should be noted that from what I can tell all of Infrant’s products share a lot fo the same features, so in most cases you are paying for the form factor, as they all only hold 4 hard drives. They share other useful features like Print Serving, and being able to plug in USB storage into the ReadyNAS and have that USB storage available across the network.
I hate tape drives. Oh, and let’s all move away from trusting LaCie products. Good luck finding a solution you like!
Comment by Stephen WG Sullivan — Monday, May 8, 2006 @ 11:46 am
Not to pick nits too much, but large (250GB) Seagate drives can be found for well under $0.50/GB, and they have 5-year warranties. This, of course, doesn’t affect the rest of your equation, but it does make a difference when shopping for hard drives!
Comment by John I. Clark — Monday, May 8, 2006 @ 11:53 am
Stephen,
If you have specific reasons to distruct LaCie products, could you share them?
Comment by Jim — Monday, May 8, 2006 @ 12:05 pm
The Ultrium 3 drives can do 400 GB native (800GB compressed), and you can get an HP Ultrium 960 (external) for $4,299. The price is about that quoted above, but the capacity is much greater:
http://h71016.www7.hp.com/dstore/MiddleFrame.asp?page=config&ProductLineId=450&FamilyId=1249&BaseId=13277&oi=E9CED&BEID=19701&SBLID=&AirTime=False
For $2,099, one could get the Ultrium 448, which does 200 GB native:
http://h71016.www7.hp.com/dstore/MiddleFrame.asp?page=config&ProductLineId=450&FamilyId=1249&BaseId=14280&oi=E9CED&BEID=19701&SBLID=&AirTime=False
The lesser drive, plus media, ($50 per 200GB native) would be about $2600 (plus tax) for 20 years worth, based on the assumptions above. Because these technologies are the foundation of backup strategies of businesses, small and large, it is probably reasonable to assume that technology will continue to exist that allows the media to be read for a number of years into the future. A less optimistic person may want to assume that this expense could be repeated once or twice over the course of the next 20 years along with the time to make the transfer to the newer technology.
Also, the calculations above appear to take into account the fact that, as technology advances, you’ll probably need more than 100 GB/yr.
Comment by Cameron Watters — Monday, May 8, 2006 @ 12:10 pm
In the “not here yet” category, you may want to take a look at blu-ray DVDs.
http://www.blu-ray.com/
25 to 50 GB per disk. I’m sure it’s going to take much longer burning times, but at least your media is much more manageable.
I agree with the comment on LaCie products. If you’re looking for a rock solid HD enclosure, check out Wiebetech. Excellent products, stellar customer service.
http://www.wiebetech.com/
Comment by Andy MacBride — Monday, May 8, 2006 @ 12:13 pm
Correction for comment #7:
The calculations above don’t appear to take into account the fact that, as technology advances, you’ll probably need more than 100 GB/yr.
Comment by Cameron Watters — Monday, May 8, 2006 @ 12:18 pm
Wow. Very timely article, one I expounded on in my own brush with the data-eater god a while back. I lost some data and kicked myself, vowed never to let it happen again. Fast forward some months and an external 120 GB drive went on me. Fool me twice I guess. I’ve been saving files and working on Macs and PC’s since the early 90’s (I just received a box of college stuff from my Dad’s house, so add 5 1/4″ actual “floppy” floppies circa 1984 to that as well) and backing all this up is getting to be a real concern. My only suggestions are not really definitive answers, as I don’t think any exist yet that are up to a total solution. First, 8 GB dual layer DVD9’s seem a more practical size for archiving. A year or more ago (?) I read about rapid advances in CD laser technology, specifically some russian breakthrough with single disk storage sizes at around 100 GB possible. I heard it so long ago my brain was finding it impossible to digest the vast space involved. But as we see now, this is vaporware in hardware form. My second suggestion is also a JBOD rig, multiple drive raid with redundancy and safety as my goals, to keep data on. (I will just use OS X as the host instead of linux) I will also preemptively upgrade the drives every 3 years, to maintain their reliability. Hope all of your backup situations work out well.
Comment by dvsjr — Monday, May 8, 2006 @ 12:27 pm
I’m not trying to sound like a smartass here, but if the bulk of your data is raw DV footage, I’d say, get over it. It simply isn’t practical to keep every minute of footage you shoot. Either keep just the best stuff, or save it all at DVD quality, where 2 hours (25 GB) of iMovie footage turns into 4.5 GB that will fit onto a single DVD. Even if you use high-quality settings (and make each DVD one hour long but still 4.5 GB) that’s still around 3:1 compression. I know you’ll have some regrets of footage that’s lost, but contrast that to the anguish of becoming a full-time archivist.
Speaking of iMovie, as far as I know edits are turned into .DV files once they’re rendered. They might not be named as such, but that’s what they are. At least they were in iMovie 4. iMovie 5 changed how files and the trash are handled and they might have changed how edits are handled, too. As long as you’re keeping something around that operates with 3.5″ drives (see below), it shouldn’t be hard to keep a Mac or two around to handle the files.
I know what you shoot is important to you, and I’m sure you wish you had more super-8 movies from your parents, but you can see that it is just impractical to store every second you’ve shot. Even with 5-year-warranted Seagate drives (which I see someone else already mentioned) there is the problem of what happens when we move beyond 3.5″ ATA/SATA drives. If you had started keeping data 20 years ago, you would have already moved from 8″ to 5.25″ to 3.5″ floppies to CD-Rs. You’ll spend the rest of your life moving data from format to format if you want to keep it around, and of course backups will have to be verified periodically too. This is required reading: http://en.wikipedia.org/wiki/BBC_Domesday_Project
I’m not saying what should or shouldn’t be important in your life, but there is the old saying about letting your possessions possessing you. Data maintenance is hard, and the higher your standards are, the harder it is. Also, your predictions on space requirements are low, if anything–they’ll probably hold true for a year or two, but what happens when you’ve gone through a couple more digital cameras and are at 10MPixels+ per image, and/or start shooting in RAW, and start shooting video in HD? There comes a time when you have to make the decision to put some of the pics into albums and toss the rest of the shoeboxes, lest they take over the whole house.
Comment by brian — Monday, May 8, 2006 @ 12:31 pm
I forgot to mention–tape drives, as far as I know, cannot compress uncompressable data. DV, MPEG, JPEG, MP3–all these formats are already compressed and don’t respond well to attempts to squish them further. Zip up an MP3 and see what happens. Maybe you’ll see a 5% savings. Can anyone with tape experience confirm or refute this?
Comment by brian — Monday, May 8, 2006 @ 12:37 pm
My uncle works for NASA. Years ago, he was called in to consult for the government on how to store some important data. Tape drives? CDs? (This was before DVDs.) His response was this: “If you write it on papyrus and lock it in a pyramid, we know it will last a few thousand years. Everything else is just guesswork.”
Comment by David Ely — Monday, May 8, 2006 @ 12:46 pm
You see the problem here isn’t that you don’t have enough storage its that you have to much stuff. If you were to look at some guy who is whining about needing to store 5 more tons of junk/antiques/old electronics he had or has to have, hopefully you would tell him he DOESN’T NEED IT ALL. How much time are you willing to devote to watching “your media” a year? Figure this out and then only keep that much backed up. Throw out some each year that you don’t need and add the stuff you do. When you get real old you still only have that much time to watch videos, so supposing that videos are still being created that you will want to view, that will cut down on the time you have to view the stuff that is backed up. You only need a FINITE amount of video backed up, providing you are still going to be living life and not tucked away in a whole with nothing but videos to watch.
“But what about my decendeants? How will they remember me?” some might ask. Well if they are anything like you they will be far to busy making their own memories and videos to watch all that stuff you archived.
In a 37 signals way you need to simplify, figure out what you don’t need and trash it. Time to watch stuff is finite, live life, don’t “relive” it.
Comment by Peter — Monday, May 8, 2006 @ 12:51 pm
For video data, you really ought to dump iMovie. Moving into FinalCut will make a big difference (albeit it is not an easy program to use - you need to read the manual). The big difference is that iMovie stores all your clips and cuts and filters/titles etc as (huge) video data files. Making the projects large and hard to back up. FinalCut simply references the original media. So to restore your movie you simply recapture it, and all the sequences/titles/etc are regenerated. This is much more efficient, and easier on your storage budget.
Comment by aricart — Monday, May 8, 2006 @ 1:05 pm
I am aware (but neglected to mention) that 100 GB is the low end, and does not take into account advances in consumer technology. For example, when this $10,000 HD video camera drops to $1000 and I put it on my Christmas wish list. And we’re already have our eye on a 7 megapixel camera to replace the 3.5 megapixel camera we bought a few years ago for the same price (not even counting inflation). Same price except for backups, of course…
I am also aware that I will not care about every minute of footage in 20 years. The problem is, I don’t have the foggiest idea which minutes I’ll care about, and I am not ready to let go of any of it yet. The oldest video footage I have is less than 3 years old. I need an archiving strategy that keeps it long enough for me to be able to look back rationally and prune it.
Finally, I am aware that I will need to transfer things to different media over time. I wouldn’t trust DVD-R discs for more than 2 years. Hard drives maybe 5 years, max. I still have 5.25″ floppy disks from my Apple ][e days (in Apple DOS 3.3 format) that I am finally getting around to transferring to .dsk (or .nib?) image files so I can put them in the Grand Pilgrim Archive and access them via emulators. My later work was saved on 3.5″ floppy disks which I successfully imaged on an early Mac — back when Macs came with floppy drives — so those are now included in my normal automated backup routine.
I should have mentioned that I *do* have an automated backup routine. I have a daily backup script that copies my Documents folder, mail archive, photo library, web server logs, and private SVN code repository to a separate local hard drive. And a weekly backup that compresses the daily backup and rsyncs it to a StrongSpace account. But that only holds 20 GB, hence my need for a larger storage solution.
Comment by Mark — Monday, May 8, 2006 @ 1:06 pm
Lacie does not make good hard drives. My 150 gb works okay but I have had repeated failures with larger capacity drives— overheating, data loss, etc. (And selling RAID 0 is just kind of dirty. But I guess capacity sells. No one wants to buy 3 500 gig drives for just 1 tb of space.)
Western Digitial now makes drives (the MyBook) that are just as good looking as Lacie’s, and I trust WD a lot more. I use my Lacies for stuff that doesn’t matter in the long term (movies and tv shows to get around to watching, etc), and I mirror my iMac’s 500 gb internal drive on a 500 gb MyBook using Super Duper.
Comment by jhn — Monday, May 8, 2006 @ 1:07 pm
I’d say storage will evolve with the size of Data. I remember back in 1992 my dad bought a computer with 127Mb of HD and 16Mb of RAM. I could not imagine what in the world he would do with all that storage. BlueRay and HD-DVD are around the corner, and will get cheaper in a couple of years. So I think as you create more data per year, also the cost of storing it will stay the same, proportionally, as technology will evolve. It’s a matter of keeping up-to-date and maintaining your backups periodically (say once every two years, you move them all to the most updated storage type, and discard the old ones).
Floppies? I’d have transfered all the data to hard drives a while ago.
Comment by Andre — Monday, May 8, 2006 @ 1:45 pm
Not much to add except that I saw something last week about Kodak marketing gold plated CDs and DVDs for archival storage. If they actually have a long shelf life and they are not too expensive, they could be of some use to me. I have little faith in traditional DVD media over the long term.
Comment by Gary — Monday, May 8, 2006 @ 1:54 pm
The problem with tape drives are that when it is time to do a recovery it is a time-costing experience. Using an external data storage is impractical for most people due to cost of file transfer fees. I am a self-employed IT consultant here in Germany. My customers have an average daily actual data level of 500 GB to 1 Tetra byte. That needs to be not just incrementally back-upped it also needs to be quickly and easily retrieved. Take into account needed for insurance purposes,such as fireproof and thief proof or perhaps offsite storage, it can get complicated and expensive. Tape is used to meet the insurance requirements. For the real world a combo of SAN and NAS solutions are used. I myself have at home a mac mini which is connected with a Lacie Biggest F800 (2 TB). The mini is a central server where my partner and I stored our music, videos, pictures, downloaded paid for software and files and where we make copies of media such as cd’s and DVD’s for archive purposes. It is also used as living room media center. We both have notebooks for our day to day usage and used the mini as a hub. In our den I’ve install a fireproof server cabinet with a linux based NAS system with a hardware raid device. I can replace or swopp hard drives as the case maybe. It is kept cool with a dehumidifier. The mini is connected to the NAS with ether net and does a daily incremental backup. The NAS is where not needed but needs to be kept anyway data is stored. It has at moment a capacity of 8 TB with 2,5 TB used. I have data going back 10 years. The server cabinet is kept locked and would be a bitch to break into; so hopefully it would be enough of a deterrent for a burglar. Thanks to Apples Bonjour technology it’s a snap to share our music and pictures and view our videos. I am not looking forward to when private individuals approach me to help them recover their “memories”. Apartments and homes of the future are going to have to offer high speed local networking along with high speed Internet access. Basements and or attics are going to need top notch “server rooms” or at least have a “closet” that is fire and break in proof. I hope that Apple and other vendors can create a SAN/NAS solution that is not just cheap and easy to use but also scalable. Media such as cd’s, dvd’s and other optical media will just not cut it. They are just too easily damaged, take up space and get outmoded fast. Online storage is for most people just not practical due to data transfer and storage fees - especially for people outside of the states. Such companies will have to reach levels of trust as banks which offer safe deposit boxes and vaults along with longevity let alone privacy and copyright policies. Think about it in the not too long future a small family of four where each member has his or her own computer will produced 100’s of gigabytes of data which will included personal memories and memoirs. It not only needs to be kept safely it also needs to easily transferable from an old computer to its replacement and shareable. I think private and family data storage is going to be hot commodity of the future.
Ian
Comment by Ian — Monday, May 8, 2006 @ 2:04 pm
Avoid DVD-Media… even the best brands have a 5-10% failure rate unless you store them controlled temperature environments.
Comment by Duffy — Monday, May 8, 2006 @ 2:58 pm
You know — you KNOW — that whatever medium you use now will be obsolete in very few years. So don’t bother planning past this year and the next. 100GB/4=25. Twice that is one spindle. Two years from now you’ll be able to copy that onto whatever’s next in a matter of hours at most. What’s the problem?
Comment by Jim Hill — Monday, May 8, 2006 @ 3:14 pm
The short answer, I think, is that whatever you do, it’s not going to be enough. For one simple reason: everything here, including in the comments, is based around personal storage solutions. This won’t ever cut it. Things move too fast for one person to keep up with it.
What’s needed is a service to blow Strongspace (what I currently use) and Amazon’s S3 out of the water. Who knows, maybe one of these two I mention may be the ones? This service will be a worldwide redundant grid-style backup solution, using a BitTorrent-like protocol for sharing data. It also relies on people’s internet connections being fast enough for this to be worth-while. And of course it needs to do all the stuff that S3 is doing, with a file system that properly supports this sort of architecture. Charge a small fixed fee each month for this, and make it easy to use, and someone’s going to make an awful lot of money.
At least, that’s what I think.
PS — what you should do in the short term until something like the above comes about: hard drives, in differnet locations, of not too big size (say 3x your estimated yearly usage, to allow for increases). Store them in different locations, have a backup of each backup drive, and replace the drives every 2 or 3 years. Oh, and win the lottery.
Comment by Chris — Monday, May 8, 2006 @ 3:32 pm
Perhaps a cheap Solaris box with ZFS instead of the RAID solutions suggested? I’m personally expecting to be in a similar situation in the near future, and ZFS has so far looked like the best way to go without totally breaking the bank.
Comment by Bob Aman — Monday, May 8, 2006 @ 3:47 pm
1. I’ve got documents that are a mere 15 years old stored on 5 1/4 inch disks… but I have no way to read them. I think your only long term solution is to realize that there is no long term solution; find something that will work for 5 years and plan on copying all your data to something new after that.
2. Several years ago I got a VXA2 Firewire tape drive, it takes 80/160 gig tapes, and cost around $1500… so donno where you got that $4000 figure.
3. This live comment preview thing is nifty. :)
Comment by paul — Monday, May 8, 2006 @ 3:47 pm
As file sizes grow in the future I’d expect disk sizes to grow as well. Another thing to consider is the role of compression in the future.
As for how I back up, I’m pretty lucky. 15 years of data (granted I don’t really have much from when I was 3) equates to about 25GB, though I don’t shoot any video. But I usually back up to my iPod when I remember
Comment by Pilky — Monday, May 8, 2006 @ 4:20 pm
Although I’m dealing with slight less data and data in much less complex formats (.psd, .html, .ai for example) I too have pondered this for a while.
I decided to purchase an old PowerMac G4 500mhz and put three 120gb hard drives in. Currently it still sits with just one 80gb, but the plan is to RAID up the three drives for redundancy. It is cheaper to add drives to a server, and because it has it’s own OS (unlike a firewire drive) I can setup OS X on another smaller partition and keep it separate. This has added benefits - because I’m on a MacBook Pro as my main computer, the server takes care of long term tasks too, such as downloading and uploading large files, and acting as an e-mail server. Add in the cost of a static IP (cheap and getting cheaper by the year) and I have a server I can access anywhere around the world at anytime.
P.S. I originally tried this with a blue and white G3, but the limitations were too great - no boot from firewire, only smaller ATA drives recognisable, etc.
Comment by phobic — Monday, May 8, 2006 @ 4:34 pm
Me, I’d be looking into tape. I second the LTO recommendation-it’s really, really fast. My backups to LTO have been ethernet limited, not tape limited. And that’s over gig-E networks. Yes, you still have to do searches, but damn–there’s a reason it’s called Mean Time Between Failure on hard drives. A 200/400 GB LTO tape will cost you ~$40-50 when bought in bulk (eg, 10 or 20). Beat that with a hard drive–especially when you have to buy at least 2 at a time to be safe. Yeah, there’s the intial (not inconsequential) drive investment, but…
Comment by Ken Carlile — Monday, May 8, 2006 @ 4:42 pm
Said before, but well worth restating: when relying on DVD media for backups, you really need to duplicate your existing disk set every 1-2 years (and test it at least once a year — you do keep hashes of the disks?) because of degradation of the disks.
Comment by Marten Veldthuis — Monday, May 8, 2006 @ 4:56 pm
if you look at all the information surrounding us nowadays, you can realise that probably the 99,9% of this information will be lost in few years. the only way i see to keep all this data for 50 years is to make it public (at least for your family and friends) and interesting for ‘your audience’.
(sorry for my english).
Comment by xabi — Monday, May 8, 2006 @ 5:37 pm
I know CDs rot, but what’s the shelf life on an idle, unplugged hard drive? Warranties seem to generally range from 1 to 5 years, but I assume drive manufacturers set those dates based on typical, regular usage. Wear and tear of moving parts and all that. But if the parts aren’t moving, why would a healthy drive die on the shelf? Oxidation? Inertia? Entropy?
Comment by Jim Biancolo — Monday, May 8, 2006 @ 5:51 pm
Loads of useful comments here, but honestly, if you’re rating the value of the data just behind the lives of your wife and children, I wouldn’t trust it on magnetic media unless you’re also housing it in a cushioned Faraday cage.
On another tack, have a look at LOCKSS.
Comment by Jeremy Dunck — Monday, May 8, 2006 @ 6:37 pm
I’ve been thinking about this a lot lately, both for personal use and for work. The best solution I have gotten to so far is a pair (at least) of servers, one local to you, and one “somewhere else”. Both have RAID-5 arrays of similar size. All of your local data gets saved to the local server, which in turn is automatically synced to the remote server(s) at regular intervals. This keeps local data safe and accessable, with a reasonable contigency scenario with the remote server(s).
This scenario has proven both reliable and cost effective for me.
However, I’d echo the sentiments of other posters in that you should think of “long term storage” as a string of short-medium term solutions that are replaced every few years. Thanks to the maleability of digital storage needs, a true single “long-term” solution is a fantasy at best.
Comment by Quentin Hartman — Monday, May 8, 2006 @ 6:48 pm
> you should think of “long term storage” as a string of short-medium term solutions that are replaced every few years.
This is the most insightful thing I’ve read all day. Thank you. I tried and tried to come up with something that succinct, and failed.
So, the question becomes: how do I keep about a terabyte of data for about five years? In a way that it is possible (and preferably relatively painless) to transfer it to another storage solution after that?
Now that you’ve helped me rephrase the question, I am much better able to sort through the rest of the answers.
I’m strongly leaning towards buying one of those NAS boxes + a single plain hard drive to keep offsite (probably at my parents’ house, since that’s relatively close and yet far enough away from my kids to make me comfortable). I have the luxury that, although my data accumulates, it tends to accumulate in spurts. After each spurt, I can manually refresh the data on the offsite drive. This is not ideal (red flag: not automatic), but it has the dual advantages of being relatively cheap (compared to a networked offsite backup), and being in a place I absolutely trust.
Comment by Mark — Monday, May 8, 2006 @ 7:14 pm
By the way, I tried compressing Pilgrim2004.iMovieProject with “tar cvfj” (bzip2 compression). It compressed from 31 GB to 22 GB. Definitely worthwhile. I’m trying some other iMovie projects now.
Comment by Mark — Monday, May 8, 2006 @ 7:20 pm
when the price comes down, network attached storage with sold state drives (=
Comment by er0k — Monday, May 8, 2006 @ 8:00 pm
No one has yet given a specific reason why they dislike LaCie. If you have one, please share.
jhn - LaCie does not manufacture hard drives. They make cases, and use the same vendors as everyone else for their mechanisms. Are you saying the cases are faulty? That they buy shitty mechanisms? Please be specific - this is our data we are talking about.
Comment by Jim — Monday, May 8, 2006 @ 9:54 pm
I agree with the >string of medium term solutions theory. I’ve done a fair amount of research for the archive that employs me and can say that it seems fairly agreed upon in the profession that the only “sure thing” is heavily RAIDed storage that is perused constantly by software checking the validity of each bit (I read something interesting about re-purposing virus scanning software to do this) Of course to guard against natural and other sorts of disaster, you’d want ideally to mirror the system in a remote location, but that does double the cost. I’d personally settle for a nice professionally managed RAID 5 box sitting somewhere safe inside the locked cage of a colocator in a boring suburb of a boring city where nothing like that has ever happened… Do stay away from optical disks dvd etc. they are “dark” storage and do have proven failure rates, plus if you calculate the time it takes to make all those multiple copies of disks and send them around, you’re better off buying some hard drives.
Comment by derek — Monday, May 8, 2006 @ 10:28 pm
Hi Mark, I don’t know if what I do would help you (I’m on a Windows box), but nevertheless, here it is.
I capture my video from my MiniDV tapes via Windows Movie Maker, edit, add titles and save it as a wmv file. WMV size is impressive (at least to me): 1 hour’s tape is about 800-900MB. Then I use a DivX converter to convert. Now this is gem: DivX converter convrts my 900Mb to about half its size! That’s a super-saver.
Needless to say, all my videos are in DivX format. Right now, I have no idea about how it would look 10 or 20 years down the line in terms of DivX’s roadmap and its player for future use. But, having looked in the past for 5-6 years (good community following), I’m not sure DivX would turn for the worse. Hope this helps you in some way. Cheers.
Comment by ch3tan — Monday, May 8, 2006 @ 11:15 pm
Jim asks, “But if the parts aren’t moving, why would a healthy drive die on the shelf?” I’ve read that the bearings in drives will dry out if the drive is left powered off for a long time. When you turn it back on… the bearings have frozen up and the drive is dead. So keep those hard disks spinning and scrubbing; that way you can replace each drive immediately after it fails.
This has been covered by the Long Now people; if only I could find the reference.
Comment by Wes Felter — Monday, May 8, 2006 @ 11:40 pm
So, how do you backup your backup?
Comment by Mephistopheles — Tuesday, May 9, 2006 @ 12:31 am
Hard Drives seem expensive at first, but you wait six months and it changes, AGAIN. NewEgg is selling 750gig 7200RPM SATA drives for $499
Most of the back-up heartaches we have encountered in the past 10 years have involved SCSI RAIDs and Drive Savers (our heros)…
Now, I buy 3 drives for every purpose. One lives as the original, I have a “live” external clone for redundancy and an offsite clone for “oh shit moments”… when we archive data… we have 3 copies all on hard drives, one in a fire safe, one offsite in a firesafe, and another in a second offsite location.
Cds and DVDs have not held up to the test of time, even the ones we have stored in the dark in a temerature controlled room.
In production environments IT People are becoming digtal librarians and archivists… hopefully our hard work will smooth thiings out for the average consumer by the time they are gathering an average of 100gigs a year.
There have been great efforts in the past decade to start digitizing people’s entire life experience… Gordon Bell and Jim Gemmell working at Microsoft have made great strides in this area… saving the data to drives.
My personal opinion is that holographic storage will happen in the next 5 years or so… with optical media that will hold 300gigs each to start and eventually they will reach into the terabytes for $100 a pop
http://news.com.com/Maxell+focuses+on+holographic+storage/2100-1015_3-5973868.html
Until then, Hard Drives (in our case a mix of external Firewire Drives and Internal SATA) have to hold on.
Comment by Bob Dow — Tuesday, May 9, 2006 @ 12:55 am
Since you’re now leaning towards a RAID NAS solution, let me also recommend the Infrant ReadyNAS solutions. A couple of points of intereset for you: the ReadyNAS supports AFP, a must as far as I’m concerned for Mac users (long file names), it also supports rsync, which makes for simple effective backup solutions without your data ending up in some weird, proprietary backup format.
Regarding off-site storage, where do you do your hosting? How much bandwidth and disk space can you buy? A nice solution (though I haven’t implemented it myself) is to run rsyncd on your host, and set up a cron job (is it launchd these days?) that uses rsync to back up to both the NAS box and your remote server every night. I use Rimu Hosting, they give full root access to your virtual Linux server, so you can set up rsync and ssh the way you like. They don’t have a plan with a terabyte of storage option, but they’re very customer focused and likely to offer custom plans.
Comment by Pete Lacey — Tuesday, May 9, 2006 @ 6:46 am
Mark, you mention using bzip2 to compress 31 GB of data. May I suggest the less CPU-hungry, faster, but not-so-spectacular gzip instead? My experience of compressing big amounts of binary stuff with bzip2 as opposed to gzip can be summed up in one word: SLOW.
So if you were to automate compression of data, it might be in your interest to deal with a small gain in filesize in exchange for a huge gain in speed of the process, and use gzip…
Comment by Michel Valdrighi — Tuesday, May 9, 2006 @ 6:59 am
I think the NAs solution is best for future proofing your backups.
I simple idea for the offsite backups. If you’ve got a friendly company just around the corner, Ask them if you can put a NAS box in their office, then setup a wireless network? If needs be, they can put a NAS box in your office for offsite backups. Just a thought
Comment by Phil Balchin — Tuesday, May 9, 2006 @ 7:17 am
Hi guys, is off site online storage really being used in America to hold private personal data. I just cannot imagine trusting corporations to protect my data “long term” let alone not viewing or using it. Remember the phrase ” I’m a e-mail admin I read your e-mail!” . Would like to hear your thoughts.
Ian
Comment by Ian — Tuesday, May 9, 2006 @ 7:18 am
Lots of great nuggets in here to choose from! Thanks for posing this question (and its revised phrasing), Mark!
My current plan involves four prongs:
1) I have a colleague who keeps a svn repository in a different city. I use this to mirror my source code, documents. I use this technique for lower, more managable forms of data, not huge chunks of video.
2) I have my videos on tape/DV media - eventually these will degrade, but if i have the gumption, I’ll make copies every so often…
3) I have transferred my videos to my hard drive and converted them to MPEG-2. I plan to just keep buying bigger and bigger hard drives, with the eventual plan to move to a dedicated file server with RAIDed hard drives. Every year or two I buy a new hard disk and ensure that the MPEG files are located on both hard drives. Eventually, due to Moore’s Law, I’ll “decommission” older hard drives… In my head, hard drives have become the one form of “permanent” storage solution - just keep moving the data along, migrating from hard disk to hard disk and updating the file system type when a new file system type becomes de-facto.
4) I also put all my MPEG files onto optical media (DVD-R). Currently that’s 30 disks and growing that I plan to refresh every 2 years. Eventually I’ll move to the next phase of DVD (HD/BluRay) when prices come down.
It has become a bit of a chore though! :)
Comment by Jeff Schiller — Tuesday, May 9, 2006 @ 7:23 am
I must agree at least some with the “let’s not get carried away” crowd. That’s a lot of backing up for not a lot of actual use.
That being said, I know it pains us to say it, but I’ve finally stopped archiving all my personal video in DV. I know that the idea of converting DV to MPEG-2 (that’s what I’d recommend since it’s got broad support and excellent quality) and then back to DV for editing ( I actually use MJPEG as an intermediary, which is huge but keeps the artifacts at bay) and then BACK to MPEG-2 for storage makes the preservationist in you cringe, but try it and give it an honest assessment. I think you’ll be pleasantly surprised. I wouldn’t recommend it for a video freelancer, but it’s more than acceptable for home video use.
I use MPEG Streamclip (plus the $30 QT MPEG-2 component) for the conversion, but ffmpegX can handle the task as well.
Besides, in five years we’ll all be complaining about how crappy DV was anyway (pros already do). :-)
Comment by grovberg — Tuesday, May 9, 2006 @ 7:24 am
Not a solution to your backup issues, but I use a catalog program for my DV stuff, CatDV http://www.squarebox.co.uk , you can capture whole tapes at a lower res, pick the clips you want then recapture at full res, you’ll have a full catalog of what you’ve shot, and you can just pull the tape if you want to get some footage you hadn’t already captured. Integrates well with FCP, which is not that hard to use.
Comment by NIck Hayday — Tuesday, May 9, 2006 @ 7:25 am
With this talk of DV versus MPEG2 and generational loss, why not use a lossless video compression standard? Lack or widespread support?
Comment by Jeremy Dunck — Tuesday, May 9, 2006 @ 8:08 am
Hmm, I wonder … … if you could print something and retrieve it by scanning? It would seem to pass the longevity and obsolesence test. I vaguely remember some various products to do this long ago.
Comment by reinkefj — Tuesday, May 9, 2006 @ 8:48 am
I’d suggest moving from iMovie to Final Cut or some other “professional” editing suite. Yes, they use proprietary project formats but these applications will also let you export your edits as an Edit Decision List (EDL) file: that’s a standard, human-readable, plain text format supported by just about all editing software. In fact, it’s so standard you could give a hardcopy to a 1940s splice-and-glue film editor and he’d be able to reassemble your film.
Comment by Carrington — Tuesday, May 9, 2006 @ 9:37 am
This may have been mentioned already, if so, sorry. My personal approach to this is currently double-layer DVD-R disks. 100GB will fit onto 12 8GB double-layer DVD-Rs. To make two copies you’ll need 24 disks, which can be found for $1.80 each; that’s $43 for the media, and a couple of hours of work with a fast reliable burner. Sure, it’s manual, sure, it’s annoying; but it’s very cost effective and gets the job done, and it’s ok if only necessary once a year.
In a couple of years it will probably make sense to copy all the old archives to a massive hdd, then re-burn them onto HD-DVDs or Blu-Rays.
Comment by Avi Flax — Tuesday, May 9, 2006 @ 10:59 am
Just to support the remarks regarding growth in both media and storage: ten years ago, my dad was pretty excited to have 64MB RAM in his PC. The Apple G5 Quad Macintosh supports up to 64GB RAM.
Storing and preserving data will stay an issue for at least the foreseeable future.
Comment by Rob Mientjes — Tuesday, May 9, 2006 @ 11:12 am
In France, a major ISP recently launched a new service named Dedibox (essentially a dedicated Linux server with a 160 Gbytes hard drive) for 45€/month all taxes included (around $36 at current rates). For this price, you have a real server (2Ghz, 1 Gbytes of RAM) and backups (snapshots of the last 3 days). Bandwidth is not metered (there is an unknown limit to the overall bandwith of the data center, but from experience of their ADSL service, they followed the growth reasonnably). It reduces your estimated monthly charge to $72 compared to Amazon service and provides usefull infrastructure for providing services with your data (altough the RTT across the Altantic will ensure poor QoS).
I am not affiliated to this service provider’s group (Free/Iliad) but I suspect that the facility (look at http://www.dedibox.fr/index.php?rub=datacenter) is designed for Video On Demand (the ISP is already a provider of VoIP and TVoIP for more than 1 million subscribers and recently tiptoed the VoD market).
France is now an advanced market for Internet services and this service provides a glimpse at the future.
Comment by Jean-Philippe Papillon — Tuesday, May 9, 2006 @ 11:15 am
This massive amount of dat is mind blowing.
I can only guess how to back up this in an affordable way, if you do not make any profit with the content you wish to back up.
Good luck for the next back up !
Comment by Marina loves pictures — Tuesday, May 9, 2006 @ 11:44 am
@ Jean-Philippe Papillon:
You got your math screwed up :))
the dollar is actually weaker than the euro, i.e. €45 converts to about $55, not $36
Besides that, this service sounds cool - It´s just a pity that Spain is still in the Internet Stone Age (I just upgraded my DSl to 6MBit which sets me back som $70 per month :(
Thanks all for the discussion and input.
I have an unused, windowless Winecellar (Bodega) in my current mansion, making it kinda fireproof. I guess, for the time being, I´ll stash that Infrant Box in therem regularly adding/replacing drives…
Rgrds
Matt
Comment by Matthias — Tuesday, May 9, 2006 @ 12:18 pm
The thing is, the data you created in the last couple of years is your only real concern, space-wise. What I mean is, sure, you created 100GB of data this year and last, and 100GB of data seems like a lot to us now, so we scramble to find a solution to back it up. But 10 years ago you probably only created, what, 10MB? 100MB? I bet I could take all the data I created from 1980 to 1995 and put it on a $30 flash drive in a few seconds.
In ten years the 100GB of data you created last year is going to be just as insignificant, and storing it is going to be just as simple and cheap. The important thing, as everybody’s already said, is to make sure you keep everything in a format that’s still usable with whatever computer you have at the moment, so you can still read it as time goes on.
Comment by Feaverish — Tuesday, May 9, 2006 @ 1:27 pm
Very tricky. I have lost data before and I still think about it (even a single 5.25-inch floppy I accidentally erased). I have three backup sets (which I just now upped to four upon reading this posting), two on DVD and one (now two) on a hard drive. I want a second hard drive and an online storage plan. I still believe that will not be sufficient.
Comment by Joe Clark — Tuesday, May 9, 2006 @ 1:39 pm
I don’t think this much data is a real problem–the hard drive solution you mention seems to make the most sense to me, both in terms of economic efficiency and in ease of use. Buy 2 300GB Seagate drives this year. Let’s say you pay $1/GB (which is a high estimate). That’s $600, but it will last you three years. In three years you can spend another $600 and you will probably be able to get 2 600GB seagate drives (or 4 300GB drives). Rinse, repeat. It’s not cheap, but $200/year isn’t an insane amount of money to pay to back up all your data. It’s sure cheaper than using Amazon. And as another commenter said, the Seagate drives are warrantied for 5 years (link).
Comment by Bob — Tuesday, May 9, 2006 @ 1:42 pm
I faced the same problem with about 200GB of data. Like other people mentioned, hard drives are getting cheap. So I build a 900GB RAID 5 server for under $1000 CDN. The current sweet spot for a $/GB is the 250GB hard drive at the time I checked out the prices here in Canada in March of 2006.
A = Maxtor 250GB SATAII @ $ 119.00 x 3 = $357 for 500 GB in RAID 5.
B = Maxtor 300GB SATA @ $149.00 x 3 = $447 for 600 GB in RAID 5.
C = Seagate 400GB SATA @ $ 247.00 x 3 = $741 for 800 GB in RAID 5.
A is $0.714/GB
B is $0.745/GB
C is $0.926/GB
My entire journey of building a RAID 5 server with cost is posted here. It’s a bit lengthy, but it’s detailed. http://www.beernut.ca/roy/archives/004367.html
As well, I had an extra 200GB hard drive which I gave to a trusted friend, and I also have a few DVDs of small enough data
Comment by Roy — Tuesday, May 9, 2006 @ 2:17 pm
No one has mentioned this yet, but what about the Iomega REV drives and disks?
I was just reading about this, and they even offer a “pro” version that’s geared towards backing up video files and such.
I had looked into this similar backup strategy, but haven’t done a cost comparison with NAS and other external hard drives. I would think that it’s more cost effective, considering how iomega has been pushing this despite drops in hard drive prices.
You could backup 100 GB on to 3 REV disks, instead of a spindle of DVDs.
Just another thought …
Comment by Vu — Tuesday, May 9, 2006 @ 4:20 pm
I’d second the recommendation that you start editing with Final Cut Pro to future proof your edits. The reason being that you can not only export edit decision lists (EDL) from FCP, which are antiquated text files that store lists of edits for only a couple of tracks of video and audio, but you can also export XML files that contain every piece of information that is normally stored in the proprietary Final Cut Project files. An FCP XML file supports many more tracks and much more information than an EDL.
EDLs are great because everything supports them, but an FCP XML file can preserve much more of your data, and for a smart guy like yourself it shouldn’t be too tough to extract what you need from the FCP XML file 30 years from now even if FCP no longer exists.
Comment by Zach Fine — Tuesday, May 9, 2006 @ 8:59 pm
I’m facing very similar dilemmas, as I think we all we be very soon.
The issue of backward compatibility is quite haunting. I use Retrospect Backup at the moment, but the product has really gone downhill, especially since Dantz sold it. I have had quite a few headaches during version changes.
I wonder what support I will have in the future for my shelves of Retrospect CDs and DVDs, compressed and stored in their proprietary format. MacOS doesn’t even recognize the disks as valid volumes. They must be read from Retrospect.
Does anyone know of any industrial-strength backup alternatives for Mac? The best thing about Retrospect is its ability to make incremental backups, and restore volumes with all their privileges, and keep the folder hierarchy intact, even when I delete and move stuff.
I am going to add hard-drive backup, with redundant hard-drives to the optical backups, because I don’t trust DVD media. The old Kodak Gold CDs were amazing. I have decade-old Kodak Gold CDs which still work perfectly, and DVD media that is less than a year old that is failing.
Quite the conundrum. I thought that HDD was the solution, but reading here it is said that the bearings can fail if they are stored without the drive being spun up. That puts a fly in the ointment. I don’t have the funds to keep all my HDD backups “live,” I was going to store them in a fireproof safe with faraday cage, physically removing them from enclosures.
And I’m dealing with at least 300GB a year of video, RAW photos and other data.
Comment by Harvard Irving — Tuesday, May 9, 2006 @ 9:11 pm
My setup has:
1 NLSU2 = $70
2 Enclosures = $40
2 300 GB Hard Drives = 2 X 80 = $160
Total 600 GB for $270
Cost = 0.45 cents / GB (Lowest on this list.)
Untill a week back I didn’t backup my backup. One drive crashed, most data was on DVDs so got away with it.
Now I backup my backup and have 300 GB for $270 = 90 cents per GB
Comment by Vishi — Tuesday, May 9, 2006 @ 10:50 pm
Vishi-
Where are you getting enclosures for $20 each and 300GB drives for $80 each??? And what the heck is “NLSU2″?
Your prices seem too good to be true!
Comment by John I. Clark — Wednesday, May 10, 2006 @ 12:16 am
Distribute! The only things I have never lost (code or media) are the ones I let free on the Internet. When everything else (my backups) fail I just Google them, and somewhere, someone has a copy of them!
Comment by Panayotis — Wednesday, May 10, 2006 @ 12:52 am
One possibility (althought it’s in the Not Quite Here Yet category) is In-Phase Technologies holographic storage system. Might be worth keeping an eye on this stuff - http://www.inphase-technologies.com. I reckon this could become a good low-cost solution for storing DV or any other large data sets.
Comment by Chas — Wednesday, May 10, 2006 @ 4:55 am
Hangon, don’t we already have malware which will erase/encrypt your data for the purposes of extortion (or whatever)? If you’re using online storage like hard drives for backups aren’t you just making these data vulnerable to malicious attack as well as all your current stuff?
On a 20 year timespan, this scenario is not paranoid, it’s prettymuch a given.
That’s why I prefer tape.
Comment by Alastair — Wednesday, May 10, 2006 @ 5:57 am
As a sound designer I have found that I have the same problem and for me the answer was to break projects down into smaller pieces and archive more frequently on to Dual Layer DVD’s (and check them for integrity after every burn!) It works for me as I never have files larger than 600 MB and when I am done with a project, I am done. I never go back to those disks unless I need some really specific thing and I keep all of the functional sounds “live” until they reach the capacity of a Dual Layer disk and then I dump them and date it.
Comment by Chris Bakos — Wednesday, May 10, 2006 @ 11:14 pm
John:
http://the.taoofmac.com/space/Linksys/NSLU2
The power of google.
Comment by Darryl — Thursday, May 11, 2006 @ 10:07 am
Interesting I should run across this now, as I’ve been thinking about doing something more serious about backup myself. I’ve been going back and forth, but I’m currently leaning towards setting up a used Mac Mini with a RAID 1 as an NAS (and print server, and whatever) to provide me with about 250 GB storage. My needs are not as great as Mark’s, and when I hit that ceiling, bigger drives will be available cheaply and I will buy those.
DVDs for medical archiving (for whatever reason, the medical field seems to have especially stringent standards for archival lifespan) are typically advertised as having 25+ year lifespans. That sounds pretty good.
Companies have been dealing with the problem of archiving vast amounts of data for a lot longer than individuals, and they can afford to hire archivists to figure it out. We don’t have that luxury. I don’t know all the practices that archivists use, but I do know they have a hierarchy for the accessibility and permanence of archived records, and they make decisions about archiving in advance and speculatively (”this looks like something we might need 5 years from now”). Adapted to Mark’s case, that might mean distilling the edited videos down to H.264 and sticking them on hard drives in three places, but leaving the raw footage on the tape someplace “safe” without making a backup of it—this is gambling, yes.
Comment by Adam Rice — Thursday, May 11, 2006 @ 12:11 pm
Apropos of nothing, but related to this conversation, I would like to point out that Infrant’s ReadyNAS is based on Debian and that they have apparently (after some prodding and foot-dragging) complied with the minimum requirements of the GNU/GPL by posting the source code of their modified components. Apparently there are other parts like the admin interface that are not GPL, which is suboptimal but not a deal-breaker.
Comment by Mark — Thursday, May 11, 2006 @ 7:35 pm
I have 1.91 TiB of storage space on eight 300GB drives in RAID-5 on a dedicated file server in another room. I’m running out of space on it though, I currently have 656GiB left and I’m using 1.26TiB, however I don’t see it as a massive problem. I can always setup another, larger array when this one gets full.
Also, since it’s RAID-5 then if any drive were to fail, the data would be safe. Although in the event of a fire, if everything were destroyed, I have a friend with 1TiB of JBOD storage space who has a large bulk of the data.
If it all goes pop on my end then it’s not the end of the world, same at his end. Effectively it’s an impractical RAID 1 array, but it works and provides redundancy. :p
Comment by David Harrison — Tuesday, May 16, 2006 @ 4:57 pm
I am the IT Director for a small business and I can tell you that the people posting about the poor quality of Lacie drives are right on the money. A little over a year ago I decided to switch our backups from tapes to external hard drives. Lacie had a 500 GB model when almost all IDE hard drives maxed out at 400GB. It turns out that is because it was 2 Maxtor drives in a RAID 0. I went ahead and bought a bunch to give them a try. 12 to be exact since we have a lot of offices that need separate backups. Of those 12, 5 have failed. Only one failed within the meager 1 year warranty as well.
While I am not a fan of Maxtor drives (I have seen a lot of them fail) I don’t think they were the problem here. Since the enclosures were out of warranty I took them apart hoping to at least use one of the drives elsewhere. In all but one case, both drives dead, usually with the identical problem. For both to fail the same way at the same time I can only assume it had to have been due to an external cause. Oh and those enclosures get _extremely_ hot, too hot to hold comfortably if you don’t have a fan on them constantly. Even when they aren’t being accessed they get very warm.
I have switched to 500GB Western Digital MyBook drives now. They are a single drive in an enclosure that can actually allow air to get to the drive. The look nice and even have a circle on the front to show how full the drive is. They run much cooler and I have had much better luck with WD drives in the past.
Comment by Kelderek — Tuesday, May 16, 2006 @ 7:54 pm
Couple things.
KMP Media Is now Selling Kodak Certified Media now.
http://kmpmedia.com/
Also, You might want to look into a tape based setup
i to this day never trust burnt cds. Always go with
tape as a backup. I like the Ultrium Based systems
and i have my own unit that backs up 400/800 with
compression on a tape (around the same size as
a DLT tape)
http://www.lto.org/newsite/index.html
I have archived some stuff to CD, and the only media
i have used is Gold Media and i have yet to have
Problems (just double checked a cd i burnt in 1997
still retains the data with no CRC errors)
Hope this helps.
-Snarto
Comment by Snarto — Tuesday, May 16, 2006 @ 11:08 pm
I’ve been doing the short-term media shuffle for a long time now. Low density floppies to HD floppies to cd-r to dvd-r, and now, external HD’s.
I use two methods:
I use cheap IDE drives in cheap external enclosures for my incremental backups (I run Mike Lin’s RapidBackup) of my photos, video, music, etc. I do these infrequently, but they have the advantage of being on a different power supply, that will likely fail separately from my desktop’s. Anecdotal evidence from a friend who works IT for an educational institution suggests that dirty power inside or outside the box is the most common cause of disk failure in consumer-grade boxen. Since I can’t afford to spend large amounts of $$ at a time, I’m stuck with consumer-grade boxen :) I am still on the hope-and-pray method that price per Gig will fall at a faster rate than the # of Gigs/year I generate. Later this year I will be able to afford rotating offsite backups. (Yay safe deposit boxen!)
My other method isn’t a backup method at all: I keep all of “My Documents” on a cheap Raid-1 array, separate from the OS drive. And by cheap I mean “the motherboard came with a software raid chip” and “I bought the cheapest $perGig SATA drives I could find.” This has saved me from sorrow at least once already. My tip here is to make sure the SATA cables aren’t loose - they are a commodity product, and manufacturing tolerances aren’t what they should be. If you wanted additional convenince, cheap harddrive sleds are available for swapping out drives without messing with opening up the case. Allegedly, SATA can support hotswapping too.
Suggestions for you:
I have none better than the ones suggested here. Others, elsewhere have suggested RAID 1 configurations (or the like), where you pull a disk from the array when you wish to archive, but there are technical difficulties associated with that - custom firmware and the like. If you are serious about the burning building scenario, then a harddrive sled in a computer with a RAID-1 data array is the answer, though I think slow speed disasters like floods, impending wildfires and the like are a better fit for that. This has the advantage that you can take with you precisely what you were most recently working on.
For incremental offsite backups, an impressive amount of bandwith can be attained by mailing an external drive to a friend. Incredibly poor latency, but it beats having to secure a good quality net connection to a remote computer. I can attest that the Bytecc enclosures often featured on newegg come with pretty nice formed styrofoam inserts and a carrying handle.
I cannot find the reference at this time, but there is a very inexpensive (80$) network adressable USB/firewire hub, that has firmware that can be flashed to run linux, and to which you can connect two USB enclosed Hd’s in a software RAID-1. (I’m still looking for for it myself.) Potentially you could rsynch to a pair or more of these, and retain the advantages of easily mailable harddrives for incremental backups. My understanding is that some (most? all?) of the open source-linux software raid stuff is device neutral: no special filesystem formatting required. Obviously reuse of old beige box computers for this purpose would be advocated by your typical slashdot-type, but I think the modularity and reduced power consumption of off the shelf stuff would be better.
So, back to your (new) question:
Keeping any of these solutions up for five years may be problematic: they all seem to require regular maintainance, and a fair amount of user intervention. But they do answer your second question: usb/firewire drives are going to be very easy to transfer over to new storage solutions in the future.
Avi suggested backing up to dual layer DVD-R, at 100 Gig=43$ of media (.43$/gig). By contrast, a 300 Gig hd is now 100$ (.33/gig). So there is (near) parity between the two solutions. Time spent recording vs. time spent installing, tax/shipping/etc. is probably a wash, and for the highly paid, favors the hd solution. (HD price from newegg) Note the assumed capital costs of a dual layer drive vs. an external enclosure can/may negate each other as well.
-r.
Comment by rhandir — Wednesday, May 17, 2006 @ 2:31 pm
I will share my way of backing up my 800 GB of data.
I have 7 years of data (photos, dv movies, so and so..), and they are backed (or replicated) in another 2 computers. I’ve tried MO, CD, and tapes, and they don’t work form me. CD’s, tapes, or MO are time consuming or expensive to maintain, or both. Also, MO and CD (or DVD) are 9.1 MB max per media. Not enought for raw dv files larger than these limits.
I want to keep the files in their original format. Converting them to MPEG-2 will not let you convert to MPEG-4. If you do, you lose the quality of the video.
Each of the computers have 3 hard drives, and they are not configured in raid, but files are organized by year and month, so finding the photo or movies are easy.
I use Linux in my 3 computers. My 4th computer is Windows, and one mapped drive accesses one of the Linux. Nothing is stored in the Windows, except for something I don’t really care to backup. I have UPS in case for power failures and power surges.
For the past 7 years, I have 2 hard drive failures. They were easy to replace and were rsync with the other 2 system. No pain here. If my data grows, I can always add one more drive for each of the computers.
I don’t like external USB. I heard some horror stories about external USB where it damaged the drives when turned on or when plugged into the computer.
One of my Linux computer is located at one of my relatives house.
Comment by Fitz — Thursday, May 18, 2006 @ 1:04 am
One of our offices went to LaCie drives for backup and as expanded storage. We spent quite a bit of time and money shipping them back and forth before we dumped the external storage and went back to tape backups. We were using these on file and database servers, so perhaps if you just used them lightly, they would be okay.
Comment by W^L+ — Thursday, May 18, 2006 @ 4:41 am
Hello,
I have an old server, It has raid 5 drives and an external storage unit with more raid 5 drives. This is our current storage of pictures and files.
We have a large harddrive in one of our PCs that I have a dos script to back up files to.
In the future, I am going to build a home server. Not much redundancy except for the hard drives. It is going to have two of the largest drives I can find. and mirror them. Then on the side, I am going to have one more drive for a backup in a second computer.
While a fire would hurt us, I feel that one of the three drives should survive such a disaster. If not, I would rebuild these memories from family and friends. We have tornadoes, if a tornado struck and I found the servers, I think the drives might be intact enough to get one good copy off the drives.
I have always had problems with tape. Too much intervention needed to make it work. At work, our tape drive uses a LTO2 library. Each night is on two tapes. Every day you have to add tapes to the machine. That is a lot of moving parts. I really don’t trust it as much as a hard drive in raid 1 or raid 5.
In the end, I am doing my best to back up my data, but I know that media changes and I would rather plan for that, more than getting a solution that will last 10 years. In 10 years, I might create 2Tb of data a year, today I generate 20 gig a year. Things change. Plan on change, and make sure that you transfer your old media to your new media.
Comment by Dave — Friday, May 19, 2006 @ 11:58 am
You already have a tape backup device capable of storing gigabytes of data - it’s your DV camcorder.
This isn’t generally suitable for data - but it IS suitable for video (of course)!
Save your raw footage, edits and finished video projects out to DV. Make multiple copies. Create new copies every year. DV tape is one of the cheapest mediums going - it’s also reliable and high capacity.
Now, for the rest of your data get 3 big hard drives (500GB seagate in Wiebertech boxes). One is your main online storage area, the other two are to take regular rotating complete backups of this. One hdd remains offsite at all times (along with a copy of your tapes). Backups are cheap and speedy (which means you’re more likely to do them) and you almost always have 3 good copies of all your data. Anything especially precious also goes on burnable media in your safe deposit box.
This is exactly what I do (except my main storage is an Infrant ready-nas NV).
Comment by Andy B — Friday, May 19, 2006 @ 12:09 pm
The Internet Archive has been doing a lot of this same sort of stuff for their data, and they have published some information about their hardware and software choices, as well as providing links to other organizations that are working in these fields, etc….
See http://www.archive.org/about/about.php#storage and http://www.archive.org/iathreads/post-view.php?id=40868 for information on their operations, http://www.archive.org/about/about.php#2 for links to other organizations doing work in these areas, and browse through the “petabox” forums at http://www.archive.org/web/petabox.php#forum for additional detailed information on their hardware and software choices, etc….
Comment by Brad Knowles — Friday, May 19, 2006 @ 12:36 pm
Amazing how little people with real experience have reacted and how much hidden advertising I saw.
Only one person seems to know how to deal with your problem (Comment by Bob Dow — Tuesday, May 9, 2006 @ 12:55 am), but it could be cheaper if you stay away from the expensive external harddrives (with USB or IEEE-interfaces that won’t last for another 10 years), and make yourself removable or swappable drives (I think it’s called ‘enclosures’ in English).
Pity that even Bob Dow don’t realise that it’s not enough to have your data on 3 physical different disks, but you HAVE to overwrite ALL data with EVERY backup. Reasons are very simple : 1) because the data is stored magnetically, you force every bit to be written freshly 2) ANY error WILL be immediately (and not after a couple of years) visible, and in my experience ALL harddrives no matter which manufacturer, will mark bad spots as bad, whithout any dataloss.
Worst case scenario is that one harddisk actually crashes, but then you still have another two, so just buy another one and use the same enclosure to switch with the bad one.
The same goes when your amount of data rises : just replace your existing one’s with something bigger.
And please, NEVER EVER use plastic junk like tapewriters, CD, DVD, etc. Read ANY professional review and you know the theory behing the hard experiences of many unlucky amateurs who think they discovered THE deal of the era.
And please, avoid RAID. It’s not really bad, but to many people don’t dare to admit, in the long way, there is nothing but trouble, certainly for the not-professional ICT.
So backup your WHOLE disks. Start before dinner, if you have a modern pc, it’s done after your dessert. Cheers.
Comment by P Praet — Friday, May 19, 2006 @ 1:13 pm
streamload.com?
Comment by Anonymous — Friday, May 19, 2006 @ 2:23 pm
Just want to thank Mark for this blog and all of you for the great comments. This has got to be the most intelligent piece of literature I’ve read on the web this entire week!
Comment by Pete — Friday, May 19, 2006 @ 2:32 pm
One recommendation: let it go.
It’s too easy to spend valuable time capturing, editing, backing up, and recovering your memories, when you could be creating them.
Yes, record special memories, for posterity, (or just tell stories to your kids) but don’t be obsessive.
Comment by Jason — Friday, May 19, 2006 @ 2:35 pm
It’s not a very standatd solution but it’s cheap and reliable : Use mini-DV tape to store any type of files.
More info http://jakeludington.com/project_studio/20050828_backup_files_to_dv_camera.html
Comment by MG — Friday, May 19, 2006 @ 5:06 pm
Rhandir says:
This sounds a lot like the aforementioned NSLU2, running a custom Linux kernel with RAID 1 enabled.
My backup system consists of a pair of USB drives attached to an NSLU2 running this custom Linux. I’m using “poor man’s RAID”: in the middle of the night, one drive backs itself up to the other using rsync. However, I’m planning to replace this with a full-blown Linux PC with a RAID array: the NSLU2 is a bit too slow to serve music effectively, and its external drives are running very hot in their enclosures. I fear they are not long for this world.
Comment by MikeB — Friday, May 19, 2006 @ 7:43 pm
This is a bit tangential, but I was wondering if anyone had any idea what would be the effect on magnetic storage of a global polarity reversal. From what I have learned (from not vary many sources, in all honesty), it sounds like a shift in the earth’s magnetic poles is long overdue - as the last one took place over 100,000 years ago. Prior to that, if I remember correctly, they were on a q30,000 year pace. Would such a reversal destabilize magnetic storage media?
Comment by mca — Saturday, May 20, 2006 @ 12:55 am
Everyone is so worried about backing up their data, yet nobody has mentioned how important it is to have a “In Case of Death” file located on their C: drive. How will anyone know where and how you stored your data if you don’t leave some instructions on what to do if you should die in a car accident. It would be a tragedy if all your family photos were saved but nobody could find or retrieve them because you are the only person who knows how and where they are stored.
Roger Secura
Comment by Roger D. Secura — Saturday, May 20, 2006 @ 10:25 am
http://www.inphase-technologies.com/
This company is bringing out Holographic storage very soon.
Comment by WJ — Saturday, May 20, 2006 @ 11:37 am
Yup, any day now…
Comment by Mark — Saturday, May 20, 2006 @ 12:00 pm