archiving data, was RE: Media Longitevity/Care
tomj at wps.com
Fri Mar 18 03:23:21 CST 2005
[a bunch of replies all in one]
On Wed, 16 Mar 2005, Vintage Computer Festival wrote:
> However, I find that media is much more sturdy than these
> discussions would indicate. We tend to gripe very loudly (and rightly so)
> when poor quality backup media takes our data with it. But what about all
> those times we go back to our old backups and thankfully find what we
That's true. It's not like 100% of all media rots to nothing. But
"finding" old media that's still readable is different from
> I consistently read lots of data with nary a problem: 20-30 year old
> floppy disks (5.25" and 8"), 20-30 year old mag tapes, even 20 year old
> VHS tapes. And of course the punch cards don't count ;)
No offense, but you're in the business of doing so :-) Only a
minority of even us can read and convert old formats, even when
On Thu, 17 Mar 2005, Jim Leonard wrote:
>> If you somehow think that tapes stored in a controlled vault is
>> more reliable, or less susceptible to bit rot than rotating
>> spindles, I believe you are wrong (and my every experience and
>> observation says otherwise). Only fiche and paper are statically
> You are omitting cost. Tapes in a vault cost significantly less over 10
> years than the cost of 4 live systems + replacement hard drives + network
> bandwidth + electricity for your method.
My assertion is that it's a false economy; besides bit-rot,
there's future (near-guarenteed) incompatibilities, and if enough
time passes in multiple media (papertape... 1/2" mag tape...
DC300... 3740... QIC... etc)
In fact, with Moore's Law and related continuous hard disk is *far
cheaper*. It's just that you're not deferring the cost to your
> Your method, unless I am misunderstanding you, has no revisioning. It
> protects against hardware failure, but how many revisions do you keep? What
> protects against you mistakenly deleting the wrong directory, etc.?
That's what filesystems are for.
I also extended a rotating backup scheme I found on the net, that
uses hard links and an intentional feature of rsync; it fully
saves N copies of the data with true incremental changes with only
about 10% more diskspace than the original files themselves, for
data that changes at a "reasonable" rate of change (per week,
say). (Example: I have a big RAID server that has 5 weeks of 100%
snapshots; the original data is about 360GB, 5 weeks worth is
> I agree completely that archival data should be monitored in some fashion
> instead of being locked away forever. And, cost permitting, transferred to
> new media.
On rotating spindles, this is continuous...
On Thu, 17 Mar 2005, Dwight K. Elvey wrote:
> One other thought is that one should have a way to
> transfer the raw data from what ever media you have.
> This allows one to have redundent information stored
> on the same media. If you depend on normal file systems,
> you run the risk of the file system not allowing you
> to access the data simply because a small part is
Hence multiple, ordinary servers, physically separate. The chances
of all of them developing a bad disk simultaneously is exceedingly
If one of my servers craps a disk, that machine is repaired; data
from the other server(s) is automatically (cron) copied to it.
On Thu, 17 Mar 2005, Doc Shipley wrote:
>>>> This reminds me that I read not too long ago that many of the super
>>>> computer labs ship PCs between sites because it's *faster* to ship a
>>>> working PC with 1TB of disk containing data than it is to transfer it
>> Yikes, brings new meaning to the term "Our System Crashed." Wonder
>> what the insurance is on something like that....
> The insurance in that case was basically irrelevant. Their insurance would
> cover the hardware replacement costs, period.
No point in insuring the data contained in the disk drives in the
computer in the back seat; it's just a copy.
On Thu, 17 Mar 2005, Doc Shipley wrote:
> Then the other issue was the archived files themselves. Some of it was in
> PageMaker, some in Wordstar, some in MS-DOS .prn files, and a lot of it in
> AutoCAD r12 and a freeware DOS GIS tool whose name I've mercifully forgotten.
THen there's that! Added to the top of media recovery ("who's got
a 200bpi UNIVAC drive?") it generally means that the data is
simply "lost". (Few think to "SAVE-AS" plain ASCII for posterity;
I rarely do.)
> I spent several hundred hours sorting backup sets, restoring them to disk,
> migrating files into current formats, and then archiving both current and
> original copies to 4mm tape. It made for some very soothing afternoons,
> hidden in my office.
And they only did it because it was required by law!
I'm seriously planning on printing out my website on paper. Laugh,
but it might be one of the few copies that survives long term.
That is, if I can find a printer that produces real output.
More information about the cctalk