Write-only backup media and big archives

Jerome H. Fine jhfinedp3k at compsys.to
Mon Jul 13 19:42:36 CDT 2009

 >Tim Shoppa <mailman at trailing-edge.com> wrote:

>Not really a mechanical or procedural thing, but a little more
>philosophical as I think about data storage not just at home
>for classic computing, but at work:
>In the 90's the CD-R and soon after the DVD-R looked pretty ideal
>for making "write only" backups of what was then considered large
>amounts of data. While not archival in the centuries sense, it
>seemed a pretty safe bet that readers would be readily available for
>the next 10 or 20 years and I think this bet has turned out well.
>I was willing to spend an afternoon buring a dozen or two CD-R's
>because they felt "real".
Jerome Fine replies:

While I used the RX03 as my backup when I first acquired a DSD 880/8, when
I started to use RD53 drives, the TK25 tape media became my choice followed
a few years later by the TK70 when I switched to Hitachi DK-515 600 MB ESDI
drives along with Sony SMO S-501 Magneto Optical media of 295 MB per side.

I probably would have used the CD-R at that point, but I don't think 
that there was
ever any software for use on a PDP-11 which would burn a CD.  By the 
time I had
a Windows 98 SE system in 2002, the price of a DVD-R drive and the media had
become sufficiently cost effective as well as enough capacity to use and 
I transferred
all previous backups to the DVD-R media.

As a matter of confidence, when I backup the file each month for the 
W98SE system
(using GHOST, the 2 GB of storage compresses to less than 1 GB in an 
image file),
I also produce a text file with all of the values of the CRCs for each 
file on the system
(which takes little storage - being less than 1 MB).  Then I save the 
MD5 values for the
pair of the files and add that third file the the DVD-R with those MD5 
values to ensure
that the image file can be independently checked.  I also copy the files 
from the DVD-R
back to the hard drive to ensure that the files can be read as well as 
running the MD5
program on the copy of the file on the hard drive (it actually saves 
considerable time
to run MD5 on the copy of the file on the hard drive rather that 
producing the MD5
value from the file on the DVD-R).

>But today, a "large amount of data" is not a few gigabytes, but
>terabytes. Tape libraries with these sorts of capacities do
>exist but aren't available at the corner store
>and I have a nagging mistrust of tapes that causes me to refer
>to them as "write only memory". (I never really ever trusted anything
>denser than 1600BPI 9-track!). Burning 2000 CD-R's or even 500 DVD-R's
>doesn't seem like a reasonable or useful task to backup a terabyte
>hard drive (which today is a fraction the price of the 9 Gig drives of the
>mid-90's that was "big storage".)
In addition, anything on a tape was difficult to locate if it was not 
the first file, aside
from the speed of the tape in the first place.  As for the TK50, it was 
so slow under
RT-11 on a PDP-11 when comparing the tape file with the original file 
that I never
accepted the TK50 as being appropriate.

Also, a 1 TB hard drive would need only 250 DVD-Rs for a fully loaded 
drive in
uncompressed mode and likely only about 100 DVD-Rs for compressed backups.
Based on toady's cost for a 1 TB hard drive, that is cost effective, but 
I agree not at
all convenient.  On the other hand, the blu-ray drives may become less 
in the next few years (in the same way that 4.7 GB DVD-R drives became since
2002 when I first started to use a DVD-R drive and media for backups). I 
that less than 25 blu-ray media would be reasonable to backup a 1 TB 
hard drive in
uncompressed mode and perhaps 10 blu-ray media for compressed backups.  If
25 (or 10 media) are still too inconvenient (certainly at present they 
would cost more),
then I would guess that an actual 1 TB drive will remain the better choice.

>At the same time, the ubiquitous and available-at-the-corner-store
>terabyte-sized USB drives don't feel awfully reliable either. They
>are way more convenient and cheap though, and that's my prefered
>backup media today.
I don't have that much in the way of files to backup, so I can't say 
what I would do.

>I think I'm falling into the trap of confusing the price of storage
>with the value of the information recorded onto the storage. It's
>ironic that as disk space has become cheaper and cheaper, we regard
>the contents as less worthy of the effort of backup onto reliable
I never confused the cost of the storage with spending hundreds of hours 
a program or the enhancements to a program.  In addition, it would never 
be possible
to exactly reproduce a source program if it was ever lost.  So backups 
were always
extremely important.

>OK, philosophy mode off. The NTSB is gonna be asking me for a few
>gigabytes of data next week and that seems easy compared to the
>terabytes at home.
As in National Transportation Safety Board?  I guess that a DVD-R
fits the bill these days.  If there are only a few files and they
need to be archived for more than a year, I suggest adding the
MD5 value as well as advising them to make a few copies and save
them at different offsite locations.

By the way, how many GBs do you estimate that all of your RT-11
files take at this point?  I doubt that my files for RT-11 take
more than 20 GB and probably less than 10 GB if I was careful.
Certainly, if I eliminated all of the duplicates, the total would
be less than 5 GB, maybe even only 2 GB.  Since storage has become
so inexpensive, it is not worth the effort to check.

Sincerely yours,

Jerome Fine

