Don Maslin/Archiving system software (was: ftp archives disappearing?)

Seth Morabito sethm at loomcom.com
Mon Mar 19 16:17:24 CDT 2007


On Mar 17, 2007, at 8:32 PM, Dave Dunfield wrote:
> [...]
> = How to store the archive
> I am a strong believer in preservation of the physical media as
> historic artifacts, however I believe it is also vital to preserve
> the data separately in modern formats, for several reasons:
>  - It allows easy replication and mirroring in multiple locations.
>    This will help insure that the material is not lost in the
>    future through any single failure point (fire, flood, death,
>    loss-of-interest - all of these things and more can wipe out
>    a single physical archive).
>  - It removes dependance on specific (and usually obsolete)
>    physical media. No need to put wear and tear on the original
>    artifacts, and allows for contingencies in the event that the
>    original devices become inoperative.
>  - Allows for easy sharing and movement of the data.
>  - Allows everything to be tracked in a central repository
>    (appropriately mirrored of course).
>  - Allows anyone who wants to set up the required equipment
>    to have complete access to the repository content.
> [...]

Digital preservation is actually my day job.  I work for the LOCKSS  
project at Stanford University (http://www.lockss.org/).  LOCKSS is a  
distributed peer-to-peer preservation system for electronic journals  
used by libraries around the world.  Basically, each library runs one  
or more boxes that collect e-journals, and all the boxes participate  
in audits with each other to make sure that content is not lost or  
damaged.  They establish and maintain an "Authoritative Copy" of the  
content that all the boxes keep locally.  If the original e-journal's  
publisher has disappeared, the boxes will repair missing content from  
each other.  Best of all, it's entirely open source using the BSD  
license and some LGPL components.

I've thought about using LOCKSS for software and documentation  
preservation, but unfortunately it's really not ideal for the job.   
As I mentioned, it was designed with the fairly narrow goal of  
providing libraries and institutions an easy way to preserve  
electronic journals, so it would require a lot of hacking to make it  
useful for something like a software archive.  Still, I think the  
general principal should apply here.  LOCKSS stands for "Lots Of  
Copies Keep Stuff Safe", and that should certainly be the goal of any  
digital archiving system.

Anyway, I just wanted to throw that out there as part of the  
discussion.  I'd be glad to help out on this:  I have a server, disk  
space, and bandwidth, and I'd like to see it used for this kind of  
thing.

At a minimum, I should probably start mirroring http://bitsavers.org/ !

-Seth






More information about the cctech mailing list