Digital archiving tools

Shoppa, Tim tshoppa at wmata.com
Wed Mar 9 15:50:29 CST 2011


> OTOH I keep all of my images compressed, precisely because I want to know if any copy has been corrupted.

Specifically I prefer bzip2 (even though a lot of my imaging
activities from the 1990's, before I knew about bzip2, are squirreled
away in zip files and I have not necessarily moved them to bzip2).

If for some reason the only copy of a bz2 file
became partially corrupted I could know which parts were good and which parts were bad:

RECOVERING DATA FROM DAMAGED FILES
       bzip2  compresses  files in blocks, usually 900kbytes long. 
       Each block is handled independently.  If a media or transmission error
       causes a multi-block .bz2 file to become damaged, it may be possible to
       recover data from the undamaged blocks in the file.

       The compressed representation of each block is delimited by a 48-bit
       pattern, which makes it possible to find the block  boundaries
       with reasonable certainty.  Each block also carries its own 32-bit CRC, so
       damaged blocks can be distinguished from undamaged ones.

       bzip2recover is a simple program whose purpose is to search for blocks
       in .bz2 files, and write each block out into  its  own  .bz2
       file.  You can then use bzip2 -t to test the integrity of the
       resulting files, and decompress those which are undamaged.



More information about the cctalk mailing list