Digital archiving tools
tshoppa at wmata.com
Wed Mar 9 15:50:29 CST 2011
> OTOH I keep all of my images compressed, precisely because I want to know if any copy has been corrupted.
Specifically I prefer bzip2 (even though a lot of my imaging
activities from the 1990's, before I knew about bzip2, are squirreled
away in zip files and I have not necessarily moved them to bzip2).
If for some reason the only copy of a bz2 file
became partially corrupted I could know which parts were good and which parts were bad:
RECOVERING DATA FROM DAMAGED FILES
bzip2 compresses files in blocks, usually 900kbytes long.
Each block is handled independently. If a media or transmission error
causes a multi-block .bz2 file to become damaged, it may be possible to
recover data from the undamaged blocks in the file.
The compressed representation of each block is delimited by a 48-bit
pattern, which makes it possible to find the block boundaries
with reasonable certainty. Each block also carries its own 32-bit CRC, so
damaged blocks can be distinguished from undamaged ones.
bzip2recover is a simple program whose purpose is to search for blocks
in .bz2 files, and write each block out into its own .bz2
file. You can then use bzip2 -t to test the integrity of the
resulting files, and decompress those which are undamaged.
More information about the cctalk