9 track tapes and block sizes

Dennis Boone drb at msu.edu
Thu Sep 17 00:53:08 CDT 2020

 > What I know is that tape is subdivided in files by means of marks,
 > and each file is subdivided in blocks of equal size.

Er, no.  The blocks aren't necessarily of equal size.  Unix people who
are used to tar often seem to have this mindset, but the general case is
that records can be of varying size.

 > Now suppose you find and unknown tape you want to preserve: using dd
 > you could easily 1:1 copy tape files to hard disk files using a SCSI
 > drive and Linux.

DON'T DO THAT.  If you use dd, you're throwing away information.
Specifically, you're throwing away knowledge of the block size.  Most of
the conventional unix utilities don't care.  Many other things do.  In
many cases, it's difficult or impossible to reconstruct the block sizes
from the content, but even if it was, it's terrible archival practice.

There are file formats for containing tape image data.  The most common
one is probably the simh .tap format.  These all preserve block lengths,
tape marks, indications of errors in reading the original, etc.  Many
fail to provide a means to embed metadata, but you can put that in
separate adjacent files.

 > But: how you know which block size is on the tape?

Generally speaking, do a read of a blocksize as large or larger than the
max on the tape, and the system will hand you the full record, and the
actual number of bytes read.  If you're writing C or scripting code, the
unix read() call does this.  From the command line, you can do it with
dd - specify a large block size and a count of 1, and it'll tell you
what it actually got as it exits.

For 9 track, few systems could write blocks larger than 32k or 64k, so
those are decent guesses for "large" there.  If it's DLT or something
more modern, then the largest possible block might be a lot larger.  The
system reading the tape may impose a limit based on available buffer
space.  You should able to iteratively determine the largest size it
will accept.

Many of the quarter-inch cartridge formats actually don't support block
sizes other than 512 bytes.  If they were used on systems that expected
to be able to write larger and/or variable records, the system hardware
or software may have implemented a logical blocking layer on top of the
512 hardware layer.  If you're reading one of these and don't have the
original hardware/software to decode it, you'll have to figure out how
to decipher it yourself.


More information about the cctech mailing list