"File types"
Don
THX1138 at dakotacom.net
Tue Aug 29 16:00:53 CDT 2006
aliensrcooluk at yahoo.co.uk wrote:
> Why not use a combination of an extension
> name and internal references within the files
> data, like the IFF format on the Amiga.
>
> eg. Filename: mypic.IFF
>
> and then within the first 16 bytes are
> contained the ASCII "ILBM" and something else
> (I forget... that's what not having a properly
> working Amiga does to me!). Not sure why
> they are spread apart and not the first few
> bytes.
Using the extension name *and* something else is just as bad
(worse?) IMO as using the extension name itself. You don't
call yourself andrew.human.male so why should a file's name
have to carry all of that information?
E.g., TIFF's can be stored in Moto or Intel byte order
(big/little Endian). Why aren't there .TIFFi and .TIFFm
extensions to differentiate between the two (ans: because
TIFF decoders are smart enough to Do The Right Thing)
Adding *superfluous* IN-BAND data just to identify
the file type is also A Bad Idea (IMO) as the information
is redundant (if the file extension *clearly* is doing it's
job, then why do we need to RESTATE all of this information
within the file?). And, it requires applications and
utilities (e.g., listing a directory's contents) to *open*
each file to determine what the file is -- instead of stat()-ing
it (somehow).
> As far as I am concerned it is down to the
> software to *detect* whether or not the
> file is the right type, regardless of whether
> the extension name is correct (eg. IFF stands
> for Interchange File Format, or something
> similar, and can have sounds stored in an
> .IFF file instead of image data).
Sure! But, then you are conceding that the information
contained in the file extension is "worthless" :>
I have the philosophy that code should protect itself
from bad/malicious data. But, other than that, it should
be able to trust the data sources that it interacts with.
E.g., if someone wants to pass a *humongus* image file
to my image decoder, I should correctly decode/display it
(subject to the resource constraints imposed on my by the
OS at this instant) -- even if it is a 500MB image of
a large black rectangle! (or, "white noise")
> Personally I developed a (lame) .ABI image
> format for the Amiga last year, and apart
> from the .ABI extension it has a couple
> of other ways to detect it is the correct file
> type - eg. my initials ("ADB") are at the end
> of the file aswell as something else hidden
> in the data (that was also included just so
> I could verify it's *my* file type - I don't
> want to invent something and have it stolen
> from me like so many great idea's (not
> referring to mine now) in the past.
This means you have to scan the file to verify that
it is "that type". Great if your application is the
only one looking at those files and you can afford
the time to scan them to verify this. But, not an
efficient way of handling "file typeS" in general.
More information about the cctalk
mailing list