I suspect that the reason for this may have been that
the original
 CP/M was done on computers with 0 based RAM and methods to
 load code at 0 on boot. 
I beleive the historay can be traced back to the MCS8i system. This,
AFAIK, never had disks, and never ran CP/M. But it's 8080-based and
amazingly it has a CPM-like IOBYTE at location 3.
This machine has RAM at location 0 by default. The thing is that the 8080
starts executing at location 0, so by puittign RAM there  you could enter
a program (or at least a jJMP instruction) at location 0, hit reset, and
run it. That's what you did on the MCS8i. Since a jump is 3 bytes long,
it occupied locations 0.1.2, so location 3 was the first 'free' location,
hence used for IOBYTE. To run the ROM monitor on the MSC8i, you toggled
C3 00 38 (jump to 0x3800) into the first 3 locations, reset, and let it run.
The MDS800, for all it didn't have a front panel, kept the same sort of
memory map (for good reasons IMHO).
FWIW, I have both machines with manuals...
   Adding hardware to do the ROM/RAM transfer on boot is
relatively
 easy with the addition of a 7474. Most machines decode a I/O 
I've seen it done in all sorts of ways. Including a little state machine
that picks up the first 3 read cycles after a reset, disables data
buffers, etc, so they never go outside the CPU board, and forces a jump
instruciton onto the processor data lines. Or, as you say, having a
flip-flop that's dset one way on reset to enable a ROM at location 0 and
the other way by some other method. One trick was to have the ROM always
accessible at the top of memory. On a reset, it was also eneables at
locaiton 0 (or it filled the memory map). The frirst instruction in the
ROM wasa jump to the 'real' ROM location (let's say it was 0xE000) and
the first read cycle with A15 set (or something) cleared the flip-flop,
thus making the ROM appear only at the top of the memory map and enabling
RAM elsewhere. And there are all sorts of other ways to do it.
-tony