On Sat, Aug 10, 2024 at 12:28:23AM -0500, Steve Lewis via cctalk wrote:
[...]
  I don't ever recall seeing 86-DOS on shelves, or
ever really hearing about
 it. But CP/M remained fairly popular to mid 1980s (I just mean I knew
 various friends who daily used CP/M then). A couple issues with CP/M: it
 never really "broke the 64K barrier", so I've wondered who
"pioneered" the
 segment management needed to make the 1MB conventional RAM seem more
 contiguous than it really is? (I understand the 640K barrier was just
 arbitrarily picking 10 segments for end user, and 6 for essentially system
 reserve - and yes there is more details to that). 
Your question (well, more of a musing but I'm taking it as something which
can be answered) seems a bit muddled, so I'll have a crack at several
possible interpretations.
CP/M was effectively limited to 64KiB because it had no traction outside of
the 8080/Z80 which had a 64KiB address space. To go beyond that limit on
those CPUs involves paging, and some platforms did indeed use paging for RAM
disks and to move some OS out of the way to leave more RAM for programs. But
as far as programs are concerned, 64KiB is the limit unless they happen to
be platform-specific and know how to handle the paging.
From what I can tell of a casual peruse of the documentation of CP/M-68K and
CP/M-86, they support the full address space of 4GiB and 1MiB respectively.
This is kind of obvious on the m68k since why would they artificially limit
it, but on x86 it's less obvious because they could have restricted changing
the segment registers. CP/M-86 *also* supports an "8080 model" with CS == DS
== ES, presumably to ease quick ports of 8080 code through source-to-source
translation.
86-DOS is the same 1MiB. Applications are free to change the segment
registers.
As to the 8086, it really does have a 1MiB linear address space. The gotcha
is that 20-bit linear addresses are too wide to fit in a 16-bit register, so
use a pair of registers. This much is no different to splitting a 16-bit
register across the 8-bit H and L registes on the Z80. The problem is that
there are just too few segment registers and so they need reloading all the
time, which is expensive, and pointer arithmetic is much more difficult due
to the shift-by-four, so for performance one tries to write code such that
they don't get changed so often, and one way to do that is to introduce an
artificial 64KiB limit.
It's also worth noting that the PC memory space is very much *not* divided
into fixed 64KiB segments (and ISTR it was originally a 512/512 split).
Segment registers have 16-byte granularity and a segment can straddle a
64kiB boundary just fine. This is used to some effect on the 286 to gain an
extra 65520 bytes beyond the 1MiB boundary in real mode.