Reproduction micros

Mon Jul 18 23:30:42 CDT 2016

On Mon, Jul 18, 2016 at 9:59 AM, Liam Proven <lproven at gmail.com> wrote:
> In detailed
> technical ways I confess I do not fully understand, the ARM2 and its
> chipset's design was optimised to work with cheap DRAM with relatively
> slow cycle times.

I've seen this claim in the past. I've looked over the chipset design,
and I don't think it did any more wonderful a job of supporting cheap
commodity DRAM than the other common chipsets of the era. Perhaps
someone with greater familiarity with the MEMC chip can tell us if
there is some tricky DRAM support feature I've overlooked.

The only uncommon and particularly clever thing I saw a system do to
optimize for DRAM (as opposed to any other read/write memory type) was
to use the low-order address lines for the DRAM row address, and to
start the DRAM cycle (assert RAS strobe) before the MMU had done the
address translation. This avoids a portion of the DRAM cycle latency
by taking advantage of the DRAM only needing half of the address bits
at the start of the cycle.

If a translation fault occurs, or the upper part of the translated
address turns out not to be intended for the DRAM array (e.g., ROM or
I/O access), the DRAM cycle has to be either:
* completed normally (OK for read cycles, discard the data)
* completed but forced to be a read (OK for read or write cycles, but
for write, don't assert WE signal, and somehow prevent bus contention)
* aborted (never assert CAS strobe, OK for read or write cycles).

I saw this done on at least one MC68000-based systems with a discrete
logic MMU, but I don't recall which one. The technique became
generally inapplicable when MMUs were integrated into tthe CPU chips,
because such CPUs generally don't give you the untranslated low
address bits any sooner than the translated upper address bits.

The technique wasn't generally useful with the MC68451 segmented MMU
or MC68851 paged MMU, because those supported translation granularity
down to 256-byte boundaries, so only six or seven low-order address
lines were unmapped (for 16-bit or 32-bit memory systems,
respectively). DRAMs at 64K and larger capacities require at least
eight row address bits.  Paged MMUs with a minimum page size of 4K
bytes would be somewhat better suited to this technique.

The major drawback to this technique is that it isn't possible to use
DRAM page-mode access to read consecutive locations, because
consecutive locations are in different rows of the DRAM, rather than
in different columns. However, the technique was used in machines that
didn't have hardware support for the processor to request multiple
consecutive sequential accesses, so it didn't matter. The bus support
for bursts of consecutive addresses mostly appeared when caches became
integrated into CPU chips. Note that bursts could still be supported
by interleaving multiple DRAM banks, but that would cut down on the
number of unmapped low address lines available for use for the DRAM
row address, which was barely adequate without interleaving.