Microsoft "20HAL"

Last Updated 02/12/2024 19:44 -0500   

 

This topic isn't directly related to my work with the Seattle Gazelle, but it is an important, although minor, part of the overall development process of PC-DOS, and somewhat relevant after having hit the 40th anniversary of its introduction in 1981.

If you visit Dan, TheStarman's site at pcministry, there is a discussion about a June, 1981 pre-release version of PC DOS 1.0 that is sometimes arbitrarily called "IBM PC DOS 0.90". This discussion is related to a disk image Dan received in 2006 from a former IBM employee which contained a near-final version of PC-DOS and some development-related tools that never made it to the final disk shipped by IBM. One interesting tool is "20HAL.COM" dated April 23, 1981 (or earlier), about four months before the introduction of the IBM-PC on August 12, 1981. The "20" might refer to the DECSYSTEM-2020 that Microsoft supposedly had at the time, although I haven't verified that. "HAL" is also an interesting choice, with each letter being one off from IBM -- and/or an interesting parallel to a large, maniacal, all-knowing space station computer of the same name.

20HAL is a relatively short program (1.8k) that I believe was used to transfer files between Microsoft in Bellevue, WA and IBM's Entry Systems division in Boca Raton, FL. To my knowledge, no one has examined it closely because, really, how important is it? It was a simple internal tool to get DOS development files to IBM. But, it's an interesting digital archeological project.

Dan, TheStarman's site also references a 2005 interview with Bob O'Rear (https://thestarman.pcministry.com/DOS/ibm100/Exam.htm#DATA) that confirms that IBM did dial into Microsoft's DECSYSTEM-20 (using a 300-baud voice modem) to communicate about development issues. It doesn't specifically mention file transfers, and the reference to "voice modem" I assume means an acoustic coupler. That page also discusses information found in the slack space (unused space on a diskette) on an original PC-DOS 1.10 disk. This space contains references to what might be later versions of this same program ("DEC-20 +++ FAST +++ [Hal version] 11-Oct-81" and, four months later, "Tops-20 Downlink to MS-DOS (Created: 24-Feb-82)  [IBM Version] [1200-bps]".

As an aside, I think it's interesting that there's anything in the slack space of a public distribution -- that means that someone made a duplication master out of a previously-used work diskette. A freshly-formatted disk is mostly filled with 0xF6 in the slack space. Admittedly, sector-level disk editing tools didn't exist at first, so there was no way for the casual user to view slack space. I guess you could have loaded it into memory using DEBUG, but that's not nearly as user-friendly as Norton Disk Edit. Anyway, I digress...

I have so far not found anyone who knows exactly how the intra-company communications was setup or how the utility was actually used. There is no embedded help, which is unsurprising since it was an internal tool. So, necessarily, this requires a lot of conjecture based on what would seem logical in that environment in 1980 and 1981. I'll summarize the results here, but you can scroll down for more of the day-to-day analysis. I had several questions:

A friend of mine, Ryan Ottignon, is working on a history project relating to SCP and the different versions of 86-DOS that were built. He told me of a passage in the book Gates (Stephen Manes and Paul Andrews) which, in Chapter 12 ("DOS Capital"), mentions exchanging development files with IBM:

 

"Software exchanges took place daily. Every afternoon around five o’clock, someone from Microsoft would round up the disks

 and drive down the freeway to Sea-Tac Airport and the Delta DASH package delivery service. Eventually modems were set up

 to speed transfers even further."

 

This passage seems to answer part of the second question above. This chapter goes on to say that the Chess prototype was unreliable (quite possibly due to overheating in the unairconditioned "secure" lab, and the wire wrapping construction), but that DOS and BASIC were running on it by late-January, 1981 (the 22nd to the 25th, specifically). Further, it mentions heavy troubleshooting sessions in March, and the need to have IBM actively engaged in the testing to keep the project on schedule. From a code development perspective, the book describes Bob O'Rear receiving updated code from Tim Paterson @ SCP and doing the adaptation work for Chess in a "...wildly kludgy multistep process that involved a stroll from one end of the building to the other and the use of three different machines, not counting the IBM prototype." In an email exchange I had with Tim Paterson in 2014, Microsoft stored the source on a PDP-10 but the code was actually assembled on a specially-configured Seattle Computer Products ("SCP") S-100 machine (I believe an above-average amount of memory). If Bob was doing the adaptation work, he would be a likely source of the files that needed to be sent to IBM daily. It wouldn't be too much of a leap to think he developed the 20HAL tool if nothing but to save him a ton of time moving files around when he could spend that time writing/debugging code.

In Ryan's research, some of the versions of 86-DOS code he uncovered were stored on Northstar-format (89.6k, 35x10x256, hard-sectored) diskettes. It is not clear to me whether Northstar (Z80) machines were used at SCP for some purpose, or if it was a personal machine owned by that developer (Pat Opalka). This same image contained a clone of the MicroPro Word Master text editor. According to a post on the VCFED forum, it was a decompilation-conversion-recompilation of Word Master for CP/M that was used by Tim Paterson as his primary IDE for later programs.

 

2024 Updates

Recently a trove of old SCP disks (literally hundreds) have been found by a friend of mine f15sim. Among these disks was the earliest known version of 86-DOS, v0.1C - SN#11. This launched a massive effort to compare it to later known version 0.34 and the released 1.0 version which was parallel to PC-DOS. While those efforts are too detailed to describe here, one document came to light, within files uploaded by Bob O'Rear to the Internet Archive (here). This undated document (although likely not earlier than February 1981) details how the object code came together and was sent to IBM. It also confirms several aspects of the process which I surmised below. Here's what Bob said (with my editorials in brackets):

  1. Seattle Comptuers delivers source of QDOS and utilities on 8" QDOS formatted diskette and an absolute assembler. {Depending on the date, this could be the SMALDISK or LARGDISK format, the difference being a 16-byte directory versus 32-byte which was added in February 1981.}

  2. Incorporate changes to QDOS using EDLIN + assemble into absolute code. {"Absolute code" would likely be a COM file. Using SCP utilities and syntax, this probably means something like "EDLIN 86DOS.ASM", "ASM 86DOS", and "HEX2BIN 86DOS".}

  3. Run an encode program (written in BASIC) that converts absolute code to Intel ASCII hex format. {This is key because file transfers were not "8-bit clean" because of 7-bit terminal ASCII. This produces an ASCII-only file like UUENCODE (which didn't come out until 1983).}

  4. Upload the encoded ASCII hex QDOS to the DEC 2020. {This confirms the system used.}

  5. BIOS for MSDOS written on 2020 in XMACRO-86 a cross assembler. Assemble to absolute location. {I think this refers to DOSIO.ASM, the IO layer that 86DOS calls.}

  6. Download BIOS and ASCII-hex encoded QDOS to Intel ISIS system. {This is new information.}

  7. Encode the BIOS to Intel ASCII-hex on the Intel ISIS machine. {Needed as a bridge between the DEC2020 to get both into HEX??}

  8. Transfer the BIOS + QDOS over IBM RS232 to IBM PC prototype memory via ROM debugger that accepts Intel ASCII hex. {Was the Intel MDS the only one with a serial port?? Doubt it...so was there another machine involved here? The Gazelle definitely had serial ports.}

  9. Use IBM PC prototype debugger to write BIOS + QDOS to exact sector locations of PC prototype 5-1/4 diskette system. {Many system ROM monitors I've used on S100 systems and others have the ability to receive HEX files. Really need a copy of that prototype ROM!!}

  10. Repeat assembly, encoding, etc for QDOS utilities. {Rinse and repeat...}

Whew! Some interesting takeaways:

 

Breaking down the Code

To start, I ran the code through Sourcer to get a listing file, and I made extensive use of the Scroll Symbolic Tracer by Murray Sargent (itself a program with an interesting Microsoft-connected history). Looking at the embedded text strings revealed TOPS-10 commands and keyboard sequences to print login information, set control code filtering, execute a copy command, and run external programs. The original listing is here.

The code had an odd mix of CP/M style calls (using "CALL 0005") for keyboard input and certain disk functions, calls to the PC BIOS for video and serial communications, and finally two isolated PC DOS INT21H calls for two disk functions. The hallmarks of CP/M, jumping around in the code, stack jumps by pushing an address on the stack and executing a return without balancing the stack just doesn't seem to fit with what I read about IBM's straight-laced development practices. So, I feel that it was developed by either SCP or Microsoft and given to IBM to use.

The CP/M-style calls and the names of the two transfer programs on the PDP-10 (both start with "CPM") would hint at it being originally written on a CP/M 2.2 system and then possibly translated to 8086 using SCP's TRANS86. 86-DOS (and subsequently, MS-DOS) was designed to accept CP/M-compatible system calls as a way to enable users to migrate CP/M programs (Z80-based) to 8086-based platforms. CP/M-86 wasn't released until three months after the PC so I doubt it was developed using that. SCP did not have a Z80 CPU card, but Gazelle-specific versions of both 86-DOS and MS-DOS could use floppy controller cards that were common in Z80 S-100 systems of the era. If the recovered Northstar disks were from a development machine, then the pieces somewhat fit together.

The next question related to how the connection between Microsoft and IBM was made. When looking at the serial communications code, there are no phone numbers or "smart modem" command strings (the Hayesmodem 300 didn't come out until later in 1981, but there were S-100 modems; none were "smart", however, but they could take a "dial" string). The least expensive method would probably have been acoustic couplers on either end of a POTS line, rather than a leased line...but...the core PC-DOS source files were about 300K which, at 300 baud, would take over 2 hours to transfer. This could have been fine at the end of the day when toll rates would have been lower. Using a 56kbps leased line, that transfer would take only about 45 seconds. This would appear to be confirmed by the passage in the Gates book, although the type of modem was non-specific.

As you get into the code, there are some small clues as to the direction of data and thus, which company likely used it. The utility was provided by a former IBM employee, the code contains the word "downlink", it opens files locally with an "overwrite" attribute, and it acts as a terminal emulator to send remote system commands. To me, these all point to the utility being used by the "downloader", so the IBM development team, in order to grab files directly from Microsoft's minicomputer.

20HAL operates as follows:

What I have not yet been able to figure out is the file transfer protocol. I have been able to successfully transfer a single block of data, but it relies on either pauses or some handshaking -- something that can't be accomplished purely through HyperTerm. I did try to write a short program in QBASIC to act as the server, but that didn't work well.

I did set up a TOPS-10 system on SIMH, and I have several acoustic modems and a PBX, so at some point I can get a better demo working.

 

Daily Log

Below is more of a day-to-day log of what I discovered, but I did not go back and correct earlier entries as I learned more. I would really like to find someone who has used this program, but that's obviously a very limited number of people, most of whom are probably now retired. I would also like to get the PDP-10 file transfer code to see how that worked.

7/17/2021

After a few months of having this project on the shelf, I embarked on taking the source code and seeing if I could produce binary-identical output. I grabbed a copy of IBM Macro Assembler 1.0 (probably what was MASM 1.0), figuring that it would be the closest thing to what was used at the time. After a day or two of noodling around with the syntax, I still could only get the code about 90% there. Then, it dawned on me that, more than likely, they would have just used the SCP Assembler from 86-DOS since it produces a fairly clean non-segmented binary. Duh! Time to fire-up an MS-DOS VM and see if it works.

The SCP syntax is different from MASM (sort of like NASM versus MASM), so I needed to spend time converting the Sourcer-produced file to be compatible with ASM. After those changes, the first pass through ASM resulted in about 50 errors, mainly because of stuff I missed. After fixing those errors, I was able to get a clean compile with no errors. Yay! Most of the problems relate to forcing a word-wide op code rather than the byte-wide version, which were easily fixed, and a few other quirks like colons after labels and invalid characters (like "_"). The one remaining problem relates to operand size in a single line -- it compiles but produces the incorrect byte sequence. For example, code just after the loc44 label is:

    mov    [bx+1],ch

In the original program, this codes as 88/AF/01/00 yet when recompiled, it comes out as 88/6F/01. Even if I use the "W" (word) modifier, it doesn't change the output. Arrrgh. One of my friends from the VCFE board mentioned a feature of the SCP assembler in which you can force a 16-bit reference by using a forward equate that's not "near" (so -127 or +128 from the PC). So, the above would be...

    mov    [bx+ONE],ch

...and then at the bottom of the source file I added:

    ONE:    equ    1

That fixed it! Not sure if that's how it would have actually been coded, but at least it causes ASM to emit the right bytes. There is one additional similar mis-coding in the FlushBuf routine (cmp cl,0) which requires using the same forward equate trick (cmp cl,ZERO) to get the right bytes (80/F9/00 rather than 82/F9/00).

Here is the source and listing file for using SCP ASM.

 

4/2/2021

I decided to continue experimenting with this using acoustic coupler modems and several other pieces to simulate a dial-up setup that might have been used. As of now I only have a "remote console" working under SIMH, but it's a start. The basic setup is as follows:

 

/--------/                                                /--------/

|       |                                                 |       |        {  SIMH  }

|       | <--> [:::]  //~~~~~~~~~~~~~~~~~//  [:::] <----> |       | <----> { TOPS10 }

/-------/                                                 /-------/        {        }

.......                                                   .......

.......                                                   .......

 

SERIAL        MODEM 1    TELCO Panasonic    Modem 2        LSI ADM 31      HP LAPTOP

TERMINAL                 PBX and two DM500                 TERMINAL        SIMH

                         telephones

Modem 1 is an Andersen Jacobsen A242-A 300-baud originate-only acoustic coupler modem. Modem 2 is a Nixdorf-branded Novation Cat 300-baud originate/answer acoustic coupler modem. The LSI terminal is a Lear-Siegler ADM31 terminal with a terminal pass-through connector. This is important because Modem 2 is connected to the "Extension" connector and "Modem" is connected to the computer running SIMH.

I have not been able to get SIMH to properly run the DZ terminal multiplexer with real serial ports at 300 baud using "ATTACH DZ -am LINE=0,CONNECT=SER2;300-8N1" (to put a real serial port on the DZ). So, I resorted to configuring the OPR terminal to use a real port using the SET CONSOLE command in SIMH:

simh>SET CONSOLE SERIAL=SER2;300-8N1

With this, I can type commands on the serial terminal and interact with the simulated PDP-10 over the modem connection. So, that's a good start.

<END>

 

The Gory Details

I would note that for system testing, I used a simulated PDP-10 running TOPS-10 using a pre-built TOPS-10 image.

There are certain code sequences which look very much like CP/M 2.2 BDOS system calls. An example:

9663:0584     print_char:
9663:0584 53         push bx
9663:0585 52         push dx
9663:0586 51         push cx
9663:0587 50         push ax
9663:0588 9C         pushf
9663:0589 B1 06      mov cl,6
9663:058B 8A D0      mov dl,al
9663:058D E8 FA75    call cs:5     ;CALL 0005 $-588h
9663:0590 9D         popf
9663:0591 58         pop ax
9663:0592 59         pop cx
9663:0593 5A         pop dx
9663:0594 5B         pop bx
9663:0595 C3         retn

When looking at the disassembly, offset 058Dh came out as a negative offset to the PC. When doing the math, it results in "5". That was the clue. CP/M system calls used the C register as the system call number and then called BDOS (CP/M's equivalent of the "DOS" part of PC DOS) through a call to a fixed address in what Digital Research referred to as "Low Storage" (CP/M's equivalent of the Program Segment Prefix in PC DOS; "CALL 0005"). If you examine the layout of the PSP for PC DOS, you can see that this mechanism was carried over, presumably for backward compatibility for programs that may have been translated to x86 code using TRANS86.

Based on this, I believe that the program was likely developed on a CP/M system -- probably not CP/M-86, but regular Z80 CP/M -- and then translated for use on the PC.

The code also makes calls to the PC BIOS for video (INT10h) and serial (INT14h) needs, interleaved with the unchanged calls to DOS using CALL 0005. The CP/M-style calls that were left unchanged were primarily the file access and FCB calls (set DMA, open, close, read, write). The entry code to DOS from the CALL 0005 twiddles the registers and then jumps to the dispatcher, so there was no real need to translate these to equivalent INT21h calls.

Moving further through the code, I did not notice any data that would be either an AT modem string or a phone number (Boca Raton at the time was area code 407). The Hayes Smartmodem 300 wasn't introduced until after the PC was introduced, but the Hayes Micromodem 100 was available on the S-100 platform, being introduced in 1979. The lack of anything that looks "smart" leads me to believe that access was through a leased line from Bellevue to Boca, or maybe an acoustic coupler.

I have run 20HAL both stand-alone and using Scroll Systems Tracer while connecting the test PC to a Hewlett-Packard 4952 protocol analyzer. The format of the command line isn't clear, and it sends a string "TTY FILL 3" and waits for a response. The code is also "protocol-less" (no embedded XMODEM or other file transfer protocol that I could find). Based on this, I'm guessing that file transfers were probably limited to text-only, possibly an Intel HEX file. In CP/M, the output of ASM was an Intel HEX file and a PRN listing file. Thus, a HEX file would make sense - it can be turned into a binary by using the HEX2BIN utility also included on the diskette.

There are several questions I have that would aid in further analysis:

 

9/27/2020

 

10/3/2020

 

10/4/2020

 

10/8/2020

 

10/9/2020

The Thinkpad on the left is the 20HAL machine. The ADM31 terminal in the middle is, well, a terminal. The HP laptop on the right is the PDP-10. The three machines are connected by RS-232 connections at 300 baud. The PDP-10 on SIMH uses telnet, so I'm using a serial-telnet bridge program (it's actually a modem emulator for Commodore BBS'es, but it works). I know there's a lot of glare on the below picture, but you can see me "dialing" the PDP-10 using the standard Hayes modem comment, except that it takes an IP address and port. Once hitting ^C, I get the dot-prompt and I can log into TOPS-10.

 

The only problem with this setup is that I can't get the ADM31 to send characters properly so I can't "dial" the modem. So, I had to run Procomm to do it. But, if I exit Procomm to run 20HAL, the connection drops. If I shell to DOS from within Procomm, and then run 20HAL, I don't get the expected characters or response. Hmmm. I tested the ADM31 with HyperTerm so I know it works but maybe it's a quirk with how the pass-through port works. It's possible that the 20HAL machine was connected directly?

 

10/12/2020

 

10/13/2020

 

Copyright (c) 1998-2024 Richard A. Cini, Jr. (rcini at msn dot com) All Rights Reserved. All copyrights of any third parties referred to herein are hereby acknowledged. There is no warranty, either express or implied, relating to any of the content contained herein. The site maintainer shall in no event be liable to anyone for damages, including any loss of profits, lost savings, or other incidental or consequential damages arising out of the use or misuse of the information contained on this Web site. Batteries not included. You may use the information contained herein for NON-COMMERCIAL purposes only and AT YOUR OWN RISK.