I sat down during lunch today and doodled up a design for a replacement CPU for the Hack machine with the same API, but with a few more registers in it, and which has a more traditional machine language. Here's the design, hopefully my flat text document will come out. Comments on the design are most welcome!
Mack: Warren's Idea for a Slightly More Powerful CPU
Goal: to build a different CPU, using the existing builtin Tecs chips with some extra control logic. Keep the same ALU and the same branch logic, but have multiple registers, and some CPU state flags, as per more popular CPUs.
Basic architecture: ALU; four 16-bit registers: R0, R1, R2, R3; two flag bits (negative and zero) which are set by the last ALU operation. Each instruction is 16 bits. There is up to one RAM fetch and one RAM store in each instruction.
Instruction formats: there are 3 instruction formats: load R0 with a constant, perform an ALU operation, and branch or jump. Here they are:
Load R0 instruction
0 <15-bit unsigned constant> Load R0 <- constant
ALU operation instruction
10 idd iss oooooo ss meaning dst = src1 op src2
dst = src1 op src2 Same 6-bits to control existing ALU,
and 2 bits to identify the dst, src1
and src2 registers.
When i bit is set in iss, we load
from RAM[src1], not src1. When i bit
is set in idd, we save result to
RAM[dst], not the dst register.
Result of the ALU operation clears or
sets the two flag bits: zero, negative.
This instruction can encode a relative branch of up to 2048 instructions, or it can encode an absolute jump to the value store in R0.
11 jjj <11-bit signed constant> When any of the j bits are set,
they act like the Hack jump bits.
Examine the zero and negative flags.
If a branch, sign extend the constant
to 16 bits, and add it to the PC.
However, when all the j bits are off,
this is an unconditional jump. Load
the contents of R0 into the PC.
label: identifier followed by colon
R0 = constant Load R0 with constant instruction
[*]Rn = [[*]Rn] op Rn ALU instruction, * means indirect addressing,
[ ] means that this is optional
Bxx label Relative branch to label, +/- 2048 instructions
Bxx is BEQ, BNE, BLT, BLE, BGT, BGE, or BR
for an unconditional branch.
JMP label This is an absolute jump "macro" instruction.
The assembler translates this into: load R0
with the label, followed by a jump instruction.
Also another macro instruction:
Rn = constant If Rn != R0, do R0 = constant, Rn = R0.
Thus, R0 becomes a sort of "scratch" register.
Sum the Numbers from 1 to 100
R1 = 1 // i = 1
R2 = 0 // sum = 0
R2 = R2 + R1 // sum = sum + i
R1 = R1 + 1 // i++
R0 = 100
R0 = R1 - R0 // calculate i - 100
BLE loop // loop while i <= 100
JMP end // end with infinite loop
I found a bug. The general Hack computer architecture has only one address bus from CPU to memory, used to both identify the location of the data to fetch from RAM and write back to RAM. Therefore, the CPU cannot read from one address and write back to a different address in an instruction.
This means my CPU cannot do, for example, *R3 = *R2 + 1, i.e. load from the address which R2 points to, and write back to the address in R3.
So, the solution is, when the i bit is set in both idd and iss, the src1 value (i.e the ss in iss) is replaced with the destination value. This allow the CPU to do *R3 = *R3 + 1.
For a lunch time back-of-the-envelope doodle, this is quite a clever design.
Implementing this architecture in LogiSim has been on my to do list for quite a while and I finally got around to it.
Note: type in the keyboard while the program is running to change the message.
Along the way I discovered some interesting details.
Easily extended Jump instructionAs I was wiring up the PC load address for the Jump instruction (Branch with jjj = 0) I had a rather snaky wire running its way to R0. It went past the ALU on it way and I got to thinking...
The Jump instruction, 11 000 xxxxxxxxxxx, has all the correct bits available in the undefined region
to turn it into a PC= instruction with full ALU computation including indirect src1.
This may be quite useful for return IPs for assembly language subroutines, computed jumps and implementation of call and return stack functions. It appears useful if the PC= instruction does not set the status bits; there may be a way to use the zr flag as a subroutine pass/fail return value.
R0 = RIP_1 R3 = R0 BR Function RIP_1: BNE Error ... Function: .... R0 = 0 // good result PC = R3
Status bits and overloaded R0=n instructionThere is a problem with the above code snippet. Depending on how the assembler encodes the R0=0 instruction, it may be an A-instruction that does not set the zr flag. I changed the data routing for the A-instruction—instead of muxing the data directly into R0, I am muxing it into the ALU x input, setting the ALU function to out=x, and loading the status register.
More macrosIf R0 is considered a scratch register that the assembler my use at will, then additional useful macros are possible. For example:
Rd=-6 R0=6; Rd=-R0 *Rd=123, R0=123; *Rd=R0 Rd=Rs+16, s!=0 R0=16; Rd=Rs+R0
How much do the extra registers help?Here is an assembly version of C's strcpy. The parameters are pointers to the string data.
On Hack, the parameters and RIP are passed in memory varaibles; on Mack they are passed in registers.
|Free forum by Nabble||Edit this page|