With fairly simple modifications, the Hack Computer can be used to demonstrate how the Von Neumann single memory (Princeton) architecture works.
In this architecture there is only one memory address and memory data bus. This means that instructions that access data in memory can not be read and executed at the same time. The simplest implementation to solve this is a fetch-execute design. Every instruction requires two clock cycles to execute, the fetch phase and the execution phase. During the fetch phase, the instruction is read from memory and stored in the Instruction Register. During the execute phase, the instruction executes, reading and writing memory as required.
Hack Computer modificationsFirst, the ROM32K needs to be moved into Memory.hdl. It will be placed so that it maps into the high half of the address space. This implies that Memory's address bus expands to 16 bits.
0 RAM 16384 Screen 23478 Keyboard 32768 ROMWith the ROM moved into Memory, Computer consists of just CPU and Memory.
CPUfe (reset=reset, addressM=addressM, inM=inM, outM=outM, writeM=writeM); MemoryROM (address=addressM, in=outM, out=inM, load=writeM);
CPUfe is the fetch-execute version of the Hack CPU. When it is fetching instructions the most significant bit of addressM will be 1 so that the ROM is accessed. If a program needs to read data from the ROM, it needs to add 32768 to assembler generated ROM address so that the addressM MSB will be set. (It would be nice to extend the Hack language to make this easier.)
Hack CPU modificationsPhase controller
A DFF with inverted feedback will alternate between 0 and 1 every clock. There is one tricky bit; when the CPU is reset, the phase controller needs to restart with a fetch phase.
// Phase control: alternates between fetch and execute. // Force to 0 during reset so it restarts in fetch phase. And (a=fetch, b=notReset, out=phaseIn); DFF (in=phaseIn, out=execute); Not (in=execute, out=fetch);Fetch phase
During the fetch phase, the PC must be routed to addressM and bit 15 must be set.
ARegister (in=aIn, load=loadA, out=aReg, out[0..14]=addressM); ARegister (in=aIn, load=loadA, out=aReg); Mux16 (sel=fetch, a=aReg, b=true, b[0..14]=pc, out[0..15]=addressM);
The instruction must be read from memory and stored for the execution phase.
// Instruction register Register (load=fetch, in=inM, out=instruction, out=instruction15, out=instruction14, out=instruction13);
The PC must also be incremented. This must be done either during the fetch phase or the execute phase, but not both. Incrementing the PC during the fetch phase offers the possibility of pipelining the next instruction fetch if the current instruction does not access memory.
PC (in=aReg, reset=reset, inc=true, load=jmp, out[0..14]=pc); PC (in=aReg, reset=reset, inc=fetch, load=jmp, out[0..14]=pc);Execute phase
During the execute phase, the instruction that was stored during the fetch phase is executed identically to the original CPU. In my CPU, the only change required was to ensure that the master control signals for A-instructions and C-instructions would only be true during the execute phase.
And (a=notReset, b=aInstBit, out=aInst); And (a=notReset, b=cInstBit, out=cInst); DMux4Way (in=execute, sel=notReset, sel=aInstBit, d=aInst); // And3Way DMux4Way (in=execute, sel=notReset, sel=cInstBit, d=cInst); // And3Way
Does it work?Here is the modified CPU running the CPUfe.tst:
|time|reset|fet|aIn|cIn| pc |addressM| inM | instruction |DRegis|ARegis| outM |writeM| |0 | 0 | 1 | 0 | 0 | 0| -32768 | 12345|0000000000000000| 0| 0| 0| 0 | |1 | 0 | 0 | 1 | 0 | 1| 0 | 0|0011000000111001| 0| 0| 0| 0 | |2 | 0 | 1 | 0 | 0 | 1| -32767 | -5104|0011000000111001| 0| 12345| 0| 0 | |3 | 0 | 0 | 0 | 1 | 2| 12345 | 0|1110110000010000| 0| 12345| 12345| 0 | |4 | 0 | 1 | 0 | 0 | 2| -32766 | 23456|1110110000010000| 12345| 12345| 12345| 0 | ... |22 | 0 | 1 | 0 | 0 | 11| -32757 | -7420|0000000000001110| -1| 14| 14| 0 | |23 | 0 | 0 | 0 | 1 | 12| 14 | 11111|1110001100000100| -1| 14| -1| 0 | |24 | 0 | 1 | 0 | 0 | 14| -32754 | 999|1110001100000100| -1| 14| 14| 0 | |25 | 0 | 0 | 1 | 0 | 15| 14 | 11111|0000001111100111| -1| 14| 14| 0 | |26 | 0 | 1 | 0 | 0 | 15| -32753 | -544|0000001111100111| -1| 999| 999| 0 | |27 | 0 | 0 | 0 | 1 | 16| 999 | 11111|1111110111100000| -1| 999| -543| 0 | |28 | 0 | 1 | 0 | 0 | 16| -32752 | -7384|1111110111100000| -1| 11112| 11112| 0 |t0: instruction fetched from -32768 = 0x8000 = ROM address 0, and PC increments.
t1: @12345 executes, A register set.
t2: instruction fetched from -32767 = 0x8001 = ROM address 1, and PC increments.
t3: D=A executes, D register set.
t22: instruction fetched from -32757 = 0x800B = ROM address 11, and PC increments to 12.
t23: D;JLT executes and JUMPS, PC = A register set = 14.
t26: instruction fetched from -32753 = 0x800F = ROM address 15, and PC increments.
t27: A=M+1 executes. Memory[A=999] = 11111, 11111 + 1 -> A.
The modified CPU also correctly runs the tests:
[D:/TECS/projects/05/fetch-exe] % for x in C*.tst ; do echo -n "$x: " ; HardwareSimulator $x ; done CPUfe.tst: End of script - Comparison ended successfully ComputerAdd.tst: End of script - Comparison ended successfully ComputerMax.tst: End of script - Comparison ended successfully ComputerRect.tst: End of script - Comparison ended successfully [D:/TECS/projects/05/fetch-exe]
Email me if you want the test files.
There is a subtle incompatibility with the original Hack computer. Because all 16 bits of the A Register are used to access memory, any code that does something that uses the top bit of pointers to store some information, like an end-of-list indicator, will access ROM instead of RAM.
Here is the CPU diagram from chapter 5 with the additions required for the single memory architecture.
In reply to this post by cadet1620
Ive been doing some reading and found that the hack system fits the definition of a RISC computer. I have found the CISC using eprom much easier to understand and implement even with a little more overhead. I also found that most modern cpu's are a mixture of the Harvard an von Neumann machine. Im thinking of using the microprogramming paradigm for the physical implementation.
I'm not sure what is the precise definition of RISC, but in my understanding HACK doesn't fit there. As far as I know RISC processors have 2 main characteristics:
Also RISC processors tend to have lots of registers. It's not a hard requirement, but without them the CPU will constantly have to store and load intermediate data in RAM (much like what happens in HACK programs), which will kill the performance.
I think HACK is a quite primitive CISC architecture.
Quit correct but the other requirment of a risc machine is that the instructions for primitive operations are done leaning over to the software stack makes the programming a little more arduous. For instance in most cisc machines the add command in assembly language would be a command structure of add A B Well due to the fact that the machine is microprogrammed, you do not have to specify the exact steps to add A and B. However, in a risc machine you must specify to the computer how to add A and B. In Hack assembly you dont get this you must tell the system how to add the numbers and Fetch Execute turns to Fetch Decode Execute. The instruction from the program rom is decoded by the control rom which stores all the steps for each instruction in the instruction set each microinstruction outputs a control rom of arbitrary size to control all parts of the system. In the hack system this is done with combinatory logic which is a little harder to understand but rom based microprogramming makes it dead simple and much much easier to make changes or additions to an instruction set.
|Free forum by Nabble||Edit this page|