Adding Hexadecimal and Character Constants to Jack

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Adding Hexadecimal and Character Constants to Jack

This post was updated on .
How often have you needed to bring up an ASCII table to print characters or test keycodes? I've written this sort of code way too many times:
if ((key = 88) | (key = 120)) {
    do Output.printInt(x);
    do Output.printChar(44);
    do Output.printInt(y);
Jack needs character constants that can be used in place of these obscure numbers.
if ((key = 'X') | (key = 'x')) {
    do Output.printInt(x);
    do Output.printChar(',');
    do Output.printInt(y);
It would also be convenient to be able to use hexadecimal constants for bit masks.
let comp = c_instr & 0x1FC0;
is much more readable than
let comp = c_instr & 8128;

Syntax changes

The book's syntax for expressions is

expression:  term (op term)*
term: integerConstant | stringConstant | keywordConstant | ...
integerConstant: A decimal number in the range 0 .. 32767

To support character constants, "term" will need to include a new syntax element, "characterConstant".

To support hexadecimal constants, "integerConstant" will need to be expanded to include both decimal and hexadecimal syntax.

term:  integerConstant | characterConstant | stringConstant | ...
integerConstant: decimalConstant | hexadecimalConstant
decimalConstant: A decimal number in the range 0 .. 32767
hexadecimalConstant: ('0x' | '0X') (digit | hexDigit)*
Valid range 0x0000 .. 0xFFFF
hexDigit: 'A' through 'F', upper or lower case
characterConstant: ' ' ' Unicode character ' ' '


For simplicity, this syntax allows the degenerate hexadecimal constant "0x" which should be interpreted as 0.

The Unicode character in characterConstant can be a single quote — ' ' ' is correct syntax for a single quote character; ' ' is illegal syntax.

Code changes

Changes are required to both the JackTokenizer and CompilationEngine modules.

Hexadecimal and character constants are just new ways to specify integer constants. Rather than adding new tokenType values and a separate method to return their values, JackTokenizer.tokenType() will return INT_CONST for hexadecimal and character constants.

For character constants, JackTokenizer.intVal() will return the numeric value of the Unicode character.

Because JackTokenizer.intVal() can now return values > 32767 which can not be handled by the "push constant N" VM command, CompilationEngine.compileExpression() will need to write special VM code when an INT_CONST token with value > 32767 is encountered.

For values 0x8000 ≤ N ≤ 0xFFFF, 0x7FFF ≥ ~N ≤ 0x0000. The required VM code is

push constant ~N

Test Code

/** Test code for hexadecimal and character constants compiler modification.
 *  Should print:
 *      Character constants 'OK'
 *      0xAbC OK
 *      0X1DeF OK
 *      0xFF00 OK
 *      0x8000 OK

class Main {
    function void main() {
        do Output.printString("Character constants ");
        do Output.printChar(''');
        do Output.printChar('O');
        do Output.printChar('K');
        do Output.printChar(''');
        do Output.println();

        if (0xAbC = 2748) {     // note mixed case in hex constant
            do Output.printString("0xAbC OK");
        } else {
            do Output.printString("0xAbC FAIL");
        do Output.println();

        if (0X1DeF = 7663) {    // note mixed case in hex constant
            do Output.printString("0X1DeF OK");
        } else {
            do Output.printString("0X1DeF FAIL");
        do Output.println();

        if (0xFF00 = ~255) {
            do Output.printString("0xFF00 OK");
        } else {
            do Output.printString("0xFF00 FAIL");
        do Output.println();

        if (0x8000 = ~32767) {
            do Output.printString("0x8000 OK");
        } else {
            do Output.printString("0x8000 FAIL");