Doesn't the Jack platform adhere to UTF rather than ANSI? (More than just the ANSI range is allowed?)
The book says "unicode", but I think I'd call it:
An unspecified 16-bit character set, with only the code points 32 - 129 defined.
Code points 32-127 are defined as corresponding to ASCII characters in that range.
Code point 128 is defined as the "new line" control code.
Code point 129 is defined as the "backspace" control code.
It's not Unicode; new line and backspace are in violation of the Unicode specification.
@cadet1620 Why do you call the code points 32–127 a subset of ANSI? I think its better to speak here of good old plain ASCII; once because this is how Unicode is defined, and second because the critical differences of the ANSI encodings, ISO 8859-1, Windows-1252 … are all how they use the ›upper part of the byte‹ (128–255).
I think it makes more sense to speak of a custom 8-bit (or 16-bit, if you want) charset with goes mostly with ASCII.
Besides, a single memory cell of the Hack Computer isn't capble of represent an abitrary unicode code point: The Unicode Standard can define up to 1.114.112 different code points (U+0000–U+10FFFF), so one need at least ceil(log_2(0x10ffff)) = ceil(20.087) = 21 Bits.
@cadet1620 Why do you call the code points 32–127 a subset of ANSI?
Insufficiently caffeinated fingers this morning... 8-(
FWIW, quoted from the Unicode spec.
The first 256 codes follow precisely the arrangement of ISO/IEC 8859-1 (Latin
1), of which 7-bit ASCII (ISO/IEC 646 IRV) accounts for the first 128 code
(Back when I started with computers, bit 7 was a character parity bit, and character sets varied a bit from vendor to vendor. Standards make the world a better place these days. Unless you need to deal with EBCDIC!)