Symbol table and identifiers other than variables

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Symbol table and identifiers other than variables

rgravina
The book makes it pretty clear that variables must be stored in the symbol table (as shown in Fig 11.6). You store the variable name, its type (int, boolean, class name etc.) and kind (static, field, arg or var) and the index assigned to it when declaring them. For method calls you also define 'this' as the first argument. This all makes sense to me so far.

However, the text about the symbol table mentions all identifiers (so this adds class names and function/method/constructor names... i.e. anything which is printed as <identifier>foo</identifier> in the XML output) should be stored in the symbol table. The "Stage 1: Symbol Table" advice is to store/print the identifier category (static, field, arg, var, class or subroutine) so it suggests "class" and "subroutine" should also be stored. But these don't appear in the example in Fig 11.6.

My symbol table is a hashmap/dictionary between a string (the variable name) and an int (the index, which will become the offset from the static/this/arg/local segments). But for class and subroutine names, what useful information can be stored about them?

Thanks!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Symbol table and identifiers other than variables

rgravina
Actually... I think I realised I am missing storing some information (though still not sure what to store for class/functions/methods).

Currently I'm just printing out for each variable...

* name (the identifier name e.g. a, x)
* type (int, boolean, char, class)
* kind (static, field, arg, var).
* index (offset)

...but these should be stored in the symbol table also. So classes and function names could be stored as

* name (the identifier name, e.g. MyClass, someFunction, someMethod)
* type (not applicable? the return type? the class name (for methods)?)
* kind (class, method, function, constructor).
* index (not applicable?)

In a structure similar to (JSON shown here). Note: since there is only one symbol table per class, we know that every method belongs to the the class.

{
  "x": {"type": "int", kind: "field", index: 0}
  "MyClass": {kind: "class"}
  "someFunction": {kind: "function"}
  "someMethod": {kind: "method"}
}

I think maybe I'll keep trying to move forward/generate VM code and then add whatever is needed to the symbol table as I go.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Symbol table and identifiers other than variables

cadet1620
Administrator
If you write the Jack compiler as suggested—using only one pass—you will not need to store class name and subroutine names in your symbol table.

A multi-pass compiler could add information about the class(es) it encounters and their subroutines so that it could do argument type checking and issue meaningful error messages. The compiler could also determine if a method or function/constructor didn't exist or was called using the wrong syntax or number of arguments.

"type" for subroutines could be an array of types where type[0] would be the return type and type[n] would be the type of argument n. (Depending on the language you are using, it might be better to have 'type' be the return type and add an 'args' field to the symbol entries.)

I suspect that the supplied Jack compiler must be a multi-pass compiler because of the types of errors it detects.

--Mark

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Symbol table and identifiers other than variables

rgravina
> If you write the Jack compiler as suggested—using only one pass—you will not need to store class name and subroutine names in your symbol table.

Thanks, this really helped clear things up :).

> A multi-pass compiler could add information about the class(es) it encounters and their subroutines so that it could do argument type checking and issue meaningful error messages. The compiler could also determine if a method or function/constructor didn't exist or was called using the wrong syntax or number of arguments.

It's good to know that the info is useful for doing type checking on method calls etc.

Onward to code generation!
Loading...