Parsing the 'term' non-terminal -- breaks rule of 1 read-ahead?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Parsing the 'term' non-terminal -- breaks rule of 1 read-ahead?

sakumar
If x is a variable name, parsing the <term> non-terminal requires 2 read ahead in the following example:

<term> could begin with:
x        // just the variable
x[2]    // Array reference
x.y     // subroutine call

By the time the parser realizes that it is case 3 (subroutine call) it has already consumed the token for 'x'. So if it calls compileSubroutineCall(), the first token could be '.' which could be problematic for this function to deal with.

Page 216 of the text alludes to this problem, but I am not clear what an elegant solution would be. One idea I was thinking of would be to create an artificial token "x.y" called SUBROUTINE_NAME.  Then compileSubroutineCall() always expects this to be the first token. Since a subroutine-call is always from <term> the compileTerm() function could set this up regardless of the format of the subroutine name. Comments?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Parsing the 'term' non-terminal -- breaks rule of 1 read-ahead?

cadet1620
Administrator
sakumar wrote
If x is a variable name, parsing the <term> non-terminal requires 2 read ahead in the following example:

<term> could begin with:
x        // just the variable
x[2]    // Array reference
x.y     // subroutine call

By the time the parser realizes that it is case 3 (subroutine call) it has already consumed the token for 'x'. So if it calls compileSubroutineCall(), the first token could be '.' which could be problematic for this function to deal with.
It's been about a year and a half since I wrote my compiler, but if memory serves, I set an objName variable when parsing terms so that I could pass it into compileCall(objName) when I encountered the '.'. CompileCall would then parse the function name and argument list.

do x.y(); would have similarly parsed the 'x' into objName and parsed the '.' before calling compileCall(objName).

--Mark
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Parsing the 'term' non-terminal -- breaks rule of 1 read-ahead?

sakumar
I did the same thing -- my compileSubroutineCall() function expects the current token to be either a DOT or an LPAREN.

So if the subroutine call is in the x.y() format, "x" must be in previous_object_name and the current token must be '.'

If the subroutine call is in the z() format, "z" must be in previous_object_name and the current token must be '('.

On entering compileSubroutineCall() it first checks to see whether the current token is DOT and if so, it moves forward two tokens and forms the "x.y" subroutine name. If the current token is not DOT then it assumes the subroutine name is wholly contained in previous_object_name. The rest of the subroutine call processing is the same for both cases.

By the way, I really appreciate your following up with responses to these questions. Since this course is not a MOOC where a bunch of people are doing it at the same time, the participation rate in this forum is quite sparse and not a whole lot of people are available to answer questions.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Parsing the 'term' non-terminal -- breaks rule of 1 read-ahead?

Rather Iffy
This post was updated on .
In reply to this post by sakumar
Another approach is to build a tokenizer which can step back in the tokenstream.
When the compile engine  parses a term and, for instance detects a '[' , it ask the tokenizer to step back 2 tokens thereby creating the right starting situation for an auxiliary procedure  'compile_array_item' to be called next form the compile_term procedure.

I think the  tokenizer should unload the compilation engine workhorse as much as possible.
The engine should not have to remember previous tokens and concentrate on the right ordering of tokens and calls

To add the step-back feature to the tokenizer is easy when you choose the right datastructure , for instance a stack.

Loading...