-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmsta-3.html
464 lines (428 loc) · 19.7 KB
/
msta-3.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.66">
<TITLE>MSTA (syntax description translator): Generated code</TITLE>
<LINK HREF="msta-4.html" REL=next>
<LINK HREF="msta-2.html" REL=previous>
<LINK HREF="msta.html#toc3" REL=contents>
</HEAD>
<BODY>
<A HREF="msta-4.html">Next</A>
<A HREF="msta-2.html">Previous</A>
<A HREF="msta.html#toc3">Contents</A>
<HR>
<H2><A NAME="s3">3.</A> <A HREF="msta.html#toc3">Generated code</A></H2>
<P>A specification as described in the previous section is translated by
MSTA into optional interface and
implementation files having the same names as one of specification
file and correspondingly suffixes `.h' and `.c' (C code) or `.cpp'
(C++ code). By default the interface file is not generated.</P>
<H2><A NAME="ss3.1">3.1</A> <A HREF="msta.html#toc3.1">C code</A>
</H2>
<P>The interface and implementation files consist of the following
definitions of generated macros, types, and functions (unless special
information for MSTA scanner is mentioned, MSTA scanner object have
the same sense and names with additional s or S after the prefix `yy'
or `YY'):
<DL>
<DT><B>YYSTYPE</B><DD>
<P>By default this is macro. The macro value is type used for
representing the parser attributes. By default this macro is
defined as `int'. You can redefine the macro if you place
definition of the macro before standard definition of the
macro.</P>
<P>If construction `%union' is present in the specification
file, YYSTYPE is type definition of union with the code
written inside construction `%union'.</P>
<P>The definition of YYSTYPE is placed in the interface file if
option `-d' is on MSTA command line. Otherwise the
definition will be in the implementation file. YYSTYPE is a
part of YACC POSIX standard.</P>
<DT><B>yychar</B><DD>
<P>This variable contains code of the current token. The
current token is not latest read token because MSTA can look
ahead far. The codes are returned by scanner function
`yylex'. If option `-d' is present on the command line (see
MSTA usage), external definition of the variable is also
placed in the interface file. The variable is a part of YACC
POSIX standard.</P>
<DT><B>yylval</B><DD>
<P>This variable is used to exchange information of the parser
with a scanner. The scanner must return attribute of the
latest read token in this variable. After that the variable
contains attribute of the current token, i.e. whose code is
in the variable `yychar'. The variable `yylval' is declared
of type YYSTYPE. If option `-d' is present on the command
line (see MSTA usage), external definition of the variable is
also placed in the interface file. The variable is a part of
YACC POSIX standard.</P>
<DT><B>YYDEBUG</B><DD>
<P>The parser generated by MSTA has code for diagnostics. The
compilation of the runtime debugging code is under the
control of YYDEBUG, a preprocessor symbol. If YYDEBUG has a
nonzero value, the debugging code will be included. If its
value is zero, the code will not be included. The macro is a
part of YACC POSIX standard.</P>
<DT><B>yydebug</B><DD>
<P>In parser where the debugging code has been included (see
macro YYDEBUG), the variable `yydebug' can be used to turn
debugging on (with a nonzero value) and off (zero value) at
run time. The initial value of yydebug is zero. If option
`-d' is present on the command line (see MSTA usage),
external definition of the variable is also placed in the
interface file. The variable is a part of YACC POSIX
standard.</P>
<DT><B>int yyparse ()</B><DD>
<P>This function is main function of MSTA parser. The function
makes parsing of the token sequence whose codes are returned
by user-defined function `yylex' and whose attributes if any
are placed in variable `yylval'. The function returns 0 if
the parser successfully finished work. Nonzero returned status
means that the parser found unrecoverable errors (or macro
YYABORT was executed explicitly). This function is a part of
YACC POSIX standard.</P>
<P>This function has name `yylex' for MSTA scanner. The
function makes scanning of the character (token in
terminology of MSTA specification file) sequence whose codes
are returned by function `yyslex' and whose attributes if
any are placed in variable `yyslval'. The function returns 0
if the parser successfully finished work and reach end of
input file stream. Negative returned status means that the
parser found unrecoverable errors (or macro YYSABORT was
executed explicitly). This function can be called many times
for getting next token. Code of the next token is suggested
to returned by statements `return' in the actions. Input
stream (look ahead characters) is saved from a call of
`yylex' to the next its call.</P>
<DT><B>int yylex ()</B><DD>
<P>This function is an external function to the MSTA parser.
User must provide it. Each call of the function should
return code of the next input token. If end of input is
reached, the function should return zero (-1 for `yyslex).
Attribute of token whose code returned by the function should
be returned by the function through variable `yylval'. In
the case of MSTA scanner, function `yyparse' has name
`yylex'.</P>
<DT><B>void yylex_start (int *error_code)</B><DD>
<P>The function `yylex_start' is generated only for MSTA
scanner. The function should be used for initiation of the
scanner. Nonzero value returned through the parameter means
that ther was error in memory allocation for the scanner
(this is a fatal error). The function is not a part of YACC
POSIX standard.</P>
<DT><B>yyprev_char</B><DD>
<P>Its value is the latest shifted token (character) code.
Usually the value is used for forming internal representation
of tokens (e.g. identifier internal representation or number
value). The variable is not a part of YACC POSIX standard.</P>
<DT><B>YYACCEPT</B><DD>
<P>The macro YYACCEPT will cause the parser to return with the
value zero. This means normal parser work finish. The macro
is a part of YACC POSIX standard.</P>
<DT><B>YYABORT</B><DD>
<P>The macro YYABORT will cause the parser to return with a
nonzero value (1 for MSTA parser and -1 for macro YYSABORT
MSTA scanner). This means abnormal parser work finish. The
macro is a part of YACC POSIX standard.</P>
<DT><B>yyerror</B><DD>
<P>When the parser detects a syntax error in its normal state,
it normally calls external function yyerror with string
argument whose value is defined by macro YYERROR_MESSAGE.
User must provide function `yyerror' for building parser
program. After that the parser jumps to recovery mode. The
parser is considered to be recovering from a previous error
until the parser has shifted over at least
YYERR_RECOVERY_MATCHES normal input tokens since the last
error was detected or a semantic action has executed the
macro `yyerrok'. The function is a part of YACC POSIX
standard.</P>
<P>Recovery mode consists of on or more steps. Each recovery
step starts with searching for the uppest stack state on
which the shift on special symbol `error' is possible. This
state becomes the top stack state, and shift on `error' is
made. After that the parser discards all tokens which can
not be after the symbol `error' in this state (so called stop
symbols). After that any recognized syntatic error results
in the new error recovery step. This is technique of standard
YACC error recovery. Such technique may result in infinite
looping of the parser or discarding all input tokens if the
stop symbols are not met.</P>
<P>By default MSTA generates the standard YACC error recovery.
There are two additional methods which msta can generate for
error recovery.</P>
<P>The first one is a local error recovery which does not permit
infinite parser looping and use context after several error
as stop symbols. According this method look ahead set also
includes look ahead tokens after token `error' in states
which have the `error' token is acceptable and which are
lower in the parser stack than the first state with
acceptable token `error'. In this case the feedback from the
parser to the scanner could not work correctly because
although rule actions are executed in such case the parser
reads the tokens once.</P>
<P>
The second one is a minimal cost error recovery where the
cost is overall number of tokens ignored. The feedback from
the parser to the scanner does not work correctly. So you
shouldn't use this method when there is the feedback.
Calling `yyerrok' has no sense for such method because the
parser in such recovery mode never executes the rule actions.
This method is the best quality error recovery although it my
be expensive method because in the worst case it might save
all input tokens.</P>
<DT><B>YYERROR_MESSAGE</B><DD>
<P>The macro value is used as a parameter of function yyerror
when a syntax error occurs. The default value of macro is
"syntax error" ("lexical error" for a scanner). You can
redefine its value. But in any case the value should be a
string. The macro is not a part of YACC POSIX standard.</P>
<DT><B>YYERR_RECOVERY_MATCHES</B><DD>
<P>The parser is considered to be recovering from a previous
error until the parser has shifted over at least
YYSERR_RECOVERY_MATCHES normal input tokens since the last
error was detected or a semantic action has executed the
macro `yyerrok'. The default value of macro is 3. You can
redefine its value. But in any case the value will be
positive. This macro is not a part of YACC POSIX standard.</P>
<DT><B>YYERR_MAX_LOOK_AHEAD_CHARS</B><DD>
<P>This macro is generated only when the local recovery mode is
used. The default value is 7. This value can not be less 1.
See description below.</P>
<DT><B>YYERR_LOOK_AHEAD_INCREMENT</B><DD>
<P>This macro is generated only when the local error recovery
mode is used. The default value is 3. This value can not be
less 0. See description below.</P>
<DT><B>YYERR_POPPED_ERROR_STATES</B><DD>
<P>This macro is generated only when the local error recovery
mode is used. The default value is 2. This value can not be
less 0. See description below.</P>
<DT><B>YYERR_DISCARDED_CHARS</B><DD>
<P>This macro is generated only when the local error recovery
mode is used. The default value is 3. This value can not be
less 0. See description below.</P>
<DT><B>yydeeper_error_try</B><DD>
<P>This and the previous macros (YYERR_MAX_LOOK_AHEAD_CHARS -
YYERR_DISCARDED_CHARS) are used only when the local error
recovery is generated. Before starting description of the
local error recovery, let me remind how YACC error recovery
works. When the parser recognizes a syntactic error, it
switches into error recovery mode. Error recovery itself
consists of one or more steps. Each step consists of finding
the top state on the stack with possible shift on
pseudo-token `error', throwing all states upper the state
with `error', and making shift on the `error' token. After
that all token are discarded until token (so called stop
symbol) which can be after the pseudo-token `error' is read.
After that any recognized error results in the local error
recovery step. And finally the error recovery is switched
off only when YYERR_RECOVERY_MATCHES (by default 3) tokens
are shifted without occurring syntactic error.</P>
<P>The differences of the local error recovery from classic YACC error
recovery is in the following:
<UL>
<LI>The parser saves all discarded tokens in error
recovery mode and returns them back into the input stream
on the local error recovery step.
</LI>
<LI>Only YYERR_LOOK_AHEAD_INCREMENT tokens can be
discarded on the first step, 2 * YYERR_LOOK_AHEAD_INCREMENT
on the second step and so on (but no more
YYERR_MAX_LOOK_AHEAD_CHARS tokens).
</LI>
<LI>If the parser requires discarding more tokens which is
possible on the step, the local error recovery step starts.
Moreover if action `yydeeper_error_try' has been fulfilled
on the previous step, the new step starts searching for the
error state on the stack with the state which is deeper
than the error state on the previous error recovery step.
Otherwise, as usually searching for the error state starts
with the top of the stack.
</LI>
<LI>On each YYERR_POPPED_ERROR_STATES error recovery step
(and correspondingly on each YYERR_POPPED_ERROR_STATES
processing the error state), the parser discards
YYERR_DISCARDED_CHARS tokens without saving them before
searching for the stop symbols.</LI>
</UL>
By default MSTA generates YACC error recovery which does not
permit infinite parser looping and use context after several
error as stop symbols. The following fragment illustrates
usage of the local error recovery mode.
<BLOCKQUOTE><CODE>
<PRE>
#define YYERR_END_RECOVERY() yyerr_end_recovery()
...
program :
| program function
...
function : ...
| error END FUNCTION
{yyerror ("error in function");}
...
statement : ...
| error
{
yyerror ("error in statement");
...
}
...
expression : ...
| error
{
yyerror ("error in expression");
...
}
...
yyerror (char *s)
{
/* save string s */
}
yyerr_end_recovery ()
{
/* print last saved error message. */
}
</PRE>
</CODE></BLOCKQUOTE>
Note that action for error rule for function does not use
macro `yydeeper_error_try', this is warranty that the all
program will be processed.</P>
<DT><B>YYRECOVERING()</B><DD>
<P>The macro YYRECOVERING serves to determine in which state
the parser works now. The macro returns 1 if a syntax error
has been detected and the parser has not yet fully recovered
from it. Otherwise, zero is returned. The macro is a
part of YACC POSIX standard.</P>
<DT><B>YYERROR</B><DD>
<P>The parser detects a syntax error when it is in a state where
the action associated with the lookahead symbol(s) is error.
A semantic action can cause the parser to initiate error
handling by executing the macro YYERROR. When YYERROR is
executed, the semantic action passes control back to the
parser. YYERROR can be placed only in the semantic action
itself (not in a function called from the semantic action).
The single difference between error detected in the parser
input and error caused by macro YYERROR is that the function
`yyerror' is not called in the second case. The macro is a
part of YACC POSIX standard.</P>
<DT><B>yynerrs</B><DD>
<P>Actually this variable contains the number of switching the
parser state from normal to error recovery. This switching
is performed by fixing error in the input or by executing
macro YYERROR. The macro is not a part of YACC POSIX
standard. In the case of MSTA scanner, the variable
accumulates the number for all calls of `yylex'.</P>
<DT><B>yyerrok</B><DD>
<P>This macro can be used only in a semantic action itself. The
macro causes the parser to act as if it has fully recovered
from any previous errors. The macro is a part of YACC POSIX
standard. The macro has no sense for minimal error recovery
method because the parser in such recovery mode never
executes the rule actions.</P>
<DT><B>YYERR_END_RECOVERY()</B><DD>
<P>This macro is called when the parser switches from the
recovery state into normal state. By default the macro does
nothing. You can redefine this macro, e.g. to output the
last error buffered by your `yyerror' function in order to
implement better error diagnostics of the parser in the local
recovery mode. The macro is not a part of YACC POSIX
standard and the macro is not generated when yacc error
recovery is used.</P>
<DT><B>YYERRCODE</B><DD>
<P>The token error is reserved for error handling. The name
error can be used in grammar rules. It indicates places
where the parser can recover from a syntax error. The
default value of error shall be 256. Its value can be
changed using a %token declaration. In any case the code of
token error is value of macro YYERRCODE.</P>
<DT><B>yyclearin</B><DD>
<P>This macro cause the parser to discard the current lookahead
token. If the current lookahead token has not yet been read,
yyclearin has no effect. The macro is a part of YACC POSIX
standard.</P>
<DT><B>YYALLOC, YYREALLOC, YYFREE</B><DD>
<P>MSTA uses memory allocation for the state and attribute
stacks. Moreover, stacks can be expandable (with the aid of
YYREALLOC). The macro values are used for the stack memory
allocation/reallocation/freeing. Default value of the macros
are standard C functions malloc, realloc, free. You can
redefine this value. The macros are not a part of YACC POSIX
standard.</P>
<DT><B>YYSTACK_SIZE</B><DD>
<P>The macro value is initial size of state and attribute stacks
of the parser. If a stack become overfull, macro YYABORT is
executed when option -no-expand is used. Otherwise, the
stacks are expanded. It is better to use left recursion in
grammar rules in order to do not make overfull stacks.
Default value of the macro is 500. You can redefine this
value. The macro is not a part of YACC POSIX standard.</P>
<DT><B>YYMAX_STACK_SIZE</B><DD>
<P>The macro value is maximal size of state and attribute stacks
of the parser. The macro is used when the stacks are
expandable. If a stack size become bigger (may be after
several stacks expansions), macro YYABORT is executed.
Otherwise, the stacks are expanded. Default value of the
macro is 5000. You can redefine this value. The macro is
not a part of YACC POSIX standard.</P>
<DT><B>YYMAX_STACK_EXPAND_SIZE</B><DD>
<P>The macro value is step of state and attribute stacks
expansion. The macro is used only when the stacks are
expandable. Default value of the macro is 100. You can
redefine this value. The macro is not a part of YACC POSIX
standard.</P>
<DT><B>YYMSTA</B><DD>
<P>This macro defined as 1 is generated in order to differ the
parser generated by YACC, BISON, or MSTA. Naturally the
macro is not a part of YACC POSIX standard.</P>
<DT><B>YYTOKEN_NAME(code)</B><DD>
<P>This macro returns printable representation of token with
given code. The macro is not a part of YACC POSIX
standard.</P>
<DT><B>YYLAST_TOKEN_CODE</B><DD>
<P>This macro value is maximal code of tokens. the
macro is not a part of YACC POSIX standard.</P>
</DL>
</P>
<H2><A NAME="ss3.2">3.2</A> <A HREF="msta.html#toc3.2">C++ code</A>
</H2>
<P>The major advantage of C++ code is that it is quite easy to create
many parsers of one language (and consequently reenterable parser).
This is useful for implementation of module languages and languages
with macro directives of type of C include directive.</P>
<P>Generated C++ code is different from C code in the following features:
<UL>
<LI>Abstract class `yyparser' is generated for a parser and
`yyscanner' for a scanner. The definition of class will be
present also in interface file if the interface file is
generated (see MSTA usage).
</LI>
<LI>Variables `yylval' (`yyslval'), `yychar' (`yyschar'), and
`yydebug' (`yysdebug') are now public members of the class.
</LI>
<LI>Functions `yylex' (`yyslex'), `yyerror' (`yyserror') are
now abstract public virtual functions of the class.
</LI>
<LI>Functions `yyparse' for a parser and `yylex' for a
scanner are now public functions of the class.
</LI>
<LI>Function `yylex_start' for a scanner is not generated
because constructor of the class
<BLOCKQUOTE><CODE>
<PRE>
yyscanner (int &)
</PRE>
</CODE></BLOCKQUOTE>
replaces the function.
</LI>
<LI>The class contains also virtual destructor.</LI>
</UL>
</P>
<P>Usually the parser (scanner) itself is implemented as sub-class of
class `yyparser' (`yyscanner'). This subclass contains definition of
functions `yylex' (`yyslex') and `yyerror' (`yyserror').</P>
<HR>
<A HREF="msta-4.html">Next</A>
<A HREF="msta-2.html">Previous</A>
<A HREF="msta.html#toc3">Contents</A>
</BODY>
</HTML>