-
Notifications
You must be signed in to change notification settings - Fork 576
ParseXS: build an AST for each XSUB #23225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
iabyn
wants to merge
148
commits into
blead
Choose a base branch
from
davem/xs_refactor10
base: blead
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add file and line_no fields to the base class of all Node types to record where that node was defined within the XS src file. (There aren't many node types yet, so this commit doesn't really do anything useful at the moment.)
Add these Node subclasses: ExtUtils::ParseXS::Node::multiline ExtUtils::ParseXS::Node::code ExtUtils::ParseXS::Node::PREINIT The first is a very generic base class for the many planned node types which will represent multi-line XS keywords, such as FOO: aaa bbb ccc The lines are just read into an array ref. ExtUtils::ParseXS::Node::code is a generic subclass of Node::multiline which represents keywords that contain sections of C code, such as PREINIT, PPCODE etc. It does extra processing such as skipping leading blank lines and wrapping the output in '#line ...'. ExtUtils::ParseXS::Node::PREINIT is a concrete subclass of Node::code which represents the PREINIT keyword. This is kind of a proof-of-concept; other such code keywords (CODE, PPCODE etc) will be added later. The net effect of this commit is to move processing of the PREINIT: keyword into the parse() and as_code() methods of the Node::PREINIT class (and/or its parents) and away from the print_section() and PREINIT_handler() methods in ExtUtils::ParseXS. This is intended as a small incremental step towards having a real AST. A PREINIT object is currently created, parsed, printed and destroyed all at the same point in time. In some future refactoring, the intention is that the object will be stored in a tree, and the parsing and code-emitting steps will be done at different times.
Add these Node subclasses: ExtUtils::ParseXS::Node::multiline_merged ExtUtils::ParseXS::Node::C_ARGS ExtUtils::ParseXS::Node::INTERFACE ExtUtils::ParseXS::Node::INTERFACE_MACRO multiline_merged is a subclass of multiline which merges all the lines associated with a keyword into a single string. It is then used as a base class for the three concrete classes C_ARGS etc which correspond to keywords which are multi-line but treat all lines as a single string. The effect of this commit is to move more keyword processing out of ExtUtils::ParseXS and into Node.pm, in preparation for building an AST.
Rename the just-added ExtUtils::ParseXS::Node::code class to ExtUtils::ParseXS::Node::codeblock, as that better describes its purpose. (I would have updated the original commit, but reordering squashing commits was too complex due to intervening commits.)
Before this commit, three keywords which take ENABLE/DISABLE as their argument were exceedingly lax about what they would accept. This commit makes them slightly less lax: they now have to match an exact word, not a part word; i.e. the regex has changed from: /^(ENABLE|DISABLE)/i; to: /^(ENABLE|DISABLE)\b/i; Note that it still quietly ignores trailing garbage. So before both these lines were legal; now only the second is: PROTOTYPES: ENABLEaaa bsbsbs stbsbsb PROTOTYPES: EnablE bsbsbs stbsbsb This commit makes VERSIONCHECK, PROTOTYPES, EXPORT_XSUB_SYMBOLS match SCOPE, which already had the \b. This is in preparation for an upcoming commit, which will use a common method to parse such keywords. This commit also changes the test infrastructure slightly: the test_many() function no longer bails out if the eval fails; instead the eval error message is added to any STDERR text, accessible to tests which can now test that ParseXS did indeed call death().
The legacy KEYWORD_handler() methods expect, on entry, for $_ to hold the remainder of the current line (after s/^s*KEYWORD:\s*//), and for @{$self->{line}} to contain any remaining unparsed lines from the current XSUB. On return, they set $_ to the next unparsed line, and @{$self->{line}} to subsequent lines. Recent commits have started added Node (and its subclasses) parse() methods to replace the KEYWORD_handler() methods. Currently they use the same "rely on $_ to pass the current line to and from" scheme. This commit changes them so that they only get lines from @{$pxs->{line}}. This removes one source of weird action-at-a-distance.
The SCOPE: keyword, which enables wrapping an XSUB's main body with ENTER/LEAVE, has been partially broken since 5.12.0. This commit fixes that, adds tests, and updates the very vague documentation for SCOPE in perlxs.pod. AFAIKT, neither the SCOPE keyword, nor it's associated /* SCOPE */ magic comment in typemap files, are used anywhere in core or on CPAN, nor in any tests. ('SCOPE: DISABLE' appears in a single test file, but disabled is the default anyway.) Background: The SCOPE keyword was added by perl-5.003_03-21-gdb3b941461 (with documentation added soon after by perl-5.003_03-34-g84287afe68). This made the SCOPE: keyword an XSUB-body-scoped keyword, e.g. void foo() SCOPE: ENABLED CODE: blah where the emitted 'blah' code would now be wrapped with ENTER/LEAVE. 13 years later, with v5.11.0-30-g28892255e8, this was extended so that the keyword could appear just before the XSUB too: SCOPE: ENABLED void foo() CODE: blah I don't know what the motivation was behind this; the commit was part of a larger upgrade, which just listed among other bug fixes: - Fix the SCOPE keyword [Goro Fuji] but I can't find any trace of a corresponding problem description on p5p or RT. This change had the unfortunate side-effect of breaking the existing XSUB-scoped variant. This is indirectly due to the fact that XSUB-scoped KEYWORD_handler() methods are supposed to set $_ to the next line before returning, while file scoped ones aren't supposed to. That change made SCOPE_handler() both file- and xsub-scoped, and also made it no longer update $_. So the new file-scoped variant worked, while the old xsub-scope variant broke, because it now retuned with $_ set to 'ENABLE' rather than to the next line. The temporary fix in this commit makes SCOPE_handler() check who its caller is and sets $_ (or not) accordingly. A proper fix will occur shortly when a SCOPE Node subclass is added, since the NODE::parse() methods don't pass values back and forth in $_. This commit also updates the pod for SCOPE, which was very vague about what the SCOPE keyword did and where it should go, syntax-wise. I also changed it so that it suggests the magic comment token in a typemap entry should be /* SCOPE */. The actually regex is {/\*.*scope.*\*/}i, which matches a whole bunch of stuff. If we ever make it stricter, insisting on an upper-case SCOPE with just surrounding white space seems the way to go.
Add the following classes: ExtUtils::ParseXS::Node::oneline ExtUtils::ParseXS::Node::enable ExtUtils::ParseXS::Node::EXPORT_XSUB_SYMBOLS ExtUtils::ParseXS::Node::PROTOTYPES ExtUtils::ParseXS::Node::SCOPE ExtUtils::ParseXS::Node::VERSIONCHECK The first two are base classes for XS keywords which consume only a since line of XS src, and which then expect the keyword to have a value of ENABLE/DISABLE. The rest are concrete Node subclasses representing all the purely ENABLE/DISABLE keywords.
Add this Node subclass: ExtUtils::ParseXS::Node::PROTOTYPE This commit moves the parsing code for the PROTOTYPE keyword from the old PROTOTYPE_handler() method in ExtUtils::ParseXS and into a new Node subclass parse() method. Also add a few more tests for PROTOTYPE - especially parsing edge cases.
In Node.pm, replace a bunch of declarations of the form package ExtUtils::ParseXS::Node::INTERFACE_MACRO; sub parse { my ExtUtils::ParseXS::Node::INTERFACE_MACRO $self = shift; ... } with package ExtUtils::ParseXS::Node::INTERFACE_MACRO; sub parse { my __PACKAGE__ $self = shift; ... }
A recent commit expanded test_many() in t/001-basic.t to test for errors as well as warnings. This commit tweaks that change to work under 5.8.x: it was emitting "Use of uninitialized value in concatenation" warnings.
When I moved this sub from ParseXS.pm to Node.pm it retained its 2-char indent. Node.pm uses a 4-char indent, so reindent it. This is whitespace-only change, apart from splitting a few long lines and re-wrapping some comment paragraphs.
Add these Node subclasses: ExtUtils::ParseXS::Node::keylines ExtUtils::ParseXS::Node::keyline ExtUtils::ParseXS::Node::ALIAS ExtUtils::ParseXS::Node::ALIAS_line An ALIAS node represents an ALIAS keyword, which can have multiple ALIAS_line kid nodes, each of which represent one processed line from an ALIAS section. keylines and keyline are base classes for ALIAS and ALIAS_line respectively, which handle the general processing of keywords which are multi-line but where each line needs treating individually. Other examples would be INPUT and OUTPUT keywords (not yet done). It's slightly overkill just for ALIAS (arguably all the data could have just been stored in a single ALIAS node), but doing it properly now will make converting INPUT and OUTPUT keywords into nodes easier in the near future. The base classes also handle shifting lines off the input queue in such a way that warnings and errors come from the right line. Note that this is the first commit which adds an *intermediate* AST tree node class: the previous commits have just been adding terminal nodes. In particular, this commit adds a 'kids' array ref field to the base Node class which allows nodes to have kids; and the parse method for ALIAS repeatedly creates ALIAS_line objects, calls their parse method, then adds to them the ALIAS's kids list. Thus it's an embryonic recursive-decent parser, in the sense that parser subs for 'big' things call parser subs for smaller things. Technically, while there will be nested calls to parser methods, there won't be actual recursion, since the XS syntax isn't recursive. The bulk of this commit consists of moving the get_aliases() sub from Parse.pm into Node.pm and renaming it to ExtUtils::ParseXS::Node::ALIAS_line::parse(). The code is basically unchanged except for tweaks required to make it a Node subclass. Similarly, ALIAS_handler() becomes ExtUtils::ParseXS::Node::ALIAS::parse(). This commit also adds some more tests for the ALIAS keyword: in particular, while there were already some tests for alias warnings, there didn't seem to be any for errors. The old, existing test code for ALIAS is modified slightly so that 'die' text isn't lost if something goes horribly wrong. That test code doesn't use the newer, more general test_many() function from t/001-basic.t which handles that sort of thing better.
I audited all the Warn(), death() etc calls in Node.pm and added tests for any which weren't yet covered (apart from hard-to-reproduce ones like internal errors).
Add ExtUtils::ParseXS::Node::ATTRS class, and add a basic test.
Add ExtUtils::ParseXS::Node::OVERLOAD class, and add a basic test. Note that currently this code doesn't warn about duplicate op names (it just silently skips duplicates), nor warn about unknown op names (it happily accepts them). This commit preserves the current behaviour for now.
This is #1 of a small series of commits to refactor the INPUT_handler() method and turn it into a Node subclass method. This commit changes the main loop from using $_ to hold the current line, to using the variable $line instead.
This is #2 of a small series of commits to refactor the INPUT_handler() method and turn it into a Node subclass method. This commit splits the method into two: a smaller outer one which has the 'foreach line' loop, and a new method, INPUT_handler_line() which contains the bulk of the old method and processes a single line from an INPUT section.
This is #3 of a small series of commits to refactor the INPUT_handler() method and turn it into a Node subclass method. This commit moves the ExtUtils::ParseXS methods INPUT_handler() INPUT_handler_line() from ParseXS.pm into ParseXS/Node.pm. For now they temporarily remain as ExtUtils::ParseXS methods; this is just a straight cut and paste, except for fully-qualifying the $BLOCK_regexp package variable name and adding a couple of temporary 'package ExtUtils::ParseXS' declarations.
This is #4 of a small series of commits to refactor the INPUT_handler() method and turn it into a Node subclass method. This commit reindents INPUT_handler() and INPUT_handler_line() from 2-indent to 4-indent to match the policy of the file they were moved to in the previous commit. Whitespace-only change
This is #5 of a small series of commits to refactor INPUT keyword handling. This commit adds these two classes: ExtUtils::ParseXS::Node::INPUT ExtUtils::ParseXS::Node::INPUT_line and converts the two ExtUtils::ParseXS methods INPUT_handler() INPUT_handler_line() into parse() methods of those two classes In a very minor way, this commit also starts separating in time the parsing and the code emitting. Whereas before, each INPUT line was parsed and then C code for it immediately emitted, now *all* lines from an explicit or implicit INPUT section are parsed and stored as an INPUT node with multiple INPUT_line children, and *then* the as_code() method is called for each child. This should make no difference to the generated output code.
This is #6 of a small series of commits to refactor INPUT keyword handling. There's no need any more to save the original line in $orig_line, as $self->{line} now holds that value. Also, wrap ... or blurt(...), return; in a do block for clarity:
This is #7 of a small series of commits to refactor INPUT keyword handling. The main job of parsing an INPUT line is to extract any information on that line and use it to update the associated Param object (which was likely created earlier when the XSUB's signature was parsed). This commit makes that information also be stored in new fields in the INPUT_line object. These new fields aren't currently used for anything, but they could in principle become useful if options for deparsing or exporting were added to ParseXS.
Move the declaration of the 'defer' Node::Param field into the "values derived from the XSUB's INPUT line" part of the declaration. No functional change, just fixing an error in the documentation.
Rename the existing check() method to set_proto(). The only thing the method was doing was calculating the overridden prototype char for that parameter based on it's type typemap entry, if any. So give it a better name. Also, rationalise where and when the method is called. It was being called each time a parameter was created, or when its type changed. Instead, just call the method once on all parameters just after all INPUT processing is complete, so the types can't change, but before any inline TYPEMAP entries might change the proto char for that type. In theory this commit should make no functional change.
This keyword, used in place of CODE or PPCODE, emits a stub body that just call croak(). It's undocumented, untested, and appears to be used in only one XS file in all of CPAN. This commit adds some very basic tests. The next commit will change the behaviour slightly: currently, K&R-style params get C declarations, but code emitting stops before ANSI-style declarations and deferred initialisations would normally be emitted. SO a couple of tests are marked as expected to fail.
The undocumented and almost-entirely-unused NOT_IMPLEMENTED_YET keyword can be used at the same point in parsing where CODE: or PPCODE: could appear, and emits a croak() call whereas a call to a C library function would otherwise have been auto-generated. This keyword was checked for for partway during emitting of initialisation code; this meant that K&R-style declarations were emitted, but ANSI_style ones weren't. This commit moves the checking for the presence of this keyword to a bit later: after all initialisation code emitting is complete. This makes NOT_IMPLEMENTED_YET logically part of the body-processing section, which now looks roughly like if (/NOT_IMPLEMENTED_YET/) emit croak elsif (/PPCODE:/) ... elsif (/CODE:/) ... else emit autocall and so makes the parsing code cleaner. Conceptually it means that a NOT_IMPLEMENTED_YET can now appear after an INIT keyword; in practice, only *INPUT* section parsing is special-cased to recognise NOT_IMPLEMENTED_YET as another valid keyword which terminates the current section. So INIT, C_ARGS etc sections continue to see "NOT_IMPLEMENTED_YET" as just a bit of text to be consumed from the input stream and added to the init code or the C signature or whatever. Some tests have been added to confirm this.
The previous commit altered the structure of a big if/else. Reindent to match. Whitespace-only change
Move this field from Node::xbody to Node::output_part, as it's only used while generating the code for the output part of the xsub. No functional change.
Remove the following field from the ExtUtils::ParseXS class: xsub_stack_was_reset and replace it with this new field in the ExtUtils::ParseXS::Node::output_part class: stack_was_reset
Remove the following fields from the ExtUtils::ParseXS class: xsub_interface_macro xsub_interface_macro_set and replace them with these new fields in the ExtUtils::ParseXS::Node::xsub class: interface_macro interface_macro_set There is also a slight change in the way these two fields are used. Formerly they were initialised to the default values "XSINTERFACE_FUNC" and "XSINTERFACE_FUNC_SET", then potentially changed by the INTERFACE_MACRO keyword, then the current values were used to emit the interface function pointer getting and setting code. Now, the values are initially undef, and the emitting code checks for defined-ness and if so uses the default value. This means that the logic for using default or overridden value is local to where that value is used rather than being hidden away elsewhere.No change in functionality though.
Remove the following fields from the ExtUtils::ParseXS class: xsub_map_overload_name_to_seen xsub_prototype and replace them with these new fields in the ExtUtils::ParseXS::Node::xsub class: overload_name_seen prototype
This commit renames the ExtUtils::ParseXS class field xsub_SCOPE_enabled to file_SCOPE_enabled and adds a new field in the ExtUtils::ParseXS::Node::xsub class: SCOPE_enabled This is because SCOPE can be used either in file scope: SCOPE: ENABLE int foo(...) or in XSUB scope, int foo(...) SCOPE: ENABLE The file_SCOPE_enabled field records whether a SCOPE keyword has been encountered just before the XSUB, while the Node::xsub SCOPE_enabled field is initialised to the current value of file_SCOPE_enabled when XSUB parsing starts, and is updated if the SCOPE keyword is encountered within the XSUB.
During the course of the refactoring in this branch, perl code has gradually been split between doing parsing in Node::FOO::parse() methods and code emitting in Node::FOO::as_code() methods (before, both were completely interleaved). How the current xsub and xbody nodes are tracked varies between those two types of methods: the as_code() methods pass them as explicit parameters, while the parse() methods rely on two 'global' fields within the ExtUtils::ParseXS object, cur_xsub and cur_xbody. However, some some as_code() methods were still relying on cur_xsub/xbody rather than the passed $xsub and $xbody params. This commit fixes that. At the moment it is mostly harmless, as each XSUB's top_level as_code() is called immediately after it's top-level parse(), so cur_xsub still points to the right XSUB. But that will change in future, so get it right now. The next commit will in fact explicitly undef cur_xsub/xbody immediately after parsing is finished. This commit includes a test for one edge case where the cur_xbody being wrong did make a difference.
Currently, the fields cur_xsub and cur_xbody of ExtUtils::ParseXS track the current xsub and body nodes during parsing. This commit undefs them immediately after use so that they can't be inadvertently used elsewhere. The fixups in the previous commit were all discovered by this undeffing.
Currently all the Node::FOO::as_code() methods get passed two args, $xsub and xbody, to indicate the current Node::xsub and Node::xbody objects. Conversely, all the Node::FOO::parse() methods access the current two objects via two 'global' fields in the ExtUtils:;ParseXS object: cur_xsub cur_xbody This commit deletes these two fields and instead passes the objects as extra parameters to all the parse() methods. Less action-at-a-distance.
Add comments about keywords which can be both inside or outside an XSUB.
The Node::Params class has a 'params' field which holds a list of Node::Param objects. This class was one of the first Node classes to be created during my recent refactoring work, and at the time, Node subclasses didn't have a generic 'kids' field. They do now, so just store the list of Param objects of 'kids' of the Params object.
Add a ExtUtils::ParseXS::Node::IO_Param class as a subclass of the existing ExtUtils::ParseXS::Node::Param class. Then Param objects will be used solely to hold the details of a parameter which have been extracted from an XSUB's signature, while IO_Param objects contain a copy of that info, but augmented with any further info gleaned from INPUT or OUTPUT lines. For example with void foo(a) int a OUTPUT: a Then the Param object for 'a' will look something like: { arg_num => 1 var => 'a', } while the corresponding IO_Param object will look something like: { arg_num => 1, var => 'a', type => 'int', in_input => 1, in_output => 1, .... } All the code-emitting methods have been moved from Param to IO_Param, and the as_code() method has been renamed to as_input_code(), to better match the naming convention of the existing as_output_code() method: an IO_Param can generate code both to declare/initialise a var, and to update/return a var.
If the list of aliases for an XSUB doesn't include the XSUB's main name, an extra alias entry is added, mapping the main name to ix 0. Move this setting from the code generation phase to the end of the parsing phase, because the AST should really be complete by the end of parsing. Also add a test for this behaviour. Shouldn't affect hat code is generated.
This method is no longer used anywhere
Currently the parsing of an XSUB's signature, and the parsing of the individual comma-separated items within that signature, are done in the same function, Params->parse(). This commit is the first of three which will extract out the latter into a separate Param->parse() method. For now, the per-param code is kept in-place (to make the diff easier to understand), but is wrapped within an immediately-called anon sub, in preparation to be moved. So before, the code was (very simplified): for (split /,/, $params_text) { ... parse type, name, init etc ... next if can't parse; my $param = Param->new(var = $var, type => $type, ...); push @{$params->{kids}}, $param; } After this commit, it looks more like: for (split /,/, $params_text) { my $param = Param->new(); sub { my $param = shift; ... ... parse type, name, init etc ... return if can't parse; $param->{var} = $var; ... return 1; }->{$param, ...) or next; push @{$params->{kids}}, $param; } Note that the inner sub leaves pushing the new param, updating the names hash and setting the arg_num to the caller. In theory there are no functional changes, except that when a synthetic RETVAL is being kept (but its position within kids moved), we now keep the Param hash and update its contents, rather than replace it with a new hash. This shouldn't make any difference.
This commit just moves a block of code of the form sub {...}->() into its own named sub. There are no changes to the moved lines of code apart from indentation. This is the second of three commits to create the parse() method. The next commit will do any final tidying up.
This is the third of three commits to create the parse() method. Mainly do s/$param/$self/g, and add a call to set file/line number foer the object.
Move all the code out of ExtUtils::ParseXS::Node::IO_Param::as_input_code() which is responsible for looking up the template initialisation code in the typemap (or elsewhere) and put it in it's own method, lookup_input_typemap(). As well as splitting a 300-line method into two approx 150-line methods, this will also allow us shortly to move the template lookup to earlier, at parse time rather than code-emitting time. Also add some more tests for the length(foo) pseudo-parameter, which I broke while working on this commit, and then noticed it was under-tested.
Move all the code out of ExtUtils::ParseXS::Node::IO_Param::as_output_code() which is responsible for looking up the template output code in the typemap (or elsewhere) and put it in it's own method, lookup_output_typemap(). As well as splitting a 490-line method into two 200 and 340-line methods, this will also allow us shortly to move the template lookup to earlier, at parse time rather than code-emitting time. It may also be possible at some point to merge the two methods added by these last two commits, lookup_intput_typemap and lookup_output_typemap, into a single method, since they share a lot of common code.
Previously these two values were set at the end of parsing an XSUB: XSRETURN_count_basic XSRETURN_count_extra They represent whether a RETVAL SV will be returned by the XSUB, and how many extra SVs are returned due to parameters declared as OUTLIST. This commit sets them earlier, as in particular, the next commit will need to access XSRETURN_count_basic earlier. XSRETURN_count_extra is now set right after parsing the XSUB's declaration, as its value can't change after then. XSRETURN_count_basic is now set after parsing the output part of the each body of the XSUB (an XSUB can have a body per CASE). Its value *aught* to be consistent across all bodies, but it's possible for the CODE_sets_ST0 hack (which looks for code like like 'ST(0) = ...' in any CODE: block) to vary across bodies; so this commit also adds a new warning and test for that.
The last few commits have moved the looking-up and processing of typemap entries (but not the evalling) for parameters from Param::as_input_code() and Param::as_output_code() into their own subs, lookup_input_typemap() and lookup_output_typemap(). This commit takes that one step further, and makes those new subs be called at parse time, rather than at code-generation time. This is needed because in principle, XSUB ASTs should be completely self-contained, and the code they emit shouldn't vary depending on when their top-level as_code() methods are called. But via the TYPEMAP: <<EOF mechanism, its possible for the typemap to change between XSUBs. This commit does this in a very crude way. Formerly, at code-emitting time, as_input_code() etc would do: my ($foo, $bar, ...) = lookup_input_typemap(...); Now, the parsing code does $self->{input_typemap_vals} = [ lookup_input_typemap(...) ]; and as_input_code() etc does: my ($foo, $bar, ...) = @{$self->{input_typemap_vals}}; Note that there are both output_typemap_vals and output_typemap_vals_outlist fields, as it's possible for the same parameter to be used both for updating the original arg (OUTPUT) and for returning the current value as a new SV (OUTLIST). So potentially we save the results of *two* calls to lookup_output_typemap() for each parameter.
Rationalise warning and error messages which appear in Node.pm: - always prefix with Warning: / Error: / Internal error: - lower-case the first letter following Error: etc - fix grammar - ensure full test coverage (except 'Internal error', which shouldn't be reproducible).
Some node types have fields to point to particular children. Make these kids also be in the generic @{$self->{kids}} array. That way, hypothetical generic tree-walking code will be able to access the whole tree just by following @{$self->{kids}}, without needing to know for example that the xsub_decl Node type has a child pointed to by $self->{return_type}.
Do a general tidy-up of this src file: white space, plus wrap long lines and strings.
For aesthetic reasons, give the $build_subclass sub an extra first arg which must be the string 'parent'. Then change invocations from: BEGIN { $build_subclass->('Foo', # parent 'field1', # ... ... } to BEGIN { $build_subclass->(parent => 'Foo', 'field1', # ... ... }
Update the code comments in calls to $build_subclass->() to indicate more consistently the 'type' of each field being declared.
In the INPUT_line and OUTPUT_line subclasses, rename the 'param' field to 'ioparam', to better reflect that it holds an IO_Param object rather than a Param object.
The work in this branch broke the parser under 5.8.9. Fix it, by not trying to autovivify an undef object pointer (which under 5.8.9 is a pseudo-hash thingy and generally behaves weirdly). The attempt to autovivify an undef $xsub was always wrong, but harmless: the value wasn't needed and was soon discarded. But under 5.8.9, it became a runtime error.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is part of my ParseXS refactoring work. I don't intend for it to be merged until 5.43.1.
This series of about 150 commits changes ExtUtils::ParseXS so that, instead of intermixing parsing and code-emitting for each XSUB, it now parses each XSUB into an Abstract Syntax Tree, and then walks the tree to emit all the C code for that XSUB.
This makes the code generally more understandable and maintainable.
For now it still just discards each tree after the XSUB is parsed; in future work, the AST will be extended so that it holds the whole file (including all the XSUBs) rather than just the current XSUB.
This branch contains six types of commit.
For terminal AST nodes for keywords such as FOO, the old
ExtUtils::ParseXS::handle_FOO()
method is removed and a new ExtUtils::ParseXS::Node::FOO class is added with parse() and as_code() methods which copy over the parsing and code-emitting parts of the handle_FOO() method. For a few keywords like INPUT which have values per line, both a Node::FOO and Node::FOO_line class are created, with several FOO_line nodes being children of the FOO node.
Note that doing the modifications for a single keyword often consists in fact of several commits in sequence.
For higher-level nodes, a Node::foo class is created with parse() and as_code() methods as before, but the contents of these methods are typically populated by moving the relevant bits of code over from the big ExtUtils::ParseXS::process_file() method.
Most of the state fields of the ExtUtils::ParseXS class (especially all the xsub_foo ones) are removed and similar fields added to the various Node subclasses instead.
Fixups to ensure that all parse-time code is in parse() methods or associated helper functions, and similarly for as_code().
Various bug fixes related to state that should be per-CASE rather than per-XSUB. Some of these bugs were pre-existing, some were introduced during this branch.
General tidying-up, fixing code comments, adding POD etc.