-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Move JuliaSyntax + JuliaLowering into the main tree #59870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This makes it easier to run subsets of tests by just including the appropriate file
Also fix a couple of test cases which weren't being run correctly including updating exceptions_ir.jl to avoid use of globals
Make all of `is_valid_ir_argument`, `is_valid_body_ir_argument`, `is_single_assign_var`, `is_const_read_arg` more accurate portings of the flisp equivalents. Some values in the IR must be written to temporaries for the resulting code to be correct. It's not clear which invariats we're upholding here because none of these seem to be documented, but it seems important to have these be as equivalent as possible for now. Some changes are still required to these after the variable analysis pass is more accurate. Also avoid using `isdefined()` when looking up globals in Julia modules during lowering - this does import resolution and we can't allow this side effect when generating IR. Instead use the new functions `is_defined_and_owned_global` `is_defined_nothrow_global` which dip a bit into Julia internals to look up bindings and determine binding owner without having any side effects.
This should preven uncommitted tests from accidentally being deleted if the dev code crashes.
Expand general assignment syntax, including * UnionAll definitions * Chained assignments * Setting of structure fields * Destructuring * Typed variable declarations Still TODO * Eliminiating tuples in case sides match * Assignments to array elements
Deal with cases like `(x,y) = (a,b)`. Still need to deal with slurps and splats.
…x.jl#501) This patch fixes serialization of `Kind`s to use `sizeof` (number of bytes) instead of `length` (number of characters) when computing number of bytes in the stringified `Kind`.
At some we can hopefully replace `local_def` with `local`, combined with a future `struct BindingFlags`.
Named tuple destructuring is needed to implement `kw_call`. Also change expansion of `K"."` in `expand_forms_1()` to lower the second element to K"Symbol" early on. We should probably do this upstream in JuliaSyntax :)
* A vector of `Slot`s is now created and passed into the `CodeInfo` creation pass so that code doesn't need access to the `Bindings` anymore. This is a better separation of data structures between passes. * Use K"Placeholder" for unused slots. * Fix small bug which made argument slurping broken.
Also fix a bug in linearization of `K"isdefined"`
Co-authored-by: spaette <[email protected]>
…#511) Julia's ecosystem (including Base.Docs and flisp lowering) assumes that strings within `struct` definitions are per-field docstrings, but the flisp parser doesn't handle these - they are only recognized when the struct itself has a docstring and are processed by the `@doc` macro recursing into the struct's internals. For example, the following doesn't result in any docs attached to `A`. ```julia struct A "x_docs" x "y_docs" y end ``` This change adds `K"doc"` node parsing to the insides of a struct, making the semantics clearer in the parser tree and making it possible to address this problems in the future within JuliaLowering. Also ensure that the `Expr` form is unaffected by this change.
…ng/JuliaSyntax.jl#506) * Don't assume that `SubString` has `pointer` and copy instead * Still assume `Substring{String}` has `pointer` * Test with `Test.GenericString`
…aLang/JuliaLowering.jl#75) Was causing several stdlib failures. MWE: ``` julia> ex = Meta.parse("begin x = 111 x = 222 end") JuliaLowering.core_lowering_hook(ex, Main, "foo.jl", 100) ``` If `core_lowering_hook` is given one filename (e.g. "none" from `@eval`), but some part of the expression contains LineNumberNodes with a different filename, we trigger the "inlined macro-expansion" logic in the debuginfo generator, which assumes new filenames are from new macro expansions atop the old filename. The violated invariant is that the list of files in this statement's flattened provenance shares some prefix with the last statement's list of files. This fix assumes there is some base file that all statements share, and normalizes different base filenames to the first it sees. Aside: Not sure if this stack logic is 100% correct given that two adjacent statements can share arbitrarily many file stack entries despite being from different macro expansions.
…/JuliaLowering.jl#80) --------- Co-authored-by: Claire Foster <[email protected]>
Adapt to JuliaSyntax changes in how macro names are represented in the tree. This is a bit messy but is important to keep in sync for now until we figure out how the green tree relates to (or differs from) the AST seen by macros and lowering.
This is a step toward an iteration interface for lowering which can return a sequence of CodeInfo to be evaluated for top level and module expressions. This also restricts lowering of module expressions to be syntactically at top level (ie, not inside a top level thunk), consistent with the existing way that they're handled in eval.
Julia's incrementally evaluated top level semantics make it rather tricky to design a lowering interface for top level and module level expressions. Currently these expressions are effectively *interpreted* by eval rather than ever being processed by lowering. However, I'd like a cleaner separation between "low level evaluation" and lowering, such that Core can contain only the low level eval "driver function". I'd like to propose the split as follows: * "Low level" evaluation is about executing a sequence of thunks represented as `CodeInfo` and creating modules for those to be executed inside. * Lowering is about expression processing. In principle, the runtime's view of `eval()` shouldn't know about `Expr` or `SyntaxTree` (or whatever AST we use) - that should be left to the compiler frontend. A useful way to think about the duties of the frontend is to consider the question "What if we wanted to host another language on top of the Julia runtime?". If we can eventually achieve that without ever generating Julia `Expr` then we will have succeeded in separating the frontend. To implement all this I've recast lowering as an incremental iterative API in this change. Thus it's the job of `eval()` to simply evaluate thunks and create new modules as driven by lowering. (Perhaps we'd move this definition of `eval()` over to the Julia runtime before 1.13.) The iteration API is currently oddly bespoke and arguably somewhat non-Julian for two reasons: * Lowering knows when new modules are required, and may request them with `:begin_module`. However `eval()` generates those modules so they need to be passed back into lowering. So we can't just use `Base.iterate()`. (Put a different way, we have a situation which is suited to coroutines but we don't want to use full Julia `Task`s for this.) * We might want to implement this `eval()` in Julia's C runtime code or early in bootstrap. Hence using SimpleVector and Symbol as the return values of `lower_step()` We might consider changing at least the second of these choices, depending on how we end up integrating this into Base.
…el-interpret-modules Incremental lowering API
…g/JuliaLowering.jl#88)
…aLowering.jl#86) Fix the lowering of `cglobal` to produce `GlobalRef(Core.Intrinsics, :cglobal)` instead of a bare symbol `:cglobal`. The inference validator requires cglobal to be a GlobalRef: https://github.com/JuliaLang/julia/blob/7a8cd6e202f1d1216a6c0c0b928fb43a123cada8/Compiler/src/validation.jl#L87 With this commit `_to_lowered_expr` resolves `cglobal` to `GlobalRef(Core.Intrinsics, :cglobal)`, matching Julia's builtin lowerer behavior and satisfying the inference validator's requirements.
…owering.jl#90) Co-authored-by: Shuhei Kadowaki <[email protected]>
…liaLowering.jl#87) * Support `const foo() = ...` * Add support for destructuring `const` * Generate Core.declare_const; make constdecl a lowering-only kind * Generate Core.declare_global; remove globaldecl kind, desugar global Corresponds to JuliaLang/JuliaLowering.jl#58279 (also take unused_only) * Refresh IR test cases * Update README * Fix toplevel_pure * Random typo fix * Use Core.declare_const instead of jl_set_const * Don't test on 1.12 in CI * Update test/decls.jl Co-authored-by: Em Chu <[email protected]> * Expand global/local function def body properly * Add a handful more IR tests for declarations * Add tests for #59755 --------- Co-authored-by: Em Chu <[email protected]>
…jl#97) Also fix a small bug in `_eval` when file is `nothing`
…JuliaLowering.jl#100) The removed test attempted to pass a runtime-computed function name to `ccall` via `ccallable_sptest_name(T)`, but `ccall` now requires its function name argument to be a compile-time constant. This pattern only works with `@generated` functions from Julia 1.13 onwards, where the function name can be evaluated at code generation time. Currently JL cannot handle `@generated` functions, so the commenting out the test case updated in the last commit. --------- Co-authored-by: Em Chu <[email protected]>
…uliaLang/JuliaLowering.jl#91) Adds support for nested splat expressions like `tuple((xs...)...)` by restructuring the splat expansion to match the native lowerer's recursive algorithm. The native lowerer unwraps only one layer of `...` per pass and relies on recursive expansion to handle nested cases. This approach naturally builds the nested `_apply_iterate` structure through multiple expansion passes, avoiding the need for explicit depth tracking and normalization. Changes: - Refactor `_wrap_unsplatted_args` to unwrap only one layer of `...` - Refactor `expand_splat` to construct unevaluated `_apply_iterate` call then recursively expand it - Add test cases for nested splats including triple-nested and mixed-depth
These are the simplest possible adaptions to create the vendored Base.JuliaSyntax from an in-tree version of JuliaSyntax (JuliaLowering to be hooked up later).
Thanks for doing this. Should this ideally be merged before or after #59818? I know JuliaLowering is not intended to be a 1.13 feature (and this PR doesn't hook it up, hence not containing the "feature") but this PR might make parser backports easier.
Assuming there's an easy way of setting this up, did we want to keep JuliaLowering CI separate (based on files touched) until it's fully integrated?
|
W.r.t the name, since the parser is also a public package (and will continue to be), I think the name JuliaParser is preferable. |
I've argued for JuliaParser and JuliaSyntax in the past, because it parses/lowers |
Isn't this the opposite direction we're trying to go in? That is, we've been trying to move more stuff out of the main Julia source tree, and into separate repos, right? At least, we've been moving a bunch of stdlibs into separate repos. |
I think in Base these are already called |
not sure it's worth such a minor change but what if the nominalizations were made to match? either both agent nouns or both gerunds (aka |
Not all code is equal, and making the repo smaller is not an end goal in itself. We've been trying to move out non-core stuff like LinearAlgebra etc, that are more or less independent of the language. The Julia parser and lowerer are core components and are heavily tied to internals, so having these be in other repos makes changes that need to touch e.g., both the lowerer, the parser and some other internal component, much more annoying than if it is in one repo and can all be done in a single PR. |
It's true it should make backports a lot easier and more natural (another reason why it's good for this to live in the main tree). I guess I'm cautiously optimistic about merging it beforehand but I could go either way. Naming
Agreed, but I don't want the vendored versions to have different module names than the versions registered in General. That seems super confusing and is also inconvenient for utility code like the tests which will refer to the module name on occasion.
Yeah. Note that the fact that they're implemented in Julia was not considered when I named them. The point of having the "Julia" prefix is that
Ok, good to know that "Julia" prefix seems ok to you. Renaming JuliaLowering->JuliaSyntax ... hmm it makes some sense but feels like it may do more harm than good at this point with the parser having been called JuliaSyntax for quite some time now 😅 Haha! I see that my reservations about asking about naming have come to pass. It's never easy 😅 I'm getting a general vibe that keeping the I'd consider renaming JuliaSyntax->JuliaParser if there's a strong consensus which develops about that though I'm wary the churn might just not be worth it at this point. I'm fairly happy to rename
Alas I'm really not loving any of those. Possibly we're at a local maximum with the existing names. |
I'm OK with keeping |
There's been some interest in having the new Julia compiler frontend (JuliaSyntax + JuliaLowering) in the main Julia tree so that these are easier to work on together and so the new lowering code can co-evolve with changes to Core more easily.
Here's a simple sketch for moving both these libraries into the main tree as separate top level modules in the JuliaSyntax and JuliaLowering subdirectories. For git history, I've used
git-filter-repo
to rewrite the history of both repositories into their respective subdirectories. At the same time some light rewriting was performed to avoid confusion for commit messages referring to issue numbers. For example, if a commit in the JuliaSyntax history refers to #256, that will be rewritten to the string JuliaLang/JuliaSyntax.jl#256. (Note for completeness that the history of these projects also includes the git history of Tokenize.jl which is the origin of the lexer.)There's a few questions / TODOs I'd like to consider before merging this:
How do we do CI of JuliaSyntax against old Julia versions?
JuliaSyntax currently supports Julia versions back to 1.0 (!!) Admittedly this may be excessive, but we should keep the JuliaSyntax registered in General working for at least some older Julia versions.
The problem is I know very little about how to set this up and I'd like advice or help :) @IanButterworth I can see you're active with both build kite and github actions infrastructure - I hoped you might have some thoughts or be able to point me in the right direction? Presumably we download pre-built versions from
julialang-s3.julialang.org
and test the JuliaSyntax module against those in addition to the current dev version of Julia.Easing the archiving of JuliaLang/JuliaSyntax
There's enough open PRs on JuliaSyntax that it'd be nice to make migrating those to the main Julia repository easy. My rough plan is to filter all branches while running
git-filter-repo
and push those filtered branches to JuliaLang/JuliaSyntax. Then PR authors should be able to grab the filtered version of their branch and apply it to the main Julia repo without issues. I haven't figured out the details of this yet but it should be done in onegit-filter-repo
run to ensure consistency of version hashes.When this is done I'll also move c42f/JuliaLowering.jl into JuliaLang/JuliaLowering.jl and archive it so there's a more permanent home for the associated github issue and PR discussions.
What should these modules be called?
I hesitate to bring this up because it might become a distraction. But if we want to rename either of these modules it makes sense to do it now while we're moving git histories around.
Originally,
JuliaSyntax
was named that way because there was a very old and obsoleteJuliaParser
already taking the name, and the prefix "Julia" was used for clarity given that it was going into the General registry. (Also, the parser work was started as an experimental side project and taking a canonical name seemed rather too bold 😅) If we want to claim a more canonical name at this point we might consider renaming it toParser
. (Of course we could takeJuliaParser
as a name, but that seems marginal enough that we may as well stick with the existing name.)JuliaLowering
was named with the same convention but if we change the JuliaSyntax name to just Parser we might also consider renaming JuliaLowering to something likeLowering
orCodeLowering
.CompilerFrontend
is also a tempting name but not including the parser in the "compiler frontend" would be a bit weird.