Skip to content

Conversation

c42f
Copy link
Member

@c42f c42f commented Oct 17, 2025

There's been some interest in having the new Julia compiler frontend (JuliaSyntax + JuliaLowering) in the main Julia tree so that these are easier to work on together and so the new lowering code can co-evolve with changes to Core more easily.

Here's a simple sketch for moving both these libraries into the main tree as separate top level modules in the JuliaSyntax and JuliaLowering subdirectories. For git history, I've usedgit-filter-repo to rewrite the history of both repositories into their respective subdirectories. At the same time some light rewriting was performed to avoid confusion for commit messages referring to issue numbers. For example, if a commit in the JuliaSyntax history refers to #256, that will be rewritten to the string JuliaLang/JuliaSyntax.jl#256. (Note for completeness that the history of these projects also includes the git history of Tokenize.jl which is the origin of the lexer.)

There's a few questions / TODOs I'd like to consider before merging this:

How do we do CI of JuliaSyntax against old Julia versions?

JuliaSyntax currently supports Julia versions back to 1.0 (!!) Admittedly this may be excessive, but we should keep the JuliaSyntax registered in General working for at least some older Julia versions.

The problem is I know very little about how to set this up and I'd like advice or help :) @IanButterworth I can see you're active with both build kite and github actions infrastructure - I hoped you might have some thoughts or be able to point me in the right direction? Presumably we download pre-built versions from julialang-s3.julialang.org and test the JuliaSyntax module against those in addition to the current dev version of Julia.

Easing the archiving of JuliaLang/JuliaSyntax

There's enough open PRs on JuliaSyntax that it'd be nice to make migrating those to the main Julia repository easy. My rough plan is to filter all branches while running git-filter-repo and push those filtered branches to JuliaLang/JuliaSyntax. Then PR authors should be able to grab the filtered version of their branch and apply it to the main Julia repo without issues. I haven't figured out the details of this yet but it should be done in one git-filter-repo run to ensure consistency of version hashes.

When this is done I'll also move c42f/JuliaLowering.jl into JuliaLang/JuliaLowering.jl and archive it so there's a more permanent home for the associated github issue and PR discussions.

What should these modules be called?

I hesitate to bring this up because it might become a distraction. But if we want to rename either of these modules it makes sense to do it now while we're moving git histories around.

Originally, JuliaSyntax was named that way because there was a very old and obsolete JuliaParser already taking the name, and the prefix "Julia" was used for clarity given that it was going into the General registry. (Also, the parser work was started as an experimental side project and taking a canonical name seemed rather too bold 😅) If we want to claim a more canonical name at this point we might consider renaming it to Parser. (Of course we could take JuliaParser as a name, but that seems marginal enough that we may as well stick with the existing name.)

JuliaLowering was named with the same convention but if we change the JuliaSyntax name to just Parser we might also consider renaming JuliaLowering to something like Lowering or CodeLowering. CompilerFrontend is also a tempting name but not including the parser in the "compiler frontend" would be a bit weird.

c42f and others added 30 commits August 31, 2024 15:54
This makes it easier to run subsets of tests by just including the
appropriate file
Also fix a couple of test cases which weren't being run correctly
including updating exceptions_ir.jl to avoid use of globals
Make all of `is_valid_ir_argument`, `is_valid_body_ir_argument`,
`is_single_assign_var`, `is_const_read_arg` more accurate portings of
the flisp equivalents.

Some values in the IR must be written to temporaries for the resulting
code to be correct. It's not clear which invariats we're upholding here
because none of these seem to be documented, but it seems important to
have these be as equivalent as possible for now. Some changes are still
required to these after the variable analysis pass is more accurate.

Also avoid using `isdefined()` when looking up globals in Julia modules
during lowering - this does import resolution and we can't allow this
side effect when generating IR.

Instead use the new functions `is_defined_and_owned_global`
`is_defined_nothrow_global` which dip a bit into Julia internals to look
up bindings and determine binding owner without having any side effects.
This should preven uncommitted tests from accidentally being deleted
if the dev code crashes.
Expand general assignment syntax, including
  * UnionAll definitions
  * Chained assignments
  * Setting of structure fields
  * Destructuring
  * Typed variable declarations

Still TODO
  * Eliminiating tuples in case sides match
  * Assignments to array elements
Deal with cases like `(x,y) = (a,b)`. Still need to deal with slurps and
splats.
…x.jl#501)

This patch fixes serialization of `Kind`s to use `sizeof` (number of
bytes) instead of `length` (number of characters) when computing number
of bytes in the stringified `Kind`.
At some we can hopefully replace `local_def` with `local`, combined with
a future `struct BindingFlags`.
Named tuple destructuring is needed to implement `kw_call`.

Also change expansion of `K"."` in `expand_forms_1()` to lower the
second element to K"Symbol" early on.  We should probably do this
upstream in JuliaSyntax :)
* A vector of `Slot`s is now created and passed into the `CodeInfo`
  creation pass so that code doesn't need access to the `Bindings`
  anymore. This is a better separation of data structures between
  passes.
* Use K"Placeholder" for unused slots.
* Fix small bug which made argument slurping broken.
Also fix a bug in linearization of `K"isdefined"`
…#511)

Julia's ecosystem (including Base.Docs and flisp lowering) assumes that
strings within `struct` definitions are per-field docstrings, but the
flisp parser doesn't handle these - they are only recognized when the
struct itself has a docstring and are processed by the `@doc` macro
recursing into the struct's internals. For example, the following
doesn't result in any docs attached to `A`.

```julia
struct A
    "x_docs"
    x

    "y_docs"
    y
end
```

This change adds `K"doc"` node parsing to the insides of a struct,
making the semantics clearer in the parser tree and making it possible
to address this problems in the future within JuliaLowering.

Also ensure that the `Expr` form is unaffected by this change.
…ng/JuliaSyntax.jl#506)

* Don't assume that `SubString` has `pointer` and copy instead

* Still assume `Substring{String}` has `pointer`

* Test with `Test.GenericString`
mlechu and others added 17 commits September 16, 2025 10:45
…aLang/JuliaLowering.jl#75)

Was causing several stdlib failures.  MWE:
```
julia> ex = Meta.parse("begin
       x = 111
       x = 222
       end")

JuliaLowering.core_lowering_hook(ex, Main, "foo.jl", 100)
```

If `core_lowering_hook` is given one filename (e.g. "none" from `@eval`), but
     some part of the expression contains LineNumberNodes with a different
     filename, we trigger the "inlined macro-expansion" logic in the debuginfo
     generator, which assumes new filenames are from new macro expansions atop
     the old filename.  The violated invariant is that the list of files in this
     statement's flattened provenance shares some prefix with the last
     statement's list of files.

This fix assumes there is some base file that all statements share, and
     normalizes different base filenames to the first it sees.

Aside: Not sure if this stack logic is 100% correct given that two adjacent
     statements can share arbitrarily many file stack entries despite being from
     different macro expansions.
…/JuliaLowering.jl#80)

---------

Co-authored-by: Claire Foster <[email protected]>
Adapt to JuliaSyntax changes in how macro names are represented in the tree. This is a bit messy
but is important to keep in sync for now until we figure out how the green tree relates
to (or differs from) the AST seen by macros and lowering.
This is a step toward an iteration interface for lowering which can
return a sequence of CodeInfo to be evaluated for top level and module
expressions.

This also restricts lowering of module expressions to be syntactically
at top level (ie, not inside a top level thunk), consistent with the
existing way that they're handled in eval.
Julia's incrementally evaluated top level semantics make it rather
tricky to design a lowering interface for top level and module level
expressions. Currently these expressions are effectively *interpreted*
by eval rather than ever being processed by lowering.

However, I'd like a cleaner separation between "low level evaluation"
and lowering, such that Core can contain only the low level eval "driver
function". I'd like to propose the split as follows:

* "Low level" evaluation is about executing a sequence of thunks
  represented as `CodeInfo` and creating modules for those to be
  executed inside.
* Lowering is about expression processing.

In principle, the runtime's view of `eval()` shouldn't know about `Expr`
or `SyntaxTree` (or whatever AST we use) - that should be left to the
compiler frontend. A useful way to think about the duties of the
frontend is to consider the question "What if we wanted to host another
language on top of the Julia runtime?". If we can eventually achieve
that without ever generating Julia `Expr` then we will have succeeded in
separating the frontend.

To implement all this I've recast lowering as an incremental iterative
API in this change. Thus it's the job of `eval()` to simply evaluate
thunks and create new modules as driven by lowering. (Perhaps we'd move
this definition of `eval()` over to the Julia runtime before 1.13.) The
iteration API is currently oddly bespoke and arguably somewhat
non-Julian for two reasons:

* Lowering knows when new modules are required, and may request them
  with `:begin_module`. However `eval()` generates those modules so they
  need to be passed back into lowering. So we can't just use
  `Base.iterate()`. (Put a different way, we have a situation which is
  suited to coroutines but we don't want to use full Julia `Task`s for
  this.)
* We might want to implement this `eval()` in Julia's C runtime code or
  early in bootstrap. Hence using SimpleVector and Symbol as the return
  values of `lower_step()`

We might consider changing at least the second of these choices,
depending on how we end up integrating this into Base.
…el-interpret-modules

Incremental lowering API
…aLowering.jl#86)

Fix the lowering of `cglobal` to produce `GlobalRef(Core.Intrinsics, :cglobal)`
instead of a bare symbol `:cglobal`. The inference validator requires cglobal
to be a GlobalRef:
https://github.com/JuliaLang/julia/blob/7a8cd6e202f1d1216a6c0c0b928fb43a123cada8/Compiler/src/validation.jl#L87

With this commit `_to_lowered_expr` resolves `cglobal` to `GlobalRef(Core.Intrinsics, :cglobal)`,
matching Julia's builtin lowerer behavior and satisfying the inference
validator's requirements.
…liaLowering.jl#87)

* Support `const foo() = ...`

* Add support for destructuring `const`

* Generate Core.declare_const; make constdecl a lowering-only kind

* Generate Core.declare_global; remove globaldecl kind, desugar global

Corresponds to JuliaLang/JuliaLowering.jl#58279 (also take unused_only)

* Refresh IR test cases

* Update README

* Fix toplevel_pure

* Random typo fix

* Use Core.declare_const instead of jl_set_const

* Don't test on 1.12 in CI

* Update test/decls.jl

Co-authored-by: Em Chu <[email protected]>

* Expand global/local function def body properly

* Add a handful more IR tests for declarations

* Add tests for #59755

---------

Co-authored-by: Em Chu <[email protected]>
…jl#97)

Also fix a small bug in `_eval` when file is `nothing`
…JuliaLowering.jl#100)

The removed test attempted to pass a runtime-computed function name to
`ccall` via `ccallable_sptest_name(T)`, but `ccall` now requires its
function name argument to be a compile-time constant.

This pattern only works with `@generated` functions from Julia 1.13
onwards, where the function name can be evaluated at code generation
time.

Currently JL cannot handle `@generated` functions, so the commenting out
the test case updated in the last commit.

---------

Co-authored-by: Em Chu <[email protected]>
…uliaLang/JuliaLowering.jl#91)

Adds support for nested splat expressions like `tuple((xs...)...)` by
restructuring the splat expansion to match the native lowerer's
recursive algorithm.

The native lowerer unwraps only one layer of `...` per pass and relies
on recursive expansion to handle nested cases. This approach naturally
builds the nested `_apply_iterate` structure through multiple expansion
passes, avoiding the need for explicit depth tracking and normalization.

Changes:
- Refactor `_wrap_unsplatted_args` to unwrap only one layer of `...`
- Refactor `expand_splat` to construct unevaluated `_apply_iterate` call
  then recursively expand it
- Add test cases for nested splats including triple-nested and mixed-depth
These are the simplest possible adaptions to create the vendored
Base.JuliaSyntax from an in-tree version of JuliaSyntax (JuliaLowering
to be hooked up later).
@topolarity topolarity requested review from Keno and mlechu October 17, 2025 13:24
@mlechu
Copy link
Member

mlechu commented Oct 17, 2025

Thanks for doing this.

Should this ideally be merged before or after #59818? I know JuliaLowering is not intended to be a 1.13 feature (and this PR doesn't hook it up, hence not containing the "feature") but this PR might make parser backports easier.

CI

Assuming there's an easy way of setting this up, did we want to keep JuliaLowering CI separate (based on files touched) until it's fully integrated?

What should these modules be called?

Parser and Lowering sound good to me.

@KristofferC
Copy link
Member

W.r.t the name, since the parser is also a public package (and will continue to be), I think the name JuliaParser is preferable.

@Keno
Copy link
Member

Keno commented Oct 17, 2025

I've argued for JuliaParser and JuliaSyntax in the past, because it parses/lowers julia syntax (as opposed to Compiler which could in principle be more general).

@DilumAluthge
Copy link
Member

Isn't this the opposite direction we're trying to go in? That is, we've been trying to move more stuff out of the main Julia source tree, and into separate repos, right? At least, we've been moving a bunch of stdlibs into separate repos.

@vtjnash
Copy link
Member

vtjnash commented Oct 17, 2025

I think in Base these are already called Base.Meta (as short for metaprogramming), and JuliaParser/JuliaSyntax are internal implementation details (in the past). I'd think Base.Meta.Parser and Base.Meta.Lowering would be sufficient name spacing to avoid needing to call it Base.Meta.JuliaLowering there. The packages themselves might stuck with the existing names to avoid ecosystem churm. Also, that they happen to be implemented in Julia doesn't preclude potentially wanting XMLParser or FloatParser packages too, so Parser alone is rather ambiguous .

@adienes
Copy link
Member

adienes commented Oct 17, 2025

not sure it's worth such a minor change but what if the nominalizations were made to match? either both agent nouns or both gerunds (aka Parser & Lowerer or Parsing & Lowering, but not Parser & Lowering)

@KristofferC
Copy link
Member

Isn't this the opposite direction we're trying to go in? That is, we've been trying to move more stuff out of the main Julia source tree, and into separate repos, right?

Not all code is equal, and making the repo smaller is not an end goal in itself. We've been trying to move out non-core stuff like LinearAlgebra etc, that are more or less independent of the language. The Julia parser and lowerer are core components and are heavily tied to internals, so having these be in other repos makes changes that need to touch e.g., both the lowerer, the parser and some other internal component, much more annoying than if it is in one repo and can all be done in a single PR.

@c42f
Copy link
Member Author

c42f commented Oct 19, 2025

Should this ideally be merged before or after #59818? I know JuliaLowering is not intended to be a 1.13 feature (and this PR doesn't hook it up, hence not containing the "feature") but this PR might make parser backports easier.

It's true it should make backports a lot easier and more natural (another reason why it's good for this to live in the main tree). I guess I'm cautiously optimistic about merging it beforehand but I could go either way.

Naming

Base.Meta.Parser and Base.Meta.Lowering would be sufficient name spacing

Agreed, but I don't want the vendored versions to have different module names than the versions registered in General. That seems super confusing and is also inconvenient for utility code like the tests which will refer to the module name on occasion.

Also, that they happen to be implemented in Julia doesn't preclude potentially wanting XMLParser or FloatParser packages too

Yeah. Note that the fact that they're implemented in Julia was not considered when I named them. The point of having the "Julia" prefix is that JuliaSyntax is about the syntax of the julia language.

I've argued for JuliaParser and JuliaSyntax in the past, because it parses/lowers julia syntax

Ok, good to know that "Julia" prefix seems ok to you. Renaming JuliaLowering->JuliaSyntax ... hmm it makes some sense but feels like it may do more harm than good at this point with the parser having been called JuliaSyntax for quite some time now 😅


Haha! I see that my reservations about asking about naming have come to pass. It's never easy 😅

I'm getting a general vibe that keeping the Julia prefix is probably a good idea. And personally I very much want the vendored modules to have the same name internally in the compiler as outside.

I'd consider renaming JuliaSyntax->JuliaParser if there's a strong consensus which develops about that though I'm wary the churn might just not be worth it at this point.

I'm fairly happy to rename JuliaLowering if we can find a better name. It's not registered in General and I do feel the existing name is not the best. But if we keep the "Julia" prefix - what are we left with if JuliaSyntax is taken? Just throwing some ideas out there:

  • JuliaFrontend? - too general
  • JuliaSyntaxAnalysis - awkward and long but somewhat accurate
  • JuliaSemanticAnalysis - Wow I don't want to type that. But lowering does mostly align with what would usually be called the semantic analysis phase
  • JuliaSemantics - seems weirdly vague

Alas I'm really not loving any of those. Possibly we're at a local maximum with the existing names.

@mlechu
Copy link
Member

mlechu commented Oct 21, 2025

I'm OK with keeping JuliaSyntax and JuliaLowering if the package names need to be the same as the module names in Base. JuliaParser accurately describes the way we currently use JuliaSyntax, but after JuliaLowering it's both the parser and a library of shared syntax utilities. (We could consider pulling it apart into parser and shared code, though I'm not sure it would be productive).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.