-
Notifications
You must be signed in to change notification settings - Fork 7
[mlir-gen] Add mlir builders for llama3.1 and tests #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Should this be in |
|
The e2e should be, yup, but this is mostly tests and getters. |
|
I moved the whole thing to examples and added attention the list of tests. |
rengolin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
rolfmorel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Have left some comments inline.
.github/workflows/examples.yml
Outdated
| - name: Run pytest-enabled examples as tests | ||
| run: | | ||
| uv run pytest python/examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we instead integrate with lit? That is, would it work if we just added the line # RUN: %PYTHON pytest %s at the top of test_llama3.py?
There's value in trying to preserve being able to just lit $PATH_WITHIN_PROJECT to run the respective tests (including PATH_WITHIN_PROJECT=. in the root).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought: the run line should probably just be # RUN: pytest %s and for CI we would place a (conditional) substitution like config.substitutions.append(("pytest", "uv run pytest")) in lit.cfg.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure we can, but why? I'd expect that only lit tests are run when I type lit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to have a single entry point for all tests I'd suggest we add something like uv run test [all] with test being a script launching all (groups?) the necessary stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, I routed it through lit for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mainly two things for me: 1) so we can have a single entry point, and 2) that we can deal with many different filetypes (e.g. in an upcoming tests dir we are likely to have .mlir files which have something run on them and are then FileCheck-ed).
As an llvm project, lit seems the natural choice and less foreign then our own test script with its own interface (e.g. lit $DIR to identify subtree to run on, pass --filter to filter tests -- exactly like on llvm-project).
The hardcoding on uv run in lit's current config is a bit unfortunate. Mea culpa. Preferably we wouldn't need people to use uv for anything, i.e. users could make everything work without it. (That is, the unconditional substitution %PYTHON -> uv run in lit.cfg.py should be changed to be, at least, configurable.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should worry too much about this right now. These are examples, likely to migrate to proper scripts later, when the APIs are usable and everything. For now, lit calling pytest on a RUN like is perfectly fine.
python/examples/llama/test_llama3.py
Outdated
| def create_pass_pipeline(ctx: ir.Context) -> PassManager: | ||
| with ctx: | ||
| pm = PassManager("builtin.module") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def create_pass_pipeline(ctx: ir.Context) -> PassManager: | |
| with ctx: | |
| pm = PassManager("builtin.module") | |
| def create_pass_pipeline() -> PassManager: | |
| pm = PassManager("builtin.module") |
See comment below for how to elide dealing with contexts in most places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above holds for many functions in this file.
For this particular function, it should just become the suffix of the schedule, so we just have end-to-end schedules for the entire MLIR lowering that is happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://github.com/libxsmm/tpp-mlir/blob/37a498bd1e320e00fa50e3323cbaac2867cd7a1e/python/mlir/tpp/sched/bundles.py#L41-L43 for an example for dealing with passes that expect to run on particular ops w.r.t to the root module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's one of the concrete elisions of ctx.
python/examples/llama/test_llama3.py
Outdated
| def create_schedule(ctx: ir.Context) -> ir.Module: | ||
| """ | ||
| Create an MLIR module containing transformation schedule. | ||
| The schedule provides partial lowering to scalar operations. | ||
| Args: | ||
| ctx: MLIR context. | ||
| """ | ||
| with ctx, ir.Location.unknown(context=ctx): | ||
| # Create transform module. | ||
| schedule = ir.Module.create() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def create_schedule(ctx: ir.Context) -> ir.Module: | |
| """ | |
| Create an MLIR module containing transformation schedule. | |
| The schedule provides partial lowering to scalar operations. | |
| Args: | |
| ctx: MLIR context. | |
| """ | |
| with ctx, ir.Location.unknown(context=ctx): | |
| # Create transform module. | |
| schedule = ir.Module.create() | |
| def create_schedule() -> ir.Module: | |
| schedule = ir.Module.create() |
And just de-indent the rest of the function.
python/examples/llama/test_llama3.py
Outdated
| def bufferize_module(ctx: ir.Context, kernel: ir.Module) -> None: | ||
| with ctx: | ||
| pm = PassManager("builtin.module") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def bufferize_module(ctx: ir.Context, kernel: ir.Module) -> None: | |
| with ctx: | |
| pm = PassManager("builtin.module") | |
| def bufferize_module(kernel: ir.Module) -> None: | |
| pm = PassManager("builtin.module") |
python/examples/llama/test_llama3.py
Outdated
| def to_ir_type(type_str, ctx): | ||
| if type_str == "f32": | ||
| return ir.F32Type.get(context=ctx) | ||
| elif type_str == "f64": | ||
| return ir.F64Type.get(context=ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def to_ir_type(type_str, ctx): | |
| if type_str == "f32": | |
| return ir.F32Type.get(context=ctx) | |
| elif type_str == "f64": | |
| return ir.F64Type.get(context=ctx) | |
| def to_ir_type(type_str): | |
| if type_str == "f32": | |
| return ir.F32Type.get() | |
| elif type_str == "f64": | |
| return ir.F64Type.get() |
In effect, these .get() methods are doing a ir.Context.current under the hood when you don't pass a context explicitly (just like the Op builders).
python/examples/llama/test_llama3.py
Outdated
| ir_type = to_ir_type(elem_type, ctx) | ||
| module = generate_module(ir_type) | ||
| schedule = create_schedule(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ir_type = to_ir_type(elem_type, ctx) | |
| module = generate_module(ir_type) | |
| schedule = create_schedule(ctx) | |
| ir_type = to_ir_type(elem_type) | |
| module = generate_module(ir_type) | |
| schedule = create_schedule() |
rengolin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a lot of smaller comments that we can leave for post-merge review. This is an example, and a complicated one at that, and we don't want to over-engineer something that will soon move to a better program.
Putting up this dirty draft for early feedback/questions. I'm putting together some tests to run a e2e llama3.1 going through linalg on tensors. The goal is to generate some nice linalg that would be optimization friendly. At the moment, there are just functional blocks and pieces that are just smoke-tested. These include naive implementations for rotary embeddings, feed forward, rms, and a bunch of other small snippets that are useful to implement the model. These are already enough to put an attention block together. It'd be nice to test it against the original implementation, but that'd require fairscale as a dependency. For now I only added pytest and kept the pipeline as simple as possible. I also reused the example with the schedule, so now it is a part of every test.