Move from an hardcoded extern _start to a dynamic proc_macro #36

morr0ne · 2023-09-05T17:51:28Z

This pr is very complex as it attempts to do a bunch of things at once.

First and foremost it adds an origin-macros crate which a export a single macro called main that generates the glue code to start the program
Tries to remove the existing _start and entry implementations so that user either have to implement them themselves, use the new macro or relay on a wrapper such as origin-stdio or mustang
Rework the api to make some item public and allow for the macro to not only work but to not make it a requirement
Update all the examples/test to the new implementation
Update all the docs to reflect the new changes

There are still some question that need to be answered mainly:

Should this completely substitute the old implementation or should they coexist?
How to coordinate updates to projects that depend on origin?
How should we handle relocation? The current code is completely broken since it relayed on the hard-coded extern.
Maybe rethink the scope of origin. Should it really handle all this stuff? Perhaps it could just perform the startup using linux system calls without worrying about libc compatibility and let that be handle by another crate

morr0ne · 2023-09-05T17:57:41Z

The macro itself is actually done, it successfully generates the code for all archs and the main function can actually be named anything. On that note I forgot to mention on the previous comment that this pr also removes all args from main and expects it to be called without any. I am not 100% if this is what we want but it was easier to get a working implementation that way so that we can start testing.

sunfishcode · 2023-09-05T20:03:26Z

Could you say more about the use case that motivates this?

Would it be possible to implement the same syntax by building the proc macro on top of origin's existing API (though with main renamed to origin_main and using extern "Rust")? It seems like that should be able to achieve the same syntax, and wouldn't require significant changes to origin's other users.

sunfishcode · 2023-09-06T14:41:53Z

I haven't tested it, but I think this might be a way to do what you're looking to do here, within the current API:

#[proc_macro_attribute]
pub fn main(_attr: TokenStream, input: TokenStream) -> TokenStream {
    let main_fn: ItemFn = parse_macro_input!(input);
    let main_fn_ident = &main_fn.sig.ident;

    quote! {
        #main_fn

        #[no_mangle]
        unsafe fn origin_main(argc: usize, argv: *mut *mut u8, envp: *mut *mut u8) -> i32 {
            #main_fn_ident()
        }
    }
    .into()
}

morr0ne · 2023-09-10T16:28:27Z

This could technically work but my intent was to actually avoid using the unnecessary extern all together. I don't think it makes sense for origin to link to some arbitrary symbols. That should a the job of a c-compatibility layer. Ideally origin would just provides the means to do so. The macro is meant for those rare use cases where only origin is wanted and instead of relaying on linking to other symbols it uses the actual main function directly. For a library that is included directly instead of linked against I am not to fond of having "random" linking requirements. After all origin is still young, there is no reason to define some arbitrary conventions when that can be left for an layer like mustang.

sunfishcode · 2023-09-10T18:01:32Z

By coincidence, I just encountered a technical reason why this macro approach would need to be significantly more complex. It turns out that the PIE relocation code had a bug (I called it!); it wasn't running early enough. It was relying on the linker doing ELF relaxation optimizations to eliminate GOT calls. But, Recent Rust nightly versions happened to disable ELF relaxation optimizations, causing calls that cross crate boundaries go through the GOT. GOT entries in PIEs need to be relocated, so if we were to expand _start in the user crate, we'd need to expand the relocate function in the user crate too, because that needs to run before any calls to other crates can happen.

We could put all of relocate in the proc macro too, but that's a lot of code, and it makes one of the major the downsides of proc macros even bigger: proc macros are hard to debug. Debuggers and backtraces don't know where you are inside of a proc macro. Editors and rustfmt don't provide as much assistance inside a quote! block. It's less convenient to insert print statements, because it requires publicly exporting the dependencies needed for printing.

morr0ne · 2023-09-11T21:17:04Z

By coincidence, I just encountered a technical reason why this macro approach would need to be significantly more complex. It turns out that the PIE relocation code had a bug (I called it!); it wasn't running early enough. It was relying on the linker doing ELF relaxation optimizations to eliminate GOT calls. But, Recent Rust nightly versions happened to disable ELF relaxation optimizations, causing calls that cross crate boundaries go through the GOT. GOT entries in PIEs need to be relocated, so if we were to expand _start in the user crate, we'd need to expand the relocate function in the user crate too, because that needs to run before any calls to other crates can happen.

Can't say I'm surprised, we definetly need to find a proper way to implement relocation that doesn't feel hacky, I am not sure how it would fit with the proc_macro approach.

We could put all of relocate in the proc macro too, but that's a lot of code, and it makes one of the major the downsides of proc macros even bigger: proc macros are hard to debug. Debuggers and backtraces don't know where you are inside of a proc macro. Editors and rustfmt don't provide as much assistance inside a quote! block. It's less convenient to insert print statements, because it requires publicly exporting the dependencies needed for printing.

That doesn't feel right, the relocation code is very big and my goal was to avoid as much code from the macro as possible precisely for this reason. As I said before, I just want the macro to glue together some existing code instead of writing it itself.

I am gonna go ahead and explain my use case so perhaps we can come to a better conclusion together

Right now I have a bunch of no_std executable that target a generic linux platform without libc. There is no dynamic linking so everything gets compiled to completely a static executable. Most of this programs are very basic and don't even require allocations. In each one of them contains something that looks like this:

fn main() -> i32 {
    0
}

#[naked]
#[no_mangle]
unsafe extern "C" fn _start() -> ! {
    use core::arch::asm;

    fn entry() -> ! {
        rustix::runtime::exit_group(main())
    }


    asm!(
        "mov rdi, rsp",
        "push rbp",
        "jmp {entry}",
        entry = sym entry,
        options(noreturn),
    );
}

That is a simplified version of the code that I adapted from origin itself. Of course I don't want to write this for every file but just write and mantain one version. Ideally I'd love to just write something that looks like this:

#[main]
fn main() -> i32 {
   0
}

There is obviously significant overlap with origin and It wouldn't make sense to write another crate that does just a subset of what origin does.

I do only need the startup code which is very small compared to everything else origin does but I admit I am not confident enough about this low level stuff to maintain such code myself, especially when great libraries like origin already exist. My hope is to find what I'm trying to achieve in origin itself. Maybe the conclusion from all of this is that origin does to much and stuff like signal handling and maybe even relocation should live in sister crates appropriately named something like origin-signal

sunfishcode · 2023-09-11T23:31:35Z

That is a simplified version of the code that I adapted from origin itself. Of course I don't want to write this for every file but just write and mantain one version. Ideally I'd love to just write something that looks like this:
#[main]
fn main() -> i32 {
   0
}

If you want to write code that looks like that, consider using my example above.

I do only need the startup code which is very small compared to everything else origin does but I admit I am not confident enough about this low level stuff to maintain such code myself, especially when great libraries like origin already exist. My hope is to find what I'm trying to achieve in origin itself. Maybe the conclusion from all of this is that origin does to much and stuff like signal handling and maybe even relocation should live in sister crates appropriately named something like origin-signal

The signal code and relocation code are already behind cargo features; if you don't enable the origin-signal or experimental-relocate features, respectively, that code isn't compiled in.

I added the example-crates/tiny example to show how origin can be used to produce very small executables. A big part of how it achieves that is by disabling origin's various optional features, which reduces the amount of code in origin down to almost nothing.

sunfishcode · 2023-09-21T20:20:13Z

I'm open to ideas here, however the constraints of the relocate code mean we don't have much flexibility here. I think the current system works pretty well in practice, and it can support the end-user syntax you're discussing. So if you have further thoughts here, feel free to reopen this or file new issues.

morr0ne added 15 commits September 4, 2023 15:52

Move logic from entry to indipendent functions

b55be91

Move some logic to call_user_code fn

e926c29

Create origin-macros crate

2881889

Create empty macro stub

6f8c04e

Added main macro to origin crate

2c93544

Added required syn feature

90eb327

Initial macro implementation

69eeb9a

Fully implement macro

1dda4a7

Add readme file to macros

21ee118

Updated manifest informations

750573c

Remove provided _start impementation

c7c7697

Mark some functions as public

51847d8

Update examples to use macros

16f99c6

Add fix to make test pass

6eac445

Merge branch 'main' into proc_macro

0d94dab

sunfishcode mentioned this pull request Sep 10, 2023

Change main from extern "C" to extern "Rust" #34

Closed

sunfishcode closed this Sep 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move from an hardcoded extern _start to a dynamic proc_macro #36

Move from an hardcoded extern _start to a dynamic proc_macro #36

morr0ne commented Sep 5, 2023

morr0ne commented Sep 5, 2023

sunfishcode commented Sep 5, 2023

sunfishcode commented Sep 6, 2023 •

edited

Loading

morr0ne commented Sep 10, 2023 •

edited

Loading

sunfishcode commented Sep 10, 2023 •

edited

Loading

morr0ne commented Sep 11, 2023

sunfishcode commented Sep 11, 2023

sunfishcode commented Sep 21, 2023

Move from an hardcoded extern _start to a dynamic proc_macro #36

Move from an hardcoded extern _start to a dynamic proc_macro #36

Conversation

morr0ne commented Sep 5, 2023

morr0ne commented Sep 5, 2023

sunfishcode commented Sep 5, 2023

sunfishcode commented Sep 6, 2023 • edited Loading

morr0ne commented Sep 10, 2023 • edited Loading

sunfishcode commented Sep 10, 2023 • edited Loading

morr0ne commented Sep 11, 2023

sunfishcode commented Sep 11, 2023

sunfishcode commented Sep 21, 2023

sunfishcode commented Sep 6, 2023 •

edited

Loading

morr0ne commented Sep 10, 2023 •

edited

Loading

sunfishcode commented Sep 10, 2023 •

edited

Loading