Skip to content

Move from an hardcoded extern _start to a dynamic proc_macro #36

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from

Conversation

morr0ne
Copy link
Contributor

@morr0ne morr0ne commented Sep 5, 2023

This pr is very complex as it attempts to do a bunch of things at once.

  • First and foremost it adds an origin-macros crate which a export a single macro called main that generates the glue code to start the program
  • Tries to remove the existing _start and entry implementations so that user either have to implement them themselves, use the new macro or relay on a wrapper such as origin-stdio or mustang
  • Rework the api to make some item public and allow for the macro to not only work but to not make it a requirement
  • Update all the examples/test to the new implementation
  • Update all the docs to reflect the new changes

There are still some question that need to be answered mainly:

  • Should this completely substitute the old implementation or should they coexist?
  • How to coordinate updates to projects that depend on origin?
  • How should we handle relocation? The current code is completely broken since it relayed on the hard-coded extern.
  • Maybe rethink the scope of origin. Should it really handle all this stuff? Perhaps it could just perform the startup using linux system calls without worrying about libc compatibility and let that be handle by another crate

@morr0ne
Copy link
Contributor Author

morr0ne commented Sep 5, 2023

The macro itself is actually done, it successfully generates the code for all archs and the main function can actually be named anything. On that note I forgot to mention on the previous comment that this pr also removes all args from main and expects it to be called without any. I am not 100% if this is what we want but it was easier to get a working implementation that way so that we can start testing.

@sunfishcode
Copy link
Owner

Could you say more about the use case that motivates this?

Would it be possible to implement the same syntax by building the proc macro on top of origin's existing API (though with main renamed to origin_main and using extern "Rust")? It seems like that should be able to achieve the same syntax, and wouldn't require significant changes to origin's other users.

@sunfishcode
Copy link
Owner

sunfishcode commented Sep 6, 2023

I haven't tested it, but I think this might be a way to do what you're looking to do here, within the current API:

#[proc_macro_attribute]
pub fn main(_attr: TokenStream, input: TokenStream) -> TokenStream {
    let main_fn: ItemFn = parse_macro_input!(input);
    let main_fn_ident = &main_fn.sig.ident;

    quote! {
        #main_fn

        #[no_mangle]
        unsafe fn origin_main(argc: usize, argv: *mut *mut u8, envp: *mut *mut u8) -> i32 {
            #main_fn_ident()
        }
    }
    .into()
}

@morr0ne
Copy link
Contributor Author

morr0ne commented Sep 10, 2023

This could technically work but my intent was to actually avoid using the unnecessary extern all together. I don't think it makes sense for origin to link to some arbitrary symbols. That should a the job of a c-compatibility layer. Ideally origin would just provides the means to do so. The macro is meant for those rare use cases where only origin is wanted and instead of relaying on linking to other symbols it uses the actual main function directly. For a library that is included directly instead of linked against I am not to fond of having "random" linking requirements. After all origin is still young, there is no reason to define some arbitrary conventions when that can be left for an layer like mustang.

@sunfishcode
Copy link
Owner

sunfishcode commented Sep 10, 2023

By coincidence, I just encountered a technical reason why this macro approach would need to be significantly more complex. It turns out that the PIE relocation code had a bug (I called it!); it wasn't running early enough. It was relying on the linker doing ELF relaxation optimizations to eliminate GOT calls. But, Recent Rust nightly versions happened to disable ELF relaxation optimizations, causing calls that cross crate boundaries go through the GOT. GOT entries in PIEs need to be relocated, so if we were to expand _start in the user crate, we'd need to expand the relocate function in the user crate too, because that needs to run before any calls to other crates can happen.

We could put all of relocate in the proc macro too, but that's a lot of code, and it makes one of the major the downsides of proc macros even bigger: proc macros are hard to debug. Debuggers and backtraces don't know where you are inside of a proc macro. Editors and rustfmt don't provide as much assistance inside a quote! block. It's less convenient to insert print statements, because it requires publicly exporting the dependencies needed for printing.

@morr0ne
Copy link
Contributor Author

morr0ne commented Sep 11, 2023

By coincidence, I just encountered a technical reason why this macro approach would need to be significantly more complex. It turns out that the PIE relocation code had a bug (I called it!); it wasn't running early enough. It was relying on the linker doing ELF relaxation optimizations to eliminate GOT calls. But, Recent Rust nightly versions happened to disable ELF relaxation optimizations, causing calls that cross crate boundaries go through the GOT. GOT entries in PIEs need to be relocated, so if we were to expand _start in the user crate, we'd need to expand the relocate function in the user crate too, because that needs to run before any calls to other crates can happen.

Can't say I'm surprised, we definetly need to find a proper way to implement relocation that doesn't feel hacky, I am not sure how it would fit with the proc_macro approach.

We could put all of relocate in the proc macro too, but that's a lot of code, and it makes one of the major the downsides of proc macros even bigger: proc macros are hard to debug. Debuggers and backtraces don't know where you are inside of a proc macro. Editors and rustfmt don't provide as much assistance inside a quote! block. It's less convenient to insert print statements, because it requires publicly exporting the dependencies needed for printing.

That doesn't feel right, the relocation code is very big and my goal was to avoid as much code from the macro as possible precisely for this reason. As I said before, I just want the macro to glue together some existing code instead of writing it itself.

I am gonna go ahead and explain my use case so perhaps we can come to a better conclusion together

Right now I have a bunch of no_std executable that target a generic linux platform without libc. There is no dynamic linking so everything gets compiled to completely a static executable. Most of this programs are very basic and don't even require allocations. In each one of them contains something that looks like this:

fn main() -> i32 {
    0
}

#[naked]
#[no_mangle]
unsafe extern "C" fn _start() -> ! {
    use core::arch::asm;

    fn entry() -> ! {
        rustix::runtime::exit_group(main())
    }


    asm!(
        "mov rdi, rsp",
        "push rbp",
        "jmp {entry}",
        entry = sym entry,
        options(noreturn),
    );
}

That is a simplified version of the code that I adapted from origin itself. Of course I don't want to write this for every file but just write and mantain one version. Ideally I'd love to just write something that looks like this:

#[main]
fn main() -> i32 {
   0
}

There is obviously significant overlap with origin and It wouldn't make sense to write another crate that does just a subset of what origin does.

I do only need the startup code which is very small compared to everything else origin does but I admit I am not confident enough about this low level stuff to maintain such code myself, especially when great libraries like origin already exist. My hope is to find what I'm trying to achieve in origin itself. Maybe the conclusion from all of this is that origin does to much and stuff like signal handling and maybe even relocation should live in sister crates appropriately named something like origin-signal

@sunfishcode
Copy link
Owner

That is a simplified version of the code that I adapted from origin itself. Of course I don't want to write this for every file but just write and mantain one version. Ideally I'd love to just write something that looks like this:

#[main]
fn main() -> i32 {
   0
}

If you want to write code that looks like that, consider using my example above.

I do only need the startup code which is very small compared to everything else origin does but I admit I am not confident enough about this low level stuff to maintain such code myself, especially when great libraries like origin already exist. My hope is to find what I'm trying to achieve in origin itself. Maybe the conclusion from all of this is that origin does to much and stuff like signal handling and maybe even relocation should live in sister crates appropriately named something like origin-signal

The signal code and relocation code are already behind cargo features; if you don't enable the origin-signal or experimental-relocate features, respectively, that code isn't compiled in.

I added the example-crates/tiny example to show how origin can be used to produce very small executables. A big part of how it achieves that is by disabling origin's various optional features, which reduces the amount of code in origin down to almost nothing.

@sunfishcode
Copy link
Owner

I'm open to ideas here, however the constraints of the relocate code mean we don't have much flexibility here. I think the current system works pretty well in practice, and it can support the end-user syntax you're discussing. So if you have further thoughts here, feel free to reopen this or file new issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants