Skip to content

rust-lang/google-summer-of-code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 

Repository files navigation

Rust project ideas

This page contains a list of ideas for various projects that could help improve the Rust Project and potentially also the wider Rust community.

These project ideas can be used as inspiration for various OSS contribution programs, such as Google Summer of Code or OSPP.

This document contains ideas that should still be actual and that were not yet completed. Here you can also find an archive of older projects from past GSoC events:

  • Past Google Summer of Code projects

We invite contributors that would like to participate in projects such as GSoC or that would just want to find a Rust project that they would like to work on to examine the project list and use it as an inspiration. Another source of inspiration can be the Rust Project Goals, particularly the orphaned goals.

If you would like to participate in GSoC, please read this. If you would like to discuss projects ideas or anything related to them, you can do so on our Zulip.

We use the GSoC project size parameters for estimating the expected time complexity of the project ideas. The individual project sizes have the following expected amounts of hours:

  • Small: 90 hours
  • Medium: 175 hours
  • Large: 350 hours

Index

Project ideas

The list of ideas is divided into several categories.

Rust Compiler

C codegen backend for rustc

Description

rustc currently has three in-tree codegen backends: LLVM (the default), Cranelift, and GCC. These live at https://github.com/rust-lang/rust/tree/master/compiler, as rustc_codegen_* crates.

The goal of this project is to add a new experimental rustc_codegen_c backend that could turn Rust's internal representations into C code (i.e. transpile) and optionally invoke a C compiler to build it. This will allow Rust to use benefits of existing C compilers (better platform support, optimizations) in situations where the existing backends cannot be used.

Expected result

The minimum viable product is to turn rustc data structures that represent a Rust program into C code, and write the output to the location specified by --out-dir. This involves figuring out how to produce buildable C code from the inputs provided by rustc_codegen_ssa::traits::CodegenBackend.

A second step is to have rustc invoke a C compiler on these produced files. This should be designed in a pluggable way, such that any C compiler can be dropped in.

Desirable skills

Knowledge of Rust and C, basic familiarity with compiler functionality.

Project size

Large.

Difficulty

Hard.

Mentor

Zulip streams

Extend annotate-snippets with features required by rustc

Description

rustc currently has incomplete support for using annotate-snippets to emit errors, but it doesn't support all the features that rustc's built-in diagnostic rendering does. The goal of this project is to execute the rustc test suite using annotate-snippets, identify missing features or bugs, fix those, and repeat until at feature-parity.

Expected result

More of the rustc test suite passes with annotate-snippets.

Desirable skills

Knowledge of Rust.

Project size

Medium.

Difficulty

Medium or hard.

Mentor

Zulip streams

Reproducible builds

Description

Recent OSS attacks such as the XZ backdoor have shown the importance of having reproducible builds.

Currently, the Rust toolchain distributed to Rust developers is not very reproducible. Our source code archives should be reproducible as of this pull request, however making the actual binary artifacts reproducible is a much more difficult effort.

The goal of this project is to investigate what exactly makes Rust builds not reproducible, and try to resolve as many such issues as possible.

While the main motivation is to make the Rust toolchain (compiler, standard library, etc.) releases reproducible, any improvements on this front should benefit the reproducibility of all Rust programs.

Expected result

Rust builds are more reproducible, ideally the Rust toolchain can be compiled in a reproducible manner.

Desirable skills

Knowledge of Rust and ideally also build systems.

Project size

Medium.

Difficulty

Large.

Mentor

Related links

Bootstrap of rustc with rustc_codegen_gcc

Description

rustc_codegen_gcc used to be able to compile rustc and use the resulting compiler to successfully compile a Hello, World! program. While it can still compile a stage 2 rustc, the resulting compiler cannot compile the standard library anymore.

The goal of this project would be to fix in rustc_codegen_gcc any issue preventing the resulting compiler to compile a Hello, World! program and the standard library. Those issues are not known, so the participant would need to attempt to do a bootstrap and investigate the issues that arises.

If time allows, an optional additional goal could be to be able to do a full bootstrap of rustc with rustc_codegen_gcc, meaning fixing even more issues to achieve this result.

Expected result

A rustc_codegen_gcc that can compile a stage 2 rustc where the resulting compiler can compile a Hello, World! program using the standard library (also compiled by that resulting compiler).

An optional additional goal would be: a rustc_codegen_gcc that can do a full bootstrap of the Rust compiler. This means getting a stage 3 rustc that is identical to stage 2.

Desirable skills

Good debugging ability. Basic knowledge of:

Project size

Medium-Large depending on the chosen scope.

Difficulty

Hard.

Mentor

Zulip streams

Refactoring of rustc_codegen_ssa to make it more convenient for the GCC codegen

Description

rustc_codegen_gcc uses rustc_codegen_ssa and implements the traits in this crate in order to have a codegen that plugs in rustc seamlessly. Since rustc_codegen_ssa was created based on rustc_codegen_llvm, they are somewhat similar, which sometimes makes it awkward for the GCC codegen. Indeed, some hacks were needed to be able to implement the GCC codegen with this API:

  • Usage of unsafe transmute: for instance, this or this. Fixing this might require separating Value into RValue and LValue or using Function in place of Value in some places to better fit the GCC API.
  • Usage of mappings to workaround the API: for instance, this or this.

Some other improvement ideas include:

  • Separate the aggregate operations (structs, arrays): methods like extract_value are generic over structures and arrays because it's the same operation in LLVM, but it is different operations in GCC, so it might make sense to have multiple methods like extract_field and extract_array_element.
  • Remove duplications between rustc_codegen_gcc and rustc_codegen_llvm by moving more stuff into rustc_codegen_ssa. For instance:

The goal of this project is to improve rustc_codegen_gcc by removing hacks, unnecessary unsafe code and/or code duplication with rustc_codegen_llvm by refactoring rustc_codegen_ssa. It would be important that this refactoring does not result in a performance degradation for rustc_codegen_llvm.

Expected result

A rustc_codegen_gcc that contains less hacks, unsafe code and/or code duplication with rustc_codegen_llvm.

Desirable skills

Knowledge of Rust and basic knowledge of rustc internals, especially the codegen part.

Project size

Small-Medium depending on the chosen scope.

Difficulty

Medium.

Mentor

Zulip streams

Infrastructure

Implement merge functionality in bors

Description

Various Rust repositories under the rust-lang organization use a merge queue bot (bors) for testing and merging pull requests. Currently, we use a legacy implementation called homu, which is quite buggy and very difficult to maintain, so we would like to get rid of it. We have started the implementation of a new bot called simply bors, which should eventually become the primary method for merging pull requests in the rust-lang/rust repository.

The bors bot is a GitHub app that responds to user commands and performs various operations on a GitHub repository. Primarily, it creates merge commits and reports test workflow results for them. It can currently perform so-called "try builds", which can be started manually by users on a given PR to check if a subset of CI passed on the PR. However, the most important functionality, actually merging pull requests into the main branch, has not been implemented yet.

Expected result

bors can be used to perform pull request merges, including "rollups". In an ideal case, bors will be already usable on the rust-lang/rust repository.

Desirable skills

Intermediate knowledge of Rust. Familiarity with GitHub APIs is a bonus.

Project size

Medium.

Difficulty

Medium.

Mentors

Zulip streams

Improve bootstrap

Description

The Rust compiler it bootstrapped using a complex set of scripts and programs generally called just bootstrap. This tooling is constantly changing, and it has accrued a lot of technical debt. It could be improved in many areas, for example:

  • Design a new testing infrastructure and write more tests.
  • Write documentation.
  • Remove unnecessary hacks.

Expected result

The bootstrap tooling will have less technical debt, more tests, and better documentation.

Desirable skills

Intermediate knowledge of Rust. Knowledge of the Rust compiler bootstrap process is welcome, but not required.

Project size

Medium or large.

Difficulty

Medium.

Mentor

Zulip streams

Cargo

Prototype an alternative architecture for cargo fix

Description

Some compiler errors know how to fix the problem and cargo fix is the command for applying those fixes. Currently, cargo fix calls into the APIs that implement cargo check with cargo in a way that allows getting the json messages from rustc and apply them to workspace members. To avoid problems with conflicting or redundant fixes, cargo fix runs rustc for workspace members in serial. As one fix might lead to another, cargo fix runs rustc for each workspace member in a loop until a fixed point is reached. This can be very slow for large workspaces.

We want to explore an alternative architecture where cargo fix runs the cargo check command in a loop, processing the json messages, until a fixed point is reached.

Benefits

  • Always runs in parallel
  • May make it easier to extend the behavior, like with an interactive mode

Downsides

  • Might have issues with files owned by multiple packages or even multiple build targets

This can leverage existing CLI and crate APIs of Cargo and can be developed as a third-party command.

See cargo#13214 for more details.

Expected result

  • A third-party command as described above
  • A comparison of performance across representative crates
  • An analysis of corner the behavior with the described corner cases

Desirable skills

Intermediate knowledge of Rust.

Project size

Medium

Difficulty

Medium.

Mentor

Zulip streams

Prototype Cargo plumbing commands

Description

Cargo is a high-level, opinionated command. Instead of trying to directly support every use case, we want to explore exposing the building blocks of the high-level commands as "plumbing" commands that people can use programmatically to compose together to create custom Cargo behavior.

This can be prototyped outside of the Cargo code base, using the Cargo API.

See the Project Goal for more details.

Expected result

Ideal: a performant cargo porcelain check command that calls out to individual cargo plumbing <name> commands to implement its functionality.

Depending on the size the particpant takes on and their experience, this may be out of reach. The priorities are:

  1. A shell of cargo porcelain check
  2. Individual commands until cargo porcelain check is functional
  3. Performance

Desirable skills

Intermediate knowledge of Rust.

Project size

Scaleable

Difficulty

Medium.

Mentor

Zulip streams

Move cargo shell completions to Rust

Description

Cargo maintains Bash and Zsh completions, but they are duplicated and limited in features.

A previous GSoC participant added unstable support for completions in Cargo itself, so we can have a single implementation with per-shell skins (rust-lang/cargo#6645).

There are many more arguments that need custom completers as well as polish in the completion system itself before this can be stabilized.

See

Expected result

Ideal:

  • A report to clap maintainers on the state of the unstable completions and why its ready for stabilization
  • A report to cargo maintainers on the state of the unstable completions and why its ready for stabilization

Desirable skills

Intermediate knowledge of Rust. Shell familiarity is a bonus.

Project size

Medium.

Difficulty

Medium.

Mentor

Build script delegation

Description

When developers need to extend how Cargo builds their package, they can write a build script. This gives users quite a bit of flexibility but

  • Allows running arbitrary code on the users system, requiring extra auditing
  • Needs to be compiled and run before the relevant package can be built
  • They are all-or-nothing, requiring users to do extra checks to avoid running expensive logic
  • They run counter to the principles of third-party build tools that try to mimic Cargo

A developer could make their build script a thin wrapper around a library (e.g. shadow-rs) but a build script still exists to be audited (even if its small) and each individual wrapper build script must be compiled and linked. This is still opaque to third-party build tools.

Leveraging an unstable feature, artifact dependencies, we could allow a developer to say that one or more dependencies should be run as build scripts, passing parameters to them.

This project would add unstable support for build script delegation that can then be evaluated for proposing as an RFC for approval.

See the proposal for more details.

Expected result

Milestones

  1. An unstable feature for multiple build scripts
  2. An unstable feature for passing parameters to build scripts from Cargo.toml, built on the above
  3. An unstable feature for build script delegation, built on the above two

Bonus: preparation work to stabilize a subset of artifact dependencies.

Desirable skills

Intermediate knowledge of Rust, especially experience with writing build scripts.

Project size

Large.

Difficulty

Medium.

Mentor

Rustfmt

Improve rustfmt infrastructure and automation

Description

Rustfmt is the code formatter for Rust code. Currently, to ensure stability, rustfmt uses unit tests that ensure a source file do not get reformatted unexpectedly. Additionally, there is a tool (currently a shell script) called diffcheck that gets run to check potentially unexpected changes across different large codebases. We would like to improve our tooling around that, namely improving the diffcheck job to include more crates, improve reporting (with HTML output, like a mini crater, which runs compiler changes against all Rust crates published to crates.io), potentially rewriting the job in Rust, and reliability.

Rustfmt currently has a versioning system that gates unstable changes behind Version=Two, and the diffcheck job may be less reliable to report changes to Version=One when changes to unstable formatting are introduced. We'd like to see this story improved to make our test system more robust.

Expected result

A more robust and reliable infrastructure for testing the rustfmt codebase, potentially rewritten in Rust, with HTML output.

Desirable skills

Intermediate knowledge of Rust. Knowledge of CI and automation welcomed.

Project size

Small or medium, depending on the scale proposed.

Difficulty

Small or medium.

Mentor

Zulip streams

Related Links

Crate ecosystem

Modernize the libc crate

Description

The libc crate is one of the oldest crates of the Rust ecosystem, long predating Rust 1.0. Additionally, it is one of the most widely used crates in the ecosystem (#4 most downloaded on crates.io). This combinations means that the current version of the libc crate (v0.2) is very conservative with breaking changes and remains backwards-compatible with all Rust compilers since Rust 1.13 (released in 2016).

The language has evolved a lot since Rust 1.13, and we would like to make use of these features in libc. The main one is support for union types to proper expose C unions.

At the same time there, is a backlog of desired breaking changes tracked in this issue. Some of these come from the evolution of the underlying platforms, some come from a desire to use newer language features, while others are simple mistakes that we cannot correct without breaking existing code.

The goal of this project is to prepare and release the next major version of the libc crate.

Expected result

The libc crate is cleaned up and modernized, and released as version 0.3.

Desirable skills

Intermediate knowledge of Rust.

Project size

Medium.

Difficulty

Medium.

Mentor

Zulip streams

Add more lints to cargo-semver-checks

Description

cargo-semver-checks is a linter for semantic versioning. It ensures that Rust crates adhere to semantic versioning by looking for breaking changes in APIs.

It can currently catch ~120 different kinds of breaking changes, meaning there are hundreds of kinds of breaking changes it still cannot catch! The goal of this project is to extend its abilities, so that it can catch and prevent more breaking changes, by:

  • adding more lints, which are expressed as queries over a database-like schema (playground)
  • extending the schema, so more Rust functionality is made available for linting

Expected result

cargo-semver-checks will contain new lints, together with test cases that both ensure the lint triggers when expected and does not trigger in situations where it shouldn't (AKA false-positives).

Desirable skills

Intermediate knowledge of Rust. Familiarity with databases, query engines, or query language design is welcome but not required.

Project size

Medium or large, depends on how many lints will be implemented. The more lints, the better!

Difficulty

Medium to high, depends on the choice of implemented lints or schema extensions.

Mentor

Zulip streams

Related Links

Enable witness generation in cargo-semver-checks

Description

When cargo-semver-checks reports a breaking change, it in principle has seen enough information for the breakage to be reproduced with an example program: a witness program. Witness programs are valuable as they confirm that the suspected breakage did indeed happen, and is not a false-positive.

Expected result

Automatic witness generation is something we've explored, but we've only scratched the surface at implementing it so far. The goal of this project would be to take it the rest of the way: enable cargo-semver-checks to (with the user's opt-in) generate witness programs for each lint, verify that they indeed demonstrate the detected breakage, and inform the user appropriately of the breakage and the manner in which it was confirmed. If a witness program fails to reproduce breakage flagged by one of our lints, we've found a bug — the tool should then prepare a diagnostic info packet and offer to help the user open an auto-populated GitHub issue.

Stretch goal: having implemented witness generation, run another study of SemVer compliance in the Rust ecosystem, similar to the study we completed in 2023. The new study would cover many more kinds of breaking changes, since cargo-semver-checks today has 2.5x times more lints than it did back then. It would also reveal any new false-positive issues, crashes, or other regressions that may have snuck into the tool in the intervening years.

Desirable skills

Intermediate knowledge of Rust. Interest in building dev tools, and empathy for user needs so we can design the best possible user experience. Familiarity with databases, query engines, or programming language design is welcome but not required.

Project size

Large

Difficulty

Medium

Mentor

Related Links

Implement a cryptographic algorithm in RustCrypto

Description

The RustCrypto Project maintains pure Rust implementations of hundreds of cryptographic algorithms, organized into repositories by algorithm type, e.g. block ciphers, stream ciphers, hash functions.

Each of these repositories contains a tracking issue identifying specific algorithms which currently lack an implementation, some of which are linked in the "Related Links" section below. Interested students can look through these issues and identify an algorithm which is currently unimplemented which sounds interesting to them, and then implement it as part of this project.

Alternatively, instead of implementing a new algorithm from scratch, a student could potentially choose to implement some significant unit of functionality in an existing algorithm implementation with an open associated issue on our GitHub trackers, an example of which might be implementing hardware acceleration support for our "bignum" library.

Expected result

One or more Rust crates/libraries containing a new implementation of a cryptographic algorithm implemented in pure Rust.

Desirable skills

Intermediate knowledge of Rust.

A background in mathematics, and some prior knowledge of cryptography, is helpful but not required, and we can provide guidance and review to ensure code is correct and securely implemented.

Project size

Will vary depending on the algorithm/project selected, but ideally small.

Note that while the code size of the deliverable may not be significant, due to the nature of cryptographic work it will typically still involve significant effort and iteration to deliver an implementation which is correct and secure.

Difficulty

Will also vary depending on the algorithm/project selected, but expected difficulty is medium/hard, as noted above.

Mentor

Zulip streams

Related Links

Wild linker with test suites from other linkers

Description

The Wild linker is a project to build a very fast linker in Rust that has incremental linking and hot reload capabilities.

It currently works well enough to link itself, the Rust compiler, clang (provided you use the right compiler flags) and a few other things. However, there are various features and combinations of flags that don’t yet work correctly. Furthermore, we have a pretty incomplete picture of what we don’t support.

The proposed project is to run the test suite of other linkers with Wild as the linker being tested, then for each failure, determine what the problem is. It’s expected that many failures will have the same root cause.

Expected result

Write a program, ideally in Rust, that runs the test suite of some other linker. Mold’s test suite is pretty easy to run with Wild, so that’s probably a good default choice. The Rust program should emit a CSV file with one row per test, whether the test passes or fails and if it fails, an attempt to identify the cause based on errors / warnings emitted by Wild.

For tests where Wild doesn’t currently emit any error or warning that is related to the cause of the test failure, attempt to make it do so. Some of the tests might fail for reasons that are hard to identify. It’s OK to just leave these as uncategorised. Where tests fail due to bugs or differences in behaviour of Wild, automatic classification likely isn’t practical. A one-off classification of these would be beneficial.

If time permits, pick something achievable that seems like an important feature / bug to support / fix and implement / fix it.

Desirable skills

Knowledge of Rust. Any existing knowledge of low-level details like assembly or the ELF binary format is useful, but can potentially be learned as we go.

Project size

Small to large depending on chosen scope.

Difficulty

Some of the work is medium. Diagnosing and / or fixing failures is often pretty hard.

Mentor

Further resources

About

Rust project ideas for Google Summer of Code

Resources

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published