This page contains a list of ideas for various projects that could help improve the Rust Project and potentially also the wider Rust community.
These project ideas can be used as inspiration for various OSS contribution programs, such as Google Summer of Code or OSPP.
This document contains ideas that should still be actual and that were not yet completed. Here you can also find an archive of older projects from past GSoC events:
- Past Google Summer of Code projects
We invite contributors that would like to participate in projects such as GSoC or that would just want to find a Rust project that they would like to work on to examine the project list and use it as an inspiration. Another source of inspiration can be the Rust Project Goals, particularly the orphaned goals.
If you would like to participate in GSoC, please read this. If you would like to discuss projects ideas or anything related to them, you can do so on our Zulip.
We use the GSoC project size parameters for estimating the expected time complexity of the project ideas. The individual project sizes have the following expected amounts of hours:
- Small: 90 hours
- Medium: 175 hours
- Large: 350 hours
- Rust Compiler
- Infrastructure
- Cargo
- Rustfmt
- Crate ecosystem
The list of ideas is divided into several categories.
Description
rustc
currently has three in-tree codegen backends: LLVM (the default), Cranelift, and GCC.
These live at https://github.com/rust-lang/rust/tree/master/compiler, as rustc_codegen_*
crates.
The goal of this project is to add a new experimental rustc_codegen_c
backend that could turn Rust's internal
representations into C
code (i.e. transpile) and optionally invoke a C
compiler to build it. This will allow Rust
to use benefits of existing C
compilers (better platform support, optimizations) in situations where the existing backends
cannot be used.
Expected result
The minimum viable product is to turn rustc
data structures that represent a Rust program into C
code, and write the
output to the location specified by --out-dir
. This involves figuring out how to produce buildable C
code from the
inputs provided by rustc_codegen_ssa::traits::CodegenBackend
.
A second step is to have rustc
invoke a C
compiler on these produced files. This should be designed in a pluggable way,
such that any C
compiler can be dropped in.
Desirable skills
Knowledge of Rust and C
, basic familiarity with compiler functionality.
Project size
Large.
Difficulty
Hard.
Mentor
Zulip streams
Description
rustc
currently has incomplete support for using annotate-snippets
to emit errors, but it doesn't support all the features that rustc
's built-in diagnostic rendering does. The goal
of this project is to execute the rustc
test suite using annotate-snippets
, identify missing features or bugs,
fix those, and repeat until at feature-parity.
Expected result
More of the rustc
test suite passes with annotate-snippets
.
Desirable skills
Knowledge of Rust.
Project size
Medium.
Difficulty
Medium or hard.
Mentor
Zulip streams
Description
Recent OSS attacks such as the XZ backdoor have shown the importance of having reproducible builds.
Currently, the Rust toolchain distributed to Rust developers is not very reproducible. Our source code archives should be reproducible as of this pull request, however making the actual binary artifacts reproducible is a much more difficult effort.
The goal of this project is to investigate what exactly makes Rust builds not reproducible, and try to resolve as many such issues as possible.
While the main motivation is to make the Rust toolchain (compiler, standard library, etc.) releases reproducible, any improvements on this front should benefit the reproducibility of all Rust programs.
Expected result
Rust builds are more reproducible, ideally the Rust toolchain can be compiled in a reproducible manner.
Desirable skills
Knowledge of Rust and ideally also build systems.
Project size
Medium.
Difficulty
Large.
Mentor
Related links
Description
rustc_codegen_gcc
used to be able to compile rustc
and use the resulting compiler to successfully compile a Hello, World!
program.
While it can still compile a stage 2 rustc
, the resulting compiler cannot compile the standard library anymore.
The goal of this project would be to fix in rustc_codegen_gcc
any issue preventing the resulting compiler to compile a Hello, World!
program and the standard library.
Those issues are not known, so the participant would need to attempt to do a bootstrap and investigate the issues that arises.
If time allows, an optional additional goal could be to be able to do a full bootstrap of rustc
with rustc_codegen_gcc
, meaning fixing even more issues to achieve this result.
Expected result
A rustc_codegen_gcc
that can compile a stage 2 rustc
where the resulting compiler can compile a Hello, World!
program using the standard library (also compiled by that resulting compiler).
An optional additional goal would be: a rustc_codegen_gcc
that can do a full bootstrap of the Rust compiler. This means getting a stage 3 rustc
that is identical to stage 2.
Desirable skills
Good debugging ability. Basic knowledge of:
- Intel x86-64 assembly (for debugging purposes).
rustc
internals, especially the codegen part.libgccjit
and GCC internals.
Project size
Medium-Large depending on the chosen scope.
Difficulty
Hard.
Mentor
Zulip streams
- Idea discussion
- rustc_codegen_gcc
Description
rustc_codegen_gcc
uses rustc_codegen_ssa
and implements the traits in this crate in order to have a codegen that plugs in rustc
seamlessly.
Since rustc_codegen_ssa
was created based on rustc_codegen_llvm
, they are somewhat similar, which sometimes makes it awkward for the GCC codegen.
Indeed, some hacks were needed to be able to implement the GCC codegen with this API:
- Usage of unsafe
transmute
: for instance, this or this. Fixing this might require separatingValue
intoRValue
andLValue
or usingFunction
in place ofValue
in some places to better fit the GCC API. - Usage of mappings to workaround the API: for instance, this or this.
Some other improvement ideas include:
- Separate the aggregate operations (structs, arrays): methods like
extract_value
are generic over structures and arrays because it's the same operation in LLVM, but it is different operations in GCC, so it might make sense to have multiple methods likeextract_field
andextract_array_element
. - Remove duplications between
rustc_codegen_gcc
andrustc_codegen_llvm
by moving more stuff intorustc_codegen_ssa
. For instance:- some debuginfo code is exactly the same
- ABI code
- the allocator code
- the dummy output type for inline assembly
- perhaps we could add a
set_alignment
method inrustc_codegen_ssa
that asks the backend to set the alignment and is called inrustc_codegen_ssa
in strategic places so that we don't have to worry as much about alignment in the codegens (not sure if this is possible).
The goal of this project is to improve rustc_codegen_gcc
by removing hacks, unnecessary unsafe code and/or code duplication with rustc_codegen_llvm
by refactoring rustc_codegen_ssa
.
It would be important that this refactoring does not result in a performance degradation for rustc_codegen_llvm
.
Expected result
A rustc_codegen_gcc
that contains less hacks, unsafe code and/or code duplication with rustc_codegen_llvm
.
Desirable skills
Knowledge of Rust and basic knowledge of rustc
internals, especially the codegen part.
Project size
Small-Medium depending on the chosen scope.
Difficulty
Medium.
Mentor
Zulip streams
- Idea discussion
- rustc_codegen_gcc
Description
Various Rust repositories under the rust-lang organization use a merge queue bot (bors) for testing and merging pull requests. Currently, we use a legacy implementation called homu, which is quite buggy and very difficult to maintain, so we would like to get rid of it. We have started the implementation of a new bot called simply bors, which should eventually become the primary method for merging pull requests in the rust-lang/rust repository.
The bors bot is a GitHub app that responds to user commands and performs various operations on a GitHub repository. Primarily, it creates merge commits and reports test workflow results for them. It can currently perform so-called "try builds", which can be started manually by users on a given PR to check if a subset of CI passed on the PR. However, the most important functionality, actually merging pull requests into the main branch, has not been implemented yet.
Expected result
bors can be used to perform pull request merges, including "rollups". In an ideal case, bors will be already usable on the rust-lang/rust
repository.
Desirable skills
Intermediate knowledge of Rust. Familiarity with GitHub APIs is a bonus.
Project size
Medium.
Difficulty
Medium.
Mentors
Zulip streams
Description
The Rust compiler it bootstrapped using a complex set of scripts and programs generally called just bootstrap
.
This tooling is constantly changing, and it has accrued a lot of technical debt. It could be improved in many areas, for example:
- Design a new testing infrastructure and write more tests.
- Write documentation.
- Remove unnecessary hacks.
Expected result
The bootstrap
tooling will have less technical debt, more tests, and better documentation.
Desirable skills
Intermediate knowledge of Rust. Knowledge of the Rust compiler bootstrap process is welcome, but not required.
Project size
Medium or large.
Difficulty
Medium.
Mentor
Zulip streams
Description
Some compiler errors know how to fix the problem and cargo fix
is the command for applying those fixes.
Currently, cargo fix
calls into the APIs that implement cargo check
with
cargo
in a way that allows getting the json messages from rustc and apply
them to workspace members.
To avoid problems with conflicting or redundant fixes, cargo fix
runs rustc
for workspace members in serial.
As one fix might lead to another, cargo fix
runs rustc
for each workspace member in a loop until a fixed point is reached.
This can be very slow for large workspaces.
We want to explore an alternative architecture where cargo fix
runs the
cargo check
command in a loop,
processing the json messages,
until a fixed point is reached.
Benefits
- Always runs in parallel
- May make it easier to extend the behavior, like with an interactive mode
Downsides
- Might have issues with files owned by multiple packages or even multiple build targets
This can leverage existing CLI and crate APIs of Cargo and can be developed as a third-party command.
See cargo#13214 for more details.
Expected result
- A third-party command as described above
- A comparison of performance across representative crates
- An analysis of corner the behavior with the described corner cases
Desirable skills
Intermediate knowledge of Rust.
Project size
Medium
Difficulty
Medium.
Mentor
Zulip streams
- Idea discussion
- Cargo team
Description
Cargo is a high-level, opinionated command. Instead of trying to directly support every use case, we want to explore exposing the building blocks of the high-level commands as "plumbing" commands that people can use programmatically to compose together to create custom Cargo behavior.
This can be prototyped outside of the Cargo code base, using the Cargo API.
See the Project Goal for more details.
Expected result
Ideal: a performant cargo porcelain check
command that calls out to
individual cargo plumbing <name>
commands to implement its functionality.
Depending on the size the particpant takes on and their experience, this may be out of reach. The priorities are:
- A shell of
cargo porcelain check
- Individual commands until
cargo porcelain check
is functional - Performance
Desirable skills
Intermediate knowledge of Rust.
Project size
Scaleable
Difficulty
Medium.
Mentor
Zulip streams
- Idea discussion
- Cargo team
Description
Cargo maintains Bash and Zsh completions, but they are duplicated and limited in features.
A previous GSoC participant added unstable support for completions in Cargo itself, so we can have a single implementation with per-shell skins (rust-lang/cargo#6645).
There are many more arguments that need custom completers as well as polish in the completion system itself before this can be stabilized.
See
Expected result
Ideal:
- A report to clap maintainers on the state of the unstable completions and why its ready for stabilization
- A report to cargo maintainers on the state of the unstable completions and why its ready for stabilization
Desirable skills
Intermediate knowledge of Rust. Shell familiarity is a bonus.
Project size
Medium.
Difficulty
Medium.
Mentor
Description
When developers need to extend how Cargo builds their package, they can write a build script. This gives users quite a bit of flexibility but
- Allows running arbitrary code on the users system, requiring extra auditing
- Needs to be compiled and run before the relevant package can be built
- They are all-or-nothing, requiring users to do extra checks to avoid running expensive logic
- They run counter to the principles of third-party build tools that try to mimic Cargo
A developer could make their build script a thin wrapper around a library (e.g. shadow-rs) but a build script still exists to be audited (even if its small) and each individual wrapper build script must be compiled and linked. This is still opaque to third-party build tools.
Leveraging an unstable feature, artifact dependencies, we could allow a developer to say that one or more dependencies should be run as build scripts, passing parameters to them.
This project would add unstable support for build script delegation that can then be evaluated for proposing as an RFC for approval.
See the proposal for more details.
Expected result
Milestones
- An unstable feature for multiple build scripts
- An unstable feature for passing parameters to build scripts from
Cargo.toml
, built on the above - An unstable feature for build script delegation, built on the above two
Bonus: preparation work to stabilize a subset of artifact dependencies.
Desirable skills
Intermediate knowledge of Rust, especially experience with writing build scripts.
Project size
Large.
Difficulty
Medium.
Mentor
Description
Rustfmt is the code formatter for Rust code. Currently, to ensure stability, rustfmt uses unit tests that ensure a source file do not get reformatted unexpectedly. Additionally, there is a tool (currently a shell script) called diffcheck
that gets run to check potentially unexpected changes across different large codebases. We would like to improve our tooling around that, namely improving the diffcheck job to include more crates, improve reporting (with HTML output, like a mini crater, which runs compiler changes against all Rust crates published to crates.io), potentially rewriting the job in Rust, and reliability.
Rustfmt currently has a versioning system that gates unstable changes behind Version=Two
, and the diffcheck job may be less reliable to report changes to Version=One
when changes to unstable formatting are introduced. We'd like to see this story improved to make our test system more robust.
Expected result
A more robust and reliable infrastructure for testing the rustfmt codebase, potentially rewritten in Rust, with HTML output.
Desirable skills
Intermediate knowledge of Rust. Knowledge of CI and automation welcomed.
Project size
Small or medium, depending on the scale proposed.
Difficulty
Small or medium.
Mentor
Zulip streams
Related Links
Description
The libc crate is one of the oldest crates of the Rust ecosystem, long predating
Rust 1.0. Additionally, it is one of the most widely used crates in the ecosystem (#4 most downloaded on crates.io).
This combinations means that the current version of the libc crate (v0.2
) is very conservative with breaking changes and
remains backwards-compatible with all Rust compilers since Rust 1.13 (released in 2016).
The language has evolved a lot since Rust 1.13, and we would like to make use of these features in libc. The main one is
support for union
types to proper expose C unions.
At the same time there, is a backlog of desired breaking changes tracked in this issue. Some of these come from the evolution of the underlying platforms, some come from a desire to use newer language features, while others are simple mistakes that we cannot correct without breaking existing code.
The goal of this project is to prepare and release the next major version of the libc crate.
Expected result
The libc crate is cleaned up and modernized, and released as version 0.3.
Desirable skills
Intermediate knowledge of Rust.
Project size
Medium.
Difficulty
Medium.
Mentor
Zulip streams
Description
cargo-semver-checks
is a linter for semantic versioning. It ensures
that Rust crates adhere to semantic versioning by looking for breaking changes in APIs.
It can currently catch ~120 different kinds of breaking changes, meaning there are hundreds of kinds of breaking changes it still cannot catch! The goal of this project is to extend its abilities, so that it can catch and prevent more breaking changes, by:
- adding more lints, which are expressed as queries over a database-like schema (playground)
- extending the schema, so more Rust functionality is made available for linting
Expected result
cargo-semver-checks
will contain new lints, together with test cases that both ensure the lint triggers when expected
and does not trigger in situations where it shouldn't (AKA false-positives).
Desirable skills
Intermediate knowledge of Rust. Familiarity with databases, query engines, or query language design is welcome but not required.
Project size
Medium or large, depends on how many lints will be implemented. The more lints, the better!
Difficulty
Medium to high, depends on the choice of implemented lints or schema extensions.
Mentor
Zulip streams
Related Links
- Playground where you can try querying Rust data
- GitHub issues describing not-yet-implemented lints
- Opportunities to add new schema, enabling new lints
- Query engine adapter
Description
When cargo-semver-checks
reports a breaking change, it in principle has seen enough information for the breakage to be reproduced with an example program: a witness program.
Witness programs are valuable as they confirm that the suspected breakage did indeed happen, and is not a false-positive.
Expected result
Automatic witness generation is something we've explored, but we've only scratched the surface at implementing it so far.
The goal of this project would be to take it the rest of the way: enable cargo-semver-checks
to (with the user's opt-in) generate witness programs for each lint, verify that they indeed demonstrate the detected breakage, and inform the user appropriately of the breakage and the manner in which it was confirmed.
If a witness program fails to reproduce breakage flagged by one of our lints, we've found a bug — the tool should then prepare a diagnostic info packet and offer to help the user open an auto-populated GitHub issue.
Stretch goal: having implemented witness generation, run another study of SemVer compliance in the Rust ecosystem, similar to the study we completed in 2023. The new study would cover many more kinds of breaking changes, since cargo-semver-checks
today has 2.5x times more lints than it did back then. It would also reveal any new false-positive issues, crashes, or other regressions that may have snuck into the tool in the intervening years.
Desirable skills
Intermediate knowledge of Rust. Interest in building dev tools, and empathy for user needs so we can design the best possible user experience. Familiarity with databases, query engines, or programming language design is welcome but not required.
Project size
Large
Difficulty
Medium
Mentor
Related Links
- Playground where you can try querying Rust data
- Use of witness programs to verify breaking change lints
Description
The RustCrypto Project maintains pure Rust implementations of hundreds of cryptographic algorithms, organized into repositories by algorithm type, e.g. block ciphers, stream ciphers, hash functions.
Each of these repositories contains a tracking issue identifying specific algorithms which currently lack an implementation, some of which are linked in the "Related Links" section below. Interested students can look through these issues and identify an algorithm which is currently unimplemented which sounds interesting to them, and then implement it as part of this project.
Alternatively, instead of implementing a new algorithm from scratch, a student could potentially choose to implement some significant unit of functionality in an existing algorithm implementation with an open associated issue on our GitHub trackers, an example of which might be implementing hardware acceleration support for our "bignum" library.
Expected result
One or more Rust crates/libraries containing a new implementation of a cryptographic algorithm implemented in pure Rust.
Desirable skills
Intermediate knowledge of Rust.
A background in mathematics, and some prior knowledge of cryptography, is helpful but not required, and we can provide guidance and review to ensure code is correct and securely implemented.
Project size
Will vary depending on the algorithm/project selected, but ideally small.
Note that while the code size of the deliverable may not be significant, due to the nature of cryptographic work it will typically still involve significant effort and iteration to deliver an implementation which is correct and secure.
Difficulty
Will also vary depending on the algorithm/project selected, but expected difficulty is medium/hard, as noted above.
Mentor
Zulip streams
Related Links
- Potential AEAD cipher projects
- Potential block cipher projects
- Potential elliptic curve projects
- Potential hash function projects
- Potential signature algorithm projects
- Potential stream cipher projects
- Potential SSH-related projects
Description
The Wild linker is a project to build a very fast linker in Rust that has incremental linking and hot reload capabilities.
It currently works well enough to link itself, the Rust compiler, clang (provided you use the right compiler flags) and a few other things. However, there are various features and combinations of flags that don’t yet work correctly. Furthermore, we have a pretty incomplete picture of what we don’t support.
The proposed project is to run the test suite of other linkers with Wild as the linker being tested, then for each failure, determine what the problem is. It’s expected that many failures will have the same root cause.
Expected result
Write a program, ideally in Rust, that runs the test suite of some other linker. Mold’s test suite is pretty easy to run with Wild, so that’s probably a good default choice. The Rust program should emit a CSV file with one row per test, whether the test passes or fails and if it fails, an attempt to identify the cause based on errors / warnings emitted by Wild.
For tests where Wild doesn’t currently emit any error or warning that is related to the cause of the test failure, attempt to make it do so. Some of the tests might fail for reasons that are hard to identify. It’s OK to just leave these as uncategorised. Where tests fail due to bugs or differences in behaviour of Wild, automatic classification likely isn’t practical. A one-off classification of these would be beneficial.
If time permits, pick something achievable that seems like an important feature / bug to support / fix and implement / fix it.
Desirable skills
Knowledge of Rust. Any existing knowledge of low-level details like assembly or the ELF binary format is useful, but can potentially be learned as we go.
Project size
Small to large depending on chosen scope.
Difficulty
Some of the work is medium. Diagnosing and / or fixing failures is often pretty hard.
Mentor
Further resources