-
Notifications
You must be signed in to change notification settings - Fork 6
radicle-surf: refactor design: part 4 #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,29 +9,27 @@ can use it to create a GitHub-like UI for a git repo: | |
|
||
1. Code browsing: given a specific commit/ref, browse files and directories. | ||
2. Diff between two revisions that resolve into two commits. | ||
3. Retrieve the history of the commits. | ||
4. Retrieve a specific object: all its metadata. | ||
5. Retrieve the refs: Branches, Tags, Remotes, Notes and user-defined | ||
"categories", where a category is: refs/<category>/<...>. | ||
3. Retrieve the history of commits with a given head, and optionally a file. | ||
4. List refs and retrieve their metadata: Branches, Tags, Remotes, | ||
Notes and user-defined "categories", where a category is: refs/<category>/<...>. | ||
|
||
## Motivation | ||
|
||
The `radicle-surf` crate aims to provide a safe and easy-to-use API that | ||
supports the features listed in [Introduction]. Based on the existing API, | ||
the main goals of the refactoring are: | ||
|
||
- API review: make API simpler whenever possible. | ||
- Address open issues in the original `radicle-surf` repo as much as possible. | ||
- Not to shy away from being `git` specific. (i.e. not to consider supporting | ||
other VCS systems) | ||
- Hide away `git2` from the API. The use of `git2` should be an implementation | ||
detail. | ||
- API review: identify the issues with the current API. | ||
- New API: propose a new API that could reuse parts of the existing API. | ||
- Address open issues in the original `radicle-surf` repo. | ||
- Be `git` specific. (i.e. no need to support other VCS systems) | ||
- Remove `git2` from the public API. The use of `git2` should be an | ||
implementation detail. | ||
|
||
## API review | ||
|
||
The current API has quite a bit accidental complexity that is not inherent with | ||
the requirements, especially when we can be git specific and don't care about | ||
other VCS systems. | ||
In this section, we review some core types in the current API and propose | ||
changes to them. The main theme is to make the API simpler and easier to use. | ||
|
||
### Remove the `Browser` | ||
|
||
|
@@ -80,39 +78,121 @@ We no longer need another layer of indirection defined by `Vcs` trait. | |
With the changes proposed in the previous section, we describe what the new API | ||
would look like and how they meet the requirements. | ||
|
||
### Common principles | ||
### Basic types | ||
|
||
#### How to identify things that resolve into a commit | ||
#### Revision and Commit | ||
|
||
In our API, it will be good to have a single way to identify all refs and | ||
objects that resolve into commits. In other words, we try to avoid using | ||
different ways at different places. Currently there are multiple types in the | ||
API for this purpose: | ||
In Git, `Revision` commonly resolves into a `Commit` but could refers to other | ||
objects for example a `Blob`. Hence we need to keep both concepts in the API. | ||
Currently we have multiple types to identify a `Commit` or `Revision`. | ||
|
||
- Commit | ||
- Oid | ||
- Rev | ||
|
||
Because `Rev` is the most high level among those and supports refs already, | ||
I think we should use `Rev` in our API as much as possible. | ||
The relations between them are: all `Rev` and `Commit` can resolve into `Oid`, | ||
and most `Rev`s can resolve into `Commit`. | ||
|
||
#### How to identify History | ||
On one hand, `Oid` is the ultimate unique identifer but it is more machine- | ||
friendly than human-friendly. On the other hand, `Revision` is most human- | ||
friendly and better suited in the API interface. A conversion from `Revision` | ||
to `Oid` will be useful. | ||
|
||
TBD | ||
For the places where `Commit` is required, we should explicitly ask for | ||
`Commit` instead of `Revision`. | ||
|
||
In conclusion, we define two new traits to support the use of `Revision` and | ||
`Commit`: | ||
|
||
```Rust | ||
pub trait Revision { | ||
/// Resolves a revision into an object id in `repo`. | ||
fn object_id(&self, repo: &RepositoryRef) -> Result<Oid, Error>; | ||
} | ||
|
||
pub trait ToCommit { | ||
/// Converts to a commit in `repo`. | ||
fn to_commit(self, repo: &RepositoryRef) -> Result<Commit, Error>; | ||
} | ||
``` | ||
|
||
These two traits will be implemented for most common representations of | ||
`Revision` and `Commit`, for example `&str`, refs like `Branch`, `Tag`, etc. | ||
Our API will use these traits where we expect a `Revision` or a `Commit`. | ||
|
||
#### History | ||
|
||
The current `History` is generic over VCS types and also retrieves the full list | ||
of commits when the history is created. The VCS part can be removed and the | ||
history can lazy-load the list of commits by implmenting `Iterator` to support | ||
potentially very long histories. | ||
|
||
We can also store the head commit with the history so that it's easy to get | ||
the start point and it helps to identify the history. | ||
|
||
To support getting the history of a file, we provide methods to modify a | ||
`History` to filter by a file path. | ||
|
||
The new `History` type would look like this: | ||
|
||
```Rust | ||
pub struct History<'a> { | ||
repo: RepositoryRef<'a>, | ||
head: Commit, | ||
revwalk: git2::Revwalk<'a>, | ||
filter_by: Option<FilterBy>, | ||
} | ||
|
||
enum FilterBy { | ||
File { path: file_system::Path }, | ||
} | ||
``` | ||
|
||
For the methods provided by `History`, please see section [Retrieve the history] | ||
(#retrieve-the-history) below. | ||
|
||
#### Commit | ||
|
||
`Commit` is a central concept in Git. In `radicle-surf` we define `Commit` type | ||
to represent its metadata: | ||
|
||
```Rust | ||
pub struct Commit { | ||
/// Object Id | ||
pub id: Oid, | ||
/// The author of the commit. | ||
pub author: Author, | ||
/// The actor who committed this commit. | ||
pub committer: Author, | ||
/// The long form message of the commit. | ||
pub message: String, | ||
/// The summary message of the commit. | ||
pub summary: String, | ||
/// The parents of this commit. | ||
pub parents: Vec<Oid>, | ||
} | ||
``` | ||
|
||
To get the content (i.e. the tree object) of the commit, the user should use | ||
`snapshot` method described in [Code browsing](#code-browsing) section. | ||
|
||
To get the diff of the commit, the user should use `diff_from_parent` method | ||
described in [Diffs](#diffs) section. Note that we might move that method to | ||
`Commit` type itself. | ||
|
||
### Code browsing | ||
|
||
The user should be able to browse the files and directories for any given | ||
commits or references. The core API is: | ||
commit. The core API is: | ||
|
||
- Create a root Directory: | ||
```Rust | ||
imp RepositoryRef { | ||
pub fn snapshot(&self, rev: &Rev) -> Result<Directory, Error>; | ||
impl RepositoryRef { | ||
pub fn snapshot<C: ToCommit>(&self, commit: C) -> Result<Directory, Error>; | ||
} | ||
``` | ||
|
||
- Browse a Directory: | ||
- Browse a Directory's contents: | ||
```Rust | ||
impl Directory { | ||
pub fn contents(&self) -> impl Iterator<Item = &DirectoryContents>; | ||
|
@@ -135,26 +215,142 @@ pub enum DirectoryContents { | |
|
||
### Diffs | ||
|
||
The user would be able to create a diff between any two revisions that resolve | ||
into two commits. | ||
The user would be able to create a diff between any two revisions. In the first | ||
implementation, these revisions have to resolve into commits. But in future, | ||
the revisions could refer to other objects, e.g. files (blobs). | ||
|
||
The main change is to use `Rev` instead of `Oid` to identify `from` and `to`. | ||
The core API is: | ||
|
||
```Rust | ||
imp RepositoryRef { | ||
pub fn diff(&self, from: &Rev, to: &Rev) -> Result<Diff, Error>; | ||
impl RepositoryRef { | ||
/// Returns the diff between two revisions. | ||
pub fn diff<R: Revision>(&self, from: R, to: R) -> Result<Diff, Error>; | ||
} | ||
``` | ||
|
||
To help convert from `Oid` to `Rev`, we provide a helper method: | ||
We used to have the following method: | ||
```Rust | ||
/// Returns the diff between a revision and the initial state of the repo. | ||
pub fn initial_diff<R: Revision>(&self, rev: R) -> Result<Diff, Error>; | ||
``` | ||
|
||
However, it is not comparing with any other commit so the output is basically | ||
the snapshot of `R`. I am not sure if it is necessary. My take is that we can | ||
remove this method. | ||
|
||
We also have the following method: | ||
```Rust | ||
imp RepositoryRef { | ||
/// Returns the Oid of `rev`. | ||
pub fn rev_oid(&self, rev: &Rev) -> Result<Oid, Error>; | ||
/// Returns the diff of a specific commit. | ||
pub fn diff_from_parent<C: ToCommit>(&self, commit: C) -> Result<Diff, Error>; | ||
``` | ||
|
||
I think it is probably better to instead define the above method as | ||
`Commit::diff()` as its output is associated with a `Commit`. | ||
|
||
### Retrieve the history | ||
|
||
The user would be able to get the list of previous commits reachable from a | ||
particular commit. | ||
|
||
To create a `History` from a repo with a given head: | ||
```Rust | ||
impl RepositoryRef { | ||
pub fn history<C: ToCommit>(&self, head: C) -> Result<History, Error>; | ||
} | ||
``` | ||
|
||
`History` implements `Iterator` that produces `Result<Commit, Error>`, and | ||
also provides these methods: | ||
|
||
```Rust | ||
impl<'a> History<'a> { | ||
pub fn new<C: ToCommit>(repo: RepositoryRef<'a>, head: C) -> Result<Self, Error>; | ||
|
||
pub fn head(&self) -> &Commit; | ||
|
||
// Modifies a history with a filter by `path`. | ||
// This is to support getting the history of a file. | ||
pub fn by_path(mut self, path: file_system::Path) -> Self; | ||
``` | ||
|
||
- Alternative design: | ||
|
||
One potential downside of define `History` as an iterator is that: | ||
`history.next()` takes a mutable history object. A different design is to use | ||
`History` as immutable object that produces an iterator on-demand: | ||
|
||
```Rust | ||
pub struct History<'a> { | ||
repo: RepositoryRef<'a>, | ||
head: Commit, | ||
} | ||
|
||
impl<'a> History<'a> { | ||
/// This method creats a new `RevWalk` internally and return an | ||
/// iterator for all commits in a history. | ||
pub fn iter(&self) -> impl Iterator<Item = Commit>; | ||
} | ||
``` | ||
|
||
In this design, `History` does not keep `RevWalk` in its state. It will create | ||
a new one when `iter()` is called. I like the immutable interface of this design | ||
but did not implement it in the current code mainly because the libgit2 doc says | ||
[creating a new `RevWalk` is relatively expensive](https://libgit2.org/libgit2/#HEAD/group/revwalk/git_revwalk_new). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here is a clickable link for the libgit2 API doc . If people are interested in this alternative design, I can do some measurements to see how "expensive" git_revwalk_new actually is. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a super interesting find! Not only is it expensive it's also not thread-safe to share 😱 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to add, the "not thread-safe" part would make the alternative design a bit more attractive. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, fun fact! The repository can't be shared across threads either 🙃 In Rust this is indicated by it either having a Having the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't understand, can you elaborate :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Never mind, I was thinking that we could iterator the history to print out the commit list in |
||
|
||
### List refs and retrieve their metadata | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not quite sure about the below but we can follow up. I think it'd be better to think about these once you're more familiar with |
||
|
||
Git refs are simple names that point to objects using object IDs. `radicle-surf` | ||
support refs by its `Ref` type. | ||
|
||
```Rust | ||
pub enum Ref { | ||
/// A git tag, which can be found under `.git/refs/tags/`. | ||
Tag { | ||
/// The name of the tag, e.g. `v1.0.0`. | ||
name: TagName, | ||
}, | ||
/// A git branch, which can be found under `.git/refs/heads/`. | ||
LocalBranch { | ||
/// The name of the branch, e.g. `master`. | ||
name: BranchName, | ||
}, | ||
/// A git branch, which can be found under `.git/refs/remotes/`. | ||
RemoteBranch { | ||
/// The remote name, e.g. `origin`. | ||
remote: String, | ||
/// The name of the branch, e.g. `master`. | ||
name: BranchName, | ||
}, | ||
/// A git namespace, which can be found under `.git/refs/namespaces/`. | ||
/// | ||
/// Note that namespaces can be nested. | ||
Namespace { | ||
/// The name value of the namespace. | ||
namespace: String, | ||
/// The reference under that namespace, e.g. The | ||
/// `refs/remotes/origin/master/ portion of `refs/namespaces/ | ||
/// moi/refs/remotes/origin/master`. | ||
reference: Box<Ref>, | ||
}, | ||
/// A git notes, which can be found under `.git/refs/notes` | ||
Notes { | ||
/// The default name is "commits". | ||
name: String, | ||
} | ||
} | ||
``` | ||
|
||
### Git Objects | ||
|
||
Git has four kinds of objects: Blob, Tree, Commit and Tag. We have already | ||
discussed `Commit` (a struct) and `Tag` (`Ref::Tag`) types. For `blob`, we | ||
use `File` type to represent, and for `tree`, we use `Directory` type to | ||
represent. The motivation is to let the user "surf" a repo as a file system | ||
as much as possible, and to avoid Git internal concepts in our API if possible. | ||
|
||
Open question: there could be some cases where the names `blob` and `tree` | ||
shall be used. We need to define such cases clearly if they exist. | ||
|
||
## Error handling | ||
|
||
TBD |
Uh oh!
There was an error while loading. Please reload this page.