Skip to content

Commit 97c930c

Browse files
committed
radicle-surf: refactor design: part 4
- Update design doc. - Refactor History API. - Use FromRevision and IntoCommit traits. - Add more test cases. Signed-off-by: Han Xu <[email protected]>
1 parent 7fe6993 commit 97c930c

File tree

16 files changed

+654
-389
lines changed

16 files changed

+654
-389
lines changed

radicle-surf/docs/refactor-design.md

Lines changed: 232 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -9,29 +9,27 @@ can use it to create a GitHub-like UI for a git repo:
99

1010
1. Code browsing: given a specific commit/ref, browse files and directories.
1111
2. Diff between two revisions that resolve into two commits.
12-
3. Retrieve the history of the commits.
13-
4. Retrieve a specific object: all its metadata.
14-
5. Retrieve the refs: Branches, Tags, Remotes, Notes and user-defined
15-
"categories", where a category is: refs/<category>/<...>.
12+
3. Retrieve the history of commits with a given head, and optionally a file.
13+
4. List refs and retrieve their metadata: Branches, Tags, Remotes,
14+
Notes and user-defined "categories", where a category is: refs/<category>/<...>.
1615

1716
## Motivation
1817

1918
The `radicle-surf` crate aims to provide a safe and easy-to-use API that
2019
supports the features listed in [Introduction]. Based on the existing API,
2120
the main goals of the refactoring are:
2221

23-
- API review: make API simpler whenever possible.
24-
- Address open issues in the original `radicle-surf` repo as much as possible.
25-
- Not to shy away from being `git` specific. (i.e. not to consider supporting
26-
other VCS systems)
27-
- Hide away `git2` from the API. The use of `git2` should be an implementation
28-
detail.
22+
- API review: identify the issues with the current API.
23+
- New API: propose a new API that could reuse parts of the existing API.
24+
- Address open issues in the original `radicle-surf` repo.
25+
- Be `git` specific. (i.e. no need to support other VCS systems)
26+
- Remove `git2` from the public API. The use of `git2` should be an
27+
implementation detail.
2928

3029
## API review
3130

32-
The current API has quite a bit accidental complexity that is not inherent with
33-
the requirements, especially when we can be git specific and don't care about
34-
other VCS systems.
31+
In this section, we review some core types in the current API and propose
32+
changes to them. The main theme is to make the API simpler and easier to use.
3533

3634
### Remove the `Browser`
3735

@@ -80,39 +78,121 @@ We no longer need another layer of indirection defined by `Vcs` trait.
8078
With the changes proposed in the previous section, we describe what the new API
8179
would look like and how they meet the requirements.
8280

83-
### Common principles
81+
### Basic types
8482

85-
#### How to identify things that resolve into a commit
83+
#### Revision and Commit
8684

87-
In our API, it will be good to have a single way to identify all refs and
88-
objects that resolve into commits. In other words, we try to avoid using
89-
different ways at different places. Currently there are multiple types in the
90-
API for this purpose:
85+
In Git, `Revision` commonly resolves into a `Commit` but could refers to other
86+
objects for example a `Blob`. Hence we need to keep both concepts in the API.
87+
Currently we have multiple types to identify a `Commit` or `Revision`.
9188

9289
- Commit
9390
- Oid
9491
- Rev
9592

96-
Because `Rev` is the most high level among those and supports refs already,
97-
I think we should use `Rev` in our API as much as possible.
93+
The relations between them are: all `Rev` and `Commit` can resolve into `Oid`,
94+
and most `Rev`s can resolve into `Commit`.
9895

99-
#### How to identify History
96+
On one hand, `Oid` is the ultimate unique identifer but it is more machine-
97+
friendly than human-friendly. On the other hand, `Revision` is most human-
98+
friendly and better suited in the API interface. A conversion from `Revision`
99+
to `Oid` will be useful.
100100

101-
TBD
101+
For the places where `Commit` is required, we should explicitly ask for
102+
`Commit` instead of `Revision`.
103+
104+
In conclusion, we define two new traits to support the use of `Revision` and
105+
`Commit`:
106+
107+
```Rust
108+
pub trait Revision {
109+
/// Resolves a revision into an object id in `repo`.
110+
fn object_id(&self, repo: &RepositoryRef) -> Result<Oid, Error>;
111+
}
112+
113+
pub trait ToCommit {
114+
/// Converts to a commit in `repo`.
115+
fn to_commit(self, repo: &RepositoryRef) -> Result<Commit, Error>;
116+
}
117+
```
118+
119+
These two traits will be implemented for most common representations of
120+
`Revision` and `Commit`, for example `&str`, refs like `Branch`, `Tag`, etc.
121+
Our API will use these traits where we expect a `Revision` or a `Commit`.
122+
123+
#### History
124+
125+
The current `History` is generic over VCS types and also retrieves the full list
126+
of commits when the history is created. The VCS part can be removed and the
127+
history can lazy-load the list of commits by implmenting `Iterator` to support
128+
potentially very long histories.
129+
130+
We can also store the head commit with the history so that it's easy to get
131+
the start point and it helps to identify the history.
132+
133+
To support getting the history of a file, we provide methods to modify a
134+
`History` to filter by a file path.
135+
136+
The new `History` type would look like this:
137+
138+
```Rust
139+
pub struct History<'a> {
140+
repo: RepositoryRef<'a>,
141+
head: Commit,
142+
revwalk: git2::Revwalk<'a>,
143+
filter_by: Option<FilterBy>,
144+
}
145+
146+
enum FilterBy {
147+
File { path: file_system::Path },
148+
}
149+
```
150+
151+
For the methods provided by `History`, please see section [Retrieve the history]
152+
(#retrieve-the-history) below.
153+
154+
#### Commit
155+
156+
`Commit` is a central concept in Git. In `radicle-surf` we define `Commit` type
157+
to represent its metadata:
158+
159+
```Rust
160+
pub struct Commit {
161+
/// Object Id
162+
pub id: Oid,
163+
/// The author of the commit.
164+
pub author: Author,
165+
/// The actor who committed this commit.
166+
pub committer: Author,
167+
/// The long form message of the commit.
168+
pub message: String,
169+
/// The summary message of the commit.
170+
pub summary: String,
171+
/// The parents of this commit.
172+
pub parents: Vec<Oid>,
173+
}
174+
```
175+
176+
To get the content (i.e. the tree object) of the commit, the user should use
177+
`snapshot` method described in [Code browsing](#code-browsing) section.
178+
179+
To get the diff of the commit, the user should use `diff_from_parent` method
180+
described in [Diffs](#diffs) section. Note that we might move that method to
181+
`Commit` type itself.
102182

103183
### Code browsing
104184

105185
The user should be able to browse the files and directories for any given
106-
commits or references. The core API is:
186+
commit. The core API is:
107187

108188
- Create a root Directory:
109189
```Rust
110-
imp RepositoryRef {
111-
pub fn snapshot(&self, rev: &Rev) -> Result<Directory, Error>;
190+
impl RepositoryRef {
191+
pub fn snapshot<C: ToCommit>(&self, commit: C) -> Result<Directory, Error>;
112192
}
113193
```
114194

115-
- Browse a Directory:
195+
- Browse a Directory's contents:
116196
```Rust
117197
impl Directory {
118198
pub fn contents(&self) -> impl Iterator<Item = &DirectoryContents>;
@@ -135,26 +215,142 @@ pub enum DirectoryContents {
135215

136216
### Diffs
137217

138-
The user would be able to create a diff between any two revisions that resolve
139-
into two commits.
218+
The user would be able to create a diff between any two revisions. In the first
219+
implementation, these revisions have to resolve into commits. But in future,
220+
the revisions could refer to other objects, e.g. files (blobs).
140221

141-
The main change is to use `Rev` instead of `Oid` to identify `from` and `to`.
142222
The core API is:
143223

144224
```Rust
145-
imp RepositoryRef {
146-
pub fn diff(&self, from: &Rev, to: &Rev) -> Result<Diff, Error>;
225+
impl RepositoryRef {
226+
/// Returns the diff between two revisions.
227+
pub fn diff<R: Revision>(&self, from: R, to: R) -> Result<Diff, Error>;
147228
}
148229
```
149230

150-
To help convert from `Oid` to `Rev`, we provide a helper method:
231+
We used to have the following method:
232+
```Rust
233+
/// Returns the diff between a revision and the initial state of the repo.
234+
pub fn initial_diff<R: Revision>(&self, rev: R) -> Result<Diff, Error>;
235+
```
236+
237+
However, it is not comparing with any other commit so the output is basically
238+
the snapshot of `R`. I am not sure if it is necessary. My take is that we can
239+
remove this method.
240+
241+
We also have the following method:
151242
```Rust
152-
imp RepositoryRef {
153-
/// Returns the Oid of `rev`.
154-
pub fn rev_oid(&self, rev: &Rev) -> Result<Oid, Error>;
243+
/// Returns the diff of a specific commit.
244+
pub fn diff_from_parent<C: ToCommit>(&self, commit: C) -> Result<Diff, Error>;
245+
```
246+
247+
I think it is probably better to instead define the above method as
248+
`Commit::diff()` as its output is associated with a `Commit`.
249+
250+
### Retrieve the history
251+
252+
The user would be able to get the list of previous commits reachable from a
253+
particular commit.
254+
255+
To create a `History` from a repo with a given head:
256+
```Rust
257+
impl RepositoryRef {
258+
pub fn history<C: ToCommit>(&self, head: C) -> Result<History, Error>;
155259
}
156260
```
157261

262+
`History` implements `Iterator` that produces `Result<Commit, Error>`, and
263+
also provides these methods:
264+
265+
```Rust
266+
impl<'a> History<'a> {
267+
pub fn new<C: ToCommit>(repo: RepositoryRef<'a>, head: C) -> Result<Self, Error>;
268+
269+
pub fn head(&self) -> &Commit;
270+
271+
// Modifies a history with a filter by `path`.
272+
// This is to support getting the history of a file.
273+
pub fn by_path(mut self, path: file_system::Path) -> Self;
274+
```
275+
276+
- Alternative design:
277+
278+
One potential downside of define `History` as an iterator is that:
279+
`history.next()` takes a mutable history object. A different design is to use
280+
`History` as immutable object that produces an iterator on-demand:
281+
282+
```Rust
283+
pub struct History<'a> {
284+
repo: RepositoryRef<'a>,
285+
head: Commit,
286+
}
287+
288+
impl<'a> History<'a> {
289+
/// This method creats a new `RevWalk` internally and return an
290+
/// iterator for all commits in a history.
291+
pub fn iter(&self) -> impl Iterator<Item = Commit>;
292+
}
293+
```
294+
295+
In this design, `History` does not keep `RevWalk` in its state. It will create
296+
a new one when `iter()` is called. I like the immutable interface of this design
297+
but did not implement it in the current code mainly because the libgit2 doc says
298+
[creating a new `RevWalk` is relatively expensive](https://libgit2.org/libgit2/#HEAD/group/revwalk/git_revwalk_new).
299+
300+
### List refs and retrieve their metadata
301+
302+
Git refs are simple names that point to objects using object IDs. `radicle-surf`
303+
support refs by its `Ref` type.
304+
305+
```Rust
306+
pub enum Ref {
307+
/// A git tag, which can be found under `.git/refs/tags/`.
308+
Tag {
309+
/// The name of the tag, e.g. `v1.0.0`.
310+
name: TagName,
311+
},
312+
/// A git branch, which can be found under `.git/refs/heads/`.
313+
LocalBranch {
314+
/// The name of the branch, e.g. `master`.
315+
name: BranchName,
316+
},
317+
/// A git branch, which can be found under `.git/refs/remotes/`.
318+
RemoteBranch {
319+
/// The remote name, e.g. `origin`.
320+
remote: String,
321+
/// The name of the branch, e.g. `master`.
322+
name: BranchName,
323+
},
324+
/// A git namespace, which can be found under `.git/refs/namespaces/`.
325+
///
326+
/// Note that namespaces can be nested.
327+
Namespace {
328+
/// The name value of the namespace.
329+
namespace: String,
330+
/// The reference under that namespace, e.g. The
331+
/// `refs/remotes/origin/master/ portion of `refs/namespaces/
332+
/// moi/refs/remotes/origin/master`.
333+
reference: Box<Ref>,
334+
},
335+
/// A git notes, which can be found under `.git/refs/notes`
336+
Notes {
337+
/// The default name is "commits".
338+
name: String,
339+
}
340+
}
341+
```
342+
343+
### Git Objects
344+
345+
Git has four kinds of objects: Blob, Tree, Commit and Tag. We have already
346+
discussed `Commit` (a struct) and `Tag` (`Ref::Tag`) types. For `blob`, we
347+
use `File` type to represent, and for `tree`, we use `Directory` type to
348+
represent. The motivation is to let the user "surf" a repo as a file system
349+
as much as possible, and to avoid Git internal concepts in our API if possible.
350+
351+
Open question: there could be some cases where the names `blob` and `tree`
352+
shall be used. We need to define such cases clearly if they exist.
353+
158354
## Error handling
159355

160356
TBD

radicle-surf/examples/diff.rs

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,10 +32,7 @@ fn main() {
3232
let base_oid = Oid::from_str(&options.base_revision).unwrap();
3333
let now = Instant::now();
3434
let elapsed_nanos = now.elapsed().as_nanos();
35-
let diff = repo
36-
.as_ref()
37-
.diff(&base_oid.into(), &head_oid.into())
38-
.unwrap();
35+
let diff = repo.as_ref().diff(base_oid, head_oid).unwrap();
3936
print_diff_summary(&diff, elapsed_nanos);
4037
}
4138

0 commit comments

Comments
 (0)