Skip to content

doc: git-add: clarify DESCRIPTION section #1952

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 33 additions & 31 deletions Documentation/git-add.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ git-add(1)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans via GitGitGadget" <[email protected]> writes:

> - Remove the snapshot-based explanation of the index and replace it with
>   a diff-based explanation because I don't feel that it's useful in this
>   context to emphasize that git uses a snapshot-based model: the main
>   way most git users interact with the index is through `git diff` or
>   `git status`, which is a completely diff-based view of the index.

But isn't it the source of the most end-user confusion that they
cannot wean themselves off of the diff/patch worldview?

How would you explain what the users would see in their "git diff",
"git diff --cached", and "git commit" after doing "edit && add &&
edit", if you explain "add" to be storing the "diff" made by the
first edit?  Does their "git diff" after the second "edit" take that
previously stored "diff" and another "diff" made by the second
"edit" and magically combine them together to present a single
"diff"?

> -git-add - Add file contents to the index
> +git-add - Add new or changed files to the index

In other words, I do think "new or changed" is a good thing to say,
but the word "contents" is fundamental here.  "Add contents of new
or changed files to the index" would be good.

> +Add new or changed files to the index (also known as "staging area") to
> +prepare for a commit.

OK, but saying "files" here adds another kind of confusion.  What is
"added" is not the fact that these paths are kept track of by Git.
Instead we add the snapshot of the contents at the time of 'git add'.

Wouldn't "add file X" confuse folks who still remember how other
SCMs before Git operated (i.e. "file X is now known, so if I make
further changes to X next 'commit' command will record it") into
thinking that Git would do the same?

> +By default, `git commit` only commits changes that you've added to the
> +index. For example, if you've edited `file.c` and want to commit your
> +changes, you can run:
> +
> +   git add file.c
> +   git commit

What happens when you did "edit && add && edit && add"?  It commits
the two changes you added to the index?  I do not think it is
productive to hide the fact that you are preparing a snapshot of the
"next commit" in the index (or "staging the contents for the next
commit in the staging area") with various forms "git add", including
"git add -p".

And to help form that mental model, it would help to avoid phrasing
"commit your changes" (as if you are somehow dealing with "diff/patch")
and instead saying "commit the result of your changes" (stressing
that the "state" matters), I would think.

De-stressing the fact that we are taking a snapshot should probably
be considered a documentation regression here.  Thanks to "git add"
taking a snapshot, users can further make experimental changes in
the working tree files freely and then come back to the exact
contents back by checking the path out of the index with "git
checkout -- <path>".  Thanks to "git commit" taking a snapshot,
users can even go back to the last commit by taking the exact
contents back by checking the path out of the HEAD with "git
checkout HEAD -- <path>".

I'll stop here and let others express their opinions without further
commenting for now.

Thanks for working on these updates.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, "Julia Evans" wrote (reply to this):

> But isn't it the source of the most end-user confusion that they
> cannot wean themselves off of the diff/patch worldview?

To me it feels very contextual! My impression is that what's important for Git
users is to be able to think about commits as diffs in some contexts, and as
snapshots in other contexts. For example with `git rebase` I'm usually thinking
of my commits as diffs, but it's very helpful to me to think of a merge commit
as a snapshot, because the merge commit does not have to be a "combination" of
the two sides of the merge, it can have arbitrary extra content.

> Wouldn't "add file X" confuse folks who still remember how other
> SCMs before Git operated (i.e. "file X is now known, so if I make
> further changes to X next 'commit' command will record it") into
> thinking that Git would do the same?

The point about Subversion is interesting: I would expect that most
people learning about Git's data model in 2025 have never used
Subversion.

So while I think it's extremely important to make accurate statements
while talking about Git (and I think it's very possible that this description
is not accurate enough!), I do not think it's so important to specifically
target misconceptions that users coming from Subversion/CVS
may have.

>> +By default, `git commit` only commits changes that you've added to the
>> +index. For example, if you've edited `file.c` and want to commit your
>> +changes, you can run:
>> +
>> +   git add file.c
>> +   git commit
>
> What happens when you did "edit && add && edit && add"?  It commits
> the two changes you added to the index?  I do not think it is
> productive to hide the fact that you are preparing a snapshot of the
> "next commit" in the index (or "staging the contents for the next
> commit in the staging area") with various forms "git add", including
> "git add -p".

It could! It's easy for me to imagine a world where the index
stores an ordered list of diffs, which are applied as patches in
series when I commit. I guess you'd need some sort of
patch + patch + patch + diff workflow to generate the final diff,
but to me that doesn't feel so different from what Git is actually doing in
practice.

In any case, I'll think more about whether I think this is really
an accurate description. I'm always especially interested in the practical
consequences of having misconceptions about Git: for example (and maybe I'm
convincing myself to change my position here!) with `git mv` I think it can
become relevant pretty quickly that commits are snapshots, because if
you move a file and edit it then Git can't always accurately guess that you
intended to "move" the file rather than delete the file and create a new one.

I'd like to be able to have a similarly practical example of why it's important
to think of commits as snapshots in the context of `git add` but I haven't quite
found the right one yet. I've noticed that people will often sort of "reject"
information that does not fit their mental models, and I think "commits are
snapshots, this is important in this context because of
<specific practical consequence>" is much more convincing than just
"commits are snapshots".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans" <[email protected]> writes:

>> Wouldn't "add file X" confuse folks who still remember how other
>> SCMs before Git operated (i.e. "file X is now known, so if I make
>> further changes to X next 'commit' command will record it") into
>> thinking that Git would do the same?
>
> The point about Subversion is interesting: I would expect that most
> people learning about Git's data model in 2025 have never used
> Subversion.

Even though I promised that I won't comment on this thread further
for now, I'd have to respond to this one.  

Times change.  I didn't have Subversion in mind when I wrote the
above.  It was CVS ;-)

Yes, I have heard that for recent crop of developers especially
newgrads, Git is the only SCM they've ever touched.  If we can
assume that the data and mental model of Git is natural for our
intended audiences, that is great (we can also forget about the
diff/patch based world view, which comes from how CVS/RCS stored
their revision data, and assume that the snapshot based world view
is natural to our readers).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, "Julia Evans" wrote (reply to this):

> Yes, I have heard that for recent crop of developers especially
> newgrads, Git is the only SCM they've ever touched.  If we can
> assume that the data and mental model of Git is natural for our
> intended audiences, that is great (we can also forget about the
> diff/patch based world view, which comes from how CVS/RCS stored
> their revision data, and assume that the snapshot based world view
> is natural to our readers).

Git is certainly the only version control system I've ever used: I started using
it when I was a new grad 15 years ago. Everything I know about Subversion or CVS
(almost nothing) I know from reading explanations of Git aimed at their users
or from trying to understand the origin of some of Git's terminology choices :)

re whether the snapshot based world view is "natural" or not to Git users: 
I did some very unscientific polls about people's mental models of Git
a while back at https://jvns.ca/blog/2024/03/28/git-poll-results/#commits

That one says that 42% of folks who responded think of commits as "snapshots"
and 50% as "diffs", which feels encouraging to me: after all, the poll doesn't
ask how Git represents commits internally, and many people replied in the
comments to say that they think of commits in both ways depending on the
situation.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, "D. Ben Knoble" wrote (reply to this):

On Tue, Aug 12, 2025 at 5:40 PM Julia Evans <[email protected]> wrote:
>
> > But isn't it the source of the most end-user confusion that they
> > cannot wean themselves off of the diff/patch worldview?
>
> To me it feels very contextual! My impression is that what's important for Git
> users is to be able to think about commits as diffs in some contexts, and as
> snapshots in other contexts. For example with `git rebase` I'm usually thinking
> of my commits as diffs, but it's very helpful to me to think of a merge commit
> as a snapshot, because the merge commit does not have to be a "combination" of
> the two sides of the merge, it can have arbitrary extra content.
>
[snip]
>
> >> +By default, `git commit` only commits changes that you've added to the
> >> +index. For example, if you've edited `file.c` and want to commit your
> >> +changes, you can run:
> >> +
> >> +   git add file.c
> >> +   git commit
> >
> > What happens when you did "edit && add && edit && add"?  It commits
> > the two changes you added to the index?  I do not think it is
> > productive to hide the fact that you are preparing a snapshot of the
> > "next commit" in the index (or "staging the contents for the next
> > commit in the staging area") with various forms "git add", including
> > "git add -p".
>
> It could! It's easy for me to imagine a world where the index
> stores an ordered list of diffs, which are applied as patches in
> series when I commit. I guess you'd need some sort of
> patch + patch + patch + diff workflow to generate the final diff,
> but to me that doesn't feel so different from what Git is actually doing in
> practice.
>
> In any case, I'll think more about whether I think this is really
> an accurate description. I'm always especially interested in the practical
> consequences of having misconceptions about Git: for example (and maybe I'm
> convincing myself to change my position here!) with `git mv` I think it can
> become relevant pretty quickly that commits are snapshots, because if
> you move a file and edit it then Git can't always accurately guess that you
> intended to "move" the file rather than delete the file and create a new one.
>
> I'd like to be able to have a similarly practical example of why it's important
> to think of commits as snapshots in the context of `git add` but I haven't quite
> found the right one yet. I've noticed that people will often sort of "reject"
> information that does not fit their mental models, and I think "commits are
> snapshots, this is important in this context because of
> <specific practical consequence>" is much more convincing than just
> "commits are snapshots".

Less a comment on this patch or diff ;) and more a meta-note: I happen
to have several links saved on the idea of "Snapshot vs. Patch" aka
"commit duality", so I figured I'd share. They reinforce to me, at
least, that the contextual mode of thinking is useful in practice,
even if the snapshot model is the (semantic) storage model [*].
Knowing about snapshots does make it far easier to interact with
objects directly, which also frequently helps me better understand how
to use particular commands.

- https://www.thirtythreeforty.net/posts/2020/01/the-wave-particle-duality-of-git-commits/
- https://roadrunnertwice.dreamwidth.org/596185.html (which references
Julia's work)
- of course, https://jvns.ca/blog/2024/01/05/do-we-think-of-git-commits-as-diffs--snapshots--or-histories/
;)
- https://stackoverflow.com/q/40617288/4400820,
https://stackoverflow.com/q/73646342/4400820,
https://stackoverflow.com/a/27760319/4400820
- https://github.blog/open-source/git/commits-are-snapshots-not-diffs/
- https://lore.kernel.org/git/[email protected]/

What I find is that, while we keep trying to reinforce the snapshot
mentality, there are situations where thinking in diffs is a
reasonable approximation. In the particular case of git-add, most
interactions I observe with the index are diff-based (git diff, git
diff --cached, etc.), but I'm not sure how to usefully clarify the
relationship between those things and the underlying trees involved
(working tree, HEAD, index :0:) in a manual section targeted primarily
at newcomers.

[*]: "Semantic" because deltas in packfiles muddy the _actual_ storage
model somewhat :)

-- 
D. Ben Knoble

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans" <[email protected]> writes:

> an accurate description. I'm always especially interested in the practical
> consequences of having misconceptions about Git: for example (and maybe I'm
> convincing myself to change my position here!) with `git mv` I think it can
> become relevant pretty quickly that commits are snapshots, because if
> you move a file and edit it then Git can't always accurately guess that you
> intended to "move" the file rather than delete the file and create a new one.

There is an easier-to-understand example.  If you pretend that you
"add" series of "diff/patch" to the index as you repeat "edit &&
add" three times, in the mental model of the users, there would be
three set of patches stored in the index somehow.  It would be a
fair wish for the users to have to be able to revert only the change
you added with your second "git add" while keeping the first one and
the third (latest) one.  You cannot explain why you fundamentally
cannot give them such a new "feature", until you admit that what is
recorded is the latest snapshot and earlier snapshots are discarded.

Another thing that the "collection of diff/patch" view probably
harms understanding of users is merge, which is not a set of diffs,
one for each parent and the merge result.  Of course, as a merge is
symmetric across the parents, it is not diff between the first
parent and the merge result, either.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, "Julia Evans" wrote (reply to this):

> There is an easier-to-understand example.  If you pretend that you
> "add" series of "diff/patch" to the index as you repeat "edit &&
> add" three times, in the mental model of the users, there would be
> three set of patches stored in the index somehow.  It would be a
> fair wish for the users to have to be able to revert only the change
> you added with your second "git add" while keeping the first one and
> the third (latest) one.  You cannot explain why you fundamentally
> cannot give them such a new "feature", until you admit that what is
> recorded is the latest snapshot and earlier snapshots are discarded.

Thanks, I think this is the perfect example and it gets at something about git
add that i’ve never totally understood: why are the earlier snapshots discarded?

Naively, one might think that:

1. the git index is just a tree object
2. when you commit, Git takes that tree object, attaches a
   message, and makes a commit with it
3. git maintains some sort of history (like the reflog) for the past "index"
   tree objects

If Git worked that way, I imagine it would be possible to implement the feature
you describe, and I feel like there's some sort of obvious reason (something
to do with performance?) for why the index isn't implemented this way that
I've never learned.

This example makes me think that if we want people to understand the
limitations of the index, it's important to communicate that the past
index snapshots are *discarded* and not just that the index is a snapshot.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans" <[email protected]> writes:

> This example makes me think that if we want people to understand the
> limitations of the index, it's important to communicate that the past
> index snapshots are *discarded* and not just that the index is a snapshot.

I suspect that you need to look at the whole thing backwards.  I
realize that it is another way to say that you are looking at the
whole thing backwards, so take your pick ;-).

Nobody stops you from extending the system to store more than one
snapshots in the index and allow your users to roll back to one of
these snapshots kept in the index.  The reason why we haven't done
so is because there has not been motivating any use case for such a
feature (and coming up with a reasonable UI for it would also be
more work).  After all, if you want to keep a set of good points to
go back to [*], that is what commits are for in the world view of
Git, where creating commits and moving around in history are cheap.
If it were something worth going back to, you'd do so at the commit
level.  "git stash" and its index operations (like the "--keep"
option that allows you to test with only what is in the index) are
implemented as (temporary) commits internally exactly for this
reason.

Having said that, there is a focused support to record the previous
state before a snapshot records a resolution for a conflicted path
[**].  This was added because of a concrete motivating use case to
allow you to recover from a botched conflict resolution (aka "gee, I
thought this resolution was OK but I did 'git add' way too early,
before I actually tested the result!"), where "you can commit to
mark the place to later go back" principle does not cleanly apply,
since commits in Git do not record conflicted state.

Please don't keep asking "why" on this point (i.e. "why not record
conflicts in commit?") and other things---at some point, the answers
will become a series of "that is how it is, and it has been good
enough for us", and then it becomes a waste of time to further ask
"why".  Until "here is the change I made to do things differently;
please see how well it works" materializes, that is.


[Footnote]

 * This is another example why the snapshot worldview gives clear
   workflow.  After you pile on several drunken-walk experimental
   commits on top of a good commit and realize that this particular
   line of effort is leading nowhere, you "jump back" to that known
   good point (i.e. "git reset --hard HEAD~7").  You do not have to
   apply these changes in reverse direction (i.e. "git apply -R") in
   reverse order (i.e. "git rev-list --reverse HEAD~7..".

** Read about "Resolve undo" in the documentation.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans via GitGitGadget" <[email protected]> writes:

> -git-add - Add file contents to the index
> +git-add - Add new or changed files to the index

Does it add much value to say "new or changed" here?  The command can
also be used to "stage" a removal of a path, e.g.

    $ rm tracked-file
    $ git add -u

so if the updated text is an attempt to give more details on what
kind of modifications are captured, it would be better to say "add
new, removed, or modified files".

> +Add new or changed files to the index to prepare for a commit. The
> +"index" (also known as "staging area") is where Git stores the changes
> +that will be in the next commit.

I won't repeat myself about change-snapshot duality, but I do not
think the new text is the best we can do.

    Update contents recorded in the index to prepare for the next
    commit.  The index (also known as "staging area") is where Git
    stores the contents that will be in the next commit.

> +By default, `git commit` only commits changes that you've added to the
> +index.
> For example, if you've edited `file.c` and want to commit your
> +changes, you can run:

Likewise.  "and want to record the resulting contents".

> ...
> -Please see linkgit:git-commit[1] for alternative ways to add content to a
> -commit.

In the original, this comment does look a bit out of place (as the
text around there does not talk about `git commit`), but as you said
that by default 'git commit' makes an as-is commit above, it may be
a good idea to move this sentence there.  `git commit <pathspec>` is
a handy thing to know even for beginners, and making your next commit
is what the user is working towards by using "git add".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jean-Noël AVILA wrote (reply to this):

On Friday, 15 August 2025 02:38:45 CEST Junio C Hamano wrote:
> "Julia Evans via GitGitGadget" <[email protected]> writes:
> > -git-add - Add file contents to the index
> > +git-add - Add new or changed files to the index
> 
> Does it add much value to say "new or changed" here?  The command can
> also be used to "stage" a removal of a path, e.g.
> 
>     $ rm tracked-file
>     $ git add -u
> 
> so if the updated text is an attempt to give more details on what
> kind of modifications are captured, it would be better to say "add
> new, removed, or modified files".
> 

The way I see it is that git add *captures* a part of the current state of the 
working tree (be it addition/removal of contents of files or subtrees of the 
working dir) for the next commit. A commit *is* a snapshot of the state of the 
project. The concept of snapshot is central to understanding the behavior of 
git and its internals.

> > +Add new or changed files to the index to prepare for a commit. The
> > +"index" (also known as "staging area") is where Git stores the changes
> > +that will be in the next commit.
> 
> I won't repeat myself about change-snapshot duality, but I do not
> think the new text is the best we can do.
> 
>     Update contents recorded in the index to prepare for the next
>     commit.  The index (also known as "staging area") is where Git
>     stores the contents that will be in the next commit.

Particularly, the "stores the changes that..." part is really not what the 
reader should remember.

> 
> > +By default, `git commit` only commits changes that you've added to the
> > +index.

I do not understand this addition. I may not be missing knowledge, but this 
behavior is not only "by default", it's the only behavior of git: commits are 
made with the content of the index. Let's not make it more complicated than it 
is already.

> > For example, if you've edited `file.c` and want to commit your
> 
> > +changes, you can run:
> Likewise.  "and want to record the resulting contents".
> 
> > ...
> > -Please see linkgit:git-commit[1] for alternative ways to add content to a
> > -commit.
> 
> In the original, this comment does look a bit out of place (as the
> text around there does not talk about `git commit`), but as you said
> that by default 'git commit' makes an as-is commit above, it may be
> a good idea to move this sentence there.  `git commit <pathspec>` is
> a handy thing to know even for beginners, and making your next commit
> is what the user is working towards by using "git add".



Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jean-Noël AVILA <[email protected]> writes:

> On Friday, 15 August 2025 02:38:45 CEST Junio C Hamano wrote:
>> "Julia Evans via GitGitGadget" <[email protected]> writes:
>> ...
>> > +By default, `git commit` only commits changes that you've added to the
>> > +index.
>
> I do not understand this addition. I may not be missing knowledge, but this 
> behavior is not only "by default", it's the only behavior of git: commits are 
> made with the content of the index. Let's not make it more complicated than it 
> is already.

I'll only react to "the only behaviour" part, without "more
complicated" part.

I think Julia is referring to the fact that you can record the state
that is different from what is in the index (or, what has been
accumulated in the index by the past use of "git add" command that
is being discussed here) with "git commit [-i] <pathspec>".  You can
do

    $ edit fileA fileB ;# assume both are tracked
    $ git add fileA
    $ git commit fileB

and the resulting commit will record the contents for fileA found in
its parent (i.e. the result of "git add fileA" is not reflected).
If the last step were

    $ git commit -i fileB

then the resulting commit will record the contents for both fileA
you added with the last "git add" on it, and contents for fileB
found in the working tree at the time of "git commit -i" was run
(i.e. "git add fileB" was not required)..

By default, after the edit of fileA&B and the add of fileA, "git
commit" would not be aware of what is currently in fileB in the
working tree, and records the same contents as its parent for all
paths except for fileA, which would record what was last added with
"git add" to the index.

>> > For example, if you've edited `file.c` and want to commit your
>> 
>> > +changes, you can run:
>> Likewise.  "and want to record the resulting contents".
>> 
>> > ...
>> > -Please see linkgit:git-commit[1] for alternative ways to add content to a
>> > -commit.
>> 
>> In the original, this comment does look a bit out of place (as the
>> text around there does not talk about `git commit`), but as you said
>> that by default 'git commit' makes an as-is commit above, it may be
>> a good idea to move this sentence there.  `git commit <pathspec>` is
>> a handy thing to know even for beginners, and making your next commit
>> is what the user is working towards by using "git add".

And this relates to "more complicated" part of your comment.

I think keeping "by default" above and also keeping this comment
that hints about non-as-is commits made with "git commit <pathspec>"
is slightly more preferrable than dropping both of them altogether.
With only four additional lines, we cover basic "edit && add && commit"
cycle fairly completely.

I am also fine to drop the mention of 'git commit' altogether, but
it feels somewhat incomplete to not talk about commit when teaching
add.  After all, add is one of the primary ways to prepare for the
next commit---putting it the other way around, you want to learn add
primarily because you eventually would want to make a commit.

In any case, only having one (i.e. "by default") and dropping the
other ("see linkgit:git-commit"), like the patch did, did not make
much sense to me.

Thanks.

NAME
----
git-add - Add file contents to the index
git-add - Add new or changed files to the index

SYNOPSIS
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans via GitGitGadget" <[email protected]> writes:

> From: Julia Evans <[email protected]>
>
> Motivations for this change:
>
> 1. Listing a huge number of options is visually overwhelming when
>    opening a man page for an unfamiliar command. It makes it harder
>    to understand the command's core syntax, like `git add <filename>`

For "git add", which has only one mode of operation, this may be
good.

Note that in general this is not necessarily a good idea, when a
command works in different modes (like "git branch" that can
list/enumerate or create/delete/manipulate), as not all the options
can be used in all the modes the command supports.  The "usage" part
of the output from "git branch -h" hits a good balance, and may want
to use as a model.

There is t0450 that aspires to ensure the short usage "git <cmd> -h"
matches the synopsis section of "git help <cmd>" for all <cmd>; right
now we have too many exceptions, and we should move towards making
these exceptions smaller.

> 2. For options which can be passed independently of any other options,
>    including them in the SYNOPSIS does not add any information which you
>    can't already get from reading the OPTIONS section.

Except that you have to scan a lot of text, which is quite
inefficient when you *know* the general idea behind the option you
want to use, and are only looking for the exact spelling of it (e.g.
"was it spelled --ignore-removed?")

> `git add` has
>    some mutually exclusive options, namely:
>    [--[no-]all | -A | --[no-]ignore-removal | [--update | -u]]
>    but personally I already find that line so hard to parse that
>    removing it doesn't remove a lot of information

It is a very good point why we may want to have these cues to
express "these go together" (my earlier example of "branch") and
"only one of these is used".  I tend to agree with you that these
are not necessarily very easy to read.

While it is important to make it easier for new readers to learn, we
should also keep in mind that nobody remains to be a newbie forever.

>  [synopsis]
> -git add [--verbose | -v] [--dry-run | -n] [--force | -f] [--interactive | -i] [--patch | -p]
> -	[--edit | -e] [--[no-]all | -A | --[no-]ignore-removal | [--update | -u]] [--sparse]
> -	[--intent-to-add | -N] [--refresh] [--ignore-errors] [--ignore-missing] [--renormalize]
> -	[--chmod=(+|-)x] [--pathspec-from-file=<file> [--pathspec-file-nul]]
> -	[--] [<pathspec>...]

This being a long single line and with redundant "--long|-s" may be
making it unnecessarily ugly.  Have you considered folding lines and
simplifying "[--long | -s]" into "[-s]" and see if it makes easier
to follow?  Documentation/git-commit.adoc may serve as a better
model.

> +git add [<options>] [--] [<pathspec>...]
>  
>  DESCRIPTION
>  -----------

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, "Julia Evans" wrote (reply to this):

Thanks for the comments. I think for now I'll just remove this patch
from the series since I don't see a clear way forward and I think it'll
make it easier to focus on the other changes.

> Note that in general this is not necessarily a good idea, when a
> command works in different modes (like "git branch" that can
> list/enumerate or create/delete/manipulate), as not all the options
> can be used in all the modes the command supports.  

I've been thinking about that as well: I have some ideas I've been working on
for how to clarify the usage of different "modes" of a command by giving the
modes names, will share those when I get to a command with modes.

> Except that you have to scan a lot of text, which is quite
> inefficient when you *know* the general idea behind the option you
> want to use, and are only looking for the exact spelling of it (e.g.
> "was it spelled --ignore-removed?")

That's fair. Something that I hadn't considered is that how easy the OPTIONS
section is to scan depends on how the man page is formatted: some man
page viewers will bold the options (which I think makes them easier to scan),
but some won't.

> While it is important to make it easier for new readers to learn, we
> should also keep in mind that nobody remains to be a newbie forever.
> Have you considered folding lines and
> simplifying "[--long | -s]" into "[-s]" and see if it makes easier
> to follow?  Documentation/git-commit.adoc may serve as a better
> model.

Hmm, here's what it looks like with the long options removed.
To me it doesn't feel like a big enough improvement, and it's harder
to tell what some of the short options (like `-n`) mean.

git add [-p] [-v] [-n] [-f] [-i] [-e] [-A | --no-all | -u]
	[--sparse] [--intent-to-add | -N] [--refresh] [--ignore-errors]
	[--ignore-missing] [--renormalize] [--chmod=(+|-)x]
	[--pathspec-from-file=<file> [--pathspec-file-nul]]
	[--] [<pathspec>...]

--------
Expand All @@ -16,37 +16,38 @@ git add [--verbose | -v] [--dry-run | -n] [--force | -f] [--interactive | -i] [-

DESCRIPTION
-----------
This command updates the index using the current content found in
the working tree, to prepare the content staged for the next commit.
It typically adds the current content of existing paths as a whole,
but with some options it can also be used to add content with
only part of the changes made to the working tree files applied, or
remove paths that do not exist in the working tree anymore.

The "index" holds a snapshot of the content of the working tree, and it
is this snapshot that is taken as the contents of the next commit. Thus
after making any changes to the working tree, and before running
the commit command, you must use the `add` command to add any new or
modified files to the index.

This command can be performed multiple times before a commit. It only
adds the content of the specified file(s) at the time the add command is
run; if you want subsequent changes included in the next commit, then
you must run `git add` again to add the new content to the index.

The `git status` command can be used to obtain a summary of which
files have changes that are staged for the next commit.

The `git add` command will not add ignored files by default. If any
ignored files were explicitly specified on the command line, `git add`
will fail with a list of ignored files. Ignored files reached by
directory recursion or filename globbing performed by Git (quote your
globs before the shell) will be silently ignored. The `git add` command can
be used to add ignored files with the `-f` (force) option.

Please see linkgit:git-commit[1] for alternative ways to add content to a
commit.
Add new or changed files to the index to prepare for a commit. The
"index" (also known as "staging area") is where Git stores the changes
that will be in the next commit.

By default, `git commit` only commits changes that you've added to the
index. For example, if you've edited `file.c` and want to commit your
changes, you can run:

git add file.c
git commit

You can also add only part of your changes to a file with `git add -p`.
Please see linkgit:git-commit[1] for alternative ways to add content to
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans via GitGitGadget" <[email protected]> writes:

> -This command can be performed multiple times before a commit.  It only
> -adds the content of the specified file(s) at the time the add command is
> -run; if you want subsequent changes included in the next commit, then
> -you must run `git add` again to add the new content to the index.
> +The `git add` command only adds the changes at the time that you run it.
> +If you edit `file.c` after adding it, you need to run `git add file.c`
> +again before committing.

I somehow find the text before this change easier to understand
(except for one thing).  "If you edit `file.c` after adding it" in
the new text says the same thing as "if you want subsequent ... in
the next commit" in the original but in a much better way.

> -The `git status` command can be used to obtain a summary of which
> -files have changes that are staged for the next commit.
> +If you want to check which changes have been added, you can run
> +`git status` to print out a summary of the changes that will be committed
> +or run `git diff --staged` to see the full diff.

Rewrite "diff --staged" to "diff --cached", simply because that is
how "git diff -h" shows.  After all, "--staged" is explained as a
"synonym" (and by definition, a synonym is something that you do not
have to use, as you can use the real thing).

"status" gives paths in two groups, "changes to be committed" and
"changes not staged for commit".  Explaining the use of "diff
--cached" to inspect what the user will be committing is a great
addition here, as it is a sensible way to sanity-check the result of
your index manipulations.  In addition, we also should talk about
"diff" to inspect what the user will be leaving out---in other
words, what the user might have forgotten to add, which is equally
if not more useful sanity-check you can do before you commit.

Thanks.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, "Julia Evans" wrote (reply to this):

Hi,

> I somehow find the text before this change easier to understand
> (except for one thing).  "If you edit `file.c` after adding it" in
> the new text says the same thing as "if you want subsequent ... in
> the next commit" in the original but in a much better way.

I really appreciate all of this feedback. It makes me wonder if there would
be a better way to approach this man page. Usually when I'm revising a technical
explanation, I find people who are currently users of the software but who have
trouble understanding how it works. Then I ask them to give feedback on what's
confusing to them about the explanation or what questions they have.

I do this because I find that often people who are extremely comfortable
with using the software (including me, which is why I usually spend so much
time collecting feedback like this!) can lose sight of what's confusing to an
"average user". And every time I'm part of a discussion about documentation for
an open source project it seems a bit strange to me for a group of people who
all already understand the concept to be discussing what would be clearest to an
"average user": surely the users themselves should be the judge of what's clear
to them!

I'm still pretty new to writing open source documentation so I don't know if
collecting user feedback like this is a normal part of the process, but I always
learn a lot from this type of feedback and it's pretty easy for me to collect
it.

> Rewrite "diff --staged" to "diff --cached"

Will use `diff --cached`.

> In addition, we also should talk about
> "diff" to inspect what the user will be leaving out---in other
> words, what the user might have forgotten to add, which is equally
> if not more useful sanity-check you can do before you commit.

That makes sense to me.

best,
Julia

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, "D. Ben Knoble" wrote (reply to this):

On Fri, Aug 15, 2025 at 12:10 PM Julia Evans <[email protected]> wrote:
>
> Hi,
>
> > I somehow find the text before this change easier to understand
> > (except for one thing).  "If you edit `file.c` after adding it" in
> > the new text says the same thing as "if you want subsequent ... in
> > the next commit" in the original but in a much better way.
>
> I really appreciate all of this feedback. It makes me wonder if there would
> be a better way to approach this man page. Usually when I'm revising a technical
> explanation, I find people who are currently users of the software but who have
> trouble understanding how it works. Then I ask them to give feedback on what's
> confusing to them about the explanation or what questions they have.
>
> I do this because I find that often people who are extremely comfortable
> with using the software (including me, which is why I usually spend so much
> time collecting feedback like this!) can lose sight of what's confusing to an
> "average user".

The curse of knowledge ;)

> And every time I'm part of a discussion about documentation for
> an open source project it seems a bit strange to me for a group of people who
> all already understand the concept to be discussing what would be clearest to an
> "average user": surely the users themselves should be the judge of what's clear
> to them!
>
> I'm still pretty new to writing open source documentation so I don't know if
> collecting user feedback like this is a normal part of the process, but I always
> learn a lot from this type of feedback and it's pretty easy for me to collect
> it.

Whether it is or isn't normal, we could probably still benefit from
that perspective.

As Junio likes to say, a mistake being old is no good reason to carry
it forward into the future (or replicate it). I'll take that to mean
we also have an opportunity to improve the inputs to documentation (as
"leaving out such a perspective" would be the "mistake"—note I'm not
ascribing intent, malicious or otherwise!).

-- 
D. Ben Knoble

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans" <[email protected]> writes:

> ... Then I ask them to give feedback on what's
> confusing to them about the explanation or what questions they have.
>
> I do this because I find that often people who are extremely comfortable
> with using the software (including me, which is why I usually spend so much
> time collecting feedback like this!) can lose sight of what's confusing to an
> "average user".

Yes, you can lose your novice status and it is hard to take it back
;-)  I agree with you that the next best thing you can do is to see
how well folks who still have that status do.

> And every time I'm part of a discussion about documentation for
> an open source project it seems a bit strange to me for a group of people who
> all already understand the concept to be discussing what would be clearest to an
> "average user": surely the users themselves should be the judge of what's clear
> to them!

Yes, with one caveat, which is that you need to be careful to avoid
throwing them into local optima.  A simplified world view may make
it look easier to swallow, but depending on the kind of white lies
you throw at them, some of them they may have to unlearn to further
understand the system.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"D. Ben Knoble" <[email protected]> writes:

> As Junio likes to say, a mistake being old is no good reason to carry
> it forward into the future (or replicate it).

I say no such thing, though.  What I say about past mistakes is that
you shouldn't use it as an excuse to make similar ones in the
future.

I'd prefer to let a sleeping dog lie.

But in the context of this discussion, I think what we carefully and
honestly need to look at are not past mistakes.  It is importance to
adjust to the new world we live in.

In early days of Git, people from older SCM systems did not grok the
index very well, so our explanation of the concept of index and
adding content to it may have focused on teaching the difference
between our system and the back-then-major SCM systems.  Unless you
have used Bitkeeper, the "you can commit and your doing so would not
bother anybody else" plus "you can rewrite your private history
until you can pretend to be a super developer who came to the best
solution with a single attempt" freedom were something quite new,
and we needed to educate folks the way to think and work well in the
distributed world.  Earlier in one of my messages, I said "making a
commit and switching to another commit is cheap", and that comment
came out of habit, but that is only understood by folks who have
used older SCM systems we displaced.

But with so many new users who haven't even touched anything other
than Git, none of the above examples certainly may not be the best
way to teach these things to these new crop of users.

a commit.

The `git add` command only adds the changes at the time that you run it.
If you edit `file.c` after adding it, you need to run `git add file.c`
again before committing.

If you want to check which changes have been added, you can run
`git status` to print out a summary of the changes that will be committed
or run `git diff --staged` to see the full diff.

`git add` will not add ignored files by default. You can use the
`--force` option to add ignored files. If you explicitly specify the
exact filename of an ignored file (e.g. `git add ignored.txt`), `git
add` will fail with a list of ignored files. Otherwise it will silently
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Chris Torek wrote (reply to this):

On Tue, Aug 12, 2025 at 1:35 PM Julia Evans via GitGitGadget
<[email protected]> wrote:
> +TERMINOLOGY NOTE
> +----------------
> +
> +Git uses the terms "staging area", "index" and "cache" interchangeably
> +for historical reasons. Many commands have flags like `--staged`,
> +`--index`, or `--cached`, and they all refer to the index.
> +

I think this is also a good idea. Unfortunately, `git apply` has two
different meanings for `--index` vs `--cached` (I believe it's the
*only* exception to the "means the same thing" rule...).

Chris

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Chris Torek <[email protected]> writes:

> On Tue, Aug 12, 2025 at 1:35 PM Julia Evans via GitGitGadget
> <[email protected]> wrote:
>> +TERMINOLOGY NOTE
>> +----------------
>> +
>> +Git uses the terms "staging area", "index" and "cache" interchangeably
>> +for historical reasons. Many commands have flags like `--staged`,
>> +`--index`, or `--cached`, and they all refer to the index.
>> +
>
> I think this is also a good idea. Unfortunately, `git apply` has two
> different meanings for `--index` vs `--cached` (I believe it's the
> *only* exception to the "means the same thing" rule...).

Yes, I think the first sentence is an excellent addition, even
though I do not know if "git add" is the best place to teach it.

However, it will be disservice to users to say "they all refer to
the index" here.  Yes, it is technically correct that they all refer
to the index, but that much any intelligent readers can infer after
reading the first sentance that historically these three words were
used to refer to the same "index".  And what I think is bad in that
second sentence is that it implies they may mean the same thing
without saying that.  It is perfectly fine to say that these three
words express some operation around the index (sometimes called the
staging area).  It also is fine to say that "--staged" is sometimes
used as synonym for `--cached`.

But at least `--cached` and `--index` mean quite different things.

As "git help cli" explains, an operation that can affect only the
index would use "--cached" and both the index and the working tree
would use "--index".

It may be that "apply" is currently the only exception (I did not
check), but it certainly is not guaranteed to stay to be the only
exception.  If a command wants to work on both the contents in the
index and in the working tree, such a command is very much welcomed
to use the option "--index" to trigger such a mode of operation.

Conclusion?  I would rather see "Many commands have ..." sentence
struck out.  After all, that does not need to be taught to those who
came here to learn about "git add".

Thanks.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, "Julia Evans" wrote (reply to this):

That sounds good to me, I'll remove the second sentence.

On Tue, Aug 12, 2025, at 5:36 PM, Junio C Hamano wrote:
> Chris Torek <[email protected]> writes:
>
>> On Tue, Aug 12, 2025 at 1:35 PM Julia Evans via GitGitGadget
>> <[email protected]> wrote:
>>> +TERMINOLOGY NOTE
>>> +----------------
>>> +
>>> +Git uses the terms "staging area", "index" and "cache" interchangeably
>>> +for historical reasons. Many commands have flags like `--staged`,
>>> +`--index`, or `--cached`, and they all refer to the index.
>>> +
>>
>> I think this is also a good idea. Unfortunately, `git apply` has two
>> different meanings for `--index` vs `--cached` (I believe it's the
>> *only* exception to the "means the same thing" rule...).
>
> Yes, I think the first sentence is an excellent addition, even
> though I do not know if "git add" is the best place to teach it.
>
> However, it will be disservice to users to say "they all refer to
> the index" here.  Yes, it is technically correct that they all refer
> to the index, but that much any intelligent readers can infer after
> reading the first sentance that historically these three words were
> used to refer to the same "index".  And what I think is bad in that
> second sentence is that it implies they may mean the same thing
> without saying that.  It is perfectly fine to say that these three
> words express some operation around the index (sometimes called the
> staging area).  It also is fine to say that "--staged" is sometimes
> used as synonym for `--cached`.
>
> But at least `--cached` and `--index` mean quite different things.
>
> As "git help cli" explains, an operation that can affect only the
> index would use "--cached" and both the index and the working tree
> would use "--index".
>
> It may be that "apply" is currently the only exception (I did not
> check), but it certainly is not guaranteed to stay to be the only
> exception.  If a command wants to work on both the contents in the
> index and in the working tree, such a command is very much welcomed
> to use the option "--index" to trigger such a mode of operation.
>
> Conclusion?  I would rather see "Many commands have ..." sentence
> struck out.  After all, that does not need to be taught to those who
> came here to learn about "git add".
>
> Thanks.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans via GitGitGadget" <[email protected]> writes:

> From: Julia Evans <[email protected]>
>
> I think the fact that git uses these three terms interchangeably is
> extremely confusing and that it deserves to be noted.

We tend to avoid saying "I think" in our proposed log messages, as
we do not churn the code and documentation merely to match personal
preferences.

I do not necessarily think "git add --help" is an appropriate place
to leave this note, by the way.  We should start from teaching "git
help glossary", which does not mention "staging area" at all, which
is a sign that it is somewhat outdated.  It does not use the verb
'to stage' even once, either.

Here is my attempt to improve the situation by giving a definition
of "staging area" in the glossary.  Luckily, "cache" already has its
own entry, describing it as an old synonym to the 'index', so I
didn't have to do anything there.  Also the description of 'index'
has a bit too much implementation detail, which I toned down.

---
Subject: glossary: talk about "staging area"

Surprisingly, "git help glossary" does not mention the 'staging
area' synonym for the index, or the verb 'to stage'.  As "git
status" output uses the latter (i.e. "Changes not staged for
commit"), we should not leave it undefined what the verb means.

Rewrite the definition of the `index` somewhat to reduce the level
of implementation detail exposed, and focus more on the fact that it
is a mapping from pathnames to the contents at these paths.  And
mention the `staging area` there, as well as giving its own glossary
entry.

Signed-off-by: Junio C Hamano <[email protected]>
---
 Documentation/glossary-content.adoc | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git c/Documentation/glossary-content.adoc w/Documentation/glossary-content.adoc
index e423e4765b..10f0c21e88 100644
--- c/Documentation/glossary-content.adoc
+++ w/Documentation/glossary-content.adoc
@@ -247,11 +247,15 @@ for a more flexible and robust system to do the same thing.
 	of Git you had to make them executable.
 
 [[def_index]]index::
-	A collection of files with stat information, whose contents are stored
-	as objects. The index is a stored version of your
-	<<def_working_tree,working tree>>. Truth be told, it can also contain a second, and even
-	a third version of a working tree, which are used
-	when <<def_merge,merging>>.
+	The index stores the mapping from filenames to their contents
+	to prepare the contents of the next commit by updating the
+	object recorded for each path (for this reason, people often
+	say that the index is "like the staging area" when explaining
+	the concept), together with other information to detect which
+	working tree files are modified efficiently.
+	During a conflicted <<def_merge,merge>>, the index can have
+	multiple versions of contents at higher stages for the same
+	path.
 
 [[def_index_entry]]index entry::
 	The information regarding a particular file, stored in the
@@ -650,6 +654,12 @@ the `refs/tags/` hierarchy is used to represent local tags..
 	is created by giving the `--depth` option to linkgit:git-clone[1], and
 	its history can be later deepened with linkgit:git-fetch[1].
 
+[[def_stage]]staging area::
+	A synonym for <<def_index,index>>.  Adding contents to the
+	index to update the mapping from the filename to its contents
+	is often called "to stage" (verb), as people explain the index
+	is like a staging area to prepare for the next commit.
+
 [[def_stash]]stash entry::
 	An <<def_object,object>> used to temporarily store the contents of a
 	<<def_dirty,dirty>> working directory and the index for future reuse.

ignore the file.

[NOTE]
Git uses the terms "staging area", "index" and "cache" interchangeably
for historical reasons.

OPTIONS
-------
Expand Down Expand Up @@ -451,6 +452,7 @@ linkgit:git-rm[1]
linkgit:git-reset[1]
linkgit:git-mv[1]
linkgit:git-commit[1]
linkgit:git-diff[1]
linkgit:git-update-index[1]

GIT
Expand Down
Loading