doc: git-add: clarify DESCRIPTION section #1952

gitgitgadget · 2025-08-12T21:04:23Z

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans via GitGitGadget" <[email protected]> writes: > - Remove the snapshot-based explanation of the index and replace it with > a diff-based explanation because I don't feel that it's useful in this > context to emphasize that git uses a snapshot-based model: the main > way most git users interact with the index is through `git diff` or > `git status`, which is a completely diff-based view of the index. But isn't it the source of the most end-user confusion that they cannot wean themselves off of the diff/patch worldview? How would you explain what the users would see in their "git diff", "git diff --cached", and "git commit" after doing "edit && add && edit", if you explain "add" to be storing the "diff" made by the first edit? Does their "git diff" after the second "edit" take that previously stored "diff" and another "diff" made by the second "edit" and magically combine them together to present a single "diff"? > -git-add - Add file contents to the index > +git-add - Add new or changed files to the index In other words, I do think "new or changed" is a good thing to say, but the word "contents" is fundamental here. "Add contents of new or changed files to the index" would be good. > +Add new or changed files to the index (also known as "staging area") to > +prepare for a commit. OK, but saying "files" here adds another kind of confusion. What is "added" is not the fact that these paths are kept track of by Git. Instead we add the snapshot of the contents at the time of 'git add'. Wouldn't "add file X" confuse folks who still remember how other SCMs before Git operated (i.e. "file X is now known, so if I make further changes to X next 'commit' command will record it") into thinking that Git would do the same? > +By default, `git commit` only commits changes that you've added to the > +index. For example, if you've edited `file.c` and want to commit your > +changes, you can run: > + > + git add file.c > + git commit What happens when you did "edit && add && edit && add"? It commits the two changes you added to the index? I do not think it is productive to hide the fact that you are preparing a snapshot of the "next commit" in the index (or "staging the contents for the next commit in the staging area") with various forms "git add", including "git add -p". And to help form that mental model, it would help to avoid phrasing "commit your changes" (as if you are somehow dealing with "diff/patch") and instead saying "commit the result of your changes" (stressing that the "state" matters), I would think. De-stressing the fact that we are taking a snapshot should probably be considered a documentation regression here. Thanks to "git add" taking a snapshot, users can further make experimental changes in the working tree files freely and then come back to the exact contents back by checking the path out of the index with "git checkout -- <path>". Thanks to "git commit" taking a snapshot, users can even go back to the last commit by taking the exact contents back by checking the path out of the HEAD with "git checkout HEAD -- <path>". I'll stop here and let others express their opinions without further commenting for now. Thanks for working on these updates.

On the Git mailing list, "Julia Evans" wrote (reply to this):

> But isn't it the source of the most end-user confusion that they > cannot wean themselves off of the diff/patch worldview? To me it feels very contextual! My impression is that what's important for Git users is to be able to think about commits as diffs in some contexts, and as snapshots in other contexts. For example with `git rebase` I'm usually thinking of my commits as diffs, but it's very helpful to me to think of a merge commit as a snapshot, because the merge commit does not have to be a "combination" of the two sides of the merge, it can have arbitrary extra content. > Wouldn't "add file X" confuse folks who still remember how other > SCMs before Git operated (i.e. "file X is now known, so if I make > further changes to X next 'commit' command will record it") into > thinking that Git would do the same? The point about Subversion is interesting: I would expect that most people learning about Git's data model in 2025 have never used Subversion. So while I think it's extremely important to make accurate statements while talking about Git (and I think it's very possible that this description is not accurate enough!), I do not think it's so important to specifically target misconceptions that users coming from Subversion/CVS may have. >> +By default, `git commit` only commits changes that you've added to the >> +index. For example, if you've edited `file.c` and want to commit your >> +changes, you can run: >> + >> + git add file.c >> + git commit > > What happens when you did "edit && add && edit && add"? It commits > the two changes you added to the index? I do not think it is > productive to hide the fact that you are preparing a snapshot of the > "next commit" in the index (or "staging the contents for the next > commit in the staging area") with various forms "git add", including > "git add -p". It could! It's easy for me to imagine a world where the index stores an ordered list of diffs, which are applied as patches in series when I commit. I guess you'd need some sort of patch + patch + patch + diff workflow to generate the final diff, but to me that doesn't feel so different from what Git is actually doing in practice. In any case, I'll think more about whether I think this is really an accurate description. I'm always especially interested in the practical consequences of having misconceptions about Git: for example (and maybe I'm convincing myself to change my position here!) with `git mv` I think it can become relevant pretty quickly that commits are snapshots, because if you move a file and edit it then Git can't always accurately guess that you intended to "move" the file rather than delete the file and create a new one. I'd like to be able to have a similarly practical example of why it's important to think of commits as snapshots in the context of `git add` but I haven't quite found the right one yet. I've noticed that people will often sort of "reject" information that does not fit their mental models, and I think "commits are snapshots, this is important in this context because of <specific practical consequence>" is much more convincing than just "commits are snapshots".

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans" <[email protected]> writes: >> Wouldn't "add file X" confuse folks who still remember how other >> SCMs before Git operated (i.e. "file X is now known, so if I make >> further changes to X next 'commit' command will record it") into >> thinking that Git would do the same? > > The point about Subversion is interesting: I would expect that most > people learning about Git's data model in 2025 have never used > Subversion. Even though I promised that I won't comment on this thread further for now, I'd have to respond to this one. Times change. I didn't have Subversion in mind when I wrote the above. It was CVS ;-) Yes, I have heard that for recent crop of developers especially newgrads, Git is the only SCM they've ever touched. If we can assume that the data and mental model of Git is natural for our intended audiences, that is great (we can also forget about the diff/patch based world view, which comes from how CVS/RCS stored their revision data, and assume that the snapshot based world view is natural to our readers).

On the Git mailing list, "Julia Evans" wrote (reply to this):

> Yes, I have heard that for recent crop of developers especially > newgrads, Git is the only SCM they've ever touched. If we can > assume that the data and mental model of Git is natural for our > intended audiences, that is great (we can also forget about the > diff/patch based world view, which comes from how CVS/RCS stored > their revision data, and assume that the snapshot based world view > is natural to our readers). Git is certainly the only version control system I've ever used: I started using it when I was a new grad 15 years ago. Everything I know about Subversion or CVS (almost nothing) I know from reading explanations of Git aimed at their users or from trying to understand the origin of some of Git's terminology choices :) re whether the snapshot based world view is "natural" or not to Git users: I did some very unscientific polls about people's mental models of Git a while back at https://jvns.ca/blog/2024/03/28/git-poll-results/#commits That one says that 42% of folks who responded think of commits as "snapshots" and 50% as "diffs", which feels encouraging to me: after all, the poll doesn't ask how Git represents commits internally, and many people replied in the comments to say that they think of commits in both ways depending on the situation.

On the Git mailing list, "D. Ben Knoble" wrote (reply to this):

On Tue, Aug 12, 2025 at 5:40 PM Julia Evans <[email protected]> wrote: > > > But isn't it the source of the most end-user confusion that they > > cannot wean themselves off of the diff/patch worldview? > > To me it feels very contextual! My impression is that what's important for Git > users is to be able to think about commits as diffs in some contexts, and as > snapshots in other contexts. For example with `git rebase` I'm usually thinking > of my commits as diffs, but it's very helpful to me to think of a merge commit > as a snapshot, because the merge commit does not have to be a "combination" of > the two sides of the merge, it can have arbitrary extra content. > [snip] > > >> +By default, `git commit` only commits changes that you've added to the > >> +index. For example, if you've edited `file.c` and want to commit your > >> +changes, you can run: > >> + > >> + git add file.c > >> + git commit > > > > What happens when you did "edit && add && edit && add"? It commits > > the two changes you added to the index? I do not think it is > > productive to hide the fact that you are preparing a snapshot of the > > "next commit" in the index (or "staging the contents for the next > > commit in the staging area") with various forms "git add", including > > "git add -p". > > It could! It's easy for me to imagine a world where the index > stores an ordered list of diffs, which are applied as patches in > series when I commit. I guess you'd need some sort of > patch + patch + patch + diff workflow to generate the final diff, > but to me that doesn't feel so different from what Git is actually doing in > practice. > > In any case, I'll think more about whether I think this is really > an accurate description. I'm always especially interested in the practical > consequences of having misconceptions about Git: for example (and maybe I'm > convincing myself to change my position here!) with `git mv` I think it can > become relevant pretty quickly that commits are snapshots, because if > you move a file and edit it then Git can't always accurately guess that you > intended to "move" the file rather than delete the file and create a new one. > > I'd like to be able to have a similarly practical example of why it's important > to think of commits as snapshots in the context of `git add` but I haven't quite > found the right one yet. I've noticed that people will often sort of "reject" > information that does not fit their mental models, and I think "commits are > snapshots, this is important in this context because of > <specific practical consequence>" is much more convincing than just > "commits are snapshots". Less a comment on this patch or diff ;) and more a meta-note: I happen to have several links saved on the idea of "Snapshot vs. Patch" aka "commit duality", so I figured I'd share. They reinforce to me, at least, that the contextual mode of thinking is useful in practice, even if the snapshot model is the (semantic) storage model [*]. Knowing about snapshots does make it far easier to interact with objects directly, which also frequently helps me better understand how to use particular commands. - https://www.thirtythreeforty.net/posts/2020/01/the-wave-particle-duality-of-git-commits/ - https://roadrunnertwice.dreamwidth.org/596185.html (which references Julia's work) - of course, https://jvns.ca/blog/2024/01/05/do-we-think-of-git-commits-as-diffs--snapshots--or-histories/ ;) - https://stackoverflow.com/q/40617288/4400820, https://stackoverflow.com/q/73646342/4400820, https://stackoverflow.com/a/27760319/4400820 - https://github.blog/open-source/git/commits-are-snapshots-not-diffs/ - https://lore.kernel.org/git/[email protected]/ What I find is that, while we keep trying to reinforce the snapshot mentality, there are situations where thinking in diffs is a reasonable approximation. In the particular case of git-add, most interactions I observe with the index are diff-based (git diff, git diff --cached, etc.), but I'm not sure how to usefully clarify the relationship between those things and the underlying trees involved (working tree, HEAD, index :0:) in a manual section targeted primarily at newcomers. [*]: "Semantic" because deltas in packfiles muddy the _actual_ storage model somewhat :) -- D. Ben Knoble

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans" <[email protected]> writes: > an accurate description. I'm always especially interested in the practical > consequences of having misconceptions about Git: for example (and maybe I'm > convincing myself to change my position here!) with `git mv` I think it can > become relevant pretty quickly that commits are snapshots, because if > you move a file and edit it then Git can't always accurately guess that you > intended to "move" the file rather than delete the file and create a new one. There is an easier-to-understand example. If you pretend that you "add" series of "diff/patch" to the index as you repeat "edit && add" three times, in the mental model of the users, there would be three set of patches stored in the index somehow. It would be a fair wish for the users to have to be able to revert only the change you added with your second "git add" while keeping the first one and the third (latest) one. You cannot explain why you fundamentally cannot give them such a new "feature", until you admit that what is recorded is the latest snapshot and earlier snapshots are discarded. Another thing that the "collection of diff/patch" view probably harms understanding of users is merge, which is not a set of diffs, one for each parent and the merge result. Of course, as a merge is symmetric across the parents, it is not diff between the first parent and the merge result, either.

On the Git mailing list, "Julia Evans" wrote (reply to this):

> There is an easier-to-understand example. If you pretend that you > "add" series of "diff/patch" to the index as you repeat "edit && > add" three times, in the mental model of the users, there would be > three set of patches stored in the index somehow. It would be a > fair wish for the users to have to be able to revert only the change > you added with your second "git add" while keeping the first one and > the third (latest) one. You cannot explain why you fundamentally > cannot give them such a new "feature", until you admit that what is > recorded is the latest snapshot and earlier snapshots are discarded. Thanks, I think this is the perfect example and it gets at something about git add that i’ve never totally understood: why are the earlier snapshots discarded? Naively, one might think that: 1. the git index is just a tree object 2. when you commit, Git takes that tree object, attaches a message, and makes a commit with it 3. git maintains some sort of history (like the reflog) for the past "index" tree objects If Git worked that way, I imagine it would be possible to implement the feature you describe, and I feel like there's some sort of obvious reason (something to do with performance?) for why the index isn't implemented this way that I've never learned. This example makes me think that if we want people to understand the limitations of the index, it's important to communicate that the past index snapshots are *discarded* and not just that the index is a snapshot.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans" <[email protected]> writes: > This example makes me think that if we want people to understand the > limitations of the index, it's important to communicate that the past > index snapshots are *discarded* and not just that the index is a snapshot. I suspect that you need to look at the whole thing backwards. I realize that it is another way to say that you are looking at the whole thing backwards, so take your pick ;-). Nobody stops you from extending the system to store more than one snapshots in the index and allow your users to roll back to one of these snapshots kept in the index. The reason why we haven't done so is because there has not been motivating any use case for such a feature (and coming up with a reasonable UI for it would also be more work). After all, if you want to keep a set of good points to go back to [*], that is what commits are for in the world view of Git, where creating commits and moving around in history are cheap. If it were something worth going back to, you'd do so at the commit level. "git stash" and its index operations (like the "--keep" option that allows you to test with only what is in the index) are implemented as (temporary) commits internally exactly for this reason. Having said that, there is a focused support to record the previous state before a snapshot records a resolution for a conflicted path [**]. This was added because of a concrete motivating use case to allow you to recover from a botched conflict resolution (aka "gee, I thought this resolution was OK but I did 'git add' way too early, before I actually tested the result!"), where "you can commit to mark the place to later go back" principle does not cleanly apply, since commits in Git do not record conflicted state. Please don't keep asking "why" on this point (i.e. "why not record conflicts in commit?") and other things---at some point, the answers will become a series of "that is how it is, and it has been good enough for us", and then it becomes a waste of time to further ask "why". Until "here is the change I made to do things differently; please see how well it works" materializes, that is. [Footnote] * This is another example why the snapshot worldview gives clear workflow. After you pile on several drunken-walk experimental commits on top of a good commit and realize that this particular line of effort is leading nowhere, you "jump back" to that known good point (i.e. "git reset --hard HEAD~7"). You do not have to apply these changes in reverse direction (i.e. "git apply -R") in reverse order (i.e. "git rev-list --reverse HEAD~7..". ** Read about "Resolve undo" in the documentation.

gitgitgadget · 2025-08-15T00:41:26Z

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans via GitGitGadget" <[email protected]> writes: > -git-add - Add file contents to the index > +git-add - Add new or changed files to the index Does it add much value to say "new or changed" here? The command can also be used to "stage" a removal of a path, e.g. $ rm tracked-file $ git add -u so if the updated text is an attempt to give more details on what kind of modifications are captured, it would be better to say "add new, removed, or modified files". > +Add new or changed files to the index to prepare for a commit. The > +"index" (also known as "staging area") is where Git stores the changes > +that will be in the next commit. I won't repeat myself about change-snapshot duality, but I do not think the new text is the best we can do. Update contents recorded in the index to prepare for the next commit. The index (also known as "staging area") is where Git stores the contents that will be in the next commit. > +By default, `git commit` only commits changes that you've added to the > +index. > For example, if you've edited `file.c` and want to commit your > +changes, you can run: Likewise. "and want to record the resulting contents". > ... > -Please see linkgit:git-commit[1] for alternative ways to add content to a > -commit. In the original, this comment does look a bit out of place (as the text around there does not talk about `git commit`), but as you said that by default 'git commit' makes an as-is commit above, it may be a good idea to move this sentence there. `git commit <pathspec>` is a handy thing to know even for beginners, and making your next commit is what the user is working towards by using "git add".

On the Git mailing list, Jean-Noël AVILA wrote (reply to this):

On Friday, 15 August 2025 02:38:45 CEST Junio C Hamano wrote: > "Julia Evans via GitGitGadget" <[email protected]> writes: > > -git-add - Add file contents to the index > > +git-add - Add new or changed files to the index > > Does it add much value to say "new or changed" here? The command can > also be used to "stage" a removal of a path, e.g. > > $ rm tracked-file > $ git add -u > > so if the updated text is an attempt to give more details on what > kind of modifications are captured, it would be better to say "add > new, removed, or modified files". > The way I see it is that git add *captures* a part of the current state of the working tree (be it addition/removal of contents of files or subtrees of the working dir) for the next commit. A commit *is* a snapshot of the state of the project. The concept of snapshot is central to understanding the behavior of git and its internals. > > +Add new or changed files to the index to prepare for a commit. The > > +"index" (also known as "staging area") is where Git stores the changes > > +that will be in the next commit. > > I won't repeat myself about change-snapshot duality, but I do not > think the new text is the best we can do. > > Update contents recorded in the index to prepare for the next > commit. The index (also known as "staging area") is where Git > stores the contents that will be in the next commit. Particularly, the "stores the changes that..." part is really not what the reader should remember. > > > +By default, `git commit` only commits changes that you've added to the > > +index. I do not understand this addition. I may not be missing knowledge, but this behavior is not only "by default", it's the only behavior of git: commits are made with the content of the index. Let's not make it more complicated than it is already. > > For example, if you've edited `file.c` and want to commit your > > > +changes, you can run: > Likewise. "and want to record the resulting contents". > > > ... > > -Please see linkgit:git-commit[1] for alternative ways to add content to a > > -commit. > > In the original, this comment does look a bit out of place (as the > text around there does not talk about `git commit`), but as you said > that by default 'git commit' makes an as-is commit above, it may be > a good idea to move this sentence there. `git commit <pathspec>` is > a handy thing to know even for beginners, and making your next commit > is what the user is working towards by using "git add".

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jean-Noël AVILA <[email protected]> writes: > On Friday, 15 August 2025 02:38:45 CEST Junio C Hamano wrote: >> "Julia Evans via GitGitGadget" <[email protected]> writes: >> ... >> > +By default, `git commit` only commits changes that you've added to the >> > +index. > > I do not understand this addition. I may not be missing knowledge, but this > behavior is not only "by default", it's the only behavior of git: commits are > made with the content of the index. Let's not make it more complicated than it > is already. I'll only react to "the only behaviour" part, without "more complicated" part. I think Julia is referring to the fact that you can record the state that is different from what is in the index (or, what has been accumulated in the index by the past use of "git add" command that is being discussed here) with "git commit [-i] <pathspec>". You can do $ edit fileA fileB ;# assume both are tracked $ git add fileA $ git commit fileB and the resulting commit will record the contents for fileA found in its parent (i.e. the result of "git add fileA" is not reflected). If the last step were $ git commit -i fileB then the resulting commit will record the contents for both fileA you added with the last "git add" on it, and contents for fileB found in the working tree at the time of "git commit -i" was run (i.e. "git add fileB" was not required).. By default, after the edit of fileA&B and the add of fileA, "git commit" would not be aware of what is currently in fileB in the working tree, and records the same contents as its parent for all paths except for fileA, which would record what was last added with "git add" to the index. >> > For example, if you've edited `file.c` and want to commit your >> >> > +changes, you can run: >> Likewise. "and want to record the resulting contents". >> >> > ... >> > -Please see linkgit:git-commit[1] for alternative ways to add content to a >> > -commit. >> >> In the original, this comment does look a bit out of place (as the >> text around there does not talk about `git commit`), but as you said >> that by default 'git commit' makes an as-is commit above, it may be >> a good idea to move this sentence there. `git commit <pathspec>` is >> a handy thing to know even for beginners, and making your next commit >> is what the user is working towards by using "git add". And this relates to "more complicated" part of your comment. I think keeping "by default" above and also keeping this comment that hints about non-as-is commits made with "git commit <pathspec>" is slightly more preferrable than dropping both of them altogether. With only four additional lines, we cover basic "edit && add && commit" cycle fairly completely. I am also fine to drop the mention of 'git commit' altogether, but it feels somewhat incomplete to not talk about commit when teaching add. After all, add is one of the primary ways to prepare for the next commit---putting it the other way around, you want to learn add primarily because you eventually would want to make a commit. In any case, only having one (i.e. "by default") and dropping the other ("see linkgit:git-commit"), like the patch did, did not make much sense to me. Thanks.

gitgitgadget · 2025-08-12T20:45:57Z

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans via GitGitGadget" <[email protected]> writes: > From: Julia Evans <[email protected]> > > Motivations for this change: > > 1. Listing a huge number of options is visually overwhelming when > opening a man page for an unfamiliar command. It makes it harder > to understand the command's core syntax, like `git add <filename>` For "git add", which has only one mode of operation, this may be good. Note that in general this is not necessarily a good idea, when a command works in different modes (like "git branch" that can list/enumerate or create/delete/manipulate), as not all the options can be used in all the modes the command supports. The "usage" part of the output from "git branch -h" hits a good balance, and may want to use as a model. There is t0450 that aspires to ensure the short usage "git <cmd> -h" matches the synopsis section of "git help <cmd>" for all <cmd>; right now we have too many exceptions, and we should move towards making these exceptions smaller. > 2. For options which can be passed independently of any other options, > including them in the SYNOPSIS does not add any information which you > can't already get from reading the OPTIONS section. Except that you have to scan a lot of text, which is quite inefficient when you *know* the general idea behind the option you want to use, and are only looking for the exact spelling of it (e.g. "was it spelled --ignore-removed?") > `git add` has > some mutually exclusive options, namely: > [--[no-]all | -A | --[no-]ignore-removal | [--update | -u]] > but personally I already find that line so hard to parse that > removing it doesn't remove a lot of information It is a very good point why we may want to have these cues to express "these go together" (my earlier example of "branch") and "only one of these is used". I tend to agree with you that these are not necessarily very easy to read. While it is important to make it easier for new readers to learn, we should also keep in mind that nobody remains to be a newbie forever. > [synopsis] > -git add [--verbose | -v] [--dry-run | -n] [--force | -f] [--interactive | -i] [--patch | -p] > - [--edit | -e] [--[no-]all | -A | --[no-]ignore-removal | [--update | -u]] [--sparse] > - [--intent-to-add | -N] [--refresh] [--ignore-errors] [--ignore-missing] [--renormalize] > - [--chmod=(+|-)x] [--pathspec-from-file=<file> [--pathspec-file-nul]] > - [--] [<pathspec>...] This being a long single line and with redundant "--long|-s" may be making it unnecessarily ugly. Have you considered folding lines and simplifying "[--long | -s]" into "[-s]" and see if it makes easier to follow? Documentation/git-commit.adoc may serve as a better model. > +git add [<options>] [--] [<pathspec>...] > > DESCRIPTION > -----------

On the Git mailing list, "Julia Evans" wrote (reply to this):

Thanks for the comments. I think for now I'll just remove this patch from the series since I don't see a clear way forward and I think it'll make it easier to focus on the other changes. > Note that in general this is not necessarily a good idea, when a > command works in different modes (like "git branch" that can > list/enumerate or create/delete/manipulate), as not all the options > can be used in all the modes the command supports. I've been thinking about that as well: I have some ideas I've been working on for how to clarify the usage of different "modes" of a command by giving the modes names, will share those when I get to a command with modes. > Except that you have to scan a lot of text, which is quite > inefficient when you *know* the general idea behind the option you > want to use, and are only looking for the exact spelling of it (e.g. > "was it spelled --ignore-removed?") That's fair. Something that I hadn't considered is that how easy the OPTIONS section is to scan depends on how the man page is formatted: some man page viewers will bold the options (which I think makes them easier to scan), but some won't. > While it is important to make it easier for new readers to learn, we > should also keep in mind that nobody remains to be a newbie forever. > Have you considered folding lines and > simplifying "[--long | -s]" into "[-s]" and see if it makes easier > to follow? Documentation/git-commit.adoc may serve as a better > model. Hmm, here's what it looks like with the long options removed. To me it doesn't feel like a big enough improvement, and it's harder to tell what some of the short options (like `-n`) mean. git add [-p] [-v] [-n] [-f] [-i] [-e] [-A | --no-all | -u] [--sparse] [--intent-to-add | -N] [--refresh] [--ignore-errors] [--ignore-missing] [--renormalize] [--chmod=(+|-)x] [--pathspec-from-file=<file> [--pathspec-file-nul]] [--] [<pathspec>...]

gitgitgadget · 2025-08-14T22:25:06Z

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans via GitGitGadget" <[email protected]> writes: > -This command can be performed multiple times before a commit. It only > -adds the content of the specified file(s) at the time the add command is > -run; if you want subsequent changes included in the next commit, then > -you must run `git add` again to add the new content to the index. > +The `git add` command only adds the changes at the time that you run it. > +If you edit `file.c` after adding it, you need to run `git add file.c` > +again before committing. I somehow find the text before this change easier to understand (except for one thing). "If you edit `file.c` after adding it" in the new text says the same thing as "if you want subsequent ... in the next commit" in the original but in a much better way. > -The `git status` command can be used to obtain a summary of which > -files have changes that are staged for the next commit. > +If you want to check which changes have been added, you can run > +`git status` to print out a summary of the changes that will be committed > +or run `git diff --staged` to see the full diff. Rewrite "diff --staged" to "diff --cached", simply because that is how "git diff -h" shows. After all, "--staged" is explained as a "synonym" (and by definition, a synonym is something that you do not have to use, as you can use the real thing). "status" gives paths in two groups, "changes to be committed" and "changes not staged for commit". Explaining the use of "diff --cached" to inspect what the user will be committing is a great addition here, as it is a sensible way to sanity-check the result of your index manipulations. In addition, we also should talk about "diff" to inspect what the user will be leaving out---in other words, what the user might have forgotten to add, which is equally if not more useful sanity-check you can do before you commit. Thanks.

On the Git mailing list, "Julia Evans" wrote (reply to this):

Hi, > I somehow find the text before this change easier to understand > (except for one thing). "If you edit `file.c` after adding it" in > the new text says the same thing as "if you want subsequent ... in > the next commit" in the original but in a much better way. I really appreciate all of this feedback. It makes me wonder if there would be a better way to approach this man page. Usually when I'm revising a technical explanation, I find people who are currently users of the software but who have trouble understanding how it works. Then I ask them to give feedback on what's confusing to them about the explanation or what questions they have. I do this because I find that often people who are extremely comfortable with using the software (including me, which is why I usually spend so much time collecting feedback like this!) can lose sight of what's confusing to an "average user". And every time I'm part of a discussion about documentation for an open source project it seems a bit strange to me for a group of people who all already understand the concept to be discussing what would be clearest to an "average user": surely the users themselves should be the judge of what's clear to them! I'm still pretty new to writing open source documentation so I don't know if collecting user feedback like this is a normal part of the process, but I always learn a lot from this type of feedback and it's pretty easy for me to collect it. > Rewrite "diff --staged" to "diff --cached" Will use `diff --cached`. > In addition, we also should talk about > "diff" to inspect what the user will be leaving out---in other > words, what the user might have forgotten to add, which is equally > if not more useful sanity-check you can do before you commit. That makes sense to me. best, Julia

On the Git mailing list, "D. Ben Knoble" wrote (reply to this):

On Fri, Aug 15, 2025 at 12:10 PM Julia Evans <[email protected]> wrote: > > Hi, > > > I somehow find the text before this change easier to understand > > (except for one thing). "If you edit `file.c` after adding it" in > > the new text says the same thing as "if you want subsequent ... in > > the next commit" in the original but in a much better way. > > I really appreciate all of this feedback. It makes me wonder if there would > be a better way to approach this man page. Usually when I'm revising a technical > explanation, I find people who are currently users of the software but who have > trouble understanding how it works. Then I ask them to give feedback on what's > confusing to them about the explanation or what questions they have. > > I do this because I find that often people who are extremely comfortable > with using the software (including me, which is why I usually spend so much > time collecting feedback like this!) can lose sight of what's confusing to an > "average user". The curse of knowledge ;) > And every time I'm part of a discussion about documentation for > an open source project it seems a bit strange to me for a group of people who > all already understand the concept to be discussing what would be clearest to an > "average user": surely the users themselves should be the judge of what's clear > to them! > > I'm still pretty new to writing open source documentation so I don't know if > collecting user feedback like this is a normal part of the process, but I always > learn a lot from this type of feedback and it's pretty easy for me to collect > it. Whether it is or isn't normal, we could probably still benefit from that perspective. As Junio likes to say, a mistake being old is no good reason to carry it forward into the future (or replicate it). I'll take that to mean we also have an opportunity to improve the inputs to documentation (as "leaving out such a perspective" would be the "mistake"—note I'm not ascribing intent, malicious or otherwise!). -- D. Ben Knoble

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans" <[email protected]> writes: > ... Then I ask them to give feedback on what's > confusing to them about the explanation or what questions they have. > > I do this because I find that often people who are extremely comfortable > with using the software (including me, which is why I usually spend so much > time collecting feedback like this!) can lose sight of what's confusing to an > "average user". Yes, you can lose your novice status and it is hard to take it back ;-) I agree with you that the next best thing you can do is to see how well folks who still have that status do. > And every time I'm part of a discussion about documentation for > an open source project it seems a bit strange to me for a group of people who > all already understand the concept to be discussing what would be clearest to an > "average user": surely the users themselves should be the judge of what's clear > to them! Yes, with one caveat, which is that you need to be careful to avoid throwing them into local optima. A simplified world view may make it look easier to swallow, but depending on the kind of white lies you throw at them, some of them they may have to unlearn to further understand the system.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"D. Ben Knoble" <[email protected]> writes: > As Junio likes to say, a mistake being old is no good reason to carry > it forward into the future (or replicate it). I say no such thing, though. What I say about past mistakes is that you shouldn't use it as an excuse to make similar ones in the future. I'd prefer to let a sleeping dog lie. But in the context of this discussion, I think what we carefully and honestly need to look at are not past mistakes. It is importance to adjust to the new world we live in. In early days of Git, people from older SCM systems did not grok the index very well, so our explanation of the concept of index and adding content to it may have focused on teaching the difference between our system and the back-then-major SCM systems. Unless you have used Bitkeeper, the "you can commit and your doing so would not bother anybody else" plus "you can rewrite your private history until you can pretend to be a super developer who came to the best solution with a single attempt" freedom were something quite new, and we needed to educate folks the way to think and work well in the distributed world. Earlier in one of my messages, I said "making a commit and switching to another commit is cheap", and that comment came out of habit, but that is only understood by folks who have used older SCM systems we displaced. But with so many new users who haven't even touched anything other than Git, none of the above examples certainly may not be the best way to teach these things to these new crop of users.

gitgitgadget · 2025-08-12T20:54:50Z

On the Git mailing list, Chris Torek wrote (reply to this):

On Tue, Aug 12, 2025 at 1:35 PM Julia Evans via GitGitGadget <[email protected]> wrote: > +TERMINOLOGY NOTE > +---------------- > + > +Git uses the terms "staging area", "index" and "cache" interchangeably > +for historical reasons. Many commands have flags like `--staged`, > +`--index`, or `--cached`, and they all refer to the index. > + I think this is also a good idea. Unfortunately, `git apply` has two different meanings for `--index` vs `--cached` (I believe it's the *only* exception to the "means the same thing" rule...). Chris

On the Git mailing list, Junio C Hamano wrote (reply to this):

Chris Torek <[email protected]> writes: > On Tue, Aug 12, 2025 at 1:35 PM Julia Evans via GitGitGadget > <[email protected]> wrote: >> +TERMINOLOGY NOTE >> +---------------- >> + >> +Git uses the terms "staging area", "index" and "cache" interchangeably >> +for historical reasons. Many commands have flags like `--staged`, >> +`--index`, or `--cached`, and they all refer to the index. >> + > > I think this is also a good idea. Unfortunately, `git apply` has two > different meanings for `--index` vs `--cached` (I believe it's the > *only* exception to the "means the same thing" rule...). Yes, I think the first sentence is an excellent addition, even though I do not know if "git add" is the best place to teach it. However, it will be disservice to users to say "they all refer to the index" here. Yes, it is technically correct that they all refer to the index, but that much any intelligent readers can infer after reading the first sentance that historically these three words were used to refer to the same "index". And what I think is bad in that second sentence is that it implies they may mean the same thing without saying that. It is perfectly fine to say that these three words express some operation around the index (sometimes called the staging area). It also is fine to say that "--staged" is sometimes used as synonym for `--cached`. But at least `--cached` and `--index` mean quite different things. As "git help cli" explains, an operation that can affect only the index would use "--cached" and both the index and the working tree would use "--index". It may be that "apply" is currently the only exception (I did not check), but it certainly is not guaranteed to stay to be the only exception. If a command wants to work on both the contents in the index and in the working tree, such a command is very much welcomed to use the option "--index" to trigger such a mode of operation. Conclusion? I would rather see "Many commands have ..." sentence struck out. After all, that does not need to be taught to those who came here to learn about "git add". Thanks.

On the Git mailing list, "Julia Evans" wrote (reply to this):

That sounds good to me, I'll remove the second sentence. On Tue, Aug 12, 2025, at 5:36 PM, Junio C Hamano wrote: > Chris Torek <[email protected]> writes: > >> On Tue, Aug 12, 2025 at 1:35 PM Julia Evans via GitGitGadget >> <[email protected]> wrote: >>> +TERMINOLOGY NOTE >>> +---------------- >>> + >>> +Git uses the terms "staging area", "index" and "cache" interchangeably >>> +for historical reasons. Many commands have flags like `--staged`, >>> +`--index`, or `--cached`, and they all refer to the index. >>> + >> >> I think this is also a good idea. Unfortunately, `git apply` has two >> different meanings for `--index` vs `--cached` (I believe it's the >> *only* exception to the "means the same thing" rule...). > > Yes, I think the first sentence is an excellent addition, even > though I do not know if "git add" is the best place to teach it. > > However, it will be disservice to users to say "they all refer to > the index" here. Yes, it is technically correct that they all refer > to the index, but that much any intelligent readers can infer after > reading the first sentance that historically these three words were > used to refer to the same "index". And what I think is bad in that > second sentence is that it implies they may mean the same thing > without saying that. It is perfectly fine to say that these three > words express some operation around the index (sometimes called the > staging area). It also is fine to say that "--staged" is sometimes > used as synonym for `--cached`. > > But at least `--cached` and `--index` mean quite different things. > > As "git help cli" explains, an operation that can affect only the > index would use "--cached" and both the index and the working tree > would use "--index". > > It may be that "apply" is currently the only exception (I did not > check), but it certainly is not guaranteed to stay to be the only > exception. If a command wants to work on both the contents in the > index and in the working tree, such a command is very much welcomed > to use the option "--index" to trigger such a mode of operation. > > Conclusion? I would rather see "Many commands have ..." sentence > struck out. After all, that does not need to be taught to those who > came here to learn about "git add". > > Thanks.

gitgitgadget · 2025-08-14T22:52:18Z

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Julia Evans via GitGitGadget" <[email protected]> writes: > From: Julia Evans <[email protected]> > > I think the fact that git uses these three terms interchangeably is > extremely confusing and that it deserves to be noted. We tend to avoid saying "I think" in our proposed log messages, as we do not churn the code and documentation merely to match personal preferences. I do not necessarily think "git add --help" is an appropriate place to leave this note, by the way. We should start from teaching "git help glossary", which does not mention "staging area" at all, which is a sign that it is somewhat outdated. It does not use the verb 'to stage' even once, either. Here is my attempt to improve the situation by giving a definition of "staging area" in the glossary. Luckily, "cache" already has its own entry, describing it as an old synonym to the 'index', so I didn't have to do anything there. Also the description of 'index' has a bit too much implementation detail, which I toned down. --- Subject: glossary: talk about "staging area" Surprisingly, "git help glossary" does not mention the 'staging area' synonym for the index, or the verb 'to stage'. As "git status" output uses the latter (i.e. "Changes not staged for commit"), we should not leave it undefined what the verb means. Rewrite the definition of the `index` somewhat to reduce the level of implementation detail exposed, and focus more on the fact that it is a mapping from pathnames to the contents at these paths. And mention the `staging area` there, as well as giving its own glossary entry. Signed-off-by: Junio C Hamano <[email protected]> --- Documentation/glossary-content.adoc | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git c/Documentation/glossary-content.adoc w/Documentation/glossary-content.adoc index e423e4765b..10f0c21e88 100644 --- c/Documentation/glossary-content.adoc +++ w/Documentation/glossary-content.adoc @@ -247,11 +247,15 @@ for a more flexible and robust system to do the same thing. of Git you had to make them executable. [[def_index]]index:: - A collection of files with stat information, whose contents are stored - as objects. The index is a stored version of your - <<def_working_tree,working tree>>. Truth be told, it can also contain a second, and even - a third version of a working tree, which are used - when <<def_merge,merging>>. + The index stores the mapping from filenames to their contents + to prepare the contents of the next commit by updating the + object recorded for each path (for this reason, people often + say that the index is "like the staging area" when explaining + the concept), together with other information to detect which + working tree files are modified efficiently. + During a conflicted <<def_merge,merge>>, the index can have + multiple versions of contents at higher stages for the same + path. [[def_index_entry]]index entry:: The information regarding a particular file, stored in the @@ -650,6 +654,12 @@ the `refs/tags/` hierarchy is used to represent local tags.. is created by giving the `--depth` option to linkgit:git-clone[1], and its history can be later deepened with linkgit:git-fetch[1]. +[[def_stage]]staging area:: + A synonym for <<def_index,index>>. Adding contents to the + index to update the mapping from the filename to its contents + is often called "to stage" (verb), as people explain the index + is like a staging area to prepare for the next commit. + [[def_stash]]stash entry:: An <<def_object,object>> used to temporarily store the contents of a <<def_dirty,dirty>> working directory and the index for future reuse.

-Original file line number
+Diff line change
@@ Expand Up / @@ -3,7 +3,7 @@ git-add(1) @@
     NAME
     ----
-    git-add - Add file contents to the index
+    git-add - Add new or changed files to the index
     SYNOPSIS
     --------
@@ Expand All @@
     DESCRIPTION
     -----------
-    This command updates the index using the current content found in
-    the working tree, to prepare the content staged for the next commit.
-    It typically adds the current content of existing paths as a whole,
-    but with some options it can also be used to add content with
-    only part of the changes made to the working tree files applied, or
-    remove paths that do not exist in the working tree anymore.
-    The "index" holds a snapshot of the content of the working tree, and it
-    is this snapshot that is taken as the contents of the next commit.  Thus
-    after making any changes to the working tree, and before running
-    the commit command, you must use the `add` command to add any new or
-    modified files to the index.
-    This command can be performed multiple times before a commit.  It only
-    adds the content of the specified file(s) at the time the add command is
-    run; if you want subsequent changes included in the next commit, then
-    you must run `git add` again to add the new content to the index.
-    The `git status` command can be used to obtain a summary of which
-    files have changes that are staged for the next commit.
-    The `git add` command will not add ignored files by default.  If any
-    ignored files were explicitly specified on the command line, `git add`
-    will fail with a list of ignored files.  Ignored files reached by
-    directory recursion or filename globbing performed by Git (quote your
-    globs before the shell) will be silently ignored.  The `git add` command can
-    be used to add ignored files with the `-f` (force) option.
-    Please see linkgit:git-commit[1] for alternative ways to add content to a
-    commit.
+    Add new or changed files to the index to prepare for a commit. The
+    "index" (also known as "staging area") is where Git stores the changes
+    that will be in the next commit.
+    By default, `git commit` only commits changes that you've added to the
+    index. For example, if you've edited `file.c` and want to commit your
+    changes, you can run:
+       git add file.c
+       git commit
+    You can also add only part of your changes to a file with `git add -p`.
+    Please see linkgit:git-commit[1] for alternative ways to add content to
+    a commit.
+    The `git add` command only adds the changes at the time that you run it.
+    If you edit `file.c` after adding it, you need to run `git add file.c`
+    again before committing.
+    If you want to check which changes have been added, you can run
+    `git status` to print out a summary of the changes that will be committed
+    or run `git diff --staged` to see the full diff.
+    `git add` will not add ignored files by default. You can use the
+    `--force` option to add ignored files. If you explicitly specify the
+    exact filename of an ignored file (e.g. `git add ignored.txt`), `git
+    add` will fail with a list of ignored files. Otherwise it will silently
+    ignore the file.
+    [NOTE]
+    Git uses the terms "staging area", "index" and "cache" interchangeably
+    for historical reasons.
     OPTIONS
     -------
@@ Expand Down Expand Up / @@ -451,6 +452,7 @@ linkgit:git-rm[1] @@
     linkgit:git-reset[1]
     linkgit:git-mv[1]
     linkgit:git-commit[1]
+    linkgit:git-diff[1]
     linkgit:git-update-index[1]
     GIT
@@ Expand Down @@

doc: git-add: clarify DESCRIPTION section #1952

Are you sure you want to change the base?

doc: git-add: clarify DESCRIPTION section #1952

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

gitgitgadget bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!