Oops, I’m late in writing my articles. Let’s go back a bit this time to the fundamentals of the commit workflow with git. Here we attack my war horse, my fixed idea, my obsession … The good practices of commit! They often sound like the painful thing, which we like to knock down just to have fun or to respond to a personal craze … And yet they are fundamental. They are a prerequisite for usinggit with all efficiency in general and to improve your version management in particular.

But by the way, what is the use of a version manager?

You want to master git. Great! But you still have to know why. Why want to do version management of a given code? One of the first reasons often mentioned is to backup the code. No! It is by no means the role of a version manager, just a side effect.

At the center of its objectives, history! Yes, a version manager’s main objective is to generate and manage a code history. And there are lots of good reasons for that: consult this history when you start working on a project to understand its logic and yhe way it’s organized, be able to work with others without stepping on your feet, go back or cancel a modification when you realize that the work of your colleague just explodes … and often the history, meaning the list of commit messages, is the only documentation available and usable. Hence its importance.

In short, that good reasons which are imposed on you to pamper this history. But what does that mean on a daily basis?

Source: xkcd

Your commit messages you will treat

Atomic the content of your commits will be

The precise content of your commits you will think about

It’s beautiful isn’t it? Yes but still? … In terms of content of commits, we often speak of atomic commits. The content of a commit must be mastered and targeted. However when you work on your code, you use to manage completely different modifications, which should not be part of a single commit. Hence the history build of successive and atomic commits.

Unfortunately there is no magic formula to define the atomicity of commits. You could possibly ask yourself the following question: “My commit X has just exploded my code. The easiest way is to cancel it. Will I get something ok?” However this famous commit contained my erroneous modification but also a bug fix, some new documentation … In short it’s caos, everything will disappear …

To avoid this kind of disappointment, an essential thing to know: the staging area. Its purpose is to help you build those commits by setting aside the contents of the next commit. After a few years of training, I can say that more than 80% of people had absolutely no idea about this staging area or at least did not use it.

Concretely, what is the staging area?

The cache zone is a buffer zone between your modifications carried out in what is called the working area (roughly the files and directories that you handle daily in your project) and the commit, ie the recording in the git repository for your change. Let’s use an analogy proposed by jub0bs which I had not thought of but which illustrates very well the reason for using this area.

Sempé

You are in a schoolyard and you have to take a class photo with thirty kids running around and screaming. Somehow you put it in order and all the kids are ready. You take the photo but then you realize that it is missing 2. Quickly we will look for them and finally we take the final photo. It will then be added to the school album.

Now imagine the following thing. You gather everyone for the photo, take it and put it in the album. And there you realize that 2 children are missing in the photo. We will have to bring the class together again, put them in a row and pray that in the meantime there is no gastroenteritis epidemic there … Ah and I forgot, the wrong photo has been pasted using super glue in the album. In short, a real pain :).

Well it’s exactly the same thing for your commit. You collect changes that you think are consistent with each other and add them to the staging area. And if you forgot a file or want to remove one of the those changes, you can do it very easily. Once you are satisfied with the result, you can commit. In case you are not using the staging area, the forgotten file and the modification which broke your code will have to be taken into account by modifying the history of existing commits. The tools exist but you have to know how to use it and it can have quite big effects on your history. And this is even more true when you have already pushed these commits on the remote server.

The staging area is therefore essential to be more efficient and generate a usable, readable and interpretable history.

How does the staging area work?

I’m assuming you’ve read the article about the basics of object-based git. Again, everything happens in your local repository, in the .git directory. The management of this area is based on an object, the blob and a file, index. To add content to the staging area, or flag your changes, you use the git add command. This command does 2 things:

  • it creates the blobs corresponding to the new version of the files after modification
  • it feeds the index file which then contains the list of modified files and pointers to the blobs.

Because an example is better than a long speech, I modified the file fic1 of my project and I created a new file fic2.

[ennael@zarafa lab (dev)]$ ls
Changelog fic1 fic2 README
[ennael@zarafa lab (dev)]$ git status
On branch master
Changes not staged for commit:
(use “git add <file>…” to update what will be committed)
(use “git restore <file>…” to discard changes in working directory)
modified:   fic1

Untracked files:
(use “git add <file>…” to include in what will be committed)
fic2

I will then add to the staging area those 2 modifications, which, as we recall, can be the addition of a new file, the modification or deletion of an existing file.

[ennael@zarafa lab (dev)]$ git add fic1 fic2
What really happened? Let’s check the contents of the .git /index file using the command below:
[ennael@zarafa lab (dev)]$ git ls-files --cached
100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0       README
100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0       fic1

You can check the list of staged files and the following information: filename, permissions, SHA1 of the blob containing the new version of the file. You then just have to commit: a new tree is built from the one proposed by HEAD by completing it with the information in the staging area.

I have modifications in the staging area and I want to check the content

At any time you can check the content of your staging area. We can first list the files contained in this area, using the git status command:

[ennael@zarafa lab (dev)]$ git status 
On branch master
Changes to be committed:
 (use "git restore --staged <file>..." to unstage)
       modified:   fic1
       modified:   fic2
Changes not staged for commit:
 (use "git add <file>..." to update what will be committed)
 (use "git restore <file>..." to discard changes in working directory)
       modified:   Makefile
 
In the staging area I, therefore, have 2 files, fic1 and fic2. I’d like to know the content of all those modifications. Nothing could be easier with the git diff command:
[ennael@zarafa lab (master)]$ git diff --cached 
diff --git a/fic1 b/fic1 
index 76233cb..ca194e3 100644 
--- a/fic1 
+++ b/fic1 
@@ -1,4 +1,4 @@ 
 #!/bin/bash 
 liste=`ls $1` 
-echo $liste 
+echo "la liste des fichiers de $1 est $liste"

And what if I want to choose in a file, the modifications to be indexed for the next commit?

In its standard operation, the git add command takes into account all the modifications in a given file. However in the same file, we can list modifications of very different natures: addition of a new function, typos fix, code comment … If I want to be as efficient as possible, I would have to be able to divide these modifications into several commits.

Here is a Makefile file containing a number of modifications:

[ennael@zarafa lab (master)]$ git status 
On branch master 
Changes not staged for commit: 
  (use "git add <file>..." to update what will be committed) 
  (use "git restore <file>..." to discard changes in working directory) 
        modified:   Makefile
We will do partial indexing. The only thing to change in the staging command is the -p option. Let’s go:
[ennael@zarafa lab (master)]$ git add -p ./Makefile
diff --git a/Makefile b/Makefile
index 0ceee0c..2281e4b 100644
--- a/Makefile
+++ b/Makefile
@@ -1,10 +1,10 @@
# SPDX-License-Identifier: GPL-2.0+

-VERSION = 2018
-PATCHLEVEL = 11
-SUBLEVEL =
+VERSION = 2019
+PATCHLEVEL = 10
+SUBLEVEL = 1
EXTRAVERSION =
-NAME =
+NAME = test

# *DOCUMENTATION*
(1/4) Stage this hunk [y,n,q,a,d,j,J,g,/,s,e,?]?

Everything is there! Here is the first block of modifications encountered in your file. As in many commands, the diff format is used here. Following this first block, a question is asked: “should this modification block be indexed?”. The rest is very simple: “y” to index it, “n” to leave it in the working area (it will therefore not be part of the next commit). Other possibilities are available, which I will let you discover in the man page.

The options “s” as split and “e” as edit can be interesting. They allow you respectively to split the block if it does not suit you and to edit the content of the block.

Once you have answered the question, you are presented with the next block and so on until the end of the file. In the end, we get something like this:

[ennael@zarafa lab (master)]$  git status 
On branch master
Changes to be committed:
 (use "git restore --staged <file>..." to unstage)
       modified:   Makefile

Changes not staged for commit:
 (use “git add <file>…” to update what will be committed)
 (use “git restore <file>…” to discard changes in working directory)
       modified:   Makefile

With this partial staging tool, there is no excuse for borking commits, even when you have forgotten to commit for a while. It is also a very good way when it is used systematically, to check the content of its future commits before validating them.

Your staging area is flexible!

I added elements to the staging area and I realize that I made a mistake. Never mind. git add adds files to the staging area, git reset will allow you to remove them. The main goal of the git reset command is to rewrite your history. But if you do not define a commit SHA1 as an argument to the command, then we can use it to modify the content of the staging area. An example :

[ennael@zarafa lab (master)]$ git status  
On branch master 
Changes to be committed: 
  (use "git restore --staged <file>..." to unstage) 
        modified:   fic1 
[ennael@zarafa lab (master +)]$ git reset fic1 
Unstaged changes after reset: 
M       fic1 
[ennael@zarafa lab (master *)]$ git status 
On branch master 
Changes not staged for commit: 
  (use "git add <file>..." to update what will be committed) 
  (use "git restore <file>..." to discard changes in working directory) 
        modified:   fic1

The file fic1, which was previously added to the staging area, was deindexed with the command and will therefore not be part of the next commit.

And since I could do partial staging, well I can also do a partial reset to only de-index part of the modifications to the file already in the staging area:

[ennael@zarafa lab (master +)]$ git reset -p fic1
diff –git a/fic1 b/fic1
index 76233cb..3f42501 100644
— a/fic1
+++ b/fic1
@@ -1,3 +1,5 @@
#!/bin/bash

+echo $1
+
liste=`ls $1`

Be careful with the meaning of this question. It is indeed the deindexing of the modification block which is proposed.

Here you know all about the staging area and related good practices. Use it as often as possible! By the way, do not forget to take care of your commit messages! (another fixed idea with me)