The number of git users is increasing, personal use, in business … But to have observed the profile of a few hundred trainees during git training, I would say that overall, 80% said they have already used git since some months. And of those 80%, less than 1% know the inner workings of git.
Ok for the numbers but why put your hands in the plumbing? “I have my list of orders that I use every day, that’s enough for me.” Except that when things go wrong, impossible to understand the why of how …
Hence the idea of putting the hands into the backend of the tool.
Git: everything is object … well almostThe organization of Git’s metadata is largely inspired by a file system. Everything relies on objects to store data and metadata. Before going further let us keep in mind that these objects have common characteristics:
- each object is identified by a checksum to guarantee its uniqueness and integrity, in this case a SHA1
- every object is immutable. An object created is never modified or deleted (nearly)
- All of these objects are stored in the local repository, in the .git /objects directory
The second object is the tree. It describes the project tree and always following our comparison with a file system, it would be the equivalent of the directory. The trees point to all the files and directories, their names as well as the SHA1 of the corresponding objects. They also contain timestamp and permissions information.
- the commit message that you tweaked of course
- the identity of the committer and the author – this differentiation makes it possible to trace the exact origin of the code, even if you do not have commit rights
- a pointer to a tree: the commit points to a specific tree representing a given state at a given time in the project
- a parent: a pointer to the commit that immediately precedes it. This is what makes it possible to build the history of commits.
Simple, right? Come on, we put everything back to end: a commit points to a tree which itself points to trees and blobs.
Git is also referencesOK we have commits but how do we organize them? Git is based on branches. But what is a branch in the end? It’s very simple: a branch references a specific commit, in this case the last one made on the branch itself. This commit is called HEAD. And all this happens in the .git / refs / heads directory: one file per defined branch, each file contains the SHA1 of the HEAD commit. Let’s take an example: my project has 2 branches
# git branch dev * master
# ls .git/refs/heads dev master # cat .git/refs/heads/master 3d7bb81994a5eaaf9eca5f12f335a8c0491eca56
git log --oneline master
3d7bb81 images utilisées pour les supports git
bf98d90 Utilisation du nouveau template pour la formation git avancé
10d16d7 Utilisation du nouveau template de slides pour git en pratique
aba2d6a Merge branch 'feature1' into 'master'
# cat .git/HEAD