Inside Git: Objects & Hashing

Git is fast and reliable because of how it stores data. To master Git, you must understand the .git directory. This hidden folder holds everything Git needs to track your project.

Git uses four main object types to manage your work:

  • Blobs: These store file contents.
  • Trees: These store directory structures and filenames.
  • Commits: These store snapshots of your project and metadata.
  • Tags: These store references to specific versions.

Git identifies everything using hashing. It uses the SHA-1 algorithm to turn data into a unique ID. This ID acts as a fingerprint. If you change one character in a file, the hash changes completely.

How Git builds a snapshot:

When you commit a file, Git creates a chain of objects.

  1. The Blob stores the actual text or data in your file.
  2. The Tree maps that blob to a specific filename and folder path.
  3. The Commit links the tree to an author, a timestamp, and a message.

Git does not store changes or "deltas." It stores snapshots. Every time you commit, Git creates new objects. This makes your history immutable and easy to recover.

Understanding this structure helps you fix errors, recover lost data, and manage complex branches.

Try this mini project to see it in action:

  1. Initialize a repo: mkdir git-lab cd git-lab git init

  2. Create a file and find its hash: echo 'Hello Git' > hello.txt git hash-object hello.txt

  3. Store the file as a blob: git hash-object -w hello.txt

  4. Check the object type and content: git cat-file -t [YOUR_HASH] git cat-file -p [YOUR_HASH]

  5. Create a commit and see the tree: git add hello.txt git commit -m 'Initial commit' git ls-tree HEAD

By looking under the hood, you see that Git is more than just a version control tool. It is a highly organized database of content-addressed objects.

Source: https://dev.to/lotanna_obianefo/inside-git-objects-hashing-44gc

Optional learning community: https://t.me/GyaanSetuAi