Skip to main content

Introduction to Git

What is git?#

Git is a free, open-source distributed version control system tool designed to track changes in computer files and coordinate work on those files among multiple people with speed and efficiency. It is primarily used for source code management in software development, but it can be used to keep track of changes in any set of files. Right now git is one of the most widely used VCSs and it has become the de-facto standard of versioning, adopted by both open source as well as corporations in their development toolbelt.

Git was originally developed and designed by Linus Torvalds for the development of the Linux kernel. With a tiny footprint and lightning fast performance, git is designed to fit any project, regardless of its size and nature. It outclasses many SCM tools that have features such as cheap local branching, convenient staging areas and multiple workflows.

What is a Version Control System?#

Version control system is a software tool that helps developers work in tandem without over-writing each others changes and keeping a complete history of their work. If an error occurs, developers can turn the clock back and compare earlier versions of the code to help resolve the problem while minimizing disruption to all team members.

Using a Version Control System allows you to revert selected files back to a previous state, revert the entire project back to a previous state, compare changes over time, see who last modified something that might be causing a problem, who introduced an issue and when, and more. Using a VCS also generally means that if you screw things up or lose files, you can easily recover. In addition, you get all this for very little overhead.

VCS are also known as SCM (Source Code Management) tools or RCS (Revision Control System).

What’s a distributed version control system?#

A distributed version control system (DVCS) is a type of version control where the complete codebase, including its full version history, is mirrored on every developer's computer. Git is an example of a distributed version control system (DVCS) commonly used for open source and commercial software development.

DVCSs allow full access to every file, branch, and iteration of a project, and allows every user access to a full and self-contained history of all changes. Unlike once popular centralized version control systems, DVCSs like Git don’t need a constant connection to a central repository. Developers can work anywhere and collaborate asynchronously from any time zone.

Why Version Control System is needed?#

In today’s world it is very common to have larger development teams. In most of the scenario they will be distributed geographically in different locations. Unlike old days when development used to happen in same room it was easy to communicate and co-ordinate. If thousands of developers are working together it may be very common to have two people working on same file. Other case might be in case of mission critical projects it will be not helpful if any code pushed in main base has security venerability or error. Version control system not only facilitates to overcome the fallback of larger teams, it also helps in keeping track of development.

Without version control, team members are subject to redundant tasks, slower timelines, and multiple copies of a single project. To eliminate unnecessary work, Git and other VCSs give each contributor a unified and consistent view of a project, surfacing work that’s already in progress. Seeing a transparent history of changes, who made them, and how they contribute to the development of a project helps team members stay aligned while working independently.

Why Git?#

Git is the most widely used Vesrion Control System in the world. The best indication of Git’s market dominance is a survey of developers by Stack Overflow. The survey found that 88.4% of 74,298 respondents in 2018 used Git (up from 69.3% in 2015). In fact, so dominant has Git become that the data scientists at Stack Overflow didn’t bother to ask the question in their 2019 and later surveys.

There are many version control systems out there - but Git has some major advantages.

  • Open-Source: Git is a free and open-source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. It is released under the GPL (General Public License) license and hence it is available for free.

  • Distributed System: One of Git's great features is that it is distributed. Distributed means that instead of switching the project to another machine, we can create a "clone" of the entire repository. Also, instead of just having one central repository that you send changes to, every user has their own repository that contains the entire commit history of the project. We do not need to connect to the remote repository; the change is just stored on our local repository. If necessary, we can push these changes to a remote repository.

  • Compatibility: Git is compatible with all the Operating Systems that are being used these days. Git repositories can also access the repositories of other Version Control Systems like SVN, CVK, etc. Git can directly access the remote repositories created by these SVNs. So, the users who were not using Git in the first place can also switch to Git without going through the process of copying their files from the repositories of other VCS’s into Git-VCS.

  • Speed: Since Git stores all the data related to a project in the local repository by the process of cloning, it is very much efficient to fetch data from the local repository instead of doing the same from the remote repository. Git is very fast and scalable compared to other version control systems, which results in the handling of large projects efficiently.

  • Secure: Git is secure. It uses the SHA1 (Secure Hash Function) to name and identify objects within its repository. Files and commits are checked and retrieved by its checksum at the time of checkout. It stores its history in such a way that the ID of particular commits depends upon the complete development history leading up to that commit. Once it is published, one cannot make changes to its old version.

  • Non-linear Development: Git supports rapid branching and merging and includes specific tools for visualizing and navigating a non-linear development history. A branch in Git represents a single commit. We can construct the full branch structure with the help of its parental commit.

  • Branching: Git allows its users to work on a line that runs parallel to the main project files. These lines are called branches. Branches in Git provide a feature to make changes in the project without affecting the original version. The master branch of a version will always contain the production quality code. Any new feature can be tested and worked upon on the branches and further, it can be merged with the master branch.

  • Lightweight: Git follows the criteria of lossless compression that compresses the data and stores it in the local repository occupying very minimal space. Whenever there is a need for this data, it follows the reverse technique and saves a lot of memory space.

  • Reliable: Providing a central repository that is being cloned each time a User performs the Pull operation, the data of the central repository is always being backed up in every collaborator’s local repository. Hence, in the event of crashing of the central server, the data will never be lost as it can be gained back easily by any of the developer’s local machine. Once the Central Server is all repaired, the data can be regained by any of the multiple collaborators. There is a very low probability that the data is not available with any developer because the one that has worked on the project last will definitely have the latest version of the project on its local machine. Same is the case at the client’s end. If a developer loses its data because of some technical fault or any of the unforeseen reasons, they can easily pull the data from the Central repository and get the latest version of the same on their local machine. Hence, pushing data on the central repository makes Git more reliable to work on.