Git · Software Engineering

Distributed Version Control (Git)

I’ve been wanting to post about Git for a very long time - it’s a very important tool in any software developer’s toolkit.

For the uninitiated Git is a tool that people can use to manage changes to software source code (but not exclusively) that they’re writing, whether by themselves or in teams that can be distributed across the planet.

Git seeks to solve a problem that every writer has had to deal with, change. As I write this post, I change my mind about the words I use, the direction the post is going, and the spelling of the words.

In the very VERY old days I would write one version, save it (maybe as versionone.txt), then write a new version, (maybe call it versiontwo.txt), and so on and so forth. This would mean a directory somewhere on the drive with multiple versions of the document.

The diff tool allows people to see the difference between one version and another, therefore sharing source code changes amongst a team, would involve emailing diff files. Each developer would apply the patch to their version of the software. As you might imagine this would lead to varying results.

Skipping some steps, along the way, Git is now the dominant tool for managing and sharing changes.

Developers share a tree that they can branch from, and put their changes in a series of commits. These changes are pushed to a central repository, where they can be reviewed and then merged into the main tree. Other developers pull from the repository and synchronise their version of the tree with the central repositories.

We’ve not, yet, reached the end of the source code version control journey, though.

There are still unsolved problems, and I think, for now, that they will remain unsolvable for the foreseeable future as they are subjective.

For example, how big should a commit be, what needs to be conveyed in the message of a commit, what sort of change does the commit make, what is the scope of the change, and so on.

There are a number of standards forming around software that address these concerns in part, but they still require some agreement between members of a team on their application.

For example, Stephen Parish gist onAngularJS Git Commit Message Conventions provides a series of rules that developers should adhere to when presenting commits for the Angular project. Unfortunately, and this demonstrates both the version control problem, and the subjectivity of the rules Brian Clements gist provides a (slightly) different set of rules (for example, chore exists in one, but not the other, as a commit type).

The subjectivity means that a developer moving from one project to another, or from one company to another, will always spend quite a bit of time having to (re)learn the rules for the project (and they are rarely documented, and enforced in a uniform manner).

Finally, in my projects, I have found that rebasing the pull requests, and not merging them, means that the commit history is properly captured in the main branch but the github model has encouraged merge commits, so a lot of developers are unfamiliar with the style, and fear rebase because of the potential for loss of work.

Published:
comments powered by Disqus