Coders like to code.
It can be easy to get into the habit of simply opening up an editor and banging out as much code as possible.
This is particularly true if you’re working on a personal project or you’re the sole developer.
It can be even more tempting if you’re a fast coder or have a boss who wants fixes and solutions right away.
But if you are slinging new code into production without a proper version control system, you’re not really doing software development, you’re doing “Cowboy Coding.”
Version control, also called revision control, versioning, or source control, is a method for tracking the revisions made to documents, code, or other files.
Version control systems (VCS) or version control software can be standalone applications that are built into document editing applications (like Word or Google Docs).
Version control software allows developers, editors, and other team members to view previous versions of files, as well as restore earlier versions.
Version control maintains a master copy of the code base. Many version control systems allow for several parallel copies of the entire code base to exist simultaneously.
Each software developer has their own copy of the code base: they can make revisions without influencing the master source code.
These revisions are brought in at an appropriate time and merged into the master source code.
How this merging happens depends on the version control system (VCS) in use.
Not convinced yet that you need a version control system?
Here are the reasons why version control is worth using:
Do you ever use the UNDO button (CTRL-Z) while working? Of course, you do. It’s one of the most important features of modern computers.
What the UNDO button gives you is the freedom to make mistakes. This is one of the advantages you get from version control — in fact, it might be the most important advantage.
With version control, you can try something out — a new solution, a new feature, a bug fix.
If it doesn’t work you can simply revert your code to an earlier point or discard the proposed revisions.
Those revisions will not have been merged into the master source code. (It is kind of like saving points in a video game.)
This is helpful for two reasons:
Have you ever worked on a project over a long period of time and then someone who uses it says, “Didn’t the exit button use to trigger a save warning before closing the application?”
If a system exists for a long enough period of time, it is inevitable that some features will be changed and removed.
Once you know you have a way to reverse mistakes, it becomes much easier to venture into unknown territory and take risks with novel solutions or untested ideas.
Usually, there was some reason for having the feature in the first place (even with features that are eventually removed).
However, there was also a reason why a given feature was removed (even if the reason was that someone did so accidentally).
Later on, when someone shows up and asks about some feature that used to be there, you can try really hard to remember what happened.
Or, if you have version control, you can go look up past revisions and come back with definitive answers about:
This is particularly helpful if you have to:
This is closely related to version history, but it is more about developers and less about features.
Your paper trail is not (usually) a literal paper trail, but version control allows you to see things like:
This is helpful when trying to piece together why things are the way they are. You can assign credit or blame or just figure out who to ask about some specific feature or implementation.
Usually, version controlled repositories are stored in multiple locations.
This saves your projects from having a single machine as a catastrophic single point of failure.
If only one person is working on a project, you might be able to get away without using any version control system (though this is still really a bad idea).
However, if multiple people are working on a project together, the risk of people writing over each other’s revisions or creating incompatible code (also known as merge conflicts) is very high.
As such, one indispensable feature of version control systems (VCS) is the ability to check for mutually incompatible revisions to the master code base to ensure that everything works together.
How do you move files from your local development machine to your test environment and then, finally, the production environments?
Some people just keep an FTP window open and drop files in as they change them.
This is unwise. It is too easy to leave a needed file out, and if there is an unexpected problem on the server, it becomes difficult to reverse your revisions.
If you are using certain types of version control (especially Git), you can simply push your revisions all at once to a remote server. It does not matter what environment — development, test, or production — the server handles.
If any of your revisions cause a problem at any point in the future, you can easily roll back the revisions so that things begin to function again.
There are basically two types of version control systems:
Let’s take an in-depth look below.
Centralized version control systems follow a client-server model.
In these systems, a single, master (“central”) set of source code sits on a server. Individual files that are being worked on are checked out by developers.
The working copy is then “locked.” Others are either warned that they should not make revisions to the file or even prevented from editing the files (or both).
Developers then push the revisions they made to these files back to the central source code, which is the version used for code/software deployments to production environments.
In a centralized version control system, there is a central server (or repository) that acts as the source of truth.
This is also the set of code that is typically kept in a production-ready state.
This means that, at any given time, the code could be shipped to a production environment without negative ramifications.
When you need to work on something, you find the files that you need to work on. You then “check out” these files, which means that:
When you are finished making changes, you can commit them, including a note on what you did.
Unlike decentralized systems where you merge in your changes (we will talk more on merges in a bit), you simply push your changes to the central server. This releases the locks you have on those files.
Decentralized/distributed version control systems are those where the software developers involved have:
There is no one user or node, that is more important than any other node, though there is usually one single repository that is designated as the origin. (Think of a repository as a file but with historical information.)
The origin is similar to the “central” source code in a centralized VCS.
Individual changes are, when ready, merged into the source of truth (typically labeled as the master branch).
Because of the asynchronous and independent method by which decentralized VCS work, merge conflicts must be resolved by the developers before merging occurs.
This is how irreconcilable differences between the work of two or more developers are prevented from breaking the master branch.
In this section, we will cover the process of using a decentralized version control system.
The branching and merging required make the use of such systems slightly more complicated than its centralized counterparts.
You can get started in one of two ways:
Regardless of which option you chose, you will end up with a full copy of the source code on your computer.
Different versions of the code are called branches, with the source of truth and the version that is shipped to a production called the master branch. When using distributed VCS, it is good practice to keep the master branch in a state ready for production deployment at all times.
Every time you want to make a change to one or more files, you create a new branch. As its name implies, a branch is an offshoot of the main code.
The number of changes you include on a branch can vary.
You might make just a small change, or you might keep months of changes on a single branch.
Typically, you would (at the very least) ensure that all of the changes are related to a single feature.
The process of saving a change is called committing.
Each commit that you make requires you to add notes on what you did — your VCS should automatically note that you were the person who committed the change and when.
Over time, you will be able to see a log of all commits made, when they were made, and by whom.
Commits have the bonus feature of allowing you to roll back your changes just a little bit at a time.
This is assuming you have created multiple commits and not just one big commit at the end of your project).
You can think of commits as divisions of branches.
While branches hold changes related to a given feature, commits are the smaller changes that, added together, become the full feature update.
Branches are also helpful for sharing your work.
For example, let’s say that you are working with several others, and you are all contributing to a single repository.
Well, if you wanted to share your work (perhaps you want to get the code you have written reviewed), you can just push the branch you have been working on instead of the entire repository.
When you are reading to ship your work, you can begin the process of merging, where someone (typically not yourself) merges your features branch into the master branch.
The general process is as follows:
Note that version control systems will only allow the reviewer to merge if your proposed changes do not conflict with anything that has already been merged into the master branch.
If this is not the case, you will have to resolve the merge conflicts and update your request.
What are the primary differences between a decentralized/distributed version control system versus a centralized version control system?
The most obvious difference between centralized and decentralized VCS is in terms of access and convenience.
You can think of a centralized system as being akin to accessing a shared Dropbox folder through a web browser.
Conversely, accessing a distributed system is the equivalent of syncing a shared, community Dropbox folder to your own computer.
With a centralized system, before your users can begin editing, they need to:
With a distributed system, the files are already right where you need them.
This is because one of the first steps of getting a distributed system set up is to clone all the files, as well as the version history, to your local development workstation.
Cloning a repository is analogous to copying a file — remember, however, repositories possess additional historical information.
When you are ready to begin working, all you have to do is open up the files you’ve “pulled” to your computer.
Having all the files you need locally is a huge advantage in terms of speed and efficiency.
The only time you need to communicate with the server is to pull a file from it or push a file back to it.
This asynchronous method also allows users to make several revisions locally before deciding on the next step:
However, one big downside of a distributed VCS is the amount of space a local repository might require.
Depending on the size of your project, individual repositories that you have cloned to your computer can end up taking a lot of space.
This problem is amplified if you have to clone multiple repositories for a single (or even multiple) projects.
When you consider the sheer number of text files, image files, videos, and changelog sizes, this can be problematic, especially for those on budget workstations.
For users with such limitations, a centralized VCS might be a better option, since users only have to pull down the files they need, not the entire set of source code and accompanying revision history.
When choosing a version control system (VCS), what are the options available to you?
Which one should you choose?
In the following sections, we will cover several popular distributed version control systems, as well as several popular centralized version control systems.
Hopefully, this helps you choose an option that fits your needs. If not, this list should help you jump start your search for the option that works!
Let’s start with some of the most popular distributed options available.
Bazaar is the version control system sponsored by Canonical, written in Python.
For users familiar with Concurrent Version System (CVS) or Subversion (SVN), Bazaar commands will appear similar.
Bazaar, unlike some of the other distributed VCS, allows you to use it with or without a central repository or server where the master source code set lives.
It also integrates well with other VCS — you can commit changes to SVN, and you can read files that are tracked by Git or Mercurial.
You can also export Bazaar history to many other systems.
Fossil is a cross-platform, distributed version control system that also includes features for:
Fossil ships with a built-in web interface that displays detailed change history and project status information.
The goal of this interface is to reduce the complexity inherently involved with project tracking and to improve a user’s situational awareness in the code base.
Like Bazaar, Fossil does not require you to use a central server, though if you do, the collaboration between your team members will be easier.
Fossil utilizes SQLite databases to store its content.
Git is a version control system created by the “father of Linux,” Linus Torvalds.
Though Git features prominently in the software development world, it can be used to track changes in any type of file set.
Over all else, Git prioritizes performance.
This is important when distributed version control systems require:
Though Git is developed using Linux, it is a cross-platform solution.
Typically, each project is managed in an individual repository. (Remember a repository is essentially a folder but with a log of changes).
Files for large projects are sometimes split into multiple repositories.
Git is typically used in conjunction with some type of web-based hosting service.
This is the method by which multiple collaborators can share their work, as well as pull down the original source code and the changes made by their peers.
In addition to supporting all of the version control and source code management features of Git, GitHub offers:
You can even generate and host simple web pages using GitHub.
Though GitHub offers both public and private repositories, utilizing a private repository incurs fees (whereas a public repository is free of charge).
This is in line with GitHub’s dedication to open source code.
Bitbucket is Atlassian’s contribution to the world of web-based hosting for Git (and Mercurial) users.
In addition to its free accounts, Bitbucket offers more feature-rich commercial plans.
For some users, Bitbucket is a better option than GitHub, since Bitbucket does not change anything if you use a private repository.
Free accounts get an unlimited number of private repositories, though the number of contributors is capped.
Bitbucket is typically seen as the option for professional developers working with proprietary source code.
Its primary use is for code and code review, though Bitbucket does offer some extras like:
GitLab provides four different self-hosted solutions plans:
If you are not interested in self-hosting, you can opt for the fully-hosted version of Git. For each self-hosted plan, there is a corresponding hosted plan:
GitLab ensures feature parity between its self-hosted and fully-hosted plans (that is, the features offered to those on the Starter plan are the same as those on the Bronze plan).
For those of you who need a private repository (or multiple private repositories), you might strongly consider GitLab.
For these situations, GitLab is cheaper than GitHub and faster than Bitbucket (though obviously, your mileage may vary depending on variables specific to your situation).
Mercurial is a cross-platform distributed version control system that is:
Despite the complexity that such features might introduce, the engineers still strive to ship a conceptually simple product with an easy-to-use, integrated web interface.
Though the command line is the primary method by which a user interacts with Mercurial, there are many graphical user interface (GUI) extensions available, and many integrated development environments (IDE) offer built-in Mercurial integration support.
The following version control systems are some of the most popular centralized options available.
Concurrent Versions System (CVS) is a free version control software.
CVS’ origins are with a series of shell scripts shipped in mid-1986.
CVS is no longer maintained (the last time the developers shipped a new release was 2008), but you will still find some people using CVS.
When using CVS, note that the terminology it uses is slightly different from those used by other version control systems.
For example, a set of related files is called a module, while the series of modules a CVS server manages is called the repository.
CVS calls the files that get checked out by developers are the working copy, sandbox, or workspace.
Revisions to the working copy are sent to the repository via commits, while updating is the process of acquiring the changes now present in the repository.
Apache’s Subversion (SVN) is an open source versioning/revision control system.
We mentioned that Concurrent Versions System (CVS) still has some users, but CVS has not been updated since 2008.
As such, Subversion has been designed to act as and is frequently used as, a (mostly) compatible alternative/successor to CVS.
While distributed systems like Git seem to get most of the attention in the world of version control systems, Subversion is commonly used, especially in the open-source community.
Subversion was originally developed in 2000 as an alternative to CVS, but with bug fixes and additional features not found in CVS.
One of the biggest perks of Subversion is its built-in, fine-grained permissions system.
You can limit access to files and directories on a per-user basis.
Furthermore, Subversion is a good option for those who want binary files and other assets stored in the same repositories as the source code (even more so if you have a large number of said binary files).
Finally, do not discount the fact that there is a learning curve when it comes to version control systems.
Subversion can be easier for people (especially non-technical users) to learn and understand than other version control systems.
Finally, Subversion is a good option for businesses operating in heavily-regulated industries.
While you can certainly hack any version control system to maintain the audit trails you need to ensure that your company is compliant with the appropriate regulations.
SVN, as an enterprise-grade system, comes with the feature set necessary to make this process easier for you.
Team Foundation Server (TFS) is Microsoft’s contribution to the world of version control systems.
TFS also includes features for:
Essentially, TFS contains everything you need to manage all aspects of the software development lifecycle.
TFS can be used with many different integrated development environments (IDEs).
It is built especially for use with Visual Studio or Eclipse.
You can self-host TFS, or you can subscribe to the hosted version called Visual Studio Team Services.
Furthermore, TFS is one of the few products that boast built-in extensibility.
You can certainly hack other systems to perform the way you want if it goes against the way the product is designed, but TFS makes this process much easier.
There are many different version control systems out there, and while they all implement version control slightly differently, the important thing is for you to adopt one.
The difference between Git, CVS, and SVN, is not as large as the difference between not having versus having a version control system.
Don’t risk catastrophic loss of your source code — adopt a version control system today!