Git submodules are a powerful way to use git as an external dependency management tool. It is basically a way to embed a repository into another. When you add a submodule in Git, the code of the submodule does not get added to the main repository, only certain information about the submodule does. It simply adds a reference to the submodule. This is analogous to the soft link you may have created in your file system.
Advantages of using submodules
- Easy separation of code into different repositories.
- The submodule can be added into multiple repositories, allowing for easy management.
I have created a new repo GitAdvanced. I will be adding the gitBasics as a submodule to explain the commands and concepts. This is not a great example from the point of view of dependency management. Usually you would want to include a submodule which is sort of a module/package that your code depends on.
Adding a submodule
As the first step I have cloned the GitAdvanced repository. I have changed my working directory to this directory.
As seen from the above snippet, currently the GitAdvanced folder only contains the file
Let us now add a submodule here.
When you do a git status you will see that a .gitmodules file has been added, along with the gitBasics directory.
Now let us take a closer look at the contents of the .gitmodules file.
The file shows that the git submodule mapping. If you have multiple submodules, you’ll have multiple entries in this file.
Now let us add these two files, commit and push to github.
Now let us quickly go to Github and see how it looks like.
As you can see a symlink for the gitBasics repository with a specific commit hash is created in this repository.
Cloning a Project that contains git submodule
When you clone a project that has submodules, by default you get just the directories that contain submodules, but none of the files within them yet. In order to get those files a few steps need to be done. In order to demonstrate this, I am going to clone the project at a different location on my system.
As seen from the output of
ls above, the gitBasics directory is listed. However, it has no contents. A combination of two commands are required to initialise the local configuration file.
- git submodule init
- git submodule update
There is another way to do this which is a little simpler. You can pass –recurse-submodules to the
git clone command. It will automatically initialise and update each submodule in the repository, including nested submodules if any of the submodules in the repository have submodules themselves.
Deleting a submodule
Deleting a submodule involves a couple steps.
- Deinit the submodule
- Removing the submodule directory from Git
- Removing the submodule directory from
- Committing and pushing the changes.
Comparison with subtrees
Both Submodule and Subtrees allows you to embed external repository within your current one. However, which to use would depend on whether you own the external repository, and how likely are you to push to this external repository.
If you own the external repository and are likely to push the code back to it, Git Submodule would serve you better.
If you have third party code that you are not likely to push to, you should use Git subtree, since it is easier to pull from.
Git submodule is a lot more intricate and this post only touches the tip of it. But it is an effective tool when you might want to include a dependency that is maybe a work in progress item owned by another team.
You can see the two repositories used for the demo here.