Git submodules

Git submodules are a powerful way to use git as an external dependency management tool. It is basically a way to embed a repository into another. When you add a submodule in Git, the code of the submodule does not get added to the main repository, only certain information about the submodule does. It simply adds a reference to the submodule. This is analogous to the soft link you may have created in your file system.

Advantages of using submodules

  1. Easy separation of code into different repositories.
  2. The submodule can be added into multiple repositories, allowing for easy management.

Common commands

I have created a new repo GitAdvanced. I will be adding the gitBasics as a submodule to explain the commands and concepts. This is not a great example from the point of view of dependency management. Usually you would want to include a submodule which is sort of a module/package that your code depends on.

Adding a submodule

As the first step I have cloned the GitAdvanced repository. I have changed my working directory to this directory.

users-MacBook-Air:GitAdvanced $ pwd
/*****/Github/GitAdvanced
users-MacBook-Air:GitAdvanced $ ls
README.md
view raw pwd.sh hosted with ❤ by GitHub

As seen from the above snippet, currently the GitAdvanced folder only contains the file README.md

Let us now add a submodule here.

users-MacBook-Air:GitAdvanced $ git submodule add git@github.com:anjanashankar9/gitBasics.git
Cloning into '/*****/Github/GitAdvanced/gitBasics'
remote: Enumerating objects: 19, done.
remote: Total 19 (delta 0), reused 0 (delta 0), pack-reused 19
Receiving objects: 100% (19/19), done.
Resolving deltas: 100% (3/3), done.
view raw addSubmodule.sh hosted with ❤ by GitHub

When you do a git status you will see that a .gitmodules file has been added, along with the gitBasics directory.

users-MacBook-Air:GitAdvanced $ git status
On branch master
Your branch is up to date with 'origin/master'.
Changes to be committed:
(use "git restore –staged <file>…" to unstage)
new file: .gitmodules
new file: gitBasics
view raw gitStatusAdd.sh hosted with ❤ by GitHub

Now let us take a closer look at the contents of the .gitmodules file.

users-MacBook-Air:GitAdvanced $ cat .gitmodules
[submodule "gitBasics"]
path = gitBasics
url = git@github.com:anjanashankar9/gitBasics.git
view raw gitModules.sh hosted with ❤ by GitHub

The file shows that the git submodule mapping. If you have multiple submodules, you’ll have multiple entries in this file.

Now let us add these two files, commit and push to github.

users-MacBook-Air:GitAdvanced $ git add .gitmodules gitBasics/
users-MacBook-Air:GitAdvanced $ git status
On branch master
Your branch is up to date with 'origin/master'.
Changes to be committed:
(use "git restore –staged <file>…" to unstage)
new file: .gitmodules
new file: gitBasics
users-MacBook-Air:GitAdvanced $ git commit -m"Adding submodule"
[master 2575f72] Adding submodule
2 files changed, 4 insertions(+)
create mode 100644 .gitmodules
create mode 160000 gitBasics
users-MacBook-Air:GitAdvanced $ git push
Enumerating objects: 4, done.
Counting objects: 100% (4/4), done.
Delta compression using up to 4 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 395 bytes | 395.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To github.com:anjanashankar9/GitAdvanced.git
f1b35a7..2575f72 master –> master
view raw gitCommit.sh hosted with ❤ by GitHub

Now let us quickly go to Github and see how it looks like.

As you can see a symlink for the gitBasics repository with a specific commit hash is created in this repository.

Cloning a Project that contains git submodule

When you clone a project that has submodules, by default you get just the directories that contain submodules, but none of the files within them yet. In order to get those files a few steps need to be done. In order to demonstrate this, I am going to clone the project at a different location on my system.

users-MacBook-Air: $ git clone git@github.com:anjanashankar9/GitAdvanced.git
Cloning into 'GitAdvanced'
remote: Enumerating objects: 10, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 10 (delta 0), reused 3 (delta 0), pack-reused 7
Receiving objects: 100% (10/10), done.
Resolving deltas: 100% (1/1), done.
users-MacBook-Air:Anjana vishal$ cd GitAdvanced/
users-MacBook-Air:GitAdvanced vishal$ ls -lR
total 8
-rw-r–r– 1 vishal staff 59 May 8 00:06 README.md
drwxr-xr-x 2 vishal staff 64 May 8 00:06 gitBasics
./gitBasics:
view raw cloneRepo.sh hosted with ❤ by GitHub

As seen from the output of ls above, the gitBasics directory is listed. However, it has no contents. A combination of two commands are required to initialise the local configuration file.

  • git submodule init
  • git submodule update
users-MacBook-Air:GitAdvanced $ cd gitBasics/
users-MacBook-Air:gitBasics $ ls
users-MacBook-Air:gitBasics $ git submodule init
Submodule 'gitBasics' (git@github.com:anjanashankar9/gitBasics.git) registered for path './'
users-MacBook-Air:gitBasics $ git submodule update
Cloning into '~/GitAdvanced/gitBasics'
Submodule path './': checked out '1eb359e70b7463ab18a39df71fa753ff96376e3e'
users-MacBook-Air:gitBasics $ ls
README.md filename.txt filename3.txt
view raw cloneInit.sh hosted with ❤ by GitHub

There is another way to do this which is a little simpler. You can pass –recurse-submodules  to the git clone command. It will automatically initialise and update each submodule in the repository, including nested submodules if any of the submodules in the repository have submodules themselves.

Deleting a submodule

Deleting a submodule involves a couple steps.

  1. Deinit the submodule
  2. Removing the submodule directory from Git
  3. Removing the submodule directory from .git/modules/
  4. Committing and pushing the changes.
users-MacBook-Air:GitAdvanced $ git submodule deinit gitBasics
Cleared directory 'gitBasics'
Submodule 'gitBasics' (git@github.com:anjanashankar9/gitBasics.git) unregistered for path 'gitBasics'
users-MacBook-Air:GitAdvanced $ git st
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean
users-MacBook-Air:GitAdvanced $ git rm gitBasics/
rm 'gitBasics'
users-MacBook-Air:GitAdvanced $ git st
On branch master
Your branch is up to date with 'origin/master'.
Changes to be committed:
(use "git restore –staged <file>…" to unstage)
modified: .gitmodules
deleted: gitBasics
users-MacBook-Air:GitAdvanced $ cat .gitmodules
users-MacBook-Air:GitAdvanced $ ls
README.md
users-MacBook-Air:GitAdvanced $ rm -rf .git/modules/gitBasics/
users-MacBook-Air:GitAdvanced $ git commit -m"Removed Submodule"
[master baa3526] Removed Submodule
2 files changed, 4 deletions(-)
delete mode 160000 gitBasics

Comparison with subtrees

Both Submodule and Subtrees allows you to embed external repository within your current one. However, which to use would depend on whether you own the external repository, and how likely are you to push to this external repository.

If you own the external repository and are likely to push the code back to it, Git Submodule would serve you better.

If you have third party code that you are not likely to push to, you should use Git subtree, since it is easier to pull from.

Conclusion

Git submodule is a lot more intricate and this post only touches the tip of it. But it is an effective tool when you might want to include a dependency that is maybe a work in progress item owned by another team.

You can see the two repositories used for the demo here.

References:

https://git-scm.com/book/en/v2/Git-Tools-Submodules

https://www.atlassian.com/git/tutorials/git-submodule