When contributing to open source projects, it’s pretty common these days to fork the project on Github, add your contribution, and then send your work as a so-called “pull request” to the project for inclusion. It’s nice, clean and fast. I did this last week to contribute to Apache CloudStack. When I wanted to contribute again today, I had to figure out how to get my “forked” repo up-to-date before I could send a new contribution.
Remember, you can read/write to your fork but only-read from the upstream repository.
Adding upstream as a remote
When you clone your forked repo to a local development machine, you get it setup like this:
git remote -v
origin [email protected]:remibergsma/cloudstack.git (fetch) origin [email protected]:remibergsma/cloudstack.git (push)
As this refers to the “static” forked version, no new commits come in. For that to happen, we need to add the original repo as an extra “remote” that we’ll call “upstream”:
git remote add upstream https://github.com/apache/cloudstack
Now, run the same command again and you’ll see two:
git remote -v
origin [email protected]:remibergsma/cloudstack.git (fetch) origin [email protected]:remibergsma/cloudstack.git (push) upstream https://github.com/apache/cloudstack (fetch) upstream https://github.com/apache/cloudstack (push)
The cloned git repo is now configured to both the forked and the upstream repo.
Let’s fetch the updates from upstream:
git fetch upstream
Sample output:
remote: Counting objects: 151, done. remote: Compressing objects: 100% (123/123), done. remote: Total 151 (delta 39), reused 0 (delta 0) Receiving objects: 100% (151/151), 153.30 KiB | 0 bytes/s, done. Resolving deltas: 100% (39/39), done. From https://github.com/apache/cloudstack 2f2ff4b..49cf2ac 4.4 -> upstream/4.4 aca0f79..66b7738 4.5 -> upstream/4.5 * [new branch] hotfix/4.4/CLOUDSTACK-8073 -> upstream/hotfix/4.4/CLOUDSTACK-8073 85bb685..356793d master -> upstream/master b963bb1..36c0c38 volume-upload -> upstream/volume-upload
We now got the new updates in. Before you continue, be sure to be on the master branch:
git checkout master
Then we will rebase the new changes to our own master branch:
git rebase upstream/master
You can achieve the same by merging, but rebasing is usually cleaner and doesn’t add the extra merge commit.
Sample output:
Updating 4e1527e..356793d Fast-forward SSHKeyPairResponse.java | 12 ++++++++++++ SolidFireSharedPrimaryDataStoreLifeCycle.java | 33 +++++++++++++++++++++++++++++++++ RulesManagerImpl.java | 2 +- ManagementServerImpl.java | 5 +---- 4 files changed, 47 insertions(+), 5 deletions(-)
Finally, update your fork at Github with the new commits:
git push origin master
Branches other than master
Imagine you want to track another branch and sync that as well.
git checkout -b 4.5 origin/4.5
This will setup a local branch called ‘4.5’ that is linked to ‘origin/4.5’.
If you want to get them in sync again later on, the workflow is similar to above:
git checkout 4.5 git fetch upstream git rebase upstream/4.5 git push origin 4.5
Automating this process
I wrote this script to synchronise my clones with upstream:
#!/bin/bash # Sync upstream repo with fork repo # Requires upstream repo to be defined # Check if local repo is specified and exists if [ -z $1 ]; then echo "Please specify repo to sync: $0 <dir>" exit 1 fi if [ ! -d $1 ]; then echo "Dir $1 does not exist!" exit 1 fi # Go into git repo and update cd $1 # Check upstream git remote -v | grep upstream >/dev/null 2>&1 RES=$? if [ $RES -gt 0 ]; then echo "Upstream repo not defined. Please add it: git remote add http://github.com/..." exit 1 fi # Update and push git fetch upstream git rebase upstream/master git push origin master
Execute like this:
./update_origin_with_upstream.sh /path/to/git/repo
Happy contributing!