cdnjs git repositories visualization using gource

Gource is a famous version control visualization tool, supports git, svn, hgbzr and cvs2cl, I tried to use gource to visualize the CDNJS development history, and here are the videos I uploaded to youtube.

CDNJS main repository:

CDNJS new-website repository:


How to sync/update forked git repository with upstream?


I found the origin post in Chinese is more popular among all the posts, so just rewrite it in English now :)

Once you forked a repository on GitHub, the commit history from the upstream will stay at the moment you forked it, sometimes we’ll need to update it, so that the parent of our commit won’t be so old, which can help us have a better look, easier to trace git commit history, and prevent some potential or known conflicts, maybe you have other reasons, anyway. This method can also sync a repository between different git servers.

If you forked repository is behind the upstream repository for more than 1k commits and the repository is fat, you can consider to delete your forked repository and re-fork the origin one, it may be faster and more efficient.

Let’s start it.

First, if you didn’t have the repository locally, you have to clone the forked repository to local, you can set the clone depth to save the bandwidth and disk space usage:
$ git clone --depth 1

(Replace the url to your forked repository)

Once you cloned it, check its ‘remote’, usually you’ll get only one remote after clone, like this:
$ git remote -v
origin (fetch)
origin (push)

Now we should add another ‘remote’ – the origin upstream, so that we can pull the updates from, in this case, use read-only git protocol will be faster, more efficient (but note that some firewall may not allow that protocol, so you’ll need to use https in that case):
$ git remote add upstream git://

PS: ‘upstream’ is the name I gave it, you can give it another name as you want.

To verify new added remote, let’s check it again, you should have two remotes now:

origin (fetch)
origin (push)
upstream git:// (fetch)
upstream git:// (push)

Now we can start the “update” works, I assume the branch you’re going to update is the master branch, if you are going to update a non-master branch, just checkout to the branch you want, but don’t forget to change the branch from the below examples!

If your branch is only behind the upstream, no any “ahead” commits(which means you didn’t commit any new things on the same branch came from upstream), you can directly pull the updates from upstream:
$ git pull upstream master

If your branch also contains your own commits, you should better pull with “–rebase” parameter:
$ git pull --rebase upstream master

Now, almost done, if there is no error or conflicts(we don’t discuss conflicts here), push your master to origin remote, then you’ll found that your forked repo is fresh again:
$ git push origin master

善用 Git 的 sparse checkout 跟 shallow clone/pull 來提高工作效率

當初也是因為在摸cdnjs才開始接觸到的東西,不過貌似大家平常用不太到,所以很多人不知道有這樣的功能,也是做個筆記,有人問的時候可以直接丟這篇 …

先講shallow clone/pull:

man git-clone理面的說明:

Create a shallow clone with a history truncated to the specified number of revisions.

簡單來說就是把太久以前不需要的歷史給丟掉,大於給定數量以前的commit紀錄就會被忽略,進而省下clone時頻寬、空間及時間,這點在數千到數萬個commits以上的repository理面效果會非常明顯,像travis-ci在做CI build的時候預設的clone depth就是50,很久以前是100,缺點除了git log只看的到一定數量的提交紀錄外,git blame跟bi-sect等會需要trace先前紀錄的功能都會變的不可靠或不可用就是了。

另外一個就是sparse-checkout了,這個功能的作用是只checkout出我們想要的檔案,以cdnjs為例,.git資料夾也才600多MB而以,可是整個專案的資料夾卻高達13GB左右,由於理面的檔案大多是非常容易壓縮的source code(文字檔),所以就會有.git資料夾明明占用很少空間,可是實際上整個專案占用的空間卻非常龐大的現象,而這麼大的專案,很可能會有檔案系統操作的效率低落的問題(尤其在rebase等操作),在我們已知只需要取得某專案某些目錄或檔案的情況下,根本沒必要把所有檔案都checkout出來,這時候就可以使用sparse-checkout,在送pull request到不是自己常態性參與的專案時很好用!


  1. 建立一個空的git專案:
    $ git init new.cdnjs && cd new.cdnjs

  2. 在專案裡面啟用sparse-checkout
    $ git config core.sparseCheckout true

  3. 設定你要checkout哪些檔案(直接寫到.git/info/sparse-checkout,多個規則可寫多行),例如我只要ajax/libs/jquery/底下的所有檔案
    $ echo '/ajax/libs/jquery/*' >> .git/info/sparse-checkout

  4. 設定remote(要從哪裡clone/pull?)
    $ git remote add origin git://

  5. 然後就可以開始pull了(這邊可以加上前面說的shallow pull,加上–depth=n)
    $ git pull origin master

到這邊就結束了,整個專案所占用的空間應該會小非常多,這邊以jquery搭配shallow clone depth=10為例,看一下空間使用:
$ du -d 1 -h

18M ./ajax
587M ./.git
605M .


$ du -d 1 -h

682M ./.git
43M ./scratch
16M ./node_modules
12G ./ajax
24K ./test
32K ./build
13G .

高達13GB …

摁 … 少了12GB的checkout快很多啊 …


就直接更改專案底下的.git/info/sparse-checkout檔案,改好之後做一次git reset –hard即可