每天推薦一個 GitHub 優質開源項目和一篇精選英文科技或編程文章原文,歡迎關注開源日報。交流QQ群:202790710;微博:https://weibo.com/openingsource;電報群 https://t.me/OpeningSourceOrg


今日推薦開源項目:《超級備忘錄——Awesome-Cheatsheets

推薦理由:你可以把Awesome-Cheatsheets看作是一個備忘錄,其實它的作用與備忘錄很相似,但它又不止是一個備忘錄,姑且叫「超級備忘錄」好了!

開源周報2018年第7期:為你寫詩,為你無所不知

這是一個存儲了現在流行的編程語言,框架和開發工具的使用技巧和知識的地方,記錄的許多人在使用一些語言,框架和工具的途中積累下來的知識於技巧,比如JavaScript,Bash, Node.js等。

Awesome-Cheatsheets就是這樣一個工具。我們在學習一門新技巧,新語言時,許多人都會選擇做一個Cheatsheets,隨著時間的積累,我們積累的東西也會越來越多,Awesome-Cheatshoots系統的存儲記錄了這些零碎的知識。使用它將會是一個不錯的體驗。

Awesome-Cheatsheets現在還在繼續完善中,不斷收集在學習各種語言,框架,工具的知識,我想它一定會滿足我們對知識的需求。


今日推薦英文原文:《Introducing Git protocol version 2》作者:Brandon Williams, Git Core Team

原文鏈接:https://opensource.googleblog.com/2018/05/introducing-git-protocol-version-2.html

推薦理由:Google 正式在其官方 Blog 宣布 Git protocal version 2,這個版本最主要的變化是客戶端與伺服器端一些 clone、fetch、push等一些操作的協議。

Introducing Git protocol version 2

Today we announce Git protocol version 2, a major update of Git's wire protocol (how clones, fetches and pushes are communicated between clients and servers). This update removes one of the most inefficient parts of the Git protocol and fixes an extensibility bottleneck, unblocking the path to more wire protocol improvements in the future.

The protocol version 2 spec can be found here. The main improvements are:

The main motivation for the new protocol was to enable server side filtering of references (branches and tags). Prior to protocol v2, servers responded to all fetch commands with an initial reference advertisement, listing all references in the repository. This complete listing is sent even when a client only cares about updating a single branch, e.g.: `git fetch origin master`. For repositories that contain 100s of thousands of references (the Chromium repository has over 500k branches and tags) the server could end up sending 10s of megabytes of data that get ignored. This typically dominates both time and bandwidth during a fetch, especially when you are updating a branch that's only a few commits behind the remote, or even when you are only checking if you are up-to-date, resulting in a no-op fetch.

We recently rolled out support for protocol version 2 at Google and have seen a performance improvement of 3x for no-op fetches of a single branch on repositories containing 500k references. Protocol v2 has also enabled a reduction of 8x of the overhead bytes (non-packfile) sent from googlesource.com servers. A majority of this improvement is due to filtering references advertised by the server to the refs the client has expressed interest in.

Getting over the hurdles

The Git project has tried on a number of occasions over the years to either limit the initial ref advertisement or move to a new protocol altogether but continued to run into two problems: (1) the initial request is rigid and does not include a field that could be used to request that new servers modify their response without breaking compatibility with existing servers and (2) error handling is not well enough defined to allow safely using a new protocol that existing servers do not understand with a quick fallback to the old protocol. To migrate to a new protocol version, we needed to find a side channel which existing servers would ignore but could be used to safely communicate with newer servers.

There are three main transports that are used to speak Git』s wire-protocol (git://, ssh://, and https://), and the side channel that we use to request v2 needs to communicate in such a way that an older server would ignore any additional data sent and not crash. The http transport was the easiest as we can simply include an additional http header in the request (「Git-Protocol: version=2」). The ssh transport is a bit more difficult as it requires sending an environment variable (「GIT_PROTOCOL=version=2」) to be set on the remote end. This is more challenging because it requires server administrators to configure sshd to accept the new environment variable on their server. The most difficult transport is the anonymous Git transport (git://).

Initial requests made to a server using the anonymous Git transport are made in the form of a single packet-line which includes the requested service (git-upload-pack for fetches and git-receive-pack for pushes), and the repository followed by a NUL byte. Later virtualization support was added and a hostname parameter could be tacked on and  terminated by a NUL byte: `0033git-upload-pack /project.git\0host=myserver.com\0`. Ideally we』d be able to add a new parameter to be used to request v2 by adding it in the same manner as the hostname was added: `003dgit-upload-pack /project.git\0host=myserver.com\0version=2\0`. Unfortunately due to a bug introduced in 2006 we aren't able to place any extra arguments (separated by NULs) other than the host because otherwise the parsing of those arguments would enter an infinite loop. When this bug was fixed in 2009, a check was put in place to disallow extra arguments so that new clients wouldn't trigger this bug in older servers.

Fortunately, that check doesn't notice if we send additional request arguments hidden behind a second NUL byte, which was pointed out back in 2009.  This allows requests structured like: `003egit-upload-pack /project.git\0host=myserver.com\0\0version=2\0`. By placing version information behind a second NUL byte we can skirt around both the infinite loop bug and the explicit disallowal of extra arguments besides hostname. Only newer servers will know to look for additional information hidden behind two NUL bytes and older servers won』t croak.

Now, in every case, a client can issue a request to use v2, using a transport-specific side channel, and v2 servers can respond using the new protocol while older servers will ignore the side channel and just respond with a ref advertisement.

Try it for yourself

To try out protocol version 2 for yourself you'll need an up to date version of Git (support for v2 was recently merged to Git's master branch and is expected to be part of Git 2.18) and a v2 enabled server (repositories on googlesource.com and Cloud Source Repositories are v2 enabled). If you enable tracing and run the `ls-remote` command querying for a single branch, you can see the server sends a much smaller set of references when using protocol version 2:

# Using the original wire protocol

GIT_TRACE_PACKET=1 git -c protocol.version=0 ls-remote https://chromium.googlesource.com/chromium/src.git master

# Using protocol version 2

GIT_TRACE_PACKET=1 git -c protocol.version=2 ls-remote https://chromium.googlesource.com/chromium/src.git master


每天推薦一個 GitHub 優質開源項目和一篇精選英文科技或編程文章原文,歡迎關注開源日報。交流QQ群:202790710;電報群 https://t.me/OpeningSourceOrg