Git Wire Protocol, Version 2
==============================

This document presents a specification for a version 2 of Git's wire
protocol.  Protocol v2 will improve upon v1 in the following ways:

  * Instead of multiple service names, multiple commands will be
    supported by a single service
  * Easily extendable as capabilities are moved into their own section
    of the protocol, no longer being hidden behind a NUL byte and
    limited by the size of a pkt-line (as there will be a single
    capability per pkt-line)
  * Separate out other information hidden behind NUL bytes (e.g. agent
    string as a capability and symrefs can be requested using 'ls-refs')
  * Reference advertisement will be omitted unless explicitly requested
  * ls-refs command to explicitly request some refs
  * Designed with http and stateless-rpc in mind.  With clear flush
    semantics the http remote helper can simply act as a proxy.

 Detailed Design
=================

A client can request to speak protocol v2 by sending `version=2` in the
side-channel `GIT_PROTOCOL` in the initial request to the server.

In protocol v2 communication is command oriented.  When first contacting a
server a list of capabilities will advertised.  Some of these capabilities
will be commands which a client can request be executed.  Once a command
has completed, a client can reuse the connection and request that other
commands be executed.

 Special Packets
-----------------

In protocol v2 these special packets will have the following semantics:

  * '0000' Flush Packet (flush-pkt) - indicates the end of a message
  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message

 Capability Advertisement
--------------------------

A server which decides to communicate (based on a request from a client)
using protocol version 2, notifies the client by sending a version string
in its initial response followed by an advertisement of its capabilities.
Each capability is a key with an optional value.  Clients must ignore all
unknown keys.  Semantics of unknown values are left to the definition of
each key.  Some capabilities will describe commands which can be requested
to be executed by the client.

    capability-advertisement = protocol-version
			       capability-list
			       flush-pkt

    protocol-version = PKT-LINE("version 2" LF)
    capability-list = *capability
    capability = PKT-LINE(key[=value] LF)

    key = 1*CHAR
    value = 1*CHAR
    CHAR = 1*(ALPHA / DIGIT / "-" / "_")

A client then responds to select the command it wants with any particular
capabilities or arguments.  There is then an optional section where the
client can provide any command specific parameters or queries.

    command-request = command
		      capability-list
		      (command-args)
		      flush-pkt
    command = PKT-LINE("command=" key LF)
    command-args = delim-pkt
		   *arg
    arg = 1*CHAR

The server will then check to ensure that the client's request is
comprised of a valid command as well as valid capabilities which were
advertised.  If the request is valid the server will then execute the
command.

When a command has finished a client can either request that another
command be executed or can terminate the connection by sending an empty
request consisting of just a flush-pkt.

 Capabilities
~~~~~~~~~~~~~~

There are two different types of capabilities: normal capabilities,
which can be used to to convey information or alter the behavior of a
request, and command capabilities, which are the core actions that a
client wants to perform (fetch, push, etc).

All commands must only last a single round and be stateless from the
perspective of the server side.  All state MUST be retained and managed
by the client process.  This permits simple round-robin load-balancing
on the server side, without needing to worry about state management.

Clients MUST NOT require state management on the server side in order to
function correctly.

 agent
-------

The server can advertise the `agent` capability with a value `X` (in the
form `agent=X`) to notify the client that the server is running version
`X`.  The client may optionally send its own agent string by including
the `agent` capability with a value `Y` (in the form `agent=Y`) in its
request to the server (but it MUST NOT do so if the server did not
advertise the agent capability). The `X` and `Y` strings may contain any
printable ASCII characters except space (i.e., the byte range 32 < x <
127), and are typically of the form "package/version" (e.g.,
"git/1.8.3.1"). The agent strings are purely informative for statistics
and debugging purposes, and MUST NOT be used to programmatically assume
the presence or absence of particular features.

 ls-refs
---------

`ls-refs` is the command used to request a reference advertisement in v2.
Unlike the current reference advertisement, ls-refs takes in parameters
which can be used to limit the refs sent from the server.

Additional features not supported in the base command will be advertised
as the value of the command in the capability advertisement in the form
of a space separated list of features, e.g.  "<command>=<feature 1>
<feature 2>".

ls-refs takes in the following parameters wrapped in packet-lines:

    symrefs
	In addition to the object pointed by it, show the underlying ref
	pointed by it when showing a symbolic ref.
    peel
	Show peeled tags.
    ref-pattern <pattern>
	When specified, only references matching the one of the provided
	patterns are displayed.

The output of ls-refs is as follows:

    output = *ref
	     flush-pkt
    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
    ref-attribute = (symref | peeled)
    symref = "symref-target:" symref-target
    peeled = "peeled:" obj-id

 fetch
-------

`fetch` is the command used to fetch a packfile in v2.  It can be looked
at as a modified version of the v1 fetch where the ref-advertisement is
stripped out (since the `ls-refs` command fills that role) and the
message format is tweaked to eliminate redundancies and permit easy
addition of future extensions.

Additional features not supported in the base command will be advertised
as the value of the command in the capability advertisement in the form
of a space separated list of features, e.g.  "<command>=<feature 1>
<feature 2>".

A `fetch` request can take the following parameters wrapped in
packet-lines:

    want <oid>
	Indicates to the server an object which the client wants to
	retrieve.

    have <oid>
	Indicates to the server an object which the client has locally.
	This allows the server to make a packfile which only contains
	the objects that the client needs. Multiple 'have' lines can be
	supplied.

    done
	Indicates to the server that negotiation should terminate (or
	not even begin if performing a clone) and that the server should
	use the information supplied in the request to construct the
	packfile.

    thin-pack
	Request that a thin pack be sent, which is a pack with deltas
	which reference base objects not contained within the pack (but
	are known to exist at the receiving end). This can reduce the
	network traffic significantly, but it requires the receiving end
	to know how to "thicken" these packs by adding the missing bases
	to the pack.

    no-progress
	Request that progress information that would normally be sent on
	side-band channel 2, during the packfile transfer, should not be
	sent.  However, the side-band channel 3 is still used for error
	responses.

    include-tag
	Request that annotated tags should be sent if the objects they
	point to are being sent.

    ofs-delta
	Indicate that the client understands PACKv2 with delta referring
	to its base by position in pack rather than by an oid.  That is,
	they can read OBJ_OFS_DELTA (ake type 6) in a packfile.

    shallow <oid>
	A client must notify the server of all objects for which it only
	has shallow copies of (meaning that it doesn't have the parents
	of a commit) by supplying a 'shallow <oid>' line for each such
	object so that the serve is aware of the limitations of the
	client's history.

    deepen <depth>
	Request that the fetch/clone should be shallow having a commit depth of
	<depth> relative to the remote side.

    deepen-relative
	Requests that the semantics of the "deepen" command be changed
	to indicate that the depth requested is relative to the clients
	current shallow boundary, instead of relative to the remote
	refs.

    deepen-since <timestamp>
	Requests that the shallow clone/fetch should be cut at a
	specific time, instead of depth.  Internally it's equivalent of
	doing "rev-list --max-age=<timestamp>". Cannot be used with
	"deepen".

    deepen-not <rev>
	Requests that the shallow clone/fetch should be cut at a
	specific revision specified by '<rev>', instead of a depth.
	Internally it's equivalent of doing "rev-list --not <rev>".
	Cannot be used with "deepen", but can be used with
	"deepen-since".

The response of `fetch` is broken into a number of sections separated by
delimiter packets (0001), with each section beginning with its section
header.

    output = *section
    section = (acknowledgments | shallow-info | packfile)
	      (flush-pkt | delim-pkt)

    acknowledgments = PKT-LINE("acknowledgments" LF)
		      *(ready | nak | ack)
    ready = PKT-LINE("ready" LF)
    nak = PKT-LINE("NAK" LF)
    ack = PKT-LINE("ACK" SP obj-id LF)

    shallow-info = PKT-LINE("shallow-info" LF)
		   *PKT-LINE((shallow | unshallow) LF)
    shallow = "shallow" SP obj-id
    unshallow = "unshallow" SP obj-id

    packfile = PKT-LINE("packfile" LF)
	       [PACKFILE]

----
    acknowledgments section
	* Always begins with the section header "acknowledgments"

	* The server will respond with "NAK" if none of the object ids sent
	  as have lines were common.

	* The server will respond with "ACK obj-id" for all of the
	  object ids sent as have lines which are common.

	* A response cannot have both "ACK" lines as well as a "NAK"
	  line.

	* The server will respond with a "ready" line indicating that
	  the server has found an acceptable common base and is ready to
	  make and send a packfile (which will be found in the packfile
	  section of the same response)

	* If the client determines that it is finished with negotiations
	  by sending a "done" line, the acknowledgments sections can be
	  omitted from the server's response as an optimization.

	* If the server has found a suitable cut point and has decided
	  to send a "ready" line, then the server can decide to (as an
	  optimization) omit any "ACK" lines it would have sent during
	  its response.  This is because the server will have already
	  determined the objects it plans to send to the client and no
	  further negotiation is needed.

----
    shallow-info section
	If the client has requested a shallow fetch/clone, a shallow
	client requests a fetch or the server is shallow then the
	server's response may include a shallow-info section.  The
	shallow-info section will be include if (due to one of the above
	conditions) the server needs to inform the client of any shallow
	boundaries or adjustments to the clients already existing
	shallow boundaries.

	* Always begins with the section header "shallow-info"

	* If a positive depth is requested, the server will compute the
	  set of commits which are no deeper than the desired depth.

	* The server sends a "shallow obj-id" line for each commit whose
	  parents will not be sent in the following packfile.

	* The server sends an "unshallow obj-id" line for each commit
	  which the client has indicated is shallow, but is no longer
	  shallow as a result of the fetch (due to its parents being
	  sent in the following packfile).

	* The server MUST NOT send any "unshallow" lines for anything
	  which the client has not indicated was shallow as a part of
	  its request.

	* This section is only included if a packfile section is also
	  included in the response.

----
    packfile section
	* Always begins with the section header "packfile"

	* The transmission of the packfile begins immediately after the
	  section header

	* The data transfer of the packfile is always multiplexed, using
	  the same semantics of the 'side-band-64k' capability from
	  protocol version 1.  This means that each packet, during the
	  packfile data stream, is made up of a leading 4-byte pkt-line
	  length (typical of the pkt-line format), followed by a 1-byte
	  stream code, followed by the actual data.

	  The stream code can be one of:
		1 - pack data
		2 - progress messages
		3 - fatal error message just before stream aborts

	* This section is only included if the client has sent 'want'
	  lines in its request and either requested that no more
	  negotiation be done by sending 'done' or if the server has
	  decided it has found a sufficient cut point to produce a
	  packfile.