14
The smart protocol provides a way to send a requests and corresponding
15
responses to communicate with a remote bzr process.
23
At the bottom level there is either a socket, pipes, or an HTTP
24
request/response. We call this layer the *medium*. It is responsible for
25
carrying bytes between a client and server. For sockets, we have the idea
26
that you have multiple requests and get a read error because the other
27
side did shutdown. For pipes we have read pipe which will have a zero
28
read which marks end-of-file. For HTTP server environment there is no
29
end-of-stream because each request coming into the server is independent.
31
So we need a wrapper around pipes and sockets to seperate out requests
32
from substrate and this will give us a single model which is consistent
33
for HTTP, sockets and pipes.
38
On top of the medium is the *protocol*. This is the layer that
39
deserialises bytes into the structured data that requests and responses
42
Request/Response processing
43
---------------------------
45
On top of the protocol is the logic for processing requests (on the
46
server) or responses (on the client).
53
MEDIUM (factory for protocol, reads bytes & pushes to protocol,
54
uses protocol to detect end-of-request, sends written
55
bytes to client) e.g. socket, pipe, HTTP request handler.
60
PROTOCOL(serialization, deserialization) accepts bytes for one
61
request, decodes according to internal state, pushes
62
structured data to handler. accepts structured data from
63
handler and encodes and writes to the medium. factory for
69
HANDLER (domain logic) accepts structured data, operates state
70
machine until the request can be satisfied,
71
sends structured data to the protocol.
73
Request handlers are registered in the `bzrlib.smart.request` module.
81
CLIENT domain logic, accepts domain requests, generated structured
82
data, reads structured data from responses and turns into
83
domain data. Sends structured data to the protocol.
84
Operates state machines until the request can be delivered
85
(e.g. reading from a bundle generated in bzrlib to deliver a
88
This is RemoteBzrDir, RemoteRepository, etc.
93
PROTOCOL (serialization, deserialization) accepts structured data for one
94
request, encodes and writes to the medium. Reads bytes from the
95
medium, decodes and allows the client to read structured data.
100
MEDIUM accepts bytes from the protocol & delivers to the remote server.
101
Allows the protocol to read bytes e.g. socket, pipe, HTTP request.
103
The domain logic is in `bzrlib.remote`: `RemoteBzrDir`, `RemoteBranch`,
106
There is also an plain file-level transport that calls remote methods to
107
manipulate files on the server in `bzrlib.transport.remote`.
115
Version one of the protocol was introduced in Bazaar 0.11.
117
The protocol (for both requests and responses) is described by::
119
REQUEST := MESSAGE_V1
120
RESPONSE := MESSAGE_V1
121
MESSAGE_V1 := ARGS [BODY]
123
ARGS := ARG [MORE_ARGS] NEWLINE
124
MORE_ARGS := SEP ARG [MORE_ARGS]
127
BODY := LENGTH NEWLINE BODY_BYTES TRAILER
128
LENGTH := decimal integer
129
TRAILER := "done" NEWLINE
131
That is, a tuple of arguments separated by Ctrl-A and terminated with a
132
newline, followed by length prefixed body with a constant trailer. Note
133
that although arguments are not 8-bit safe (they cannot include 0x01 or
134
0x0a bytes without breaking the protocol encoding), the body is.
139
Version two was introduced in Bazaar 0.16.
141
The request protocol is::
143
REQUEST_V2 := "bzr request 2" NEWLINE MESSAGE_V2
145
The response protocol is::
147
RESPONSE_V2 := "bzr response 2" NEWLINE RESPONSE_STATUS NEWLINE MESSAGE_V2
148
RESPONSE_STATUS := "success" | "failed"
150
Future versions should follow this structure, like version two does::
152
FUTURE_MESSAGE := VERSION_STRING NEWLINE REST_OF_MESSAGE
154
This is so that clients and servers can read bytes up to the first newline
155
byte to determine what version a message is.
157
For compatibility will all versions (past and future) of bzr clients,
158
servers that receive a request in an unknown protocol version should
159
respond with a single-line error terminated with 0x0a (NEWLINE), rather
160
than structured response prefixed with a version string.
162
Version two of the message protocol is::
164
MESSAGE_V2 := ARGS [BODY_V2]
165
BODY_V2 := BODY | STREAMED_BODY
167
That is, a version one length-prefixed body, or a version two streamed
170
Version two with streamed bodies
171
--------------------------------
173
An extension to version two allows streamed bodies. A streamed body looks
174
a lot like HTTP's chunked encoding::
176
STREAMED_BODY := "chunked" NEWLINE CHUNKS TERMINATOR
177
CHUNKS := CHUNK [CHUNKS]
178
CHUNK := HEX_LENGTH CHUNK_CONTENT
179
HEX_LENGTH := HEX_DIGITS NEWLINE
180
CHUNK_CONTENT := bytes
182
TERMINATOR := SUCCESS_TERMINATOR | ERROR_TERMINATOR
183
SUCCESS_TERMINATOR := 'END' NEWLINE
184
ERROR_TERMINATOR := 'ERR' NEWLINE CHUNKS SUCCESS_TERMINATOR
186
That is, the body consists of a series of chunks. Each chunk starts with
187
a length prefix in hexadecimal digits, followed by an ASCII newline byte.
188
The end of the body is signaled by '``END\\n``', or by '``ERR\\n``'
189
followed by error args, one per chunk. Note that these args are 8-bit
190
safe, unlike request args.
192
A streamed body starts with the string "chunked" so that legacy clients
193
and servers will not mistake the first chunk as the start of a version one
196
The type of body (length-prefixed or chunked) in a response is always the
197
same for a given request method. Only new request methods introduced in
198
Bazaar 0.91 and later use streamed bodies.
205
For some discussion of the requirements that led to this new protocol
206
version, see `bug #83935`_.
208
.. _bug #83935: https://bugs.launchpad.net/bzr/+bug/83935
210
Version three has bencoding of most protocol structures, to make parsing
211
simpler. For extra parsing convenience, these structures are length
214
LENGTH_PREFIX := 32-bit unsigned integer in network byte order
216
Unlike earlier versions, clients and servers are no longer required to
217
know which request verbs and responses will have bodies attached. Because
218
of length-prefixing and other changes, it is always possible to know when
219
a complete request or response has been read, even if the server
222
The underlying message format is::
224
MESSAGE := "bzr message 3 (bzr 1.6)" NEWLINE HEADERS MESSAGE_PARTS
225
HEADERS := LENGTH_PREFIX bencoded_dict
226
MESSAGE_PARTS := MESSAGE_PART [MORE_MESSAGE_PARTS]
227
MORE_MESSAGE_PARTS := END_MESSAGE_PARTS | MESSAGE_PARTS
228
END_MESSAGE_PARTS := "e"
230
MESSAGE_PART := ONE_BYTE | STRUCTURE | BYTES
232
STRUCTURE := "s" LENGTH_PREFIX bencoded_structure
233
BYTES := "b" LENGTH_PREFIX bytes
235
This format allows an arbitrary sequence of message parts to be encoded
241
Each request and response will have “headers”, a dictionary of key-value pairs.
242
The keys must be strings, not any other type of value.
244
Currently, the only defined header is “Software version”. Both the client and
245
the server should include a “Software version” header, with a value of a
246
free-form string such as “bzrlib 1.5”, to aid debugging and logging. Clients
247
and servers **should not** vary behaviour based on this string.
249
Conventional requests and responses
250
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
252
By convention, most requests and responses have a simple “arguments plus
253
optional body” structure, as in earlier protocol versions. This section
254
describes how such messages are encoded. All requests and responses
255
defined by earlier protocol versions must be encoded in this way.
257
Conventional requests will send a sequence of:
259
* Arguments (a STRUCTURE of a tuple)
263
* Single body (BYTES), or
265
* Streamed body (multiple BYTES parts), followed by a status (ONE_BYTE)
267
* if status is "E", followed by an Error (STRUCTURE)
269
Conventional responses will send a sequence of:
273
* Arguments (a STRUCTURE of a tuple)
277
* Single body (BYTES), or
279
* Streamed body (multiple BYTES parts), followed by a status (ONE_BYTE)
281
* if status is "E", followed by an Error (STRUCTURE)
283
In all cases, the ONE_BYTE status is either "S" for Success or "E" for
284
Error. Note that the streamed body from version two is now just multiple
287
For new methods, these sequences are just a convention and may be varied
288
if appropriate for a particular request or response. However, each
289
request should at least start with a STRUCTURE encoding the arguments
290
tuple. The first element of that tuple must be a string that names the
291
request method. (Note that arguments in this protocol version are
292
bencoded. As a result, unlike previous protocol versions, arguments in
293
this version are 8-bit clean.)
295
For errors (where the Status byte of a response or a streamed body is
296
"E"), the situation is analagous to requests. The first item in the
297
encoded sequence must be a string of the error name. The other arguments
298
supply details about the error, and their number and types will depend on
299
the type of error (as identified by the error name).
304
Paths are passed across the network. The client needs to see a namespace
305
that includes any repository that might need to be referenced, and the
306
client needs to know about a root directory beyond which it cannot ascend.
308
Servers run over ssh will typically want to be able to access any path the
309
user can access. Public servers on the other hand (which might be over
310
http, ssh or tcp) will typically want to restrict access to only a
311
particular directory and its children, so will want to do a software
312
virtual root at that level. In other words they'll want to rewrite
313
incoming paths to be under that level (and prevent escaping using ../
314
tricks). The default implementation in bzrlib does this using the
315
`bzrlib.transport.chroot` module.
317
URLs that include ~ should probably be passed across to the server
318
verbatim and the server can expand them. This will proably not be
319
meaningful when limited to a directory? See `bug 109143`_.
321
.. _bug 109143: https://bugs.launchpad.net/bzr/+bug/109143
327
The first argument of a request specifies the request method.
329
The available request methods are registered in `bzrlib.smart.request`.
331
**XXX**: ideally the request methods should be documented here.
332
Contributions welcome!
338
The first argument of an error response specifies the error type.
340
One possible error name is ``UnknownMethod``, which means the server does
341
not recognise the verb used by the client's request. This error was
342
introduced in version three.
344
**XXX**: ideally the error types should be documented here. Contributions