~bzr-pqm/bzr/bzr.dev

2777.4.1 by Andrew Bennetts
Move HPSS protocol description from bzrlib.smart docstring into doc/developers.
1
================
2
Network Protocol
3
================
4
5
:Date: 2007-09-03
6
7
8
.. contents::
9
10
11
Overview
12
========
13
14
The smart protocol provides a way to send a requests and corresponding
15
responses to communicate with a remote bzr process.
16
17
Layering
18
========
19
20
Medium
21
------
22
23
At the bottom level there is either a socket, pipes, or an HTTP
24
request/response.  We call this layer the *medium*.  It is responsible for
25
carrying bytes between a client and server.  For sockets, we have the idea
26
that you have multiple requests and get a read error because the other
27
side did shutdown.  For pipes we have read pipe which will have a zero
28
read which marks end-of-file.  For HTTP server environment there is no
29
end-of-stream because each request coming into the server is independent.
30
31
So we need a wrapper around pipes and sockets to seperate out requests
32
from substrate and this will give us a single model which is consistent
33
for HTTP, sockets and pipes.
34
35
Protocol
36
--------
37
38
On top of the medium is the *protocol*.  This is the layer that
39
deserialises bytes into the structured data that requests and responses
40
consist of.
41
42
Request/Response processing
43
---------------------------
44
45
On top of the protocol is the logic for processing requests (on the
46
server) or responses (on the client).
47
48
Server-side
49
-----------
50
51
Sketch::
52
53
 MEDIUM  (factory for protocol, reads bytes & pushes to protocol,
54
          uses protocol to detect end-of-request, sends written
55
          bytes to client) e.g. socket, pipe, HTTP request handler.
56
  ^
57
  | bytes.
58
  v
59
60
 PROTOCOL(serialization, deserialization)  accepts bytes for one
61
          request, decodes according to internal state, pushes
62
          structured data to handler.  accepts structured data from
63
          handler and encodes and writes to the medium.  factory for
64
          handler.
65
  ^
66
  | structured data
67
  v
68
69
 HANDLER  (domain logic) accepts structured data, operates state
70
          machine until the request can be satisfied,
71
          sends structured data to the protocol.
72
73
Request handlers are registered in the `bzrlib.smart.request` module.
74
75
76
Client-side
77
-----------
78
79
Sketch::
80
81
 CLIENT   domain logic, accepts domain requests, generated structured
82
          data, reads structured data from responses and turns into
83
          domain data.  Sends structured data to the protocol.
84
          Operates state machines until the request can be delivered
85
          (e.g. reading from a bundle generated in bzrlib to deliver a
86
          complete request).
87
2777.4.3 by Andrew Bennetts
Various small improvements.
88
          This is RemoteBzrDir, RemoteRepository, etc.
2777.4.1 by Andrew Bennetts
Move HPSS protocol description from bzrlib.smart docstring into doc/developers.
89
  ^
90
  | structured data
91
  v
92
93
 PROTOCOL  (serialization, deserialization)  accepts structured data for one
94
          request, encodes and writes to the medium.  Reads bytes from the
95
          medium, decodes and allows the client to read structured data.
96
  ^
97
  | bytes.
98
  v
99
2777.4.3 by Andrew Bennetts
Various small improvements.
100
 MEDIUM   accepts bytes from the protocol & delivers to the remote server.
101
          Allows the protocol to read bytes e.g. socket, pipe, HTTP request.
2777.4.1 by Andrew Bennetts
Move HPSS protocol description from bzrlib.smart docstring into doc/developers.
102
103
The domain logic is in `bzrlib.remote`: `RemoteBzrDir`, `RemoteBranch`,
104
and so on.
105
106
There is also an plain file-level transport that calls remote methods to
107
manipulate files on the server in `bzrlib.transport.remote`.
108
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
109
Protocol description
110
====================
111
112
Version one
113
-----------
114
2777.4.3 by Andrew Bennetts
Various small improvements.
115
Version one of the protocol was introduced in Bazaar 0.11.
116
117
The protocol (for both requests and responses) is described by::
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
118
119
  REQUEST := MESSAGE_V1
120
  RESPONSE := MESSAGE_V1
3211.7.1 by Andrew Bennetts
Add description of proposed new network protocol to developer docs (and fix some minor inaccuracies in previous versions' descriptions).
121
  MESSAGE_V1 := ARGS [BODY]
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
122
123
  ARGS := ARG [MORE_ARGS] NEWLINE
124
  MORE_ARGS := SEP ARG [MORE_ARGS]
125
  SEP := 0x01
126
127
  BODY := LENGTH NEWLINE BODY_BYTES TRAILER
128
  LENGTH := decimal integer
129
  TRAILER := "done" NEWLINE
130
131
That is, a tuple of arguments separated by Ctrl-A and terminated with a
132
newline, followed by length prefixed body with a constant trailer.  Note
133
that although arguments are not 8-bit safe (they cannot include 0x01 or
134
0x0a bytes without breaking the protocol encoding), the body is.
135
136
Version two
137
-----------
138
2777.4.3 by Andrew Bennetts
Various small improvements.
139
Version two was introduced in Bazaar 0.16.
140
141
The request protocol is::
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
142
143
  REQUEST_V2 := "bzr request 2" NEWLINE MESSAGE_V2
144
2777.4.3 by Andrew Bennetts
Various small improvements.
145
The response protocol is::
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
146
3211.7.1 by Andrew Bennetts
Add description of proposed new network protocol to developer docs (and fix some minor inaccuracies in previous versions' descriptions).
147
  RESPONSE_V2 := "bzr response 2" NEWLINE RESPONSE_STATUS NEWLINE MESSAGE_V2
148
  RESPONSE_STATUS := "success" | "failed"
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
149
150
Future versions should follow this structure, like version two does::
151
152
  FUTURE_MESSAGE := VERSION_STRING NEWLINE REST_OF_MESSAGE
153
154
This is so that clients and servers can read bytes up to the first newline
155
byte to determine what version a message is.
156
157
For compatibility will all versions (past and future) of bzr clients,
158
servers that receive a request in an unknown protocol version should
159
respond with a single-line error terminated with 0x0a (NEWLINE), rather
160
than structured response prefixed with a version string.
161
162
Version two of the message protocol is::
163
3211.7.1 by Andrew Bennetts
Add description of proposed new network protocol to developer docs (and fix some minor inaccuracies in previous versions' descriptions).
164
  MESSAGE_V2 := ARGS [BODY_V2]
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
165
  BODY_V2 := BODY | STREAMED_BODY
166
167
That is, a version one length-prefixed body, or a version two streamed
168
body.
169
2777.4.3 by Andrew Bennetts
Various small improvements.
170
Version two with streamed bodies
171
--------------------------------
172
173
An extension to version two allows streamed bodies.  A streamed body looks
174
a lot like HTTP's chunked encoding::
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
175
 
176
  STREAMED_BODY := "chunked" NEWLINE CHUNKS TERMINATOR
177
  CHUNKS := CHUNK [CHUNKS]
3211.7.1 by Andrew Bennetts
Add description of proposed new network protocol to developer docs (and fix some minor inaccuracies in previous versions' descriptions).
178
  CHUNK := HEX_LENGTH CHUNK_CONTENT
179
  HEX_LENGTH := HEX_DIGITS NEWLINE
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
180
  CHUNK_CONTENT := bytes
181
  
182
  TERMINATOR := SUCCESS_TERMINATOR | ERROR_TERMINATOR
183
  SUCCESS_TERMINATOR := 'END' NEWLINE
184
  ERROR_TERMINATOR := 'ERR' NEWLINE CHUNKS SUCCESS_TERMINATOR
185
186
That is, the body consists of a series of chunks.  Each chunk starts with
187
a length prefix in hexadecimal digits, followed by an ASCII newline byte.
2748.4.16 by Andrew Bennetts
Tweaks suggested by review.
188
The end of the body is signaled by '``END\\n``', or by '``ERR\\n``'
189
followed by error args, one per chunk.  Note that these args are 8-bit
190
safe, unlike request args.
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
191
192
A streamed body starts with the string "chunked" so that legacy clients
193
and servers will not mistake the first chunk as the start of a version one
194
body.
195
2777.4.3 by Andrew Bennetts
Various small improvements.
196
The type of body (length-prefixed or chunked) in a response is always the
197
same for a given request method.  Only new request methods introduced in
198
Bazaar 0.91 and later use streamed bodies.
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
199
3211.7.1 by Andrew Bennetts
Add description of proposed new network protocol to developer docs (and fix some minor inaccuracies in previous versions' descriptions).
200
Version three
201
-------------
202
203
.. note::
204
  
205
  For some discussion of the requirements that led to this new protocol
206
  version, see bug #\ 83935_.
207
208
.. _83935: https://bugs.launchpad.net/bzr/+bug/83935
209
210
Version three has bencoding of most protocol structures, to make parsing
211
simpler.
212
213
The request and response protocol is::
214
215
  REQUEST_V3 := "bzr request 3" NEWLINE HEADERS REQUEST_ARGS BODY_V3
216
  RESPONSE_V3 := "bzr response 3" NEWLINE RESPONSE_STATUS_V3 HEADERS
217
                 RESPONSE_ARGS BODY_V3
218
  RESPONSE_STATUS_V3 := SUCCESS_STATUS | ERROR_STATUS
219
  SUCCESS_STATUS := "S"
220
  ERROR_STATUS := "E"
221
  HEADERS := bencoded_dictionary
222
223
Each request and response will have “headers”, a dictionary of key-value pairs.
224
The keys must be strings, not any other type of value.
225
226
Currently, the only defined header is “Software version”.  Both the client and
227
the server should include a “Software version” header, with a value of a
228
free-form string such as “bzrlib 1.2”, to aid debugging and logging.  Clients
229
and servers **should not** vary behaviour based on this string.
230
231
The argument encoding is::
232
233
  REQUEST_ARGS := bencoded_sequence
234
  RESPONSE_ARGS := bencoded_sequence
235
236
Arguments in this protocol version are bencoded, and the entire argument
237
structure is length-prefixed.  As a result, unlike previous protocol versions,
238
arguments in this version are 8-bit clean.
239
240
For requests, the first item in the encoded sequence must be a string of
241
the request's verb, e.g. ``Branch.last_revision_info``.  (And so requests must
242
always have at least one item in their REQUEST_ARGS sequence.)
243
244
For error responses (where the RESPONSE_STATUS_V3 is ERROR_STATUS), the
245
situation is analagous to requests.  The first item in the encoded sequence must
246
be a string of the error name.  The other arguments supply details about the
247
error, and their number and types will depend on the type of error (as
248
identified by the error name).
249
250
One possible error name is ``UnknownRequestVerb``, which means the server does
251
not recognise the verb used by the client's request.
252
253
The body encoding is::
254
255
  BODY_V3 := NO_BODY | LENGTH_PREFIXED_BODY | STREAMED_BODY_V2
256
  NO_BODY := "n"
257
  LENGTH_PREFIXED_BODY := "p" LENGTH_PREFIX BODY_BYTES
258
  LENGTH_PREFIX := 32-bit unsigned integer in network byte order
259
  STREAMED_BODY_V2 := "s" CHUNKS_V2 TERMINATOR_V2
260
261
  CHUNKS_V2 := "c" CHUNK_V2 [CHUNKS_V2]
262
  CHUNK_V2 := LENGTH_PREFIX CHUNK_CONTENT
263
  
264
  TERMINATOR_V2 := SUCCESS_TERMINATOR_V2 | ERROR_TERMINATOR_V2
265
  SUCCESS_TERMINATOR_V2 := SUCCESS_STATUS
266
  ERROR_TERMINATOR_V2 := ERROR_STATUS bencoded_sequence
267
268
Version 3 messages always explicitly declare if a body is included.  Clients
269
and servers are no longer expected to know which request verbs and responses
270
will have bodies attached.  If present, bodies may be a single length-prefixed
271
string (like in protocol 1) or a stream of chunks (like in protocol 2).
272
273
If a streamed body is finished with an error, that error will be encoded
274
identically to RESPONSE_ARGS.
275
2777.4.1 by Andrew Bennetts
Move HPSS protocol description from bzrlib.smart docstring into doc/developers.
276
Paths
277
=====
278
279
Paths are passed across the network.  The client needs to see a namespace
280
that includes any repository that might need to be referenced, and the
281
client needs to know about a root directory beyond which it cannot ascend.
282
283
Servers run over ssh will typically want to be able to access any path the
284
user can access.  Public servers on the other hand (which might be over
285
http, ssh or tcp) will typically want to restrict access to only a
286
particular directory and its children, so will want to do a software
287
virtual root at that level.  In other words they'll want to rewrite
288
incoming paths to be under that level (and prevent escaping using ../
2777.4.3 by Andrew Bennetts
Various small improvements.
289
tricks).  The default implementation in bzrlib does this using the
290
`bzrlib.transport.chroot` module.
2777.4.1 by Andrew Bennetts
Move HPSS protocol description from bzrlib.smart docstring into doc/developers.
291
292
URLs that include ~ should probably be passed across to the server
293
verbatim and the server can expand them.  This will proably not be
2777.4.3 by Andrew Bennetts
Various small improvements.
294
meaningful when limited to a directory?  See `bug 109143`_.
295
296
.. _bug 109143: https://bugs.launchpad.net/bzr/+bug/109143
2777.4.1 by Andrew Bennetts
Move HPSS protocol description from bzrlib.smart docstring into doc/developers.
297
298
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
299
Requests
300
========
301
2777.4.3 by Andrew Bennetts
Various small improvements.
302
The first argument of a request specifies the request method.
303
2777.4.2 by Andrew Bennetts
Add description of proposed streamed body extension to network-protocol.txt.
304
The available request methods are registered in `bzrlib.smart.request`.
305
2777.4.3 by Andrew Bennetts
Various small improvements.
306
**XXX**: ideally the request methods should be documented here.
307
Contributions welcome!
308
309
2777.4.1 by Andrew Bennetts
Move HPSS protocol description from bzrlib.smart docstring into doc/developers.
310
..
311
   vim: ft=rst tw=74 ai
312