1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
|
================
Network Protocol
================
:Date: 2007-09-03
.. contents::
Overview
========
The smart protocol provides a way to send a requests and corresponding
responses to communicate with a remote bzr process.
Layering
========
Medium
------
At the bottom level there is either a socket, pipes, or an HTTP
request/response. We call this layer the *medium*. It is responsible for
carrying bytes between a client and server. For sockets, we have the idea
that you have multiple requests and get a read error because the other
side did shutdown. For pipes we have read pipe which will have a zero
read which marks end-of-file. For HTTP server environment there is no
end-of-stream because each request coming into the server is independent.
So we need a wrapper around pipes and sockets to seperate out requests
from substrate and this will give us a single model which is consistent
for HTTP, sockets and pipes.
Protocol
--------
On top of the medium is the *protocol*. This is the layer that
deserialises bytes into the structured data that requests and responses
consist of.
Request/Response processing
---------------------------
On top of the protocol is the logic for processing requests (on the
server) or responses (on the client).
Server-side
-----------
Sketch::
MEDIUM (factory for protocol, reads bytes & pushes to protocol,
uses protocol to detect end-of-request, sends written
bytes to client) e.g. socket, pipe, HTTP request handler.
^
| bytes.
v
PROTOCOL(serialization, deserialization) accepts bytes for one
request, decodes according to internal state, pushes
structured data to handler. accepts structured data from
handler and encodes and writes to the medium. factory for
handler.
^
| structured data
v
HANDLER (domain logic) accepts structured data, operates state
machine until the request can be satisfied,
sends structured data to the protocol.
Request handlers are registered in the `bzrlib.smart.request` module.
Client-side
-----------
Sketch::
CLIENT domain logic, accepts domain requests, generated structured
data, reads structured data from responses and turns into
domain data. Sends structured data to the protocol.
Operates state machines until the request can be delivered
(e.g. reading from a bundle generated in bzrlib to deliver a
complete request).
This is RemoteBzrDir, RemoteRepository, etc.
^
| structured data
v
PROTOCOL (serialization, deserialization) accepts structured data for one
request, encodes and writes to the medium. Reads bytes from the
medium, decodes and allows the client to read structured data.
^
| bytes.
v
MEDIUM accepts bytes from the protocol & delivers to the remote server.
Allows the protocol to read bytes e.g. socket, pipe, HTTP request.
The domain logic is in `bzrlib.remote`: `RemoteBzrDir`, `RemoteBranch`,
and so on.
There is also an plain file-level transport that calls remote methods to
manipulate files on the server in `bzrlib.transport.remote`.
Protocol description
====================
Version one
-----------
Version one of the protocol was introduced in Bazaar 0.11.
The protocol (for both requests and responses) is described by::
REQUEST := MESSAGE_V1
RESPONSE := MESSAGE_V1
MESSAGE_V1 := ARGS BODY
ARGS := ARG [MORE_ARGS] NEWLINE
MORE_ARGS := SEP ARG [MORE_ARGS]
SEP := 0x01
BODY := LENGTH NEWLINE BODY_BYTES TRAILER
LENGTH := decimal integer
TRAILER := "done" NEWLINE
That is, a tuple of arguments separated by Ctrl-A and terminated with a
newline, followed by length prefixed body with a constant trailer. Note
that although arguments are not 8-bit safe (they cannot include 0x01 or
0x0a bytes without breaking the protocol encoding), the body is.
Version two
-----------
Version two was introduced in Bazaar 0.16.
The request protocol is::
REQUEST_V2 := "bzr request 2" NEWLINE MESSAGE_V2
The response protocol is::
RESPONSE_V2 := "bzr response 2" NEWLINE MESSAGE_V2
Future versions should follow this structure, like version two does::
FUTURE_MESSAGE := VERSION_STRING NEWLINE REST_OF_MESSAGE
This is so that clients and servers can read bytes up to the first newline
byte to determine what version a message is.
For compatibility will all versions (past and future) of bzr clients,
servers that receive a request in an unknown protocol version should
respond with a single-line error terminated with 0x0a (NEWLINE), rather
than structured response prefixed with a version string.
Version two of the message protocol is::
MESSAGE_V2 := ARGS BODY
BODY_V2 := BODY | STREAMED_BODY
That is, a version one length-prefixed body, or a version two streamed
body.
Version two with streamed bodies
--------------------------------
An extension to version two allows streamed bodies. A streamed body looks
a lot like HTTP's chunked encoding::
STREAMED_BODY := "chunked" NEWLINE CHUNKS TERMINATOR
CHUNKS := CHUNK [CHUNKS]
CHUNK := CHUNK_LENGTH CHUNK_CONTENT
CHUNK_LENGTH := HEX_DIGITS NEWLINE
CHUNK_CONTENT := bytes
TERMINATOR := SUCCESS_TERMINATOR | ERROR_TERMINATOR
SUCCESS_TERMINATOR := 'END' NEWLINE
ERROR_TERMINATOR := 'ERR' NEWLINE CHUNKS SUCCESS_TERMINATOR
That is, the body consists of a series of chunks. Each chunk starts with
a length prefix in hexadecimal digits, followed by an ASCII newline byte.
The end of the body is signaled by '``END\\n``', or by '``ERR\\n``'
followed by error args, one per chunk. Note that these args are 8-bit
safe, unlike request args.
A streamed body starts with the string "chunked" so that legacy clients
and servers will not mistake the first chunk as the start of a version one
body.
The type of body (length-prefixed or chunked) in a response is always the
same for a given request method. Only new request methods introduced in
Bazaar 0.91 and later use streamed bodies.
Paths
=====
Paths are passed across the network. The client needs to see a namespace
that includes any repository that might need to be referenced, and the
client needs to know about a root directory beyond which it cannot ascend.
Servers run over ssh will typically want to be able to access any path the
user can access. Public servers on the other hand (which might be over
http, ssh or tcp) will typically want to restrict access to only a
particular directory and its children, so will want to do a software
virtual root at that level. In other words they'll want to rewrite
incoming paths to be under that level (and prevent escaping using ../
tricks). The default implementation in bzrlib does this using the
`bzrlib.transport.chroot` module.
URLs that include ~ should probably be passed across to the server
verbatim and the server can expand them. This will proably not be
meaningful when limited to a directory? See `bug 109143`_.
.. _bug 109143: https://bugs.launchpad.net/bzr/+bug/109143
Requests
========
The first argument of a request specifies the request method.
The available request methods are registered in `bzrlib.smart.request`.
**XXX**: ideally the request methods should be documented here.
Contributions welcome!
..
vim: ft=rst tw=74 ai
|