1
# Copyright (C) 2006 Canonical Ltd
3
# This program is free software; you can redistribute it and/or modify
4
# it under the terms of the GNU General Public License as published by
5
# the Free Software Foundation; either version 2 of the License, or
6
# (at your option) any later version.
8
# This program is distributed in the hope that it will be useful,
9
# but WITHOUT ANY WARRANTY; without even the implied warranty of
10
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11
# GNU General Public License for more details.
13
# You should have received a copy of the GNU General Public License
14
# along with this program; if not, write to the Free Software
15
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
17
"""Smart-server protocol, client and server.
19
This code is fairly complex, so it has been split up into a package of modules,
20
rather than being a single large module. Refer to the individual module
21
docstrings for details.
26
Requests are sent as a command and list of arguments, followed by optional
27
bulk body data. Responses are similarly a response and list of arguments,
28
followed by bulk body data. ::
31
Fields are separated by Ctrl-A.
32
BULK_DATA := CHUNK TRAILER
33
Chunks can be repeated as many times as necessary.
34
CHUNK := CHUNK_LEN CHUNK_BODY
35
CHUNK_LEN := DIGIT+ NEWLINE
36
Gives the number of bytes in the following chunk.
37
CHUNK_BODY := BYTE[chunk_len]
38
TRAILER := SUCCESS_TRAILER | ERROR_TRAILER
39
SUCCESS_TRAILER := 'done' NEWLINE
42
Paths are passed across the network. The client needs to see a namespace that
43
includes any repository that might need to be referenced, and the client needs
44
to know about a root directory beyond which it cannot ascend.
46
Servers run over ssh will typically want to be able to access any path the user
47
can access. Public servers on the other hand (which might be over http, ssh
48
or tcp) will typically want to restrict access to only a particular directory
49
and its children, so will want to do a software virtual root at that level.
50
In other words they'll want to rewrite incoming paths to be under that level
51
(and prevent escaping using ../ tricks.)
53
URLs that include ~ should probably be passed across to the server verbatim
54
and the server can expand them. This will proably not be meaningful when
55
limited to a directory?
57
At the bottom level socket, pipes, HTTP server. For sockets, we have the idea
58
that you have multiple requests and get a read error because the other side did
59
shutdown. For pipes we have read pipe which will have a zero read which marks
60
end-of-file. For HTTP server environment there is not end-of-stream because
61
each request coming into the server is independent.
63
So we need a wrapper around pipes and sockets to seperate out requests from
64
substrate and this will give us a single model which is consist for HTTP,
70
MEDIUM (factory for protocol, reads bytes & pushes to protocol,
71
uses protocol to detect end-of-request, sends written
72
bytes to client) e.g. socket, pipe, HTTP request handler.
77
PROTOCOL (serialization, deserialization) accepts bytes for one
78
request, decodes according to internal state, pushes
79
structured data to handler. accepts structured data from
80
handler and encodes and writes to the medium. factory for
86
HANDLER (domain logic) accepts structured data, operates state
87
machine until the request can be satisfied,
88
sends structured data to the protocol.
94
CLIENT domain logic, accepts domain requests, generated structured
95
data, reads structured data from responses and turns into
96
domain data. Sends structured data to the protocol.
97
Operates state machines until the request can be delivered
98
(e.g. reading from a bundle generated in bzrlib to deliver a
101
Possibly this should just be RemoteBzrDir, RemoteTransport,
107
PROTOCOL (serialization, deserialization) accepts structured data for one
108
request, encodes and writes to the medium. Reads bytes from the
109
medium, decodes and allows the client to read structured data.
114
MEDIUM (accepts bytes from the protocol & delivers to the remote server.
115
Allows the potocol to read bytes e.g. socket, pipe, HTTP request.
118
# TODO: _translate_error should be on the client, not the transport because
119
# error coding is wire protocol specific.
121
# TODO: A plain integer from query_version is too simple; should give some
124
# TODO: Server should probably catch exceptions within itself and send them
125
# back across the network. (But shouldn't catch KeyboardInterrupt etc)
126
# Also needs to somehow report protocol errors like bad requests. Need to
127
# consider how we'll handle error reporting, e.g. if we get halfway through a
128
# bulk transfer and then something goes wrong.
130
# TODO: Standard marker at start of request/response lines?
132
# TODO: Make each request and response self-validatable, e.g. with checksums.
134
# TODO: get/put objects could be changed to gradually read back the data as it
135
# comes across the network
137
# TODO: What should the server do if it hits an error and has to terminate?
139
# TODO: is it useful to allow multiple chunks in the bulk data?
141
# TODO: If we get an exception during transmission of bulk data we can't just
142
# emit the exception because it won't be seen.
143
# John proposes: I think it would be worthwhile to have a header on each
144
# chunk, that indicates it is another chunk. Then you can send an 'error'
145
# chunk as long as you finish the previous chunk.
147
# TODO: Clone method on Transport; should work up towards parent directory;
148
# unclear how this should be stored or communicated to the server... maybe
149
# just pass it on all relevant requests?
151
# TODO: Better name than clone() for changing between directories. How about
152
# open_dir or change_dir or chdir?
154
# TODO: Is it really good to have the notion of current directory within the
155
# connection? Perhaps all Transports should factor out a common connection
156
# from the thing that has the directory context?
158
# TODO: Pull more things common to sftp and ssh to a higher level.
160
# TODO: The server that manages a connection should be quite small and retain
161
# minimum state because each of the requests are supposed to be stateless.
162
# Then we can write another implementation that maps to http.
164
# TODO: What to do when a client connection is garbage collected? Maybe just
165
# abruptly drop the connection?
167
# TODO: Server in some cases will need to restrict access to files outside of
168
# a particular root directory. LocalTransport doesn't do anything to stop you
169
# ascending above the base directory, so we need to prevent paths
170
# containing '..' in either the server or transport layers. (Also need to
171
# consider what happens if someone creates a symlink pointing outside the
174
# TODO: Server should rebase absolute paths coming across the network to put
175
# them under the virtual root, if one is in use. LocalTransport currently
176
# doesn't do that; if you give it an absolute path it just uses it.
178
# XXX: Arguments can't contain newlines or ascii; possibly we should e.g.
179
# urlescape them instead. Indeed possibly this should just literally be
182
# FIXME: This transport, with several others, has imperfect handling of paths
183
# within urls. It'd probably be better for ".." from a root to raise an error
184
# rather than return the same directory as we do at present.
186
# TODO: Rather than working at the Transport layer we want a Branch,
187
# Repository or BzrDir objects that talk to a server.
189
# TODO: Probably want some way for server commands to gradually produce body
190
# data rather than passing it as a string; they could perhaps pass an
191
# iterator-like callback that will gradually yield data; it probably needs a
192
# close() method that will always be closed to do any necessary cleanup.
194
# TODO: Split the actual smart server from the ssh encoding of it.
196
# TODO: Perhaps support file-level readwrite operations over the transport
199
# TODO: SmartBzrDir class, proxying all Branch etc methods across to another
200
# branch doing file-level operations.