~bzr-pqm/bzr/bzr.dev

5557.1.7 by John Arbash Meinel
Merge in the bzr.dev 5582
1
# Copyright (C) 2006-2011 Canonical Ltd
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
2
#
3
# This program is free software; you can redistribute it and/or modify
4
# it under the terms of the GNU General Public License as published by
5
# the Free Software Foundation; either version 2 of the License, or
6
# (at your option) any later version.
7
#
8
# This program is distributed in the hope that it will be useful,
9
# but WITHOUT ANY WARRANTY; without even the implied warranty of
10
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
11
# GNU General Public License for more details.
12
#
13
# You should have received a copy of the GNU General Public License
14
# along with this program; if not, write to the Free Software
4183.7.1 by Sabin Iacob
update FSF mailing address
15
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
16
2004.2.1 by John Arbash Meinel
Cleanup of urllib functions
17
"""Implementaion of urllib2 tailored to bzr needs
18
2363.4.7 by Vincent Ladeuil
Deeper tests, prepare the auth setting that will avoid the
19
This file complements the urllib2 class hierarchy with custom classes.
2004.2.1 by John Arbash Meinel
Cleanup of urllib functions
20
21
For instance, we create a new HTTPConnection and HTTPSConnection that inherit
22
from the original urllib2.HTTP(s)Connection objects, but also have a new base
3059.2.2 by Vincent Ladeuil
Read http responses on demand without buffering the whole body
23
which implements a custom getresponse and cleanup_pipe handlers.
2004.2.1 by John Arbash Meinel
Cleanup of urllib functions
24
25
And then we implement custom HTTPHandler and HTTPSHandler classes, that use
26
the custom HTTPConnection classes.
27
28
We have a custom Response class, which lets us maintain a keep-alive
29
connection even for requests that urllib2 doesn't expect to contain body data.
30
2363.4.10 by Vincent Ladeuil
Complete tests.
31
And a custom Request class that lets us track redirections, and
2363.4.12 by Vincent Ladeuil
Take jam's review comments into account. Fix typos, give better
32
handle authentication schemes.
3430.1.1 by Vincent Ladeuil
Fix bug #229076 by fixing header names before sending the request.
33
34
For coherency with python libraries, we use capitalized header names throughout
35
the code, even if the header names will be titled just before sending the
36
request (see AbstractHTTPHandler.do_open).
2004.2.1 by John Arbash Meinel
Cleanup of urllib functions
37
"""
38
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
39
DEBUG = 0
40
41
# FIXME: Oversimplifying, two kind of exceptions should be
42
# raised, once a request is issued: URLError before we have been
43
# able to process the response, HTTPError after that. Process the
44
# response means we are able to leave the socket clean, so if we
45
# are not able to do that, we should close the connection. The
46
# actual code more or less do that, tests should be written to
2004.1.16 by v.ladeuil+lp at free
Add tests against erroneous http status lines.
47
# ensure that.
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
48
4628.1.2 by Vincent Ladeuil
More complete fix.
49
import errno
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
50
import httplib
4011.3.5 by Jelmer Vernooij
Move import next to other system libs, fix format.
51
try:
52
    import kerberos
53
except ImportError:
54
    have_kerberos = False
55
else:
56
    have_kerberos = True
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
57
import socket
58
import urllib
59
import urllib2
60
import urlparse
2167.3.4 by v.ladeuil+lp at free
Better fix for #74759, but still not tests.
61
import re
2004.1.16 by v.ladeuil+lp at free
Add tests against erroneous http status lines.
62
import sys
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
63
import time
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
64
2004.1.25 by v.ladeuil+lp at free
Shuffle http related test code. Hopefully it ends up at the right place :)
65
from bzrlib import __version__ as bzrlib_version
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
66
from bzrlib import (
2900.2.6 by Vincent Ladeuil
Make http aware of authentication config.
67
    config,
3052.3.3 by Vincent Ladeuil
Add -Dhttp support.
68
    debug,
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
69
    errors,
2929.3.1 by Vincent Ladeuil
Fix python2.6 deprecation warnings (still 4 failures 5 errors in test suite).
70
    osutils,
3052.3.3 by Vincent Ladeuil
Add -Dhttp support.
71
    trace,
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
72
    transport,
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
73
    ui,
4795.4.5 by Vincent Ladeuil
Make sure all redirection code paths can handle authentication.
74
    urlutils,
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
75
    )
76
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
77
5393.6.1 by Toshio Kuratomi
Fix branching from lp: repositories under python-2.7
78
class addinfourl(urllib2.addinfourl):
5393.6.3 by Toshio Kuratomi
Fix typo in implementation of python-2.7 xmlrpclib fix and add comment + entry in NEWS
79
    '''Replacement addinfourl class compatible with python-2.7's xmlrpclib
80
81
    In python-2.7, xmlrpclib expects that the response object that it receives
82
    has a getheader method.  httplib.HTTPResponse provides this but
83
    urllib2.addinfourl does not.  Add the necessary functions here, ported to
84
    use the internal data structures of addinfourl.
85
    '''
5393.6.1 by Toshio Kuratomi
Fix branching from lp: repositories under python-2.7
86
87
    def getheader(self, name, default=None):
88
        if self.headers is None:
89
            raise httplib.ResponseNotReady()
5393.6.3 by Toshio Kuratomi
Fix typo in implementation of python-2.7 xmlrpclib fix and add comment + entry in NEWS
90
        return self.headers.getheader(name, default)
5393.6.1 by Toshio Kuratomi
Fix branching from lp: repositories under python-2.7
91
92
    def getheaders(self):
93
        if self.headers is None:
94
            raise httplib.ResponseNotReady()
95
        return self.headers.items()
96
97
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
98
class _ReportingFileSocket(object):
99
100
    def __init__(self, filesock, report_activity=None):
101
        self.filesock = filesock
102
        self._report_activity = report_activity
103
4776.2.1 by Vincent Ladeuil
Support no activity report on http sockets.
104
    def report_activity(self, size, direction):
105
        if self._report_activity:
106
            self._report_activity(size, direction)
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
107
108
    def read(self, size=1):
109
        s = self.filesock.read(size)
4776.2.1 by Vincent Ladeuil
Support no activity report on http sockets.
110
        self.report_activity(len(s), 'read')
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
111
        return s
112
5848.2.1 by John Arbash Meinel
Break compatibility with python <2.6.
113
    def readline(self, size=-1):
114
        s = self.filesock.readline(size)
115
        self.report_activity(len(s), 'read')
116
        return s
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
117
118
    def __getattr__(self, name):
119
        return getattr(self.filesock, name)
120
121
122
class _ReportingSocket(object):
123
124
    def __init__(self, sock, report_activity=None):
3287.3.3 by Andrew Bennetts
A slightly neater hack for forcing buffering, thanks to John.
125
        self.sock = sock
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
126
        self._report_activity = report_activity
127
4776.2.1 by Vincent Ladeuil
Support no activity report on http sockets.
128
    def report_activity(self, size, direction):
129
        if self._report_activity:
130
            self._report_activity(size, direction)
131
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
132
    def sendall(self, s, *args):
4105.1.1 by Andrew Bennetts
Clean-up _ReportingSocket.send/sendall slightly.
133
        self.sock.sendall(s, *args)
4776.2.1 by Vincent Ladeuil
Support no activity report on http sockets.
134
        self.report_activity(len(s), 'write')
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
135
136
    def recv(self, *args):
137
        s = self.sock.recv(*args)
4776.2.1 by Vincent Ladeuil
Support no activity report on http sockets.
138
        self.report_activity(len(s), 'read')
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
139
        return s
3287.3.3 by Andrew Bennetts
A slightly neater hack for forcing buffering, thanks to John.
140
141
    def makefile(self, mode='r', bufsize=-1):
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
142
        # httplib creates a fileobject that doesn't do buffering, which
143
        # makes fp.readline() very expensive because it only reads one byte
144
        # at a time.  So we wrap the socket in an object that forces
145
        # sock.makefile to make a buffered file.
146
        fsock = self.sock.makefile(mode, 65536)
147
        # And wrap that into a reporting kind of fileobject
148
        return _ReportingFileSocket(fsock, self._report_activity)
3287.3.3 by Andrew Bennetts
A slightly neater hack for forcing buffering, thanks to John.
149
150
    def __getattr__(self, name):
151
        return getattr(self.sock, name)
152
153
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
154
# We define our own Response class to keep our httplib pipe clean
155
class Response(httplib.HTTPResponse):
2004.1.16 by v.ladeuil+lp at free
Add tests against erroneous http status lines.
156
    """Custom HTTPResponse, to avoid the need to decorate.
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
157
158
    httplib prefers to decorate the returned objects, rather
159
    than using a custom object.
160
    """
161
2004.1.7 by vila
Better handling of passwords (user should be queried only once).
162
    # Some responses have bodies in which we have no interest
5504.4.1 by Vincent Ladeuil
Fix http test spurious failures and get rid of some useless messages in log.
163
    _body_ignored_responses = [301,302, 303, 307, 400, 401, 403, 404, 501]
2004.1.7 by vila
Better handling of passwords (user should be queried only once).
164
3146.3.4 by Vincent Ladeuil
Review feedback, simpler loops.
165
    # in finish() below, we may have to discard several MB in the worst
166
    # case. To avoid buffering that much, we read and discard by chunks
167
    # instead. The underlying file is either a socket or a StringIO, so reading
168
    # 8k chunks should be fine.
169
    _discarded_buf_size = 8192
170
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
171
    def begin(self):
172
        """Begin to read the response from the server.
173
174
        httplib assumes that some responses get no content and do
175
        not even attempt to read the body in that case, leaving
176
        the body in the socket, blocking the next request. Let's
177
        try to workaround that.
178
        """
2004.1.2 by vila
Implements a BasicAuthManager.
179
        httplib.HTTPResponse.begin(self)
2004.1.7 by vila
Better handling of passwords (user should be queried only once).
180
        if self.status in self._body_ignored_responses:
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
181
            if self.debuglevel >= 2:
2004.1.16 by v.ladeuil+lp at free
Add tests against erroneous http status lines.
182
                print "For status: [%s]," % self.status,
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
183
                print "will ready body, length: %s" % self.length
2004.1.16 by v.ladeuil+lp at free
Add tests against erroneous http status lines.
184
            if not (self.length is None or self.will_close):
185
                # In some cases, we just can't read the body not
186
                # even try or we may encounter a 104, 'Connection
187
                # reset by peer' error if there is indeed no body
188
                # and the server closed the connection just after
189
                # having issued the response headers (even if the
2004.1.37 by v.ladeuil+lp at free
Small refactoring.
190
                # headers indicate a Content-Type...)
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
191
                body = self.read(self.length)
192
                if self.debuglevel >= 9:
3024.2.3 by Vincent Ladeuil
Rewrite http_readv to allow several GET requests. Smoke tested against branch reported in the bug.
193
                    # This one can be huge and is generally not interesting
2004.1.16 by v.ladeuil+lp at free
Add tests against erroneous http status lines.
194
                    print "Consumed body: [%s]" % body
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
195
            self.close()
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
196
        elif self.status == 200:
197
            # Whatever the request is, it went ok, so we surely don't want to
198
            # close the connection. Some cases are not correctly detected by
199
            # httplib.HTTPConnection.getresponse (called by
200
            # httplib.HTTPResponse.begin). The CONNECT response for the https
2955.2.1 by Vincent Ladeuil
Fix #160012 by leaving the http pipeline related exceptions raise.
201
            # through proxy case is one.  Note: the 'will_close' below refers
202
            # to the "true" socket between us and the server, whereas the
203
            # 'close()' above refers to the copy of that socket created by
204
            # httplib for the response itself. So, in the if above we close the
205
            # socket to indicate that we are done with the response whereas
206
            # below we keep the socket with the server opened.
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
207
            self.will_close = False
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
208
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
209
    def finish(self):
3059.2.11 by Vincent Ladeuil
Fix typos mentioned by spiv.
210
        """Finish reading the body.
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
211
3059.2.11 by Vincent Ladeuil
Fix typos mentioned by spiv.
212
        In some cases, the client may have left some bytes to read in the
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
213
        body. That will block the next request to succeed if we use a
3059.2.11 by Vincent Ladeuil
Fix typos mentioned by spiv.
214
        persistent connection. If we don't use a persistent connection, well,
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
215
        nothing will block the next request since a new connection will be
216
        issued anyway.
3104.3.3 by Vincent Ladeuil
Jam's and Aaron feedback about bug #175886.
217
218
        :return: the number of bytes left on the socket (may be None)
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
219
        """
3104.3.3 by Vincent Ladeuil
Jam's and Aaron feedback about bug #175886.
220
        pending = None
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
221
        if not self.isclosed():
222
            # Make sure nothing was left to be read on the socket
3104.3.1 by Vincent Ladeuil
Fix #175886 by reading remaining bytes by chunks.
223
            pending = 0
3146.3.2 by Vincent Ladeuil
Fix #179368 by keeping the current range hint on ShortReadvErrors.
224
            data = True
3146.3.4 by Vincent Ladeuil
Review feedback, simpler loops.
225
            while data and self.length:
226
                # read() will update self.length
227
                data = self.read(min(self.length, self._discarded_buf_size))
3104.3.1 by Vincent Ladeuil
Fix #175886 by reading remaining bytes by chunks.
228
                pending += len(data)
229
            if pending:
3146.3.2 by Vincent Ladeuil
Fix #179368 by keeping the current range hint on ShortReadvErrors.
230
                trace.mutter("%s bytes left on the HTTP socket", pending)
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
231
            self.close()
3104.3.3 by Vincent Ladeuil
Jam's and Aaron feedback about bug #175886.
232
        return pending
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
233
2004.1.2 by vila
Implements a BasicAuthManager.
234
2004.2.1 by John Arbash Meinel
Cleanup of urllib functions
235
# Not inheriting from 'object' because httplib.HTTPConnection doesn't.
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
236
class AbstractHTTPConnection:
237
    """A custom HTTP(S) Connection, which can reset itself on a bad response"""
238
239
    response_class = Response
240
3104.3.3 by Vincent Ladeuil
Jam's and Aaron feedback about bug #175886.
241
    # When we detect a server responding with the whole file to range requests,
242
    # we want to warn. But not below a given thresold.
243
    _range_warning_thresold = 1024 * 1024
244
4776.2.1 by Vincent Ladeuil
Support no activity report on http sockets.
245
    def __init__(self, report_activity=None):
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
246
        self._response = None
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
247
        self._report_activity = report_activity
3104.3.3 by Vincent Ladeuil
Jam's and Aaron feedback about bug #175886.
248
        self._ranges_received_whole_file = None
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
249
250
    def _mutter_connect(self):
3104.3.4 by Vincent Ladeuil
Add test.
251
        netloc = '%s:%s' % (self.host, self.port)
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
252
        if self.proxied_host is not None:
253
            netloc += '(proxy for %s)' % self.proxied_host
254
        trace.mutter('* About to connect() to %s' % netloc)
255
256
    def getresponse(self):
257
        """Capture the response to be able to cleanup"""
258
        self._response = httplib.HTTPConnection.getresponse(self)
259
        return self._response
260
3059.2.2 by Vincent Ladeuil
Read http responses on demand without buffering the whole body
261
    def cleanup_pipe(self):
3111.1.24 by Vincent Ladeuil
Cleanups.
262
        """Read the remaining bytes of the last response if any."""
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
263
        if self._response is not None:
5504.4.1 by Vincent Ladeuil
Fix http test spurious failures and get rid of some useless messages in log.
264
            try:
265
                pending = self._response.finish()
266
                # Warn the user (once)
267
                if (self._ranges_received_whole_file is None
268
                    and self._response.status == 200
269
                    and pending and pending > self._range_warning_thresold
270
                    ):
271
                    self._ranges_received_whole_file = True
272
                    trace.warning(
273
                        'Got a 200 response when asking for multiple ranges,'
274
                        ' does your server at %s:%s support range requests?',
275
                        self.host, self.port)
276
            except socket.error, e:
277
                # It's conceivable that the socket is in a bad state here
278
                # (including some test cases) and in this case, it doesn't need
279
                # cleaning anymore, so no need to fail, we just get rid of the
280
                # socket and let callers reconnect
281
                if (len(e.args) == 0
282
                    or e.args[0] not in (errno.ECONNRESET, errno.ECONNABORTED)):
283
                    raise
284
                self.close()
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
285
            self._response = None
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
286
        # Preserve our preciousss
287
        sock = self.sock
288
        self.sock = None
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
289
        # Let httplib.HTTPConnection do its housekeeping
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
290
        self.close()
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
291
        # Restore our preciousss
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
292
        self.sock = sock
293
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
294
    def _wrap_socket_for_reporting(self, sock):
295
        """Wrap the socket before anybody use it."""
296
        self.sock = _ReportingSocket(sock, self._report_activity)
297
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
298
299
class HTTPConnection(AbstractHTTPConnection, httplib.HTTPConnection):
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
300
301
    # XXX: Needs refactoring at the caller level.
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
302
    def __init__(self, host, port=None, proxied_host=None,
303
                 report_activity=None):
304
        AbstractHTTPConnection.__init__(self, report_activity=report_activity)
2929.3.9 by Vincent Ladeuil
Don't pretend we support HTTP/0.9 since we don't and do that correctly.
305
        # Use strict=True since we don't support HTTP/0.9
306
        httplib.HTTPConnection.__init__(self, host, port, strict=True)
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
307
        self.proxied_host = proxied_host
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
308
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
309
    def connect(self):
310
        if 'http' in debug.debug_flags:
311
            self._mutter_connect()
312
        httplib.HTTPConnection.connect(self)
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
313
        self._wrap_socket_for_reporting(self.sock)
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
314
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
315
3815.2.3 by Martin Pool
merge fix for #293054, ssl on python2.6
316
# Build the appropriate socket wrapper for ssl
317
try:
3823.1.3 by Vincent Ladeuil
Fixed as per John's review.
318
    # python 2.6 introduced a better ssl package
319
    import ssl
3815.2.3 by Martin Pool
merge fix for #293054, ssl on python2.6
320
    _ssl_wrap_socket = ssl.wrap_socket
321
except ImportError:
3823.1.3 by Vincent Ladeuil
Fixed as per John's review.
322
    # python versions prior to 2.6 don't have ssl and ssl.wrap_socket instead
323
    # they use httplib.FakeSocket
3815.2.3 by Martin Pool
merge fix for #293054, ssl on python2.6
324
    def _ssl_wrap_socket(sock, key_file, cert_file):
325
        ssl_sock = socket.ssl(sock, key_file, cert_file)
326
        return httplib.FakeSocket(sock, ssl_sock)
327
328
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
329
class HTTPSConnection(AbstractHTTPConnection, httplib.HTTPSConnection):
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
330
331
    def __init__(self, host, port=None, key_file=None, cert_file=None,
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
332
                 proxied_host=None,
333
                 report_activity=None):
334
        AbstractHTTPConnection.__init__(self, report_activity=report_activity)
2929.3.9 by Vincent Ladeuil
Don't pretend we support HTTP/0.9 since we don't and do that correctly.
335
        # Use strict=True since we don't support HTTP/0.9
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
336
        httplib.HTTPSConnection.__init__(self, host, port,
2929.3.9 by Vincent Ladeuil
Don't pretend we support HTTP/0.9 since we don't and do that correctly.
337
                                         key_file, cert_file, strict=True)
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
338
        self.proxied_host = proxied_host
339
340
    def connect(self):
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
341
        if 'http' in debug.debug_flags:
342
            self._mutter_connect()
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
343
        httplib.HTTPConnection.connect(self)
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
344
        self._wrap_socket_for_reporting(self.sock)
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
345
        if self.proxied_host is None:
346
            self.connect_to_origin()
347
348
    def connect_to_origin(self):
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
349
        ssl_sock = _ssl_wrap_socket(self.sock, self.key_file, self.cert_file)
350
        # Wrap the ssl socket before anybody use it
351
        self._wrap_socket_for_reporting(ssl_sock)
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
352
353
354
class Request(urllib2.Request):
355
    """A custom Request object.
356
357
    urllib2 determines the request method heuristically (based on
358
    the presence or absence of data). We set the method
359
    statically.
360
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
361
    The Request object tracks:
362
    - the connection the request will be made on.
363
    - the authentication parameters needed to preventively set
364
      the authentication header once a first authentication have
365
       been made.
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
366
    """
367
368
    def __init__(self, method, url, data=None, headers={},
369
                 origin_req_host=None, unverifiable=False,
2520.2.2 by Vincent Ladeuil
Fix #115209 by issuing a single range request on 400: Bad Request
370
                 connection=None, parent=None,
4795.4.2 by Vincent Ladeuil
Revert auth reuse.
371
                 accepted_errors=None):
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
372
        urllib2.Request.__init__(self, url, data, headers,
373
                                 origin_req_host, unverifiable)
374
        self.method = method
375
        self.connection = connection
2520.2.2 by Vincent Ladeuil
Fix #115209 by issuing a single range request on 400: Bad Request
376
        self.accepted_errors = accepted_errors
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
377
        # To handle redirections
378
        self.parent = parent
379
        self.redirected_to = None
2164.2.15 by Vincent Ladeuil
Http redirections are not followed by default. Do not use hints
380
        # Unless told otherwise, redirections are not followed
381
        self.follow_redirections = False
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
382
        # auth and proxy_auth are dicts containing, at least
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
383
        # (scheme, host, port, realm, user, password, protocol, path).
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
384
        # The dict entries are mostly handled by the AuthHandler.
385
        # Some authentication schemes may add more entries.
4795.4.2 by Vincent Ladeuil
Revert auth reuse.
386
        self.auth = {}
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
387
        self.proxy_auth = {}
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
388
        self.proxied_host = None
2420.1.3 by Vincent Ladeuil
Implement http proxy basic authentication.
389
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
390
    def get_method(self):
391
        return self.method
392
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
393
    def set_proxy(self, proxy, type):
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
394
        """Set the proxy and remember the proxied host."""
4776.2.7 by Vincent Ladeuil
Fix proxy CONNECT for non-default ports.
395
        host, port = urllib.splitport(self.get_host())
396
        if port is None:
397
            # We need to set the default port ourselves way before it gets set
398
            # in the HTTP[S]Connection object at build time.
399
            if self.type == 'https':
400
                conn_class = HTTPSConnection
401
            else:
402
                conn_class = HTTPConnection
403
            port = conn_class.default_port
404
        self.proxied_host = '%s:%s' % (host, port)
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
405
        urllib2.Request.set_proxy(self, proxy, type)
4797.75.1 by Andrew Bennetts
Backport fix for bug 558343 from lp:bzr r5220.
406
        # When urllib2 makes a https request with our wrapper code and a proxy,
407
        # it sets Host to the https proxy, not the host we want to talk to.
408
        # I'm fairly sure this is our fault, but what is the cause is an open
409
        # question. -- Robert Collins May 8 2010.
410
        self.add_unredirected_header('Host', self.proxied_host)
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
411
412
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
413
class _ConnectRequest(Request):
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
414
415
    def __init__(self, request):
416
        """Constructor
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
417
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
418
        :param request: the first request sent to the proxied host, already
419
            processed by the opener (i.e. proxied_host is already set).
420
        """
421
        # We give a fake url and redefine get_selector or urllib2 will be
422
        # confused
423
        Request.__init__(self, 'CONNECT', request.get_full_url(),
424
                         connection=request.connection)
3376.2.4 by Martin Pool
Remove every assert statement from bzrlib!
425
        if request.proxied_host is None:
426
            raise AssertionError()
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
427
        self.proxied_host = request.proxied_host
428
429
    def get_selector(self):
430
        return self.proxied_host
431
432
    def set_proxy(self, proxy, type):
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
433
        """Set the proxy without remembering the proxied host.
434
435
        We already know the proxied host by definition, the CONNECT request
436
        occurs only when the connection goes through a proxy. The usual
437
        processing (masquerade the request so that the connection is done to
438
        the proxy while the request is targeted at another host) does not apply
439
        here. In fact, the connection is already established with proxy and we
440
        just want to enable the SSL tunneling.
441
        """
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
442
        urllib2.Request.set_proxy(self, proxy, type)
443
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
444
445
class ConnectionHandler(urllib2.BaseHandler):
446
    """Provides connection-sharing by pre-processing requests.
447
448
    urllib2 provides no way to access the HTTPConnection object
449
    internally used. But we need it in order to achieve
450
    connection sharing. So, we add it to the request just before
451
    it is processed, and then we override the do_open method for
2363.4.7 by Vincent Ladeuil
Deeper tests, prepare the auth setting that will avoid the
452
    http[s] requests in AbstractHTTPHandler.
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
453
    """
454
455
    handler_order = 1000 # after all pre-processings
456
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
457
    def __init__(self, report_activity=None):
458
        self._report_activity = report_activity
459
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
460
    def create_connection(self, request, http_connection_class):
461
        host = request.get_host()
462
        if not host:
2004.1.15 by v.ladeuil+lp at free
Better design for bogus servers. Both urllib and pycurl pass tests.
463
            # Just a bit of paranoia here, this should have been
464
            # handled in the higher levels
2004.1.27 by v.ladeuil+lp at free
Fix bug #57644 by issuing an explicit error message.
465
            raise errors.InvalidURL(request.get_full_url(), 'no host given.')
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
466
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
467
        # We create a connection (but it will not connect until the first
468
        # request is made)
2004.1.42 by v.ladeuil+lp at free
Fix #70803 by catching the httplib exception.
469
        try:
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
470
            connection = http_connection_class(
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
471
                host, proxied_host=request.proxied_host,
472
                report_activity=self._report_activity)
2004.1.42 by v.ladeuil+lp at free
Fix #70803 by catching the httplib exception.
473
        except httplib.InvalidURL, exception:
474
            # There is only one occurrence of InvalidURL in httplib
475
            raise errors.InvalidURL(request.get_full_url(),
476
                                    extra='nonnumeric port')
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
477
478
        return connection
479
480
    def capture_connection(self, request, http_connection_class):
481
        """Capture or inject the request connection.
482
483
        Two cases:
484
        - the request have no connection: create a new one,
485
486
        - the request have a connection: this one have been used
487
          already, let's capture it, so that we can give it to
488
          another transport to be reused. We don't do that
489
          ourselves: the Transport object get the connection from
490
          a first request and then propagate it, from request to
491
          request or to cloned transports.
492
        """
493
        connection = request.connection
494
        if connection is None:
495
            # Create a new one
496
            connection = self.create_connection(request, http_connection_class)
497
            request.connection = connection
498
499
        # All connections will pass here, propagate debug level
500
        connection.set_debuglevel(DEBUG)
501
        return request
502
503
    def http_request(self, request):
504
        return self.capture_connection(request, HTTPConnection)
505
506
    def https_request(self, request):
507
        return self.capture_connection(request, HTTPSConnection)
508
509
510
class AbstractHTTPHandler(urllib2.AbstractHTTPHandler):
511
    """A custom handler for HTTP(S) requests.
512
513
    We overrive urllib2.AbstractHTTPHandler to get a better
514
    control of the connection, the ability to implement new
515
    request types and return a response able to cope with
516
    persistent connections.
517
    """
518
519
    # We change our order to be before urllib2 HTTP[S]Handlers
2004.3.1 by vila
Test ConnectionError exceptions.
520
    # and be chosen instead of them (the first http_open called
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
521
    # wins).
522
    handler_order = 400
523
524
    _default_headers = {'Pragma': 'no-cache',
525
                        'Cache-control': 'max-age=0',
526
                        'Connection': 'Keep-Alive',
2004.1.15 by v.ladeuil+lp at free
Better design for bogus servers. Both urllib and pycurl pass tests.
527
                        'User-agent': 'bzr/%s (urllib)' % bzrlib_version,
2004.3.3 by vila
Better (but still incomplete) design for bogus servers.
528
                        'Accept': '*/*',
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
529
                        }
530
531
    def __init__(self):
2004.1.16 by v.ladeuil+lp at free
Add tests against erroneous http status lines.
532
        urllib2.AbstractHTTPHandler.__init__(self, debuglevel=DEBUG)
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
533
2004.1.15 by v.ladeuil+lp at free
Better design for bogus servers. Both urllib and pycurl pass tests.
534
    def http_request(self, request):
535
        """Common headers setting"""
536
537
        request.headers.update(self._default_headers.copy())
538
        # FIXME: We may have to add the Content-Length header if
539
        # we have data to send.
540
        return request
541
2004.1.37 by v.ladeuil+lp at free
Small refactoring.
542
    def retry_or_raise(self, http_class, request, first_try):
543
        """Retry the request (once) or raise the exception.
2004.3.1 by vila
Test ConnectionError exceptions.
544
545
        urllib2 raises exception of application level kind, we
546
        just have to translate them.
547
548
        httplib can raise exceptions of transport level (badly
549
        formatted dialog, loss of connexion or socket level
550
        problems). In that case we should issue the request again
551
        (httplib will close and reopen a new connection if
2004.1.37 by v.ladeuil+lp at free
Small refactoring.
552
        needed).
553
        """
554
        # When an exception occurs, we give back the original
555
        # Traceback or the bugs are hard to diagnose.
556
        exc_type, exc_val, exc_tb = sys.exc_info()
557
        if exc_type == socket.gaierror:
558
            # No need to retry, that will not help
559
            raise errors.ConnectionError("Couldn't resolve host '%s'"
560
                                         % request.get_origin_req_host(),
561
                                         orig_error=exc_val)
2955.2.1 by Vincent Ladeuil
Fix #160012 by leaving the http pipeline related exceptions raise.
562
        elif isinstance(exc_val, httplib.ImproperConnectionState):
563
            # The httplib pipeline is in incorrect state, it's a bug in our
564
            # implementation.
565
            raise exc_type, exc_val, exc_tb
2004.1.37 by v.ladeuil+lp at free
Small refactoring.
566
        else:
567
            if first_try:
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
568
                if self._debuglevel >= 2:
2004.1.37 by v.ladeuil+lp at free
Small refactoring.
569
                    print 'Received exception: [%r]' % exc_val
570
                    print '  On connection: [%r]' % request.connection
571
                    method = request.get_method()
572
                    url = request.get_full_url()
573
                    print '  Will retry, %s %r' % (method, url)
574
                request.connection.close()
575
                response = self.do_open(http_class, request, False)
576
            else:
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
577
                if self._debuglevel >= 2:
2004.1.39 by v.ladeuil+lp at free
Fix a race condition that make selftest fail once in a while.
578
                    print 'Received second exception: [%r]' % exc_val
579
                    print '  On connection: [%r]' % request.connection
2004.1.37 by v.ladeuil+lp at free
Small refactoring.
580
                if exc_type in (httplib.BadStatusLine, httplib.UnknownProtocol):
581
                    # httplib.BadStatusLine and
582
                    # httplib.UnknownProtocol indicates that a
583
                    # bogus server was encountered or a bad
584
                    # connection (i.e. transient errors) is
585
                    # experimented, we have already retried once
586
                    # for that request so we raise the exception.
587
                    my_exception = errors.InvalidHttpResponse(
588
                        request.get_full_url(),
589
                        'Bad status line received',
590
                        orig_error=exc_val)
4628.1.2 by Vincent Ladeuil
More complete fix.
591
                elif (isinstance(exc_val, socket.error) and len(exc_val.args)
5599.3.3 by John Arbash Meinel
Treate WSAECONNABORTED the same as WSAECONNRESET in the http _urllib2 code. Bug #686587
592
                      and exc_val.args[0] in (errno.ECONNRESET, 10053, 10054)):
593
                      # 10053 == WSAECONNABORTED
594
                      # 10054 == WSAECONNRESET
4628.1.2 by Vincent Ladeuil
More complete fix.
595
                    raise errors.ConnectionReset(
596
                        "Connection lost while sending request.")
2004.1.37 by v.ladeuil+lp at free
Small refactoring.
597
                else:
598
                    # All other exception are considered connection related.
599
600
                    # socket errors generally occurs for reasons
601
                    # far outside our scope, so closing the
602
                    # connection and retrying is the best we can
603
                    # do.
604
605
                    my_exception = errors.ConnectionError(
606
                        msg= 'while sending %s %s:' % (request.get_method(),
607
                                                       request.get_selector()),
608
                        orig_error=exc_val)
609
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
610
                if self._debuglevel >= 2:
2004.1.37 by v.ladeuil+lp at free
Small refactoring.
611
                    print 'On connection: [%r]' % request.connection
612
                    method = request.get_method()
613
                    url = request.get_full_url()
614
                    print '  Failed again, %s %r' % (method, url)
615
                    print '  Will raise: [%r]' % my_exception
616
                raise my_exception, None, exc_tb
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
617
        return response
2004.1.37 by v.ladeuil+lp at free
Small refactoring.
618
619
    def do_open(self, http_class, request, first_try=True):
620
        """See urllib2.AbstractHTTPHandler.do_open for the general idea.
621
622
        The request will be retried once if it fails.
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
623
        """
624
        connection = request.connection
3376.2.4 by Martin Pool
Remove every assert statement from bzrlib!
625
        if connection is None:
626
            raise AssertionError(
627
                'Cannot process a request without a connection')
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
628
2004.1.19 by v.ladeuil+lp at free
Test protocol version in http responses.
629
        # Get all the headers
2004.1.15 by v.ladeuil+lp at free
Better design for bogus servers. Both urllib and pycurl pass tests.
630
        headers = {}
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
631
        headers.update(request.header_items())
632
        headers.update(request.unredirected_hdrs)
3430.1.1 by Vincent Ladeuil
Fix bug #229076 by fixing header names before sending the request.
633
        # Some servers or proxies will choke on headers not properly
634
        # cased. httplib/urllib/urllib2 all use capitalize to get canonical
635
        # header names, but only python2.5 urllib2 use title() to fix them just
636
        # before sending the request. And not all versions of python 2.5 do
637
        # that. Since we replace urllib2.AbstractHTTPHandler.do_open we do it
638
        # ourself below.
3430.1.2 by Vincent Ladeuil
Fixed as per Matt Nordhoff review.
639
        headers = dict((name.title(), val) for name, val in headers.iteritems())
2004.3.1 by vila
Test ConnectionError exceptions.
640
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
641
        try:
3052.3.3 by Vincent Ladeuil
Add -Dhttp support.
642
            method = request.get_method()
643
            url = request.get_selector()
644
            connection._send_request(method, url,
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
645
                                     # FIXME: implements 100-continue
646
                                     #None, # We don't send the body yet
647
                                     request.get_data(),
648
                                     headers)
3052.3.3 by Vincent Ladeuil
Add -Dhttp support.
649
            if 'http' in debug.debug_flags:
650
                trace.mutter('> %s %s' % (method, url))
651
                hdrs = ['%s: %s' % (k, v) for k,v in headers.items()]
652
                trace.mutter('> ' + '\n> '.join(hdrs) + '\n')
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
653
            if self._debuglevel >= 1:
654
                print 'Request sent: [%r] from (%s)' \
655
                    % (request, request.connection.sock.getsockname())
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
656
            response = connection.getresponse()
657
            convert_to_addinfourl = True
2004.1.37 by v.ladeuil+lp at free
Small refactoring.
658
        except (socket.gaierror, httplib.BadStatusLine, httplib.UnknownProtocol,
659
                socket.error, httplib.HTTPException):
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
660
            response = self.retry_or_raise(http_class, request, first_try)
661
            convert_to_addinfourl = False
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
662
663
# FIXME: HTTPConnection does not fully support 100-continue (the
664
# server responses are just ignored)
665
666
#        if code == 100:
667
#            mutter('Will send the body')
668
#            # We can send the body now
669
#            body = request.get_data()
670
#            if body is None:
671
#                raise URLError("No data given")
672
#            connection.send(body)
673
#            response = connection.getresponse()
674
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
675
        if self._debuglevel >= 2:
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
676
            print 'Receives response: %r' % response
677
            print '  For: %r(%r)' % (request.get_method(),
678
                                     request.get_full_url())
679
680
        if convert_to_addinfourl:
681
            # Shamelessly copied from urllib2
682
            req = request
683
            r = response
684
            r.recv = r.read
3287.3.2 by Andrew Bennetts
Buffer 64k, rather than just 8k.
685
            fp = socket._fileobject(r, bufsize=65536)
5393.6.1 by Toshio Kuratomi
Fix branching from lp: repositories under python-2.7
686
            resp = addinfourl(fp, r.msg, req.get_full_url())
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
687
            resp.code = r.status
688
            resp.msg = r.reason
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
689
            resp.version = r.version
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
690
            if self._debuglevel >= 2:
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
691
                print 'Create addinfourl: %r' % resp
692
                print '  For: %r(%r)' % (request.get_method(),
693
                                         request.get_full_url())
3059.2.5 by Vincent Ladeuil
DAMN^64, the http test server is 1.0 not 1.1 :( Better pipe cleaning and less readv caching (since that's the point of the whole fix).
694
            if 'http' in debug.debug_flags:
695
                version = 'HTTP/%d.%d'
696
                try:
697
                    version = version % (resp.version / 10,
698
                                         resp.version % 10)
699
                except:
700
                    version = 'HTTP/%r' % resp.version
701
                trace.mutter('< %s %s %s' % (version, resp.code,
702
                                             resp.msg))
703
                # Use the raw header lines instead of treating resp.info() as a
704
                # dict since we may miss duplicated headers otherwise.
705
                hdrs = [h.rstrip('\r\n') for h in resp.info().headers]
706
                trace.mutter('< ' + '\n< '.join(hdrs) + '\n')
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
707
        else:
708
            resp = response
709
        return resp
710
711
712
class HTTPHandler(AbstractHTTPHandler):
2004.2.1 by John Arbash Meinel
Cleanup of urllib functions
713
    """A custom handler that just thunks into HTTPConnection"""
714
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
715
    def http_open(self, request):
716
        return self.do_open(HTTPConnection, request)
717
718
719
class HTTPSHandler(AbstractHTTPHandler):
2004.2.1 by John Arbash Meinel
Cleanup of urllib functions
720
    """A custom handler that just thunks into HTTPSConnection"""
721
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
722
    https_request = AbstractHTTPHandler.http_request
723
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
724
    def https_open(self, request):
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
725
        connection = request.connection
726
        if connection.sock is None and \
727
                connection.proxied_host is not None and \
728
                request.get_method() != 'CONNECT' : # Don't loop
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
729
            # FIXME: We need a gazillion connection tests here, but we still
730
            # miss a https server :-( :
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
731
            # - with and without proxy
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
732
            # - with and without certificate
733
            # - with self-signed certificate
734
            # - with and without authentication
2929.3.9 by Vincent Ladeuil
Don't pretend we support HTTP/0.9 since we don't and do that correctly.
735
            # - with good and bad credentials (especially the proxy auth around
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
736
            #   CONNECT)
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
737
            # - with basic and digest schemes
738
            # - reconnection on errors
739
            # - connection persistence behaviour (including reconnection)
740
741
            # We are about to connect for the first time via a proxy, we must
742
            # issue a CONNECT request first to establish the encrypted link
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
743
            connect = _ConnectRequest(request)
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
744
            response = self.parent.open(connect)
745
            if response.code != 200:
4797.75.1 by Andrew Bennetts
Backport fix for bug 558343 from lp:bzr r5220.
746
                raise errors.ConnectionError("Can't connect to %s via proxy %s" % (
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
747
                        connect.proxied_host, self.host))
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
748
            # Housekeeping
3059.2.2 by Vincent Ladeuil
Read http responses on demand without buffering the whole body
749
            connection.cleanup_pipe()
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
750
            # Establish the connection encryption
2540.2.1 by Vincent Ladeuil
Rough, working, tested against squid+apache in basic auth fix for #120678
751
            connection.connect_to_origin()
752
            # Propagate the connection to the original request
753
            request.connection = connection
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
754
        return self.do_open(HTTPSConnection, request)
755
756
class HTTPRedirectHandler(urllib2.HTTPRedirectHandler):
757
    """Handles redirect requests.
758
759
    We have to implement our own scheme because we use a specific
760
    Request object and because we want to implement a specific
761
    policy.
762
    """
763
    _debuglevel = DEBUG
764
    # RFC2616 says that only read requests should be redirected
765
    # without interacting with the user. But bzr use some
766
    # shortcuts to optimize against roundtrips which can leads to
767
    # write requests being issued before read requests of
768
    # containing dirs can be redirected. So we redirect write
769
    # requests in the same way which seems to respect the spirit
770
    # of the RFC if not its letter.
771
772
    def redirect_request(self, req, fp, code, msg, headers, newurl):
773
        """See urllib2.HTTPRedirectHandler.redirect_request"""
774
        # We would have preferred to update the request instead
775
        # of creating a new one, but the urllib2.Request object
776
        # has a too complicated creation process to provide a
777
        # simple enough equivalent update process. Instead, when
2164.2.29 by Vincent Ladeuil
Test the http redirection at the request level even if it's not
778
        # redirecting, we only update the following request in
779
        # the redirect chain with a reference to the parent
780
        # request .
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
781
2164.2.1 by v.ladeuil+lp at free
First rough http branch redirection implementation.
782
        # Some codes make no sense in our context and are treated
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
783
        # as errors:
784
785
        # 300: Multiple choices for different representations of
786
        #      the URI. Using that mechanisn with bzr will violate the
787
        #      protocol neutrality of Transport.
788
789
        # 304: Not modified (SHOULD only occurs with conditional
790
        #      GETs which are not used by our implementation)
791
792
        # 305: Use proxy. I can't imagine this one occurring in
793
        #      our context-- vila/20060909
794
795
        # 306: Unused (if the RFC says so...)
796
2164.2.1 by v.ladeuil+lp at free
First rough http branch redirection implementation.
797
        # If the code is 302 and the request is HEAD, some may
798
        # think that it is a sufficent hint that the file exists
799
        # and that we MAY avoid following the redirections. But
800
        # if we want to be sure, we MUST follow them.
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
801
802
        if code in (301, 302, 303, 307):
803
            return Request(req.get_method(),newurl,
804
                           headers = req.headers,
805
                           origin_req_host = req.get_origin_req_host(),
806
                           unverifiable = True,
807
                           # TODO: It will be nice to be able to
808
                           # detect virtual hosts sharing the same
809
                           # IP address, that will allow us to
810
                           # share the same connection...
811
                           connection = None,
812
                           parent = req,
813
                           )
814
        else:
815
            raise urllib2.HTTPError(req.get_full_url(), code, msg, headers, fp)
816
2164.2.29 by Vincent Ladeuil
Test the http redirection at the request level even if it's not
817
    def http_error_302(self, req, fp, code, msg, headers):
2004.3.1 by vila
Test ConnectionError exceptions.
818
        """Requests the redirected to URI.
819
3059.2.2 by Vincent Ladeuil
Read http responses on demand without buffering the whole body
820
        Copied from urllib2 to be able to clean the pipe of the associated
821
        connection, *before* issuing the redirected request but *after* having
822
        eventually raised an error.
2004.3.1 by vila
Test ConnectionError exceptions.
823
        """
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
824
        # Some servers (incorrectly) return multiple Location headers
825
        # (so probably same goes for URI).  Use first header.
826
827
        # TODO: Once we get rid of addinfourl objects, the
828
        # following will need to be updated to use correct case
829
        # for headers.
830
        if 'location' in headers:
831
            newurl = headers.getheaders('location')[0]
832
        elif 'uri' in headers:
833
            newurl = headers.getheaders('uri')[0]
834
        else:
835
            return
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
836
        if self._debuglevel >= 1:
2164.2.1 by v.ladeuil+lp at free
First rough http branch redirection implementation.
837
            print 'Redirected to: %s (followed: %r)' % (newurl,
838
                                                        req.follow_redirections)
839
        if req.follow_redirections is False:
840
            req.redirected_to = newurl
841
            return fp
842
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
843
        newurl = urlparse.urljoin(req.get_full_url(), newurl)
844
845
        # This call succeeds or raise an error. urllib2 returns
846
        # if redirect_request returns None, but our
847
        # redirect_request never returns None.
848
        redirected_req = self.redirect_request(req, fp, code, msg, headers,
849
                                               newurl)
850
851
        # loop detection
852
        # .redirect_dict has a key url if url was previously visited.
853
        if hasattr(req, 'redirect_dict'):
854
            visited = redirected_req.redirect_dict = req.redirect_dict
855
            if (visited.get(newurl, 0) >= self.max_repeats or
856
                len(visited) >= self.max_redirections):
857
                raise urllib2.HTTPError(req.get_full_url(), code,
858
                                        self.inf_msg + msg, headers, fp)
859
        else:
860
            visited = redirected_req.redirect_dict = req.redirect_dict = {}
861
        visited[newurl] = visited.get(newurl, 0) + 1
862
863
        # We can close the fp now that we are sure that we won't
864
        # use it with HTTPError.
865
        fp.close()
866
        # We have all we need already in the response
3059.2.2 by Vincent Ladeuil
Read http responses on demand without buffering the whole body
867
        req.connection.cleanup_pipe()
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
868
869
        return self.parent.open(redirected_req)
870
2164.2.29 by Vincent Ladeuil
Test the http redirection at the request level even if it's not
871
    http_error_301 = http_error_303 = http_error_307 = http_error_302
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
872
873
2167.3.1 by v.ladeuil+lp at free
Fix bug #74759.
874
class ProxyHandler(urllib2.ProxyHandler):
875
    """Handles proxy setting.
876
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
877
    Copied and modified from urllib2 to be able to modify the request during
878
    the request pre-processing instead of modifying it at _open time. As we
879
    capture (or create) the connection object during request processing, _open
880
    time was too late.
881
882
    The main task is to modify the request so that the connection is done to
883
    the proxy while the request still refers to the destination host.
884
885
    Note: the proxy handling *may* modify the protocol used; the request may be
886
    against an https server proxied through an http proxy. So, https_request
887
    will be called, but later it's really http_open that will be called. This
2540.2.3 by Vincent Ladeuil
Take Aaron's comments into account.
888
    explains why we don't have to call self.parent.open as the urllib2 did.
2167.3.1 by v.ladeuil+lp at free
Fix bug #74759.
889
    """
890
891
    # Proxies must be in front
892
    handler_order = 100
2167.3.3 by v.ladeuil+lp at free
* bzrlib/transport/http/_urllib2_wrappers.py:
893
    _debuglevel = DEBUG
2167.3.1 by v.ladeuil+lp at free
Fix bug #74759.
894
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
895
    def __init__(self, proxies=None):
2167.3.1 by v.ladeuil+lp at free
Fix bug #74759.
896
        urllib2.ProxyHandler.__init__(self, proxies)
2167.3.4 by v.ladeuil+lp at free
Better fix for #74759, but still not tests.
897
        # First, let's get rid of urllib2 implementation
2167.3.1 by v.ladeuil+lp at free
Fix bug #74759.
898
        for type, proxy in self.proxies.items():
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
899
            if self._debuglevel >= 3:
2167.3.3 by v.ladeuil+lp at free
* bzrlib/transport/http/_urllib2_wrappers.py:
900
                print 'Will unbind %s_open for %r' % (type, proxy)
2167.3.1 by v.ladeuil+lp at free
Fix bug #74759.
901
            delattr(self, '%s_open' % type)
902
4797.75.1 by Andrew Bennetts
Backport fix for bug 558343 from lp:bzr r5220.
903
        def bind_scheme_request(proxy, scheme):
904
            if proxy is None:
905
                return
906
            scheme_request = scheme + '_request'
907
            if self._debuglevel >= 3:
908
                print 'Will bind %s for %r' % (scheme_request, proxy)
909
            setattr(self, scheme_request,
910
                lambda request: self.set_proxy(request, scheme))
2167.3.4 by v.ladeuil+lp at free
Better fix for #74759, but still not tests.
911
        # We are interested only by the http[s] proxies
2167.3.6 by v.ladeuil+lp at free
Take John's comments into account and add more tests.
912
        http_proxy = self.get_proxy_env_var('http')
4797.75.1 by Andrew Bennetts
Backport fix for bug 558343 from lp:bzr r5220.
913
        bind_scheme_request(http_proxy, 'http')
2167.3.6 by v.ladeuil+lp at free
Take John's comments into account and add more tests.
914
        https_proxy = self.get_proxy_env_var('https')
4797.75.1 by Andrew Bennetts
Backport fix for bug 558343 from lp:bzr r5220.
915
        bind_scheme_request(https_proxy, 'https')
2167.3.3 by v.ladeuil+lp at free
* bzrlib/transport/http/_urllib2_wrappers.py:
916
2167.3.6 by v.ladeuil+lp at free
Take John's comments into account and add more tests.
917
    def get_proxy_env_var(self, name, default_to='all'):
918
        """Get a proxy env var.
919
2182.1.1 by Aaron Bentley
Respect proxy environment settings (Vincent Ladeuil, #74759)
920
        Note that we indirectly rely on
2167.3.6 by v.ladeuil+lp at free
Take John's comments into account and add more tests.
921
        urllib.getproxies_environment taking into account the
922
        uppercased values for proxy variables.
923
        """
2167.3.3 by v.ladeuil+lp at free
* bzrlib/transport/http/_urllib2_wrappers.py:
924
        try:
925
            return self.proxies[name.lower()]
926
        except KeyError:
2167.3.6 by v.ladeuil+lp at free
Take John's comments into account and add more tests.
927
            if default_to is not None:
928
                # Try to get the alternate environment variable
929
                try:
930
                    return self.proxies[default_to]
931
                except KeyError:
932
                    pass
933
        return None
2167.3.3 by v.ladeuil+lp at free
* bzrlib/transport/http/_urllib2_wrappers.py:
934
2167.3.4 by v.ladeuil+lp at free
Better fix for #74759, but still not tests.
935
    def proxy_bypass(self, host):
5639.2.2 by Vincent Ladeuil
Add tests and comments to clarify the feature.
936
        """Check if host should be proxied or not.
937
938
        :returns: True to skip the proxy, False otherwise.
939
        """
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
940
        no_proxy = self.get_proxy_env_var('no', default_to=None)
5639.2.2 by Vincent Ladeuil
Add tests and comments to clarify the feature.
941
        bypass = self.evaluate_proxy_bypass(host, no_proxy)
942
        if bypass is None:
5639.2.1 by Martin Pool
Empty entries in the ``NO_PROXY`` variable are no longer treated as matching every host.
943
            # Nevertheless, there are platform-specific ways to
944
            # ignore proxies...
945
            return urllib.proxy_bypass(host)
946
        else:
5639.2.2 by Vincent Ladeuil
Add tests and comments to clarify the feature.
947
            return bypass
5639.2.1 by Martin Pool
Empty entries in the ``NO_PROXY`` variable are no longer treated as matching every host.
948
949
    def evaluate_proxy_bypass(self, host, no_proxy):
5639.2.2 by Vincent Ladeuil
Add tests and comments to clarify the feature.
950
        """Check the host against a comma-separated no_proxy list as a string.
951
952
        :param host: ``host:port`` being requested
953
5639.2.1 by Martin Pool
Empty entries in the ``NO_PROXY`` variable are no longer treated as matching every host.
954
        :param no_proxy: comma-separated list of hosts to access directly.
5639.2.2 by Vincent Ladeuil
Add tests and comments to clarify the feature.
955
5639.2.1 by Martin Pool
Empty entries in the ``NO_PROXY`` variable are no longer treated as matching every host.
956
        :returns: True to skip the proxy, False not to, or None to
957
            leave it to urllib.
958
        """
2167.3.4 by v.ladeuil+lp at free
Better fix for #74759, but still not tests.
959
        if no_proxy is None:
5639.2.2 by Vincent Ladeuil
Add tests and comments to clarify the feature.
960
            # All hosts are proxied
2167.3.4 by v.ladeuil+lp at free
Better fix for #74759, but still not tests.
961
            return False
962
        hhost, hport = urllib.splitport(host)
2182.1.1 by Aaron Bentley
Respect proxy environment settings (Vincent Ladeuil, #74759)
963
        # Does host match any of the domains mentioned in
964
        # no_proxy ? The rules about what is authorized in no_proxy
965
        # are fuzzy (to say the least). We try to allow most
2167.3.4 by v.ladeuil+lp at free
Better fix for #74759, but still not tests.
966
        # commonly seen values.
967
        for domain in no_proxy.split(','):
5639.2.1 by Martin Pool
Empty entries in the ``NO_PROXY`` variable are no longer treated as matching every host.
968
            domain = domain.strip()
969
            if domain == '':
970
                continue
2167.3.4 by v.ladeuil+lp at free
Better fix for #74759, but still not tests.
971
            dhost, dport = urllib.splitport(domain)
2167.3.5 by v.ladeuil+lp at free
Tests for proxies, covering #74759.
972
            if hport == dport or dport is None:
2167.3.4 by v.ladeuil+lp at free
Better fix for #74759, but still not tests.
973
                # Protect glob chars
974
                dhost = dhost.replace(".", r"\.")
975
                dhost = dhost.replace("*", r".*")
976
                dhost = dhost.replace("?", r".")
2167.3.5 by v.ladeuil+lp at free
Tests for proxies, covering #74759.
977
                if re.match(dhost, hhost, re.IGNORECASE):
5639.2.2 by Vincent Ladeuil
Add tests and comments to clarify the feature.
978
                    return True
979
        # Nothing explicitly avoid the host
980
        return None
2167.3.4 by v.ladeuil+lp at free
Better fix for #74759, but still not tests.
981
2167.3.3 by v.ladeuil+lp at free
* bzrlib/transport/http/_urllib2_wrappers.py:
982
    def set_proxy(self, request, type):
2167.3.4 by v.ladeuil+lp at free
Better fix for #74759, but still not tests.
983
        if self.proxy_bypass(request.get_host()):
984
            return request
985
2167.3.6 by v.ladeuil+lp at free
Take John's comments into account and add more tests.
986
        proxy = self.get_proxy_env_var(type)
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
987
        if self._debuglevel >= 3:
2167.3.3 by v.ladeuil+lp at free
* bzrlib/transport/http/_urllib2_wrappers.py:
988
            print 'set_proxy %s_request for %r' % (type, proxy)
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
989
        # FIXME: python 2.5 urlparse provides a better _parse_proxy which can
990
        # grok user:password@host:port as well as
991
        # http://user:password@host:port
992
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
993
        (scheme, user, password,
994
         host, port, path) = transport.ConnectedTransport._split_url(proxy)
4294.2.9 by Robert Collins
Fixup tests broken by cleaning up the layering.
995
        if not host:
996
            raise errors.InvalidURL(proxy, 'No host component')
2900.2.15 by Vincent Ladeuil
AuthenticationConfig can be queried for logins too (first step).
997
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
998
        if request.proxy_auth == {}:
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
999
            # No proxy auth parameter are available, we are handling the first
1000
            # proxied request, intialize.  scheme (the authentication scheme)
1001
            # and realm will be set by the AuthHandler
1002
            request.proxy_auth = {
1003
                                  'host': host, 'port': port,
1004
                                  'user': user, 'password': password,
1005
                                  'protocol': scheme,
1006
                                   # We ignore path since we connect to a proxy
1007
                                  'path': None}
1008
        if port is None:
1009
            phost = host
1010
        else:
1011
            phost = host + ':%d' % port
1012
        request.set_proxy(phost, type)
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
1013
        if self._debuglevel >= 3:
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
1014
            print 'set_proxy: proxy set to %s://%s' % (type, phost)
2167.3.1 by v.ladeuil+lp at free
Fix bug #74759.
1015
        return request
1016
1017
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1018
class AbstractAuthHandler(urllib2.BaseHandler):
1019
    """A custom abstract authentication handler for all http authentications.
1020
1021
    Provides the meat to handle authentication errors and
1022
    preventively set authentication headers after the first
1023
    successful authentication.
1024
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1025
    This can be used for http and proxy, as well as for basic, negotiate and
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1026
    digest authentications.
1027
2420.1.16 by Vincent Ladeuil
Handle nonce changes. Fix a nasty bug breaking the auth parameters sharing.
1028
    This provides an unified interface for all authentication handlers
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1029
    (urllib2 provides far too many with different policies).
2420.1.16 by Vincent Ladeuil
Handle nonce changes. Fix a nasty bug breaking the auth parameters sharing.
1030
1031
    The interaction between this handler and the urllib2
1032
    framework is not obvious, it works as follow:
1033
1034
    opener.open(request) is called:
1035
1036
    - that may trigger http_request which will add an authentication header
1037
      (self.build_header) if enough info is available.
1038
1039
    - the request is sent to the server,
1040
1041
    - if an authentication error is received self.auth_required is called,
1042
      we acquire the authentication info in the error headers and call
1043
      self.auth_match to check that we are able to try the
1044
      authentication and complete the authentication parameters,
1045
1046
    - we call parent.open(request), that may trigger http_request
1047
      and will add a header (self.build_header), but here we have
1048
      all the required info (keep in mind that the request and
1049
      authentication used in the recursive calls are really (and must be)
1050
      the *same* objects).
1051
1052
    - if the call returns a response, the authentication have been
1053
      successful and the request authentication parameters have been updated.
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1054
    """
1055
4307.4.3 by Vincent Ladeuil
Tighten multiple auth schemes handling.
1056
    scheme = None
1057
    """The scheme as it appears in the server header (lower cased)"""
1058
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
1059
    _max_retry = 3
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
1060
    """We don't want to retry authenticating endlessly"""
1061
4050.2.3 by Vincent Ladeuil
Slight cosmetic tweaks.
1062
    requires_username = True
1063
    """Whether the auth mechanism requires a username."""
1064
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1065
    # The following attributes should be defined by daughter
1066
    # classes:
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1067
    # - auth_required_header:  the header received from the server
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1068
    # - auth_header: the header sent in the request
1069
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
1070
    def __init__(self):
1071
        # We want to know when we enter into an try/fail cycle of
1072
        # authentications so we initialize to None to indicate that we aren't
1073
        # in such a cycle by default.
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
1074
        self._retry_count = None
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1075
4050.2.2 by Vincent Ladeuil
Ensures all auth handlers correctly parse all auth headers.
1076
    def _parse_auth_header(self, server_header):
1077
        """Parse the authentication header.
1078
1079
        :param server_header: The value of the header sent by the server
1080
            describing the authenticaion request.
1081
1082
        :return: A tuple (scheme, remainder) scheme being the first word in the
1083
            given header (lower cased), remainder may be None.
1084
        """
1085
        try:
1086
            scheme, remainder = server_header.split(None, 1)
1087
        except ValueError:
1088
            scheme = server_header
1089
            remainder = None
1090
        return (scheme.lower(), remainder)
1091
2420.1.16 by Vincent Ladeuil
Handle nonce changes. Fix a nasty bug breaking the auth parameters sharing.
1092
    def update_auth(self, auth, key, value):
1093
        """Update a value in auth marking the auth as modified if needed"""
1094
        old_value = auth.get(key, None)
1095
        if old_value != value:
1096
            auth[key] = value
1097
            auth['modified'] = True
1098
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1099
    def auth_required(self, request, headers):
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1100
        """Retry the request if the auth scheme is ours.
1101
1102
        :param request: The request needing authentication.
1103
        :param headers: The headers for the authentication error response.
1104
        :return: None or the response for the authenticated request.
1105
        """
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
1106
        # Don't try  to authenticate endlessly
1107
        if self._retry_count is None:
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
1108
            # The retry being recusrsive calls, None identify the first retry
1109
            self._retry_count = 1
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
1110
        else:
1111
            self._retry_count += 1
1112
            if self._retry_count > self._max_retry:
1113
                # Let's be ready for next round
1114
                self._retry_count = None
1115
                return None
4307.4.2 by Vincent Ladeuil
Handle servers proposing several authentication schemes.
1116
        server_headers = headers.getheaders(self.auth_required_header)
1117
        if not server_headers:
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1118
            # The http error MUST have the associated
1119
            # header. This must never happen in production code.
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1120
            raise KeyError('%s not found' % self.auth_required_header)
1121
2420.1.16 by Vincent Ladeuil
Handle nonce changes. Fix a nasty bug breaking the auth parameters sharing.
1122
        auth = self.get_auth(request)
1123
        auth['modified'] = False
4795.4.5 by Vincent Ladeuil
Make sure all redirection code paths can handle authentication.
1124
        # Put some common info in auth if the caller didn't
1125
        if auth.get('path', None) is None:
1126
            (protocol, _, _,
1127
             host, port, path) = urlutils.parse_url(request.get_full_url())
1128
            self.update_auth(auth, 'protocol', protocol)
1129
            self.update_auth(auth, 'host', host)
1130
            self.update_auth(auth, 'port', port)
1131
            self.update_auth(auth, 'path', path)
4307.4.2 by Vincent Ladeuil
Handle servers proposing several authentication schemes.
1132
        # FIXME: the auth handler should be selected at a single place instead
4307.4.3 by Vincent Ladeuil
Tighten multiple auth schemes handling.
1133
        # of letting all handlers try to match all headers, but the current
1134
        # design doesn't allow a simple implementation.
4307.4.2 by Vincent Ladeuil
Handle servers proposing several authentication schemes.
1135
        for server_header in server_headers:
1136
            # Several schemes can be proposed by the server, try to match each
1137
            # one in turn
1138
            matching_handler = self.auth_match(server_header, auth)
1139
            if matching_handler:
1140
                # auth_match may have modified auth (by adding the
1141
                # password or changing the realm, for example)
1142
                if (request.get_header(self.auth_header, None) is not None
1143
                    and not auth['modified']):
1144
                    # We already tried that, give up
1145
                    return None
1146
4307.4.3 by Vincent Ladeuil
Tighten multiple auth schemes handling.
1147
                # Only the most secure scheme proposed by the server should be
1148
                # used, since the handlers use 'handler_order' to describe that
1149
                # property, the first handler tried takes precedence, the
1150
                # others should not attempt to authenticate if the best one
1151
                # failed.
1152
                best_scheme = auth.get('best_scheme', None)
1153
                if best_scheme is None:
1154
                    # At that point, if current handler should doesn't succeed
1155
                    # the credentials are wrong (or incomplete), but we know
1156
                    # that the associated scheme should be used.
1157
                    best_scheme = auth['best_scheme'] = self.scheme
1158
                if  best_scheme != self.scheme:
1159
                    continue
1160
4307.4.2 by Vincent Ladeuil
Handle servers proposing several authentication schemes.
1161
                if self.requires_username and auth.get('user', None) is None:
1162
                    # Without a known user, we can't authenticate
1163
                    return None
1164
1165
                # Housekeeping
1166
                request.connection.cleanup_pipe()
1167
                # Retry the request with an authentication header added
1168
                response = self.parent.open(request)
1169
                if response:
1170
                    self.auth_successful(request, response)
1171
                return response
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1172
        # We are not qualified to handle the authentication.
1173
        # Note: the authentication error handling will try all
1174
        # available handlers. If one of them authenticates
1175
        # successfully, a response will be returned. If none of
1176
        # them succeeds, None will be returned and the error
1177
        # handler will raise the 401 'Unauthorized' or the 407
1178
        # 'Proxy Authentication Required' error.
1179
        return None
1180
1181
    def add_auth_header(self, request, header):
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1182
        """Add the authentication header to the request"""
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1183
        request.add_unredirected_header(self.auth_header, header)
1184
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1185
    def auth_match(self, header, auth):
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1186
        """Check that we are able to handle that authentication scheme.
1187
1188
        The request authentication parameters may need to be
2420.1.16 by Vincent Ladeuil
Handle nonce changes. Fix a nasty bug breaking the auth parameters sharing.
1189
        updated with info from the server. Some of these
1190
        parameters, when combined, are considered to be the
1191
        authentication key, if one of them change the
1192
        authentication result may change. 'user' and 'password'
1193
        are exampls, but some auth schemes may have others
1194
        (digest's nonce is an example, digest's nonce_count is a
1195
        *counter-example*). Such parameters must be updated by
1196
        using the update_auth() method.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
1197
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1198
        :param header: The authentication header sent by the server.
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1199
        :param auth: The auth parameters already known. They may be
1200
             updated.
1201
        :returns: True if we can try to handle the authentication.
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1202
        """
1203
        raise NotImplementedError(self.auth_match)
1204
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1205
    def build_auth_header(self, auth, request):
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1206
        """Build the value of the header used to authenticate.
1207
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1208
        :param auth: The auth parameters needed to build the header.
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1209
        :param request: The request needing authentication.
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1210
1211
        :return: None or header.
1212
        """
1213
        raise NotImplementedError(self.build_auth_header)
1214
2420.1.16 by Vincent Ladeuil
Handle nonce changes. Fix a nasty bug breaking the auth parameters sharing.
1215
    def auth_successful(self, request, response):
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1216
        """The authentification was successful for the request.
1217
2420.1.16 by Vincent Ladeuil
Handle nonce changes. Fix a nasty bug breaking the auth parameters sharing.
1218
        Additional infos may be available in the response.
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1219
1220
        :param request: The succesfully authenticated request.
2420.1.9 by Vincent Ladeuil
Refactor proxy and auth test classes. Tests failing for digest auth.
1221
        :param response: The server response (may contain auth info).
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1222
        """
2540.2.2 by Vincent Ladeuil
Fix #120678 by issuing a CONNECT request when https is used via a proxy.
1223
        # It may happen that we need to reconnect later, let's be ready
1224
        self._retry_count = None
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1225
2900.2.20 by Vincent Ladeuil
http can query AuthenticationConfig for logins too.
1226
    def get_user_password(self, auth):
3910.2.3 by Ben Jansen
Made tweaks requested by John Arbash Meinel.
1227
        """Ask user for a password if none is already available.
1228
3910.2.4 by Vincent Ladeuil
Fixed as per John's review.
1229
        :param auth: authentication info gathered so far (from the initial url
1230
            and then during dialog with the server).
3910.2.3 by Ben Jansen
Made tweaks requested by John Arbash Meinel.
1231
        """
2900.2.20 by Vincent Ladeuil
http can query AuthenticationConfig for logins too.
1232
        auth_conf = config.AuthenticationConfig()
4795.4.4 by Vincent Ladeuil
Protect more access to 'user' and 'password' auth attributes.
1233
        user = auth.get('user', None)
1234
        password = auth.get('password', None)
2900.2.20 by Vincent Ladeuil
http can query AuthenticationConfig for logins too.
1235
        realm = auth['realm']
5484.2.2 by Martin Pool
Cope gracefully if urllib2 doesn't tell us the port number in the authentication callback
1236
        port = auth.get('port', None)
2900.2.20 by Vincent Ladeuil
http can query AuthenticationConfig for logins too.
1237
1238
        if user is None:
3910.2.1 by Ben Jansen
Changed HTTP transport auth so that URLs no longer need to include the username for HTTP Auth to work. Now, if bzr gets a 401 HTTP response, it looks in the authentication config for an appropriate username and password. If it doesn't find a username, it defaults to the local user. If it doesn't find a password, it prompts.
1239
            user = auth_conf.get_user(auth['protocol'], auth['host'],
5484.2.2 by Martin Pool
Cope gracefully if urllib2 doesn't tell us the port number in the authentication callback
1240
                                      port=port, path=auth['path'],
4222.3.12 by Jelmer Vernooij
Check that the HTTP transport prompts for usernames.
1241
                                      realm=realm, ask=True,
1242
                                      prompt=self.build_username_prompt(auth))
3910.2.2 by Vincent Ladeuil
Fix bug #300347 by allowing querying authentication.conf if no
1243
        if user is not None and password is None:
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
1244
            password = auth_conf.get_password(
5484.2.2 by Martin Pool
Cope gracefully if urllib2 doesn't tell us the port number in the authentication callback
1245
                auth['protocol'], auth['host'], user,
1246
                port=port,
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
1247
                path=auth['path'], realm=realm,
2900.2.19 by Vincent Ladeuil
Mention proxy and https in the password prompts, with tests.
1248
                prompt=self.build_password_prompt(auth))
2900.2.20 by Vincent Ladeuil
http can query AuthenticationConfig for logins too.
1249
1250
        return user, password
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1251
2900.2.19 by Vincent Ladeuil
Mention proxy and https in the password prompts, with tests.
1252
    def _build_password_prompt(self, auth):
1253
        """Build a prompt taking the protocol used into account.
1254
1255
        The AuthHandler is used by http and https, we want that information in
1256
        the prompt, so we build the prompt from the authentication dict which
1257
        contains all the needed parts.
1258
3133.1.3 by Vincent Ladeuil
Fix typo (hi John ;).
1259
        Also, http and proxy AuthHandlers present different prompts to the
3133.1.2 by Vincent Ladeuil
Fix #177643 by making pycurl handle url-embedded credentials again.
1260
        user. The daughter classes should implements a public
2900.2.19 by Vincent Ladeuil
Mention proxy and https in the password prompts, with tests.
1261
        build_password_prompt using this method.
1262
        """
1263
        prompt = '%s' % auth['protocol'].upper() + ' %(user)s@%(host)s'
1264
        realm = auth['realm']
1265
        if realm is not None:
1266
            prompt += ", Realm: '%s'" % realm
1267
        prompt += ' password'
1268
        return prompt
1269
4222.3.12 by Jelmer Vernooij
Check that the HTTP transport prompts for usernames.
1270
    def _build_username_prompt(self, auth):
1271
        """Build a prompt taking the protocol used into account.
1272
1273
        The AuthHandler is used by http and https, we want that information in
1274
        the prompt, so we build the prompt from the authentication dict which
1275
        contains all the needed parts.
1276
1277
        Also, http and proxy AuthHandlers present different prompts to the
1278
        user. The daughter classes should implements a public
1279
        build_username_prompt using this method.
1280
        """
1281
        prompt = '%s' % auth['protocol'].upper() + ' %(host)s'
1282
        realm = auth['realm']
1283
        if realm is not None:
1284
            prompt += ", Realm: '%s'" % realm
1285
        prompt += ' username'
1286
        return prompt
1287
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1288
    def http_request(self, request):
1289
        """Insert an authentication header if information is available"""
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1290
        auth = self.get_auth(request)
1291
        if self.auth_params_reusable(auth):
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1292
            self.add_auth_header(request, self.build_auth_header(auth, request))
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1293
        return request
1294
1295
    https_request = http_request # FIXME: Need test
1296
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1297
1298
class NegotiateAuthHandler(AbstractAuthHandler):
1299
    """A authentication handler that handles WWW-Authenticate: Negotiate.
1300
4032.1.4 by John Arbash Meinel
Found 2 more files with trailing whitespace.
1301
    At the moment this handler supports just Kerberos. In the future,
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1302
    NTLM support may also be added.
1303
    """
1304
4307.4.3 by Vincent Ladeuil
Tighten multiple auth schemes handling.
1305
    scheme = 'negotiate'
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1306
    handler_order = 480
4017.5.1 by Jelmer Vernooij
Allow HTTP authentication handlers (such as the NegotiateAuthHandler) to
1307
    requires_username = False
1308
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1309
    def auth_match(self, header, auth):
4050.2.2 by Vincent Ladeuil
Ensures all auth handlers correctly parse all auth headers.
1310
        scheme, raw_auth = self._parse_auth_header(header)
4307.4.3 by Vincent Ladeuil
Tighten multiple auth schemes handling.
1311
        if scheme != self.scheme:
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1312
            return False
1313
        self.update_auth(auth, 'scheme', scheme)
4011.3.2 by Jelmer Vernooij
Only attempt GSSAPI authentication when the kerberos module is present.
1314
        resp = self._auth_match_kerberos(auth)
1315
        if resp is None:
1316
            return False
1317
        # Optionally should try to authenticate using NTLM here
1318
        self.update_auth(auth, 'negotiate_response', resp)
1319
        return True
1320
1321
    def _auth_match_kerberos(self, auth):
1322
        """Try to create a GSSAPI response for authenticating against a host."""
4011.3.4 by Jelmer Vernooij
review from vila: mention HTTPS, clarify error a bit, move import to top-level.
1323
        if not have_kerberos:
4011.3.2 by Jelmer Vernooij
Only attempt GSSAPI authentication when the kerberos module is present.
1324
            return None
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1325
        ret, vc = kerberos.authGSSClientInit("HTTP@%(host)s" % auth)
1326
        if ret < 1:
4011.3.5 by Jelmer Vernooij
Move import next to other system libs, fix format.
1327
            trace.warning('Unable to create GSSAPI context for %s: %d',
1328
                auth['host'], ret)
4011.3.2 by Jelmer Vernooij
Only attempt GSSAPI authentication when the kerberos module is present.
1329
            return None
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1330
        ret = kerberos.authGSSClientStep(vc, "")
1331
        if ret < 0:
1332
            trace.mutter('authGSSClientStep failed: %d', ret)
4011.3.2 by Jelmer Vernooij
Only attempt GSSAPI authentication when the kerberos module is present.
1333
            return None
1334
        return kerberos.authGSSClientResponse(vc)
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1335
1336
    def build_auth_header(self, auth, request):
4011.3.2 by Jelmer Vernooij
Only attempt GSSAPI authentication when the kerberos module is present.
1337
        return "Negotiate %s" % auth['negotiate_response']
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1338
1339
    def auth_params_reusable(self, auth):
1340
        # If the auth scheme is known, it means a previous
1341
        # authentication was successful, all information is
1342
        # available, no further checks are needed.
4032.1.4 by John Arbash Meinel
Found 2 more files with trailing whitespace.
1343
        return (auth.get('scheme', None) == 'negotiate' and
4011.3.4 by Jelmer Vernooij
review from vila: mention HTTPS, clarify error a bit, move import to top-level.
1344
                auth.get('negotiate_response', None) is not None)
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1345
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1346
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1347
class BasicAuthHandler(AbstractAuthHandler):
1348
    """A custom basic authentication handler."""
1349
4307.4.3 by Vincent Ladeuil
Tighten multiple auth schemes handling.
1350
    scheme = 'basic'
2545.2.1 by Vincent Ladeuil
Fix 121889 by working around urllib2 bug.
1351
    handler_order = 500
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1352
    auth_regexp = re.compile('realm="([^"]*)"', re.I)
1353
1354
    def build_auth_header(self, auth, request):
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1355
        raw = '%s:%s' % (auth['user'], auth['password'])
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1356
        auth_header = 'Basic ' + raw.encode('base64').strip()
1357
        return auth_header
1358
4284.1.1 by Vincent Ladeuil
Fix wrong realm extraction in http basic authentication (reported
1359
    def extract_realm(self, header_value):
1360
        match = self.auth_regexp.search(header_value)
1361
        realm = None
1362
        if match:
1363
            realm = match.group(1)
1364
        return match, realm
1365
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1366
    def auth_match(self, header, auth):
4050.2.2 by Vincent Ladeuil
Ensures all auth handlers correctly parse all auth headers.
1367
        scheme, raw_auth = self._parse_auth_header(header)
4307.4.3 by Vincent Ladeuil
Tighten multiple auth schemes handling.
1368
        if scheme != self.scheme:
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1369
            return False
1370
4284.1.1 by Vincent Ladeuil
Fix wrong realm extraction in http basic authentication (reported
1371
        match, realm = self.extract_realm(raw_auth)
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1372
        if match:
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1373
            # Put useful info into auth
2420.1.16 by Vincent Ladeuil
Handle nonce changes. Fix a nasty bug breaking the auth parameters sharing.
1374
            self.update_auth(auth, 'scheme', scheme)
1375
            self.update_auth(auth, 'realm', realm)
4795.4.3 by Vincent Ladeuil
Protect access to 'user' and 'password' auth attributes.
1376
            if (auth.get('user', None) is None
1377
                or auth.get('password', None) is None):
2900.2.20 by Vincent Ladeuil
http can query AuthenticationConfig for logins too.
1378
                user, password = self.get_user_password(auth)
1379
                self.update_auth(auth, 'user', user)
2420.1.16 by Vincent Ladeuil
Handle nonce changes. Fix a nasty bug breaking the auth parameters sharing.
1380
                self.update_auth(auth, 'password', password)
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1381
        return match is not None
1382
1383
    def auth_params_reusable(self, auth):
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1384
        # If the auth scheme is known, it means a previous
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1385
        # authentication was successful, all information is
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1386
        # available, no further checks are needed.
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1387
        return auth.get('scheme', None) == 'basic'
1388
1389
1390
def get_digest_algorithm_impls(algorithm):
1391
    H = None
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1392
    KD = None
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1393
    if algorithm == 'MD5':
2929.3.1 by Vincent Ladeuil
Fix python2.6 deprecation warnings (still 4 failures 5 errors in test suite).
1394
        H = lambda x: osutils.md5(x).hexdigest()
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1395
    elif algorithm == 'SHA':
2929.3.1 by Vincent Ladeuil
Fix python2.6 deprecation warnings (still 4 failures 5 errors in test suite).
1396
        H = lambda x: osutils.sha(x).hexdigest()
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1397
    if H is not None:
1398
        KD = lambda secret, data: H("%s:%s" % (secret, data))
1399
    return H, KD
1400
1401
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1402
def get_new_cnonce(nonce, nonce_count):
1403
    raw = '%s:%d:%s:%s' % (nonce, nonce_count, time.ctime(),
1404
                           urllib2.randombytes(8))
2929.3.1 by Vincent Ladeuil
Fix python2.6 deprecation warnings (still 4 failures 5 errors in test suite).
1405
    return osutils.sha(raw).hexdigest()[:16]
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1406
1407
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1408
class DigestAuthHandler(AbstractAuthHandler):
1409
    """A custom digest authentication handler."""
1410
4307.4.3 by Vincent Ladeuil
Tighten multiple auth schemes handling.
1411
    scheme = 'digest'
4050.2.3 by Vincent Ladeuil
Slight cosmetic tweaks.
1412
    # Before basic as digest is a bit more secure and should be preferred
2545.2.1 by Vincent Ladeuil
Fix 121889 by working around urllib2 bug.
1413
    handler_order = 490
1414
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1415
    def auth_params_reusable(self, auth):
1416
        # If the auth scheme is known, it means a previous
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1417
        # authentication was successful, all information is
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1418
        # available, no further checks are needed.
1419
        return auth.get('scheme', None) == 'digest'
1420
1421
    def auth_match(self, header, auth):
4050.2.2 by Vincent Ladeuil
Ensures all auth handlers correctly parse all auth headers.
1422
        scheme, raw_auth = self._parse_auth_header(header)
4307.4.3 by Vincent Ladeuil
Tighten multiple auth schemes handling.
1423
        if scheme != self.scheme:
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1424
            return False
1425
1426
        # Put the requested authentication info into a dict
1427
        req_auth = urllib2.parse_keqv_list(urllib2.parse_http_list(raw_auth))
1428
1429
        # Check that we can handle that authentication
1430
        qop = req_auth.get('qop', None)
1431
        if qop != 'auth': # No auth-int so far
1432
            return False
1433
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1434
        H, KD = get_digest_algorithm_impls(req_auth.get('algorithm', 'MD5'))
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1435
        if H is None:
1436
            return False
1437
1438
        realm = req_auth.get('realm', None)
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
1439
        # Put useful info into auth
1440
        self.update_auth(auth, 'scheme', scheme)
1441
        self.update_auth(auth, 'realm', realm)
4795.4.3 by Vincent Ladeuil
Protect access to 'user' and 'password' auth attributes.
1442
        if auth.get('user', None) is None or auth.get('password', None) is None:
2900.2.20 by Vincent Ladeuil
http can query AuthenticationConfig for logins too.
1443
            user, password = self.get_user_password(auth)
1444
            self.update_auth(auth, 'user', user)
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
1445
            self.update_auth(auth, 'password', password)
1446
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1447
        try:
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1448
            if req_auth.get('algorithm', None) is not None:
2420.1.16 by Vincent Ladeuil
Handle nonce changes. Fix a nasty bug breaking the auth parameters sharing.
1449
                self.update_auth(auth, 'algorithm', req_auth.get('algorithm'))
1450
            nonce = req_auth['nonce']
1451
            if auth.get('nonce', None) != nonce:
1452
                # A new nonce, never used
1453
                self.update_auth(auth, 'nonce_count', 0)
1454
            self.update_auth(auth, 'nonce', nonce)
1455
            self.update_auth(auth, 'qop', qop)
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1456
            auth['opaque'] = req_auth.get('opaque', None)
1457
        except KeyError:
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1458
            # Some required field is not there
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1459
            return False
1460
1461
        return True
1462
1463
    def build_auth_header(self, auth, request):
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1464
        url_scheme, url_selector = urllib.splittype(request.get_selector())
1465
        sel_host, uri = urllib.splithost(url_selector)
1466
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1467
        A1 = '%s:%s:%s' % (auth['user'], auth['realm'], auth['password'])
1468
        A2 = '%s:%s' % (request.get_method(), uri)
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1469
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1470
        nonce = auth['nonce']
1471
        qop = auth['qop']
1472
2420.1.16 by Vincent Ladeuil
Handle nonce changes. Fix a nasty bug breaking the auth parameters sharing.
1473
        nonce_count = auth['nonce_count'] + 1
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1474
        ncvalue = '%08x' % nonce_count
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1475
        cnonce = get_new_cnonce(nonce, nonce_count)
1476
1477
        H, KD = get_digest_algorithm_impls(auth.get('algorithm', 'MD5'))
1478
        nonce_data = '%s:%s:%s:%s:%s' % (nonce, ncvalue, cnonce, qop, H(A2))
1479
        request_digest = KD(H(A1), nonce_data)
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1480
1481
        header = 'Digest '
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1482
        header += 'username="%s", realm="%s", nonce="%s"' % (auth['user'],
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1483
                                                             auth['realm'],
1484
                                                             nonce)
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1485
        header += ', uri="%s"' % uri
1486
        header += ', cnonce="%s", nc=%s' % (cnonce, ncvalue)
1487
        header += ', qop="%s"' % qop
1488
        header += ', response="%s"' % request_digest
1489
        # Append the optional fields
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1490
        opaque = auth.get('opaque', None)
1491
        if opaque:
1492
            header += ', opaque="%s"' % opaque
2420.1.14 by Vincent Ladeuil
Tested against squid-2.6.5 with digest authentication.
1493
        if auth.get('algorithm', None):
1494
            header += ', algorithm="%s"' % auth.get('algorithm')
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1495
1496
        # We have used the nonce once more, update the count
1497
        auth['nonce_count'] = nonce_count
1498
1499
        return header
1500
1501
1502
class HTTPAuthHandler(AbstractAuthHandler):
1503
    """Custom http authentication handler.
2004.3.1 by vila
Test ConnectionError exceptions.
1504
2363.4.12 by Vincent Ladeuil
Take jam's review comments into account. Fix typos, give better
1505
    Send the authentication preventively to avoid the roundtrip
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1506
    associated with the 401 error and keep the revelant info in
1507
    the auth request attribute.
2004.3.1 by vila
Test ConnectionError exceptions.
1508
    """
1509
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1510
    auth_required_header = 'www-authenticate'
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1511
    auth_header = 'Authorization'
1512
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1513
    def get_auth(self, request):
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1514
        """Get the auth params from the request"""
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1515
        return request.auth
1516
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1517
    def set_auth(self, request, auth):
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1518
        """Set the auth params for the request"""
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1519
        request.auth = auth
2004.3.1 by vila
Test ConnectionError exceptions.
1520
2900.2.19 by Vincent Ladeuil
Mention proxy and https in the password prompts, with tests.
1521
    def build_password_prompt(self, auth):
1522
        return self._build_password_prompt(auth)
1523
4222.3.12 by Jelmer Vernooij
Check that the HTTP transport prompts for usernames.
1524
    def build_username_prompt(self, auth):
1525
        return self._build_username_prompt(auth)
1526
2363.4.9 by Vincent Ladeuil
Catch first succesful authentification to avoid further 401
1527
    def http_error_401(self, req, fp, code, msg, headers):
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1528
        return self.auth_required(req, headers)
1529
1530
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1531
class ProxyAuthHandler(AbstractAuthHandler):
1532
    """Custom proxy authentication handler.
2420.1.3 by Vincent Ladeuil
Implement http proxy basic authentication.
1533
1534
    Send the authentication preventively to avoid the roundtrip
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1535
    associated with the 407 error and keep the revelant info in
1536
    the proxy_auth request attribute..
2420.1.3 by Vincent Ladeuil
Implement http proxy basic authentication.
1537
    """
1538
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1539
    auth_required_header = 'proxy-authenticate'
2420.1.7 by Vincent Ladeuil
Tested against squid-2.6.5 with basic authentication.
1540
    # FIXME: the correct capitalization is Proxy-Authorization,
2420.1.8 by Vincent Ladeuil
Interesting typo :-) A mix between capitalize, title and fuzzy may be...
1541
    # but python-2.4 urllib2.Request insist on using capitalize()
2420.1.7 by Vincent Ladeuil
Tested against squid-2.6.5 with basic authentication.
1542
    # instead of title().
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1543
    auth_header = 'Proxy-authorization'
1544
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1545
    def get_auth(self, request):
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1546
        """Get the auth params from the request"""
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1547
        return request.proxy_auth
1548
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1549
    def set_auth(self, request, auth):
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1550
        """Set the auth params for the request"""
2420.1.6 by Vincent Ladeuil
Update NEWS to explain the intent of the modification. Also, use dicts
1551
        request.proxy_auth = auth
2420.1.3 by Vincent Ladeuil
Implement http proxy basic authentication.
1552
2900.2.19 by Vincent Ladeuil
Mention proxy and https in the password prompts, with tests.
1553
    def build_password_prompt(self, auth):
1554
        prompt = self._build_password_prompt(auth)
1555
        prompt = 'Proxy ' + prompt
1556
        return prompt
1557
4222.3.12 by Jelmer Vernooij
Check that the HTTP transport prompts for usernames.
1558
    def build_username_prompt(self, auth):
1559
        prompt = self._build_username_prompt(auth)
1560
        prompt = 'Proxy ' + prompt
1561
        return prompt
1562
2420.1.3 by Vincent Ladeuil
Implement http proxy basic authentication.
1563
    def http_error_407(self, req, fp, code, msg, headers):
2420.1.5 by Vincent Ladeuil
Refactor http and proxy authentication. Tests passing. proxy password can be prompted too.
1564
        return self.auth_required(req, headers)
2420.1.3 by Vincent Ladeuil
Implement http proxy basic authentication.
1565
1566
2420.1.11 by Vincent Ladeuil
Implement digest authentication. Test suite passes. Tested against apache-2.x.
1567
class HTTPBasicAuthHandler(BasicAuthHandler, HTTPAuthHandler):
1568
    """Custom http basic authentication handler"""
1569
1570
1571
class ProxyBasicAuthHandler(BasicAuthHandler, ProxyAuthHandler):
1572
    """Custom proxy basic authentication handler"""
1573
1574
1575
class HTTPDigestAuthHandler(DigestAuthHandler, HTTPAuthHandler):
1576
    """Custom http basic authentication handler"""
1577
1578
1579
class ProxyDigestAuthHandler(DigestAuthHandler, ProxyAuthHandler):
1580
    """Custom proxy basic authentication handler"""
1581
2420.1.3 by Vincent Ladeuil
Implement http proxy basic authentication.
1582
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1583
class HTTPNegotiateAuthHandler(NegotiateAuthHandler, HTTPAuthHandler):
1584
    """Custom http negotiate authentication handler"""
1585
1586
1587
class ProxyNegotiateAuthHandler(NegotiateAuthHandler, ProxyAuthHandler):
1588
    """Custom proxy negotiate authentication handler"""
1589
1590
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
1591
class HTTPErrorProcessor(urllib2.HTTPErrorProcessor):
1592
    """Process HTTP error responses.
1593
1594
    We don't really process the errors, quite the contrary
1595
    instead, we leave our Transport handle them.
1596
    """
1597
2520.2.2 by Vincent Ladeuil
Fix #115209 by issuing a single range request on 400: Bad Request
1598
    accepted_errors = [200, # Ok
1599
                       206, # Partial content
1600
                       404, # Not found
1601
                       ]
1602
    """The error codes the caller will handle.
1603
1604
    This can be specialized in the request on a case-by case basis, but the
1605
    common cases are covered here.
1606
    """
1607
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
1608
    def http_response(self, request, response):
1609
        code, msg, hdrs = response.code, response.msg, response.info()
1610
2520.2.2 by Vincent Ladeuil
Fix #115209 by issuing a single range request on 400: Bad Request
1611
        accepted_errors = request.accepted_errors
1612
        if accepted_errors is None:
1613
            accepted_errors = self.accepted_errors
1614
1615
        if code not in accepted_errors:
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
1616
            response = self.parent.error('http', request, response,
1617
                                         code, msg, hdrs)
1618
        return response
1619
1620
    https_response = http_response
1621
1622
1623
class HTTPDefaultErrorHandler(urllib2.HTTPDefaultErrorHandler):
1624
    """Translate common errors into bzr Exceptions"""
2004.2.1 by John Arbash Meinel
Cleanup of urllib functions
1625
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
1626
    def http_error_default(self, req, fp, code, msg, hdrs):
2520.2.2 by Vincent Ladeuil
Fix #115209 by issuing a single range request on 400: Bad Request
1627
        if code == 403:
3430.3.1 by Vincent Ladeuil
Fix #230223 by making both http implementations raise appropriate exceptions.
1628
            raise errors.TransportError(
1629
                'Server refuses to fulfill the request (403 Forbidden)'
1630
                ' for %s' % req.get_full_url())
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
1631
        else:
2004.1.27 by v.ladeuil+lp at free
Fix bug #57644 by issuing an explicit error message.
1632
            raise errors.InvalidHttpResponse(req.get_full_url(),
1633
                                             'Unable to handle http code %d: %s'
1634
                                             % (code, msg))
2004.2.1 by John Arbash Meinel
Cleanup of urllib functions
1635
2520.2.2 by Vincent Ladeuil
Fix #115209 by issuing a single range request on 400: Bad Request
1636
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
1637
class Opener(object):
1638
    """A wrapper around urllib2.build_opener
1639
1640
    Daughter classes can override to build their own specific opener
1641
    """
2145.1.1 by mbp at sourcefrog
merge urllib keepalive etc
1642
    # TODO: Provides hooks for daughter classes.
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
1643
2004.2.1 by John Arbash Meinel
Cleanup of urllib functions
1644
    def __init__(self,
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
1645
                 connection=ConnectionHandler,
1646
                 redirect=HTTPRedirectHandler,
3945.1.5 by Vincent Ladeuil
Start implementing http activity reporting at socket level.
1647
                 error=HTTPErrorProcessor,
1648
                 report_activity=None):
1649
        self._opener = urllib2.build_opener(
1650
            connection(report_activity=report_activity),
1651
            redirect, error,
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
1652
            ProxyHandler(),
1653
            HTTPBasicAuthHandler(),
1654
            HTTPDigestAuthHandler(),
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1655
            HTTPNegotiateAuthHandler(),
2900.2.16 by Vincent Ladeuil
Make hhtp proxy aware of AuthenticationConfig (for password).
1656
            ProxyBasicAuthHandler(),
1657
            ProxyDigestAuthHandler(),
4011.3.1 by Jelmer Vernooij
Add simple support for GSSAPI authentication over HTTP.
1658
            ProxyNegotiateAuthHandler(),
2004.1.2 by vila
Implements a BasicAuthManager.
1659
            HTTPHandler,
1660
            HTTPSHandler,
1661
            HTTPDefaultErrorHandler,
2004.2.1 by John Arbash Meinel
Cleanup of urllib functions
1662
            )
2520.2.2 by Vincent Ladeuil
Fix #115209 by issuing a single range request on 400: Bad Request
1663
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
1664
        self.open = self._opener.open
3111.1.20 by Vincent Ladeuil
Make all the test pass. Looks like we are HTTP/1.1 compliant.
1665
        if DEBUG >= 9:
2004.1.9 by vila
Takes jam's remarks into account when possible, add TODOs for the rest.
1666
            # When dealing with handler order, it's easy to mess
1667
            # things up, the following will help understand which
1668
            # handler is used, when and for what.
2004.1.1 by vila
Connection sharing, with redirection. without authentification.
1669
            import pprint
1670
            pprint.pprint(self._opener.__dict__)