6
by mbp at sourcefrog
import all docs from arch |
1 |
***************************** |
2 |
Security aspects of Bazaar-NG |
|
3 |
***************************** |
|
4 |
||
5 |
||
6 |
* Good security is required. |
|
7 |
||
8 |
* Usability is required for good security. |
|
9 |
||
10 |
Being too strict "because it's the secure way" just means that people will |
|
11 |
disable you altogether, or start doing things that they know is wrong, |
|
12 |
because the right way of doing this may be secure, but [..] also very |
|
13 |
inconvenient. |
|
14 |
||
15 |
-- Linus Torvalds |
|
16 |
||
17 |
.. contents: |
|
18 |
||
19 |
Requirements |
|
20 |
============ |
|
21 |
||
22 |
David Wheeler gives some good requirements__: |
|
23 |
||
24 |
Problem is, the people who develop SCM tools often don't think about what kind of security requirements they need to support. This mini-paper describes briefly the kinds of security requirements an SCM tool should support. |
|
25 |
||
26 |
__ http://www.dwheeler.com/essays/scm-security.html |
|
27 |
||
28 |
confidentiality_ |
|
29 |
Are only those who should be able to read information able to do so? |
|
30 |
||
31 |
integrity |
|
32 |
Are only those who should be able to write/change information able to do so? This includes not only limiting access rights for writing, but also protecting against repository corruption. |
|
33 |
||
34 |
availability |
|
35 |
Is the system available to those who need it? (I.E., is it resistant to denial-of-service attacks?) |
|
36 |
||
37 |
identification/authentication |
|
38 |
Does the system safely authenticate its users? If it uses tokens (like passwords), are they protected when stored and while being sent over a network, or are they exposed as cleartext? |
|
39 |
||
40 |
audit |
|
41 |
Are actions recorded? |
|
42 |
||
43 |
non-repudiation |
|
44 |
Can the system "prove" that a certain user/key did an action later? |
|
45 |
||
46 |
self-protection |
|
47 |
Does the system protect itself, and can its own |
|
48 |
data (like timestamps) be trusted? |
|
49 |
||
50 |
trusted paths |
|
51 |
Can the system make sure that its communication with users is |
|
52 |
protected? |
|
53 |
||
54 |
Attacker categories |
|
55 |
------------------- |
|
56 |
||
57 |
* Unprivileged outsiders. |
|
58 |
||
59 |
(Almost always read-only, but people might want to allow them to |
|
60 |
write in some cases, e.g. for wikis.) |
|
61 |
||
62 |
* Non-malicious developers with privilege. |
|
63 |
||
64 |
* Malicious developers with privilege. |
|
65 |
||
66 |
* Attackers who have stolen a privileged developer's identity. |
|
67 |
||
68 |
||
69 |
Access control |
|
70 |
-------------- |
|
71 |
||
72 |
Dan Nicolaescu gives these examples of access control: |
|
73 |
||
74 |
- security related code that is still emabargoed, only select few |
|
75 |
are allowed to see it, it is not desirable to release this |
|
76 |
information to the public because a fix is still being worked |
|
77 |
on. It would be nice to be able to have this kind of code under |
|
78 |
the same version control system used for normal development for |
|
79 |
ease of use and easy merging, yet it is crucial to restrict access |
|
80 |
to a branches, files or directories to certain people. |
|
81 |
||
82 |
- feature freeze before a release. It would be good if the release |
|
83 |
manager could disable writing to the release branch, so that the |
|
84 |
last tests are run, and not have someone commit stuff by mistake. |
|
85 |
||
86 |
- documentation/translation writers don't need write access to the |
|
87 |
whole source code, just to the documentation directories. |
|
88 |
||
89 |
- For proprietary companies restricting access is even more |
|
90 |
important, for example only some engineers should access the |
|
91 |
latest development version of some code in order to keep some |
|
92 |
trade secrets, etc, etc. |
|
93 |
||
94 |
In Bazaar-NG, the basic unit of access control is the branch. If |
|
95 |
people are not supposed to read a branch, or know of its existence, |
|
96 |
put it somewhere where they can't see it. If people are allowed to |
|
97 |
read from but not write to a branch then set those permissions. The |
|
98 |
code can later be merged into a public branch if desired with no loss |
|
99 |
of function. |
|
100 |
||
101 |
We largely rely on lower-level security measures controlling who can |
|
102 |
get read or write access to a branch. If you have a branch that |
|
103 |
should be confidential, then put it on an appropriately-secured |
|
104 |
machine, with only people in a particular group allowed to read it. |
|
105 |
||
106 |
Not having separate repositories is probably a feature here -- unlike |
|
107 |
Subversion, no features depend on having branches be in the same |
|
108 |
repository. Each repository can have different group ownership. |
|
109 |
(The directories should usually be setgid.) It also makes it easier |
|
110 |
to see just what the access control is; there is only one object that |
|
111 |
can meaningfully have an ACL. |
|
112 |
||
113 |
The existence of a secret branch can be fairly well hidden from the |
|
114 |
world. When its changes are merged in, all that is visible is the |
|
115 |
name, date, and branch name of the commit, not anything about the |
|
116 |
location of the source branch. |
|
117 |
||
118 |
The documentation case I would handle by having a separate |
|
119 |
documentation branch, which could perhaps be checked out into a |
|
120 |
subdirectory when it is required. I think this is fairly common for |
|
121 |
larger projects even in CVS. |
|
122 |
||
123 |
||
124 |
||
125 |
||
126 |
Confidentiality |
|
127 |
--------------- |
|
128 |
||
129 |
As dwheeler points out, this can be important even for open source |
|
130 |
projects, such as when preparing a security patch. |
|
131 |
Mechanisms that send email should have an option to encrypt the mail. |
|
132 |
||
133 |
I can't think of anywhere encrypted archives would be useful. If you |
|
134 |
want to store it on an encrypted filesystem you can. If you want to |
|
135 |
store encrypted files you can do that too, though that will leak some |
|
136 |
information in the metadata and branch structure. |
|
137 |
||
138 |
||
139 |
Security in distributed systems |
|
140 |
------------------------------- |
|
141 |
||
142 |
If I have a branch on my laptop, the software ultimately cannot |
|
143 |
prevent me doing anything to that branch -- physical access trumps |
|
144 |
software controls. We can, at most, try to prevent non-malicious |
|
145 |
mistakes. |
|
146 |
||
147 |
The purpose of the software here is to protect other people, whose |
|
148 |
machines I do not control. In particular, it should be hard for me to |
|
149 |
lie to them; the software should detect any false statements. |
|
150 |
||
151 |
In particular, these should be prevented: |
|
152 |
||
153 |
* Claiming to be someone else. |
|
154 |
||
155 |
* Attempting to rewrite history. |
|
156 |
||
157 |
||
158 |
Revocation |
|
159 |
---------- |
|
160 |
||
161 |
Suppose Alice's code-signing key is stolen by an attacker Charles. |
|
162 |
Charles can sign changesets purporting to come from Alice. |
|
163 |
||
164 |
Alice needs to revoke that key; hopefully she has saved a copy of the |
|
165 |
key elsewhere and can use that to revoke it. Failing that she can |
|
166 |
mail everyone and ask them to delete it. This can propagate through |
|
167 |
the usual GPG mechanism, which is very nice. |
|
168 |
||
169 |
Alice also needs to make a new key and get it trusted. |
|
170 |
||
171 |
This revocation does not distinguish between changesets genuinely |
|
172 |
signed by Alice in the past, and changesets fraudulently signed by |
|
173 |
Charles. |
|
174 |
||
175 |
What can Alice do now? First of all, she needs to work out what |
|
176 |
changesets signed by her key can still be trusted. One good way to do |
|
177 |
this is to check against another branch signed by Bob. If Bob's key |
|
178 |
is safe, we know his copy of Alice's changesets are OK and the full |
|
179 |
tree at various points is OK. |
|
180 |
||
181 |
Then: |
|
182 |
||
183 |
* Go through her old changesets, check that they're OK -- perhaps |
|
184 |
restore from a trusted backup. Re-sign those changesets with a new |
|
185 |
key bound to the same email address. Publish the new signatures |
|
186 |
instead. |
|
187 |
||
188 |
(This seems to indicate it is a good idea to bind signatures to |
|
189 |
changeset by author name/address rather than by key ID.) |
|
190 |
||
191 |
* Roll-up all previous development into a new tree, then sign that. |
|
192 |
This means there is no safe access to the previous individual |
|
193 |
changes, but in some cases it may be OK. |
|
194 |
||
195 |
If a key is revoked at a particular time then perhaps we could still |
|
196 |
trust commits made before that time. I don't know if GPG revocations |
|
197 |
can support that. |
|
198 |
||
199 |
||
200 |
Old keys |
|
201 |
-------- |
|
202 |
||
203 |
Keys also expire, rather than being revoked. What does this mean? |
|
204 |
||
205 |
Ideally we would check that the date when a changeset claims to have |
|
206 |
been signed is within the validity period of the key. This requires |
|
207 |
more GPG integration than may at the moment be possible, but in theory |
|
208 |
we can do it. |
|
209 |
||
210 |
Also need to make sure that commits are in order by date, or at least |
|
211 |
reasonably close to being in order (to allow for some clock skew). |
|
212 |
||
213 |
One interesting case is when version is committed for which both the |
|
214 |
public and private keys have been lost. This will always be |
|
215 |
untrusted, but that should not prevent people continuing to use the |
|
216 |
archive if they can accept that. |
|
217 |
||
218 |
This suggests that perhaps we should allow for multiple signatures on |
|
219 |
a single revision. |
|
220 |
||
221 |
||
222 |
Encumbrance attacks |
|
223 |
------------------- |
|
224 |
||
225 |
A special case where we need to be able to destroy history to avoid a |
|
226 |
legal problem. Allowed as discussed elsewhere: either destroy commits |
|
227 |
from the tail backwards, or equivalently branch from a previous |
|
228 |
revision and replace with that. |
|
229 |
||
230 |
People who saw the original branch can still prove it happened; people |
|
231 |
who look in the future will not see any record. |
|
232 |
||
233 |
Either way, probably requires physical branch access. |
|
234 |
||
235 |
||
236 |
Multiple signature keys |
|
237 |
----------------------- |
|
238 |
||
239 |
Should we allow for several signatures on a single changeset? What |
|
240 |
would that mean? How do we know what signatures are meaningful or |
|
241 |
worthwhile? |
|
242 |
||
243 |
||
244 |
Forensics |
|
245 |
--------- |
|
246 |
||
247 |
dwheeler: |
|
248 |
||
249 |
[O]nce you find out who did a malicious act, the SCM should make it |
|
250 |
easy to identify all of their actions. In short, if you make it |
|
251 |
easy to catch someone, you increase the attackers' risk... and that |
|
252 |
means the attacker is less likely to do it. |
|
253 |
||
254 |
dwheeler asks that the committer's IP address be recorded. Putting |
|
255 |
this in the changeset seems to cause too much of a |
|
256 |
privacy/confidentiality problem. However, an active server might |
|
257 |
reasonably record the IPs of all clients. |
|
258 |
||
259 |
||
260 |
Non-repudiation |
|
261 |
--------------- |
|
262 |
||
263 |
If a changeset has propagated to Bob, signed by Alice's key, then Bob |
|
264 |
can prove that someone possessing Alice's key signed it. Alice's only |
|
265 |
way out is to claim her key was stolen. |
|
266 |
||
267 |
||
268 |
Trusted review |
|
269 |
-------------- |
|
270 |
||
271 |
Can be handled by importing onto another branch. Can have various |
|
272 |
levels for "quickly checked", "deeply trusted", etc. |
|
273 |
||
274 |
(Is it really necessary to import onto a new branch rather than add |
|
275 |
anotations to existing branches? Copying the whole text seems a bit |
|
276 |
redundant. This might be a nice place for arch-style taggings, where |
|
277 |
we just add a reference to another branch.) |
|
278 |
||
279 |
||
280 |
Hooks |
|
281 |
----- |
|
282 |
||
283 |
Automatically running hooks downloaded from someone else is |
|
284 |
dangerous. In particular, the user may not have the chance to check |
|
285 |
the hooks are reasonable before they are run. |
|
286 |
||
287 |
Conversely, users can subvert client-side hooks. If we want to run a |
|
288 |
check before accepting code onto a shared branch, that must run on the |
|
289 |
server. |
|
290 |
||
291 |
Enforcing server-side checks gives a good way to run build, |
|
292 |
formatting, suspiciousness checks, etc. This implies that write |
|
293 |
access to a repository is through a mediating daemon rather than by |
|
294 |
directly writing. |
|
295 |
||
296 |
||
297 |
||
298 |
Signing |
|
299 |
------- |
|
300 |
||
301 |
We use signing to prove that a particular person (or 'principal', |
|
302 |
possibly a robot) committed a particular changeset. |
|
303 |
||
304 |
It is the job of external signing software to help work out whether |
|
305 |
this is true or not. This has several parts: |
|
306 |
||
307 |
* Mathematical verification that a signature on a particular |
|
308 |
changeset header document is correct |
|
309 |
||
310 |
* Determining that the signature corresponds to a particular public |
|
311 |
key |
|
312 |
||
313 |
* Determining that the public key corresponds to the person claimed |
|
314 |
to have authored the changeset (identified by email address.) |
|
315 |
||
316 |
The second two are really PKI functions, and somewhat harder than the |
|
317 |
first. |
|
318 |
||
319 |
The canonical implementation is to use GPG/OpenPGP, but anything will |
|
320 |
do. There are simpler RSA/DSA implementations which assume each user |
|
321 |
manually builds a list of trusted keys. |
|
322 |
||
323 |
This leaves open the question of which people should be trusted to |
|
324 |
provide software on a particular branch or at all. This is not a very |
|
325 |
easy question for software to answer. We assume that people will know |
|
326 |
by other means. For public code, it may be that all changesets are |
|
327 |
re-signed by say samba-team@samba.org. |
|
328 |
||
329 |
I think it is fair to distinguish people by an email address, or at |
|
330 |
least by $ID@$DOMAIN. There is no need to have this actually receive |
|
331 |
email, so spam need not be a problem. |
|
332 |
||
333 |
The signing design is inspired by the very usable security afforded by |
|
334 |
OpenSSH: it automatically protects where it can, and allows higher |
|
335 |
security to users who want to do some work (by offline verification of |
|
336 |
signatures). |
|
337 |
||
338 |
Using a signing mechanism other than GPG when key developers already |
|
339 |
have GPG and there is a big infrastructure to support it seems |
|
340 |
undesirable. It is true that GPG is quite complex. |
|
341 |
||
342 |
The purpose of signing is to protect against unauthorized modification |
|
343 |
of archives. |
|
344 |
||
345 |
Bazaar-NG can apply a GPG signature to both patches and manifests. This |
|
346 |
vallows a later proof that the revision and the changeset were produced |
|
347 |
by the author they claim to have been written by. |
|
348 |
||
349 |
We cannot cryptographically prove that a particular patch was merged |
|
350 |
into a branch, because the person doing the merge might have subverted |
|
351 |
the patch in the process of merging it. All we can prove |
|
352 |
cryptographically is that the merge committer asserts they took the |
|
353 |
patch. |
|
354 |
||
355 |
GPGME and PyMe seem to give a reasonable interface for doing this: |
|
356 |
there is a function to check a signature, and the return indicates the |
|
357 |
signing name, with possible errors including a missing key, etc. |
|
358 |
||
359 |
||
360 |
Sign branches, not revisions |
|
361 |
'''''''''''''''''''''''''''' |
|
362 |
||
363 |
Aaron Bentley suggested the interesting idea of signing the mapping of |
|
364 |
revisions onto branches, rather than revisions themselves. For |
|
365 |
example a branch could contain just a signed pointer to the most |
|
366 |
recent revision. |
|
367 |
||
368 |
(It probably is useful to be able to check signatures on previous |
|
369 |
revisions, for example when recovering from an intrusion.) |
|
370 |
||
371 |
||
372 |
Protocol attacks |
|
373 |
---------------- |
|
374 |
||
375 |
Both client and server should be resistant to malicious changesets, |
|
376 |
network requests, etc. There's no easy solution. |
|
377 |
||
378 |
* Defense in depth. Check reasonablenes at various points. |
|
379 |
||
380 |
* Disallow changesets that try to change files outside of the branch. |
|
381 |
||
382 |
||
383 |
Availability |
|
384 |
------------ |
|
385 |
||
386 |
bzr can be configured so as to have no single point of failure to a |
|
387 |
denial-of-service attack (or at least nearly none): |
|
388 |
||
389 |
* Can have any number of mirrors of a branch. |
|
390 |
||
391 |
* If a central server is taken out, developers can continue working |
|
392 |
with state they already have (unbind their branches), and can |
|
393 |
collaborate by email or other means until the server is repaired or |
|
394 |
replaced. |
|
395 |
||
396 |
* The origin branch can be on a machine whose location is secret and |
|
397 |
which is not directly publicly accessible. |
|
398 |
||
399 |
* Branches can be moved between machines or IP addresses without |
|
400 |
disrupting anything else. |
|
401 |
||
402 |
* Branches can be moved around out-of-band, as tarballs over |
|
403 |
bittorrent, etc. |
|
404 |
||
405 |
I think the only possible denial of service attacks are those that aim |
|
406 |
to shut down the entire network, or block communication with |
|
407 |
individual developers, for example by flooding their email address. |
|
408 |
But if those people can get connected through some other means, they |
|
409 |
can continue. |