~bzr-pqm/bzr/bzr.dev

Problem is, the people who develop SCM tools often don't think about what kind of security requirements they need to support. This mini-paper describes briefly the kinds of security requirements an SCM tool should support.

__ http://www.dwheeler.com/essays/scm-security.html

confidentiality_

Are only those who should be able to read information able to do so?

integrity

Are only those who should be able to write/change information able to do so? This includes not only limiting access rights for writing, but also protecting against repository corruption.

availability

Is the system available to those who need it? (I.E., is it resistant to denial-of-service attacks?)

identification/authentication

Does the system safely authenticate its users? If it uses tokens (like passwords), are they protected when stored and while being sent over a network, or are they exposed as cleartext?

audit

Are actions recorded?

non-repudiation

Can the system "prove" that a certain user/key did an action later?

self-protection

Does the system protect itself, and can its own

data (like timestamps) be trusted?

trusted paths

Can the system make sure that its communication with users is

protected?

Attacker categories

-------------------

* Unprivileged outsiders.

(Almost always read-only, but people might want to allow them to

write in some cases, e.g. for wikis.)

* Non-malicious developers with privilege.

* Malicious developers with privilege.

* Attackers who have stolen a privileged developer's identity.

Access control

--------------

Dan Nicolaescu gives these examples of access control:

- security related code that is still emabargoed, only select few

are allowed to see it, it is not desirable to release this

information to the public because a fix is still being worked

on. It would be nice to be able to have this kind of code under

the same version control system used for normal development for

ease of use and easy merging, yet it is crucial to restrict access

to a branches, files or directories to certain people.

- feature freeze before a release. It would be good if the release

manager could disable writing to the release branch, so that the

last tests are run, and not have someone commit stuff by mistake.

- documentation/translation writers don't need write access to the

whole source code, just to the documentation directories.

- For proprietary companies restricting access is even more

important, for example only some engineers should access the

latest development version of some code in order to keep some

trade secrets, etc, etc.

In Bazaar-NG, the basic unit of access control is the branch. If

people are not supposed to read a branch, or know of its existence,

put it somewhere where they can't see it. If people are allowed to

read from but not write to a branch then set those permissions. The

code can later be merged into a public branch if desired with no loss

of function.

100

101

We largely rely on lower-level security measures controlling who can

102

get read or write access to a branch. If you have a branch that

103

should be confidential, then put it on an appropriately-secured

104

machine, with only people in a particular group allowed to read it.

105

106

Not having separate repositories is probably a feature here -- unlike

107

Subversion, no features depend on having branches be in the same

108

repository. Each repository can have different group ownership.

109

(The directories should usually be setgid.) It also makes it easier

110

to see just what the access control is; there is only one object that

111

can meaningfully have an ACL.

112

113

The existence of a secret branch can be fairly well hidden from the

114

world. When its changes are merged in, all that is visible is the

115

name, date, and branch name of the commit, not anything about the

116

location of the source branch.

117

118

The documentation case I would handle by having a separate

119

documentation branch, which could perhaps be checked out into a

120

subdirectory when it is required. I think this is fairly common for

121

larger projects even in CVS.

122

123

124

125

126

Confidentiality

127

---------------

128

129

As dwheeler points out, this can be important even for open source

130

projects, such as when preparing a security patch.

131

Mechanisms that send email should have an option to encrypt the mail.

132

133

I can't think of anywhere encrypted archives would be useful. If you

134

want to store it on an encrypted filesystem you can. If you want to

135

store encrypted files you can do that too, though that will leak some

136

information in the metadata and branch structure.

137

138

139

Security in distributed systems

140

-------------------------------

141

142

If I have a branch on my laptop, the software ultimately cannot

143

prevent me doing anything to that branch -- physical access trumps

144

software controls. We can, at most, try to prevent non-malicious

145

mistakes.

146

147

The purpose of the software here is to protect other people, whose

148

machines I do not control. In particular, it should be hard for me to

149

lie to them; the software should detect any false statements.

150

151

In particular, these should be prevented:

152

153

* Claiming to be someone else.

154

155

* Attempting to rewrite history.

156

157

158

Revocation

159

----------

160

161

Suppose Alice's code-signing key is stolen by an attacker Charles.

162

Charles can sign changesets purporting to come from Alice.

163

164

Alice needs to revoke that key; hopefully she has saved a copy of the

165

key elsewhere and can use that to revoke it. Failing that she can

166

mail everyone and ask them to delete it. This can propagate through

167

the usual GPG mechanism, which is very nice.

168

169

Alice also needs to make a new key and get it trusted.

170

171

This revocation does not distinguish between changesets genuinely

172

signed by Alice in the past, and changesets fraudulently signed by

173

Charles.

174

175

What can Alice do now? First of all, she needs to work out what

176

changesets signed by her key can still be trusted. One good way to do

177

this is to check against another branch signed by Bob. If Bob's key

178

is safe, we know his copy of Alice's changesets are OK and the full

179

tree at various points is OK.

180

181

Then:

182

183

* Go through her old changesets, check that they're OK -- perhaps

184

restore from a trusted backup. Re-sign those changesets with a new

185

key bound to the same email address. Publish the new signatures

186

instead.

187

188

(This seems to indicate it is a good idea to bind signatures to

189

changeset by author name/address rather than by key ID.)

190

191

* Roll-up all previous development into a new tree, then sign that.

192

This means there is no safe access to the previous individual

193

changes, but in some cases it may be OK.

194

195

If a key is revoked at a particular time then perhaps we could still

196

trust commits made before that time. I don't know if GPG revocations

197

can support that.

198

199

200

Old keys

201

--------

202

203

Keys also expire, rather than being revoked. What does this mean?

204

205

Ideally we would check that the date when a changeset claims to have

206

been signed is within the validity period of the key. This requires

207

more GPG integration than may at the moment be possible, but in theory

208

we can do it.

209

210

Also need to make sure that commits are in order by date, or at least

211

reasonably close to being in order (to allow for some clock skew).

212

213

One interesting case is when version is committed for which both the

214

public and private keys have been lost. This will always be

215

untrusted, but that should not prevent people continuing to use the

216

archive if they can accept that.

217

218

This suggests that perhaps we should allow for multiple signatures on

219

a single revision.

220

221

222

Encumbrance attacks

223

-------------------

224

225

A special case where we need to be able to destroy history to avoid a

226

legal problem. Allowed as discussed elsewhere: either destroy commits

227

from the tail backwards, or equivalently branch from a previous

228

revision and replace with that.

229

230

People who saw the original branch can still prove it happened; people

231

who look in the future will not see any record.

232

233

Either way, probably requires physical branch access.

234

235

236

Multiple signature keys

237

-----------------------

238

239

Should we allow for several signatures on a single changeset? What

240

would that mean? How do we know what signatures are meaningful or

241

worthwhile?

242

243

244

Forensics

245

---------

246

247

dwheeler:

248

249

[O]nce you find out who did a malicious act, the SCM should make it

250

easy to identify all of their actions. In short, if you make it

251

easy to catch someone, you increase the attackers' risk... and that

252

means the attacker is less likely to do it.

253

254

dwheeler asks that the committer's IP address be recorded. Putting

255

this in the changeset seems to cause too much of a

256

privacy/confidentiality problem. However, an active server might

257

reasonably record the IPs of all clients.

258

259

260

Non-repudiation

261

---------------

262

263

If a changeset has propagated to Bob, signed by Alice's key, then Bob

264

can prove that someone possessing Alice's key signed it. Alice's only

265

way out is to claim her key was stolen.

266

267

268

Trusted review

269

--------------

270

271

Can be handled by importing onto another branch. Can have various

272

levels for "quickly checked", "deeply trusted", etc.

273

274

(Is it really necessary to import onto a new branch rather than add

275

anotations to existing branches? Copying the whole text seems a bit

276

redundant. This might be a nice place for arch-style taggings, where

277

we just add a reference to another branch.)

278

279

280

Hooks

281

-----

282

283

Automatically running hooks downloaded from someone else is

284

dangerous. In particular, the user may not have the chance to check

285

the hooks are reasonable before they are run.

286

287

Conversely, users can subvert client-side hooks. If we want to run a

288

check before accepting code onto a shared branch, that must run on the

289

server.

290

291

Enforcing server-side checks gives a good way to run build,

292

formatting, suspiciousness checks, etc. This implies that write

293

access to a repository is through a mediating daemon rather than by

294

directly writing.

295

296

297

298

Signing

299

-------

300

301

We use signing to prove that a particular person (or 'principal',

302

possibly a robot) committed a particular changeset.

303

304

It is the job of external signing software to help work out whether

305

this is true or not. This has several parts:

306

307

* Mathematical verification that a signature on a particular

308

changeset header document is correct

309

310

* Determining that the signature corresponds to a particular public

311

key

312

313

* Determining that the public key corresponds to the person claimed

314

to have authored the changeset (identified by email address.)

315

316

The second two are really PKI functions, and somewhat harder than the

317

first.

318

319

The canonical implementation is to use GPG/OpenPGP, but anything will

320

do. There are simpler RSA/DSA implementations which assume each user

321

manually builds a list of trusted keys.

322

323

This leaves open the question of which people should be trusted to

324

provide software on a particular branch or at all. This is not a very

325

easy question for software to answer. We assume that people will know

326

by other means. For public code, it may be that all changesets are

327

re-signed by say samba-team@samba.org.

328

329

I think it is fair to distinguish people by an email address, or at

330

least by $ID@$DOMAIN. There is no need to have this actually receive

331

email, so spam need not be a problem.

332

333

The signing design is inspired by the very usable security afforded by

334

OpenSSH: it automatically protects where it can, and allows higher

335

security to users who want to do some work (by offline verification of

336

signatures).

337

338

Using a signing mechanism other than GPG when key developers already

339

have GPG and there is a big infrastructure to support it seems

340

undesirable. It is true that GPG is quite complex.

341

342

The purpose of signing is to protect against unauthorized modification

343

of archives.

344

345

Bazaar-NG can apply a GPG signature to both patches and manifests. This

346

vallows a later proof that the revision and the changeset were produced

347

by the author they claim to have been written by.

348

349

We cannot cryptographically prove that a particular patch was merged

350

into a branch, because the person doing the merge might have subverted

351

the patch in the process of merging it. All we can prove

352

cryptographically is that the merge committer asserts they took the

353

patch.

354

355

GPGME and PyMe seem to give a reasonable interface for doing this:

356

there is a function to check a signature, and the return indicates the

357

signing name, with possible errors including a missing key, etc.

358

359

360

Sign branches, not revisions

361

''''''''''''''''''''''''''''

362

363

Aaron Bentley suggested the interesting idea of signing the mapping of

364

revisions onto branches, rather than revisions themselves. For

365

example a branch could contain just a signed pointer to the most

366

recent revision.

367

368

(It probably is useful to be able to check signatures on previous

369

revisions, for example when recovering from an intrusion.)

370

371

372

Protocol attacks

373

----------------

374

375

Both client and server should be resistant to malicious changesets,

376

network requests, etc. There's no easy solution.

377

378

* Defense in depth. Check reasonablenes at various points.

379

380

* Disallow changesets that try to change files outside of the branch.

381

382

383

Availability

384

------------

385

386

bzr can be configured so as to have no single point of failure to a

387

denial-of-service attack (or at least nearly none):

388

389

* Can have any number of mirrors of a branch.

390

391

* If a central server is taken out, developers can continue working

392

with state they already have (unbind their branches), and can

393

collaborate by email or other means until the server is repaired or

394

replaced.

395

396

* The origin branch can be on a machine whose location is secret and

397

which is not directly publicly accessible.

398

399

* Branches can be moved between machines or IP addresses without

400

disrupting anything else.

401

402

* Branches can be moved around out-of-band, as tarballs over

403

bittorrent, etc.

404

405

I think the only possible denial of service attacks are those that aim

406

to shut down the entire network, or block communication with

407

individual developers, for example by flooding their email address.

408

But if those people can get connected through some other means, they

409

can continue.

Older »