3794.5.12
by Mark Hammond
First cut at developer docs. |
1 |
Case Insensitive File Systems |
2 |
============================= |
|
3 |
||
4 |
Bazaar must be portable across operating-systems and file-systems. While the |
|
5 |
primary file-system for an operating-system might have some particular |
|
6 |
characteristics, it's not necessary that *all* file-systems for that |
|
7 |
operating-system will have the same characteristics. |
|
8 |
||
9 |
For example, the FAT32 file-system is most commonly found on Windows operating |
|
10 |
systems, and has the characteristics usually associated with a Windows |
|
11 |
file-system. However, USB devices means FAT32 file-systems are often used |
|
5278.1.5
by Martin Pool
Correct more sloppy use of the term 'Linux' |
12 |
with GNU/Linux systems, so the current operating system doesn't necessarily reflect the |
3794.5.12
by Mark Hammond
First cut at developer docs. |
13 |
capabilities of the file-system. |
14 |
||
15 |
Bazaar supports 3 kinds of file-systems, each to different degrees. |
|
16 |
||
17 |
* Case-sensitive file-systems: This is the file-system generally used on |
|
5278.1.5
by Martin Pool
Correct more sloppy use of the term 'Linux' |
18 |
GNU/Linux: 2 files can differ only by case, and the exact case must be used |
3794.5.12
by Mark Hammond
First cut at developer docs. |
19 |
when opening a file. |
20 |
||
21 |
* Case-insensitive, case-preserving (cicp) file-systems: This is the |
|
22 |
file-system generally used on Windows; FAT32 is an example of such a |
|
23 |
file-system. Although existing files can be opened using any case, the |
|
24 |
exact case used to create the file is preserved and available for programs |
|
25 |
to query. Two files that differ only by case is not allowed. |
|
26 |
||
27 |
* Case-insensitive: This is the file-system used by very old Windows versions |
|
28 |
and is rarely encountered "in the wild". Two files that differ only by |
|
29 |
case is not allowed and the case used to create a file is not preserved. |
|
4853.1.1
by Patrick Regan
Removed trailing whitespace from files in doc directory |
30 |
|
3794.5.12
by Mark Hammond
First cut at developer docs. |
31 |
As can be implied by the above descriptions, only the first two are considered |
32 |
relevant to a modern Bazaar. |
|
33 |
||
34 |
For more details, including use cases, please see |
|
5050.22.1
by John Arbash Meinel
Lots of documentation updates. |
35 |
http://wiki.bazaar.canonical.com/CasePreservingWorkingTreeUseCases |
3794.5.12
by Mark Hammond
First cut at developer docs. |
36 |
|
37 |
Handling these file-systems |
|
38 |
--------------------------- |
|
39 |
||
40 |
The fundamental problem handling these file-systems is that the user may |
|
41 |
specify a file name or inventory item with an "incorrect" case - where |
|
42 |
"incorrect" simply means different than what is stored - from the user's |
|
43 |
point-of-view, the filename is still correct, as it can be used to open, edit |
|
44 |
delete etc the item. |
|
45 |
||
46 |
The approach Bazaar takes is to "fixup" each of the command-line arguments |
|
47 |
which refer to a filename or an inventory item - where "fixup" means to |
|
48 |
adjust the case specified by the user so it exactly matches an existing item. |
|
49 |
||
50 |
There are two places this match can be performed against - the file-system |
|
3794.5.32
by Mark Hammond
doc tweaks |
51 |
and the Bazaar inventory. When looking at a case-insensitive file-system, it |
52 |
is impossible to have 2 names that differ only by case, so there is no |
|
53 |
ambiguity. The inventory doesn't have the same rules, but it is expected that |
|
54 |
projects which wish to work with Windows would, by convention, avoid filenames |
|
55 |
that differ only by case. |
|
3794.5.12
by Mark Hammond
First cut at developer docs. |
56 |
|
57 |
The rules for such fixups turn out to be quite simple: |
|
58 |
||
59 |
* If an argument refers to an existing inventory item, we fixup the argument |
|
60 |
using the inventory. This is, basically, all commands that take a filename |
|
3794.5.32
by Mark Hammond
doc tweaks |
61 |
or directory argument *other* than 'add' and in some cases 'mv' |
3794.5.12
by Mark Hammond
First cut at developer docs. |
62 |
|
63 |
* If an argument refers to an existing filename for the creation of an |
|
64 |
inventory item (eg, add), then the case of the existing file on the disk |
|
65 |
will be used. However, Bazaar must still check the inventory to prevent |
|
66 |
accidentally creating 2 inventory items that differ only by case. |
|
67 |
||
68 |
* If an argument results in the creation of a *new* filename (eg, a move |
|
69 |
destination), the argument will be used as specified. Bzr will create |
|
3794.5.32
by Mark Hammond
doc tweaks |
70 |
a file and inventory item that exactly matches the case specified (although |
71 |
as above, care must be taken to avoid creating two inventory items that |
|
72 |
differ only by case.) |
|
3794.5.12
by Mark Hammond
First cut at developer docs. |
73 |
|
74 |
Implementation of support for these file-systems |
|
75 |
------------------------------------------------ |
|
76 |
||
77 |
From the description above, it can be seen the implementation is fairly |
|
78 |
simple and need not intrude on the internals of Bazaar too much; most of |
|
79 |
the time it is simply converting a string specified by the user to the |
|
80 |
"canonical" form as stored in either the inventory or filesystem. These |
|
81 |
boil down to the following new API functions: |
|
82 |
||
83 |
* osutils.canonical_relpath() - like osutils.relpath() but adjust the case |
|
84 |
of the result to match any existing items. |
|
85 |
||
3794.5.22
by Mark Hammond
News and docs about cifs-filesystem support. |
86 |
* Tree.get_canonical_inventory_path - somewhat like Tree.get_symlink_target(), |
3794.5.12
by Mark Hammond
First cut at developer docs. |
87 |
Tree.get_file_by_path() etc; returns a name with the case adjusted to match |
88 |
existing inventory items. |
|
89 |
||
3794.5.32
by Mark Hammond
doc tweaks |
90 |
* osutils.canonical_relpaths() and Tree.get_canonical_inventory_paths() - like |
91 |
the 'singular' versions above, but accept and return sequences and therefore |
|
92 |
offer more optimization opportunities when working with multiple names. |
|
3794.5.12
by Mark Hammond
First cut at developer docs. |
93 |
|
94 |
The only complication is the requirement that Bazaar not allow the creation |
|
95 |
of items that differ only by case on such file-systems. For this requirement, |
|
3794.5.22
by Mark Hammond
News and docs about cifs-filesystem support. |
96 |
case-insensitive and cicp file-systems can be treated the same. The |
97 |
'case_sensitive' attribute on a MutableTree is used to control this behaviour. |