~bzr-pqm/bzr/bzr.dev

« back to all changes in this revision

Viewing changes to bzrlib/utextwrap.py

  • Committer: INADA Naoki
  • Date: 2011-05-10 00:43:28 UTC
  • mto: This revision was merged to the branch mainline in revision 5874.
  • Revision ID: songofacandy@gmail.com-20110510004328-mj3uuldb61zsmsp3
Add document of some limitations in docstring.

Show diffs side-by-side

added added

removed removed

Lines of Context:
42
42
    :param ambiguous_width: (keyword argument) width for character when
43
43
                            unicodedata.east_asian_width(c) == 'A'
44
44
                            (default: 1)
 
45
 
 
46
    Limitations:
 
47
    * expand_tabs doesn't fixed. It uses len() for calculating width
 
48
      of string on left of TAB.
 
49
    * Handles one codeunit as a single character having 1 or 2 width.
 
50
      This is not correct when there are surrogate pairs, combined
 
51
      characters or zero-width characters.
 
52
    * Treats all asian character are line breakable. But it is not
 
53
      true because line breaking is prohibited around some characters.
 
54
      (For example, breaking before punctation mark is prohibited.)
 
55
      See UAX # 14 "UNICODE LINE BREAKING ALGORITHM"
45
56
    """
46
57
 
47
58
    def __init__(self, width=None, **kwargs):
194
205
            assert chunk # TextWrapper._split removes empty chunk
195
206
            prev_pos = 0
196
207
            for pos, char in enumerate(chunk):
197
 
                # Treats all asian character are line breakable.
198
 
                # But it is not true because line breaking is
199
 
                # prohibited around some characters.
200
 
                # See UAX # 14 "UNICODE LINE BREAKING ALGORITHM"
201
208
                if _eawidth(char) in 'FWA':
202
209
                    if prev_pos < pos:
203
210
                        cjk_split_chunks.append(chunk[prev_pos:pos])