~bzr-pqm/bzr/bzr.dev

« back to all changes in this revision

Viewing changes to bzrlib/utextwrap.py

Committer: INADA Naoki
Date: 2011-05-10 00:43:28 UTC
mto: This revision was merged to the branch mainline in revision 5874.
Revision ID: songofacandy@gmail.com-20110510004328-mj3uuldb61zsmsp3

Add document of some limitations in docstring.

files modified:
bzrlib/utextwrap.py

Show diffs side-by-side

added added

removed removed

bzrlib/utextwrap.py

:param ambiguous_width: (keyword argument) width for character when

unicodedata.east_asian_width(c) == 'A'

(default: 1)

Limitations:

* expand_tabs doesn't fixed. It uses len() for calculating width

of string on left of TAB.

* Handles one codeunit as a single character having 1 or 2 width.

This is not correct when there are surrogate pairs, combined

characters or zero-width characters.

* Treats all asian character are line breakable. But it is not

true because line breaking is prohibited around some characters.

(For example, breaking before punctation mark is prohibited.)

See UAX # 14 "UNICODE LINE BREAKING ALGORITHM"

"""

def __init__(self, width=None, **kwargs):

194

205

assert chunk # TextWrapper._split removes empty chunk

195

206

prev_pos = 0

196

207

for pos, char in enumerate(chunk):

197

# Treats all asian character are line breakable.

198

# But it is not true because line breaking is

199

# prohibited around some characters.

200

# See UAX # 14 "UNICODE LINE BREAKING ALGORITHM"

201

208

if _eawidth(char) in 'FWA':

202

209

if prev_pos < pos:

203

210

cjk_split_chunks.append(chunk[prev_pos:pos])

Older »