Package trac :: Package util :: Module text

Module text

source code

Classes
  unicode_passwd
Conceal the actual content of the string when repr is called.
  UnicodeTextWrapper
Functions
 
to_unicode(text, charset=None)
Convert input to an unicode object.
source code
 
exception_to_unicode(e, traceback=False) source code
 
path_to_unicode(path)
Convert a filesystem path to unicode, using the filesystem encoding.
source code
 
javascript_quote(text)
Quote strings for inclusion in single or double quote delimited Javascript strings
source code
 
to_js_string(text)
Embed the given string in a double quote delimited Javascript string (conform to the JSON spec)
source code
 
unicode_quote(value, safe='/')
A unicode aware version of urllib.quote
source code
 
unicode_quote_plus(value, safe='')
A unicode aware version of urllib.quote_plus.
source code
unicode
unicode_unquote(value)
A unicode aware version of urllib.unquote.
source code
 
unicode_urlencode(params, safe='')
A unicode aware version of urllib.urlencode.
source code
 
quote_query_string(text)
Quote strings for query string
source code
 
to_utf8(text, charset='iso-8859-15')
Convert a string to UTF-8, assuming the encoding is either UTF-8, ISO Latin-1, or as specified by the optional charset parameter.
source code
 
console_print(out, *args, **kwargs) source code
 
printout(*args, **kwargs) source code
 
printerr(*args, **kwargs) source code
 
raw_input(prompt) source code
 
text_width(text, ambiwidth=1)
Determine the column width of text in Unicode characters.
source code
 
print_table(data, headers=None, sep=' ', out=None, ambiwidth=None)
Print data as a table in the terminal.
source code
 
shorten_line(text, maxlen=75) source code
 
wrap(t, cols=75, initial_indent='', subsequent_indent='', linesep='\n', ambiwidth=1)
Wraps the single paragraph in t, which contains unicode characters.
source code
 
obfuscate_email_address(address) source code
 
breakable_path(path)
Make a path breakable after path separators, and conversely, avoid breaking at spaces.
source code
 
normalize_whitespace(text, to_space=u' ', remove=u'')
Normalize whitespace in a string, by replacing special spaces by normal spaces and removing zero-width spaces.
source code
 
pretty_size(size, format='%.1f') source code
 
expandtabs(s, tabstop=8, ignoring=None) source code
 
fix_eol(text, eol)
Fix end-of-lines in a text.
source code
 
unicode_to_base64(text, strip_newlines=True)
Safe conversion of text to base64 representation using utf-8 bytes.
source code
 
unicode_from_base64(text)
Safe conversion of text to unicode based on utf-8 bytes.
source code
 
levenshtein_distance(lhs, rhs)
Return the Levenshtein distance between two strings.
source code
Variables
  CRLF = '\r\n'
  empty = u''
  __package__ = 'trac.util'
  c = '>'
  i = 62

Imports: __builtin__, locale, os, re, sys, textwrap, quote, quote_plus, unquote, east_asian_width, _, ctypes


Function Details

to_unicode(text, charset=None)

source code 

Convert input to an unicode object.

For a str object, we'll first try to decode the bytes using the given charset encoding (or UTF-8 if none is specified), then we fall back to the latin1 encoding which might be correct or not, but at least preserves the original byte sequence by mapping each byte to the corresponding unicode code point in the range U+0000 to U+00FF.

Otherwise, a simple unicode() conversion is attempted, with some special care taken for Exception objects.

unicode_quote(value, safe='/')

source code 
A unicode aware version of urllib.quote
Parameters:
  • value - anything that converts to a str. If unicode input is given, it will be UTF-8 encoded.
  • safe - as in quote, the characters that would otherwise be quoted but shouldn't here (defaults to '/')

unicode_quote_plus(value, safe='')

source code 
A unicode aware version of urllib.quote_plus.
Parameters:
  • value - anything that converts to a str. If unicode input is given, it will be UTF-8 encoded.
  • safe - as in quote_plus, the characters that would otherwise be quoted but shouldn't here (defaults to '/')

unicode_unquote(value)

source code 
A unicode aware version of urllib.unquote.
Parameters:
  • str - UTF-8 encoded str value (for example, as obtained by unicode_quote).
Returns: unicode

unicode_urlencode(params, safe='')

source code 

A unicode aware version of urllib.urlencode.

Values set to empty are converted to the key alone, without the equal sign.

to_utf8(text, charset='iso-8859-15')

source code 

Convert a string to UTF-8, assuming the encoding is either UTF-8, ISO Latin-1, or as specified by the optional charset parameter.

''Deprecated in 0.10. You should use unicode strings only.''

text_width(text, ambiwidth=1)

source code 

Determine the column width of text in Unicode characters.

The characters in the East Asian Fullwidth (F) or East Asian Wide (W) have a column width of 2. The other characters in the East Asian Halfwidth (H) or East Asian Narrow (Na) have a column width of 1.

That ambiwidth parameter is used for the column width of the East Asian Ambiguous (A). If 1, the same width as characters in US-ASCII. This is expected by most users. If 2, twice the width of US-ASCII characters. This is expected by CJK users.

cf. http://www.unicode.org/reports/tr11/.

print_table(data, headers=None, sep=' ', out=None, ambiwidth=None)

source code 

Print data as a table in the terminal.

That ambiwidth parameter is used for the column width of the East Asian Ambiguous (A). If None, detect ambiwidth with the locale settings. If others, pass to the ambiwidth parameter of text_width.

wrap(t, cols=75, initial_indent='', subsequent_indent='', linesep='\n', ambiwidth=1)

source code 

Wraps the single paragraph in t, which contains unicode characters. The every line is at most cols characters long.

That ambiwidth parameter is used for the column width of the East Asian Ambiguous (A). If 1, the same width as characters in US-ASCII. This is expected by most users. If 2, twice the width of US-ASCII characters. This is expected by CJK users.

unicode_to_base64(text, strip_newlines=True)

source code 

Safe conversion of text to base64 representation using utf-8 bytes.

Strips newlines from output unless strip_newlines is False.