Functional tools for strings

s() is very convenient, but it’s only a thin wrapper on top of regular strings and the tools from this module.

So if you want to apply some of the goodies from it without having to turn your strings into StringWrapper objects, you can use the functions from this module directly.

They don’t accept bytes as an input. If you do so and it works, you must know it’s not a supported behavior and may change in the future. Only pass:

  • unicode objects in Python 2;
  • str objects in Python 3.

Example

>>> from ww.tools.strings import multisplit  # same as s().split()
>>> string = u'a,b;c/d=a,b;c/d'
>>> chunks = multisplit(string, u',', u';', u'[/=]', maxsplit=4)
>>> for chunk in chunks: print(chunk)
a
b
c
d
a,b;c/d

You’ll find bellow the detailed documentation for each functions of. Go have a look, there is some great stuff here!

ww.tools.strings.multireplace(string, patterns, substitutions, maxreplace=0, flags=0)[source]

Like unicode.replace() but accept several substitutions and regexes

Parameters:
  • string – the string to split on.
  • patterns – a string, or an iterable of strings to be replaced.
  • substitutions – a string or an iterable of string to use as a replacement. You can pass either one string, or an iterable containing the same number of sustitutions that you passed as patterns. You can also pass a callable instead of a string. It should expact a match object as a parameter.
  • maxreplace – the max number of replacement to make. 0 is no limit, which is the default.
  • flags

    flags you wish to pass if you use regexes. You should pass them as a string containing a combination of:

    • ‘m’ for re.MULTILINE
    • ‘x’ for re.VERBOSE
    • ‘v’ for re.VERBOSE
    • ‘s’ for re.DOTALL
    • ‘.’ for re.DOTALL
    • ‘d’ for re.DEBUG
    • ‘i’ for re.IGNORECASE
    • ‘u’ for re.UNICODE
    • ‘l’ for re.LOCALE
Returns:

The string with replaced bits.

Raises:

ValueError – if you pass the wrong number of substitution.

Example

>>> print(multireplace(u'a,b;c/d', (u',', u';', u'/'), u','))
a,b,c,d
>>> print(multireplace(u'a1b33c-d', u'\d+', u','))
a,b,c-d
>>> print(multireplace(u'a-1,b-3,3c-d', u',|-', u'', maxreplace=3))
a1b3,3c-d
>>> def upper(match):
...     return match.group().upper()
...
>>> print(multireplace(u'a-1,b-3,3c-d', u'[ab]', upper))
A-1,B-3,3c-d
ww.tools.strings.multisplit(string, *separators, **kwargs)[source]

Like unicode.split, but accept several separators and regexes

Parameters:
  • string – the string to split.
  • separators – strings you can split on. Each string can be a regex.
  • maxsplit – max number of time you wish to split. default is 0, which means no limit.
  • flags

    flags you wish to pass if you use regexes. You should pass them as a string containing a combination of:

    • ‘m’ for re.MULTILINE
    • ‘x’ for re.VERBOSE
    • ‘v’ for re.VERBOSE
    • ‘s’ for re.DOTALL
    • ‘.’ for re.DOTALL
    • ‘d’ for re.DEBUG
    • ‘i’ for re.IGNORECASE
    • ‘u’ for re.UNICODE
    • ‘l’ for re.LOCALE
  • cast – what to cast the result to
Returns:

An iterable of substrings.

Raises:
  • ValueError – if you pass a flag without separators.
  • TypeError – if you pass something else than unicode strings.

Example

>>> for word in multisplit(u'fat     black cat, big'): print(word)
fat
black
cat,
big
>>> string = u'a,b;c/d=a,b;c/d'
>>> chunks = multisplit(string, u',', u';', u'[/=]', maxsplit=4)
>>> for chunk in chunks: print(chunk)
a
b
c
d
a,b;c/d