Regular Expressions

From Training Material
Revision as of 20:02, 5 February 2013 by Kristian Rother (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


More power for string operations

Regular expressions allow to perform string search & replace operations using patterns.

import re

text = 'all ways lead to Rome'
Searching if something exists or not:
re.search('R...\s', text)
Finding all words:
re.findall('\s(.o)', text)
Replacing:
re.sub('R[meo]+','London', text)

How to find the right pattern for your problem

Finding the right RegEx requires lots of trial-and-error search. You can test regular expressionsonline before including them into your program:

http://www.regexplanet.com/simple/

Characters used in RegEx patterns:

Some of the most commonly used characters in Regular Expression patterns are:

\d - decimal character [0..9]

\w - alphanumeric [a..z] or [0..9]

\A - start of the text

\Z - end of the text

[ABC] - one of characters A,B,C

. - any character

^A - not A

a+ - one or more of pattern a

a* - zero or more of pattern a

a|b - either pattern a or b matches

(a) - re.findall returns a

\s - empty space

Ignoring case

If the case of the text should be ignored during the pattern matching, add , re.IGNORECASE to the parameters of any re function.