65. Regular Expressions with re

Regular Expressions with re: Using the re module for pattern matching and string manipulation

The re module in Python provides a powerful mechanism for pattern matching and string manipulation through regular expressions (regex). Regular expressions allow you to search, match, and manipulate strings in complex ways. Below are various examples demonstrating how to use the re module effectively.


1. Basic Matching with re.match() and re.search()

  • re.match(): Tries to match a pattern from the start of the string.

  • re.search(): Searches for the pattern anywhere in the string.

import re

# re.match() example
result = re.match(r'hello', 'hello world')
if result:
    print("Match found:", result.group())
else:
    print("No match")

# re.search() example
result = re.search(r'world', 'hello world')
if result:
    print("Search found:", result.group())
else:
    print("No search match")

2. Finding All Matches with re.findall()

re.findall() returns a list of all non-overlapping matches in the string.


3. Replacing Text with re.sub()

The re.sub() function allows you to replace occurrences of a pattern in a string with a specified replacement.


4. Compiling Regular Expressions with re.compile()

You can compile a regular expression pattern into a regex object, which can be used multiple times for efficiency.


5. Using Groups with re.search()

You can use parentheses () in your regular expressions to create capture groups, which allow you to extract parts of the match.


6. Matching at the Start or End of a String

  • ^: Asserts the start of a string.

  • $: Asserts the end of a string.


7. Using re.split() to Split a String

The re.split() function splits the string based on the given regular expression pattern.


8. Using Wildcards with .

The dot . wildcard matches any character except a newline.


9. Using Character Classes

Character classes allow you to match specific types of characters.

  • \d: Matches any digit.

  • \w: Matches any alphanumeric character or underscore.

  • \s: Matches any whitespace character.


10. Using Quantifiers

Quantifiers specify how many times a pattern should match. Common quantifiers include:

  • *: Matches zero or more times.

  • +: Matches one or more times.

  • {n}: Matches exactly n times.

  • {n,}: Matches n or more times.

  • {n,m}: Matches between n and m times.


11. Anchoring with Word Boundaries \b

The \b symbol matches word boundaries, useful for matching whole words.


12. Case-Insensitive Matching with re.IGNORECASE

You can perform case-insensitive matching using the re.IGNORECASE flag.


Summary of Key Features:

  • Basic Functions: re.match(), re.search(), re.findall(), re.sub()

  • Groups: Capture specific parts of the match using parentheses.

  • Anchors: Use ^ for start, $ for end, and \b for word boundaries.

  • Wildcards and Quantifiers: Use . for any character and quantifiers like *, +, {n} to control match repetitions.

  • Character Classes: Use \d, \w, \s for matching digits, words, and whitespace characters.

  • Flags: Use re.IGNORECASE for case-insensitive matching.

The re module is a versatile tool that, when used effectively, can significantly simplify string searching and manipulation in Python.

Last updated