# 👁‍🗨 Regular Expressions

regexpythontextmatchanalysispattern

If you decide to use a regex to solve your problem, now have two problems!

# 🥉 Vocabulary

  • Qualifiers, Meta-characters, Meta-classes
  • RE works on characters
    • [0-9] valid
    • [0-255] will only match [0-2]
  • (? waits for next character to assign a meaning
Pattern Description
(?:... non returning grouping, non capturing version
(?=...) Positive Look ahead Assertion
(?!...) Negative Look ahead Assertion
(?<=...) Positive Look behind Assertion
(?<!...) Negative Look behind Assertion
(?P<name>) named RE pattern
(?P=name) named RE pattern reference

# 🥇 RE Repetition Qualifiers

. In the default mode, this matches any character except a newline.

  • Non-greedy variants (usually followed by ?)
Greedy Qualifiers Non-greedy variants
ab* *?
ab+ +?
ab? ??
a{6} a{3,5}?

Common regex patterns

Regex meaning
(.*) 0..N Any charter (except space) matching 0 to n number of times
(.+) 1..N Match at least 1 to n number of times
(.?) 0..1 Match either O or 1 number of time

# 👶 Qualifiers

The question mark character ?, matches either once or zero times; you can think of it as marking something as being optional. For Example, home-?brew matches either 'homebrew' or 'home-brew'

# 👣 Meta-characters and Meta-character Classes

Remember in duality

Meta-characters

  • ., ?, *
  • ^, $
  • [...], (...), {...}
  • (?:...), (?=...), (?!...), (?<=...), (?<!...), (?P<name>...), (?P=name...)

Meta-character Classes

  • \w, \W
  • \d, \D
  • \a, \A
  • \s, \S

# 4️⃣ IP pattern

Lets start with what we already know about IP generation Rules

  1. 4 octets
  2. each octet value between 0-255
  3. Boundary Values 0.0.0.0, 255.255.255.255
"""
^                       # first character match after space
(?:[0-9]{1,3}\.)        # non capturing group returning numbers 0-9
                        #     matched 1,2 or 3 times followed by a .
{3}                     # use previous group pattern match exactly 3 times
[0-9]{1,3}
$                       # last character match followed by a space
"""
# Using named group repetition
/^(?:(?P<octet>[0-9]{1,3})\.){3}(?P=octet)$/

# 👗 Regex Assertions

Matching and returning the matches based on assertions either by looking forward in the blob or by looking backward

# ⏩ Positive Look ahead Assertion

Consider the case where use want to match only Issac Asimov and not Issac Newton

Pattern Match return
Issac Asimov ✔️ Issac
Issac Newton
Issac (?=Asimov)

# ⏪ Negative Look ahead Assertion

Reverse the above situation, we want all other Issacs which are not followed by Asimov. We want Issac from Issac Newton this time.

Pattern Match return
Issac Asimov
Issac Newton ✔️ Issac
Issac (?!Asimov)

# ⏭ Positive Look behind Assertion

Blob Patton Match return
Avi Mehenwal ☑️
Shubhranshu Mehenwal

Consider we want Mehenwal only from Avi Mehenwal and not from Shubhranshu Mehenwal

(?<=Avi) Mehenwal

# ⏮ Negative Look behind Assertion

Blob Pattern Match return
Avi Mehenwal
Shubhranshu Mehenwal ☑️

Now lets reverse the situation, we want all Mehenwal which are not preceded by Avi. We want Mehenwal from Shubhranshu Mehenwal

(?<!Avi) Mehenwal

# 🌹 Grep

Global Regular Expression

egrep - Extended regular expressions

include all of the basic meta-characters along with additional meta-characters to express more complex matches.

egrep -c '^begin|end$' myfile.txt

Use python like lookahead and lookbehind regex using rg and grep on shell

echo "Nate or nate" | grep -P '(?<=N)a'

# 🏵 Resources

*[RE]: Regular Expressions | regex

I am looking for work opportunities. If you like my work and feels it was helpful kindly support me by


Buy Me A Coffee