PowerShell Searching With Regular Expressions

PowerShell Regular expressions provide a powerful mechanism to search a file for text that has a particular pattern. We can use a regular expression to define a particular pattern. For example, we can define a regular expression to match a phone number, email, Uniform Resource Locator (URL) etc. We can define a regular expression to match a piece of text. Moreover, we can use special characters that enable us to search for patterns within text.

Some of these special characters include:

  • \w: Searches for words characters such as: letters, numbers and underscores
  • \W: The opposite of \w and matches non word characters such as: space and punctuation characters.
  • \d: Matches any digits from 0 -9
  • \D: Matches any character that is not a digit.
  • \s: Matches a space character such as space, tab, carriage return
  • \S: The opposite of \s matches any character that is not a space
  • . Matches any single character
  • * Matches zero or more of the preceding characters or character groups
  • ‘+’ Matches one or more of the preceding characters or character groups
  • ? Matches zero or one of the preceding characters or character groups
  • [a-z] Matches any characters in the range from a-z
  • [0-9] Matches any digits in the range from 0-9
  • {N} Matches exactly N number of occurrences of the previous pattern. For example a pattern \d{2}- requires there to be exactly two digits followed by a hyphen
  • {N,M} Matches between N and M occurrences
  • ^ Matches at the beginning of a String
  • $ Matches at the end of a String.

PowerShell: Match Comparison Operator

The PowerShell match comparison operator allows us to assert if a string contains a pattern. This operator outputs a True or False value.

1
"He was born 05/01/1990" -match "\d{2}/\d{2}/\d{4}"

The above regular expression outputs true as it matches the date 05/01/1990

1
"His name is John" -match "john”

The above outputs true. This is because the –match operator performs a case insensitive search. If we wish to perform a case sensitive search we have to use a different operator.

PowerShell: CMatch Comparison Operator

The PowerShell CMatch comparison operator is similar to the Match operator but instead performs a case sensitive search.

1
"His name is John" -cmatch "john”

The above expression evaluates to False. While the below expression evaluates to True.

1
"His name is John" -cmatch "John”

PowerShell: Select-String Cmdlet

The Select-String Cmdlet allows us to search one or more files that match a pattern. For example consider searching for all of the repeating words in a text file such as:

1
The the book is very heavy. Julia must carry it every day on her way to school.

We will use the Select-String cmdlet to find the repeating words in a file.

1
Select-String -Pattern "\b(\w+)(\s+\1\b)+" text.txt

We could extend this to find all the repeating words in text files in the current directory.

1
get-childitem -filter *.txt -recurse | select-string –pattern "\b(\w+)(\s+\1\b)+" | format-table Filename,LineNumber,Line –Wrap

The above command outputs a table of all text files in the current directory with repeating words.

1
2
3
4
5
6
Filename                                   LineNumber Line                     
--------                                   ---------- ----                     
text.txt                                            1 The    the book is very h
                                                     eavy. Julia must carry it
                                                      every day on her way to
                                                     school.

Conclusion

In this small tutorial we looked at how we can use Powershell to search with regular expressions.