Search notes:

VBA: Regular expressions

In VBA, regular expressions need the reference Microsoft VBScript Regular Expressions 5.5. (whose GUID is {3F4DACA7-160D-11D2-A8E9-00104B365C9F}).
This GUID might be conveniently added to a VBA project using the VB Editor Object Model. The following snippet works in Excel:
thisWorkbook.VBProject.references.addFromGuid _
        GUID   :="{3F4DACA7-160D-11D2-A8E9-00104B365C9F}", _
        major  :=  5,                                      _
        minor  :=  5

Objects

RegExp

Methods and properties of the RegExp objects are:
Execute(src) Returns an IMatchCollection2 object (see MatchCollection below)
Global A boolean
IgnoreCase A boolean
MultiLine A boolean
Pattern The regular expression (string)
Replace(src, repl) Returns a string
Test(src) Returns a boolean

MatchCollection

MatchCollection is a collection of IMatch2 objects/interfaces (see Match below).
A MatchCollection object can be obtained by RegExp.Excecute().
Count The number of matches captured by this object. Can be greater then 1 if RegExp.global is set to true.
Item(n) The nth item in the collection (first item has index 0)

Match

FirstIndex The position of the first character that was matched by the corresponding part of the regex pattern.
Length The length of the text that was matched by the regex pattern (i. e. the length of Value)
SubMatches An ISubmatches object/interface (See SubMatches) below
Value string. Default value

SubMatches

SubMatches are used for the portion captured with paranthesis in the regular expression.
Count Corresponds to the number of capturing parantheses in the regular expression.
Item The string that was matched by the parantheses.

Split

' runVBAFilesInOffice.vbs -word %CD%\split -c main

sub main() ' {

' http://stackoverflow.com/a/35254567/180275

  dim text   as string
  dim nums() as string

  text = "twenty-two22fourty-nine49six6eleven11five hundred500"

  nums = regexpSplit(text, "\d+")

  selection.font.name = "Courier New"

  for each num in nums
      selection.typeText ">" & num & "<" & vbCrLf
  next num

  activeDocument.saved = true
 
end sub ' }

private function regexpSplit(text as string, pattern as string) as string() ' {

  dim text_0 as string
  dim re     as new regExp

  re.pattern   = pattern
  re.global    = true
  re.multiLine = true

  text_0 = re.replace(text, vbNullChar)

  regexpSplit = strings.split(text_0, vbNullChar)

end function ' }
Github repository about-VBA, path: /regular-expressions/split.bas

Replace SQL comments

' c:\lib\runVBAFilesInOffice\runVBAFilesInOffice.vbs -excel  %CD%\replace-sql-comments -c main
sub main()

  dim sqlText as     string
  dim re      as new regExp

  sqlText =           "select "                      & chr(10) & chr(13)
  sqlText = sqlText & "  *    -- This is a comment " & chr(10) 
  sqlText = sqlText & "from   -- another comment"    & chr(10) & chr(13)
  sqlText = sqlText & "  table-- last comment"

  re.global    = true
  re.multiLine = true
  re.pattern   = "--.*$"

  columns(2).columnWidth = 100
  columns(2).font.name = "Courier New"

  cells(2,2).value = sqlText

  sqlText = re.replace(sqlText, "")

  cells(4,2).value = sqlText


  activeWorkbook.saved = true

end sub
Github repository about-VBA, path: /regular-expressions/replace-sql-comments.bas
Compare with the removeSQLComments function in the SQL VBA module.

Excel

The following example tries to show how regular expressions might be used in Excel to fill cell values with the result of a regexp pattern that is applied to other cells.
option explicit

sub main() ' {

    fillTestData

    range(cells(1,6), cells(4,6)).formulaR1C1 = "=extractNumber(rc[-5])"

end sub ' }

sub fillTestData() ' {

    cells(1, 1) = "foo bar baz"
    cells(2, 1) = "bla 42 more bla"
    cells(3, 1) = "number 37 another number 99 xyz"
    cells(4, 1) = "line one" + vbCrLf + "line two" + vbCrLf + "line three (3)" + vbCrLf + "line four"

end sub ' }

public function extractNumber(cell as range) as string ' {

    dim re as new regExp
    re.pattern   ="\d+"
  ' re.global    = true
  ' re.multiLine = true

    dim mtcColl as matchCollection

    set mtcColl = re.execute(cell)

    dim mtc as match
    set mtc = mtcColl(0)

    extractNumber = mtc.value

end function ' }
Github repository about-VBA, path: /regular-expressions/excel.bas

Extract nth match

By setting the global flag to true, it's possible to access the nth match of pattern within a text:
option explicit

sub main() ' {

    dim txt as string
    txt = "ninety-nine: 99, fourty-two: 42, eleven: 11"

    debug.print extractNthNumber(txt, 1) ' 99
    debug.print extractNthNumber(txt, 2) ' 42
    debug.print extractNthNumber(txt, 3) ' 11

end sub ' }

private function extractNthNumber(txt as string, n as integer) as integer ' {

    dim re as new regExp
    re.pattern   ="\d+"
    re.global    = true
  ' re.multiLine = true

    dim mtcColl as matchCollection

    set mtcColl = re.execute(txt)

    dim mtc as match
    set mtc = mtcColl(n - 1)

    extractNthNumber = mtc.value

end function ' }
Github repository about-VBA, path: /regular-expressions/extract-nth-match.bas

See also

Using parenthesis to extract submatches.
VBA's like operator.
VBA language
The .NET System.Text.RegularExpressions namespace
Find Excel cells matching a given regular expression or contain whitespace characters only.
Regular expressions

Index