How to use Regular Expressions (Regex) in Swift and iOS

How to use Regular Expressions (Regex) in Swift and iOS
Developers, scared and running away from Regular Expressions

Regular expressions are a powerful language feature. It allows you to find matches, extract text, or even replace strings in the given input of text. Like many developers, I've had a fair share of fighting with them eventually leading to a loss of confidence and productivity.

Every time I need Regular expressions, I cannot immediately come up with the code to accomplish the task. I am writing this post to share what I know about Regular expressions and how to use them on iOS using Swift.

There are 3 basic use-cases for using regular expressions. I will cover them one by one in the following sections.

  1. Checking the existence of a given expression in the input text
  2. Finding all the matches of the given expression in the input text
  3. Replacing instances of matching expressions in the input text

Let's cover them one by one in the following sections

Checking the existence of a given expression in the input text

This is the basic case for RegEx matching. Given the RegEx pattern, we want the checker to return the boolean flag to indicate if there is at least one match for that expression in the input text. We will use a couple of examples in each category to demonstrate how RegEx works.

For the first example, our pattern and the input string look like this,

Regex: (#\\/{0,1}\\d{1,}#\\*{0,2})

Input text: We are big now #1#**lot of sales#/1#* the money and cards #2#Robert Langdon and Ambra Vidal#/2#**.

Please note how special characters are escaped with double backslash characters. This is to avoid error originating due to invalid escape sequence while constructing them in Swift

Before getting if matches exist or not, let's see what the RegEx tells us. If we convert RegEx into simple language, this is how it will read.

[zero_or_one_front_slash]#[any_number]#[one_two_or_no_star]

Examples of matching patterns are,

  • #/100#*
  • #1#
  • #2000#**
  • #/3#*

Finally, let's write a function that returns a boolean value whether there is a match or not,


func doesMatchExist(regularExpression: String, inputText: String) -> Bool {
    guard let regex = try? NSRegularExpression(pattern: regularExpression) else {
        return false
    }
    return regex.firstMatch(in: inputText, range: NSRange(inputText.startIndex..., in: inputText)) != nil
}

This function will take two inputs - One is the regular expression and the other is the input text. It will return true if there is at least one match with the passed regular expression, otherwise returns false.

Now that our function and inputs are ready, let's call this utility and check the output.


let regex = "(#\\/{0,1}\\d{1,}#\\*{0,2})"
let inputText = "We are big now #1#**lot of sales#/1#* the money and cards #2#Robert Langdon and Ambra Vidal#/2#**."

let result = doesMatchExist(regularExpression: regex, inputText: inputText)
print(result) // true

Let's run it on another example. In this example, we want to check if the input URL adheres to the given pattern.

Regex : \\W+offset=\\d+&limit=\\d+

Input text: https://pokeapi.co/api/v2/pokemon-species?offset=2&limit=2


let regex = "\\W+offset=\\d+&limit=\\d+"
let inputText = "https://pokeapi.co/api/v2/pokemon-species?offset=2&limit=2"

let result = doesMatchExist(regularExpression: regex, inputText: inputText)
print(result) // true

Finding all the matches of the given expression in the input text

In the next part, we are going to see how to use Regular expressions in Swift to find all the strings matching the pattern in the given input string.

To find the matches, we will use the new operator that encloses expressions to extract in round brackets.


func getMatches(regex: String, inputText: String) -> [String] {

    guard let regex = try? NSRegularExpression(pattern: regex) else {
        return []
    }
    let results = regex.matches(in: inputText,
                            range: NSRange(inputText.startIndex..., in: inputText))

    let finalResult = results.map { match in

        return (0..<match.numberOfRanges).map { range -> String in

            let rangeBounds = match.range(at: range)
            guard let range = Range(rangeBounds, in: inputText) else {
                return ""
            }
            return String(inputText[range])
        }
    }.filter { !$0.isEmpty }

    var allMatches: [String] = []

    // Iterate over the final result which includes all the matches and groups
    // We will store all the matching strings
    for result in finalResult {
        for (index, resultText) in result.enumerated() {

            // Skip the match. Go to the next elements which represent matching groups
            if index == 0 {
                continue
            }
            allMatches.append(resultText)
        }
    }

    return allMatches
}

The function above accepts two parameters - the regex string and the input text. We use matches API on regex string to get the results array. Each element in this array is of the type NSTextCheckingResult which encodes all the substrings in the original input which match the passed regex.

We iterate over the results and using the range associated with each result, we fetch substring from the original text and assign it to finalResult which stores all the matching substrings.

finalResult is an array of array where each subarray stores a match and list of subsequent matching groups. In the end, we are only interested in matching groups, so we iterate over finalResult to get an array that stores match as the first entry, and the rest of the entries represent matching groups.

We will iterate over them and store all the matching groups and return them as an array of strings back to the caller.

Now that the utility function is clear, let's learn it using the same proceeding examples of RegEx and input text.


// This Regex represents pattern which says
// substring should 
//   1. Start with a character #
//   2. Should have no or exactly one occurence of character /
//   3. Followed by at least one digit
//   4. Followed by single character #
//   5. Followed by no or at most 2 occurrences of character *

var regex = "(#\\/{0,1}\\d{1,}#\\*{0,2})"

var inputText = "We are big now #1#**lot of sales#/1#* the money and cards #2#Robert Langdon and Ambra Vidal#/2#**."

// Outputs
/*
▿ 4 elements
  - 0 : "#1#**"
  - 1 : "#/1#*"
  - 2 : "#2#"
  - 3 : "#/2#**"
*/

let allMatches = getMatches(regex: regex, inputText: inputText)

// This Regex represents pattern which says
// substring should
//   1. Start with any alphanumeric character
//   2. Should have a pattern of offset=[some digit]&limit=[some digit]
//   3. If it matches according to condition in #2, extract the digits associated with offset and limit values from the input URL

regex = "\\W+offset=(\\d+)&limit=(\\d+)"

inputText = "https://pokeapi.co/api/v2/pokemon-species?offset=100&limit=40"

// Outputs
/*
▿ 2 elements
  - 0 : "100" // Offset
  - 1 : "40"  // Limit
  Where first element represents offet and second one represents limit in the passed URL
*/

let allMatches1 = getMatches(regex: regex, inputText: inputText)

Replacing instances of matching expressions in the input text

In the final section of this post, we will see how to replace the instances of matching expressions in the input text using Regex APIs.

Replacing the matching instances is a bit more complicated than the first two sections. This is due to the fact that every time we replace the text, it changes the original string and might result in an inconsistency between the original string that was worked on and the modified string after replacement.

To perform the replacement of matching groups, we will write a function taking 3 parameters as input

  1. regex - Regular expression to match against
  2. inputText - Input text to operate on
  3. replacementStringClosure - A closure user can customize to return replacement for the match. This closure will have a string as an input which represents the matching group in the input text. The user can return a string that will replace the original match

func stringAfterReplacingMatches(regex: String, inputText: String, replacementStringClosure: (String) -> String?) -> String {
    guard let regex = try? NSRegularExpression(pattern: regex) else {
        return inputText
    }

    let results = regex.matches(in: inputText, range: NSRange(inputText.startIndex..., in: inputText))

    var outputText = inputText

    results.reversed().forEach { match in

        (1..<match.numberOfRanges).reversed().forEach { rangeIndex in
            let matchingGroup: String = (inputText as NSString).substring(with: match.range(at: rangeIndex))
            let rangeBounds = match.range(at: rangeIndex)

            guard let range = Range(rangeBounds, in: inputText) else {
                return
            }
            let replacement = replacementStringClosure(matchingGroup) ?? matchingGroup

            outputText = outputText.replacingOccurrences(of: matchingGroup, with: replacement, range: range)
        }
    }
    return outputText
}

In the above function, first, we will find all the matches of the given regular expression in the input text. Each value in results array is of type NSSimpleRegularExpressionCheckingResult. The first value is a match that matches the whole expression. We will ignore the match and continue with the matching groups instead. (Hence we start at the index 1 instead of 0)

With the range of each matching element, we will get the matching group. Once we have it, we will call the replacementStringClosure closure with this value. Client has already customized which value they would like to return for the matching string. If the user returns nothing for the replacement, no changes are made to the input string.

We store the intermediate results into outputText and return it at the end once all the replacements are carried out.

💡
Please note that we always start iterating in reverse order. This is to avoid inconsistency since we are replacing and thereby changing the length of the resulting string. Since we start from the last, the replacement has no effect on the original range of matching strings. On the other hand, if we had started from the beginning, it would have incrementally changed the string length thus invalidating the range of previous matches

Now that our replacement function is ready, let's run it on some sample regex and an input string,


let regex = "(#\\/{0,1}\\d{1,}#\\*{0,2})"
let inputText = "We are big now #1#**lot of sales#/1#* the money and cards #2#Robert Langdon and Ambra Vidal#/2#**."

let updatedInputText = stringAfterReplacingMatches(regex: regex, inputText: inputText) { originalMatchingGroup in
    return ""
}

// prints
// "We are big now lot of sales the money and cards Robert Langdon and Ambra Vidal."

print(updatedInputText)

In the above example, we replace all the matches with blank strings so none of the matching strings appear in the final output after replacement.


let regex1 = "\\W+offset=(\\d+)&limit=(\\d+)"
let inputText1 = "?offset=400&limit=20"

let anotherUpdatedInputText = stringAfterReplacingMatches(regex: regex1, inputText: inputText1) { originalMatchingGroup in
    return "Awesome" + originalMatchingGroup
}

// prints
// "?offset=Awesome400&limit=Awesome20"

print(anotherUpdatedInputText)

In this example, we decided to prepend matches with "Awesome" text thus it has provided the above string as the output.

💡
The full source code is available on Github on this Gist

Summary

So this was all about doing complex operations using Regular expression on iOS using Swift. Regexes are highly efficient, but it's not always clear how to use them. This blog post is my effort to make it clear how to utilize them and how to perform basic Regex-related tasks.

I have made the source code public using the above-mentioned Gist. Thanks for reading, and if you have any questions, comments, or feedback, please feel free to reach out on Twitter @jayeshkawli.

While working on this post, I came across on-line interactive regular expressions check portal which is great for incrementally trying out RegEx and observing intermediate results. With this tool, you will quickly get the feedback in the real-time without going through multiple iterations