PcoWSkbVqDnWTu_dm2ix

10 min

A string pattern is a combination of characters that can be used to find very specific pieces — often called substrings that exist inside a longer string. String patterns are used with several string functions provided by Lua.

Direct Matches

Direct matches can be done for any non-magic characters by simply using them literally in a Lua function like string.match(). For example, these commands look for the word Roblox within a string:

Notice that a match is found in the first string, so Roblox is output to the console. However, a match is not found in the second string and the output is nil.

Character Classes

Character classes are essential for more advanced string searches. They’re a way to search for something that isn’t necessarily character-specific but it fits within a known category (class). In Lua, you can search a string for letters, digits, spaces, punctuation, and more.

The following table shows the official character classes for Lua string patterns:

Class Represents Example Match
. Any character 32kasGJ1%fTlk?@94
%a An uppercase or lowercase letter aBcDeFgHiJkLmNoPqRsTuVwXyZ
%l A lowercase letter abcdefghijklmnopqrstuvwxyz
%u An uppercase letter ABCDEFGHIJKLMNOPQRSTUVWXYZ
%d Any digit (number) 0123456789
%p Any punctuation character !@#;,.
%w An alphanumeric character (either a letter or a number) aBcDeFgHiJkLmNoPqRsTuVwXyZ0123456789
%s A space or whitespace character _, \n, and \r
%c A special control character
%x A hexadecimal character 0123456789ABCDEF
%z The NULL character (\0)

Magic Characters

There are 12 “magic characters” which are reserved for special purposes in patterns:

$ % ^ * ( )
. [ ] + - ?

Instead of using their special meaning, you can precede them with a % symbol to search for them literally. This is called character escaping. For example, to search for roblox.com, you’ll need to escape the . (period) symbol by preceding it with a %.

Anchors

To ensure a pattern occurs at the beginning of a string, you can use the ^ symbol to represent the “head” of the string. Conversely, the $ symbol ensures a pattern occurs at the end of a string.

You can also use both ^ and $ together to ensure a pattern matches only the full string and not just some portion of it.

Class Modifiers

By itself, a character class will only match one character in a string. For instance, the pattern below ("%d") starts reading the string from left to right, finds the first digit (2), and stops.

Fortunately, you can use modifiers with any character class to control the result:

Quantifier Meaning
+ Match 1 or more of the preceding character class
- Match as few of the preceding character class as possible
* Match 0 or more of the preceding character class
? Match 1 or less of the preceding character class
%n For n between 1 and 9, matches a substring equal to the n-th captured string.
%bxy The balanced capture matching x, y, and everything between (for example, %b() matches a pair of parentheses and everything between them)

Adding a modifier to the same pattern above ("%d+" instead of "%d"), outputs 25 instead of 2:

Class Sets

Sets should be used when a single character class can’t do the whole job. For instance, you might want to match both lowercase letters (%l) and punctuation characters (%p) using a single pattern.

Sets are defined by brackets [] around them. In the following example, notice the difference between using a set ("[%l%p]+") and not using a set ("%l%p+").

The first command (set) tells Lua to find both lowercase characters and punctuation. With the + quantifier added after the entire set, it finds all of those characters (ello!!!), stopping when it reaches the space.

In the second command (non-set), the + quantifier only applies to the %p class before it, so Lua grabs only the first lowercase character (o) before the series of punctuation (!!!).

String Captures

String captures are sub-patterns within a pattern. These are enclosed in parentheses () and are used to get (capture) matching substrings and save them to variables. For example, the pattern below contains two captures, (%a+) and (%d+), which return two substrings upon a successful match.

String captures can also be nested as in the following example:

This pattern search works as follows:

  1. The string.gmatch() iterator looks for a match on the entire “description” pattern defined by the outer pair of parentheses. This stops at the first comma and captures the following:
# Pattern Capture
1 (The%s(%a+%sKingdom)[%w%s]+) The Cloud Kingdom is heavenly
  1. Using its successful first capture, the iterator then looks for a match on the “kingdom” pattern defined by the inner pair of parentheses. This nested pattern simply captures the following:
# Pattern Capture
2 (%a+%sKingdom) Cloud Kingdom
  1. The iterator then backs out and continues searching the full string, capturing the following:
# Pattern Capture
3 (The%s(%a+%sKingdom)[%w%s]+) The Forest Kingdom is peaceful
4 (%a+%sKingdom) Forest Kingdom
Tags:
  • string
  • pattern
  • lua