Lua String Patterns vs Regular Expressions

· 9 min read · Updated March 30, 2026 · intermediate
strings patterns regex lua

Lua does not have regular expressions. It has something simpler and more focused: string patterns. If you arrive from Python, JavaScript, or PHP, you will immediately notice similarities and start reaching for tools that do not exist. This guide covers exactly what Lua patterns can and cannot do, and how they differ from the regex engine you already know.

Why Lua Has Its Own Pattern System

Lua’s standard library is deliberately small. Rather than importing a full regex engine, the string library ships with a lightweight pattern syntax that handles the vast majority of everyday text processing tasks. The tradeoff is real: Lua patterns cannot do everything PCRE can, but they are faster to parse and simpler to reason about.

The key distinction is that Lua patterns are regular strings, not a separate syntax. They follow the same escaping rules as any other Lua string. This means % is the escape character inside a pattern, not the backslash you might expect from other languages.

Magic Characters

Certain characters have special meaning inside a pattern. These are the magic characters:

( ) . % + - * ? [ ^ $
```lua

To match any of these literally, you escape them with `%`:

```lua
%.    -- literal dot
%)    -- literal close parenthesis
%%    -- literal percent sign
%[    -- literal open bracket
```lua

When in doubt, escape it. `%` works as an escape for every non-alphanumeric character. This catches a lot of people coming from regex, where `\` is the escape character. In Lua, `\` has no special meaning inside patterns at all — you use `%` instead.

## Character Classes

Single-character predefined classes let you match common character sets without building your own:

| Class | Matches |
|-------|---------|
| `.` | Any character |
| `%a` | Letters (a-z, A-Z) |
| `%c` | Control characters |
| `%d` | Digits (0-9) |
| `%l` | Lowercase letters |
| `%p` | Punctuation |
| `%s` | Space characters |
| `%u` | Uppercase letters |
| `%w` | Alphanumeric (letters + digits) |
| `%x` | Hexadecimal digits |
| `%z` | The null character (`\0`) |

The uppercase version of any class is its complement:

```lua
%A   -- any character that is NOT a letter
%S   -- any character that is NOT a space
```lua

One important difference from PCRE: Lua's `%w` does not include underscore. It matches `[a-zA-Z0-9]` only. PCRE's `\w` includes `_`. If you need to match identifiers, use `[_%w]` or the full character class `[_%a%d]` instead.

You can also build your own character sets with square brackets:

```lua
[aeiou]          -- vowels
[0-9a-fA-F]      -- hex digits, same as %x
[^0-7]           -- any character that is NOT an octal digit
[%[\\]]          -- literal brackets and backslash
```lua

One subtlety: inside a character set, the `%` escape does not work the same way. To include a literal `]` in your set, put it immediately after the opening `[` or escape it. To include a literal `-`, put it at the start or end of the set.

## Repetition Modifiers

Lua patterns offer four modifiers that control how many times a character class can repeat:

| Modifier | Meaning |
|----------|---------|
| `+` | One or more (greedy) |
| `*` | Zero or more (greedy) |
| `-` | Zero or more (non-greedy / lazy) |
| `?` | Zero or one (optional) |

```lua
%d+       -- one or more digits (integer)
%a*       -- zero or more letters (word)
[+-]?%d+  -- optional sign followed by digits (e.g. -12, +100, 42)
```lua

The greedy versus non-greedy distinction matters more than you might expect:

```lua
-- Greedy: matches from the FIRST "/*" to the LAST "*/"
string.gsub("/* x */ int y; /* z */", "/%*.*%*/", "<C>")
--> "int y; <C>"

-- Non-greedy: matches from each "/*" to the FIRST "*/" after it
string.gsub("/* x */ int y; /* z */", "/%*.-%*/", "<C>")
--> "<C> int y; <C>"
```lua

This trips up a lot of people. In PCRE, you add `?` after a quantifier to make it lazy. In Lua, `-` is the lazy version of `*`. There is no lazy version of `+` or `?` — that is a genuine limitation.

## Balanced Pairs with `%b`

Lua has a built-in pattern for matching balanced delimiter pairs, something that requires recursion or complex backtracking in most regex engines:

```lua
%b()   -- matched parentheses
%b{}   -- matched braces
%b[]   -- matched brackets
%b""   -- matched double quotes
%b''   -- matched single quotes
```lua

```lua
local s = "function foo(a, (b + c)) end"
print(string.match(s, "%b()"))  --> (a, (b + c))
```lua

This uses a simple stack-based algorithm — it matches the first opening delimiter with the first closing one, then resets. It cannot match nested semantic pairs like `if (a > (b + c)) then` in a way that respects Lua's grammar. But for delimited strings, it is extremely useful.

## The Frontier Pattern `%f`

The `%f[set]` pattern matches the boundary between a character not in `set` and a character that is in `set`. This is Lua's way of detecting transitions:

```lua
%f[%w]   -- transition from non-word to word (word start)
%f[^%w]  -- transition from word to non-word (word end)
```lua

```lua
local text = "hello world foo bar"
print(string.match(text, "%f[%w]%a+%f[^%w]"))  --> hello
```lua

Frontier patterns are unique to Lua. PCRE has no direct equivalent. They are rarely needed but invaluable when you need to match words in context without lookahead.

## The Four Core Functions

Every pattern operation in Lua goes through one of four functions in the `string` library.

### `string.find`

Searches for the first match and returns start and end indices:

```lua
local s = "hello world"
print(string.find(s, "world"))  --> 7  11
```lua

Pass `true` as the fourth argument to do a plain text search with no pattern interpretation:

```lua
-- Without plain=true, the dot would match any character
print(string.find("price: $50", "$50", 1, true))  --> 8  11
```lua

### `string.match`

Returns the matched substring or, when the pattern has captures, returns those captured values:

```lua
local date = "30/05/1999"
local d, m, y = string.match(date, "(%d+)/(%d+)/(%d+)")
print(d, m, y)  --> 30  05  1999
```lua

### `string.gmatch`

Returns an iterator for stepping through all matches in a loop:

```lua
local s = "10 20 30 40"
for num in string.gmatch(s, "%d+") do
    print(num)
end
```lua

### `string.gsub`

Replaces every occurrence. The replacement can be a string, a table, or a function:

```lua
local s = "hello world"
print(string.gsub(s, "o", "0"))  --> hell0 w0rld  2

-- Table-based replacement
string.gsub("hello", "%l", {h="H", e="E", l="L", o="O"})  --> "HELLO"

-- Function-based replacement
string.gsub("10 20 30", "%d+", function(n) return tonumber(n) * 2 end)
--> "20 40 60"
```lua

## What Lua Patterns Cannot Do

Coming from PCRE, there are three gaps you will hit immediately.

**No alternation.** There is no `|` operator. To match one word or another, you need separate calls:

```lua
local match = string.match(s, "foo") or string.match(s, "bar")
```lua

Or iterate over a table of alternatives. There is no elegant single-pattern workaround.

**No lookaround.** No lookahead (`?=`, `?!`) or lookbehind (`?<=`, `?<!`). The `%f` frontier pattern covers some boundary cases, but it cannot match conditionally based on what follows.

**No quantifiers on groups.** You cannot write `(abc)+` in Lua. A modifier can only apply to a character class, not to a grouped subpattern. The workaround is to restructure your pattern or handle it in code.

## Escaping User Input

When building patterns from user input, you must escape all magic characters:

```lua
function escape_pattern(s)
    return (s:gsub("[%(%)%.%%%+%-%*%?%[%]%^%$]", "%%%0"))
end

local user_input = "hello (world)"
local pattern = escape_pattern(user_input)  --> "hello %(world%)"
```lua

Without this, any parentheses in the user input will be interpreted as capture groups. This is the most common source of bugs in Lua string pattern code.

## Performance Notes

Lua does not compile patterns ahead of time. Each call to `string.find`, `string.match`, `string.gmatch`, or `string.gsub` parses the pattern string internally before matching. This makes the parsing cost paid on every call, but the parsing itself is much simpler and faster than PCRE's engine.

For tight loops with heavy pattern matching, the difference can matter. If you are processing millions of short strings, pre-parsing is not an option — Lua simply does not have that feature. In practice, for most tasks (parsing config files, extracting data from structured logs, validating input), the pattern parsing overhead is negligible compared to the actual string scanning.

If you need maximum performance and are on LuaJIT instead of vanilla Lua, you can see 10-100x speedups on tight loops. The simpler pattern syntax also tends to JIT better than a full regex engine.

## Quick Reference

| Feature | Lua Patterns | PCRE/Regex |
|---------|-------------|------------|
| Escape character | `%` | `\` |
| Any character | `.` | `.` |
| One or more | `+` | `+` |
| Zero or more greedy | `*` | `*` |
| Zero or more lazy | `-` | `*?` |
| Optional | `?` | `?` |
| Alternation | None | `\|` |
| Lookahead/behind | None | Yes |
| Anchors needed | No (anywhere by default) | Usually |
| `%w` includes underscore | No | Yes |
| Balanced pairs | `%b<xy>` | Recursive patterns |
| Frontier pattern | `%f[set]` | None |

Lua string patterns are a deliberate tradeoff. They handle the common cases cleanly and leave the complex cases to external libraries or plain Lua code. Once you stop fighting that decision and work with the grain, you will find they cover a surprising amount of ground.

## See Also

- [Lua Metatables](/guides/lua-metatables/) — Metatables let you redefine how string operations work on your own objects.
- [Lua Weak Tables](/guides/lua-weak-tables/) — Weak tables help manage memory when tables are used as pattern matchers or callbacks.
- [Pattern Matching](/tutorials/pattern-matching/) — A broader look at pattern facilities in Lua, including table patterns and the `match` construct.