Strings and Pattern Matching Basics
Strings are everywhere in programming—and Lua gives you powerful tools to work with them. Whether you’re parsing data, building URLs, or processing user input, understanding Lua’s string capabilities will save you countless hours.
Creating strings
In Lua, strings can be created using single quotes, double quotes, or long brackets:
local single = 'Hello'
local double = "World"
local long = [[This is a
multi-line string]]
Note: Lua strings are immutable. Every operation that appears to modify a string actually creates a new one.
The string library
Lua’s string library provides functions for manipulating and analyzing strings. Most functions live in the string table:
local text = "Hello, Lua!"
-- Get length
print(#text) -- 11
print(string.len(text)) -- 11
-- Convert to uppercase/lowercase
print(string.upper("hi")) -- HI
print(string.lower("HI")) -- hi
-- Get substring
print(string.sub(text, 1, 5)) -- Hello
-- Find substring
local start, finish = string.find(text, "Lua")
print(start, finish) -- 8 10
String concatenation
Combine strings using the .. operator:
local first = "Hello"
local second = "World"
local greeting = first .. " " .. second
print(greeting) -- Hello World
Tip: For building strings in loops, use
table.concatinstead of..to avoid O(n²) performance:
local parts = {"a", "b", "c"}
local result = table.concat(parts, "-")
print(result) -- a-b-c
Pattern matching basics
Lua’s pattern matching is a lightweight alternative to full regex. It’s built into the string library and uses special characters called magic characters:
| Character | Meaning |
|---|---|
. | Any character |
%a | Any letter |
%d | Any digit |
%s | Any whitespace |
%w | Alphanumeric characters |
%p | Punctuation |
%b() | Balanced parentheses |
^ | Start of string (in pattern) |
$ | End of string (in pattern) |
Character classes
Character classes let you match types of characters:
local text = "abc123DEF"
-- Match all letters
print(string.match(text, "%a+")) -- abc
-- Match all digits
print(string.match(text, "%d+")) -- 123
-- Match uppercase letters
print(string.match(text, "%u+")) -- DEF
Repetition operators
Control how many times a pattern matches:
| Operator | Meaning |
|---|---|
* | 0 or more (greedy) |
+ | 1 or more (greedy) |
- | 0 or more (non-greedy) |
? | 0 or 1 |
local text = "123abc456"
print(string.match(text, "%d+")) -- 123 (greedy - longest match)
print(string.match(text, "%d-")) -- (empty - non-greedy)
print(string.match(text, "%a*")) -- abc (0 or more)
print(string.match(text, "%a?")) -- a (0 or 1)
Captures
Extract specific parts of a match using parentheses:
local date = "2024-03-17"
-- Capture each component
local year, month, day = string.match(date, "(%d+)-(%d+)-(%d+)")
print(year, month, day) -- 2024 03 17
-- Extract domain from email
local email = "user@example.com"
local user, domain = string.match(email, "(.+)@(.+)")
print(user, domain) -- user example.com
Common pattern operations
Finding and replacing
local text = "The quick brown fox"
-- Find a pattern
local pos = string.find(text, "quick")
print(pos) -- 5
-- Replace all occurrences (gsub)
local result = string.gsub(text, "o", "X")
print(result) -- The quick brXwn fXx
-- Replace with captures
local result = string.gsub("Hello, John!", "(%w+)", "%1!")
print(result) -- Hello!, John!!
Splitting strings
Lua doesn’t have a built-in split function, but you can create one:
function split(str, delimiter)
local result = {}
local pattern = string.format("([^%s]+)", delimiter)
for match in string.gmatch(str, pattern) do
table.insert(result, match)
end
return result
end
local parts = split("apple,banana,cherry", ",")
print(table.concat(parts, ", ")) -- apple, banana, cherry
Validating input
Pattern matching is excellent for validation:
function is_valid_email(email)
-- Simple pattern: something@something.something
return string.match(email, "[^@]+@[^@]+%.[^@]+") ~= nil
end
function is_phone_number(phone)
-- Accepts: 123-456-7890 or (123) 456-7890
return string.match(phone, "%(?%d% d%d%d%)?%s*%d%d%d-%d%d%d%d") ~= nil
end
print(is_valid_email("test@example.com")) -- true
print(is_valid_email("invalid")) -- false
print(is_phone_number("123-456-7890")) -- true
Escaping magic characters
When you need to match a literal magic character, escape it with %:
local text = "Price: $99.99"
-- Match literal dollar sign
local price = string.match(text, "$%d+%.%d+")
print(price) -- $99.99
-- Match literal period
local version = string.match("lua5.4", "5%.%d")
print(version) -- 5.4
Practical example: parsing log lines
Let’s build a log parser that extracts useful information:
function parse_log_line(line)
-- Pattern: [TIMESTAMP] LEVEL: Message
local timestamp, level, message = string.match(
line,
"%[(%d+:%d+:%d+)%]%s(%w+):%s(.+)"
)
return {
timestamp = timestamp,
level = level,
message = message
}
end
local log = "[14:32:15] ERROR: Connection refused"
local parsed = parse_log_line(log)
print(parsed.level) -- ERROR
print(parsed.message) -- Connection refused
Summary
Lua’s string capabilities cover most common text-processing needs:
#operator andstring.lenfor lengthstring.subfor extracting substringsstring.findfor searchingstring.gsubfor replacements- Patterns for flexible matching and extraction
- Captures to extract specific parts of matches
Pattern matching in Lua is simpler than regex but powerful enough for most tasks. Remember to escape magic characters (%, ., ^, $, etc.) when matching them literally.
Next in series: Modules and the require System
See also
- string.find reference — Find substrings with string.find()
- string.gsub reference — Replace patterns with string.gsub()
- Lua tables tutorial — The companion data-structure tutorial in this series
Why Lua patterns are not regex
Lua deliberately avoids the full POSIX or PCRE regular-expression grammar. The reasons matter when you decide which tool to reach for. Patterns have no alternation operator, no backreferences inside a single match, and no quantifier on character classes longer than a single character. In exchange you get a tiny, predictable engine that lives entirely in the standard library, with no allocation surprises and no catastrophic backtracking. For most log lines, slug fragments, or field extractions in a config file, the trade is comfortably in your favor.
When patterns are not enough, the standard option is the lrexlib-pcre library, which exposes full PCRE through a familiar interface. Reach for it when you need lookaround, named captures, or alternation. For everything else, the built-in string.match, string.gmatch, and string.gsub keep dependencies small and behavior easy to reason about.
For a single-line refresher: %a is letter, %d is digit, %w is alphanumeric, %s is whitespace, and uppercase versions are the complement. Combine with +, *, ?, or - to control repetition, and remember that - is the lazy quantifier rather than a range marker.