luaguides

string.gmatch

string.gmatch(s, pattern [, init])

Signature

string.gmatch(s, pattern [, init])

string.gmatch returns an iterator function. Each call yields the next set of captures from pattern over s; when no matches remain, the iterator returns nil and a for ... in loop terminates on its own. The standard shape is a for ... in loop that walks through every match in source order.

Quick example

Iterating over words is the textbook use case. The pattern %a+ matches a run of one or more ASCII letters, and string.gmatch returns a fresh iterator that yields each match in turn. Nothing to free or close: the iterator is a closure with a single upvalue, and it disappears when the loop ends.

for w in string.gmatch("hello world from Lua", "%a+") do
  print(w)
end

Each call to print fires once per match, so the output is one word per line, in the order the words appear in the source string. The for loop handles the “no more matches” case for you, so there is no off-by-one risk at the end of the input.

hello
world
from
Lua

If you only want the first match, see string.match. If you want byte positions for a single match, see string.find.

Parameters

s (string, required): the subject string to iterate over. Passing nil or any non-string raises a type error. The error wording shifts between Lua versions, but the type-error behaviour itself is stable across 5.1, 5.3, and 5.4:

string.gmatch(nil, "%a+")
-- error: bad argument #1 to 'gmatch' (string expected, got nil)

pattern (string, required): a Lua pattern. Captures inside parentheses determine what the iterator yields on each call. The pattern is not anchored. string.gmatch scans the bytes in source order looking for every non-overlapping match, and the manual is explicit that “non-overlapping” means the next match must end at least one byte after the previous one ended.

init (integer, optional, default 1): the 1-based byte position in s where the search starts. Negative values count from the end, as in string.find. Added in Lua 5.3, so it is not present in 5.1, 5.2, or stock LuaJIT (5.1-compatible). The site targets Lua 5.4, where init is available. A negative value lets you start from the tail:

for w in string.gmatch("hello world", "%a+", -5) do
  print(w)  --> world
end

Return value

A function value, specifically a closure whose upvalue holds the current byte position in s. Calling the function advances that position and returns the next match’s captures. Two iterators over the same s are independent — only one cursor lives inside each closure, so call string.gmatch once per consumer:

local words = string.gmatch("a b c", "%a+")
local firsts = string.gmatch("a b c", "%a+")
print(words(), firsts(), words())  --> a   a   b

What the iterator yields

  • No captures in pattern: the iterator returns the whole match as a single string on each call.
  • One capture: the iterator returns the captured substring.
  • N captures: the iterator returns N values, so the for loop must declare N loop variables, or accept the rest via ... or table.pack.

Behaviour rules

No empty match right after a match. The pattern machine refuses to accept the empty string immediately after a previous match — the next match must end at least one byte past the previous one. The rule is shared with string.gsub and exists to keep patterns like a* from looping forever. The iterator advances one byte at minimum, so a pattern that matches the empty string still terminates:

local n = 0
for _ in string.gmatch("aaa", "a*") do n = n + 1 end
-- n == 1

^ does not anchor in gmatch. From the manual: “a caret ’^’ at the start of a pattern does not work as an anchor, as this would prevent the iteration.” If you need an anchor, use a literal character class or the init parameter to control the start position.

Bytes, not characters. Lua strings are byte sequences. %a, %w, and the other character classes match ASCII ranges per byte, so a UTF-8 letter like é (bytes 0xC3 0xA9) is not matched by %a in stock Lua 5.4 — neither byte is in the ASCII letter range on its own:

for _ in string.gmatch("é", "%a") do
  print("letter")
end
-- prints nothing

If you deal with non-ASCII text, reach for a library such as lua-utf8.

No match is not an error. If the pattern never matches, the iterator returns nil on the first call and the loop body never runs. No exception is raised.

Examples

Iterate over words

A simple pattern, one capture variable, one print per match. This is the form you will see in most Lua code that processes a string. The pattern %a+ reads as “one or more ASCII letters”, and the loop runs once per contiguous run of letters.

local s = "hello world from Lua"
for w in string.gmatch(s, "%a+") do
  print(w)
end

Each iteration binds w to the next match, in order, and the loop exits as soon as the iterator returns nil. There is no off-by-one risk; the last word and the first “no more” call are handled by the for machinery, not by your code.

hello
world
from
Lua

Parse key=value pairs into a table

Two captures in the pattern mean two loop variables, which is the usual shape when you want to turn a flat string into a structured record. Each iteration assigns to t[k], so the order of keys in the result mirrors the order in the source string.

local t = {}
local s = "from=world, to=Lua"
for k, v in string.gmatch(s, "(%w+)=(%w+)") do
  t[k] = v
end
-- t == { from = "world", to = "Lua" }

If the source has values that may contain non-word characters (spaces, punctuation), swap %w+ for a tighter class like [^,]+ and trim whitespace by hand. For multiline input, run gmatch over each line separately rather than trying to encode newlines into a single pattern.

Drive the iterator manually

The closure is just a function. You can store it in a variable and call it as many times as you like until it returns nil. This is occasionally useful when the consumer of matches is not a for loop, for example a coroutine resume, or a recursive descent parser that wants one token at a time.

local next_word = string.gmatch("a b c", "%a+")
print(next_word())  --> a
print(next_word())  --> b
print(next_word())  --> c
print(next_word())  --> nil

Note that the state lives inside the closure, not in the loop. There is no way to “rewind” without calling string.gmatch again. If you need parallel iteration over the same input, build two iterators up front and call them alternately, or just slice the string yourself.

Skip a header with init

init is the third argument. Pass a 1-based byte position and string.gmatch starts the search there. This is the cleanest way to skip a fixed prefix without anchoring the pattern, and it works with captures exactly the same way the no-init form does.

local s = "v=1.0.0 name=demo build=42"
for k, v in string.gmatch(s, "(%w+)=([%w%.]+)", 3) do
  print(k, v)
end

init = 3 starts the search at byte 3, past the "v=" prefix. [%w%.]+ accepts letters, digits, underscores, and dots, so 1.0.0 is captured cleanly. If the prefix changes length, compute the byte offset from a string.find call and pass that in instead of hard-coding it.

name    demo
build   42

Collect matches into a table

The classic “list comprehension” pattern in Lua: build a fresh table, append with #out + 1, and return it. The function takes a pattern so the caller decides what counts as a match, and the result is a clean array indexed from 1.

local function collect(s, pattern)
  local out = {}
  for v in string.gmatch(s, pattern) do
    out[#out + 1] = v
  end
  return out
end

collect("1, 2, 3, 4", "%d+")  --> { "1", "2", "3", "4" }

If you need both the match and its byte position, use string.find in a loop or wrap gmatch with a custom iterator that tracks offsets. For very large inputs, consider yielding results with coroutine.yield instead of building a table, so the caller can stop early without paying for the rest.

Pattern that never matches

A pattern that never fires is a no-op, not a bug. The iterator returns nil on its first call and the loop body never runs. This is the behaviour you want for “is there at least one?” checks, written as if next(string.gmatch(...)) then ... end.

for _ in string.gmatch("hello", "%d+") do
  print("never runs")
end
-- nothing happens, no error

Two iterators over the same string

Iterators are independent, so for parallel work call string.gmatch twice. The example below alternates a word iterator and a digit iterator over the same source, advancing each at its own pace:

local s = "a1 b2 c3"
local words = string.gmatch(s, "%a+")
local digits = string.gmatch(s, "%d+")
print(words(), digits(), words(), digits())  --> a   1   b   2

Common mistakes

Forgetting parentheses around captures. This is the canonical bug. The pattern has no captures, so the iterator returns the whole match as a single string, and the second loop variable is always nil. The fix is to wrap each piece you want to extract:

local s = "from=world, to=Lua"

-- wrong: no captures, so v is always nil
for k, v in string.gmatch(s, "%w+=%w+") do
  print(k, v)  -- k = "from=world", v = nil
end

-- right: two captures, two loop variables
for k, v in string.gmatch(s, "(%w+)=(%w+)") do
  print(k, v)  -- k = "from", v = "world"
end

Wrong number of loop variables. With two captures you need exactly two loop variables. A missing variable becomes nil; an extra one is silently dropped. If the count varies at runtime, accept the rest with ... and unpack with table.pack.

Expecting ^ to anchor. It does not, and that is intentional. Anchor with the start of the string manually or with the init parameter.

Reusing one iterator across two loops. A closure is stateful, and the position cursor only goes forward. Call string.gmatch again to get a fresh iterator:

local it = string.gmatch("a b c", "%a+")
for w in it do print("first:", w) end  -- a, b, c
for w in it do print("second:", w) end  -- never runs

init is byte-based and 1-based. 1 is the first byte. -1 is the last byte. There is no end-position parameter; if you need a window, encode it in the pattern or take a substring before calling gmatch.

Lua 5.1 and LuaJIT compatibility. Drop init if your code might run on Lua 5.1, 5.2, or stock LuaJIT — the third positional argument is either a hard error or a silent no-op depending on the runtime. The portable form is to slice the input yourself:

-- portable: skip the "v=" prefix with string.sub
for k, v in string.gmatch(string.sub(s, 3), "(%w+)=([%w%.]+)") do
  print(k, v)
end

Reaching for gmatch when string.match would do. If you only want the first match (or the first set of captures), string.match is cheaper because it does not set up a closure. string.gmatch is for the “iterate everything” case.

See also