Implementing Sets in Lua
Lua ships with tables — versatile structures that act as arrays, maps, or dictionaries depending on how you use them. What it does not ship with is a built-in Set type. Fortunately, tables are flexible enough that you can build one yourself in a few lines of idiomatic Lua. In this tutorial, you’ll implement sets using tables, learn the standard operations, avoid the common pairs() ordering trap, and cap it off with a read-only proxy pattern for cases where mutation just shouldn’t happen.
What Is a Set?
A set is a collection of unique values where order doesn’t matter and membership is the primary operation. The classic question a set answers is: “Is X in this collection?” — answerable in constant time, not by scanning the whole thing.
In Lua, the convention is to use a table where keys are members and every value is true. This is sometimes called the “true sentinel” pattern.
local fruits = { apple = true, banana = true, cherry = true }
```lua
Why keys rather than values? If you stored members as values in an array-style table, a membership check would require scanning the whole table — O(n). By using keys, Lua's hash table lookup gives you O(1) membership checks. The `true` value is just a convention; any truthy value works, but `true` is the clearest choice.
You can also build a set incrementally:
```lua
local primes = {}
primes[2] = true
primes[3] = true
primes[5] = true
primes[7] = true
-- primes is now the set {2, 3, 5, 7}
```lua
This is the foundation everything else in this tutorial builds on.
## Basic Operations
Once you understand the sentinel pattern, the basic operations fall out naturally.
### Adding an element
```lua
set[key] = true
```lua
Because setting a key that already exists overwrites the value with `true`, adding an element to a set is idempotent — inserting the same value twice has no extra effect.
### Removing an element
```lua
set[key] = nil
```lua
Setting a key to `nil` deletes it from the table entirely. This is idiomatic Lua cleanup. There's no `remove` method needed, no special API call — just assign `nil`.
### Checking membership
```lua
if set[key] then
print("key is in the set")
end
```lua
Since `nil` is falsy and `true` is truthy, a direct lookup works as a membership test. No comparison against a special value needed.
Here are those three operations wrapped in small helper functions to make your intent explicit:
```lua
local function add(set, key)
set[key] = true
end
local function remove(set, key)
set[key] = nil
end
local function contains(set, key)
return set[key] ~= nil
end
local vowels = {}
add(vowels, "a")
add(vowels, "e")
add(vowels, "i")
add(vowels, "o")
add(vowels, "u")
print(contains(vowels, "a")) --> true
print(contains(vowels, "z")) --> false
remove(vowels, "e")
print(contains(vowels, "e")) --> false
```lua
## Set Operations
You'll often need to combine or compare sets. The standard operations — union, intersection, difference, and subset check — are all implementable with a few lines of Lua.
### Union — elements in either set (or both)
```lua
local function union(a, b)
local result = {}
for k in pairs(a) do result[k] = true end
for k in pairs(b) do result[k] = true end
return result
end
local set_a = { x = true, y = true }
local set_b = { y = true, z = true }
local set_u = union(set_a, set_b)
-- set_u = { x = true, y = true, z = true }
```lua
### Intersection — elements in both sets
```lua
local function intersection(a, b)
local result = {}
for k in pairs(a) do
if b[k] then result[k] = true end
end
return result
end
local set_a = { x = true, y = true }
local set_b = { y = true, z = true }
local set_i = intersection(set_a, set_b)
-- set_i = { y = true }
```lua
### Difference — elements in A but not in B
```lua
local function difference(a, b)
local result = {}
for k in pairs(a) do
if not b[k] then result[k] = true end
end
return result
end
local set_a = { x = true, y = true }
local set_b = { y = true, z = true }
local set_d = difference(set_a, set_b)
-- set_d = { x = true }
```lua
### Subset check
```lua
local function is_subset(sub, super)
for k in pairs(sub) do
if not super[k] then return false end
end
return true
end
local small = { a = true, b = true }
local large = { a = true, b = true, c = true, d = true }
print(is_subset(small, large)) --> true
print(is_subset(large, small)) --> false
```lua
All of these iterate with `pairs()`, which brings us to a critical gotcha.
## Iteration with `pairs()` — The Ordering Trap
`pairs()` iterates over all key-value pairs in a table. For sets, the values are always `true`, so `pairs()` gives you the keys — which is what you want. The problem is **iteration order**.
```lua
local vowels = { a = true, e = true, i = true, o = true, u = true }
for letter in pairs(vowels) do
print(letter)
end
```lua
This will print the vowels, but not in alphabetical order. The order is implementation-defined — it might be insertion order today, hash bucket order tomorrow, and different again in a different Lua version. You cannot rely on it.
If order matters for your use case, you need to sort explicitly:
```lua
local sorted = {}
for k in pairs(vowels) do table.insert(sorted, k) end
table.sort(sorted)
for _, v in ipairs(sorted) do print(v) end
-- a, e, i, o, u — guaranteed sorted
```lua
Note that `ipairs()` only works with sequential integer-keyed tables. It cannot iterate over a set's string keys. Once you've sorted the keys into an array, `ipairs()` handles the ordered traversal correctly.
This is one of the most common sources of surprising behaviour when people first use tables-as-sets in Lua. The rule of thumb: if you need ordered output, sort before iterating.
## Sets vs Arrays — When to Use Which
Sets and arrays (which in Lua are just tables with sequential integer keys) serve different purposes. Here's a quick decision guide:
| Scenario | Best choice |
|---|---|
| Checking if a value exists | **Set** — O(1) lookup |
| Traversing values in insertion order | **Array** — `ipairs()` |
| Deduplicating a list of values | **Set** — keys are naturally unique |
| Collecting unique items | **Set** — inserting the same key twice is a no-op |
| Storing an ordered sequence | **Array** — integer keys preserve order |
| Fast edge lookup in a graph | **Set** per node |
| Tracking tags or flags | **Set** — `tags["urgent"] = true` |
The telltale sign that you should switch from an array to a set: if you find yourself doing this:
```lua
-- Array with a manual linear scan for membership
local items = { "apple", "banana", "cherry" }
local function contains(arr, val)
for _, v in ipairs(arr) do
if v == val then return true end
end
return false
end
```lua
Replace it with a set and the lookup becomes a single hash access:
```lua
local items = { apple = true, banana = true, cherry = true }
if items["banana"] then
-- O(1) — done
end
```lua
## Read-Only Sets with Metatables
Sometimes you want a set that's fixed after creation — a constant set that nothing can modify. This is common when modelling things like operators, keyword sets in a lexer, or valid configuration values.
Lua's metatable mechanism lets you create a **proxy** around a table that intercepts writes and throws an error.
```lua
local function readonly_set(source)
local proxy = {}
local mt = {
__index = function(_, k) return source[k] end,
__newindex = function(_, _, _)
error("cannot modify a read-only set", 2)
end
}
setmetatable(proxy, mt)
return proxy
end
local operators = readonly_set({
["+"] = true, ["-"] = true,
["*"] = true, ["/"] = true
})
print(operators["+"]) --> true
print(operators["%"]) --> nil (not in the set)
operators["+"] = nil --> error: cannot modify a read-only set
operators["^"] = true --> error: cannot modify a read-only set
```lua
Here's how it works:
- `__index` forwards any read to the underlying source table, so the proxy behaves exactly like the original set.
- `__newindex` fires whenever code tries to assign to any key — including `proxy["+"] = nil`. It throws an error with a two-level stack trace (`2`) so the error points to the caller's line, not the metamethod itself.
- The source table lives in the closure and is never exposed directly, so there's no way for external code to bypass the proxy.
One thing to be aware of: if any other code holds a direct reference to the underlying `source` table, it can still modify it. The proxy only guards access through the proxy itself. Keep the source table private.
The practical payoff for this pattern is clean, safe constant sets:
```lua
local KEYWORDS = readonly_set({
["and"] = true, ["break"] = true, ["do"] = true,
["else"] = true, ["elseif"] = true, ["end"] = true,
["false"] = true, ["for"] = true, ["function"] = true,
["if"] = true, ["in"] = true, ["local"] = true,
["nil"] = true, ["not"] = true, ["or"] = true,
["repeat"] = true, ["return"] = true, ["then"] = true,
["true"] = true, ["until"] = true, ["while"] = true
})
local function is_keyword(word)
return KEYWORDS[word] ~= nil
end
```lua
This is the pattern you'll see in lexers and parsers across the Lua ecosystem.
## Practical Examples
Let's look at three scenarios where sets genuinely earn their keep.
### Deduplication
If you receive a list of values and need only the unique ones:
```lua
local input = { "apple", "banana", "apple", "cherry", "banana", "apple" }
local unique = {}
for _, v in ipairs(input) do
unique[v] = true
end
local result = {}
for k in pairs(unique) do
table.insert(result, k)
end
table.sort(result)
for _, v in ipairs(result) do
print(v)
end
-- apple, banana, cherry (sorted, deduplicated)
```lua
### Token classification in a lexer
When parsing text, you often need to quickly decide whether a character is a valid operator:
```lua
local operators = readonly_set({
["+"] = true, ["-"] = true, ["*"] = true, ["/"] = true,
["="] = true, ["<"] = true, [">"] = true,
["=="] = true, ["~="] = true, ["<="] = true, [">="] = true
})
local function tokenize_operator(chars, i)
local two = chars[i] .. (chars[i + 1] or "")
if operators[two] then
return two, i + 2
end
local one = chars[i]
if operators[one] then
return one, i + 1
end
return nil, i
end
-- Test it
print(tokenize_operator({"=", "="}, 1)) --> == 3
print(tokenize_operator({"+"}, 1)) --> + 2
print(tokenize_operator({"x"}, 1)) --> nil 1
```lua
### Graph node neighbours
In a graph represented as adjacency lists, storing each node's neighbours in a set gives fast edge existence checks:
```lua
local graph = {
A = { B = true, C = true },
B = { A = true, D = true },
C = { A = true, D = true },
D = { B = true, C = true }
}
local function has_edge(from, to)
return graph[from] and graph[from][to] == true
end
print(has_edge("A", "B")) --> true
print(has_edge("A", "D")) --> false
```lua
## Summary
Sets in Lua are tables used in a specific way: keys are members, values are `true`. This simple convention gives you O(1) membership testing, idiomatic `nil`-based deletion, and a clean foundation for implementing union, intersection, difference, and subset operations.
The key takeaways:
- Use `set[key] = true` to add, `set[key] = nil` to remove, and `if set[key]` to check membership.
- `pairs()` iteration order is not guaranteed — sort keys explicitly if order matters.
- Reach for a set whenever you're doing repeated membership checks on a collection.
- Use a metatable proxy when you need an immutable set.
If you're building data structures in Lua, you'll also find it useful to understand how [hash maps work in Lua](/tutorials/ds-hash-maps/) — since that's what your sets are built on — and how [arrays and lists](/tutorials/ds-arrays-and-lists/) handle ordered data that sets deliberately don't care about.
## See Also
- [Implementing Hash Maps in Lua](/tutorials/ds-hash-maps/)
- [Implementing Arrays and Lists in Lua](/tutorials/ds-arrays-and-lists/)