Rate Limiting with Lua and Shared Dicts

March 23, 2026 · 11 min read ·Updated May 17, 2026 ·intermediate

openrestyrate-limitingshared-dictnginxlua

Introduction

Rate limiting is one of the first things you reach for when you need to protect an API from abuse, accidental overload, or malicious traffic. In OpenResty, ngx.shared.DICT gives you a shared memory zone where Lua code can store counters, tokens, and other state across all worker processes. This article walks through how to use it for rate limiting, starting with the basics of shared dict configuration, then building three increasingly sophisticated patterns, and finishing with the production-standard library that handles the tricky parts for you.

By the end you will know how to configure shared dicts, implement a fixed window counter, implement a token bucket, and integrate lua-resty-limit-traffic into your access phase.

Configuring Shared Dicts in nginx.conf

Shared dicts are declared in the http block of your nginx.conf. They reserve a fixed amount of memory that all nginx workers can read and write to.

http {
    lua_shared_dict rate_limit 10m;
    lua_shared_dict token_bucket 10m;

    server {
        listen 80;
        location /api {
            content_by_lua_block {
                -- rate limiting logic goes here
            }
        }
    }
}

The lua_shared_dict directive takes two arguments: a name and a size. The name becomes available as ngx.shared.<name> inside your Lua code. The size 10m means 10 megabytes. If you try to access a dict that was never declared, OpenResty throws an error when it starts up, so this is a safe mistake to catch during testing.

Memory is allocated when the nginx worker starts, not lazily. This means setting an enormous size wastes RAM, but it also means you get predictable performance — the allocator never has to grow the zone under load.

The ngx.shared.DICT API

Every shared dict exposes the same set of methods. The ones most relevant to rate limiting are get, set, incr, add, and replace. Each call is atomic at the C level, so a single call cannot be corrupted by another worker.

ngx.shared.mydict:get(key) returns the value or nil if the key does not exist.

ngx.shared.mydict:set(key, value, exptime) stores a value. The optional third argument is a TTL in seconds. Unlike add, this always overwrites.

ngx.shared.mydict:incr(key, delta) atomically adds delta to the stored integer value and returns the new value. If the key does not exist, incr returns nil. It only works on numeric values.

ngx.shared.mydict:add(key, value, exptime) sets the value only if the key is absent. This is useful for initializing a counter on the first request.

ngx.shared.mydict:replace(key, value, exptime) only updates a key that already exists.

ngx.shared.mydict:flush_all() removes all keys but does not free memory. It is useful for resetting state during testing.

ngx.shared.mydict:expire(key, exptime) sets or resets the TTL on an existing key.

Pattern A: Fixed Window Counter

The fixed window algorithm divides time into consecutive windows of fixed length — say, one minute — and counts how many requests each client makes inside the current window. When the window expires, the counter resets.

This pattern is straightforward and works well for single-server deployments. The trade-off is a boundary burst issue: a client can make the full quota of requests at the end of one window and immediately make another full quota at the start of the next, effectively doubling the rate for a brief moment.

Here is a complete, self-contained limiter module.

local _M = {}

local RATE_LIMIT = 20          -- max requests per window
local WINDOW_SIZE = 60         -- window length in seconds

function _M.check(client_id)
    local dict = ngx.shared.rate_limit
    local window_start = math.floor(ngx.now() / WINDOW_SIZE) * WINDOW_SIZE
    local key = client_id .. ":" .. tostring(window_start)

    local current = dict:get(key)
    if not current then
        -- Initialize counter on first request in this window
        local ok, err = dict:add(key, 1, WINDOW_SIZE)
        if not ok then
            -- Another worker initialized it first; increment instead
            current, err = dict:incr(key, 1)
            if not current then
                return nil, err
            end
        end
        return true  -- first request is always allowed
    end

    if current >= RATE_LIMIT then
        local retry_after = WINDOW_SIZE - (ngx.now() - window_start)
        return false, "rate limit exceeded", retry_after
    end

    local new_val, err = dict:incr(key, 1)
    if not new_val then
        return nil, err
    end

    return true
end

return _M

The check function builds a key from the client identifier and the start of the current minute. It uses add to initialize the counter on the first request, then incr on subsequent requests. If the counter reaches the limit, it returns false along with a retry_after value that the caller can use in an HTTP header.

To use this module, drop it as rate_limiter.lua somewhere nginx can see it — typically alongside your other Lua files — and require it in your access phase.

local rate_limiter = require "rate_limiter"

local client_id = ngx.var.remote_addr  -- or a token from an Authorization header
local ok, err, retry_after = rate_limiter.check(client_id)

if not ok then
    ngx.header["Retry-After"] = retry_after
    ngx.exit(429)
end

ngx.exit(429) stops processing and returns the status code to the client.

This pattern works correctly under single-server load because each incr call is atomic. The boundary burst remains the main weakness. For internal tools and low-traffic endpoints, it is usually fine.

Pattern B: Token Bucket

The token bucket algorithm refills a bucket with tokens at a constant rate. Each request consumes one token. If the bucket is empty, the request is rejected. This approach absorbs bursts gracefully and gives smoother traffic shaping than fixed windows.

Implementing token bucket in pure Lua requires storing two pieces of state per bucket: the current token count and the last refill timestamp.

local _M = {}
local cjson = require "cjson"  -- for serializing bucket state

local dict_name = "token_bucket"
local capacity = 10        -- max tokens in the bucket
local refill_rate = 1      -- tokens added per second

function _M.allow(client_id)
    local dict = ngx.shared[dict_name]
    local now = ngx.now()

    local key = client_id
    local bucket = dict:get(key)

    if bucket then
        bucket = cjson.decode(bucket)
    else
        -- New bucket: start full
        bucket = {tokens = capacity, last = now}
    end

    -- Refill based on elapsed time
    local elapsed = now - bucket.last
    local new_tokens = elapsed * refill_rate
    bucket.tokens = math.min(capacity, bucket.tokens + new_tokens)
    bucket.last = now

    if bucket.tokens >= 1 then
        bucket.tokens = bucket.tokens - 1
        -- Persist updated bucket (no TTL — entry stays until explicitly removed or dict evicts it)
        dict:set(key, cjson.encode(bucket))
        return true
    end

    dict:set(key, cjson.encode(bucket))
    return false
end

return _M

The bucket state is stored as JSON in the shared dict. On each allow call, we calculate how many tokens should have been added since the last check and add them (capped at capacity). If at least one token is available, we consume it and store the updated state.

There are two important caveats with this approach. First, the read-modify-write sequence — get, decode, modify, set — is not atomic as a whole. Between the get and the set, another worker can modify the same key, so the token count is approximate rather than exact. For rate limiting, the worst case is a few extra tokens being allowed, but it does mean you cannot rely on this for billing or strict enforcement. Second, the JSON encode and decode on every request adds a small CPU overhead. For high-throughput endpoints, the next pattern is more efficient.

Pattern C: lua-resty-limit-traffic

The lua-resty-limit-traffic library ships with OpenResty and is the standard solution for production use. It provides three limiter types:

limit-req — a leaky bucket that limits the average rate and allows short bursts
limit-count — a fixed window counter with atomic operations
limit-conn — limits concurrent connections

Each limiter type supports both shared dict storage and external storage like Redis for multi-server environments.

Setting up the library

http {
    lua_shared_dict limit_req_store 10m;
    lua_shared_dict limit_count_store 10m;

    server {
        listen 80;

        access_by_lua_block {
            local limit_req = require "resty.limit.req"

            -- 10 requests per second with a burst of 20
            local lim, err = limit_req.new(
                "limit_req_store", 10, 20
            )
            if not lim then
                ngx.log(ngx.ERR, "limit_req.new failed: ", err)
                ngx.exit(500)
            end

            local key = ngx.var.remote_addr
            local delay, err = lim:incoming(key, true)

            if not delay then
                if err == "rejected" then
                    ngx.exit(429)
                end
                ngx.log(ngx.ERR, "limit_req error: ", err)
                ngx.exit(500)
            end

            if delay >= 0.001 then
                ngx.sleep(delay)
            end
        }

        content_by_lua_block {
            ngx.say("OK")
        }
    }
}

lim:incoming(key, true) checks the request against the limiter. The second argument true enables bursting. The return value delay is the number of seconds to sleep before processing the request. A negative value means the request is allowed immediately. If the bucket is exhausted, delay is nil and err is "rejected".

ngx.sleep(delay) yields control to nginx for the specified duration, making this a non-blocking operation. The client experiences slower responses rather than hard errors for moderate overages.

Combining limiters

One powerful feature is grouping limiters so they share a single context.

access_by_lua_block {
    local limit_req = require "resty.limit.req"
    local limit_count = require "resty.limit.count"

    local lim1, err = limit_req.new("limit_req_store", 10, 20)
    local lim2, err = limit_count.new("limit_count_store", 100, 60)

    local limit = require "resty.limit.traffic".combine(
        lim1, lim2, {mode = "internal"}
    )

    local key = ngx.var.remote_addr
    local delay, err = limit.incoming(key, true)

    if not delay then
        if err == "rejected" then
            ngx.exit(429)
        end
        ngx.exit(500)
    end

    if delay >= 0.001 then
        ngx.sleep(delay)
    end
}

The combine function creates a composite limiter that applies all child limiters. If any one of them rejects, the request is blocked. The mode = "internal" setting keeps the context internal rather than propagating it to subrequests.

Distributed rate limiting

For multi-server deployments, replace the shared dict with Redis.

local limit_req = require "resty.limit.req"
local lim, err = limit_req.new(
    "limit_req_store",   -- still required but unused
    10,
    20,
    {
        dict = require "resty.lock",
        store = "redis",
        host = "127.0.0.1",
        port = 6379,
    }
)

The limiter still needs a dict declaration in nginx.conf for its internal lock, but the actual counters live in Redis. This lets multiple OpenResty instances share the same rate limit state.

Common Pitfalls

Even when using well-tested libraries, there are several issues that surface in production.

Memory exhaustion. Shared dicts have a fixed size. When the dict fills up, OpenResty evicts entries using an LRU policy, which means rate limits can silently stop working under memory pressure. Monitor usage with dict:capacity() and dict:free_space(), and size your dicts generously. For a simple per-IP counter, each entry uses roughly 50–100 bytes, so a 5 MB dict holds 50,000–100,000 unique IPs comfortably.

Atomicity across multiple calls. Each individual ngx.shared.DICT method is atomic, but sequences like get then set are not. If you need atomic read-modify-write, use incr wherever possible and accept its integer-only constraint. If you need something more complex, consider Lua’s ngx.lock or move to Redis with its native atomic commands.

TTL and lazy expiration. When you set a TTL with exptime, the key is not removed immediately when it expires. OpenResty removes it lazily on the next access. This means expired keys still consume memory until something reads them. Call dict:flush_expired() periodically from a timer if this is a concern.

local dict = ngx.shared.rate_limit
local freed = dict:flush_expired(100)  -- flush up to 100 entries
ngx.log(ngx.INFO, "flushed ", freed, " expired entries")

Set up the timer in the init_by_lua phase so it runs across all workers.

Key design. Keep your rate limit keys short and consistent. A pattern like "rl:" .. client_id .. ":" .. window works, but "rl:" can be omitted if the dict is dedicated to rate limiting. Avoid embedding large values like full user agents or query strings into keys.

Missing client identifiers. In some setups — for example, when nginx sits behind another proxy that does not set $remote_addr correctly — the client IP comes from an HTTP header like X-Forwarded-For. Validate that the header is present and looks reasonable before using it as a key, otherwise a single malicious header value could poison the rate limit state for everyone.

Summary

You now have three patterns for rate limiting with OpenResty and ngx.shared.DICT.

The fixed window counter is the simplest. It works well for single-server APIs where occasional boundary bursts are acceptable. The token bucket handles bursty traffic more gracefully, but the pure-Lua implementation is not fully atomic across the read-modify-write cycle — keep that in mind for strict enforcement scenarios.

For production services, lua-resty-limit-traffic is the correct choice. It provides well-tested limiters with proper atomicity, supports bursting and request delaying, and can scale to multi-server deployments via Redis. Combine it with init_by_lua for setup and access_by_lua for enforcement, and you have a rate limiter that handles tens of thousands of requests per second without breaking a sweat.

Start with the fixed window counter for prototyping, move to the token bucket when you need smoother traffic shaping, and reach for lua-resty-limit-traffic when you are ready to ship.