Working with Binary Data in Lua 5.3+

· 5 min read · Updated April 1, 2026 · intermediate
binary-data string-pack stdlib serialization lua-5.3

Binary data shows up everywhere: network protocols, file formats like PNG and WAV, and custom binary schemas for game assets or IPC messages. Lua 5.3 introduced a small but powerful trio of functions that make this tractable without reaching for C libraries or writing manual bit-shifting code. This guide covers string.pack, string.unpack, and string.packsize in depth.

The Pack/Unpack API

string.pack serializes values into a binary string according to a format string. string.unpack reads them back. The format string describes the byte layout of the data.

local bin = string.pack("i4", 42)
local value, pos = string.unpack("i4", bin)
print(value)  -- 42

string.unpack returns the unpacked values followed by the index of the first unread byte. That index is useful when you need to read several values sequentially from the same binary string:

local bin = string.pack("i1i1i1", 10, 20, 30)
local a, p1 = string.unpack("i1", bin)
local b, p2 = string.unpack("i1", bin, p1)
local c, p3 = string.unpack("i1", bin, p2)
-- a=10, b=20, c=30, p3=4

string.packsize tells you the fixed byte length a format will produce, which is handy for pre-allocating buffers or calculating offsets:

local size = string.packsize("i4")     -- 4
local size = string.packsize("fdd")     -- 4 + 8 + 8 = 20

It errors on variable-length formats like s and z, since their size depends on the data.

Integral Types

Lua supports a range of integer widths, from 1 to 16 bytes. The lowercase codes are signed; uppercase are unsigned.

CodeMeaningSize
b / Bsigned/unsigned char1 byte
h / Hsigned/unsigned short2 bytes
l / Lsigned/unsigned long4 bytes
j / Jsigned/unsigned Lua integernative size
in / Insigned/unsigned, n bytes (1–16)variable
local bin = string.pack("i1", 127)       -- 1-byte signed int
local bin = string.pack("I2", 1000)      -- 2-byte unsigned int
local bin = string.pack("i8", 2^40)       -- 8-byte signed int

The j and J codes pack a full Lua integer, which is 8 bytes on a 64-bit build. Use them when you want native integer width without thinking about it.

Floating-Point Types

Floats and doubles use IEEE 754 representation:

CodeMeaningSize
ffloat (single precision)4 bytes
ddouble (double precision)8 bytes
local bin = string.pack("f", 3.14159)    -- 4 bytes
local bin = string.pack("d", 3.14159)    -- 8 bytes
local value = string.unpack("f", bin)

If you need exact IEEE 754 bit patterns (for cross-language communication with Python, C, etc.), f and d produce standard representations. Lua 5.3+ uses IEEE 754 natively on all supported platforms.

String Types

Three codes handle strings, each with different tradeoffs:

CodeStoresUse when
snlength prefix (n bytes) + string bytesstructured records with known length
zbytes + null terminatorC-style null-terminated strings
cnexactly n bytes, padded with \0fixed-width fields

The s2 format stores the string length in a 2-byte field before the actual bytes. The length prefix makes it easy to read the string back without searching for a terminator:

local bin = string.pack("s2", "hello")
print(#bin)  -- 7 (2 bytes length + 5 bytes content)

The z format is simpler but requires scanning to find the end:

local bin = string.pack("z", "hello")
print(#bin)  -- 6 (5 bytes + null terminator)

Fixed-width cn fields truncate or pad to exactly n bytes:

local bin = string.pack("c10", "ab")
-- "ab\0\0\0\0\0\0\0\0\0" -- 10 bytes, padded with nulls
local result = string.unpack("c10", bin)
-- result is "ab\0\0\0\0\0\0\0\0\0" -- all 10 bytes, terminal stops displaying at first \0

Endianness and Alignment

By default, string.pack uses native byte order (your machine’s endianness). That is a portability hazard. A binary file written on an x64 laptop will read incorrectly on a big-endian system or network hardware.

Use explicit endianness prefixes to make your data portable:

PrefixMeaning
<little endian
>big endian
!network byte order (big endian), equivalent to >
=native endianness
local le = string.pack("<i4", 1)   -- little-endian 4-byte int
local be = string.pack(">i4", 1)   -- big-endian 4-byte int

Always pick an explicit endianness and use it consistently. Network protocols almost always use big endian (> or !).

Alignment prefixes control padding. Without a prefix, no padding is inserted. The !n prefix sets a maximum alignment:

string.packsize("ci4")     -- 1 + 4 = 5, no padding
string.packsize("!4ci4")  -- 1 + 3 padding + 4 = 8, 'i4' aligned to 4

Space padding (a literal space, x, or Xn) inserts explicit zero bytes:

local bin = string.pack("c10X3", "hi")  -- "hi" + 8 null bytes

Reading a Binary Packet Header

A practical example: parsing a 6-byte header with a magic number, version, message type, and payload length.

local HEADER_FMT = ">H B B H"
-- 'H': unsigned short, 2 bytes (big-endian)
-- 'B': unsigned char, 1 byte
-- Second 'B': another unsigned char
-- Final 'H': unsigned short, 2 bytes (big-endian)

local function read_header(data)
    local magic, version, msg_type, length, pos = string.unpack(HEADER_FMT, data)
    return {
        magic = magic,
        version = version,
        type = msg_type,
        payload_length = length,
        payload_start = pos
    }
end

local header_bin = string.pack(HEADER_FMT, 0xDEAD, 1, 5, 256)
local h = read_header(header_bin)
print(h.magic, h.version, h.type, h.payload_length)
-- 0xDEAD  1  5  256

If the packet had a variable-length string payload, you could read it immediately after using the position unpack returns:

local PAYLOAD_FMT = ">s2"  -- 2-byte length prefix
local magic, version, msg_type, length, pos = string.unpack(HEADER_FMT, data)
local payload, next_pos = string.unpack(PAYLOAD_FMT, data, pos)

Common Pitfalls

Overflow checking

string.pack validates that values fit in the target size:

string.pack("i1", 128)    -- error: 128 overflows signed 1-byte int
string.pack("I1", 256)    -- error: 256 overflows unsigned 1-byte int

Pick a wide enough integer format for your data. Use i4 or i8 for general-purpose values where overflow is undesirable.

Ignoring the position from unpack

A common mistake is assuming unpack returns only the value:

-- WRONG
local value = string.unpack("i4", bin)

-- RIGHT
local value, pos = string.unpack("i4", bin)

If you ignore pos, you lose track of where the next value starts. This matters as soon as you unpack more than one field.

Platform-dependent endianness

string.pack("i4", x) on your machine might produce different bytes than on a different architecture. Always specify < or > explicitly:

-- BAD: breaks across platforms
local bin = string.pack("i4", 42)

-- GOOD: consistent everywhere
local bin = string.pack(">i4", 42)

LuaJIT compatibility

LuaJIT is based on Lua 5.1 and does not include string.pack or string.unpack. If you need binary packing on LuaJIT, use a library like lua-struct from LuaSocket. Otherwise, target Lua 5.3 or 5.4.

See Also