Skip to content

fix(vm): restore caller's open_upvalues after nested execution#245

Merged
davydog187 merged 3 commits into
mainfrom
fix/require-leaks-open-upvalues
May 27, 2026
Merged

fix(vm): restore caller's open_upvalues after nested execution#245
davydog187 merged 3 commits into
mainfrom
fix/require-leaks-open-upvalues

Conversation

@davydog187
Copy link
Copy Markdown
Contributor

require() leaks inner module's open_upvalues into outer caller

Plan: .agents/plans/A39-require-leaks-open-upvalues.md
Closes #244

Goal

Fix Lua.VM.Executor.execute/5 so that nested executions (most
importantly require) no longer leak the inner module's
state.open_upvalues map back to the outer caller. This unblocks
loading real-world Lua libraries that mix nested require chains with
many top-level local function definitions (luassert, busted, etc.).

Root cause

Lua.VM.Executor.execute/5 resets state.open_upvalues to %{} at
entry but never restores the caller's open_upvalues on return. Every
other call site that descends into a nested execution
(call_function/3 for :lua_closure, call_value/5, the dispatcher
entry, the dispatcher's frame returns, the interpreter's :call op for
Lua closures) carefully saves the caller's map, resets to %{}, runs
the callee, and restores on return. Executor.execute/5 is the one
outlier.

When require is called as a native_func from a Lua execution, the
inner module's Lua.VM.execute(proto, state) populates its own
open_upvalues as closures are created over the inner module's
top-level locals. When the inner returns, those entries leak back to
the outer caller. If the outer then creates a closure that captures a
top-level local at a register number that collides with one of the
inner's leftover entries, the outer's closure reuses the inner's
stale cell
, aliasing the outer's local to whatever value the inner
had at that register.

For luassert.assertions specifically, the outer's assert (reg 0)
ends up aliased to the inner luassert.assert module's s (reg 0,
the say module). At line 307, assert:register(...) reads assert
through the stale upvalue cell and sees say, not the obj table —
hence the bug report's "attempt to call a nil value (method 'register'
on local 'assert')".

Fix

In lib/lua/vm/executor.ex execute/5, snapshot state.open_upvalues
before resetting and restore it on the way out:

def execute(instructions, registers, upvalues, proto, state) do
  prev = Process.get(@position_key, @unset)
  saved_open_upvalues = state.open_upvalues

  try do
    state = %{state | open_upvalues: %{}}

    {results, regs, state} =
      do_execute(instructions, registers, upvalues, proto, state, [], [], 0)

    {results, regs, %{state | open_upvalues: saved_open_upvalues}}
  after
    restore_position(prev)
  end
end

This mirrors the save/restore pattern already used by call_function/3
for :lua_closure, call_value/5, and the dispatcher entry
(do_execute_top). The fix is two-call-site:

  • Lua.VM.execute/2 — used by parse_and_execute_module (the bug
    path) and by top-level Lua.eval!. Save/restore is correct in both
    cases.
  • Lua.do_call_function/3 for :lua_closure — called from the public
    Lua.call_function/3. Save/restore makes the public API safer:
    callers don't lose open_upvalues across call_function invocations.

Success criteria

Changes

 .agents/plans/A39-require-leaks-open-upvalues.md  | 156 ++++++++++++++++++++
 CHANGELOG.md                                      |  10 ++
 lib/lua/vm/executor.ex                            |  15 +-
 test/integration/luassert/README.md               |  ~80
 test/integration/luassert/lua/luassert/*.lua      | (vendored, MIT)
 test/integration/luassert/lua/say/init.lua        | (vendored, MIT)
 test/integration/luassert_test.exs                | ~115
 test/lua/vm/require_open_upvalue_test.exs         |  ~95

Verification

mix format
mix compile --warnings-as-errors
mix test                                 # 1792 passing, 0 failing
mix test --only lua53                    # 29 tests, 0 failures
mix test test/lua/vm/require_open_upvalue_test.exs   # 2 passing
mix test test/integration/luassert_test.exs          # 18 passing

Negative verification — temporarily reverted the fix to confirm the
tests actually catch the bug:

Out of scope (intentional)

  • Refactoring the upvalue / closure model.
  • "Close all open upvalues at chunk end" sweep — not needed once
    save/restore is in place.
  • The full package.searchers mechanism.
  • Bytecode-encoder support for vararg chunks (chunks falling back to
    the interpreter is what surfaced this bug, but the interpreter path
    should be correct on its own).
  • Coordinating the tv-labs/platform/sidecar Lua bump.
  • Behavioural assertions against the vendored luassert (e.g.
    assert.are.equal(1, 1) passes). Loading the module graph is the
    bug surface; behavioural coverage is a separate follow-up.
  • Loading luassert.formatters and top-level luassert. Both depend
    on io.type(io.stdout) for TTY detection at module-load time, which
    this VM intentionally does not expose. Tracked separately from
    issue require: cached module result lost after deep require chain (assert:register nil) #244.

`Lua.VM.Executor.execute/5` reset `state.open_upvalues` to `%{}` at
entry but never restored the caller's map on return. When a Lua chunk
called `require`, the inner module's body would populate
`open_upvalues` with cells keyed by the inner's register indices, and
those entries would leak back to the outer caller. The outer caller's
later closures would then reuse the stale cells by register index,
aliasing the outer's locals to unrelated inner values.

This broke real-world Lua libraries (luassert.assertions, luassert.array,
luassert.spy) that follow the pattern `local x = require(...)` →
many `local function` defs that close over `x` →
`x:method(...)`: by the time `x:method` ran, reads of `x` went
through a stale upvalue cell and saw an inner module's local instead.

As a side effect, `Lua.call_function/3` (public API) now preserves the
caller's `open_upvalues` across calls — previously the same leak
applied there too, but no caller of the public API depended on the
buggy behaviour.

Two layers of test coverage:
- Unit regression in `test/lua/vm/require_open_upvalue_test.exs` with
  a minimal two-file pure-Lua repro.
- Integration test that vendors luassert v1.9.0 + say v1.4.1 under
  `test/integration/luassert/` and asserts every interior luassert
  module loads via `require` without raising. luassert is the
  dominant testing framework in the Lua ecosystem and was the original
  reproducer in the bug report.

Plan: A39
Closes #244
@davydog187 davydog187 merged commit 56ee59e into main May 27, 2026
5 checks passed
@davydog187 davydog187 deleted the fix/require-leaks-open-upvalues branch May 27, 2026 01:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

require: cached module result lost after deep require chain (assert:register nil)

1 participant