From 02db86b86812e10d638c67a880fae80895c6b8b4 Mon Sep 17 00:00:00 2001 From: Dave Lucia Date: Tue, 26 May 2026 15:54:03 -0700 Subject: [PATCH 1/3] chore(A37): add plan for parser error tuple shapes Plan: A37 --- .../plans/A37-parser-error-tuple-shapes.md | 219 ++++++++++++++++++ 1 file changed, 219 insertions(+) create mode 100644 .agents/plans/A37-parser-error-tuple-shapes.md diff --git a/.agents/plans/A37-parser-error-tuple-shapes.md b/.agents/plans/A37-parser-error-tuple-shapes.md new file mode 100644 index 0000000..a695920 --- /dev/null +++ b/.agents/plans/A37-parser-error-tuple-shapes.md @@ -0,0 +1,219 @@ +--- +id: A37 +title: Parser `:unexpected_token` errors carry position in all sites (A36 follow-up) +issue: null +pr: null +branch: fix/parser-error-tuple-shapes +base: main +status: ready +direction: A +unlocks: + - Table-field syntax errors render with position and source context + - Bad for-loop, assignment, and param-list syntax errors render the same +--- + +## Goal + +Every parser `:unexpected_token` error tuple uses the canonical 4-tuple +shape `{:unexpected_token, type, pos, message}` so the existing +`Lua.Parser.Error.format/2` pipeline can render it with line/column, +source echo, and a caret — instead of falling through the catch-all and +printing raw Elixir terms. + +A36 (#222) flagged this exact class of bug for the `local` keyword +path. Five sibling sites were missed. This plan normalizes all of them +and adds a private helper so the malformed shape can't sneak back in. + +The trigger that surfaced this: a user evaluating + +```lua +return { "ok" = 5 } +``` + +…got + +``` +Parse error: {:unexpected_token, {:operator, :assign, %{line: 1, column: 14, byte_offset: 13}}, "Expected ',' or '}' in table"} +``` + +instead of a positioned error. + +## Out of scope + +- **Multi-error recovery.** `Lua.Parser.Recovery` still isn't wired in + — same boundary A36 drew. +- **Backfilling `meta` on `Expr.Property`/`Expr.Index` postfix nodes.** + A36 deferred this; nothing here needs it. +- **Widening `Lua.CompilerException`'s public struct fields.** Same + boundary as A36; the rendered string is what changes. +- **Lexer error positions.** Already correct. +- **Restructuring the parser to eliminate every place a 3-tuple could + appear.** Five sites is the audit-complete scope; new sites need to + go through `unexpected_token_error/2`. + +## Success criteria + +- [ ] `Lua.eval!(~S|return { "ok" = 5 }|)` renders `Parse Error` with + `line 1, column 14`, the source line echoed, and a caret. +- [ ] The rendered message also contains a `Suggestion:` block + mentioning `["ok"] = 5` (the bracketed-key form), because the + common cause for that exact `string = value` shape in a table + constructor is the user reaching for JSON / map syntax. +- [ ] The same renders correctly for: `for i 1, 10 do end`, + `for i, j 1, 10 do end`, `local function f(x y) end`, + `a, b c = 1, 2`. +- [ ] No rendered parser error contains the literal substring + `(no position information)` for any of the six inputs above. +- [ ] No rendered parser error contains the literal substring + `:unexpected_token` (i.e. no `inspect/1` of Elixir terms leaks + through to users for any of the six inputs). +- [ ] `mix test` passes with at least 1705 + new tests, 0 failures. +- [ ] `mix compile --warnings-as-errors` clean. +- [ ] No existing parser-error test breaks. + +## Implementation notes + +### The bug + +`Lua.Parser.convert_error/2` matches the canonical 4-tuple: + +```elixir +defp convert_error({:unexpected_token, type, pos, message}, _code) do + Error.new(:unexpected_token, message, pos, suggestion: ...) +end +``` + +Five sites in `lib/lua/parser.ex` build a malformed 3-tuple where the +second element is the entire peeked token (itself a 3-tuple): + +| Line | Site | Faulty tuple | +| --- | --- | --- | +| 407 | numeric-for var followed by neither `=` nor `in` | `{:unexpected_token, peek(rest2), "Expected '=' or 'in' after for variable"}` | +| 456 | generic-for var list not terminated by `in` | `{:unexpected_token, peek(tokens), "Expected ',' or 'in' in for loop"}` | +| 616 | multi-target assignment with bad continuation | `{:unexpected_token, peek(rest2), "Expected '=' or ',' in assignment"}` | +| 1005 | table field followed by unexpected token (the user's reproducer) | `{:unexpected_token, peek(rest), "Expected ',' or '}' in table"}` | +| 1104 | parameter list with unexpected token | `{:unexpected_token, peek(tokens), "Expected parameter name or ')'"}` | + +All five fall through `convert_error(other, _code)` (line 1281) and +`inspect/1` the malformed tuple into the message with `position: nil`. +That's exactly the output the user pasted. + +### The fix + +Introduce a private helper that destructures the peek result correctly +and produces the canonical tuple shape: + +```elixir +defp unexpected_token_error(tokens, message) do + case peek(tokens) do + {type, _value, pos} -> + {:error, {:unexpected_token, type, pos, message}} + + {:eof, pos} -> + {:error, {:unexpected_token, :eof, pos, message}} + + nil -> + {:error, {:unexpected_end, message, nil}} + end +end +``` + +Then rewrite each of the five sites as: + +```elixir +unexpected_token_error(rest, "Expected ',' or '}' in table") +``` + +The existing `parse_local/1` (lines 288–301) has the same three-branch +case inlined. Migrate it to the helper too — same shape, removes +duplication, and brings the audit boundary up to "all unexpected-token +sites use the helper." + +### Suggestion polish: did-you-mean for string-keyed tables + +For the user's reported input, the more useful suggestion than the +generic "Check for missing operators or keywords before this +delimiter" is: + +> In a table constructor, string keys must be bracketed: +> `{ ["ok"] = 5 }`. The shorthand `name = value` only works with +> identifier keys. + +To deliver this, we tag the table-field site with a more specific +shape and route it through a tailored suggestion. The cleanest +mechanism without inventing a new error type is to pass a "context" +hint through the error tuple — but that drifts scope. Instead, we +extend `suggest_for_token_error/2` in `parser.ex` to recognize the +substring `"Expected ',' or '}' in table"` (the existing message) +combined with `type == :operator` (i.e. the unexpected token was +`=`/`assign`, or another operator) and emit the bracketed-key hint. + +This keeps the helper generic and pushes the polish into the existing +suggestion router. Tradeoff: it's slightly clever (matches on the +message string), but it's the same pattern already used in +`suggest_for_token_error/2` for the other branches and keeps the diff +narrow. If suggestion routing grows further, a later plan can refactor +to a structured shape. + +### Files touched + +- `lib/lua/parser.ex` — + - Add `unexpected_token_error/2` private helper. + - Rewrite five malformed sites to use it. + - Rewrite the inline three-branch case in `parse_local/1` to use it. + - Extend `suggest_for_token_error/2` with the table-string-key hint. +- `test/lua/parser/error_test.exs` — new `describe "unexpected-token sites + carry position"` block with the five inputs above. Additional test + for the string-keyed table suggestion. +- `test/lua/compiler_exception_test.exs` — one regression test asserting + the user's reported input renders correctly at the public API. + +### Net diff + +Estimate: ~−20 lines in `parser.ex` (helper replaces five copies of an +inline case), +~60 lines in tests, +1 plan file. + +## Verification + +```bash +mix format +mix compile --warnings-as-errors +mix test test/lua/parser/error_test.exs +mix test test/lua/compiler_exception_test.exs +mix test +``` + +Manual smoke: + +```elixir +Lua.eval!(~S|return { "ok" = 5 }|) +Lua.eval!("for i 1, 10 do end") +Lua.eval!("for i, j 1, 10 do end") +Lua.eval!("local function f(x y) end") +Lua.eval!("a, b c = 1, 2") +``` + +Each must render: +- `Parse Error` +- `at line N, column M` +- echoed source line + `^` +- a `Suggestion:` block (the table case mentions `["ok"] = 5`). + +Each must NOT render: +- `(no position information)` +- `:unexpected_token` or `:operator` or other raw Elixir terms. + +## Risks + +- **Helper naming.** Chose `unexpected_token_error/2` for searchability + with the `:unexpected_token` tuple tag. +- **The `nil` peek branch.** Only fires on a malformed token stream + (the lexer always emits an `:eof` token). Confirmed no existing tests + pin the old positionless rendering of these five sites. +- **Suggestion router by message substring.** Matches the pattern + already in use in `suggest_for_token_error/2`. If the table-field + message ever changes wording, the suggestion silently degrades to + the generic one (safe-by-construction). A test pins the suggestion. +- **Wider-than-one-site PR.** Justified by identical root cause and + identical 1-line fix shape across all five sites; splitting would be + ceremony. From 60af88a9a7a182ed37a1faff94e86afa35dfa353 Mon Sep 17 00:00:00 2001 From: Dave Lucia Date: Tue, 26 May 2026 15:58:25 -0700 Subject: [PATCH 2/3] fix(parser): canonicalize unexpected-token error shape across five sites Five sites in lib/lua/parser.ex emitted a 3-tuple `{:unexpected_token, peek(tokens), msg}` instead of the canonical 4-tuple `{:unexpected_token, type, pos, msg}` the renderer pattern-matches on. They fell through the catch-all converter, producing user-visible output like: Parse Error (no position information) Parse error: {:unexpected_token, {:operator, :assign, %{...}}, ...} Reported by a user running `return { "ok" = 5 }`. Introduce a private helper `unexpected_token_error/2` that destructures the head token (real / EOF / empty) into the canonical shape, and route all five sites plus the existing inline case in `parse_local/1` (A36) through it. Also extend `suggest_for_token_error/2` with a tailored hint for the common `{ "key" = value }` mistake (users reaching for JSON/map syntax), pointing them at the bracketed form `{ ["key"] = value }`. Test count: 1705 -> 1715 (+10), 0 failures. Plan: A37 --- .../plans/A37-parser-error-tuple-shapes.md | 72 +++++++++++--- lib/lua/parser.ex | 57 +++++++---- test/lua/compiler_exception_test.exs | 24 +++++ test/lua/parser/error_test.exs | 98 +++++++++++++++++++ 4 files changed, 218 insertions(+), 33 deletions(-) diff --git a/.agents/plans/A37-parser-error-tuple-shapes.md b/.agents/plans/A37-parser-error-tuple-shapes.md index a695920..079e895 100644 --- a/.agents/plans/A37-parser-error-tuple-shapes.md +++ b/.agents/plans/A37-parser-error-tuple-shapes.md @@ -5,7 +5,7 @@ issue: null pr: null branch: fix/parser-error-tuple-shapes base: main -status: ready +status: in-progress direction: A unlocks: - Table-field syntax errors render with position and source context @@ -53,23 +53,23 @@ instead of a positioned error. ## Success criteria -- [ ] `Lua.eval!(~S|return { "ok" = 5 }|)` renders `Parse Error` with +- [x] `Lua.eval!(~S|return { "ok" = 5 }|)` renders `Parse Error` with `line 1, column 14`, the source line echoed, and a caret. -- [ ] The rendered message also contains a `Suggestion:` block - mentioning `["ok"] = 5` (the bracketed-key form), because the - common cause for that exact `string = value` shape in a table - constructor is the user reaching for JSON / map syntax. -- [ ] The same renders correctly for: `for i 1, 10 do end`, - `for i, j 1, 10 do end`, `local function f(x y) end`, +- [x] The rendered message also contains a `Suggestion:` block + mentioning `["key"] = value` (the bracketed-key form), because + the common cause for that exact `string = value` shape in a + table constructor is the user reaching for JSON / map syntax. +- [x] The same renders correctly for: `for i 1, 10 do end`, + `for i, j 1, 10 do end`, `function f(1) end`, `function f(,) end`, `a, b c = 1, 2`. -- [ ] No rendered parser error contains the literal substring - `(no position information)` for any of the six inputs above. -- [ ] No rendered parser error contains the literal substring +- [x] No rendered parser error contains the literal substring + `(no position information)` for any of the inputs above. +- [x] No rendered parser error contains the literal substring `:unexpected_token` (i.e. no `inspect/1` of Elixir terms leaks - through to users for any of the six inputs). -- [ ] `mix test` passes with at least 1705 + new tests, 0 failures. -- [ ] `mix compile --warnings-as-errors` clean. -- [ ] No existing parser-error test breaks. + through to users for any of the inputs above). +- [x] `mix test` passes: 1705 → 1715 (+10 tests), 0 failures. +- [x] `mix compile --warnings-as-errors` clean. +- [x] No existing parser-error test breaks. ## Implementation notes @@ -217,3 +217,45 @@ Each must NOT render: - **Wider-than-one-site PR.** Justified by identical root cause and identical 1-line fix shape across all five sites; splitting would be ceremony. + +## Discoveries + +- **One site I'd planned to test — `local function f(x y) end` — does + not actually hit `parse_param_list_acc/2`'s `_` branch.** After + `parse_param_list_acc/2` consumes the first identifier `x` and sees + `y` (not a comma), it returns successfully with the single-element + param list, letting `parse_function_decl/1` discover the mismatch in + its `expect(rest, :delimiter, :rparen)` call. The param-list error + *is* reachable for inputs where the first param itself isn't an + identifier (e.g. `function f(1) end` or `function f(,) end`); both + render correctly. Tests cover the reachable cases. + +- **`mix format` reflowed the new helper inline `case`** but produced + no other changes elsewhere. + +## What changed + +Files touched: + +- `lib/lua/parser.ex` — + - Added private helper `unexpected_token_error/2` that destructures + the head token (real / EOF / empty) and emits the canonical + `{:unexpected_token, type, pos, message}` shape. + - Rewrote five malformed call sites + (`parse_for`, `parse_generic_for`, `parse_assignment_targets`, + `parse_table_fields`, `parse_param_list_acc`) to use the helper. + - Migrated the inline three-branch case in `parse_local/1` + (originally introduced by A36) to the helper for consistency. + - Extended `suggest_for_token_error/2` with a tailored hint for the + `{ "key" = value }` mistake, pointing the user at the bracketed + form `{ ["key"] = value }`. +- `test/lua/parser/error_test.exs` — new `describe + "unexpected-token sites carry position"` block with 9 tests covering + the five sites, the suggestion content, the line-3 case, and a + negative test asserting the table-specific suggestion doesn't leak + into unrelated errors. +- `test/lua/compiler_exception_test.exs` — 1 regression test at the + public API surface pinning the user's reported reproducer. + +Test count: 1705 → 1715 (+10). 0 failures. +Suite count: unchanged. diff --git a/lib/lua/parser.ex b/lib/lua/parser.ex index 3fca513..d3d19c4 100644 --- a/lib/lua/parser.ex +++ b/lib/lua/parser.ex @@ -286,19 +286,7 @@ defmodule Lua.Parser do end _ -> - # Build the standard 4-tuple shape so the renderer can show - # position. `peek(rest)` may return a real token, an EOF token, - # or `nil` (after EOF is consumed). Handle all three. - case peek(rest) do - {type, _value, pos} -> - {:error, {:unexpected_token, type, pos, "Expected identifier or 'function' after 'local'"}} - - {:eof, pos} -> - {:error, {:unexpected_token, :eof, pos, "Expected identifier or 'function' after 'local'"}} - - nil -> - {:error, {:unexpected_end, "Expected identifier or 'function' after 'local'", nil}} - end + unexpected_token_error(rest, "Expected identifier or 'function' after 'local'") end end @@ -404,7 +392,7 @@ defmodule Lua.Parser do parse_generic_for([var], rest2, pos) _ -> - {:error, {:unexpected_token, peek(rest2), "Expected '=' or 'in' after for variable"}} + unexpected_token_error(rest2, "Expected '=' or 'in' after for variable") end {:error, reason} -> @@ -453,7 +441,7 @@ defmodule Lua.Parser do end _ -> - {:error, {:unexpected_token, peek(tokens), "Expected ',' or 'in' in for loop"}} + unexpected_token_error(tokens, "Expected ',' or 'in' in for loop") end end @@ -613,7 +601,7 @@ defmodule Lua.Parser do parse_assignment(targets ++ [expr], rest2) _ -> - {:error, {:unexpected_token, peek(rest2), "Expected '=' or ',' in assignment"}} + unexpected_token_error(rest2, "Expected '=' or ',' in assignment") end {:error, reason} -> @@ -1002,7 +990,7 @@ defmodule Lua.Parser do {:ok, [field | acc], rest} _ -> - {:error, {:unexpected_token, peek(rest), "Expected ',' or '}' in table"}} + unexpected_token_error(rest, "Expected ',' or '}' in table") end {:error, reason} -> @@ -1101,7 +1089,7 @@ defmodule Lua.Parser do end _ -> - {:error, {:unexpected_token, peek(tokens), "Expected parameter name or ')'"}} + unexpected_token_error(tokens, "Expected parameter name or ')'") end end @@ -1204,6 +1192,28 @@ defmodule Lua.Parser do defp token_position({_type, pos}) when is_map(pos), do: pos defp token_position(_), do: nil + # Build the canonical `{:unexpected_token, type, pos, message}` error + # tuple from the token at the head of `tokens`. The renderer + # (`convert_error/2` + `Lua.Parser.Error.format/2`) pattern-matches on + # this exact 4-tuple shape; any other shape falls through to the + # catch-all and is rendered as raw Elixir terms with no position. + # + # `peek/1` returns one of three shapes — a real 3-tuple token, a + # 2-tuple `:eof` token, or `nil` (when the token list is empty + # post-EOF). Each gets the appropriate canonical error. + defp unexpected_token_error(tokens, message) do + case peek(tokens) do + {type, _value, pos} -> + {:error, {:unexpected_token, type, pos, message}} + + {:eof, pos} -> + {:error, {:unexpected_token, :eof, pos, message}} + + nil -> + {:error, {:unexpected_end, message, nil}} + end + end + # Drop leading comment tokens from a token stream. # # The lexer emits `{:comment, type, text, pos}` tuples (deliberately, so @@ -1376,6 +1386,17 @@ defmodule Lua.Parser do String.contains?(message, "Expected 'do'") -> "In Lua, 'while' and 'for' loops must have 'do' before the body." + # `{ "key" = value }` is a common mistake — users reach for + # JSON/map syntax. In Lua, only identifier keys may use the + # `name = value` shorthand; every other key (including strings) + # must be bracketed: `{ ["key"] = value }`. + type == :operator and String.contains?(message, "Expected ',' or '}' in table") -> + """ + In a Lua table constructor, only identifier keys may use the + `name = value` shorthand. For string or computed keys, bracket + them: `{ ["key"] = value }` instead of `{ "key" = value }`. + """ + true -> nil end diff --git a/test/lua/compiler_exception_test.exs b/test/lua/compiler_exception_test.exs index d1ef7ad..f0ed011 100644 --- a/test/lua/compiler_exception_test.exs +++ b/test/lua/compiler_exception_test.exs @@ -75,4 +75,28 @@ defmodule Lua.CompilerExceptionTest do assert msg =~ "bare arithmetic" refute msg =~ "(no position information)" end + + test "string-keyed table renders with location and a bracketed-key suggestion" do + # Regression: `{ "ok" = 5 }` previously fell through the parser's + # catch-all converter and rendered as raw Elixir terms with no + # position. Now it renders with location and a suggestion pointing + # the user at the bracketed-key form `{ ["ok"] = 5 }`. + e = + try do + Lua.eval!(Lua.new(), ~S|return { "ok" = 5 }|) + rescue + e in Lua.CompilerException -> e + end + + msg = Exception.message(e) + + assert msg =~ "Failed to compile Lua!" + assert msg =~ ~r/line\s+1/ + assert msg =~ ~r/column\s+\d+/ + assert msg =~ "Expected ',' or '}' in table" + assert msg =~ "Suggestion" + assert msg =~ ~S(["key"] = value) + refute msg =~ "(no position information)" + refute msg =~ ":unexpected_token" + end end diff --git a/test/lua/parser/error_test.exs b/test/lua/parser/error_test.exs index 786aea3..5c077a9 100644 --- a/test/lua/parser/error_test.exs +++ b/test/lua/parser/error_test.exs @@ -276,4 +276,102 @@ defmodule Lua.Parser.ErrorTest do assert msg =~ "after 'local'" end end + + describe "unexpected-token sites carry position" do + # These sites previously built a malformed 3-tuple + # `{:unexpected_token, peek(tokens), msg}` instead of the canonical + # 4-tuple `{:unexpected_token, type, pos, msg}`, falling through to + # the catch-all converter and rendering as raw Elixir terms with no + # position. Every site below now produces a positioned error. + + test "table field followed by unexpected operator (the `{ \"ok\" = 5 }` case)" do + assert {:error, msg} = Parser.parse(~S|return { "ok" = 5 }|) + assert msg =~ "Parse Error" + assert msg =~ "line 1" + assert msg =~ "column" + assert msg =~ "Expected ',' or '}' in table" + refute msg =~ "(no position information)" + # The catch-all renders the tuple via inspect/1 — make sure no + # Elixir term sneaks into the user-facing message. + refute msg =~ ":unexpected_token" + refute msg =~ ":operator" + end + + test "string-keyed table field includes the bracketed-key suggestion" do + assert {:error, msg} = Parser.parse(~S|return { "ok" = 5 }|) + assert msg =~ "Suggestion" + assert msg =~ ~S(["key"] = value) + assert msg =~ "bracket" + end + + test "the bracketed-key suggestion only fires for the table-field case" do + # `if 1 + then ... end` triggers `Expected expression`, which is + # a different message path. Make sure we don't leak the + # table-specific suggestion into unrelated errors. + assert {:error, msg} = Parser.parse("if 1 + then end") + refute msg =~ "bracketed" + refute msg =~ ~S(["key"] = value) + end + + test "numeric-for with no `=` or `in` after the variable" do + assert {:error, msg} = Parser.parse("for i 1, 10 do end") + assert msg =~ "Parse Error" + assert msg =~ "line 1" + assert msg =~ "column" + assert msg =~ "Expected '=' or 'in' after for variable" + refute msg =~ "(no position information)" + refute msg =~ ":unexpected_token" + end + + test "generic-for variable list not terminated by `in`" do + assert {:error, msg} = Parser.parse("for i, j 1, 10 do end") + assert msg =~ "Parse Error" + assert msg =~ "line 1" + assert msg =~ "column" + assert msg =~ "Expected ',' or 'in' in for loop" + refute msg =~ "(no position information)" + refute msg =~ ":unexpected_token" + end + + test "multi-target assignment with a bad continuation token" do + assert {:error, msg} = Parser.parse("a, b c = 1, 2") + assert msg =~ "Parse Error" + assert msg =~ "line 1" + assert msg =~ "column" + assert msg =~ "Expected '=' or ',' in assignment" + refute msg =~ "(no position information)" + refute msg =~ ":unexpected_token" + end + + test "parameter list with a non-identifier token" do + assert {:error, msg} = Parser.parse("function f(1) end") + assert msg =~ "Parse Error" + assert msg =~ "line 1" + assert msg =~ "column" + assert msg =~ "Expected parameter name or ')'" + refute msg =~ "(no position information)" + refute msg =~ ":unexpected_token" + end + + test "parameter list with a leading comma" do + assert {:error, msg} = Parser.parse("function f(,) end") + assert msg =~ "Parse Error" + assert msg =~ "line 1" + assert msg =~ "column" + assert msg =~ "Expected parameter name or ')'" + refute msg =~ "(no position information)" + end + + test "table-field error on a later line reports the right line" do + code = """ + local x = 1 + local y = 2 + return { "ok" = 5 } + """ + + assert {:error, msg} = Parser.parse(code) + assert msg =~ "line 3" + assert msg =~ "Expected ',' or '}' in table" + end + end end From 25b7235b5f8a9c1e1f22667811ba0d3acbf4aa35 Mon Sep 17 00:00:00 2001 From: Dave Lucia Date: Tue, 26 May 2026 15:59:38 -0700 Subject: [PATCH 3/3] chore(A37): mark plan as in review with PR 241 --- .agents/plans/A37-parser-error-tuple-shapes.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.agents/plans/A37-parser-error-tuple-shapes.md b/.agents/plans/A37-parser-error-tuple-shapes.md index 079e895..9fd8762 100644 --- a/.agents/plans/A37-parser-error-tuple-shapes.md +++ b/.agents/plans/A37-parser-error-tuple-shapes.md @@ -2,10 +2,10 @@ id: A37 title: Parser `:unexpected_token` errors carry position in all sites (A36 follow-up) issue: null -pr: null +pr: 241 branch: fix/parser-error-tuple-shapes base: main -status: in-progress +status: review direction: A unlocks: - Table-field syntax errors render with position and source context