Skip to content

Add ConvertFrom-Toml and ConvertTo-Toml functions with full TOML v1.1.0 spec coverage #2

@MariusStorhaug

Description

PowerShell has built-in ConvertFrom-Json / ConvertTo-Json for JSON, and the PSModule/Yaml module provides the same shape for YAML. There is no equivalent for TOML, even though TOML is widely used by package managers (Cargo.toml, pyproject.toml), config files, and CI tooling. Scripts that need to read or generate TOML today must shell out to external tooling or write ad-hoc string parsing.

Request

Desired capability

A ConvertFrom-Toml function that parses a TOML string into a PowerShell object, and a ConvertTo-Toml function that serializes a PowerShell object back to a TOML string. The module aligns with TOML v1.1.0 — the latest spec revision — and aims for full coverage of the spec.

ConvertFrom-Toml:

  • Accepts a valid TOML string via pipeline or -InputObject parameter
  • Returns a [PSCustomObject] by default (matches ConvertFrom-Json shape)
  • Supports -AsHashtable to return an [ordered] hashtable
  • Supports -NoEnumerate to prevent unwrapping single-element top-level arrays
  • Supports -Depth (default 1024) as a recursion safety net
  • Throws clear errors on invalid TOML (duplicate keys, mixed-type-table conflicts, malformed values) per the spec's "must produce an error" rules

ConvertTo-Toml:

  • Accepts PowerShell objects, hashtables, ordered dictionaries, and arrays via pipeline or -InputObject
  • Returns a TOML-formatted string
  • Supports -Depth (default 1024)
  • Supports -EnumsAsStrings to render enum values as their string names
  • Produces clean, idiomatic output with sensible header style — top-level keys then [table] headers (not always inline tables)

Aliases: ConvertFrom-Tml / ConvertTo-Tml registered via [Alias()], mirroring the Yml alias in PSModule/Yaml.

Acceptance criteria

  • Round-trip fidelity: $obj | ConvertTo-Toml | ConvertFrom-Toml produces an equivalent object
  • Both functions accept pipeline input
  • All examples from the TOML v1.1.0 spec parse correctly via ConvertFrom-Toml (verified via a spec test harness — see Implementation plan)
  • All TOML constructs the spec marks "INVALID" or "must produce an error" are rejected with a meaningful error
  • Reference compliance against the BurntSushi/toml-test compliance suite — both valid and invalid corpora pass

Note

Comment preservation through round-trip is out of scope for this initial issue and will be tracked separately, mirroring the equivalent decision in PSModule/Yaml.


Technical decisions

Spec alignment: TOML v1.1.0 (the published 1.1.0 spec). Rationale: latest stable, includes hex/octal/binary integers, mixed-type arrays, \e escape, multiline-string improvements over 1.0.0. The ABNF grammar is the formal reference.

Approach: Pure PowerShell implementation with no external dependencies, matching the PSModule/Yaml architecture. Hand-written recursive-descent parser driven by the ABNF.

Function placement: Public functions in src/functions/public/ConvertFrom-Toml.ps1 and ConvertTo-Toml.ps1. Internal helpers in src/functions/private/ (line tokenizer, value parser, key-path resolver, emitter helpers).

Output types: ConvertFrom-Toml returns [PSCustomObject] by default, [System.Collections.Specialized.OrderedDictionary] with -AsHashtable. ConvertTo-Toml returns [string].

Type mapping (parse):

TOML type PowerShell type
String (all 4 forms) [string]
Integer (dec/hex/oct/bin) [long] (or [bigint] if overflow)
Float [double] (incl. inf, -inf, nan)
Boolean [bool]
Offset Date-Time [datetimeoffset]
Local Date-Time [datetime] with Kind=Unspecified
Local Date [datetime] with Kind=Unspecified (time = 00:00)
Local Time [timespan] (or a custom struct — see open question)
Array [object[]]
Table / Inline table [PSCustomObject] or [ordered]

Type mapping (emit): PowerShell types → TOML types, mirroring above. Strings choose between basic / multiline-basic / literal / multiline-literal automatically based on content (presence of newlines, double quotes, backslashes). Integers always emit as decimal by default; an -IntegerFormat option could be added later if needed (out of scope here).

Header style on emit: Prefer [section] headers for object-typed values where possible, fall back to inline tables only for nested objects inside arrays or where the path would otherwise be ambiguous. Top-level scalars/arrays come before any headers (TOML root-table rule).

Strictness: Strict by default. The TOML spec is explicit about what "must" produce errors (duplicate keys, redefining tables as arrays, etc.) — these are surfaced as terminating errors. No "lenient mode" for v1.

Test approach: Three layers:

  1. Behavior teststests/ConvertFrom-Toml.Tests.ps1 and tests/ConvertTo-Toml.Tests.ps1 covering each spec section directly with hand-written cases.
  2. Spec example harnesstests/Spec-1.1.0.Tests.ps1 containing every code example from the v1.1.0 spec page as a discrete It block, asserting against the JSON form the spec shows.
  3. External compliance suite — optional CI step that runs the BurntSushi/toml-test corpus (valid + invalid directories) via thin adapters around the cmdlets. This catches edge cases the spec text doesn't enumerate.

Spec coverage checklist (TOML v1.1.0)

This is the scoreboard. Every box must be checked before this issue can close.

Lexical / structural

  • UTF-8 input enforcement; reject ill-formed sequences
  • LF and CRLF newline handling
  • Whitespace-only lines and blank lines
  • # comments (full-line and end-of-line)
  • Reject control chars U+0000-U+0008, U+000A-U+001F, U+007F in comments
  • BOM tolerated at start of stream

Keys

  • Bare keys (A-Za-z0-9_-)
  • Bare keys composed of only digits (1234 = "x") treated as strings
  • Quoted keys (basic-string form)
  • Quoted keys (literal-string form)
  • Empty quoted keys (valid but discouraged)
  • Dotted keys → nested tables
  • Whitespace ignored around dots in dotted keys
  • Reject duplicate key definitions
  • Reject defining a key as both scalar and table

Strings

  • Basic strings ("...") with all listed escape sequences (\b \t \n \f \r \e \" \\ \xHH \uHHHH \UHHHHHHHH)
  • Reject reserved/unknown escape sequences
  • Reject control chars in basic strings (other than tab)
  • Multi-line basic strings ("""...""")
  • Multi-line basic: trim leading newline immediately after opening delimiter
  • Multi-line basic: line-ending backslash trims whitespace
  • Multi-line basic: 1, 2, and escaped 3+ quote handling
  • Literal strings ('...') — no escaping
  • Multi-line literal strings ('''...''')
  • Multi-line literal: trim leading newline; preserve all other whitespace
  • Reject control chars in literal strings (other than tab/CR/LF in multi-line)
  • Newline normalization on emit (configurable or platform default)

Integer

  • Decimal with optional +/- sign
  • Underscore digit grouping (with surround-digit validation)
  • Reject leading zeros on decimal
  • Hex 0xDEADBEEF (case-insensitive)
  • Octal 0o755
  • Binary 0b11010110
  • Reject leading + on hex/oct/bin
  • Underscores in hex/oct/bin (between digits, not after prefix)
  • At least 64-bit signed range supported losslessly; throw on overflow

Float

  • Fractional only (3.14)
  • Exponent only (5e+22, 1e06)
  • Both fractional and exponent (6.626e-34)
  • Reject .7, 7., 3.e+20
  • Underscore digit grouping
  • inf, +inf, -inf
  • nan, +nan, -nan
  • Round-trip [double]::PositiveInfinity etc. on emit
  • +0.0 / -0.0 distinguished per IEEE 754

Boolean

  • true / false (lowercase only)
  • Reject True / TRUE / Yes etc. as booleans

Date / Time

  • Offset date-time (Z and ±HH:MM)
  • T or space delimiter between date and time
  • Fractional seconds (millisecond minimum; truncate beyond supported precision)
  • Seconds-omitted form (07:32Z, 07:32-07:00) → assume :00
  • Local date-time (no offset)
  • Local date (1979-05-27)
  • Local time (07:32:00, 07:32)

Array

  • Homogeneous arrays
  • Mixed-type arrays
  • Nested arrays
  • Multi-line arrays with newlines and comments between elements
  • Trailing comma allowed
  • Inline tables as array elements

Table

  • [table] header
  • Dotted-key table headers ([a.b.c])
  • Whitespace tolerance inside header brackets
  • Implicit super-table creation
  • Empty tables
  • Reject redefining a table
  • Reject defining a table that conflicts with a previously-defined dotted key
  • Top-level/root table behavior
  • Dotted-key sub-tables under [table] headers
  • Reject extending a table defined via dotted keys with a [header] for the same path

Inline table

  • Single-line inline tables ({ a = 1, b = 2 })
  • Multi-line inline tables (1.1.0 feature)
  • Trailing comma allowed (1.1.0 feature)
  • Nested inline tables
  • Dotted keys inside inline tables
  • Reject adding keys to an inline table outside its braces
  • Reject using inline table to extend already-defined [table]

Array of tables

  • [[table]] header creates and appends elements
  • Empty array-of-tables element
  • Sub-tables under array-of-tables elements ([fruits.physical])
  • Nested arrays of tables ([[fruits.varieties]])
  • Reject reversed parent/child ordering
  • Reject appending to statically-defined array
  • Reject [table] conflicting with prior [[table]] on same path
  • Reject [[table]] conflicting with prior [table] on same path

Output / emission

  • Round-trip preserves type for every primitive in spec
  • Sensible header style (top-level scalars first, then [section]s)
  • Auto-select string style based on content
  • Emit dates/times in canonical RFC 3339 form
  • Quote keys when bare-key rules would be violated

Implementation plan

Core functions

  • Add ConvertFrom-Toml in src/functions/public/ConvertFrom-Toml.ps1 with ConvertFrom-Tml alias
  • Add ConvertTo-Toml in src/functions/public/ConvertTo-Toml.ps1 with ConvertTo-Tml alias
  • Add private helpers — line/token stream, value parser, key-path resolver, table-graph builder, emitter
  • Remove placeholder Get-PSModuleTest / Set-PSModuleTest / Test-PSModuleTest / New-PSModuleTest scaffolding

Tests

  • tests/ConvertFrom-Toml.Tests.ps1 — behavior tests by spec section (one Describe per section above)
  • tests/ConvertTo-Toml.Tests.ps1 — emit + round-trip tests
  • tests/Spec-1.1.0.Tests.ps1 — every example from the spec page as an It block
  • CI integration with BurntSushi/toml-test (decoder + encoder modes)
  • Cover every "INVALID" example from the spec with a Should -Throw assertion

Documentation

  • Update README — fill in NAME / DESCRIPTION, document TOML v1.1.0 alignment with link, add cmdlet usage examples mirroring the PSModule/Yaml README
  • Update examples/General.ps1 with TOML usage demos
  • Document the type-mapping table in README

Cleanup

  • Remove or repurpose the template scaffolding under src/functions/public/PSModule/ and src/functions/public/SomethingElse/ (placeholders from the module template)
  • Remove tests/PSModuleTest.Tests.ps1

Open questions

  • Local Time → PowerShell type: [timespan] represents duration, not time-of-day. Options: (a) use [timespan] and document the convention, (b) introduce a small [TomlLocalTime] struct, (c) use [datetime] with date pinned to 0001-01-01. Lean toward (a) for simplicity, but flag for review during implementation.
  • Big integers: should integers outside [long] range automatically promote to [bigint], or throw? Spec says "must throw an error" if a value cannot be represented losslessly. Lean toward throwing by default with an opt-in -AllowBigInteger switch (defer that switch to a follow-up).
  • -Compress / inline-table emission style: TOML doesn't have a JSON-style "compress" concept, but ConvertTo-Toml could expose -InlineTableThreshold or similar to force certain depths inline. Out of scope for v1.
  • Comment preservation: explicitly out of scope for this issue. A separate tracking issue will mirror 🚀 [Feature]: Preserve comments through ConvertFrom-YamlConvertTo-Yaml round-trip Yaml#28 once this lands.

References

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions