Skip to content

fix(hal): add tanh + integer clamp to SPIR-V interpreter#225

Merged
kolkov merged 1 commit into
gogpu:mainfrom
amery:pr-amery-glsl-ext-builtins
Jun 24, 2026
Merged

fix(hal): add tanh + integer clamp to SPIR-V interpreter#225
kolkov merged 1 commit into
gogpu:mainfrom
amery:pr-amery-glsl-ext-builtins

Conversation

@amery

@amery amery commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

The software SPIR-V interpreter's GLSL.std.450 dispatch was missing two
common math builtins. Unknown ext-inst opcodes fall through to
return ValUint(0), so shaders calling these got silently wrong
results (zeros) rather than an error
:

  • tanh()Tanh (21) absent (all hyperbolics, 19–24, were missing)
  • integer clamp()SClamp (45) and UClamp (44) absent; only the
    float FClamp (43) was handled

Changes

  • Implement Sinh/Cosh/Tanh/Asinh/Acosh/Atanh (19–24) and
    UClamp/SClamp (44–45) in hal/software/shader/glsl_ext.go.
  • Add glslTernaryUint / glslTernaryInt helpers, mirroring the
    existing glslBinary* ones.
  • Interpreter tests for each: hyperbolic rows in TestGLSLUnaryIntrinsics,
    plus TestGLSLSClamp / TestGLSLUClamp.

Testing

go test ./hal/software/shader/ and go vet ./hal/software/shader/ pass.

Discovery

Surfaced by the born ML framework's webgpu backend on the software path,
where its Tanh and Int32 Clamp ops returned all-zeros. With these
opcodes implemented, both produce correct results.

No existing issue tracked this; still reproducible on main.

The software interpreter's GLSL.std.450 dispatch only handled float
clamp (FClamp); the hyperbolics (Sinh/Cosh/Tanh and inverses) and the
integer clamp variants (SClamp/UClamp) were missing. Unknown ext-inst
opcodes fall through to `return ValUint(0)`, so any shader calling
`tanh()` or integer `clamp()` silently produced zeros instead of the
real result — wrong output, not a visible error.

Surfaced by born's webgpu backend on the software path, where its Tanh
and Int32 Clamp ops returned all-zeros.

Add the missing opcodes (19–24, 44–45) plus ternary int helpers, with
interpreter tests for each.

Signed-off-by: Alejandro Mery <amery@geeks.cl>
@amery amery requested a review from kolkov as a code owner June 24, 2026 18:59
@codecov

codecov Bot commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@kolkov kolkov left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Clean implementation, follows all existing patterns.

Verified:

  • All 8 opcode numbers correct per GLSL.std.450 spec (Sinh=19–Atanh=24, UClamp=44, SClamp=45)
  • Hyperbolic ops use glslUnaryFloat — consistent with Sin/Cos/Tan pattern
  • Integer clamp ops use new glslTernaryUint/glslTernaryInt helpers — mirrors existing glslBinary* style
  • Tests cover all branches: saturation boundaries (tanh ±10), domain boundaries (acosh(1)), negative range (SClamp)
  • float32↔float64 conversion chain matches every other float op in the file

One systemic note (not blocking this PR): unknown GLSL.std.450 opcodes silently return ValUint(0) — this is how Born ML got zeros in the first place. Filed internally as BUG-SW-009 to add a warning log for unimplemented opcodes.

Thanks @amery — shipping in v0.30.3.

@kolkov kolkov merged commit eba655c into gogpu:main Jun 24, 2026
11 checks passed
@amery amery deleted the pr-amery-glsl-ext-builtins branch June 24, 2026 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants