chore(apertus): close out rollout — remove deprecated runtimes#95
Merged
Conversation
Closes the Apertus rollout (APERTUS_ROLLOUT.md) at 3-of-3 PRs (plan #91, routing #92, chat-template docs #93, tool calling #94). Drops the optional PR 4 (rebuild kapertus-cli) — the unified skainet-cli already covers Apertus end-to-end after #92, and the workspace direction (per `81f3506` deleting kqwen / kvoxtral / kapertus CLIs) is consolidation rather than per-model binaries. Also lands the deprecated-runtime cleanup that the rollout deferred. After PR 1 made `OptimizedLLMRuntime + apertusNetwork()` the canonical path, the hand-coded `ApertusRuntime` and its quantized variant served no production callers — both were flagged @deprecated in #92's wake. Removing them now rather than maintaining stale code through a separate deprecation cycle. Deleted: - llm-inference/apertus/.../ApertusRuntime.kt — hand-coded decoder runtime. Replaced by OptimizedLLMRuntime + apertusNetwork() per #92. - llm-inference/apertus/.../ApertusQuantizedRuntime.kt — lazy-dequant variant. Same canonical replacement path; QuantPolicy.NATIVE_OPTIMIZED through the unified loader covers the same memory profile. - llm-inference/apertus/.../ApertusAttentionBackend.kt — interface used only by the two deleted runtimes. - llm-inference/apertus/.../ApertusCpuAttentionBackend.kt — implementation, only used by the two deleted runtimes. - llm-inference/apertus/.../ApertusRuntimeSmokeTest.kt — exercised the deleted ApertusRuntime. - llm-inference/apertus/.../ApertusQuantizedRuntimeSmokeTest.kt — exercised the deleted ApertusQuantizedRuntime. Extracted (kept): - llm-inference/apertus/.../ApertusXIELU.kt (new) — `xielu()` and `softplus()` reference activation helpers were public functions in ApertusRuntime.kt and are still useful as a numerical reference (ApertusXIELUTest validates the math, and future xIELU implementations can point at this file as the golden reference). Pulled out as a standalone activation module so the test keeps compiling after ApertusRuntime is gone. Untouched (still on the production path): - ApertusNetworkDef.kt — the apertusNetwork() DSL with xIELU op, QK-Norm, ungated FFN. - ApertusNetworkLoader.kt — module-build entry point. - ApertusWeightLoader.kt + ApertusSafeTensorsLoader.kt — GGUF + SafeTensors ingestion. Still used by both ApertusNetworkLoader and (in transition) ApertusIngestion's loadQuantized* methods. - ApertusRuntimeWeights.kt — data classes (ApertusModelMetadata, ApertusLayerWeights, ApertusXIELUParams). Used by the network path. - ApertusIngestion.kt (kapertus runtime) — thin facade. Its loadQuantized* methods reference ApertusQuantizedRuntimeWeights, which lives in the (still-extant) ApertusWeightLoader codepath. Verified compiles cleanly after this PR. Stale code-comment references to "ApertusRuntime" remain in OptimizedLLMRuntime.kt and llm-core's OutputEquivalenceTest.kt kdocs — they describe the migration history. Not load-bearing; left for a future docs sweep. APERTUS_ROLLOUT.md rewritten as a closure document (status: complete, summary of the four merged PRs, what was dropped from PR 4, post-cleanup test footprint). Verification: - `:llm-inference:apertus:jvmTest` — 12/12 (ConfigParser 6, XIELU 6). - `:llm-agent:jvmTest --tests '*Apertus*'` — 21/21 (ChatTemplate 10, ParserStrategy 11). - `:llm-runtime:kapertus:compileKotlinJvm`, `:llm-apps:skainet-cli:compileKotlin`, `:llm-core:compileTestKotlinJvm` — all green after the deletes. Total Apertus test footprint after this commit: 33 tests, all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the Apertus rollout at 3-of-3 PRs (plan #91, routing #92, chat-template docs #93, tool calling #94). Drops the optional PR 4 (rebuild kapertus-cli) per the workspace consolidation direction. Also lands the deprecated-runtime cleanup the rollout deferred.
What's deleted
After PR 1 made
OptimizedLLMRuntime + apertusNetwork()the canonical Apertus inference path, the hand-coded runtimes served no production callers (and were flagged@Deprecatedin #92's wake):ApertusRuntime.ktApertusQuantizedRuntime.ktApertusAttentionBackend.kt+ApertusCpuAttentionBackend.kt(only used by the two deleted runtimes)ApertusRuntimeSmokeTest.kt+ApertusQuantizedRuntimeSmokeTest.kt(exercised deleted code)What's preserved
ApertusXIELU.kt(new) — extracted the publicxielu()andsoftplus()reference activation helpers that were defined inApertusRuntime.kt. Still useful as a numerical reference (validated byApertusXIELUTest), and a single golden source for any future xIELU implementation to point at.ApertusNetworkDef,ApertusNetworkLoader,ApertusWeightLoader,ApertusSafeTensorsLoader,ApertusRuntimeWeights,ApertusConfigParser,QuantizedTensor,ApertusIngestion.What changed in
APERTUS_ROLLOUT.mdRewritten as a closure document. Status: complete. Summary table of the four PRs. Note explaining why PR 4 was dropped (consolidation direction;
skainet-clialready covers Apertus end-to-end). Post-cleanup test footprint.Test plan
:llm-inference:apertus:jvmTest— 12/12 (ConfigParser 6, XIELU 6):llm-agent:jvmTest --tests '*Apertus*'— 21/21 (ChatTemplate 10, ParserStrategy 11):llm-runtime:kapertus:compileKotlinJvm,:llm-apps:skainet-cli:compileKotlin,:llm-core:compileTestKotlinJvm— all green after the deletesTotal Apertus test footprint after this PR: 33 tests, all green.
Stale references not addressed here
OptimizedLLMRuntime.ktandOutputEquivalenceTest.kthave kdoc comments referring to "ApertusRuntime" as part of the migration history. Code-comment only; not load-bearing. Worth a future docs sweep but not blocking.🤖 Generated with Claude Code