Skip to content

Extract transformer-core: NN primitives reusable on all targets (incl. androidNative)#185

Merged
michalharakal merged 2 commits into
developfrom
feature/transformer-core
Jun 17, 2026
Merged

Extract transformer-core: NN primitives reusable on all targets (incl. androidNative)#185
michalharakal merged 2 commits into
developfrom
feature/transformer-core

Conversation

@michalharakal

Copy link
Copy Markdown
Contributor

Closes #183.

Extracts llm-core's lang-core-only NN primitives (KV-cache family, MultiHeadAttention, Embedding,
RMSNormalization, RoPE, SwiGLU/GeGLU FFN, ResidualAdd, LinearProjection, TransformerDsl) into a
new transformer-core module that depends only on skainet-lang-core and declares the full target
matrix including androidNativeArm32/androidNativeArm64. llm-core api-depends on it (re-exports),
so existing consumers are unaffected; ARM-native consumers (e.g. skainet-whisper-kmp) can reuse the
primitives instead of reimplementing.

Why

The primitives only need lang-core (which has androidNative), but were trapped in llm-core, whose other
deps (io-gguf/io-core/compile-*/backend-cpu) lack androidNative. They're dtype-agnostic (just call
ops.*), so this target generalization is orthogonal to the quant/dtype generalization (#178) and meets
it cleanly at these primitives. See transformer-core/README.md.

What stayed / decoupled

  • dsl/decoder/* stays in llm-core (DecoderTransformerNetwork needs apps.llm.HybridTransformerBlock,
    which is compile-opt-coupled).
  • MultiHeadAttention's diagnostic dumpStats back-reference → a settable mhaStatSink (default no-op)
    that HybridTransformerBlock wires to llm-core's platform dumpStats — no behaviour lost.

Verified

Follow-up (noted in the README)

The pre-transpose marker (#178 "Solution C") will land in LinearProjection.kt, now here; and
RowDequantSource + packing (today in sk.ainet.models.gemma) are the next hoist candidates — tracked in #184.

michalharakal and others added 2 commits June 17, 2026 11:32
… androidNative

llm-core's transformer primitives (KV-cache family, MultiHeadAttention, Embedding,
RMSNormalization, RoPE, SwiGLU/GeGLU FFN, ResidualAdd, LinearProjection, …) only need
skainet-lang-core (which has androidNative), but were trapped in llm-core, whose other
deps (io-gguf/io-core/compile-*/backend-cpu) have no androidNative — so ARM-native
consumers (the Amlogic box) couldn't reuse them and had to reimplement.

Move the 15 lang-core-only NN files (transformer/, layers/, normalization/,
dsl/TransformerDsl.kt) into a new transformer-core module that depends ONLY on
skainet-lang-core and declares the full matrix INCLUDING androidNativeArm32/Arm64.
llm-core api-depends on transformer-core (re-exports), so existing consumers are
unaffected. dsl/decoder/* stays in llm-core (DecoderTransformerNetwork needs
apps.llm.HybridTransformerBlock, which is compile-opt-coupled).

Decoupled the one back-reference: MultiHeadAttention's diagnostic dumpStats call now
goes through a settable `mhaStatSink` (default no-op) that HybridTransformerBlock wires
to llm-core's platform dumpStats — no functionality lost.

Verified: transformer-core compiles for jvm + androidNativeArm32 + arm64; llm-core
builds + jvmTest green (5/5).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…onflict assessment)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@michalharakal michalharakal merged commit 6c548d7 into develop Jun 17, 2026
2 checks passed
@michalharakal michalharakal deleted the feature/transformer-core branch June 17, 2026 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extract transformer-core: NN primitives reusable on all targets incl. androidNative

1 participant