diff --git a/CHANGELOG.md b/CHANGELOG.md index e308e8a6..3fdc28d8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,33 @@ version line is kept in lock-step with the underlying SKaiNET engine The format roughly follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.31.1] — 2026-06-17 + +Adds **`transformer-core`** — the framework NN primitives (attention, the KV-cache family, embedding, +norms, RoPE, SwiGLU/GeGLU FFN, residual, linear projection) extracted from `llm-core` so they build on the +**full Kotlin target matrix including `androidNative`** (32-bit + 64-bit ARM). `llm-core` re-exports it, so +existing consumers are unaffected; ARM-native downstreams (e.g. on-device whisper) can now reuse the +primitives instead of reimplementing them. + +### Added + +- **`transformer-core` module** (`sk.ainet.transformers:skainet-transformers-transformer-core`) — the + lang-core-only NN primitives, reusable on every target incl. `androidNativeArm32`/`androidNativeArm64`. + Depends only on `skainet-lang-core`. Added to the BOM. (#183) + +### Changed + +- **`llm-core` now `api`-depends on `transformer-core` and re-exports it** (no behaviour change). The NN + primitive sources moved out of `llm-core` into `transformer-core`; `dsl/decoder/*` stayed (it needs the + compile-opt-coupled `HybridTransformerBlock`). `MultiHeadAttention`'s diagnostic `dumpStats` is decoupled + via a settable `mhaStatSink` that `HybridTransformerBlock` wires to llm-core's platform `dumpStats`. + +### Notes + +- **Engine pin unchanged (`skainet = 0.31.0`).** `transformer-core` needs nothing new from the engine (only + `skainet-lang-core`, already in 0.31.0), so this patch ships against engine **0.31.0** — the one case the + transformers-`X.Y.Z` ↔ engine-`X.Y.Z` alignment is intentionally relaxed (additive + engine-independent). + ## [0.31.0] — 2026-06-15 Version-aligned with **SKaiNET 0.31.0**. Completes the eager board-decode path diff --git a/README.md b/README.md index b74313e8..85057351 100644 --- a/README.md +++ b/README.md @@ -103,8 +103,13 @@ Honest status — see the project-status note at the top of this README. ## Current release -The current release is **0.31.0** — version-aligned with **SKaiNET 0.31.0**. -The headline is that the eager `NATIVE_OPTIMIZED` Gemma path now keeps the +The current release is **0.31.1** (against **SKaiNET 0.31.0**). It adds +**`transformer-core`** — the framework NN primitives (attention, KV-cache family, +embedding, norms, RoPE, FFNs, linear projection) extracted out of `llm-core` so they +build on the **full target matrix including `androidNative`** (32-bit + 64-bit ARM); +`llm-core` re-exports it, so nothing changes for existing consumers, and ARM-native +downstreams (e.g. on-device whisper) can reuse the primitives instead of reimplementing +them. The 0.31.0 highlights still apply: the eager `NATIVE_OPTIMIZED` Gemma path keeps the **tied Q8_0 lm_head packed** (paired with SKaiNET 0.31.0's `ops.transpose` fix for all packed dtypes), and `GemmaNetworkLoader.load()` takes an optional `maxInferenceLen` to cap the KV cache for constrained devices — together @@ -116,7 +121,7 @@ The recommended way to consume is via the BOM. It pins every published `skainet- ```kotlin dependencies { - implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.31.0")) + implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.31.1")) // Versions resolved from the BOM: implementation("sk.ainet.transformers:skainet-transformers-core") @@ -141,6 +146,7 @@ dependencies { | Module | Purpose | | -------------------- | ----------------------------------------------------------------------- | | `llm-api` | Framework-neutral interfaces (`ChatModel`, `EmbeddingModel`, `ToolDefinition`) — Spring AI-shaped. | +| `transformer-core` | Framework NN primitives (attention, KV-cache family, embedding, norms, RoPE, FFNs, linear projection). `lang-core`-only → **all targets incl. `androidNative`**; re-exported by `llm-core`. | | `llm-core` | `OptimizedLLMRuntime`, `ModelRegistry`, `UnifiedModelLoader`, shared abstractions. | | `llm-inference/` | Per-architecture network DSLs and weight loaders (`llama`, `gemma`, `qwen`, `apertus`, `bert`). | | `llm-runtime/` | Per-architecture runtime facades (`kllama`, `kgemma`, `kqwen`, `kapertus`). | @@ -193,6 +199,15 @@ try (KLlamaSession session = KLlamaJava.loadGGUF(modelPath, /* systemPrompt */ n See `llm-test/llm-test-java/src/test/java/.../KLlamaJavaToolCallingTest.java` for a runnable reference. +## What's new in 0.31.1 + +- **`transformer-core` module — NN primitives reusable on all targets incl. `androidNative`.** The + attention / KV-cache / embedding / norm / RoPE / FFN / linear-projection primitives were trapped in + `llm-core` (whose io/compile/backend deps lack `androidNative`); they only need `skainet-lang-core` + (which has it), so they're extracted into `transformer-core` and `llm-core` re-exports them. Existing + consumers are unaffected; ARM-native downstreams (on-device whisper, future models) reuse them instead of + reimplementing. Ships against engine **0.31.0** (additive, no engine change). (#183) + ## What's new in 0.31.0 - **Tied Q8_0 lm_head stays packed (eager `NATIVE_OPTIMIZED`).** FunctionGemma's diff --git a/buildSrc/src/main/kotlin/sk/ainet/transformers/gradle/BomCoveragePlugin.kt b/buildSrc/src/main/kotlin/sk/ainet/transformers/gradle/BomCoveragePlugin.kt index 1582c05c..84fae2ac 100644 --- a/buildSrc/src/main/kotlin/sk/ainet/transformers/gradle/BomCoveragePlugin.kt +++ b/buildSrc/src/main/kotlin/sk/ainet/transformers/gradle/BomCoveragePlugin.kt @@ -37,6 +37,28 @@ class BomCoveragePlugin : Plugin { ) } + // Fail fast (at configuration time, not at Maven Central deploy time) when a NEW published + // module forgot its gradle.properties. Without POM_ARTIFACT_ID the artifact silently defaults + // to the bare project name (wrong coordinates / not the skainet-transformers-* convention); + // without POM_NAME, Maven Central rejects the deploy. This recurs on every new module — catch it. + val pomProblems = publishedPaths.mapNotNull { path -> + val p = project.project(path) + val missing = buildList { + if (p.findProperty("POM_ARTIFACT_ID")?.toString().isNullOrBlank()) add("POM_ARTIFACT_ID") + if (p.findProperty("POM_NAME")?.toString().isNullOrBlank()) add("POM_NAME") + } + if (missing.isEmpty()) null else "$path — missing ${missing.joinToString(" + ")}" + } + if (pomProblems.isNotEmpty()) { + throw GradleException( + "[bom-coverage] Published module(s) are missing required POM properties — the Maven " + + "Central deploy would fail:\n" + + pomProblems.joinToString("\n") { " - $it" } + + "\nAdd a `gradle.properties` to each module with POM_ARTIFACT_ID + POM_NAME " + + "(see `llm-core/gradle.properties`)." + ) + } + project.dependencies.constraints { publishedPaths.forEach { add("api", project.project(it)) } } diff --git a/docs/modules/ROOT/pages/reference/architecture.adoc b/docs/modules/ROOT/pages/reference/architecture.adoc index 21fb21a4..179e71cd 100644 --- a/docs/modules/ROOT/pages/reference/architecture.adoc +++ b/docs/modules/ROOT/pages/reference/architecture.adoc @@ -4,7 +4,8 @@ == Module Structure ---- -llm-core Core abstractions (Tokenizer, InferenceRuntime, ModelRegistry) +transformer-core NN primitives (attention, KV-cache, embedding, norms, RoPE, FFNs) — all targets incl. androidNative +llm-core Core abstractions (Tokenizer, InferenceRuntime, ModelRegistry); re-exports transformer-core llm-agent Chat templates, tool calling, AgentLoop, ChatSession llm-inference/ llama/ LLaMA/Qwen network definition and weight loading diff --git a/docs/modules/ROOT/pages/tutorials/getting-started-java.adoc b/docs/modules/ROOT/pages/tutorials/getting-started-java.adoc index 4723dd70..47da0d65 100644 --- a/docs/modules/ROOT/pages/tutorials/getting-started-java.adoc +++ b/docs/modules/ROOT/pages/tutorials/getting-started-java.adoc @@ -25,7 +25,7 @@ In your `build.gradle.kts`: [source,kotlin] ---- dependencies { - implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.31.0")) + implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.31.1")) implementation("sk.ainet.transformers:skainet-transformers-runtime-kllama") implementation("sk.ainet.transformers:skainet-transformers-agent") @@ -41,7 +41,7 @@ Or in Maven (Maven needs the `-jvm` classifier suffix on platform artifacts): sk.ainet.transformers skainet-transformers-bom - 0.31.0 + 0.31.1 pom import diff --git a/docs/modules/ROOT/pages/tutorials/llama3-tool-calling.adoc b/docs/modules/ROOT/pages/tutorials/llama3-tool-calling.adoc index 0be131c3..cb9ebf50 100644 --- a/docs/modules/ROOT/pages/tutorials/llama3-tool-calling.adoc +++ b/docs/modules/ROOT/pages/tutorials/llama3-tool-calling.adoc @@ -52,7 +52,7 @@ The pieces you need live in three modules: [source,kotlin] ---- dependencies { - implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.31.0")) + implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.31.1")) implementation("sk.ainet.transformers:skainet-transformers-runtime-kllama") implementation("sk.ainet.transformers:skainet-transformers-agent") diff --git a/gradle.properties b/gradle.properties index 942ff890..a1bd7efa 100644 --- a/gradle.properties +++ b/gradle.properties @@ -1,5 +1,5 @@ GROUP=sk.ainet.transformers -VERSION_NAME=0.31.0 +VERSION_NAME=0.31.1 POM_DESCRIPTION=SKaiNET-transformers diff --git a/transformer-core/gradle.properties b/transformer-core/gradle.properties new file mode 100644 index 00000000..7974dd47 --- /dev/null +++ b/transformer-core/gradle.properties @@ -0,0 +1,2 @@ +POM_ARTIFACT_ID=skainet-transformers-transformer-core +POM_NAME=skainet transformers transformer-core