diff --git a/README.md b/README.md index 3a8d9c2..26d24ed 100644 --- a/README.md +++ b/README.md @@ -11,18 +11,52 @@ ##################################################################### ``` -# 🖥️ GPT-2 in BASIC: AI Meets Retrocomputing +# GPT2-BASIC: Portable Machine Intelligence in BASIC -*What if transformer models had been invented during the 486 era?* +GPT2-BASIC is a fixed-point transformer and assistant runtime implemented in +BASIC for DOS-class machines. It is not a web frontend, an API wrapper, or a +mock terminal demo. The release build compiles under DOS FreeBASIC, loads local +model artifacts from disk, performs GPT-style inference with integer arithmetic, +switches hot-loadable assistant packs, and uses local indexed knowledge files for +fast recall on constrained systems. + +The project is built around a practical claim: language-model inference and +useful local assistant behavior are portable algorithms. With appropriate +quantization, storage layout, tokenizer design, and retrieval indexes, the same +core ideas can run far below the hardware floor normally associated with modern +LLMs. ``` ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ ``` +## ► What It Does + +- Runs a DOS GPT-style transformer runtime in FreeBASIC with Q20.12 fixed-point + weights and DOS-loadable vocabulary/model files. +- Provides an assistant shell with multiple local packs for chat, DOS help, + office tasks, development notes, and portable-system guidance. +- Uses hot-swappable local model/knowledge assets rather than network access. +- Combines tiny model generation, golden replies, session memory, pack retrieval, + binary KDB/KB2 records, and sharded `KB2T?.TXT` term indexes. +- Ships DOSBox, QEMU, hardware-transfer, and launch-kit bundles for release and + validation workflows. +- Includes host and QEMU gates that produce machine-readable evidence instead of + relying on screenshots or claims. + +This is still a deliberately small model, not a frontier LLM compressed into a +486. The useful behavior comes from a complete constrained-system design: +fixed-point inference, compact local weights, curated language packs, indexed +recall, deterministic validation, and runtime fallbacks that keep answers useful +when raw generation is weak. + ## ► Project Status The current production path is the promoted `MODEL_LEXICON_GOLD_V4_S3000` -checkpoint running inside the DOS `GPT2.EXE` program. The model is -trained/exported on the host, copied into `C:\MODEL`, and executed by the -FreeBASIC fixed-point transformer runtime. +checkpoint running inside the DOS `GPT2.EXE` program. The model is trained and +exported on the host, copied into `C:\MODEL`, and executed by the FreeBASIC +fixed-point transformer runtime. The assistant release path also includes five +pack directories with local model metadata, golden replies, HELP rows, KDB/KB2 +knowledge records, aggregate `KB2TERM.TXT` term indexes, and sharded `KB2T?.TXT` +term indexes for faster recall. Verified production surface: @@ -94,32 +128,27 @@ Release mode choice: ## ► About This Project -This implementation demonstrates that **modern AI concepts like transformers are fundamentally just algorithms** - mathematical operations that can be implemented even on hardware from decades ago. It bridges two worlds typically considered separate: cutting-edge AI and vintage computing. - -Think of it as *digital archaeology in reverse* - building tomorrow's technology with yesterday's tools. - -### ■ Why This Matters - -``` -╔══════════════════════════════════════════════════════════════════╗ -║ "We were so busy asking if LLMs could run on a 486, we didn't ║ -║ stop to think if they should. The answer, by the way, is yes." ║ -║ ║ -║ — Anonymous DOS Enthusiast ║ -╚══════════════════════════════════════════════════════════════════╝ -``` - -This project serves multiple purposes: - -1. **Demystifying Modern AI**: By stripping away the layers of optimization that make modern transformers inscrutable, we expose their fundamental mathematical operations. - -2. **Historical "What If?"**: Imagine an alternate timeline where transformers were invented in the early 1990s. How would they have been implemented with the constraints of the era? +This project demonstrates that language-model inference is not intrinsically +tied to a cloud service, a GPU, Python, or a modern operating system. The +software here keeps the implementation close to the machine: BASIC source, +integer math, explicit binary files, simple text indexes, and repeatable DOS +validation runs. -3. **Educational Tool**: Learn about both transformer architecture and optimization techniques for constrained environments in an accessible way. +The repository has three audiences: -4. **Bridge Between Communities**: Connects retro-computing enthusiasts with modern AI concepts, and helps AI practitioners appreciate the elegance of optimization under constraints. +1. **Constrained-system developers** who want concrete techniques for local AI + under severe memory, storage, and CPU limits. +2. **AI practitioners** who want a readable, inspectable transformer runtime + without the usual stack of frameworks and accelerators. +3. **Retro and embedded-system builders** who want a working assistant runtime, + not just a conceptual port. -5. **Practical Local AI**: Provides a working DOS runtime, assistant packs, evidence harnesses, and transfer tooling for constrained and retro systems. +The important result is not that this tiny model competes with modern hosted +LLMs. It does not. The important result is that the full loop is real: local +weights, local tokenizer, local inference, local pack switching, local indexed +recall, DOS execution, QEMU stress testing, and a physical-machine transfer +workflow. Physical returned board logs are still pending, so hardware-specific +speed claims remain QEMU evidence until real-system logs are captured. ## ► Comprehensive Documentation @@ -133,12 +162,14 @@ This extensive documentation includes: - Complete technical explanations of all core innovations and optimization techniques - Platform-specific implementation considerations - Thorough performance analysis with benchmarking methodology -- Counterfactual historical analysis of how this implementation might have altered computing history +- Constrained-system design analysis for DOS-class and embedded targets - Educational value and insights for modern edge AI development - Future directions and applications - Comprehensive academic references -The paper bridges technical implementation details with historical analysis to provide both practical insights and thought-provoking exploration of an alternate AI timeline. +The paper bridges technical implementation details with historical context to +show how the same algorithmic ideas can be lowered into much smaller runtime +environments. Public release and media: diff --git a/docs/marketing/promo-kit.md b/docs/marketing/promo-kit.md index c005ed5..576dfba 100644 --- a/docs/marketing/promo-kit.md +++ b/docs/marketing/promo-kit.md @@ -5,24 +5,25 @@ GitHub release, video descriptions, social posts, and a small landing page. ## One-Line Description -GPT2-BASIC runs small GPT-style transformer models inside DOS using a -FreeBASIC fixed-point inference runtime. +GPT2-BASIC runs local GPT-style inference and assistant recall inside DOS using +a FreeBASIC fixed-point runtime. ## Short Description -GPT2-BASIC asks a deliberately unreasonable question: what would a transformer -language model look like if it had to run in a 486-era DOS software stack? The -project includes host-side training/export tools, a FreeBASIC fixed-point -runtime, DOS/QEMU evidence, assistant packs, and hardware-transfer tooling for -physical machines. +GPT2-BASIC is a portable machine-intelligence runtime for DOS-class systems. It +includes host-side training/export tools, a FreeBASIC fixed-point transformer, +hot-loadable local assistant packs, sharded recall indexes, DOS/QEMU evidence, +and hardware-transfer tooling for physical machines. ## Longer Description -GPT2-BASIC is an educational retrocomputing AI project. It trains compact -GPT-style models on a modern host, exports fixed-point weights, and runs the -inference path inside a DOS FreeBASIC program. The goal is not to compete with -modern LLMs. The goal is to make the transformer loop visible, constrained, -and reproducible on hardware and software that were never designed for it. +GPT2-BASIC is a constrained-system AI project. It trains compact GPT-style +models on a modern host, exports fixed-point weights, and runs the inference +path inside a DOS FreeBASIC program. The assistant combines local model output, +pack-specific golden replies, session memory, binary knowledge records, and +sharded term indexes. The goal is not to compete with modern hosted LLMs. The +goal is to make useful local machine intelligence visible, portable, and +reproducible under severe CPU, memory, storage, and operating-system limits. The current preview includes a production DOS runtime, curated model artifacts, assistant packs, QEMU evidence, quality reports, and a hardware-transfer bundle @@ -33,10 +34,12 @@ for physical DOS systems. - Real fixed-point transformer inference in DOS. - FreeBASIC source you can read, build, and inspect. - Host-trained, DOS-exported model artifacts. -- Assistant packs for CHAT, DOSHELP, and OFFICE workflows. +- Hot-loadable assistant packs for CHAT, DOSHELP, OFFICE, DEV, and PORTABLE + workflows. +- Sharded `KB2T?.TXT` recall indexes for faster local knowledge lookup. - QEMU 486DX2/66 evidence plus a path to physical 486 validation. - Reproducible preview-release and hardware-transfer zips. -- Educational focus: modern AI concepts explained through old constraints. +- Educational focus: modern AI concepts explained through constrained systems. - Substrate-portability argument: the runtime is built from primitive operations that can be lowered to C or assembly. @@ -51,17 +54,18 @@ for physical DOS systems. ## Suggested Taglines -- Modern transformer ideas, DOS-era constraints. -- A tiny GPT-style model running where it has no business running. -- Real inference. Real mode vibes. +- Portable local intelligence, built from BASIC and fixed-point math. +- GPT-style inference and recall for DOS-class machines. +- Local weights, local recall, local execution. - The transformer loop, stripped down to BASIC. -- AI archaeology from an alternate 1993. +- Useful assistant behavior under severe constraints. ## GitHub Repository Blurb -Small GPT-style transformer models exported to a DOS FreeBASIC fixed-point -runtime, with QEMU evidence, assistant packs, and hardware-transfer tooling for -486-era machines. +Portable machine intelligence in BASIC: a QEMU-verified DOS/486 transformer and +assistant runtime with fixed-point GPT inference, hot-swappable local model +packs, sharded recall indexes, and release bundles for retro and constrained +systems. ## Release Announcement Draft @@ -84,8 +88,9 @@ GPT2-BASIC includes a tiny transformer runtime for DOS in FreeBASIC. It loads exported fixed-point weights, runs GPT-style inference, and includes assistant packs with reproducible quality evidence. -What if transformer models had landed in the 486 era? GPT2-BASIC is a working -answer: small, slow, inspectable, and running inside DOS. +GPT2-BASIC keeps the AI stack local and inspectable: fixed-point weights, +FreeBASIC source, sharded recall indexes, QEMU evidence, and DOS release +bundles. ## Video Description Template diff --git a/tests/test_public_repo_hygiene.py b/tests/test_public_repo_hygiene.py index d239d59..80feb8e 100644 --- a/tests/test_public_repo_hygiene.py +++ b/tests/test_public_repo_hygiene.py @@ -46,7 +46,10 @@ def test_public_copy_uses_product_not_personal_framing(self) -> None: self.assertNotIn("**Proof of Concept**", readme) self.assertNotIn("Contact: Tsotchke Corporation / project owner", promo) - self.assertIn("**Practical Local AI**", readme) + self.assertIn("Portable Machine Intelligence in BASIC", readme) + self.assertIn("not a frontier LLM", readme) + self.assertIn("Physical returned board logs are still pending", readme) + self.assertIn("Do not claim physical 486 speed", promo) def test_substrate_portability_claim_is_qualified(self) -> None: substrate = (ROOT / "docs" / "substrate-portability.md").read_text(encoding="utf-8")