Ternip is a parameterizable, open-source RTL implementation of a hardware accelerator targeting the MatmulFree LLM algoirhtnm. MatmulFree LLMs replace traditional matrix multiplication with ternary weight operations, enabling significant reductions in compute and memory bandwidth — making them well-suited for hardware acceleration.
This project is licensed under the BSD 3-Clause License and is free to use, modify, and distribute.
| File | Description |
|---|---|
| rtl/ternip/ternip_core.sv | Top-level compute core |
| rtl/fus/ternip_tmatmul.sv | Ternary matrix multiplication unit |
| rtl/fus/ternip_rms.sv | RMS normalization unit |
| rtl/math/ | Math modules (sqrt, sigmoid, SiLU, and more) |
Tests and build flow are not currently provided but will be made available shortly.
Ternip depends on the BaseJump STL hardware library.
Ternip is available on the FuseSoC Package Directory. To add it as a library using the GitHub repo directly:
fusesoc library add ternip https://github.com/sifferman/ternip --sync-type=gitThen declare it as a dependency in your .core file:
filesets:
rtl:
depend:
- sifferman::ternipIf you use this work, please cite the original MatmulFree LLM paper:
@article{zhu2024scalable,
title = {Scalable MatMul-free Language Modeling},
author = {Zhu, Rui-Jie and Zhang, Yu and Sifferman, Ethan and Sheaves, Tyler and Wang, Yiqiao and Richmond, Dustin and Zhou, Peng and Eshraghian, Jason K},
journal = {arXiv preprint arXiv:2406.02528},
year = {2024}
}See docs/CONTRIBUTORS.