Skip to content

Old-cpu/FriendLLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FriendLLM

中文文档

FriendLLM is a mini LLM inference project written in C++.

The goal of this repository is not to compete with llama.cpp, but to provide a smaller and easier codebase for beginners who want to understand how an LLM inference engine is built from scratch.

You can think of it as a learning-oriented, mini version of llama.cpp.

Project Goal

This project is mainly for learning:

  • how tensor metadata is represented
  • how operators are described
  • how a compute graph is built
  • how a simple executor runs the graph
  • how model weights and runtime memory are managed

The design aims to stay small, readable, and easy to modify.

Current Direction

FriendLLM is being built step by step from the bottom up:

  1. basic internal types
  2. tensor structure
  3. operator definitions
  4. compute graph
  5. executor
  6. model loading
  7. simple generation

Right now the repository is still in the early stage, with the focus on building the core data structures first.

What This Project Is

  • a mini inference engine for learning
  • a readable C++ codebase
  • a playground for understanding LLM internals

What This Project Is Not

  • not a production inference engine
  • not a performance-first implementation
  • not a full replacement for llama.cpp

Why This Exists

Projects like llama.cpp are excellent, but for beginners they can feel large and dense.

FriendLLM tries to keep the important ideas while reducing the amount of code you need to hold in your head at once.

The hope is that you can read the code, follow the data flow, and gradually build intuition for how LLM inference works.

Repository Layout

Current structure:

FriendLLM/
├── include/
│   ├── core/
│   │   └── tensor.hpp
│   └── utils/
│       └── utils.hpp
├── CMakeLists.txt
└── README.md

As the project grows, more core modules will be added, such as graph, executor, backend, and model loader.

Development Philosophy

  • start from the smallest useful abstraction
  • prefer clarity over cleverness
  • build one layer at a time
  • keep the code beginner-friendly

Planned Learning Path

If you want to follow the project in order, the recommended reading path is:

  1. include/utils/utils.hpp
  2. include/core/tensor.hpp
  3. graph-related structures
  4. operator factory
  5. executor
  6. model parser and weight loading

Status

Early work in progress.

The current focus is on the internal type system and tensor representation.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages