An intelligent, modular system for understanding source code through structured analysis. This project transforms raw code into a semantic, language-neutral representation that can power analyzers, tooling, and AI systems.
The project is currently focused on building the core analysis engine, specifically:
Code → AST → IR → Clean JSON
This pipeline converts code into a structured format that is:
- human-readable
- machine-analyzable
- language-agnostic
-
Parses Python code using
ast -
Extracts:
- functions, classes
- loops and conditionals
- assignments and variables
- calls and control flow
A custom Semantic IR layer that:
- preserves code structure
- captures execution logic
- abstracts away Python-specific syntax
Example:
function → body → if → return
The IR is converted into a clean JSON format designed for:
- analyzers
- AI systems
- cross-language compatibility
Example:
{
"type": "function",
"name": "factorial",
"body": [
{
"type": "if",
"condition": "n <= 1",
"body": [
{ "type": "return", "value": "1" }
],
"else": [
{ "type": "return", "value": "n * factorial(n)" }
]
}
]
}CodeZap/
│
├── ir/ # IRNode + IRBuilder (AST → IR)
├── driver/ # File parsing entry
├── printer/ # Debug visualization
├── analyzers/ # (soon) analysis modules
└── main.py # entry point
Python Code
↓
AST (Python)
↓
IR (custom semantic structure)
↓
Clean JSON (language-neutral)
- AST parsing
- IR (Intermediate Representation)
- Clean JSON export
- Call graph (next step)
- Advanced analyzers (planned)
Run from the project root directory:
py -m CodeZap.main path/to/file.pyYou can redirect the output to a file:
py -m CodeZap.main path/to/file.py > output.jsonpy -m CodeZap.main test.py > output.jsonThis will:
- parse the input file
- build IR
- output clean JSON
- AST → IR
- IR → Clean JSON
- Call Graph generation
- Recursion detection
- Basic code metrics
- AI-powered explanations
- Refactoring suggestions
- Complexity metrics
- Security and Data leakage flags
- Multi-language support
- Frontend/Dashboard etc
Project is in active development. Structure may evolve contributions and ideas are welcome.
Apache License 2.0
This is a core engine-first project. Higher-level features (AI, dashboard, security) will be built on top of the current foundation.