Skip to content

Compiler IR

Levente Santha edited this page May 11, 2026 · 2 revisions

Compiler IR (Intermediate Representation)

Compiler's internal representation of code with SSA form for JNode's JIT compilation pipeline.

Overview

Compiler IR (Intermediate Representation) is the central data structure used by JNode's L2 JIT compiler to represent and optimize bytecode. Unlike the stack-based JVM bytecode, IR uses a three-address code representation that simplifies optimization and code generation.

The IR is built in Static Single Assignment (SSA) form, where each variable is assigned exactly once. This form makes data flow analysis straightforward and enables powerful compiler optimizations like constant propagation, dead code elimination, and register allocation.

Key Components

Class / File Role
IRControlFlowGraph Control flow graph containing basic blocks
IRBasicBlock Linear sequence of IR instructions
IRGenerator Converts bytecode to IR quads
quad/Quad Base class for all IR instructions
LinearScanAllocator Register allocation for SSA form
SSAStack SSA construction and phi-function placement

How It Works

IR Generation Pipeline

  1. Bytecode Parsing: The BytecodeParser converts JVM bytecode into an initial control flow graph.

  2. IR Construction: IRGenerator transforms each basic block into a sequence of IR Quads (three-address instructions).

  3. SSA Conversion: The IR is converted to SSA form using SSAStack, inserting Phi functions at merge points.

  4. Optimization: Various optimizations are applied:

    • Constant propagation
    • Common subexpression elimination
    • Dead code elimination
    • Loop invariant code motion
  5. Register Allocation: LinearScanAllocator maps virtual registers to physical registers.

  6. Code Generation: CodeGenerator emits native x86 instructions.

IR Quads

IR instructions (Quads) represent operations in three-address form:

  • BinaryQuad: Arithmetic operations (add, sub, mul, div, etc.)
  • CallQuad / CallAssignQuad: Method calls
  • BranchQuad: Conditional and unconditional branches
  • PhiAssignQuad: SSA phi functions for merge points

SSA Form

In SSA form:

  • Each variable is assigned exactly once
  • Phi functions at control flow merges select the correct value based on which predecessor block executed
  • This enables precise data flow analysis

Gotchas & Non-Obvious Behavior

  • Phi Functions: SSA phi functions are virtual instructions that select values based on control flow history—they don't map directly to machine instructions.
  • Linear Scan: The register allocator uses linear scan, which is fast but may not produce optimal code compared to graph coloring.
  • Block Ordering: IRBasicBlock ordering affects generated code locality.
  • Exception Handling: Exception edges in the CFG are modeled as implicit branches.

Related Pages

Clone this wiki locally