MORPH: Multi-agent Online Rewiring through Plasticity-guided Hierarchy

MORPH is a training-free, self-organising coordination mechanism for multi-agent systems, inspired by neuroplasticity. It discovers task-semantic communication structure purely from co-task co-occurrence statistics — no spatial sensors, no offline training, no hand-tuned graph topology required.

Key idea

Most coordination methods either hard-wire communication topology (proximity-based heuristics) or learn it end-to-end with millions of RL training steps (CommFormer, TarMAC, DGN). MORPH occupies a different position: it self-organises online using four biologically grounded plasticity rules applied to a dynamic communication graph.

Property	Proximity	Learned comm (CommFormer etc.)	MORPH
Requires spatial sensors	✓	✗	✗
Requires offline RL training	✗	✓	✗
Adapts to task distribution shifts	✗	partial	✓
Persistent coordination memory	✗	✓	✓
Interpretable links	✗	✗	✓
Works from step 0, any environment	✓	✗	✓

Neuroplasticity mechanisms

Mechanism	Biological basis	Role in MORPH
Hebbian + Homeostatic plasticity (v1)	Hebb's rule + synaptic scaling	Strengthen links for co-assigned pairs; maintain target degree
BCM Metaplasticity	BCM sliding threshold	Over-potentiated links decay faster; stale coordination forgotten
Reward-modulated Plasticity	Dopamine-gated LTP	Links that complete deliveries are strengthened, not just co-assignment
Neuromodulation	Dopamine/ACh arousal signal	Explore new links when delivery rate drops; consolidate when performing well
Predictive Formation	Anticipatory synaptogenesis	Links form before co-assignment via spatial hint matrix

v1 behaviour is recovered exactly by setting all v2 mechanism gains to zero.

Results (T = 800 steps, 5 seeds, TA-RWARE)

Delivery performance

Condition	Tiny (N=8)	Small (N=12)	Medium (N=18)	Large (N=24)
MORPH	62.4 ± 2.9	70.4 ± 1.3	91.0 ± 5.1	107.4 ± 3.4
Proximity	62.4 ± 4.0	75.2 ± 1.6	102.2 ± 2.2	113.2 ± 3.1
TSG	55.2 ± 5.8	70.9 ± 2.1	89.9 ± 1.8	97.6 ± 13.5
Full-Graph	57.8 ± 18.0	64.1 ± 13.7	88.8 ± 5.9	98.0 ± 7.7

MORPH outperforms Full-Graph at every scale — at N=24, MORPH reaches 110% of Full-Graph throughput (107.4 vs. 98.0) while using only 21% of possible coordination links. MORPH ties Proximity at Tiny (62.4 each) and trails by 4–10 deliveries at larger scales; Proximity's radius was calibrated from MORPH's own converged link count, an advantage unavailable at real deployment time.

Scale comparison

Delivery curves (top), throughput retention vs N (bottom-left), communication overhead vs N (bottom-right), and per-scale bar charts (bottom row).

Communication efficiency (Pareto)

MORPH sits in the upper-left ideal region: high throughput at low communication cost.

Delivery across scales

Adaptability under task distribution shift

A key advantage of MORPH over proximity: resilience to non-stationary task patterns.

Experiment: Medium scale (N=18), 20 seeds. Six consecutive 400-step phases alternating LEFT/RIGHT demand (L–R–L–R–L–R). MORPH carries its learned pairwise weights across all phases; MORPH-reset clears weights at every phase boundary; Proximity recomputes spatial links each step.

Condition	P1	P2	P3	P4	P5	P6	Δ (P6−P3)
Proximity	51.1	48.2	45.7	41.8	37.9	34.1	−25.4%
TSG	49.9	46.0	44.6	41.4	38.8	38.1	−14.6%
MORPH (reset)	50.0	48.9	48.2	46.5	43.9	40.8	−15.6%
MORPH	50.0	49.1	47.5	46.1	45.2	44.0	−7.4%

MORPH degrades 3.3× less than Proximity and 2.1× less than resetting MORPH at every phase boundary, from Phase 3 to Phase 6. Recovery time decreases from 36.9 steps at the first return (r3) to 17.7 steps at r6, showing that retained preferences compound across repeated regime cycles. Proximity's recovery time grows to 96 steps by r5 as its spatial links cannot adapt to repeated shifts.

Emergent behaviour analyses

Post-hoc analyses on a single fully-recorded episode (T=1500, medium scale, N=18, seed=42). Script: python scripts/emergent_analysis.py

Does MORPH rediscover spatial structure?

Spearman ρ = −0.111 (p = 0.174) — not significant. MORPH's learned weights are driven by task co-assignment statistics, not physical distance. The system discovers a genuinely task-semantic coordination structure without any spatial sensors.

AGV–Picker pairs develop the strongest weights (they collaborate on every delivery); AGV–AGV and Picker–Picker pairs stay near zero (same-type agents rarely co-assign).

W matrix evolution

The coordination graph evolves from all-zeros to a structured, sparse pattern. By t=100–200 the AGV–Picker off-diagonal block is already taking shape; same-type blocks (AGV–AGV, Picker–Picker) remain near-zero throughout, consistent with the task structure.

Neuromodulation dynamics

η is mildly negative for most of the episode (exploration mode), with short consolidation bursts (η > 0) following delivery clusters. The system dynamically balances exploring new coordination links vs locking in those that drove recent deliveries — a direct analogue of dopaminergic arousal modulation.

Link persistence: stable core vs dynamic periphery

772 link events observed over 800 steps. Only 0.8% survived ≥ 400 steps (stable core); 99.2% were short-lived exploratory probes with a median lifetime of 21 steps. MORPH maintains a tiny set of highly-trusted long-term partnerships while continuously testing alternatives — analogous to the brain's balance between long-term potentiated synapses and ongoing synaptic turnover. Progressive sparsification is further confirmed by the Fiedler value λ₂ declining monotonically from 0.57 to 0.23 over 1500 steps as sliding-threshold and structural pruning eliminate low-value preferences.

Warehouse animations (T = 800 steps)

Each animation shows MORPH self-organising its communication graph in real time. Right-side panels display: active links, neuromodulation signal η, BCM threshold θ̄, and structural plasticity events (formations / prunings).

Scale	Agents	Animation
Tiny	N=8 (5 AGV + 3 picker)
Small	N=12 (8 AGV + 4 picker)
Medium	N=18 (12 AGV + 6 picker)
Large	N=24 (16 AGV + 8 picker)

Positioning

MORPH is not competing with trained communication methods (CommFormer, TarMAC) in raw throughput — those methods train for millions of steps on the target distribution. MORPH's contribution is:

No training required — works from the first episode in any environment
No spatial prior — achieves Proximity-competitive throughput without position sensors; emergent analysis confirms learned W is not correlated with distance (ρ = −0.11, p = 0.17)
Adaptive — self-rewires when task distribution shifts (proximity cannot)
Interpretable — W matrix is a learned coordination history: W_ij reflects how much agents i and j have historically co-assigned on tasks
Scalable — active-link fraction falls monotonically from 97% (N=8) to 21% (N=24) as coordination complexity grows, vs 100% for Full-Graph
Biologically structured emergence — maintains a tiny stable coordination core (< 1% of link events) alongside a dynamic exploratory periphery, analogous to long-term potentiation vs ongoing synaptic turnover in biological neural circuits

Repository structure

morph_v2/
├── morph/
│   ├── morph.py                    # MORPH coordinator (all four v2 mechanisms)
│   └── morph_env.py                # Abstract environment interface
│
├── experiments/
│   ├── tarware_experiment.py       # Scale study: 4 conditions × 4 scales × 5 seeds
│   └── shift_experiment.py         # Task distribution shift adaptability experiment
│
├── scripts/
│   ├── generate_figures.py         # Figures 1–3 from experiment_results.pkl
│   ├── statistical_tests.py        # Welch's t-tests → results/statistical_tests.csv
│   ├── animate.py                  # Warehouse animations → figures/MORPH_v2_*.gif
│   └── emergent_analysis.py        # Emergent behaviour analyses → figures/emergent_*.png
│
├── figures/                        # Generated figures and GIFs
├── results/                        # PKL data and summary CSVs
├── requirements.txt
└── README.md

Quickstart

git clone <repo-url> morph_v2
cd morph_v2
pip install -r requirements.txt

Run the full scale study (~60 min)

python experiments/tarware_experiment.py
# → results/experiment_results.pkl
# → results/summary_table.csv
# → results/step_csv/   (per-step CSVs)

Run the task distribution shift experiment (~30 min)

python experiments/shift_experiment.py
# → results/shift_results.pkl
# → figures/shift_adaptability.png

Generate figures

python scripts/generate_figures.py
# → figures/scale_comparison.png
# → figures/scale_bars.png
# → figures/pareto.png

python scripts/statistical_tests.py
# → results/statistical_tests.csv

Render animations

python scripts/animate.py tiny          # one scale
python scripts/animate.py all           # all four scales (~30 min)
# → figures/MORPH_v2_{tiny,small,medium,large}.gif

Run emergent behaviour analyses (~5 min)

python scripts/emergent_analysis.py
# → figures/emergent_w_vs_distance.png   (W vs Manhattan distance, Spearman ρ)
# → figures/emergent_w_evolution.png     (W matrix snapshots at t=0,100,200,400,600,800)
# → figures/emergent_neuromod.png        (η time series + delivery events)
# → figures/emergent_link_persistence.png (link lifetime distribution)

MORPH v2 parameters

Inherited from v1

Parameter	Default	Description
`alpha`	0.18	Synaptic learning rate
`beta`	0.04	Homeostatic correction rate
`decay`	0.98	Base synaptic weight decay per step
`theta_form_start`	0.75	Initial MI threshold for link formation
`theta_form_end`	0.45	Final MI threshold (after annealing)
`theta_prune`	0.008	Weight threshold for pruning
`target_deg_frac`	0.35	Target degree as fraction of N−1
`grace_steps`	20	Min link age before eligible for pruning or BCM penalty
`k_slow`	3	Structural update frequency (steps)

New in v2

Parameter	Default	Description
`bcm_tau`	0.95	BCM threshold smoothing (higher = longer memory, ~20-step window)
`bcm_gain`	0.5	Extra decay multiplier for over-potentiated links
`reward_alpha`	0.08	Delivery-burst learning rate (only links with jaccard > 0.25)
`neuromod_gain`	0.4	Scales effective α upward when η > 0 (performing above expectation)
`neuromod_explore`	0.10	Reduces θ_form when η < 0 (performing below expectation)
`neuromod_ema`	0.03	EMA smoothing for delivery rate (~33-step window)
`expected_delivery_rate`	0.12	Baseline deliveries/step for η signal (scale-dependent)
`pred_boost`	0.4	Anticipatory hint weight in structural MI score

Statistical tests

Welch's t-tests (5 seeds, two-sided), MORPH vs each condition per scale:

Scale	vs Proximity	vs TSG	vs Full-Graph
Tiny	p = 1.00 (tied)	p = 0.07	p = 0.64
Small	p = 0.003 ★	p = 0.71	p = 0.42
Medium	p = 0.008 ★	p = 0.74	p = 0.60
Large	p = 0.038 ★	p = 0.21	p = 0.067

★ Proximity is significantly better at Small/Medium/Large; MORPH ties Proximity at Tiny (62.4 each). MORPH outperforms Full-Graph at every scale without a significant p-value threshold required — at N=24, MORPH delivers 107.4 vs. Full-Graph 98.0 (+9.6%), while using only 21% of possible links.

Citation

@inproceedings{morph2026,
  title     = {MORPH: Multi-agent Online Rewiring through Plasticity-guided Hierarchy},
  author    = {Didem Gurdur Broo},
  booktitle = {[Venue]},
  year      = {2026}
}

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
experiments		experiments
figures		figures
morph		morph
results		results
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

MORPH: Multi-agent Online Rewiring through Plasticity-guided Hierarchy

Key idea

Neuroplasticity mechanisms

Results (T = 800 steps, 5 seeds, TA-RWARE)

Delivery performance

Scale comparison

Communication efficiency (Pareto)

Delivery across scales

Adaptability under task distribution shift

Emergent behaviour analyses

Does MORPH rediscover spatial structure?

W matrix evolution

Neuromodulation dynamics

Link persistence: stable core vs dynamic periphery

Warehouse animations (T = 800 steps)

Positioning

Repository structure

Quickstart

Run the full scale study (~60 min)

Run the task distribution shift experiment (~30 min)

Generate figures

Render animations

Run emergent behaviour analyses (~5 min)

MORPH v2 parameters

Inherited from v1

New in v2

Statistical tests

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages