Catalog/Classes/Marcus
CLASS · 05 / 06v0.4Live · v0.4

Marcus

The impediment to action advances action. What stands in the way becomes the way.

Named afterMarcus Aurelius (121 — 180 AD) — emperor who wrote the Meditations as a daily discipline of self-interrogation and correction.

+7.8 pts median eval uplift in 3 generationszero post-promotion regressions38 agents reviewed nightly
01 — The science

Why we named it Marcus.

Marcus Aurelius ran the Roman Empire by day and interrogated himself by candlelight. The Meditations weren't philosophy — they were operational review. What pattern am I missing? What will this look like in ten years if I don't correct it now?

Seneca, his Stoic contemporary, had one obsession: time. 'Omnia aliena sunt, tempus tantum nostrum est' — all else is borrowed; only time is ours. He wrote about it relentlessly because he watched people waste it on urgency while ignoring consequence.

The Marcus agent combines both disciplines. It is the only meta-agent in the catalog — it does not act on the world directly. It reviews the other agents the way a coaching staff reviews game film: methodically, without ego, looking for the drift patterns that turn into failures. Marcus is the agent you wake up to, not the one you ping.

  • 01Dichotomy of control (Epictetus, Enchiridion 1) → autonomy tiers by reversibility: high-blast-radius actions get the lowest autonomy tier and a mandatory human gate.
  • 02Discipline of assent → no promotion without held-out eval evidence: the meta-agent refuses to assent to a flattering in-sample result.
  • 03Obstacle is the way (Meditations 5.20) → failure signatures are the routing table: every failure maps to an improvement lever, not a discard pile.

Read the design research →

Marcus class

The impediment to action advances action. What stands in the way becomes the way.

— MARCUS AURELIUS · MEDITATIONS 5.20
Marcus class
1 / 4Why is it called Marcus?

The impediment to action advances action. What stands in the way becomes the way. Interrogate what you are doing daily.

Marcus Aurelius · Meditations 5.20
tap to continue →
Field Evidence
ACADEMIC BASIS
Reflexion — 91% vs 80% on HumanEvalSelf-improvement loop without fine-tuning: verbal reinforcement from evaluation feedback drives +11pt accuracy gain. Extended by production eval-gated patterns (Notion AI, LangSmith, Braintrust).
Shinn et al. · arXiv:2303.11366, 2023; ZenML, Jul 2025
IN PRODUCTION
Eval-driven development — Notion AI, LangSmith, BraintrustMulti-layer eval stack: lightweight unit evals run frequently, trigger-gated offline regression suites, eval scores as promotion criteria, regressions auto-blocked.
ZenML LLMOps compendium, Jul 2025; Braintrust, Feb 2026
BENCHMARK
Zillow Offers — $421M Q3-2021 lossConcept drift unmonitored, human overrides suppressed, maximal irreversibility treated as reversible. The canonical Goodhart failure: an estimation model given maximum autonomy over irreversible capital decisions.
insideAI News, Dec 2021; GeekWire, Nov 2021
02 — Agents in this class

Prototype agents.

Every class ships with reference agents calibrated to operational use cases. Fork them, deploy them, or use them as a template.

Failure Signature Classifier

91% agreement with human triage

Exemplar Curator

retrieval relevance +22%

Generalisation Gap Auditor

blocked 14% of overfit promotions

Evening Review Conductor

64% of proposals accepted
12-WEEK BETA · 9 DESIGN PARTNERS · 47,000 SHADOWED RUNS
03 — Qualification gate

The ALOFT pipeline, applied.

Every agent in this class passes the same five-stage gate. Below: the criteria specific to Marcus agents at each stage.

ALOFT
01 · Curation
02 · Staging
03 · Deploy
04 · Operate
05 · Generalise
A→L→O→F→T
01
Curation
Mesh reflection scope; cadence defined
  • Mesh reflection scope
  • Coaching signal taxonomy
  • Long-horizon cadence defined
02
Staging
Eval suite passes; memo schema validated
  • Reflection eval suite passes
  • Coaching memo schema validated
  • Mesh-wide replay tested
03
Deployment
Meta-agent signed; read-only enforced
  • Meta-agent registry signed
  • Coaching contract attached
  • Read-only scope enforced
04
Operation
Reflection ≥ daily; memo coverage 100%
  • Reflection cadence ≥ daily
  • Trend coverage across quarters
  • Coaching memo coverage = 100%
05
Generalisation
Coaching pattern reused; quarter memo out
  • Coaching pattern reused
  • New class onboarded
  • Quarter-end memo released

Ready to deploy a Marcus agent?

Open the Workbench →Back to catalog
All classes
FeynmanFermiMaxwellRamaMarcus· currentWheeler