Writing
Read the thesis as it is being built.
Writing that starts from activity: what AI transforms, what it hides, and what conditions let humans verify, decide, and recover.
The Enterprise Agent Problem Is Belief
An agent is not enterprise-ready because it can act. It is enterprise-ready when the belief it creates about its action is calibrated to evidence.
An AI talk should talk about real work
Why a useful AI talk should start from what AI changes in work, decisions, responsibility, evidence and human recovery, not from tools alone.
Raw Traces Are Not Evals
The missing layer between real agent failures and measurable model progress: reducing messy traces into replayable eval seeds without laundering away the signal.
Tool Use Is Not Task Completion
Why agent reliability depends on preserving the boundary between intention, action, observable state, and truthful final claims.
The Most Dangerous Agent Failure Is Not Hallucination
Why the critical risk is not only hallucination, but an agent claiming work is done without observable evidence.
AI Agents Are Not Just Tools. They Are Work Systems
Why task completion is not enough: an agent redistributes verification, responsibility, coordination, and recovery work.
AI Is Not Integrated Into Work. It Reconfigures It.
Why AI must be analyzed from real activity: constraints, trade-offs, invisible cooperation, room to act, and responsibility.
The Benchmark Lie: Why Grok 4.20 Excels in Benchmarks but Fails in Production
A cognitive ergonomist dogfoods Grok 4.20 across 12+ production agent instances. What 40 years of human factors research says about why benchmarks miss what matters in agent loops.
Contact
Email Julien
For an intervention, a talk, or a conversation about AI and real work, send the situation, the decision to make, and the visible constraint.