Masaki Mu
In the middle of the journey, I came to myself.
Latest
-
HITL's H Is Shifting from Human to Harness
The 'human' in Human-in-the-loop is really just a role — the verifier that judges 'is this round done?' after each turn. The evolution of coding agents over the past two years is, in large part, the story of harnesses taking over this role, piece by piece, in the order of what can be expressed as a check. HITL → VITL.
-
Humans as Weak Supervisors: What AAR Reveals About Alignment
Anthropic's AAR project is ostensibly about autonomous AI research. Viewed differently, it's a meta-validation of weak-to-strong alignment: humans as weak supervisors, wielding evaluation environment design as their last leverage point over models that exceed their own capabilities.
-
Brain ≠ Hands: Dissecting Anthropic's Managed Agents Architecture
Anthropic published their Managed Agents architecture. Under the hood, the real insight isn't the three-layer split — it's two counterintuitive decouplings: Session ≠ Context, and Tool execution doesn't live next to the Agent.
-
The Agent Trapped in the Same River
A month of building memory for an AI Agent. What we learned wasn't how to build a pipeline — it was the fundamental difference between a database and memory: databases preserve facts, memory lets knowledge grow.
-
Agent Context Compaction: From Cursor's Self-Compaction to Claude Code and Codex CLI
Cursor trains models to self-compact via RL for long-horizon tasks. We dissect the KV cache reuse mechanism and compare it with Claude Code's 9-section structured compact and Codex CLI's auto-compact through reverse engineering.
-
Bub Tape Architecture Deep Dive: From Traceable Memory to Controllable Context Windows
A deep breakdown of Bub's Tape subsystem: storage structure, context selection, handoff/reset semantics, search mechanism, and engineering trade-offs.