Masaki Mu

In the middle of the journey, I came to myself.

Latest

HITL's H Is Shifting from Human to Harness

2 Jun, 2026

The 'human' in Human-in-the-loop is really just a role — the verifier that judges 'is this round done?' after each turn. The evolution of coding agents over the past two years is, in large part, the story of harnesses taking over this role, piece by piece, in the order of what can be expressed as a check. HITL → VITL.
Humans as Weak Supervisors: What AAR Reveals About Alignment

15 Apr, 2026

Anthropic's AAR project is ostensibly about autonomous AI research. Viewed differently, it's a meta-validation of weak-to-strong alignment: humans as weak supervisors, wielding evaluation environment design as their last leverage point over models that exceed their own capabilities.
Brain ≠ Hands: Dissecting Anthropic's Managed Agents Architecture

9 Apr, 2026

Anthropic published their Managed Agents architecture. Under the hood, the real insight isn't the three-layer split — it's two counterintuitive decouplings: Session ≠ Context, and Tool execution doesn't live next to the Agent.
The Agent Trapped in the Same River

31 Mar, 2026

A month of building memory for an AI Agent. What we learned wasn't how to build a pipeline — it was the fundamental difference between a database and memory: databases preserve facts, memory lets knowledge grow.
Agent Context Compaction: From Cursor's Self-Compaction to Claude Code and Codex CLI

18 Mar, 2026

Cursor trains models to self-compact via RL for long-horizon tasks. We dissect the KV cache reuse mechanism and compare it with Claude Code's 9-section structured compact and Codex CLI's auto-compact through reverse engineering.
Bub Tape Architecture Deep Dive: From Traceable Memory to Controllable Context Windows

27 Feb, 2026

A deep breakdown of Bub's Tape subsystem: storage structure, context selection, handoff/reset semantics, search mechanism, and engineering trade-offs.

All posts →

Masaki Mu

Latest

HITL's H Is Shifting from Human to Harness

Humans as Weak Supervisors: What AAR Reveals About Alignment

Brain ≠ Hands: Dissecting Anthropic's Managed Agents Architecture

The Agent Trapped in the Same River

Agent Context Compaction: From Cursor's Self-Compaction to Claude Code and Codex CLI

Bub Tape Architecture Deep Dive: From Traceable Memory to Controllable Context Windows