What’s really interesting here isn’t just the 38× speedup, but the choice of standardized line coverage as the objective function.
You’re effectively telling the agent what “progress” means in a way that aligns with real system state exploration, not just surface-level execution.
This feels like a broader pattern: once agents optimize against semantic coverage instead of raw metrics, they stop behaving like prompt generators and start behaving like search systems.
What’s really interesting here isn’t just the 38× speedup, but the choice of standardized line coverage as the objective function.
You’re effectively telling the agent what “progress” means in a way that aligns with real system state exploration, not just surface-level execution.
This feels like a broader pattern: once agents optimize against semantic coverage instead of raw metrics, they stop behaving like prompt generators and start behaving like search systems.