Hacker News — vinext + Cloudflare Workers

new
past
show
ask
show
jobs
submit

▲Continual Harness: Online Adaptation for Self-Improving Foundation Agents (arxiv.org)

8 points by milkkarten 2 days ago | 1 comment

milkkarten 2 days ago [-]

Author here. TL;DR:

Long-horizon embodied agency is a harness problem, not a model-scale problem. Coding agents like Claude Code work because of scaffolding (prompt, skills, memory, sub-agents) around the model. Embodied agents haven't had an equivalent.

Gemini Plays Pokémon (GPP) became the first AI to complete Pokémon Blue, Yellow Legacy on hard mode, and Crystal without a lost battle via iterative harness refinement. Early on a human edited the harness. By Crystal the model was doing it itself by naming its own strategies, writing truth tables for puzzles, wrapping loopholes into reusable primitives.

Continual Harness automates this fully. Starting from a raw interface with no curated knowledge, every F steps a Refiner reads the recent trajectory and applies edits to the prompt, sub-agents, skills, and memory -- no resets. It closes most of the gap to a hand-engineered expert harness from scratch.

Our key findings: (1) Iterative harness refinement closes most of the gap to a hand-engineered version. (2) Long-horizon agency requires self-refinement, and self-refinement requires a useful model. (3) The future of agents is model-harness co-learning.

Demos: https://sethkarten.ai/continual-harness

Rendered at 13:28:10 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.