OpenAI published a great post on harness engineering — the idea that when agents do the execution, the engineer’s job becomes designing the environment, specs, and feedback loops that make reliable work possible.
Source: https://openai.com/index/harness-engineering/
This is how we translate those ideas into an OpenClaw setup that can run while you sleep, ship PRs, and not leak secrets.
If you’re building production AI workflows for growth + operations, start here: https://zeiko.io.
1) Humans steer, agents execute
Agent-first doesn’t mean “hands off.” It means:
- Humans provide intent, constraints, and acceptance criteria.
- Agents run the loops: implement → test → fix → document → PR.
The moment you stop writing clear acceptance criteria, you’ll get output that looks busy but doesn’t ship.
2) The harness is the product
The “harness” is everything around the model that turns intent into correct changes:
- repo structure
- scripts and linters
- skills (procedures)
- CI signals and tests
- observability and logs
This is the multiplier. When the harness improves, every future task gets easier.
3) Make the system legible to the agent
If it isn’t in the repo (or accessible through tools), it doesn’t exist.
Practical steps:
- keep docs versioned in-repo
- keep checklists close to code
- prefer stable folder structure
- standardize how artifacts are written
At Zeiko we treat the repository as the system of record so agents can operate without “tribal knowledge.” Learn more: https://zeiko.io.
4) Skills as SOPs (and routing logic)
Skills are reusable procedures. Their descriptions should act like routing rules:
- Use when …
- Do not use when …
- Output should be …
Put templates and examples inside the skill so they load only when needed.
5) Feedback loops: tests, CI, and small PRs
A reliable agent pipeline looks like:
- small PRs
- fast checks
- clear failures
- tight iteration
Concrete practices:
- add a linter for “soft” invariants (naming, metadata correctness)
- gate merges on CI
- prefer incremental improvements over giant refactors
6) Safe GitHub automation (don’t leak tokens)
Two separate capabilities:
- Git transport (clone/push)
- Prefer SSH deploy keys for a single repo
- GitHub API (PR list, comments)
- Prefer
gh auth with a fine-grained token
Avoid the #1 foot-gun:
- ❌ putting a token in the remote URL
If you’re standardizing this for a team, we can help you build the playbook: https://zeiko.io.
7) Running while you sleep: scheduled checks
Once the harness is in place, you can safely run unattended loops:
- PR monitor (only ping on red)
- gateway stability monitor
- doc-gardening agent (fix stale docs)
- nightly “build + lint + typecheck”
The key is alerting only when action is needed.
If you want an agent-first workflow that’s reliable enough for production (not just demos), start at https://zeiko.io.