Name: Zeiko - AI Business Manager
Availability: InStock
Rating: 4.9 (247 reviews)
Author: Zeiko

We did not migrate because the Vercel AI SDK stopped being useful.

It is still a strong way to build streaming AI interfaces, structured output utilities, and fast product surfaces. The problem was that Zeiko had outgrown a chat-first model integration pattern.

Our product is no longer just "send a message, get a response." Zeiko runs business agents that:

operate across chat, website widgets, Slack, Telegram, WhatsApp, GitHub, and scheduled ticks
call tools with account-scoped permissions
pause for approvals
resume long-running work
hand off to humans
publish to external systems
keep auditable memory, events, and runtime evidence

That is a different job. For that job, the OpenAI Agents SDK became the better runtime boundary.

The short version

We moved model-and-tool orchestration into the OpenAI Agents SDK because our agents needed a more explicit execution model.

The Vercel AI SDK helped us build fast UI and model utilities. The OpenAI Agents SDK gave us a clearer home for production agent behavior: tools, guardrails, handoffs, state, tracing, and rollout evidence.

We still keep UI transport and retained utility paths where they make sense. The migration was not "replace everything." It was "put durable agent work in the runtime designed for durable agent work."

Vercel AI SDK vs OpenAI Agents SDK

Decision area	Vercel AI SDK pattern	OpenAI Agents SDK pattern	Why it mattered for Zeiko
Primary job	AI UI, streaming, structured generation	Agent execution, tools, handoffs, guardrails	Our core product became deployed agents, not isolated AI calls
Tool orchestration	App-defined tool loops around model calls	Agent/Runner execution with SDK tools	Easier to reason about account-safe tool policy and runtime outcomes
Long-running work	Usually app-level orchestration	Clearer agent execution boundary	Better fit for approvals, retries, scheduled ticks, and resumable workflows
Observability	Mostly our own wrappers and logs	SDK-native tracing concepts plus our event store	Less custom glue around agent outcomes
Risk control	Possible, but mostly app-defined	Guardrails and approvals are part of the agent model	Safer for publishing, data access, browser automation, and external channels
Best retained use	UI streaming, one-shot utilities, schema helpers	Production agent runtime paths	We kept each tool where it had the strongest job fit

What changed in our product

The migration forced us to separate three categories that had started to blur.

1. Agent runtime work

This is work where the system is not just generating text. It is deciding, using tools, applying policy, recording state, and producing an operator-visible outcome.

Examples:

main agent chat
selected-agent execution
website widget runtime
external channel replies
autonomous ticks
delegated worker tasks
approval-backed workflow execution
YouTube Niche Finder tool execution

These paths now belong behind the OpenAI Agents SDK runtime boundary.

2. Retained utilities

Some AI calls are still just utilities. They do not need a full agent runner.

Examples:

simple classifiers
one-shot content generation
schema normalization
OCR-style helpers
internal formatting calls
UI streaming transport

Moving these for the sake of purity would add complexity without reducing risk. So we kept them as utilities and documented why.

3. Blocked high-risk lanes

Some surfaces needed evidence before they could become SDK-default:

social publishing and visual generation
workspace memory/data export
GitHub operations
Shopify inventory and finance tools
browser/code execution
broader YouTube growth workflows

For these, we built local certification gates before promotion. The point was not to "trust the migration." The point was to prove each risky lane had approval, quota, sandbox, credential, and rollback evidence.

The real reason: our old boundary was too implicit

When a product is young, implicit orchestration feels fast.

You call a model. You pass tools. You parse output. You add a few wrappers. You add more wrappers. Then approvals arrive. Then external channels arrive. Then scheduled jobs arrive. Then publishing arrives. Then one day, the "AI call" is actually a distributed business operation with credentials, audit logs, retries, and human review.

That is where our previous boundary started to hurt.

We needed a runtime that made the following things explicit:

What agent is running?
Which tools are allowed?
What state was read?
What approval did it require?
What channel did it reply to?
What happened when a tool failed?
Can the same behavior run from chat, a widget, or a scheduled tick?
Can we prove this lane is ready before making it default?

The OpenAI Agents SDK gave us a cleaner center of gravity for that.

Why not stay on the Vercel AI SDK forever?

The honest answer is switching has a cost.

This is where status quo bias can trick engineering teams. Staying with a familiar SDK feels safer because the failure modes are known. But known friction is still friction.

For us, the cost of staying looked like this:

more app-specific wrappers around every tool loop
more duplicated runtime logic across chat and channels
more custom approval/resume handling
more difficulty proving high-risk capability lanes were safe
more ambiguity about which AI paths were true agent execution and which were utilities

The migration was worth it once the cost of keeping the old shape became larger than the cost of changing it.

What we kept from Vercel AI SDK patterns

We did not throw away everything around the Vercel AI SDK.

The most important retained idea is still excellent: build great AI product interfaces. Users need streaming, responsive UI, stable message transport, and predictable rendering.

So our migration kept the parts that were still doing the right job:

UI message streaming where it is purely transport
retained one-shot utilities where a full agent runner is unnecessary
compatibility fallbacks during rollout
structured output wrappers where the call site is not operating as an autonomous agent

This matters because migrations fail when they become ideological. We were not trying to win a framework argument. We were trying to make the runtime match the product.

Why OpenAI Agents SDK fit the next stage

Zeiko's product vision is deployable AI agents for small businesses. That means the agent is not a sidebar feature; it is the operating layer.

The OpenAI Agents SDK lined up with five product needs.

1. Tools are first-class

Agents need tools, but not every tool should be available in every context.

A support agent, SEO agent, Shopify agent, and social publishing agent need different permissions. The runtime needs to make tool availability visible and enforceable.

2. Handoffs are real product events

An agent should be able to escalate to a human, delegate to a specialist, or resume after a worker finishes. That cannot be treated as just another generated paragraph.

3. Approvals are part of execution

For destructive actions, external publishing, data export, and account mutations, approval is not optional UX. It is part of the runtime contract.

4. State needs one source of truth

The same agent may run from the web app, a website widget, Slack, Telegram, WhatsApp, or a scheduled tick. If each surface carries its own state model, behavior drifts.

5. Rollout evidence matters

The migration had to be provable:

local smoke tests
hosted smoke tests
blocked-lane certification
default-runtime promotion
post-flip smoke
fallback-removal readiness

That evidence is what let us move from "it should work" to "we can merge this without guessing."

What improved after the migration

The most important improvement was not raw model quality. It was operational clarity.

We can now answer questions like:

Is this lane SDK-backed, retained, or blocked?
What evidence allowed it to become default?
Which account/agent cohort is promoted?
Did post-flip smoke resolve the SDK runtime?
Is the fallback still present for rollback only?
Which utilities are intentionally retained?

That clarity compounds. Every future agent feature has a better place to live.

What got harder

The migration was not free.

The hardest parts were not syntax changes. They were product-boundary decisions:

deciding which paths were true agent runtime versus utility calls
keeping compatibility fallbacks without letting them become permanent ambiguity
proving social publishing and external-channel behavior without breaking production
documenting retained paths so future work does not accidentally reopen the migration
keeping local evidence strong enough when hosted credits were constrained

This is the part most migration posts skip. The code change is only half the work. The other half is reducing regret risk for the team that has to operate the system after merge.

A practical migration checklist

If you are deciding whether to migrate from Vercel AI SDK patterns to OpenAI Agents SDK, use this checklist.

Stay with lighter AI SDK patterns when:

the feature is mostly UI streaming
the model call is one-shot and stateless
tools are simple and low-risk
no approval, handoff, or durable runtime state is needed
failure is easy to retry manually

Move to an agent runtime when:

tools mutate customer or business data
work can run from multiple channels
the agent must pause and resume
approval is required before execution
external publishing or integrations are involved
you need operator-visible evidence and replay
scheduled/autonomous execution matters

For Zeiko, the second list became the product.

FAQ

Is Vercel AI SDK bad for agents?

No. It is very useful for AI interfaces and app-level AI features. Our issue was not quality; it was fit. Zeiko needed a stronger runtime boundary for deployed agents, tools, approvals, and cross-channel state.

Did Zeiko remove every Vercel AI SDK usage?

No. We retained utility and transport paths where they still fit. The migration focused on production agent execution, not ideological removal.

Why migrate if the old system worked?

Because "works" is not the same as "scales safely." The old shape worked for narrower AI features. It became expensive once agents needed durable state, approvals, external publishing, and multi-channel behavior.

What was the biggest risk?

High-risk capability lanes: publishing, browser/code execution, data export, Shopify operations, GitHub operations, and broad YouTube workflows. We kept those behind certification gates until evidence existed.

What is the lesson for other teams?

Choose the runtime based on the job. Use lightweight AI SDK patterns for lightweight AI features. Use an agent runtime when the product is actually asking an agent to operate a business process.

The takeaway

The migration was not about chasing a newer SDK.

It was about admitting that Zeiko had crossed a product threshold. Once agents can use tools, publish, approve, hand off, remember, retry, and run across channels, the runtime becomes part of the product's trust model.

That is why we moved production agent execution to the OpenAI Agents SDK.

If your team is building AI features, keep the stack simple. If your team is building agents that operate real workflows, invest in the runtime boundary early.

That boundary is where reliability starts.

Why we migrated from Vercel AI SDK to OpenAI Agents SDK

What is the main takeaway from this engineering article?

Continue Reading

Turn operational know-how into agent-first execution.

Share

Related

Harness engineering for agent-first teams (what we borrowed from Codex)