Vercel AI SDK
Vercel AI SDK is the standard library for LLM-powered web apps — unified provider interface (Anthropic, OpenAI, Google), streaming primitives (streamText, useChat), and automatic tool-call cycles without SSE boilerplate.
The standard library for building LLM-powered web applications. Works with any provider (Anthropic, OpenAI, Google, Mistral) through a unified interface. The main value is streaming UX. Responses appear token-by-token without you writing SSE infrastructure.
Install
pnpm add ai @ai-sdk/anthropic @ai-sdk/openaiCore Primitives
streamText — Server-Side Streaming
import { streamText } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
// app/api/chat/route.ts
export async function POST(req: Request) {
const { messages } = await req.json()
const result = streamText({
model: anthropic('claude-sonnet-4-6'),
system: 'You are a helpful assistant.',
messages,
})
return result.toDataStreamResponse()
}toDataStreamResponse() returns a streaming Response with the AI SDK's data stream protocol. The client-side hooks understand this format.
generateText — Non-Streaming
import { generateText } from 'ai'
const { text, usage } = await generateText({
model: anthropic('claude-sonnet-4-6'),
prompt: 'Summarise this document in 3 bullet points.',
})
console.log(`Tokens used: ${usage.totalTokens}`)generateObject — Structured Output
import { generateObject } from 'ai'
import { z } from 'zod'
const { object } = await generateObject({
model: anthropic('claude-sonnet-4-6'),
schema: z.object({
sentiment: z.enum(['positive', 'negative', 'neutral']),
confidence: z.number().min(0).max(1),
summary: z.string(),
}),
prompt: `Analyse the sentiment of: "${userReview}"`,
})
// object is fully typed and validatedClient-Side: useChat
'use client'
import { useChat } from 'ai/react'
export function Chat() {
const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
api: '/api/chat',
onError: (error) => console.error(error),
})
return (
<div>
{messages.map(m => (
<div key={m.id} className={m.role === 'user' ? 'text-right' : 'text-left'}>
{m.content}
</div>
))}
<form onSubmit={handleSubmit}>
<input value={input} onChange={handleInputChange} disabled={isLoading} />
<button type="submit" disabled={isLoading}>Send</button>
</form>
</div>
)
}useChat manages message history, sends to your API route, and streams the response into messages as tokens arrive.
useCompletion — Single-Turn Completion
'use client'
import { useCompletion } from 'ai/react'
export function SummaryButton({ text }: { text: string }) {
const { complete, completion, isLoading } = useCompletion({ api: '/api/summarise' })
return (
<>
<button onClick={() => complete(text)} disabled={isLoading}>
Summarise
</button>
{completion && <p>{completion}</p>}
</>
)
}Tool Calling
Tools are defined with Zod schemas; the SDK handles the tool-call → execute → result cycle automatically.
import { streamText, tool } from 'ai'
import { z } from 'zod'
const result = streamText({
model: anthropic('claude-sonnet-4-6'),
tools: {
getWeather: tool({
description: 'Get current weather for a location',
parameters: z.object({
location: z.string().describe('City name or coordinates'),
unit: z.enum(['celsius', 'fahrenheit']).default('celsius'),
}),
execute: async ({ location, unit }) => {
const weather = await fetchWeatherAPI(location, unit)
return { temperature: weather.temp, condition: weather.condition }
},
}),
},
messages,
})Set maxSteps to allow multi-step tool use (model calls tool → sees result → calls another tool):
const result = streamText({
model: anthropic('claude-sonnet-4-6'),
tools: { getWeather, searchWeb, calculateRoute },
maxSteps: 5,
messages,
})Multi-Provider Setup
import { anthropic } from '@ai-sdk/anthropic'
import { openai } from '@ai-sdk/openai'
import { google } from '@ai-sdk/google'
// Route by capability or cost
const model = useCase === 'reasoning'
? anthropic('claude-opus-4-7')
: useCase === 'fast'
? anthropic('claude-haiku-4-5-20251001')
: openai('gpt-4o')The streamText interface is identical regardless of provider. Swap models without changing application code.
Streaming with Custom Data
Send structured data alongside the text stream:
import { streamText, createDataStreamResponse } from 'ai'
export async function POST(req: Request) {
const { messages } = await req.json()
return createDataStreamResponse({
execute: async (dataStream) => {
// Send metadata immediately
dataStream.writeData({ type: 'sources', sources: retrievedDocs })
const result = streamText({
model: anthropic('claude-sonnet-4-6'),
messages,
})
result.mergeIntoDataStream(dataStream)
},
})
}On the client, useChat exposes data alongside messages:
const { messages, data } = useChat({ api: '/api/chat' })
const sources = data?.filter(d => d.type === 'sources').at(-1)?.sourcesError Handling
import { streamText } from 'ai'
import { APICallError } from 'ai'
try {
const result = await streamText({ model, messages })
for await (const chunk of result.textStream) {
process.stdout.write(chunk)
}
} catch (error) {
if (APICallError.isInstance(error)) {
console.error(`API error ${error.statusCode}: ${error.message}`)
// Retry logic, fallback model, etc.
}
}Middleware
Add cross-cutting concerns (logging, caching, rate limiting) without changing route handlers:
import { wrapLanguageModel, extractReasoningMiddleware } from 'ai'
const modelWithReasoning = wrapLanguageModel({
model: anthropic('claude-opus-4-7'),
middleware: extractReasoningMiddleware({ tagName: 'think' }),
})Key Facts
- Install:
pnpm add ai @ai-sdk/anthropic @ai-sdk/openai - Three server primitives:
streamText(streaming),generateText(non-streaming),generateObject(structured/Zod) toDataStreamResponse()returns the streaming response;useChaton the client understands this protocolmaxStepsinstreamTextenables multi-step tool-call cycles — model calls tool, sees result, calls nextcreateDataStreamResponse+dataStream.writeData()sends structured metadata alongside the token streamwrapLanguageModeladds middleware (logging, caching, reasoning extraction) without changing route handlersAPICallError.isInstance(error)is the typed check for provider errors in catch blocks
Common Failure Cases
useChat messages show undefined for tool call results because the tool response is not returned to toDataStreamResponse
Why: when a tool is defined in streamText, the tool's execute function must return a serialisable value; if it returns undefined or throws an unhandled error, the data stream protocol cannot include the tool result and useChat receives an incomplete message.
Detect: messages in the useChat state show the tool call but no tool result; the UI stops updating mid-stream; adding console.log in execute shows the function threw an error.
Fix: wrap execute in try/catch and return a structured error object on failure; never return undefined — return { error: "..." } instead.
generateObject fails with NoObjectGeneratedError because the Zod schema is too complex for the model to satisfy
Why: deeply nested Zod schemas with many optional fields and complex validation constraints require the model to produce very specific JSON; the model occasionally fails to satisfy all constraints in one generation, and without retries, generateObject throws.
Detect: NoObjectGeneratedError in production logs; the error is intermittent — most requests succeed but 1-5% fail on complex schemas.
Fix: set mode: 'json' on the model call if the provider supports JSON mode; simplify the schema by removing optional fields that are rarely needed; add retry logic with maxRetries on the generateObject call.
maxSteps exceeded because a tool always returns data that triggers another tool call
Why: with maxSteps: 5, if each tool result causes the model to call another tool rather than generating a final text response, the cycle exhausts maxSteps and the generation ends without a complete response.
Detect: streaming ends abruptly after exactly maxSteps tool calls with no final assistant text; the model's reasoning shows it expected to make another tool call.
Fix: add a finalize tool that the model can call when it is ready to give the final answer; or instruct the model in the system prompt to provide a text summary after tool results rather than continuing to call tools.
useChat sends the full conversation history on every request, causing token costs to grow unboundedly in long conversations
Why: useChat maintains the full messages array and sends all messages on every submit; a 50-turn conversation sends 50 messages worth of tokens on turn 51, causing costs and latency to grow linearly.
Detect: LLM API costs per session grow with session length; the messages array passed to streamText on the server grows without bound.
Fix: implement a message window on the server-side API route: messages.slice(-20) to keep only the last 20 messages; or add summarisation to compress older context.
Connections
- web-frameworks/nextjs — App Router API routes and
useChathook integration - apis/anthropic-api —
@ai-sdk/anthropicwraps the Anthropic Messages API; prompt caching is supported - agents/langgraph — when you need stateful agent loops, checkpointing, or HITL beyond what
useChatprovides - protocols/tool-design — writing good tool descriptions; the same principles apply to AI SDK
tool()definitions - observability/platforms — logging token usage and latency from
generateTextusage via Langfuse
Open Questions
- How does the Vercel AI SDK's data stream protocol compare to raw SSE for browser compatibility and debugging?
- At what
maxStepscount do tool-call cycles instreamTextbecome impractical — and what is the failure mode? - Does
generateObjectwith complex nested Zod schemas reliably validate at the same rate across Anthropic vs OpenAI models?
Related reading
More in Web Frameworks