Spring AI
Spring AI is the official Spring Boot LLM integration — auto-configured ChatClient, Advisors for RAG and memory, pgvector store, and function calling via Spring beans; the natural choice for teams already on Spring Boot.
Spring Boot's official AI integration framework. If your team is already on Spring Boot, Spring AI is the natural choice. It follows Spring conventions (auto-configuration, dependency injection, application.properties) and integrates with the broader Spring ecosystem (Spring Data, Spring Security, Spring Web).
Spring AI vs LangChain4j: Spring AI is better for teams already on Spring Boot; LangChain4j is better for teams that want richer agent capabilities or a standalone library. See java/langchain4j for comparison.
Dependencies
<!-- pom.xml — Spring Boot 3.3+ required -->
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>1.0.0</version> <!-- check for latest -->
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<!-- Anthropic -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-anthropic-spring-boot-starter</artifactId>
</dependency>
<!-- OpenAI -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
<!-- Vector store (pgvector) -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>
</dependencies>Configuration
# application.yml
spring:
ai:
anthropic:
api-key: ${ANTHROPIC_API_KEY}
chat:
options:
model: claude-sonnet-4-6
max-tokens: 1024
openai:
api-key: ${OPENAI_API_KEY}
embedding:
options:
model: text-embedding-3-largeSpring AI reads these at startup and auto-configures ChatClient and EmbeddingModel beans.
ChatClient: Basic Conversation
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.stereotype.Service;
@Service
public class AssistantService {
private final ChatClient chatClient;
// Spring injects ChatClient automatically
public AssistantService(ChatClient.Builder builder) {
this.chatClient = builder
.defaultSystem("You are a helpful assistant for a Java developer.")
.build();
}
public String ask(String question) {
return chatClient.prompt()
.user(question)
.call()
.content();
}
// Streaming response
public Flux<String> stream(String question) {
return chatClient.prompt()
.user(question)
.stream()
.content();
}
}Structured Output
Map model responses to Java records/classes:
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.converter.BeanOutputConverter;
public record SentimentAnalysis(
String sentiment, // "positive", "negative", "neutral"
double confidence,
String reasoning
) {}
@Service
public class SentimentService {
private final ChatClient chatClient;
public SentimentAnalysis analyse(String text) {
return chatClient.prompt()
.user(u -> u.text("Analyse the sentiment of: {text}")
.param("text", text))
.call()
.entity(SentimentAnalysis.class);
}
}Spring AI uses the class structure to generate the JSON schema and parses the response.
Function Calling
import org.springframework.ai.model.function.FunctionCallback;
import org.springframework.context.annotation.Bean;
// Define the function as a Spring bean
@Bean
public FunctionCallback weatherFunction() {
return FunctionCallback.builder()
.function("getCurrentWeather", (WeatherRequest req) -> {
// Execute the actual weather lookup
return weatherService.getWeather(req.location(), req.unit());
})
.description("Get the current weather for a given location. " +
"Use Celsius for European locations, Fahrenheit for US.")
.inputType(WeatherRequest.class)
.build();
}
public record WeatherRequest(String location, String unit) {}
// Use in chat
public String chatWithWeather(String message) {
return chatClient.prompt()
.user(message)
.functions("getCurrentWeather") // enable the function
.call()
.content();
}Spring AI handles the full tool-call cycle: sends definition, receives call, executes bean, returns result.
RAG Pipeline
import org.springframework.ai.document.Document;
import org.springframework.ai.embedding.EmbeddingModel;
import org.springframework.ai.reader.pdf.PagePdfDocumentReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.ai.vectorstore.SearchRequest;
@Service
public class DocumentService {
private final VectorStore vectorStore;
private final EmbeddingModel embeddingModel;
// Ingest documents
public void ingest(Resource pdfResource) {
var reader = new PagePdfDocumentReader(pdfResource);
var splitter = new TokenTextSplitter(512, 50, 5, 10_000, true);
List<Document> docs = splitter.apply(reader.get());
vectorStore.add(docs); // auto-embeds and stores
}
// Search
public List<Document> search(String query, int topK) {
return vectorStore.similaritySearch(
SearchRequest.query(query).withTopK(topK)
);
}
}
// Full RAG chat
@Service
public class RagService {
private final ChatClient chatClient;
private final VectorStore vectorStore;
public String answer(String question) {
List<Document> context = vectorStore.similaritySearch(
SearchRequest.query(question).withTopK(5)
);
String contextText = context.stream()
.map(Document::getContent)
.collect(Collectors.joining("\n\n"));
return chatClient.prompt()
.system("""
Answer using only the provided context.
If the answer isn't in the context, say so.
Context: {context}
""")
.user(question)
.param("context", contextText)
.call()
.content();
}
}pgvector Configuration
spring:
datasource:
url: jdbc:postgresql://localhost:5432/mydb
username: postgres
password: password
ai:
vectorstore:
pgvector:
index-type: HNSW
distance-type: COSINE_DISTANCE
dimensions: 1536 # match your embedding model output-- Required extensions
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS hstore;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- Spring AI creates the table automatically on startupAdvisors: Middleware for ChatClient
Advisors intercept the request/response pipeline. The QuestionAnswerAdvisor does RAG automatically:
import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;
import org.springframework.ai.chat.client.advisor.MessageChatMemoryAdvisor;
import org.springframework.ai.chat.memory.InMemoryChatMemory;
ChatClient chatClient = ChatClient.builder(chatModel)
.defaultAdvisors(
new MessageChatMemoryAdvisor(new InMemoryChatMemory()), // conversation memory
new QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()), // auto-RAG
)
.build();
// Now every call automatically retrieves context and maintains history
String response = chatClient.prompt()
.user("What does our SLA say about uptime?")
.call()
.content();Spring AI vs LangChain4j
| Feature | Spring AI | LangChain4j |
|---|---|---|
| Spring Boot integration | Native | Manual config |
| Auto-configuration | Yes | No |
| Agent / ReAct loop | Basic | Mature |
| Tool calling | Yes (Bean-based) | Yes (@Tool annotation) |
| RAG | Yes (Advisor) | Yes (pipeline) |
| Memory | InMemory only (v1.0) | Multiple backends |
| Streaming | Reactor Flux | Completable/Flux |
| Best for | Spring Boot teams | Standalone Java AI apps |
Key Facts
- Spring Boot 3.3+ required for Spring AI 1.0.0
- Auto-configures
ChatClientandEmbeddingModelbeans fromapplication.yml QuestionAnswerAdvisorhandles full RAG retrieval automatically per requestMessageChatMemoryAdvisormaintains conversation history (InMemory only in v1.0)- pgvector config: set
dimensionsto match your embedding model output (1536 for text-embedding-3-large) - Function calling: define as a Spring
@Bean, enable by function name in.functions("name") - Spring AI streams via Reactor
Flux<String>— integrates with Spring WebFlux naturally
Common Failure Cases
ChatClient.Builder is autowired but the bean is absent because the spring-ai-anthropic-spring-boot-starter is missing from the classpath
Why: Spring AI auto-configuration only activates when the provider-specific starter is on the classpath; importing only spring-ai-core without a provider starter creates no ChatModel bean, causing NoSuchBeanDefinitionException for ChatClient.Builder at startup.
Detect: the application fails at startup with No qualifying bean of type 'org.springframework.ai.chat.client.ChatClient$Builder' available; spring.ai.anthropic.api-key is set correctly but ignored.
Fix: add spring-ai-anthropic-spring-boot-starter to dependencies (not just spring-ai-core); verify with mvn dependency:tree that the starter artifact is present.
vectorStore.add(docs) silently succeeds but no vectors are stored because the pgvector extension is not installed on the database
Why: Spring AI's pgvector store auto-creates the vector_store table on startup but does not verify that the vector PostgreSQL extension is installed; if it is absent, CREATE TABLE ... USING vector fails silently in some configurations, or the table is created without the vector column.
Detect: vectorStore.similaritySearch("query") returns an empty list even after ingestion; inspecting the database shows the vector_store table exists but the embedding column is missing or has text type instead of vector.
Fix: run CREATE EXTENSION IF NOT EXISTS vector; on the database before starting the application; add this to your migration scripts or database setup documentation.
QuestionAnswerAdvisor injects retrieved context that exceeds the model's context window, causing a 400 error on long conversations
Why: the advisor retrieves topK documents per request and prepends them to the system message; combined with MessageChatMemoryAdvisor maintaining a growing conversation history, the total tokens can exceed the model's input limit on longer sessions.
Detect: requests fail with a 400 response from the Anthropic API indicating prompt is too long; the failure starts appearing after several conversation turns; the same prompt works on a fresh conversation.
Fix: set a lower topK value (3 instead of 5); reduce MessageWindowChatMemory.withMaxMessages() to limit history size; or implement a summarisation step that compresses old history when the window fills.
Function calling fails with IllegalArgumentException because the @Bean name and the .functions("name") reference do not match
Why: Spring AI registers function callbacks by the bean name; if the bean name is inferred from the method name (camelCase) but .functions() is called with a different string (e.g., "get_current_weather" vs "getCurrentWeather"), the function is not found and the model either ignores it or the SDK throws.
Detect: the model does not call the function even on prompts that clearly require it; enabling Spring AI debug logging shows the function is not registered under the expected name.
Fix: explicitly name the bean with @Bean("getCurrentWeather") and ensure the .functions("getCurrentWeather") call uses the exact same string; or use the FunctionCallback.builder().name("exact-name") to set the name explicitly.
Connections
- java/langchain4j — alternative with richer agent patterns and MCP support; comparison in this file
- infra/vector-stores — pgvector, Weaviate, Pinecone all have Spring AI starters
- rag/pipeline — RAG concepts Spring AI implements via Advisors
- apis/anthropic-api — Anthropic API underlying the Spring AI Anthropic client
Open Questions
- When will Spring AI add persistent chat memory backends (Redis, PostgreSQL) beyond InMemoryChatMemory?
- How does Spring AI's Advisor model compare to LangChain4j's pipeline for complex multi-step RAG?
- What is the migration path from Spring AI 0.x to 1.0 for existing production applications?
Related reading