Close Menu
    Trending
    • Boost 2-Bit LLM Accuracy with EoRA
    • MLE-Dojo: Training a New Breed of LLM Agents to Master Machine Learning Engineering | by ArXiv In-depth Analysis | May, 2025
    • Student Asks for Money Back After Professor Uses ChatGPT
    • Efficient Graph Storage for Entity Resolution Using Clique-Based Compression
    • Papers Explained 366: Math Shepherd | by Ritvik Rastogi | May, 2025
    • Airbnb Now Offers Bookings for Massages, Chefs, Fitness
    • Integrating LLM APIs with Spring Boot: A Practical Guide | by ThamizhElango Natarajan | May, 2025
    • Barbara Corcoran Finds a Buyer in One Day for $12M Penthouse
    Finance StarGate
    • Home
    • Artificial Intelligence
    • AI Technology
    • Data Science
    • Machine Learning
    • Finance
    • Passive Income
    Finance StarGate
    Home»Machine Learning»Integrating LLM APIs with Spring Boot: A Practical Guide | by ThamizhElango Natarajan | May, 2025
    Machine Learning

    Integrating LLM APIs with Spring Boot: A Practical Guide | by ThamizhElango Natarajan | May, 2025

    FinanceStarGateBy FinanceStarGateMay 15, 2025No Comments11 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Massive Language Fashions (LLMs) have revolutionized how we construct clever functions. Whether or not you’re making a buyer help chatbot, producing content material, or analyzing textual content knowledge, LLMs provide highly effective capabilities that may improve your Spring Boot functions. This information will stroll you thru integrating standard LLM APIs with Spring Boot, providing real-world examples for instance every state of affairs.

    1. Setting Up Your Spring Boot Venture
    2. Utilizing Spring AI
    3. Integrating OpenAI’s API
    4. Working with Anthropic’s Claude API
    5. Constructing a Sensible Buyer Help System
    6. Making a Content material Technology Service
    7. Implementing Semantic Search
    8. Greatest Practices for LLM API Integration
    9. Efficiency Concerns
    10. Conclusion

    Earlier than integrating LLM APIs, let’s arrange a primary Spring Boot venture. You should use Spring Initializr (https://begin.spring.io/) to create a brand new venture with the next dependencies:

    • Spring Internet
    • Spring Boot DevTools
    • Lombok (non-obligatory, however useful)

    As soon as your venture is ready up, you’ll want so as to add dependencies for HTTP shoppers. I like to recommend utilizing Spring’s WebClient for reactive functions or RestTemplate for synchronous operations.

    For Maven, add the next to your pom.xml:



    org.springframework.boot
    spring-boot-starter-webflux



    com.fasterxml.jackson.core
    jackson-databind

    For Gradle, add to your construct.gradle:

    implementation 'org.springframework.boot:spring-boot-starter-webflux'
    implementation 'com.fasterxml.jackson.core:jackson-databind'

    Spring AI is an official Spring venture that gives a unified API for working with numerous LLM suppliers. It simplifies integration and makes your code provider-agnostic, permitting you to modify between totally different AI companies with out main code adjustments.

    First, add the Spring AI BOM (Invoice of Supplies) to your venture:

    For Maven:




    org.springframework.ai
    spring-ai-bom
    1.0.0
    pom
    import


    For Gradle:

    dependencyManagement {
    imports {
    mavenBom "org.springframework.ai:spring-ai-bom:1.0.0"
    }
    }

    Primarily based on the most recent data out there, Spring AI has advanced to supply a complete framework for AI engineering with a number of key options together with transportable APIs throughout numerous AI suppliers for Chat, text-to-image, audio transcription, text-to-speech, and embedding fashions with each synchronous and streaming choices.

    The framework helps structured outputs (mapping AI mannequin outputs to POJOs), a wide selection of vector database suppliers, and a transportable API throughout these suppliers that features a SQL-like metadata filter.

    Further capabilities embrace:

    1. Instruments/Operate Calling — Permits fashions to request client-side instrument execution to entry real-time data
    2. Advisors API — Encapsulates widespread generative AI patterns and transforms knowledge despatched to and from LLMs
    3. Observability — Offers insights into AI-related operations
    4. Doc Injection ETL — Framework for knowledge engineering
    5. AI Mannequin Analysis — Instruments to judge generated content material and defend towards hallucinated responses
    6. RAG Help — Help for Retrieval Augmented Technology, enabling use circumstances like “Chat along with your documentation”

    Spring AI helps a number of LLM suppliers. Right here’s tips on how to arrange each:

    OpenAI with Spring AI

    Add the OpenAI dependency:


    org.springframework.ai
    spring-ai-openai-spring-boot-starter

    Configure in software.properties:

    spring.ai.openai.api-key=your-api-key
    spring.ai.openai.mannequin=gpt-4
    spring.ai.openai.temperature=0.7

    Utilization instance:

    @RestController
    @RequiredArgsConstructor
    public class OpenAIController {

    personal remaining OpenAiChatClient openAiChatClient;

    @GetMapping("/ai/openai/generate")
    public String generate(@RequestParam String immediate) {
    return openAiChatClient.name(immediate);
    }

    @GetMapping("/ai/openai/chat")
    public String chat(@RequestParam String message) {
    Immediate immediate = new Immediate(new UserMessage(message));
    ChatResponse response = openAiChatClient.name(immediate);
    return response.getResult().getOutput().getContent();
    }
    }

    Anthropic Claude with Spring AI

    Add the Anthropic dependency:


    org.springframework.ai
    spring-ai-anthropic-spring-boot-starter

    Configure in software.properties:

    spring.ai.anthropic.api-key=your-api-key
    spring.ai.anthropic.mannequin=claude-3-sonnet-20240229
    spring.ai.anthropic.temperature=0.7

    Utilization instance:

    @RestController
    @RequiredArgsConstructor
    public class AnthropicController {

    personal remaining AnthropicChatClient anthropicChatClient;

    @GetMapping("/ai/anthropic/chat")
    public String chat(@RequestParam String message) {
    Immediate immediate = new Immediate(new UserMessage(message));
    ChatResponse response = anthropicChatClient.name(immediate);
    return response.getResult().getOutput().getContent();
    }
    }

    Vertex AI (Google) with Spring AI

    Add the Vertex AI dependency:


    org.springframework.ai
    spring-ai-vertex-ai-spring-boot-starter

    Configure in software.properties:

    spring.ai.vertex.ai.project-id=your-gcp-project-id
    spring.ai.vertex.ai.location=us-central1
    spring.ai.vertex.ai.mannequin=gemini-1.0-pro

    Utilization instance:

    @RestController
    @RequiredArgsConstructor
    public class VertexAIController {

    personal remaining VertexAiChatClient vertexAiChatClient;

    @GetMapping("/ai/vertex/chat")
    public String chat(@RequestParam String message) {
    Immediate immediate = new Immediate(new UserMessage(message));
    ChatResponse response = vertexAiChatClient.name(immediate);
    return response.getResult().getOutput().getContent();
    }
    }

    Azure OpenAI with Spring AI

    Add the Azure OpenAI dependency:


    org.springframework.ai
    spring-ai-azure-openai-spring-boot-starter

    Configure in software.properties:

    spring.ai.azure.openai.api-key=your-api-key
    spring.ai.azure.openai.endpoint=your-azure-endpoint
    spring.ai.azure.openai.deployment-name=your-deployment-name

    Utilization instance:

    @RestController
    @RequiredArgsConstructor
    public class AzureOpenAIController {

    personal remaining AzureOpenAiChatClient azureOpenAiChatClient;

    @GetMapping("/ai/azure/chat")
    public String chat(@RequestParam String message) {
    Immediate immediate = new Immediate(new UserMessage(message));
    ChatResponse response = azureOpenAiChatClient.name(immediate);
    return response.getResult().getOutput().getContent();
    }
    }

    Ollama with Spring AI

    For native mannequin deployment:


    org.springframework.ai
    spring-ai-ollama-spring-boot-starter

    Configure in software.properties:

    spring.ai.ollama.base-url=http://localhost:11434
    spring.ai.ollama.mannequin=llama2

    Utilization instance:

    @RestController
    @RequiredArgsConstructor
    public class OllamaController {

    personal remaining OllamaChatClient ollamaChatClient;

    @GetMapping("/ai/ollama/chat")
    public String chat(@RequestParam String message) {
    Immediate immediate = new Immediate(new UserMessage(message));
    ChatResponse response = ollamaChatClient.name(immediate);
    return response.getResult().getOutput().getContent();
    }
    }

    Spring AI additionally offers a unified API for producing embeddings:

    @Service
    @RequiredArgsConstructor
    public class EmbeddingService {

    personal remaining OpenAiEmbeddingClient embeddingClient;

    public Listing generateEmbedding(String textual content) {
    EmbeddingResponse response = embeddingClient.embed(textual content);
    return response.getResult().getOutput();
    }
    }

    Spring AI has wonderful help for doc processing and RAG (Retrieval-Augmented Technology):

    @Service
    @RequiredArgsConstructor
    public class DocumentProcessingService {

    personal remaining OpenAiChatClient chatClient;

    public String processDocument(String documentContent, String question) {
    // Create doc
    Doc doc = new Doc(documentContent);

    // Cut up into smaller chunks
    RecursiveCharacterTextSplitter splitter = new RecursiveCharacterTextSplitter();
    Listing chunks = splitter.apply(doc);

    // Create immediate template with doc context
    PromptTemplate promptTemplate = new PromptTemplate("""
    Reply the next query based mostly on the supplied context:
    Context: {context}

    Query: {query}
    """);

    // Create RAG immediate
    Immediate immediate = promptTemplate.create(Map.of(
    "context", chunks.stream().map(Doc::getContent).gather(Collectors.becoming a member of("n")),
    "query", question
    ));

    // Get response
    ChatResponse response = chatClient.name(immediate);
    return response.getResult().getOutput().getContent();
    }
    }

    OpenAI offers probably the most standard LLM APIs with fashions like GPT-4. Let’s create a service to work together with it.

    First, create a configuration class to handle the API key:

    @Configuration
    @ConfigurationProperties(prefix = "openai")
    @Knowledge
    public class OpenAIConfig {
    personal String apiKey;
    personal String apiUrl = "https://api.openai.com/v1/chat/completions";
    }

    Subsequent, add the API key to your software.properties or software.yml:

    openai.api-key=your-api-key-here

    Now, let’s create the service:

    @Service
    @RequiredArgsConstructor
    public class OpenAIService {

    personal remaining OpenAIConfig config;
    personal remaining WebClient.Builder webClientBuilder;

    @PostConstruct
    public void init() {
    webClient = webClientBuilder
    .baseUrl(config.getApiUrl())
    .defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)
    .defaultHeader(HttpHeaders.AUTHORIZATION, "Bearer " + config.getApiKey())
    .construct();
    }

    personal WebClient webClient;

    public Mono generateText(String immediate) {
    Map message = Map.of("position", "person", "content material", immediate);

    Map requestBody = Map.of(
    "mannequin", "gpt-4",
    "messages", Listing.of(message),
    "temperature", 0.7,
    "max_tokens", 500
    );

    return webClient.publish()
    .bodyValue(requestBody)
    .retrieve()
    .bodyToMono(JsonNode.class)
    .map(response -> response.at("/selections/0/message/content material").asText());
    }
    }

    Anthropic’s Claude is one other highly effective LLM with distinct benefits. Let’s arrange an analogous service for Claude:

    @Configuration
    @ConfigurationProperties(prefix = "anthropic")
    @Knowledge
    public class AnthropicConfig {
    personal String apiKey;
    personal String apiUrl = "https://api.anthropic.com/v1/messages";
    }

    Add the API key to your properties:

    anthropic.api-key=your-anthropic-api-key-here

    And create the service:

    @Service
    @RequiredArgsConstructor
    public class AnthropicService {

    personal remaining AnthropicConfig config;
    personal remaining WebClient.Builder webClientBuilder;
    personal WebClient webClient;

    @PostConstruct
    public void init() {
    webClient = webClientBuilder
    .baseUrl(config.getApiUrl())
    .defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)
    .defaultHeader("x-api-key", config.getApiKey())
    .defaultHeader("anthropic-version", "2023-06-01")
    .construct();
    }

    public Mono generateText(String immediate) {
    Map requestBody = Map.of(
    "mannequin", "claude-3-opus-20240229",
    "messages", Listing.of(Map.of("position", "person", "content material", immediate)),
    "max_tokens", 500
    );

    return webClient.publish()
    .bodyValue(requestBody)
    .retrieve()
    .bodyToMono(JsonNode.class)
    .map(response -> response.at("/content material/0/textual content").asText());
    }
    }

    Now, let’s create a real-world software: a sensible buyer help system that may:

    1. Perceive buyer queries
    2. Retrieve related data from a data base
    3. Generate useful responses

    First, we’ll create a mannequin for our data base articles:

    @Entity
    @Knowledge
    public class KnowledgeArticle {
    @Id
    @GeneratedValue(technique = GenerationType.IDENTITY)
    personal Lengthy id;

    personal String title;

    @Column(size = 10000)
    personal String content material;

    personal String class;
    }

    Subsequent, let’s create a repository:

    public interface KnowledgeArticleRepository extends JpaRepository {
    Listing findByCategory(String class);
    }

    Now, let’s create a service that can use our LLM to grasp buyer queries and supply applicable responses:

    @Service
    @RequiredArgsConstructor
    public class CustomerSupportService {

    personal remaining OpenAIService openAIService;
    personal remaining KnowledgeArticleRepository knowledgeRepository;

    public Mono handleCustomerQuery(String question) {
    // Step 1: Use the LLM to categorize the question
    return openAIService.generateText(
    "Categorize this buyer question into one in every of these classes: " +
    "Billing, Technical, Product, Returns, Different. Question: " + question
    )
    .flatMap(class -> {
    // Step 2: Fetch related data articles
    Listing articles = knowledgeRepository.findByCategory(class.trim());
    String relevantInfo = articles.stream()
    .map(article -> article.getTitle() + ": " + article.getContent())
    .gather(Collectors.becoming a member of("nn"));

    // Step 3: Generate a response based mostly on the related data
    String promptForResponse =
    "You're a useful buyer help agent. " +
    "Use the next data to reply the shopper question. " +
    "In the event you do not discover related data, say you will escalate to a human agent.nn" +
    "Data: " + relevantInfo + "nn" +
    "Buyer question: " + question;

    return openAIService.generateText(promptForResponse);
    });
    }
    }

    Lastly, let’s create a controller to reveal this performance:

    @RestController
    @RequestMapping("/api/help")
    @RequiredArgsConstructor
    public class CustomerSupportController {

    personal remaining CustomerSupportService supportService;

    @PostMapping("/question")
    public Mono>> handleQuery(
    @RequestBody Map request) {

    String question = request.get("question");
    return supportService.handleCustomerQuery(question)
    .map(response -> ResponseEntity.okay(Map.of("response", response)))
    .defaultIfEmpty(ResponseEntity.badRequest().construct());
    }
    }

    One other widespread use case is content material era. Let’s construct a service that helps generate weblog publish outlines and drafts:

    @Service
    @RequiredArgsConstructor
    public class ContentGenerationService {

    personal remaining AnthropicService claudeService;

    public Mono generateBlogOutline(String subject, String targetAudience, int sections) {
    String immediate = String.format(
    "Generate an overview for a weblog publish about '%s' for %s viewers. " +
    "Embrace %d foremost sections with transient descriptions for every part.",
    subject, targetAudience, sections
    );

    return claudeService.generateText(immediate)
    .map(define -> new ContentResponse("define", define));
    }

    public Mono expandSection(String define, String sectionTitle) {
    String immediate = String.format(
    "Primarily based on this weblog define: nnpercentsnn" +
    "Write an in depth draft for the part titled '%s'. " +
    "Embrace examples, knowledge factors if relevant, and keep a conversational tone.",
    define, sectionTitle
    );

    return claudeService.generateText(immediate)
    .map(sectionContent -> new ContentResponse("section_content", sectionContent));
    }

    @Knowledge
    @AllArgsConstructor
    public static class ContentResponse {
    personal String kind;
    personal String content material;
    }
    }

    And the corresponding controller:

    @RestController
    @RequestMapping("/api/content material")
    @RequiredArgsConstructor
    public class ContentGenerationController {

    personal remaining ContentGenerationService contentService;

    @PostMapping("/define")
    public Mono> generateOutline(
    @RequestBody Map request) {

    String subject = (String) request.get("subject");
    String viewers = (String) request.get("viewers");
    int sections = (int) request.get("sections");

    return contentService.generateBlogOutline(subject, viewers, sections)
    .map(ResponseEntity::okay)
    .defaultIfEmpty(ResponseEntity.badRequest().construct());
    }

    @PostMapping("/expand-section")
    public Mono> expandSection(
    @RequestBody Map request) {

    String define = request.get("define");
    String sectionTitle = request.get("sectionTitle");

    return contentService.expandSection(define, sectionTitle)
    .map(ResponseEntity::okay)
    .defaultIfEmpty(ResponseEntity.badRequest().construct());
    }
    }

    Let’s create a semantic search function that makes use of LLM embeddings to search out related paperwork. We’ll use OpenAI’s embeddings API:

    @Service
    @RequiredArgsConstructor
    public class EmbeddingService {

    personal remaining OpenAIConfig config;
    personal remaining WebClient.Builder webClientBuilder;
    personal WebClient webClient;

    @PostConstruct
    public void init() {
    webClient = webClientBuilder
    .baseUrl("https://api.openai.com/v1/embeddings")
    .defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)
    .defaultHeader(HttpHeaders.AUTHORIZATION, "Bearer " + config.getApiKey())
    .construct();
    }

    public Mono getEmbedding(String textual content) {
    Map requestBody = Map.of(
    "mannequin", "text-embedding-3-small",
    "enter", textual content
    );

    return webClient.publish()
    .bodyValue(requestBody)
    .retrieve()
    .bodyToMono(JsonNode.class)
    .map(response -> {
    JsonNode knowledge = response.get("knowledge").get(0).get("embedding");
    float[] embedding = new float[data.size()];
    for (int i = 0; i embedding[i] = (float) knowledge.get(i).asDouble();
    }
    return embedding;
    });
    }
    }

    Now, let’s create a doc mannequin with embedded vectors:

    @Entity
    @Knowledge
    public class Doc {
    @Id
    @GeneratedValue(technique = GenerationType.IDENTITY)
    personal Lengthy id;

    personal String title;

    @Column(size = 10000)
    personal String content material;

    @ElementCollection
    @CollectionTable(title = "document_embedding", joinColumns = @JoinColumn(title = "document_id"))
    @OrderColumn(title = "place")
    personal Listing embedding;
    }

    And a service to deal with semantic search:

    @Service
    @RequiredArgsConstructor
    public class SemanticSearchService {

    personal remaining EmbeddingService embeddingService;
    personal remaining DocumentRepository documentRepository;

    @Transactional
    public Mono addDocument(String title, String content material) {
    return embeddingService.getEmbedding(content material)
    .map(embedding -> {
    Doc doc = new Doc();
    doc.setTitle(title);
    doc.setContent(content material);
    doc.setEmbedding(convertToList(embedding));
    return documentRepository.save(doc);
    });
    }

    public Mono> search(String question, int restrict) {
    return embeddingService.getEmbedding(question)
    .map(queryEmbedding -> {
    Listing allDocs = documentRepository.findAll();

    // Calculate cosine similarity between question and every doc
    Listing> docsWithSimilarity = allDocs.stream()
    .map(doc -> Pair.of(doc, cosineSimilarity(queryEmbedding, doc.getEmbedding())))
    .sorted(Comparator.evaluating(Pair::getRight, Comparator.reverseOrder()))
    .gather(Collectors.toList());

    // Return prime N paperwork
    return docsWithSimilarity.stream()
    .restrict(restrict)
    .map(Pair::getLeft)
    .gather(Collectors.toList());
    });
    }

    personal Listing convertToList(float[] array) {
    Listing checklist = new ArrayList(array.size);
    for (float f : array) {
    checklist.add(f);
    }
    return checklist;
    }

    personal double cosineSimilarity(float[] v1, Listing v2List) {
    float[] v2 = v2List.stream().mapToFloat(Float::floatValue).toArray();
    double dot = 0.0;
    double norm1 = 0.0;
    double norm2 = 0.0;

    for (int i = 0; i dot += v1[i] * v2[i];
    norm1 += Math.pow(v1[i], 2);
    norm2 += Math.pow(v2[i], 2);
    }

    return dot / (Math.sqrt(norm1) * Math.sqrt(norm2));
    }
    }

    1. Implement Caching: LLM API calls will be costly and sluggish. Use Spring’s caching mechanisms to cache widespread queries:
    @EnableCaching
    @Configuration
    public class CacheConfig {
    @Bean
    public CacheManager cacheManager() {
    return new ConcurrentMapCacheManager("llmResponses");
    }
    }

    // In your service
    @Cacheable(worth = "llmResponses", key = "#immediate.hashCode()")
    public Mono generateText(String immediate) {
    // API name logic
    }

    2. Implement Price Limiting: Stop exceeding API charge limits with resilience4j:

    @Bean
    public RateLimiterRegistry rateLimiterRegistry() {
    RateLimiterConfig config = RateLimiterConfig.customized()
    .limitRefreshPeriod(Period.ofMinutes(1))
    .limitForPeriod(60) // 60 requests per minute
    .timeoutDuration(Period.ofSeconds(5))
    .construct();

    return RateLimiterRegistry.of(config);
    }

    // In your service
    personal remaining RateLimiter rateLimiter;

    @PostConstruct
    public void init() {
    // Different initialization code
    this.rateLimiter = rateLimiterRegistry.rateLimiter("openai");
    }

    public Mono generateText(String immediate) {
    return Mono.fromSupplier(() -> {
    attempt {
    return rateLimiter.executeCallable(() -> {
    // Make the API name
    });
    } catch (Exception e) {
    throw new RuntimeException("Price restrict exceeded", e);
    }
    });
    }

    3. Implement Circuit Breakers: Defend your software from API outages:

    @Bean
    public CircuitBreakerRegistry circuitBreakerRegistry() {
    CircuitBreakerConfig config = CircuitBreakerConfig.customized()
    .failureRateThreshold(50)
    .waitDurationInOpenState(Period.ofMinutes(1))
    .permittedNumberOfCallsInHalfOpenState(5)
    .slidingWindowSize(10)
    .construct();

    return CircuitBreakerRegistry.of(config);
    }

    // In your service
    personal remaining CircuitBreaker circuitBreaker;

    @PostConstruct
    public void init() {
    // Different initialization code
    this.circuitBreaker = circuitBreakerRegistry.circuitBreaker("openai");
    }

    public Mono generateText(String immediate) {
    return Mono.fromSupplier(() -> {
    attempt {
    return circuitBreaker.executeCallable(() -> {
    // Make the API name
    });
    } catch (Exception e) {
    throw new RuntimeException("Service unavailable", e);
    }
    });
    }

    1. Asynchronous Processing: Use Spring’s async capabilities for non-blocking operations:
    @Configuration
    @EnableAsync
    public class AsyncConfig {
    @Bean
    public Executor taskExecutor() {
    ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
    executor.setCorePoolSize(5);
    executor.setMaxPoolSize(10);
    executor.setQueueCapacity(25);
    executor.setThreadNamePrefix("LLMExecutor-");
    executor.initialize();
    return executor;
    }
    }

    // In your service
    @Async
    public CompletableFuture generateTextAsync(String immediate) {
    return CompletableFuture.supplyAsync(() -> {
    // Make API name
    });
    }

    2. Batch Processing: For processing giant volumes of knowledge, use batch processing:

    @Service
    @RequiredArgsConstructor
    public class BatchProcessingService {

    personal remaining OpenAIService openAIService;

    public Flux processBatch(Listing inputs, int batchSize) {
    return Flux.fromIterable(inputs)
    .buffer(batchSize)
    .flatMap(batch -> {
    Listing> responses = batch.stream()
    .map(openAIService::generateText)
    .gather(Collectors.toList());

    return Flux.merge(responses);
    });
    }
    }

    3. Streaming Responses: For lengthy outputs, use streaming to enhance person expertise:

    @RestController
    @RequestMapping("/api/stream")
    @RequiredArgsConstructor
    public class StreamingController {

    personal remaining OpenAiChatClient openAiChatClient;

    @GetMapping(worth = "/generate", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux> streamGeneration(@RequestParam String immediate) {
    Immediate chatPrompt = new Immediate(new UserMessage(immediate));

    return openAiChatClient.stream(chatPrompt)
    .map(response -> ServerSentEvent.builder()
    .knowledge(response.getResult().getOutput().getContent())
    .construct()
    );
    }
    }

    Operate calling permits the LLM to request execution of client-side capabilities when wanted:

    @Service
    @RequiredArgsConstructor
    public class WeatherFunctionService {

    // Outline the operate schema
    @Bean
    public FunctionCallback weatherFunction() {
    return FunctionCallback.builder()
    .title("get_weather")
    .description("Get the present climate for a location")
    .parameter("location", String.class, "The town and state, e.g., San Francisco, CA")
    .parameter("unit", String.class, "The temperature unit, e.g., celsius or fahrenheit")
    .returnType(Map.class)
    .operate(params -> {
    String location = (String) params.get("location");
    String unit = (String) params.get("unit");

    // Right here can be actual API name to climate service
    Map weatherData = Map.of(
    "location", location,
    "temperature", unit.equals("celsius") ? 22 : 72,
    "unit", unit,
    "situation", "Sunny"
    );

    return weatherData;
    })
    .construct();
    }

    personal remaining ChatClient chatClient;

    public String askAboutWeather(String userQuery) {
    Immediate immediate = new Immediate(
    new UserMessage(userQuery)
    );

    ChatResponse response = chatClient.name(
    immediate,
    ChatOptions.builder()
    .withFunction(weatherFunction())
    .construct()
    );

    return response.getResult().getOutput().getContent();
    }
    }

    Implementing RAG permits your software to supply extra correct and contextually related responses:

    @Service
    @RequiredArgsConstructor
    public class RAGService {

    personal remaining ChatClient chatClient;
    personal remaining VectorStore vectorStore;
    personal remaining EmbeddingClient embeddingClient;

    // Technique to ingest paperwork
    public void ingestDocument(String content material, Map metadata) {
    // Cut up doc into chunks
    TextSplitter splitter = new RecursiveCharacterTextSplitter(500, 100);
    Listing paperwork = splitter.apply(Listing.of(new Doc(content material, metadata)));

    // Retailer paperwork in vector retailer
    vectorStore.add(paperwork);
    }

    // RAG question implementation
    public String question(String userQuery) {
    // Retrieve related paperwork
    Listing relevantDocs = vectorStore.similaritySearch(userQuery);

    // Format context from paperwork
    String context = relevantDocs.stream()
    .map(Doc::getContent)
    .gather(Collectors.becoming a member of("nn"));

    // Create immediate with retrieved context
    PromptTemplate promptTemplate = new PromptTemplate("""
    Reply the query based mostly solely on the next context:

    Context:
    {{context}}

    Query: {{query}}
    Reply:
    """);

    Map variables = new HashMap();
    variables.put("context", context);
    variables.put("query", userQuery);

    Immediate immediate = promptTemplate.create(variables);

    // Get response from LLM
    ChatResponse response = chatClient.name(immediate);
    return response.getResult().getOutput().getContent();
    }
    }

    Integrating LLM APIs with Spring Boot opens up a world of potentialities for creating clever functions. We’ve coated tips on how to:

    1. Arrange Spring Boot for API integration
    2. Use Spring AI for a standardized method throughout totally different LLM suppliers
    3. Work with numerous AI mannequin suppliers (OpenAI, Anthropic, Azure OpenAI, Vertex AI, and Ollama)
    4. Implement real-world functions:
    • Sensible buyer help
    • Content material era
    • Semantic search
    • Multimodal content material era (text-to-image, audio processing)
    • Operate calling for real-time knowledge entry
    • Retrieval Augmented Technology (RAG)

    5. Apply greatest practices for efficiency and reliability

    Spring AI presents a robust abstraction layer that permits you to construct AI-enabled functions with out getting locked right into a single supplier. This flexibility means you possibly can:

    1. Select the most effective mannequin for every particular job
    2. Simply change suppliers if pricing or capabilities change
    3. Use a constant programming mannequin throughout your total software

    By following these patterns, you possibly can harness the ability of LLMs whereas leveraging Spring Boot’s sturdy framework for constructing production-ready functions.

    Keep in mind to contemplate the moral implications of utilizing AI in your functions, together with bias mitigation, transparency with customers about AI-generated content material, and applicable content material filtering.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleBarbara Corcoran Finds a Buyer in One Day for $12M Penthouse
    Next Article Airbnb Now Offers Bookings for Massages, Chefs, Fitness
    FinanceStarGate

    Related Posts

    Machine Learning

    MLE-Dojo: Training a New Breed of LLM Agents to Master Machine Learning Engineering | by ArXiv In-depth Analysis | May, 2025

    May 15, 2025
    Machine Learning

    Papers Explained 366: Math Shepherd | by Ritvik Rastogi | May, 2025

    May 15, 2025
    Machine Learning

    They Didn’t Get It — And That’s the Point: Why the Tesla-AI Argument Breaks People’s Brains | by NickyCammarata | BehindTheSugar | May, 2025

    May 14, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    How Automatic Speech Recognition is Shaping the Future of Voice Technology | by Matthew-Mcmullen | May, 2025

    May 6, 2025

    Top 25 Python Libraries You Need to Know

    February 4, 2025

    Crucial Questions Co-Founders Must Answer Before Launching a Startup

    February 12, 2025

    Generación de las computadoras 1ª a la 5ª (resumen) | by Sharith Padilla | Mar, 2025

    March 1, 2025

    How to Get Rapid YouTube Subscriber Growth for Creators

    February 17, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    Most Popular

    The Mindset that Helped Me Start 5 Companies Before Age 30

    February 20, 2025

    Build Your First Machine Learning Model | by Gauravnardia | Apr, 2025

    April 27, 2025

    Report: Oracle to Deploy AI Cluster with 30,000 AMD MI355X Accelerators

    March 28, 2025
    Our Picks

    Text-To-Image using Diffusion model with AWS Sagemaker Distributed Training | by Aniketp | Mar, 2025

    March 21, 2025

    How I Turned a Failing Business Into a $1 Million Powerhouse in Just 6 Months

    April 2, 2025

    Amazon Will Restart Theft Screenings for Warehouse Workers

    April 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 Financestargate.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.