Close Menu
    Trending
    • Ultimate Guide to SQL Commands: DDL vs DML vs TCL vs DQL vs DCL | by The Analyst’s Edge | May, 2025
    • Cognichip out of Stealth with $33M in Funding for Artificial Chip Intelligence
    • Coinbase CEO Says Company Won’t Pay Hackers’ Ransom
    • How To Build a Benchmark for Your Models
    • From Code to Creativity: Building Multimodal AI Apps with Gemini and Imagen | by Hiralkotwani | May, 2025
    • Duos Edge AI Confirms EDC Deployment Goal in 2025
    • Why Skills Alone Aren’t Enough to Build a Strong Team
    • Statistical Aid: A School of Statistics | by MD TOUHIDUL ISLAM | May, 2025
    Finance StarGate
    • Home
    • Artificial Intelligence
    • AI Technology
    • Data Science
    • Machine Learning
    • Finance
    • Passive Income
    Finance StarGate
    Home»Machine Learning»From Code to Creativity: Building Multimodal AI Apps with Gemini and Imagen | by Hiralkotwani | May, 2025
    Machine Learning

    From Code to Creativity: Building Multimodal AI Apps with Gemini and Imagen | by Hiralkotwani | May, 2025

    FinanceStarGateBy FinanceStarGateMay 15, 2025No Comments2 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Incomes this badge taught me to bridge code and creativity — a milestone in my AI journey.

    The lab began with analyzing photos utilizing Gemini, Google’s multimodal mannequin. Utilizing a easy Python script, I despatched a picture of scones from Cloud Storage and requested, “What’s proven on this picture?” Gemini precisely described the scene, showcasing its means to course of textual content and visuals collectively.

    Code Snippet:

    response = consumer.fashions.generate_content(
    mannequin=”gemini-2.0-flash-001″,
    contents=[“What’s in this image?”, Part.from_uri(“gs://…/scones.jpg”, “image/jpeg”)]
    )
    print(response.textual content) # Output: “A plate of scones with jam and cream…”

    Key Perception: Gemini’s power lies in context-aware prompts. For instance, including “Describe this in 5 phrases” refined outputs for advertising and marketing use instances.

    Subsequent, I explored Imagen, Google’s text-to-image mannequin. With a single immediate, I generated hyper-realistic photos, like a cricket stadium in Los Angeles. The lab taught me to steadiness creativity and specificity:

    Instance Immediate:

    generate_image(
    immediate=”A futuristic cricket floor in LA with palm timber”,
    output_file=”cricket_la.jpeg”
    )

    Professional Tip: Disabling watermarks (add_watermark=False) and utilizing seed values ensured consistency for branding tasks.

    The lab additionally lined constructing chat functions. Utilizing streaming, I created a chatbot that solutions questions on rainbows in real-time:

    for chunk in chat.send_message_stream(“Why are rainbows colourful?”):
    print(chunk.textual content, finish=””) # Streams responses word-by-word

    Why It Issues: Streaming reduces latency, making AI interactions really feel pure — excellent for customer support bots.

    The finale was a multimodal app for a floral design firm:

    1. Picture Technology: imagen-3.0-generate-002 created bouquets from prompts (*“2 sunflowers + 3 roses”*).
    2. Picture Evaluation: Gemini analyzed the bouquet and generated birthday needs through streaming.

    Code Workflow:

    # Generate bouquet
    generate_bouquet_image(“2 sunflowers, 3 roses”)

    # Analyze picture & stream needs
    analyze_bouquet_image(“bouquet.jpeg”, “Write a birthday message based mostly on this bouquet”)

    Lesson Realized: Combining Gemini and Imagen unlocks end-to-end options — think about apps that design merchandise and write descriptions robotically!

    • Actual-World Focus: No toy examples — I constructed instruments companies really need.
    • Error Dealing with: Realized to troubleshoot API points (e.g., 429 fee limits).
    • Scalability: Vertex AI’s infrastructure lets these apps deal with thousands and thousands of customers.

    Generative AI isn’t only for tech giants. With instruments like Gemini and Imagen, builders can create AI apps that see, create, and converse. Prepared to begin your journey? Dive into Google Cloud Abilities Increase and experiment with prompts — it’s simpler than you suppose!

    🔗 Discover the Labs: https://www.cloudskillsboost.google/course_templates/1076
    🔗 Lab Completion Badge: https://www.cloudskillsboost.google/public_profiles/1eb74403-c67b-40ab-b441-464848d2eb53/badges/15279493

    Let’s construct the long run — one AI app at a time! 🌟



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleDuos Edge AI Confirms EDC Deployment Goal in 2025
    Next Article How To Build a Benchmark for Your Models
    FinanceStarGate

    Related Posts

    Machine Learning

    Ultimate Guide to SQL Commands: DDL vs DML vs TCL vs DQL vs DCL | by The Analyst’s Edge | May, 2025

    May 16, 2025
    Machine Learning

    Statistical Aid: A School of Statistics | by MD TOUHIDUL ISLAM | May, 2025

    May 15, 2025
    Machine Learning

    Logarithms — What, Why and How. Understanding the intuition behind… | by Gaurav Goel | May, 2025

    May 15, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Data-Centric Approach vs. Model-Centric Approach in Machine Learning | by Emily Smith | Apr, 2025

    April 4, 2025

    5 Ways CEOs Can Assess and Reset Their Company Culture

    May 8, 2025

    What is Model Context Protocol (MCP)? A Beginner-Friendly Guide for AI Developers | by Nishan Jain | Apr, 2025

    April 26, 2025

    Together AI Cloud Raises $305M Series B

    February 20, 2025

    Instilling Foundational Trust in Agentic AI: Techniques and Best Practices

    April 30, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    Most Popular

    How Leaders Can Cultivate a Growth Mindset in Their Teams

    February 13, 2025

    These Are the 10 Best States to Start a Business, Startup

    March 28, 2025

    On-Device Machine Learning in Spatial Computing

    February 17, 2025
    Our Picks

    Nexla Expands AI-Powered Integration Platform for Enterprise-Grade GenAI

    March 4, 2025

    Analyzing and Predicting Book Reviews Using NLP Techniques | by Fatma Nur ÇETİNTÜRK | Mar, 2025

    March 30, 2025

    I will write data science ,data analyst ,data engineer , machine learning resume | by Oluwafemiadeola | Feb, 2025

    February 26, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 Financestargate.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.