Categories: FAANG

Gemini Live Agent Challenge: Announcing the winners and highlights

The Gemini Live Agent Challenge is officially in the books! We challenged developers worldwide to break out of the traditional ‘text box’ paradigm by building next-generation AI agents. From our initial announcement to amassing 11,878 participants and 1,536 submitted projects from 151 countries, the results were nothing short of spectacular.

The mission was to seamlessly integrate multimodal capabilities—building agents that help you see, hear, speak, and create in real time — using the Gemini Live API, the Agent Development Kit (ADK), and the robust infrastructure of Google Cloud. Participants pushed the boundaries of interactive AI across three distinct categories: The Live Agent, The Creative Storyteller, and The UI Navigator.

Congratulations to the builders who took home the top prizes! These winning teams combined technical precision with bold imagination, completely redefining how users can interact with and experience agents. Two of these standout developers were even recognized in person at Google Cloud Next 2026. Here’s a look at their experience, alongside the complete list of winning agents.

Celebrating our category winners at Google Cloud Next ‘26

Category winners Jeremiah Somoine and Bryen Param were invited to attend Google Cloud Next 2026 in Las Vegas, where they shared their experiences and insights with the broader developer community. Both winners presented Lightning Talks at the Developer Theatre on the expo floor and sat down for exclusive interviews in the Creator Studio Pod at the GDE and Certified Lounge. 

During his time at the event, Bryen discussed the core inspiration behind drone-copilot. He explained that his project was driven by the question of “what if a model could interact with the real world?”, showcasing how multimodal capabilities can bridge the gap between AI and physical environments.

Jeremiah, currently a college student, reflected on the development process behind Sankofa, noting that “the best response to a technical limitation was a creative one.” When asked what advice he would give to other students looking to build the next generation of AI applications, he emphasized the importance of jumping at any opportunity to get hands-on with the technology. “The best way to learn is by doing,” he said, encouraging aspiring developers to simply dive in and start building.

Winners

Grand Prize winner: ORION – Operating Room Intelligent Orchestration Node
By: Aditya Shukla

ORION, or Operating Room Intelligent Orchestration Node, is a voice-directed surgical co-pilot for robotic surgery. Surgeons can speak naturally and instantly receive answers, live data on display, and real-time visual assistance – all without breaking scrub.


The Live Agent winner: drone-copilot
By: Bryen Param

Drone-copilot transforms how users interact with hardware by enabling natural, real-time conversations with a drone instead of using a joystick or complex menus. Simply by speaking, users can instruct the drone to navigate, perform autonomous visual inspections, or describe its surroundings, while the drone verbally responds and confirms its actions in real time.


Creative Storyteller winner: Sankofa
By: Jeremiah Somoine

Sankofa acts as a multimodal AI “griot”—a traditional West African storyteller—transforming fragmented family histories into deeply immersive narratives. Based on just a few user details, it weaves together rich voice narration, watercolor imagery, and ambient soundscapes into a historical story, allowing users to engage in a real-time voice conversation with the storyteller to explore their roots further.


UI Navigator winner: Moonwalk
By: Enaiho Uwas Paul and Aman Kumar Sah

Moonwalk is a conversational, hands-free desktop assistant that helps users intuitively navigate their computer and complete complex tasks using just their voice. By remembering personal preferences and past interactions, it acts as an intelligent co-pilot that can seamlessly control your mouse and keyboard to execute everyday workflows—like booking flights or managing spreadsheets—while you simply sit back and speak.


Best multimodal integration and user experience winner: Wand
By: David Li

Wand is a voice-first, pointer-aware browser assistant that helps you seamlessly navigate and interact with any website using a combination of natural speech and hand gestures. By simply pointing at your screen and speaking — like asking to “play this video” or “zoom in here”—this live agent helps you instantly execute clicks, searches, and commands without ever needing to touch a mouse or keyboard.


Best technical execution and agent architecture winner: JohnKeats.AI
By: Matthew Keats

JohnKeats.AI is a voice-first emotional companion designed to actively listen and hold space for users without rushing to offer solutions. By processing subtle vocal cues like pitch, pacing, and tone, it reacts naturally to a user’s emotional state in real time to provide a deeply reflective and empathetic conversational experience.


Best innovation and thought leadership winner: Rayan Memory
By: Yusuf Elnady

Rayan Memory tackles the universal problem of forgetting by turning your daily learnings into a fully explorable 3D “memory palace.” A background agent passively listens to your real-world audio to extract important ideas as physical artifacts, allowing you to walk through themed virtual rooms and converse with a dedicated AI companion to easily retrieve your exact memories.


Honorable mention: NagarDrishti
By: Nikita Dongre and Omkar Dongre

NagarDrishti tackles dangerous road conditions by allowing citizens to safely report potholes and waterlogging using a hands-free voice assistant while driving. These real-time reports instantly populate an interactive dashboard, where city officials can use natural language to easily identify hazard hotspots and manage critical repairs.


Honorable mention: Ekaette
By: Bassey John

Ekaette revolutionizes customer service by replacing frustrating hold queues with a conversational, multimodal AI assistant that operates across live phone calls and text messaging. Customers can speak naturally with the agent over a standard phone line while seamlessly sharing photos, reviewing product options, or completing payments via WhatsApp, c


Honorable mention: VibeCat
By: Sejun Kim and Michael Chang

VibeCat is a proactive macOS desktop companion that continuously watches your screen, understands your context, and suggests helpful actions before you even ask. Instead of waiting for a command, it speaks up first — like offering to fix a missing line of code or execute a terminal command — and completes the task only after receiving your permission.


Honorable mention: Call My Parts
By: Sugam Palav, Nikhil Lohar, Siddhant Panday, and Vishal Parekh

Call My Parts automates the tedious, time-consuming process of sourcing used vehicle parts by doing the research and vendor outreach for you. Users simply speak their part request, and the AI agent autonomously searches vendor websites, calls suppliers to check pricing and inventory, and compiles the best options into a ranked, easy-to-read dashboard.


Honorable mention: Relay
By: Faith Ogundimu

Relay is an interactive AI lab partner that uses your webcam to watch and guide your physical electronics projects in real time. It provides step-by-step voice instructions to help you build circuits, catches wiring mistakes before they happen, and reinforces your skills with a built-in 3D simulation sandbox and adaptive quizzes.

Keep the momentum going

Inspired by these incredible projects? Start building and stay connected with the community through our latest programs and events:

  • Join Gemini Enterprise Agent Ready (GEAR), designed to help developers and decision-makers build and deploy production-ready AI agents.

  • Catch up on Google Cloud Next 2026: We just wrapped up an amazing Google Cloud Next! If you weren’t able to join us in person — or simply want to relive the energy — take a look at our social and livestream recaps to catch up on some of the exciting developer activations straight from the expo floor.

  • Tune in on Tuesdays: Want to be the first to hear about new tools, product updates, and upcoming hackathons? Join us for our weekly livestream every Tuesday 9:00 A.M. PDT / 12:00 P.M. EDT for the latest in all things Google Cloud.

Congratulations again to all of our winners and participants. We can’t wait to see what you build next!

AI Generated Robotic Content

Recent Posts

It appears that Microsoft uploaded an image model on HuggingFace and then deleted it.

https://x.com/HuggingPapers/status/2055176632491778363 https://huggingface.co/microsoft/Lens https://huggingface.co/microsoft/Lens-Turbo submitted by /u/Total-Resort-3120 [link] [comments]

2 seconds ago

Restrict access to sensitive documents in your Amazon Quick knowledge bases for Amazon S3

Organizations that must restrict access to sensitive documents increasingly rely on AI-driven search and chat…

26 seconds ago

The Best Outdoor Deals From the REI Anniversary Sale 2026

It’s the best time of year to pick up all the outdoor gadgets, tents, sleeping…

1 hour ago

NASA’s new AI space chip could let spacecraft think for themselves

NASA is testing a next-generation space computer chip that could give spacecraft the ability to…

1 hour ago

Improve bot accuracy with Amazon Lex Assisted NLU

Improving bot accuracy in Amazon Lex starts with handling how customers communicate naturally. Your customers…

1 day ago