Google I/O 2024: A Deep Dive into Top 8 Latest AI Innovation

“ AI, AI and AI,” In this year’s Google’s annual developer conference, Google I/O, Google couldn't stop talking about their new AI models, rebranded as Gemini. In fact, they said “AI” more than 120 times. So, Gemini AI is set to sneak in everything from Gmail to Google Docs.

google AI

Among the roughly 100 announcements, we've sorted out the top 8 launches to bring you only the highlights.

ASK PHOTOS

Google is launching "Ask Photos," a new feature in Google Photos powered by the Gemini AI models. Users will be able to search their photo libraries just by asking the question. For example, as demonstrated by Google CEO Sundar Pichai at the event, one can ask for the license plate number, and voila! It will show the find the photo. The feature, coming this summer, aims to make it easier to find specific memories or information in your photo collection. So no more scrolling through thousands of photos, to get that specific one. Saves time right?

ASK PHOTOS

Project Astra

Google's Project Astra is an advanced and Multimodal AI assistant, which Google hopes will become an universal assistant, that does everything from understanding what it sees through your device's camera to remembering where you left things and helping you with tasks. It provides real-time assistance and answers questions based on visual inputs.

PROJECT ASTRA

Here's a full recap of our news and updates from #GoogleIO — in under 10 minutes 🎉 pic.twitter.com/O2B8QPsNTg— Google (@Google) May 15, 2024

GEMS

Google GEMS, a feature for its Gemini AI, enabling users to craft personalised chatbots with distinct personalities and capabilities. Similar to OpenAI’s GPT Store, Gems empowers subscribers to create custom assistants tailored to their preferences and needs, with just one click. Whether you seek an enthusiastic gym companion, a culinary sous-chef, or a coding collaborator, Gems allows for easy generation of specialized chatbots by specifying desired traits and tasks. This feature is available to Gemini Advanced subscribers soon.

VEO

Google has launched Veo, a powerful AI model for generating 1080p videos from text prompts, aiming to compete with OpenAI's Sora. Veo offers various cinematic styles and editing capabilities, currently available to select creators through VideoFX. You can get like aerial shots or timelapses, and also you can tweaked with more prompts. Veo understands camera movements, VFX, and basic physics, making its videos more realistic. It can edit specific areas, create videos from images, and generate longer videos based on a story sequence. It's poised to integrate into platforms like YouTube Shorts in the future. Join the waitlist to try it out.

VEO

Imagen3

Google has introduced Imagen 3, its latest image generation model, which Google says is their best model yet. Imagen 3 boasts improved natural language understanding, allowing it to produce highly detailed and realistic images with minimal visual artefacts, no matter how long the prompt is. It also excels at rendering text, making it a valuable tool for various applications. While currently in private preview, it will soon be accessible through Vertex AI, with a waitlist available for the public.

Imagen3

Gemini 1.5 enters Workspace

Google is launching Gemini 1.5 Pro, its latest language model, into various Workspace apps such as Gmail, Docs, and Drive. This upgrade brings advanced capabilities like email summarization, contextual smart replies, and assistance with tasks like data analysis in Sheets. These features aim to enhance productivity and streamline workflows for paid subscribers, gradually rolling out to users over the coming months. Additionally, Gmail for mobile is receiving three new features: summarization, Gmail Q&A, and Contextual Smart Reply, providing users with more efficient communication tools. With these updates, Google Workspace users can expect a more integrated and intelligent experience across their daily tasks and communications.

AI in Workspace

GEMINI NANO

Google announced at Google I/O that it's integrating Gemini Nano, its smallest AI model, into Chrome desktop (starting with Chrome 126). This will enable on-device AI features like text generation for social media posts and product reviews directly within the browser. Gemini Nano with multimodality will enhance TalkBack on Android, providing detailed descriptions of images, like dresses, based on colours, sleeve length, etc.

GEMINI NANO

Visual Search Tool- Google Lens

Google's visual search tool Lens now has a new feature that lets users search by recording a video. Lens used to only support still photos, but it now allows you to ask questions using both audio and video. This update attempts to make chores like diagnosing automotive issues or finding out more details about a product easier. With this update, users can now record videos and ask questions to find web-based answers, taking Google Lens's capabilities beyond image-based searches.

google lens

And that’s a wrap on Google's latest AI launches. Google in technology and AI, now has taken a major leap forward giving tough competition to its rivals. Let’s wait for more ‘AI’ based announcements and updates from other tech giants.

_{Photos: Multiple Sources}