If 2023 was the year that catapulted ‘AI’ out there, 2024 is going to be the year to put (Google’s) AI in everyone’s hand, home and head.
Google CEO Sundar Pichai highlighted that today, all of Google’s 2 billion user products use Gemini. This is just the start of it, as Pichai said:
We’re still in the beginning of our Gemini era.
AI Overviews coming now
Google kicked off the I/O 2024 event with a major announcement: the rollout of its Search Generative Experience (SGE) labs feature to US users, scheduled within the week.
AI Overviews will automatically answer specific searches in the US, offering concise explanations at the top of search results pages before the traditional list of links. Over the next few days, hundreds of millions of users in the US will experience AI overviews, with plans to expand to over a billion users worldwide by the end of the year.
Soon, you’ll be able to adjust your AI Overview with options to simplify the language or break it down in more detail. This can be particularly useful if you’re new to a topic, or if you’re trying to simplify something to satisfy your kid’s curiosity.
For example, maybe you’re looking for a new yoga or pilates studio, and you want one that’s popular with locals, conveniently located for your commute, and also offers a discount for new members. Soon, with just one search, you’ll be able to ask something like “find the best yoga or pilates studios in Boston and show me details on their intro offers, and walking time from Beacon Hill.”
With planning capabilities directly in Search, you can get help create plans for whatever you need, starting with meals and vacations. Search for something like “create a 3-day meal plan for a group that’s easy to prepare,” and you’ll get a starting point with a wide range of recipes from across the web.
Video search will soon be available for Search Labs users in the U.S. in English, with plans to expand to more regions over time.
Talk to your gallery with Ask Photos
In the upcoming months, Google Photos will introduce context-aware voice and text prompts to help users search for specific images or details within images. The Ask Photos feature goes beyond conventional image searches by utilizing Gemini to recognize image content. For example, it can detect a car license plate and prompt users to inquire about a specific plate number on a particular car model, providing accurate identification.
The rollout of Ask Photos is expected to begin in the coming months, with a tentative release timeframe set for summer.
“Double the tokens, please!”
AI will scan your inbox with Gemini Pro in Workspace Labs
Gemini in Gmail is set to revolutionize email management by offering a comprehensive search feature that summarizes your entire email history in a convenient sidebar. This solution addresses the common issue of sifting through numerous emails to find relevant information. With Gemini, users can simply request a summary of emails from a specific contact, receiving a concise bullet-point list of key details and quick access to the original emails. In a one-minute demo, Google showcased how users can swiftly respond to emails directly from the Gemini sidebar, streamlining the communication process.
Audio Overviews
This upgrade is great for people who prefer learning by listening rather than reading. In a demo, NotebookLM was given some physics lessons to work with. It then made a conversation between two speakers, explaining how basketball relates to the physics topic, like force, when asked by Google’s Josh Woodward.
Gemini 1.5 Flash
Gemini 1.5 Flash is “great at summarizing, chatting, captioning images and videos, extracting data from long documents and tables, and more,” wrote Demis Hassabis, CEO of Google DeepMind, in a blog post. Hassabis explained that Google made Gemini 1.5 Flash because developers wanted a model that was lighter and cheaper than the Pro version announced in February.
Gemini 1.5 Flash is in between Gemini 1.5 Pro and Gemini 1.5 Nano, Google’s smallest model that runs directly on devices. Even though it’s lighter than Gemini Pro, it’s still powerful.
Imagen 3 is here to blow you away
Google says Veo understands natural language and visual concepts to generate the video you want. These AI-generated videos can be over a minute long and include advanced cinematic techniques like timelapses.
Imagen 3 is described as Google’s highest-quality text-to-image model, producing highly detailed and photorealistic images with fewer errors. Google claims Imagen 3 is better at understanding and managing detailed prompts and handles text more effectively than previous versions.
Enter Trillium
Next, Google introduced the 6th generation of Google Cloud TPUs called Trillium. These new AI-specific hardware units support Google’s latest AI models like Gemini 1.5 Flash, Imagen 3, and Gemma 2.0.
(Updating live…)