Google Gemini new functions and features (UPDATING)

Google’s I/O conference started – after a not-so-brief brief Marc Rebillet techno fiesta – with a focus on AI and, in particular, Google’s AI: Gemini.

If 2023 was the year that catapulted ‘AI’ out there, 2024 is going to be the year to put (Google’s) AI in everyone’s hand, home and head.

Google CEO Sundar Pichai highlighted that today, all of Google’s 2 billion user products use Gemini. This is just the start of it, as Pichai said:

We’re still in the beginning of our Gemini era.

Okay, let’s check it out!

AI Overviews coming now

Google Gemini new functions and features (UPDATING)

Google kicked off the I/O 2024 event with a major announcement: the rollout of its Search Generative Experience (SGE) labs feature to US users, scheduled within the week.

AI Overviews will automatically answer specific searches in the US, offering concise explanations at the top of search results pages before the traditional list of links. Over the next few days, hundreds of millions of users in the US will experience AI overviews, with plans to expand to over a billion users worldwide by the end of the year.

Soon, you’ll be able to adjust your AI Overview with options to simplify the language or break it down in more detail. This can be particularly useful if you’re new to a topic, or if you’re trying to simplify something to satisfy your kid’s curiosity.

AI Overviews will help with increasingly complex questions. Rather than breaking your question into multiple searches, you can ask your most complex questions, with all the nuances and caveats you have in mind, all in one go.

For example, maybe you’re looking for a new yoga or pilates studio, and you want one that’s popular with locals, conveniently located for your commute, and also offers a discount for new members. Soon, with just one search, you’ll be able to ask something like “find the best yoga or pilates studios in Boston and show me details on their intro offers, and walking time from Beacon Hill.”

Beyond finding the right answer or information for a complex question, Search will also be able to plan with you.

With planning capabilities directly in Search, you can get help create plans for whatever you need, starting with meals and vacations. Search for something like “create a 3-day meal plan for a group that’s easy to prepare,” and you’ll get a starting point with a wide range of recipes from across the web.

With advancements in video understanding, you can now search using videos. For instance, if you bought a record player at a thrift shop and the needle arm is drifting unexpectedly, you can simply search with a video of the issue. This saves you from trying to describe the problem in words and provides an AI Overview with troubleshooting steps and resources.

Video search will soon be available for Search Labs users in the U.S. in English, with plans to expand to more regions over time.

Talk to your gallery with Ask Photos

Gemini is making its way further into the Photos app, which soon will be able to complete tasks you tell it to do.

In the upcoming months, Google Photos will introduce context-aware voice and text prompts to help users search for specific images or details within images. The Ask Photos feature goes beyond conventional image searches by utilizing Gemini to recognize image content. For example, it can detect a car license plate and prompt users to inquire about a specific plate number on a particular car model, providing accurate identification.

The rollout of Ask Photos is expected to begin in the coming months, with a tentative release timeframe set for summer.

“Double the tokens, please!”

Pichai also revealed that Gemini 1.5 Pro, the newest iteration of its AI model, will now be accessible to all users through the Gemini Advanced app. The public version comes with a context window of 1 million tokens. Additionally, Google has upgraded Gemini 1.5 Pro to handle 2 million tokens, but this feature will be limited to developers in a private preview.In AI, a token is like a building block or a piece of a puzzle. It’s a small unit of information that represents something meaningful, like a word or a part of a sentence. Tokens help AI understand and process language by breaking it down into manageable pieces, making it easier for computers to analyze and generate text.

AI will scan your inbox with Gemini Pro in Workspace Labs

Gemini in Gmail is set to revolutionize email management by offering a comprehensive search feature that summarizes your entire email history in a convenient sidebar. This solution addresses the common issue of sifting through numerous emails to find relevant information. With Gemini, users can simply request a summary of emails from a specific contact, receiving a concise bullet-point list of key details and quick access to the original emails. In a one-minute demo, Google showcased how users can swiftly respond to emails directly from the Gemini sidebar, streamlining the communication process.

Audio Overviews

Google is improving NotebookLM, its AI tool for understanding documents, by adding “audio overviews” that create a podcast-style conversation between two speakers.

This upgrade is great for people who prefer learning by listening rather than reading. In a demo, NotebookLM was given some physics lessons to work with. It then made a conversation between two speakers, explaining how basketball relates to the physics topic, like force, when asked by Google’s Josh Woodward.

Gemini 1.5 Flash

Google is introducing a new model called Gemini 1.5 Flash, designed to be fast and efficient.

Gemini 1.5 Flash is “great at summarizing, chatting, captioning images and videos, extracting data from long documents and tables, and more,” wrote Demis Hassabis, CEO of Google DeepMind, in a blog post. Hassabis explained that Google made Gemini 1.5 Flash because developers wanted a model that was lighter and cheaper than the Pro version announced in February.

Gemini 1.5 Flash is in between Gemini 1.5 Pro and Gemini 1.5 Nano, Google’s smallest model that runs directly on devices. Even though it’s lighter than Gemini Pro, it’s still powerful.

Imagen 3 is here to blow you away

Also, Google announced two new AI tools for media creation: Veo, which can create high-quality 1080p videos, and Imagen 3, the latest version of its text-to-image framework.

Google says Veo understands natural language and visual concepts to generate the video you want. These AI-generated videos can be over a minute long and include advanced cinematic techniques like timelapses.

Imagen 3 is described as Google’s highest-quality text-to-image model, producing highly detailed and photorealistic images with fewer errors. Google claims Imagen 3 is better at understanding and managing detailed prompts and handles text more effectively than previous versions.

Enter Trillium

Next, Google introduced the 6th generation of Google Cloud TPUs called Trillium. These new AI-specific hardware units support Google’s latest AI models like Gemini 1.5 Flash, Imagen 3, and Gemma 2.0.

Trillium offers a 4.7 times increase in performance per chip compared to the previous TPU v5e, with double the memory and bandwidth. It includes a third-generation SparseCore accelerator for processing large data sets in ranking and recommendation tasks.Google claims Trillium can train AI models faster with lower latency and cost, and it’s their most energy-efficient TPU yet, using 67% less energy than the previous version.

(Updating live…)

Source link