Google is reportedly rolling out new features for Gemini to analyse a user’s surroundings in real-time using the device’s camera or screen, providing instant answers to questions. These features are a part of Project Astra, which was announced by the company last year.
A user on Reddit claimed to have received the update on Gemini and shared a video demonstrating how the feature works. This feature is a part of Gemini Live, which lets users converse with the AI tool in real-time and using natural language.
Later, a spokesperson from Google confirmed to The Verge that the company has indeed started rolling out the features. Recently, at the Mobile World Congress in Barcelona, the company announced that ‘live video’ and ‘screen sharing’ capabilities will start rolling out to Gemini Advanced subscribers as part of the Google One Premium plan on Android devices this month.
In December last year, the company announced more improvements and features to Project Astra, such as better dialogue, latency and memory capabilities and the ability to use external tools. During this time, the company also unveiled Multimodal Live API with Gemini 2.0, which could read information from the user’s screen and offer real-time advice.
OpenAI already has a feature for ChatGPT Plus and Pro subscribers to use the camera to feed real-time visual information for the advanced voice mode feature.
Recently, Google announced Audio Overviews for Gemini, which brings NotebookLM’s capabilities to the AI assistant. The company also announced native image generation and image editing features for the Gemini 2.0 Flash model.
Meanwhile, Google announced Gemma 3, the next iteration in the Gemma family of open-weight models—a successor to the Gemma 2 model released last year. In the Chatbot Arena, Gemma 3 27B outperformed DeepSeek-V3, OpenAI’s o3-mini and Meta’s Llama 3-405B model. Models in Chatbot Arena are evaluated against each other through side-by-side evaluations by humans.