Google launches Gemini 2.0 for multimodal content generation

Nguyen Tran Tien
Dec 17, 2024
2 min read

Google has launched Gemini 2.0, an advanced AI model that enhances performance over version 1.5 and supports multimodal content generation, including text, images, native audio, and multiple languages.

According to CEO Sundar Pichai, while Gemini 1.0 focused on organizing and understanding information, Gemini 2.0 aims to make information significantly more useful.

The initial release, Gemini 2.0 Flash, is now available for early access to users and developers. This version boasts double the response speed of 1.5 Pro and excels in various applications. Notably, its proficiency in coding languages like Python, Java, and C++ has increased to 92.9% from 79.8% in 1.5 Flash, and its mathematical problem-solving capability has risen to 89.7% from 77.9%. However, its ability to comprehend extended contexts has slightly declined from 71.9% to 69.2%.

A standout feature of Gemini 2.0 Flash is its ability to generate original multimodal content. Outputs can include text, voice, images, and text-to-speech conversions, with customizable voice options for users.

Demis Hassabis, CEO of Google DeepMind, highlighted that improvements in reasoning, long-context understanding, planning, and executing complex instructions will enable new AI agent experiences.

During the launch, Google demonstrated features such as integrating Gemini 2.0 into Astra—a future AI assistant capable of understanding real-world contexts by combining Google Search, Lens, and Maps to provide swift responses. Another AI agent utilizing Gemini 2.0 can analyze information on a strategy game screen and suggest winning strategies to players.

Gemini 2.0 Flash is currently available as a trial model for developers through the Gemini API in Google AI Studio and Vertex AI. Users can also experience it via the Gemini chatbot by selecting the 2.0 Flash version. Google plans to expand applications and introduce additional models of Gemini 2.0 by January 2025.

Pichai noted that since the launch of Gemini 1.0 in December 2023, competing with OpenAI's GPT, Gemini applications have garnered two billion users, with the AI Overviews feature integrated into Google Search attracting one billion users.