Last week, Google announced the rollout of their new AI model, Gemini 1.0 Ultra, marking a major advancement in Google’s AI capabilities, particularly with the introduction of Gemini Advanced. This rollout has now been extended to developers and Cloud customers, who can utilize the Gemini API in AI Studio and Vertex AI.
Sundar Pichai, CEO of Google, highlighted the ongoing efforts of their teams in advancing AI technology, emphasizing safety as a core component. Despite the recent release of Gemini 1.0 Ultra, Google is already introducing its successor, Gemini 1.5. This new version boasts significant improvements in various aspects and is particularly notable for its efficiency, achieving comparable quality to Gemini 1.0 Ultra while using fewer computing resources.
One of the most groundbreaking features of Gemini 1.5 is its enhanced capability to understand long contexts. The model can process up to 1 million tokens consistently, setting a new standard for large-scale foundation models in terms of context window size. This enhancement is expected to unlock new capabilities and assist developers in building more useful models and applications.
Demis Hassabis, CEO of Google DeepMind, provided further details about Gemini 1.5. This next-generation model represents a major leap in performance, built upon Google’s extensive research and engineering advancements. Gemini 1.5 is designed to be more efficient in both training and serving, thanks to its new Mixture-of-Experts (MoE) architecture.
The first version released for early testing is Gemini 1.5 Pro, a mid-size multimodal model that delivers performance comparable to the larger Gemini 1.0 Ultra. It introduces an experimental feature for understanding long-context information with a standard context window of 128,000 tokens. However, a select group of developers and enterprise customers can test it with a context window of up to 1 million tokens.
Google has focused on optimizing Gemini 1.5 Pro to enhance its latency, computational requirements, and overall user experience. This model’s extended context window can process large amounts of data, like hour-long videos, extensive audio files, and vast codebases.
Gemini 1.5 Pro demonstrates its prowess in various applications, including the analysis, classification, and summarization of large content volumes. It can handle complex reasoning over extensive data sets, such as lengthy transcripts, multimodal content like silent films, and intricate coding tasks.
In terms of performance, Gemini 1.5 Pro outshines Gemini 1.0 Pro in most benchmarks and shows comparable results to Gemini 1.0 Ultra. Its extended context window doesn’t compromise its performance, maintaining high accuracy in identifying specific information within large data blocks.
Gemini 1.5 Pro also exhibits excellent in-context learning abilities, quickly adapting to new skills from information in lengthy prompts. This capability was demonstrated in a benchmark test involving the translation of a rare language.
In line with Google’s AI Principles, Gemini 1.5 Pro underwent extensive ethics and safety testing, incorporating these findings into its development and evaluation processes.
Google is now offering a limited preview of Gemini 1.5 Pro to developers and enterprise customers. The model will initially be available with a standard 128,000 token context window, with plans to introduce scalable pricing tiers for larger context windows of up to 1 million tokens. Early testers can access the 1 million token feature at no cost, though they should expect longer latency times during this experimental phase.
In conclusion, Gemini 1.5 Pro represents a significant step in AI development, offering powerful capabilities in processing and understanding large-scale data, which opens up new possibilities for AI applications in various fields.
Read the full blog post HERE