Google has introduced Gemini 3.1 Flash Lite, which is said to be the fastest and most cost-efficient Gemini 3 series model. The company says that 3.1 Flash Lite is designed for high-volume developer workloads at scale and offers high quality for its price and model tier. The new model is rolling out in preview to developers through the Gemini API in Google AI Studio and for enterprises via Vertex AI.
One of the biggest highlights of Gemini 3.1 Flash Lite is its cost-efficiency. It costs $0.25 per one million input tokens and $1.50 per one million output tokens. ‘3.1 Flash Lite delivers enhanced performance at a fraction of the cost of larger models,’ the tech giant explains. ‘It outperforms 2.5 Flash with a 2.5X faster Time to First Answer Token and 45 per cent increase in output speed, according to the Artificial Analysis benchmark while maintaining similar or better quality. ‘
Also read: OpenAI introduces GPT 5.3 Instant for ChatGPT: Check new upgrades and availability details
Also, Gemini 3.1 Flash Lite achieved an Elo score of 1432 on the Arena.ai Leaderboard and outperformed other models of similar tier across reasoning and multimodal understanding benchmarks, as per Google.
Another useful feature is the thinking levels in AI Studio and Vertex AI. This allows developers to control how much reasoning power the model uses for each task.
‘Early-access developers on AI Studio and Vertex AI, and companies like Latitude, Cartwheel and Whering are already using 3.1 Flash Lite to solve complex problems at scale. Early testers highlighted 3.1 Flash Lite’s efficiency and reasoning capabilities, saying it can handle complex inputs with the precision of a larger-tier model, plus follow instructions and maintain adherence,’ Google said.
Also read: Apple iPhone 18 Pro Max, iPhone 18 Pro leaks: When will they launch and how much they may cost