Krutrim-2 – Can India’s language-first AI outpace global benchmarks?

By Sagar Sharma | Updated on 04-Feb-2025

04-Feb-2025

Ola’s Krutrim-2 represents India’s boldest attempt to create culturally resonant AI infrastructure, blending strategic investments with open-source collaboration. While achieving notable progress in Indic language processing, the model confronts systemic challenges in balancing local relevance with global performance parity.

Survey

✅ Thank you for completing the survey!

Architectural evolution & benchmark realities of Krutrim-2

The model’s transition to a 12-billion parameter architecture marks a strategic leap from Krutrim-1, prioritising India’s linguistic diversity through expanded capabilities:

Add

As A Trusted Source For Google.

Add as a preferred source on Google

Metric	Krutrim-1	Krutrim-2	Mistral-NeMo	DeepSeek-R1
Parameters	7B	12B	12B	500B
Context Window	4K tokens	128K tokens	128K tokens	128K tokens
Supported Languages	10	22	13	15
MMLU	0.58	0.63	0.68	0.75
BharatBench (Indic)	–	0.95	–	–

Also read: India’s 8-GPU gambit: Shivaay, a foundational AI model built against the odds

Rudransh Agnihotri, FuturixAI CEO, notes: “The 71% parameter expansion hasn’t closed the MMLU gap with Mistral-NeMo – tokenisation inefficiencies in scripts like Devanagari appear to offset gains from Indic optimisation”.

Linguistic innovation & systemic challenges

Krutrim-2’s 128K-token context window facilitates complex vernacular dialogues across 22 scheduled languages, yet three critical hurdles emerge. Tokenisation struggles with Brahmic script complexities – conjunct consonants in Devanagari and vowel diacritics in Dravidian scripts require algorithmic overhauls to improve translation accuracy.

Simultaneously, reliance on synthetic datasets introduces grammatical inconsistencies, particularly in low-resource languages like Bhojpuri, where 38% of outputs showed tense agreement errors during testing.

Also read: DeepSeek data breach: A grim warning for AI security

The prioritisation of BharatBench over global benchmarks creates a 7.4% performance gap against Mistral-NeMo, sparking debates about calibration methodologies for culturally grounded AI. Agnihotri elaborates: “Grammar correction trails 7B models in our trials – this isn’t about scale but rethinking Indic training paradigms”.

Open-sourcing strategy & ecosystem development of Krutrim-2

Krutrim’s decision to open-source aligns with global trends, yielding tangible ecosystem impacts. The Chitrarth-1 vision-language model now processes Tamil shopfront texts and Odia manuscripts with 89% accuracy, while Dhwani-1 enables Haryanvi dialect speech-to-text conversions for rural telemedicine platforms.

Also read: What is Distillation of AI Models: Explained in short

Over 150 startups leverage these tools – Vyakhyarth-1 embeddings power vernacular search in agritech apps, and Krutrim Translate handles 10 million daily conversions. However, Agnihotri cautions: “Community innovation addresses tokenisation flaws but demands NVIDIA-tier compute many lack.”.

Cost efficiencies emerge through DeepSeek model hosting at $0.003/token, 60% cheaper than GPT-4, though adoption remains constrained to 25,000 developers – just 3% of India’s AI workforce potential.

Infrastructure scaling & future trajectory

With ₹10,000 crore committed through 2026, Krutrim’s roadmap focuses on three key frontiers. First, the operational NVIDIA GB200 supercomputer will process 2 trillion Indic tokens by Q3 2025, becoming India’s largest AI infrastructure.

Second, the “Bodhi” AI chip series – optimised for Bharatiya language processing – aims for 2026 deployment alongside 1GW data centres. Third, the Shivaay training framework, which compressed 200B tokens onto 8 GPUs, seeks to democratise access for vernacular AI startups.

Aggarwal asserts: “We’re redefining efficiency metrics for Indian AI”, though matching DeepSeek’s 275 tokens/second processing speed remains contingent on algorithmic breakthroughs.

The path forward for Krutrim-2

Krutrim-2 embodies India’s aspiration to craft AI that resonates with its linguistic soul, yet its success hinges on resolving the triad of tokenisation complexity, data quality, and benchmark calibration. As Aggarwal concedes: “We’re learning to walk before we run”, the model’s evolution will test whether cultural specificity and technical universality can coexist in India’s AI future.

Sagar Sharma

A software engineer who happens to love testing computers and sometimes they crash. While reviving his crashed system, you can find him reading literature, manga, or watering plants. View Full Profile