NVIDIA’s Kari Briski believes open models will define the next era of AI
Open-weights model challenged by NVIDIA Nemotron’s full-stack transparency stance
Token efficiency becomes competitive advantage as throughput drives reasoning
Sovereign AI pitch: why outsource your country’s intelligence?
In the world of computing, open (be it software or hardware) is a curious and oft-abused word. Thrown around left, right and centre, but rarely examined. In AI circles, especially, it’s been pulled, stretched, and sanded down to fit a range of competing ideologies. Some say open means freedom from corporate control. Others use it as shorthand for transparency or accessibility. But for Kari Briski, Vice President of Generative AI Software for Enterprise at NVIDIA, who leads NVIDIA’s Nemotron model family and the ecosystem of software around it, “open” isn’t a slogan. It’s a principle and a stance that decides how technical progress is ultimately achieved.
Survey“We believe that to further AI development you have to have open models,” Briski tells me, while referring to the Nemotron model family. Nemotron constitutes AI models and techniques developed by NVIDIA to build advanced reasoning-based AI agents. “We do it for ourselves, and are talking about it now so people understand this is like CUDA – like our math libraries – you can depend on it. We are committed to it. It is part of our strategy because first and foremost we build it for ourselves, and secondly because we believe in this AI development.”
In other words, Nemotron is part of NVIDIA’s core strategy. And it arrives right as enterprises move from sandbox GenAI experiments to shipping systems at scale.
“Right now, enterprises are finally realizing it takes more than one model to build a system. RAG didn’t solve everything. As they need to build agents that understand their environment and tools, they’ve got to customize. So we’re saying: here’s a model library you can depend on – that’s Nemotron. That’s why we’re talking about it now,” says Kari Briski.
Open sourced confusion
Briski is clear-eyed about the industry’s semantic slop around open source software, especially in the uber competitive GenAI landscape. “When you talk about open models, there’s been an explosion since Llama 2 in 2023, but everyone kind of changed the way they talk about them to say ‘open-weights models’ because no one releases the data they’ve trained their models on.”
Nemotron is built differently, highlights Kari Briski. “It isn’t just the models, it’s the datasets, both the pre-training and post-training datasets, as well as the algorithms, software, and research. You can pick up our datasets, you can go through them, you can know what’s inside of them. If you decide to pre-train your own or train your own data, you can pick and choose parts of the dataset that we’ve put out there – or not.” There’s no dearth of choice and flexibility in what Nemotron can do.
The point, for enterprises, is practical: “What that allows enterprises to do is take these open models and then blend them with their proprietary data or their IP or their personal information they have to train on, and so that model can run on-prem and in-house – none of that data ever comes back to us,” emphasizes Briski. “Having that sort of open data really allows the enterprise to have that balance between proprietary data and IP, and trust – through transparency – the model enough to put those two together.”

And for anyone thinking Nemotron might be some new experiment from NVIDIA, Kari Briski is quick to remind me that Nemotron didn’t materialize from nowhere. “Nemotron is not new – I think it’s NVIDIA’s best-kept secret. We’ve been flying under the radar. We put out our first model in 2020 called Megatron, and at the same time we had NeMo (neural modules). We brought those two teams together,” points out Briski.
“For us, it is our strategy to put out these models. And the reason it’s so great that we can do that is because we build our infrastructure and hardware to make sure they run great. Almost every time we have a new chip architecture or system – compute, networking, storage – we run a large training run and then we put that model out. That’s what sets us apart.”
When it comes to software or anything proprietary, most big companies are allergic to black boxes. Briski’s refrain speaks to that institutional allergy loud and clear. She’s forthright about the boring-but-crucial release artifacts that de-risk adoption. “We released the algorithms; we released the distillation methodology.” And she connects dots many vendors leave scattered. “If you look at other open models, they often don’t release the software that they’ve been trained with – by that I mean the training framework, or the reinforcement-learning framework, or anything else… but we release everything,” Briski says emphatically.
Rise of reasoning = AI agent wave of 2025
Ask Briski why agents finally feel useful and she points to the obvious that arrived late – reasoning models. That’s what really democratized AI agents this past year, according to Briski.
“First of all, Nemotron’s a fantastic reasoning model, and if you think about what really kicked off the season of agents in 2025, it was reasoning. People were building agents before, but they were very rules-based – if-then-else or hard-coded to tools. Reasoning models allowed agents to be more autonomous and more accurate – to really plan.”
She further highlights that GenAI based agents in 2025 isn’t just one model, but a system of models. “There’s no one model to rule them all. On top of that, we’re also saying that it takes specialized AI in different sizes of models. Nemotron comes in many sizes – Nano, Super, and Ultra – think small, medium, and large.”

And those sizes are mapped to boxes you can actually buy, if you’re an enterprise customer thinking of investing in NVIDIA’s platform, Kari Briski explains. “Super fits on a single H100 (Hopper). Ultra fits on a single node – eight GPUs in a box. A lot of state-of-the-art MoE reasoning models don’t fit on a single node; they are multi-node just to run inference… For Ultra we made sure it fits in a single box so you can serve it on eight GPUs.”
But if 2023 was the year of tokens-as-a-flex, 2025 is fast ending on a tokens-as-a-budget note. Efficiency is the name of the game, when a hyperscaler is deploying GenAI applications at scale, where every penny and paisa counts. “The demand for tokens is going up faster than the cost per token is going down, which is why we’ve made these models extremely efficient,” explains Briski. “The more tokens you can generate in a short time, the larger the search space for the answer. More tokens can mean more accuracy – but we also want token efficiency.”
On the how, she’s comfortable getting under the hood. “If I look at token efficiency – throughput over time – our previous dense models, when we did neural-architecture search and reduced size, were about 30% smaller than a typical Llama-class model at the same accuracy. We will eventually have an MoE or Mixture of Experts – that’s on the roadmap – but our model is a hybrid: a Mamba state-space model as well as a transformer. That eliminates some of the compute cost even with a dense architecture,” Briski explains.
Enabling India’s sovereign AI efforts
For Briski, “sovereign AI” is not a slogan, it’s an operating requirement. “NVIDIA is a development platform – not just software. It’s NVIDIA Cloud Partners (NCP), ‘NEO clouds,’ and sovereign AI. Jensen has said before, ‘Why would you outsource your country’s intelligence?’”
The reason is cultural and kinetic: “When you have models, they need to understand your lexicon, culture, and political norms of the time. Things change – things need retraining and updates. If you want AI to be valuable, the token output has to be valuable… With our development platform, you put it all together and run it so your data stays there – your intelligence stays there – and you build a better data flywheel for your nation.”
This isn’t hypothetical in India. “Since Nemotron is not just the model – it’s the techniques and datasets – I’ve seen really good progress working with NCPs like Yotta. What Sarvam did was a great model. I’ve seen very positive early results. I work with a lot of sovereign nations, and India was one of the first. That was great to see,” Briski recollects, while talking about Nemotron’s India applications.

Also read: Intel’s India vision: From AI acceleration to enabling electronics manufacturing for the world
NVIDIA put out datasets not just to brag, but to seed ecosystems, Briski further tells me. “Recently we announced a model called UK-LLM. We helped them translate our reasoning datasets into Welsh to improve reasoning in that language.” And they’re shipping scaffolding India can use immediately.
“We took the Census of India and created a dataset called ‘Personas of India.’ You can use that dataset so models in India reflect the personas – age distribution, education, regional differences – and become better models.”
NVIDIA released Nemotron-Personas-India on October 13, establishing it as the first open synthetic dataset of Indian personas designed to reflect the country’s demographic, geographic, and linguistic diversity. The dataset is substantial, including 21 million synthetic personas and 7.7 billion tokens across three language/script combinations – English, Hindi (Devanagari), and Hindi (Latin) and it was developed using the NeMo Data Designer (currently in early access). “We have a Data Designer as part of the NeMo framework. If you have a small amount of data, you can generate high-quality synthetic data; if you have private data, you can use differential privacy so the dataset is similar but untraceable back to PII,” she highlights.
According to Briski, this multilingual dataset provides crucial privacy-preserving synthetic training data aimed at supporting the development of regionally contextualized large language models (LLMs) and agentic AI systems for Indian use cases. Ultimately, this major release expands NVIDIA’s growing suite of open, localized datasets, thereby advancing Sovereign AI development in India.
Advice for entrepreneurs on future of AI agents
Briski’s message to founders is refreshingly non-mystical. “If you’re starting right away, you don’t have to reinvent the wheel with the base model. Pick it up – the tools are available, the recipes are available, the data is ready, and the model is ready to build on. And just like a software library, we’ve got forums; you can open bugs. We’re here to be the development platform for you,” Briski reiterates.

For India especially, where power and connectivity can be challenging for startups and GenAI companies to bridge effectively, Briski has an important message. At the edge – where India lives as much as it works – distillation is the bridge. “It’s not one model to rule them all; it’s models of many sizes and efficiencies. If you have a model that’s great at a particular task, you can distill it to run at the edge – process and do inference where the data and events happen – before sending it back for retraining,” says Kari Briski.
And she underlines the operational pattern. The more efficient the model, the more efficient the edge, asserts Briski. If AI is going to touch everything, enterprises will have to innovate for data centers and for smaller models that are still accurate across the edge and far edge – and bring it all back together.
On the topic of the future of AI agents, Briski gets delightfully analogue. “Agents have been very general until now. AI agents are like a college grad with access to the internet and a chatbot, not very specialised,” Briski points out. “To take AI to the last mile in the enterprise, an agent needs to be seasoned – trained on the same kind of information a chip designer of 20 years has. You wouldn’t unleash a fresh grad on proprietary EDA tools, just like you wouldn’t let a new doctor operate. The next thing for agents is specialization – that’s where it’s going,” sums up Briski, giving us a glimpse of what AI agents will look like in 2026 and beyond.
Also read: India’s AI focus: IBM’s Amith Singhee on building trust into future of AI
Jayesh Shinde
Executive Editor at Digit. Technology journalist since Jan 2008, with stints at Indiatimes.com and PCWorld.in. Enthusiastic dad, reluctant traveler, weekend gamer, LOTR nerd, pseudo bon vivant. View Full Profile