If you have been looking for an alternative to ChatGPT, DeepSeek has just unveiled one – DeepSeek-R1. According to the company, it is a large language model (LLM) that will rival OpenAI’s o1. It is even freely available for personal and commercial use. With this, developers and researchers can get access to its code and weights. Before producing outputs, DeepSeek-R1 executes complex reasoning tasks and implements a chain of thought without giving it explicit prompts. With this, it acts as a significant competitor to OpenAI’s ChatGPT.
DeepSeek-R1 is a more advanced version of the DeepSeek-V3-Base. It boasts of 671 billion parameters, with 37 billion active at any moment. It processes up to 128,000 tokens of input context, making it highly capable of extensive reasoning tasks. It is further fine-tuned using reinforcement learning and synthetic datasets. The model achieves exceptional accuracy, with improvements derived from incentivised problem-solving techniques.
If you wish to access the model, you can do so via API and will have to pay just $0.55 per million input tokens and $2.19 per million output tokens. This is significantly cheaper than OpenAI’s o1.
Speaking about the benchmarks, DeepSeek-R1 outperformed OpenAI’s o1 on 5 of 11 benchmarks, including AIME 2024 and MATH-500. It also got better results compared to Anthropic Claude 3.5 Sonnet and OpenAI GPT-4o across various tests. Furthermore, related models like DeepSeek-R1-Zero and distilled versions based on Qwen and Llama showed competitive or better performance against OpenAI-o1-mini.
DeepSeek-R1’s transparent reasoning process contrasts with o1, which hides its steps. This openness not only helps build trust but also helps in training smaller, accurate models through distillation.