
Moscow, March 13 (IANS) Leading global financial institution Sber’s GigaChat 2 MAX model ranks first among AI models, and compared to international benchmarks, the updated product line outperforms GPT4o, DeepSeek-V3, LLaMA 70B, and Qwen2.5 across multiple metrics, according to MERA benchmark data for Russian.
The entire GigaChat 2.0 product line has received a significant upgrade, enabling business customers to solve current tasks and implement large-scale projects faster and better.
GigaChat 2 MAX has become even more powerful and outperforms many similar foreign models in solving tasks set in the Russian language.
GigaChat 2 Pro shows the same quality as the previous MAX version and can handle tasks where creativity and accuracy are important. At the same time, the model itself is less demanding in terms of resources, according to the firm.
GigaChat 2 Lite, the basic model for lighter tasks, is now comparable in quality to the previous Pro version and can handle complex tasks faster and more cost-effectively.
Users retain access to the first-generation models and can try GigaChat 2.0 before upgrading to the new product line.
“You don’t need to know much about programming to find the best version of GigaChat and its prompts for your business. The entire suite is available to enterprises in the cloud via APIs and can also be deployed on-prem,” Andrey Belevtsev, Senior Vice President, and head of Technology Development, Sberbank, said, GigaChat 2.0 is not just about an increase in metrics and technical features, but a significant step in the development of Russian-language large language models (LLMs).
“We have created a model at the level of the world’s best solutions, and in Russian-language tasks, the model outperforms most of them. Strong Russian neural networks are strategically important for any company operating in Russia,” Belevtsev added.
Belevtsev further stated that 15,000 external clients already use GigaChat, and this powerful upgrade to our product line will allow even more customers to solve a wide range of tasks more efficiently.
“By improving processes with the help of artificial intelligence, companies will have a unique opportunity to stay ahead of the competition, increase profits, and improve customer loyalty,” Belevtsev added.
Based on GigaChat 2.0, companies will be able to create more productive autonomous assistants (AI agents) that can reason and solve complex, multi-component problems on their own. This has been made possible because models have increased their knowledge of math, science, and the humanities, and have learned to programme and write better code.
According to the company, to develop agents in Python and JS, you can use the popular LangChain SDK, with which GigaChat is fully compatible. Compatibility packages are available in the GigaChain repository.
Next-gen models retain the context of a conversation for much longer, answer complex, long questions, and analyse more text. Previously, a single query could contain about 48 A4 pages of text (14 pt font), but now the maximum query volume has increased to nearly 200 pages. This makes it easier to build chatbots with GigaChat 2.0.
The new models are twice as accurate at following the user’s instructions and 25 per cent better at answering questions: they follow the specified formats and conditions and formulate answers in a certain style, which helps solve work tasks more efficiently: preparing supporting legal documentation, analysing customer requests, etc.
According to the independent MERA benchmark for Russian, GigaChat 2 MAX ranks first among AI models. And, according to the results of MMLU benchmarking in Russian and English, the new product line is on par with the world’s top performers or even surpasses them.
The most impressive results were achieved by the flagship model of the series. Compared to DeepSeek-V3, Qwen2.5 (Qwen-2.5-75b), GPT4o, and LLaMA 70B, GigaChat 2 MAX answers factual questions in Russian better and follows the given format. The model also outperforms its foreign counterparts according to the HumanEval benchmark for performance in code generation tasks and is more proficient in the exact sciences.
–IANS
vd