Alibaba Cloud has unveiled the LLM artificial intelligence model under the name QwQ-32B (pronounced as Queue). This model, which anticipates before responding, was previously introduced as a preview version at the end of 2024. However, it is now officially released with significantly improved test scores across multiple datasets, closely matching the larger DeepSeek-R1 model by over 20 times.
The QwQ team initially trained the model on mathematical and programming problems because answers in these categories are easily verifiable, allowing for reinforcement learning (RL) training based on correct responses. Subsequent training expanded its capabilities, such as selecting tools, with recent training iterations proving to enhance abilities even with minimal additional input. The core programming capabilities have not seen any decline.
Similarly, DeepSeek previously released the R1 model, which was trained following the Qwen-32B, and despite that, the testing results were notably inferior to the QwQ-32B. Currently, users can utilize models close to DeepSeek-R1 or o1-preview on computers that are not significantly large, with the added benefit of free model access.
Alibaba’s stock has surged by up to 8% since the launch of QwQ-32B.
Source – QwenLM
TLDR: Alibaba Cloud introduces the intelligent QwQ-32B model, showcasing improved testing results and offering free access to models that can rival DeepSeek-R1. The stock of Alibaba has seen an 8% increase following the model’s launch.
Leave a Comment