The AI model Qwen/QwQ-32B-Preview is now available via HuggingChat. This announcement was made via the X account of Victor Mustar, Head of Product Design at Hugging Face. The release of the full-precision model on the HuggingChat platform opens up new possibilities for developers and users.
QwQ-32B-Preview, developed by the Qwen Team, is an experimental research model focused on improving AI reasoning abilities. As a preview version, it shows promising analytical capabilities, but also has some limitations.
Known challenges include language mixing, recursive thought loops, and safety aspects. The model can unexpectedly switch between languages, which can affect the clarity of the responses. It can also get stuck in circular reasoning patterns, leading to long but inconclusive answers. Therefore, improved safety measures are needed to ensure reliable and safe performance. Users should exercise caution during deployment.
Despite these limitations, the model excels in mathematics and programming. There is potential for improvement in areas such as understanding everyday situations and nuances of language.
QwQ-32B-Preview is based on the Transformer architecture with RoPE, SwiGLU, RMSNorm, and Attention QKV Bias. It has 32.5 billion parameters, of which 31 billion are non-embedding parameters. The model has 64 layers and 40 attention heads for Q and 8 for KV (GQA). The context length is a full 32,768 tokens.
The Qwen2.5 code is integrated into the latest Hugging Face Transformers. Using the latest version of Transformers is recommended. Older versions (less than 4.37.0) may result in errors.
The availability of Qwen/QwQ-32B-Preview on HuggingChat is an important step for the further development and research of AI models. The open availability of the model allows the community to explore the capabilities and limitations of the model and contribute to its further development. The integration into HuggingChat simplifies access and use of the model for a wider audience. Future development and addressing the existing challenges will reveal the true potential of this model.
```