|
- DeepSeek-V3. 2: Pushing the Frontier of Open Large Language Models
We introduce DeepSeek-V3 2, a model that harmonizes high computational efficiency with superior reasoning and agent performance The key technical breakthroughs of DeepSeek-V3 2 are as follows: (1) DeepSeek Sparse Attention (DSA): We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance in long-context scenarios
- deepseek-ai DeepSeek-V3. 2-Speciale · Hugging Face
DeepSeek Sparse Attention (DSA): We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios
- DeepSeek-V3. 2:当“稀疏注意力”遇上“奥赛级推理”,开源模型终迎巅峰时刻 - 知乎
DeepSeek-V3 2 的成功建立在三大技术支柱之上:高效的注意力机制、可扩展的强化学习框架、以及大规模 Agent 数据合成管线。
- DeepSeek-V3. 2 · Models
DeepSeek-V3 2 introduces significant updates to its chat template compared to prior versions The primary changes involve a revised format for tool calling and the introduction of a "thinking with tools" capability
- DeepSeek Debuts New AI Models to Rival Google and OpenAI
The new version, DeepSeek-V3 2, matches the performance of OpenAI Inc 's flagship GPT-5 across multiple reasoning benchmarks and combines human-like reasoning with the capability to use tools like
- DeepSeek-V3. 2: Pushing the Frontier of Open Large Language Models
This paper introduces DeepSeek-V3 2, a new AI model designed to be both efficient in its calculations and very good at complex reasoning and problem-solving, especially when acting as an 'agent' that can use tools
- A Technical Tour of the DeepSeek Models from V3 to V3. 2
Similar to DeepSeek V3, the team released their new flagship model over a major US holiday weekend Given DeepSeek V3 2’s really good performance (on GPT-5 and Gemini 3 0 Pro) level, and the fact that it’s also available as an open-weight model, it’s definitely worth a closer look
- AI导读AI论文: DeepSeek-V3. 2: Pushing the Frontier of Open Large Language . . .
DeepSeek-V3 2是DeepSeek-AI推出的开源大语言模型,通过三大技术创新显著提升性能: DSA稀疏注意力将长文本计算复杂度从O (L²)降至O (Lk),保留128K上下文能力; 可扩展RL框架投入超预训练10%的计算量,使基础版推理性能比肩GPT-5,高计算变体DeepSeek-V3 2-Speciale在IMO IOI等
|
|
|