- Gated Attention for Large Language Models: Non-linearity, Sparsity,. . .
The authors response that they will add experiments in QWen architecture, give the hyperparameters, and promise to open-source one of the models Reviewer bMKL is the only reviewer to initially score the paper in the negative region (Borderline reject) They have some doubts on the experimental section
- Shr Quan | OpenReview
Shanghaoran Quan Intern, Qwen Team, DAMO Academy, Alibaba Group Undergrad student, School of Computer Science and Engineering, Beihang University Joined July 2023
- LiveVQA: Assessing Models with Live Visual Knowledge
We introduce LiveVQA, an automatically collected dataset of latest visual knowledge from the Internet with synthesized VQA problems LiveVQA consists of 3,602 single- and multi-hop visual questions from 6 news websites across 14 news categories, featuring high-quality image-text coherence and authentic information Our evaluation across 15 MLLMs (e g , GPT-4o, Gemma-3, and Qwen-2 5-VL family
- MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context . . .
(a) Summary of Scientific Claims and Findings The paper presents MagicDec, a speculative decoding technique aimed at improving throughput and reducing latency for long-context Large Language Models (LLMs) It challenges the conventional understanding by demonstrating that speculative decoding can be effective even in high-throughput scenarios with large batch sizes and extended sequences
- rStar-Math: Small LLMs Can Master Math Reasoning with Self . . . - OpenReview
For Qwen models with Best-of-N, we re-evaluate MATH-500 and AIME AMC accuracy; other benchmarks results are from their technical reports For a fair compari-son, rStar-Math run MCTS to generate the same number of solutions as Qwen
- Qidong Huang - OpenReview
Qidong Huang Researcher, Qwen Team, Alibaba Group PhD student, University of Science and Technology of China Joined September 2022
- pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation
Few-step diffusion or flow-based generative models typically distill a velocity-predicting teacher into a student that predicts a shortcut towards denoised data This format mismatch has led to
- Towards Federated RLHF with Aggregated Client Preference for LLMs
Reinforcement learning with human feedback (RLHF) fine-tunes a pretrained large language model (LLM) using user preference data, enabling it to generate content aligned with human preferences
|