|
- arXiv. org e-Print archive
arXiv is a free distribution service and an open-access archive for nearly 2 4 million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics
- [2501. 12948] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs . . .
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1 DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoning capabilities Through RL, DeepSeek-R1-Zero naturally emerges with numerous powerful and intriguing reasoning behaviors However, it
- GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
General-purpose robots need a versatile body and an intelligent mind Recent advancements in humanoid robots have shown great promise as a hardware platform for building generalist autonomy in the human world A robot foundation model, trained on massive and diverse data sources, is essential for enabling the robots to reason about novel situations, robustly handle real-world variability, and
- [2505. 13447] Mean Flows for One-step Generative Modeling - arXiv. org
We propose a principled and effective framework for one-step generative modeling We introduce the notion of average velocity to characterize flow fields, in contrast to instantaneous velocity modeled by Flow Matching methods A well-defined identity between average and instantaneous velocities is derived and used to guide neural network training Our method, termed the MeanFlow model, is self
- Log in to arXiv | arXiv e-print repository
If you've never logged in to arXiv org Register for the first time Registration is required to submit or update papers, but is not necessary to view them
- Fast Text-to-Audio Generation with Adversarial Post-Training
Text-to-audio systems, while increasingly performant, are slow at inference time, thus making their latency unpractical for many creative applications We present Adversarial Relativistic-Contrastive (ARC) post-training, the first adversarial acceleration algorithm for diffusion flow models not based on distillation While past adversarial post-training methods have struggled to compare
- [2103. 14030] Swin Transformer: Hierarchical Vision Transformer using . . .
This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision Challenges in adapting Transformer from language to vision arise from differences between the two domains, such as large variations in the scale of visual entities and the high resolution of pixels in images compared to words in text To address these
- [2505. 19124] Asymptotic Efficiency Analysis of the Recursive Least . . .
Abstract: This paper investigates the optimality analysis of the recursive least-squares (RLS) algorithm for autoregressive systems with exogenous inputs (ARX systems) A key challenge in analyzing is managing the potential unboundedness of the parameter estimates, which may diverge to infinity
|
|
|