- Biography - Homepage
I am Hongyu Wang (王鸿钰 in Chinese), a fourth-year Ph D candidate at Institute of Computing Technology, Chinese Academy of Sciences (ICT, CAS) under the supervision of Professor Xilin Chen
- CV - Homepage
Hongyu Wang, Jiayu Xu, Ruiping Wang, Yan Feng, Yitao Zhai, Peng Pei, Xunliang Cai, Xilin Chen BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation
- Publications - Homepage
MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models Published in arXiv, 2025 Abstract: Large multimodal Mixture-of-Experts (MoEs) effectively scale the model size to boost performance while maintaining fixed active parameters However, previous works primarily utilized full-precision experts during sparse up-cycling Despite they show superior performance on end tasks
- BitNet b1. 58 2B4T Technical Report - Homepage
Abstract: We introduce BitNet b1 58 2B4T, the first open-source, native 1-bit Large Language Model (LLM) at the 2-billion parameter scale Trained on a corpus of 4 trillion tokens, the model has been rigorously evaluated across benchmarks covering language understanding, mathematical reasoning, coding proficiency, and conversational ability Our results demonstrate that BitNet b1 58 2B4T
- BitNet: Scaling 1-bit Transformers for Large Language Models
Abstract: The increasing size of large language models has posed challenges for deployment and raised concerns about environmental impact due to high energy consumption In this work, we introduce BitNet, a scalable and stable 1-bit Transformer architecture designed for large language models Specifically, we introduce BitLinear as a drop-in replacement of the nn Linear layer in order to train
- Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
Abstract: We introduce, Q-Sparse, a simple yet effective approach to training sparsely-activated large language models (LLMs) Q-Sparse enables full sparsity of activations in LLMs which can bring significant efficiency gains in inference This is achieved by applying top-K sparsification to the activations and the straight-through-estimator to the training We also introduce Block Q-Sparse
- Markdown - Homepage
A startup company or startup is a company or temporary organization designed to search for a repeatable and scalable business model
- BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1 . . .
Abstract: Efficient deployment of 1-bit Large Language Models (LLMs) is hindered by activation outliers, which complicate quantization to low bit-widths We introduce BitNet v2, a novel framework enabling native 4-bit activation quantization for 1-bit LLMs To tackle outliers in attention and feed-forward network activations, we propose H-BitLinear, a module applying an online Hadamard
|