|
- Introducing Triton: Open-source GPU programming for neural networks
We’re releasing Triton 1 0, an open-source Python-like programming language which enables researchers with no CUDA experience to write highly efficient GPU code—most of the time on par with what an expert would be able to produce
- OpenAI
OpenAI and NORAD team up to bring new magic to “NORAD Tracks Santa” Company Dec 1, 2025
- Software Engineer, Triton Compiler - OpenAI
The Compiler Kernels team is responsible for the performance-critical software stack that brings OpenAI’s custom silicon to life We design and build the compilers, languages, and high-performance kernels that allow researchers to fully exploit our first-party accelerators
- How to run gpt-oss with Transformers | OpenAI Cookbook
This guide will walk you through running OpenAI gpt-oss-20b or OpenAI gpt-oss-120b using Transformers, either with a high-level pipeline or via low-level generate calls with raw token IDs
- How to run gpt-oss-20b on Google Colab | OpenAI Cookbook
Since support for mxfp4 in transformers is bleeding edge, we need a recent version of PyTorch and CUDA, in order to be able to install the mxfp4 triton kernels
- 80% faster, 50% less memory, 0% loss of accuracy LLM finetuning
Hey OpenAI community! It’s been a while since I last posted so hi again! I just launched Unsloth GitHub - unslothai unsloth: 80% faster 50% less memory LLM finetuning which allows you to finetune LLMs 5x faster, use 50% less memory all on your local GPU
- Project: Running your own Whisper-Large-v3 model and extract Audio . . .
This allows you to clone the openai whisper-large-v3 repo into your own ZeroSpace, and get going “quickly” (caveats in a minute) I will have a free way to do this as well for the hobbyist, who don’t need the insane speedup that a hefty GPU can provide
- Block-sparse GPU kernels - OpenAI
We’re releasing highly-optimized GPU kernels for an underexplored class of neural network architectures: networks with block-sparse weights Depending on the chosen sparsity, these kernels can run orders of magnitude faster than cuBLAS or cuSPARSE We’ve used them to attain state-of-the-art results in text sentiment analysis and generative modeling of text and images
|
|
|