copy and paste this google map to your website or blog!
Press copy button and paste into your blog or website.
(Please switch to 'HTML' mode when posting into your blog. Examples: WordPress Example, Blogger Example)
OrionZheng openmoe-34b-200B · Hugging Face OpenMoE-34B (200 Tokens) OpenMoE is a project aimed at igniting the open-source MoE community! We are releasing a family of open-sourced Mixture-of-Experts (MoE) Large Language Models Our project began in the summer of 2023 On August 22, 2023, we released the first batch of intermediate checkpoints (OpenMoE-base 8B), along with the data and code [Twitter] Subsequently, the OpenMoE-8B
GitHub - XueFuzhao OpenMoE: A family of open-sourced Mixture-of-Experts . . . OpenMoE is a project aimed at igniting the open-source MoE community! We are releasing a family of open-sourced Mixture-of-Experts (MoE) Large Language Models Our project began in the summer of 2023 On August 22, 2023, we released the first batch of intermediate checkpoints (OpenMoE-base 8B), along with the data and code [Twitter] Subsequently, the OpenMoE-8B training was completed in
[2409. 02060] OLMoE: Open Mixture-of-Experts Language Models We introduce OLMoE, a fully open, state-of-the-art language model leveraging sparse Mixture-of-Experts (MoE) OLMoE-1B-7B has 7 billion (B) parameters but uses only 1B per input token We pretrain it on 5 trillion tokens and further adapt it to create OLMoE-1B-7B-Instruct Our models outperform all available models with similar active parameters, even surpassing larger ones like Llama2-13B
Openmoe 34b 200B · Models · Dataloop The OpenMoE 34b 200B model is a great example This model uses a Mixture-of-Experts architecture, which allows it to achieve comparable performance to larger models while keeping costs down With 34 billion parameters and 200 billion tokens, it's designed to handle tasks like text generation and conversation with ease
Introducing OLMoE - fully open source Mixture of Experts LLM Today, together with the Allen Institute for AI, we’re announcing OLMoE, a first-of-its-kind fully open-source mixture-of-expert (MoE) language model that scores best in class when considering the combination of performance and cost OLMoE is pre-trained from scratch and released with open data, code, logs, and intermediate training checkpoints
deepseek-coder-v2 - Ollama An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks
OLMoE: Open Mixture-of-Experts Language Models - GitHub Fully open, state-of-the-art Mixture of Expert model with 1 3 billion active and 6 9 billion total parameters All data, code, and logs released This repository provides an overview of all resources for the paper "OLMoE: Open Mixture-of-Experts Language Models
Openmoe 8b 200B · Models · Dataloop OpenMoE 8B 200B is a powerful AI model designed to promote open-source research in the field of Mixture-of-Experts (MoE) Large Language Models Developed by a team of students, this model is unique in that it fully shares its training data, strategies, model architecture, and weights with the community
[2402. 01739] OpenMoE: An Early Effort on Open Mixture-of-Experts . . . To help the open-source community have a better understanding of Mixture-of-Experts (MoE) based large language models (LLMs), we train and release OpenMoE, a series of fully open-sourced and reproducible decoder-only MoE LLMs, ranging from 650M to 34B parameters and trained on up to over 1T tokens Our investigation confirms that MoE-based LLMs can offer a more favorable cost-effectiveness