LogoAIAny
Icon for item

Grok-1

Grok-1 is an open release from xai-org containing JAX example code for loading and running the Grok-1 open-weights model (a 314B-parameter Mixture-of-Experts LLM). The repository includes instructions for downloading model checkpoints (magnet link and Hugging Face Hub), example run scripts, model specs, and notes about hardware requirements and license.

Introduction

Overview

Grok-1 is an open-weight release hosted by xai-org that provides JAX example code to load and run the Grok-1 language model. The repository's primary goal is to let researchers and engineers validate the model and run inference using the provided checkpoints and example scripts.

Model specifications
  • Parameters: 314B
  • Architecture: Mixture of Experts (MoE) with 8 experts
  • Experts utilization: 2 experts used per token
  • Layers: 64
  • Attention heads: 48 (queries), 8 (keys/values)
  • Embedding size: 6,144
  • Tokenizer: SentencePiece with 131,072 tokens
  • Positional embeddings: Rotary embeddings (RoPE)
  • Supports: activation sharding and 8-bit quantization
  • Maximum sequence length: 8,192 tokens
What the repository provides
  • JAX example code to load and run the Grok-1 checkpoint and sample outputs (run.py example).
  • Instructions and commands for downloading the checkpoint either via a provided magnet link or directly from the Hugging Face model repository (xai-org/grok-1).
  • Notes on the implementation: the MoE layer implementation prioritizes correctness and ease of validation over runtime efficiency, so it may not be optimal for production inference.
  • License: Apache 2.0 (applies to source files and the distributed Grok-1 weights in this release).
How to run (example)
  1. Clone the repository and install requirements:
git clone https://github.com/xai-org/grok-1.git && cd grok-1
pip install -r requirements.txt
  1. Download checkpoints and place them under checkpoints/ckpt-0 (options provided in the repo):
  • Torrent magnet link in the README, or
  • Use Hugging Face Hub download commands (requires huggingface_hub) to fetch ckpt-0/* into checkpoints.
  1. Run the example script to load the checkpoint and sample from the model:
python run.py

Note: because Grok-1 is 314B parameters, you need a machine with sufficient GPU memory (and appropriate JAX setup) to run the model at reasonable performance. The repository documents that the MoE implementation is not optimized for memory/throughput.

Use cases and audience

This repository is primarily intended for researchers, replicability-focused engineers, and teams who want to inspect and validate the Grok-1 weights, reproduce inference behavior, or use the model for experimentation. It is not an optimized production inference stack — further engineering is needed for efficient MoE inference at scale.

Additional notes
  • The README includes both a magnet link for torrent-based download and explicit Hugging Face Hub commands to retrieve the model checkpoint.
  • The project is published by the GitHub organization xai-org and released under the Apache 2.0 license for the repository and included weights as described in the README.

Information

  • Websitegithub.com
  • Authorsxai-org
  • Published date2024/03/17

Categories

More Items