Files
transformers/docs/source/en/model_doc/olmo2.md
2025-10-15 14:08:54 -07:00

3.2 KiB

This model was released on 2024-02-01 and added to Hugging Face Transformers on 2024-11-25 and contributed by shanearora.

FlashAttention SDPA

OLMo2

OLMo2 is the next-generation fully open language model series, featuring dense autoregressive architectures with improved training stability and per-token efficiency. It introduces a new pretraining data mixture, Dolmino Mix 1124, which enhances downstream task performance when applied in late-stage curriculum training. The OLMo 2-Instruct variant incorporates permissive instruction data and reinforcement learning with verifiable rewards (RLVR), following best practices from T"ulu 3. Models at 7B and 13B scales are fully open, competitive with or surpassing comparable open-weight models like Llama 3.1 and Qwen 2.5, and all code, data, and checkpoints are publicly released.

import torch
from transformers import pipeline

pipeline = pipeline(task="text-generation", model="allenai/OLMo-2-0425-1B", dtype="auto",)
pipeline("Plants create energy through a process known as photosynthesis.")
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-2-0425-1B")
model = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-0425-1B", dtype="auto",)

inputs = tokenizer("Plants create energy through a process known as photosynthesis.", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))

Usage tips

  • OLMo2 uses RMSNorm instead of standard layer norm. RMSNorm is applied to attention queries and keys. It's applied after the attention and feedforward layers rather than before.
  • OLMo2 requires Transformers v4.48 or higher.
  • Load specific intermediate checkpoints by adding the revision parameter to [~AutoModel.from_pretrained].

Olmo2Config

autodoc Olmo2Config

Olmo2Model

autodoc Olmo2Model - forward

Olmo2ForCausalLM

autodoc Olmo2ForCausalLM - forward