3.4 KiB
This model was released on 2025-08-05 and added to Hugging Face Transformers on 2025-08-05 and contributed by [](https://huggingface.co/).
GptOss
GptOss are open-weight reasoning models built on a mixture-of-experts transformer for improved accuracy and lower inference cost. Training combines large-scale distillation with reinforcement learning, and the models are optimized for agentic tasks like web research, Python execution, and integration with external developer tools. They use a rendered chat format to improve instruction following and role clarity, and demonstrate strong performance across benchmarks in math, coding, and safety. All components—model weights, inference systems, tool environments, and tokenizers are released under Apache 2.0 for broad accessibility.
import torch
from transformers import pipeline
pipeline = pipeline(task="text-generation", model="openai/gpt-oss-20b", dtype="auto",)
pipeline("Plants create energy through a process known as photosynthesis.")
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")
model = AutoModelForCausalLM.from_pretrained("openai/gpt-oss-20b", dtype="auto",)
inputs = tokenizer("Plants create energy through a process known as photosynthesis.", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))
Usage tips
- Attention sinks with flex attention require special handling. Unlike standard attention implementations where sinks add directly to attention scores, flex attention
score_mod
function operates on individual score elements rather than the full attention matrix. - Apply attention sinks renormalization after flex attention computations. Renormalize the outputs using the log-sum-exp (LSE) values returned by flex attention.
GptOssConfig
autodoc GptOssConfig
GptOssModel
autodoc GptOssModel - forward
GptOssForCausalLM
autodoc GptOssForCausalLM - forward
GptOssForSequenceClassification
autodoc GptOssForSequenceClassification - forward
GptOssForTokenClassification
autodoc GptOssForTokenClassification - forward