Files
transformers/docs/source/en/model_doc/gpt_neox.md
2025-10-15 14:08:54 -07:00

3.1 KiB

This model was released on 2022-04-14 and added to Hugging Face Transformers on 2022-05-24.

SDPA

GPT-NeoX

GPT-NeoX is a 20-billion-parameter dense autoregressive language model trained on The Pile dataset, released with open weights and code under a permissive license. At the time of publication, it was the largest publicly available model of its kind. The paper details its architecture, training process, and evaluation across language understanding, mathematics, and knowledge tasks. Results show it excels as a few-shot reasoner, achieving stronger five-shot performance gains than comparably sized GPT-3 and FairSeq models.

import torch
from transformers import pipeline

pipeline = pipeline(task="text-generation", model="EleutherAI/gpt-neox-20b", dtype="auto",)
pipeline("Plants create energy through a process known as photosynthesis.")
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-neox-20b", dtype="auto",)

inputs = tokenizer("Plants create energy through a process known as photosynthesis.", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))

Usage tips

  • GPT-NeoX-20B uses a different tokenizer than GPT-J-6B and GPT-Neo. The new tokenizer allocates additional tokens to whitespace characters. This makes the model more suitable for code generation tasks.

GPTNeoXConfig

autodoc GPTNeoXConfig

GPTNeoXTokenizerFast

autodoc GPTNeoXTokenizerFast

GPTNeoXModel

autodoc GPTNeoXModel - forward

GPTNeoXForCausalLM

autodoc GPTNeoXForCausalLM - forward

GPTNeoXForQuestionAnswering

autodoc GPTNeoXForQuestionAnswering - forward

GPTNeoXForSequenceClassification

autodoc GPTNeoXForSequenceClassification - forward

GPTNeoXForTokenClassification

autodoc GPTNeoXForTokenClassification - forward