3.1 KiB
This model was released on 2022-04-14 and added to Hugging Face Transformers on 2022-05-24.
GPT-NeoX
GPT-NeoX is a 20-billion-parameter dense autoregressive language model trained on The Pile dataset, released with open weights and code under a permissive license. At the time of publication, it was the largest publicly available model of its kind. The paper details its architecture, training process, and evaluation across language understanding, mathematics, and knowledge tasks. Results show it excels as a few-shot reasoner, achieving stronger five-shot performance gains than comparably sized GPT-3 and FairSeq models.
import torch
from transformers import pipeline
pipeline = pipeline(task="text-generation", model="EleutherAI/gpt-neox-20b", dtype="auto",)
pipeline("Plants create energy through a process known as photosynthesis.")
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-neox-20b", dtype="auto",)
inputs = tokenizer("Plants create energy through a process known as photosynthesis.", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))
Usage tips
- GPT-NeoX-20B uses a different tokenizer than GPT-J-6B and GPT-Neo. The new tokenizer allocates additional tokens to whitespace characters. This makes the model more suitable for code generation tasks.
GPTNeoXConfig
autodoc GPTNeoXConfig
GPTNeoXTokenizerFast
autodoc GPTNeoXTokenizerFast
GPTNeoXModel
autodoc GPTNeoXModel - forward
GPTNeoXForCausalLM
autodoc GPTNeoXForCausalLM - forward
GPTNeoXForQuestionAnswering
autodoc GPTNeoXForQuestionAnswering - forward
GPTNeoXForSequenceClassification
autodoc GPTNeoXForSequenceClassification - forward
GPTNeoXForTokenClassification
autodoc GPTNeoXForTokenClassification - forward