Files
transformers/docs/source/en/model_doc/llama.md
2025-10-15 14:08:54 -07:00

3.7 KiB

This model was released on 2023-02-27 and added to Hugging Face Transformers on 2023-03-16 and contributed by zphang and BlackSamorez.

FlashAttention SDPA Tensor parallelism

Llama

LLaMA: Open and Efficient Foundation Language Models presents a series of foundation language models sized from 7B to 65B parameters. Trained on trillions of tokens, LLaMA demonstrates the capability to achieve state-of-the-art results using only publicly available datasets. Specifically, LLaMA-13B surpasses GPT-3 (175B) on most benchmarks, while LLaMA-65B competes with top models like Chinchilla-70B and PaLM-540B. All models are released to the research community.

import torch
from transformers import pipeline

pipeline = pipeline(task="text-generation", model="huggyllama/llama-7b", dtype="auto",)
pipeline("Plants create energy through a process known as photosynthesis.")
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("huggyllama/llama-7b")
model = AutoModelForCausalLM.from_pretrained("huggyllama/llama-7b", dtype="auto",)

inputs = tokenizer("Plants create energy through a process known as photosynthesis.", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))

Usage tips

  • The tokenizer is a byte-pair encoding model based on SentencePiece. During decoding, if the first token starts a word (like "Banana"), the tokenizer doesn't prepend the prefix space to the string.

LlamaConfig

autodoc LlamaConfig

LlamaTokenizer

autodoc LlamaTokenizer - build_inputs_with_special_tokens - get_special_tokens_mask - create_token_type_ids_from_sequences - save_vocabulary

LlamaTokenizerFast

autodoc LlamaTokenizerFast - build_inputs_with_special_tokens - get_special_tokens_mask - create_token_type_ids_from_sequences - update_post_processor - save_vocabulary

LlamaModel

autodoc LlamaModel - forward

LlamaForCausalLM

autodoc LlamaForCausalLM - forward

LlamaForSequenceClassification

autodoc LlamaForSequenceClassification - forward

LlamaForQuestionAnswering

autodoc LlamaForQuestionAnswering - forward

LlamaForTokenClassification

autodoc LlamaForTokenClassification - forward