Community Tutorials

Community tutorials are made by active members of the Hugging Face community who want to share their knowledge and expertise with others. They are a great way to learn about the library and its features, and to get started with core classes and modalities.

Language Models

Tutorials

Task	Class	Description	Author	Tutorial
Reinforcement Learning	[`GRPOTrainer`]	Efficient Online Training with GRPO and vLLM in TRL	Sergio Paniego	Link
Reinforcement Learning	[`GRPOTrainer`]	Post training an LLM for reasoning with GRPO in TRL	Sergio Paniego	Link
Reinforcement Learning	[`GRPOTrainer`]	Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial	Philipp Schmid	Link
Reinforcement Learning	[`GRPOTrainer`]	RL on LLaMA 3.1-8B with GRPO and Unsloth optimizations	Andrea Manzoni	Link
Instruction tuning	[`SFTTrainer`]	Fine-tuning Google Gemma LLMs using ChatML format with QLoRA	Philipp Schmid	Link
Structured Generation	[`SFTTrainer`]	Fine-tuning Llama-2-7B to generate Persian product catalogs in JSON using QLoRA and PEFT	Mohammadreza Esmaeilian	Link
Preference Optimization	[`DPOTrainer`]	Align Mistral-7b using Direct Preference Optimization for human preference alignment	Maxime Labonne	Link
Preference Optimization	[`ORPOTrainer`]	Fine-tuning Llama 3 with ORPO combining instruction tuning and preference alignment	Maxime Labonne	Link
Instruction tuning	[`SFTTrainer`]	How to fine-tune open LLMs in 2025 with Hugging Face	Philipp Schmid	Link

Videos

Task	Title	Author	Video
Instruction tuning	Fine-tuning open AI models using Hugging Face TRL	Wietse Venema
Instruction tuning	How to fine-tune a smol-LM with Hugging Face, TRL, and the smoltalk Dataset	Mayurji

⚠️ Deprecated features notice for "How to fine-tune a smol-LM with Hugging Face, TRL, and the smoltalk Dataset" (click to expand)

Warning

The tutorial uses two deprecated features:

SFTTrainer(..., tokenizer=tokenizer): Use SFTTrainer(..., processing_class=tokenizer) instead, or simply omit it (it will be inferred from the model).

setup_chat_format(model, tokenizer): Use SFTConfig(..., chat_template_path="Qwen/Qwen3-0.6B"), where chat_template_path specifies the model whose chat template you want to copy.

Vision Language Models

Tutorials

Task	Class	Description	Author	Tutorial
Visual QA	[`SFTTrainer`]	Fine-tuning Qwen2-VL-7B for visual question answering on ChartQA dataset	Sergio Paniego	Link
Visual QA	[`SFTTrainer`]	Fine-tuning SmolVLM with TRL on a consumer GPU	Sergio Paniego	Link
SEO Description	[`SFTTrainer`]	Fine-tuning Qwen2-VL-7B for generating SEO-friendly descriptions from images	Philipp Schmid	Link
Visual QA	[`DPOTrainer`]	PaliGemma 🤝 Direct Preference Optimization	Merve Noyan	Link
Visual QA	[`DPOTrainer`]	Fine-tuning SmolVLM using direct preference optimization (DPO) with TRL on a consumer GPU	Sergio Paniego	Link
Object Detection Grounding	[`SFTTrainer`]	Fine tuning a VLM for Object Detection Grounding using TRL	Sergio Paniego	Link
Visual QA	[`DPOTrainer`]	Fine-Tuning a Vision Language Model with TRL using MPO	Sergio Paniego	Link
Reinforcement Learning	[`GRPOTrainer`]	Post training a VLM for reasoning with GRPO using TRL	Sergio Paniego	Link

Contributing

If you have a tutorial that you would like to add to this list, please open a PR to add it. We will review it and merge it if it is relevant to the community.

9.1 KiB Raw Blame History

Community Tutorials

Language Models

Tutorials

Videos

Vision Language Models

Tutorials

Contributing

9.1 KiB

Raw Blame History