add docs

2025-10-20 15:33:48 +08:00 · 2023-03-24 13:16:26 +05:30
parent 098962fa65
commit 3d00af4799
14 changed files with 749 additions and 4 deletions
--- a/.github/workflows/build_documentation.yml
+++ b/.github/workflows/build_documentation.yml
@ -0,0 +1,17 @@
+name: Build documentation
+
+on:
+  push:
+    branches:
+      - main
+      - doc-builder*
+      - v*-release
+
+jobs:
+   build:
+    uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main
+    with:
+      commit_sha: ${{ github.sha }}
+      package: accelerate
+    secrets:
+      token: ${{ secrets.HUGGINGFACE_PUSH }}
--- a/.github/workflows/build_pr_documentation.yml
+++ b/.github/workflows/build_pr_documentation.yml
@ -0,0 +1,16 @@
+name: Build PR Documentation
+
+on:
+  pull_request:
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
+  cancel-in-progress: true
+
+jobs:
+  build:
+    uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main
+    with:
+      commit_sha: ${{ github.event.pull_request.head.sha }}
+      pr_number: ${{ github.event.number }}
+      package: peft
--- a/.github/workflows/delete_doc_comment.yml
+++ b/.github/workflows/delete_doc_comment.yml
@ -0,0 +1,13 @@
+name: Delete dev documentation
+
+on:
+  pull_request:
+    types: [ closed ]
+
+
+jobs:
+  delete:
+    uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@main
+    with:
+      pr_number: ${{ github.event.number }}
+      package: peft
--- a/6
+++ b/6
@ -1,6 +1,6 @@
 .PHONY: quality style test docs

-check_dirs := src tests examples
+check_dirs := src tests examples docs

 # Check that source code meets quality standards

@ -8,13 +8,13 @@ check_dirs := src tests examples
 quality:
 	black --check $(check_dirs)
 	ruff $(check_dirs)
-	doc-builder style src tests --max_len 119 --check_only
+	doc-builder style src tests docs --max_len 119 --check_only

 # Format source code automatically and check is there are any problems left that need manual fixing
 style:
 	black $(check_dirs)
 	ruff $(check_dirs) --fix
-	doc-builder style src tests --max_len 119
+	doc-builder style src tests docs --max_len 119

 test:
 	pytest tests/
--- a/docs/Makefile
+++ b/docs/Makefile
@ -0,0 +1,19 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line.
+SPHINXOPTS    =
+SPHINXBUILD   = sphinx-build
+SOURCEDIR     = source
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
--- a/docs/README.md
+++ b/docs/README.md
@ -0,0 +1,267 @@
+<!---
+Copyright 2023 The HuggingFace Team. All rights reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# Generating the documentation
+
+To generate the documentation, you first have to build it. Several packages are necessary to build the doc, 
+you can install them with the following command, at the root of the code repository:
+
+```bash
+pip install -e ".[docs]"
+```
+
+Then you need to install our special tool that builds the documentation:
+
+```bash
+pip install git+https://github.com/huggingface/doc-builder
+```
+
+---
+**NOTE**
+
+You only need to generate the documentation to inspect it locally (if you're planning changes and want to
+check how they look before committing for instance). You don't have to commit the built documentation.
+
+---
+
+## Building the documentation
+
+Once you have setup the `doc-builder` and additional packages, you can generate the documentation by 
+typing the following command:
+
+```bash
+doc-builder build accelerate docs/source/ --build_dir ~/tmp/test-build
+```
+
+You can adapt the `--build_dir` to set any temporary folder that you prefer. This command will create it and generate
+the MDX files that will be rendered as the documentation on the main website. You can inspect them in your favorite
+Markdown editor.
+
+## Previewing the documentation
+
+To preview the docs, first install the `watchdog` module with:
+
+```bash
+pip install watchdog
+```
+
+Then run the following command:
+
+```bash
+doc-builder preview {package_name} {path_to_docs}
+```
+
+For example:
+
+```bash
+doc-builder preview transformers docs/source/en/
+```
+
+The docs will be viewable at [http://localhost:3000](http://localhost:3000). You can also preview the docs once you have opened a PR. You will see a bot add a comment to a link where the documentation with your changes lives.
+
+---
+**NOTE**
+
+The `preview` command only works with existing doc files. When you add a completely new file, you need to update `_toctree.yml` & restart `preview` command (`ctrl-c` to stop it & call `doc-builder preview ...` again).
+
+---
+
+## Adding a new element to the navigation bar
+
+Accepted files are Markdown (.md or .mdx).
+
+Create a file with its extension and put it in the source directory. You can then link it to the toc-tree by putting
+the filename without the extension in the [`_toctree.yml`](https://github.com/huggingface/accelerate/blob/main/docs/source/_toctree.yml) file.
+
+## Renaming section headers and moving sections
+
+It helps to keep the old links working when renaming the section header and/or moving sections from one document to another. This is because the old links are likely to be used in Issues, Forums, and Social media and it'd make for a much more superior user experience if users reading those months later could still easily navigate to the originally intended information.
+
+Therefore, we simply keep a little map of moved sections at the end of the document where the original section was. The key is to preserve the original anchor.
+
+So if you renamed a section from: "Section A" to "Section B", then you can add at the end of the file:
+
+```
+Sections that were moved:
+
+[ <a href="#section-b">Section A</a><a id="section-a"></a> ]
+```
+and of course, if you moved it to another file, then:
+
+```
+Sections that were moved:
+
+[ <a href="../new-file#section-b">Section A</a><a id="section-a"></a> ]
+```
+
+Use the relative style to link to the new file so that the versioned docs continue to work.
+
+
+## Writing Documentation - Specification
+
+The `huggingface/accelerate` documentation follows the
+[Google documentation](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html) style for docstrings,
+although we can write them directly in Markdown.
+
+### Adding a new tutorial
+
+Adding a new tutorial or section is done in two steps:
+
+- Add a new file under `./source`. This file can either be ReStructuredText (.rst) or Markdown (.md).
+- Link that file in `./source/_toctree.yml` on the correct toc-tree.
+
+Make sure to put your new file under the proper section. It's unlikely to go in the first section (*Get Started*), so
+depending on the intended targets (beginners, more advanced users, or researchers) it should go in sections two, three, or
+four.
+
+### Writing source documentation
+
+Values that should be put in `code` should either be surrounded by backticks: \`like so\`. Note that argument names
+and objects like True, None, or any strings should usually be put in `code`.
+
+When mentioning a class, function, or method, it is recommended to use our syntax for internal links so that our tool
+adds a link to its documentation with this syntax: \[\`XXXClass\`\] or \[\`function\`\]. This requires the class or 
+function to be in the main package.
+
+If you want to create a link to some internal class or function, you need to
+provide its path. For instance: \[\`utils.gather\`\]. This will be converted into a link with
+`utils.gather` in the description. To get rid of the path and only keep the name of the object you are
+linking to in the description, add a ~: \[\`~utils.gather\`\] will generate a link with `gather` in the description.
+
+The same works for methods so you can either use \[\`XXXClass.method\`\] or \[~\`XXXClass.method\`\].
+
+#### Defining arguments in a method
+
+Arguments should be defined with the `Args:` (or `Arguments:` or `Parameters:`) prefix, followed by a line return and
+an indentation. The argument should be followed by its type, with its shape if it is a tensor, a colon, and its
+description:
+
+```
+    Args:
+        n_layers (`int`): The number of layers of the model.
+```
+
+If the description is too long to fit in one line (more than 119 characters in total), another indentation is necessary 
+before writing the description after the argument.
+
+Finally, to maintain uniformity if any *one* description is too long to fit on one line, the 
+rest of the parameters should follow suit and have an indention before their description.
+
+Here's an example showcasing everything so far:
+
+```
+    Args:
+        gradient_accumulation_steps (`int`, *optional*, default to 1):
+            The number of steps that should pass before gradients are accumulated. A number > 1 should be combined with `Accelerator.accumulate`.
+        cpu (`bool`, *optional*):
+            Whether or not to force the script to execute on CPU. Will ignore GPU available if set to `True` and force the execution on one process only.
+```
+
+For optional arguments or arguments with defaults we follow the following syntax: imagine we have a function with the
+following signature:
+
+```
+def my_function(x: str = None, a: float = 1):
+```
+
+then its documentation should look like this:
+
+```
+    Args:
+        x (`str`, *optional*):
+            This argument controls ... and has a description longer than 119 chars.
+        a (`float`, *optional*, defaults to 1):
+            This argument is used to ... and has a description longer than 119 chars.
+```
+
+Note that we always omit the "defaults to \`None\`" when None is the default for any argument. Also note that even
+if the first line describing your argument type and its default gets long, you can't break it on several lines. You can
+however write as many lines as you want in the indented description (see the example above with `input_ids`).
+
+#### Writing a multi-line code block
+
+Multi-line code blocks can be useful for displaying examples. They are done between two lines of three backticks as usual in Markdown:
+
+
+````
+```python
+# first line of code
+# second line
+# etc
+```
+````
+
+#### Writing a return block
+
+The return block should be introduced with the `Returns:` prefix, followed by a line return and an indentation.
+The first line should be the type of the return, followed by a line return. No need to indent further for the elements
+building the return.
+
+Here's an example of a single value return:
+
+```
+    Returns:
+        `List[int]`: A list of integers in the range [0, 1] --- 1 for a special token, 0 for a sequence token.
+```
+
+Here's an example of a tuple return, comprising several objects:
+
+```
+    Returns:
+        `tuple(torch.FloatTensor)` comprising various elements depending on the configuration ([`BertConfig`]) and inputs:
+        - ** loss** (*optional*, returned when `masked_lm_labels` is provided) `torch.FloatTensor` of shape `(1,)` --
+          Total loss is the sum of the masked language modeling loss and the next sequence prediction (classification) loss.
+        - **prediction_scores** (`torch.FloatTensor` of shape `(batch_size, sequence_length, config.vocab_size)`) --
+          Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).
+```
+
+## Styling the docstring
+
+We have an automatic script running with the `make style` comment that will make sure that:
+- the docstrings fully take advantage of the line width
+- all code examples are formatted using black, like the code of the Transformers library
+
+This script may have some weird failures if you made a syntax mistake or if you uncover a bug. Therefore, it's
+recommended to commit your changes before running `make style`, so you can revert the changes done by that script
+easily.
+
+## Writing documentation examples
+
+The syntax for Example docstrings can look as follows:
+
+```
+    Example:
+
+    ```python
+    >>> import time
+    >>> from accelerate import Accelerator
+    >>> accelerator = Accelerator()
+    >>> if accelerator.is_main_process:
+    ...     time.sleep(2)
+    >>> else:
+    ...     print("I'm waiting for the main process to finish its sleep...")
+    >>> accelerator.wait_for_everyone()
+    >>> # Should print on every process at the same time
+    >>> print("Everyone is here")
+    ```
+```
+
+The docstring should give a minimal, clear example of how the respective function 
+is to be used in inference and also include the expected (ideally sensible)
+output.
+Often, readers will try out the example before even going through the function 
+or class definitions. Therefore, it is of utmost importance that the example 
+works as expected.
--- a/docs/_toctree.yml
+++ b/docs/_toctree.yml
@ -0,0 +1,16 @@
+- title: Get Started
+	sections:
+	- local: index
+		title: 🤗 PEFT
+	- local: quicktour
+		title: Quicktour
+	- local: installation
+		title: Installation
+- title: Reference
+	sections:
+	- local: package_reference/peft_model
+		title: PEFT model
+	- local: package_reference/configs
+		title: Configuration
+	- local: package_reference/tuners
+		title: Tuners
--- a/docs/index.mdx
+++ b/docs/index.mdx
@ -0,0 +1,49 @@
+<!--Copyright 2023 The HuggingFace Team. All rights reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+-->
+
+# PEFT
+
+🤗 PEFT is a library that enables using State-of-the-art Parameter-Efficient Fine-Tuning (PEFT) methods. 
+
+PEFT methods enable efficient adaptation of pre-trained language models (PLMs) to 
+various downstream applications without fine-tuning all the model's parameters. 
+Fine-tuning large-scale PLMs is often prohibitively costly. 
+In this regard, PEFT methods only fine-tune a small number of (extra) model parameters, 
+thereby greatly decreasing the computational and storage costs. 
+Recent State-of-the-Art PEFT techniques achieve performance comparable to that of full fine-tuning.
+
+Seamlessly integrated with 🤗 Accelerate for large scale models leveraging DeepSpeed and Big Model Inference. 
+
+Supported methods, with more coming soon:
+
+1. LoRA: [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/pdf/2106.09685.pdf)
+2. Prefix Tuning: [Prefix-Tuning: Optimizing Continuous Prompts for Generation](https://aclanthology.org/2021.acl-long.353/), [P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks](https://arxiv.org/pdf/2110.07602.pdf)
+3. P-Tuning: [GPT Understands, Too](https://arxiv.org/pdf/2103.10385.pdf)
+4. Prompt Tuning: [The Power of Scale for Parameter-Efficient Prompt Tuning](https://arxiv.org/pdf/2104.08691.pdf) 
+
+## Getting started
+
+```python
+from transformers import AutoModelForSeq2SeqLM
+from peft import get_peft_config, get_peft_model, LoraConfig, TaskType
+
+model_name_or_path = "bigscience/mt0-large"
+tokenizer_name_or_path = "bigscience/mt0-large"
+
+peft_config = LoraConfig(task_type=TaskType.SEQ_2_SEQ_LM, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)
+
+model = AutoModelForSeq2SeqLM.from_pretrained(model_name_or_path)
+model = get_peft_model(model, peft_config)
+model.print_trainable_parameters()
+# output: trainable params: 2359296 || all params: 1231940608 || trainable%: 0.19151053100118282
+```
+
--- a/docs/install.mdx
+++ b/docs/install.mdx
@ -0,0 +1,46 @@
+<!--Copyright 2023 The HuggingFace Team. All rights reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+-->
+
+# Installation and Configuration
+
+Before you start, you will need to setup your environment, install the appropriate packages, and configure 🤗 PEFT. 🤗 PEFT is tested on **Python 3.7+**.
+
+## Installing 🤗 PEFT
+
+🤗 PEFT is available on pypi, as well as on GitHub. Details to install from each are below:
+
+### pip 
+
+To install 🤗 PEFT from pypi, perform:
+
+```bash
+pip install peft
+```
+
+### Source
+
+New features are added every day that haven't been released yet. To try them out yourself, install
+from the GitHub repository:
+
+```bash
+pip install git+https://github.com/huggingface/peft
+```
+
+If you're working on contributing to the library or wish to play with the source code and see live 
+results as you run the code, an editable version can be installed from a locally-cloned version of the 
+repository:
+
+```bash
+git clone https://github.com/huggingface/peft
+cd peft
+pip install -e .
+```
--- a/docs/package_reference/config
+++ b/docs/package_reference/config
--- a/docs/package_reference/peft_model
+++ b/docs/package_reference/peft_model
--- a/docs/package_reference/tuners
+++ b/docs/package_reference/tuners
--- a/docs/quicktour.mdx
+++ b/docs/quicktour.mdx
@ -0,0 +1,300 @@
+<!--Copyright 2023 The HuggingFace Team. All rights reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+-->
+
+# Quick tour
+
+Let's have a look at the 🤗 PEFT main features and traps to avoid.
+
+## Main use
+
+To use 🤗 PEFT in your script, you have to follow below steps:
+
+1. Create a `PeftConfig` object corresponding to your PEFT method. 
+Please refer to the [Config Page](package_reference/config) for more details. 
+Below, we will use `LoRAConfig` for demonstration.
+
+```python
+from peft import LoraConfig, TaskType
+
+peft_config = LoraConfig(task_type=TaskType.SEQ_2_SEQ_LM, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)
+```
+
+Here, `task_type` is the type of task you are training your model for. 
+For available task types, please refer [TaskType](package_reference/config#peft.config.TaskType).
+
+2. Load the base model you want to fine-tune.
+
+```python
+from transformers import AutoModelForSeq2SeqLM
+
+model_name_or_path = "bigscience/mt0-large"
+tokenizer_name_or_path = "bigscience/mt0-large"
+model = AutoModelForSeq2SeqLM.from_pretrained(model_name_or_path)
+```
+
+3. Preprocess your model if you use `bitsandbytes` for INT-8 quantized training; else skip this step. 
+
+```python
+from peft import prepare_model_for_int8_training
+
+model = prepare_model_for_int8_training(model)
+```
+
+4. Wrap your model in the `PeftModel` object using the `get_peft_model` function. Also, check the number of trainable parameters of your model.
+
+```python
+from peft import get_peft_model
+
+model = get_peft_model(model, peft_config)
+model.print_trainable_parameters()
+# output: trainable params: 2359296 || all params: 1231940608 || trainable%: 0.19151053100118282
+```
+
+5. Voila 🎉. Now, train the model using 🤗 Transformers Trainer API, 🤗 Accelerate or any custom PyTroch training loop.
+Please refer example [peft_lora_seq2seq.ipynb](https://github.com/huggingface/peft/blob/main/examples/conditional_generation/peft_lora_seq2seq.ipynb) for an end-to-end example.
+
+### Saving/loading a model
+
+1. Save your model using the `save_pretrained` function.
+
+```python
+model.save_pretrained("output_dir")
+# model.push_to_hub("my_awesome_peft_model") also works
+```
+
+This will only save the incremental PEFT weights that were trained. 
+For example, you can find the `bigscience/T0_3B` tuned using LoRA on the `twitter_complaints` raft dataset here: 
+[smangrul/twitter_complaints_bigscience_T0_3B_LORA_SEQ_2_SEQ_LM](https://huggingface.co/smangrul/twitter_complaints_bigscience_T0_3B_LORA_SEQ_2_SEQ_LM). 
+Notice that it only contains 2 files: `adapter_config.json` and `adapter_model.bin` with the latter being just 19MB.
+
+2. Load your model using the `from_pretrained` function.
+
+```diff
+  from transformers import AutoModelForSeq2SeqLM
+ from peft import PeftModel, PeftConfig
+
+ peft_model_id = "smangrul/twitter_complaints_bigscience_T0_3B_LORA_SEQ_2_SEQ_LM"
+ config = PeftConfig.from_pretrained(peft_model_id)
+  model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path)
+ model = PeftModel.from_pretrained(model, peft_model_id)
+  tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
+
+  model = model.to(device)
+  model.eval()
+  inputs = tokenizer("Tweet text : @HondaCustSvc Your customer service has been horrible during the recall process. I will never purchase a Honda again. Label :", return_tensors="pt")
+
+  with torch.no_grad():
+      outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"), max_new_tokens=10)
+      print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0])
+# 'complaint'
+```
+
+## Launching your distributed script
+
+PEFT models work with 🤗 Accelerate out of the box. 
+Use 🤗 Accelerate for Distributed training on various hardware such as GPUs, Apple Silicon devices etc during training.
+Use 🤗 Accelerate for inferencing on consumer hardware with small resources.
+
+### Example of PEFT model training using 🤗 Accelerate's DeepSpeed integration
+
+DeepSpeed version required `v0.8.0`. An example is provided in `~examples/conditional_generation/peft_lora_seq2seq_accelerate_ds_zero3_offload.py`. 
+  a. First, run `accelerate config --config_file ds_zero3_cpu.yaml` and answer the questionnaire. 
+  Below are the contents of the config file.
+  ```yaml
+  compute_environment: LOCAL_MACHINE
+  deepspeed_config:
+    gradient_accumulation_steps: 1
+    gradient_clipping: 1.0
+    offload_optimizer_device: cpu
+    offload_param_device: cpu
+    zero3_init_flag: true
+    zero3_save_16bit_model: true
+    zero_stage: 3
+  distributed_type: DEEPSPEED
+  downcast_bf16: 'no'
+  dynamo_backend: 'NO'
+  fsdp_config: {}
+  machine_rank: 0
+  main_training_function: main
+  megatron_lm_config: {}
+  mixed_precision: 'no'
+  num_machines: 1
+  num_processes: 1
+  rdzv_backend: static
+  same_network: true
+  use_cpu: false
+  ```
+  b. run the below command to launch the example script
+  ```bash
+  accelerate launch --config_file ds_zero3_cpu.yaml examples/peft_lora_seq2seq_accelerate_ds_zero3_offload.py
+  ```
+
+  c. output logs:
+  ```bash
+  GPU Memory before entering the train : 1916
+  GPU Memory consumed at the end of the train (end-begin): 66
+  GPU Peak Memory consumed during the train (max-begin): 7488
+  GPU Total Peak Memory consumed during the train (max): 9404
+  CPU Memory before entering the train : 19411
+  CPU Memory consumed at the end of the train (end-begin): 0
+  CPU Peak Memory consumed during the train (max-begin): 0
+  CPU Total Peak Memory consumed during the train (max): 19411
+  epoch=4: train_ppl=tensor(1.0705, device='cuda:0') train_epoch_loss=tensor(0.0681, device='cuda:0')
+  100%|████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:27<00:00,  3.92s/it]
+  GPU Memory before entering the eval : 1982
+  GPU Memory consumed at the end of the eval (end-begin): -66
+  GPU Peak Memory consumed during the eval (max-begin): 672
+  GPU Total Peak Memory consumed during the eval (max): 2654
+  CPU Memory before entering the eval : 19411
+  CPU Memory consumed at the end of the eval (end-begin): 0
+  CPU Peak Memory consumed during the eval (max-begin): 0
+  CPU Total Peak Memory consumed during the eval (max): 19411
+  accuracy=100.0
+  eval_preds[:10]=['no complaint', 'no complaint', 'complaint', 'complaint', 'no complaint', 'no complaint', 'no complaint', 'complaint', 'complaint', 'no complaint']
+  dataset['train'][label_column][:10]=['no complaint', 'no complaint', 'complaint', 'complaint', 'no complaint', 'no complaint', 'no complaint', 'complaint', 'complaint', 'no complaint']
+  ```
+
+### Example of PEFT model inference using 🤗 Accelerate's Big Model Inferencing capabilities
+An example is provided in `~examples/causal_language_modeling/peft_lora_clm_accelerate_big_model_inference.ipynb`. 
+
+## Model Support matrix
+
+### Causal Language Modeling
+| Model        | LoRA | Prefix Tuning  | P-Tuning | Prompt Tuning  |
+|--------------| ---- | ---- | ---- | ----  |
+| GPT-2        | ✅  | ✅  | ✅  | ✅  |
+| Bloom        | ✅  | ✅  | ✅  | ✅  |
+| OPT          | ✅  | ✅  | ✅  | ✅  |
+| GPT-Neo      | ✅  | ✅  | ✅  | ✅  |
+| GPT-J        | ✅  | ✅  | ✅  | ✅  |
+| GPT-NeoX-20B | ✅  | ✅  | ✅  | ✅  |
+| LLaMA        | ✅  | ✅  | ✅  | ✅  |
+| ChatGLM      | ✅  | ✅  | ✅  | ✅  |
+
+### Conditional Generation
+|   Model         | LoRA | Prefix Tuning  | P-Tuning | Prompt Tuning  | 
+| --------- | ---- | ---- | ---- | ---- |
+| T5        | ✅   | ✅   | ✅   | ✅   |
+| BART      | ✅   | ✅   | ✅   | ✅   |
+
+### Sequence Classification
+|   Model         | LoRA | Prefix Tuning  | P-Tuning | Prompt Tuning  | 
+| --------- | ---- | ---- | ---- | ----  |
+| BERT           | ✅  | ✅  | ✅  | ✅  |  
+| RoBERTa        | ✅  | ✅  | ✅  | ✅  |
+| GPT-2          | ✅  | ✅  | ✅  | ✅  | 
+| Bloom          | ✅  | ✅  | ✅  | ✅  |   
+| OPT            | ✅  | ✅  | ✅  | ✅  |
+| GPT-Neo        | ✅  | ✅  | ✅  | ✅  |
+| GPT-J          | ✅  | ✅  | ✅  | ✅  |
+| Deberta        | ✅  |     | ✅  | ✅  |     
+| Deberta-v2     | ✅  |     | ✅  | ✅  |    
+
+### Token Classification
+|   Model         | LoRA | Prefix Tuning  | P-Tuning | Prompt Tuning  | 
+| --------- | ---- | ---- | ---- | ----  |
+| BERT           | ✅  | ✅  |   |   |  
+| RoBERTa        | ✅  | ✅  |   |   |
+| GPT-2          | ✅  | ✅  |   |   | 
+| Bloom          | ✅  | ✅  |   |   |   
+| OPT            | ✅  | ✅  |   |   |
+| GPT-Neo        | ✅  | ✅  |   |   |
+| GPT-J          | ✅  | ✅  |   |   |
+| Deberta        | ✅  |     |   |   | 
+| Deberta-v2     | ✅  |     |   |   |
+
+### Text-to-Image Generation
+
+|   Model         | LoRA | Prefix Tuning  | P-Tuning | Prompt Tuning  | 
+| --------- | ---- | ---- | ---- | ----  |
+| Stable Diffusion           | ✅  |   |   |   |  
+
+
+### Image Classification
+
+|   Model         | LoRA | Prefix Tuning  | P-Tuning | Prompt Tuning  | 
+| --------- | ---- | ---- | ---- | ----  |
+| ViT           | ✅  |   |   |   | 
+| Swin           | ✅  |   |   |   | 
+
+___Note that we have tested LoRA for [ViT](https://huggingface.co/docs/transformers/model_doc/vit) and [Swin](https://huggingface.co/docs/transformers/model_doc/swin) for fine-tuning on image classification. However, it should be possible to use LoRA for any compatible model [provided](https://huggingface.co/models?pipeline_tag=image-classification&sort=downloads&search=vit) by 🤗 Transformers. Check out the respective
+examples to learn more. If you run into problems, please open an issue.___
+
+The same principle applies to our [segmentation models](https://huggingface.co/models?pipeline_tag=image-segmentation&sort=downloads) as well. 
+
+### Semantic Segmentation
+
+|   Model         | LoRA | Prefix Tuning  | P-Tuning | Prompt Tuning  | 
+| --------- | ---- | ---- | ---- | ----  |
+| SegFormer           | ✅  |   |   |   | 
+
+
+## Other caveats
+
+1. Below is an example of using PyTorch FSDP for training. However, it doesn't lead to 
+any GPU memory savings. Please refer to issue [[FSDP] FSDP with CPU offload consumes 1.65X more GPU memory when training models with most of the params frozen](https://github.com/pytorch/pytorch/issues/91165). 
+
+  ```python
+  from peft.utils.other import fsdp_auto_wrap_policy
+
+
+  if os.environ.get("ACCELERATE_USE_FSDP", None) is not None:
+      accelerator.state.fsdp_plugin.auto_wrap_policy = fsdp_auto_wrap_policy(model)
+
+  model = accelerator.prepare(model)
+  ```
+
+  Example of parameter efficient tuning with [`mt0-xxl`](https://huggingface.co/bigscience/mt0-xxl) base model using 🤗 Accelerate is provided in `~examples/conditional_generation/peft_lora_seq2seq_accelerate_fsdp.py`. 
+  a. First, run `accelerate config --config_file fsdp_config.yaml` and answer the questionnaire. 
+  Below are the contents of the config file.
+  ```yaml
+  command_file: null
+  commands: null
+  compute_environment: LOCAL_MACHINE
+  deepspeed_config: {}
+  distributed_type: FSDP
+  downcast_bf16: 'no'
+  dynamo_backend: 'NO'
+  fsdp_config:
+    fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
+    fsdp_backward_prefetch_policy: BACKWARD_PRE
+    fsdp_offload_params: true
+    fsdp_sharding_strategy: 1
+    fsdp_state_dict_type: FULL_STATE_DICT
+    fsdp_transformer_layer_cls_to_wrap: T5Block
+  gpu_ids: null
+  machine_rank: 0
+  main_process_ip: null
+  main_process_port: null
+  main_training_function: main
+  megatron_lm_config: {}
+  mixed_precision: 'no'
+  num_machines: 1
+  num_processes: 2
+  rdzv_backend: static
+  same_network: true
+  tpu_name: null
+  tpu_zone: null
+  use_cpu: false
+  ```
+  b. run the below command to launch the example script
+  ```bash
+  accelerate launch --config_file fsdp_config.yaml examples/peft_lora_seq2seq_accelerate_fsdp.py
+  ```
+
+2. When using `P_TUNING` or `PROMPT_TUNING` with `SEQ_2_SEQ` task, remember to remove the `num_virtual_token` virtual prompt predictions from the left side of the model outputs during evaluations. 
+
+3. For encoder-decoder models, `P_TUNING` or `PROMPT_TUNING` doesn't support the `generate` functionality of transformers because `generate` strictly requires `decoder_input_ids` but 
+`P_TUNING`/`PROMPT_TUNING` append soft prompt embeddings to `input_embeds` to create
+new `input_embeds` to be given to the model. Therefore, `generate` doesn't support this yet.
+
+4. When using ZeRO3 with zero3_init_flag=True, if you find the GPU memory increase with training steps. we might need to set zero3_init_flag=false in accelerate config.yaml. The related issue is [[BUG] memory leak under zero.Init](https://github.com/microsoft/DeepSpeed/issues/2637)
--- a/examples/lora_dreambooth/train_dreambooth.py
+++ b/examples/lora_dreambooth/train_dreambooth.py
@ -1063,7 +1063,9 @@ def main(args):
                )
                text_encoder_state_dict = {f"text_encoder_{k}": v for k, v in text_encoder_state_dict.items()}
                state_dict.update(text_encoder_state_dict)
-                lora_config["text_encoder_peft_config"] = unwarpped_text_encoder.get_peft_config_as_dict(inference=True)
+                lora_config["text_encoder_peft_config"] = unwarpped_text_encoder.get_peft_config_as_dict(
+                    inference=True
+                )

            accelerator.print(state_dict)
            accelerator.save(state_dict, os.path.join(args.output_dir, f"{args.instance_prompt}_lora.pt"))