mirror of
https://github.com/huggingface/peft.git
synced 2025-10-20 15:33:48 +08:00
add docs
This commit is contained in:
17
.github/workflows/build_documentation.yml
vendored
Normal file
17
.github/workflows/build_documentation.yml
vendored
Normal file
@ -0,0 +1,17 @@
|
|||||||
|
name: Build documentation
|
||||||
|
|
||||||
|
on:
|
||||||
|
push:
|
||||||
|
branches:
|
||||||
|
- main
|
||||||
|
- doc-builder*
|
||||||
|
- v*-release
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
build:
|
||||||
|
uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main
|
||||||
|
with:
|
||||||
|
commit_sha: ${{ github.sha }}
|
||||||
|
package: accelerate
|
||||||
|
secrets:
|
||||||
|
token: ${{ secrets.HUGGINGFACE_PUSH }}
|
16
.github/workflows/build_pr_documentation.yml
vendored
Normal file
16
.github/workflows/build_pr_documentation.yml
vendored
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
name: Build PR Documentation
|
||||||
|
|
||||||
|
on:
|
||||||
|
pull_request:
|
||||||
|
|
||||||
|
concurrency:
|
||||||
|
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
||||||
|
cancel-in-progress: true
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
build:
|
||||||
|
uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main
|
||||||
|
with:
|
||||||
|
commit_sha: ${{ github.event.pull_request.head.sha }}
|
||||||
|
pr_number: ${{ github.event.number }}
|
||||||
|
package: peft
|
13
.github/workflows/delete_doc_comment.yml
vendored
Normal file
13
.github/workflows/delete_doc_comment.yml
vendored
Normal file
@ -0,0 +1,13 @@
|
|||||||
|
name: Delete dev documentation
|
||||||
|
|
||||||
|
on:
|
||||||
|
pull_request:
|
||||||
|
types: [ closed ]
|
||||||
|
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
delete:
|
||||||
|
uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@main
|
||||||
|
with:
|
||||||
|
pr_number: ${{ github.event.number }}
|
||||||
|
package: peft
|
6
Makefile
6
Makefile
@ -1,6 +1,6 @@
|
|||||||
.PHONY: quality style test docs
|
.PHONY: quality style test docs
|
||||||
|
|
||||||
check_dirs := src tests examples
|
check_dirs := src tests examples docs
|
||||||
|
|
||||||
# Check that source code meets quality standards
|
# Check that source code meets quality standards
|
||||||
|
|
||||||
@ -8,13 +8,13 @@ check_dirs := src tests examples
|
|||||||
quality:
|
quality:
|
||||||
black --check $(check_dirs)
|
black --check $(check_dirs)
|
||||||
ruff $(check_dirs)
|
ruff $(check_dirs)
|
||||||
doc-builder style src tests --max_len 119 --check_only
|
doc-builder style src tests docs --max_len 119 --check_only
|
||||||
|
|
||||||
# Format source code automatically and check is there are any problems left that need manual fixing
|
# Format source code automatically and check is there are any problems left that need manual fixing
|
||||||
style:
|
style:
|
||||||
black $(check_dirs)
|
black $(check_dirs)
|
||||||
ruff $(check_dirs) --fix
|
ruff $(check_dirs) --fix
|
||||||
doc-builder style src tests --max_len 119
|
doc-builder style src tests docs --max_len 119
|
||||||
|
|
||||||
test:
|
test:
|
||||||
pytest tests/
|
pytest tests/
|
19
docs/Makefile
Normal file
19
docs/Makefile
Normal file
@ -0,0 +1,19 @@
|
|||||||
|
# Minimal makefile for Sphinx documentation
|
||||||
|
#
|
||||||
|
|
||||||
|
# You can set these variables from the command line.
|
||||||
|
SPHINXOPTS =
|
||||||
|
SPHINXBUILD = sphinx-build
|
||||||
|
SOURCEDIR = source
|
||||||
|
BUILDDIR = _build
|
||||||
|
|
||||||
|
# Put it first so that "make" without argument is like "make help".
|
||||||
|
help:
|
||||||
|
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||||
|
|
||||||
|
.PHONY: help Makefile
|
||||||
|
|
||||||
|
# Catch-all target: route all unknown targets to Sphinx using the new
|
||||||
|
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
|
||||||
|
%: Makefile
|
||||||
|
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
267
docs/README.md
Normal file
267
docs/README.md
Normal file
@ -0,0 +1,267 @@
|
|||||||
|
<!---
|
||||||
|
Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
you may not use this file except in compliance with the License.
|
||||||
|
You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software
|
||||||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
See the License for the specific language governing permissions and
|
||||||
|
limitations under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
# Generating the documentation
|
||||||
|
|
||||||
|
To generate the documentation, you first have to build it. Several packages are necessary to build the doc,
|
||||||
|
you can install them with the following command, at the root of the code repository:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install -e ".[docs]"
|
||||||
|
```
|
||||||
|
|
||||||
|
Then you need to install our special tool that builds the documentation:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install git+https://github.com/huggingface/doc-builder
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
**NOTE**
|
||||||
|
|
||||||
|
You only need to generate the documentation to inspect it locally (if you're planning changes and want to
|
||||||
|
check how they look before committing for instance). You don't have to commit the built documentation.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Building the documentation
|
||||||
|
|
||||||
|
Once you have setup the `doc-builder` and additional packages, you can generate the documentation by
|
||||||
|
typing the following command:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
doc-builder build accelerate docs/source/ --build_dir ~/tmp/test-build
|
||||||
|
```
|
||||||
|
|
||||||
|
You can adapt the `--build_dir` to set any temporary folder that you prefer. This command will create it and generate
|
||||||
|
the MDX files that will be rendered as the documentation on the main website. You can inspect them in your favorite
|
||||||
|
Markdown editor.
|
||||||
|
|
||||||
|
## Previewing the documentation
|
||||||
|
|
||||||
|
To preview the docs, first install the `watchdog` module with:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install watchdog
|
||||||
|
```
|
||||||
|
|
||||||
|
Then run the following command:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
doc-builder preview {package_name} {path_to_docs}
|
||||||
|
```
|
||||||
|
|
||||||
|
For example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
doc-builder preview transformers docs/source/en/
|
||||||
|
```
|
||||||
|
|
||||||
|
The docs will be viewable at [http://localhost:3000](http://localhost:3000). You can also preview the docs once you have opened a PR. You will see a bot add a comment to a link where the documentation with your changes lives.
|
||||||
|
|
||||||
|
---
|
||||||
|
**NOTE**
|
||||||
|
|
||||||
|
The `preview` command only works with existing doc files. When you add a completely new file, you need to update `_toctree.yml` & restart `preview` command (`ctrl-c` to stop it & call `doc-builder preview ...` again).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Adding a new element to the navigation bar
|
||||||
|
|
||||||
|
Accepted files are Markdown (.md or .mdx).
|
||||||
|
|
||||||
|
Create a file with its extension and put it in the source directory. You can then link it to the toc-tree by putting
|
||||||
|
the filename without the extension in the [`_toctree.yml`](https://github.com/huggingface/accelerate/blob/main/docs/source/_toctree.yml) file.
|
||||||
|
|
||||||
|
## Renaming section headers and moving sections
|
||||||
|
|
||||||
|
It helps to keep the old links working when renaming the section header and/or moving sections from one document to another. This is because the old links are likely to be used in Issues, Forums, and Social media and it'd make for a much more superior user experience if users reading those months later could still easily navigate to the originally intended information.
|
||||||
|
|
||||||
|
Therefore, we simply keep a little map of moved sections at the end of the document where the original section was. The key is to preserve the original anchor.
|
||||||
|
|
||||||
|
So if you renamed a section from: "Section A" to "Section B", then you can add at the end of the file:
|
||||||
|
|
||||||
|
```
|
||||||
|
Sections that were moved:
|
||||||
|
|
||||||
|
[ <a href="#section-b">Section A</a><a id="section-a"></a> ]
|
||||||
|
```
|
||||||
|
and of course, if you moved it to another file, then:
|
||||||
|
|
||||||
|
```
|
||||||
|
Sections that were moved:
|
||||||
|
|
||||||
|
[ <a href="../new-file#section-b">Section A</a><a id="section-a"></a> ]
|
||||||
|
```
|
||||||
|
|
||||||
|
Use the relative style to link to the new file so that the versioned docs continue to work.
|
||||||
|
|
||||||
|
|
||||||
|
## Writing Documentation - Specification
|
||||||
|
|
||||||
|
The `huggingface/accelerate` documentation follows the
|
||||||
|
[Google documentation](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html) style for docstrings,
|
||||||
|
although we can write them directly in Markdown.
|
||||||
|
|
||||||
|
### Adding a new tutorial
|
||||||
|
|
||||||
|
Adding a new tutorial or section is done in two steps:
|
||||||
|
|
||||||
|
- Add a new file under `./source`. This file can either be ReStructuredText (.rst) or Markdown (.md).
|
||||||
|
- Link that file in `./source/_toctree.yml` on the correct toc-tree.
|
||||||
|
|
||||||
|
Make sure to put your new file under the proper section. It's unlikely to go in the first section (*Get Started*), so
|
||||||
|
depending on the intended targets (beginners, more advanced users, or researchers) it should go in sections two, three, or
|
||||||
|
four.
|
||||||
|
|
||||||
|
### Writing source documentation
|
||||||
|
|
||||||
|
Values that should be put in `code` should either be surrounded by backticks: \`like so\`. Note that argument names
|
||||||
|
and objects like True, None, or any strings should usually be put in `code`.
|
||||||
|
|
||||||
|
When mentioning a class, function, or method, it is recommended to use our syntax for internal links so that our tool
|
||||||
|
adds a link to its documentation with this syntax: \[\`XXXClass\`\] or \[\`function\`\]. This requires the class or
|
||||||
|
function to be in the main package.
|
||||||
|
|
||||||
|
If you want to create a link to some internal class or function, you need to
|
||||||
|
provide its path. For instance: \[\`utils.gather\`\]. This will be converted into a link with
|
||||||
|
`utils.gather` in the description. To get rid of the path and only keep the name of the object you are
|
||||||
|
linking to in the description, add a ~: \[\`~utils.gather\`\] will generate a link with `gather` in the description.
|
||||||
|
|
||||||
|
The same works for methods so you can either use \[\`XXXClass.method\`\] or \[~\`XXXClass.method\`\].
|
||||||
|
|
||||||
|
#### Defining arguments in a method
|
||||||
|
|
||||||
|
Arguments should be defined with the `Args:` (or `Arguments:` or `Parameters:`) prefix, followed by a line return and
|
||||||
|
an indentation. The argument should be followed by its type, with its shape if it is a tensor, a colon, and its
|
||||||
|
description:
|
||||||
|
|
||||||
|
```
|
||||||
|
Args:
|
||||||
|
n_layers (`int`): The number of layers of the model.
|
||||||
|
```
|
||||||
|
|
||||||
|
If the description is too long to fit in one line (more than 119 characters in total), another indentation is necessary
|
||||||
|
before writing the description after the argument.
|
||||||
|
|
||||||
|
Finally, to maintain uniformity if any *one* description is too long to fit on one line, the
|
||||||
|
rest of the parameters should follow suit and have an indention before their description.
|
||||||
|
|
||||||
|
Here's an example showcasing everything so far:
|
||||||
|
|
||||||
|
```
|
||||||
|
Args:
|
||||||
|
gradient_accumulation_steps (`int`, *optional*, default to 1):
|
||||||
|
The number of steps that should pass before gradients are accumulated. A number > 1 should be combined with `Accelerator.accumulate`.
|
||||||
|
cpu (`bool`, *optional*):
|
||||||
|
Whether or not to force the script to execute on CPU. Will ignore GPU available if set to `True` and force the execution on one process only.
|
||||||
|
```
|
||||||
|
|
||||||
|
For optional arguments or arguments with defaults we follow the following syntax: imagine we have a function with the
|
||||||
|
following signature:
|
||||||
|
|
||||||
|
```
|
||||||
|
def my_function(x: str = None, a: float = 1):
|
||||||
|
```
|
||||||
|
|
||||||
|
then its documentation should look like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
Args:
|
||||||
|
x (`str`, *optional*):
|
||||||
|
This argument controls ... and has a description longer than 119 chars.
|
||||||
|
a (`float`, *optional*, defaults to 1):
|
||||||
|
This argument is used to ... and has a description longer than 119 chars.
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that we always omit the "defaults to \`None\`" when None is the default for any argument. Also note that even
|
||||||
|
if the first line describing your argument type and its default gets long, you can't break it on several lines. You can
|
||||||
|
however write as many lines as you want in the indented description (see the example above with `input_ids`).
|
||||||
|
|
||||||
|
#### Writing a multi-line code block
|
||||||
|
|
||||||
|
Multi-line code blocks can be useful for displaying examples. They are done between two lines of three backticks as usual in Markdown:
|
||||||
|
|
||||||
|
|
||||||
|
````
|
||||||
|
```python
|
||||||
|
# first line of code
|
||||||
|
# second line
|
||||||
|
# etc
|
||||||
|
```
|
||||||
|
````
|
||||||
|
|
||||||
|
#### Writing a return block
|
||||||
|
|
||||||
|
The return block should be introduced with the `Returns:` prefix, followed by a line return and an indentation.
|
||||||
|
The first line should be the type of the return, followed by a line return. No need to indent further for the elements
|
||||||
|
building the return.
|
||||||
|
|
||||||
|
Here's an example of a single value return:
|
||||||
|
|
||||||
|
```
|
||||||
|
Returns:
|
||||||
|
`List[int]`: A list of integers in the range [0, 1] --- 1 for a special token, 0 for a sequence token.
|
||||||
|
```
|
||||||
|
|
||||||
|
Here's an example of a tuple return, comprising several objects:
|
||||||
|
|
||||||
|
```
|
||||||
|
Returns:
|
||||||
|
`tuple(torch.FloatTensor)` comprising various elements depending on the configuration ([`BertConfig`]) and inputs:
|
||||||
|
- ** loss** (*optional*, returned when `masked_lm_labels` is provided) `torch.FloatTensor` of shape `(1,)` --
|
||||||
|
Total loss is the sum of the masked language modeling loss and the next sequence prediction (classification) loss.
|
||||||
|
- **prediction_scores** (`torch.FloatTensor` of shape `(batch_size, sequence_length, config.vocab_size)`) --
|
||||||
|
Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).
|
||||||
|
```
|
||||||
|
|
||||||
|
## Styling the docstring
|
||||||
|
|
||||||
|
We have an automatic script running with the `make style` comment that will make sure that:
|
||||||
|
- the docstrings fully take advantage of the line width
|
||||||
|
- all code examples are formatted using black, like the code of the Transformers library
|
||||||
|
|
||||||
|
This script may have some weird failures if you made a syntax mistake or if you uncover a bug. Therefore, it's
|
||||||
|
recommended to commit your changes before running `make style`, so you can revert the changes done by that script
|
||||||
|
easily.
|
||||||
|
|
||||||
|
## Writing documentation examples
|
||||||
|
|
||||||
|
The syntax for Example docstrings can look as follows:
|
||||||
|
|
||||||
|
```
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```python
|
||||||
|
>>> import time
|
||||||
|
>>> from accelerate import Accelerator
|
||||||
|
>>> accelerator = Accelerator()
|
||||||
|
>>> if accelerator.is_main_process:
|
||||||
|
... time.sleep(2)
|
||||||
|
>>> else:
|
||||||
|
... print("I'm waiting for the main process to finish its sleep...")
|
||||||
|
>>> accelerator.wait_for_everyone()
|
||||||
|
>>> # Should print on every process at the same time
|
||||||
|
>>> print("Everyone is here")
|
||||||
|
```
|
||||||
|
```
|
||||||
|
|
||||||
|
The docstring should give a minimal, clear example of how the respective function
|
||||||
|
is to be used in inference and also include the expected (ideally sensible)
|
||||||
|
output.
|
||||||
|
Often, readers will try out the example before even going through the function
|
||||||
|
or class definitions. Therefore, it is of utmost importance that the example
|
||||||
|
works as expected.
|
16
docs/_toctree.yml
Normal file
16
docs/_toctree.yml
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
- title: Get Started
|
||||||
|
sections:
|
||||||
|
- local: index
|
||||||
|
title: 🤗 PEFT
|
||||||
|
- local: quicktour
|
||||||
|
title: Quicktour
|
||||||
|
- local: installation
|
||||||
|
title: Installation
|
||||||
|
- title: Reference
|
||||||
|
sections:
|
||||||
|
- local: package_reference/peft_model
|
||||||
|
title: PEFT model
|
||||||
|
- local: package_reference/configs
|
||||||
|
title: Configuration
|
||||||
|
- local: package_reference/tuners
|
||||||
|
title: Tuners
|
49
docs/index.mdx
Normal file
49
docs/index.mdx
Normal file
@ -0,0 +1,49 @@
|
|||||||
|
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
# PEFT
|
||||||
|
|
||||||
|
🤗 PEFT is a library that enables using State-of-the-art Parameter-Efficient Fine-Tuning (PEFT) methods.
|
||||||
|
|
||||||
|
PEFT methods enable efficient adaptation of pre-trained language models (PLMs) to
|
||||||
|
various downstream applications without fine-tuning all the model's parameters.
|
||||||
|
Fine-tuning large-scale PLMs is often prohibitively costly.
|
||||||
|
In this regard, PEFT methods only fine-tune a small number of (extra) model parameters,
|
||||||
|
thereby greatly decreasing the computational and storage costs.
|
||||||
|
Recent State-of-the-Art PEFT techniques achieve performance comparable to that of full fine-tuning.
|
||||||
|
|
||||||
|
Seamlessly integrated with 🤗 Accelerate for large scale models leveraging DeepSpeed and Big Model Inference.
|
||||||
|
|
||||||
|
Supported methods, with more coming soon:
|
||||||
|
|
||||||
|
1. LoRA: [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/pdf/2106.09685.pdf)
|
||||||
|
2. Prefix Tuning: [Prefix-Tuning: Optimizing Continuous Prompts for Generation](https://aclanthology.org/2021.acl-long.353/), [P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks](https://arxiv.org/pdf/2110.07602.pdf)
|
||||||
|
3. P-Tuning: [GPT Understands, Too](https://arxiv.org/pdf/2103.10385.pdf)
|
||||||
|
4. Prompt Tuning: [The Power of Scale for Parameter-Efficient Prompt Tuning](https://arxiv.org/pdf/2104.08691.pdf)
|
||||||
|
|
||||||
|
## Getting started
|
||||||
|
|
||||||
|
```python
|
||||||
|
from transformers import AutoModelForSeq2SeqLM
|
||||||
|
from peft import get_peft_config, get_peft_model, LoraConfig, TaskType
|
||||||
|
|
||||||
|
model_name_or_path = "bigscience/mt0-large"
|
||||||
|
tokenizer_name_or_path = "bigscience/mt0-large"
|
||||||
|
|
||||||
|
peft_config = LoraConfig(task_type=TaskType.SEQ_2_SEQ_LM, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)
|
||||||
|
|
||||||
|
model = AutoModelForSeq2SeqLM.from_pretrained(model_name_or_path)
|
||||||
|
model = get_peft_model(model, peft_config)
|
||||||
|
model.print_trainable_parameters()
|
||||||
|
# output: trainable params: 2359296 || all params: 1231940608 || trainable%: 0.19151053100118282
|
||||||
|
```
|
||||||
|
|
46
docs/install.mdx
Normal file
46
docs/install.mdx
Normal file
@ -0,0 +1,46 @@
|
|||||||
|
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
# Installation and Configuration
|
||||||
|
|
||||||
|
Before you start, you will need to setup your environment, install the appropriate packages, and configure 🤗 PEFT. 🤗 PEFT is tested on **Python 3.7+**.
|
||||||
|
|
||||||
|
## Installing 🤗 PEFT
|
||||||
|
|
||||||
|
🤗 PEFT is available on pypi, as well as on GitHub. Details to install from each are below:
|
||||||
|
|
||||||
|
### pip
|
||||||
|
|
||||||
|
To install 🤗 PEFT from pypi, perform:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install peft
|
||||||
|
```
|
||||||
|
|
||||||
|
### Source
|
||||||
|
|
||||||
|
New features are added every day that haven't been released yet. To try them out yourself, install
|
||||||
|
from the GitHub repository:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install git+https://github.com/huggingface/peft
|
||||||
|
```
|
||||||
|
|
||||||
|
If you're working on contributing to the library or wish to play with the source code and see live
|
||||||
|
results as you run the code, an editable version can be installed from a locally-cloned version of the
|
||||||
|
repository:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone https://github.com/huggingface/peft
|
||||||
|
cd peft
|
||||||
|
pip install -e .
|
||||||
|
```
|
0
docs/package_reference/config
Normal file
0
docs/package_reference/config
Normal file
0
docs/package_reference/peft_model
Normal file
0
docs/package_reference/peft_model
Normal file
0
docs/package_reference/tuners
Normal file
0
docs/package_reference/tuners
Normal file
300
docs/quicktour.mdx
Normal file
300
docs/quicktour.mdx
Normal file
@ -0,0 +1,300 @@
|
|||||||
|
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
# Quick tour
|
||||||
|
|
||||||
|
Let's have a look at the 🤗 PEFT main features and traps to avoid.
|
||||||
|
|
||||||
|
## Main use
|
||||||
|
|
||||||
|
To use 🤗 PEFT in your script, you have to follow below steps:
|
||||||
|
|
||||||
|
1. Create a `PeftConfig` object corresponding to your PEFT method.
|
||||||
|
Please refer to the [Config Page](package_reference/config) for more details.
|
||||||
|
Below, we will use `LoRAConfig` for demonstration.
|
||||||
|
|
||||||
|
```python
|
||||||
|
from peft import LoraConfig, TaskType
|
||||||
|
|
||||||
|
peft_config = LoraConfig(task_type=TaskType.SEQ_2_SEQ_LM, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)
|
||||||
|
```
|
||||||
|
|
||||||
|
Here, `task_type` is the type of task you are training your model for.
|
||||||
|
For available task types, please refer [TaskType](package_reference/config#peft.config.TaskType).
|
||||||
|
|
||||||
|
2. Load the base model you want to fine-tune.
|
||||||
|
|
||||||
|
```python
|
||||||
|
from transformers import AutoModelForSeq2SeqLM
|
||||||
|
|
||||||
|
model_name_or_path = "bigscience/mt0-large"
|
||||||
|
tokenizer_name_or_path = "bigscience/mt0-large"
|
||||||
|
model = AutoModelForSeq2SeqLM.from_pretrained(model_name_or_path)
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Preprocess your model if you use `bitsandbytes` for INT-8 quantized training; else skip this step.
|
||||||
|
|
||||||
|
```python
|
||||||
|
from peft import prepare_model_for_int8_training
|
||||||
|
|
||||||
|
model = prepare_model_for_int8_training(model)
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Wrap your model in the `PeftModel` object using the `get_peft_model` function. Also, check the number of trainable parameters of your model.
|
||||||
|
|
||||||
|
```python
|
||||||
|
from peft import get_peft_model
|
||||||
|
|
||||||
|
model = get_peft_model(model, peft_config)
|
||||||
|
model.print_trainable_parameters()
|
||||||
|
# output: trainable params: 2359296 || all params: 1231940608 || trainable%: 0.19151053100118282
|
||||||
|
```
|
||||||
|
|
||||||
|
5. Voila 🎉. Now, train the model using 🤗 Transformers Trainer API, 🤗 Accelerate or any custom PyTroch training loop.
|
||||||
|
Please refer example [peft_lora_seq2seq.ipynb](https://github.com/huggingface/peft/blob/main/examples/conditional_generation/peft_lora_seq2seq.ipynb) for an end-to-end example.
|
||||||
|
|
||||||
|
### Saving/loading a model
|
||||||
|
|
||||||
|
1. Save your model using the `save_pretrained` function.
|
||||||
|
|
||||||
|
```python
|
||||||
|
model.save_pretrained("output_dir")
|
||||||
|
# model.push_to_hub("my_awesome_peft_model") also works
|
||||||
|
```
|
||||||
|
|
||||||
|
This will only save the incremental PEFT weights that were trained.
|
||||||
|
For example, you can find the `bigscience/T0_3B` tuned using LoRA on the `twitter_complaints` raft dataset here:
|
||||||
|
[smangrul/twitter_complaints_bigscience_T0_3B_LORA_SEQ_2_SEQ_LM](https://huggingface.co/smangrul/twitter_complaints_bigscience_T0_3B_LORA_SEQ_2_SEQ_LM).
|
||||||
|
Notice that it only contains 2 files: `adapter_config.json` and `adapter_model.bin` with the latter being just 19MB.
|
||||||
|
|
||||||
|
2. Load your model using the `from_pretrained` function.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
from transformers import AutoModelForSeq2SeqLM
|
||||||
|
+ from peft import PeftModel, PeftConfig
|
||||||
|
|
||||||
|
+ peft_model_id = "smangrul/twitter_complaints_bigscience_T0_3B_LORA_SEQ_2_SEQ_LM"
|
||||||
|
+ config = PeftConfig.from_pretrained(peft_model_id)
|
||||||
|
model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path)
|
||||||
|
+ model = PeftModel.from_pretrained(model, peft_model_id)
|
||||||
|
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
|
||||||
|
|
||||||
|
model = model.to(device)
|
||||||
|
model.eval()
|
||||||
|
inputs = tokenizer("Tweet text : @HondaCustSvc Your customer service has been horrible during the recall process. I will never purchase a Honda again. Label :", return_tensors="pt")
|
||||||
|
|
||||||
|
with torch.no_grad():
|
||||||
|
outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"), max_new_tokens=10)
|
||||||
|
print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0])
|
||||||
|
# 'complaint'
|
||||||
|
```
|
||||||
|
|
||||||
|
## Launching your distributed script
|
||||||
|
|
||||||
|
PEFT models work with 🤗 Accelerate out of the box.
|
||||||
|
Use 🤗 Accelerate for Distributed training on various hardware such as GPUs, Apple Silicon devices etc during training.
|
||||||
|
Use 🤗 Accelerate for inferencing on consumer hardware with small resources.
|
||||||
|
|
||||||
|
### Example of PEFT model training using 🤗 Accelerate's DeepSpeed integration
|
||||||
|
|
||||||
|
DeepSpeed version required `v0.8.0`. An example is provided in `~examples/conditional_generation/peft_lora_seq2seq_accelerate_ds_zero3_offload.py`.
|
||||||
|
a. First, run `accelerate config --config_file ds_zero3_cpu.yaml` and answer the questionnaire.
|
||||||
|
Below are the contents of the config file.
|
||||||
|
```yaml
|
||||||
|
compute_environment: LOCAL_MACHINE
|
||||||
|
deepspeed_config:
|
||||||
|
gradient_accumulation_steps: 1
|
||||||
|
gradient_clipping: 1.0
|
||||||
|
offload_optimizer_device: cpu
|
||||||
|
offload_param_device: cpu
|
||||||
|
zero3_init_flag: true
|
||||||
|
zero3_save_16bit_model: true
|
||||||
|
zero_stage: 3
|
||||||
|
distributed_type: DEEPSPEED
|
||||||
|
downcast_bf16: 'no'
|
||||||
|
dynamo_backend: 'NO'
|
||||||
|
fsdp_config: {}
|
||||||
|
machine_rank: 0
|
||||||
|
main_training_function: main
|
||||||
|
megatron_lm_config: {}
|
||||||
|
mixed_precision: 'no'
|
||||||
|
num_machines: 1
|
||||||
|
num_processes: 1
|
||||||
|
rdzv_backend: static
|
||||||
|
same_network: true
|
||||||
|
use_cpu: false
|
||||||
|
```
|
||||||
|
b. run the below command to launch the example script
|
||||||
|
```bash
|
||||||
|
accelerate launch --config_file ds_zero3_cpu.yaml examples/peft_lora_seq2seq_accelerate_ds_zero3_offload.py
|
||||||
|
```
|
||||||
|
|
||||||
|
c. output logs:
|
||||||
|
```bash
|
||||||
|
GPU Memory before entering the train : 1916
|
||||||
|
GPU Memory consumed at the end of the train (end-begin): 66
|
||||||
|
GPU Peak Memory consumed during the train (max-begin): 7488
|
||||||
|
GPU Total Peak Memory consumed during the train (max): 9404
|
||||||
|
CPU Memory before entering the train : 19411
|
||||||
|
CPU Memory consumed at the end of the train (end-begin): 0
|
||||||
|
CPU Peak Memory consumed during the train (max-begin): 0
|
||||||
|
CPU Total Peak Memory consumed during the train (max): 19411
|
||||||
|
epoch=4: train_ppl=tensor(1.0705, device='cuda:0') train_epoch_loss=tensor(0.0681, device='cuda:0')
|
||||||
|
100%|████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:27<00:00, 3.92s/it]
|
||||||
|
GPU Memory before entering the eval : 1982
|
||||||
|
GPU Memory consumed at the end of the eval (end-begin): -66
|
||||||
|
GPU Peak Memory consumed during the eval (max-begin): 672
|
||||||
|
GPU Total Peak Memory consumed during the eval (max): 2654
|
||||||
|
CPU Memory before entering the eval : 19411
|
||||||
|
CPU Memory consumed at the end of the eval (end-begin): 0
|
||||||
|
CPU Peak Memory consumed during the eval (max-begin): 0
|
||||||
|
CPU Total Peak Memory consumed during the eval (max): 19411
|
||||||
|
accuracy=100.0
|
||||||
|
eval_preds[:10]=['no complaint', 'no complaint', 'complaint', 'complaint', 'no complaint', 'no complaint', 'no complaint', 'complaint', 'complaint', 'no complaint']
|
||||||
|
dataset['train'][label_column][:10]=['no complaint', 'no complaint', 'complaint', 'complaint', 'no complaint', 'no complaint', 'no complaint', 'complaint', 'complaint', 'no complaint']
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example of PEFT model inference using 🤗 Accelerate's Big Model Inferencing capabilities
|
||||||
|
An example is provided in `~examples/causal_language_modeling/peft_lora_clm_accelerate_big_model_inference.ipynb`.
|
||||||
|
|
||||||
|
## Model Support matrix
|
||||||
|
|
||||||
|
### Causal Language Modeling
|
||||||
|
| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning |
|
||||||
|
|--------------| ---- | ---- | ---- | ---- |
|
||||||
|
| GPT-2 | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| Bloom | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| OPT | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| GPT-Neo | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| GPT-J | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| GPT-NeoX-20B | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| LLaMA | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| ChatGLM | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
|
||||||
|
### Conditional Generation
|
||||||
|
| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning |
|
||||||
|
| --------- | ---- | ---- | ---- | ---- |
|
||||||
|
| T5 | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| BART | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
|
||||||
|
### Sequence Classification
|
||||||
|
| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning |
|
||||||
|
| --------- | ---- | ---- | ---- | ---- |
|
||||||
|
| BERT | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| RoBERTa | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| GPT-2 | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| Bloom | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| OPT | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| GPT-Neo | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| GPT-J | ✅ | ✅ | ✅ | ✅ |
|
||||||
|
| Deberta | ✅ | | ✅ | ✅ |
|
||||||
|
| Deberta-v2 | ✅ | | ✅ | ✅ |
|
||||||
|
|
||||||
|
### Token Classification
|
||||||
|
| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning |
|
||||||
|
| --------- | ---- | ---- | ---- | ---- |
|
||||||
|
| BERT | ✅ | ✅ | | |
|
||||||
|
| RoBERTa | ✅ | ✅ | | |
|
||||||
|
| GPT-2 | ✅ | ✅ | | |
|
||||||
|
| Bloom | ✅ | ✅ | | |
|
||||||
|
| OPT | ✅ | ✅ | | |
|
||||||
|
| GPT-Neo | ✅ | ✅ | | |
|
||||||
|
| GPT-J | ✅ | ✅ | | |
|
||||||
|
| Deberta | ✅ | | | |
|
||||||
|
| Deberta-v2 | ✅ | | | |
|
||||||
|
|
||||||
|
### Text-to-Image Generation
|
||||||
|
|
||||||
|
| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning |
|
||||||
|
| --------- | ---- | ---- | ---- | ---- |
|
||||||
|
| Stable Diffusion | ✅ | | | |
|
||||||
|
|
||||||
|
|
||||||
|
### Image Classification
|
||||||
|
|
||||||
|
| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning |
|
||||||
|
| --------- | ---- | ---- | ---- | ---- |
|
||||||
|
| ViT | ✅ | | | |
|
||||||
|
| Swin | ✅ | | | |
|
||||||
|
|
||||||
|
___Note that we have tested LoRA for [ViT](https://huggingface.co/docs/transformers/model_doc/vit) and [Swin](https://huggingface.co/docs/transformers/model_doc/swin) for fine-tuning on image classification. However, it should be possible to use LoRA for any compatible model [provided](https://huggingface.co/models?pipeline_tag=image-classification&sort=downloads&search=vit) by 🤗 Transformers. Check out the respective
|
||||||
|
examples to learn more. If you run into problems, please open an issue.___
|
||||||
|
|
||||||
|
The same principle applies to our [segmentation models](https://huggingface.co/models?pipeline_tag=image-segmentation&sort=downloads) as well.
|
||||||
|
|
||||||
|
### Semantic Segmentation
|
||||||
|
|
||||||
|
| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning |
|
||||||
|
| --------- | ---- | ---- | ---- | ---- |
|
||||||
|
| SegFormer | ✅ | | | |
|
||||||
|
|
||||||
|
|
||||||
|
## Other caveats
|
||||||
|
|
||||||
|
1. Below is an example of using PyTorch FSDP for training. However, it doesn't lead to
|
||||||
|
any GPU memory savings. Please refer to issue [[FSDP] FSDP with CPU offload consumes 1.65X more GPU memory when training models with most of the params frozen](https://github.com/pytorch/pytorch/issues/91165).
|
||||||
|
|
||||||
|
```python
|
||||||
|
from peft.utils.other import fsdp_auto_wrap_policy
|
||||||
|
|
||||||
|
|
||||||
|
if os.environ.get("ACCELERATE_USE_FSDP", None) is not None:
|
||||||
|
accelerator.state.fsdp_plugin.auto_wrap_policy = fsdp_auto_wrap_policy(model)
|
||||||
|
|
||||||
|
model = accelerator.prepare(model)
|
||||||
|
```
|
||||||
|
|
||||||
|
Example of parameter efficient tuning with [`mt0-xxl`](https://huggingface.co/bigscience/mt0-xxl) base model using 🤗 Accelerate is provided in `~examples/conditional_generation/peft_lora_seq2seq_accelerate_fsdp.py`.
|
||||||
|
a. First, run `accelerate config --config_file fsdp_config.yaml` and answer the questionnaire.
|
||||||
|
Below are the contents of the config file.
|
||||||
|
```yaml
|
||||||
|
command_file: null
|
||||||
|
commands: null
|
||||||
|
compute_environment: LOCAL_MACHINE
|
||||||
|
deepspeed_config: {}
|
||||||
|
distributed_type: FSDP
|
||||||
|
downcast_bf16: 'no'
|
||||||
|
dynamo_backend: 'NO'
|
||||||
|
fsdp_config:
|
||||||
|
fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
|
||||||
|
fsdp_backward_prefetch_policy: BACKWARD_PRE
|
||||||
|
fsdp_offload_params: true
|
||||||
|
fsdp_sharding_strategy: 1
|
||||||
|
fsdp_state_dict_type: FULL_STATE_DICT
|
||||||
|
fsdp_transformer_layer_cls_to_wrap: T5Block
|
||||||
|
gpu_ids: null
|
||||||
|
machine_rank: 0
|
||||||
|
main_process_ip: null
|
||||||
|
main_process_port: null
|
||||||
|
main_training_function: main
|
||||||
|
megatron_lm_config: {}
|
||||||
|
mixed_precision: 'no'
|
||||||
|
num_machines: 1
|
||||||
|
num_processes: 2
|
||||||
|
rdzv_backend: static
|
||||||
|
same_network: true
|
||||||
|
tpu_name: null
|
||||||
|
tpu_zone: null
|
||||||
|
use_cpu: false
|
||||||
|
```
|
||||||
|
b. run the below command to launch the example script
|
||||||
|
```bash
|
||||||
|
accelerate launch --config_file fsdp_config.yaml examples/peft_lora_seq2seq_accelerate_fsdp.py
|
||||||
|
```
|
||||||
|
|
||||||
|
2. When using `P_TUNING` or `PROMPT_TUNING` with `SEQ_2_SEQ` task, remember to remove the `num_virtual_token` virtual prompt predictions from the left side of the model outputs during evaluations.
|
||||||
|
|
||||||
|
3. For encoder-decoder models, `P_TUNING` or `PROMPT_TUNING` doesn't support the `generate` functionality of transformers because `generate` strictly requires `decoder_input_ids` but
|
||||||
|
`P_TUNING`/`PROMPT_TUNING` append soft prompt embeddings to `input_embeds` to create
|
||||||
|
new `input_embeds` to be given to the model. Therefore, `generate` doesn't support this yet.
|
||||||
|
|
||||||
|
4. When using ZeRO3 with zero3_init_flag=True, if you find the GPU memory increase with training steps. we might need to set zero3_init_flag=false in accelerate config.yaml. The related issue is [[BUG] memory leak under zero.Init](https://github.com/microsoft/DeepSpeed/issues/2637)
|
@ -1063,7 +1063,9 @@ def main(args):
|
|||||||
)
|
)
|
||||||
text_encoder_state_dict = {f"text_encoder_{k}": v for k, v in text_encoder_state_dict.items()}
|
text_encoder_state_dict = {f"text_encoder_{k}": v for k, v in text_encoder_state_dict.items()}
|
||||||
state_dict.update(text_encoder_state_dict)
|
state_dict.update(text_encoder_state_dict)
|
||||||
lora_config["text_encoder_peft_config"] = unwarpped_text_encoder.get_peft_config_as_dict(inference=True)
|
lora_config["text_encoder_peft_config"] = unwarpped_text_encoder.get_peft_config_as_dict(
|
||||||
|
inference=True
|
||||||
|
)
|
||||||
|
|
||||||
accelerator.print(state_dict)
|
accelerator.print(state_dict)
|
||||||
accelerator.save(state_dict, os.path.join(args.output_dir, f"{args.instance_prompt}_lora.pt"))
|
accelerator.save(state_dict, os.path.join(args.output_dir, f"{args.instance_prompt}_lora.pt"))
|
||||||
|
Reference in New Issue
Block a user