mirror of
https://github.com/huggingface/peft.git
synced 2025-10-20 23:43:47 +08:00
Compare commits
22 Commits
Author | SHA1 | Date | |
---|---|---|---|
32357c2dd2 | |||
79298c7c24 | |||
b25ce8a0cd | |||
5d84484079 | |||
49ddefa834 | |||
3af469eeea | |||
5e7e5ad836 | |||
9d8287f3e3 | |||
2efd02769b | |||
669dd4edeb | |||
b5641cc744 | |||
c5d94855cd | |||
face67dfeb | |||
d9094cebea | |||
493ae58beb | |||
ed4ce9fc94 | |||
4c48970cb0 | |||
46e03602ed | |||
45343a4ccc | |||
276c91b143 | |||
cfe35a7878 | |||
d47d23aa0e |
@ -29,15 +29,6 @@ ENV PATH /opt/conda/envs/peft/bin:$PATH
|
||||
# Activate our bash shell
|
||||
RUN chsh -s /bin/bash
|
||||
SHELL ["/bin/bash", "-c"]
|
||||
# Activate the conda env and install transformers + accelerate from source
|
||||
RUN source activate peft && \
|
||||
python3 -m pip install --no-cache-dir \
|
||||
librosa \
|
||||
"soundfile>=0.12.1" \
|
||||
scipy \
|
||||
git+https://github.com/huggingface/transformers \
|
||||
git+https://github.com/huggingface/accelerate \
|
||||
peft[test]@git+https://github.com/huggingface/peft
|
||||
|
||||
# Stage 2
|
||||
FROM nvidia/cuda:12.2.2-devel-ubuntu22.04 AS build-image
|
||||
@ -49,6 +40,18 @@ SHELL ["/bin/bash", "-c"]
|
||||
RUN source activate peft && \
|
||||
python3 -m pip install --no-cache-dir bitsandbytes optimum auto-gptq
|
||||
|
||||
# Activate the conda env and install transformers + accelerate from source
|
||||
RUN source activate peft && \
|
||||
python3 -m pip install -U --no-cache-dir \
|
||||
librosa \
|
||||
"soundfile>=0.12.1" \
|
||||
scipy \
|
||||
git+https://github.com/huggingface/transformers \
|
||||
git+https://github.com/huggingface/accelerate \
|
||||
peft[test]@git+https://github.com/huggingface/peft
|
||||
|
||||
RUN pip freeze | grep transformers
|
||||
|
||||
# Install apt libs
|
||||
RUN apt-get update && \
|
||||
apt-get install -y curl git wget && \
|
||||
|
@ -28,10 +28,13 @@ Being similar to LoRA, IA3 carries many of the same advantages:
|
||||
* Performance of models fine-tuned using IA3 is comparable to the performance of fully fine-tuned models.
|
||||
* IA3 does not add any inference latency because adapter weights can be merged with the base model.
|
||||
|
||||
In principle, IA3 can be applied to any subset of weight matrices in a neural network to reduce the number of trainable
|
||||
parameters. Following the authors' implementation, IA3 weights are added to the key, value and feedforward layers
|
||||
of a Transformer model. Given the target layers for injecting IA3 parameters, the number of trainable parameters
|
||||
can be determined based on the size of the weight matrices.
|
||||
In principle, IA3 can be applied to any subset of weight matrices in a neural network to reduce the number of trainable
|
||||
parameters. Following the authors' implementation, IA3 weights are added to the key, value and feedforward layers
|
||||
of a Transformer model. To be specific, for transformer models, IA3 weights are added to the outputs of key and value layers, and to the input of the second feedforward layer
|
||||
in each transformer block.
|
||||
|
||||
Given the target layers for injecting IA3 parameters, the number of trainable parameters
|
||||
can be determined based on the size of the weight matrices.
|
||||
|
||||
|
||||
## Common IA3 parameters in PEFT
|
||||
@ -43,10 +46,19 @@ As with other methods supported by PEFT, to fine-tune a model using IA3, you nee
|
||||
3. Wrap the base model with `get_peft_model()` to get a trainable `PeftModel`.
|
||||
4. Train the `PeftModel` as you normally would train the base model.
|
||||
|
||||
`IA3Config` allows you to control how IA3 is applied to the base model through the following parameters:
|
||||
`IA3Config` allows you to control how IA3 is applied to the base model through the following parameters:
|
||||
|
||||
- `target_modules`: The modules (for example, attention blocks) to apply the IA3 vectors.
|
||||
- `feedforward_modules`: The list of modules to be treated as feedforward layers in `target_modules`. While learned vectors are multiplied with
|
||||
the output activation for attention blocks, the vectors are multiplied with the input for classic feedforward layers.
|
||||
- `feedforward_modules`: The list of modules to be treated as feedforward layers in `target_modules`. While learned vectors are multiplied with
|
||||
the output activation for attention blocks, the vectors are multiplied with the input for classic feedforward layers. Note that `feedforward_modules` must be a subset of `target_modules`.
|
||||
- `modules_to_save`: List of modules apart from IA3 layers to be set as trainable and saved in the final checkpoint. These typically include model's custom head that is randomly initialized for the fine-tuning task.
|
||||
|
||||
## Example Usage
|
||||
|
||||
For the task of sequence classification, one can initialize the IA3 config for a Llama model as follows:
|
||||
|
||||
```py
|
||||
peft_config = IA3Config(
|
||||
task_type=TaskType.SEQ_CLS, target_modules=["k_proj", "v_proj", "down_proj"], feedforward_modules=["down_proj"]
|
||||
)
|
||||
```
|
@ -83,6 +83,7 @@ accelerate launch train_dreambooth.py \
|
||||
--output_dir=$OUTPUT_DIR \
|
||||
--train_text_encoder \
|
||||
--with_prior_preservation --prior_loss_weight=1.0 \
|
||||
--num_dataloader_workers=1 \
|
||||
--instance_prompt="a photo of sks dog" \
|
||||
--class_prompt="a photo of dog" \
|
||||
--resolution=512 \
|
||||
@ -101,6 +102,8 @@ accelerate launch train_dreambooth.py \
|
||||
--max_train_steps=800
|
||||
```
|
||||
|
||||
If you are running this script on Windows, you may need to set the `--num_dataloader_workers` to 0.
|
||||
|
||||
## Inference with a single adapter
|
||||
|
||||
To run inference with the fine-tuned model, first specify the base model with which the fine-tuned LoRA weights will be combined:
|
||||
@ -171,7 +174,7 @@ image.save("DESTINATION_PATH_FOR_THE_IMAGE")
|
||||
## Multi-adapter inference
|
||||
|
||||
With PEFT you can combine multiple adapters for inference. In the previous example you have fine-tuned Stable Diffusion on
|
||||
some dog images. The pipeline created based on these weights got a name - `adapter_name="dog`. Now, suppose you also fine-tuned
|
||||
some dog images. The pipeline created based on these weights got a name - `adapter_name="dog"`. Now, suppose you also fine-tuned
|
||||
this base model on images of a crochet toy. Let's see how we can use both adapters.
|
||||
|
||||
First, you'll need to perform all the steps as in the single adapter inference example:
|
||||
|
@ -213,6 +213,10 @@ def parse_args(input_args=None):
|
||||
help="Bias type for Lora. Can be 'none', 'all' or 'lora_only', only used if use_lora and `train_text_encoder` are True",
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--num_dataloader_workers", type=int, default=1, help="Num of workers for the training dataloader."
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--train_batch_size", type=int, default=4, help="Batch size (per device) for the training dataloader."
|
||||
)
|
||||
@ -799,7 +803,7 @@ def main(args):
|
||||
batch_size=args.train_batch_size,
|
||||
shuffle=True,
|
||||
collate_fn=lambda examples: collate_fn(examples, args.with_prior_preservation),
|
||||
num_workers=1,
|
||||
num_workers=args.num_dataloader_workers,
|
||||
)
|
||||
|
||||
# Scheduler and math around the number of training steps.
|
||||
|
28
setup.py
28
setup.py
@ -22,7 +22,7 @@ extras["test"] = extras["dev"] + ["pytest", "pytest-cov", "pytest-xdist", "param
|
||||
|
||||
setup(
|
||||
name="peft",
|
||||
version="0.6.0",
|
||||
version="0.6.2",
|
||||
description="Parameter-Efficient Fine-Tuning (PEFT)",
|
||||
license_files=["LICENSE"],
|
||||
long_description=open("README.md", "r", encoding="utf-8").read(),
|
||||
@ -63,19 +63,23 @@ setup(
|
||||
)
|
||||
|
||||
# Release checklist
|
||||
# 1. Change the version in __init__.py and setup.py.
|
||||
# 2. Commit these changes with the message: "Release: VERSION"
|
||||
# 3. Add a tag in git to mark the release: "git tag VERSION -m 'Adds tag VERSION for pypi' "
|
||||
# Push the tag to git: git push --tags origin main
|
||||
# 4. Run the following commands in the top-level directory:
|
||||
# 1. Change the version in __init__.py and setup.py to the release version, e.g. from "0.6.0.dev0" to "0.6.0"
|
||||
# 2. Check if there are any deprecations that need to be addressed for this release by seaching for "# TODO" in the code
|
||||
# 3. Commit these changes with the message: "Release: VERSION", create a PR and merge it.
|
||||
# 4. Add a tag in git to mark the release: "git tag -a VERSION -m 'Adds tag VERSION for pypi' "
|
||||
# Push the tag to git:
|
||||
# git push --tags origin main
|
||||
# It is necessary to work on the original repository, not on a fork.
|
||||
# 5. Run the following commands in the top-level directory:
|
||||
# python setup.py bdist_wheel
|
||||
# python setup.py sdist
|
||||
# 5. Upload the package to the pypi test server first:
|
||||
# Ensure that you are on the clean and up-to-date main branch (git status --untracked-files=no should not list any
|
||||
# files and show the main branch)
|
||||
# 6. Upload the package to the pypi test server first:
|
||||
# twine upload dist/* -r pypitest
|
||||
# twine upload dist/* -r pypitest --repository-url=https://test.pypi.org/legacy/
|
||||
# 6. Check that you can install it in a virtualenv by running:
|
||||
# 7. Check that you can install it in a virtualenv by running:
|
||||
# pip install -i https://testpypi.python.org/pypi peft
|
||||
# 7. Upload the final version to actual pypi:
|
||||
# 8. Upload the final version to actual pypi:
|
||||
# twine upload dist/* -r pypi
|
||||
# 8. Add release notes to the tag in github once everything is looking hunky-dory.
|
||||
# 9. Update the version in __init__.py, setup.py to the new version "-dev" and push to master
|
||||
# 9. Add release notes to the tag on https://github.com/huggingface/peft/releases once everything is looking hunky-dory.
|
||||
# 10. Update the version in __init__.py, setup.py to the bumped minor version + ".dev0" (e.g. from "0.6.0" to "0.7.0.dev0")
|
||||
|
@ -17,7 +17,7 @@
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
__version__ = "0.6.0"
|
||||
__version__ = "0.6.2"
|
||||
|
||||
from .auto import (
|
||||
AutoPeftModel,
|
||||
|
@ -13,6 +13,10 @@
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
import importlib
|
||||
import importlib.metadata as importlib_metadata
|
||||
from functools import lru_cache
|
||||
|
||||
import packaging.version
|
||||
|
||||
|
||||
def is_bnb_available() -> bool:
|
||||
@ -28,9 +32,35 @@ def is_bnb_4bit_available() -> bool:
|
||||
return hasattr(bnb.nn, "Linear4bit")
|
||||
|
||||
|
||||
def is_auto_gptq_available() -> bool:
|
||||
return importlib.util.find_spec("auto_gptq") is not None
|
||||
def is_auto_gptq_available():
|
||||
if importlib.util.find_spec("auto_gptq") is not None:
|
||||
AUTOGPTQ_MINIMUM_VERSION = packaging.version.parse("0.5.0")
|
||||
version_autogptq = packaging.version.parse(importlib_metadata.version("auto_gptq"))
|
||||
if AUTOGPTQ_MINIMUM_VERSION <= version_autogptq:
|
||||
return True
|
||||
else:
|
||||
raise ImportError(
|
||||
f"Found an incompatible version of auto-gptq. Found version {version_autogptq}, "
|
||||
f"but only versions above {AUTOGPTQ_MINIMUM_VERSION} are supported"
|
||||
)
|
||||
|
||||
|
||||
def is_optimum_available() -> bool:
|
||||
return importlib.util.find_spec("optimum") is not None
|
||||
|
||||
|
||||
@lru_cache()
|
||||
def is_torch_tpu_available(check_device=True):
|
||||
"Checks if `torch_xla` is installed and potentially if a TPU is in the environment"
|
||||
if importlib.util.find_spec("torch_xla") is not None:
|
||||
if check_device:
|
||||
# We need to check if `xla_device` can be found, will raise a RuntimeError if not
|
||||
try:
|
||||
import torch_xla.core.xla_model as xm
|
||||
|
||||
_ = xm.xla_device()
|
||||
return True
|
||||
except RuntimeError:
|
||||
return False
|
||||
return True
|
||||
return False
|
||||
|
@ -15,6 +15,7 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import collections
|
||||
import inspect
|
||||
import os
|
||||
import warnings
|
||||
@ -58,6 +59,7 @@ from .utils import (
|
||||
_set_adapter,
|
||||
_set_trainable,
|
||||
get_peft_model_state_dict,
|
||||
id_tensor_storage,
|
||||
infer_device,
|
||||
load_peft_weights,
|
||||
set_peft_model_state_dict,
|
||||
@ -168,6 +170,8 @@ class PeftModel(PushToHubMixin, torch.nn.Module):
|
||||
save_directory (`str`):
|
||||
Directory where the adapter model and configuration files will be saved (will be created if it does not
|
||||
exist).
|
||||
safe_serialization (`bool`, *optional*):
|
||||
Whether to save the adapter files in safetensors format.
|
||||
kwargs (additional keyword arguments, *optional*):
|
||||
Additional keyword arguments passed along to the `push_to_hub` method.
|
||||
"""
|
||||
@ -199,6 +203,28 @@ class PeftModel(PushToHubMixin, torch.nn.Module):
|
||||
os.makedirs(output_dir, exist_ok=True)
|
||||
|
||||
if safe_serialization:
|
||||
# Section copied from: https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py#L2111-L2134
|
||||
# Safetensors does not allow tensor aliasing.
|
||||
# We're going to remove aliases before saving
|
||||
ptrs = collections.defaultdict(list)
|
||||
for name, tensor in output_state_dict.items():
|
||||
# Sometimes in the state_dict we have non-tensor objects.
|
||||
# e.g. in bitsandbytes we have some `str` objects in the state_dict
|
||||
if isinstance(tensor, torch.Tensor):
|
||||
ptrs[id_tensor_storage(tensor)].append(name)
|
||||
else:
|
||||
# In the non-tensor case, fall back to the pointer of the object itself
|
||||
ptrs[id(tensor)].append(name)
|
||||
|
||||
# These are all the pointers of shared tensors.
|
||||
shared_ptrs = {ptr: names for ptr, names in ptrs.items() if len(names) > 1}
|
||||
|
||||
for _, names in shared_ptrs.items():
|
||||
# Here we just clone the shared tensors to avoid tensor aliasing which is
|
||||
# not supported in safetensors.
|
||||
for shared_tensor_name in names[1:]:
|
||||
output_state_dict[shared_tensor_name] = output_state_dict[shared_tensor_name].clone()
|
||||
|
||||
safe_save_file(
|
||||
output_state_dict,
|
||||
os.path.join(output_dir, SAFETENSORS_WEIGHTS_NAME),
|
||||
|
@ -26,7 +26,8 @@ from peft.utils import transpose
|
||||
class AdaLoraLayer(LoraLayer):
|
||||
# List all names of layers that may contain adapter weights
|
||||
# Note: ranknum doesn't need to be included as it is not an nn.Module
|
||||
adapter_layer_names = ["lora_A", "lora_B", "lora_E", "lora_embedding_A", "lora_embedding_B"]
|
||||
adapter_layer_names = ("lora_A", "lora_B", "lora_E", "lora_embedding_A", "lora_embedding_B")
|
||||
# other_param_names is defined in LoraLayer
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
|
@ -39,12 +39,20 @@ def llama_apply_rotary_pos_emb(q, cos, sin, position_ids):
|
||||
This function was adapted from:
|
||||
https://github.com/huggingface/transformers/blob/1de8ce9ee1191ba761a593ac15d9ccbf5851bfc5/src/transformers/models/llama/modeling_llama.py#L133
|
||||
|
||||
It was modified to remove unnecessary processing of key states.
|
||||
It was modified to remove unnecessary processing of key states. The method is compatible with transformers <=
|
||||
4.34.2 and also with the latest version (>=4.35).
|
||||
"""
|
||||
gather_indices = position_ids[:, None, :, None] # [bs, 1, seq_len, 1]
|
||||
gather_indices = gather_indices.repeat(1, cos.shape[1], 1, cos.shape[3])
|
||||
cos = torch.gather(cos.repeat(gather_indices.shape[0], 1, 1, 1), 2, gather_indices)
|
||||
sin = torch.gather(sin.repeat(gather_indices.shape[0], 1, 1, 1), 2, gather_indices)
|
||||
# In previous transformers version cos/sin cached had a shape of 4D
|
||||
if len(cos.shape) == 4:
|
||||
gather_indices = position_ids[:, None, :, None] # [bs, 1, seq_len, 1]
|
||||
gather_indices = gather_indices.repeat(1, cos.shape[1], 1, cos.shape[3])
|
||||
cos = torch.gather(cos.repeat(gather_indices.shape[0], 1, 1, 1), 2, gather_indices)
|
||||
sin = torch.gather(sin.repeat(gather_indices.shape[0], 1, 1, 1), 2, gather_indices)
|
||||
# In the new version, it is 2D so we fall back to the new implementation
|
||||
# https://github.com/huggingface/transformers/blame/eef7ea98c31a333bacdc7ae7a2372bde772be8e4/src/transformers/models/llama/modeling_llama.py#L222-L226
|
||||
else:
|
||||
cos = cos[position_ids].unsqueeze(1)
|
||||
sin = sin[position_ids].unsqueeze(1)
|
||||
q_embed = (q * cos) + (llama_rotate_half(q) * sin)
|
||||
return q_embed
|
||||
|
||||
|
@ -29,7 +29,9 @@ class IA3Config(PeftConfig):
|
||||
target_modules (`Union[List[str],str]`):
|
||||
The names of the modules to apply (IA)^3 to.
|
||||
feedforward_modules (`Union[List[str],str]`):
|
||||
The names of the modules to be treated as feedforward modules, as in the original paper.
|
||||
The names of the modules to be treated as feedforward modules, as in the original paper. These modules will
|
||||
have (IA)^3 vectors multiplied to the input, instead of the output. feedforward_modules must be a name or a
|
||||
subset of names present in target_modules.
|
||||
fan_in_fan_out (`bool`):
|
||||
Set this to True if the layer to replace stores weight like (fan_in, fan_out). For example, gpt-2 uses
|
||||
`Conv1D` which stores weights like (fan_in, fan_out) and hence this should be set to `True`.
|
||||
@ -78,3 +80,8 @@ class IA3Config(PeftConfig):
|
||||
self.feedforward_modules = (
|
||||
set(self.feedforward_modules) if isinstance(self.feedforward_modules, list) else self.feedforward_modules
|
||||
)
|
||||
|
||||
# check if feedforward_modules is a subset of target_modules. run the check only if both are sets
|
||||
if isinstance(self.feedforward_modules, set) and isinstance(self.target_modules, set):
|
||||
if not self.feedforward_modules.issubset(self.target_modules):
|
||||
raise ValueError("`feedforward_modules` should be a subset of `target_modules`")
|
||||
|
@ -25,8 +25,10 @@ from peft.utils import transpose
|
||||
|
||||
|
||||
class IA3Layer(BaseTunerLayer):
|
||||
# List all names of layers that may contain adapter weights
|
||||
adapter_layer_names = ["ia3_l"]
|
||||
# All names of layers that may contain adapter weights
|
||||
adapter_layer_names = ("ia3_l",)
|
||||
# All names of other parameters that may contain adapter-related parameters
|
||||
other_layer_names = ("scaling",)
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
|
@ -206,7 +206,7 @@ class IA3Model(BaseTuner):
|
||||
"New adapter should have the same value for `is_feedforward` as previously added adapter."
|
||||
)
|
||||
if isinstance(target, torch.nn.Conv2d):
|
||||
target.update_layer_conv2d(
|
||||
target.update_layer(
|
||||
adapter_name,
|
||||
ia3_config.init_ia3_weights,
|
||||
)
|
||||
|
@ -24,8 +24,9 @@ from peft.tuners.lycoris_utils import LycorisLayer
|
||||
|
||||
|
||||
class LoHaLayer(LycorisLayer, nn.Module):
|
||||
# List all names of layers that may contain adapter weights
|
||||
adapter_layer_names = ["hada_w1_a", "hada_w1_b", "hada_w2_a", "hada_w2_b", "hada_t1", "hada_t2"]
|
||||
# All names of layers that may contain adapter weights
|
||||
adapter_layer_names = ("hada_w1_a", "hada_w1_b", "hada_w2_a", "hada_w2_b", "hada_t1", "hada_t2")
|
||||
# other_param_names is defined on parent class
|
||||
|
||||
def __init__(self):
|
||||
LycorisLayer.__init__(self)
|
||||
|
@ -24,8 +24,8 @@ from peft.tuners.lycoris_utils import LycorisLayer
|
||||
|
||||
|
||||
class LoKrLayer(LycorisLayer, nn.Module):
|
||||
# List all names of layers that may contain adapter weights
|
||||
adapter_layer_names = [
|
||||
# All names of layers that may contain adapter weights
|
||||
adapter_layer_names = (
|
||||
"lokr_w1",
|
||||
"lokr_w1_a",
|
||||
"lokr_w1_b",
|
||||
@ -33,7 +33,8 @@ class LoKrLayer(LycorisLayer, nn.Module):
|
||||
"lokr_w2_a",
|
||||
"lokr_w2_b",
|
||||
"lokr_t2",
|
||||
]
|
||||
)
|
||||
# other_param_names is defined on parent class
|
||||
|
||||
def __init__(self):
|
||||
LycorisLayer.__init__(self)
|
||||
|
@ -26,8 +26,10 @@ from peft.utils.other import transpose
|
||||
|
||||
|
||||
class LoraLayer(BaseTunerLayer):
|
||||
# List all names of layers that may contain adapter weights
|
||||
adapter_layer_names = ["lora_A", "lora_B", "lora_embedding_A", "lora_embedding_B"]
|
||||
# All names of layers that may contain (trainable) adapter weights
|
||||
adapter_layer_names = ("lora_A", "lora_B", "lora_embedding_A", "lora_embedding_B")
|
||||
# All names of other parameters that may contain adapter-related parameters
|
||||
other_param_names = ("r", "lora_alpha", "scaling", "lora_dropout")
|
||||
|
||||
def __init__(self, in_features: int, out_features: int, **kwargs):
|
||||
self.r = {}
|
||||
|
@ -661,29 +661,15 @@ class LoraModel(BaseTuner):
|
||||
del self.peft_config[adapter_name]
|
||||
|
||||
key_list = [key for key, _ in self.model.named_modules() if "lora" not in key]
|
||||
new_adapter = None
|
||||
for key in key_list:
|
||||
_, target, _ = _get_submodules(self.model, key)
|
||||
if isinstance(target, LoraLayer):
|
||||
for attr in [
|
||||
"r",
|
||||
"lora_alpha",
|
||||
"scaling",
|
||||
"lora_A",
|
||||
"lora_B",
|
||||
"lora_embedding_A",
|
||||
"lora_embedding_B",
|
||||
"lora_dropout",
|
||||
]:
|
||||
if adapter_name in getattr(target, attr):
|
||||
getattr(target, attr).pop(adapter_name)
|
||||
if adapter_name in target.active_adapters:
|
||||
resetting_active_adapter = (
|
||||
list(self.peft_config.keys())[0] if len(self.peft_config) > 0 else "default"
|
||||
)
|
||||
warnings.warn(
|
||||
f"Adapter {adapter_name} was active which is now deleted. Setting active adapter to {resetting_active_adapter}. "
|
||||
)
|
||||
target.set_adapter(resetting_active_adapter)
|
||||
target.delete_adapter(adapter_name)
|
||||
if new_adapter is None:
|
||||
new_adapter = target.active_adapters[:]
|
||||
|
||||
self.active_adapter = new_adapter or []
|
||||
|
||||
def merge_and_unload(self, progressbar: bool = False, safe_merge: bool = False):
|
||||
r"""
|
||||
|
@ -62,6 +62,8 @@ class LycorisLayer(BaseTunerLayer, nn.Module):
|
||||
r"""
|
||||
A base layer for LyCORIS like adapters
|
||||
"""
|
||||
# adapter_layer_names needs to be defined on the child class
|
||||
other_param_names = ("r", "alpha", "scaling", "rank_dropout", "module_dropout")
|
||||
|
||||
def __init__(self):
|
||||
self.r = {}
|
||||
@ -391,17 +393,12 @@ class LycorisTuner(BaseTuner):
|
||||
del self.peft_config[adapter_name]
|
||||
|
||||
key_list = [key for key, _ in self.model.named_modules() if self.prefix not in key]
|
||||
new_adapter = None
|
||||
for key in key_list:
|
||||
_, target, _ = _get_submodules(self.model, key)
|
||||
if isinstance(target, LycorisLayer):
|
||||
for attr in target.adapter_layer_names:
|
||||
if adapter_name in getattr(target, attr):
|
||||
getattr(target, attr).pop(adapter_name)
|
||||
if adapter_name in target.active_adapters:
|
||||
resetting_active_adapter = (
|
||||
list(self.peft_config.keys())[0] if len(self.peft_config) > 0 else "default"
|
||||
)
|
||||
warnings.warn(
|
||||
f"Adapter {adapter_name} was active which is now deleted. Setting active adapter to {resetting_active_adapter}. "
|
||||
)
|
||||
target.set_adapter(resetting_active_adapter)
|
||||
target.delete_adapter(adapter_name)
|
||||
if new_adapter is None:
|
||||
new_adapter = target.active_adapters[:]
|
||||
|
||||
self.active_adapter = new_adapter or []
|
||||
|
@ -16,6 +16,7 @@ from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import re
|
||||
import warnings
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import Any, Union
|
||||
|
||||
@ -24,7 +25,7 @@ from torch import nn
|
||||
from peft.utils import COMMON_LAYERS_PATTERN
|
||||
|
||||
from ..config import PeftConfig
|
||||
from ..utils import _get_submodules
|
||||
from ..utils import ModulesToSaveWrapper, _get_submodules
|
||||
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
@ -210,6 +211,9 @@ class BaseTuner(nn.Module, ABC):
|
||||
is_target_modules_in_base_model = False
|
||||
key_list = [key for key, _ in model.named_modules()]
|
||||
|
||||
_check_for_modules_to_save = getattr(peft_config, "modules_to_save", None) is not None
|
||||
_has_modules_to_save = False
|
||||
|
||||
model_config = getattr(model, "config", {"model_type": "custom"})
|
||||
if hasattr(model_config, "to_dict"):
|
||||
model_config = model_config.to_dict()
|
||||
@ -217,6 +221,22 @@ class BaseTuner(nn.Module, ABC):
|
||||
peft_config = self._prepare_adapter_config(peft_config, model_config)
|
||||
|
||||
for key in key_list:
|
||||
# Check for modules_to_save in case
|
||||
if _check_for_modules_to_save and any(
|
||||
key.endswith(f"{module_to_save}") for module_to_save in peft_config.modules_to_save
|
||||
):
|
||||
# Optionally set the modules to save
|
||||
parent, target, target_name = _get_submodules(model, key)
|
||||
|
||||
if not isinstance(target, ModulesToSaveWrapper):
|
||||
new_module = ModulesToSaveWrapper(target, adapter_name)
|
||||
setattr(parent, target_name, new_module)
|
||||
else:
|
||||
target.update(adapter_name)
|
||||
|
||||
_has_modules_to_save = True
|
||||
continue
|
||||
|
||||
if not self._check_target_module_exists(peft_config, key):
|
||||
continue
|
||||
|
||||
@ -243,6 +263,12 @@ class BaseTuner(nn.Module, ABC):
|
||||
if adapter_name in n:
|
||||
p.requires_grad = False
|
||||
|
||||
if _has_modules_to_save:
|
||||
if not hasattr(model, "modules_to_save"):
|
||||
model.modules_to_save = set(peft_config.modules_to_save)
|
||||
else:
|
||||
model.modules_to_save.update(set(peft_config.modules_to_save))
|
||||
|
||||
def merge_adapter(self):
|
||||
"""
|
||||
This method merges the LoRa layers into the base model.
|
||||
@ -272,8 +298,10 @@ class BaseTunerLayer(ABC):
|
||||
"""
|
||||
active_adapter = None
|
||||
|
||||
# List all names of layers that may contain adapter weights
|
||||
adapter_layer_names: list[str] = []
|
||||
# All names of layers that may contain adapter (trainable) weights
|
||||
adapter_layer_names: tuple[str] = ()
|
||||
# All names of other parameters that may contain adapter-related parameters
|
||||
other_param_names: tuple[str] = ()
|
||||
|
||||
# indicates whether all adapters should be disabled
|
||||
_disable_adapters: bool = False
|
||||
@ -351,6 +379,54 @@ class BaseTunerLayer(ABC):
|
||||
|
||||
self._active_adapter = adapter_names
|
||||
|
||||
def _all_available_adapter_names(self) -> list[str]:
|
||||
"""Return a sorted list of all available adapter names"""
|
||||
adapter_names = set()
|
||||
for name in self.adapter_layer_names + self.other_param_names:
|
||||
# we check each possible attribute and if it's a dict or ModuleDict, we assume that the keys are the adapter
|
||||
# names
|
||||
attr = getattr(self, name)
|
||||
if hasattr(attr, "keys"):
|
||||
adapter_names.update(attr.keys())
|
||||
return sorted(adapter_names)
|
||||
|
||||
def delete_adapter(self, adapter_name: str) -> None:
|
||||
"""
|
||||
Delete an adapter from the layer
|
||||
|
||||
This should be called on all adapter layers, or else we will get an inconsistent state.
|
||||
|
||||
This method will also set a new active adapter if the deleted adapter was an active adapter. It is important
|
||||
that the new adapter is chosen in a deterministic way, so that the same adapter is chosen on all layers.
|
||||
|
||||
Args:
|
||||
adapter_name (`str`): The name of the adapter to delete
|
||||
|
||||
"""
|
||||
for attr in self.adapter_layer_names + self.other_param_names:
|
||||
if adapter_name in getattr(self, attr):
|
||||
del getattr(self, attr)[adapter_name]
|
||||
|
||||
if adapter_name in self.active_adapters:
|
||||
# choose a new active adapter
|
||||
active_adapters = self.active_adapters[:]
|
||||
active_adapters.remove(adapter_name)
|
||||
if active_adapters:
|
||||
self.set_adapter(active_adapters)
|
||||
else:
|
||||
# no active adapters left, set a new default adapter
|
||||
# here we get the list of all adapters existing adapter names and choose the first one
|
||||
remaining_adapters = self._all_available_adapter_names()
|
||||
if not remaining_adapters:
|
||||
self.set_adapter([])
|
||||
else:
|
||||
new_active_adapter = remaining_adapters[0]
|
||||
warnings.warn(
|
||||
f"Adapter {adapter_name} was active which is now deleted. Setting active adapter to "
|
||||
f"{new_active_adapter}."
|
||||
)
|
||||
self.set_adapter(remaining_adapters[0])
|
||||
|
||||
|
||||
def check_target_module_exists(config, key: str) -> bool | re.Match[str] | None:
|
||||
"""A helper method to check if the passed module's key name matches any of the target modules in the adapter_config.
|
||||
|
@ -45,6 +45,7 @@ from .other import (
|
||||
infer_device,
|
||||
get_auto_gptq_quant_linear,
|
||||
get_quantization_config,
|
||||
id_tensor_storage,
|
||||
)
|
||||
from .hub_utils import hub_file_exists
|
||||
from .save_and_load import get_peft_model_state_dict, set_peft_model_state_dict, load_peft_weights
|
||||
|
@ -15,14 +15,15 @@
|
||||
import copy
|
||||
import inspect
|
||||
import warnings
|
||||
from typing import Optional
|
||||
from typing import Optional, Tuple
|
||||
|
||||
import accelerate
|
||||
import torch
|
||||
from accelerate.hooks import add_hook_to_module, remove_hook_from_module
|
||||
from accelerate.utils import is_npu_available, is_xpu_available
|
||||
from safetensors.torch import storage_ptr, storage_size
|
||||
|
||||
from ..import_utils import is_auto_gptq_available
|
||||
from ..import_utils import is_auto_gptq_available, is_torch_tpu_available
|
||||
|
||||
|
||||
# Get current device name based on available devices
|
||||
@ -412,25 +413,57 @@ def get_auto_gptq_quant_linear(gptq_quantization_config):
|
||||
"""
|
||||
Get the right AutoGPTQQuantLinear class based on the quantization config file
|
||||
"""
|
||||
if is_auto_gptq_available():
|
||||
if gptq_quantization_config is not None and is_auto_gptq_available():
|
||||
from auto_gptq.utils.import_utils import dynamically_import_QuantLinear
|
||||
|
||||
if gptq_quantization_config is not None:
|
||||
desc_act = gptq_quantization_config.desc_act
|
||||
group_size = gptq_quantization_config.group_size
|
||||
bits = gptq_quantization_config.bits
|
||||
disable_exllama = gptq_quantization_config.disable_exllama
|
||||
AutoGPTQQuantLinear = dynamically_import_QuantLinear(
|
||||
use_triton=False,
|
||||
desc_act=desc_act,
|
||||
group_size=group_size,
|
||||
bits=bits,
|
||||
disable_exllama=disable_exllama,
|
||||
)
|
||||
return AutoGPTQQuantLinear
|
||||
desc_act = gptq_quantization_config.desc_act
|
||||
group_size = gptq_quantization_config.group_size
|
||||
bits = gptq_quantization_config.bits
|
||||
if hasattr(gptq_quantization_config, "use_exllama"):
|
||||
use_exllama = gptq_quantization_config.use_exllama
|
||||
else:
|
||||
use_exllama = not gptq_quantization_config.disable_exllama
|
||||
if hasattr(gptq_quantization_config, "exllama_config"):
|
||||
exllama_version = gptq_quantization_config.exllama_config["version"]
|
||||
else:
|
||||
exllama_version = 1
|
||||
AutoGPTQQuantLinear = dynamically_import_QuantLinear(
|
||||
use_triton=False,
|
||||
desc_act=desc_act,
|
||||
group_size=group_size,
|
||||
bits=bits,
|
||||
disable_exllama=not (use_exllama and exllama_version == 1),
|
||||
disable_exllamav2=not (use_exllama and exllama_version == 2),
|
||||
)
|
||||
return AutoGPTQQuantLinear
|
||||
return None
|
||||
|
||||
|
||||
def id_tensor_storage(tensor: torch.Tensor) -> Tuple[torch.device, int, int]:
|
||||
"""
|
||||
Unique identifier to a tensor storage. Multiple different tensors can share the same underlying storage. For
|
||||
example, "meta" tensors all share the same storage, and thus their identifier will all be equal. This identifier is
|
||||
guaranteed to be unique and constant for this tensor's storage during its lifetime. Two tensor storages with
|
||||
non-overlapping lifetimes may have the same id.
|
||||
|
||||
This method is the exact same copy of
|
||||
https://github.com/huggingface/transformers/blob/main/src/transformers/pytorch_utils.py#L282C1-L300C58 but we added
|
||||
it here manually to avoid import issue with old versions of transformers.
|
||||
"""
|
||||
if tensor.device.type == "xla" and is_torch_tpu_available():
|
||||
# NOTE: xla tensors dont have storage
|
||||
# use some other unique id to distinguish.
|
||||
# this is a XLA tensor, it must be created using torch_xla's
|
||||
# device. So the following import is safe:
|
||||
import torch_xla
|
||||
|
||||
unique_id = torch_xla._XLAC._xla_get_tensor_id(tensor)
|
||||
else:
|
||||
unique_id = storage_ptr(tensor)
|
||||
|
||||
return tensor.device, unique_id, storage_size(tensor)
|
||||
|
||||
|
||||
TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING = {
|
||||
"t5": ["q", "v"],
|
||||
"mt5": ["q", "v"],
|
||||
@ -478,9 +511,9 @@ TRANSFORMERS_MODELS_TO_IA3_TARGET_MODULES_MAPPING = {
|
||||
"bert": ["key", "value", "output.dense"],
|
||||
"deberta-v2": ["key_proj", "value_proj", "output.dense"],
|
||||
"deberta": ["in_proj", "output.dense"],
|
||||
"RefinedWebModel": ["query_key_value"],
|
||||
"RefinedWeb": ["query_key_value"],
|
||||
"falcon": ["query_key_value"],
|
||||
"RefinedWebModel": ["query_key_value", "dense_4h_to_h"],
|
||||
"RefinedWeb": ["query_key_value", "dense_4h_to_h"],
|
||||
"falcon": ["query_key_value", "dense_4h_to_h"],
|
||||
}
|
||||
|
||||
TRANSFORMERS_MODELS_TO_IA3_FEEDFORWARD_MODULES_MAPPING = {
|
||||
@ -499,9 +532,9 @@ TRANSFORMERS_MODELS_TO_IA3_FEEDFORWARD_MODULES_MAPPING = {
|
||||
"bert": ["output.dense"],
|
||||
"deberta-v2": ["output.dense"],
|
||||
"deberta": ["output.dense"],
|
||||
"RefinedWeb": ["query_key_value"],
|
||||
"RefinedWebModel": ["query_key_value"],
|
||||
"falcon": ["query_key_value"],
|
||||
"RefinedWeb": ["dense_4h_to_h"],
|
||||
"RefinedWebModel": ["dense_4h_to_h"],
|
||||
"falcon": ["dense_4h_to_h"],
|
||||
}
|
||||
|
||||
COMMON_LAYERS_PATTERN = ["layers", "h", "block", "blocks", "layer"]
|
||||
|
@ -53,7 +53,7 @@ class AdaptionPromptTester(TestCase, PeftCommonTester):
|
||||
"""
|
||||
|
||||
def setUp(self):
|
||||
"""Check that llama is available in transformers package before running each test."""
|
||||
# Check that llama is available in transformers package before running each test.
|
||||
if not is_llama_available():
|
||||
self.skipTest("Llama not available in transformers. Skipping test.")
|
||||
|
||||
|
@ -13,6 +13,7 @@
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
import gc
|
||||
import tempfile
|
||||
import unittest
|
||||
|
||||
import pytest
|
||||
@ -22,6 +23,7 @@ from transformers import (
|
||||
AutoModelForCausalLM,
|
||||
AutoModelForSeq2SeqLM,
|
||||
AutoModelForSequenceClassification,
|
||||
AutoModelForTokenClassification,
|
||||
AutoTokenizer,
|
||||
BitsAndBytesConfig,
|
||||
LlamaForCausalLM,
|
||||
@ -33,6 +35,7 @@ from peft import (
|
||||
IA3Config,
|
||||
LoraConfig,
|
||||
PeftModel,
|
||||
TaskType,
|
||||
get_peft_model,
|
||||
prepare_model_for_kbit_training,
|
||||
)
|
||||
@ -158,12 +161,12 @@ class PeftGPUCommonTests(unittest.TestCase):
|
||||
flan_ia3_config = IA3Config(target_modules=["q", "v"], task_type="SEQ_2_SEQ_LM")
|
||||
|
||||
opt_ia3_config = IA3Config(
|
||||
target_modules=["q_proj", "v_proj"],
|
||||
feedforward_modules=["down_proj"],
|
||||
target_modules=["q_proj", "v_proj", "fc2"],
|
||||
feedforward_modules=["fc2"],
|
||||
task_type="CAUSAL_LM",
|
||||
)
|
||||
|
||||
config = IA3Config(target_modules=["q_proj", "v_proj"], feedforward_modules=["down_proj"])
|
||||
config = IA3Config(target_modules=["q_proj", "v_proj", "fc2"], feedforward_modules=["fc2"])
|
||||
|
||||
flan_8bit = get_peft_model(flan_8bit, flan_ia3_config)
|
||||
self.assertTrue(
|
||||
@ -276,12 +279,12 @@ class PeftGPUCommonTests(unittest.TestCase):
|
||||
flan_ia3_config = IA3Config(target_modules=["q", "v"], task_type="SEQ_2_SEQ_LM")
|
||||
|
||||
opt_ia3_config = IA3Config(
|
||||
target_modules=["q_proj", "v_proj"],
|
||||
feedforward_modules=["down_proj"],
|
||||
target_modules=["q_proj", "v_proj", "fc2"],
|
||||
feedforward_modules=["fc2"],
|
||||
task_type="CAUSAL_LM",
|
||||
)
|
||||
|
||||
config = IA3Config(target_modules=["q_proj", "v_proj"], feedforward_modules=["down_proj"])
|
||||
config = IA3Config(target_modules=["q_proj", "v_proj", "fc2"], feedforward_modules=["fc2"])
|
||||
|
||||
flan_4bit = get_peft_model(flan_4bit, flan_ia3_config)
|
||||
self.assertTrue(
|
||||
@ -631,3 +634,16 @@ class PeftGPUCommonTests(unittest.TestCase):
|
||||
self.assertTrue(isinstance(model, PeftModel))
|
||||
self.assertTrue(isinstance(model.base_model.model.model.decoder.layers[0].self_attn.q_proj, LoraLinear4bit))
|
||||
self.assertTrue(isinstance(model.base_model.model.model.decoder.layers[0].self_attn.v_proj, LoraLinear4bit))
|
||||
|
||||
@require_torch_gpu
|
||||
@pytest.mark.single_gpu_tests
|
||||
def test_serialization_shared_tensors(self):
|
||||
model_checkpoint = "roberta-base"
|
||||
peft_config = LoraConfig(
|
||||
task_type=TaskType.TOKEN_CLS, inference_mode=False, r=16, lora_alpha=16, lora_dropout=0.1, bias="all"
|
||||
)
|
||||
model = AutoModelForTokenClassification.from_pretrained(model_checkpoint, num_labels=11).to("cuda")
|
||||
model = get_peft_model(model, peft_config)
|
||||
|
||||
with tempfile.TemporaryDirectory() as tmp_dir:
|
||||
model.save_pretrained(tmp_dir, safe_serialization=True)
|
||||
|
@ -24,6 +24,7 @@ from parameterized import parameterized
|
||||
|
||||
from peft import (
|
||||
AdaLoraConfig,
|
||||
# TODO: uncomment once PEFT works again with transformers
|
||||
AdaptionPromptConfig,
|
||||
IA3Config,
|
||||
LoHaConfig,
|
||||
@ -40,6 +41,7 @@ from peft import (
|
||||
PEFT_MODELS_TO_TEST = [("lewtun/tiny-random-OPTForCausalLM-delta", "v1")]
|
||||
|
||||
ALL_CONFIG_CLASSES = (
|
||||
# TODO: uncomment once PEFT works again with transformers
|
||||
AdaptionPromptConfig,
|
||||
AdaLoraConfig,
|
||||
IA3Config,
|
||||
@ -221,3 +223,31 @@ class PeftConfigTester(unittest.TestCase):
|
||||
|
||||
# should run without errors
|
||||
LoraConfig(**valid_config)
|
||||
|
||||
def test_ia3_is_feedforward_subset_invalid_config(self):
|
||||
# This test checks that the IA3 config raises a value error if the feedforward_modules argument
|
||||
# is not a subset of the target_modules argument
|
||||
|
||||
# an example invalid config
|
||||
invalid_config = {"target_modules": ["k", "v"], "feedforward_modules": ["q"]}
|
||||
|
||||
with self.assertRaisesRegex(
|
||||
ValueError, expected_regex="^`feedforward_modules` should be a subset of `target_modules`$"
|
||||
):
|
||||
IA3Config(**invalid_config)
|
||||
|
||||
def test_ia3_is_feedforward_subset_valid_config(self):
|
||||
# This test checks that the IA3 config is created without errors with valid arguments.
|
||||
# feedforward_modules should be a subset of target_modules if both are lists
|
||||
|
||||
# an example valid config with regex expressions.
|
||||
valid_config_regex_exp = {
|
||||
"target_modules": ".*.(SelfAttention|EncDecAttention|DenseReluDense).*(q|v|wo)$",
|
||||
"feedforward_modules": ".*.DenseReluDense.wo$",
|
||||
}
|
||||
# an example valid config with module lists.
|
||||
valid_config_list = {"target_modules": ["k", "v", "wo"], "feedforward_modules": ["wo"]}
|
||||
|
||||
# should run without errors
|
||||
IA3Config(**valid_config_regex_exp)
|
||||
IA3Config(**valid_config_list)
|
||||
|
@ -681,6 +681,14 @@ class PeftCustomModelTester(unittest.TestCase, PeftCommonTester):
|
||||
# This is bad, there was a warning about the bias when there should not have been any.
|
||||
self.fail("There should be no warning when bias is set to 'none'")
|
||||
|
||||
@parameterized.expand(TEST_CASES)
|
||||
def test_delete_adapter(self, test_name, model_id, config_cls, config_kwargs):
|
||||
self._test_delete_adapter(model_id, config_cls, config_kwargs)
|
||||
|
||||
@parameterized.expand(TEST_CASES)
|
||||
def test_delete_inactive_adapter(self, test_name, model_id, config_cls, config_kwargs):
|
||||
self._test_delete_inactive_adapter(model_id, config_cls, config_kwargs)
|
||||
|
||||
@parameterized.expand(TEST_CASES)
|
||||
def test_adding_multiple_adapters_with_bias_raises(self, test_name, model_id, config_cls, config_kwargs):
|
||||
self._test_adding_multiple_adapters_with_bias_raises(model_id, config_cls, config_kwargs)
|
||||
|
@ -154,6 +154,10 @@ class PeftDecoderModelTester(unittest.TestCase, PeftCommonTester):
|
||||
def test_delete_adapter(self, test_name, model_id, config_cls, config_kwargs):
|
||||
self._test_delete_adapter(model_id, config_cls, config_kwargs)
|
||||
|
||||
@parameterized.expand(PeftTestConfigManager.get_grid_parameters(FULL_GRID))
|
||||
def test_delete_inactive_adapter(self, test_name, model_id, config_cls, config_kwargs):
|
||||
self._test_delete_inactive_adapter(model_id, config_cls, config_kwargs)
|
||||
|
||||
@parameterized.expand(PeftTestConfigManager.get_grid_parameters(FULL_GRID))
|
||||
def test_adding_multiple_adapters_with_bias_raises(self, test_name, model_id, config_cls, config_kwargs):
|
||||
self._test_adding_multiple_adapters_with_bias_raises(model_id, config_cls, config_kwargs)
|
||||
|
@ -12,11 +12,14 @@
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
import tempfile
|
||||
import unittest
|
||||
|
||||
import torch
|
||||
from parameterized import parameterized
|
||||
from transformers import AutoModelForSeq2SeqLM
|
||||
from transformers import AutoModelForSeq2SeqLM, AutoModelForTokenClassification
|
||||
|
||||
from peft import LoraConfig, TaskType, get_peft_model
|
||||
|
||||
from .testing_common import PeftCommonTester, PeftTestConfigManager
|
||||
|
||||
@ -125,6 +128,10 @@ class PeftEncoderDecoderModelTester(unittest.TestCase, PeftCommonTester):
|
||||
def test_delete_adapter(self, test_name, model_id, config_cls, config_kwargs):
|
||||
self._test_delete_adapter(model_id, config_cls, config_kwargs)
|
||||
|
||||
@parameterized.expand(PeftTestConfigManager.get_grid_parameters(FULL_GRID))
|
||||
def test_delete_inactive_adapter(self, test_name, model_id, config_cls, config_kwargs):
|
||||
self._test_delete_inactive_adapter(model_id, config_cls, config_kwargs)
|
||||
|
||||
@parameterized.expand(PeftTestConfigManager.get_grid_parameters(FULL_GRID))
|
||||
def test_adding_multiple_adapters_with_bias_raises(self, test_name, model_id, config_cls, config_kwargs):
|
||||
self._test_adding_multiple_adapters_with_bias_raises(model_id, config_cls, config_kwargs)
|
||||
@ -172,3 +179,20 @@ class PeftEncoderDecoderModelTester(unittest.TestCase, PeftCommonTester):
|
||||
)
|
||||
def test_disable_adapter(self, test_name, model_id, config_cls, config_kwargs):
|
||||
self._test_disable_adapter(model_id, config_cls, config_kwargs)
|
||||
|
||||
|
||||
class PeftEncoderDecoderCustomModelTester(unittest.TestCase):
|
||||
"""
|
||||
A custom class to write any custom test related with Enc-Dec models
|
||||
"""
|
||||
|
||||
def test_save_shared_tensors(self):
|
||||
model_id = "hf-internal-testing/tiny-random-RobertaModel"
|
||||
peft_config = LoraConfig(
|
||||
task_type=TaskType.TOKEN_CLS, inference_mode=False, r=16, lora_alpha=16, lora_dropout=0.1, bias="all"
|
||||
)
|
||||
model = AutoModelForTokenClassification.from_pretrained(model_id, num_labels=11)
|
||||
model = get_peft_model(model, peft_config)
|
||||
with tempfile.TemporaryDirectory() as tmp_dir:
|
||||
# This should work fine
|
||||
model.save_pretrained(tmp_dir, safe_serialization=True)
|
||||
|
@ -146,6 +146,10 @@ class PeftFeatureExtractionModelTester(unittest.TestCase, PeftCommonTester):
|
||||
def test_delete_adapter(self, test_name, model_id, config_cls, config_kwargs):
|
||||
self._test_delete_adapter(model_id, config_cls, config_kwargs)
|
||||
|
||||
@parameterized.expand(PeftTestConfigManager.get_grid_parameters(FULL_GRID))
|
||||
def test_delete_inactive_adapter(self, test_name, model_id, config_cls, config_kwargs):
|
||||
self._test_delete_inactive_adapter(model_id, config_cls, config_kwargs)
|
||||
|
||||
@parameterized.expand(
|
||||
PeftTestConfigManager.get_grid_parameters(
|
||||
{
|
||||
|
@ -658,7 +658,8 @@ class PeftGPTQGPUTests(unittest.TestCase):
|
||||
from transformers import GPTQConfig
|
||||
|
||||
self.causal_lm_model_id = "marcsun13/opt-350m-gptq-4bit"
|
||||
self.quantization_config = GPTQConfig(bits=4, disable_exllama=True)
|
||||
# TODO : check if it works for Exllamav2 kernels
|
||||
self.quantization_config = GPTQConfig(bits=4, use_exllama=False)
|
||||
self.tokenizer = AutoTokenizer.from_pretrained(self.causal_lm_model_id)
|
||||
|
||||
def tearDown(self):
|
||||
|
@ -19,6 +19,7 @@ import unittest
|
||||
import torch
|
||||
|
||||
from peft import LoraConfig, get_peft_model_state_dict, inject_adapter_in_model
|
||||
from peft.utils import ModulesToSaveWrapper
|
||||
|
||||
|
||||
class DummyModel(torch.nn.Module):
|
||||
@ -63,3 +64,28 @@ class TestPeft(unittest.TestCase):
|
||||
|
||||
for key in peft_state_dict.keys():
|
||||
self.assertTrue("lora" in key)
|
||||
|
||||
def test_modules_to_save(self):
|
||||
self.model = DummyModel()
|
||||
|
||||
lora_config = LoraConfig(
|
||||
lora_alpha=16,
|
||||
lora_dropout=0.1,
|
||||
r=64,
|
||||
bias="none",
|
||||
target_modules=["linear"],
|
||||
modules_to_save=["embedding"],
|
||||
)
|
||||
|
||||
self.model = inject_adapter_in_model(lora_config, self.model)
|
||||
|
||||
for name, module in self.model.named_modules():
|
||||
if name == "linear":
|
||||
self.assertTrue(hasattr(module, "lora_A"))
|
||||
self.assertTrue(hasattr(module, "lora_B"))
|
||||
elif name == "embedding":
|
||||
self.assertTrue(isinstance(module, ModulesToSaveWrapper))
|
||||
|
||||
state_dict = get_peft_model_state_dict(self.model)
|
||||
|
||||
self.assertTrue("embedding.weight" in state_dict.keys())
|
||||
|
@ -29,6 +29,7 @@ from peft import (
|
||||
IA3Config,
|
||||
LoraConfig,
|
||||
PeftModel,
|
||||
PeftType,
|
||||
PrefixTuningConfig,
|
||||
PromptEncoderConfig,
|
||||
PromptLearningConfig,
|
||||
@ -815,42 +816,87 @@ class PeftCommonTester:
|
||||
self.assertIsNotNone(param.grad)
|
||||
|
||||
def _test_delete_adapter(self, model_id, config_cls, config_kwargs):
|
||||
if issubclass(config_cls, AdaLoraConfig):
|
||||
# AdaLora does not support adding more than 1 adapter
|
||||
return
|
||||
|
||||
model = self.transformers_class.from_pretrained(model_id)
|
||||
supported_peft_types = [PeftType.LORA, PeftType.LOHA, PeftType.LOKR]
|
||||
# IA3 does not support deleting adapters yet, but it just needs to be added
|
||||
# AdaLora does not support multiple adapters
|
||||
config = config_cls(
|
||||
base_model_name_or_path=model_id,
|
||||
**config_kwargs,
|
||||
)
|
||||
if config.peft_type not in supported_peft_types:
|
||||
return
|
||||
|
||||
model = self.transformers_class.from_pretrained(model_id)
|
||||
if isinstance(config.target_modules, str):
|
||||
# TODO this should be doable
|
||||
self.skipTest("Multiple adapters cannot currently be added when target_modules is a string.")
|
||||
|
||||
adapter_to_delete = "delete_me"
|
||||
model = get_peft_model(model, config)
|
||||
model.add_adapter(adapter_to_delete, config)
|
||||
model.set_adapter(adapter_to_delete)
|
||||
model = model.to(self.torch_device)
|
||||
model.delete_adapter(adapter_to_delete)
|
||||
self.assertFalse(adapter_to_delete in model.peft_config)
|
||||
self.assertEqual(model.active_adapters, ["default"])
|
||||
|
||||
if config.peft_type not in ("LORA"):
|
||||
with self.assertRaises(AttributeError):
|
||||
model.delete_adapter(adapter_to_delete)
|
||||
else:
|
||||
model.delete_adapter(adapter_to_delete)
|
||||
self.assertFalse(adapter_to_delete in model.peft_config)
|
||||
key_list = [key for key, _ in model.named_modules() if "lora" not in key]
|
||||
for key in key_list:
|
||||
_, target, _ = _get_submodules(model, key)
|
||||
if isinstance(target, LoraLayer):
|
||||
for attr in [
|
||||
"r",
|
||||
"lora_alpha",
|
||||
"scaling",
|
||||
"lora_A",
|
||||
"lora_B",
|
||||
"lora_embedding_A",
|
||||
"lora_embedding_B",
|
||||
"lora_dropout",
|
||||
]:
|
||||
self.assertFalse(adapter_to_delete in getattr(target, attr))
|
||||
key_list = [key for key, _ in model.named_modules() if "lora" not in key]
|
||||
for key in key_list:
|
||||
_, target, _ = _get_submodules(model, key)
|
||||
attributes_to_check = getattr(target, "adapter_layer_names", []) + getattr(target, "other_param_names", [])
|
||||
for attr in attributes_to_check:
|
||||
self.assertFalse(adapter_to_delete in getattr(target, attr))
|
||||
|
||||
# check that we can also delete the last remaining adapter
|
||||
model.delete_adapter("default")
|
||||
self.assertFalse("default" in model.peft_config)
|
||||
self.assertEqual(model.active_adapters, [])
|
||||
|
||||
input = self.prepare_inputs_for_testing()
|
||||
# note: we cannot call model(**input) because PeftModel always expects there to be at least one adapter
|
||||
model.base_model(**input) # should not raise an error
|
||||
|
||||
def _test_delete_inactive_adapter(self, model_id, config_cls, config_kwargs):
|
||||
# same as test_delete_adapter, but this time an inactive adapter is deleted
|
||||
supported_peft_types = [PeftType.LORA, PeftType.LOHA, PeftType.LOKR]
|
||||
# IA3 does not support deleting adapters yet, but it just needs to be added
|
||||
# AdaLora does not support multiple adapters
|
||||
config = config_cls(
|
||||
base_model_name_or_path=model_id,
|
||||
**config_kwargs,
|
||||
)
|
||||
if config.peft_type not in supported_peft_types:
|
||||
return
|
||||
|
||||
model = self.transformers_class.from_pretrained(model_id)
|
||||
if isinstance(config.target_modules, str):
|
||||
# TODO this should be doable
|
||||
self.skipTest("Multiple adapters cannot currently be added when target_modules is a string.")
|
||||
|
||||
adapter_to_delete = "delete_me"
|
||||
model = get_peft_model(model, config)
|
||||
model.add_adapter(adapter_to_delete, config)
|
||||
# "delete_me" is added but not activated
|
||||
model = model.to(self.torch_device)
|
||||
model.delete_adapter(adapter_to_delete)
|
||||
self.assertFalse(adapter_to_delete in model.peft_config)
|
||||
self.assertEqual(model.active_adapters, ["default"])
|
||||
|
||||
key_list = [key for key, _ in model.named_modules() if "lora" not in key]
|
||||
for key in key_list:
|
||||
_, target, _ = _get_submodules(model, key)
|
||||
attributes_to_check = getattr(target, "adapter_layer_names", []) + getattr(target, "other_param_names", [])
|
||||
for attr in attributes_to_check:
|
||||
self.assertFalse(adapter_to_delete in getattr(target, attr))
|
||||
|
||||
# check that we can also delete the last remaining adapter
|
||||
model.delete_adapter("default")
|
||||
self.assertFalse("default" in model.peft_config)
|
||||
self.assertEqual(model.active_adapters, [])
|
||||
|
||||
input = self.prepare_inputs_for_testing()
|
||||
# note: we cannot call model(**input) because PeftModel always expects there to be at least one adapter
|
||||
model.base_model(**input) # should not raise an error
|
||||
|
||||
def _test_unload_adapter(self, model_id, config_cls, config_kwargs):
|
||||
model = self.transformers_class.from_pretrained(model_id)
|
||||
|
Reference in New Issue
Block a user