Implements VeRA: https://huggingface.co/papers/2310.11454
VeRA is similar to LoRA but even more parameter efficient, while promising to
keep the same performance. In its current implementation, it has a few
limitations compared to LoRA:
- All targeted parameters must have the same shape.
- Only `nn.Linear` layers are supported.
- Quantized layers are not supported.
This PR is based on, and supersedes, #1039.
---------
Co-authored-by: Alex McKinney <alex.f.mckinney@gmail.com>
Co-authored-by: Dawid <20214809+dkopi@users.noreply.github.com>
Several tests were using bnb_4bit_compute_type but the argument should
be called bnb_4bit_compute_dtype. Now the correct name is used.
This change should not affect the tests, because they were passing the
default value anyway. Therefore, the fact that this argument was passed
incorrectly (and thus, presumably, ignored) should not affect the
results.
Also, fix another incorrect argument to bnb config. These were caused by an
incorrect search and replace operation in #1552.
Related to #1532
At the moment, using LoftQ is quite cumbersome, as shown in this
example:
7e84dec20b/examples/loftq_finetuning
Essentially, users have to:
1. Load the non-quantized model with LoftQ (which can be quite huge)
2. Modify the PEFT config
3. Save the adapter
4. Unwrap the base model
5. Save the base model with modified weights (i.e. a whole copy of the
base model)
6. Load the base model from step 5 with bnb quantization
7. Load the adapter from step 3
Yes, there is a helper script to do this, but this still has the
advantage that we need to load the non-quantized model and that we have
to create a completely new model checkpoint with the modified weights.
This PR aims to make this process more convenient by adding a single
function replace_lora_weights_loftq. This function takes the
bnb-quantized LoRA model as input. Then it goes through each module with
LoRA weights, lazily loads the corresponding non-quantized weights one
at a time using safetensors, computes the quantization error, and
replaces the LoRA weights with LoftQ-initialized LoRA weights.
This is much more convenient because we only require very little extra
memory thanks to lazy loading, and we don't have to keep an extra copy
of the weights.
While working on this, I still found that LoftQ initialization often did
not seem to help a lot, as mentioned in #1532. I measured this by
creating (1) logits with the base model, (2) with the quantized+LoRA
model, and (3) with the quantized+LoRA+LoftQ model. The expectation is
that (1) should be closer to (3) than to (2). This was often not the
case.
I therefore added the possibility to run a check each time that we
replace a LoRA weight with the LoftQ weights. If this check returns
True, we proceed to the next weight, otherwise we discard the change.
That way, we only make the replacement with LoftQ weights if we see a
real improvement. Of course, this is only a form of greedy optimization,
but it seems to work in practice. And since it's optional, users can
choose not to use it.
This doesn't support 8bit quantization and the num_iter arguments of LoftQ.
However, the replace_lora_weights_loftq function can be called multiple
times in a row for slightly improved results.
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Don't pass load_in_4bit or load_in_8bit to AutoModel*.from_pretrained,
as it is deprecated. Instead, pass the appropriate BitsAndBytesConfig to
the quantization_config argument of from_pretrained.
This commit refines the levenshtein_distance algorithm implemented in peft_lora_seq2seq_accelerate_ds_zero3_offload.py to improve its space
complexity from O(n^2) to O(n). Additionally, thorough testing has been
conducted to ensure the correctness and reliability of the revised
implementation.
Also update peft_lora_clm_accelerate_ds_zero3_offload.py
* Support OFT
* add test
* Update README
* fix code quality
* fix test
* Skip 1 test
* fix eps rule and add more test
* feat: added examples to new OFT method
* fix: removed wrong arguments from model example
* fix: changed name of inference file
* fix: changed prompt variable
* fix docs
* fix: dreambooth inference revision based on feedback
* fix: review from BenjaminBossan
* apply safe merge
* del partially
* refactor oft
* refactor oft
* del unused line
* del unused line
* fix skip in windows
* skip test
* Add comments about bias added place
* rename orig_weights to new_weights
* use inverse instead of linalg.inv
* delete alpha and scaling
---------
Co-authored-by: Lukas Kuhn <lukaskuhn.lku@gmail.com>
Co-authored-by: Lukas Kuhn <lukas.kuhn@deutschebahn.com>
* add support for saving base layers weights along with adapter weights
* Update save_and_load.py
* Add an example showing the usage of the added feature
* refactor the functionality
* fix
* refactoring code
1. Add `is_embedding_layer_resized` parameter to `save_pretrained`
2. Fix the deduplication in README when adding PEFT details.
3. `save_pretrained` should only save the model when `is_main_process=True` which is one of the parameters of `save_pretrained`.
* update example
* fix the model card
* fix model card
* 😅
* fix model card
* automate setting `is_embedding_layer_resized`
* nits
* Update peft_lora_clm_with_additional_tokens.ipynb
* add test
* fix tests
* maybe fixes the issue?
* address comments
Co-Authored-By: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
---------
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
* mpt
* fix save
* fix save
* add jupyter notebook
* add jupyter notebook
* add jupyter notebook
* drop shuffling
* drop classify_dataset
* drop classify_dataset
* fix keys
* fix keys
* add comments
* use EXACT_SOURCE_TASK in the example
* formatting
* Fix dict index in embedding retrieval
* run style and quality
* run style and quality
* run style and quality
* style
* final fix
* style
* comment out failing tests
* fix generation tests
* fix style and save test
* all testcases
* fix import
* add license header
* reformat
* fix encoder-decoder models
* fix tests running multiple times
* fix paper name for IA3 and add MPT paper
* Trigger CI
* address the recommended changes
* reformat
* address suggestions
* address suggestions
* revert reformatting
* revert reformatting
---------
Co-authored-by: Alex-Brooks <Alex.Brooks@ibm.com>
Example 1: training a multilayer perceptron
Example 2: fine-tuning a timm image classifier
New section "Developer Guides" in docs.
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add support for embedding with peft
* add example and resolve code quality issues
* update notebook example post fixing the loss
* adding full example with inference notebook
* quality ✨
* add tests, docs, guide and rename task_type to be inline with Hub
* fixes
* fixes
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update peft_model.py
* fixes
* final fixes
* Update _toctree.yml
* fixes and make style and make quality
* deberta exception with checkpointing
* Update docs/source/task_guides/semantic-similarity-lora.md
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update docs/source/task_guides/semantic-similarity-lora.md
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
* resolve comments
* testing prompt learning methods
* Update testing_common.py
* fix the tests
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
* Added initial ia3 code
* Implemented ia3 correctly for feedforward layers; Fixed regex matching
* Fixed module mapping for mt5
* Merged changes from huggingface:main
* Merged changes
* Fixed lora merge conflicts
* Different bloom config
* Added save option for ia3
* Added loading code for ia3
* Added feedforward implementation in utils and seq cls example
* Added feedforward implementation in utils and seq cls example
* Implemented merge, unmerge, enable/disable adapters functionality
* Fixed feedforward during merge
* Debugging Merge
* Removing debug messages
* Cleaned up repo
* Removed non-IA3 changes
* Refactor save and load
* Added support to all models in tests; Added IA3Config for common tests
* Added half-precision support and test for gradient checkpointing; Formatted jupyter notebooks
* Added target modules for new models GPTBigCode and LLama
* Cleaned up code
* Cleaned up code
* Cleaned up example notebook
* Cleaned up seq2seq notebook
* Corrected function docstrings; refactored find_and_replace
* Corrected function docstrings; refactored find_and_replace
* Added basic docs for IA3
* Added new conceptual guide in source tree for documentation
* Minor fix to documentation
* Minor fixes to docstrings; Added error handling for 4bit quantization; Cleaned unused merge/unmerge methods
* styling changes after merge from main
* Update src/peft/tuners/ia3.py
Remove unused attribute merge_weights
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
---------
Co-authored-by: Abhishek2304 <abhishekgupta2304@gmail.com>
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
* Update train_dreambooth.py
Accelerator init updated from logging_dir to project_dir. Newer versions of accelerate uses project_dir. logging_dir is deprecated
* Bugfix: Adapter name variable inserted, when changing LORA_ADAPTER_NAME it causes error
* Adapter name added as kwarg
* Black code formatted
* Style & Quality check
* Wandb import added for logging and project initialization
* Wandb import added for logging and project initialization
* fix project_name
* print tqdm progress to wandb
- As discussed, loralib is no longer required, so the examples from the
docs have been updated to no longer require loralib as dependencies
- In one example, a missing torch import was added
- In another example, a missing line was added (output of that line is
shown, but not the line itself)
* Update train_dreambooth.py
Accelerator init updated from logging_dir to project_dir. Newer versions of accelerate uses project_dir. logging_dir is deprecated
* Bugfix: Adapter name variable inserted, when changing LORA_ADAPTER_NAME it causes error
* Adapter name added as kwarg
* Black code formatted
* Style & Quality check