4 Commits

Author SHA1 Message Date
61a11f9180 CI Testing transformers deprecations (#2817)
Check if PEFT triggers transformers FutureWarning or DeprecationWarning
by converting these warnings into failures.
2025-10-13 16:53:35 +02:00
e3710e0602 CI: Handle errors with MacOS and transformers (#2561)
CI Handle error with MacOS and transformers

A change in transformers introduced an error in the MacOS CI, which is
handled in this PR.

Context

For context on why we use torch 2.2 for MacOS, check #2431.
Unfortunately, as of today, the available GH workers for MacOS still
haven't improved.

Description

The error was introduced by
https://github.com/huggingface/transformers/pull/37785, which results in
torch.load failing when using torch < 2.6.

The proposed solution is to plug into pytest, intercept the test report,
check for the specific error, and mark the test as skipped instead.

Alternative solutions

The proposed solution is obviously an ugly hack. However, these are
errors we cannot fix directly, as they're caused by a dependency and are
caused by the old torch version we're forced to use (thus fixing them in
transformers is probably not an option).

Instead of altering the test report, the individual tests that fail
could get an explicit skip marker when MacOS is detected. However, since
the amount of affected tests are several hundreds, this is very
impractical and leads to a lot of noise in the tests.

Alternatively, we could move forward with the proposal in #2431 and
remove MacOS completely from the CI. I do, however, still have the faint
hope that GH will provide arm64 workers with more RAM in the future,
allowing us to switch.
2025-06-02 15:07:10 +02:00
fc78a2491e MNT Move code quality fully to ruff (#1421) 2024-02-07 12:52:35 +01:00
9fd788bedb TST: Add regression tests 2 (#1115)
Description

In general, for regression tests, we need two steps:

1. Creating the regression artifacts, in this case the adapter
   checkpoint and the expected output of the model.
2. Running the regression tests, i.e. loading the adapter and checking
   that the output of the model is the same as the expected output.

My approach is to re-use as much code as possible between those two
steps. Therefore, the same test script can be used for both, with only
an environment variable to distinguish between the two. Step 1 is
invoked by calling:

`REGRESSION_CREATION_MODE=True pytest tests/regression/test_regression.py`

and to run the second step, we call:

`pytest tests/regression/test_regression.py`

Creating regression artifacts

The first step will create an adapter checkpoint and an output for the
given PEFT version and test setting in a new directory. E.g. it will
create a directory `tests/regression/lora_opt-125m_bnb_4bit/0.5.0/` that
contains adapter_model.bin and output.pt.

Before this step runs, there is a check that the git repo is clean (no
dirty worktree) and that the commit is tagged (i.e. corresponds to a
release version of PEFT). Otherwise, we may accidentally create
regression artifacts that do not correspond to any PEFT release.

The easiest way to get such a clean state (say, for PEFT v0.5.0) is by
checking out a tagged commit, e.g:

`git checkout v0.5.0`

before running the first step.

The first step will also skip the creation of regression artifacts if
they already exist.

It is possible to circumvent all the aforementioned checks by setting
the environment variable `REGRESSION_FORCE_MODE` to True like so:

`REGRESSION_FORCE_MODE=True REGRESSION_CREATION_MODE=True pytest tests/regression/test_regression.py`

You should only do this if you know exactly what you're doing.

Running regression tests

The second step is much simpler. It will load the adapters and the
output created in the first step, and compare the output to the output
from a new PEFT model using the loaded adapter. The outputs should be
the same.

If more than one version is discovered for a given test setting, all of
them are tested.

Notes

Regression artifacts are stored on HF Hub.
2023-12-06 15:07:05 +01:00