transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-10-20 09:03:53 +08:00

Author	SHA1	Message	Date
i3hz	4763b8c5b8	Correct numerical regression in vision embeddings (#41374 ) created modeling file	2025-10-07 13:43:24 +02:00
Donghua Yu	caa14e7dab	fix resample in asr pipeline (#41298 )	2025-10-06 17:31:10 +00:00
Yao Matrix	73f8c4b8ad	fix asr ut failures (#41332 ) Signed-off-by: Yao, Matrix <matrix.yao@intel.com>	2025-10-06 17:12:19 +00:00
Anton Vlasjuk	57e82745f9	[`v5`] Sync Bert and Bart eager attention (#41248 ) * remove from modeling files * remaining changes * style / copies * revert deprecated models and fixup some models * oops * sync attn impl * fix style/copies * fix distilbert * remove dim check	2025-10-06 18:49:01 +02:00
Arthur	505387c05b	Update from pretrained error when loading (#33380 ) * init commit * style * take comments into account * mrege with main and simplify * nits * final * small fixes * fix * super small update! * add another test * up up * update * fixes * sort them by default	2025-10-06 16:10:19 +00:00
Anthonette Adanyin	e00f46f16e	serve: add non-streaming mode to /v1/responses; stream event parity; remove placeholder logprobs (#41353 )	2025-10-06 16:04:17 +00:00
Arthur	0395ed52ae	[`CB`] Refactors the way we access paged (#41370 ) * up * refactor the way we handle paged attention * affect serve as well * update * fix * cup	2025-10-06 17:55:31 +02:00
Yuanyuan Chen	39b0c9491b	Remove unused function patameters (#41358 ) Remove unused arguments Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-06 15:38:17 +00:00
Yao Matrix	11e4b5e5ee	make some ut cases pass on xpu w/ latest torch (#41337 ) * make some ut cases pass on xpu w/ latest torch Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update test_modeling_llava_onevision.py * Apply style fixes --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-06 15:38:00 +00:00
Yuanyuan Chen	fa36c973fc	Remove unnecessary list comprehension (#41305 ) Remove unnecessary comprehension Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-06 14:49:02 +00:00
Ilyas Moutawwakil	7a1aeec36e	Fixes in check_model_inputs, GPTBigCodeModel and ImageGPTModel (#40811 ) * misc fixes * fix * Update src/transformers/models/imagegpt/modeling_imagegpt.py * Apply suggestion from @IlyasMoutawwakil * pickup use_cache from args input as well * fix	2025-10-06 16:34:24 +02:00
Anuj Soni	297a41a6cf	Use canonical get_size_with_aspect_ratio (with max_size) from transformers.image_transforms to fix #37939 (#41284 ) * Use canonical get_size_with_aspect_ratio (with max_size) from transformers.image_transforms to fix #37939 * Fix import sorting/style * Fix import order * Refactor: use canonical get_size_with_aspect_ratio across image processors (except YOLOS) This commit updates image processing utilities in multiple model processors to use the shared transformers.image_transforms.get_size_with_aspect_ratio for consistent resizing logic and aspect ratio handling. YOLOS processors are intentionally left unchanged in this commit to preserve their current behavior and avoid breaking model-specific padding/resizing assumptions. YOLOS will be updated in a dedicated follow-up PR once compatibility is fully verified. * ruff fixes * Fix check_copies.py references for get_size_with_aspect_ratio to use canonical transformers.image_transforms version --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-10-06 10:15:56 -04:00
Yangshen⚡Deng	ae60c77689	Fix flash_attention.py: wrong argument passing for attn_implementation (#41347 ) * Fix flash_attention.py: wrong argument passing for attn_implementation The name of the attn type argument for `_flash_attention_forward()` should be `implementation`, instead of `attn_implementation` which currently uses in the function call. This would result in wrong type specification. * modify the kwargs inside _flash_attention_forward * fix the doc * fix typo --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-06 15:36:40 +02:00
Yih-Dar	6bf6e36d3b	[testing] update `test_longcat_generation_cpu` (#41368 ) * fix * Update tests/models/longcat_flash/test_modeling_longcat_flash.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-10-06 13:21:29 +00:00
Cyril Vallez	4903cd4087	🚨 Remove BetterTransformer (#41367 ) remove	2025-10-06 15:18:12 +02:00
Roman Solomatin	a5700c497e	Better typehints for `apply_chat_template` (#41355 )	2025-10-06 13:14:03 +00:00
Omkaar	089d573aca	Fix typo in model proposal template (#41352 )	2025-10-06 13:06:50 +00:00
Anton Vlasjuk	c27b67f0cd	🚨 [`v5`] Remove relative position embeddings (for bert like models) (#41170 ) * remove from modeling files * remaining changes * style / copies * revert deprecated models and fixup some models * oops	2025-10-06 14:21:41 +02:00
Nicolas Patry	a89bdcf5f1	Fixing a typo for BLT model (#41325 )	2025-10-06 12:16:45 +00:00
Arthur	0452f28544	[`ModularChecker`] QOL for the modular checker (#41361 ) * update * fancy table fancy prints * download to cache folder, never need it everagain * stule * update based on review	2025-10-06 12:52:10 +02:00
Raushan Turganbay	9db58abd6e	Check model inputs - hidden states (#40994 ) * update all models * fix copies * skip aria tests * update other models * skip should be in test, not tester * i think this is more descriptive as a name * find and replace for new models	2025-10-06 11:48:52 +02:00
Marc Sun	db711210d2	Fix trainer for py3.9 (#41359 ) fix	2025-10-06 11:36:05 +02:00
Cyril Vallez	163601c619	Standardize `PretrainedConfig` to `PreTrainedConfig` (#41300 ) * replace * add metaclass for full BC * doc * consistency * update deprecation message * revert	2025-10-06 11:34:02 +02:00
Cyril Vallez	55b172b8eb	🚨 Bump to Python 3.10 and rework how we check 3rd-party libraries existence (#41268 ) * cleanup * add check * fix * remove all global variables * fix * add lru caches everywhere * fix * fix * style * improve * reorder all functions * fix order * improve * fix * fix * fix	2025-10-06 11:04:19 +02:00
Raushan Turganbay	1ec0b54414	Rope for Qwen2--5-vl (#41173 ) qwen2--5-vl	2025-10-06 10:56:29 +02:00
Sai-Suraj-27	0947b9042c	Fixed tiny incorrect import in `gemma3` (#41354 ) Fixed tiny import issue in gemma3	2025-10-06 10:55:42 +02:00
Arthur	e11a00a16f	`JetMoe` Fix jetmoe after #40132 (#41324 ) * update * up	2025-10-04 11:02:13 +02:00
Marc Sun	1bc75db9bd	Fix lr_scheduler_parsing (#41322 ) * fix * fix	2025-10-03 17:51:17 +02:00
Cyril Vallez	c2b3cc3e64	Fix jamba (#41309 ) * reactivate tests * first pass * fix * fix bias * fix and simplify * finally fix this stupid bug * add skips * remove bad stuff * fix copies * simplify	2025-10-03 16:54:19 +02:00
Pablo Montalvo	5abfa43f02	Security/fuyu (#41320 ) remove reference to compromised repo	2025-10-03 14:13:41 +00:00
Mohamed Mekkouri	217ff1e4ef	AutoAWQ tests (#41295 ) * initial commit * fix * fix multi gpu * fix expected output * fix * latest * add comment * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-03 15:17:10 +02:00
Raushan Turganbay	5339f72b9b	🚨 [unbloating] unify `TypedDict` usage in processing (#40931 ) * just squash commits into one * fix style	2025-10-03 14:17:59 +02:00
Yih-Dar	42bcc81ba2	Minor security fix for `ssh-runner.yml` (#41317 ) security issue Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-03 14:14:34 +02:00
Pablo Montalvo	cd4422922e	Add modular detector (#41289 ) * doc * doc * no remote code * safe-ize the release + remove remote * fixes * add some documentation as well	2025-10-03 14:11:10 +02:00
Yih-Dar	59eba49237	download and use HF Hub Cache (#41181 ) use hub cache Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-03 11:11:37 +02:00
Yangshen⚡Deng	de3ee737cf	Fix README.md error when installing from source (#41303 )	2025-10-02 16:08:27 -07:00
Federico Moretti	b914445f77	Italian translation for README.md (#41269 ) chore: add Italian translation for README.md	2025-10-02 15:59:28 -07:00
Benjamin Bossan	41e5abac5c	FIX: Bug in PEFT integration delete_adapter method (#41252 ) The main content of this PR is to fix a bug in the delete_adapter method of the PeftAdapterMixin. Previously, it did not take into account auxiliary modules from PEFT, e.g. those added by modules_to_save. This PR fixes this oversight. Note that the PR uses a new functionality from PEFT that exposes integration functions like delete_adapter. Those will be contained in the next PEFT release, 0.18.0 (yet unreleased). Therefore, the bug is only fixed when users have a PEFT version fullfilling this requirement. I ensured that with old PEFT versions, the integration still works the same as previously. The newly added test for this is skipped if the PEFT version is too low. (Note: I tested locally with that the test will pass with PEFT 0.18.0) While working on this, I also cleaned up the following: - The active_adapter property has been deprecated for more than 2 years (#26407). It is safe to remove it now. - There were numerous small errors or outdated pieces of information in the docstrings, which have been addressed. When PEFT < 0.18.0 is used, although we cannot delete modules_to_save, we can still detect them and warn about it.	2025-10-02 18:36:57 +02:00
Anton Vlasjuk	da3c7d1d36	🚨 [`DistilBert`] Refactor Attention (#41163 ) * refactor * allow pos ids for flattened sequences	2025-10-02 17:50:48 +02:00
Anton Vlasjuk	e54defcfc2	[`Flex Attn`] Fix lse x attention sinks logic (#41249 ) fix	2025-10-02 17:49:39 +02:00
Cyril Vallez	b3bd815786	Fix mxfp4 dequantization (#41292 ) fix	2025-10-02 16:47:42 +02:00
Yuanyuan Chen	e4930d6bde	🚨 [V5] Remove deprecated `resume_download` (#41122 ) Remove deprecated `resume_download` Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-10-02 16:44:34 +02:00
Yih-Dar	7adb43e60a	Build doc in 2 jobs: `en` and `other languages` (#41290 ) * separate * separate --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-02 14:33:57 +00:00
Yih-Dar	e1f1d32af0	Remove some previous team members from allow list of triggering Github Actions (#41263 ) * delete * delete --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-02 16:32:28 +02:00
Manh Nguyen	1d7ebff398	Fix - remove deprecated args checking in deepspeed intergrations (#41282 ) Remove deprecated args checking in deepspeed intergrations Signed-off-by: nguyen599 <pnvmanh2123@gmail.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-10-02 13:59:50 +00:00
Cyril Vallez	9d02602f0f	Remove `test_initialization` (#41261 ) remove it	2025-10-02 15:23:43 +02:00
Joao Gante	248e7ef8bc	[docs] remove references to recently deleted classes in non-`en` docs (onnx, feature processors) (#41286 ) remove references to old classes	2025-10-02 12:59:28 +00:00
JJJYmmm	bc33fd3fc2	Add processor and intergration test for qwen3vl (#41277 ) * support aux loss in qwen3vlmoe * update qwen3vl processor test! * add integration tests for qwen3vl-30a3 * remove duplicated decorator * code clean * fix consistency * do not inherit from nn.Linear for better quantization * pass check	2025-10-02 14:59:04 +02:00
Luc Georges	639ad8ccd9	feat: use `aws-highcpu-32-priv` for amd docker img build (#41285 ) * feat: use `aws-highcpu-32-priv` for amd docker img build * feat: add `workflow_dispatch` event to docker build CI	2025-10-02 12:53:14 +00:00
Yuanyuan Chen	894a2bdd8c	Fix pylint generator warnings (#41258 ) Fix pylint generator warnings Signed-off-by: cyy <cyyever@outlook.com>	2025-10-02 12:35:42 +00:00

... 3 4 5 6 7 ...

20920 Commits