20920 Commits

Author SHA1 Message Date
b9bd8c45a1 [CI] Build translated docs (#41632)
fix
2025-10-16 14:01:33 +02:00
baecdb8a97 [Ernie 4.5 Moe] Fix Moe and offloading (#41385)
fix
2025-10-16 13:59:01 +02:00
44539827d5 [Executorch] Simplify for encoder models (#41627)
* Trigger Build

* revert extra treatment for executorch as we default to no vmapping now
2025-10-16 13:57:52 +02:00
143acfe2ce fix check inputs for text2text pipeline (#41556)
fix check inputs

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-10-16 11:42:41 +00:00
67fae90519 Fix FP-Quant quantization fallback CPU dispatch. (#41619)
* fp_quant fix

* Update quantizer_fp_quant.py
2025-10-16 11:41:01 +00:00
af2a66ced9 Migrate transformers cli to Typer (#41487)
* Add typer-slim as explicit dependency

* Migrate CLI to Typer

* code quality

* bump release candidate

* adapt test_cli.py

* Remove ./commands + adapt tests

* fix quality

* consistency

* doctested

* do not serve model in chat

* style

* will it fix them?

* fix test

* capitalize classes

* Rebase

* Rebase

* tests + fixup

tests + fixup

* csutom error message

* fix ?

* should be good

* fix caplog globally

* inner caplog

* last attempt

* Retry

* Let's try with capsys disabled

---------

Co-authored-by: Lysandre <hi@lysand.re>
2025-10-16 13:29:42 +02:00
a59124e27e Add missing dates to docs (#41576)
add dates
2025-10-16 09:32:28 +00:00
81f97b17d2 Remove randomly added script (#41650)
remove
2025-10-16 11:23:53 +02:00
c0a5cf19ad Fix tokenization test (#41649)
fix
2025-10-16 11:14:20 +02:00
3ef6f2c415 Allow passing tp_plan in from_pretrained directly (#41435)
* start

* allow passing it

* fix plans

* fix

* fix

* style

* style

* fix

* add_test

* oupsi indent

* fix

* fix

* fix for CI without accelerator

* fix import
2025-10-16 11:12:07 +02:00
59efd86da2 Add aux loss for GLM-4.5V (#41564)
* add aux

* update

* update config to text_config

* use qwen data class to avoid repeat again

* format

* update

* use 1e-4

* update

* update for remove init

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2025-10-16 09:04:21 +00:00
7b7d17f9bf 🚨 [v5] Toggle the serialization format in processors (#41474)
* toggle the serialization

* prob this fixes it

* fix tests

* typo

* delete legacy save entirely

* remove extra nesting in if

* revert test and serialzie a public attr instead of private
2025-10-16 10:19:22 +02:00
e20df45bf6 Add Backbone API fine-tuning tutorial (#41590)
---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-10-15 18:42:32 +02:00
19df66dcba Update executorch.md (#41582)
* Update executorch.md

* Update executorch.md

* Update executorch.md

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-10-15 09:01:46 -07:00
9f71e3a604 [docs] Duplicate entry (#41591)
fix
2025-10-15 17:02:36 +02:00
bc9900562d Fix quantization base class (#41613)
* fix

* fix

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-10-15 16:58:17 +02:00
72fd67929b Remove deprecated code (#41616)
remove

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-10-15 16:57:52 +02:00
da382917aa Remove the head masking block in some vision models (#41620)
* old

* new

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-10-15 15:51:01 +02:00
313afcc468 [chat template] update when "push_to_hub" (#39815)
* update templates push to hub

* rvert jinja suffix and move it to processor file
2025-10-15 13:49:59 +00:00
7bba4d1202 Fix video processing channel format (#41603)
fix
2025-10-15 15:48:01 +02:00
ab92534377 enable sdpa enable gqa logic for Ascend NPU (#41601)
* enable gqa logic for Ascend NPU

* remove redundant comments

* fix comments about Ascend NPU

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
2025-10-15 13:45:28 +00:00
56a727dde5 Add fast path for bidirectional mask creation to fix regression (#41586)
* fixed performance regression

* also fixed the older_torch function

* Update src/transformers/masking_utils.py

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* fix

* more general

* fix slicing

* fix data dependent

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2025-10-15 15:30:39 +02:00
dc6fdeb705 Update a dataset reop link (#41618)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-10-15 14:41:38 +02:00
3953b65440 Reinstate early CUDA init fix (#41617)
* Reinstate early CUDA init fix

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Delay import further

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

---------

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-15 14:41:10 +02:00
96d245a83d torch 2.9 don't ❤️ torchcodec 💔 (#41610)
pin

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-10-15 14:34:00 +02:00
bb0c3af995 More markdown file fixes (#41599)
* Format markdown files

Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>

* Format markdown files

Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>

* Format markdown files

Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>

---------

Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>
2025-10-15 12:29:27 +00:00
70e871959c Fix trainer simple tests (#41449)
* fix

* fix ray

* train to tune

* breaking changes wrt generation config

* Fix !

* fix

* fix

* fix deepspeed !

* fix

* fix

* fix

* improve logic

* revert and fix

* revert comment

* oups

* revert change

* fix

* style

* typo in comment

---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-10-15 14:09:00 +02:00
c4210796e0 Import expand_device_map instead of redefining it (#41608)
remove it
2025-10-15 14:00:09 +02:00
fcd1ccdb78 [Docs] Fix changed references (#41614)
* fix

* fix

* other ln
2025-10-15 13:59:13 +02:00
2b2c20f315 Update issue template (#41573)
* update

* fix
2025-10-15 13:54:37 +02:00
e2122c4bcb remove ray_scope and check_quantized_param (#41587)
remove
2025-10-15 13:10:35 +02:00
e89cef6625 fix some case failures lead by "torch.compile recompiled part of th… (#41558)
* fix some case failures lead by "`torch.compile` recompiled part of the forward pass" in xpu

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* update comment

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

---------

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2025-10-15 10:45:29 +00:00
26b7f66850 Add logits_to_keep to many older CausalLM models (#41335)
* Add logits_to_keep to CausalLM models

* Skip failing test for git model

* Remove unused return_dict from kosmos2 signature

* Revert BlipForQuestionAnswering
2025-10-15 11:56:01 +02:00
5db730786d [device_map] Accelerate loading by computing device_map much faster (#41548)
* start

* add the important fix

* continue

* big cleanup

* type hints

* add method

* fix typehints

* typehints

* fix

* oupsi

* remove space

* improve function

* CI
2025-10-15 11:18:57 +02:00
13a35a5057 Enable non-streaming mode in transformers serve (#41446)
* Enable non-streaming in transformers serve

Remove typos

Remove typos

Remove typos

* Fix tests

* Arthur review
2025-10-15 09:37:26 +02:00
94df0e6560 Benchmark overhaul (#41408)
* Big refactor, still classes to move around and script to re-complexify

* Move to streamer, isolate benches, propagate num tokens

* Some refacto

* Added compile mode to name

* Re-order

* Move to dt_tokens

* Better format

* Fix and disable use_cache by default

* Fixed compile and SDPA backend default

* Refactor results format

* Added default compile mode

* Always use cache

* Fixed cache and added flex

* Plan for missing modules

* Experiments: no cg and shuffle

* Disable compile for FA

* Remove wall time, add sweep mode, get git commit

* Review compliance, start

* Apply suggestions from code review

Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>

* Update benchmark_v2/framework/benchmark_runner.py

Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>

* Disable workflow

* Pretty print

* Added some pretty names to have pretty logs

* Review n2 compliance (end?)

* Style and end of PR

---------

Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>
2025-10-14 21:41:43 +02:00
9e4199ede3 Gemma3 fixes (#41572)
* Multiple device error fix

* FA2 equivalence fix

* Move the train fwd in cfg test

* Style

* Added comment

* Made the comment more clear
2025-10-14 18:33:27 +02:00
4c8d293599 Fix typsetting and content of llm_tutorial_optimization.md (#41172)
* Fix typsetting of llm_tutorial_optimization

Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>

* Fix errors

Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>

---------

Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>
2025-10-14 08:40:26 -07:00
a99b1be3c7 Revert some breaking changes bnb (#41581)
fix
2025-10-14 16:28:16 +02:00
82cae9eb52 Add __iter__ to DynamicCache (#41569)
* Add __iter__ to DynamicCache

* Fix tests that use ddp init
2025-10-14 16:16:32 +02:00
4fad35ee4a [VisionEncoderDecoderModel] Update loss function (#40863)
Update loss function
2025-10-14 16:03:00 +02:00
ae6f6cc3e0 Revert "add rmsnorm kernels support for Intel XPU" (#41579)
Revert "add rmsnorm kernels support for Intel XPU (#41563)"

This reverts commit fd787c5f6d667d3e00def70f588972af4437f631.
2025-10-14 15:49:33 +02:00
fd787c5f6d add rmsnorm kernels support for Intel XPU (#41563)
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
2025-10-14 13:26:09 +00:00
4e4f2af586 Add conditional checks to _check_and_adjust_attn_implementation() (#41542) 2025-10-14 13:00:07 +00:00
3648fde486 Add DINOv3Backbone for ConvNext variant (#40651)
---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-10-14 14:57:04 +02:00
abf5b57a68 delete some tokenizer tests using pickle (#41514)
* hate pickle

* hate pickle

* hate pickle

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-10-14 14:50:51 +02:00
8fe4db5399 [kernels] rm mra kernels (#41507)
* fix modeling

* remove kernel

* fix style
2025-10-14 13:34:04 +02:00
c620c38bb0 [Qwen3VLMoe] Fixed: Expected self.dtype to be equal to src.dtype - routing_weights casting (#41420)
* Fixed Expected self.dtype to be equal to src.dtype on eval

* Fixed Expected self.dtype to be equal to src.dtype on eval

* Fixed Expected self.dtype to be equal to src.dtype on eval

* generated modeling_qwen3_vl_moe.py file

* Fixed Ernie_4_5_MoE router casting

* Fixed routing_weights dtype casting (ernie4_5_moe, hunyuan_v1_moe, qwen2_moe, qwen3_moe, qwen3_next,qwen3_omni_moe)

* rollback hunyuan_v1_moe changes

---------

Co-authored-by: Daniel Oliveira <daniel-oliveira-11@hotmail.com>
Co-authored-by: Daniel Oliveira <36623265+daniel3303@users.noreply.github.com>
2025-10-14 13:14:49 +02:00
0798797ec9 Fix an import error with PreTrainModel (#41571) 2025-10-14 13:13:37 +02:00
0566b6f5bd Patch MistralCommonTokenizer (#41439)
* Fix token_to_id and add add_generation_prompt

* Fix spm download

* Refactor spm

* Try another possibly non-gated spm

* Improve get_vocab

* lint

* Improve get_vocab

* Add warn to piece_to_id

* Improve from_pretrained raise and revert model spm

* Revert fast
2025-10-14 11:13:19 +00:00