transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-10-21 01:23:56 +08:00

Author	SHA1	Message	Date
ydshieh	ca38d640b7	fix	2025-10-17 22:21:10 +02:00
ydshieh	b8011a3dc5	fix	2025-10-17 21:30:29 +02:00
ydshieh	a118c8e1c4	fix	2025-10-17 20:59:00 +02:00
ydshieh	0109c42409	fix	2025-10-17 20:30:23 +02:00
ydshieh	3c7552f733	fix	2025-10-17 15:40:54 +02:00
ydshieh	4757bf062b	fix	2025-10-17 15:12:12 +02:00
ydshieh	aceaa7ce97	fix	2025-10-17 15:05:09 +02:00
ydshieh	c9293376a0	fix	2025-10-17 12:04:50 +02:00
ydshieh	e69d3ca150	check 1	2025-10-17 10:44:24 +02:00
ydshieh	bffad7f4fb	check 1	2025-10-17 09:21:09 +02:00
ydshieh	740f952218	check 1	2025-10-17 06:57:10 +02:00
ydshieh	950c4e5303	check 1	2025-10-17 06:28:55 +02:00
ydshieh	89970f4797	check 1	2025-10-17 03:03:25 +02:00
ydshieh	a4a46e62a5	check 1	2025-10-16 21:32:04 +02:00
ydshieh	9b36498d5f	1	2025-10-16 21:16:53 +02:00
HyunSang Jang	eefbf4ac8b	🌐 [i18n-KO] Translated llama4.md to Korean (#40396 ) * docs: ko: llama4.md * feat: nmt draft * fix: manual edits * Update docs/source/ko/model_doc/llama4.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update docs/source/ko/model_doc/llama4.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update docs/source/ko/model_doc/llama4.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update docs/source/ko/model_doc/llama4.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> --------- Co-authored-by: TaskerJang <bymyself103@naver.com> Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>	2025-10-16 11:28:27 -07:00
Judy	50ca781d78	🌐 [i18n-KO] Translated `code_llama.md` to Korean (#40558 ) * docs: ko: code_llama.md * feat: nmt draft * fix: manual edits * Apply suggestions from code review Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by: HyunZ118 <156191095+HyunZ118@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> --------- Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by: HyunZ118 <156191095+HyunZ118@users.noreply.github.com>	2025-10-16 11:27:46 -07:00
SSUM	8739fc05c4	[i18n-KO] Translated `big_bird.md` to Korean (#40445 ) * docs: ko: BigBird.md * feat: nmt draft * fix: manual edits	2025-10-16 11:23:56 -07:00
HyunZ118	77b5ad65ee	🌐 [i18n-KO] Translated sam_hq.md to Korean (#41340 ) * fix: manual edits * Apply suggestions from code review Apply suggestions from code review Co-authored-by: HyunSang Jang <tasker.dev103@gmail.com> * Apply suggestions from code review Apply suggestions from code review Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: HyunSang Jang <tasker.dev103@gmail.com> Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>	2025-10-16 11:10:16 -07:00
Judy	a9731a725e	🌐 [i18n-KO] Translated `chat_extras.md` to Korean (#39863 ) * docs: ko: chat_extras.md * feat: nmt draft * fix: manual edits * Apply suggestions from code review * Apply suggestions from code review * Update docs/source/ko/chat_extras.md	2025-10-16 10:41:03 -07:00
Marc Sun	bdbc2d037b	[Trainer] [Breaking change] `use_cache` default to `False` (#41585 ) * use_cache default to `False` when training * style * Fix comment * add checks * style * set * switch	2025-10-16 18:51:36 +02:00
Mohamed Mekkouri	fe11cbb808	Erroring when KernelConfig is passed without use_kernels = True (#41657 ) * update * update	2025-10-16 18:08:46 +02:00
Yih-Dar	6344371a91	improve `utils/check_bad_commit.py` (#41658 ) * robust * robust * robust --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-16 15:51:19 +00:00
Cyril Vallez	a408384a88	Improve package version check (#41661 ) fix	2025-10-16 17:31:58 +02:00
Rémi Ouazan	f7c33abab3	Small changes to benchmarking script (#41662 )	2025-10-16 17:25:49 +02:00
Marc Sun	9839d57a02	Fix serving continuous batching (#41624 ) * udpate-serving-cb * style * style * check none * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-10-16 17:24:21 +02:00
Cyril Vallez	e85d5ab2bb	Fix dtype casting with quantization (#41665 ) fix dtype casting	2025-10-16 17:19:32 +02:00
Raushan Turganbay	1c36d407d5	Add in-out modalities as class attribute per model (#41366 ) * update all models * fix copies * explanation comment * better notation in omni model * style * fix copies * output_modalities under generation mixin * fix copies * oh, glm4v also needs conversion	2025-10-16 17:11:06 +02:00
Rémi Ouazan	0215846d98	Switch to CB if cache_implementation == paged (#41655 ) * Add a switch to CB in case of paged cache * Added paged as a valid cache implem * Added a fallback on inputs_ids as a name * Rookie mistake * Removed paged from cache implems * Added warning about some beam search args * Moved up CB warning	2025-10-16 17:00:18 +02:00
Yuanyuan Chen	9e99198e5e	Use \| for Optional and Union typing (#41646 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-16 14:29:54 +00:00
Anton Vlasjuk	bf815e9b5e	[`Masks`] Fix mask handling in eager for vision models (#41625 ) add mask handling in case of models that do use it	2025-10-16 16:27:26 +02:00
vb	4a43e3d57c	purge HF_HUB_ENABLE_HF_TRANSFER; promote Xet (#41656 )	2025-10-16 16:17:09 +02:00
Fabian Joswig	8725ce10ed	[Fix] Deepseek V3 expert bias routing (#41647 ) * [Fix] Deepseek V3 expert bias routing * [Fix] fix-copies * [Fix] Run make style	2025-10-16 14:04:48 +00:00
Mohamed Mekkouri	1fb3fc4db0	[kernels] refactor function kernel calling (#41577 ) * refactor function kernel callling * nit * don't pass the mapping * use _kernels_available * rm import	2025-10-16 15:43:02 +02:00
Pablo Montalvo	9176af574a	Double router compute? (#41653 ) * weird double router compute? * flip it	2025-10-16 15:17:21 +02:00
Yuanyuan Chen	503c933f36	Fix confusing cls assignment (#41642 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-16 13:01:07 +00:00
Yuanyuan Chen	2aff20aff6	Fix typos in documentation (#41641 ) Fix typos Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-16 12:58:46 +00:00
Yuanyuan Chen	981370c038	Format MarkDown documentation and tiny fixes (#41638 ) * Fix MarkDown syntax Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-16 12:58:06 +00:00
Rémi Ouazan	eef9fb2af3	Fix EncoderDecoder cache (#41612 ) * Fix EncoderDecoder cache * Add the option for the ddp data tuples to have 2 elems * Modifiy the order of the KV and sliding * Adapted RAG and Whisper to new EncoderDecoderCache * A single comma * Remove kwargs in map * Fixed order in manual injection cache test * Slight changes to support legacy format * Removed Nonnes	2025-10-16 14:55:41 +02:00
Mario Koddenbrock	35dc8f0a2e	Adjust device logging level and add minor fixes (#41636 ) This commit addresses a noisy warning and improves the robustness of the base pipeline implementation. - The device placement message in the pipeline base class has been changed from a `warning` to a `debug` log. This reduces log noise for users who are aware of their device setup, while still providing the information for debugging purposes. - Additionally, potential `UnboundLocalError` exceptions in the `_pad` and `check_model_type` functions have been prevented by initializing variables before their conditional assignment.	2025-10-16 12:47:39 +00:00
Rémi Ouazan	2935a1be19	Fix fp32_ln for various models (#41605 ) * Add is_causal to KosmosTextAttention * Move get target_dtype to be imported elsewhere * Fix fp32 flash attention bug in bark * Fix is_causal in mllama * Fix fp32 issue on StableLM * Fix repo-consistency	2025-10-16 14:18:49 +02:00
Steven Liu	b9bd8c45a1	[CI] Build translated docs (#41632 ) fix	2025-10-16 14:01:33 +02:00
Anton Vlasjuk	baecdb8a97	[`Ernie 4.5 Moe`] Fix Moe and offloading (#41385 ) fix	2025-10-16 13:59:01 +02:00
Anton Vlasjuk	44539827d5	[`Executorch`] Simplify for encoder models (#41627 ) * Trigger Build * revert extra treatment for executorch as we default to no vmapping now	2025-10-16 13:57:52 +02:00
jiqing-feng	143acfe2ce	fix check inputs for text2text pipeline (#41556 ) fix check inputs Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-10-16 11:42:41 +00:00
Andrei Panferov	67fae90519	Fix FP-Quant quantization fallback CPU dispatch. (#41619 ) * fp_quant fix * Update quantizer_fp_quant.py	2025-10-16 11:41:01 +00:00
Lucain	af2a66ced9	Migrate transformers cli to Typer (#41487 ) * Add typer-slim as explicit dependency * Migrate CLI to Typer * code quality * bump release candidate * adapt test_cli.py * Remove ./commands + adapt tests * fix quality * consistency * doctested * do not serve model in chat * style * will it fix them? * fix test * capitalize classes * Rebase * Rebase * tests + fixup tests + fixup * csutom error message * fix ? * should be good * fix caplog globally * inner caplog * last attempt * Retry * Let's try with capsys disabled --------- Co-authored-by: Lysandre <hi@lysand.re>	2025-10-16 13:29:42 +02:00
Yoni Gozlan	a59124e27e	Add missing dates to docs (#41576 ) add dates	2025-10-16 09:32:28 +00:00
Cyril Vallez	81f97b17d2	Remove randomly added script (#41650 ) remove	2025-10-16 11:23:53 +02:00
Cyril Vallez	c0a5cf19ad	Fix tokenization test (#41649 ) fix	2025-10-16 11:14:20 +02:00
Cyril Vallez	3ef6f2c415	Allow passing `tp_plan` in `from_pretrained` directly (#41435 ) * start * allow passing it * fix plans * fix * fix * style * style * fix * add_test * oupsi indent * fix * fix * fix for CI without accelerator * fix import	2025-10-16 11:12:07 +02:00
Yuxuan Zhang	59efd86da2	Add aux loss for GLM-4.5V (#41564 ) * add aux * update * update config to text_config * use qwen data class to avoid repeat again * format * update * use 1e-4 * update * update for remove init * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2025-10-16 09:04:21 +00:00
Raushan Turganbay	7b7d17f9bf	🚨 [v5] Toggle the serialization format in processors (#41474 ) * toggle the serialization * prob this fixes it * fix tests * typo * delete legacy save entirely * remove extra nesting in if * revert test and serialzie a public attr instead of private	2025-10-16 10:19:22 +02:00
Merve Noyan	e20df45bf6	Add Backbone API fine-tuning tutorial (#41590 ) --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-10-15 18:42:32 +02:00
Jack	19df66dcba	Update executorch.md (#41582 ) * Update executorch.md * Update executorch.md * Update executorch.md * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-10-15 09:01:46 -07:00
Steven Liu	9f71e3a604	[docs] Duplicate entry (#41591 ) fix	2025-10-15 17:02:36 +02:00
Marc Sun	bc9900562d	Fix quantization base class (#41613 ) * fix * fix --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-15 16:58:17 +02:00
Marc Sun	72fd67929b	Remove deprecated code (#41616 ) remove Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-15 16:57:52 +02:00
Yih-Dar	da382917aa	Remove the head masking block in some vision models (#41620 ) * old * new --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-15 15:51:01 +02:00
Raushan Turganbay	313afcc468	[chat template] update when "push_to_hub" (#39815 ) * update templates push to hub * rvert jinja suffix and move it to processor file	2025-10-15 13:49:59 +00:00
Raushan Turganbay	7bba4d1202	Fix video processing channel format (#41603 ) fix	2025-10-15 15:48:01 +02:00
Zhen	ab92534377	enable sdpa enable gqa logic for Ascend NPU (#41601 ) * enable gqa logic for Ascend NPU * remove redundant comments * fix comments about Ascend NPU --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-10-15 13:45:28 +00:00
i3hz	56a727dde5	Add fast path for bidirectional mask creation to fix regression (#41586 ) * fixed performance regression * also fixed the older_torch function * Update src/transformers/masking_utils.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * fix * more general * fix slicing * fix data dependent --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-10-15 15:30:39 +02:00
Yih-Dar	dc6fdeb705	Update a dataset reop link (#41618 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-15 14:41:38 +02:00
Harry Mellor	3953b65440	Reinstate early CUDA init fix (#41617 ) * Reinstate early CUDA init fix Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Delay import further Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> --------- Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-15 14:41:10 +02:00
Yih-Dar	96d245a83d	torch 2.9 don't ❤️ torchcodec 💔 (#41610 ) pin Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-15 14:34:00 +02:00
Yuanyuan Chen	bb0c3af995	More markdown file fixes (#41599 ) * Format markdown files Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Format markdown files Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Format markdown files Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-15 12:29:27 +00:00
Marc Sun	70e871959c	Fix trainer simple tests (#41449 ) * fix * fix ray * train to tune * breaking changes wrt generation config * Fix ! * fix * fix * fix deepspeed ! * fix * fix * fix * improve logic * revert and fix * revert comment * oups * revert change * fix * style * typo in comment --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-15 14:09:00 +02:00
Cyril Vallez	c4210796e0	Import `expand_device_map` instead of redefining it (#41608 ) remove it	2025-10-15 14:00:09 +02:00
Anton Vlasjuk	fcd1ccdb78	[`Docs`] Fix changed references (#41614 ) * fix * fix * other ln	2025-10-15 13:59:13 +02:00
Marc Sun	2b2c20f315	Update issue template (#41573 ) * update * fix	2025-10-15 13:54:37 +02:00
Marc Sun	e2122c4bcb	remove ray_scope and check_quantized_param (#41587 ) remove	2025-10-15 13:10:35 +02:00
Wang, Yi	e89cef6625	fix some case failures lead by "`torch.compile` recompiled part of th… (#41558 ) * fix some case failures lead by "`torch.compile` recompiled part of the forward pass" in xpu Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * update comment Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2025-10-15 10:45:29 +00:00
Philip Roeleveld	26b7f66850	Add `logits_to_keep` to many older CausalLM models (#41335 ) * Add logits_to_keep to CausalLM models * Skip failing test for git model * Remove unused return_dict from kosmos2 signature * Revert BlipForQuestionAnswering	2025-10-15 11:56:01 +02:00
Cyril Vallez	5db730786d	[device_map] Accelerate loading by computing device_map much faster (#41548 ) * start * add the important fix * continue * big cleanup * type hints * add method * fix typehints * typehints * fix * oupsi * remove space * improve function * CI	2025-10-15 11:18:57 +02:00
Lysandre Debut	13a35a5057	Enable non-streaming mode in `transformers serve` (#41446 ) * Enable non-streaming in transformers serve Remove typos Remove typos Remove typos * Fix tests * Arthur review	2025-10-15 09:37:26 +02:00
Rémi Ouazan	94df0e6560	Benchmark overhaul (#41408 ) * Big refactor, still classes to move around and script to re-complexify * Move to streamer, isolate benches, propagate num tokens * Some refacto * Added compile mode to name * Re-order * Move to dt_tokens * Better format * Fix and disable use_cache by default * Fixed compile and SDPA backend default * Refactor results format * Added default compile mode * Always use cache * Fixed cache and added flex * Plan for missing modules * Experiments: no cg and shuffle * Disable compile for FA * Remove wall time, add sweep mode, get git commit * Review compliance, start * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Update benchmark_v2/framework/benchmark_runner.py Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Disable workflow * Pretty print * Added some pretty names to have pretty logs * Review n2 compliance (end?) * Style and end of PR --------- Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>	2025-10-14 21:41:43 +02:00
Rémi Ouazan	9e4199ede3	Gemma3 fixes (#41572 ) * Multiple device error fix * FA2 equivalence fix * Move the train fwd in cfg test * Style * Added comment * Made the comment more clear	2025-10-14 18:33:27 +02:00
Yuanyuan Chen	4c8d293599	Fix typsetting and content of llm_tutorial_optimization.md (#41172 ) * Fix typsetting of llm_tutorial_optimization Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix errors Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-14 08:40:26 -07:00
Marc Sun	a99b1be3c7	Revert some breaking changes bnb (#41581 ) fix	2025-10-14 16:28:16 +02:00
Rémi Ouazan	82cae9eb52	Add __iter__ to DynamicCache (#41569 ) * Add __iter__ to DynamicCache * Fix tests that use ddp init	2025-10-14 16:16:32 +02:00
NielsRogge	4fad35ee4a	[VisionEncoderDecoderModel] Update loss function (#40863 ) Update loss function	2025-10-14 16:03:00 +02:00
Mohamed Mekkouri	ae6f6cc3e0	Revert "add rmsnorm kernels support for Intel XPU" (#41579 ) Revert "add rmsnorm kernels support for Intel XPU (#41563)" This reverts commit fd787c5f6d667d3e00def70f588972af4437f631.	2025-10-14 15:49:33 +02:00
kaixuanliu	fd787c5f6d	add rmsnorm kernels support for Intel XPU (#41563 ) Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>	2025-10-14 13:26:09 +00:00
Chunyu	4e4f2af586	Add conditional checks to _check_and_adjust_attn_implementation() (#41542 )	2025-10-14 13:00:07 +00:00
Merve Noyan	3648fde486	Add DINOv3Backbone for ConvNext variant (#40651 ) --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-10-14 14:57:04 +02:00
Yih-Dar	abf5b57a68	delete some tokenizer tests using pickle (#41514 ) * hate pickle * hate pickle * hate pickle --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-14 14:50:51 +02:00
Mohamed Mekkouri	8fe4db5399	[kernels] rm mra kernels (#41507 ) * fix modeling * remove kernel * fix style	2025-10-14 13:34:04 +02:00
Daniel Quintas	c620c38bb0	[Qwen3VLMoe] Fixed: Expected self.dtype to be equal to src.dtype - routing_weights casting (#41420 ) * Fixed Expected self.dtype to be equal to src.dtype on eval * Fixed Expected self.dtype to be equal to src.dtype on eval * Fixed Expected self.dtype to be equal to src.dtype on eval * generated modeling_qwen3_vl_moe.py file * Fixed Ernie_4_5_MoE router casting * Fixed routing_weights dtype casting (ernie4_5_moe, hunyuan_v1_moe, qwen2_moe, qwen3_moe, qwen3_next,qwen3_omni_moe) * rollback hunyuan_v1_moe changes --------- Co-authored-by: Daniel Oliveira <daniel-oliveira-11@hotmail.com> Co-authored-by: Daniel Oliveira <36623265+daniel3303@users.noreply.github.com>	2025-10-14 13:14:49 +02:00
Rémi Ouazan	0798797ec9	Fix an import error with PreTrainModel (#41571 )	2025-10-14 13:13:37 +02:00
Julien Denize	0566b6f5bd	Patch MistralCommonTokenizer (#41439 ) * Fix token_to_id and add add_generation_prompt * Fix spm download * Refactor spm * Try another possibly non-gated spm * Improve get_vocab * lint * Improve get_vocab * Add warn to piece_to_id * Improve from_pretrained raise and revert model spm * Revert fast	2025-10-14 11:13:19 +00:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟	b3e3c3dc93	[Qwen3VL] fix device mismatch error for FSDP2 training (#41536 ) For FSDP2, parameters might be on a meta device, and the weight.device attribute may not accurately reflect where the actual computation will happen during forward passes. ```log File "transformers/models/qwen3_vl_moe/modeling_qwen3_vl_moe.py", line 776, in forward pos_embeds = self.fast_pos_embed_interpolate(grid_thw) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "transformers/models/qwen3_vl_moe/modeling_qwen3_vl_moe.py", line 745, in fast_pos_embed_interpolate pos_embeds = self.pos_embed(idx_tensor) * weight_tensor[:, :, None] ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "torch/nn/modules/module.py", line 1773, in _wrapped_call_impl return self._call_impl(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "torch/nn/modules/module.py", line 1879, in _call_impl return inner() ^^^^^^^ File "torch/nn/modules/module.py", line 1827, in inner result = forward_call(args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "torch/nn/modules/sparse.py", line 192, in forward return F.embedding( ^^^^^^^^^^^^ File "torch/nn/functional.py", line 2546, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when checking argument in method wrapper_CUDA__index_select) ``` https://github.com/volcengine/verl/pull/3686#issuecomment-3380981817 Signed-off-by: Hollow Man <hollowman@opensuse.org>	2025-10-14 10:28:25 +00:00
Matt	b84c0b31c6	Remove references to AutoModelForVision2Seq (#41513 ) * Since Vision2Seq is deprecated, remove it from pipelines and docstrings * Catch some more references	2025-10-13 17:00:07 +01:00
Arthur	1ee3b288a6	[`from_pretrained`] Small refactor `from_pretrained`: move around unrelated stuff (#41445 ) * drafts * up * simplify modeling utils * more simplifications * type kwargs * up * move more accelerate related stuff * safeguarding? * nits * remove func when func is NOPE * more * nits * styling * yups * up * ups * revert * protect trainer utils iport * fix doc * Update src/transformers/integrations/peft.py Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * review * update * ? * fixx * update * super small update * ups * style * this is stupid * 🤦 well this was the issue * small nit * fix * nit * damn the missing return * one last stupid fix --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-10-13 16:33:32 +02:00
Kehan Li	cad74496ca	[model] Add VideoLLaMA3 implementation (#40499 ) * Add VideoLLaMA3 implementation * Run style fix * Switch to modular * Fix config and smart_resize * Fix * Fix * Fix style * Fix * Ruff fix * Rename * Rename * Fix * Clean * Fix consistency * Add doc * Fix * Fix * Fix doc * Update generated code * remove test_initialization * fix tests * simplify * tests * Add VideoLlama3IntegrationTest * replace asserts * fix tests --------- Co-authored-by: steven-ccq <55176896+steven-ccq@users.noreply.github.com> Co-authored-by: steven-ccq <1456320989@qq.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-13 15:54:34 +02:00
Akilesh	3813a8e3a1	Add VideoMAE video processor (#41534 ) * Add video processor for VideoMAE * Document VideoMAE video processor * Add regression tests for VideoMAE video processor * refactor: Use direct batch key access for pixel_values_videos * test: add parity test for VideoMAEVideoProcessor vs VideoMAEImageProcessor * docs(videomae): update model docstring example to demonstrate VideoMAEVideoProcessor (TorchCodec-based decoding and sampling)	2025-10-13 15:42:27 +02:00
Julian Ste	66d8d7a077	Fixed typos and formatting (#34215 ) #hacktoberfest	2025-10-13 13:38:06 +00:00
Joao Gante	d621be8286	🚨 [v5] `generate` delegates default cache initialization to the model (#41505 )	2025-10-13 13:20:48 +01:00
regisss	d7c9fbdb64	Enable modular files from other libraries (#41372 ) Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-13 13:48:32 +02:00
fan-amd	41e763decd	Add AMD developer cloud support (#41126 ) * Add AMD developer cloud support * Add AMD remote svg link. * Update notebooks/README.md Co-authored-by: pagezyhf <165770107+pagezyhf@users.noreply.github.com> --------- Co-authored-by: Rémi Ouazan <83456801+remi-or@users.noreply.github.com> Co-authored-by: pagezyhf <165770107+pagezyhf@users.noreply.github.com>	2025-10-13 12:17:24 +02:00
Rémi Ouazan	cf1e9834ec	Restore cuda graphs to continuous batching (#41421 ) * Type hints and small fixes * Remove unusued params * Made slice inputs the default * ruffed * Updated some var name and moved index slicing * Logging arg in example * Added some padding debug var and reformat out cg * First working CG, fixe size * Working flexible CG * CG are compatible with all implementations * Fixed CG API * Update example * Documentation * Fix padding tokens in FA * Review compliance * Better doc around weird bug * Style * Fix for sliding with CG	2025-10-13 11:57:56 +02:00
Raushan Turganbay	6c901bdc0e	[SAM] Fix typing hints (#41506 ) fix	2025-10-13 11:52:00 +02:00
Sai-Suraj-27	58f9e13313	Fixed Type-hints in function defintions (#41525 ) * Explicitly annotate default None parameters as Optional * make style. * make style. * Fixed check_copies. * fix consistency.	2025-10-13 11:48:37 +02:00
Yoni Gozlan	eb28242251	Add MLlama fast image processor (#41391 ) * Merge conflict * add fast processor * add fast processor * make style * add new convert rgb * use nested group by shape in mllama fast, add support for multiple inputs in group by shape * refactor after review --------- Co-authored-by: Vincent <phamvinh257@gmail.com>	2025-10-13 09:16:05 +00:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟	65cb8fac6d	[Qwen3VL] fix: hidden_states in place modification error (#41535 ) ``` File "transformers/models/qwen3_vl_moe/modeling_qwen3_vl_moe.py", line 941, in forward hidden_states = self._deepstack_process( ^^^^^^^^^^^^^^^^^^^^^^^^ File "transformers/models/qwen3_vl_moe/modeling_qwen3_vl_moe.py", line 960, in _deepstack_process hidden_states[visual_pos_masks, :] = local_this ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Output 0 of SliceBackward0 is a view and is being modified inplace. This view was created inside a custom Function (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward associated with the custom Function, leading to incorrect gradients. This behavior is forbidden. You can fix this by cloning the output of the custom Function. ``` Signed-off-by: Hollow Man <hollowman@opensuse.org>	2025-10-13 10:50:14 +02:00
Yih-Dar	3927ffed31	[testing] reduce runtime of `HunYuanMoEV1IntegrationTest:test_model_generation` (#41373 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-10 22:27:01 +02:00
Yuanyuan Chen	7164924a7e	Fix Latex typesetting in documentation (#41177 ) Fix Latex typsetting in documentation Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-10 08:54:27 -07:00
nicha-api	26a5368c44	Allow optuna's catch kwargs passthrough (#41496 ) * allow optuna's catch kwargs passthrough * apply ruff formatting --------- Co-authored-by: nicha <nicha.api@nectec.or.th>	2025-10-10 13:58:07 +00:00
Marc Sun	feca4f3de7	remove `tpu_num_cores` (#41383 ) * remove-tpu-num-cores * fix * let's remove it * style * Update examples/legacy/seq2seq/finetune_tpu.sh Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-10 15:53:28 +02:00
Cyril Vallez	c6042a4169	Remove outdated flags (#41512 ) remove flags	2025-10-10 14:34:47 +02:00
Benjamin Keene	dfd4121cd4	add Trainer import to .md in appropriate cell block for training.ipynb transformers_doc (#41484 ) add Trainer import to .md in appropriate cell block for docs	2025-10-10 12:04:07 +00:00
Cyril Vallez	60f6ec438a	Fix detectron2 import (#41510 ) * fix * fix * typo	2025-10-10 13:33:47 +02:00
Marc Sun	f9f8bf5a10	Revert `local_rank` deletion and some cleaning (#41504 ) * forgot those * clean * Fix * merge * fix * fix	2025-10-10 12:23:04 +02:00
Lucain	b4067472ae	Bump to hfh 1.0.0.rc5 to fix test (#41508 )	2025-10-10 12:12:08 +02:00
Marc Sun	bc529a3368	More trainer cleaning (#41489 ) clean	2025-10-10 11:55:43 +02:00
Pablo Montalvo	b92fc0c6e1	[QoL] modular conversion shows LoC saved (#41500 ) smol qol conversion	2025-10-10 11:55:23 +02:00
BakerBunker	2eae7c7452	Set `truncation` to `False` in Qwen3Omni to avoid default truncation (#41473 ) * Set `truncation` to `False` in Qwen3Omni to avoid default truncation * move `padding` and `truncation` to audio default args --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com>	2025-10-10 09:55:18 +00:00
eustlb	c5094a4f97	[voxtral] language detection + skipping lang:xx (#41225 ) * proc + doc update * improve doc * add lang:xx in decode * update voxtral test * nit * nit * update test value * use regex	2025-10-10 09:18:30 +00:00
Yao Matrix	f4487ec521	fix gemma3n case failure (#41426 ) * fix gemma3n case failure Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update dependency_versions_table.py * change the case argument passing way to make the case PASS, generation_config way need re-visit Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-10-10 09:15:27 +00:00
Cyril Vallez	e8194fe84f	Fix some tests (#41503 ) * fix * fix * doc	2025-10-10 11:05:09 +02:00
Joao Gante	9556b36b2f	[causallm tester] automate pipeline mappings + bloom tests (#41318 )	2025-10-10 10:02:00 +01:00
eustlb	5aca530b34	[Parakeet] unnecessary warning & auto mapping (#41412 ) * add parakeet to CONFIG_MAPPING_NAMES * TOKENIZER_MAPPING_NAMES update * fix auto tokenizer * update * fix	2025-10-10 11:00:15 +02:00
Sai-Suraj-27	4f323369db	Fixed tiny incorrect imports in `glm4v` (#41483 ) Fixed tiny import issue in glm4v	2025-10-10 08:57:01 +00:00
Yih-Dar	f5f3457278	Try to remove `pickle` - `BloomTokenizerFast` (#41466 ) * pickle 1 * pickle 1 * pickle 1 * pickle 1 * pickle 1 * pickle 1 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-10 10:52:51 +02:00
Mohamed Mekkouri	3585737746	[kernels] rm yoso kernel (#41495 ) * disable kernel mapping * rm kernel * delete files * style * typo	2025-10-10 10:50:12 +02:00
Mohamed Mekkouri	b543679d0e	[kernels] Remove RWKV kernel finally ! (#41493 ) * rm kernel * fix style	2025-10-10 10:32:05 +02:00
jiqing-feng	ac7777be16	fix bnb model loading (#41499 )	2025-10-10 08:27:29 +00:00
Lysandre Debut	17c31a98ac	Streaming should be handled at the request-level rather than at the istance level (#41444 ) * Streaming should be handled at the request-level rather than at the instance level * Add tests * Require torch GPU	2025-10-10 10:24:55 +02:00
Mohamed Mekkouri	b28902c86b	Remove DISABLE_KERNEL_MAPPING flag (#41475 ) rm disable	2025-10-10 10:19:25 +02:00
Pablo Montalvo	d0271be18f	Update philosophy (#41438 ) * update philosophy * Update docs/source/en/philosophy.md Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/philosophy.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * emphasis --------- Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-10-10 06:52:18 +00:00
Marc Sun	0419ff881d	Remove `local_rank` arg from `TrainingArguments` (#41382 )	2025-10-09 18:54:12 +02:00
Marc Sun	081391b20e	deprecate `jit_mode_eval` (#41376 )	2025-10-09 18:50:45 +02:00
Marc Sun	1ddbbdef48	[Trainer] deprecate ray scope (#41403 )	2025-10-09 18:50:00 +02:00
Anton Vlasjuk	c20849bad1	[`CI`] Fix copies on main (#41486 ) fix copies	2025-10-09 18:38:14 +02:00
Marc Sun	776eea8612	deprecate `overwrite_output_dir` (#41323 ) * dep * style * rm * wut * style	2025-10-09 18:36:19 +02:00
Marc Sun	3839d51013	`report_to` default changed to "none" + cleaning deprecated env var (#41375 ) * reporting * fix * fix	2025-10-09 18:28:48 +02:00
Yuxuan Zhang	78f79ba5af	Update GLM-4.6 doc (#41471 ) Update glm4_moe.md	2025-10-09 09:18:05 -07:00
Marc Sun	11c597b1b8	Remove deprecated args in Trainer for v5 (#41404 ) remove deprecated code	2025-10-09 18:10:14 +02:00
Marc Sun	b450d55a91	Remove `past_index` (#41384 ) * remove-tpu-num-cores * fix * rm past index * Revert "fix" This reverts commit 7608a6c059210957d3a77812e66178c8b79a9313. * Revert "remove-tpu-num-cores" This reverts commit ef08a51d71389849851518d67d8ad6c9ea8f04fc.	2025-10-09 18:06:46 +02:00
Marc Sun	1a3a5f5289	Remove SigOpt (#41479 ) * remove sigopt * style	2025-10-09 18:05:55 +02:00
Marc Sun	823fab4860	Fix bnb fsdp loading for pre-quantized checkpoint (#41415 ) * fix * fix * get_param_name * fix device name	2025-10-09 18:05:35 +02:00
Konstantinos Pitas	42d4e13a0b	RT-Detr correct 2d positional embeddings for non-square images (#41380 ) * Correct 2d positional embeddings for non-square images * Simplify bug fix propagate changes to other models --------- Co-authored-by: Konstantinos Pitas <kostasp210@gmail.com> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-10-09 17:58:22 +02:00
Jacob Kahn	0eae41ad36	Add Code World Model (CWM) (#41199 ) * [wip][cwm] Code World Model stubs and setup in HF Transformers * [wip] Get other things working * [wip] Working * Tokenizer pad * fix: cwm window attn * temp remove test * temp remove test * Fixes * Temporarily add auto config remapping option until VLLM 0.11 is out * Fix model type and add layer validation * Lint, remove CwmForSequenceClassification * Lint, tests * Remove CwmForSequenceClassification * Lint * Remove intermediary layer expors/doc errorss, fix tests * Lint * run python utils/sort_auto_mappings.py --check_only * Remove Cwm processor mapping, get check_repo passing * Remove CwmTextConfig from test * Add docstring for CwmConfig * remove global_window and window_pattern params from config * Fix docstrings * Revert change to auto docstring util * lint * Fixes minus test improvements * Alter tests to simply check logits * lint * Have slow tests use repo, make CwmPretrainedModel passthrough * Remove decoder layer implementation, use Llama3Decoder + CwmAttetion * Use linear w/o bias for CwmAttention, add token-level integration test * Don't ignore config attention bias * Remove attention bias parameter entirely from config --------- Co-authored-by: galco <galco@meta.com>	2025-10-09 17:57:45 +02:00
Yao Matrix	589fc29c9d	enhance patched_tearDown to support python 3.11+ (#41429 ) * enhance to support python 3.11+ Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com>	2025-10-09 21:19:29 +05:30
YangKai0616	26b5b52676	[Fix] Fix test file error (#40973 ) Fix test file error	2025-10-09 15:30:53 +00:00
Anton Vlasjuk	34b861abd1	🚨 [`Attention Masks`] Bidirectional masks for encoder and encoder-decoder models (#41265 ) * new masks * fixes * adjust comments * fix unnecessary mask creation on sdpa * simplify masks more * propogate to other models * style + repo consistency * copies * no comment * fix attempt * finally fix grounding dinos * fix distilbert * fix executorch * move to own module * address first few comments WIP * revert device comments, simplify executorch further * fix typo * add a test for cuda graphs * move cleanup... * fix conflict with new main * fix esm and evolla	2025-10-09 16:56:11 +02:00
Marc Sun	b44d91570f	[v5] remove load_in_4bit and load_in_8bit (#41287 ) * [v5] remove load_in_4bit and load_in_8bit * fix * reveert * fix --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-09 16:34:04 +02:00
Mohamed Mekkouri	d99069195b	Cleaning hub kernels (#41477 ) * disable kernel mapping * cleaning * revert * fix style	2025-10-09 16:32:18 +02:00
Konstantinos Pitas	bf38b2d11d	Change RT-Detr docs to reflect fixed 640x640 input size (#41364 ) * Update rt_detr docs to mention 640x640 input size The authors of RT-Detr mention that the model was trained on 640x640 images and was meant to be used for inference on 640x640 images. Also, the current implementation has certain quirks that make training/inferring on images of different sizes problematic. For example, the pixel masks used for batches of varying image sizes are discarded. I've added a few lines in the docs to notify the user about these issues. * Batching not possible with variable image sizes * Remove reference to batching --------- Co-authored-by: Konstantinos Pitas <kostasp210@gmail.com>	2025-10-09 14:29:16 +00:00
Yuanyuan Chen	72a3fc275c	Remove infer_device (#41088 ) * Remove infer_device Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix docs using accelerator Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix conflict Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-09 14:05:39 +00:00
Yih-Dar	9ef804472b	Pickle - part 2 (#41476 ) * pickle 2 * pickle 2 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-09 13:46:53 +00:00
Yuanyuan Chen	2b5e4c0d13	Import Callable from collections.abc (#41130 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-09 12:12:43 +00:00
Marc Sun	add4df62ba	Fix tests fsdp (#41422 ) * Fix tests * fix ! * fix	2025-10-09 14:09:52 +02:00
Ferdinand Schlatt	3e87072666	Fix auto model configuration for encoder of perceptionlm (#41464 ) * fix auto model configuration for encoder of perceptionlm * delete perception_encoder auto registrations	2025-10-09 14:08:03 +02:00
Yuanyuan Chen	f0544d7e7c	Remove KERAS_NLP_IMPORT_ERROR (#41468 ) Remove unused variables of error messages Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-09 11:58:30 +00:00
Raushan Turganbay	d1c6310d6a	🚨 [v5] Rendundant code in nested configs (#41314 ) * batch update models * delete even more * fix modular super init location * fix * fix copies * fix again, these have force-set values in configs * fix copies	2025-10-09 13:47:44 +02:00
Mohamed Mekkouri	927aa8bef2	[kernels] Cleanup deta kernel (#41470 ) * cleanup deta kernel * fix modeling	2025-10-09 13:17:42 +02:00
Yuxuan Zhang	1951f3be8e	Update GLM-4.1V MMRope implementation (#41182 ) * update for 4D mask * update * Update modular_glm4v.py * 1 * Revert "1" This reverts commit d13a763e876fa049c5fb70a8b3447b335dbb6098. * update as glm4v logtic * update * 1 * update * Create convert_glm4v_moe_mgt_weights_to_hf.py * update * update	2025-10-09 12:15:47 +02:00
Joao Gante	f50fd7fb6b	[v5] rm `utils/tf_ops/` (#41402 ) rm utils/tf_ops/	2025-10-09 10:27:47 +01:00
Raushan Turganbay	be3fa93b29	Subconfig is a class attribute (#41308 ) * delete * fix this test * fix copies * oke, more tests to fix * fix last tests on DPT * deleted accidentally	2025-10-09 10:46:44 +02:00
Raushan Turganbay	8137dbdbbd	🚨 [v5] Rename left traces of `past_key_value` in BERT-like models (#41448 ) rename everything	2025-10-09 10:44:44 +02:00
Cyril Vallez	7aa888b7fa	Fix doc (#41457 ) * dummy * remove	2025-10-08 20:13:21 +02:00
Cyril Vallez	bfe2b623ef	Fix generate outputs and simplify cache tests (#41440 ) * start refactoring * simplify * tests * tests * fix * zamba * final fix * fix	2025-10-08 19:04:18 +02:00
Yao Matrix	b9be8a8775	enable some falcon-mamba uts on xpu (#41428 ) * enable some falcon-mamba uts on xpu Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com>	2025-10-08 18:48:04 +02:00
Prathamesh Chavan	bef73bf8d7	Update hqq.md (#41452 ) mistake in loading model	2025-10-08 07:44:56 -07:00
Raushan Turganbay	89a4115a6b	Validate processing kwargs with @strict from huggingface_hub (#40793 ) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals	2025-10-08 16:14:09 +02:00
ErfanBaghaei	82ffeb28ad	Add Top-H decoding (entropy-bounded truncation) as a LogitsWarper for text generation (#40837 ) * init * added TopH * Update TopH logits_process.py * Update logits_process.py * Update test_logits_process.py * Update test_logits_process.py * added test No. 4 * Resolving __init__.py issues * Resolving configuration_utils.py Issues * Resolving logits_process.py Issues * Resolving utils.py Issues * Resolving test_logits_process.py Issues * Resolving __init__.py issues * Resolving logits_process.py Issues * Resolving __init__.py issues * Updated Docs * Updated Docstring * style: autoformat with make fixup * Fixing Docstring * Update logits_process.py removed defaults * Variable H name -> cumulative_entropy * Using torch.distributions.Categorical * Improve torch_dtype checks (#40808) * Improve torch_dtype checks Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Apply suggestions from code review --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Add VideoProcessors to auto-backend requirements (#40843) * add it * fix existing ones * add perception to auto_mapping... * Adds Causal Conv 1D kernel for mamba models (#40765) * add kernel * make style * keep causal-conv1d * small fix * small fix * fix modular converter * modular fix + lazy loading * revert changes modular * nit * hub kernels update * update * small nit * Update no split modules in T5Gemma model (#40810) * Update no split modules in T5Gemma model * Update no_split_modules also for T5Gemma modular * Remove model_split_percents from test cases --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Replace image classification loss functions to `self.loss_function` (#40764) * Fix the misalignment between the l2norm in GDN of Qwen3-Next and the implementation in the FLA library. (#40842) * align torch implementation of gdn with fla. * fix fla import. * fix * remove unused attr * fixes * strictly align l2norm in Qwen3-Next with FLA implementation. --------- Co-authored-by: bozheng-hit <dsoul0621@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Fixes for continuous batching (#40828) * Fix for CB attn mask and refactor * Tests for CB (not all passing) * Passing tests and a logger fix * Fixed the KV metrics that were broken when we moved to hybrid alloc * Fix circular import and style * Added tests for FA * Unfolded test to have device expectations * Fixes for H100 * more fixes for h100 * H100 are good * Style * Adding some comments from #40831 * Rename test * Avoid 1 letter variables * Dictonnary is only removed during kwargs * Test for supported sample * Fix a unvoluntary slice * Fixes for non-sliced inputs and small example improvments * Slice inputs is more understandabe * Style * [tests] re-enable aria fast tests (#40846) * rise from the dead * test * [SAM2] Fix inconsistent results with original implementation with input boxes (#40800) * Fix inconsistencies with box input inference with original repo * remove print * always pad * fix modular * [Sam2Video] Fix video inference with batched boxes and add test (#40797) fix video inference with batched boxes and add test * add: differential privacy research model (#40851) * VaultGemma * Removing Sequence and Token classification models. Removing integration tests for now * Remove pass-only modular code. style fixes * Update vaultgemma.md * Update docs/source/en/model_doc/vaultgemma.md Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Update docs/source/en/model_doc/vaultgemma.md Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Add links to model doc * Correct model doc usage examples * Updating model doc to describe differences from Gemma 2 * Update model_doc links * Adding integration tests * style fixes * repo consistency * attribute exception --------- Co-authored-by: Amer <amersinha@gmail.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * [test] Fix test_eager_matches_sdpa incorrectly skipped (#40852) * ouput_attentions in typed kwargs * correct typing in GenericForTokenClassification * improve * [tests] move generative tests away from `test_modeling_common.py` (#40854) move tests * [generate] Always use decoder config to init cache (#40772) * mega derp * fix * always use the decoder * Use checkpoint in auto_class_docstring (#40844) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix TrainingArguments.parallelism_config NameError with accelerate<1.10.1 (#40818) Fix ParallelismConfig type for accelerate < 1.10.1 Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Redirect MI355 CI results to dummy dataset (#40862) * [Bug fix #40813] Fix base_model_tp_plan of Starcoder2 model. (#40814) Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com> * [docstrings / type hints] Update outdated annotations for `past_key_values` (#40803) * some fixes * nits * indentation * indentation * a bunch of type hints * bulk changes * fix florence kwargs (#40826) * fix: XIELU act parameters not being casted to correct dtype (#40812) * Update model tags and integration references in bug report (#40881) * [Qwen3 Next] Use numerically stable `rsqrt` (#40848) use numerically stable inverse * Adding Support for Qwen3-VL Series (#40795) * add qwen3vl series * make fixup * fix import * re-protect import * fix it finally (need to merge main into the branch) * skip processor test (need the checkpoint) * oups typo * simplify modular * remove unecesary attr * fix layer * remove unused rope_deltas args * reuse image def * remove unnesesary imports --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * [`VaultGemma`] Update expectations in integration tests (#40855) * fix tests * style * Fix modular consistency (#40883) * reapply modular * add missing one * 🔴 Move variable output controls to `_prepare_generation_config ` (#40715) * move checks to validate steps where possible * fix csm and other models that override _sample * ops dia you again * opsie * joao review * Move variable output controls to `prepare_inputs_for_generation` * fix a bunch of models * back to basics * final touches * Clarify passing is_causal in sdpa_attention_paged_forward (#40838) * Correctly pass is_causal in sdpa_attention_paged_forward Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Improve typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Add comment Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Improve comments Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Revert typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Use torch.expm1 and torch.log1p for better numerical results (#40860) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Add Fast PromptDepthAnything Processor (#40602) * Test & import setup * First version passing tests * Ruff * Dummy post processing * Add numerical test * Adjust * Doc * Ruff * remove unused arg * Refine interpolation method and push test script * update bench * Comments * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Remove benchmrk script * Update docstrings * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * doc * further process kwargs * remove it * remove * Remove to dict * remove crop middle * Remove param specific handling * Update testing logic * remove ensure multiple of as kwargs * fix formatting * Remove none default and get image size * Move stuff to _preprocess_image_like_inputs and refacto * Clean * ruff * End of file & comments * ruff again * Padding fixed * Remove comments to pass tests * Remove prompt depth from kwargs * Adjust output_size logic * Docstring for preprocess * auto_docstring for preprocess * pass as an arg * update test batched * stack images * remove prompt scale to meter * return tensors back in preprocess * remove copying of images * Update behavior to match old processoer * Fix batch size of tests * fix test and fast * Fix slow processor * Put tests back to pytorch * remove check and modify batched tests * test do_pad + slow processor fix --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> * Fix deta loading & dataclass (#40878) * fix * fix 2 * Remove dict branch of attention_mask in sdpa_attention_paged_forward (#40882) Remove dict branch of attention_mask Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * 🌐 [i18n-KO] Translated smolvlm.md to Korean (#40414) * fix: manual edits * Apply suggestions from code review * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * 🌐 [i18n-KO] Translated `imageprocessor.md` to Korean (#39557) * feat: manual translation * docs: fix ko/_toctree.yml * Apply suggestions from code review Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> * Update docs/source/ko/image_processors.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * [generate] remove docs of a feature that no longer exists (#40895) * Make debugging failing tests (check and update expect output values) easier 🔥 (#40727) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fixing the call to kernelize (#40628) * fix * style * overload train and eval * add getter and setter * Fix getter regression (#40824) * test things * style * move tests to a sane place * Fix flaky `Gemma3nAudioFeatureExtractionTest::test_dither` (#40902) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [cache] Merge static sliding and static chunked layer (#40893) * merge * get rid of tensors in get_mask_sizes!! * remove branch * add comment explanation * re-add the class with deprecation cycle * Harmonize CacheLayer names (#40892) * unify naming * style * doc as well * post rebase fix * style * style * revert * [cache] Only use scalars in `get_mask_sizes` (#40907) * remove tensor ops * style * style * Set seed for `Glm4vIntegrationTest` (#40905) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add Olmo3 model (#40778) * transformers add-new-model-like for Olmo3 * Implement modular Olmo3 * Update Olmo3 tests * Copy Olmo2 weight converter to Olmo3 * Implement Olmo3 weight converter * Fix code quality errors * Remove unused import * Address rope-related PR comments * Update Olmo3 model doc with minimal details * Fix Olmo3 rope test failure * Fix 7B integration test * remove dummy EncodingFast (#40864) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Improve module name handling for local custom code (#40809) * Improve module name handling for local custom code * Use `%lazy` in logging messages * Revert "Use `%lazy` in logging messages" This reverts commit 5848755d5805e67177c5218f351c0ac852df9340. * Add notes for sanitization rule in docstring * Remove too many underscores * Update src/transformers/dynamic_module_utils.py * Update src/transformers/dynamic_module_utils.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Remove `runner_map` (#40880) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * disable `test_fast_is_faster_than_slow` (#40909) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [gemma3] `Gemma3ForConditionalGeneration` compatible with assisted generation (#40791) * gemma3vision compatible with assisted generation * docstring * BC * docstring * failing checks * make fixup * apply changes to modular * misc fixes * is_initialized * fix poor rebase * [generate] misc fixes (#40906) misc fixes * 🔴Make `center_crop` fast equivalent to slow (#40856) make center_crop fast equivalent to slow * Fix dtype in Paligemma (#40912) * fix dtypes * fix copies * delete unused attr * [Docs] Adding documentation of MXFP4 Quantization (#40885) * adding mxfp4 quantization docs * review suggestions * Apply suggestions from code review Co-authored-by: vb <vaibhavs10@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: vb <vaibhavs10@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Processor load with multi-processing (#40786) push * [Llama4] Remove `image_sizes` arg and deprecate `vision_feature_layer` (#40832) * Remove unused arg * deprecate * revrt one change * get set go * version correction * fix * make style * comment * Fix #40067: Add dedicated UMT5 support to GGUF loader (config, tokenizer, test) (#40218) * Fix #40067 : add UMT5 support in GGUF loader (config, tokenizer, test) * chore: fix code formatting and linting issues * refactor: move UMT5 GGUF test to quantization directory and clean up comments * chore: trigger CI pipeline * refactor(tests): Move UMT5 Encoder GGUF test to GgufModelTests. This consolidates the new test into the main class for consistency. * Add regression check to UMT5 encoder GGUF test Verify encoder output against reference tensor values with appropriate tolerances for stability. * Update tests/quantization/ggml/test_ggml.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update tests/quantization/ggml/test_ggml.py remove comments Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * [torchao safetensors] renaming get_state_dict function (#40774) renaming get_state_dict function Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Adding activation kernels (#40890) * first commit * add mode * revert modeling * add compile * rm print * Minor fix for #40727 (#40929) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add support for Florence-2 training (#40914) * Support training florence2 * update doc and testing model to florence-community * fix florence-2 test, use head dim 16 instead of 8 for fa2 * skip test_sdpa_can_dispatch_on_flash * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Add LongCat-Flash (#40730) * working draft for LongCat * BC changes to deepseek_v3 for modular * format * various modularities * better tp plan * better init * minor changes * make modular better * clean up patterns * Revert a couple of modular commits, because we won't convert in the end * make things explicit. * draft test * toctree, tests and imports * drop * woops * make better things * update test * update * fixes * style and CI * convert stuff * up * ah, yes, that * enable gen tests * fix cache shape in test (sum of 2 things) * fix tests * comments * re-Identitise * minimize changes * better defaults * modular betterment * fix configuration, add documentation * fix init * add integration tests * add info * simplify * update slow tests * fix * style * some additional long tests * cpu-only long test * fix last tests? * urg * cleaner tests why not * fix * improve slow tests, no skip * style * don't upcast * one skip * finally fix parallelism * [DOC] Add missing dates in model cards (#40922) add missing dates * [models] remove unused `import torch.utils.checkpoint` (#40934) * Intel CPU dockerfile (#40806) * upload intel cpu dockerfile Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update cpu dockerfile Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update label name Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * docs(i18n): Correct the descriptive text in the README_zh-hans.md (#40941) * Fix trainer tests (#40823) * fix liger * fix * more * fix * fix hp * fix --------- Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com> * Fix `Glm4vMoeIntegrationTest` (#40930) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Raise error instead of warning when using meta device in from_pretrained (#40942) * raise instead of warning * add timm * remove * Consistent naming for images kwargs (#40834) * use consistent naming for padding * no validation on pad size * add warnings * fix * fox copies * another fix * fix some tests * fix more tests * fix lasts tests * fix copies * better docstring * delete print * Remove nested import logic for torchvision (#40940) * remove nested import logic for torchvision * remove unnecessary protected imports * remove unnecessarry protected import in modular (and modeling) * fix wrongly remove protected imports * Fix `Glm4vModelTest::test_eager_matches_fa2_generate` (#40947) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Update expected values for some `test_speculative_generation` (#40949) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Standardize audio embedding function name for audio multimodal models (#40919) * Standardize audio embedding function name for audio multimodal models * PR review * Add FlexOlmo model (#40921) * transformers add-new-model-like * Add FlexOlmo implementation * Update FlexOlmo docs * Set default tokenization for flex olmo * Update FlexOlmo tests * Update attention comment * Remove unneeded use of `sliding_window` * Don't list dropout in eager_paged_attention_forward (#40924) Remove dropout argument Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Update expected values for one more `test_speculative_generation` after #40949 (#40967) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * FIX(trainer): ensure final checkpoint is saved when resuming training (#40347) * fix(trainer): ensure final checkpoint is saved when resuming training * add test * make style && slight fix of test * make style again * move test code to test_trainer * remove outdated test file * Apply style fixes --------- Co-authored-by: rangehow <rangehow@foxmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Add new model LFM2-VL (#40624) * Add LFM2-VL support * add tests * linting, formatting, misc review changes * add siglip2 to auto config and instantiate it in lfm2-vl configuration * decouple image processor from processor * remove torch import from configuration * replace \| with Optional * remove layer truncation from modeling file * fix copies * update everything * fix test case to use tiny model * update the test cases * fix finally the image processor and add slow tests * fixup * typo in docs * fix tests * the doc name uses underscore * address comments from Yoni * delete tests and unsuffling * relative import * do we really handle imports better now? * fix test * slow tests * found a bug in ordering + slow tests * fix copies * dont run compile test --------- Co-authored-by: Anna <anna@liquid.ai> Co-authored-by: Anna Banaszak <48625325+ankke@users.noreply.github.com> * Fix outdated version checks of accelerator (#40969) * Fix outdated version checks of accelerator Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix outdated version checks of accelerator Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Use `skip_predictor=True` in vjepa2 `get_vision_features` (#40966) use skip_predictor in vjepa2 `get_vision_features` * [Trainer] Fix DP loss (#40799) * fix * style * Fix fp16 * style --------- Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com> * [timm_wrapper] better handling of "Unknown model" exception in timm (#40951) * fix(timm): Add exception handling for unknown Gemma3n model * nit: Let’s cater to this specific issue * nit: Simplify error handling * Fix Issue #39030: AutoTokenizer.from_pretrained does not propagate token (#40956) * fix merge conflicts * change token typing --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-27-253.ec2.internal> * [tests] Really use small models in all fast tests (#40945) * start * xcodec * chameleon * start * layoutlm2 * layoutlm * remove skip * oups * timm_wrapper * add default * doc * consistency * Add captured actual outputs to CI artifacts (#40965) * fix * fix * Remove `# TODO: ???` as it make me `???` * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Revert change in `compile_friendly_resize` (#40645) fix * Track the CI (model) jobs that don't produce test output files (process being killed etc.) (#40981) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Using torch.distributions.Categorical * Remove `set_model_tester_for_less_flaky_tests` (#40982) remove * Benchmarking v2 GH workflows (#40716) * WIP benchmark v2 workflow * Container was missing * Change to sandbox branch name * Wrong place for image name * Variable declarations * Remove references to file logging * Remove unnecessary step * Fix deps install * Syntax * Add workdir * Add upload feature * typo * No need for hf_transfer * Pass in runner * Runner config * Runner config * Runner config * Runner config * Runner config * mi325 caller * Name workflow runs properly * Copy-paste error * Add final repo IDs and schedule * Review comments * Remove wf params * Remove parametrization from worfkflow files * Fix callers * Change push trigger to pull_request + label * Add back schedule event * Push to the same dataset * Simplify parameter description * 🔴[`Attention`] Bert-based Models Attention Refactor (#38301) * clean start to bert refactor * some test fixes * style * fix last tests * be strict on positional embeddings, fixup according tests * cache support * more cache fixes, new causal API * simplify masks, fix tests for gen * flex attn, static cache support, round of fixes * ? * this time * style * fix flash attention tests, flex attention requires torch 2.7.x to work with multiple classes (as recompile strats force a size call which is wrongly interpreted before) * roberta * fixup sdpa remains * attention split, simplify args and kwargs, better typing * fix encoder decoder * fix test * modular roberta * albert * data2vectext, making it modular tomorrow * modular data2vec text * tmp disable * xmod + cache position fixes * whoops * electra + markuplm, small fixes * remove wrong copy * xlm_roberta + some embedding fixes * roberta prelayernorm * RemBert: remove copy, maybe doing it later * ernie * fix roberta offloading * camembert * copy fixes * bert generation + fixes on eager * xlm roberta xl * bridgetower (text) + seamlessv2 copy fixes * rocbert + small fixes * whoops * small round of fixups * NOTE: kernels didnt load with an earlier version, some fixup (needs another look bc cross deps) * the end of the tunnel? * fixup nllbmoe + style * we dont need this anymore * megatron bert is barely used, low prio skip for now * Modernize bert (template for others) NOTE: trying to push this through, might be overdue if not in time possible * check inputs for all others (if checkmarked) * fix bridgetower * style * fix encoder decoder (partially but cause found and fix also, just needs to be done for everything else) * proper fix for bert to force intermediate dict outputs * propagate to others * style * xlm roberta xl investigation, its the layernorm... * mobile bert * revert this, might cause issues with composed models * review * style * Remove [[autodoc]] refs to TF/Flax objects (#40996) * remove refs * more * ENH: Enable readline support for transformers chat (#40911) ENH Enable readline support for chat This small change enables GNU readline support for the transformers chat command. This includes, among others: - advanced navigation and editing: ctrl + a ctrl + e alt + b alt + f ctrl + k alt + d etc. - navigate and search history: arrow up/down ctrl + p ctrl + n ctrl + r - undo: ctrl + _ - clear screen: ctrl + l Implementation Although it may look strange, just importing readline is enough to enable it in Python, see: https://docs.python.org/3/library/functions.html#input As readline is not available on some platforms (https://docs.python.org/3/library/readline.html), the import is guarded. Readline should work on Linux, MacOS, and with WSL, I'm not sure about Windows though. Ideally, someone can give it a try. It's possible that Windows users would have to install pyreadline (https://pypi.org/project/pyreadline3/). * [testing] test `num_hidden_layers` being small in model tester (#40992) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * blt wip (#38579) * blt wip * cpu version * cpu friendly with full entropy model (real time patching) * adding config file instead of args file * enable MPS * refactoring unused code * single config class in config file * inherit from PreTrainedModel * refactor LMTransformer --> BLTPatcher * add conversion script * load from new checkpoing with form_pretrained * fixed demo from_pretrained * clean up * clean a few comments * cleanup folder * clean up dir * cleaned up modeling further * rename classes * adding transformers Attention class and RotaryEmbedding class * exchanged blt modules for transformers modules: attention, rotary_emb, create_causal_mask, etc * seperate out patcher config, update modeling and conversion script * rename vars to be more transformers-like * rm unused functions * adding cross attention from transformers * pass arg * rename weights * updated conversion script * overwritten commit! fixing PR * apply feedback * adding BLTRMSNorm like Llama * add repeat_kv and eager_attention_forward copied from * BLTMLP identical to MllamTextMLP * clean up some args' * more like mllama, but busier inits * BLTTransformerLayer config * decoder, encoder, global configs * wip working on modular file * cleaning up patch and configs * clean up patcher helpers * clean up patcher helpers further * clean up * some config renaming * clean up unused configs * clean up configs * clean up configs * update modular * clean * update demo * config more like mllama, seperated subconfigs from subdicts * read from config instead of self args * update demo file * model weights to causal lm weights * missed file * added tied weights keys * BLTForCausalLM * adding files after add-new-model-like * update demo * working on tests * first running integration tests * added integration tests * adding tokenization tests, integration tests, and cleaned up tokenization file, + ruff * tokenizer clean up * modular file * fixing rebase * ruff * adding correct basemodel output and updating config with checkpoint vals (for testing) * BLTModelTests git status * enabling inputs_embeds, although won't be equal to input_ids since need ids for patching logic * fix sdpa == causal tests * fix small model test and some gradient checkpointing * skip training GC tests * fix test * updated modular * update modular * ruff * adding modular + modeling * modular * more modern is_casual check * cleaning up modular * more modular reduction * ruff * modular fix * fix styling * return 2 * return 2 * fix some tests * fix bltcrossattention after modular break * some fixes / feedback * try cache generate fix * try cache generate fix * fix generate tests * attn_impl workaround * refactoring to use recent TransformersKwargs changes * fix hidden_states shape test * refactor to new outputs * simplify outputs a bit * rm unneeded decoderlayer overwriting * rename blt * forgot tokenizer test renamed * Reorder * Reorder * working on modular * updates from modular * new modular * ruff and such * update pretrainedmodel modular * using cohere2 apply_rotary_pos_emb * small changes * apply feedback r2 * fix cross_attention * apply more feedback * update modeling fix * load submodules from pretrainedmodel * set initializer_range to subconfigs * rm cross_attnetion_states pass when not needed * add 7b projection layer support * check repo * make copies * lost cohere2 rotate_half * ruff * copies? * don't tie weights for submodules * tie weights setting * check docstrings * apply feedback * rebase * rebased modeling * update docs * applying feedback * few more fixes * fix can_record_outputs * fast tokenizer * no more modulelist * tok auto * rm tokenizersss * fix docs * ruff * fix after rebase * fix test, configs are not subscriptable --------- Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-30.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-103.ec2.internal> Co-authored-by: Lysandre <hi@lysand.re> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-36.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-45.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-173-121.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-103.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-178.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-79.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-169-239.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-111.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-100.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-153.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-15.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-165-131.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-138.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-215.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-172-142.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-172-147.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-0.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-163-58.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-165-202.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-244.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-186.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-192.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-14.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-171-249.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-75.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-78.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-163-134.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-180.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-175-241.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-225.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-9.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-34.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-68.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-175.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-170-160.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-95.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-172-73.ec2.internal> * [docs] rm stray tf/flax autodocs references (#40999) rm tf references * [`RMSNorm`] Fix rms norm init for models that center around 1 (#40796) * fix * fixup inits * oops * fixup gemma * fixup modular order * how does this keep happen lol * vaultgemma is new i forgot * remove init check * Make `EfficientLoFTRModelTest` faster (#41000) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix typoes in src and tests (#40845) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix more dates in model cards and wrong modalities in _toctree.yml (#40955) * Fix model cards and modalities in toctree * fix new models * RUFF fix on CI scripts (#40805) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * fix dict like init for ModelOutput (#41002) * fix dict like init * style * 🚨 [v5] remove generate output retrocompatibility aliases (#40998) remove old type aliases * [tests] update `test_left_padding_compatibility` (and minimize overwrites) (#40980) * update test (and overwrites) * better test comment * 0 as a default for * Patch more `unittest.case.TestCase.assertXXX` methods (#41008) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * 🚨 [v5] remove deprecated entry point (#40997) * remove old entry point * update references to transformers-cli * 🚨 [lightglue] fix: matches order changed because of early stopped indices (#40859) * fix: bug that made early stop change order of matches * fix: applied code suggestion Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix: applied code suggestion to modular * fix: integration tests --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Fix `PhimoeIntegrationTest` (#41007) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix Glm4v test (#41011) fix * Update after #41007 (#41014) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix benchmark runner argument name (#41012) * Adding support for Qwen3Omni (#41025) * Add Qwen3Omni * make fix-copies, import properly * nit * fix wrong setup. Why was audio_token_id renamed ? * upds * more processing fixes * yup * fix more generation tests * down to 1? * fix import issue * style, update check repo * up * fix quality at my best * final quality? * fix doc building * FINAL COMMIT: SKIP IMPORTANT BUT FAILING TESTS FOR MERGE * SKIP THE TEMPLATE ONE --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com> Co-authored-by: Arthur <arthur.zucker@gmail.com> * Making compute_loss_func always take priority in Trainer (#40632) * logger warn, if-else logic improved * redundant if condition fix * Modify Qwen3Omni parameter name since VL changed it (#41045) Modify parameter name since VL changed it Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com> * Fix Qwen video tests (#41049) fix test * [testing] Fix `qwen2_audio` (#41018) * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix typing of tuples (#41028) * Fix tuple typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Remove optax (#41030) Remove optax dep Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix typos in English/Chinese documentation (#41031) * Fix typos and formatting in English docs Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix typos and formatting in Chinese docs Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Use torch.autocast (#40975) * Use torch.autocast Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Format code Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * docs: improved RoPE function Docstrings (#41004) * docs: improved RoPE functuon docstrings * Update src/transformers/modeling_rope_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix condition for emitting warning when generation exceeds max model length (#40775) correct warning when generation exceeds max model length Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com> * Fix outdated torch version check (#40925) Update torch minimum version check to 2.2 Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Remove doc of tf and flax (#41029) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Add Whole Word Masking and Padding Strategy to DataCollatorForLanguageModeling (#39485) * Add whole word masking * Vectorize whole word masking functions * Unit test whole word masking * Remove support for TF in whole word masking * [testing] Fix `seed_oss` (#41052) * fix * fix * fix * fix * fix * fix * Update tests/models/seed_oss/test_modeling_seed_oss.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Remove repeated import (#40937) * Remove repeated import Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix conflict Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Simplify unnecessary Optional typing (#40839) Remove Optional Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Add write token for uploading benchmark results to the Hub (#41047) * Separate write token for Hub upload * Address review comments * Address review comments * Ci utils (#40978) * Add CI reports dir to gitignore * Add utils to run local CI * Review compliance * Style * License * Remove <frameworkcontent> and <pt> tags from documentation (#41055) * Remove <frameworkcontent> and <pt> tags Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Revert changes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Update docs/source/en/model_doc/madlad-400.md --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix CI jobs being all red 🔴 (false positive) (#41059) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Update quantization CI (#41068) * fix * new everything * fix * [i18n-bn] Add Bengali language README file (#40935) * [i18n-bn] Add Bengali language README file and update links in existing language files * Update Bengali README for clarity and consistency in model descriptions * Improve documentation and errors in Mamba2-based models (#41063) * fix bug in Mamba2 docs * correct 'because on of' issue * link to other Mamba2 model types * github URL is not changed * update error message in generated files * Update team member list for some CI workflows (#41094) * update list * update list --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * fix crash when using chat to send 2+ request to gptoss (#40536) Signed-off-by: Wang, Yi <yi.a.wang@intel.com> * Minor addition, no split modules for VideoMAEE (#41051) * added no split modules * fixed typo --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co> * Switch to `python:3.10-slim` for CircleCI docker images (#41067) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix argument name in benchmarking script (#41086) * Fix argument name in benchmarking script * Adjust vars * Remove mention of TensorFlow/Flax/JAX from English documentation (#41058) Remove mention of TensorFlow from English documentation Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix typos in documentation (#41087) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix typing (#40788) * Fix optional typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix optional typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix schema typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix typing * Fix typing * Fix typing * Fix typing * Use np.ndarray Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Format code Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Use np.ndarray Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Improve typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix quote string of np.ndarray Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix code * Format Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Remove unused arguments (#40916) * Fix unused arguments Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Remove tf and flax from Chinese documentation (#41057) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * fix wrong height and width when read video use torchvision (#41091) * docs: Fix Tool Use links and remove dead RAG links (#41104) docs: Fix tool use links. Remove dead RAG links. Fix style * 🚨 [generate] update paligemma mask updates (and other assisted generation-related fixes) (#40917) * tmp * fix modular inheritance * nit * paligemma 1 doesn't have swa * use same pattern as in models with hybrid layers * PR comments * helium also needs layer_typed (bc it relies on gemma) * paligemma/gemma3: same mask creation fn in fwd and generate * propagate changes to helium (gemma-based) * tmp commit * slow paligemma tests passing, let's see what breaks * fix test_left_padding_compatibility * tmp commit * tmp commit * rebase error * docs * reduce diff * like this? * t5gemma * better comment * shorter diff * exception * ffs type * optional * shorter modular_gemma.py * helium model actually needs no changes -- the tester is the issue * t5gemma modular config * a few more modular; paligemma BC * fix processor issues? * rm config exception * lift warning in gemma * [tests] gpt2 + `CausalLMModelTester` (#41003) * tmp commit * tmp commit * tmp commit * rm old GPT2ModelTester * nit bug * add facilities for encoder-decoder tests; add comments on ALL overwrites/extra fns * vision_encoder_decoder * Fix `_get_test_info` for inherited tests (#41106) * fix _get_test_info * fix patched * add comment * ruff --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Remove bad test skips (#41109) * remove bad skips * remove more * fix inits * Format empty lines and white space in markdown files. (#41100) * Remove additional white space and empty lines from markdown files Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Add empty lines around code Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Update ruff to 0.13.1 + target Python 3.10 + apply fixes (#37809) Update ruff to 0.13.1 target it to Python 3.10 and apply its fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * 🚨 [V5] Remove deprecated training arguments (#41017) * Remove deprecated training arguments from V5 Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Remove deprecated training arguments from V5 Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix comments Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix code Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Support loading LFM2 GGUF (#41111) * add gguf config mapping for lfm2 * add lfm2 tensor process to unsqueeze conv weights * adjust values from gguf config to HF config * add test for lfm2 gguf * ruff --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * [torchao safetensors] integrate torchao safetensors support with transformers (#40735) * enable torchao safetensors * enable torchao safetensors support * add more version checking * [Qwen3-next] Fix dimension mismatch in torch_chunk_gated_delta_rule and torch_recurrent_gated_delta_rule (#40963) (#41036) * fix mismatched dims for qwen3 next * propagate changes * chore: renamed tot_heads to total_sequence_length * Apply suggestion from @vasqu Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * minor fix to modular qwen3 next file --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Fix the error where a keyword argument appearing before args (#41099) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Fix broken `` expressions in markdown files (#41113) Fix broken expressions in markdown files Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Remove self-assignment (#41062) * Remove self-assignment Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Update src/transformers/integrations/flash_paged.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Clear pass Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Clear pass Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Clear pass Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * 🚨Refactor: Update text2text generation pipelines to use max_new_tokens… (#40928) * Refactor: Update text2text generation pipelines to use max_new_tokens and resolve max_length warning * docs(text2text_generation): 更新参数注释以反映现代生成实践将max_length参数注释更新为max_new_tokens，以符合现代生成实践中指定生成新token数量的标准做法 * refactor(text2text_generation): Remove outdated input validation logic * docs(text2text_generation): Revert incorrectly modified comment * docs(text2text_generation): Revert incorrectly modified comment * Fixed MXFP4 model storage issue (#41118) * Fixed loading LongT5 from legacy checkpoints (#40724) * Fixed loading LongT5 from legacy checkpoints * Adapted the fix to work with missing lm_head * dummy commit (#41133) * dummy commit, nothing interesting * dummy commit, nothing interesting * dummy commit, nothing interesting * dummy commit, nothing interesting --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix loading logic flaw with regards to unexpected and missing keys (#40850) * Unexpected keys should be ignored at load with device map * remove them all * fix logic flaw * fix * simplify * style * fix * revert caching allocator change * add other test * add nice doc --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Using torch.distributions.Categorical * Resolving logits_process.py Issues * style: autoformat with make fixup * Update logits_process.py removed defaults * Variable H name -> cumulative_entropy * Resolving format error * Correction of the loop variables in logit processor * Vectorized the loop in logits_process * formatted logits_process * paper reference and stopping rule comment logits_process * Trigger CI rerun * Update logits_process.py * added test_TopH_example_integration * added test_TopH_example_integration * Update README.md * Restore CI config to match main (remove accidental changes) * Restore CI config to match upstream main (no diffs) --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com> Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com> Signed-off-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: ArminAzizi98 <147081650+ArminAzizi98@users.noreply.github.com> Co-authored-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Yuchao Zhang <418121364@qq.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: Bo Zheng <368586905@qq.com> Co-authored-by: bozheng-hit <dsoul0621@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Rémi Ouazan <83456801+remi-or@users.noreply.github.com> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: Ryan Mullins <ryanmullins@google.com> Co-authored-by: Amer <amersinha@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Ákos Hadnagy <akos@ahadnagy.com> Co-authored-by: Grzegorz Kwasniewski <213329731+greg-kwasniewski1@users.noreply.github.com> Co-authored-by: NanoCode012 <nano@axolotl.ai> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: 艾力可 <178652170+thalahors@users.noreply.github.com> Co-authored-by: JJJYmmm <92386084+JJJYmmm@users.noreply.github.com> Co-authored-by: Manuel de Prada Corral <6536835+manueldeprada@users.noreply.github.com> Co-authored-by: Samuel Barry <127697809+SamuelBarryCS@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: HyunZ118 <156191095+HyunZ118@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: Shane A <shanea@allenai.org> Co-authored-by: Xuehai Pan <XuehaiPan@pku.edu.cn> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com> Co-authored-by: vb <vaibhavs10@gmail.com> Co-authored-by: Yaswanth Gali <82788246+yaswanth19@users.noreply.github.com> Co-authored-by: Akshay Babbar <priv.akshay@outlook.com> Co-authored-by: liangel-02 <liangel@meta.com> Co-authored-by: Duc-Viet Hoang <vietyb00@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: lilin-1 <256404019@qq.com> Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com> Co-authored-by: Jack <32371937+jackzhxng@users.noreply.github.com> Co-authored-by: Rangehow <88258534+rangehow@users.noreply.github.com> Co-authored-by: rangehow <rangehow@foxmail.com> Co-authored-by: Anna <anna@liquid.ai> Co-authored-by: Anna Banaszak <48625325+ankke@users.noreply.github.com> Co-authored-by: Hamish Scott <41787553+hamishs@users.noreply.github.com> Co-authored-by: Harshal Janjani <75426551+harshaljanjani@users.noreply.github.com> Co-authored-by: Branden <brandenkmurray@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-27-253.ec2.internal> Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> Co-authored-by: Ita Zaporozhets <31893021+itazap@users.noreply.github.com> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-30.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-103.ec2.internal> Co-authored-by: Lysandre <hi@lysand.re> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-36.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-45.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-173-121.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-103.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-178.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-79.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-169-239.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-111.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-100.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-153.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-15.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-165-131.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-138.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-215.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-172-142.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-172-147.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-0.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-163-58.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-165-202.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-244.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-186.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-192.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-14.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-171-249.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-75.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-78.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-163-134.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-180.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-175-241.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-225.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-9.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-34.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-68.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-175.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-170-160.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-95.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-172-73.ec2.internal> Co-authored-by: StevenBucaille <steven.bucaille@gmail.com> Co-authored-by: BakerBunker <17872844+BakerBunker@users.noreply.github.com> Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com> Co-authored-by: Arthur <arthur.zucker@gmail.com> Co-authored-by: Ayush <ayushtanwar1729@gmail.com> Co-authored-by: Ryan Mullins <ryan@ryanmullins.org> Co-authored-by: Yannick Schnider <Yannick.Schnider1@ibm.com> Co-authored-by: Ralph Gleaton <70818603+rjgleaton@users.noreply.github.com> Co-authored-by: Saidur Rahman Pulok <59414463+saidurpulok@users.noreply.github.com> Co-authored-by: Nick Doiron <ndoiron@mapmeld.com> Co-authored-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Duygu Altinok <duygu.altinok12@gmail.com> Co-authored-by: Jinde.Song <juude.song@gmail.com> Co-authored-by: hbenoit <60629420+HaroldBenoit@users.noreply.github.com> Co-authored-by: nnul <107971634+notkisk@users.noreply.github.com> Co-authored-by: YangKai0616 <kai.yang@intel.com> Co-authored-by: Karol Szustakowski <61427290+Szustarol@users.noreply.github.com> Co-authored-by: souvikku <107592858+souvikku@users.noreply.github.com>	2025-10-08 13:37:51 +00:00
Yih-Dar	e064dc05c2	[testing] Fix `JetMoeIntegrationTest` (#41377 ) * fix * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-08 13:11:53 +00:00
Anton Vlasjuk	20282f13fa	[`JetMoe`] Fix KV head repetition and padding free (#41423 ) fix jetmoe	2025-10-08 14:27:22 +02:00
Yuanyuan Chen	c528f50663	Remove Python 3.9 classifier (#41410 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-08 12:20:36 +00:00
Arthur	8dfc8e8cfc	🤦 CB nit! (#41413 ) * 🤦 * updates * update cb simple * merge * up * update * fix * up * nit * rumble this is annoying * update * update * up * fix * .... * cleanup a bit * nit * typo * typing and typo * nit * updates * up * final fix! * update * fix more import issues * nuke is paged * up	2025-10-08 13:36:27 +02:00
Jerry Zhang	2166e26cb1	[torchao] Add regex support for ModuleFqnToConfig (#41242 ) * Add regex support for ModuleFqnToConfig Summary: Similar to https://github.com/pytorch/ao/pull/3084 we added regex support in transformers so people can use regex to quantize the models. See https://github.com/pytorch/ao/pull/3084 for docs and precedence of different configurations Uploaded model: https://huggingface.co/torchao-testing/opt-125m-ModuleFqnToConfig-v1-regex-0.14.0.dev Test Plan: pytest tests/quantization/torchao_integration/test_torchao.py -k test_module_fqn_to_config_regex Reviewers: Subscribers: Tasks: Tags: * Apply style fixes * add assert for --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-10-08 11:05:15 +00:00
Yao Matrix	b13ee63b5a	enable new model uts to xpu and fix some failures on xpu (#41386 ) * enable new model uts to xpu and fix some failures on xpu Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * add more Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update test_modeling_internvl.py * Update test_modeling_llava.py * Update test_modeling_qwen2_5_omni.py * Update test_modeling_llava_next_video.py * Update test_modeling_qwen3.py * Update test_modeling_whisper.py * Update test_modeling_whisper.py * Update test_modeling_llava.py * Update test_modeling_llava.py * Update test_modeling_qwen2_5_omni.py * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com>	2025-10-08 10:14:50 +00:00
Yuanyuan Chen	1c5ac899e8	Use accelerator API to free device memory (#41195 ) * Use accelerator API to free device memory Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Use clear_device_cache Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Cleanup Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Cleanup Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-08 12:11:18 +02:00
Mohamed Mekkouri	957b1f3696	Fixing comments in __init__ file (#41414 ) nit	2025-10-08 12:07:26 +02:00
Marc Sun	13791d8f48	[v5] Bump min version of bitsandbytes to 0.46.1 (#41283 ) * bump bitsandbytes to 0.46.1 * huge cleanup * style * fix * req * fix * importerror * fix	2025-10-08 12:04:26 +02:00
Joao Gante	7e475552be	🚨 [v5] Prune `prune_heads` (#41417 ) * remove _prune_heads * remove prune_heads * finalize the purge * remove another patterns	2025-10-08 10:25:13 +01:00
Cyril Vallez	46db0edf3b	🚨🚨 Remove all traces of legacy cache format (#41378 ) * remove * more * add back * tests * revert classes * tests * add exceptions * reapply modular * rename * oupsi * start with whisper * fix tests * fix * fix * fix * typing	2025-10-08 11:14:44 +02:00
Sai-Suraj-27	ee5488440b	Tiny Cleanup - Removed duplicate class field definition's (#41293 ) * Removed duplicate-class-field-definition 's using RUFF PIE794 * Removed duplicate-class-field-definition 's using RUFF PIE794 * Ruff format. * Removed duplicate-class-field-definition * Added New ruff rule to detect duplicate class field defs * remove comment * order --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-08 10:49:34 +02:00
Lysandre Debut	34dcd73b57	v5 dev version (#41436 )	2025-10-08 10:45:33 +02:00
Yoni Gozlan	3553f0bc23	Fix overriding common_kwargs defaults in processor calls (#41381 ) * set common_kwargs defaults before updating with kwargs * change order to override defaults common_kwargs	2025-10-07 23:13:56 -04:00
Cyril Vallez	242eb9cbdc	Remove deprecation warning (#41425 ) * remove * fix space	2025-10-07 19:21:14 +02:00
Raushan Turganbay	50090c3fc8	[v5] Delete left traces of feature extractor (#41321 ) delete the left traces	2025-10-07 18:24:08 +02:00
Sai-Suraj-27	ccbaa1670a	Fix incorrect assignment in `update_device_map` for GPTQ quantizer (#41328 ) Fix incorrect assignment in update_device_map for GPTQ quantizer Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-07 17:28:55 +02:00
Marc Sun	c562c5d801	[v5] Bump accelerate to 1.1.0 (#41234 ) * bump to 1.1.0 ! * bump accelerate * fix * None * fixed ! * style	2025-10-07 17:18:32 +02:00
Harry Mellor	88e946e062	Fix early CUDA initialisation (#41409 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-07 14:37:17 +01:00
Sai-Suraj-27	93464a0279	Prefer raising `TypeError` exception for invalid type (#41346 ) * Fixed raising of TypeError exception for invalid type * Fixed failing tests.	2025-10-07 13:11:42 +00:00
Paul Pak	0c9a72e457	[Model] Lfm2Moe (#41401 ) * [new-models] LFM2-MoE Signed-off-by: Paul Pak <paulpak58@gmail.com> * [docs] add in template lfm2_moe doc files Signed-off-by: Paul Pak <paulpak58@gmail.com> * [configuration] update configuration class Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modular][lfm] minor: fix rotary_emb typo Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modeling] modular/modeling files for Lfm2Moe Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modeling][lfm2_moe] fix Lfm2Moe modular/modeling Signed-off-by: Paul Pak <paulpak58@gmail.com> * [configuration][lfm2_moe] update configuration keys with latest config changes Signed-off-by: Paul Pak <paulpak58@gmail.com> * [misc] make fixup Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modular][lfm2_moe] address comments: dtype, mlp, buffers Signed-off-by: Paul Pak <paulpak58@gmail.com> * [configuration][lfm2_moe] add initializer_range Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modular][lfm2_moe] include init_weights to pass test_initialization Signed-off-by: Paul Pak <paulpak58@gmail.com> * [tests][causal_lm] include pos_emb as possible rope attribute Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modeling][lfm2_moe] remove load_balancing_loss_func due to lack of support for hooking expert biases Signed-off-by: Paul Pak <paulpak58@gmail.com> * [misc] make style Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modeling][lfm2_moe] MoE refactor PR update in LFM2Moe Signed-off-by: Paul Pak <paulpak58@gmail.com> * [tests] lfm2_moe: unit tests Signed-off-by: Paul Pak <paulpak58@gmail.com> * [misc] update LFM2-8B-A1B repo id Signed-off-by: Paul Pak <paulpak58@gmail.com> * [tests] lfm2: update ModelTests for lfm2 Signed-off-by: Paul Pak <paulpak58@gmail.com> * Update LFM2 documentation Updated the LFM2 documentation to reflect the addition of a new model size and clarified architectural details. * Add Lfm2Moe documentation Add Lfm2Moe model documentation with overview and example usage. * [misc] fix ci Signed-off-by: Paul Pak <paulpak58@gmail.com> * [docs] remove trust_remote_code Signed-off-by: Paul Pak <paulpak58@gmail.com> * [misc] ci: fix modular Signed-off-by: Paul Pak <paulpak58@gmail.com> * reapply modular * simplify * remove static address and inplace op * simplify * simplify a bit more the modular * imports --------- Signed-off-by: Paul Pak <paulpak58@gmail.com> Co-authored-by: Maxime Labonne <81252890+mlabonne@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-07 15:09:58 +02:00
st81	b4428d545f	Fix test for model with dotted name and relative imports (#41343 )	2025-10-07 13:55:54 +01:00
Paul Pak	0464d9eb37	[Cache] lfm2 cache: allocate empty kv layers during init (#41396 ) * [Cache] lfm2 cache: allocate empty kv layers during init Signed-off-by: Paul Pak <paulpak58@gmail.com> * [Cache] lfm2_cache: update modular file Signed-off-by: Paul Pak <paulpak58@gmail.com> --------- Signed-off-by: Paul Pak <paulpak58@gmail.com>	2025-10-07 14:01:31 +02:00
Mohamed Mekkouri	da7b8ce11f	[kernels] Kernel Config (#41232 ) * first config * add kernel_config * add import logic * fixing style * compare class name * add comments * rm import * adding kernel md files * add to toctree * adding to main_classes * simplify required config * add to doc * style * store the mapping * remove nested func * add hub mixin * fix * imports * fix	2025-10-07 13:58:20 +02:00
i3hz	4763b8c5b8	Correct numerical regression in vision embeddings (#41374 ) created modeling file	2025-10-07 13:43:24 +02:00
Donghua Yu	caa14e7dab	fix resample in asr pipeline (#41298 )	2025-10-06 17:31:10 +00:00
Yao Matrix	73f8c4b8ad	fix asr ut failures (#41332 ) Signed-off-by: Yao, Matrix <matrix.yao@intel.com>	2025-10-06 17:12:19 +00:00
Anton Vlasjuk	57e82745f9	[`v5`] Sync Bert and Bart eager attention (#41248 ) * remove from modeling files * remaining changes * style / copies * revert deprecated models and fixup some models * oops * sync attn impl * fix style/copies * fix distilbert * remove dim check	2025-10-06 18:49:01 +02:00
Arthur	505387c05b	Update from pretrained error when loading (#33380 ) * init commit * style * take comments into account * mrege with main and simplify * nits * final * small fixes * fix * super small update! * add another test * up up * update * fixes * sort them by default	2025-10-06 16:10:19 +00:00
Anthonette Adanyin	e00f46f16e	serve: add non-streaming mode to /v1/responses; stream event parity; remove placeholder logprobs (#41353 )	2025-10-06 16:04:17 +00:00
Arthur	0395ed52ae	[`CB`] Refactors the way we access paged (#41370 ) * up * refactor the way we handle paged attention * affect serve as well * update * fix * cup	2025-10-06 17:55:31 +02:00
Yuanyuan Chen	39b0c9491b	Remove unused function patameters (#41358 ) Remove unused arguments Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-06 15:38:17 +00:00
Yao Matrix	11e4b5e5ee	make some ut cases pass on xpu w/ latest torch (#41337 ) * make some ut cases pass on xpu w/ latest torch Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update test_modeling_llava_onevision.py * Apply style fixes --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-06 15:38:00 +00:00
Yuanyuan Chen	fa36c973fc	Remove unnecessary list comprehension (#41305 ) Remove unnecessary comprehension Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-06 14:49:02 +00:00
Ilyas Moutawwakil	7a1aeec36e	Fixes in check_model_inputs, GPTBigCodeModel and ImageGPTModel (#40811 ) * misc fixes * fix * Update src/transformers/models/imagegpt/modeling_imagegpt.py * Apply suggestion from @IlyasMoutawwakil * pickup use_cache from args input as well * fix	2025-10-06 16:34:24 +02:00
Anuj Soni	297a41a6cf	Use canonical get_size_with_aspect_ratio (with max_size) from transformers.image_transforms to fix #37939 (#41284 ) * Use canonical get_size_with_aspect_ratio (with max_size) from transformers.image_transforms to fix #37939 * Fix import sorting/style * Fix import order * Refactor: use canonical get_size_with_aspect_ratio across image processors (except YOLOS) This commit updates image processing utilities in multiple model processors to use the shared transformers.image_transforms.get_size_with_aspect_ratio for consistent resizing logic and aspect ratio handling. YOLOS processors are intentionally left unchanged in this commit to preserve their current behavior and avoid breaking model-specific padding/resizing assumptions. YOLOS will be updated in a dedicated follow-up PR once compatibility is fully verified. * ruff fixes * Fix check_copies.py references for get_size_with_aspect_ratio to use canonical transformers.image_transforms version --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-10-06 10:15:56 -04:00
Yangshen⚡Deng	ae60c77689	Fix flash_attention.py: wrong argument passing for attn_implementation (#41347 ) * Fix flash_attention.py: wrong argument passing for attn_implementation The name of the attn type argument for `_flash_attention_forward()` should be `implementation`, instead of `attn_implementation` which currently uses in the function call. This would result in wrong type specification. * modify the kwargs inside _flash_attention_forward * fix the doc * fix typo --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-06 15:36:40 +02:00
Yih-Dar	6bf6e36d3b	[testing] update `test_longcat_generation_cpu` (#41368 ) * fix * Update tests/models/longcat_flash/test_modeling_longcat_flash.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-10-06 13:21:29 +00:00
Cyril Vallez	4903cd4087	🚨 Remove BetterTransformer (#41367 ) remove	2025-10-06 15:18:12 +02:00
Roman Solomatin	a5700c497e	Better typehints for `apply_chat_template` (#41355 )	2025-10-06 13:14:03 +00:00
Omkaar	089d573aca	Fix typo in model proposal template (#41352 )	2025-10-06 13:06:50 +00:00
Anton Vlasjuk	c27b67f0cd	🚨 [`v5`] Remove relative position embeddings (for bert like models) (#41170 ) * remove from modeling files * remaining changes * style / copies * revert deprecated models and fixup some models * oops	2025-10-06 14:21:41 +02:00
Nicolas Patry	a89bdcf5f1	Fixing a typo for BLT model (#41325 )	2025-10-06 12:16:45 +00:00
Arthur	0452f28544	[`ModularChecker`] QOL for the modular checker (#41361 ) * update * fancy table fancy prints * download to cache folder, never need it everagain * stule * update based on review	2025-10-06 12:52:10 +02:00
Raushan Turganbay	9db58abd6e	Check model inputs - hidden states (#40994 ) * update all models * fix copies * skip aria tests * update other models * skip should be in test, not tester * i think this is more descriptive as a name * find and replace for new models	2025-10-06 11:48:52 +02:00
Marc Sun	db711210d2	Fix trainer for py3.9 (#41359 ) fix	2025-10-06 11:36:05 +02:00
Cyril Vallez	163601c619	Standardize `PretrainedConfig` to `PreTrainedConfig` (#41300 ) * replace * add metaclass for full BC * doc * consistency * update deprecation message * revert	2025-10-06 11:34:02 +02:00
Cyril Vallez	55b172b8eb	🚨 Bump to Python 3.10 and rework how we check 3rd-party libraries existence (#41268 ) * cleanup * add check * fix * remove all global variables * fix * add lru caches everywhere * fix * fix * style * improve * reorder all functions * fix order * improve * fix * fix * fix	2025-10-06 11:04:19 +02:00
Raushan Turganbay	1ec0b54414	Rope for Qwen2--5-vl (#41173 ) qwen2--5-vl	2025-10-06 10:56:29 +02:00
Sai-Suraj-27	0947b9042c	Fixed tiny incorrect import in `gemma3` (#41354 ) Fixed tiny import issue in gemma3	2025-10-06 10:55:42 +02:00
Arthur	e11a00a16f	`JetMoe` Fix jetmoe after #40132 (#41324 ) * update * up	2025-10-04 11:02:13 +02:00
Marc Sun	1bc75db9bd	Fix lr_scheduler_parsing (#41322 ) * fix * fix	2025-10-03 17:51:17 +02:00
Cyril Vallez	c2b3cc3e64	Fix jamba (#41309 ) * reactivate tests * first pass * fix * fix bias * fix and simplify * finally fix this stupid bug * add skips * remove bad stuff * fix copies * simplify	2025-10-03 16:54:19 +02:00
Pablo Montalvo	5abfa43f02	Security/fuyu (#41320 ) remove reference to compromised repo	2025-10-03 14:13:41 +00:00
Mohamed Mekkouri	217ff1e4ef	AutoAWQ tests (#41295 ) * initial commit * fix * fix multi gpu * fix expected output * fix * latest * add comment * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-03 15:17:10 +02:00
Raushan Turganbay	5339f72b9b	🚨 [unbloating] unify `TypedDict` usage in processing (#40931 ) * just squash commits into one * fix style	2025-10-03 14:17:59 +02:00
Yih-Dar	42bcc81ba2	Minor security fix for `ssh-runner.yml` (#41317 ) security issue Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-03 14:14:34 +02:00
Pablo Montalvo	cd4422922e	Add modular detector (#41289 ) * doc * doc * no remote code * safe-ize the release + remove remote * fixes * add some documentation as well	2025-10-03 14:11:10 +02:00
Yih-Dar	59eba49237	download and use HF Hub Cache (#41181 ) use hub cache Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-03 11:11:37 +02:00
Yangshen⚡Deng	de3ee737cf	Fix README.md error when installing from source (#41303 )	2025-10-02 16:08:27 -07:00
Federico Moretti	b914445f77	Italian translation for README.md (#41269 ) chore: add Italian translation for README.md	2025-10-02 15:59:28 -07:00
Benjamin Bossan	41e5abac5c	FIX: Bug in PEFT integration delete_adapter method (#41252 ) The main content of this PR is to fix a bug in the delete_adapter method of the PeftAdapterMixin. Previously, it did not take into account auxiliary modules from PEFT, e.g. those added by modules_to_save. This PR fixes this oversight. Note that the PR uses a new functionality from PEFT that exposes integration functions like delete_adapter. Those will be contained in the next PEFT release, 0.18.0 (yet unreleased). Therefore, the bug is only fixed when users have a PEFT version fullfilling this requirement. I ensured that with old PEFT versions, the integration still works the same as previously. The newly added test for this is skipped if the PEFT version is too low. (Note: I tested locally with that the test will pass with PEFT 0.18.0) While working on this, I also cleaned up the following: - The active_adapter property has been deprecated for more than 2 years (#26407). It is safe to remove it now. - There were numerous small errors or outdated pieces of information in the docstrings, which have been addressed. When PEFT < 0.18.0 is used, although we cannot delete modules_to_save, we can still detect them and warn about it.	2025-10-02 18:36:57 +02:00
Anton Vlasjuk	da3c7d1d36	🚨 [`DistilBert`] Refactor Attention (#41163 ) * refactor * allow pos ids for flattened sequences	2025-10-02 17:50:48 +02:00
Anton Vlasjuk	e54defcfc2	[`Flex Attn`] Fix lse x attention sinks logic (#41249 ) fix	2025-10-02 17:49:39 +02:00
Cyril Vallez	b3bd815786	Fix mxfp4 dequantization (#41292 ) fix	2025-10-02 16:47:42 +02:00
Yuanyuan Chen	e4930d6bde	🚨 [V5] Remove deprecated `resume_download` (#41122 ) Remove deprecated `resume_download` Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-10-02 16:44:34 +02:00
Yih-Dar	7adb43e60a	Build doc in 2 jobs: `en` and `other languages` (#41290 ) * separate * separate --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-02 14:33:57 +00:00
Yih-Dar	e1f1d32af0	Remove some previous team members from allow list of triggering Github Actions (#41263 ) * delete * delete --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-02 16:32:28 +02:00
Manh Nguyen	1d7ebff398	Fix - remove deprecated args checking in deepspeed intergrations (#41282 ) Remove deprecated args checking in deepspeed intergrations Signed-off-by: nguyen599 <pnvmanh2123@gmail.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-10-02 13:59:50 +00:00
Cyril Vallez	9d02602f0f	Remove `test_initialization` (#41261 ) remove it	2025-10-02 15:23:43 +02:00
Joao Gante	248e7ef8bc	[docs] remove references to recently deleted classes in non-`en` docs (onnx, feature processors) (#41286 ) remove references to old classes	2025-10-02 12:59:28 +00:00
JJJYmmm	bc33fd3fc2	Add processor and intergration test for qwen3vl (#41277 ) * support aux loss in qwen3vlmoe * update qwen3vl processor test! * add integration tests for qwen3vl-30a3 * remove duplicated decorator * code clean * fix consistency * do not inherit from nn.Linear for better quantization * pass check	2025-10-02 14:59:04 +02:00
Luc Georges	639ad8ccd9	feat: use `aws-highcpu-32-priv` for amd docker img build (#41285 ) * feat: use `aws-highcpu-32-priv` for amd docker img build * feat: add `workflow_dispatch` event to docker build CI	2025-10-02 12:53:14 +00:00
Yuanyuan Chen	894a2bdd8c	Fix pylint generator warnings (#41258 ) Fix pylint generator warnings Signed-off-by: cyy <cyyever@outlook.com>	2025-10-02 12:35:42 +00:00
Yuanyuan Chen	1cc9069551	Fix unnecessary single-item container checks (#41279 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-02 12:35:11 +00:00
0xAvi	4f286fbbf8	Biogptlogits (#41270 ) added logits slicing to BioGpt for seq classifier Signed-off-by: Aviral <aviralkamaljain@gmail.com>	2025-10-02 12:33:48 +00:00
Yuanyuan Chen	1d91a8a454	Use max/min (#41280 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-02 12:15:27 +00:00
Yuanyuan Chen	f1b64c5b06	Unify is_torchvision_v2_available with is_torchvision_available (#41259 ) Fix is_torchvision_v2_available Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-02 11:56:37 +00:00
Marc Sun	2f3e266692	fix async client for transformers chat (#41255 ) * fix-client * fix	2025-10-02 13:23:37 +02:00
Joao Gante	313504bcdd	🚨 [v5] remove deprecated `generate` classes (constraints and beam scorers) (#41223 ) rm	2025-10-02 12:11:11 +01:00
Quentin Gallouédec	8f14300663	Allow private Space id for Trackio (#40948 ) * allow prive space id for trackio * complete docstring	2025-10-02 12:38:25 +02:00
Quentin Gallouédec	734732140a	Deprecate Trackio environment variables and deploy to Spaces by default (#40950 ) * allow prive space id for trackio * complete docstring * Deprecate environment variables for Trackio integration; use TrainingArguments instead and deploy by default * style * Enhance documentation for Trackio Space ID in TrainingArguments	2025-10-02 12:37:55 +02:00
Arthur	7938e91faa	MoE + vllm = 😻 (#40132 ) * update modeling mixtral * oups[13;2u * fix * better naming? * compute softmax and top_k inside the experts * update minamax as well * models that will need an update * more models that need a fix * stash * fix mixtral * update olmoe * update * update * current changes * nits * molmoe is now fixed * olmoe is good to go! * refactor qwen2_moe * fixes * fixed moe * fix qwen2 modular * nit * qwen2_moie test script works * tricky rope ! * fix qwen3 * DeepSeek v3 MoE Standardization (#40538) * DeepSeek-v3 Shared Shared * Dependents of DS3 * Standardize GLM4V MoE (#40539) * up * Standardize VitPose's MoE (#40549) * VitPose * outside * outside * outside * fix * update dbrx * dbrx... the magix * Refactor Ernie 4.5's MoE (#40547) * Isolate Ernie fixes * fix moe --------- Co-authored-by: Vasqu <antonprogamer@gmail.com> * fix style * style * fix copies * style * latest changes * fixes * had to stage * current updaters * up * another modular * modular graniteMoe * some update * draft another modular moe * updaters * up * fix nit * q3 nit * fix phi moe * we're going up up up up its our mooooment * fix switch transformers this time around * up * gptsan japanese is deprecated forget about it * fix mixtral to not be a linear (gives us more freedom) * update * fix copies gone wrong try catch nothing * fix mixtral * new refactor again * update aria as well * up dbrx and deepseekv3 * nit * fix phimoe? * fix deepseek v3 * nits * don't bother with this one please * up olmoe * ?? * fix olmoe * yups * fiupx * ish * hot patch * new qwen3 * updates * up * nit * fix copies * fix * nits * we're going up up up * nits * switch_transformesr edge case * lol modular gptsan? * fix deepseek * finally all modeling match modular * update * up * up * dang * up * up aria * fix dbrx * nits here and there * finish fixing dbrx * fix deepseek * upd * up * fix flex olmo * updated * update jamba * JAMBA is stil a bit todo * forward forward * fix dots11 * update * fix hunyuan * fix some other * update phimoe * fuck you phimoe you are now submitted * submit granitemoe as well * try to fix some other models, reduces some of the failures * fix olmoe and qwem2moe * up * up * fix qwen2_moe * update modular make it again, simpler * nits * up * up * fix * someswitch reductions * up * fix qwen3vl * some fixes to jetmo * these should be shipped to the modular to fix jetmoe * fix most of the nllb failures * more nllb fixes * fix the modular * remove nllb modular as it sucks for now * ? * fix granitemoe * granitemoehybrid don't have rope * use rope when rope, no rope when no rope * updates * finish fixing dumbgrainite * fix most of minimax * fix * update modular * ? * up * up jetmoe still broken * up * fix, now align the moe * fix jetmoe * fix styling and qwen3 repo consitency * updatge * up up * update ruff? * nits * modeling is goot now for switch * fix * more fixses to switch! * fix some siwtch test * ? * ? * up * fix switch modular! * nit? * uip * subtest * can't believe I wasted so much time on this... * fix * updates * nits * nit jamba is fucking annoying * ? * fix? * oups * good good * styling * up * make sure qwen2 sliding works! * fix dbrx small * lol * nits * fix one test * fix load balancing loss issue * fix jamba * fix nllbmoe * fix jamba consistency and doc? * up * thse are correct * up * up * up * some of the final cleanup * update * up * fix some revert in granimoe * bring back attention multipliers for the granite family we'll see later on if they need removal * small jamba fix docstring and typing * fix phimoe * yup * fix unk returndict in granitemoes * up * fix qwen config * fix phiemoe check quality * nits * update based on caught non relative imports! * fix dbrx * Apply suggestions from code review Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * fix copies * fiuxp * fix dot1 regression! * fix phimoe issue * fix phi moe * fix float() for some models * fix jamba regression * ui * more dtype issues * fix deepseek2 and 3? * proper update * fix modular deepseek! * jamba jambaaaaaa --------- Co-authored-by: Lysandre Debut <hi@lysand.re> Co-authored-by: Vasqu <antonprogamer@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-10-02 12:12:44 +02:00
Daniel Bershatsky	e6a8e7debe	Fix binding of video frames to video placeholder in `InternVL` model (#41237 ) * Fix binding video frames to video placeholder in prompt Signed-off-by: Daniel Bershatsky <daniel.bershatsky@gmail.com> * Add test on binding video frames to prompt Signed-off-by: Daniel Bershatsky <daniel.bershatsky@gmail.com> * Fix code style issues Signed-off-by: Daniel Bershatsky <daniel.bershatsky@gmail.com> * Fix broken tests on `InternVLProcessor` Signed-off-by: Daniel Bershatsky <daniel.bershatsky@gmail.com> * Add `return_tensors` to video processor defaults Signed-off-by: Daniel Bershatsky <daniel.bershatsky@gmail.com> --------- Signed-off-by: Daniel Bershatsky <daniel.bershatsky@gmail.com>	2025-10-02 09:43:35 +00:00
Yuanyuan Chen	30b79effb5	Remove SageMakerTrainer (#41267 ) * Remove SageMakerTrainer Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More removal Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-02 09:16:32 +00:00
tim120526	aabf0a03cb	Fix multi-video timestamp bug in Qwen-3-VL and GLM4V (#41229 ) * fix multi-video timestamp bug in qwen3vl,glm4v * run make fix-copies to sync modular files * run make fix-copies to sync modular files --------- Co-authored-by: UBT <daqin.luo@ubtrobot.com>	2025-10-02 11:15:57 +02:00
Yuanyuan Chen	bcdd5532bf	Use regex defailed flags (#41264 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-02 08:34:09 +00:00
Yao Matrix	55d63e86ea	fix asr pipeline ut failures (#41275 ) * fix asr pipeline ut failures Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * make style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com>	2025-10-02 10:32:03 +02:00
Mohamed Mekkouri	522b79a346	add more activation kernels, follow up (#40944 ) * add more activation kernels * fixing style * fix version	2025-10-02 08:45:05 +02:00
Matthew Douglas	9f2d5666f8	docs: update bitsandbytes platform support (#41266 )	2025-10-01 14:27:19 -04:00
Yih-Dar	9d8f693c7e	add peft team members to issue/pr template (#41262 ) * add * Update .github/PULL_REQUEST_TEMPLATE.md Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>	2025-10-01 17:26:59 +00:00
Xuehai Pan	94bbf8e199	Resolve remote custom module path warnings (#41243 )	2025-10-01 15:55:42 +00:00
Yih-Dar	c4b505d0f7	Don't convert to `safetensors` on the fly if the call is from testing (#41194 ) * don't convert * disable * Update src/transformers/modeling_utils.py Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * fix * disable * disable * disable --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-10-01 17:46:21 +02:00
Joao Gante	01c9e1ba68	[t5gemma] fix `get_text_config` and related fixes (#40939 ) * tmp commit * t5gemma fixes	2025-10-01 15:55:26 +01:00
Anton Vlasjuk	025531981c	[`FA3`] Fix masking and loading logic in same process (#41217 ) fix loading and fa3 masking	2025-10-01 16:36:12 +02:00
Andrei Panferov	3256773974	FP-Quant NVFP4 and Python 3.9 support (#39876 ) * quartet * quartet qat -> quartet * format * bf16 backward * interfaces * forward_method * quartet -> fp_quant * style * List -> list * list typing * fixed format and annotations * test_fp_quant * docstrings and default dtypes * better docstring and removed noop checks * docs * pseudoquantization support to test on non-blackwell * pseudoquant * Pseudoquant docs * Update docs/source/en/quantization/fp_quant.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update docs/source/en/quantization/fp_quant.md * Update docs/source/en/quantization/fp_quant.md * Update src/transformers/utils/quantization_config.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update tests/quantization/fp_quant_integration/test_fp_quant.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update tests/quantization/fp_quant_integration/test_fp_quant.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * small test fixes * dockerfile update * spec link * removed `_process_model_after_weight_loading` * toctree * nvfp4 * nvfp4 tests * FP-Quant version bumped * nvfp4 default and docs update * trainable * cpu if pseudoquant * proper group size selection * gsr * qutlass requirement version bumo * Upstream docker copy * docs update --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-01 13:58:22 +00:00
Marc Sun	d848a3953a	Remove all instances of `is_safetensors_available` (#41233 ) * safetensors is a core dep * fix * ok * simplify branching * keep it for now --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-01 13:57:28 +00:00
Yuanyuan Chen	e4913bdf50	🚨 [v5] Remove SinkCache (#41107 ) Remove SinkCache Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-01 13:46:55 +00:00
Yuanyuan Chen	1c8f206ecc	Fix pylint warnings (#41222 ) * Remove unused variables Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Remove reimported packages Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix code Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix pylint warnings Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Simplify Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-01 13:16:22 +00:00
Yuanyuan Chen	3016717f0d	Use removeprefix and removesuffix (#41240 ) * Use removeprefix and removesuffix Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-01 13:13:04 +00:00
Yuanyuan Chen	ca975f1cb8	[V5] Remove deprecated transformers.onnx (#41214 ) * Remove deprecated transformers.onnx Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Remove onnx docs Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-10-01 12:17:04 +00:00
Joao Gante	1d1ac07893	[repo utils] Update `models_to_deprecate.py` (#41231 ) * update models_to_deprecate * exclude this file * handle typos and aliases * don't commit files * PR suggestions; make fixup	2025-10-01 12:01:52 +00:00
Yao Matrix	bcec3e2175	fix TrainerIntegrationDeepSpeed UT failures (#41236 ) Signed-off-by: Yao, Matrix <matrix.yao@intel.com>	2025-10-01 13:55:01 +02:00
Raushan Turganbay	ae879f67f8	🚨 [v5] Delete feature extractors used for vision (#41174 ) * bye bye * remove from docs * do not use feature extractor here * fix docs * do not delete it * forgot these	2025-10-01 13:20:58 +02:00
Yuanyuan Chen	1c4d9982d3	Use math.log2 (#41241 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-01 09:52:31 +00:00
Raushan Turganbay	db1cc65c06	Video processor accepts single frames on cuda (#41218 ) * fix * why was is np if input is in torch	2025-10-01 10:55:11 +02:00
Raushan Turganbay	f22cb1e868	fix qwen text config (#41158 ) * fix qwen text config * fix tests * fix one more test * address comments	2025-09-30 17:23:44 +00:00
Yuanyuan Chen	374ded5ea4	Fix white space in documentation (#41157 ) * Fix white space Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Revert changes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix autodoc Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-30 09:41:03 -07:00
Steven Liu	16a141765c	[docs] Fix tp_plan (#41205 ) remove manual	2025-09-30 09:27:50 -07:00
Marc Sun	5d1e853032	[Trainer] deprecate `num_train_tokens` (#41165 ) * dep * fix * fix	2025-09-30 15:53:16 +00:00
Marc Sun	cecd92849e	[v5] Remove train kwargs (#41127 ) * rm train kwargs * fix	2025-09-30 17:43:25 +02:00
Marc Sun	103fa6d235	[v5] Remove deprecated prediction loop (#41123 ) * rem deprecated * more * rm all instances of legacy arg	2025-09-30 17:43:01 +02:00
Marc Sun	aa3e8798ba	[v5] Remove tokenizer from Trainer (#41128 ) * tokenizer deprecated * style * forgot this * style	2025-09-30 17:42:10 +02:00
Marc Sun	e99dee6470	Remove old sagemaker api support (#41161 ) * fix * fix	2025-09-30 17:41:52 +02:00
Marc Sun	dded9fd112	[v5] More Training Args cleaning (#41131 ) clean	2025-09-30 17:38:07 +02:00
Marc Sun	6fb6117abe	Revert "Fix DeepSpeed mixed precision precedence over Accelerate defaults" (#41124 ) * Revert "Fix DeepSpeed mixed precision precedence over Accelerate defaults (#3…" This reverts commit df67cd35f0ca1a1cbf7147b2576db31b16200cf4. * fix	2025-09-30 17:37:42 +02:00
Rémi Ouazan	5bdb70450d	Fix sliding window attn mask (#41228 ) * Fix sliding window attn mask * Clearer test * Apply style fixes * If Picasso made ascii drawings he would have made this --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-09-30 17:22:53 +02:00
Yuanyuan Chen	a61fc6a0b9	Fix typing of train_args (#41142 ) * Fix typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix fsdp typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-30 14:28:02 +00:00
Yuanyuan Chen	919a4845fb	Unify is_torchvision_v2_available with is_torchvision_available (#41227 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-30 15:21:49 +01:00
Yih-Dar	8e7b0655f1	update code owners (#41221 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-30 16:21:19 +02:00
魅影	2dd175e6bb	Adapt to the SDPA interface to enable the NPU to call FlashAttentionScore (#41143 ) Adapt to the SDPA interface to enable the NPU to call FlashAttentionScore. Co-authored-by: frozenleaves <frozen@Mac.local>	2025-09-30 14:19:57 +00:00
Yuanyuan Chen	cf0887f62c	Remove old Python code (#41226 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-30 14:15:59 +00:00
Anton Vlasjuk	52f5eca7c9	🚨 [`v5`] Remove headmasking (#41076 ) * first attempt at removing * copies * last bits in core * quick fixes * tests purge * docs and examples * some fixes * more * another round of cleanups * fix * fix a bunch of models * fix dummy bert * fix * fix new model * fix signature change * fix * fix style/copies * new models * fix copies didnt find that damn * test * this shouldnt have happened during model addition	2025-09-30 16:04:57 +02:00
Joao Gante	a80f05dfcb	[generate] cache missing custom generate file (#41216 ) * cache missing custom generate file * make fixup	2025-09-30 13:32:24 +00:00
Tom Aarsen	1f1e93e095	Align pull request template to bug report template (#41220 ) The only difference is that I don't users to https://discuss.huggingface.co/ for hub issues.	2025-09-30 14:25:41 +02:00
Peter St. John	2a596f5b2f	[ESM] add accepts_loss_kwargs=False to EsmPreTrainedModel (#41006 ) add accepts_loss_kwargs=False to EsmPreTrainedModel Signed-off-by: Peter St. John <pstjohn@nvidia.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-09-30 12:06:47 +00:00
Pramodith Ballapuram	3edd8048b0	Trainer: Pass `num_items_in_batch` to `compute_loss` in `prediction_step` (#41183 ) * Add num_items_in_batch computation to predict_step. * address comments. * Fix test cases. * fixup --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-09-30 09:45:17 +00:00
Tom Aarsen	59035fd0e1	Avoid assumption that model has config attribute in deepspeed (#41207 ) Avoid assumption that model has config in deepspeed	2025-09-30 11:42:50 +02:00
Sam Sharpe	d97397787e	Wait for main process in _save_checkpoint to ensure best checkpoint exists (#40923 ) * Update trainer.py * fix * fix format * move barrier, delete redundant	2025-09-30 11:41:03 +02:00
Marc Sun	06c04e0851	Deprecate `half_precision_backend` (#41134 ) * deprecate * remove * rm apex * fix * fix * fix doc	2025-09-30 11:36:44 +02:00
eun2ce	0e5a975608	Fix Qwen3-Omni audio_token_id serialization issue (#41192 ) Fix Qwen3-Omni audio_token_id serialization by overriding parent's attribute_map - Override attribute_map in Qwen3OmniMoeThinkerConfig to prevent inheritance of incorrect mapping - Parent class maps audio_token_id -> audio_token_index, but implementation uses audio_token_id directly - Fixes issue where custom audio_token_id values were not preserved during save_pretrained/from_pretrained cycles Fixes #41191	2025-09-30 11:15:56 +02:00
OMOTAYO OMOYEMI	42c682514b	docs/examples(speech): pin CTC commands to Hub datasets; add Windows notes (#41027 ) * examples(speech): load Common Voice from Hub; remove deprecated dataset-script references (Windows-friendly notes) * docs/examples(speech): pin CTC streaming & other CTC commands to Hub datasets; add Windows notes * make style * examples(speech): align DataTrainingArguments help with datasets docs; minor wording fixes * docs/examples(speech): address review remove Hub subsection & Whisper tip; align dataset help text * style: apply ruff/black/usort/codespell on examples/speech-recognition * Apply style fixes * Update examples/pytorch/speech-recognition/README.md * update doc to match load_dataset --------- Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-09-30 08:38:31 +00:00
Yuanyuan Chen	aaf1269d83	Remove unnecessary Optional typing (#41198 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-30 08:38:05 +00:00
Steven Liu	4a02bc7004	[docs] Fix links (#41110 ) fix	2025-09-30 08:53:07 +02:00
Yoni Gozlan	def4a37e19	Embed interactive timeline in docs (#41015 ) * embed timeline in docs (test web componentand Iframe) * test scaling * test multiple scales * compensate scale in width * set correct syle and scale * remove bottom space created by scale * add timeline as a separate page * reformulate docs after review	2025-09-30 01:36:08 +00:00
Marc Sun	3e975acc8b	Fix docker quantization (#41201 ) * launch docker * remove gptq for now * run tests * Revert "run tests" This reverts commit f85718ce3a21d5937bf7405b8925c125c67d1a3e. * revert	2025-09-29 16:36:30 +00:00
Marc Sun	8635d8e796	Fix 8bit bnb loading (#41200 ) * Fix 8bit * oups forgot the case where it is not prequantized	2025-09-29 18:34:46 +02:00
Kyungmin Lee	1f0e9a4778	Fix EXAONE-4.0 dummy id (#41089 ) * Fix EXAONE-4.0 dummy id * Fix exaone4 dummy (#1) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-29 16:30:55 +00:00
Yoni Gozlan	bd37c45354	Add EdgeTAM (#39800 ) * initial comment * test * initial conversion for outline * intermediate commit for configuration * chore:init files for sam2 * adding arbitary undefined config * check * add vision * make style * init sam2 base model * Fix imports * Linting * chore:sam to sam2 classes * Linting * Add sam2 to models.__init__ * chore:match prompt encoder with sam2 code * chore:prepare kwargs for mask decoder * Add image/video predictors * Add CUDA kernel * Add output classes * linting * Add logging info * tmp commit * docs for sam2 * enable image processing * check difference of original SAM2 - difference is the order of ToTensor() - please see https://pytorch.org/vision/main/_modules/torchvision/transforms/functional.html#resize * enable promptencoder of sam2 * fix promprencoder * Confirmed that PromptEncoder is exactly same (Be aware of bfloat16 and float32 difference) * Confirmed that ImageEncoder is exactly same (Be aware the linting of init) * Confirmed that MaskDecoder is exactly same (TO DO: lint variable name) * SamModel is now available (Need more chore for name) * make fix-copies * make style * make CI happy * Refactor VisionEncoder and PostioinEmbedding * TO DO : fix the image_embeddings and sparse_embeddings part * pure image inference done * reusable features fix and make style * styling * refactor memoryattention * tmp * tmp * refactor memoryencoder TO DO : convert and inference the video pipeline * TO DO : fix the image_encoder shape * conversion finish TO DO: need to check video inference * make style * remove video model * lint * change * python utils/check_docstringspy --check_all * python utils/check_config_attributes.py * remove copies for sam2promptencoder due to configuration * change __init__.py * remove tensorflow version * fix that to not use direct comparison * make style * add missing import * fix image_embedding_size * refactor Sam2 Attention * add fully working video inference (refactoring todo) * clarify _prepare_memory_conditioned_features * simplify modeling code, remove unused paths * use one model * use auto_docstring * refactor rope embeddings * nit * not using multimask when several points given * add all sam2.1 * add video tmp * add Sam2VideoSessionState + fast image proc + video proc * remove init_states from model * fix batch inference * add image integration tests * uniformize modeling code with other sam models and use modular * pass vision tests an most model tests * All tests passing * add offloading inference state and video to cpu * fix inference from image embedding and existing mask * fix multi_boxes mask inference * Fix batch images + batch boxes inference * improve processing for image inference * add support for mask generation pipeline * add support for get_connected_components post processing in mask generation * add fast image processor sam, image processor tests and use modular for sam2 image processor * fix mistake in sam after #39120 * fix init weights * refactor convert * add integration tests for video + other improvements * add needed missing docstrings * Improve docstrings and * improve inference speed by avoiding cuda sync * add test * skip test for vision_model * minor fix for vision_model * fix vision_model by adding sam2model and change the torch dependencies * remove patch_size * remove image_embedding_size * fix patch_size * fix test * make style * Separate hieradet and vision encoder in sam2 * fixup * review changes part 1 * remove MemoryEncoderConfig and MemoryAttentionConfig * pass q_stride instead of q_pool module * add inference on streamed videos * explicitely process streamed frames * nit * Improve docstrings in Sam2Model * update sam2 modeling with better gestion of inference state and cache, and separate Sam2Model and Sam2VideoModel * improve video inference api * change inference_state to inference_session * use modular for Sam2Model * fix convert sam2 hf * modular * Update src/transformers/models/sam2/video_processing_sam2.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix minor config * fix attention loading error * update modeling tests to use hub checkpoints * Use CI A10 runner for integration tests values + higher tolerance for video integration tests * PR review part 1 * fix doc * nit improvements * enforce one input format for points, labels and boxes * nit * last few nits from PR review * fix style * fix the input type * fix docs * add sam2 model as conversion script * improve sam2 doc * add rough necessarry changes * first working edgetam * fix issue with object pointers * Use modular as much as possible * nit fixes + optimization * refactor spatial perceiver * cleanup after merge * add working edgetam * improve perceiver resampler code * simplify/unify rope attention logic * Improve comments in apply_rotary_pos_emb_2d * add working tests * fix test timmwrapper * add docs * make fixup * nits * fix modular * fix modular * PR review part 1 * split apply_rotary_pos_emb_2d * add granularity to _prepare_memory_conditioned_features * add dates to doc * add separate mlp for memory attention * Fix memory on wrong device * store processed frames in dict * update checkpoints in tests * update dates --------- Co-authored-by: sangbumchoi <danielsejong55@gmail.com> Co-authored-by: RUFFY-369 <prakarshkaushik369@gmail.com> Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> Co-authored-by: Haitham Khedr <haithamkhedr@meta.com> Co-authored-by: sangbum choi <sangbumchoi@sangbumui-MacBookAir.local> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-09-29 11:54:54 -04:00
Anton Vlasjuk	c1db38686a	[`Kernels Attention`] Change fallback logic to error out on explicit kernels request and include FA3 (#41010 ) * fix * be more strict * change logic to include fa3 * fix the case where nothing is requested * modify old tests + add kernels related tests * style	2025-09-29 17:10:59 +02:00
Cyril Vallez	5426edecab	Make quantizers good citizens loading-wise (#41138 ) * fix param_needs_quantization * rewrite most hqq * clean * fix * comment * remove it from exception of safetensors * start on bnb 4bits * post-rebase fix * make bnb4 bit a good citizen * remove forgotten print * make bnb 8bits a good citizen * better hqq * fix * clean * remove state dict from signature * switch method * make torchao a good citizen * fixes * fix torchao * add check * typo	2025-09-29 17:04:45 +02:00
Ákos Hadnagy	399c589dfa	Separate docker images for Nvidia and AMD in benchmarking (#41119 ) Separate docker images for Nvidia and AMD	2025-09-29 17:03:27 +02:00
Samuel Barry	52cbc7c868	Fix attention sink implementation in flex attention (#41083 ) * Fix attention sink implementation in flex attention * fix dim * fix * Remove print * raisae error when return_lse is False yet s_aux is providewd * Clean test files for merge * Update src/transformers/integrations/flex_attention.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * force return lse * Add to doc --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-09-29 14:33:03 +00:00
Pk Patel	de9a75f5b0	fix(trainer): Avoid moving model with device_map (#41032 ) * fix(trainer): Avoid moving model with device_map When a model is loaded with `device_map="auto"` and is too large to fit on a single GPU, `accelerate` will offload some layers to the CPU or disk. The `Trainer` would previously attempt to move the entire model to the specified device, causing a `RuntimeError` because a model dispatched with `accelerate` hooks cannot be moved. This commit fixes the issue by adding a check in `_move_model_to_device` to see if the model has an `hf_device_map` attribute. If it does, the device placement is assumed to be handled by `accelerate`, and the `model.to(device)` call is skipped. A regression test is added to ensure the `Trainer` can be initialized with a model that has a `hf_device_map` that simulates offloading without raising an error. * Added the logger warning for the move model --------- Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>	2025-09-29 14:31:42 +00:00
Yao Matrix	bcc0dae77c	enable flex attention ut cases on XPU (#40989 ) * enable flex attention ut cases on XPU Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-09-29 14:30:49 +00:00
Lucain	fcd483f0ff	Bump hfh prerelease version (#41175 )	2025-09-29 16:28:36 +02:00
lilin-1	a3fa1d3993	Fix inaccurate train_tokens_per_second when resuming from checkpoint (#41156 ) * fix(trainer): Fix the issue of inaccurate token count in training sessions During the training process, the initial token count was not saved, leading to inaccurate speed calculation. Now, the initial token count is saved and the increment during the session is calculated, ensuring that the speed metric accurately reflects the performance of the current training session. * 修复错误 --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-09-29 16:22:35 +02:00
Marc Sun	ad74fba085	[v5] Remove `model_parallel` deprecated feature (#41166 ) * fix * remove model parallel * style * removed a bit too much * rm comments * fix	2025-09-29 16:14:03 +02:00
Yuanyuan Chen	38a08b6e8a	More typing fixes (#41102 ) * Fix noqa Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * fix typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Use np.ndarray Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * remove noqa Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix chars Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-29 13:11:53 +00:00
Joao Gante	4fade1148f	[tests] `CausalLMTester` automatically infers other test classes from `base_model_class` 🐛 🔫 (#41066 ) * halfway through the models * update test checks * refactor all * another one * use tuples * more deletions * solve bad inheritance patterns * type * PR ready? * automatic model class inference from the base class * vaultgemma * make fixup * make fixup * rebase with gpt2 * make fixup :'( * gpt2 is special	2025-09-29 15:05:08 +02:00
YangKai0616	cdba28c344	[XPU] Add MXFP4 support for XPU (#41117 ) * XPU supports gpt-oss MXFP4 * Complete MXFP4 UT file and comment information * Complete MXFP4 UT file and comment information * Fix code style * Fix code style --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-09-29 12:10:41 +02:00
Guillaume LEGENDRE	2dcb20dcec	CI Runners - move amd runners mi355 and 325 to runner group (#41193 ) * Update CI workflows to use devmi355 branch * Add workflow trigger for AMD scheduled CI caller * Remove unnecessary blank line in workflow YAML * Add trigger for workflow_run on main branch * Update workflow references from devmi355 to main * Change runner_scale_set to runner_group in CI config	2025-09-29 11:14:19 +02:00
Rémi Ouazan	d0d574b1e4	Modernbert fix (#41056 ) * Add FA to docker * Fixed padding for mdernbert * Fixed logits and hidden states extraction in ModernBertForMultipleChoice * Added a test for ModernBertForMultipleChoice * fixes * More fixes and GREEN CI * consistency * moar consistency	2025-09-29 10:52:44 +02:00
Ita Zaporozhets	071eb5334f	handle flash slow tests (#41072 ) * handle flash slow tests * update patch mask to 1/0 for flash * don't skip flash * flash * raise tols * rm flash support :( * nits --------- Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-173-7.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-171-230.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-95.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-214.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-163-147.ec2.internal>	2025-09-26 16:24:31 +00:00
Rémi Ouazan	50d2448a1a	Enable fa in amd docker (#41069 ) * Add FA to docker * Use caching mechanism for qwen2_5 * Fix a typo in important models list * Partial fixes for gemma3 * Added a commit ID for FA repo * Detailled the expectation storage format * Rebase fix * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-09-26 13:57:58 +02:00
Lysandre Debut	10f6891fc5	Remove data from examples (#41168 ) Remove telemetry	2025-09-26 13:52:45 +02:00
Rémi Ouazan	97ca0b4712	Fix flash-attn for paged_attention when no kernels (#41078 ) * Fix non-kernels flash attention paged implementation * Cover all cases * Style * Update src/transformers/integrations/flash_paged.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Apply style fixes --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-09-26 10:41:21 +02:00
Yoni Gozlan	53838edde7	Improve `add_dates` script (#41167 ) * utils/add_dates.py * put lfm2-vl in correct category	2025-09-25 16:00:05 -04:00
Yuanyuan Chen	449533af73	Add language specifiers to code blocks of markdown files (#41114 ) * Add language specifiers to code blocks of markdown files Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Update docs/source/en/model_doc/qwen3_omni_moe.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_templating_writing.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_templating_writing.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_templating_writing.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Update nemotron.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update phimoe.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix syntax error Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-09-25 10:29:57 -07:00
Yoni Gozlan	e691f84412	Force new vision models addition to include a fast image processor (#40802 ) * add test * fix test and change cutoff date * Add documentation to test	2025-09-25 15:58:18 +00:00
Cyril Vallez	e54bb62a73	Simplify and improve model loading logic (#41103 ) * remove unexpected keys from inputs (they have nothing to do there) * remove input * simplify a lot init * fix * fix check for non-persistent buffer * revert because too many old and bad models... * remove comment * type hint * make it a real test * remove model_to_load -> always use the same model * typo * remove legacy offload_folder (we never waste that memory anymore) * do not change prefix anymore * change very bad function name * create adjust method * remove useless method * restrict * BC * remove unused method * CI * remove unused args * small fix * fix * CI * CI * avoid too many loops * fix regex * cleaner * typo * fix * fix	2025-09-25 17:28:27 +02:00
Yuanyuan Chen	6dc9ed87a0	Fix format of compressed_tensors.md (#41155 ) * Fix table format Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix format Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-25 14:50:15 +00:00
Nithin Rao	a579de7f5e	Add Parakeet (#39062 ) * first commit Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * update to handle masking for bs>1 Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * Add tests and docs Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * update model ids Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * update docs and improve style Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * update librosa location Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * import guard torch too Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * ruff code checks fix Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * ruff format check Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * updated to parakeet names Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * update script Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * Add tokenizer decoding Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * Remove other model dependency Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * clean tests Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * fix tests Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * linting Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * fix ruff lint warnings Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * move to seperate folders Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * add parakeet ctc model code Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * simplify encoder structure Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * update documentation Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * add parakeet to toctree Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * fix tests Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * add parakeet doc Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * Address comments Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * Update featurizer to compute lens directly Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * fix ruff tests Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * fix encoding format Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * fix minor ctc decoding Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * revert modular_model_converter.py changes * revert check_config_attributes.py changes * refactor: fastconformer & parakeet_ctc -> parakeet * modeling update * test update * propagate feature extractor updates * propagate doc changes * propagate doc changes * propagate tokenization changes * propagate conversion changes * remove fastconformer tests * remove modular * update processor * update processor * tset update * diverse fixes * 100% macthing greedy batched * Update conversion script. * Refactor docs. * Reafactor auto loading. * Refactor and fix tokenization and processing. * Update integration test. * Modeling fixes: - ensure correct attention mask shape - ensure layer drop returns valid output - correct blank token ID when computing CTC loss * Format and repo consistency. * Update model doc. * Fix feature extraction tests. * Fix (most) tokenizer tests. * Add pipeline example. * Fixes * Use eager_attention_forward from Llama. * Small tweaks. * Replace Sequential with ModuleList * Add check if not all layers copied * Clean tokenizer. * Standardize FastSpeech2ConformerConvolutionModule for Parakeet. * Switch to modular for modeling and processing. * Add processor tests. * Fix modeling tests. * Formating and docstrings. * Add `return_attention_mask` like other feature extractors. * clean up after merging main. * nits on modeling * configuration update * nit * simplification: use PretrainedTokenizerFast, simplify processor * add dtype arg to mel_filter_bank * feature extraction: simplify! * modeling update * change to ParakeetTokenizerFast * correct attention mask handling * auto update * proc update * test update * feature extraction fixes * modeling update * conversion script update * udpate tests feature integration * update tokenization and tests * processor tests * revert audio_utils * config docstring update * blank_token -> pad_token * modeling udpate * doc update * fix tests * fix test * fix tests * address review comments * add comment * add comment * explicitly not support flash * atttention straightforward masking * fix * tokenizer update: skipping blank tokens by default * doc update * fix max_positions_embeddings handling * nits * change atol faeture extraction integration tests * doc update + fix loss * doc update * nit * update integration test for A10 * repo id name * nit --------- Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> Co-authored-by: Eric B <ebezzam@gmail.com>	2025-09-25 13:52:24 +00:00
Yao Matrix	1dd22a234c	extend gemma3n integration ut cases on XPU (#41071 ) Signed-off-by: Yao, Matrix <matrix.yao@intel.com>	2025-09-25 13:46:37 +00:00
Yuanyuan Chen	05fb90c969	Fix single quotes in markdown (#41154 ) Fix typos Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-25 13:03:26 +00:00
Lucain	44682e7131	Adapt and test huggingface_hub v1.0.0 (#40889 ) * Adapt and test huggingface_hub v1.0.0.rc0 * forgot to bump hfh * bump * code quality * code quality * relax dependency table * fix has_file * install hfh 1.0.0.rc0 in circle ci jobs * repostiryo * push to hub now returns a commit url * catch HfHubHTTPError * check commit on branch * add it back * fix ? * remove deprecated test * uncomment another test * trigger * no proxies * many more small changes * fix load PIL Image from httpx * require 1.0.0.rc0 * fix mocked tests * fix others * unchange * unchange * args * Update .circleci/config.yml * Bump to 1.0.0.rc1 * bump kernels version * fix deps	2025-09-25 11:13:50 +00:00
Qile Xu	750dd2a401	Fix: align Qwen2.5-VL inference rope index with training by passing s… (#41153 ) Fix: align Qwen2.5-VL inference rope index with training by passing second_per_grid_ts	2025-09-25 10:33:46 +00:00
Lysandre Debut	7258ea44bc	Fix loading logic flaw with regards to unexpected and missing keys (#40850 ) * Unexpected keys should be ignored at load with device map * remove them all * fix logic flaw * fix * simplify * style * fix * revert caching allocator change * add other test * add nice doc --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-09-24 16:44:42 +02:00
Yih-Dar	2c4caa19e7	dummy commit (#41133 ) * dummy commit, nothing interesting * dummy commit, nothing interesting * dummy commit, nothing interesting * dummy commit, nothing interesting --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-24 16:31:46 +02:00
Karol Szustakowski	6d1875924c	Fixed loading LongT5 from legacy checkpoints (#40724 ) * Fixed loading LongT5 from legacy checkpoints * Adapted the fix to work with missing lm_head	2025-09-24 13:13:18 +01:00
YangKai0616	3ca43d34b1	Fixed MXFP4 model storage issue (#41118 )	2025-09-24 12:11:51 +00:00
lilin-1	b33cb70097	🚨Refactor: Update text2text generation pipelines to use max_new_tokens… (#40928 ) * Refactor: Update text2text generation pipelines to use max_new_tokens and resolve max_length warning * docs(text2text_generation): 更新参数注释以反映现代生成实践将max_length参数注释更新为max_new_tokens，以符合现代生成实践中指定生成新token数量的标准做法 * refactor(text2text_generation): Remove outdated input validation logic * docs(text2text_generation): Revert incorrectly modified comment * docs(text2text_generation): Revert incorrectly modified comment	2025-09-24 11:54:55 +00:00
Yuanyuan Chen	b0c7034d58	Remove self-assignment (#41062 ) * Remove self-assignment Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Update src/transformers/integrations/flash_paged.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Clear pass Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Clear pass Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Clear pass Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-09-24 12:43:17 +01:00
Yuanyuan Chen	04a0bb569c	Fix broken `` expressions in markdown files (#41113 ) Fix broken expressions in markdown files Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-24 11:34:12 +00:00
Yuanyuan Chen	071c7b1423	Fix the error where a keyword argument appearing before *args (#41099 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-24 11:27:37 +00:00
nnul	80f20e0ff8	[Qwen3-next] Fix dimension mismatch in torch_chunk_gated_delta_rule and torch_recurrent_gated_delta_rule (#40963 ) (#41036 ) * fix mismatched dims for qwen3 next * propagate changes * chore: renamed tot_heads to total_sequence_length * Apply suggestion from @vasqu Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * minor fix to modular qwen3 next file --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-09-24 11:18:27 +00:00
liangel-02	1d81247b0c	[torchao safetensors] integrate torchao safetensors support with transformers (#40735 ) * enable torchao safetensors * enable torchao safetensors support * add more version checking	2025-09-24 12:32:47 +02:00
hbenoit	b533cec74d	Support loading LFM2 GGUF (#41111 ) * add gguf config mapping for lfm2 * add lfm2 tensor process to unsqueeze conv weights * adjust values from gguf config to HF config * add test for lfm2 gguf * ruff --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-09-24 10:17:41 +00:00
Yuanyuan Chen	65dcd66cc8	🚨 [V5] Remove deprecated training arguments (#41017 ) * Remove deprecated training arguments from V5 Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Remove deprecated training arguments from V5 Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix comments Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix code Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-24 12:01:27 +02:00
Yuanyuan Chen	43a613c8da	Update ruff to 0.13.1 + target Python 3.10 + apply fixes (#37809 ) Update ruff to 0.13.1 target it to Python 3.10 and apply its fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-09-24 06:37:21 +00:00
Yuanyuan Chen	f64354e89a	Format empty lines and white space in markdown files. (#41100 ) * Remove additional white space and empty lines from markdown files Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Add empty lines around code Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-23 16:20:01 -07:00
Cyril Vallez	99b0995138	Remove bad test skips (#41109 ) * remove bad skips * remove more * fix inits	2025-09-23 20:39:28 +02:00
Yih-Dar	00f3d90720	Fix `_get_test_info` for inherited tests (#41106 ) * fix _get_test_info * fix patched * add comment * ruff --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-23 19:35:24 +02:00
Joao Gante	cfa022e719	[tests] gpt2 + `CausalLMModelTester` (#41003 ) * tmp commit * tmp commit * tmp commit * rm old GPT2ModelTester * nit bug * add facilities for encoder-decoder tests; add comments on ALL overwrites/extra fns * vision_encoder_decoder	2025-09-23 18:07:06 +01:00
Joao Gante	869735d37d	🚨 [generate] update paligemma mask updates (and other assisted generation-related fixes) (#40917 ) * tmp * fix modular inheritance * nit * paligemma 1 doesn't have swa * use same pattern as in models with hybrid layers * PR comments * helium also needs layer_typed (bc it relies on gemma) * paligemma/gemma3: same mask creation fn in fwd and generate * propagate changes to helium (gemma-based) * tmp commit * slow paligemma tests passing, let's see what breaks * fix test_left_padding_compatibility * tmp commit * tmp commit * rebase error * docs * reduce diff * like this? * t5gemma * better comment * shorter diff * exception * ffs type * optional * shorter modular_gemma.py * helium model actually needs no changes -- the tester is the issue * t5gemma modular config * a few more modular; paligemma BC * fix processor issues? * rm config exception * lift warning in gemma	2025-09-23 16:20:00 +00:00
Ryan Mullins	71717ce91c	docs: Fix Tool Use links and remove dead RAG links (#41104 ) docs: Fix tool use links. Remove dead RAG links. Fix style	2025-09-23 09:18:49 -07:00
Jinde.Song	946e5f95ea	fix wrong height and width when read video use torchvision (#41091 )	2025-09-23 12:35:44 +00:00
Yuanyuan Chen	870add3daf	Remove tf and flax from Chinese documentation (#41057 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-23 11:43:17 +00:00
Yuanyuan Chen	ae60692821	Remove unused arguments (#40916 ) * Fix unused arguments Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-23 11:40:51 +00:00
Yuanyuan Chen	f682797866	Fix typing (#40788 ) * Fix optional typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix optional typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix schema typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix typing * Fix typing * Fix typing * Fix typing * Use np.ndarray Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Format code Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Use np.ndarray Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Improve typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix quote string of np.ndarray Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix code * Format Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-23 11:36:02 +00:00
Yuanyuan Chen	f4a6c65951	Fix typos in documentation (#41087 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-23 11:27:04 +00:00
Yuanyuan Chen	89e0f472f4	Remove mention of TensorFlow/Flax/JAX from English documentation (#41058 ) Remove mention of TensorFlow from English documentation Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-23 11:14:11 +00:00
Ákos Hadnagy	62ce6fcb60	Fix argument name in benchmarking script (#41086 ) * Fix argument name in benchmarking script * Adjust vars	2025-09-23 13:05:27 +02:00
Yih-Dar	257fe5eea8	Switch to `python:3.10-slim` for CircleCI docker images (#41067 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-23 12:48:48 +02:00
Duygu Altinok	0ec0325781	Minor addition, no split modules for VideoMAEE (#41051 ) * added no split modules * fixed typo --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2025-09-23 11:53:51 +02:00
Wang, Yi	577fa6f167	fix crash when using chat to send 2+ request to gptoss (#40536 ) Signed-off-by: Wang, Yi <yi.a.wang@intel.com>	2025-09-23 09:50:23 +00:00
Yih-Dar	03c92884b5	Update team member list for some CI workflows (#41094 ) * update list * update list --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-23 09:48:40 +00:00
Nick Doiron	cbb290ec23	Improve documentation and errors in Mamba2-based models (#41063 ) * fix bug in Mamba2 docs * correct 'because on of' issue * link to other Mamba2 model types * github URL is not changed * update error message in generated files	2025-09-22 10:36:20 -07:00
Saidur Rahman Pulok	8048c614bf	[i18n-bn] Add Bengali language README file (#40935 ) * [i18n-bn] Add Bengali language README file and update links in existing language files * Update Bengali README for clarity and consistency in model descriptions	2025-09-22 09:51:39 -07:00
Marc Sun	aa30e0642e	Update quantization CI (#41068 ) * fix * new everything * fix	2025-09-22 18:10:16 +02:00
Yih-Dar	1bb69cce82	Fix CI jobs being all red 🔴 (false positive) (#41059 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-22 16:51:00 +02:00
Yuanyuan Chen	f15258dec2	Remove <frameworkcontent> and <pt> tags from documentation (#41055 ) * Remove <frameworkcontent> and <pt> tags Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Revert changes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Update docs/source/en/model_doc/madlad-400.md --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-09-22 14:29:50 +00:00
Rémi Ouazan	2ec37649e2	Ci utils (#40978 ) * Add CI reports dir to gitignore * Add utils to run local CI * Review compliance * Style * License	2025-09-22 16:16:19 +02:00
Ákos Hadnagy	b9d337b6f3	Add write token for uploading benchmark results to the Hub (#41047 ) * Separate write token for Hub upload * Address review comments * Address review comments	2025-09-22 14:13:46 +00:00
Yuanyuan Chen	646ff51d1a	Simplify unnecessary Optional typing (#40839 ) Remove Optional Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-22 12:57:50 +00:00
Yuanyuan Chen	c9939b3ab6	Remove repeated import (#40937 ) * Remove repeated import Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix conflict Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-22 12:57:13 +00:00
Yih-Dar	4f36011545	[testing] Fix `seed_oss` (#41052 ) * fix * fix * fix * fix * fix * fix * Update tests/models/seed_oss/test_modeling_seed_oss.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-09-22 14:54:30 +02:00
Ralph Gleaton	2b8a7e82b5	Add Whole Word Masking and Padding Strategy to DataCollatorForLanguageModeling (#39485 ) * Add whole word masking * Vectorize whole word masking functions * Unit test whole word masking * Remove support for TF in whole word masking	2025-09-22 13:42:34 +01:00
Yuanyuan Chen	226667ec2f	Remove doc of tf and flax (#41029 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-22 13:42:26 +01:00
Yuanyuan Chen	6eff44bb8d	Fix outdated torch version check (#40925 ) Update torch minimum version check to 2.2 Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-22 12:38:07 +00:00
Yannick Schnider	9ff47a71e4	Fix condition for emitting warning when generation exceeds max model length (#40775 ) correct warning when generation exceeds max model length Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>	2025-09-22 12:21:38 +00:00
Ryan Mullins	ae9ef2e151	docs: improved RoPE function Docstrings (#41004 ) * docs: improved RoPE functuon docstrings * Update src/transformers/modeling_rope_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-09-22 13:21:15 +01:00
Yuanyuan Chen	f3c481ed87	Use torch.autocast (#40975 ) * Use torch.autocast Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Format code Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-22 12:18:24 +00:00
Yuanyuan Chen	37152f8446	Fix typos in English/Chinese documentation (#41031 ) * Fix typos and formatting in English docs Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix typos and formatting in Chinese docs Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-22 11:31:46 +00:00
Yuanyuan Chen	8a52288dba	Remove optax (#41030 ) Remove optax dep Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-22 11:30:39 +00:00
Yuanyuan Chen	5f891b36cd	Fix typing of tuples (#41028 ) * Fix tuple typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-22 11:29:07 +00:00
Yih-Dar	c05f9d2f0e	[testing] Fix `qwen2_audio` (#41018 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-22 10:45:31 +00:00
Raushan Turganbay	55a1eaf6f0	Fix Qwen video tests (#41049 ) fix test	2025-09-22 12:28:11 +02:00
BakerBunker	db802aafa4	Modify Qwen3Omni parameter name since VL changed it (#41045 ) Modify parameter name since VL changed it Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com>	2025-09-22 10:06:59 +00:00
Ayush	8a2f24a321	Making compute_loss_func always take priority in Trainer (#40632 ) * logger warn, if-else logic improved * redundant if condition fix	2025-09-22 09:47:34 +00:00
BakerBunker	ebbcf00ad1	Adding support for Qwen3Omni (#41025 ) * Add Qwen3Omni * make fix-copies, import properly * nit * fix wrong setup. Why was audio_token_id renamed ? * upds * more processing fixes * yup * fix more generation tests * down to 1? * fix import issue * style, update check repo * up * fix quality at my best * final quality? * fix doc building * FINAL COMMIT: SKIP IMPORTANT BUT FAILING TESTS FOR MERGE * SKIP THE TEMPLATE ONE --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com> Co-authored-by: Arthur <arthur.zucker@gmail.com>	2025-09-21 23:46:27 +02:00
Ákos Hadnagy	67097bf340	Fix benchmark runner argument name (#41012 )	2025-09-20 10:53:56 +02:00
Yih-Dar	8076e755e5	Update after #41007 (#41014 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-19 21:55:46 +02:00
Cyril Vallez	022c882e14	Fix Glm4v test (#41011 ) fix	2025-09-19 18:54:26 +02:00
Yih-Dar	966b3dbcbe	Fix `PhimoeIntegrationTest` (#41007 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-19 16:43:46 +00:00
StevenBucaille	04bf4112f2	🚨 [lightglue] fix: matches order changed because of early stopped indices (#40859 ) * fix: bug that made early stop change order of matches * fix: applied code suggestion Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix: applied code suggestion to modular * fix: integration tests --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-09-19 16:41:22 +01:00
Joao Gante	dfc230389c	🚨 [v5] remove deprecated entry point (#40997 ) * remove old entry point * update references to transformers-cli	2025-09-19 14:40:27 +00:00
Yih-Dar	8010f5d1d9	Patch more `unittest.case.TestCase.assertXXX` methods (#41008 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-19 16:38:12 +02:00
Joao Gante	5bf633b32a	[tests] update `test_left_padding_compatibility` (and minimize overwrites) (#40980 ) * update test (and overwrites) * better test comment * 0 as a default for	2025-09-19 15:36:26 +01:00
Joao Gante	df12617914	🚨 [v5] remove generate output retrocompatibility aliases (#40998 ) remove old type aliases	2025-09-19 14:36:12 +00:00
Marc Sun	2a538b2ed4	fix dict like init for ModelOutput (#41002 ) * fix dict like init * style	2025-09-19 16:14:44 +02:00
Yuanyuan Chen	96a3e898cd	RUFF fix on CI scripts (#40805 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-19 13:50:26 +00:00
Yoni Gozlan	98c8523434	Fix more dates in model cards and wrong modalities in _toctree.yml (#40955 ) * Fix model cards and modalities in toctree * fix new models	2025-09-19 09:47:28 -04:00
Yuanyuan Chen	767f8a4c75	Fix typoes in src and tests (#40845 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-19 13:18:38 +00:00
Yih-Dar	9d9c4d24c5	Make `EfficientLoFTRModelTest` faster (#41000 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-19 12:51:05 +00:00
Anton Vlasjuk	b4ba4e1da0	[`RMSNorm`] Fix rms norm init for models that center around 1 (#40796 ) * fix * fixup inits * oops * fixup gemma * fixup modular order * how does this keep happen lol * vaultgemma is new i forgot * remove init check	2025-09-19 12:15:36 +00:00
Joao Gante	fce746512b	[docs] rm stray tf/flax autodocs references (#40999 ) rm tf references	2025-09-19 12:04:12 +01:00
Ita Zaporozhets	ddfa3d4402	blt wip (#38579 ) * blt wip * cpu version * cpu friendly with full entropy model (real time patching) * adding config file instead of args file * enable MPS * refactoring unused code * single config class in config file * inherit from PreTrainedModel * refactor LMTransformer --> BLTPatcher * add conversion script * load from new checkpoing with form_pretrained * fixed demo from_pretrained * clean up * clean a few comments * cleanup folder * clean up dir * cleaned up modeling further * rename classes * adding transformers Attention class and RotaryEmbedding class * exchanged blt modules for transformers modules: attention, rotary_emb, create_causal_mask, etc * seperate out patcher config, update modeling and conversion script * rename vars to be more transformers-like * rm unused functions * adding cross attention from transformers * pass arg * rename weights * updated conversion script * overwritten commit! fixing PR * apply feedback * adding BLTRMSNorm like Llama * add repeat_kv and eager_attention_forward copied from * BLTMLP identical to MllamTextMLP * clean up some args' * more like mllama, but busier inits * BLTTransformerLayer config * decoder, encoder, global configs * wip working on modular file * cleaning up patch and configs * clean up patcher helpers * clean up patcher helpers further * clean up * some config renaming * clean up unused configs * clean up configs * clean up configs * update modular * clean * update demo * config more like mllama, seperated subconfigs from subdicts * read from config instead of self args * update demo file * model weights to causal lm weights * missed file * added tied weights keys * BLTForCausalLM * adding files after add-new-model-like * update demo * working on tests * first running integration tests * added integration tests * adding tokenization tests, integration tests, and cleaned up tokenization file, + ruff * tokenizer clean up * modular file * fixing rebase * ruff * adding correct basemodel output and updating config with checkpoint vals (for testing) * BLTModelTests git status * enabling inputs_embeds, although won't be equal to input_ids since need ids for patching logic * fix sdpa == causal tests * fix small model test and some gradient checkpointing * skip training GC tests * fix test * updated modular * update modular * ruff * adding modular + modeling * modular * more modern is_casual check * cleaning up modular * more modular reduction * ruff * modular fix * fix styling * return 2 * return 2 * fix some tests * fix bltcrossattention after modular break * some fixes / feedback * try cache generate fix * try cache generate fix * fix generate tests * attn_impl workaround * refactoring to use recent TransformersKwargs changes * fix hidden_states shape test * refactor to new outputs * simplify outputs a bit * rm unneeded decoderlayer overwriting * rename blt * forgot tokenizer test renamed * Reorder * Reorder * working on modular * updates from modular * new modular * ruff and such * update pretrainedmodel modular * using cohere2 apply_rotary_pos_emb * small changes * apply feedback r2 * fix cross_attention * apply more feedback * update modeling fix * load submodules from pretrainedmodel * set initializer_range to subconfigs * rm cross_attnetion_states pass when not needed * add 7b projection layer support * check repo * make copies * lost cohere2 rotate_half * ruff * copies? * don't tie weights for submodules * tie weights setting * check docstrings * apply feedback * rebase * rebased modeling * update docs * applying feedback * few more fixes * fix can_record_outputs * fast tokenizer * no more modulelist * tok auto * rm tokenizersss * fix docs * ruff * fix after rebase * fix test, configs are not subscriptable --------- Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-30.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-103.ec2.internal> Co-authored-by: Lysandre <hi@lysand.re> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-36.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-45.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-173-121.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-103.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-178.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-79.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-169-239.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-111.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-100.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-153.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-15.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-165-131.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-138.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-215.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-172-142.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-172-147.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-0.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-163-58.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-165-202.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-244.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-186.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-192.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-14.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-171-249.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-75.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-78.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-163-134.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-180.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-175-241.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-225.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-9.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-34.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-68.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-175.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-170-160.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-95.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-172-73.ec2.internal>	2025-09-19 11:55:55 +02:00
Yih-Dar	46ea7e613d	[testing] test `num_hidden_layers` being small in model tester (#40992 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-19 11:45:07 +02:00
Benjamin Bossan	ebdc17b8e5	ENH: Enable readline support for transformers chat (#40911 ) ENH Enable readline support for chat This small change enables GNU readline support for the transformers chat command. This includes, among others: - advanced navigation and editing: ctrl + a ctrl + e alt + b alt + f ctrl + k alt + d etc. - navigate and search history: arrow up/down ctrl + p ctrl + n ctrl + r - undo: ctrl + _ - clear screen: ctrl + l Implementation Although it may look strange, just importing readline is enough to enable it in Python, see: https://docs.python.org/3/library/functions.html#input As readline is not available on some platforms (https://docs.python.org/3/library/readline.html), the import is guarded. Readline should work on Linux, MacOS, and with WSL, I'm not sure about Windows though. Ideally, someone can give it a try. It's possible that Windows users would have to install pyreadline (https://pypi.org/project/pyreadline3/).	2025-09-19 10:39:21 +01:00
Cyril Vallez	e2dbde280f	Remove [[autodoc]] refs to TF/Flax objects (#40996 ) * remove refs * more	2025-09-19 11:28:34 +02:00
Anton Vlasjuk	155f7e2e62	🔴[`Attention`] Bert-based Models Attention Refactor (#38301 ) * clean start to bert refactor * some test fixes * style * fix last tests * be strict on positional embeddings, fixup according tests * cache support * more cache fixes, new causal API * simplify masks, fix tests for gen * flex attn, static cache support, round of fixes * ? * this time * style * fix flash attention tests, flex attention requires torch 2.7.x to work with multiple classes (as recompile strats force a size call which is wrongly interpreted before) * roberta * fixup sdpa remains * attention split, simplify args and kwargs, better typing * fix encoder decoder * fix test * modular roberta * albert * data2vectext, making it modular tomorrow * modular data2vec text * tmp disable * xmod + cache position fixes * whoops * electra + markuplm, small fixes * remove wrong copy * xlm_roberta + some embedding fixes * roberta prelayernorm * RemBert: remove copy, maybe doing it later * ernie * fix roberta offloading * camembert * copy fixes * bert generation + fixes on eager * xlm roberta xl * bridgetower (text) + seamlessv2 copy fixes * rocbert + small fixes * whoops * small round of fixups * NOTE: kernels didnt load with an earlier version, some fixup (needs another look bc cross deps) * the end of the tunnel? * fixup nllbmoe + style * we dont need this anymore * megatron bert is barely used, low prio skip for now * Modernize bert (template for others) NOTE: trying to push this through, might be overdue if not in time possible * check inputs for all others (if checkmarked) * fix bridgetower * style * fix encoder decoder (partially but cause found and fix also, just needs to be done for everything else) * proper fix for bert to force intermediate dict outputs * propagate to others * style * xlm roberta xl investigation, its the layernorm... * mobile bert * revert this, might cause issues with composed models * review * style	2025-09-19 11:23:58 +02:00
Ákos Hadnagy	61eff450d3	Benchmarking v2 GH workflows (#40716 ) * WIP benchmark v2 workflow * Container was missing * Change to sandbox branch name * Wrong place for image name * Variable declarations * Remove references to file logging * Remove unnecessary step * Fix deps install * Syntax * Add workdir * Add upload feature * typo * No need for hf_transfer * Pass in runner * Runner config * Runner config * Runner config * Runner config * Runner config * mi325 caller * Name workflow runs properly * Copy-paste error * Add final repo IDs and schedule * Review comments * Remove wf params * Remove parametrization from worfkflow files * Fix callers * Change push trigger to pull_request + label * Add back schedule event * Push to the same dataset * Simplify parameter description	2025-09-19 08:54:49 +00:00
Cyril Vallez	5f6e278a51	Remove `set_model_tester_for_less_flaky_tests` (#40982 ) remove	2025-09-18 18:56:10 +02:00
Cyril Vallez	4df2529d79	🚨🚨🚨 Fully remove Tensorflow and Jax support library-wide (#40760 ) * setup * start the purge * continue the purge * more and more * more * continue the quest: remove loading tf/jax checkpoints * style * fix configs * oups forgot conflict * continue * still grinding * always more * in tje zone * never stop * should fix doc * fic * fix * fix * fix tests * still tests * fix non-deterministic * style * remove last rebase issues * onnx configs * still on the grind * always more references * nearly the end * could it really be the end? * small fix * add converters back * post rebase * latest qwen * add back all converters * explicitly add functions in converters * re-add	2025-09-18 18:27:39 +02:00
Yih-Dar	5ac3c5171a	Track the CI (model) jobs that don't produce test output files (process being killed etc.) (#40981 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-18 18:27:27 +02:00
Pavel Iakubovskii	d9d7f6a6b9	Revert change in `compile_friendly_resize` (#40645 ) fix	2025-09-18 16:25:45 +01:00
Yih-Dar	738b223f57	Add captured actual outputs to CI artifacts (#40965 ) * fix * fix * Remove `# TODO: ???` as it make me `???` * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-18 15:40:53 +02:00
Cyril Vallez	dd7ac4cd59	[tests] Really use small models in all fast tests (#40945 ) * start * xcodec * chameleon * start * layoutlm2 * layoutlm * remove skip * oups * timm_wrapper * add default * doc * consistency	2025-09-18 15:24:12 +02:00
Branden	2ce35a248f	Fix Issue #39030 : AutoTokenizer.from_pretrained does not propagate token (#40956 ) * fix merge conflicts * change token typing --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-27-253.ec2.internal>	2025-09-18 13:22:19 +00:00
Harshal Janjani	6e51ac31ef	[timm_wrapper] better handling of "Unknown model" exception in timm (#40951 ) * fix(timm): Add exception handling for unknown Gemma3n model * nit: Let’s cater to this specific issue * nit: Simplify error handling	2025-09-18 14:09:08 +01:00
Marc Sun	9378f874c1	[Trainer] Fix DP loss (#40799 ) * fix * style * Fix fp16 * style --------- Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com>	2025-09-18 13:07:20 +00:00
Hamish Scott	7cf1f5ced0	Use `skip_predictor=True` in vjepa2 `get_vision_features` (#40966 ) use skip_predictor in vjepa2 `get_vision_features`	2025-09-18 11:51:45 +00:00
Yuanyuan Chen	f6104189fd	Fix outdated version checks of accelerator (#40969 ) * Fix outdated version checks of accelerator Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix outdated version checks of accelerator Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-18 11:49:14 +00:00
Raushan Turganbay	c532575795	Add new model LFM2-VL (#40624 ) * Add LFM2-VL support * add tests * linting, formatting, misc review changes * add siglip2 to auto config and instantiate it in lfm2-vl configuration * decouple image processor from processor * remove torch import from configuration * replace \| with Optional * remove layer truncation from modeling file * fix copies * update everything * fix test case to use tiny model * update the test cases * fix finally the image processor and add slow tests * fixup * typo in docs * fix tests * the doc name uses underscore * address comments from Yoni * delete tests and unsuffling * relative import * do we really handle imports better now? * fix test * slow tests * found a bug in ordering + slow tests * fix copies * dont run compile test --------- Co-authored-by: Anna <anna@liquid.ai> Co-authored-by: Anna Banaszak <48625325+ankke@users.noreply.github.com>	2025-09-18 11:01:58 +00:00
Rangehow	564fde14f1	FIX(trainer): ensure final checkpoint is saved when resuming training (#40347 ) * fix(trainer): ensure final checkpoint is saved when resuming training * add test * make style && slight fix of test * make style again * move test code to test_trainer * remove outdated test file * Apply style fixes --------- Co-authored-by: rangehow <rangehow@foxmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-09-18 09:57:21 +00:00
Yih-Dar	5748352c27	Update expected values for one more `test_speculative_generation` after #40949 (#40967 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-18 11:47:14 +02:00
Yuanyuan Chen	438343d93f	Don't list dropout in eager_paged_attention_forward (#40924 ) Remove dropout argument Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-18 09:05:50 +00:00
Shane A	449da6bb30	Add FlexOlmo model (#40921 ) * transformers add-new-model-like * Add FlexOlmo implementation * Update FlexOlmo docs * Set default tokenization for flex olmo * Update FlexOlmo tests * Update attention comment * Remove unneeded use of `sliding_window`	2025-09-18 09:04:06 +00:00
Jack	3bb1b4867c	Standardize audio embedding function name for audio multimodal models (#40919 ) * Standardize audio embedding function name for audio multimodal models * PR review	2025-09-18 08:45:04 +00:00
Yih-Dar	58e13b9f12	Update expected values for some `test_speculative_generation` (#40949 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-17 20:50:38 +02:00
Yih-Dar	529d3a2b06	Fix `Glm4vModelTest::test_eager_matches_fa2_generate` (#40947 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-17 19:53:59 +02:00
Yoni Gozlan	a2ac4de8b0	Remove nested import logic for torchvision (#40940 ) * remove nested import logic for torchvision * remove unnecessary protected imports * remove unnecessarry protected import in modular (and modeling) * fix wrongly remove protected imports	2025-09-17 13:34:30 -04:00
Raushan Turganbay	8e837f6ae2	Consistent naming for images kwargs (#40834 ) * use consistent naming for padding * no validation on pad size * add warnings * fix * fox copies * another fix * fix some tests * fix more tests * fix lasts tests * fix copies * better docstring * delete print	2025-09-17 18:40:25 +02:00
Cyril Vallez	eb04363a0d	Raise error instead of warning when using meta device in from_pretrained (#40942 ) * raise instead of warning * add timm * remove	2025-09-17 18:23:37 +02:00
Yih-Dar	ecc1d778ce	Fix `Glm4vMoeIntegrationTest` (#40930 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-17 18:21:18 +02:00
Marc Sun	c5553b4120	Fix trainer tests (#40823 ) * fix liger * fix * more * fix * fix hp * fix --------- Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com>	2025-09-17 16:05:17 +00:00
lilin-1	14f01aee39	docs(i18n): Correct the descriptive text in the README_zh-hans.md (#40941 )	2025-09-17 08:48:38 -07:00
jiqing-feng	26b65fb516	Intel CPU dockerfile (#40806 ) * upload intel cpu dockerfile Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update cpu dockerfile Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update label name Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-09-17 15:42:30 +00:00
Joao Gante	66f97d3f64	[models] remove unused `import torch.utils.checkpoint` (#40934 )	2025-09-17 16:37:56 +01:00
Yoni Gozlan	3853bfe4d5	[DOC] Add missing dates in model cards (#40922 ) add missing dates	2025-09-17 11:17:06 -04:00
Pablo Montalvo	6cade29278	Add LongCat-Flash (#40730 ) * working draft for LongCat * BC changes to deepseek_v3 for modular * format * various modularities * better tp plan * better init * minor changes * make modular better * clean up patterns * Revert a couple of modular commits, because we won't convert in the end * make things explicit. * draft test * toctree, tests and imports * drop * woops * make better things * update test * update * fixes * style and CI * convert stuff * up * ah, yes, that * enable gen tests * fix cache shape in test (sum of 2 things) * fix tests * comments * re-Identitise * minimize changes * better defaults * modular betterment * fix configuration, add documentation * fix init * add integration tests * add info * simplify * update slow tests * fix * style * some additional long tests * cpu-only long test * fix last tests? * urg * cleaner tests why not * fix * improve slow tests, no skip * style * don't upcast * one skip * finally fix parallelism	2025-09-17 14:48:10 +02:00
Duc-Viet Hoang	48a5565179	Add support for Florence-2 training (#40914 ) * Support training florence2 * update doc and testing model to florence-community * fix florence-2 test, use head dim 16 instead of 8 for fa2 * skip test_sdpa_can_dispatch_on_flash * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-09-17 11:49:56 +00:00
Yih-Dar	89949c5d2d	Minor fix for #40727 (#40929 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-17 11:42:13 +02:00
Mohamed Mekkouri	c830fc1207	Adding activation kernels (#40890 ) * first commit * add mode * revert modeling * add compile * rm print	2025-09-17 11:36:09 +02:00
liangel-02	f6999b00c3	[torchao safetensors] renaming get_state_dict function (#40774 ) renaming get_state_dict function Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-09-17 11:20:50 +02:00
Akshay Babbar	8428c7b9c8	Fix #40067 : Add dedicated UMT5 support to GGUF loader (config, tokenizer, test) (#40218 ) * Fix #40067 : add UMT5 support in GGUF loader (config, tokenizer, test) * chore: fix code formatting and linting issues * refactor: move UMT5 GGUF test to quantization directory and clean up comments * chore: trigger CI pipeline * refactor(tests): Move UMT5 Encoder GGUF test to GgufModelTests. This consolidates the new test into the main class for consistency. * Add regression check to UMT5 encoder GGUF test Verify encoder output against reference tensor values with appropriate tolerances for stability. * Update tests/quantization/ggml/test_ggml.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update tests/quantization/ggml/test_ggml.py remove comments Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-09-17 09:15:55 +00:00
Yaswanth Gali	ddd4caf066	[Llama4] Remove `image_sizes` arg and deprecate `vision_feature_layer` (#40832 ) * Remove unused arg * deprecate * revrt one change * get set go * version correction * fix * make style * comment	2025-09-17 09:14:13 +00:00
Raushan Turganbay	b82cd1c240	Processor load with multi-processing (#40786 ) push	2025-09-17 09:46:49 +02:00
Aritra Roy Gosthipaty	6e50a8afb2	[Docs] Adding documentation of MXFP4 Quantization (#40885 ) * adding mxfp4 quantization docs * review suggestions * Apply suggestions from code review Co-authored-by: vb <vaibhavs10@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: vb <vaibhavs10@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-09-16 11:31:28 -07:00
Raushan Turganbay	cccef4be91	Fix dtype in Paligemma (#40912 ) * fix dtypes * fix copies * delete unused attr	2025-09-16 16:07:56 +00:00
Yoni Gozlan	beb09cbd5a	🔴Make `center_crop` fast equivalent to slow (#40856 ) make center_crop fast equivalent to slow	2025-09-16 16:01:38 +00:00
Joao Gante	d4af0d9f03	[generate] misc fixes (#40906 ) misc fixes	2025-09-16 15:18:06 +01:00
Joao Gante	3b3f6cd0c1	[gemma3] `Gemma3ForConditionalGeneration` compatible with assisted generation (#40791 ) * gemma3vision compatible with assisted generation * docstring * BC * docstring * failing checks * make fixup * apply changes to modular * misc fixes * is_initialized * fix poor rebase	2025-09-16 15:08:48 +01:00
Yih-Dar	88ba0f107e	disable `test_fast_is_faster_than_slow` (#40909 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-16 15:34:04 +02:00
Yih-Dar	270da89708	Remove `runner_map` (#40880 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-16 15:18:07 +02:00
Xuehai Pan	df03fc1f9c	Improve module name handling for local custom code (#40809 ) * Improve module name handling for local custom code * Use `%lazy` in logging messages * Revert "Use `%lazy` in logging messages" This reverts commit 5848755d5805e67177c5218f351c0ac852df9340. * Add notes for sanitization rule in docstring * Remove too many underscores * Update src/transformers/dynamic_module_utils.py * Update src/transformers/dynamic_module_utils.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-09-16 13:11:48 +00:00
Yuanyuan Chen	96bc19bcdf	remove dummy EncodingFast (#40864 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-16 12:56:11 +00:00
Shane A	d0af4269ec	Add Olmo3 model (#40778 ) * transformers add-new-model-like for Olmo3 * Implement modular Olmo3 * Update Olmo3 tests * Copy Olmo2 weight converter to Olmo3 * Implement Olmo3 weight converter * Fix code quality errors * Remove unused import * Address rope-related PR comments * Update Olmo3 model doc with minimal details * Fix Olmo3 rope test failure * Fix 7B integration test	2025-09-16 13:28:23 +02:00
Yih-Dar	65f9ede359	Set seed for `Glm4vIntegrationTest` (#40905 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-16 13:01:51 +02:00
Cyril Vallez	0c1839d609	[cache] Only use scalars in `get_mask_sizes` (#40907 ) * remove tensor ops * style * style	2025-09-16 12:48:58 +02:00
Cyril Vallez	3688a977d0	Harmonize CacheLayer names (#40892 ) * unify naming * style * doc as well * post rebase fix * style * style * revert	2025-09-16 12:14:12 +02:00
Cyril Vallez	087775d10e	[cache] Merge static sliding and static chunked layer (#40893 ) * merge * get rid of tensors in get_mask_sizes!! * remove branch * add comment explanation * re-add the class with deprecation cycle	2025-09-16 11:41:20 +02:00
Yih-Dar	1aff033ec9	Fix flaky `Gemma3nAudioFeatureExtractionTest::test_dither` (#40902 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-16 11:00:07 +02:00
Pablo Montalvo	65adc3aaa3	Fix getter regression (#40824 ) * test things * style * move tests to a sane place	2025-09-16 10:57:13 +02:00
Mohamed Mekkouri	8e1a12bbee	Fixing the call to kernelize (#40628 ) * fix * style * overload train and eval * add getter and setter	2025-09-16 10:50:54 +02:00
Yih-Dar	21c8379fb0	Make debugging failing tests (check and update expect output values) easier 🔥 (#40727 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-16 10:21:48 +02:00
Joao Gante	5af248b3e3	[generate] remove docs of a feature that no longer exists (#40895 )	2025-09-15 19:22:31 +01:00
HyunZ118	20ee3a73f0	🌐 [i18n-KO] Translated `imageprocessor.md` to Korean (#39557 ) * feat: manual translation * docs: fix ko/_toctree.yml * Apply suggestions from code review Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> * Update docs/source/ko/image_processors.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-09-15 10:07:16 -07:00
HyunZ118	2141a5b764	🌐 [i18n-KO] Translated smolvlm.md to Korean (#40414 ) * fix: manual edits * Apply suggestions from code review * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-09-15 10:06:57 -07:00
Yuanyuan Chen	2a83792165	Remove dict branch of attention_mask in sdpa_attention_paged_forward (#40882 ) Remove dict branch of attention_mask Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-15 17:38:13 +02:00
Cyril Vallez	04d1c8f3d4	Fix deta loading & dataclass (#40878 ) * fix * fix 2	2025-09-15 17:23:13 +02:00
Samuel Barry	ff26fe8302	Add Fast PromptDepthAnything Processor (#40602 ) * Test & import setup * First version passing tests * Ruff * Dummy post processing * Add numerical test * Adjust * Doc * Ruff * remove unused arg * Refine interpolation method and push test script * update bench * Comments * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Remove benchmrk script * Update docstrings * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * doc * further process kwargs * remove it * remove * Remove to dict * remove crop middle * Remove param specific handling * Update testing logic * remove ensure multiple of as kwargs * fix formatting * Remove none default and get image size * Move stuff to _preprocess_image_like_inputs and refacto * Clean * ruff * End of file & comments * ruff again * Padding fixed * Remove comments to pass tests * Remove prompt depth from kwargs * Adjust output_size logic * Docstring for preprocess * auto_docstring for preprocess * pass as an arg * update test batched * stack images * remove prompt scale to meter * return tensors back in preprocess * remove copying of images * Update behavior to match old processoer * Fix batch size of tests * fix test and fast * Fix slow processor * Put tests back to pytorch * remove check and modify batched tests * test do_pad + slow processor fix --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-09-15 15:03:43 +00:00
Yuanyuan Chen	6254bb4a68	Use torch.expm1 and torch.log1p for better numerical results (#40860 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-15 11:54:14 +00:00
Yuanyuan Chen	e674e9dadb	Clarify passing is_causal in sdpa_attention_paged_forward (#40838 ) * Correctly pass is_causal in sdpa_attention_paged_forward Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Improve typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Add comment Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Improve comments Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Revert typing Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-15 11:51:22 +00:00
Manuel de Prada Corral	0957999f7f	🔴 Move variable output controls to `_prepare_generation_config` (#40715 ) * move checks to validate steps where possible * fix csm and other models that override _sample * ops dia you again * opsie * joao review * Move variable output controls to `prepare_inputs_for_generation` * fix a bunch of models * back to basics * final touches	2025-09-15 11:08:00 +00:00
Cyril Vallez	5e9ec59d0c	Fix modular consistency (#40883 ) * reapply modular * add missing one	2025-09-15 13:07:08 +02:00
Anton Vlasjuk	3442b2f300	[`VaultGemma`] Update expectations in integration tests (#40855 ) * fix tests * style	2025-09-15 12:46:30 +02:00
JJJYmmm	c0dbe095b0	Adding Support for Qwen3-VL Series (#40795 ) * add qwen3vl series * make fixup * fix import * re-protect import * fix it finally (need to merge main into the branch) * skip processor test (need the checkpoint) * oups typo * simplify modular * remove unecesary attr * fix layer * remove unused rope_deltas args * reuse image def * remove unnesesary imports --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-09-15 12:46:18 +02:00
艾力可	fc5f9105da	[Qwen3 Next] Use numerically stable `rsqrt` (#40848 ) use numerically stable inverse	2025-09-15 12:45:13 +02:00
Arthur	96d3795cfc	Update model tags and integration references in bug report (#40881 )	2025-09-15 12:08:29 +02:00
NanoCode012	f5e1641857	fix: XIELU act parameters not being casted to correct dtype (#40812 )	2025-09-15 11:05:55 +02:00
Marc Sun	ada64ce452	fix florence kwargs (#40826 )	2025-09-15 11:05:47 +02:00
Joao Gante	93f810e6fa	[docstrings / type hints] Update outdated annotations for `past_key_values` (#40803 ) * some fixes * nits * indentation * indentation * a bunch of type hints * bulk changes	2025-09-15 10:52:32 +02:00
Grzegorz Kwasniewski	c65fea0b92	[Bug fix #40813 ] Fix base_model_tp_plan of Starcoder2 model. (#40814 ) Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>	2025-09-15 10:46:32 +02:00
Ákos Hadnagy	9c804f7ec4	Redirect MI355 CI results to dummy dataset (#40862 )	2025-09-14 18:42:49 +02:00
Albert Villanova del Moral	02ea2b3433	Fix TrainingArguments.parallelism_config NameError with accelerate<1.10.1 (#40818 ) Fix ParallelismConfig type for accelerate < 1.10.1 Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-09-14 15:35:42 +00:00
Yuanyuan Chen	d42e96a2a7	Use checkpoint in auto_class_docstring (#40844 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-13 00:49:19 +00:00
Joao Gante	6eb3255842	[generate] Always use decoder config to init cache (#40772 ) * mega derp * fix * always use the decoder	2025-09-12 18:24:22 +02:00
Joao Gante	e682f90f60	[tests] move generative tests away from `test_modeling_common.py` (#40854 ) move tests	2025-09-12 16:12:27 +00:00
eustlb	8d8459132a	[test] Fix test_eager_matches_sdpa incorrectly skipped (#40852 ) * ouput_attentions in typed kwargs * correct typing in GenericForTokenClassification * improve	2025-09-12 18:07:48 +02:00
Ryan Mullins	291772b6b5	add: differential privacy research model (#40851 ) * VaultGemma * Removing Sequence and Token classification models. Removing integration tests for now * Remove pass-only modular code. style fixes * Update vaultgemma.md * Update docs/source/en/model_doc/vaultgemma.md Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Update docs/source/en/model_doc/vaultgemma.md Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Add links to model doc * Correct model doc usage examples * Updating model doc to describe differences from Gemma 2 * Update model_doc links * Adding integration tests * style fixes * repo consistency * attribute exception --------- Co-authored-by: Amer <amersinha@gmail.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-09-12 17:36:03 +02:00
Yoni Gozlan	8502b41bf1	[Sam2Video] Fix video inference with batched boxes and add test (#40797 ) fix video inference with batched boxes and add test	2025-09-12 14:33:28 +00:00
Yoni Gozlan	f384bb8ad5	[SAM2] Fix inconsistent results with original implementation with input boxes (#40800 ) * Fix inconsistencies with box input inference with original repo * remove print * always pad * fix modular	2025-09-12 14:21:22 +00:00
Joao Gante	4cb41ad2a2	[tests] re-enable aria fast tests (#40846 ) * rise from the dead * test	2025-09-12 15:14:54 +01:00
Rémi Ouazan	ef053939ca	Fixes for continuous batching (#40828 ) * Fix for CB attn mask and refactor * Tests for CB (not all passing) * Passing tests and a logger fix * Fixed the KV metrics that were broken when we moved to hybrid alloc * Fix circular import and style * Added tests for FA * Unfolded test to have device expectations * Fixes for H100 * more fixes for h100 * H100 are good * Style * Adding some comments from #40831 * Rename test * Avoid 1 letter variables * Dictonnary is only removed during kwargs * Test for supported sample * Fix a unvoluntary slice * Fixes for non-sliced inputs and small example improvments * Slice inputs is more understandabe * Style	2025-09-12 15:35:31 +02:00
Bo Zheng	98a8078127	Fix the misalignment between the l2norm in GDN of Qwen3-Next and the implementation in the FLA library. (#40842 ) * align torch implementation of gdn with fla. * fix fla import. * fix * remove unused attr * fixes * strictly align l2norm in Qwen3-Next with FLA implementation. --------- Co-authored-by: bozheng-hit <dsoul0621@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-09-12 14:08:01 +02:00
Pavel Iakubovskii	77aa35ee9c	Replace image classification loss functions to `self.loss_function` (#40764 )	2025-09-12 12:59:37 +01:00
Yuchao Zhang	797859c9b8	Update no split modules in T5Gemma model (#40810 ) * Update no split modules in T5Gemma model * Update no_split_modules also for T5Gemma modular * Remove model_split_percents from test cases --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-09-12 10:44:57 +00:00
Mohamed Mekkouri	6e69b60806	Adds Causal Conv 1D kernel for mamba models (#40765 ) * add kernel * make style * keep causal-conv1d * small fix * small fix * fix modular converter * modular fix + lazy loading * revert changes modular * nit * hub kernels update * update * small nit	2025-09-12 12:22:25 +02:00
Cyril Vallez	827b65c42c	Add VideoProcessors to auto-backend requirements (#40843 ) * add it * fix existing ones * add perception to auto_mapping...	2025-09-12 12:21:12 +02:00
Yuanyuan Chen	5e2e77fb45	Improve torch_dtype checks (#40808 ) * Improve torch_dtype checks Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Apply suggestions from code review --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-09-12 09:57:59 +00:00
HyunZ118	c81f426f9a	🌐 [i18n-KO] Translated clipseg.md to Korean (#39903 ) * docs: ko: model_doc/clipseg.md * fix: manual edits * Apply suggestions from code review Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com> --------- Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com>	2025-09-11 17:07:24 -07:00
Anton Vlasjuk	cf084f5b40	[`Jetmoe`] Fix RoPE (#40819 ) * fix * remove prints * why was this there...	2025-09-11 18:41:11 +02:00
Quentin Gallouédec	dfae7dd98d	Push generation config along with checkpoints (#40804 )	2025-09-11 17:33:16 +02:00
namgyu-youn	c264c0ee7e	add general hub test for Fast Image Processors in test_image_processing_utils (#40086 ) * build unittest for ViTImageProcessorFast * remove redundant test case --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-09-11 14:31:37 +00:00
Yuanyuan Chen	895b3ebe41	Fix typos in src (#40782 ) Fix typoes in src Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-11 13:15:15 +01:00
Bo Zheng	6d369124ad	Align torch implementation of Gated DeltaNet in Qwen3-Next with fla library. (#40807 ) * align torch implementation of gdn with fla. * fix fla import. * fix * remove unused attr * fixes --------- Co-authored-by: bozheng-hit <dsoul0621@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-09-11 13:10:15 +02:00
Manuel de Prada Corral	0f1b128d33	⚠️ 🔴 Add ministral model (#40247 ) * add ministral model * docs, tests * nits * fix tests * run modular after merge * opsie * integration tests * again * fff * dtype * rerun modular * arthur review * ops * review	2025-09-11 10:30:39 +02:00
Isotr0py	02f1d7c091	Fix config dtype parsing for Emu3 edge case (#40766 ) * fix emu3 config Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> * address comment Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> * add comments Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> --------- Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-11 08:26:45 +00:00
一u一风荷举	de01a22aff	Fix edge case for tokenize (#36277 ) (#36555 ) * Fix edge case for tokenize (#36277) * Fix tokenizing dtype for float input cases * add test for empty input string * deal empty list of list like [[]] * add tests for tokenizer for models with input that is not plain text	2025-09-11 09:57:30 +02:00
Prathmesh Adsod	ec532f20fb	feature: Add robust token counting with padding exclusion (#40416 ) * created robust token counting by using existing include_num_input_tokens_seen variable and kept bool for backward compatibility and added string also to ensure everything goes well and kept default as is. also robust test cases are created * some codebase mismatched in my local and remote, commiting to solve it and also solved code quality issue * ci: retrigger tests * another attemp to trigger CI for checks	2025-09-11 09:16:06 +02:00
nnul	df67cd35f0	Fix DeepSpeed mixed precision precedence over Accelerate defaults (#39856 ) * Fix DeepSpeed mixed precision precedence over Accelerate defaults Resolves issue where Accelerate would default to bf16 mixed precision when a DeepSpeed config specifies fp16, causing a ValueError. The fix ensures DeepSpeed config takes precedence over TrainingArguments defaults while preserving explicit user settings. Changes: - Add override_training_args_from_deepspeed() method to handle config precedence - Reorder mixed precision environment variable setting in TrainingArguments - Ensure DeepSpeed fp16/bf16 settings override defaults but not explicit choices Fixes #39849 * Add tests for DeepSpeed mixed precision precedence fix - Add TestDeepSpeedMixedPrecisionPrecedence class with 3 focused tests - Test DeepSpeed fp16/bf16 config overriding TrainingArguments defaults - Test user explicit settings being preserved over DeepSpeed config - Test precedence hierarchy: user settings > DeepSpeed config > defaults - Replace massive 934-line test bloat with concise 50-line test suite - Tests cover core functionality of PR #39856 mixed precision precedence fix	2025-09-11 09:12:15 +02:00
이지훈	549ba5b8b6	[Docs] Add missing class documentation for optimizer_schedules (#31870 , #23010 ) (#40761 ) * Add missing class documentation for optimizer_schedules (#31870, #23010) * Add section level header to the optimizer schedules	2025-09-10 14:58:21 -07:00
Lambert	dae1ccfb98	fix_image_processing_fast_for_glm4v (#40483 ) * fix_image_processing_fast_for_glm4v * fix(format): auto-ruff format * add test image processing glm4v * fix quality --------- Co-authored-by: Your Name <you@example.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-09-10 21:05:27 +00:00
Yuanyuan Chen	7d57b31e16	Remove use_ipex option from Trainer (#40784 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-10 17:00:15 +00:00
Sam Sharpe	3378e7dabf	Move num_items_in_batch to correct device before accelerator.gather (#40773 ) add device	2025-09-10 18:49:42 +02:00
Yuan Wu	e5ecb03c92	Fix the issue that csm model cannot work with pipeline mode. (#39349 ) * Fix the issue that csm model cannot work with pipeline mode. Signed-off-by: yuanwu <yuan.wu@intel.com> * Remove batching inference Signed-off-by: yuanwu <yuan.wu@intel.com> * csm output is list of tensor Signed-off-by: yuanwu <yuan.wu@intel.com> * Update src/transformers/pipelines/text_to_audio.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Use different waveform key for different model Signed-off-by: yuanwu <yuan.wu@intel.com> * Fix make style errors Signed-off-by: yuanwu <yuan.wu@intel.com> * Add csm tests Signed-off-by: yuanwu <yuanwu@habana.ai> * Update src/transformers/models/auto/tokenization_auto.py --------- Signed-off-by: yuanwu <yuan.wu@intel.com> Signed-off-by: yuanwu <yuanwu@habana.ai> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-09-10 16:17:35 +00:00
August Moharrami	abbed7010b	Fix dotted model names (#40745 ) * Fix module loading for models with dots in names * quality check * added test * wrong import * Trigger CI rerun after making test model public * Update src/transformers/dynamic_module_utils.py * Update tests/utils/test_dynamic_module_utils.py * Update tests/utils/test_dynamic_module_utils.py * Move test * make fixup --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Matt <rocketknight1@gmail.com>	2025-09-10 14:34:56 +00:00
Cyril Vallez	75202b0928	Read config pattern for Qwen3Next (#40792 ) read it	2025-09-10 15:18:51 +02:00
Yuanyuan Chen	7401cfa57c	Use functools.cached_property (#40607 ) * cached_property is avaiable in functools Signed-off-by: cyy <cyyever@outlook.com> * Remove cached_property Signed-off-by: cyy <cyyever@outlook.com> * Fix docs Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com> Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-10 12:15:40 +00:00
Yuanyuan Chen	8ab2448707	Fix invalid PipelineParallel member (#40789 ) Fix invalid enum member Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-10 12:06:36 +00:00
Yuanyuan Chen	6c9f412105	Fix typos in tests and util (#40780 ) Fix typos Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-10 11:45:40 +00:00
Shuming Hu	0997c2f2ab	Fix doc for PerceptionLMForConditionalGeneration forward. (#40733 ) * Fix doc for PerceptionLMForConditionalGeneration forward. * fix last nit --------- Co-authored-by: raushan <raushan@huggingface.co>	2025-09-10 11:57:19 +02:00
BakerBunker	a72e5a4b9d	🚨 Fix Inconsistant `input_feature` length and `attention_mask` length in `WhisperFeatureExtractor` (#39221 ) * Update feature_extraction_whisper.py * Reformat * Add feature extractor shape test * reformat * fix omni * fix new failing whisper test * Update src/transformers/models/whisper/feature_extraction_whisper.py * make style * revert omni test changes * add comment --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Vasqu <antonprogamer@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>	2025-09-10 09:38:47 +00:00
Yuanyuan Chen	a5ecd94a3f	Enable ruff on benchmark and scripts (#40634 ) * Enable ruff on benchmark and scripts Signed-off-by: cyy <cyyever@outlook.com> * Cover benchmark_v2 Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * correct * style * style --------- Signed-off-by: cyy <cyyever@outlook.com> Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-09-10 11:38:06 +02:00
Raushan Turganbay	08edec9f7d	[processors] Unbloating simple processors (#40377 ) * modularize processor - step 1 * typos * why raise error, super call check it also * tiny update * fix copies * fix style and test * lost an import / fix copies * fix tests * oops deleted accidentally	2025-09-10 10:37:19 +02:00
Yuanyuan Chen	c52889bd51	Remove reference of video_load_backend and video_fps for processor (#40719 ) * Remove reference of video_load_backend and video_fps for processor Signed-off-by: cyy <cyyever@outlook.com> * Restore changes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-09-10 08:37:11 +00:00
jiqing-feng	3340ccbd40	Fix gpt-oss router_indices in EP (#40545 ) * fix out shape Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix router indice Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix mod Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix masking Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix typo Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix typo Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add safety cheking Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix checking Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable 1 expert per rank Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix skip Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add ep plan in config Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add update ep plan Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix typo Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * rm ep_plan and add comments Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-09-10 10:30:55 +02:00
Bo Zheng	b9282355be	Adding Support for Qwen3-Next (#40771 ) * Add Qwen3-Next. * fix * style * doc * simplify * fix name * lazy cache init to allow multi-gpu inference * simplify * fix config to support different hybrid ratio. * remove last commit (redundant) * tests * fix test --------- Co-authored-by: bozheng-hit <dsoul0621@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-09-09 23:46:57 +02:00
Steven Liu	79fdbf2a4a	[docs] CPU install (#40631 ) * init * feedback	2025-09-09 12:51:54 -07:00
Joao Gante	37c14430c9	[pipeline] ASR pipeline kwargs are forwared to `generate` (#40375 ) * tmp commit * add test * PR suggestion	2025-09-09 17:29:25 +00:00
Hiroyoshi Komatsu	d09fdf5e52	Fix crash when executing MambaCache sample code (#40557 ) * Fix the sample code of MambaCache * Update automatically generated code * Fix FalconMambaCache documents * minor doc fixes --------- Co-authored-by: Joao Gante <joao@huggingface.co>	2025-09-09 16:44:49 +00:00
Joao Gante	d33c189e5a	[RoPE] run RoPE tests when the model uses RoPE (#40630 ) * enable rope tests * no manual rope test parameterization * Apply suggestions from code review * Update tests/models/hunyuan_v1_dense/test_modeling_hunyuan_v1_dense.py * PR comment: use generalist torch code to find the rope layer	2025-09-09 17:11:02 +01:00
Joao Gante	71ac7ea048	[tests] update `test_past_key_values_format` and delete overwrites (#40701 ) * tmp * rm some overwrites	2025-09-09 16:40:04 +01:00
Joao Gante	7aaef98cbe	rm src/transformers/convert_pytorch_checkpoint_to_tf2.py (#40718 ) * rm src/transformers/convert_pytorch_checkpoint_to_tf2.py * doctest skip	2025-09-09 16:34:54 +01:00
Joao Gante	de5cbe8b79	[deprecations] Remove generate-related deprecations up to v4.56 (#40729 ) remove generate-related deprecations up to v4.56	2025-09-09 16:32:41 +01:00
Rémi Ouazan	1cdbbb3e9d	Support sliding window in CB (#40688 ) * CB example: better compare feature * Cache managers, still issue w/ effective length * WIP -- fix for effective length * Renames * Wroking, need better parity checks, we mind be missing 1 token * Small fixes * Fixed wrong attn mask and broke cache into pieces * Warmup is slowing down things, disabling it * Cache was too big, fixed * Simplified index objects * Added a profile option to the example * Avoid calls to memory reporing tools * Restore full attention read indices for better latency * Adressed some TODOS and style * Docstrings for cache managers * Docstrings for Schedulers * Refactor scheudlers * [Important] Cache fix for sliding window, check with small sw size * Updated doc for cache memory compute and cache as a whole * Moved a todo * Nits and style * Fix for when sliding window is smaller than max batch per token * Paged interface update * Support for FLash in new API * Fix example CB * Fix bug in CB for paged * Revert example * Style * Review compliance * Style * Styleeeee * Removed NO_SLIDING_WINDOW * Review #2 compliance * Better art * Turn cum_seqlens_k in a dict * Attn mask is now a dict * Update examples/pytorch/continuous_batching.py Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Adressed McPatate pro review * Style and fix --------- Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>	2025-09-09 15:51:11 +02:00
Joao Gante	ed100211cb	[generate] `PromptLookupCandidateGenerator` won't generate forbidden tokens (#40726 ) * no longer flaky :) * PR comments * any token-blocking logits processor works * ? * default * -_- * create fake tensors once	2025-09-09 11:04:01 +00:00
Ze-Yi LIN	82d66e5dd0	Fix: swanlab `public.cloud.experiment_url` api error (#40763 ) fix	2025-09-09 09:28:13 +00:00
Prajwal A	a871f6f58d	Add EfficientLoFTRImageProcessorFast for GPU-accelerated image processing (#40215 ) * Add EfficientLoFTRImageProcessorFast for GPU-accelerated image processing * Fix fast processor output format and add comprehensive tests * Fix trailing whitespace in test file * Apply ruff formatting to test file * simplify pair validation logic * add superglue tests to fast image processor --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-09-08 21:08:02 +00:00
Eric Bezzam	aee5000f16	Fix Bark failing tests (#39478 ) * Fix vocab size for Bark generation. * Fix Bark processor tests. * Fix style. * Address comments. * Fix formatting. --------- Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-09-08 20:24:51 +02:00
SSUM	126264d015	🌐 [i18n-KO] Translated 'xclip.md' to Korean (#39594 ) * feat: nmt draft * fix: manual edits * docs: ko: xclip.md * feat: nmt draft * fix: manual edits * fix: Modify _toctree.yml file to reflect review * fix: Modify _toctree.yml file to reflect review * jungnerd_suggestion_modified_01 ko_xclip.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * jungnerd_suggestion_modified_02 ko_xclip.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>	2025-09-08 11:19:10 -07:00
Abdelrahman Kaseb	5a468e56b7	Fix `continue_final_message` in `apply_chat_template` to prevent substring matching issues (#40732 ) * Fix continue_final_message parameter in apply_chat_template * after run fixup * Handle trim in the template * after fixup * Update src/transformers/utils/chat_template_utils.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-09-08 17:25:12 +00:00
Clint Adams	e8db153599	Fix inconsistency in SeamlessM4T and SeamlessM4Tv2 docs (#39364 )	2025-09-08 10:01:44 -07:00
Yuanyuan Chen	fd2a29d468	Fix more typos (#40627 ) Fix typos Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-08 16:05:40 +00:00
st81	bb8e9cd675	Remove unnecessary tildes from documentation (#40748 )	2025-09-08 08:56:35 -07:00
Luc Georges	a9b313a0c2	docs: add continuous batching to serving (#40758 ) * docs: tmp * docs: add continuous batching to serving * docs: reword after @lysandrejik review	2025-09-08 15:50:28 +00:00
Luc Georges	2077f17547	feat: err when unsupported attn impl is set w/ `--continuous_batching` (#40618 ) * feat: err when unsupported attn impl is set w/ `--continuous_batching` * refactor: move defaults and support list to CB code * feat: add action item in error msg * fix(serve): add default attn implementation * feat(serve): add log when `attn_implementation` is `None` * feat: raise Exception when attn_implementation is not supported by CB	2025-09-08 14:31:49 +00:00
Wing Lian	dc262ee6f5	remove FSDP prefix when using save_pretrained with FSDP2 (#40207 ) * remove FSDP prefix when using save_pretrained with FSDP2 * Fix: use removeprefix correctly --------- Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com> Co-authored-by: S1ro1 <matej.sirovatka@gmail.com>	2025-09-08 14:52:31 +02:00
August Moharrami	9ab6078323	remove gemmas eager training warning (#40744 ) * removed warning * removed remaining warnings	2025-09-08 14:41:52 +02:00
VinceMo	2a1eb5b508	Add BF16 support check for MUSA backend (#40576 ) add musa bf16 supported Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-09-08 12:39:14 +00:00
Clint Adams	7b8d40ea7a	Set accepts_loss_kwargs to False for ConvNext(\|V2)ForImageClassification (#40746 )	2025-09-08 14:25:43 +02:00
Yuanyuan Chen	def7558f74	Fix np array typing (#40741 ) Fix typing Signed-off-by: cyy <cyyever@outlook.com> Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-08 11:30:40 +00:00
Cyril Vallez	44b3888d2a	Fix order of mask functions when using `and/or_mask_function` (#40753 ) fix order	2025-09-08 12:31:42 +02:00
Kashif Rasul	3f7bda4209	[Continous Batching] fix do_Sample=True in continuous batching (#40692 ) * fix do_Sample=True in continous batching * added test * fix top_p * test * Update examples/pytorch/continuous_batching.py	2025-09-08 10:30:15 +02:00
Luc Georges	bb45d3631e	refactor(serve): move `request_id` to headers (#40722 ) * refactor(serve): move `request_id` to headers * fix(serve): typo in middleware fn name Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-09-05 17:50:04 +02:00
Yih-Dar	12b8e10dbf	Skip `VitMatteImageProcessingTest::test_fast_is_faster_than_slow` (#40713 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-05 17:36:20 +02:00
Merve Noyan	6b232618b6	Keypoint matching docs (#40541 ) --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: StevenBucaille <steven.bucaille@gmail.com>	2025-09-05 17:24:56 +02:00
Anton Vlasjuk	948bc0fa34	[`Gemma Embedding`] Fix SWA (#40700 ) * fix gemma embedding flash attention * fix sdpa * fix atttempt number 2 * alternative gemma fix * fix modular	2025-09-05 17:12:00 +02:00
Yuanyuan Chen	828044cadb	Add Optional typing (#40686 ) * Add Optional typing Signed-off-by: cyy <cyyever@outlook.com> * Fix typing Signed-off-by: cyy <cyyever@outlook.com> * Format Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-09-05 15:05:51 +00:00
Joao Gante	e9d6a6907b	[tests] remove overwrites of removed test (#40720 ) rm tests from method moved to hub	2025-09-05 16:04:22 +01:00
Joao Gante	96a5774f2e	[serve] re-enable tests (#40717 ) run tests	2025-09-05 15:15:34 +01:00
Yuanyuan Chen	c76387e580	Fix arguments (#40605 ) * Fix invalid arguments Signed-off-by: cyy <cyyever@outlook.com> * Fix typing Signed-off-by: cyy <cyyever@outlook.com> * Add missing self Signed-off-by: cyy <cyyever@outlook.com> * Add missing self and other fixes Signed-off-by: cyy <cyyever@outlook.com> * More fixes Signed-off-by: cyy <cyyever@outlook.com> * More fixes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-09-05 13:50:04 +00:00
Raushan Turganbay	21f09032db	🔴 Update Glm4V to use config values (#40712 ) * update to use config * just fix it * fixup want this to be reformatted	2025-09-05 13:19:50 +00:00
Yuanyuan Chen	b62e5b6051	Fix parent classes of AllKwargsForChatTemplate (#40685 ) Fix parent classes of AllKwargsForChatTemplate because the *Kwargs are members Signed-off-by: cyy <cyyever@outlook.com>	2025-09-05 11:08:51 +00:00
lmarshall12	313effa7ad	[onnx] use logical `or` for grounding dino mask (#40625 ) * change \|= operator to use torch logical or for friendly export to different backends * change \|= operator to use torch logical or for friendly export to different backends in grounding dino model --------- Co-authored-by: Lewis Marshall <lewism@elderda.co.uk>	2025-09-05 10:55:20 +00:00
Francisco Jesús	f3211b5db7	[moduar] Add missing `self` in post-process methods (#40711 )	2025-09-05 10:49:52 +00:00
Joao Gante	a2a8a3ca1e	[tests] fix blip2 edge case (#40699 )	2025-09-05 11:35:29 +01:00
Raushan Turganbay	4e195f1949	🚨 Allow `check_model_inputs` in core VLMs (#40342 ) * allow `check_model_inputs` in core VLMs * address comments * fix style * why this didnt fail prev? * chec for Noneness instead * batch update vlms * fix some tests * fix copies * oops delete * fix efficientloftr * fix copies * i am stupid, fix idefics * fix GC * return type and other comments * we shouldn't manually change attention anymore * fix style * fix copies * fix the test	2025-09-05 10:05:56 +00:00
Yuanyuan Chen	93df343def	Fix parent classes of ProcessingKwargs (#40676 ) FIx parent classes of ProcessingKwargs Signed-off-by: cyy <cyyever@outlook.com>	2025-09-05 10:01:16 +00:00
Luc Georges	89e103c15e	feat(serve): add healthcheck test (#40697 )	2025-09-05 11:56:34 +02:00
Yih-Dar	a2fffa505d	Fetch more test data with `hf_hub_download` (#40710 ) [test-all] tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-05 09:49:31 +00:00
agamjots	4a88e81532	Add Fast Image Processor for ImageGPT (#39592 ) * initial commit * initial setup * Overiding imageGPT specific functions * imported is_torch_available and utilized it for importing torch in imageGPT fast * Created init and ImageGPTFastImageProcessorKwargs * added return_tensors, data_format, and input_data_format to ImageGPTFastImageProcessorKwargs * set up arguments and process and _preprocess definitions * Added arguments to _preprocess * Added additional optional arguments * Copied logic over from base imageGPT processor * Implemented 2nd draft of fast imageGPT preprocess using batch processing * Implemented 3rd draft of imageGPT fast _preprocessor. Pulled logic from BaseImageProcessorFast * modified imageGPT test file to properly run fast processor tests * converts images to torch.float32 from torch.unit8 * fixed a typo with self.image_processor_list in the imagegpt test file * updated more instances of image_processing = self.image_processing_class in the test file to test fast processor * standardized normalization to not use image mean or std * Merged changes from solution2 branch * Merged changes from solution2 test file * fixed testing through baseImageGPT processor file * Fixed check_code_quality test. Removed unncessary list comprehension. * reorganized imports in image_processing_imagegpt_fast * formatted image_processing_imagegpt_fast.py * Added arg documentation * Added FastImageProcessorKwargs class + Docs for new kwargs * Reformatted previous * Added F to normalization * fixed ruff linting and cleaned up fast processor file * implemented requested changes * fixed ruff checks * fixed formatting issues * fix(ruff after merging main) * simplify logic and reuse standard equivalenec tests --------- Co-authored-by: Ethan Ayaay <ayaayethan@gmail.com> Co-authored-by: chris <christine05789@gmail.com> Co-authored-by: Ethan Ayaay <98191976+ayaayethan@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-09-04 22:45:06 +00:00
Yih-Dar	9db11b728b	Fetch one missing test data (#40703 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-04 23:05:23 +02:00
Manuel de Prada Corral	acd820561f	Align assisted generate for unified signature in decoding methods (#40657 ) * Squashed previous branch * unify assisted generate to common decoding method signature * move checks to validate steps where possible * fix csm and other models that override _sample * ops dia you again * opsie * joao review	2025-09-04 22:47:44 +02:00
Yih-Dar	16b821c542	Avoid `T5GemmaModelTest::test_eager_matches_sdpa_inference` being flaky (#40702 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-04 20:44:40 +00:00
Po-Han Huang (NVIDIA)	519c2524af	Fix broken Llama4 accuracy in MoE part (#40609 ) * Fix broken Llama4 accuracy in MoE part Llama4 accuracy is broken by a bug in https://github.com/huggingface/transformers/pull/39501 . It forgot to transpose the router_scores before applying it to routed_in, causing Llama4 to generate garbage output. This PR fixes that issue by adding back the transpose() and adding some comments explaining why the transpose() is needed. Signed-off-by: Po-Han Huang <pohanh@nvidia.com> * remove comment --------- Signed-off-by: Po-Han Huang <pohanh@nvidia.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-09-04 22:14:44 +02:00
Raushan Turganbay	586dc5d06e	[Glm4.5V] fix vLLM support (#40696 ) * fix * add a test case	2025-09-04 22:09:20 +02:00
Yoni Gozlan	ad2da3ea83	Fix self.dropout_p is not defined for SamAttention/Sam2Attention (#40667 ) Fix dropout_p is not defined for SamAttention/Sam2Attention	2025-09-04 19:32:39 +02:00
Quentin Gallouédec	e39f222096	Fix backward compatibility with accelerate in Trainer (#40668 )	2025-09-04 18:15:15 +02:00
Ákos Hadnagy	d8f670583e	Change docker image to preview for the MI355 CI (#40693 ) * Change docker image to preview for the MI355 CI * Use pushed image	2025-09-04 17:23:09 +02:00
rcogill	4cbca0d1af	Fixing bug in Voxtral when merging text and audio embeddings (#40671 ) * Fixing bug when replacing text-audio token placeholders with audio embeddings * apply changes --------- Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-09-04 15:11:23 +00:00
Luc Georges	9a6c6568db	feat: support request cancellation (#40599 ) * feat: support request cancellation * test: add cancellation test * refactor: use exisitng fn to check req cancellation * feat(cb): make cancellation thread safe * refactor(serve): update test to use `requests` instead of `httpx`	2025-09-04 17:01:29 +02:00
Ryan Mullins	87f38dbfce	add: embedding model (#40694 ) * Gemma 3 for Embeddings * Style fixes * Rename conversion file for consistency * Default padding side emb vs gen * Corrected 270m config * style fixes * EmbeddingGemma config * TODO for built-in prompts * Resolving the sentence similarity bug and updating the architecture * code style * Add query prompt for SentenceTransformers * Code quality * Fixing or_mask_function return types * Adding placeholder prompts for document and passage * Finalizing prompt templates * Adding Retrieval ro preconfigured prompts * Add Gemma 3 270M Config * Correcting num_linear_layers flag default * Export Sentence Transformer in correct dtype --------- Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com>	2025-09-04 16:16:15 +02:00
Yih-Dar	5b0c01b5e2	Final test data cache - inside CI docker images (#40689 ) * run * build * build * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-04 13:12:49 +00:00
Raushan Turganbay	1f3cc935cc	Load a tiny video to make CI faster (#40684 ) * load a tiny video to make CI faster * add video in url_to_local_path	2025-09-04 14:49:00 +02:00
Wing Lian	669230a86f	fix broken offline mode when loading tokenizer from hub (#40669 ) * fix broken offline mode when loading tokenizer from hub * formatting * make quality * fix import order	2025-09-04 12:15:56 +00:00
Flavio Ialongo	91b34be9cf	Add codebook_dim attribute to DacVectorQuantize for DacResidualVectorQuantize.from_latents() (#40665 ) * Add instance attribute to DacVectorQuantize for use in DacResidualVectorQuantize.from_latents * add from_latent tests * style fix * Fix style for test_modeling_dac.py	2025-09-04 11:29:53 +00:00
Abdelrahman Kaseb	25b4a0d8ae	Add sequence classification support for small Gemma 3 text models (#40562 ) * add seq class for gemma3 text model * add Gemma3TextForSequenceClassification to modeling file * After run make fixup * let's just check * thiis is why it was crashing, tests were just failing... * skip it, tested only for seq clf --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2025-09-04 09:44:59 +00:00
Yih-Dar	30a4b8707d	CircleCI docker images cleanup / update / fix (#40681 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-04 10:42:18 +02:00
Yih-Dar	7f92e1f91a	Mark `Aimv2ModelTest::test_eager_matches_sdpa_inference_04_fp16_pad_right_sdpa_kernels` as flaky (#40683 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-04 10:30:14 +02:00
Yih-Dar	ca9b36a9c1	Avoid night torch CI not run because of irrelevant docker image failing to build (#40677 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-04 09:06:37 +02:00
Yih-Dar	d40e7ea52d	Skip more fast v.s slow image processor tests (#40675 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-04 06:35:44 +02:00
Yih-Dar	34595cf296	Even more test data cached (#40636 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-03 21:20:37 +00:00
Ákos Hadnagy	f22ec7f174	Benchmarking V2: framework impl (#40486 ) * Start revamping benchmarking * Start refactoring benchmarking * Use Pandas for CSV * import fix * Remove benchmark files * Remove sample data * Address review comments * Benchmarking v2 * Fix llama bench parameters * Working checkpoint * Readme touchups * Remove unnecessary test * Massage the framework a bit * Small cleanup * Remove unnecessary flushes * Remove references to mock benchmark * Take commit ID from CLI * Address review comments * Use Events for thread comms * Tiny renaming	2025-09-03 22:26:32 +02:00
Luc Georges	459c1fa47a	refactor: use `tolist` instead of list comprehension calling `.item()` (#40646 )	2025-09-03 19:25:29 +02:00
Yih-Dar	afd1393df1	Remove overwritten `GitModelTest::test_beam_search_generate` (#40666 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-03 18:55:45 +02:00
Yih-Dar	68b9cbb7f5	Skip `test_prompt_lookup_decoding_matches_greedy_search` for `qwen2_audio` (#40664 ) * Skip `test_prompt_lookup_decoding_matches_greedy_search` for `qwen2_audio` * Skip `test_prompt_lookup_decoding_matches_greedy_search` for `qwen2_audio` --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-03 18:43:35 +02:00
Pavel Iakubovskii	55676d7d4c	Fix warning for output_attentions=True (#40597 ) * Fix attn_implementation for output_attentions * remove setting attention, just raise warning * improve message * Update src/transformers/utils/generic.py	2025-09-03 16:25:13 +00:00
Yih-Dar	b67608f587	Skip `test_fast_is_faster_than_slow` for `Owlv2ImageProcessingTest` (#40663 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-03 17:49:10 +02:00
Yih-Dar	30d66dc3bc	Update `check_determinism` inside `test_determinism` (#40661 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-03 17:30:39 +02:00
Manuel de Prada Corral	3f40ebf620	Allow custom args in `custom_generate` Callables and unify generation args structure (#40586 ) * Squashed commit of the following: commit beb2b5f7a04ea9e12876696db66f3589fbae10c5 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 16:03:25 2025 +0200 also standardize _get_stopping_criteria commit 15c25663fa991e0a215a7f3cdcf13a9d3a989faa Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 15:48:38 2025 +0200 watch super.generate() usages commit 67dd845be2202d191a54b2872f1cb3f71b74b7d6 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 14:44:32 2025 +0200 ops commit 4655dfa28fd59d5dc083a41d8396de042d99858c Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 14:41:36 2025 +0200 wrong merge commit 46478143994e7b27d51c972a7881e0fea3cb6e3c Merge: a72c2c4b2f 8564e210ca Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 14:36:15 2025 +0200 Merge branch 'main' of github.com:huggingface/transformers into fix-custom-gen-from-function2 commit a72c2c4b2f9c0e09fe6ec7992d4d02bfa279da2a Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 14:04:59 2025 +0200 ops5 commit e72f91411b961979bb3d271810f57905cee5b577 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 12:06:19 2025 +0200 ops4 commit 12ca97b1078a42167143e0243036f6ef87d5fdac Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 11:58:59 2025 +0200 ops3 commit 8cac6c60a318dd381793d4bf1ef3775823f3c95b Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 11:43:03 2025 +0200 ops2 commit 4681a7d5dc6c8b96a515d9d79f06380c096b9a9f Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 11:40:51 2025 +0200 ops commit 0d72aa6cbd99a5933c5a95a39bea9088ee21e50f Merge: e0d47e980e 5bb6186b8e Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 11:37:28 2025 +0200 Merge branch 'remove-constrained-bs' into fix-custom-gen-from-function2 commit 5bb6186b8efbd5fdb8e3464a22f958343b9c450c Merge: 44973dac7d b0db5a02f3 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 11:36:30 2025 +0200 Merge branch 'main' into remove-constrained-bs commit 44973dac7df4b4e2111c71f5fac918be21f3de52 Merge: 1ddab4bee1 893d89e5e6 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 11:29:48 2025 +0200 Merge commit '893d89e5e6fac7279fe4292bfa3b027172287162' into remove-constrained-bs commit e0d47e980e26d32b028c2b402ccb71262637a7a7 Merge: 88128e4563 1ddab4bee1 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 10:52:50 2025 +0200 Merge branch 'remove-constrained-bs' into fix-custom-gen-from-function2 commit 88128e4563c0be583728e1d3c639bc93143c4029 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Mon Sep 1 10:44:38 2025 +0200 fix custom generate args, refactor gen mode args commit 1ddab4bee159f6c20722e7ff5cd41d5041fab0aa Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Sun Aug 31 21:03:53 2025 +0200 fix commit 6095fdda677ef7fbeb06c05f4f914a11b45257b4 Merge: 4a8b6d2ce1 04addbc9ec Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Thu Aug 28 17:49:16 2025 +0200 Merge branch 'remove-constrained-bs' of github.com:manueldeprada/transformers into remove-constrained-bs commit 4a8b6d2ce18b3a8b52c5261fea427e2416f65187 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Thu Aug 28 17:48:25 2025 +0200 restore and deprecate beam obkects commit 04addbc9ec62dd4f59d15128e8cd9499e2cda3bb Merge: e800c7841e becab2c601 Author: Manuel de Prada Corral <6536835+manueldeprada@users.noreply.github.com> Date: Thu Aug 28 14:38:29 2025 +0200 Merge branch 'main' into remove-constrained-bs commit e800c7841e5c46ce5698fc9be309d0808f85d23c Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Thu Aug 28 14:38:10 2025 +0200 tests gone after green commit 33971d21ac40aef76a7e1122f4a98ef28beadbe8 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Thu Aug 28 14:07:11 2025 +0200 tests green, changed handling of deprecated methods commit ab303835c184d0a87789da7aed7d8de5ba85d867 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Thu Aug 28 12:58:01 2025 +0200 tests fix commit ec74274ca52a6aa0b5f300374fda838609680506 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Thu Aug 28 12:32:05 2025 +0200 ops commit 0fb19004ccd285dcad485fce0865b355ce5493e0 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Thu Aug 28 11:45:16 2025 +0200 whoops commit c946bea5e45aea021c8878c57fcabc2a13f06fe5 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Thu Aug 28 11:35:36 2025 +0200 testing... commit 924c0dec6d9ea6b4890644fe7f711dc778f820bb Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Thu Aug 28 11:22:46 2025 +0200 sweeep ready for tests commit b05aa771d3994b07cd460cda74b274c9e4f315e6 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Thu Aug 28 11:13:01 2025 +0200 restore and deprecate constraints commit 9c7962d10efa7178b69d3c99e69663756e1cd979 Merge: fceeb383f9 c17bf304d5 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Wed Aug 27 20:44:21 2025 +0200 Merge branch 'remove-group-bs' into remove-constrained-bs commit c17bf304d5cf33af7f34f9f6057915d5f5821dae Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Wed Aug 27 17:00:50 2025 +0200 fix test commit d579aeec6706b77fcc24c1f6806cd7277d7db56e Merge: 822efd8c3c ed5dd2999c Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Wed Aug 27 16:04:31 2025 +0200 Merge branch 'main' of github.com:huggingface/transformers into remove-group-bs commit 822efd8c3cf475d079e64293aa06e4ab59740fd7 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Wed Aug 27 15:59:51 2025 +0200 aaand remove tests after all green!! commit 62cb274a4acb9f24201902242f1b0dc4e46daac1 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Wed Aug 27 11:48:19 2025 +0200 fix commit c89c892e7b24a7d71831f2b35264456005030925 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Wed Aug 27 11:45:20 2025 +0200 testing that hub works the same commit fceeb383f99e4a836679d67b1d2a8520152eaf49 Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Tue Aug 26 20:06:59 2025 +0200 draft commit 6a9b384078f3798587ba865ac7ddfefc9a79e41c Merge: 8af3af13ab 58cebc848b Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Tue Aug 26 15:00:05 2025 +0200 Merge branch 'main' of github.com:huggingface/transformers into remove-group-bs commit 8af3af13abb85ca60e795d0390832f398a56c34f Author: Manuel de Prada Corral <manueldeprada@gmail.com> Date: Tue Aug 26 11:55:45 2025 +0200 Squashed commit remove-constrastive-search * ops * fix * ops * review * fix * fix dia * review	2025-09-03 17:30:09 +02:00
Yuanyuan Chen	a8f400367d	Avoid attention_mask copy in qwen2.5 (#40658 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-09-03 15:17:22 +00:00
Matt	57f5668d0b	Fix Metaclip modular conversion (#40660 ) * Fix Metaclip modular conversion * manually run check_copies	2025-09-03 16:13:50 +01:00
Luc Georges	238a8274b4	feat(serving): add healthcheck (#40653 )	2025-09-03 16:43:12 +02:00
jiqing-feng	f2416b4fd2	fix pipeline dtype (#40638 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-09-03 16:05:48 +02:00
Yih-Dar	5ea5c8179b	Mark `LongformerModelTest::test_attention_outputs` as flaky (#40655 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-03 13:19:02 +00:00
Matt	fe1a9e0dba	Remove TF/Flax examples (#40654 ) * Remove TF/Flax examples * Remove check_full_copies * Trigger CI	2025-09-03 14:15:57 +01:00
Yung-Sung Chuang	5e2e496149	fix MetaCLIP 2 wrong link & wrong model names in the docstrings (#40565 ) * fix MetaCLIP 2 wrong link & wrong model names in the documentation and docstrings * ruff reformatted * update files generated by modular * update meta_clip2 to metaclip_2 to match the original * _supports_flash_attn = False --------- Co-authored-by: Yung-Sung Chuang <yungsung@meta.com>	2025-09-03 13:53:56 +01:00
Minho Ryu	03708ccf6f	add DeepseekV3ForTokenClassification (#40641 ) * add DeepseekV3ForTokenClassification * fix typo --------- Co-authored-by: json.bourne <json.bourne@kakaocorp.com>	2025-09-03 12:30:09 +00:00
Yih-Dar	c485c52db4	Skip `test_prompt_lookup_decoding_matches_greedy_search` for `voxtral` (#40643 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-03 11:45:29 +00:00
Abdelrahman Kaseb	2bbf98a83d	Fix: PIL image load in Processing utils apply_chat_template (#40622 )	2025-09-03 13:06:05 +02:00
Kashif Rasul	acc968c581	[CP] Add attention_mask to the buffer when the mask is causal (#40619 ) Fix attention mask validation for context parallelism Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-09-03 10:19:35 +00:00
Raushan Turganbay	cb54ce4ec6	[auto-model] propagate kwargs (#40491 ) propagate kwargs	2025-09-03 09:59:20 +00:00
ye	0f5e45a6d1	fix: gas for gemma fixed (#40591 ) * fix: gas for gemma fixed * feat: run fix-copies * feat: added issue label	2025-09-03 08:44:14 +00:00
Yih-Dar	e690fe61e8	Fix `too many requests` in `TestMistralCommonTokenizer` (#40623 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-03 05:05:03 +02:00
SSUM	00a8364271	🌐 [i18n-KO] Translated `deepseek_v3.md` to Korean (#39649 ) * docs: ko: deepseek_v3.md * feat: nmt draft * fix: manual edits * fix: glossary edits * docs : 4N3MONE recommandced modified contents * Update docs/source/ko/model_doc/deepseek_v3.md Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com> * Update docs/source/ko/model_doc/deepseek_v3.md Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com> * add_toctree.yml --------- Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com>	2025-09-02 13:35:56 -07:00
Cyril Vallez	ed49376a42	Remove random flag (#40629 ) remove flag	2025-09-02 19:10:02 +02:00
VinceMo	d47ad91c3c	Support TF32 flag for MUSA backend (#33187 ) * Support MUSA (Moore Threads GPU) backend in transformers Add accelerate version check, needs accelerate>=0.33.0 * Support TF32 flag for MUSA backend * fix typo	2025-09-02 16:27:10 +00:00
Yuanyuan Chen	a470f21396	Enable more ruff UP rules (#40579 ) * Import Sequence from collections.abc Signed-off-by: cyy <cyyever@outlook.com> * Apply ruff UP rules Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-09-02 17:29:59 +02:00
Yuanyuan Chen	37103d6f22	Fix invalid typing (#40612 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-09-02 13:10:22 +00:00
Yuanyuan Chen	4f542052b9	Remove unnecessary pillow version check (#40604 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-09-02 12:59:22 +00:00
Ákos Hadnagy	8c60a7c385	Add collated reports job to Nvidia CI (#40470 ) * Add collated reports job to Nvidia CI * machine_type * Move collated reports job to model_jobs * Propagate repo id variable * assifgn runner_type is self-scheduled-caller	2025-09-02 14:25:22 +02:00
Yih-Dar	97266dfd50	Fix flaky `JambaModelTest.test_load_balancing_loss` (#40617 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-02 13:58:16 +02:00
Yih-Dar	91be12bdc6	Avoid `too many request` caused by `AutoModelTest::test_dynamic_saving_from_local_repo` (#40614 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-02 12:08:52 +02:00
Raushan Turganbay	bbd8085b0b	Fix processor chat template (#40613 ) fix tests	2025-09-02 10:59:48 +02:00
Luc Georges	b2b1c30b1b	fix: continuous batching in `transformers serve` (#40479 ) * fix: continuous batching in `transformers serve` * fix: short circuit inner gen loop when prepare_next_batch prepared nothing * docs: add comment explaining FastAPI lifespan * test: add CB serving tests * refactor: remove gen cfg max new tokens override bc unnecessary * docs: add docstring for `ServeCommand::run` * feat: use new `DecodeStream` API	2025-09-02 10:45:05 +02:00
Yih-Dar	8a091cc07c	Disable cache for `TokenizerTesterMixin` temporarily (#40611 ) * try no cache * try no cache --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-02 08:40:04 +02:00
Rémi Ouazan	514b3e81b7	Multiple fixes to FA tests in AMD (#40498 ) * Expectations for gemma3 * Fixes for Qwen2_5_VL tests * Added expectation but underlying pb is still there * Better handling of mrope section for Qwen2_5_vl * Fixes for FA2 tests and reformat batch test for Qwen2_5_Omni * Fix multi-device error in qwen2_5_omni * Styel and repo-consistency * Removed inherited test because fix in common * slow tests fixes * Style * Fixes for qwen2_5_vl or omni for FA test	2025-09-01 20:49:50 +02:00
Rémi Ouazan	b3655507bb	Pin torchcodec to 0.5 in AMD docker (#40598 )	2025-09-01 20:39:55 +02:00
Yih-Dar	4da03d7f57	Reduce more test data fetch (#40595 ) * example * fix * fix * add to fetch script --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-01 18:07:18 +02:00
Anton Vlasjuk	abf5900a76	[`Tests`] Fixup duplicated mrope logic (#40592 ) cleanup duplicated logic	2025-09-01 17:22:34 +02:00
Cyril Vallez	3beac9c659	Fix quite a lot of FA tests (#40548 ) * fix_rope_change * fix * do it dynamically * style * simplify a lot * better fix * fix * fix * fix * fix * style * fix	2025-09-01 16:42:50 +02:00
Rémi Ouazan	21e708c8fd	Fix for missing default values in encoder decoder (#40517 ) * Added default_value for is_updated and type check * Forgot one * Repo consistency	2025-09-01 16:11:23 +02:00
Yih-Dar	c99d43e6ec	Fix `siglip` flaky `test_eager_matches_sdpa_inference` (#40584 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-01 15:17:25 +02:00
Matt	3c3dac3c12	Add Copilot instructions (#40432 ) * Add copilot-instructions.md * Fix typo * Update .github/copilot-instructions.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-09-01 14:09:54 +01:00
Yuanyuan Chen	2b71c5b7a6	Fix inexistent imports (#40580 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-09-01 13:05:00 +00:00
Yih-Dar	8e0b2c8baf	Skip `TvpImageProcessingTest::test_slow_fast_equivalence` (#40593 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-01 15:03:34 +02:00
Yuanyuan Chen	a543095c99	Fix typos (#40585 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-09-01 12:58:23 +00:00
Manuel de Prada Corral	8564e210ca	🚨 Remove Constrained Beam Search decoding strategy (#40518 ) * Squashed remove-constrastive-search * sweeep ready for tests * testing... * whoops * ops * tests fix * tests green, changed handling of deprecated methods * tests gone after green * restore and deprecate beam obkects * restore and deprecate constraint objects * fix ci * review	2025-09-01 12:34:48 +00:00
Yaowei Zheng	564be6d895	Support batch size > 1 image-text inference (#36682 ) * update make nested image list * fix make flat list of images * update type anno * fix image_processing_smolvlm * use first image * add verbose comment * fix images * rollback * fix ut * Update image_processing_smolvlm.py * Update image_processing_idefics3.py * add tests and fix some processors * fix copies * fix after rebase * make the test cover chat templates * sjip udop, no point in fixing it * fix after rebase * fix a few more tests --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: raushan <raushan@huggingface.co>	2025-09-01 12:26:07 +00:00
Manuel de Prada Corral	3bccb02616	🚨 Remove Group Beam Search decoding strategy (#40495 ) * Squashed remove-constrastive-search * testing that tests pass using hub * fix * aaand remove tests after all green!!	2025-09-01 13:42:48 +02:00
Manuel de Prada Corral	90953d5bc1	Fix custom generate relative imports (#40480 )	2025-09-01 13:38:56 +02:00
Pavel Iakubovskii	2537ed4477	Update `get__features` methods + update doc snippets (#40555 ) siglip * clip * aimv2 * metaclip_2 * align * align fixup * altclip * blip2 (make consistent) * chineese clip * clipseg * flava * groupvit * owlv2 * owlvit * vision_encoder * clap * x_clip * fixup * fix siglip2 * blip2 * fix blip2 tests (revert to original) * fix docs	2025-09-01 12:37:43 +01:00
Raushan Turganbay	48ebae975e	Fix llava image processor (#40588 ) fix	2025-09-01 13:32:57 +02:00
Yih-Dar	db6821b79c	Allow `remi-or` to `run-slow` (#40590 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-01 12:30:53 +02:00
Yih-Dar	6546f288a1	Fix CircleCI step passes in the case of pytest worker crash at test collection time (#40552 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-01 11:33:23 +02:00
Yih-Dar	cfed99d310	Fix `test_eager_matches_sdpa_inference` not run for `CLIP` (#40581 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-01 11:21:56 +02:00
Raushan Turganbay	1d742644c0	[qwen-vl] fix position ids (#40490 ) * fix position ids * fixup * adjust tests since they are failing on main as well * add a comment to make it clear	2025-09-01 09:10:41 +00:00
Raushan Turganbay	0b24507379	processor tests - use dummy videos (#40537 ) * use dummy videos * failing on main, new model merged had conflicts	2025-09-01 09:04:47 +00:00
Yih-Dar	b0db5a02f3	Set `test_all_params_have_gradient=False` for `DeepseekV2ModelTest` (#40566 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-30 22:46:31 +02:00
dsinghvi	1363fceeec	remove the redundant non maintained jieba and use rjieba instead (#40383 ) * porting not maintained jieba to rjieba * Fix format * replaced the line with rjieba instead of removing it * cut_all is not included as a parameter. cut_all is a seperate function rjieba * rev * jieba remove installation * Trigger tests * Update tokenization_cpm.py * Update tokenization_cpm_fast.py --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-08-30 13:28:52 +02:00
Yih-Dar	36fddebcee	pin `pytest-rerunfailures<16.0` (#40561 ) ping pytest-rerunfailures<16.0 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-30 12:58:44 +02:00
ivarflakstad	2d3b8863e8	Fix collated reports upload filename (#40556 )	2025-08-30 09:35:51 +02:00
Lysandre	ce48e9cac0	Dev version	2025-08-29 20:17:34 +02:00
Yih-Dar	155fd926d2	Fix `GptOssModelTest::test_assisted_decoding_matches_greedy_search_1_same` (#40551 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Manuel de Prada Corral <6536835+manueldeprada@users.noreply.github.com>	2025-08-29 15:53:53 +00:00
jiqing-feng	1067577ad2	fix gpt-oss out shape (#40535 ) * fix out shape Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * reset gpt-oss modeling Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix copies Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-08-29 15:20:33 +00:00
Raushan Turganbay	7efb4c87ca	Flaky CI is annoying (#40543 ) * mark flaky * and the non batch one	2025-08-29 16:47:44 +02:00
Marc Sun	828a27fd32	Fix gpt-oss rope warning (#40550 ) * fix * fix print * rm * real fix * fix * style	2025-08-29 14:40:33 +00:00
André R.	74a24217f5	Add bfloat16 support detection for MPS in is_torch_bf16_gpu_available() (#40458 ) * Add bfloat16 support detection for MPS (Apple Silicon) in is_torch_bf16_gpu_available bfloat16 seems to have been supported for a few years now in Metal and torch.mps. Make sure to allow it and not throw on bf16 usage with "Your setup doesn't support bf16/gpu." from TrainingArguments. * Check bf16 support for MPS using torch method Actually seems method exists: `5859edf113/torch/_dynamo/device_interface.py (L519)` It simply checks if you are on MacOs 14 or higher. * Document Metal emulation for bf16 support Add note about Metal emulation for bf16 support on M1/M2. * Update bf16 support check for MPS backend is_bf16_supported() not exposed even if defined on MPSInterface, use same approach as in accelerate pr. --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-08-29 14:37:15 +00:00
shanjiaz	ffdd10fced	Allow compression on meta device (#39039 ) * disable gradient calculation for int weights Signed-off-by: shanjiaz <zsjwpianpian@gmail.com> * Update src/transformers/quantizers/quantizer_compressed_tensors.py Co-authored-by: Kyle Sayers <kylesayrs@gmail.com> * updated model procession before/after weight loading Signed-off-by: shanjiaz <zsjwpianpian@gmail.com> * fix style Signed-off-by: shanjiaz <zsjwpianpian@gmail.com> * reformat Signed-off-by: shanjiaz <zsjwpianpian@gmail.com> * fix style Signed-off-by: shanjiaz <zsjwpianpian@gmail.com> --------- Signed-off-by: shanjiaz <zsjwpianpian@gmail.com> Co-authored-by: Kyle Sayers <kylesayrs@gmail.com>	2025-08-29 15:49:15 +02:00
Cyril Vallez	f0e778112f	Clean-up kernel loading and dispatch (#40542 ) * clean * clean imporrts * fix imports * oups * more imports * more imports * more * move it to integrations * fix * style * fix doc	2025-08-29 14:14:38 +02:00
Piyush	f68eb5f135	Redundant code removal (#40534 ) redundant code	2025-08-29 11:30:23 +00:00
Yuanyuan Chen	d888bd435d	Fix typos (#40511 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-08-29 11:25:33 +00:00
Arthur	11a6b95553	Oupsy (#40544 ) fix bump!	2025-08-29 12:59:49 +02:00
Arthur	b07144ac27	`tokenizers` bump tokenizers version (#40540 ) * bump tokenizers version * use rc0 * ? * fml * update	2025-08-29 12:34:41 +02:00
Yih-Dar	008c0ba8e2	Fix `SeamlessM4Tv2ModelWithTextInputTest::test_retain_grad_hidden_states_attentions` (#40532 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-28 23:30:59 +02:00
Yih-Dar	89ef1b6e0b	Set `test_all_params_have_gradient=False` for `HunYuanMoEV1ModelTest` (#40530 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-28 22:32:51 +02:00
Anton Vlasjuk	2e0f1d6a37	[`Qwen Omni/VL`] Fix fa tests (#40528 ) * fix * style * flaky flaky * flaky flaky * oopsie, we need the out of place for sure * flaky flaky * flaky flaky	2025-08-28 21:07:22 +02:00
Manuel de Prada Corral	68013c505a	Improve Gemma3n model and tests (#39764 )	2025-08-28 20:25:42 +02:00
Raushan Turganbay	ffcb344612	Lazy import torchcodec (#40526 ) * lazy import * parse version * omg, we need to guard version parse as well	2025-08-28 18:57:14 +02:00
vamshika-0210	8c7f685079	Fix typo: 'casual' to 'causal' (#40374 ) fix typo: 'casual' to 'causal' Co-authored-by: demo <vamshika0210@gamil.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-08-28 09:17:37 -07:00
Yih-Dar	d61fab1549	skip some `padding_matches_padding_free_with_position_ids` for FA2 (#40521 ) skip 1 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-28 17:20:07 +02:00
Yih-Dar	31336ab750	Fix mistral3 tests after "[Kosmos 2.5] Rename checkpoints" (#40523 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-28 16:29:54 +02:00
Arthur	851b8f281d	[`kernels`] If flash attention2 is not installed / fails to import (cc on our cluster) default to kernels (#40178 ) * first step if flash not installed but you set to use it * try importing * now default to using it * update our tests as well * wow yesterday I was not awake * fixup * style * lol the fix was very very simple * `RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/kernels@main#egg=kernels ` for updated dockers * push review comments * fix --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-08-28 16:20:25 +02:00
Yih-Dar	de9e2d7a2e	Skip some flex attn tests (#40519 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-28 15:43:38 +02:00
Anton Vlasjuk	7e1aee4db6	[`FA`] Remaining Cleanup (#40424 ) * fa cleanup * flaky tests * readd removed test and changeup comments to reflect the purpose * flaky tests	2025-08-28 15:01:19 +02:00
Raushan Turganbay	893d89e5e6	[omni modality] support composite processor config (#38142 ) * dump ugly option to check again tomorrow * tiny update * do not save as nested dict yet! * fix and add tests * fix dia audio tokenizers * rename the flag and fix new model Evolla * fix style * address comments * broken from different PRp * fix saving layoutLM * delete print * delete!	2025-08-28 14:40:27 +02:00
Cyril Vallez	becab2c601	Use the config for DynamicCache initialization in all modelings (#40420 ) * update all * remove the most horrible old code * style	2025-08-28 14:32:30 +02:00
Marc Sun	8acbbdcadf	[serve] fix `request_id` unexpected (#40501 ) * fix request-id in serving * style * fix	2025-08-28 14:16:28 +02:00
nayana1729	2300be3b41	sped up gguf tokenizer for nemotron test (#40509 ) sped up tokenizer for nemotron test	2025-08-28 12:10:49 +00:00
湛露先生	b2b654afbf	correct kes to keys. (#40489 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-08-28 12:00:22 +00:00
Pavel Iakubovskii	476cd7bab1	[vision] Improve keypoint-matching models docs (#40497 ) fix options and add inference_mode	2025-08-28 12:31:21 +01:00
NielsRogge	1499f9e356	[Kosmos 2.5] Rename checkpoints (#40338 )	2025-08-28 13:30:41 +02:00
Yuanyuan Chen	10ddfb0be5	Add more missing arguments (#40354 ) Add missing arguments Signed-off-by: cyy <cyyever@outlook.com>	2025-08-28 12:21:51 +02:00
EduardDurech	d10603f701	Add Apertus (#39381 ) * init swissai model * AutoModelForCausalLM * AutoModelForCausalLM mapping * qk norm and post ln optional * fix wrong shape of qk norm: megatron uses head_dim * automodel fixes * minor fix in forward * fix rope validation to accept llama3 scaling * `SwissAIForTokenClassification` support * Align `SwissAI` to v4.52.4 * Align `SwissAI` to v4.53.1 * Init CUDA xIELU * `SwissAI`->`Apertus` * ci fix * check_docstring ignore ApertusConfig * Licensing and placeholder tests * Placeholder doc * XIELU syntax * `_xielu_python` optimization * Fix xIELU * [tmp] `{beta,eps}` persistent=False until {beta,eps} saved in checkpoint * Modular `Apertus` * CUDA xIELU logging * ci fix * ci fix * ci fix * Update license Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update tests/models/apertus/test_modeling_apertus.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * `.utils.import_utils.is_torchdynamo_compiling` * `Apertus` class ordering * `past_key_value{->s}`, `make fix-copies` * ci fix * Remove unused configuration parameters * `{beta,eps}` saved in checkpoint * `{beta,eps}` Temporarily on CPU * Suggestions Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * ci fix * remove fx_compatible (deprecated) * remove `rotary_embedding_layer` As the tests are written for a config without default scaling (which is not the case in Apertus) - besides, rope scaling is tested in other models so it's all safe. * fully removing `Mask4DTestHard` class Not needed (for now) * switch to `dtype` instead of `torch_dtype` Following this: https://github.com/huggingface/transformers/pull/39782 * remove unused imports * remove `cache_implementation="static"` * +Apertus to `docs/source/en/_toctree.yml` for the doc builder --------- Co-authored-by: Alexander Hagele <alexanderhagele@gmail.com> Co-authored-by: dhia680 <garbayad@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Dhia Garbaya <84809366+dhia680@users.noreply.github.com>	2025-08-28 11:55:43 +02:00
jiqing-feng	f9b9a5e884	Update quantization overview for XPU (#40331 ) * update xpu quantization overview Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix aqlm tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update gguf support Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix gguf tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix xpu gguf precision error Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * replace deprecated models Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix import org Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update xpu ggml tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert wrong change Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix xpu tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * xpu optimum-quanto goes green Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-08-28 09:52:59 +00:00
GuoChenxu	b824f4986f	fix typo (#40484 ) * fix typo Signed-off-by: guochenxu <guochenxu@modelbest.cn> * csm & qwen omni Signed-off-by: guochenxu <guochenxu@modelbest.cn> * format Signed-off-by: guochenxu <guochenxu@modelbest.cn> * Apply style fixes * omni Signed-off-by: guochenxu <guochenxu@modelbest.cn> --------- Signed-off-by: guochenxu <guochenxu@modelbest.cn> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-08-28 08:31:25 +00:00
Rémi Ouazan	c9ff166718	Various AMD expectations (#40510 ) * AMD expectations for qwen2 * Added more detailled excpectation to smolvlm * Added AMD expectations to TableTransformer * Style	2025-08-28 10:15:21 +02:00
ivarflakstad	721d4aee81	Include machine type in collated reports filename (#40514 )	2025-08-28 09:28:12 +02:00
Cyril Vallez	98289c5546	[modular] Classes can now be defined and referenced in arbitrary order (without bringing unwanted dependencies) (#40507 ) * remove future class from dependency graph * convert all	2025-08-27 23:06:10 +02:00
Bryan	e3d8fd730e	docs(pixtral): Update Pixtral model card to new format (#40442 ) * docs(pixtral): Update Pixtral model card to new format * docs(pixtral): Change cuda into auto for device_map * docs(pixtral): Apply suggestions from review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * docs(pixtral): Apply suggestions from review, changing mistral-community into Mistral AI Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * docs(pixtral): Apply suggestions from review [!TIP] part Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * docs(pixtral): Finalize model card with tested code examples This commit finalizes the update for the Pixtral model card. * Fix the hfoption by the right one * @BryanBradfo docs(pixtral): Changing the redirection of bitsandbytes Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * docs(pixtral): Add of ` to highlight the tokens Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * docs(pixtral): Move image block per final review --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-27 11:38:51 -07:00
Yih-Dar	821384d5d4	Fix the CI workflow of `merge to main` (#40503 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-27 18:35:12 +02:00
ivarflakstad	304225aa15	Collated reports: no need to upload artifact (#40502 ) No need to upload collated reports as gh artifact	2025-08-27 18:31:55 +02:00
ivarflakstad	3c343c6601	[Whisper] Add rocm expected results to certain tests (#40482 ) * Add rocm expected results to certain tests * Specify rocm version in expectations so we know origin. Improved var names * Update test var names	2025-08-27 16:19:11 +00:00
Yih-Dar	6350636964	Fix `qwen2_moe` tests (#40494 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-27 16:22:04 +02:00
StevenBucaille	52aaa3f500	[EfficientLoFTR] dynamic image size support (#40329 ) * fix: reverted efficientloftr embeddings computation to inference time with lru cache * fix: added dtype and device for torch ones and zeros creation * fix: fixed embed height and width computation with aggregation * fix: make style * fix error message * fix fa2 tests --------- Co-authored-by: qubvel <qubvel@gmail.com>	2025-08-27 15:05:08 +01:00
Raushan Turganbay	ed5dd2999c	[ESM] support attention API (#40370 ) * ESM supports attention API * supports flags * fix tests * fix copiees * another fixup needed after fixing tests * fix tests and make sure Evolla copied everything * fix * order * forgot about "is_causal" for fa2 * cross attention can't be causal	2025-08-27 15:39:04 +02:00
Cyril Vallez	8b804311ba	[modular] Remove ambiguity in all calls to parent class methods + fix dependency graph (#40456 ) * fix in modular * remove leftover print * fix everything except when it's in assignment * fix assignment as well * more general * better * better * better comment * docstring * cleaner * remove base * doc	2025-08-27 14:51:28 +02:00
Cyril Vallez	a3afebbbbe	[modular] Use multi-processing + fix model import issue (#40481 ) * add mp and simplify a bit * improve * fix * fix imports * nit	2025-08-27 14:51:12 +02:00
zifeitong	75d6f17de6	Validate GptOssConfig rope config after it's fully initialized (#40474 ) * Validate GptOssConfig rope config after it's fully initialized Fixes #40461 * Remove whitespaces	2025-08-27 10:16:58 +01:00
Yih-Dar	80f4c0c6a0	CI when PR merged to `main` (#40451 ) * up * up * up * up * up * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-27 10:56:18 +02:00
Yih-Dar	ff8b88a948	Fix nightly torch CI (#40469 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-26 22:02:15 +02:00
Yih-Dar	74ad608a2b	Not to shock AMD team by the cancelled workflow run notification ❤️ 💖 (#40467 )	2025-08-26 20:53:24 +02:00
SowmiyaNarayanan G	c8c7623f20	Update SegFormer model card (#40417 ) * Update SegFormer model card * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update the segformer model card * Remove quantization example --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-26 08:27:25 -07:00
StevenBucaille	78f32c3917	[pipeline] Add Keypoint Matching pipeline (#39970 ) * feat: keypoint-matcher pipeline * docs: added keypoint-matcher pipeline in docs * fix: added missing statements for repo consistency * docs: updated SuperGlue, LightGlue and EfficientLoFTR docs * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * test: fixed run_pipeline_test * update pipeline typing and docs * update tests * update docs snippets * Fix import error * fix: pipeline init * pt framework --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-08-26 15:26:57 +01:00
Joao Gante	6451294f6f	[RoPE] explicit factor > implicit factor in YaRN (#40320 ) explicit factor > implicit factor	2025-08-26 14:58:28 +01:00
audioXD	5a8ba87ecf	[fast_image_processor] fix image normalization for resize (#40436 )	2025-08-26 13:49:51 +00:00
VED	0ce6709e70	deci gguf support (#38669 ) * deci gguf support * make style * tests for deci * try except removed * style * try except removed	2025-08-26 13:43:17 +00:00
Matt	263d06fedc	Fix extra template loading (#40455 ) * Fix extra template loading * Reformat * Trigger tests	2025-08-26 14:01:01 +01:00
Pedro Cuenca	58cebc848b	flash_paged: s_aux may not exist (#40434 ) Some implementations (i.e., https://huggingface.co/kernels-community/vllm-flash-attn3) support an `s_aux` arg for attention sinks, but others (https://huggingface.co/kernels-community/flash-attn) do not. If s_aux is present in the kwargs, we forward it, otherwise we don't. The user will still get an error if they use a model like gpt-oss-20b with an implementation that does not support `s_aux`, but models that don't use it won't error out. For example, [this is currently failing](`399cd5c04b/examples/pytorch/continuous_batching.py (L16)`) because we are sending `s_aux: None` in the dict.	2025-08-26 13:15:59 +02:00
Rémi Ouazan	34108a2230	Continuous batching refactor (#40426 ) * Rework of the CB example * Further rework of CB example * Refactor PA cache, slice on tokens, add debug prints -- WIP * Slice cache -- WIP * Added a mechanism to check batched outputs in CB script * Less logging, debug flag for slice, !better reset! -- WIP * QOL and safety margins * Refactor and style * Better saving of cb example * Fix * Fixes and QOL * Mor einformations about metrics * Further logging * Style * Licenses * Removed some comments * Add a slice input flag * Fix in example * Added back some open-telemetry deps * Removed some aux function * Added FA2 option to example script * Fixed math (all of it) * Added a simple example * Renamed core to classes * Made allocation of attention mask optionnal * Style	2025-08-26 13:01:42 +02:00
Manuel de Prada Corral	49e168ff08	🚨 Remove Contrastive Search decoding strategy (#40428 ) * delete go brrr * fix tests * review	2025-08-26 12:31:46 +02:00
Rémi Ouazan	b8184b7ce9	Make cache_config not mandatory (#40316 ) * Relaxed assumptions on cache_config * Review compliance * Style * Styyyle * Removed default and added args * Rebase mishapfix * Propagate args to TorchExportableModuleForDecoderOnlyLM * Fix the test I wanted fixed in this PR * Added some AMD expectation related to cache tests	2025-08-26 12:06:17 +02:00
Yao Matrix	32fcc24667	rename get_cuda_warm_up_factor to get_accelerator_warm_up_factor (#40363 ) Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-08-26 09:56:35 +00:00
Raushan Turganbay	f690a2a1e0	[video processors] decode only sampled videos -> less RAM and faster processing (#39600 ) * draft update two models for now * batch update all VLMs first * update some more image processors * update * fix a few tests * just make CI green for now * fix copies * update once more * update * unskip the test * fix these two * fix torchcodec audio loading * maybe * yay, i fixed torchcodec installation and now can actually test it * fix copies deepseek * make sure the metadata is returrned when users request it * add docs * update * fixup * Update src/transformers/audio_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/glm4v/video_processing_glm4v.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update * what if we set some metadata attr to `None` * fix CI * fix one test * fix 4 channel test * fix glm timestemps * rebase gone wrong * raise warning once * fixup * typo * fix copies * ifx smolvlm test * this is why torch's official benchmark was faster, set threads to `0` * Apply style fixes --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-08-26 11:38:02 +02:00
Xin Yao	64ae6e6b1d	fix qwen25-vl grad acc (#40333 ) * fix qwen25—vl grad acc * fix Qwen2_5_VLForConditionalGeneration for accepts_loss_kwargs * fix ci * fix ci * fix typo * fix CI	2025-08-26 09:30:06 +00:00
Kashif Rasul	6d2bb1e04d	[Trainer] accelerate contextparallel support in trainer (#40205 ) * initial context_parallel_size support in trainer * For context parallelism, use AVG instead of SUM to avoid over-accounting tokens * use parallelism_config.cp_enabled * add parallelism_config to trainer state * warn when auto-enabling FSDP * fix some reviews * WIP: somewhat matching loss * Feat: add back nested_gather * Feat: cleanup * Fix: raise on non-sdpa attn * remove context_parallel_size from TrainingArguments * if we have parallelism_config, we defer to get_state_dict from accelerate * fix form review * Feat: add parallelism config support * Chore: revert some unwanted formatting changes * Fix: check None * Check none 2 * Fix: remove duplicate import * Update src/transformers/trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Fin * require accerelate 1.10.1 and higer --------- Co-authored-by: S1ro1 <matej.sirovatka@gmail.com> Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-08-26 09:28:48 +00:00
Pavel Iakubovskii	63caaea1fb	Refactor ViT-like models (#39816 ) * refactor vit * fix * fixup * turn off FX tests * AST * deit * dinov2 * dinov2_with_registers * dpt * depth anything (nit) * depth pro (nit) * ijepa * ijepa (modular) * prompt_depth_anything (nit) * vilt (nit) * zoedepth (nit) * videomae * vit_mae * vit_msn * vivit * yolos * eomt * vitpose * update auto backbone * disable `fx` and export tests (dnov2, dpt, ijepa, vit, vitpose) * fix kwargs for backbone * fix * convnext * fixup * update convnext layernorm * fix-copies layer_norm * convnextv2 * explicit output_hidden_states for models with backbones * explicit hidden states collection for dinov2 * tests fixed * fix DPT as well * fix dinov2 with registers * add comment	2025-08-26 11:14:06 +02:00
Yih-Dar	922e65b3fc	Fix non FA2 tests after FA2 installed in CI docker image (#40430 ) * up * up * up * up * up * up * up * up * up * up * up * up * up --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-26 10:36:50 +02:00
ivarflakstad	e68146fbe7	Fix collated reports model name entry (#40441 )	2025-08-25 20:36:01 +00:00
Ákos Hadnagy	8ce633cc75	InternVL MI325 test expectations (#40387 ) * Adjust ROCm expectations * MI355 --------- Co-authored-by: ivarflakstad <69173633+ivarflakstad@users.noreply.github.com>	2025-08-25 22:00:35 +02:00
ivarflakstad	7637d298b3	Fix collated reports uploading (#40440 )	2025-08-25 21:49:59 +02:00
id01	fa59cf9c9f	Fix https://github.com/huggingface/transformers/issues/40292 (#40439 ) * Fix https://github.com/huggingface/transformers/issues/40292 * Trigger tests --------- Co-authored-by: Matt <rocketknight1@gmail.com>	2025-08-25 20:12:57 +01:00
ivarflakstad	f0e87b436d	Fix collated reports model directory traversal (#40437 ) Fix model dir traversal	2025-08-25 18:01:58 +00:00
Ákos Hadnagy	ef406902bf	Gemma3 text fixes: Add expectations for MI325 (#40384 ) * Add expectations for MI325 * Ruff * Adjust CUDA expectations as well * Another attempt for CUDA expectations	2025-08-25 19:57:50 +02:00
Judy	c81723d31b	🌐 [i18n-KO] Translated `models.md` to Korean (#39518 ) * docs: ko: models.md * feat: nmt draft * fix: manual edits * Resolved _toctree.yaml conflict during merge from main * Apply suggestions from code review Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Apply suggestions from code review * fix: update toctree * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-25 09:17:08 -07:00
ivarflakstad	6b5eab70e4	Remove working-dir from collated reports job (#40435 )	2025-08-25 18:14:35 +02:00
Joao Gante	1763ef2951	[docs] remove last references to `transformers` TF classes/methods (#40429 ) * halfway through tasks * complete * Update utils/check_docstrings.py	2025-08-25 16:30:59 +01:00
Olumayowa Akinkuehinmi	eac4f00bdf	Fix typo and improve GPU kernel check error message in MXFP4 quantization (#40349 ) (#40408 ) Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-08-25 15:21:55 +00:00
Joshua Chin	d8f2edcc46	Add `tokenizer_kwargs` argument to the text generation pipeline (#40364 ) * Add `tokenizer_kwargs` arg to text generation pipeline. * chore: re-run CI * Rename `tokenizer_kwargs` to `tokenizer_encode_kwargs` for text generation pipeline * Fix `tokenizer_encode_kwargs` doc string. * Fix note related to `tokenizer _kwargs` in text generation pipeline --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-08-25 15:21:19 +00:00
ivarflakstad	1a35d07f56	Update collated reports working directory and --path (#40433 )	2025-08-25 15:18:26 +00:00
Cyril Vallez	399cd5c04b	Fix modular for modernbert-decoder (#40431 ) * fix the modular * CI	2025-08-25 16:50:49 +02:00
Manuel de Prada Corral	ea8d9c8f06	🚨 Remove DoLa decoding strategy (#40082 ) * remove dola generation strategy * add fast test	2025-08-25 16:33:27 +02:00
Arthur	6bf6f8490c	[`Mxfp4`] Add a way to save with a quantization method (#40176 ) * add a test * tempdir * fix import issue[ * wow I am tired * properly init * i am not super familiar with quantizer api :\| * set to TRUE fro now * full support * push current changes * will clean this later but the imports are a shitshow here * this correctly saves the block and scales but forward seems broken * quanitze was not correct * fix storage * why were bias even included * finally! * style * fix style * remove print * lazy import * up * not sure what happens this works now? * holy molly it was not so far * okay this seems to work! * workings!!! * allow save_pretrained to create PR * Apply suggestions from code review * fixup * add deqyabtze fakse as wek * working new * fix * rm swizzle and unswizzle during saving * rm print * Update src/transformers/modeling_utils.py * fix * style --------- Co-authored-by: Marc Sun <marc@huggingface.co>	2025-08-25 16:27:19 +02:00
Andrew Chauzov	04c2bae3a8	Fix label smoothing incompatibility with multi-label classification (#40296 ) * Fix label smoothing incompatibility with multi-label classification (#40258) * Improve label smoothing multi-label check based on reviewer feedback - Move check from LabelSmoother to Trainer.__init__() for better architecture - Use model.config.problem_type instead of tensor inference for robustness - Warn and disable smoothing instead of raising error for better UX - Update test to verify warning behavior	2025-08-25 14:23:31 +00:00
Raushan Turganbay	3b5b9f6518	Fix processing tests (#40379 ) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment	2025-08-25 14:50:54 +02:00
jiqing-feng	a0a37b3250	Gpt oss optim (#40304 ) * enable fast index selecting Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update model Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix gpt-oss tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix check tensor Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-08-25 14:36:33 +02:00
ρrαnαm	d73181b3fc	Fix UnboundLocalError in WER metric computation (#40402 ) Renamed wer metric variable to wer_metric to avoid naming conflict with local variable assignment in compute_metrics function. Co-authored-by: pranam-gf <pranam@goodfin.com>	2025-08-25 12:02:22 +00:00
Prawal Sharma	11e12a715a	Fix typo: 'seperator' to 'separator' in variable names (#40389 ) Fixed 4 instances of the typo "seperator" → "separator" in variable names: - 2 instances in src/transformers/models/shieldgemma2/convert_shieldgemma2_weights_orbax_to_hf.py - 2 instances in src/transformers/models/gemma3/convert_gemma3_weights_orbax_to_hf.py These typos were in variable names used for parsing path components in weight conversion scripts. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-08-25 11:56:30 +00:00
Cyril Vallez	40299134a8	Fix CI (hunyuan moe does not support fullgraph) (#40423 ) fix flag	2025-08-25 12:01:28 +02:00
Olumayowa Akinkuehinmi	a2b37bfd58	Fix typo: 'casual' -> 'causal' in code and documentation (#40371 ) (#40407 )	2025-08-25 09:32:15 +00:00
Joao Gante	0031c044f8	[docs] flax/jax purge (#40372 ) flax/jax purge	2025-08-25 10:25:00 +01:00
Du Wenjie	14b89fed24	fix to accept cumulative_seqlens from TransformersKwargs in FA (#40194 ) * fix to the typings which are unmatched to FA function signature cumulative_seqlens_q/k -> cu_seq_lens_q/k: - in the FlashAttentionKwargs in modeling_flash_attention_utils - in the TransformersKwargs in generic - in the PagedAttentionArgs in continuous_batching It is BC, because they are created in `ContinuousBatchProcessor.setup_static_tensors:L762`, used in `ContinuousBatchingManager._model_forward:L1233` and destroyed with `ContinuousBatchProcessor` * format changes by ruff * Update src/transformers/integrations/flash_paged.py unused function arg in `PagedAttentionCache.update` Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * revert continuous_batching signiture, which is more meaningful --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-08-25 11:00:13 +02:00
Pablo Montalvo	ba095d387d	🧹 🧹 🧹 Get set decoder cleanup (#39509 ) * simplify common get/set * remove some noise * change some 5 years old modeling utils * update examples * fix copies * revert some changes * fixes, gah * format * move to Mixin * remove smolvlm specific require grad * skip * force defaults * remodularise some stuff * remodularise more stuff * add safety for audio models * style * have a correct fallback, you daft donkey * remove this argh * change heuristic for audio models * fixup * revert * this works * this should be explicit * fix Nth ESM exception * tryout decoder * this as well * revert again * 🧠 * aaah ESM has two modelings aaah * broom broom * format * wrong copies * copies * modular cleanups * format * modularities * wrong mergefix * seriously * align with new model * new model	2025-08-25 10:57:56 +02:00
Cyril Vallez	2c55c7fc94	Reactivate a lot of tests skipped for no reason anymore (#40378 ) * reactivate all the tests * some tests still failing	2025-08-25 10:44:43 +02:00
Yih-Dar	4f9b4e62bc	Run FA2 tests in CI (#40397 ) up Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-23 12:30:18 +02:00
Quentin Gallouédec	28ca27cb2b	HF papers in doc (#40381 ) * HF papers * clean * Update src/transformers/models/gemma3n/configuration_gemma3n.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * style --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-22 15:07:08 -07:00
tardc	7d88f57fc6	Update README_zh-hans.md (#40380 ) Fix a typo.	2025-08-22 18:22:26 +00:00
Cyril Vallez	29ddcacea3	Rework the Cache documentation (#40373 ) * start working the doc * remove gemma2 * review	2025-08-22 17:06:28 +02:00
Matt	dab66f15a1	Chat Template Doc Fixes (#40173 ) * draft commit * draft commit * Fixup chat_extras too * Update conversations.md * Update the toctree and titles * Update the writing guide! * Use @zucchini-nlp's suggestion * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-22 15:48:33 +01:00
amd-lalithnc	0a21e870c7	Bug Fix: Dynamically set return_lse flag in FlexAttention (#40352 ) * bug fix - return_lse dynamically set * addressed compatibility with return type - flex_attention_forward * rename variables * revert changes to commits	2025-08-22 13:49:26 +00:00
Abdelrahman Kaseb	894b2d84b6	Add GptOssForTokenClassification for GPT-OSS models (#40190 ) * Add GptOssForTokenClassification for GPT-OSS models * After run make fixup	2025-08-22 15:14:46 +02:00
Fazzie	56d68c6706	Addiing ByteDance Seed Seed-OSS (#40272 ) add seed oss	2025-08-22 14:54:28 +02:00
Yonghye Kwon	8a6908c10d	fix(example): align parameter names with the latest function definition for gdino (#40369 )	2025-08-22 12:27:58 +00:00
Raushan Turganbay	7db228a92a	[configuration] allow to overwrite kwargs from subconfigs (#40241 ) allow to overwrite kwargs from subconfigs	2025-08-22 13:31:25 +02:00
Raushan Turganbay	19ffe0219d	[processor] move commonalities to mixin (#40339 ) * move commonalities to mixin * revert - unrelated * fix copies * fix style * comments	2025-08-22 13:04:43 +02:00
Cyril Vallez	d8f6d3790a	⚠️⚠️ Use `dtype` instead of `torch_dtype` everywhere! (#39782 ) * update everywhere * style * pipelines * switch it everywhere in tests * switch it everywhere in docs * switch in converters everywhere * update in examples * update in model docstrings * style * warnings * style * Update configuration_utils.py * fix * Update configuration_utils.py * fixes and add first test * add pipeline tests * Update test_pipelines_common.py * add config test * Update test_modeling_common.py * add new ones * post rebase * add new * post rebase adds	2025-08-22 12:34:16 +02:00
Joao Gante	9c25820978	[pipelines] add support to `skip_special_tokens` in the main text generation pipelines (#40356 ) * add support to skip_special_tokens in pipelines * add test * rm redundant	2025-08-22 10:12:46 +00:00
Raushan Turganbay	5c40e7a225	Change multimodal data links to HF hub (#40309 ) change multimodal data links to HF hub	2025-08-22 11:50:04 +02:00
Rémi Ouazan	e018b77c89	wav2vec2 fixes (#40341 ) * Changed datasets to avoid a datasets error * Changed back split to test	2025-08-22 11:32:29 +02:00
Isotr0py	d7fe3111ff	Fix idefics3 vision embeddings indices dtype (#40360 ) fix idefics3 vision embeddings Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-08-22 11:10:45 +02:00
yjc9696	cf487cdf1f	HunYuan opensource (#39606 ) * merge opensource_hunyuan * add head_dim * fix assertion error * fix seen_tokens * ready_for_upstream (merge request !17) Squash merge branch 'ready_for_upstream' into 'main' * fix configuration type&docstring * fix style * ready_for_upstream (merge request !18) Squash merge branch 'ready_for_upstream' into 'main' * add doc * fix testcode * fix configuration type&docstring * rename base model * remove assert * update * remove tiktoken * update * fix moe and code style (#3) * update * fix format * update * revert makefile * fix moe config * fix numel() * remove prepare_inputs_for_generation * fix kv_seq_len * add docs/toctree * remove unused paramter&add licence * add licence * remove unused paramter * fix code * dense modular update import fix fix use mistralmodel fix qknorm add sliding_window make style fix dense done hunyuan moe fix import fix modular fixup fixup * update model path * fix mlp_bias * fix modular * Fix modeling (#5) * fix attention * use llamamodel * fix code * Fix qk (#6) * fix qk_norm * fix * fix modual * Fix moe (#7) * fix some moe code * fix einsum * try top1 * use top1 * Fix rotary (#8) * fix rotary * fix modeling * fix modular * fix testcode * remove A13B unit test * Fix moe v1 (#9) fix moe & gate * Fix gate norm (#10) * add norm_topk_prob * Fix testcase (#11) * fix&skip test * Fix testcase (#12) * skip testcase * Fix norm topk (#13) * hardcode norm_topk_prob * fix testcase --------- Co-authored-by: pridejcyang <pridejcyang@tencent.com> Co-authored-by: Mingji Han <mingjihan@tencent.com>	2025-08-22 07:59:58 +00:00
Huzaifa Jawad	8365f70e92	DOCS: Clarification on the use of `label_names` as an argument to TrainingArguments (#40353 ) * Update trainer.md * Update trainer.md Removed the detail about label_names argument usage from the tip/ warning section * Update training_args.py Added the label_names usage clarification in the docstring * Update trainer.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-21 17:19:04 -07:00
Yao Matrix	7c1169e21f	[4/N]more docs to device agnostic (#40355 ) * more docs to device agnostic Signed-off-by: YAO Matrix <matrix.yao@intel.com> * more Signed-off-by: YAO Matrix <matrix.yao@intel.com> * 1 Signed-off-by: YAO Matrix <matrix.yao@intel.com> * 2 Signed-off-by: YAO Matrix <matrix.yao@intel.com> * Update vitpose.md * Update camembert.md * Update camembert.md --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-08-21 10:22:26 -07:00
Joao Gante	9568b506ed	[generate] handle support for cache classes when num enc layers != num dec layers (#40277 ) * handle support for cache classes when num enc layers != num dec layers * handle overwrites * one more corner case * Update src/transformers/generation/utils.py * Update src/transformers/generation/utils.py * Apply suggestions from code review * handle corner case :o	2025-08-21 17:35:18 +01:00
Ákos Hadnagy	7f38068ae0	Qwen2.5-VL test fixes for ROCm (#40308 )	2025-08-21 18:13:07 +02:00
Anton Vlasjuk	cb1df4d26a	[`FA`] Fix some model tests (#40350 ) * fix * cleanup, revert aimv2 fa changes * fix aria * i searched a long time but the cross dependency is for the recent models so... * this was something... evolla * fix modernbert decoder + make fa test more robust * nit	2025-08-21 18:08:21 +02:00
Yuanyuan Chen	f46f29dd7c	Remove more PyTorch 2.2 compatible code (#40337 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-08-21 15:19:53 +00:00
Aaron Keesing	128f42d370	[detection] use consistent dtype for Conditional and DAB DETR positional embeddings (#40300 ) fix: use consistent dtype for sine positional embeddings	2025-08-21 15:49:56 +01:00
Joao Gante	2121d09239	[serve] add cors warnings (#40112 ) * add cors warnings * Update src/transformers/commands/serving.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Update src/transformers/commands/serving.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review * make fixup --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-08-21 14:32:36 +01:00
Eric Bezzam	b40b834ab1	Clean up XCodec and other codecs (#40348 ) * Clean up xcodec addition. * Clean up config. * Switch to fixtures test. * Small stuff. * Polish XCodec and standardize across codecs. * Update src/transformers/models/xcodec/modeling_xcodec.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Format and fix test. * Update tol. --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-08-21 15:32:00 +02:00
Michele Corazza	75aa7c7252	[ModernBert] Prevent the attention mask from being None in ModernBertForSequenceClassification (#35991 ) * [ModernBert] Prevent the attention mask from being None in ModernBertForSequenceClassification * fix the modular conversion	2025-08-21 15:16:03 +02:00
Pablo Montalvo	04b751f07d	Fix attention vizualizer (#40285 ) * make visualizer rely on create causal mask * format * fixup * fixup * read token * read token, duh * what is up with that token * small tests? * adjust * try with flush * normalize for ANSI * buffer shenanigans	2025-08-21 13:13:35 +00:00
cyn	1e1db12304	(small) fix conditional for input_ids and input_embeds in marian (#40045 ) * (small) fix conditional for input_ids and input_embeds in marian * address comment	2025-08-21 15:13:14 +02:00
Yih-Dar	7f2f53424e	Update `test_spm_converter_bytefallback_warning` (#40284 ) fff Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-21 14:09:28 +02:00
Ákos Hadnagy	11a49dd9e3	T5 test and target device fixes (#40313 ) * Fix cache setup related issues * Fix target-device-related issues * Ruff * Address review comments	2025-08-21 14:07:29 +02:00
Eddie Tsai	c4513a9fe6	Fix links in Glm4vMoe configuration classes to point to the correct H… (#40310 ) * Fix links in Glm4vMoe configuration classes to point to the correct Hugging Face model repository * run fixup to update links in Glm4vMoe configuration classes to point to the correct Hugging Face model repository	2025-08-21 11:42:53 +00:00
Elad Segal	c7e6f9a485	Fix an infinite loop bug in recursive search of relative imports (#40326 ) Fix bug in recursive search of relative imports	2025-08-21 11:39:43 +00:00
wirthual	e95441bdb5	add type hints (#40319 ) * add basic type hints to import module * run make fixup * remove optional * fixes --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-08-21 12:19:59 +01:00
Tom Aarsen	5c88d8fbcc	Fix: Only call Trainer.align_special_tokens if model has "config" attribute (#40322 ) * Only call Trainer.align_special_tokens if model has "config" attribute * Add efficient test for training a model without model.config * Reformat	2025-08-21 12:06:42 +01:00
Joao Gante	c031f6f994	[docs] remove TF references from `/en/model_doc` (#40344 ) * models up to F * models up to M * all models	2025-08-21 11:53:21 +01:00
Yuanyuan Chen	7b060e5eb7	Add missing arguments to class constructors (#40068 ) * Add missing arguments Signed-off-by: cyy <cyyever@outlook.com> * Fix typos Signed-off-by: cyy <cyyever@outlook.com> * More fixes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-08-21 10:22:38 +00:00
Cyril Vallez	6ad7f29461	Fix deprecation warning version (#40343 ) fix	2025-08-21 12:18:23 +02:00
Abdelrahman Kaseb	adf84aec21	Add DeepseekV3ForSequenceClassification for Deepseek V3 models (#40200 ) * Add Sequence Classification Support for Deepseek v3 model DeepseekV3ForSequenceClassification * After run make fixup	2025-08-21 12:01:33 +02:00
Yuanyuan Chen	1e2e28f3c8	Change Qwen2RMSNorm to RMSNorm from PyTorch (#40066 ) * Unify Qwen2RMSNorm definitions and use RMSNorm from PyTorch Signed-off-by: cyy <cyyever@outlook.com> * subclass RMSNorm Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-08-21 11:58:35 +02:00
Yuekai Zhang	022af24fcc	Fix qwen-omni processor text only mode (#40336 ) * Fix qwen-omni processor text only mode * remove try except --------- Co-authored-by: yuekaiz <yuekaiz@mgmt1-login.cm.cluster>	2025-08-21 11:57:32 +02:00
Joao Gante	c99ed492c7	[docs] remove flax references from `/en/model_doc` (#40311 ) * 1st commit * all models up to D * all models up to G * all models up to M * all remaining models	2025-08-21 10:52:54 +01:00
Cyril Vallez	c2e3cc24e0	Fix chunked attention mask with left-padding (#40324 ) * add fix * add test * raise proper warning for older versions * fix * fix and add 2nd test * fix for flex and torch 2.5	2025-08-21 10:52:49 +02:00
Cyril Vallez	242bb2cafc	One cache class to rule them all (#40276 ) * remove all classes * fix generate * start replacing everywhere * finish removing everywhere * typo * typo * fix * typo * remove num_layers=1 * CI * fix all docstrings * review * style	2025-08-20 19:36:11 +02:00
ivarflakstad	1054494dd6	Update notification service amd_daily_ci_workflows definition (#40314 )	2025-08-20 17:49:46 +02:00
Eon Kim	139cd91713	Fix: Apply `get_placeholder_mask` in Ovis2 (#40280 ) * Refactor special image mask * Refactor get_placeholder_mask method * Revert "Refactor special image mask" This reverts commit 9eb1828ae930329656d6f323a510c5e6033e1f85. * Fix * Revert "Refactor get_placeholder_mask method" This reverts commit 07aad6484bb08d6351d5b605e9db574d28edcd15.	2025-08-20 17:12:10 +02:00
Yih-Dar	5d906740d2	Update CI with nightly torch workflow file (#40306 ) * fix nightly ci * Apply suggestions from code review Co-authored-by: ivarflakstad <69173633+ivarflakstad@users.noreply.github.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: ivarflakstad <69173633+ivarflakstad@users.noreply.github.com>	2025-08-20 16:59:00 +02:00
Arthur	4977ec2ae8	[`GPT OSS`] Refactor the tests as it was not properly checking the outputs (#40288 ) * it was long due! * use the official kernel * more permissive * update the kernel as well * mmm should it be this? * up pu * fixup * Update test_modeling_gpt_oss.py * style * start with 20b	2025-08-20 16:47:41 +02:00
Yih-Dar	3b7230124b	No more `natten` (#40287 ) get rid off natten Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-20 16:10:15 +02:00
Matt	2df0c323cb	byebye torch 2.1 (#40317 ) * Bump minimum torch version to 2.2 * Remove is_torch_greater_or_equal_than_2_2 * update versions table * Deprecate is_torch_sdpa_available (except for backward compat), remove require_torch_sdpa	2025-08-20 15:03:46 +01:00
Rishub Tamirisa	c50f140be2	Add back `_tp_plan` attribute (#39944 ) * Update modeling_utils.py * make sure we update with the module's plan * use public api * oups * update * fix failing test * Update src/transformers/integrations/tensor_parallel.py * Update src/transformers/integrations/tensor_parallel.py * fix * make the API more friendly! * fix tests * fix styling --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-08-20 15:29:55 +02:00
Ákos Hadnagy	a97213d131	Qwen2.5-Omni test fixes (#40307 ) Updated expectations, and mp tests	2025-08-20 14:48:30 +02:00
Duc-Viet Hoang	ca543f822f	Add support for Florence-2 (#38188 ) * init * add modular * fixup * update configuration * add processing file * update auto files * update * update modular * green setup_and_quality ci * it works * fix some tests * commit florence2 * update test * make test cases done - 16 left * style * fix few test cases * fix some tests * fix init test * update florence2 vision style * hope is green * fix init test * fix init * update modular * refactor vision module * fix: channel attention use dynamic scale * update modular * update * update attention mask * update * fix naming * Update src/transformers/models/florence2/processing_florence2.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * spatial block works * more beautiful * more more beautiful * merge main * merge main and fixup * fix typing hint * update modeling * fix eager matches sdpa * fix style * fix compile test - all green * remove florence2 language * remove Florence2LanguageModel things * fix style * update florence2 model * override prepare encoder_decoder for generation * add weight conversion script * rewrite channel attention to use sdpa * eleminate 1 tranpose op * support fa2 * fix quality check * chore: reformat `test_modeling_florence2.py` * some refactor for processor * some refactor for processor * update naming convention and remove BC * make it pass the test * fix: correct Embedding Cosine * update comments and docstring * support input_embeds * support input embeds ideally * fix style * fix style * fix style again :D * add test prcoessor * refactor processor and add test for processor * reformat test processor * make fixup * fix schema check * remove image_token * ensure image token in tokenizer and fix integration tests * fix processor test * add more integration tests for large model and rename test_processor to test_processing * test_assisted_decoding_sample should pass * update doc and make model work with image text to text pipeline * docs: add sdpa bagde * resolve cyril's comments * fix import torch error * add helper get_placeholder_mask * inherit from llava * florence2 may not _supports_attention_backend because of bart ... * move florence2 model card to multimodal * let base model always return_dict * fix style * tiny update doc * set _checkpoint_conversion_mapping = {} * fix code quality * support flex and compile graph and move external func to internal func * remove condition because it always true * remove window funcs * move post processor config out * fix ci * new intro to trigger test * remove `kernel_size` argument --------- Co-authored-by: ducviet00-h2 <viet.d.hoang@h2corporation.jp> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-08-20 14:28:06 +02:00
Matt	959239debc	Remove unnecessary contiguous calls for modern torch (#40315 )	2025-08-20 12:24:14 +00:00
Anton Vlasjuk	7d2aa5d6e6	🚨 [`Flash Attention`] Fix sliding window size (#40163 ) * swa fix * add comment, make fix symmetrical * modify fa inference test to force swa correctness check * fixup comment	2025-08-20 14:23:14 +02:00
MilkClouds	3128db6927	chore: fix typo in `find_executable_batch_size` to match new 0.9 ratio (#40206 )	2025-08-20 12:18:06 +00:00
Manny Cortes	ca0aaa8c74	[`fix`] Pass adamw optimizer parameters to StableAdamW (#40184 ) * fix: pass adamw optimizer parameters to StableAdamW * add test for stable_adamw initialization with trainer arguments * address copilot suggestion * fix: update weight_decay handling in stable_adamw kwargs --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-08-20 11:52:23 +00:00
Isotr0py	a01f38b364	Fix GOT-OCR2 and Cohere2Vision image processor patches caculation (#40312 ) fix got-ocr patches caculation Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-08-20 13:13:58 +02:00
Anuraag (Rag) Agrawal	a5f0b505a0	Remove OTel SDK dependencies (#40305 )	2025-08-20 12:31:44 +02:00
Eric Bezzam	d0f1a6ec36	Clean up X-Codec. (#40271 ) * Clean up xcodec addition. * Clean up config. * Switch to fixtures test. * Small stuff.	2025-08-20 12:16:28 +02:00
Joao Gante	da9452a592	[docs] delete more TF/Flax docs (#40289 ) * delete some TF docs * update documentation checks to ignore tf/flax * a few more removals * nit * Update utils/check_repo.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-08-20 10:44:14 +01:00
Anton Vlasjuk	a4e1fee44d	[`FA`] Fix dtype in varlen with position ids (#40295 ) fix	2025-08-20 11:15:55 +02:00
Yih-Dar	126bc03b4e	Allow to be able to run `torch.compile` tests with `fullgraph=True` (#40164 ) * fix * address comment --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-20 10:42:33 +02:00
NielsRogge	1d46091737	Add MetaCLIP 2 (#39826 ) * First draft * Make fixup * Use eos_token_id * Improve tests * Update clip * Make fixup * Fix processor tests * Add conversion script * Update docs * Update tokenization_auto * Make fixup * Use check_model_inputs * Rename to lowercase * Undo CLIP changes * Address comment * Convert all checkpoints * Update auto files * Rename checkpoints	2025-08-20 09:25:43 +02:00
Yao Matrix	0f9c9088d0	[3/3] make docs device agnostic, all en docs for existing models done (#40298 ) docs to device agnostic cont. Signed-off-by: Yao, Matrix <matrix.yao@intel.com>	2025-08-19 21:01:27 -07:00
Yao Matrix	eaa48c81e9	make model docs device agnostic (2) (#40256 ) * doc cont. Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * more models Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update docs/source/en/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update mixtral.md --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-19 13:10:03 -07:00
Ákos Hadnagy	42fe769928	SmolVLM test fixes (#40275 ) * Fix SmolVLM tests * Add the proper CUDA expectations as well * Split 'A10 and A100 expectations * Ruff --------- Co-authored-by: Akos Hadnagy <akoshuggingface@mi325x8-123.atl1.do.cpe.ice.amd.com>	2025-08-19 21:22:06 +02:00
Ákos Hadnagy	4c017465bd	Adjust ROCm test output expectations (#40279 ) Adjust ROCm output expectations	2025-08-19 21:21:45 +02:00
Nemitha Wijerathna	0f9ce43687	Standardize BertGeneration model card (#40250 ) * Standardize BertGeneration model card: new format, usage examples, quantization * Update docs/source/en/model_doc/bert-generation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bert-generation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bert-generation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bert-generation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bert-generation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bert-generation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bert-generation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply reviewer feedback: update code examples * Add missing code example --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-19 11:22:13 -07:00
Quentin Gallouédec	6ceb13fb22	SmolVLM and InternVL: Ensure pixel values are converted to the correct dtype for fp16/bf16 (#40121 ) * Ensure pixel values are converted to the correct dtype for fp16/bf16 * add to modular	2025-08-19 10:39:08 -07:00
Ahnjj_DEV	92f40da608	Update model card for gpt neox japanese (#39862 ) * Update GPT-NeoX-Japanese model card * Apply suggestions from code review * Update gpt_neox_japanese.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-19 09:18:46 -07:00
Mehul Aggarwal	3a4b2756cf	docs: Update TrOCR model card to new format (#40240 ) * docs: Update TrOCR model card to new format * Updated Sugegestions	2025-08-19 09:17:45 -07:00
Aayush Shah	46d38546f3	Standardize RAG model card (#40222 ) * Standardize RAG model card Update rag.md to follow the new Hugging Face model card template: - Added friendly overview in plain language - Added pipeline and AutoModel usage examples - Included quantization example with BitsAndBytesConfig - Added notes and resources sections - Removed abstract and FlashAttention badge * Standardize RAG model card Update rag.md to follow the new Hugging Face model card template: - Added friendly overview in plain language - Added AutoModel usage example - Included quantization example with BitsAndBytesConfig	2025-08-19 09:16:10 -07:00
Jin-Ho Lee	bd96e1e1cc	docs(layoutlm): add missing `id=usage` to `<hfoptions>` tag in LayoutLM model card (#40273 ) docs(layoutlm): add missing 'id=usage' to <hfoptions> tag in LayoutLM model card	2025-08-19 09:14:43 -07:00
Robin Ede	8636b309e6	Fix chat CLI GPU loading and request_id validation issues (#40230 ) (#40232 ) * Fix chat CLI GPU loading and request_id validation issues (#40230) This commit addresses two critical bugs in the transformers chat CLI: 1. GPU Loading Issue: Changed default device from "cpu" to "auto" in ChatArguments - Chat CLI now automatically uses GPU when available instead of defaulting to CPU - Matches the behavior of the underlying serving infrastructure 2. Request ID Validation Error: Added request_id field to TransformersCompletionCreateParamsStreaming schema - Fixes "Unexpected keys in the request: {'request_id'}" error on second message - Allows request_id to be properly sent and validated by the server Both fixes target the exact root causes identified in issue #40230: - Users will now get GPU acceleration by default when available - Chat sessions will no longer break after the second message * Remove unrelated request_id field from TransformersCompletionCreateParamsStreaming	2025-08-19 15:33:44 +00:00
Arthur	bebeccb06a	fix which routing method (#40283 )	2025-08-19 16:35:13 +02:00
Tyler Zhu	249d7c6929	Update image_processing_perception_lm_fast.py to allow for proper override of vision_input_type (#40252 ) * Update image_processing_perception_lm_fast.py Allow for a proper override of vision_input_type in hf fast image processor, otherwise we need to resort to manually setting the attribute. * Update processing_perception_lm.py to match kwargs vision input type * Update image_processing_perception_lm_fast.py kwargs to signature args	2025-08-19 11:41:27 +00:00
r0	57bb6db6ee	Skipping pytree registration in case fsdp is enabled (#40075 ) * Skipping pytree registration in case fsdp is enabled * Beauty changes * Beauty changes * Moved the is_fsdp_available function to import utils * Moved is_fsdp_available to integrations.fsdp * Skipping pytree registration in case fsdp is enabled * Beauty changes * Beauty changes * Moved the is_fsdp_available function to import utils * Moved is_fsdp_available to integrations.fsdp * Added pytree registration inside dynamic cache class * Making ci/cd lords happy * Adding a check if DynamicCache is already a leaf * Adding try/catch for multiple initializations of DynamicCache in test suites * Moving dynamic cache pytree registration to executorch * Adding try catch back	2025-08-19 11:58:05 +02:00
tic-top	5b3b7ea472	Add Kosmos-2.5 (#31711 ) Add Microsoft Kosmos-2.5 --------- Co-authored-by: kirp@umich.edu <tic-top> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-19 11:56:03 +02:00
nnul	c93594e286	[detection] fix correct `k_proj` weight and bias slicing in D-FINE (#40257 ) Fix: correct k_proj weight and bias conversion in D-FINE	2025-08-19 09:44:37 +00:00
Raushan Turganbay	2f1a8ad4ba	Fix setting attention for multimodal models (#39984 ) * fix * use non-explicit `None` * keep previously set attn if exists	2025-08-19 11:35:11 +02:00
Cyril Vallez	a2e76b908b	🚨🚨 Switch default compilation to fullgraph=False (#40137 ) * switch default * docstring * docstring * rework tests and remove outdated restrictions * simplify * we need a check for static cache * fix * rename var * fix * revert * style * rename test	2025-08-19 11:26:22 +02:00
Jack	2b59207a72	Fix slow static cache export tests (#40261 )	2025-08-19 11:24:07 +02:00
Matteo Destro	56c44213b3	[detection] fix attention mask for RT-DETR-based models (#40269 ) * Fix get_contrastive_denoising_training_group attention * Add bool attention_mask conversion	2025-08-19 09:15:56 +00:00
BakerBunker	5d9a715e30	set inputs_embeds to None while generate to avoid audio encoder forward in generation process (#40248 ) * set inputs_embeds to None while generate to avoid audio encoder forward in generation process * set input_features to none instead --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com>	2025-08-19 08:45:57 +00:00
ivarflakstad	28746cdc7b	Remove MI300 CI (#40270 ) Remove MI300 CI (in history if we need it back)	2025-08-19 08:23:39 +00:00
Raushan Turganbay	debc92e60a	Skip broken tests (#40157 ) skip these tests	2025-08-19 10:04:08 +02:00
rafakatri	6b5bd11723	docs: Update OLMo model card (#40233 ) * Updated OLMo model card * Update OLMo description Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix typo Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix cli typo Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix cli example Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Add bitsandbytes info Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-18 13:35:39 -07:00
Ákos Hadnagy	e472efb9ac	Fix benchmark workflow (#40254 ) Correct init_db.sql path Co-authored-by: Akos Hadnagy <akoshuggingface@mi325x8-123.atl1.do.cpe.ice.amd.com>	2025-08-18 18:14:16 +00:00
Pavlo Fesenko	59862209ca	Correct typo and update notes in docs Readme (#40234 ) * Correct typo and update notes in docs readme * Update docs/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-18 10:31:12 -07:00
Sahil Kabir	a7eabf1dde	Model card for NLLB (#40074 ) * initializing branch and draft PR * updated model card .md file * minor * minor * Update docs/source/en/model_doc/nllb.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/nllb.md suggestion Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/nllb.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/nllb.md suggestion Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/nllb.md suggestion Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/nllb.md suggestion Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/nllb.md suggestion Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * resolving comments + adding visuals * Update docs/source/en/model_doc/nllb.md suggestion Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/nllb.md suggestion Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/nllb.md suggestion Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/nllb.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/nllb.md suggestion Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/nllb.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/nllb.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * NllbTokenizerFast and NllbTokenizer added * endline * minor * Update nllb.md --------- Co-authored-by: Sahil Kabir <sahilkabir@Sahils-MacBook-Pro.local> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-18 10:05:59 -07:00
akug	01c03bf4ee	fix: Catch correct ConnectionError for additional_chat_templates (#39874 ) * fix: Catch correct ConnectionError for additional_chat_templates * fix: don't catch timeout * fix: formatting	2025-08-18 17:25:47 +01:00
Rémi Ouazan	2bcf9f6c7e	Fixes for EncoderDecoderCache (#40008 ) * Add expectation to t5 for rocm 9.4 * Made EncoderDecoderCache compatible with nn.DataParallel * Fixed t5gemma EncoderDecoderCache * Added todos in autoformer * Ruff * Init is self-contained * Review compliance * Fixed kwargs init of EncoderDecoderCache	2025-08-18 17:51:05 +02:00
Anton Vlasjuk	aa45824919	[`CI`] Fix repo consistency (#40249 ) * fix * doc --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-08-18 17:32:17 +02:00
Joao Gante	d6fad86d23	[serve] guard imports (#39825 ) guard imports	2025-08-18 16:28:10 +01:00
MQY	7a0ba0d7d8	[typing] fix type annotation error in DepthPro model image processor (#40238 ) * fix type annotation error in DepthPro model image processor * fix * run make fix-copies	2025-08-18 15:42:13 +01:00
Thomas Børstad	00b4dfb786	Add `chat_template` (`jinja2`) as an extra dependency (#40128 ) * add jinja2 as a dependency * Make jinja2 a core dependency in install_requires - Add jinja2 to install_requires list in setup.py for automatic installation - Add jinja2 to runtime version checks in dependency_versions_check.py - Resolves issue where pip install transformers doesn't install jinja2 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Make jinja2 a core dependency in install_requires * Make jinja2 an extra dependency instead of adding a core dep --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-18 14:31:40 +00:00
Peter St. John	f417a1aad4	remove transpose_for_scores call in ESM-2 (#40210 ) * remove transpose_for_scores call Signed-off-by: Peter St. John <pstjohn@nvidia.com> * fix copied evolla code Signed-off-by: Peter St. John <pstjohn@nvidia.com> --------- Signed-off-by: Peter St. John <pstjohn@nvidia.com>	2025-08-18 14:28:59 +00:00
Manuel de Prada Corral	a36d51e801	🚨 Always return Cache objects in modelings (to align with generate) (#39765 ) * watch the world burn * fix models, pipelines * make the error a warning * remove kwargs and return_legacy_cache * fix reformer	2025-08-18 16:26:35 +02:00
Yuanyuan Chen	57e230cdb2	Fix more pylint warnings (#40204 ) Fix pylint warnings Signed-off-by: cyy <cyyever@outlook.com>	2025-08-18 14:17:16 +00:00
Eon Kim	47938f8f8d	Add Ovis2 model and processor implementation (#37088 ) * Add Ovis2 model and processor implementation * Apply style fixes * Add unit tests for Ovis2 image processing and processor * Refactor image processing functions for clarity and efficiency * Add Ovis2 ImageProcessorFast * Refactor Ovis2 code * Refactor Ovis2 model components and update processor functionality * Fix repo consistency issues for Ovis2: docstring, config cleanup * Update Ovis2 model integration tests * Update Ovis2 configuration and processing classes for improved documentation * Remove duplicate entry for 'ovis2' in VLM_CLASS_NAMES * Fix conflict * Fix import order * Update image processor class names * Update Ovis2 model structure * Refactor Ovis2 configuration * Fix typos * Refactor Ovis2 model classes and remove unused code * Fix typos * Refactor Ovis2 model initialization * Fiix typos * Remove Ovis2 model mapping from MODEL_MAPPING_NAMES in modeling_auto.py * Add license and update type hints * Refactor token function and update docstring handling * Add license * Add Ovis2 model support and update documentation * Refactor Ovis2 model structure and enhance multimodal capabilities * Update Ovis2 weight mapping for consistency and clarity in key patterns * Remove unused 'grids' parameter from Ovis2 model and Update processing logic to handle image grids more efficiently. * Refactor Ovis2 model test structure to include Ovis2Model * Add optional disable_grouping param to Ovis2ImageProcessorFast * Refactor type hints in Ovis2 modules * Add licensing information in Ovis2 modules and tests * Refactor Ovis2 model by removing unused methods * Refactor Ovis2 model tests by renaming test classes and removing skipped tests * Refactor Ovis2 model output classes * Refactor Ovis2 weight conversion and Update model embedding classes * Refactor Ovis2 model imports and remove unused functions * Enhance vision configuration extraction in Ovis2 weight conversion * Refactor Ovis2 model's forward method to remove interpolation option * Update Ovis2 model documentation * Refactor Ovis2 model input handling and tokenizer configuration * Update return type hints in Ovis2 model * Remove commented-out code * fix config for tests and remove key mappings * Update tokenizer configuration to use add_special_tokens method * skip torchscript * Fix image placeholder generation in Ovis2Processor * Refactor Ovis2 model to rename visual_table to visual_embeddings_table * Enhance Ovis2 model by adding vision_feature_select_strategy parameter * Refactor Ovis2 model weights conversion and architecture * Refactor Ovis2 model by removing vision_feature_select_strategy parameter * Update Ovis2 model examples * Refactor Ovis2 model * Update Ovis2 model * Update Ovis2 model configuration * Refactor Ovis2 model test setup * Refactor flash attention support * Refactor * Fix typo * Refactor * Refactor model classes * Update expected output in Ovis2 * Refactor docstrings * Fix * Fix * Fix * Update input in tests * Fix * Fix get_decoder method * Refactor * Refactor Ovis2 * Fix * Fix * Fix test * Add get_placeholder_mask * Refactor Ovis2 model tests * Fix * Refactor * Fix * Fix * Fix Ovis2 test --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-08-18 16:05:49 +02:00
ivarflakstad	2fe43376cd	AMD scheduled CI ref env file (#40243 ) * Reference env-file to be used in docker running the CI * Disable MI300 CI for now	2025-08-18 15:23:27 +02:00
nnul	e4bd2c858d	Fix ESM token_dropout crash when using inputs_embeds instead of input_ids (#40181 ) * fix: Error after calling ESM model with input embeddings not input ids * propagate changes to other models	2025-08-18 13:22:10 +00:00
Yuanyuan Chen	6333eb986a	Fix more typos (#40212 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-08-18 12:52:12 +00:00
Yoni Gozlan	e5886f9194	[SAM 2] Change checkpoints in docs and tests (#40213 ) * change checkpoints in docs and tests * add notebook	2025-08-18 11:21:34 +02:00
Luo Xiaochuan	eb2f9da096	fix error vocab_size at Qwen2_5_VLForConditionalGeneration loss_function (#40130 ) * fix error vocab_size at Qwen2_5_VLForConditionalGeneration loss_function Signed-off-by: luoxiaoc <xiaochuan.luo@intel.com> * fix similar errer at qwen2_vl and do make fix-copies Signed-off-by: luoxiaoc <xiaochuan.luo@intel.com> * pass in kwargs for loss_func at qwen2_vl and qwen2_5_vl Signed-off-by: luoxiaoc <xiaochuan.luo@intel.com> * Apply style fixes --------- Signed-off-by: luoxiaoc <xiaochuan.luo@intel.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-08-18 08:59:25 +00:00
Rohit Jena	6ce8f05375	Use correct `model_input_names` for PixtralImageProcessor (#40226 ) add image_sizes to model_input_names	2025-08-18 08:06:52 +00:00
Yih-Dar	2914ceca20	Revert "Pin torch to 2.7.1 on CircleCI for now" + Final fix for `too long with no output` (#40201 ) * Revert "Pin torch to 2.7.1 on CircleCI for now (#40174)" This reverts commit 31b6e6e1dac0d32f74ec5cd6b3c1868534ccd7b5. * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-18 08:40:53 +02:00
Jin-Ho Lee	cd22550692	docs: Update LayoutLM model card according to new standardized format (#40129 ) * docs: Update LayoutLM model card with standardized format * Apply suggestions from code review This commit incorporates all suggestions provided in the recent review. Further changes will be committed separately to address remaining comments. Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Address remaining review comments * Address few more review comments: 1. remove transformer-cli section 2. put resources after notes 3. change API refs to 2nd level header * Update layoutlm.md * Update layoutlm.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-15 09:33:47 -07:00
Daniel Han	05000aefe1	Fix GPT-OSS `swiglu_limit` not passed in for MXFP4 (#40197 ) Add swiglu_limit = 7.0	2025-08-15 17:04:25 +02:00
Manal ML	3f4c85fef0	Add X-Codec model (#38248 ) * add working x-codec * nit * fix styling + copies * fix docstring * fix docstring and config attribute * Update args + config * update convertion script * update docs + cleanup * Ruff fix * fix doctrings	2025-08-15 16:24:12 +02:00
Ákos Hadnagy	29e4e35927	Benchmarking improvements (#39768 ) * Start revamping benchmarking * Start refactoring benchmarking * Use Pandas for CSV * import fix * Remove benchmark files * Remove sample data * Address review comments	2025-08-15 15:59:11 +02:00
Ajeet Verma	de437d0d7a	Update: add type hints to check_tokenizers.py (#40094 ) * Update check_tokenizers.py chore(typing): add type hints to check_tokenizers script - Annotate params/returns for helper functions - Keep tokenizer instances as `Any` to avoid runtime coupling - Make `check_LTR_mark` return `bool` explicitly (no behavior change) * Update check_tokenizers.py chore(typing): replace Any with PreTrainedTokenizerBase in check_tokenizers.py - Use transformers.tokenization_utils_base.PreTrainedTokenizerBase for `slow` and `fast` params - Covers both PreTrainedTokenizer and PreTrainedTokenizerFast - Exposes required methods (encode, decode, encode_plus, tokenize) - Removes generic Any typing while staying implementation-agnostic	2025-08-15 12:41:28 +00:00
Yuanyuan Chen	28a03fb78a	Fix various Pylint warnings (#40107 ) Tidy code Signed-off-by: cyy <cyyever@outlook.com>	2025-08-15 12:40:12 +00:00
Yuanyuan Chen	ec85d2c44f	Avoid CUDA stream sync (#40060 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-08-15 12:37:15 +00:00
Yuanyuan Chen	c7afaa5b44	Remove _prepare_flash_attention_from_position_ids (#40069 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-08-15 12:35:03 +00:00
Yuanyuan Chen	c167faa081	Fix typos (#40175 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-08-15 12:10:26 +00:00
Cyril Vallez	5068fcd9a8	Add repr to EncoderDecoderCache (#40195 ) * add repr * oups	2025-08-15 12:57:49 +02:00
Cyril Vallez	421175685d	Fix fsdp for generic-task models (#40191 ) * remove abc inheritance * add fast test	2025-08-15 12:28:16 +02:00
Ferdinand Mom	4912d5b490	fix to avoid modifying a view in place (#40162 ) * fix to avoid modifying a view in place * add backward test in tensor parallel * add test to test_modelig_gpt_oss.py * linting	2025-08-15 10:30:49 +02:00
Yao Matrix	cc9997878a	make model doc device agnostic (#40143 ) * make model doc device agnostic Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update align.md * Update aya_vision.md * Update byt5.md * refine Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update granitevision.md * Update src/transformers/pytorch_utils.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add doc Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * 3 more Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-14 23:31:31 -07:00
Christopher Akiki	85fce2e54c	[MINOR:TYPO] Update base.py (#40169 ) * [MINOR:TYPO] Update base.py All other occurrences in the docs use lowercase. (https://github.com/search?q=repo%3Ahuggingface%2Ftransformers%20translation_XX_to_YY&type=code) Also, using uppercase doesn't work: tested with "translation_EN_to_FR" which doesn't work and instead returns: `ValueError: The task does not provide any default models for options ('EN', 'FR')` It might be a good idea to allow for uppercase, but that's for another issue. * [MINOR:TYPO] Update __init__.py	2025-08-14 22:53:57 -07:00
Raushan Turganbay	52c6c1bb6e	Update dynamic attnt setter for multimodals (#39908 ) * update * fix the test for DepthPro * PR comments * wait, I didn't delete this in prev commit? * fix * better way --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-08-14 21:46:13 +02:00
Yih-Dar	31b6e6e1da	Pin torch to 2.7.1 on CircleCI for now (#40174 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-14 20:19:35 +02:00
MAHIR DAIYAN	b02f2d8b6a	Add dates to the model docs (#39320 ) * added dates to the models with a single hf papers link * added the dates for models with multiple papers * half of no_papers models done * rest of no_papers models also done, only the exceptions left * added copyright disclaimer to sam_hw, cohere, cohere2 + dates * some more fixes, hf links + typo * some new models + a rough script * the script looks robust, changed all paper links to hf * minor change to handle technical reports along with blogs * ran make fixup to remove the white space * refactor	2025-08-14 10:08:46 -07:00
Eshwanth Karti T R	8a658ac119	Standardize BARTpho model card: badges, new examples, fixed broken im… (#40051 ) * Standardize BARTpho model card: badges, new examples, fixed broken image section, and links (#36979)Update bartpho.md * Update bartpho.md Removed non-required/unsupported sections: Quantization, Attention visualizer, and Resources (plus stray tokenizer header). Added code snippets which were suggested * Update bartpho.md Updated with necessary tags * Update bartpho.md * Update bartpho.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-14 09:55:27 -07:00
Yuefeng	2b6cbedeb2	Add GptOssForSequenceClassification for GPT-OSS models (#40043 ) * Add GptOssForSequenceClassification * Tiny fix * make fixup * trigger CI rerun * Check config type instead --------- Co-authored-by: Yuefeng Zhan <yuefzh@microsoft.com>	2025-08-14 18:32:14 +02:00
Sarah Floris	b834cb8138	build: Add fast image processor tvp (#39529 ) * build: add TvpImageProcessorFast - Introduced TvpImageProcessorFast to enhance image processing capabilities. - Updated image processing auto registration to include the new fast processor. - Modified tests to accommodate both TvpImageProcessor and TvpImageProcessorFast, ensuring comprehensive coverage for both classes. * fix: TvpImageProcessorFast with new resize method and update processing logic * build: add TvpImageProcessorFast * refactor: clean up whitespace and formatting in TvpImageProcessorFast and related tests - Removed unnecessary whitespace and ensured consistent formatting in image_processing_tvp_fast.py. - Updated import order in test_image_processing_tvp.py for clarity. - Minor adjustments to maintain code readability and consistency. * fix: Enhance TvpFastImageProcessorKwargs and update documentation - Added TvpFastImageProcessorKwargs class to define valid kwargs for TvpImageProcessorFast. - Updated the documentation in tvp.md to include the new class and its parameters. - Refined the image processing logic in image_processing_tvp_fast.py for better handling of padding and resizing. - Improved test cases in test_image_processing_tvp.py to ensure compatibility with the new processing logic and tensor inputs. * fix: tested now with python 3.9 * fix: remove tvp kwargs from docs * simplify processing * remove import and fix tests --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-08-14 15:48:18 +00:00
Pavel Iakubovskii	6f259bc83e	Fix docs typo (#40167 ) * DINOv3 model * working version * linter revert * linter revert * linter revert * fix init * remove flex and add convert to hf script * DINOv3 convnext * working version of convnext * adding to auto * Dinov3 -> DINOv3 * PR feedback * complete convert checkpoint * fix assertion * bf16 -> fp32 * add fast image processor * fixup * change conversion script * Use Pixtral attention * minor renaming * simplify intermediates capturing * refactor DINOv3ViTPatchEmbeddings * Refactor DINOv3ViTEmbeddings * [WIP] rope: remove unused params * [WIP] rope: rename period -> inv_freq for consistency * [WIP] rope: move augs * change inv_freq init (not persistent anymore) * [WIP] rope: move coords to init * rope - done! * use default LayerScale * conversion: truncate expected outputs * remove commented code * Refactor MLP layers * nit * clean up config params * nit docs * simplify embeddings * simplify compile compat lru_cache * fixup * dynamic patch coords * move augmentation * Fix docs * fixup and type hints * fix output capturing * fix tests * fixup * fix auto mappings * Add draft docs * fix dtype cast issue * add push to hub * add image processor tests * fixup * add modular * update modular * convert and test convnext * update conversion script * update prefix * Update LayerNorm * refactor DINOv3ConvNextLayer * rename * refactor convnext model * fix doc check * fix docs * fix convnext config * tmp fix for check docstring * remove unused arg * fix tests * (nit) change init * standardize gated MLP * clear namings and sat493m * fix tensors on different devices * revert linter * pr * pr feedbak ruff format * missing headers * fix code snippet and collection link in docs * DINOv3 description * fix checkpoints in tests * not doc fixes in configs * output_hidden_states * x -> features * remove sequential --------- Co-authored-by: Cijo Jose <cijose@meta.com>	2025-08-14 17:29:53 +02:00
Zhen	41980ce93e	[bugfix] fix flash-attention2 unavailable error for Ascend NPU (#40151 ) * [bugfix] fix flash-attention2 unavailable error for Ascend NPU * remove redundant apply_rotary_emb usage * fix ruff check error * pad_input and unpad_input use same implementation as fa2 * rollback redundant codes * fix ruff check error * optimize fa2 judgement logic	2025-08-14 14:21:39 +02:00
Cyril Vallez	eba1d62091	[FA2] Fix it finally - revert fa kwargs preparation (#40161 ) revert	2025-08-14 13:39:11 +02:00
Quentin Gallouédec	1c5d2f7fb6	Replace `self.tokenizer` by `self.processing_class` (#40119 )	2025-08-14 13:24:55 +02:00
Kashif Rasul	cfe52ff4db	[Continous Batching] set head_dim when config.head_dim is None (#40159 ) * set head_dim when config.head_dim is None * use model's actual TP setting	2025-08-14 13:23:27 +02:00
Manuel de Prada Corral	c47544b16f	Fix CI: Use correct import in SAM for torchvision InterpolationMode (#40160 ) fix ci	2025-08-14 10:53:23 +00:00
StevenBucaille	22e89e5385	[efficientloftr] fix bugs and follow original cross attn implementation strictly (#40141 ) * fix: changed is_causal to be False * fix: Added original cross attention bug * fix: fixed the way bordel removal is computed * fix: added missing normalization on coarse features * test: fixed integration tests --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-08-14 10:42:59 +01:00
Raushan Turganbay	252364fd8e	[Cohere2Vision] remove unused arg (#40103 ) * remove unused arg * remove the arg from test as well	2025-08-14 09:10:25 +00:00
Guillaume LEGENDRE	e446372f76	Create self-scheduled-amd-mi355-caller.yml (#40134 )	2025-08-14 01:33:45 +02:00
Sai-Suraj-27	be1ab5103f	Update Dockerfiles to install packages inside a virtual environment (#39098 ) * Removed un-necessary virtual environment creation in Dockerfiles. * Updated Dockerfiles to install packages in a virtual environment. * use venv's python * update * build and trigger * trigger * build and trigger * build and trigger * build and trigger * build and trigger * build and trigger * build and trigger * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-13 23:51:52 +02:00
Yih-Dar	591708d9ce	Add pytest marker: `torch_compile_test` and `torch_export_test` (#39950 ) * new marker * trigger CI * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-13 23:47:15 +02:00
Cyril Vallez	12e49cda32	Fix quantized cache with only cache_implementation in generate (#40144 ) * fix args * comment	2025-08-13 23:21:41 +02:00
임승섭	e651ae0a32	🌐 [i18n-KO] Translated `gemma3.md` to Korean (#39865 ) * docs: ko: gemma3.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> * fix: resolve suggestions --------- Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>	2025-08-13 13:25:20 -07:00
Anil-Red	0f9c2595cd	updated visualBERT modelcard (#40057 ) * updated visualBERT modelcard * fix: Review for VisualBERT card	2025-08-13 12:47:32 -07:00
Cyril Vallez	412c9c3030	Remove an old badly designed test (#40142 ) remove it	2025-08-13 20:47:00 +02:00
Steven Liu	eb5768a86e	[docs] Fix ko toctree (#40138 ) Update _toctree.yml	2025-08-13 11:24:58 -07:00
Sangbum Daniel Choi	68a13cd4a6	Add Segment Anything 2 (SAM2) (#32317 ) * initial comment * test * initial conversion for outline * intermediate commit for configuration * chore:init files for sam2 * adding arbitary undefined config * check * add vision * make style * init sam2 base model * Fix imports * Linting * chore:sam to sam2 classes * Linting * Add sam2 to models.__init__ * chore:match prompt encoder with sam2 code * chore:prepare kwargs for mask decoder * Add image/video predictors * Add CUDA kernel * Add output classes * linting * Add logging info * tmp commit * docs for sam2 * enable image processing * check difference of original SAM2 - difference is the order of ToTensor() - please see https://pytorch.org/vision/main/_modules/torchvision/transforms/functional.html#resize * enable promptencoder of sam2 * fix promprencoder * Confirmed that PromptEncoder is exactly same (Be aware of bfloat16 and float32 difference) * Confirmed that ImageEncoder is exactly same (Be aware the linting of init) * Confirmed that MaskDecoder is exactly same (TO DO: lint variable name) * SamModel is now available (Need more chore for name) * make fix-copies * make style * make CI happy * Refactor VisionEncoder and PostioinEmbedding * TO DO : fix the image_embeddings and sparse_embeddings part * pure image inference done * reusable features fix and make style * styling * refactor memoryattention * tmp * tmp * refactor memoryencoder TO DO : convert and inference the video pipeline * TO DO : fix the image_encoder shape * conversion finish TO DO: need to check video inference * make style * remove video model * lint * change * python utils/check_docstringspy --check_all * python utils/check_config_attributes.py * remove copies for sam2promptencoder due to configuration * change __init__.py * remove tensorflow version * fix that to not use direct comparison * make style * add missing import * fix image_embedding_size * refactor Sam2 Attention * add fully working video inference (refactoring todo) * clarify _prepare_memory_conditioned_features * simplify modeling code, remove unused paths * use one model * use auto_docstring * refactor rope embeddings * nit * not using multimask when several points given * add all sam2.1 * add video tmp * add Sam2VideoSessionState + fast image proc + video proc * remove init_states from model * fix batch inference * add image integration tests * uniformize modeling code with other sam models and use modular * pass vision tests an most model tests * All tests passing * add offloading inference state and video to cpu * fix inference from image embedding and existing mask * fix multi_boxes mask inference * Fix batch images + batch boxes inference * improve processing for image inference * add support for mask generation pipeline * add support for get_connected_components post processing in mask generation * add fast image processor sam, image processor tests and use modular for sam2 image processor * fix mistake in sam after #39120 * fix init weights * refactor convert * add integration tests for video + other improvements * add needed missing docstrings * Improve docstrings and * improve inference speed by avoiding cuda sync * add test * skip test for vision_model * minor fix for vision_model * fix vision_model by adding sam2model and change the torch dependencies * remove patch_size * remove image_embedding_size * fix patch_size * fix test * make style * Separate hieradet and vision encoder in sam2 * fixup * review changes part 1 * remove MemoryEncoderConfig and MemoryAttentionConfig * pass q_stride instead of q_pool module * add inference on streamed videos * explicitely process streamed frames * nit * Improve docstrings in Sam2Model * update sam2 modeling with better gestion of inference state and cache, and separate Sam2Model and Sam2VideoModel * improve video inference api * change inference_state to inference_session * use modular for Sam2Model * fix convert sam2 hf * modular * Update src/transformers/models/sam2/video_processing_sam2.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix minor config * fix attention loading error * update modeling tests to use hub checkpoints * Use CI A10 runner for integration tests values + higher tolerance for video integration tests * PR review part 1 * fix doc * nit improvements * enforce one input format for points, labels and boxes * nit * last few nits from PR review * fix style * fix the input type * fix docs * add sam2 model as conversion script * improve sam2 doc * nit fixes + optimization * split sam2 and sam2_video in two models * PR review part 1 * fix None for default slow processor of sam2 * remove unecessary code path in sam2_video * refactor/simplify RoPE * replace embedding module list with embedding matrix * fix tests * remove kernel * nit * use lru_cache for sine_pos_embeddings * reorder sam2_video methods * simplify sam2_video * PR review part 1 * simplify sam2 video a lot * more simplification * update integration tests with updated conftest * more explicit config for hieradet * do post_processing outside of sam2 video model * Improve Sam2VideoVisionRotaryEmbedding * fix tests * update docs and fix mask2former/oneformer * avoid unnecessary reshapes/permute * fix device concatenating points * small dtype fix * PR review * nit * fix style and finish up doc * fix style * fix docstrings * fix modular --------- Co-authored-by: RUFFY-369 <prakarshkaushik369@gmail.com> Co-authored-by: Haitham Khedr <haithamkhedr@meta.com> Co-authored-by: sangbum choi <sangbumchoi@sangbumui-MacBookAir.local> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-08-13 14:18:05 -04:00
Cyril Vallez	25ad9c8c92	Fix Janus (#40140 ) fix	2025-08-13 20:12:21 +02:00
Arthur	bec6926696	gpt oss is important (#40139 )	2025-08-13 19:49:54 +02:00
Lee SuJung	ab9108517a	🌐 [i18n-KO] Translated `pipelines.md` to Korean (#39577 ) * docs: ko: pipelines.md * feat: gpt draft * Update docs/source/ko/main_classes/pipelines.md Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> * Update docs/source/ko/main_classes/pipelines.md Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> * Update docs/source/ko/main_classes/pipelines.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ko/main_classes/pipelines.md Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> * Update docs/source/ko/main_classes/pipelines.md Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> * Update _toctree.yml * Update _toctree.yml 번역 문서 수정 * Update pipelines.md ToC 수정 * Update pipelines.md --------- Co-authored-by: xhaktm <tnwjd318@hs.ac.kr> Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-13 10:26:17 -07:00
Yoni Gozlan	20c6b478cd	🚨 Use lru_cache for sine pos embeddings MaskFormer (#40007 ) * use lru_cache for sine pos embeddings maskformer * fix calls to pos embed * change maxsize to 1	2025-08-13 17:05:22 +00:00
HyunSang Jang	6b728f1830	🌐 [i18n-KO] Translated grounding-dino.md to Korean (#39861 ) * docs: ko: grounding-dino.md * feat: nmt draft * fix: manual edits * Update docs/source/ko/model_doc/grounding-dino.md Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com> * Update docs/source/ko/model_doc/grounding-dino.md Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com> * Update docs/source/ko/model_doc/grounding-dino.md Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com> * docs: add AP explanation for better readability --------- Co-authored-by: TaskerJang <bymyself103@naver.com> Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-08-13 10:01:05 -07:00
chaeyoung kim	127e33f759	🌐 [i18n-KO] Translated `optimizers.md` to Korean (#40011 ) * docs: ko: optimizers.md * feat: optimizers draft * fix: manual edits * docs: ko: update optimizers.md * Update docs/source/ko/optimizers.md Co-authored-by: Minseo Kim <75977640+luckyvickyricky@users.noreply.github.com> * Update docs/source/ko/optimizers.md Co-authored-by: Minseo Kim <75977640+luckyvickyricky@users.noreply.github.com> * Update docs/source/ko/optimizers.md Co-authored-by: Jaehyeon Shin <108786184+skwh54@users.noreply.github.com> * docs: ko: final updates to optimizers and toctree --------- Co-authored-by: Minseo Kim <75977640+luckyvickyricky@users.noreply.github.com> Co-authored-by: Jaehyeon Shin <108786184+skwh54@users.noreply.github.com>	2025-08-13 10:00:47 -07:00
Taemin Park	ac52c77a66	🌐 [i18n-KO] Translated `gpt2.md` to Korean (#39808 ) * docs: ko: bamba.md * feat: nmt draft * fix: manual edits * docs: ko: gpt2.md * feat: nmt draft * fix: manual edits * Remove bamba.md from docs/source/ko/model_doc/ * Update _toctree.yml	2025-08-13 10:00:25 -07:00
Joao Gante	5337f3052d	🚨🚨 [generate] ignore `cache_implementation="hybrid"` hub defaults (#40135 ) * working? * fix tests	2025-08-13 17:57:41 +02:00
Minseo Kim	e4223fa915	🌐 [i18n-KO] Translated `main_classes/optimizer_schedules.md` to Korean (#39713 ) * docs: ko: main_classes/optimizer_schedules * feat: nmt draft * fix: improve TOC anchors and expressions in optimizer_schedules - Add TOC anchors to all section headers - Fix terminology and improve Korean expressions * fix: Correct translation of 'weight decay fixed' to '가중치 감쇠가 적용된' Changed '가중치 감쇠가 수정된' to '가중치 감쇠가 적용된' for more accurate translation of 'weight decay fixed' in the context of optimization. * fix: Use more natural Korean inheritance expression Changed '에서 상속받는' to '을 상속받는' to follow natural Korean grammar patterns for inheritance terminology. * fix: Use consistent '미세 조정' translation for 'finetuned models' Changed '파인튜닝된' to '미세 조정된 모델' to follow the established translation glossary for 'finetuned models' terminology.	2025-08-13 08:23:09 -07:00
Jaehyeon Shin	9e21e50241	🌐 [i18n-KO] Translated `jamba.md` to Korean (#39890 ) * docs: ko: jamba.md * feat: nmt draft * fix: manual edits * fix: resolve suggestion Co-authored-by: Minseo Kim <75977640+luckyvickyricky@users.noreply.github.com> --------- Co-authored-by: Minseo Kim <75977640+luckyvickyricky@users.noreply.github.com>	2025-08-13 08:22:28 -07:00
HyunSang Jang	486844579b	🌐 [i18n-KO] Translated `main_classes/processors.md` to Korean (#39519 ) * docs: ko: processors.md * feat: nmt draft * fix: manual edits * Update docs/source/ko/main_classes/processors.md Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * Update docs/source/ko/main_classes/processors.md Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> --------- Co-authored-by: TaskerJang <bymyself103@naver.com> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2025-08-13 08:21:38 -07:00
Yoni Gozlan	f445caeb0f	Fix hidden torchvision>=0.15 dependency issue (#39928 ) * use pil_torch_interpolation_mapping for NEAREST/NEAREST_EXACT * fix min torchvision version * use InterpolationMode directly * remove unused is_torchvision_greater_or_equal, * nit	2025-08-13 15:13:42 +00:00
Joao Gante	11537c3e0c	[trainer] handle case where EOS token is None in `generation_config` (#40127 ) * handle case where EOS token is None in gen config * update eli5 dataset	2025-08-13 15:57:17 +01:00
Shiva Heydari	8ef5cd6579	DOCS: Add missing space in SECURITY.md (#40087 )	2025-08-13 12:57:37 +00:00
ivarflakstad	ebceef343a	Collated reports (#40080 ) * Add initial collated reports script and job definition * provide commit hash for this run. Also use hash in generated artifact name. Json formatting * tidy * Add option to upload collated reports to hf hub * Add glob pattern for test report folders * Fix glob * Use machine_type as path filter instead of glob. Include machine_type in collated report	2025-08-13 14:48:15 +02:00
Manuel de Prada Corral	e78571f5ce	`decoding_method` argument in generate (#40085 ) * factor out expand inputs * callable arg * improve docs, add test * Update docs/source/en/generation_strategies.md Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-08-13 12:45:50 +00:00
Joao Gante	8d19231bca	[serve] allow array `content` inputs for LLMs (#39829 ) fix bug; add tests	2025-08-13 11:26:19 +01:00
Manuel de Prada Corral	34a1fc6426	Fix QuantoQuantizedCache import issues (#40109 ) * fix quantoquantized	2025-08-13 10:22:59 +00:00
Nikita	060b86e21d	changed xLSTMRMSNorm to RMSNorm (#40113 ) * changed xLSTMRMS.. to RMS... * fix linter error --------- Co-authored-by: Nikita <nikita@Nikitas-MacBook-Pro.local>	2025-08-13 11:10:42 +02:00
Quentin Gallouédec	849c3778c6	[bugfix] Fix tensor device in Idefics2, Idefics3, and SmolVLM (#39975 ) * [bugfix] ensure correct tensor device in Idefics2, Idefics3, and SmolVLM models * to cuda	2025-08-13 09:58:50 +02:00
Ahn Joon Sung	85d536a93b	🌐 [i18n-KO] Translated `tiny_agents.md` to Korean (#39913 ) * docs: ko: tiny_agents.md * feat: nmt draft * fix: manual edits * fix: manual edits	2025-08-12 22:54:16 -07:00
Ferdinand Mom	31ab7168ff	remove sequence parallel in llama4 (#40084 )	2025-08-13 00:12:45 +02:00
Shivamjan	a1a4fcd03e	Add model card for MobileViT (#40033 ) * Add model card for MobileViT * Update docs/source/en/model_doc/mobilevit.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/mobilevit.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/mobilevit.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/mobilevit.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/mobilevit.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update mobilevit.md * Update mobilevit.md * Update mobilevit.md * Update docs/source/en/model_doc/mobilevit.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/mobilevit.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update mobilevit.md * Update mobilevit.md * Update mobilevit.md * Update mobilevit.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-12 11:36:59 -07:00
Joao Gante	e5e73e4b95	[docs] Add reference to HF-maintained `custom_generate` collections (#39894 ) decoding -> generation; add collections	2025-08-12 17:38:00 +01:00
LucasChan	0ce24f5a88	Fix Causality Handling in Flash Attention to Support Bidirectional Attention (#39707 ) Fix the is_causal logic to enable bidirectional attention Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-08-12 16:16:28 +00:00
Joao Gante	83dbebc429	[trainer] ensure special tokens in model configs are aligned with tokenizer at train time (#38441 ) * tmp commit * add test * make fixup * reset warns/info in test	2025-08-12 16:32:07 +01:00
Anton Vlasjuk	9977cf1739	[`Flash Attention`] Fix flash attention integration (#40002 ) * fix flash attention * i got a stroke reading that comment * change dropout kwarg back to before * rename _fa3... as it's used for multiple variants and should work as fallback instead * simplify imports and support kwargs for fa * style * fix comments order * small fix * skip kernels test (causes cuda illegal memories w/o cleanup), fix fa test in general esp for models like bart * style * allow fullgraph by preloading on init * make globals "private" * ci pls be happy * change skip conditions based on backend flag (indicating missing mask interface) * move globals support to a function to prepare kwargs * style * generalize supported kwargs * small change to doc * fix * add comments * style * revert prep during generate * style * revert weird style changes * add fa kwarg prep during generate with fixes back * how did this even happen * how * add comment	2025-08-12 15:24:10 +00:00
Mohamed Mekkouri	b6ba595543	Default to dequantize if cpu in device_map for mxfp4 (#39993 ) * default to dq if cpu * an other check * style * revert some changes	2025-08-12 16:48:52 +02:00
Michał Gallus	a5fac1c394	Fix error on importing unavailable torch.distributed (#40038 ) Currently model_debugging_utils.py would have an unguarded `import torch.distributed.tensor`. This PR ensures that the distributed module is available before including its tensor module.	2025-08-12 16:30:51 +02:00
Çağrı Tuğrul Canbol	085e02383c	Fix Qwen3 MoE GGUF architecture mismatch (#39976 ) * fix qwen3moe gguf architecture * Fix Qwen3Moe GGUF loading --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Jinuk Kim <jusjinuk@snu.ac.kr>	2025-08-12 13:38:48 +00:00
Cyril Vallez	2ce0dae390	Switch the order of args in StaticCache (for BC and future logic) (#40100 ) * switch order for BC and future logic * in generate as well	2025-08-12 15:30:44 +02:00
Isotr0py	f7cbd5f3ef	Fix regression in mllama vision encoder (#40083 ) fix mllama vision encoder Signed-off-by: Isotr0py <2037008807@qq.com>	2025-08-12 15:29:45 +02:00
Quentin Gallouédec	35dc88829c	Replace `logger.warning` with `logger.warning_once` in `GradientCheckpointingLayer` (#40091 )	2025-08-12 15:26:47 +02:00
Cyril Vallez	b1b46555cd	Re-apply make style (#40106 ) make style	2025-08-12 15:02:16 +02:00
MilkClouds	a07b5e90f2	feat: add `is_fast` to ImageProcessor (#39603 ) * feat: add `is_fast` to ImageProcessor * test_image_processing_common.py 업데이트 Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * feat: add missing BaseImageProcessorFast import * fix: `issubclass` for discriminating subclass of BaseImageProcessorFast --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-08-12 12:14:57 +00:00
Yuanyuan Chen	952fac100d	Enable SIM rules (#39806 ) * Enable SIM rules Signed-off-by: cyy <cyyever@outlook.com> * More fixes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-08-12 12:14:26 +00:00
Cyril Vallez	41d1717882	New DynamicSlidingWindowLayer & associated Cache (#40039 ) * start adding the layer * style * improve * modular * fix * fix * improve * generate integration * comment * remove old one * remove * fix * fix * fix * fix all recompiles * fix * doc * fix * add text config check * fix encoderdecoder cache * add it for all models with sliding/hybrid support * revert * start fixing * prophetnet * fsmt * fix ddp_data * add test for mistral * improve mistral test and add gemma2 test * docstrings	2025-08-12 14:09:52 +02:00
Malav-P	ab455e0d88	Audio encodings now match conv2d weight dtype in Gemma3nAudioSSCPConvBlock (#39743 ) audio encodings now match conv weight dtype in Gemma3nAudioSSCPConvBlock	2025-08-12 12:08:28 +00:00
Quentin Gallouédec	4b3a1a62cc	Causal loss for `ForConditionalGeneration` (#39973 ) * feat: add ForConditionalGeneration loss to LOSS_MAPPING * consistent spelling of "recognized"	2025-08-12 14:03:09 +02:00
Lambert	f6b6e17719	Add glm4.5&&glm4.5V doc (#40095 ) * Docs: GLM-4-MoE & GLM-4V-MoE pages * Docs: polish GLM-4V-MoE intro, remove placeholders; pin image * Docs --------- Co-authored-by: wujiahan <lambert@gmail.com>	2025-08-12 11:44:53 +00:00
Raushan Turganbay	1c5e17c025	Update Glm4V processor and add tests (#39988 ) * update GLm4V and add tests * Update tests/models/glm4v/test_processor_glm4v.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * remove min/max pixels for BC * fix video tests --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-08-12 13:40:54 +02:00
Aritra Roy Gosthipaty	913c0a8c33	[docs] Zero Shot Object Detection Task (#40096 ) * refactor zsod task docs * keeping the image guided od section * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/tasks/zero_shot_object_detection.md Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com> --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>	2025-08-12 11:43:38 +01:00
youngrok cha	c6fbfab61b	[fix] batch inference for llava_onevision (#40021 ) * [fix] llava onevision batch inference * style * cannot pass inconsistent list & handle text-only case	2025-08-12 11:01:00 +02:00
Raushan Turganbay	86bb1fcd26	Revert FA2 kwargs construction (#40029 ) * revert * use imports * went way too high in imports level * style	2025-08-12 10:48:35 +02:00
Shuming Hu	3ff2e984d2	Fix PerceptionLM image preprocessing for non-tiled image input. (#40006 ) * Fix PerceptionLM image preprocessing for non-tiled image input. * Add test for single tile vanilla image processing. * ruff format * recover missing test skip * Simplify test. * minor test name fix	2025-08-12 08:40:22 +00:00
ivarflakstad	4668ef1459	Update notification service MI325 (#40078 ) add mi325 to amd_daily_ci_workflows	2025-08-12 10:22:52 +02:00
drbh	1cea763ba4	feat: extract rev in attn_implementation kernels via @ (#40009 ) * feat: extract rev in attn_implementation kernels via @ * fix: adjust for ruff * fix: update regex and add explanatory comment * fix: move attn_implementation kernel doc * fix: remove extra line	2025-08-11 15:14:13 -04:00
Anton Vlasjuk	e29919f993	[`GPT Big Code`] Fix attention scaling (#40041 ) * fix * update integration tests * fmt * add regression test	2025-08-11 19:01:31 +00:00
Shoumik Gandre	eca703026e	chore: standardize DeBERTa model card (#37409 ) * chore: standardize DeBERTa model card * Apply suggestions from code review in docs Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix: Update deberta.md with code cleanup suggestions * Update docs/source/en/model_doc/deberta.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/deberta.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update deberta.md * Update deberta.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-11 10:30:37 -07:00
Yih-Dar	43001fd3c6	Fix `time_spent` in `notification_service.py`. (#40081 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-11 18:30:58 +02:00
azhar shaikh	5521c62b89	added Textnet fast image processor (#39884 ) * feat: add fast image processor implementation for TextNet model * chore: override to_dict method to TextNetImageProcessorFast for slow processor compatibility tests * chore: update init method * chore: coding and style checks * chore: fixed code quality issue * chore: override resize to handle size_divisor, move all preprocessing logic to child class * fix: autoImageProcessor issue for textnet * chore: cleanup * simplify resize --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-08-11 11:44:31 -04:00
Raushan Turganbay	6b70d79b61	Fix repo consistency (#40077 ) fix	2025-08-11 15:26:22 +02:00
Wing Lian	7dd82f307b	guard on model.eval when using torch.compile + FSDP2 (#37413 ) guard on model.eval Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-08-11 13:22:42 +02:00
Cyril Vallez	68eb1a9a63	Remove deprecated cache-related objects (#40035 ) remove them	2025-08-11 10:30:14 +02:00
Dongruixuan Li	480653d271	fix: move super().__init__ after vision_config init in Mistral3Config (#40063 ) fix: move super().__init__ after vision_config init in Mistral3Config (#40062)	2025-08-11 09:21:54 +02:00
Raushan Turganbay	502f253e20	[gemma3] update conversion key mapping (#39778 ) update conversion key mapping	2025-08-11 09:21:13 +02:00
Raushan Turganbay	3124d1b439	[qwen-vl] fix beam search with videos (#39726 ) * fix * fix copies	2025-08-11 09:21:04 +02:00
Tsumugii	1372a5b8c4	fix: resolve triton version check compatibility on windows (#39986 ) * fix: resolve triton version check compatibility on windows * style: remove trailing space * fix: fix typo --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-08-11 08:53:19 +02:00
Yih-Dar	99c747539e	unpin `torchcodec==0.5.0` and use `torch 2.8` on daily CI (#40072 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-10 22:27:39 +02:00
reedrya	b59140b696	Update HuBERT model card according to template (#39742 ) * Update HuBERT model card according to template Standardized HuBERT doc, added ASR examples, Flash Attention 2 support, and quantization section. * Address review comments and changes requested to hubert.md * Update hubert.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-10 11:32:45 -07:00
Yih-Dar	f4d57f2f0c	Revert "fix `notification_service.py` about `time_spent`" (#40044 ) Revert "fix `notification_service.py` about `time_spent` (#40037)" This reverts commit d2ba153b29feb9cc0e9818c1ce63a07679b47250.	2025-08-08 22:32:24 +02:00
Yuxuan Zhang	7b20915f4e	GLM-4.5V Model Support (#39805 ) * init * update * uupdate * ruff * t patch is 2 defalut not 1 * draft * back * back1 * update * config update * update using glm-41 format * add self.rope_scaling = config.rope_scaling * update config * update * remove the processor * update * fix tests * update * for test * update * update 2126 * self.rope_scaling is missing in GLM4MOE lets add it * update * update * Update modular_glm4v_moe.py * change config * update apply_multimodal_rotary_pos_emb * format * update * Delete 3-rollout_qas_thinking_answers.py * use right name * update with place holder * update * use right rotary * Update image_processing_glm4v_fast.py * rope_config_validation needs to rewrite the entire config file in modular * update * changed name * update * Update modeling_glm4v_moe.py * _init_weights shoud be add in Glm4vMoePreTrainedModel * remove use_qk_norm * Update modular_glm4v_moe.py * remove use_qk_norm as it is not use * fix style * deprecations are not needed on new models * fix merge issues --------- Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Arthur <arthur.zucker@gmail.com>	2025-08-08 17:39:52 +02:00
Yih-Dar	d2ba153b29	fix `notification_service.py` about `time_spent` (#40037 ) temp Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-08 17:11:16 +02:00
Mohamed Mekkouri	f639c0c780	Bnb failling tests (#40026 ) * initial commit * style --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-08-08 16:28:00 +02:00
Cyril Vallez	a96cccd0dd	Tie weights recursively on all submodels (#39996 ) * recursive call * add missing keys * remove bad keys	2025-08-08 16:03:16 +02:00
Cyril Vallez	a78263dbb5	fix	2025-08-08 15:32:23 +02:00
Cyril Vallez	dc11a3cbb2	[core] Refactor the Cache logic to make it simpler and more general (#39797 ) * Simplify the logic quite a bit * Update cache_utils.py * continue work * continue simplifying a lot * style * Update cache_utils.py * offloading much simpler * style * Update cache_utils.py * update inits * Update cache_utils.py * consistemncy * Update cache_utils.py * update generate * style * fix * fix * add early_initialization * fix * fix mamba caches * update * fix * fix * fix * fix tests * fix configs * revert * fix tests * alright * Update modeling_gptj.py * fix the constructors * cache tests * Update test_cache_utils.py * fix * simplify * back to before -> avoid compile bug * doc * mistral test * llama4 test dtype * Update test_modeling_llama4.py * CIs * Finally find a nice impl * Update cache_utils.py * Update cache_utils.py * add lazy methods in autodoc * typo * better doc * Add detailed docstring for lazy init * CIs * style * fix	2025-08-08 14:47:21 +02:00
Laurenz Ruzicka	95510ab018	Fix missing None default values for Gemma3n model in get_placeholder_mask (#39991 ) (#40024 ) * Fix missing None default values for Gemma3n model in get_placeholder_mask (#39991) * Switched definition of optional from\| None to Optiona[] (Issue #39991) --------- Co-authored-by: Laurenz Ruzicka <Laurenz.Ruzicka@ait.ac.at>	2025-08-08 10:43:42 +00:00
Cyril Vallez	5c3fb7f731	Harmonize `past_key_value` to `past_key_valueS` everywhere (#39956 ) * all modulars and llama * apply modular * bert and gpt2 copies * fix imports * do it everywhere * fix import * finalize it * fix * oups set it in modular * style * fix * Add 1 version to deprecation cycle * Update modeling_layers.py	2025-08-08 11:52:57 +02:00
Raushan Turganbay	2469cce621	Fix an annoying flaky test (#40000 ) annoying flaky test	2025-08-08 10:32:51 +02:00
Mohamed Mekkouri	fe1bf82159	Higgs modules_to_not_convert standardization (#39989 ) fix higgs	2025-08-08 10:22:59 +02:00
Isotr0py	b374c3d12e	Fix broken image inference for Fuyu model (#39915 ) * fix fuyu Signed-off-by: Isotr0py <2037008807@qq.com> * oops Signed-off-by: Isotr0py <2037008807@qq.com> * run test on GPU Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> * clean unused Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> * revert Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> * add fuyu multimodal test Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-08-08 07:21:49 +00:00
Yih-Dar	4d57c39007	pin torchcodec==0.5.0 for now with torch 2.7.1 on daily CI (#40013 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-07 23:05:39 +02:00
Yih-Dar	3e0333fa4a	Update expected output values after #39885 (part 2) (#40015 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-07 22:52:53 +02:00
Mohamed Mekkouri	12f248bced	Raising error when quantizing a quantized model (#39998 ) * error when quantizing a quantized model * style	2025-08-07 20:37:25 +00:00
Minseo Kim	efaf3714dc	docs: fix duplication in 'en/optimizers.md' (#40014 )	2025-08-07 13:28:43 -07:00
Yih-Dar	ca4cbb1e3f	unpin torch<2.8 on circleci (#40012 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-07 21:31:17 +02:00
Raushan Turganbay	78922577e9	FA2 can continue generation from cache (#39843 ) * add fa2 support to continue generation from cache * update q-len	2025-08-07 19:26:23 +02:00
Yuanyuan Chen	9bfbdd2945	Fix default values of getenv (#39867 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-08-07 17:25:40 +00:00
Duc-Viet Hoang	692d336908	Fix HGNetV2 Model Card and Image Classification Pipeline Usage Tips (#39965 ) * fix hgnet docs and image-classification pipeline * use positional argument * fix dit close hfoptions tag * fix alphabet order * fix hgnnet modular docstring * Update hgnet_v2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update hgnet_v2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/hgnet_v2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix: hgnet reference * change hgnet to en doc --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-07 09:33:29 -07:00
Armaghan Shakir	0659214196	fix: remove CHAT_TEMPLATE import in tests for deepseek-vl (#40003 ) * remove CHAT_TEMPLATE import in tests * update and use prepare_processor_dict	2025-08-07 16:19:36 +00:00
Shuming Hu	27997eeb8d	Fix missing video inputs for PerceptionLM. (#39971 ) * Fix missing video inputs for PerceptionLM. * Minor fix for vanilla input image (only C,H,W, no tiles dim). * Revert "Minor fix for vanilla input image (only C,H,W, no tiles dim)." This reverts commit 181d87b964e59c4118035a9fd4f530c6e551ba9f.	2025-08-07 15:54:45 +00:00
Yuan Wu	bf1bd6ac1f	Fix int4 quantized model cannot work with cpu (#39724 ) * Fix int4 quantized model cannot work with cpu Signed-off-by: yuanwu <yuan.wu@intel.com> * Update the comments Signed-off-by: yuanwu <yuan.wu@intel.com> * update Signed-off-by: yuanwu <yuan.wu@intel.com> * update Signed-off-by: yuanwu <yuan.wu@intel.com> --------- Signed-off-by: yuanwu <yuan.wu@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-08-07 15:24:00 +00:00
Yih-Dar	43d3b1931a	Update expected output values after #39885 (part 1) (#39990 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-07 16:00:28 +02:00
Cyril Vallez	d5a0809707	Fix consistency (#39995 ) * modular * fix	2025-08-07 15:52:40 +02:00
Pavel Iakubovskii	b347e93567	[typing] Fix return typehint for decoder and inv_freq annotation (#39610 ) * fix return typehint for decoder and annotate inv_freq * fix modular * Fix consistency * Move annotation on class level * missing annotations * add comment	2025-08-07 14:10:22 +01:00
dependabot[bot]	7188e2e28c	Bump transformers from 4.48.0 to 4.53.0 in /examples/tensorflow/language-modeling-tpu (#39967 ) Bump transformers in /examples/tensorflow/language-modeling-tpu Bumps [transformers](https://github.com/huggingface/transformers) from 4.48.0 to 4.53.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.48.0...v4.53.0) --- updated-dependencies: - dependency-name: transformers dependency-version: 4.53.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-07 12:13:48 +01:00
Isotr0py	2b19a06692	Fix gemma3n feature extractor's incorrect squeeze (#39919 ) * fix gemma3n squeeze Signed-off-by: Isotr0py <2037008807@qq.com> * add regression test Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> --------- Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-08-07 18:34:28 +08:00
Raushan Turganbay	555cbf5917	[Idefics] fix device mismatch (#39981 ) fix	2025-08-07 11:12:04 +02:00
Rémi Ouazan	597ed1a11d	Various test fixes for AMD (#39978 ) * Add amd expectation in internvl * Add amd expectation to llama * Added bnb decorator for a llava test that requires bnb * Added amd expectation for mistral3 * Style	2025-08-07 10:57:04 +02:00
Jack	6121e9e46c	Support input_embeds in torch exportable decoders (#39836 ) * Support input_embeds in torch exportable decoders * Hybrid cache update * Manually change some callsites * AI changes the rest of the call sites * Make either input_ids/inputs_embeds mandatory * Clean up * Ruff check --fix * Fix test * pr review * Revert config/generation_config changes * Ruff check	2025-08-07 08:51:31 +00:00
StevenBucaille	cdeaad96b7	[superglue] Fixed the way batch mask was applied to the scores before match assignment computation (#39968 ) fix: mask filling to score was wrong	2025-08-07 09:49:39 +01:00
Rémi Ouazan	2593932f10	Gemma3 fixes (#39960 ) * Fix multiple devices issue * Added expectations for rocm 9.4 * Ruff	2025-08-07 09:57:21 +02:00
Yoni Gozlan	513f76853b	Modular fix: remove the model name in `find_file_type` (#39897 ) * remove the model name in the class name * add comment	2025-08-06 23:31:07 +00:00
Arpon Kapuria	743bb5f52e	chore: update Deformable_Detr model card (#39902 ) * chore: update Deformable_Detr model card * fix: added pipeline, automodel examples and checkpoints link * Update deformable_detr.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-06 12:45:14 -07:00
Zhen	ac0b468465	[bugfix] fix flash_attention_2 unavailable error on Ascend NPU (#39844 )	2025-08-06 17:48:52 +00:00
Manuel de Prada Corral	cf243a1bf8	Fix `fix_and_overwrite` mode of `utils/check_docstring.py` (#39369 ) * bug in fix mode of check_docstring	2025-08-06 19:37:25 +02:00
Marc Sun	6902ffa505	remove `triton_kernels` dep with `kernels` instead (#39926 ) * remove dep * style * rm import * fix * style * simplify * style	2025-08-06 19:31:20 +02:00
ScutterKey	cb2e0df2ec	[image processor] fix glm4v (#39964 ) * fix glm4v image process * Update src/transformers/models/glm4v/image_processing_glm4v.py --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-08-06 17:46:58 +01:00
Tialo	9ab75fc428	fix typo (#39936 ) * fix typo * fix modular instead * fix --------- Co-authored-by: y.korobko <y.korobko@tbank.ru>	2025-08-06 16:21:24 +00:00
Mikhail Samin	43b3f58875	Fix grammatical error in MoE variable name: expert_hitted → expert_hit, hitted_experts → hit_experts (#39959 ) * Fix grammatical error: expert_hitted -> expert_hit in MoE implementations * Fix grammatical error: hitted_experts -> hit_experts in MoE implementation	2025-08-06 15:45:19 +00:00
Minseo Kim	dff6185d61	docs: fix typo in 'quantization-aware training' (#39904 )	2025-08-06 14:52:43 +00:00
Matthew Douglas	c7844c7a8e	Enable gpt-oss mxfp4 on older hardware (sm75+) (#39940 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-08-06 13:39:21 +00:00
Lintch	dd70a8cb9d	Fix MXFP4 quantizer validation to allow CPU inference with dequantize option (#39953 ) * Fix MXFP4 quantizer validation to enable CPU dequantization Move dequantize check before CUDA availability check to allow CPU inference when quantization_config.dequantize is True. This enables users to run MXFP4 models on CPU by automatically converting them to BF16 format. * Add tests for MXFP4 quantizer CPU dequantization validation * fix: format mxfp4 test file with ruff	2025-08-06 15:20:41 +02:00
Joao Gante	82eb67e62a	[docs] ko toc fix (#39927 )	2025-08-06 10:12:34 +00:00
Yih-Dar	9e76a6bb54	circleci: pin torch 2.7.1 until `torchcodec` is updated (#39951 ) circleci torch 2.7.1 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-06 11:18:00 +02:00
Manuel de Prada Corral	910b319357	Fix CI: Tests failing on CPU due to `torch.device('cpu').index` being None (#39933 ) replace routing_weights.device.index with a	2025-08-06 10:22:43 +02:00
Yih-Dar	369c99d0ce	Avoid `utils/check_bad_commit.py` failing due to rate limit (requesting `api.github.com`) (#39918 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-05 21:52:20 +02:00
Joao Gante	b771e476a8	[CI] post-`GptOss` fixes for green CI (#39929 )	2025-08-05 20:04:59 +02:00
Lysandre	eb6e26acf3	Dev version	2025-08-05 18:09:30 +02:00
Lysandre Debut	c54203a32e	gpt_oss last chat template changes (#39925 ) Last chat template changes	2025-08-05 18:08:08 +02:00
Arthur	7c38d8fc23	Add GPT OSS model from OpenAI (#39923 ) * fix * nice * where i am at * Bro this works * Update src/transformers/integrations/tensor_parallel.py * cleanups * yups that was breaking * Update src/transformers/models/openai_moe/modeling_openai_moe.py * gather on experts and not mlp * add changes for latest convert branch * adds options to get output_router_logits from config * bring chat temlate + special tokens back into the script. * initial commmit * update * working with shards * add model.safetensors.index.json * fix * fix * mxfp4 flag * rm print * Fix PAD/EOS/BOS (#18) * fix pad/eos/bos * base model maybe one day * add some doc * special tokens based on harmony. * add in tokenizer config as well. * prepare for rebase with main * Fix for initialize_tensor_parallelism now returning 4-tuple ``` [rank0]: File "/fsx/edward/work/openai-tsm-examples/examples/generate.py", line 17, in <module> [rank0]: model = AutoModelForCausalLM.from_pretrained( [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/fsx/edward/work/new-model-addition-openai/src/transformers/models/auto/auto_factory.py", line 600, in from_pretrained [rank0]: return model_class.from_pretrained( [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/fsx/edward/work/new-model-addition-openai/src/transformers/modeling_utils.py", line 316, in _wrapper [rank0]: return func(args, kwargs) [rank0]: ^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/fsx/edward/work/new-model-addition-openai/src/transformers/modeling_utils.py", line 4748, in from_pretrained [rank0]: tp_plan, device_map, device_mesh = initialize_tensor_parallelism(tp_plan, tp_size=None) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: ValueError: too many values to unpack (expected 3) ``` mxfp4 * mxfp4 draft * fix * fix import * draft * draft impl * finally working ! * simplify * add import * working version * consider blocks and scales * device mesh fix * initial commit * add working dequant + quant logic * update * non nan, gibberish output * working EP + quantization finally ! * start cleaning * remove reversing process * style * some cleaning * initial commmit * more cleaning * more cleaning * simplify * more cleaning * rm duplicated function * changing tp_plan * update tp plan check * add loading attribute * dequantizing logic * use subfunctions * import cleaning * update_param_name * adds clamped swiglu * add clamping to training path * simplify dequant logic * update * Bad merge * more simplifications & tests * fix ! * fix registering custom attention * fix order * fixes * some test nits * nits * nit * fix * Clamp sink logits * Clean * Soft-max trick * Clean up * p * fix deepspeed * update both modeling and modular for cleanup * contiguous * update tests * fix top_k router call * revert renaming * test nits * small fixes for EP * fix path for our local tests * update as I should not have broken that! * fix the loss of mixtral * revert part of the changes related to router_scores, kernel probably no ready for that! * deleting a small nit * update arch * fix post processing * update * running version but not expected output * moving to cuda * initial commit * revert * erroring when loading on cpu * updates * del blocks, scales * fix * style * rm comm * comment * add comment * style * remove duplicated lines * Fix minor issue with weight_map conversion script * fix sampling params * rename to final name * upate pre-final version of template * Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py * fix batched inference * serve fixes * swizzle ! * update final chat template by Matt. * fix responses; pin oai * sinplify * Thanks Matt for his tireless efforts! Co-authored-by: Rocketknight1 <Rocketknight1@users.noreply.github.com> * Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * fix * Use ROCm kernels from HUB * Make kernel modes explicit * update final chat template by Matt. x2 * Thanks Matt for his tireless efforts! Co-authored-by: Rocketknight1 <Rocketknight1@users.noreply.github.com> * Fix installation * Update setup.py Co-authored-by: Ákos Hadnagy <akos.hadnagy@gmail.com> * allow no content * fix: update message handling in write_tokenizer function * Fix template logic for user message role * last nits for CB and flash_paged! * there was one bad merge * fix CB (hardcode for now, its just using kv groups instead) * fix * better fix for device_map * minor device fix * Fix flash paged * updates * Revert "remove dtensors, not explicit (#39840)" This reverts commit 6dfd561d9cd722dfc09f702355518c6d09b9b4e3. * update * Revert "remove dtensors, not explicit (#39840)" This reverts commit 6dfd561d9cd722dfc09f702355518c6d09b9b4e3. * fix merge * fix * Fix line break when custom model indentity * nits testing * to locals first and pass sliding window to flash paged * register modes for MegaBlocksMoeMlp * add integration test in fixtures -> now update the tests to use it! * update integration tests * initial fix * style and update tests * fix * chore(gpt oss): remove mlp_bias from configuration It was just a leftover. * stats * Integration tests * whoops * Shouldn't move model * Ensure assistant messages without thinking always go to "final" channel * More checks to ensure expected format * Add pad_token_id to model configuration in write_model function (#51) * Add oai fix fast tests (#59) * Fix some fast tests * Force some updates * Remove unnecessary fixes * Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py * reasoning -> Reasoning * Add additional integration tests * fixup * Slight fixes * align chat template with harmony * simplify * Add comment * torch testing assert close * torch testing assert close * torch testing assert close * torch testing assert close * torch testing assert close * torch testing assert close * Revert fixup * skip 2 test remove todo * merge * padding side should be left for integration tests * fix modular wrt to changes made to modeling * style * isort * fix opies for the loss * mmmm --------- Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: edbeeching <edbeeching@gmail.com> Co-authored-by: Vaibhavs10 <vaibhavs10@gmail.com> Co-authored-by: MekkCyber <mekk.cyber@gmail.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Edward Beeching <edbeeching@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com> Co-authored-by: Zhuohan Li <zhuohan@openai.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: joao@huggingface.co <joao@ip-10-53-88-32.ec2.internal> Co-authored-by: Rocketknight1 <Rocketknight1@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Akos Hadnagy <akos@ahadnagy.com> Co-authored-by: Ákos Hadnagy <akos.hadnagy@gmail.com> Co-authored-by: Alvaro Moran <alvaro.moran@huggingface.co> Co-authored-by: Lysandre <hi@lysand.re> Co-authored-by: Matt <rocketknight1@gmail.com>	2025-08-05 18:02:18 +02:00
TaeHyeon Jeon	738c1a3899	🌐 [i18n-KO] Translated `cache_explanation.md` to Korean (#39535 ) * update: _toctree.yml * docs: ko: cache_explanation.md * feat: nmt draft * fix: apply yijun-lee's comments * fix: apply 4N3MONE's comments * docs: update cache_position * docs: update cache-storage-implementation * update: add h2 tag in cache-position --------- Co-authored-by: taehyeonjeon <xogus294@gmail.com>	2025-08-05 08:20:13 -07:00
Guang Yang	d2ae766836	Export SmolvLM (#39614 ) Export SmolVLM for ExecuTorch	2025-08-05 16:20:23 +02:00
ppaanngggg	c430047602	[docs] update object detection guide (#39909 ) * Update object_detection.md * Update object_detection.md	2025-08-05 14:07:21 +00:00
Arthur	dedcbd6e3d	run model debugging with forward arg (#39905 ) * run model debugging a lot simpler * fixup * Update src/transformers/utils/generic.py * fixup * mode syle? * guard a bit	2025-08-05 15:46:19 +02:00
Arthur	20ce210ab7	Revert "remove dtensors, not explicit (#39840 )" (#39912 ) * Revert "remove dtensors, not explicit (#39840)" This did not work with generation (lm_head needs extra care!) This reverts commit 6dfd561d9cd722dfc09f702355518c6d09b9b4e3. * update * style?	2025-08-05 15:12:14 +02:00
Raushan Turganbay	2589a52c5c	Fix aria tests (#39879 ) * fix aria tests * awful bug * fix copies * fix tests * fix style * revert this	2025-08-05 13:48:47 +02:00
Justin van Heek	6e4a9a5b43	Fix eval thread fork bomb (#39717 )	2025-08-05 10:50:32 +00:00
Yuanyuan Chen	98a3c49135	Replace video_fps with fps in tests (#39898 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-08-05 10:39:55 +00:00
nnul	1af1071081	Fix misleading WandB error when WANDB_DISABLED is set (#39891 ) When users set `report_to="wandb"` but also have `WANDB_DISABLED=true` in their environment, the previous error message was misleading: "WandbCallback requires wandb to be installed. Run pip install wandb." This was confusing because wandb was actually installed, just disabled via the environment variable. The fix detects this specific case and provides a clear, actionable error message explaining the conflict and how to resolve it.	2025-08-05 10:18:18 +00:00
Yidi Wu	78ef84921b	Avoid aliasing in cond's branches for torch 2.8 (#39488 ) Avoid alaising in cond's branches Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-08-05 11:18:11 +02:00
Yuanyuan Chen	9e676e6a0e	[qwen] remove unnecessary CUDA sync in qwen2_5_vl (#39870 ) Signed-off-by: cyy <cyyever@outlook.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-08-05 08:54:16 +00:00
Yao Matrix	392be3b282	fix test_working_of_tp failure of accelerate ut (#39828 ) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-08-05 08:52:57 +00:00
Arthur	cc5de36454	[`Exaone4`] Fixes the attn implementation! (#39906 ) * fix * fix config	2025-08-05 09:29:16 +02:00
Lysandre Debut	00d47757bf	Reorder serving docs (#39634 ) * Slight reorg * LLMs + draft VLMs * Actual VLM examples * Initial responses * Reorder * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/tiny_agents.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/open_webui.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/cursor.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Responses API * Address Pedro's comments --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2025-08-05 08:43:06 +02:00
Arpon Kapuria	8c4ea670dc	chore: update DETR model card (#39822 ) * Update model card for DETR * fix: applied suggested changes * fix: simplified pipeline and modified notes and resources * Update detr.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-04 12:25:53 -07:00
Jan Netík	0bd91cc822	Add support for `ModernBertForMultipleChoice` (#39232 ) * implement ModernBertForMultipleChoice * fixup, style, repo consistency * generate modeling_modernbert * add tests + docs * fix test	2025-08-04 20:45:43 +02:00
Yih-Dar	801e869b67	send some feedback when manually building doc via comment (#39889 ) * fix * fix * fix * Update .github/workflows/pr_build_doc_with_comment.yml Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-08-04 18:20:48 +00:00
Yih-Dar	ee7eb2d0b1	Update cohere2 vision test (#39888 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-04 20:08:18 +02:00
rohitthewanderer	3bafa128dc	[DOCS] : Improved mimi model card (#39824 ) * [DOCS] : Improved mimi model card * Removed additional header * Review: addressed feedback * Update mimi.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-04 10:07:06 -07:00
Pavel Iakubovskii	192acc2d0f	Fix link to models in README (#39880 ) Update README.md	2025-08-04 09:34:41 -07:00
Pavel Iakubovskii	7dca2ff8cf	[typing] better return type hint for `AutoModelForCausalLM` and `AutoModelForImageTextToText` (#39881 ) * Better return type hint for AutoModelForCausalLM and AutoModelForImageTextToText * fix imports * fix	2025-08-04 15:03:53 +00:00
Yih-Dar	3edd14610e	Set `torch.backends.cudnn.allow_tf32 = False` for CI (#39885 ) * fix * fix * [test all] --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-04 16:55:16 +02:00
Quentin Gallouédec	e3505cd4dc	Replace `Tokenizer` with `PreTrainedTokenizerFast` in `ContinuousBatchProcessor` (#39858 ) Replace Tokenizer with PreTrainedTokenizerFast in ContinuousBatchProcessor	2025-08-04 16:39:19 +02:00
Cyril Vallez	380b2a0317	Rework add-new-model-like with modular and make test filenames coherent (#39612 ) * remove tf/flax * fix * style * Update add_new_model_like.py * work in progress * continue * more cleanup * simplify and first final version * fixes -> it works * add linter checks * Update add_new_model_like.py * fix * add modular conversion at the end * Update add_new_model_like.py * add video processor * Update add_new_model_like.py * Update add_new_model_like.py * Update add_new_model_like.py * fix * Update image_processing_auto.py * Update image_processing_auto.py * fix post rebase * start test filenames replacement * rename all test_processor -> test_processing * fix copied from * add docstrings * Update add_new_model_like.py * fix regex * improve wording * Update add_new_model_like.py * Update add_new_model_like.py * Update add_new_model_like.py * start adding test * fix * fix * proper first test * tests * fix * fix * fix * fix * modular can be used from anywhere * protect import * fix * Update add_new_model_like.py * fix	2025-08-04 14:41:09 +02:00
Marc Sun	5fb5b6cfaf	Fix quant docker for fp-quant (#39641 ) * fix quant docker * Apply style fixes --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-08-04 11:57:08 +00:00
Pavel Iakubovskii	16d6faef9a	[core] Fix attn_implementation setter with missing `sub_configs` (#39855 ) * fix * add sub_configs * remove case for attention setter * fix None * Add test * Fix sub-configs * fix tests_config * fix consistency * fix fsmt * fix	2025-08-04 11:35:09 +01:00
Akib Jawad	2a9febd632	Add support for including in-memory videos (not just files/urls) in apply_chat_template (#39494 ) * added code for handling video object ,as dictionary of frames and metadata, in chat template * added new test where videos are passed as objects (dict of frames, metadata) in the chat template * modified hardcoded video_len check that does not match with increased number of tests cases. * Modify hardcoded video_len check that fails with increased number of tests * update documentation of multi-modal chat templating with extra information about including video object in chat template. * add array handling in load_video() * temporary test video inlcuded * skip testing smolvlm with videos that are list of frames * update documentation & make fixup * Address review comments	2025-08-04 11:49:42 +02:00
Yih-Dar	0d511f7a77	Use comment to build doc on PRs (#39846 ) * try * try * try * try * try --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-04 10:24:45 +02:00
Quentin Gallouédec	4819adbbaa	Refactor label name handling for PEFT models in Trainer class (#39265 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-08-04 06:29:57 +00:00
Quentin Gallouédec	166fcad3f8	Improve `is_wandb_available` function to verify WandB installation (#39875 ) Improve `is_wandb_available` function to verify WandB installation by checking for a key attribute	2025-08-04 08:22:52 +02:00
Arthur	6dfd561d9c	remove dtensors, not explicit (#39840 ) * remove dtensors, not explicit Co-authored-by: 3outeille <3outeille@users.noreply.github.com> * style * fix test * update * as we broke saving try to fix * output layouts should exit * nit * devicemesh exists if it was distributed * use _device_mesh of self * update * lol * fix * nit * update * fix! * this??? * grumble grumble * ? * fuck me --------- Co-authored-by: 3outeille <3outeille@users.noreply.github.com>	2025-08-01 22:02:47 +02:00
Quentin Gallouédec	b727c2b20e	Allow `TrackioCallback` to work when pynvml is not installed (#39851 ) Allow TrackioCallback to work when pynvml is not installed	2025-08-01 18:57:25 +02:00
StevenBucaille	1ec0feccdd	[image-processing] deprecate `plot_keypoint_matching`, make `visualize_keypoint_matching` as a standard (#39830 ) * fix: deprecate plot_keypoint_matching and make visualize_keypoint_matching for all Keypoint Matching models * refactor: added copied from * fix: make style * fix: repo consistency * fix: make style * docs: added missing method in SuperGlue docs	2025-08-01 16:29:57 +00:00
Yoni Gozlan	7b4d9843ba	Add fast image processor Janus, Deepseek VL, Deepseek VL hybrid (#39739 ) * add fast image processor Janus, deepseek_vl, deepseek_vl_hybrid * fix after review	2025-08-01 12:20:08 -04:00
Lysandre Debut	88ead3f518	Fix responses add tests (#39848 ) * Quick responses fix * [serve] Fix responses API and add tests * Remove typo * Remove typo * Tests	2025-08-01 18:06:08 +02:00
Arthur	6ea646a03a	Update ux cb (#39845 ) * clenaup * nits * updates * fix logging * push updates? * just passexception * update * nits * fix * add tokencount * style	2025-08-01 16:50:28 +02:00
rziga	3951d4ad5d	Add MM Grounding DINO (#37925 ) * first commit Added modular implementation for MM Grounding DINO from starting point created by add-new-model-like. Added conversion script from mmdetection to huggingface. TODO: Some tests are failing so that needs to be fixed. * fixed a bug with modular definition of MMGroundingDinoForObjectDetection where box and class heads were not correctly assigned to inner model * cleaned up a hack in the conversion script * Fixed the expected values in integration tests Cross att masking and cpu-gpu consistency tests are still failing however. * changes for make style and quality * add documentation * clean up contrastive embedding * add mm grounding dino to loss mapping * add model link to config docstring * hack fix for mm grounding dino consistency tests * add special cases for unused config attr check * add all models and update docs * update model doc to the new style * Use super_kwargs for modular config * Move init to the _init_weights function * Add copied from for tests * fixup * update typehints * Fix-copies for tests * fix-copies * Fix init test * fix snippets in docs * fix consistency * fix consistency * update conversion script * fix nits in readme and remove old comments from conversion script * add license * remove unused config args * remove unnecessary if/else in model init * fix quality * Update references * fix test * fixup --------- Co-authored-by: qubvel <qubvel@gmail.com>	2025-08-01 15:43:23 +01:00
Yuanyuan Chen	50145474b7	[typecheck] proper export of private symbols (#39729 ) * Export private symbols Signed-off-by: cyy <cyyever@outlook.com> * Update src/transformers/__init__.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/__init__.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Fix format Signed-off-by: cyy <cyyever@outlook.com> * Add a comment for exported symbols Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-08-01 13:36:47 +01:00
Arthur	c962f1515e	[`attn_implementation`] remove recursive, allows custom kernels with wrappers (#39823 ) * fix? * fixme and style * Update src/transformers/modeling_utils.py * update * update * fix * small fixees * nit * nits * fix init check? * fix * fix default * or fucks me * nits * include a small nit * does this make it hapy? * fixup * fix the remaining ones	2025-08-01 12:18:28 +02:00
Raushan Turganbay	d3b8627b56	[VLMs] split out "get placeholder mask" to helper (#39777 ) * batch upidate all models * update * forgot about llava onevision * update * fix tests * delete file * typo * fix emu3 once and forever * update cohere2 vision as well	2025-08-01 08:01:06 +00:00
Arthur	a115b67392	Fix tp cb (#39838 ) * fixes * one more	2025-08-01 09:59:04 +02:00
Eric Bezzam	2c0af41ce5	Fix bad markdown links (#39819 ) Fix bad markdown links.	2025-07-31 09:14:14 -07:00
Tommy Chiang	4fcf455517	Fix broken links (#39809 ) Replace links in the form of `[text]((url))` to `[text](url)`. This is the correct format of a url in the markdown.	2025-07-31 13:23:04 +00:00
Raushan Turganbay	b937d47455	[cohere2 vision] move doc to multimodal section (#39820 ) move doc to multimodal section	2025-07-31 15:13:02 +02:00
Kyle Duffy	6ba8a1ff45	Update documentation for Cohere2Vision models (#39817 ) * Update docs with pipeline example * Add Cohere2Vision to list of vision models * Sort models	2025-07-31 11:58:45 +00:00
Raushan Turganbay	e1688d28d3	[Model] Cohere2 Vision (#39810 ) * Add cohere2_vision to support CohereLabs/command-a-vision-07-2025 * update and add modualr file * update processors and check with orig impl later * delete unused files * image processor reduce LOC and re-use GotOCR2 * update the config to use modular * model tests pass * processor fixes * check model outputs decorator * address one more comment * Update tokens. Temp - need to read from tokenizer' * fix for multi-gpu * Fix image token handling * upadte image token expansion logic * fix a few issues with remote code loading * not related but modular forces us to change all files now * Add overview and code sample to cohere vision docs * add scripts. TMP. * Update inference script * Create script * set dtype in export script * TO revert: modular export fix * Fix scripts * Revert "TO revert: modular export fix" This reverts commit bdb2f305b61027a05f0032ce70d6ca698879191c. * Use modular weights * Upload to hub Removed OOD weights ad script * Updated docs * fix import error Update docs Added pipeline test * Updated docs * Run modular script remove modular for config Added patch_size Added docstrings in modular Fix OOM Add docs, fixup integration tests. 8-gpu passing * tiny updates * address comments + fixup * add test for chat template * check model outputs workaround * aya vision fix check model inputs * Revert "add test for chat template" This reverts commit 42c756e397f588d76b449ff1f93292d8ee0202d8. * reveert more changes * last revert * skip and merge * faulty copy from --------- Co-authored-by: Julian Mack <julian.mack@cohere.com> Co-authored-by: kyle-cohere <kyle@cohere.com>	2025-07-31 10:57:34 +00:00
Joao Gante	6c3f27ba61	[docs] fix korean docs yet again (#39813 ) fix korean docs yet again	2025-07-31 09:13:25 +00:00
Jeff Zhang	cb289ad243	feat(tokenization): add encode_message to tokenize messages one by one (#39507 ) * feat(tokenization): add encode_message to tokenize messages one by one * Fix the `encode_message` method, remove the `add_generation_prompt` parameter and add the corresponding error handling. Update the document to reflect this change and verify the error handling in the test. * Optimize the `encode_message` method, improve the processing logic of the empty dialogue history, and ensure that the chat template can be applied correctly when the dialogue history is empty. Update the document to reflect these changes. * The `_encode_message` method is deleted, the message coding logic is simplified, and the functional integrity of the `encode_message` method is ensured. Update the document to reflect these changes. * Docs fix * Revert changes in docstring of pad() * Revert changes in docstring * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Repair the call of the `encode_message` method, update it to `encode_message_with_chat_template` to support the chat template, and adjust the relevant test cases to reflect this change. * Optimize the call format of the `apply_chat_template` method, and merge multi-line calls into a single line to improve code readability. --------- Co-authored-by: pco111 <15262555+pco111@user.noreply.gitee.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-31 10:55:45 +02:00
Joao Gante	4f93cc9174	fix: providing a tensor to cache_position in model.generate kwargs always crashes because of boolean test (#39300 ) * fix: cache_position: RuntimeError: Boolean value of Tensor with more than one value is ambiguous * test cache_position * move test * propagate changes --------- Co-authored-by: Masataro Asai <guicho2.71828@gmail.com>	2025-07-30 17:30:28 +00:00
Bernhard Liebl	9b3203f47b	Add callback to monitor progress in whisper transcription (#37483 ) * Add callback to monitor progress in whisper transcription * Added `` around variables, rewording * Add example of `monitor_progress`. --------- Co-authored-by: Eric B <ebezzam@gmail.com>	2025-07-30 17:40:53 +02:00
Drew Ross	7abb5d3992	Update mT5 model card (#39702 ) * Update mt5 model card * Fix casing of model title * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-30 08:35:04 -07:00
Arpon Kapuria	1019b00028	Update model card for Cohere2 (Command R7B) (#39604 ) * Update model card for Cohere2 (Command R7B) * fix: applied suggested changes	2025-07-30 08:34:26 -07:00
Ethan Villarosa	ecbb5ee194	standardized BARThez model card (#39701 ) * standardized barthez model card according to template * Update docs/source/en/model_doc/barthez.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/barthez.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/barthez.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/barthez.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/barthez.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/barthez.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * suggested changes to barthez model card --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-30 08:33:13 -07:00
Raushan Turganbay	8e077a3e45	Fix re-compilations for cross attention cache (#39788 ) fix recompilations for cross attn cache	2025-07-30 14:52:03 +02:00
Yuanyuan Chen	1e0665a191	Simplify conditional code (#39781 ) * Use != Signed-off-by: cyy <cyyever@outlook.com> * Use get Signed-off-by: cyy <cyyever@outlook.com> * Format * Simplify bool operations Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-07-30 12:32:10 +00:00
Yuanyuan Chen	b94929eb49	Fix an invalid condition (#39762 ) Fix an invalid judgement Signed-off-by: cyy <cyyever@outlook.com>	2025-07-30 12:19:17 +00:00
Yao Matrix	bb2ac66453	fix chameleonvision UT failure (#39646 ) * fix chameleonvision UT failure Signed-off-by: matrix.yao@intel.com <Yao Matrix> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: matrix.yao@intel.com <Yao Matrix> Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: root <Yao Matrix>	2025-07-30 12:09:26 +00:00
Raushan Turganbay	5348445dfa	Super tiny update (#39727 ) super tiny update	2025-07-30 12:21:41 +02:00
Yih-Dar	54cbea5615	more info in `model_results.json` (#39783 ) more info Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-30 11:43:10 +02:00
eustlb	01d5f94695	[ASR pipline] fix with datasets 4.0 (#39504 ) * fix * handle edge case * make	2025-07-30 08:13:40 +00:00
jiqing-feng	8ab21be570	enable static cache on vision encoder decoder (#39773 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-07-30 08:10:46 +00:00
Cyril Vallez	67cfe11528	Fix Evolla and xLSTM tests (#39769 ) * fix all evolla * xlstm	2025-07-30 09:51:55 +02:00
Quentin Gallouédec	ec4033457e	Don't set `run_name` when none (#39695 ) * Don't set run_name when none * revert --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-07-30 01:39:29 +00:00
Yana Mishula	551a89a4a3	Standardize CLAP model card format (#39738 ) * Standardize CLAP model card format * Apply review feedback * Remove Resources section	2025-07-29 14:13:04 -07:00
StevenBucaille	da70b1389a	docs: Update EfficientLoFTR documentation (#39620 ) * docs: Update EfficientLoFTR documentation * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-29 13:54:44 -07:00
Cyril Vallez	ddd2100767	Fix OmDet test after arg deprecation (#39766 ) fix arg name	2025-07-29 22:10:36 +02:00
st81	4abb053b6c	Remove python3.7 reference from doc link (#39706 )	2025-07-29 09:17:13 -07:00
Joao Gante	33aa49df9d	[docs] Ko doc fixes after toc update (#39660 ) * update docs * doc builder working * make fixup	2025-07-29 17:05:26 +01:00
Manuel de Prada Corral	c4e2069898	Fix Cache.max_cache_len max value for Hybrid models (#39737 ) * fix gemma * fix min * fix quant init issue * fix gemma 3n * skip quant cache test * fix modular * new test for Gemma * include cyril change --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-29 17:12:50 +02:00
Taihang Hu	075dbbceaa	fix(trainer): Correct loss scaling for incomplete gradient accumulation steps (#39659 ) * Fix issue[#38837]: wrong loss scaled in last step of epoch * chore: trigger CI * Update src/transformers/trainer.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Update src/transformers/modeling_flash_attention_utils.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> --------- Co-authored-by: taihang <taihang@U-2RHYVWX7-2207.local> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2025-07-29 17:12:31 +02:00
Jaehyeon Shin	1d061536cf	🌐 [i18n-KO] Translated `how_to_hack_models.md` to Korean (#39536 ) * docs: ko: how_to_hack_models.md * feat: nmt draft * fix: manual edits	2025-07-29 08:09:16 -07:00
박종범	43fe41c0a8	🌐 [i18n-KO] Translated `perf_train_gpu_one.md` to Korean (#39552 ) * docs: ko: perf_train_gpu_one.md * feat: nmt draft * fix: manual edits * fix: Manually added missing backticks * Update docs/source/ko/perf_train_gpu_one.md fix: remove space between heading and GPU anchor Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update docs/source/ko/perf_train_gpu_one.md fix: clarify table headers to indicate training speed boost and memory savings Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update docs/source/ko/perf_train_gpu_one.md fix: improve readability Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/perf_train_gpu_one.md fix : rephrase explanation of data preloading to improve readability Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>	2025-07-29 08:08:57 -07:00
Ahn Joon Sung	9f38763731	🌐 [i18n-KO] Translated `pipeline_gradio.md` to Korean (#39520 ) * docs: ko: pipeline_gradio.md * feat: nmt draft * fix: manual edits * docs: ko: pipeline_gradio.md	2025-07-29 08:04:30 -07:00
Lio (임승섭)	f72311796b	🌐 [i18n-KO] Translated `tokenizer.md` to Korean (#39532 ) * docs: ko: tokenizer.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Yijun Lee <yijun-lee@users.noreply.github.com> Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> --------- Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>	2025-07-29 08:04:14 -07:00
Kim Juwon	d346d46752	🌐 [i18n-KO] Translated `tvp.md` to Korean (#39578 ) * docs: ko: tvp.md * feat: nmt draft * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> --------- Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com>	2025-07-29 08:04:00 -07:00
Ahnjj_DEV	2f59c15b33	🌐 [i18n-KO] Translated albert.md to Korean (#39524 ) * docs: ko: albert.md * feat: nmt draft * fix: manual edits	2025-07-29 08:03:40 -07:00
Minseo Kim	98386dcee9	🌐 [i18n-KO] Translated `main_classes/peft.md` (#39515 ) * docs: ko: main_classes/peft.md * feat: nmt draft * docs: add missing TOC to documentation for `PeftAdapterMixin` section Added a table of contents (TOC) to the documentation, specifically for the `transformers.integrations.PeftAdapterMixin` section, following the structure and content outlined in [this link](https://huggingface.co/docs/transformers/main/en/main_classes/peft#transformers.integrations.PeftAdapterMixin). * fix: Improve naturalness of purpose expression in Korean Changed '관리하기 위한' to '관리할 수 있도록' for more natural Korean expression when describing the purpose of providing functions. * fix: Simplify plural form and make expression more concise Changed '~할 수 없기 때문에' to '~할 수 없어' for more concise expression while maintaining clarity. * fix: Replace technical term '주입' with more natural '적용' Changed '주입할 수 없어' to '적용할 수 없어' for better readability. Considered alternatives: '삽입': Too literal translation of 'inject' '입력': Could be misunderstood as data input '통합': Implies merging two systems '추가': Simple but less precise '적용' was chosen as it's the most natural and widely used term in Korean technical documentation for this context. * fix: update toctree path for PEFT to lowercase Changed the toctree path from 'PEFT' (uppercase) to 'peft' (lowercase) to match the correct directory naming convention and prevent broken links. * docs: update as per reviewer feedback after rebase	2025-07-29 08:03:17 -07:00
Raushan Turganbay	1ad216bd7d	[modenbert] fix regression (#39750 ) * fix regression * add FA2 test	2025-07-29 16:58:59 +02:00
Yih-Dar	379209b603	add `libcst` to `extras["testing"]` in `setup.py` (#39761 ) add Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-29 16:58:51 +02:00
Cyril Vallez	abf101af1f	Fix version issue in modeling_utils.py (#39759 ) fix version issue	2025-07-29 16:15:30 +02:00
jiqing-feng	8db4d79161	Enable xpu allocator on caching_allocator_warmup (#39654 ) * add xpu allocator Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix typo Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix variable name Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * rm useless default value Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-07-29 16:06:52 +02:00
Çağrı Tuğrul Canbol	fb141e2c90	Support loading Qwen3 MoE GGUF (#39638 ) * support loading qwen3 gguf * qwen3moe test cases * fix whitespaces * fix ggml tests	2025-07-29 13:44:44 +00:00
Raushan Turganbay	ccb2e0e03b	Fix GPT2 with cross attention (#39754 ) * fix * use new mask API * style * fix copies and attention tests * fix head pruning tests	2025-07-29 15:40:31 +02:00
Yih-Dar	dfd616e658	Avoid OOM when other tests are failing (#39758 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-29 15:35:44 +02:00
ivarflakstad	65df73aa88	AMD disable torchcodec (#39757 ) Temporarily disable torchcodec installation because of bizarre segfault	2025-07-29 13:07:25 +00:00
Yih-Dar	63b3200779	Use `--gpus all` in workflow files (#39752 ) gpu all Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-29 14:53:33 +02:00
Yuanyuan Chen	95faabf0a6	Apply several ruff SIM rules (#37283 ) * Apply ruff SIM118 fix Signed-off-by: cyy <cyyever@outlook.com> * Apply ruff SIM910 fix Signed-off-by: cyy <cyyever@outlook.com> * Apply ruff SIM101 fix Signed-off-by: cyy <cyyever@outlook.com> * Format code Signed-off-by: cyy <cyyever@outlook.com> * More fixes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-07-29 11:40:34 +00:00
Manuel de Prada Corral	cf97f6cfd1	Fix mamba regression (#39728 ) * fix mamba regression * fix compile test	2025-07-29 12:44:28 +02:00
ivarflakstad	66984ed4f6	Update IMPORTANT_MODELS list (#39734 )	2025-07-29 12:34:57 +02:00
Yih-Dar	de8d0cec30	update `GemmaIntegrationTest::test_model_2b_bf16_dola` again (#39731 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-29 11:42:55 +02:00
Matej Sirovatka	85d5aeb324	Fix: add back base model plan (#39733 ) * Fix: add back base model plan * Fix: typo * fixup * remove unused import --------- Co-authored-by: Arthur <arthur.zucker@gmail.com>	2025-07-29 11:37:33 +02:00
hebangwen	2a90193dd8	[Fix] import two missing typos in `models/__init__.py` for typo checking (#39745 ) * [Fix] import lost gemma3n for type checking in vscode * [Fix] import missing qwen2_5_omni typo * [Refactor] sort by ascii order	2025-07-29 11:35:22 +02:00
Arthur	f2aca3eccc	fix cache inheritance (#39748 ) * fix cache inheritance * styule	2025-07-29 11:24:44 +02:00
Yao Matrix	f3598a95c7	extend more trainer test cases to XPU, all pass (#39652 ) extend more trainer test cases to XPU Signed-off-by: Yao, Matrix <matrix.yao@intel.com>	2025-07-29 10:51:00 +02:00
Raushan Turganbay	75794792ad	BLIPs clean-up (#35560 ) * blips clean up * update processor * readability * fix processor length * fix copies * tmp * update and fix copies * why keep these, delete? * fix test fetcher * irrelevant comment * fix tests * fix tests * fix copies	2025-07-29 10:03:06 +02:00
Ramesh	4f8f51be4e	Add Fast Segformer Processor (#37024 ) * Add Fast Segformer Processor * Modified the params according to segformer model * modified test_image_processing_Segformer_fast args - removed redundant params like do_center_crop,center_crop which aren't present in the original segformer class * added segmentation_maps processing logic form the slow segformer processing module with references from beitimageprocessing fast * fixed code_quality * added recommended fixes and tests to make sure everything processess smoothly * Fixed SegmentationMapsLogic - modified the preprocessing of segmentation maps to use tensors - added batch support * fixed some mismatched files * modified the tolerance for tests * use modular * fix ci --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-07-28 19:22:32 +00:00
Avigyan Sinha	c353f2bb5e	Superpoint fast image processor (#37804 ) * feat: superpoint fast image processor * fix: reran fast cli command to generate fast config * feat: updated test cases * fix: removed old model add * fix: format fix * Update src/transformers/models/superpoint/image_processing_superpoint_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * fix: ported to torch and made requested changes * fix: removed changes to init * fix: init fix * fix: init format fix * fixed testcases and ported to torch * fix: format fixes * failed test case fix * fix superpoint fast * fix docstring --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-07-28 18:15:06 +00:00
Rémi Ouazan	14adcbd937	Fix AMD dockerfile for audio models (#39669 )	2025-07-28 19:05:41 +02:00
Raushan Turganbay	1c6b47451d	Fix cache-related tests (#39676 ) * fix * fix kyutai at last * fix unrelated tests and copies * update musicgen as well * revert tensor * fix old test failures * why it wasn't added?	2025-07-28 17:30:11 +02:00
Cyril Vallez	fc2bd1eac0	Fix Layer device placement in Caches (#39732 ) * fix device placement * style * typo in comment	2025-07-28 16:37:11 +02:00
Eric Bezzam	7623aa3e5f	Fix `Qwen2AudioForConditionalGeneration.forward()` and `test_flash_attn_kernels_inference_equivalence` (#39503 ) * Add missing cache_position argument. * Pass cache_position to language model. * Overwrite prepare_inputs_for_generation. * Set model to half precision for Flash Attention test. * Cast model to bfloat16.	2025-07-28 16:35:08 +02:00
Yih-Dar	28f2619868	skip `Glm4MoeModelTest::test_torch_compile_for_training` (#39670 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-28 16:30:40 +02:00
Yih-Dar	88aed92b59	Update `QAPipelineTests::test_large_model_course` after #39193 (#39666 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-28 16:26:49 +02:00
Ita Zaporozhets	da823fc04e	mllama outputs refactor (#39643 ) * mllama outputs refactor * forgot kwargs * fix output * add can_record_outputs * correct @check_model_inputs placement * ruff and copies * rebase * feedback * only return hidden_states --------- Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-153.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-14.ec2.internal>	2025-07-28 15:59:20 +02:00
Cyril Vallez	686bb3b098	Remove all expired deprecation cycles (#39725 ) * remove all deprecation cycles * style * fix * remove * remove * fix * Update modular_dpt.py * back * typo * typo * final fix * remove all args	2025-07-28 15:43:41 +02:00
Anton Vlasjuk	a0fa500a3d	[`CI`] Add Eric to comment slow ci (#39601 ) add to ci	2025-07-28 13:24:00 +00:00
Matej Sirovatka	4c7da9fedf	PATCH: add back n-dim device-mesh + fix tp trainer saving (#39693 ) * Feat: something * Feat: initial changes * tmp changes to unblock * Refactor * remove todo * Feat: docstring * Fix: saving of distributed model in trainer * Fix: distributed saving with trainer * Feat: add pure tp saving * Only require tp dim if ndim > 1 * Fix: default to None * Fix: better comments/errors * Fix: properly check tp_size attribute * Fix: properly check for None in tp_size --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-07-28 12:29:58 +00:00
Jitesh Gupta	cbede2969b	Add self-hosted runner scale set workflow for mi325 CI (#39651 )	2025-07-28 13:32:25 +02:00
Raushan Turganbay	b56d721397	[configuration] remove redundant `classmethod` (#38812 ) * remove redundant classmethod * warning message, add space between words * fix tests * fix copies	2025-07-28 10:38:48 +00:00
jzhang533	02ea23cbde	update ernie model card (#39657 ) * update ernie model doc Signed-off-by: Zhang Jun <jzhang533@gmail.com> * address ruff format error reported by ci Signed-off-by: Zhang Jun <jzhang533@gmail.com> * address check_repository_consistency error reported by ci Signed-off-by: Zhang Jun <jzhang533@gmail.com> --------- Signed-off-by: Zhang Jun <jzhang533@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-28 10:21:18 +00:00
Raushan Turganbay	8b237b8639	[processors] add tests for helper fn (#39629 ) * add tests for helpers * duplicate test for each model * why llava next video has no helper * oops must have been in the commit * fix test after rebase * add copy from	2025-07-28 09:41:58 +00:00
Wang, Yi	6638b3642d	xpu optimization for generation case (#39573 ) * xpu optimization for generation case Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * fix ci failure Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2025-07-28 11:34:58 +02:00
pjo256	5c15eb55d2	fix(tokenization): check token.content for trie (#39587 ) fix: check token.content for trie	2025-07-28 11:28:56 +02:00
BUI Van Tuan	6a61e16626	Fix missing initialization of `FastSpeech2Conformer` (#39689 ) * fix missing initialization of FastSpeech2Conformer * switch order and reactivate tests --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-28 10:47:39 +02:00
Wing Lian	a6393e7d28	fix missing model._tp_size from ep refactor (#39688 ) * fix missing model._tp_size from ep refactor * restore setting device_mesh too	2025-07-26 12:26:36 +02:00
Cyril Vallez	18a7c29ff8	More robust tied weight test (#39681 ) * Update test_modeling_common.py * remove old ones * Update test_modeling_common.py * Update test_modeling_common.py * add * Update test_modeling_musicgen_melody.py	2025-07-25 22:03:21 +02:00
Arthur	c3401d6fad	dev version 4.55	2025-07-25 21:11:20 +02:00
Garrett Goon	97f8c71f52	Add padding-free to Granite hybrid moe models (#39677 ) * start fixing kwarg handling * fmt * updates padding free tests * docs * add missing kwargs modeling_granitemoe.py * run modular util * rm unrelated changes from modular util	2025-07-25 20:10:50 +02:00
Cyril Vallez	d6e9f71a6e	Fix tied weight test (#39680 ) Update test_modeling_common.py	2025-07-25 20:09:33 +02:00
bigmoyan	5da6ad2731	fix break for ckpt without _tp_plan (#39658 ) * fix break for ckpt without _tp_plan * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py --------- Co-authored-by: wangzhengtao <wangzhengtao@msh.team> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-25 20:03:48 +02:00
lgai-exaone	c06d4cd6ce	Add EXAONE 4.0 model (#39129 ) * Add EXAONE 4.0 model * Refactor EXAONE 4.0 modeling code * Fix cache slicing on SWA + FA2 * Fix cache slicing on FA2 + HybridCache * Update EXAONE 4.0 modeling code for main branch * Update o_proj for asymmetric projection * Address PR feedback * Add EXAONE 4.0 docs * Update EXAONE 4.0 modeling code for main branch * update * fix updates * updates * fix * fix * fix --------- Co-authored-by: Arthur <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-25 19:58:28 +02:00
Park Woorak	3e4d584a5b	Support `typing.Literal` as type of tool parameters or return value (#39633 ) * support `typing.Literal` as type of tool parameters * validate the `args` of `typing.Literal` roughly * add test to get json schema for `typing.Literal` type hint * fix: add `"type"` attribute to the parsed result of `typing.Literal` * test: add argument `booleanish` to test multi-type literal * style: auto fixup	2025-07-25 17:51:28 +00:00
Arthur	300d42a43e	Add ep (#39501 ) * EP + updates Co-authored-by: Nouamane Tazi <NouamaneTazi@users.noreply.github.com> Co-authored-by: drbh <drbh@users.noreply.github.com> * remove unrelated change * not working yet but let's see where it goes! * update the api a bit * udpate * where I am at for now * fix ep * refactor the API * yups * fix * fixup * clean modeling * just support llama4 for now! * properly avoid * fix * nits * Update src/transformers/models/llama4/modeling_llama4.py * Update src/transformers/integrations/tensor_parallel.py * style * ,,,, * update --------- Co-authored-by: Nouamane Tazi <NouamaneTazi@users.noreply.github.com> Co-authored-by: drbh <drbh@users.noreply.github.com>	2025-07-25 19:46:17 +02:00
Dario Salvati	abaa043d60	bad_words_ids no longer slow on mps (#39556 ) * fix: bad_words_ids no longer slow on mps * fix: SequenceBiasLogitsProcessor slow `_prepare_bias_variables` method * fix: re-adding a deleted comment * fix: bug in no_bad_words_logits * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-07-25 19:45:41 +02:00
Cyril Vallez	6630c5b714	Add xlstm model (#39665 ) * Add xLSTM cleanly with optimizations. * Fix style. * Fix modeling test. * Make xLSTM package optional. * Fix: Update torch version check. * Fix: Bad variable naming in test. * Fix: Import structure cleaning with Ruff. * Fix: Update docstrings. * Fix: Mitigate unused config attr tests by explicit usage. * Fix: Skip tests, if xlstm library is not installed. * Feat: Enable longer context window for inference by chunking. * Fix: Make training test pass by lowering target accuracy. * Chore: Increase test verbosity for failing generation test. * Update docs/source/en/model_doc/xlstm.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix: Make xlstm available even without CUDA. * Chore: Remove unnecessary import. * Fix: Remove BOS insertion. * Chore: Improve xLSTMCache documentation. * Integrate basic xLSTM fallback code. * Chore: Remove unnecessary import. * Chore: Remove duplicate LayerNorm. * chore: update copyright, minor reformatting * fix: refactor mLSTMStateType due to missing torch import * fix: add missing import * Chore: Replace einops. * fix: apply ruff formatting * fix: run `make fix-copies` to re-generate dummy_pt_objects.py * fix: make type hints Python 3.9 compatible * fix: remove obsolete import * fix: remove obsolete method from docs * chore: remove obsolete `force_bos_token_insert` from config * Chore: Remove duplicated xLSTMCache class. * Fix: Formatting of modeling_xlstm.py * Chore: Remove xlstm package requirement from test. Re-add update_rnn_state. * Fix: Update xLSTMCache docstring. * Feat: Add proper initialization of xLSTM. * Chore: Re-format files. * Chore: Adapt format. * Fix: xLSTMCache import restructuring. * Fix: Add __all__ lists to modeling and configuration files. * Chore: Reformat. * Fix: Remove unnecessary update_rnn_state function. * Fix: Undo test accuracy quickfix. * Fix: Update copyright year, remvoe config copy. * Chore: Flatten all internal configs to xLSTMConfig. * Fix: Unused config variables check. * Chore: Remove unnecessary imports. * Fix: Unify xlstm cache argument from batch_size to max_batch_size. * Chore: Remove bad default arg value for xLSTMCache. * Chore: Rename core configuration arguments to HF default in xLSTM. * Chore: Fix formatting. * Fix: xLSTM Cache config access. * Fix: Update xlstm tests for config update. * Feat: Re-add embbeding_dim, num_blocks config options for compat with xLSTM-7B. * Fix: Configuration xLSTM python3.9 syntax. * Fix: Difference to main in test_utils.py assertion. * Fix: Bad syntax in xlstm config for python3.9. * Fix: xLSTMConfig docstring. * Fix: xLSTMConfig docstring. * Fix typing issues in xLSTM and BeiT, Paligemma. * Fix: Exclude xLSTM from test cache utils. * Chore: Fix style. * Chore: Fix format. * Chore: Remove unnecessary LayerNorm, NormLayer layer abstractions. * Chore: Remove asserts and replace with ValueErrors. * Chore: Update __init__.py structure of xLSTM. * Chore: Clean xLSTM initialization of weights. * Fix index names in modeling_xlstm.py * Update xlstm model test typing annotations. * Fix: Remove all asserts. * Revert changes to the main __init__.py * Fix: Move xLSTMCache to modeling_xlstm.py * Fix: Remove xLSTMForCausalLM mapping from modeling_auto.py * Remove xLSTMCache from dummy_pt_objects.py * Fix: Remove extended torchdynamo compilation check integrating cuda graph captures. * Revert test_cache_utils.py xLSTM change. * Fix: Move xLSTM init functions before init call. * Remove xLSTMCache from generation utils. * Fix: Clean xLSTM init functionality for recursive calls. * Fix: Move xLSTMCache before its first call. * Fix formatting. * Add partial docstring for xLSTMModel forward. * Fix xLSTMCache docstring in xLSTMModel. * Remove xLSTMCache from public documentation. Update auto_docstring. * Remove all agressive shape comments * style * Fix names * simplify * remove output_hidden_states * Update modeling_xlstm.py * Update modeling_xlstm.py * Update test_modeling_xlstm.py * Update modeling_xlstm.py * Update modeling_xlstm.py * fix * fix * style * style --------- Co-authored-by: Korbinian Poeppel <korbinian.poeppel@nx-ai.com> Co-authored-by: Korbinian Pöppel <37810656+kpoeppel@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sebastian Böck <sebastian.boeck@nx-ai.com> Co-authored-by: Korbinian Poeppel <poeppel@ml.jku.at>	2025-07-25 19:39:17 +02:00
Yoni Gozlan	ed9a96bc6d	Use auto_docstring for perception_lm fast image processor (#39679 )	2025-07-25 17:32:48 +00:00
Ryan Mullins	d913b39ef3	fix: HWIO to OIHW (#39200 ) * fix: HWIO to OIHW * Bug in attention type * Conversion script docstring * style --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Arthur <arthur.zucker@gmail.com>	2025-07-25 19:23:15 +02:00
Yoni Gozlan	a26f0fabb8	Fix auto_docstring crashing when dependencies are missing (#39564 ) * add try except to not crash auto_docstring when some dependency are missing * safeguard None value in placeholder dict	2025-07-25 19:19:23 +02:00
Armaghan Shakir	69cff312f5	Add support for DeepseekAI's DeepseekVL (#36248 ) * upload initial code * update deepseek-vl adaptor * update hierarchy of vision model classes * udpate aligner model * add text model * Added Image Processor * Added Image Processor * Added Image Processor * apply masks * remove projection; add aligner * remove interpolate_pos_encoding * remove unused params in config * cleaning * Add the __init__ file * added processing deepseek_vl class * modified the deepseek-vl processor * modified the deepseek-vl processor * update __init__ * Update the image processor class name * Added Deepseek to src/transformers/__init__.py file * Added Deepseek to image_processing_auto.py * update the __init__ file * update deepseek_vl image processor * Update Deepseek Processor * upload fast image processor * Revert "upload fast image processor" This reverts commit 68c8fd50bafbb9770ac70c9de02448e2519219b4. * update image processor * flatten heirarchy * remove DeepseekVLModel * major update (complete modeling) * auto modeling and other files * formatting * fix quality * replace torchvision in modeling * set default do_normalize to False * add fast image processor template using tool * update image processors * add fast image processor to other files * update liscense * Added deepseek image testcases * update image test * update processor * write CHAT_TEMPLATE * update model for processor * fix processor * minor fixes and formatting * fix image processing and tests * fix interpolation in sam * fix output_attentions in DeepseekVLModel * upload test_modeling * fix tests because of vocab size * set use_high_res_vision=False in tests * fix all modeling tests * fix styling * remove explicit background_color from image processors * added test_processor * added test_processor * fix processor tests * update docs * update docs * update docs * update conversion script * Fixed typos * minor fixes from review - remove model_id comments in examples - remove from pre-trained auto mapping - move to image-text-to-text from vision-to-seq in auto mapping - add image_token_index to __init__ for config - remove outdated temporary config in conversion script - update example to use chat_template in docstring example - update liscense 2021->2025 * fix type in config docstring Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> * update get_image_features * fix config * improve DeepseekVLImageProcessor.preprocess * return image_hidden_states * use AutoTokenizer and AutoImageProcessor in Processor * fix model outputs * make num_image_tokens configurable * fix docstring of processor * move system prompt to chat template * fix repo consistency * fix return_dict * replace SamVisionEncoder with SamVisionModel * update to remove deepcopy * 🛠️ Major Architectural Changes (Adds DeepseekVLHybrid) * fix quality checks * add missing hybrid in auto modeling * run make style * update sam_hq * update high_res_size in test * update docs following #36979 * update code with auto_docstring * update conversion scripts * fix style * fix failing test because of tuple * set weights_only=True in conversion script * use safetensors.torch.load_file instead of torch.load in conversion script * make output_dir optional in conversion script * fix code snippets in docs (now the examples work fine) * integration tests for DeepseekVL * update expected texts * make style * integration tests for DeepseekVLHybrid * fix class name * update expected texts for hybrid * run "make style" * update since changes in main * run make-style * nits since changes in main * undo changes in sam * fix tests * fix tests; update with main * update with main: output_attention/output_hidden_states * fix copied part in deepseek_vl * run fix-copies * fix output_hidden_states * sam: fix _init_weigths * use modular for DeepseekVL * make image processor more modular * modular: use JanusPreTrainedModel * janus: provide kwargs in loss * update processors in conversion script * Revert "sam: fix _init_weigths" This reverts commit db625d0c68956c0dad45edd7a469b6a074905c27. * run fix-copies --------- Co-authored-by: Shakib-IO <shakib.khan17@northsouth.edu> Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>	2025-07-25 19:18:50 +02:00
Cyril Vallez	a98bbc294c	Add missing flag for CacheLayer (#39678 ) * fix * Update cache_utils.py	2025-07-25 19:12:13 +02:00
Xibin Bayes Zhou	45c7bfb157	Add evolla rebase main (#36232 ) * add evolla * adding protein encoder part * add initial processing test * save processor * add docstring * add evolla processor * add two test * change vision to protein * change resampler to sequence_compressor * change vision to protein * initial update for llama * add initial update for llamaForCausalLM * add `test_processor`, `test_saprot_output`, `test_protein_encoder_output` * change evolla, but still working on it * add test_single_forward * pass test_attention_outputs * pass test_hidden_states_output * pass test_save_load and test_from_pretrained_no_checkpoint * pass test_cpu_offload * skip some tests * update new progress * skip test_model_is_small * pass test_model_weights_reload_no_missing_tied_weights * pass test_model_get_set_embeddings * pass test_cpu_offload * skip test_resize_embeddings * add pipeline_model_mapping * remote old setUp * pass processor save_pretrained and load_pretrained * remove pooling layer * pass test_inputs_embeds_matches_input_ids * pass test_model_is_small * pass test_attention_outputs * pass test_initialization * pass test_model_get_set_embeddings * pass test_single_forward * skip test_disk_offload_bin and test_disk_offload_safetensors * fix most tests * pass test_protein_encoder_output * remove useless code * add EvollaForProteinText2Text * pass test_saprot_output * pass all EvollaModelTest test and remove processor test * add processor test to its own file * skip is_training since esm skipped it and the saprot code causes error when setting is_training True * pass processor tests * solve all except config * pass most cases * change init * add doc to `configuration_evolla.py` * remove image_processing test * remove extra processor test * remove extra modules * remove extra modules * change all configs into one config * pass all evolla test * pass `make fixup` * update short summary * update Evolla-10B-hf * pass check_dummies.py and check_code_quality * fix `tests/models/auto/test_tokenization_auto.py::AutoTokenizerTest::test_model_name_edge_cases_in_mappings` * remove dummy codes * change format * fix llava issue * update format * update to solve llama3 access issue * update to make forward right * solve processor save load problem from instructblip solution * remove unexpected file * skip `test_generation_tester_mixin_inheritance` * add `test_single_forward_correct` and `test_inference_natural_language_protein_reasoning` * add `modular_evolla.py` * solved issue #36362 * run `make fixup` * update modular * solve float32 training * add fix * solve `utils/check_docstrings.py` * update * update * update * remove other files and replace sequential and einsum * add use case in document * update the models * update model * change some wrong code * Update src/transformers/models/evolla/modular_evolla.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/evolla/modular_evolla.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/evolla/modular_evolla.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/evolla/modular_evolla.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * fix issues mentioned in PR * update style and rearrange the placement * fix return_dict argument issue * solve SaProtConfig issue * Solve EvollaSaProtRotaryEmbedding issue * solve attention_mask issue * solve almosst all issues * make style * update config * remove unrelated pickle file * delete pickle files * fix config * simplify a lot * remove past k-v from encoder * continue work * style * skip it from init * fix init * fix init * simplify more * fill in docstrings * change test for generation * skip test * fix style --------- Co-authored-by: Chenchen Han <13980209828@163.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-25 19:11:57 +02:00
Yih-Dar	2670da66ce	update expected outputs for whisper after #38778 (#39304 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-25 16:48:10 +00:00
Yih-Dar	4b125e2993	fix `kyutai` tests (#39416 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-07-25 18:42:04 +02:00
Arthur	4f17bf0572	Fixes the BC (#39636 ) * fix * update * Update src/transformers/utils/generic.py Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * fixup * fixes * fix more models * fix fix fix * add embedding to more models * update * update * fix --------- Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>	2025-07-25 18:41:21 +02:00
Cyril Vallez	ddb0546d14	Delete bad rebasing functions (#39672 ) * remove outdated stuff * remove comment * use register * remove finally clause (to allow further check if fallback to sdpa) * general exception * add wrapper * revert check * typo	2025-07-25 18:28:09 +02:00
Anton Vlasjuk	a91653561e	[`Ernie 4.5`] Post merge adaptations (#39664 ) * ernie 4.5 fixes * Apply style fixes * fix --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-07-25 17:36:18 +02:00
Joao Gante	5d0ba3e479	[CI] revert device in `test_export_static_cache` (#39662 ) * revert device * add todo	2025-07-25 15:36:12 +00:00
Pavel Iakubovskii	850bdeaa95	Fix ModernBERT Decoder model (#39671 ) fix	2025-07-25 16:20:12 +01:00
Yoni Gozlan	17f02102c5	🚨[Fast Image Processor] Force Fast Image Processor for Qwen2_VL/2_5_VL + Refactor (#39591 ) * init * Force qwen2VL image proc to fast * refactor qwen2 vl fast * fix copies * Update after PR review and update tests to use return_tensors="pt" * fix processor tests * add BC for min pixels/max pixels	2025-07-25 11:11:28 -04:00
Lysandre Debut	f90de364c2	Rename huggingface_cli to hf (#39630 ) * Rename huggingface_cli to hf * hfh	2025-07-25 14:10:04 +02:00
revanth	3b3f9c0c46	fix(voxtral): correct typo in apply_transcription_request (#39572 ) * fix(voxtral): correct typo in apply_transcription_request * temporary wrapper: apply_transcrition_request * Update processing_voxtral.py * style: sort imports in processing_voxtral.py * docs(voxtral): fix typo in voxtral.md * make style * doc update --------- Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>	2025-07-25 12:09:44 +00:00
Joao Gante	2a82cf06ad	make fixup (#39661 )	2025-07-25 11:27:45 +00:00
Joao Gante	e3760501b0	[docs] fix ko cache docs (#39644 ) fix ko docs	2025-07-25 10:06:03 +01:00
Quentin Lhoest	91f591f7bc	Make pytorch examples UV-compatible (#39635 ) * update release.py * add uv headers in some pytorch examples * rest of pytorch examples * style	2025-07-25 10:46:22 +02:00
Wing Lian	c46c17db57	revert change to cu_seqlen_k and max_k when preparing from position_ids (#39653 )	2025-07-25 10:28:22 +02:00
Jeffrey Li	4600c27c4f	Fix: explicit not none check for tensors in flash attention (#39639 ) fix: explicit not none check for tensors	2025-07-25 10:09:14 +02:00
Raushan Turganbay	c392d47c9b	[attention] fix test for packed padfree masking (#39582 ) * fix most tests * skip a few more tests * address comments * fix chameleon tests * forgot to uncomment * qwen has its own tests with images, rename it as well	2025-07-25 07:44:52 +00:00
lmarshall12	565c035a2e	Add owlv2 fast processor (#39041 ) * add owlv2 fast image processor * add Owlv2ImageProcessorFast to Owlv2Processor image_processor_class * add Owlv2ImageProcessorFast to Owlv2Processor image_processor_class * change references to owlVit to owlv2 in docstrings for post process methods * change type hints from List, Dict, Tuple to list, dict, tuple * remove unused typing imports * add disable grouping argument to group images by shape * run make quality and repo-consistency * use modular * fix auto_docstring --------- Co-authored-by: Lewis Marshall <lewism@elderda.co.uk> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-07-25 02:40:11 +00:00
Wing Lian	5a81d7e0b3	revert behavior of _prepare_from_posids (#39622 ) * revert behavior of _prepare_from_posids * add back cu_seqlens_k and max_k for inference	2025-07-24 20:31:00 +02:00
eustlb	ad6fd2da0e	[Voxtral] values for A10 runners (#39605 ) * values for A10 runners * make * as for Llava * does not apply to Voxtral	2025-07-24 18:52:35 +02:00
Joao Gante	4741e1f1b7	[timm] new timm pin (#39640 )	2025-07-24 16:01:59 +00:00
StevenBucaille	12b612830d	[efficientloftr] fix model_id in tests (#39621 ) fix: wrong EfficientLoFTR model id in tests	2025-07-24 10:41:06 +01:00
Raushan Turganbay	947a37e8f5	Update recent processors for vLLM backend (#39583 ) * update recent models and make sure it runs withh vLLM * delete!	2025-07-24 10:29:27 +02:00
Matthew Hernandez	7b897fe583	[Docs] Translate audio_classification.md from English to Spanish (#39513 ) * Docs: translate audio_classification to Spanish * Update audio_classification.md * Remove space * Normalize backticks * Update audio_classification.md * Apply corrections recommended by aaronjimv * Update _toctree.yml --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-23 15:55:13 -07:00
Ethan Villarosa	9b7244f189	standardized YOLOS model card according to template in #36979 (#39528 ) * standardized YOLOS model card according to template in #36979 * Update docs/source/en/model_doc/yolos.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/yolos.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/yolos.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/yolos.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/yolos.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/yolos.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * standardized YOLOS model card according to template in #36979 * Update docs/source/en/model_doc/yolos.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/yolos.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/yolos.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/yolos.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/yolos.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/yolos.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * replaced YOLOS architecture image, deleted quantization and AttentionMaskVisualizer sections * removed cli section * Update yolos.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-23 11:00:25 -07:00
JoestarGagan	ec8a09a5fe	Feature/standardize opt model card (#39568 ) * docs: Standardize OPT model card with enhanced details * Remove incorrect link from OPT model card * Address review feedback on OPT model card * Update opt.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-23 10:57:48 -07:00
Eric Bezzam	c5a80dd6c4	🔴 Fix EnCodec internals and integration tests (#39431 ) * EnCodec fixes and update integration tests. * Apply padding mask when normalize is False. * Update comment of copied function. * Fix padding mask within modeling. * Revert padding function. * Simplify handling of padding_mask. * Address variable codebook size. * Add output for padding for consistency with original model, fix docstrings. * last_frame_pad_length as int * Update example code. * Improve docstring/comments. * Shorten expected output. * Consistent docstring. * Parameterize tests. * Properties for derived variables. * Update expected outputs from GitHub runner. * Consistent outputs with runner GPUs.	2025-07-23 19:39:27 +02:00
Eric Bezzam	7a4e2e7868	Fix DAC integration tests and checkpoint conversion. (#39313 ) * Fix DAC (slow) integration tests. * Fix DAC conversion. * Address comments * Sync with main, uncomment nn.utils.parametrizations.weight_norm. * Update DAC integration tests with expected outputs. * Added info about encoder/decoder error and longer decoder outputs. * Parameterize tests. * Set expected values to GitHub runners.	2025-07-23 19:21:26 +02:00
Eric Bezzam	596a75f6e9	Move openai import (#39613 )	2025-07-23 19:05:39 +02:00
Lysandre Debut	a0e5a7d34b	Transformers serve VLM (#39454 ) * Add support for VLMs in Transformers Serve * Raushan comments * Update src/transformers/commands/serving.py Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com> * Quick fix * CPU -> Auto * Update src/transformers/commands/serving.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fixup --------- Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-23 17:03:18 +02:00
Pablo Montalvo	ea56eb6bed	Fix important models CI (#39576 ) * relax test boundaries and fix from config * eager is always supported.	2025-07-23 16:24:29 +02:00
Maxime Grenu	0fe03afeb8	Fix typos and grammar issues in documentation and code (#39598 ) - Fix Cyrillic 'Р' to Latin 'P' in Portuguese language link (README.md) - Fix 'meanginful' to 'meaningful' in training documentation - Fix duplicate 'Cohere' reference in modular transformers documentation - Fix duplicate 'the the' in trainer and chat command comments 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude <noreply@anthropic.com>	2025-07-23 12:43:11 +00:00
Matej Sirovatka	82603b6cc2	Allow `device_mesh` have multiple dim (#38949 ) * Feat: something * Feat: initial changes * tmp changes to unblock * Refactor * remove todo * Feat: docstring --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-07-23 12:27:36 +00:00
jiqing-feng	10c990f7e2	enable triton backend on awq xpu (#39443 ) * enable triton backend on awq xpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/quantizers/quantizer_awq.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * fix dtype check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-07-23 12:10:38 +00:00
Raushan Turganbay	e7e6efcbbd	[idefics3] fix for vLLM (#39470 ) * fix idefics3 for vllm tests * fix copies	2025-07-23 14:00:43 +02:00
llbdyiu66	a62f65a989	fix moe routing_weights (#39581 ) * fix moe routing_weights * fix ernie4_5_moe routing_weights * fix integration test --------- Co-authored-by: llbdyiu66 <llbdyiu66@users.noreply.github.com> Co-authored-by: Vasqu <antonprogamer@gmail.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-07-23 11:20:23 +00:00
Andrei Panferov	623ab01039	FP-Quant support (#38696 ) * quartet * quartet qat -> quartet * format * bf16 backward * interfaces * forward_method * quartet -> fp_quant * style * List -> list * list typing * fixed format and annotations * test_fp_quant * docstrings and default dtypes * better docstring and removed noop checks * docs * pseudoquantization support to test on non-blackwell * pseudoquant * Pseudoquant docs * Update docs/source/en/quantization/fp_quant.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update docs/source/en/quantization/fp_quant.md * Update docs/source/en/quantization/fp_quant.md * Update src/transformers/utils/quantization_config.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update tests/quantization/fp_quant_integration/test_fp_quant.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update tests/quantization/fp_quant_integration/test_fp_quant.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * small test fixes * dockerfile update * spec link * removed `_process_model_after_weight_loading` * toctree --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-07-23 11:41:10 +02:00
Raushan Turganbay	eb1a007f7f	Rename `supports_static_cache` to `can_compile_fullgraph` (#39505 ) * update all * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * apply suggestions * fix copies --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-23 09:35:18 +00:00
Quentin Gallouédec	b357cbb19d	[Trackio] Allow single-gpu training and monitor power (#39595 ) Allow not distributed and monitor power	2025-07-23 11:22:50 +02:00
Cyril Vallez	019b74977d	Generic task-specific base classes (#39584 ) * first shot * Update modeling_layers.py * fix mro order * finalize llama * all modular and copied from from llama * fix	2025-07-23 10:49:47 +02:00
Cyril Vallez	5dba4bc7b2	Fix DynamicCache and simplify Cache classes a bit (#39590 ) * fix * use kwargs * simplify * Update cache_utils.py * Update cache_utils.py * Update test_cache_utils.py * fix * style	2025-07-23 10:13:45 +02:00
Sangbum Daniel Choi	d9b35c635e	Mask2former & Maskformer Fast Image Processor (#35685 ) * add maskformerfast * test * revert do_reduce_labels and add testing * make style & fix-copies * add mask2former and make fix-copies TO DO: add test for mask2former * make fix-copies * fill docstring * enable mask2former fast processor * python utils/custom_init_isort.py * make fix-copies * fix PR's comments * modular file update * add license * make style * modular file * make fix-copies * merge * temp commit * finish up maskformer mask2former * remove zero shot examples --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-07-23 02:47:47 +00:00
Quentin Gallouédec	6e9972962f	🎯 Trackio integration (#38814 ) * First attempt * fix * fix * Enhance TrackioCallback to log GPU memory usage and allocation * Enhance Trackio integration in callbacks and training arguments documentation * re order * remove unused lines * fix torch optional	2025-07-22 14:50:20 -07:00
space_samurai	c6d0500d15	[WIP] Add OneformerFastImageProcessor (#38343 ) * [WIP] OneformerFastImageProcessor * update init * Fully working oneformer image processor fast * change Nearest to Neares exact interpolation where needed * fix doc --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-07-22 20:41:39 +00:00
Harry Mellor	4884b6bf41	Fix link in "Inference server backends" doc (#39589 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-22 16:44:08 +00:00
Marc Sun	075a65657a	Torchdec RuntimeError catch (#39580 ) * fix * fix * maybe better * style	2025-07-22 18:35:03 +02:00
Kashif Rasul	2936902a76	[Paged-Attention] Handle continuous batching for repetition penalty (#39457 ) * Handle continuous batching for repetition penalty * fix last scores and with token mask creation * add test * Update src/transformers/generation/continuous_batching.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/logits_process.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix formatting * remove unneeded cast --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-22 18:13:40 +02:00
Cássia Sampaio	cbcb8e6c1f	updated mistral3 model card (#39531 ) * updated mistral3 model card (#1) * updated mistral3 model card * applying suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * made all changes to mistral3.md * adding space between paragraphs in docs/source/en/model_doc/mistral3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * removing duplicate in mistral3.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * adding 4 backticks to preserve formatting --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-22 09:01:55 -07:00
Woojun Jung	601260fd96	Update `docs/source/ko/_toctree.yml` (#39516 ) docs: update `docs/source/ko/_toctree.yml`	2025-07-22 09:00:42 -07:00
Manuel de Prada Corral	c338fd43b0	[cache refactor] Move all the caching logic to a per-layer approach (#39106 ) * Squash for refactor: Replace monolithic cache classes with modular LayeredCache (#38077) - Introduces CacheLayer and Cache base classes - Ports Static, Dynamic, Offloaded, Quantized, Hybrid, etc. to use layers - Implements method/attr dispatch across layers to reduce boilerplate - Adds CacheProcessor hooks for offloading, quantization, etc. - Updates and passes tests * fix quantized, add tests * remove CacheProcessorList * raushan review, arthur review * joao review: minor things * remove cache configs, make CacheLayer a mixin (joaos review) * back to storage inside Cache() * remove cachebase for decorator * no more __getattr__ * fix tests * joaos review except docs * fix ast deprecations for python 3.14: replace node.n by node.value and use `ast.Constant` More verbose exceptions in `fix_docstring` on docstring formatting issues. * Revert "back to storage inside Cache()" This reverts commit 27916bc2737806bf849ce2148cb1e66d59573913. * cyril review * simplify cache export * fix lfm2 cache * HybridChunked to layer * BC proxy object for cache.key_cache[i]=... * reorder classes * bfff come on LFM2 * better tests for hybrid and hybridChunked * complete coverage for hybrid chunked caches (prefill chunking) * reimplementing HybridChunked * cyril review * fix ci * docs for cache refactor * docs * oopsie * oopsie * fix after merge * cyril review * arthur review * opsie * fix lfm2 * opsie2	2025-07-22 16:10:25 +02:00
Cyril Vallez	b16688e96a	General weight initialization scheme (#39579 ) * general + modulars from llama * all modular models * style and fix musicgen * fix * Update configuration_musicgen.py * Update modeling_utils.py	2025-07-22 16:04:20 +02:00
Ákos Hadnagy	015b62bf3e	Add AMD GPU expectations for LLaVA tests (#39486 ) * Add AMD GPU expectation to llava tests * FMT * Remove debug print * Address review comments	2025-07-22 14:01:54 +00:00
Arthur	efceeaf267	Kernels flash attn (#39474 ) * use partial to wrap around `transformers` utils! * try to refactor? * revert one wrong change * just a nit * push * reverter watever was wrong! * some nits * fixes when there is no attention mask * bring the licence back * some fixes * nit * style * remove prints * correct dtype * fa flags for testing * update * use paged attention if requested! * updates * a clone was needed, not sure why * automatically create cu seq lens when input is flash, this at least makes sure layers don't re-compute * simplify and improve? * flash attention is kinda broken on recent cuda version so allow the opportunity to use something else * fix! * protect kernels import * update * properly parse generation config being passed * revert and update * add two tests * some fixes * fix test FA2 * takes comment into account * fixup * revert changes * revert the clone, it is only needed because the metal kernel is not doing it? * [docs] update attention implementation and cache docs (#39547) * update docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * applu suggestions --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix mps on our side for now * Update src/transformers/integrations/flash_paged.py * no qa --------- Co-authored-by: Vasqu <antonprogamer@gmail.com> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-22 15:41:06 +02:00
Ákos Hadnagy	b62557e712	Add AMD expectations to Mistral3 tests (#39481 ) Add AMD expectations to mistral3 tests	2025-07-22 15:40:16 +02:00
Raushan Turganbay	1806583390	[docs] Create page on inference servers with transformers backend (#39550 ) * draft docs on inference servers * Update docs/source/en/_toctree.yml Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * update * dic build failed * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * apply last suggestions --------- Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-22 15:31:10 +02:00
Raushan Turganbay	cd98c1fee3	[docs] update attention implementation and cache docs (#39547 ) * update docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * applu suggestions --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-22 15:06:43 +02:00
Ákos Hadnagy	ef99537f37	Add AMD test expectations to DETR model (#39539 ) * Add AMD test expectations to DETR model * Fix baseline expectation * Address review comments * Make formatting a bit more consistent	2025-07-22 12:07:10 +00:00
Dominik Baran	30567c28e8	[timm_wrapper] add support for gradient checkpointing (#39287 ) * feat: add support for gradient checkpointing in TimmWrapperModel and TimmWrapperForImageClassification * ruff fix * refactor + add test for not supported model * ruff * Update src/transformers/models/timm_wrapper/modeling_timm_wrapper.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/timm_wrapper/modeling_timm_wrapper.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/timm_wrapper/modeling_timm_wrapper.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/timm_wrapper/modeling_timm_wrapper.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-07-22 11:07:52 +00:00
Wing Lian	a44dcbe513	Fixes needed for n-d parallelism and TP (#39562 ) Handle non-DTensors cases in TP Layers Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-07-22 10:24:59 +00:00
Ákos Hadnagy	0cae633ce1	Bump AMD container for 2.7.1 PyTorch (#39458 ) * Bump AMD container for 2.7.1 PyTorch * Forgot to update pinned packages	2025-07-22 12:11:38 +02:00
StevenBucaille	a88ea9cbc8	Add EfficientLoFTR model (#36355 ) * initial commit * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix: various typos, typehints, refactors from suggestions * fix: fine_matching method * Added EfficientLoFTRModel and AutoModelForKeypointMatching class * fix: got rid of compilation breaking instructions * docs: added todo for plot * fix: used correct hub repo * docs: added comments * fix: run modular * doc: added PyTorch badge * fix: model repo typo in config * fix: make modular * fix: removed mask values from outputs * feat: added plot_keypoint_matching to EfficientLoFTRImageProcessor * feat: added SuperGlueForKeypointMatching to AutoModelForKeypointMatching list * fix: reformat * refactor: renamed aggregation_sizes config parameter into q, kv aggregation kernel size and stride * doc: added q, kv aggregation kernel size and stride doc to config * refactor: converted efficientloftr implementation from modular to copied from mechanism * tests: overwrote batching_equivalence for "keypoints" specific tests * fix: changed EfficientLoFTRConfig import in test_modeling_rope_utils * fix: make fix-copies * fix: make style * fix: update rope function to make meta tests pass * fix: rename plot_keypoint_matching to visualize_output for clarity * refactor: optimize image pair processing by removing redundant target size calculations * feat: add EfficientLoFTRImageProcessor to image processor mapping * refactor: removed logger and updated attention forward * refactor: added auto_docstring and can_return_tuple decorators * refactor: update type imports * refactor: update type hints from List/Dict to list/dict for consistency * refactor: update MODEL_MAPPING_NAMES and __all__ to include LightGlue and AutoModelForKeypointMatching * fix: change type hint for size parameter in EfficientLoFTRImageProcessor to Optional[dict] * fix typing * fix some typing issues * nit * a few more typehint fixes * Remove output_attentions and output_hidden_states from modeling code * else -> elif to support efficientloftr * nit * tests: added EfficientLoFTR image processor tests * refactor: reorder functions * chore: update copyright year in EfficientLoFTR test file * Use default rope * Add docs * Update visualization method * fix doc order * remove 2d rope test * Update src/transformers/models/efficientloftr/modeling_efficientloftr.py * fix docs * Update src/transformers/models/efficientloftr/image_processing_efficientloftr.py * update gradient * refactor: removed unused codepath * Add motivation to keep postprocessing in modeling code * refactor: removed unnecessary variable declarations * docs: use load_image from image_utils * refactor: moved stage in and out channels computation to configuration * refactor: set an intermediate_size parameter to be more explicit * refactor: removed all mentions of attention masks as they are not used * refactor: moved position_embeddings to be computed once in the model instead of every layer * refactor: removed unnecessary hidden expansion parameter from config * refactor: removed completely hidden expansions * refactor: removed position embeddings slice function * tests: fixed broken tests because of previous commit * fix is_grayscale typehint * not refactoring * not renaming * move h/w to embeddings class * Precompute embeddings in init * fix: replaced cuda device in convert script to accelerate device * fix: replaced stevenbucaille repo to zju-community * Remove accelerator.device from conversion script * refactor: moved parameter computation in configuration instead of figuring it out when instantiating a Module * fix: removed unused attributes in configuration * fix: missing self * fix: refactoring and tests * fix: make style --------- Co-authored-by: steven <steven.bucaille@buawei.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-07-22 10:53:16 +01:00
Raushan Turganbay	3bc726b381	[gemma3] fix bidirectional image mask (#39396 ) * fix gemma3 mask * make compile happy, and use only torch ops * no full attention between images * update tests * fix tests * add a fast test	2025-07-22 10:04:56 +02:00
nlhm	fbeaf96f9e	Update OLMoE model card (#39344 ) * Update OLMoE model card * Checks Test * Add license and code * Update docs/source/en/model_doc/olmoe.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update olmoe.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-21 16:41:01 -07:00
Orion Weller	641aaed7c0	Update modernbertdecoder docs (#39453 ) * update docs with paper and real model * nit * Apply suggestions from code review Thanks to @stevhlui! Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Remove usage examples, add quantization --------- Co-authored-by: oweller2 <oweller2@dsailogin.mgmt.ai.cluster> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-21 16:40:22 -07:00
Anton Vlasjuk	049a674e68	[`CI`] Fix post merge ernie 4.5 (#39561 ) fix repo consistency	2025-07-21 20:56:24 +02:00
Yoni Gozlan	b3ebc761e2	[Fast image processors] Improve handling of image-like inputs other than images (segmentation_maps) (#39489 ) * improve handlike of other image-like inputs in fast image processors * fix issues with _prepare_images_structure * update sam image processor fast * use dict update	2025-07-21 14:12:14 -04:00
Anton Vlasjuk	b4115a426e	[`Ernie 4.5`] Add ernie text models (#39228 ) * init * copied from remote * add proper structure and llama like structure * fixup * revert to state that works * get closer to llama * slow and steady * some removal * masks work * it is indeed the rope implementation, how dafuq does it mesh with the cache now hmm * nice * getting closer * closer to transformers style * let's simplify this, batching works now * simplified * working version with modular * it is indeed the rotation per weights, make it complete llama style * cleanup conversion, next to look at -> tokenizer * remove llama artefacts * fix modeling tests (common ones) * style * integration test + first look into tokenization (will need more work, focussing on modeling other models first) * style * working moe version, based on remote * lets keep it simple and go step by step - transformers annotations for modular and transformers style rope (complex view) * more cleanup * refactor namings and remove addition forXXX classes * our moe won't cut it it seems, correction bias seems to be missing in remote code version * tokenization change (remote) * our moe version works when adding normalization :D * cleanup moe * nits * cleanup modeling -> let's get to modular next * style * modular v1 * minor things + attempt at conversion (which doesn't work) * no conversion follow glm, fixup modular and other nits * modular cleanup * fixes * tests, tests, tests + some moe dtype forcing * simplify modular, fix fatal fa2 bug, remaining tests * fix import issue? * some initial docs, fix bnb faulty behavior --> needs to fix some tests because of gate needing to be float * fix sdpa test, load on init dtype only * fixup post merge * style * fix doc links * tokenization cleanup beginnings * simplify tokenizer by a lot as its basically llama * tokenizer is full llama with different defaults + extra special tokens * sync og special tokens of ernie * fix decoding with numbers (also in remote done what a timing), begin of tok tests * align with remote and preserve special tokens, adjust tests to ernie legacy behavior, warning for questionable behavior (also in llama) * nits * docs * my daily post merge it is * check * tokenization update with explanations and conversion script * review on modular (til), revert some tokenizer things i did prior, remove mtp comment (low prio) * post merge fixes * fixup tokenization, llama fast is the way to go * more fixups * check * import fixes * correction bias following the paddle code * fix * fix TP plan, fix correction bias sharding during forward * style * whoops * fix tied weights * docs and last nit * license * flasky tests * move repo id, update when merged on the hub	2025-07-21 19:51:49 +02:00
Pablo Montalvo	69b158260f	Refactor embedding input/output getter/setter (#39339 ) * simplify common get/set * remove some noise * change some 5 years old modeling utils * update examples * fix copies * revert some changes * fixes, gah * format * move to Mixin * remove smolvlm specific require grad * skip * force defaults * remodularise some stuff * remodularise more stuff * add safety for audio models * style * have a correct fallback, you daft donkey * remove this argh * change heuristic for audio models * fixup * revert * this works * revert again * 🧠 * aaah ESM has two modelings aaah * add informative but short comment * add `input_embed_layer` mixin attribute * style * walrus has low precedence * modular fix * this was breaking parser	2025-07-21 18:18:14 +02:00
김민서	2da97f0943	🌐 [i18n-KO] Translated `perf_infer_gpu_multi.md` to Korean (#39441 ) * docs: ko: perf_infer_gpu_many.md * feat: nmt draft * docs: refine KO translation and enhance naturalness * docs: add missing TOC to documentation * Align toctree and filename with original: perf_infer_gpu_multi Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Refine Korean translation * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>	2025-07-21 09:14:15 -07:00
Yoni Gozlan	82807e56b1	[Fast image processor] refactor fast image processor glm4v (#39490 ) refactor fast image processor glm4v	2025-07-21 11:18:46 -04:00
Wing Lian	4b4f04fcca	fix ndim check of device_mesh for TP (#39538 )	2025-07-21 13:09:33 +00:00
Manuel de Prada Corral	1aa7256f01	Refactor `MambaCache` to `modeling_mamba.py` (#38086 ) * Refactor MambaCache to modeling_mamba.py (parity with Zamba) * ruff * fix dummies * update * update * remove mamba ref in cache tests * remove cache_implementation from tests * update * ruff * ruff * sneaky regression * model consistency * fix test_multi_gpu_data_parallel_forward * fix falcon slow tests * ruff * ruff * add sample false * try to fix slow tests * Revert "fix test_multi_gpu_data_parallel_forward" This reverts commit 66b7162c7c5c5ce8a73ccf48cffc8a96343ebb33. * fix tests on nvidia t4, remove dataparallel tests from mamba * ruff * remove DDP tests from mamba and falcon_mamba * add explicit error for MambaCache * mamba2 also needs to init cache in prepare_inputs_for_generation * ruff * ruff * move MambaCache to its own file * ruff * unprotected import fix * another attempt to fix unprotected imports * Revert "another attempt to fix unprotected imports" This reverts commit 2338354fcab630de5899321f5daced5fb312c2a2. * fixing unprotected import, attempt 3 * Update src/transformers/cache_utils.py * ruff's fault * fix arthur review * modular falcon mamba * found a hack * fix config docs * fix docs * add export info * merge modular falcon branch * oopsie * fix fast path failing * new approach * oopsie * fix types * Revert new pragma in modular This reverts commit 80b1cf160ee251536f07c40b8a0857d499e70db6. * trying another modular workaround * review & fix ci * oopsie * clear prepare_inputs on mamba/mamba2/falcon_mamba	2025-07-21 14:59:36 +02:00
st81	a419a40234	Fix Docstring of BarkProcessor (#39546 ) * Fix Docstring of BarkProcessor * Fix typo * Add type hint of return value for BarkProcessor.__call__	2025-07-21 12:56:44 +00:00
Wang, Yi	9323d0873c	use the enable_gqa param in torch.nn.functional.scaled_dot_product_at… (#39412 ) * use the enable_gqa param in torch.nn.functional.scaled_dot_product_attention Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * ci failure fix Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * add check Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * fix ci failure Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * refine code, extend to cuda Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * refine code Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * fix review comments Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * refine the PR Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-07-21 14:46:43 +02:00
BUI Van Tuan	6b3a1f2f51	Fix missing initializations for models created in 2023 (#39239 ) * fix SwiftFormer * fix Kosmos2 * fix Owlv2 * fix Sam * fix Vits * fix Pvt * fix MobileViTV2 * fix PatchTST * fix Bros * fix Informer * fix BridgeTower * fix Mra and Yoso * fix Rwkv * fix EfficientNet * fix NllbMoe * fix Tvp * fix Clap * fix Autoformer * fix SwiftFormer * fix Mgpstr * fix Align * fix VitMatte * fix SpeechT5 * add conditional check for parameters * fix SpeechT5 * fix TimmBackbone and Clvp * fix SwiftFormer * fix SeamlessM4T and SeamlessM4Tv2 * fix Align * fix Owlv2 and OwlViT * add reviewed changes * add reviewed changes * fix typo --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-07-21 14:43:52 +02:00
Sai-Suraj-27	970d9a75ce	Raise `TypeError` instead of ValueError for invalid types (#38660 ) * Raise TypeError instead of ValueError for invalid types. * Removed un-necessary changes. * Resolved conflicts * Code quality * Fix failing tests. * Fix failing tests.	2025-07-21 12:42:00 +00:00
Yuanyuan Chen	822c5e45b2	Fix pylint warnings (#39477 ) * Fix pylint warnings Signed-off-by: cyy <cyyever@outlook.com> * Fix variable names Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-07-21 12:38:05 +00:00
Cyril Vallez	dc017cd763	Fix Qwen Omni integration test (#39553 ) fix	2025-07-21 14:11:46 +02:00
Krishnan Vignesh	fdc0566e15	🚨🚨🚨 [Trainer] Enable `average_tokens_across_devices` by default in `TrainingArguments` (#39395 ) Enable average_tokens_across_devices by default in TrainingArguments Fixes #39392 This change improves loss calculation correctness for multi-GPU training by enabling proper token averaging across devices by default. Co-authored-by: Krishnan Vignesh <krishnanvignesh@Krishnans-MacBook-Air.local> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-07-21 12:11:20 +00:00
Raushan Turganbay	8c102e2eb1	Rename `_supports_flash_attn_2` in examples and tests (#39471 ) * delete `_supports_flash_attn_2` from examples and tests * simplify docs	2025-07-21 14:02:57 +02:00
Cyril Vallez	3a152e3a5c	Fix the check in flex test (#39548 ) * fix the check * fix flags * flags	2025-07-21 13:29:44 +02:00
Eric Bezzam	78fb2d2760	Fix bad tensor shape in failing Hubert test. (#39502 ) Fix bad tensor shape in Hubert test.	2025-07-21 12:25:52 +01:00
Yuxuan Zhang	39ba5f3cc2	GLM-4 Update (#39393 ) * one commit with full * Create glm4_moe.md * Update check_config_docstrings.py * Update __init__.py * update * argue * argue: router problem * 1 * Update test_modeling_glm4_moe.py * Update test_modeling_glm4_moe.py * Update test_modeling_glm4_moe.py * Update modular_glm4_moe.py * update * use dsv3 pretrainmodel in modular * update for test * upodate new modular * use LlamaAttention and avoid use CohereAttention cause repeat norm * update the modular * update attn modular * update * Update modular_glm4_moe.py * MTP layer is need to ignore * fix gradient error using with dots_1 method * Update test_modeling_glm4_moe.py * Update test_modeling_glm4_moe.py * Update test_modeling_glm4_moe.py --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-07-21 13:24:34 +02:00
Raushan Turganbay	344012b3a6	[qwen2 vl] fix packing with all attentions (#39447 ) * fix qwen2 vl packing in FA2 * why? delete! * qwen2-5-vl seems to work now * update * fix tests * start by adapting FA2 tests * add similar tests for sdpa/eager * address comments * why is this even in conditional model and not base model?	2025-07-21 12:19:15 +02:00
Raushan Turganbay	e42681b48b	[gemma3] support sequence classification task (#39465 ) * add seq clf class * fix docs and add in auto-map * skip tests * optional pixels	2025-07-21 11:03:20 +02:00
Yoni Gozlan	34133d0a79	Fix placeholders replacement logic in auto_docstring (#39433 ) Fix and simplify placeholders replacement logic	2025-07-18 22:56:23 +00:00
Yoni Gozlan	433d2a23d7	Update SAM/SAM HQ attention implementation + fix Cuda sync issues (#39386 ) * update attention implementation and improve inference speed * modular sam_hq + fix integration tests on A10 * fixup * fix after review * softmax in correct place * return attn_weights in sam/sam_hq	2025-07-18 18:46:27 -04:00
Yoni Gozlan	541bed22d6	Improve @auto_docstring doc and rename `args_doc.py` to `auto_docstring.py` (#39439 ) * rename `args_doc.py` to `auto_docstring.py` and improve doc * modifs after review	2025-07-18 18:00:34 +00:00
Yoni Gozlan	de0dd3139d	Add fast image processor SAM (#39385 ) * add fast image processor sam * nits	2025-07-18 17:27:16 +00:00
Enno Hermann	561a79a2f4	Fix BatchEncoding.to() for nested elements (#38985 )	2025-07-18 14:14:45 +01:00
Mohit Deopujari	f4d076561f	[gemma3] Fix do_convert_rgb in image processors. (#39438 ) * [gemma3] Fix do_convert_rgb in image processors. * [gemma3] Fix do_convert_rgb in image processors.	2025-07-18 12:33:00 +00:00
Raushan Turganbay	bcc0091937	[chat template] return assistant mask in processors (#38545 ) * messed up the git history, squash commits * raise error if slow and refine tests * index was off by one * fix the test	2025-07-18 12:23:20 +00:00
Joao Gante	328ca9cf1d	[dependencies] Update `datasets` pin (#39500 ) * pyarrow pin * make fixup * test? * like this? * like this? * like this? * datasets pin * comment	2025-07-18 12:05:28 +00:00
Ákos Hadnagy	fb58377700	Slack CI bot: set default result for non-existing artifacts (#39499 ) * Set default result for non-existing artifacts * FMT * Address review comments	2025-07-18 11:45:47 +00:00
Cyril Vallez	4ded9a4113	🚨🚨 Fix and simplify attention implementation dispatch and subconfigs handling (#39423 ) * first try * Update modeling_utils.py * Update modeling_utils.py * big refactor * Update modeling_utils.py * style * docstrings and simplify inner workings of configs * remove all trace of _internal * Update modeling_utils.py * fix logic error * Update modeling_utils.py * recursive on config * Update configuration_utils.py * fix * Update configuration_dpt.py * Update configuration_utils.py * Update configuration_utils.py * Update modeling_idefics.py * Update modeling_utils.py * fix for old models * more old models fixup * Update modeling_utils.py * Update configuration_utils.py * Remove outdated test * remove the deepcopy!! 🥵🥵 * Update test_modeling_gpt_bigcode.py * fix qwen dispatch * restrict to only models supporting it * style * switch name * Update modeling_utils.py * Update modeling_utils.py * add tests! * fix * rypo * remove bad copies * fix * Update modeling_utils.py * additional check * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * fix * skip	2025-07-18 13:41:54 +02:00
Joao Gante	2b819ba4e3	[dependencies] temporary pyarrow pin (#39496 ) * pyarrow pin * make fixup * test? * like this? * like this? * like this?	2025-07-18 10:05:40 +00:00
eustlb	967045082f	Add voxtral (#39429 ) * draft * draft update (conversion working) * mend * draft update * draft update: working generate * refactor * VoxtralProcessor draft * processor update * update convert_tekken_tokenizer * refactor processor * update convert * make style * better handle prefil * make style * add tests * add mistral_common audio loading * processor update * revert changes * audio utils update * add audio to apply chat template mistral update * voxtral processor update * fix * udpate converstion script * make mistral tokenier from pretrain work from local dir * fix udpates * add integration tests * add batched version * processor docstring * make style * revert convert_tekken_tokenizer changes * revert processing_qwen2.5 changes * add multi-turn test * processor improvements * address review changes * Update src/transformers/tokenization_mistral_common.py Co-authored-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> * update audio utils * nits * integration test update * correct _support * update tests * test update * update integration tests * fix * fix * fix * add test_apply_chat_template_with_audio * add model doc * model doc * nit * doc uptade * nit * processor improvement * ensure default is 3B * nits * make * make * convert modular * update checkpoint * fix test * make * make * autos * make * make * nit * nit * nit --------- Co-authored-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-18 00:02:04 +00:00
Qizhi Chen	73869f2e81	Fix typing order (#39467 ) * fix type order * change all Union[str, dict] to Union[dict, str] * add hf_parser test && fix test order * add deepspeed dependency * replace deepspeed with accelerator	2025-07-17 15:47:31 +00:00
he pang	bda75b4011	Add unified logits_to_keep support to LLMClass (#39472 ) * add supports for logits_to_keep for qwen25vl and glm4v * Update relevant modular files	2025-07-17 17:07:12 +02:00
Joao Gante	bf6c997685	[serve] Add speech to text (`/v1/audio/transcriptions`) (#39434 ) * Scaffolding * Explicit content * Naïve Responses API streaming implementation * Cleanup * Scaffolding * Explicit content * Naïve Responses API streaming implementation * Cleanup * use openai * validate request, including detecting unused fields * dict indexing * dict var access * tmp commit (tests failing) * add slow * use oai output type in completions * (little rebase errors) * working spec? * guard type hint * type hints. fix state (CB can now load different models) * type hints; fn names; error type * add docstrings * responses + kv cache * metadata support; fix kv cache; error event * add output_index and content_index * docstrings * add test_build_response_event * docs/comments * gate test requirements; terminate cb manager on model switch * nasty type hints * more type hints * disable validation by default; enable force models * todo * experiment: base model from typed dict * audio working * fix bad rebase * load audio with librosa * implement timed models * almost working * make fixup * fix tests * transcription request type * tokenizer -> processor * add example in docs --------- Co-authored-by: Lysandre <hi@lysand.re>	2025-07-17 14:29:57 +00:00
zhaiji0727	8b3de61a65	Update integration_utils.py (#39469 ) * Update integration_utils.py sanitize mlflow upload metric * Update integration_utils.py change import order to pass CI * Update integration_utils.py add comments * Update integration_utils.py Remove whitespace from blank line	2025-07-17 13:57:49 +00:00
Peter Schneider	7fd60047c8	fix: ImageTextToTextPipeline handles user-defined generation_config (#39374 ) fix: ImageTextToTextPipeline handles user-defined generation_config passed to the pipeline Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2025-07-17 13:23:29 +00:00
Yuanyuan Chen	60b5471da3	Enable some ruff checks for performance and readability (#39383 ) * Fix inefficient sequence tests Signed-off-by: cyy <cyyever@outlook.com> * Enable PERF102 Signed-off-by: cyy <cyyever@outlook.com> * Enable PLC1802 Signed-off-by: cyy <cyyever@outlook.com> * Enable PLC0208 Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-07-17 13:21:59 +00:00
Stonepia	fc700c2a26	Fix convert_and_export_with_cache failures for GPU models (#38976 ) * Add the `device` option for `generate()` * Add device for default tensors to avoid tensor mismatch * [test] Enable test_static_cache_exportability for torch_device * infer device from the prompt_token_ids * Add device for generated tensor * [Test] Make `test_export_static_cache` tests to run on devices rather than only CPU * fix format * infer device from the model	2025-07-17 13:12:32 +00:00
Yih-Dar	54680d75c9	Update `GemmaIntegrationTest::test_model_2b_bf16_dola` (#39362 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-17 14:06:23 +01:00
klimarissa17	322400af58	fix a comment typo in utils.py (#39459 )	2025-07-17 13:06:04 +00:00
Yuanyuan Chen	43f07018cf	Use newer typing notation (#38934 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-07-17 13:05:21 +00:00
Marc Sun	565dd0bad7	Fix tests due to breaking change in accelerate (#39451 ) * update values * fix	2025-07-17 13:51:50 +01:00
Zhongkai Zhao	26fed50460	fix max_length calculating using cu_seq_lens (#39341 )	2025-07-17 10:54:23 +02:00
Yusuf Shihata	cdfe6164b3	fix(pipelines): QA pipeline returns fewer than top_k results in batch mode (#39193 ) * fixing the bug * Try a simpler approach * make fixup --------- Co-authored-by: Matt <rocketknight1@gmail.com>	2025-07-17 10:24:30 +02:00
renet10	b85ed49e0a	Corrections to PR #38642 and enhancements to Wav2Vec2Processor __call__ and pad docstrings (#38822 ) * Correcting PR #38642. The PR removed references to the deprecated method "as_target_processor()" in the __call__ and pad method docstrings, which is correct, but also removed all references to PreTrainedTokenizer, which is incorrect. This commit adds back the reference to PreTrainedTokenizer and also takes the opportunity to enhance the docstrings with the invocation procedure post removal of "as_target_processor()" and adds information on return values. * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: René Tio <tor@Jammer.local> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-16 14:13:07 -07:00
Dhruv Malik	787a0128a9	create ijepa modelcard (ref : PR #36979 ). (#39354 ) * wip: adding first version of the IJEPA model card. * refactor based on the @stevhliu feedbacks * refactor: - revert the accidental removal of the autodoc api description and the image reerece architecture - general context updation. * - changes of model for example quantization. - merging the quantization content.	2025-07-16 12:40:22 -07:00
ridima11	48f2233cdf	Improve grammar and clarity in perf_hardware.md (#39428 )	2025-07-16 12:15:15 -07:00
Yaowei Zheng	e68ebb695f	fix cached file error when repo type is dataset (#36909 ) * fix cached file * Update hub.py	2025-07-16 18:02:26 +02:00
Krishnan Vignesh	35a416c400	Fix indentation bug in SmolVLM image processor causing KeyError (#39452 ) Fix indentation bug in Idefics3 image processor - Fix KeyError when do_image_splitting=False - Move split_images_grouped assignment inside loop - Ensures all image shapes are stored, not just the last one - This fixes the bug in both Idefics3 and generated SmolVLM processors cc @yonigozlan Co-authored-by: Krishnan Vignesh <krishnanvignesh@Krishnans-MacBook-Air.local>	2025-07-16 11:59:28 -04:00
Luke Friedrichs	2c58705dc2	Updated Megatron conversion script for gpt2 checkpoints (#38969 ) * update script to support new megatron gpt format * fixed quality failures --------- Co-authored-by: Luke Friedrichs <LckyLke>	2025-07-16 15:54:29 +00:00
Anton Vlasjuk	26be7f717e	[`CI`] Fix partially red CI (#39448 ) fix	2025-07-16 15:53:43 +02:00
sebastianvlad1	0a88751940	Fixes #39204 : add fallback if get_base_model missing (#39226 ) * Fixes #39204: add fallback if get_base_model missing * Inline try_get_base_model logic as suggested in PR review * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-07-16 15:51:30 +02:00
Wing Lian	ba506f87db	make the loss context manager easier to extend (#39321 )	2025-07-16 15:47:24 +02:00
Arthur	9f1ac6f185	Remove something that should have never been there (#38254 ) * what the hell * update * style * style * typing * fix init issue * fix granite moe hybrid as well	2025-07-16 15:22:44 +02:00
Raushan Turganbay	a7ca5b5d67	Fix processor tests (#39450 ) fix	2025-07-16 15:01:35 +02:00
Kyle Sayers	71818f570b	[Bugfix] [Quantization] Remove unused init arg (#39324 ) remove unused arg from ct config init Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-07-16 14:57:42 +02:00
Pavel Iakubovskii	cc24b0378e	Better typing for model.config (#39132 ) * Apply to all models config annotation * Update modular to preserve order * Apply modular * fix define docstring * fix dinov2 consistency (docs<->modular) * fix InstructBlipVideoForConditionalGeneration docs<->modular consistency * fixup * remove duplicate code * Delete config_class attribute from the modeling code * Add config_class attribute in base model * Update init sub class * Deprecated models update * Update new models * Fix remote code BC issue * fixup * fixing more corner cases * fix new models * add test * modular docs update * fix comment a bit * fix for py3.9	2025-07-16 14:50:35 +02:00
Eon Kim	4b258454a7	Fix typo in generation configuration for Janus model weight conversion (#39432 ) * Fix typo in generation configuration for Janus model weight conversion * Fix typo * Update Janus model generation configuration * Update Janus model to use generation_kwargs	2025-07-16 14:28:02 +02:00
Lysandre Debut	de5ca373ac	Responses API in `transformers serve` (#39155 ) * Scaffolding * Explicit content * Naïve Responses API streaming implementation * Cleanup * Responses API (to be merged into #39155) (#39338) * Scaffolding * Explicit content * Naïve Responses API streaming implementation * Cleanup * use openai * validate request, including detecting unused fields * dict indexing * dict var access * tmp commit (tests failing) * add slow * use oai output type in completions * (little rebase errors) * working spec? * guard type hint * type hints. fix state (CB can now load different models) * type hints; fn names; error type * add docstrings * responses + kv cache * metadata support; fix kv cache; error event * add output_index and content_index * docstrings * add test_build_response_event * docs/comments * gate test requirements; terminate cb manager on model switch * nasty type hints * more type hints * disable validation by default; enable force models * todo --------- Co-authored-by: Lysandre <hi@lysand.re> * Slight bugfixes * PR comments from #39338 * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co>	2025-07-16 14:16:16 +02:00
Raushan Turganbay	c8524aeb07	[cache] make all classes cache compatible finally (#38635 ) * dump * push other models * fix simple greedy generation * xmod * add fmst and clean up some mentions of old cache format * gpt-bigcode now follows standards * delete tuple cache reference in generation * fix some models * fix some models * fix mambas and support cache in tapas * fix some more tests * fix copies * delete `_reorder_cache` * another fix copies * fix typos and delete unnecessary test * fix rag generate, needs special cache reordering * fix tapas and superglue * reformer create special cache * recurrent gemma `reorder_cache` was a no-op, delete * fix-copies * fix blio and musicgen pipeline tests * fix reformer * fix reformer, again... * delete `_supports_cache_class` * delete `supports_quantized_cache` * fix failing tests * fix copies * some minor clean up * style * style * fix copies * fix tests * fix copies * create causal mask now needs positions? * fixc copies * style * Update tests/test_modeling_common.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * clean-up of non-generative model after merging main * check `is_decoder` for cache * delete transpose for scores * remove tuple cache from docs everywhere * fix tests * fix copies * fix copies once more * properly deprecate `encoder_attention_mask` in Bert-like models * import `deprecate_kwarg` where needed * fix copies again * fix copies * delete `nex_decoder_cache` * fix copies asks to update for PLM * fix copies * rebasing had a few new models, fix them and merge asap! * fix copies once more * fix slow tests * fix tests and updare PLM checkpoint * add read token and revert accidentally removed line * oh com -on, style * just skip it, read token has no access to PLM yet --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-16 14:00:17 +02:00
Ilias Aarab	6cb43defd0	docs: add missing numpy import to minimal example (#39444 ) docs: add numpy import to minimal example	2025-07-16 11:57:13 +00:00
Yuanyuan Chen	61163099f1	Remove runtime conditions for type checking (#37340 ) Remove dynamic conditions for type checking Signed-off-by: cyy <cyyever@outlook.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-16 13:36:48 +02:00
Marc Sun	bfc9ddf5c6	Add StableAdamW Optimizer (#39446 ) * Added StableAdamW as an optimizer option for Trainer. Also wrote tests to verify its behaviour. * Fixed issue with * Added docs for StableAdamW. Also fixed a typo in schedule free optimizers --------- Co-authored-by: Gautham Krithiwas <gauthamkrithiwas2003@gmail.com>	2025-07-16 13:35:53 +02:00
Pablo Montalvo	b9ee528246	add test scanner (#39419 ) * add test scanner * add doc + license * refactor for only 1 tree traversal * add back test of only one method * document single method scan * format * fixup generate tests * minor fix * fixup * fixup doc	2025-07-16 12:45:46 +02:00
Ákos Hadnagy	79941c61ce	Fix missing definition of diff_file_url in notification service (#39445 ) Fix missing definition of diff_file_url	2025-07-16 12:09:18 +02:00
richardodliu	e048d48bd0	Add cosine_with_min_lr_schedule_with_warmup_lr_rate scheduler in Trainer (#31870 ) * add cosine_with_min_lr_schedule_with_warmup_lr_rate scheduler in trainer * Update src/transformers/optimization.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update optimization.py fix the error of the unclosed "(" * Update optimization.py remove whitespace in line 402 in order to pass the quality test * Update src/transformers/optimization.py * Update src/transformers/optimization.py * Apply style fixes --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-07-16 12:01:08 +02:00
Quentin Gallouédec	0cf08e90dd	Change log level from warning to info for scheduled request logging in `ContinuousBatchProcessor` (#39372 ) Change log level from warning to info for scheduled request logging in ContinuousBatchProcessor	2025-07-16 11:54:20 +02:00
Yuanyuan Chen	ae4e306a40	Defaults to adamw_torch_fused for Pytorch>=2.8 (#37358 ) * Defaults to adamw_torch_fused for latest Pytorch Signed-off-by: cyy <cyyever@outlook.com> * Fix test Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-07-16 09:52:33 +00:00
Jeonghwan Kim	4524a68c66	Fix L270 - hasattr("moe_args") returning False error (#38715 ) * Fix L270 - hasattr("moe_args") returning False error * Update src/transformers/models/llama4/convert_llama4_weights_to_hf.py --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-16 09:45:58 +00:00
Raushan Turganbay	d33a1c389f	[chat template] add a testcase for kwargs (#39415 ) add a testcase	2025-07-16 11:31:35 +02:00
S1quence	99c9763398	Fixed a bug calculating cross entropy loss in `JetMoeForCausalLM` (#37830 ) fix: 🐛 Fixed a bug in calculating Cross Entropy loss in JetMoeForCausalLM In the original code, we shift the logits and pass shift_logits into the self.loss_function, but in self.loss_function, the shift_logits will be shifted again, so we are actually doing "next next token prediction", which is incorrect. I have removed the logits shifting before calling self.loss_function. Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-16 11:22:00 +02:00
Klaus-Rudolf Kladny	667ad02374	Remove double soft-max in load-balancing loss. Fixes #39055 . (#39056 ) Remove double soft-max in load-balancing loss. Fixes #39055	2025-07-16 09:20:23 +00:00
Kyle Sayers	31d81943c9	[Core] [Offloading] Fix saving offloaded submodules (#39280 ) * fix counting meta tensors, fix onloading meta tensors Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove unrelated fix Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove unrelated change Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * add clarifying comment Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * add test_save_offloaded_model_with_direct_params Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * fix merge conflict, add decorators Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-07-16 08:44:40 +00:00
Raushan Turganbay	add43c4d09	[autodocstring] add video and audio inputs (#39420 ) * add video and audio inputs in auto docstring * fix copies	2025-07-16 09:41:50 +02:00
Ákos Hadnagy	0dc2df5dda	CI workflow for performed test regressions (#39198 ) * WIP script to compare test runs for models * Update line normalitzation logic * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-07-16 04:20:02 +02:00
StevenBucaille	1bc9ac5107	docs: update LightGlue docs (#39407 ) * docs: update LightGlue docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-15 12:40:50 -07:00
StevenBucaille	d9574f2fe3	docs: update SuperGlue docs (#39406 ) * docs: update SuperGlue docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-15 12:40:26 -07:00
Raushan Turganbay	9f41f67135	[vlm] fix loading of retrieval VLMs (#39242 ) * fix vlm with retrieval * we can't use AutoModel because new ColQwen was released after refactor * no need for colqwen * tied weight keys are necessary, if using IMageTextToText * need to apply renaming in tied weights, only for ColPali * overwrite tied keys in ColPali * fix copies, modular can't handle if-statements	2025-07-15 17:23:54 +02:00
Wing Lian	b1d14086e4	handle training summary when creating modelcard but offline mode is set (#37095 ) * handle training summary when creating modelcard but offline mode is set * chore: lint	2025-07-15 17:21:15 +02:00
Dario Salvati	67f42928f0	Remove residual quantization attribute from dequantized models (#39373 ) * fix: removing quantization trace attribute from dequantized model Fixes #39295 * add: test `to(dtype=torch.float16)` after dequantization	2025-07-15 17:16:10 +02:00
Wangyi Jiang	30c508dbcb	Remove deprecated audio utils functions (#39330 ) Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-15 14:02:25 +00:00
Hosein Rezaei	d8e05951b8	Fix bugs in pytorch example run_clm when streaming is enabled (#39286 )	2025-07-15 15:37:28 +02:00
Matt	a989bf8d84	Fix bugs from pipeline preprocessor overhaul (#39425 ) * Correct load classes for VideoClassificationPipeline * Correct load classes for the ASR pipeline	2025-07-15 14:28:59 +01:00
Luc Georges	53c9dcd6fd	refactor: remove `set_tracer_provider` and `set_meter_provider` calls (#39422 )	2025-07-15 14:22:12 +02:00
Yuanyuan Chen	f03b384149	Fix invalid property (#39384 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-07-15 12:11:37 +00:00
jiqing-feng	c4d41567fa	set document_question_answering pipeline _load_tokenizer to True (#39411 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-07-15 12:05:49 +00:00
Matt	f56b49f48f	Ignore extra position embeddings weights for ESM (#39063 ) * Ignore extra position embeddings weights * Slight name fix	2025-07-15 11:57:32 +00:00
44670	2b79f14375	support loading qwen3 gguf (#38645 ) * support loading qwen3 gguf * Add qwen3 into GGUF_TO_FAST_CONVERTERS for tokenizer conversion * Add testcase * Fix formatting	2025-07-15 09:53:41 +00:00
Orion Weller	0e4b7938d0	Add ModernBERT Decoder Models - ModernBERT, but trained with CLM! (#38967 ) * working locally; need to style and test * added docs and initial tests; need to debug and flesh out * fixed tests * working long context; batches * working fa2 and eager * update tests * add missing confnigs * remove default autoset * fix spacing * fix most tests * fixed tests * fix to init * refactor to match new transformers updates * remove static cache option * fa2 fix * fix docs * in progress * working on tests * fixed issue with attn outputs * remove debug * fix local config attr * update doc string * fix docstring * add docs to toc * correct typo in toc * add new updates from main w.r.t. ModernBERT RoPE * fix local param --------- Co-authored-by: oweller2 <oweller2@dsailogin.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@l07.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@n02.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@l08.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@l01.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@l02.mgmt.ai.cluster>	2025-07-15 10:40:41 +02:00
Alvaro Bartolome	0b724114cf	Fix typo in `/v1/models` output payload (#39414 )	2025-07-15 08:59:25 +01:00
Raushan Turganbay	8d6259b0b8	[refactor] set attention implementation (#38974 ) * update * fix some tests * init from config, changes it in-place, add deepcopy in tests * fix modernbert * don't delete thsi config attr * update * style and copies * skip tests in generation * fix style * accidentally removed flash-attn-3, revert * docs * forgot about flags set to False * fix copies * address a few comments * fix copies * custom code BC	2025-07-15 09:34:06 +02:00
Sameeraja Shyam	6017f5e8ed	[siglip] fix pooling comment (#39378 ) * feat(siglip2): add forward pass with pooled output logic in Siglip2TextModel * test(siglip2): add test_text_model.py to verify pooled output behavior * style(siglip2): fix formatting in test_text_model.py using Ruff * fix(siglip2): remove misleading 'sticky EOS' comment and sync modular-classic files * fix(siglip2): remove misleading 'sticky EOS' comment and sync modular-classic files * chore(siglip2): regenerate classic model after modular change * Update	2025-07-14 17:47:19 +00:00
Tanuj Rai	8d40ca5749	Update phi4_multimodal.md (#38830 ) * Update phi4_multimodal.md * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update phi4_multimodal.md * Update phi4_multimodal.md * Update phi4_multimodal.md * Update phi4_multimodal.md * Update phi4_multimodal.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-14 10:35:17 -07:00
MilkClouds	3635415af2	[Docs] Fix typo in CustomTrainer compute_loss method and adjust loss reduction logic (#39391 ) Fix typo in CustomTrainer compute_loss method and adjust loss reduction logic	2025-07-14 09:25:06 -07:00
rasmi	3a48e9534c	Use np.pad instead of np.lib.pad. (#39346 ) * Use np.pad instead of np.lib.pad. * Update audio_utils.py Formatting	2025-07-14 16:05:28 +00:00
Matt	3d8be20cd2	Totally rewrite how pipelines load preprocessors (#38947 ) * Totally rewrite how pipelines load preprocessors * Delete more mappings * Fix conditionals, thanks Cyril!	2025-07-14 16:40:04 +01:00
eromomon	903944a411	[examples] fix do_reduce_labels argument for run_semantic_segmentation_no_trainer (#39322 ) * no use do_reduce_labels argument in model * use do_reducer_labels in AutoImageProcessor	2025-07-14 10:16:49 +00:00
Cyril Vallez	8165c703ab	Fix Lfm2 and common tests (#39398 ) * fix * better fix * typo	2025-07-14 12:02:59 +02:00
Raushan Turganbay	878d60a3cb	Deprecate AutoModelForVision2Seq (#38900 ) deprecate vision2seq	2025-07-14 11:42:06 +02:00
sabari	ad333d4852	[Qwen2.5-VL] Fix torch.finfo() TypeError for integer attention_mask_tensor (#39333 ) * Update modeling_qwen2_5_vl.py ### 🐛 Bug Description When using Unsloth’s Qwen2.5-VL vision models (both 3B and 7B) with the latest HuggingFace Transformers (commit: 520b9dcb42cef21662c304583368ff6645116a45), the model crashes due to a type mismatch in the attention mask handling. --- ### 🔥 Error Traceback * Fix dtype compatibility in attention mask processing Replace hardcoded torch.finfo() usage with dtype-aware function selection to handle both integer and floating-point attention mask tensors. Technical Details: Problem: Line 1292 assumes floating-point dtype for attention_mask_tensor Solution: Add dtype check to use torch.iinfo() for integer types and torch.finfo() for float types Files Modified: transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py * Update modeling_qwen2_5_vl.py * Update modeling_qwen2_5_vl.py * Fix: Cast to float before applying torch.finfo * # Fix: Use appropriate function based on dtype * Update modular_qwen2_5_vl.py * Fix: Cast to float before applying torch.finfo * Fix: Use appropriate function based on dtype * Fix: Use appropriate function based on dtype * Updatet modeling_glm4v.py * Only apply conversion for floating point tensors (inverted masks) * corrected the format issue reformatted modeling_glm4v.py All done! ✨ 🍰 ✨ 1 file reformatted * Fix: Cast to float before applying torch.finfo Corrected the format issue * Fix torch.finfo() for integer attention mask #39333 * Run make fix-copies and make style for CI compliance - Updated dependency versions table - Fixed code formatting and style issues - Sorted auto mappings - Updated documentation TOC * Fix torch.finfo() TypeError for Fix torch.finfo() TypeError for integer attention_mask_tensor #39333 * Fix torch.finfo() TypeError for integer	2025-07-14 07:47:39 +00:00
Raushan Turganbay	c30af65521	[BLIP] remove cache from Qformer (#39335 ) * remove cache from Qformer * fix * this was never correct...	2025-07-14 09:20:01 +02:00
Raushan Turganbay	66cd995618	[shieldgemma] fix checkpoint loading (#39348 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-14 08:34:58 +02:00
Yoni Gozlan	a1ad9197c5	Fix overriding Fast Image/Video Processors instance attributes affect other instances (#39363 ) * fix and add tests * nit	2025-07-12 23:39:06 +00:00
Yih-Dar	dc98fb3e5e	update docker file to use latest `timm` (for `perception_lm`) (#39380 ) update docker file for timm Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-12 23:19:37 +02:00
Parag Ekbote	5c30f7e390	Update Model Card for Encoder Decoder Model (#39272 ) * update model card. * add back the model contributors for mamba and mamba2. * update the model card. * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update batches with correct alignment. * update examples and remove quantization example. * update the examples. * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update example. * correct the example. --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-11 11:23:08 -07:00
Xiang Chendong	0d7efe3e4b	fix gpt2 usage doc (#39351 ) fix typo of gpt2 doc usage	2025-07-11 10:59:41 -07:00
Muhammad Shaheer Malik	a646fd55fd	Updated CamemBERT model card to new standardized format (#39227 ) * Updated CamemBERT model card to new standardized format * Applied review suggestions for CamemBERT: restored API refs, added examples, badges, and attribution * Updated CamemBERT usage examples, quantization, badges, and format * Updated CamemBERT badges * Fixed CLI Section	2025-07-11 10:59:09 -07:00
eromomon	af74ec65a7	Update Readme to Run Multiple Choice Script from Example Directory (#39323 ) * Update Readme to run in current place * Update Readme files to execute PyTorch examples from their respective folders	2025-07-11 10:58:26 -07:00
Julien Denize	70e57e4710	Add mistral common support (#38906 ) * wip: correct docstrings * Add mistral-common support. * quality * wip: add requested methods * wip: fix tests * wip: add internally some methods not being supported in mistral-common * wip * wip: add opencv dependency and update test list * wip: add mistral-common to testing dependencies * wip: revert some test changes * wip: ci * wip: ci * clean * check * check * check * wip: add hf image format to apply_chat_template and return pixel_values * wip: make mistral-common non-installed safe * wip: clean zip * fix: from_pretrained * fix: path and base64 * fix: path and import root * wip: add docs * clean * clean * revert --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-07-11 16:26:58 +00:00
Jay Zhuang	665418dacc	Remove device check in HQQ quantizer (#39299 ) * Remove device check in HQQ quantizer Fix https://github.com/huggingface/transformers/issues/38439 * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-07-11 14:59:51 +00:00
Manuel de Prada Corral	601bea2c4e	Verbose error in fix mode for utils/check_docstrings.py (#38915 ) * fix ast deprecations for python 3.14: replace node.n by node.value and use `ast.Constant` More verbose exceptions in `fix_docstring` on docstring formatting issues.	2025-07-11 14:36:10 +00:00
Yih-Dar	24f771a043	fix failing `test_sdpa_can_dispatch_on_flash` (#39259 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-11 16:30:56 +02:00
Arthur	ee74397d20	update cb TP (#39361 ) * update cb TP * safety	2025-07-11 15:54:25 +02:00
Cyril Vallez	9bc675b3b6	Fix link for testpypi (#39360 ) fix link	2025-07-11 15:34:01 +02:00
Shuming Hu	bf607f6d3b	PerceptionLM (#37878 ) * plm template * A working plm with fixed image features * hacked processor * First version that reproduced PLM output using PE from timm. * Simplify and fix tie_word_embeddings * Use PIL resize. Simplify converstion. * First version that works with video input. * simplifed image preprocessing (not batched) * Minor fixes after rebasing on main. * Video processor based on new API. * Revert to use _preprocess for image processor. * refactor with modular * fix tie_word_embedding * Testing with timm PE * check in missed converstion from modular to model.py * First working version of PLM with Eva PE. PLM-1B and 3B outputs are exactly the same as before. PLM-8B output has some differences. * address review comments * Fixed batching if video and image examples mixed. * Simplify PE configuration. * Enable AutoModel for PerceptionEncoder. * Update PE config style. * update all headers * Minor fixes. * Move lm_head to PerceptionLMForConditionalGeneration. Fix vit_G model specification. * Fix for testing_modeling_perception_lm.py * Image processing refactoring to use more common parts. * Fix processor test. * update tests to use model from hub * More test fixes. * integration test GT update after rebasing; probably due to video preprocessing * update test media path to hub * Stop tracking local scripts * address some review comments * refactor image processing. * small fixes * update documentation and minor fixes * remove scripts * Minor fix for CI * Fix image processing * CI and doc fix * CI formatting fix * ruff fix * ruff formatting * ran utils/sort_auto_mappings.py * update docstring * more docstring udpates * add vision_input_type default fallback for image processing * more verbose variable naming * test update * Remove PE and PEConfig use AutoModel(TimmWrapper) instead * Minor cleanup. * Minor Fix: remove any ref to PE. Ruff format and check. * fix docstring * Fix modular/model consistency.Improvex docstringfor . * Fix PerceptionLMForConditionalGenerationModelTest * ruff fix * fix for check_repo * minor formatting * dummy size arg to fix for processor test. * Update docstring for PerceptionLMConfig * Minor fixes from review feedback. * Revert some minor changes per reviewer feedback. * update base_model_prefix * address reviewer feedback * fix comment in modeling file * address reviewer feedback * ruff format * Pre-merge test update. * reapply modular and fix checkpoint name * processor test path * use modular a bit more * remove dead code * add token decorator --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-11 11:07:32 +02:00
Giuseppe Coccia	4b47b2b8ea	Updated Switch Transformers model card with standardized format (Issue #36979 ) (#39305 ) * Updated Switch Transformers model card with standardized format (Issue #36979) * Apply reviewer suggestions to the new standardised Switch Transformer's model card * Update switch_transformers.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-10 15:34:10 -07:00
Pavel Iakubovskii	fe1a5b73e6	[modular] speedup check_modular_conversion with multiprocessing (#37456 ) * Change topological sort to return level-based output (lists of lists) * Update main for modular converter * Update test * update check_modular_conversion * Update gitignore * Fix missing conversion for glm4 * Update * Fix error msg * Fixup * fix docstring * update docs * Add comment * delete qwen3_moe	2025-07-10 19:07:59 +01:00
Cyril Vallez	571a8c2131	Add a default value for `position_ids` in masking_utils (#39310 ) * set default * Update masking_utils.py * add small test	2025-07-10 18:53:40 +02:00
Kyle Sayers	bdc8028cb3	[Core] [Offloading] Enable saving offloaded models with multiple shared tensor groups (#39263 ) * fix counting meta tensors, fix onloading meta tensors Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove unrelated fix Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * add test Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-07-10 18:33:30 +02:00
Joao Gante	df49b399dc	[tests] tag serve tests as slow (#39343 ) * maybe they need more cpu resources? * add todo	2025-07-10 15:40:08 +00:00
Paul Pak	36e80a18da	[modeling][lfm2] LFM2: Remove deprecated seen_tokens (#39342 ) * [modeling][lfm2] remove deprecated seen_tokens * [modular][lfm2] remove deprecated seen_tokens from modular file	2025-07-10 17:27:55 +02:00
Paul Pak	9682d07f92	LFM2 (#39340 ) * [modeling][lfm2] LFM2 model on 4.53.0 interface * [configuration] hook in LFM2 keys * [modeling][lfm2] update modeling interface for 4.53.1 * [modeling][lfm2] apply mask to hidden conv states * [misc] ruff format/lint * [modeling][lfm2] minor: NotImplemented legacy cache conversion * Create lfm2.md * create nice modular * style * Update modeling_auto.py * clean and start adding tests * style * Update test_modeling_lfm2.py * Update __init__.py * small test model size * config * small fix * fix * remove useless config attrs -> block_dim and conv_dim are hiden_size * fix prepare inputs * fix config * test * typo * skip tests accordingly * config docstrings * add doc to .md * skip config docstring check --------- Co-authored-by: Maxime Labonne <81252890+mlabonne@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-10 16:07:33 +02:00
Joao Gante	38c3931362	[server] add tests and fix passing a custom `generation_config` (#39230 ) * add tests; fix passing a custom generation_config * tool integration test * add install step * add accelerate as dep to serving * add todo	2025-07-10 13:41:38 +00:00
edwko	6b09c8eab0	Handle DAC conversion when using weight_norm with newer PyTorch versions (#36393 ) * Update convert_dac_checkpoint.py * Update convert_dac_checkpoint.py * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-07-10 10:36:58 +00:00
Yih-Dar	92043bde29	fix `phi3` tests (#39312 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-10 11:51:55 +02:00
Kingsley	520b9dcb42	fix Glm4v batch videos forward (#39172 ) * changes for video * update modular * change get_video_features * update video token replacement * update modular * add test and fix typo * lint * fix order * lint * fix * remove dependency * lint * lint * remove todo * resize video for test * lint.. * fix test * new a processor for video_test * fix test	2025-07-10 10:44:28 +02:00
Raushan Turganbay	bc161d5d06	Delete deprecated stuff (#38838 ) * delete deprecated stuff * fix copies * remove unused tests * fix modernbert and fuyu * Update src/transformers/cache_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * bye bye `seen_tokens` * address comments * update typings * ecnoder decoder models follow same pattern as whisper * fix copies * why is it set to False? * fix switch transformers * fix encoder decoder models shared weight * fix copies and RAG * remove `next_cache` * fix gptj/git * fix copies * fix copies * style... * another forgotten docsrting --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-10 05:18:44 +00:00
Yoni Gozlan	c6ee0b1da8	Fix broken SAM after #39120 (#39289 ) fix	2025-07-09 17:46:22 -04:00
jiqing-feng	aff7df8436	enable static cache on TP model (#39164 ) * enable static cache on TP model Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * check tp size before init kv cache Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix docstring Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add tp tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix comment Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix other cache head size Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-07-09 21:14:45 +00:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟	2ef59646b8	Fix `max_length_q` and `max_length_k` types to `flash_attn_varlen_func` (#37206 ) Also add notes asking users to set `TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1` or call `torch._dynamo.config.capture_scalar_outputs = True`, as currently this will cause a graph break. Signed-off-by: Hollow Man <hollowman@opensuse.org>	2025-07-09 23:12:39 +02:00
Avihu Dekel	2d600a4363	Granite speech speedups (#39197 ) * ensure the query is updated during training avoid unused parameters that DDP does not like * avoid a crash when `kwargs` contain `padding=True` trainers often pass this argument automatically * minor * Remove mel_spec lazy init, and rename to mel_filters. this ensures save_pretrained will not crash when saving the processor during training `d5d007a1a0/src/transformers/feature_extraction_utils.py (L595)` * minor - most feature extractors has a `sampling_rate` property * speedup relative position embeddings * fix several issues in model saving/loading: - avoid modifying `self._hf_peft_config_loaded` when saving - adapter_config automatically points to the original base model - a finetuned version should point to the model save dir. - fixing model weights names, that are changed by adding an adapter. * minor * minor * minor * fixing a crash without peft active * add todo to replace einsum * granite speech speedups: 1. register attention_dist to avoid cpu-to-gpu transfer every layer. 2. pad_sequence is much faster than per-sample-padding + concat. 3. avoid returning audio back to cpu when using a compute device. * support audio.shape=(1,L)	2025-07-09 23:09:50 +02:00
Tom Aarsen	5111c8ea2f	Fix typo: langauge -> language (#39317 )	2025-07-09 12:06:46 -07:00
Priya aka Priyamvadha Balakrishnan	2781ad092d	docs: update LLaVA-NeXT model card (#38894 ) * docs: update LLaVA-NeXT model card * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * [docs] Updated llava_next model card * Update docs/source/en/model_doc/llava_next.md remove image sources Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * [fix] Change Flash Attention to SDPA badge * [doc] fixed quantization example * docs: updated contribution details and badges * Update llava_next.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-09 11:32:40 -07:00
Yih-Dar	16dd7f48d0	skip files in `src/` for doctest (for now) (#39316 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-09 19:36:48 +02:00
Eman Risha	d61c0d087c	Updated the Model docs - for the MARIAN model (#39138 ) * Update marian.md This update improves the Marian model card to follow the Hugging Face standardized model card format. The changes include: - Added a clear description of MarianMT, its architecture, and how it differs from other models. - Provided usage examples for Pipeline and AutoModel. - Added a quantization example for optimizing model inference. - Included instructions and examples for multilingual translation with language codes. - Added an Attention Mask Visualizer example. - Added a Resources section with relevant links to papers, the Marian framework, language codes, tokenizer guides, and quantization documentation. - Fixed formatting issues in the code blocks for correct rendering. This update improves the readability, usability, and consistency of the Marian model documentation for users. * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update marian.md * Update marian.md * Update marian.md * Update marian.md * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update marian.md * Update marian.md * Update marian.md * Update marian.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-09 10:23:03 -07:00
Yih-Dar	161cf3415e	add `stevhliu` to the list in `self-comment-ci.yml` (#39315 ) add Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-09 19:07:44 +02:00
Cyril Vallez	3be10c6d19	Fix consistency and a few docstrings warnings (#39314 ) * Update modeling_deepseek_v2.py * fix docstrings * fix * fix	2025-07-09 18:40:37 +02:00
MaCAT	4652677c89	🌐 [i18n-KO] Translated quark.md to Korean (#39268 ) * initial translation * removed english parts * maintain consistency * Update docs/source/ko/quantization/quark.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update docs/source/ko/quantization/quark.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update docs/source/ko/quantization/quark.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update docs/source/ko/quantization/quark.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * add toctree * fixed indentation --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>	2025-07-09 09:29:51 -07:00
Vladislav Bronzov	c980904204	Add DeepSeek V2 Model into Transformers (#36400 ) * add initial structure * doc fixes, add model base logic * update init files * some fixes to config and modular * some improvements for attention * format * remove unused attn * some fixes for moe layer and for decoder * adapt _compute_yarn_parameters for deepseek * format * small fix * fix for decoder forward * add tests, small refactoring * fix dummies * fix init * fix doc * fix config docs * add sequce doc, fix init for gate * fix issues in tests * fix config doc * remove unused args * some fixes and refactoring after review * fix doc for config * small fixes for config args * revert config refactoring * small refactoring * minor fixes after rebase * small fix after merge * fix modular * remove rotaryembd from public init * small test fix * some rotary pos calculation improvement * fix format * some improvements and fixes * fix config * some refactoring * adjust some unit tests * skip test * small fixes and tests adjustment * reapply modular * fix all tests except Integration * fix integration testzs * cleanup BC stuff * rope * fix integrations tests based on a10 * style --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-09 17:04:28 +02:00
Raushan Turganbay	accbd8e0fe	[sliding window] revert and deprecate (#39301 ) * bring back and deprecate * oops --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-07-09 16:10:38 +02:00
Cyril Vallez	1cefb5d788	[modular] Allow method with the same name in case of @property decorator (#39308 ) * fix * add example * fix * Update modular_model_converter.py	2025-07-09 15:46:53 +02:00
Yih-Dar	4798c05c64	skip `test_torchscript_*` for now until the majority of the community ask for it (#39307 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-09 15:35:48 +02:00
Yih-Dar	fe5f3c85d2	fix `aria` tests (#39277 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-09 13:49:33 +02:00
Raushan Turganbay	0687d481e2	[flash attn 3] bring back flags (#39294 ) * flash attn 3 flag * fix copies	2025-07-09 09:45:01 +02:00
JJJYmmm	25343aafee	Fix SDPA attention precision issue in Qwen2.5-VL (#37363 ) * solve conflicts and remove redundant attention_mask in qwenvit * update decoded text check * remove trailing whitespace	2025-07-09 07:03:44 +02:00
Yaswanth Gali	0e1c281745	[Tests] Update model_id in AIMv2 Tests (#39281 ) * Update model_id in tests * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-08 21:46:32 +02:00
Biao Zhang	7ef592c96c	Update T5gemma (#39210 ) * bug fix: add vocab_size to t5gemmaconfig for pipeline. * Update checkpoint placeholder * minor change * minor change * minor change: update example. * fix: add vocab_size as an explict arg. * buf fix: remove vocab_size verification; instead, re-set encoder/decoder vocab size. Note, in t5gemma, vocab size of encoder/decoder shoud be always the same. * add `add_generation_prompt` for message preprocessing.	2025-07-08 19:08:48 +02:00
Quentin Lhoest	1ecd52e50a	Add torchcodec in docstrings/tests for `datasets` 4.0 (#39156 ) * fix dataset run_object_detection * bump version * keep same dataset actually * torchcodec in docstrings and testing utils * torchcodec in dockerfiles and requirements * remove duplicate * add torchocodec to all the remaining docker files * fix tests * support torchcodec in audio classification and ASR * [commit to revert] build ci-dev images * [commit to revert] trigger circleci * [commit to revert] build ci-dev images * fix * fix modeling_hubert * backward compatible run_object_detection * revert ci trigger commits * fix mono conversion and support torch tensor as input * revert map_to_array docs + fix it * revert mono * nit in docstring * style * fix modular --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-08 17:06:12 +02:00
StevenBucaille	1255480fd2	[lightglue] add support for remote code DISK keypoint detector (#39253 ) * feat: add trust_remote_code in LightGlueConfig * fix: made sure trust_remote_code is provided only when necessary * fix: make style * docs: added missing trust_remote_code docstring * refactor: refactored LightGlue config init * fix: removed unnecessary argument	2025-07-08 15:03:04 +00:00
Yih-Dar	838a0268b8	fix flaky `test_generate_compile_model_forward` (#39276 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-08 15:36:05 +02:00
Pavel Iakubovskii	29d0030e23	Refactor `PretrainedConfig.__init__` method to make it more explicit (#39158 ) * cleanup * fix no `__init__` test * fix missing inits	2025-07-08 14:24:39 +01:00
Joao Gante	1580f64653	[smollm3] add tokenizer mapping for `smollm3` (#39271 ) add tok mapping to smollm3	2025-07-08 10:44:01 +00:00
Kashif Rasul	db05e4ff33	[pagged-attention] fix off-by-1 error in pagged attention generation (#39258 ) * fix off-by-1 error in pagged attention generation * formatting * use update_with_token	2025-07-08 12:34:22 +02:00
Joao Gante	6f1a43896c	[CI] fix docs (#39273 ) * fix docs * add ko gloassary file to toctree	2025-07-08 11:31:03 +01:00
Yaswanth Gali	fbdaa7b099	Add Aimv2 model (#36625 ) * Model skelton * changes * temp push * changes * Added support for aimv2-native * More changes * More changes * Stupid mistake correction * Added config and refactor * Added vison model * update * Refactor for lit variant * Added Text Model * Minor fixes * nits * update * Preliminary tests * More fixes * Updated tests 🤗 * Refactor * Updated testcase * Updated config * make fixup * more fixes * Bug fix and updates * deadcode * Fixes * nit * up * Happy CI ✅ * Reduce LOC * nit * nit * make style * return_dict refactor * bug fix * fix * doc update * nit * make fixup * Minor update * _init_weigths modifcation * update tests * Minor fixes post review * Update w.r.t GradientCheckpointingLayer * docs update * update * nit * Use more Modular 😉 * Change name from AIMv2 to Aimv2 * Nit * make style * Add model doc pointer * make style * Update model doc section * updates * Modify attn mask and interface * update test * Final change * Utilize flash and flex attn * keep attn mask * camelcase model name in test file * Fix docstring * Fix config warning finally and create_causal_mask * disable torchscript * remove unused arg * remove from tests * balance model size for tests * fix device * tests * tests * flaky test * fix import --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-08 11:53:21 +02:00
Jingze Shi	d8590b4b0c	Add Doge model (#35891 ) * Add Doge Model * Fix code quality * Rollback an error commit * Fix config for open-source weights * Revert "Fix config for open-source weights" This reverts commit 229cdcac10a6a4274d1dd13b729bc14c98eb0c76. * Add modular_doge * Update Doge inherits from Llama * Fix import bug * [docs] Add usage of doge model * Fix Doge import pretrainedconfig from modeling_utils to configuration_utils * [docs] remove trust remote code from doge * Fix dynamo bug in doge model * Update docstrings * Import apply_rotary_pos_emb and repeat_kv from Llama * Fix all nits * Fix code quality * Fix some bugs * Fix code quality * Remove inherited `_update_causal_mask` from Llama This leads to incorrect weight initialization. * Fix the wrong tensor orderings in DogeCDMoE * Fix attention mask bug We have to provide attention_mask for dynamic mask computation * Modify most implementations to inherit from Llama But there are two problems: 1. `flex_attention_forward` is not updated properly 2. `Example` error in the forward method of DogeForCausalLM * Modify CDMoE for batch efficient implementation * Uniform MoE configuration names, just like QwenMoE * Fix code quality * Fix code quality * Fix code quality * Add tp plan of CDMoE Module * Hybird DMA with sliding window * Update valid tokens greater than window size * Fix code quality * Add `convert_doge_weights_to_hf` * Fix STATE_DICT_MAPPING in convert_doge_weights_to_hf.py * Fix nits in modular_doge * Fix code quality * Fix all nits * Fix all nits * Make sure the attention function is updated inside the class * Fix code quality issues in the Doge model and add a test for it * Fix `test_generate` * Fix code quality * Fix nits fllowing suggestions * Fix code quality * Fix code quality issues * Fix nits * Fix code quality nits * Fix the missing parameters in the configuration. * Fix the missing parameters in the configuration. * Fix nits * Add initialization of attention * Fix last nits * Simplify dynamic mask generation logic * Rename router_logits to gate_logits for matching latest changes of MixtralModel * Rename typings for matching latest changes of MixtralModel * Fixes typo in comment * Update src/transformers/models/doge/modular_doge.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix code quality issues to match other modular * Fix code quality issues to match other modular * Fix the static compilation errors * Update model weights link * Fix code quality issues to match other modular * reapply modular and support for new outputs * style * simplify a lot * fix import location * reapply modular * fix * fix integration test --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-08 11:44:29 +02:00
Joonchen Liau	d370bc64c6	Fix errors when use verl to train GLM4.1v model (#39199 ) * Fix errors when use verl to train GLM4.1v model * Support glm4v load from AutoModelForVision2Seq * Set glm4v model _checkpoint_conversion_mapping attr from None to {} * Update modeling_auto.py	2025-07-08 09:39:31 +00:00
Arthur	5fb8bb3e1a	fix recompiles due to instance key, and deepcopy issues (#39270 ) * fix recompiles due to instance key, and deepcopy issues * dict	2025-07-08 11:38:11 +02:00
Guang Yang	356fd68109	fix(generation): stop beam search per-instance when heuristic satisfied (#38778 ) * fix(decoding): stop beam search per-instance when heuristic satisfied Previously, when early_stopping is set to `False`, the early-stopping heuristic only halted generation when all batch instances reached the criterion. This caused instances that are impossible (suggested by the heuristic) to improve keep generating, leading to inconsistent and overlong outputs across the batch. Now we apply the heuristic per-instance: once a certain instance of batch has its all beams impossibe to improve, we mark that instance finished while letting others continue. This restores expected behavior and ensures consistency in batched generation. * Add test case GenerationIntegrationTests.test_beam_search_early_stop_heuristic * Update naming improvement_possibility -> is_early_stop_heuristic_unsatisfied * Add comments for early stop heuristic * Update src/transformers/generation/utils.py --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-08 08:59:37 +00:00
Pablo Montalvo	0b0ede8b2b	remove broken block (#39255 ) * remove broken block * fixup	2025-07-08 10:41:44 +02:00
Yih-Dar	a21557fa3e	Skip `test_eager_matches sdpa generate` and update an integration test for blip-like models (#39248 ) * skip * skip --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-08 10:38:25 +02:00
gudwls215	ea3c2c0277	Fix license text, duplicate assignment, and typo in constant names (#39250 ) - Complete Apache License text in Italian documentation - Remove duplicate variable assignment in Perceiver converter - Fix typo in MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES constant	2025-07-08 10:20:52 +02:00
Yao Matrix	b2816da802	fix xpu failures on PT 2.7 and 2.8 w/o IPEX and enable hqq cases on XPU (#39187 ) * chameleon xpu bnb groundtruth update on bnb triton backend since we are deprecating ipex backend Signed-off-by: YAO Matrix <matrix.yao@intel.com> * enable hqq uts on XPU, all passed Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix comment Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-07-08 10:18:26 +02:00
Yuxuan Zhang	17b3c96c00	Glm 4 doc (#39247 ) * update the glm4 model readme * update test * update GLM-4.1V model * update as format * update * fix some tests * fix the rest * fix on a10, not t4 * nit: dummy import --------- Co-authored-by: raushan <raushan@huggingface.co>	2025-07-08 08:22:04 +02:00
Drew Ross	bbca9782ca	Update LED model card (#39233 ) * Update LED model card * Remove extra arguments * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-07 15:56:57 -07:00
Yih-Dar	41e865bb8d	fix some flaky tests in `tests/generation/test_utils.py` (#39254 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-07 19:49:41 +02:00
Cyril Vallez	93747d89ea	Simplify Mixtral and its modular children (#39252 ) * simplify mixtral a lot * fix * other moes * mixtral * qwen3 * back * Update modular_qwen3_moe.py	2025-07-07 19:40:41 +02:00
Mikhail Moskovchenko	3993ee1e98	Add `segmentation_maps` support to MobileNetV2ImageProcessor (#37312 ) * Add `segmentation_maps` support to mobilenet_v2 image processor and `reduce_labels` to mobilevit * Changed mobilenetv2 tests to support fastimageprocessor * added `segmentation_maps` support to fast image processor * reverted to upstream/main * Add optional * Use autodocstring * Changed docs * Docs fix * Changed fp to match beit fp * Change typing imports * Fixed repo inconsistency * Added fast-slow equivalence tests * Removed unnecessary call * Add `reduce_labels` to Mobilevit fast processor --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-07-07 13:34:59 -04:00
Shohail Ismail	b96f213fcf	Clarify per_device_train_batch_size scaling in TrainingArguments (#38… (#38857 ) Clarify global batch size calculation in TrainingArguments (#38484)	2025-07-07 16:57:42 +00:00
Joosun Hwang	9698052560	Add Korean translation for glossary.md (#38804 ) * Add Korean translation for glossary.md * Update docs/source/ko/glossary.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Joosun40 <77312900+Joosun40@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>	2025-07-07 09:12:55 -07:00
Lucain	bf203aa9da	Update tiny-agents example (#39245 )	2025-07-07 15:58:36 +02:00
kaixuanliu	c4e39ee59c	adjust input and output texts for test_modeling_recurrent_gemma.py (#39190 ) * adjust input and output texts for test_modeling_recurrent_gemma.py Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * fix bug Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * adjust Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Expectation match Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * fix --------- Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-07 15:13:25 +02:00
jiqing-feng	14cba7ad33	enable xpu on kv-cache and hqq doc (#39246 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-07-07 13:12:02 +00:00
Cyril Vallez	32db48db73	Fix patch helper (#39216 ) remove -1	2025-07-07 15:11:48 +02:00
Pavel Iakubovskii	a3618d485a	RotaryEmbeddings change `is not None` -> `isinstance(..., dict)` (#39145 ) is None -> isinstance dict	2025-07-07 14:05:28 +01:00
Yih-Dar	9b09fe479f	fix `fastspeech2_conformer` tests (#39229 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-07 15:04:26 +02:00
Zhen	00e9efceab	[bugfix] fix flash attention 2 unavailable error on Ascend NPU (#39166 ) [bugfix] fix flash attention 2 error on Ascend NPU	2025-07-07 13:03:39 +00:00
Cyril Vallez	056fa73fae	[modular] Simplify logic and docstring handling (#39185 ) * simplify a lot * Update modular_model_converter.py * finalize * remove outdated functions * apply it * and examples	2025-07-07 14:52:57 +02:00
Xavier Dupré	f16fbfb89a	Make _compute_dynamic_ntk_parameters exportable (#39171 ) * Make _compute_dynamic_ntk_parameters exportable * add unit test	2025-07-07 14:48:31 +02:00
kaixuanliu	4243bb844d	fix bug using FSDP V1 will lead to model device not properly set (#39177 ) * fix bug using FSDP V1 will lead to model device not properly set Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update the code Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> --------- Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>	2025-07-07 14:47:04 +02:00
Yih-Dar	34c16167eb	Don't send new comment if the previous one is less than 30 minutes (unless the content is changed) (#39170 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-07 14:43:50 +02:00
Daniel van Strien	b8f397e456	fix typo in Gemma3n notes (#39196 )	2025-07-07 14:41:33 +02:00
Cyril Vallez	5348fbc005	[modular] Follow global indexing and attribute setting, and their dependencies (#39180 ) * export global indexing statements * add example * style * examples	2025-07-07 14:36:43 +02:00
Isotr0py	8570bc29f3	Fix missing fast tokenizer/image_processor in whisper/qwen2.5-omni processor (#39244 ) * fix missing fast tokenizer in whisper processor Signed-off-by: Isotr0py <2037008807@qq.com> * fix processor test Signed-off-by: Isotr0py <2037008807@qq.com> * fix qwen2.5 omni processor Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-07-07 13:54:18 +02:00
Joshua Lochner	b283d52f7f	[vjepa2] replace einsum with unsqueeze (#39234 )	2025-07-07 11:14:08 +01:00
Rémi Ouazan	a325409a50	Expectations re-order and corrected FA3 skip (#39195 ) * Fix Expectations and a FA3 skip * Fixed docstring * Added context for Default expectation	2025-07-07 11:42:33 +02:00
zrohyun	b0a8e0b8d7	[video processors] Support float fps for precise frame sampling (#39134 ) * [video processors] Support float fps for precise frame sampling Enable fractional fps values (e.g., 1.5, 29.97) in video processors for more precise frame sampling control. - Change fps type from int to float across all video processors - Maintain backward compatibility with integer values Extends: #38105 * [video processors] Refine fps typing to Union[int, float] Change fps type from Optional[float] to Optional[Union[int, float]] for more explicit type information about supporting both integer and floating-point frame rates. - Update type hints and docstrings across 8 files - Maintain backward compatibility - Clarify support for both int and float values Extends: #38105 * Revert "[video processors] Support float fps for precise frame sampling" This reverts commit 7360d6e661b413ca0239e5ef61f9b1abbeab8e65.	2025-07-07 03:43:43 +00:00
Arthur	ca7e1a3756	Refactor the way we handle outputs for new llamas and new models (#39120 ) * just update 2 files * update other models as well just making fix-copies * also add the changes needed to modeling utils * put this on the pretrained model instead * nits and fixes * update generic, fix to use config value * update other modelings * use transformers kwargs instead * update * update * update other models * update * updates * update * update * update * fix * finally * very small nits * this fixes more tests * fix other models as well! * update modularqwen2 * update models based on qwen2 * update * update * remove the *flash stuff in favor of noraml kwargs update * propagate gemma? * remove output attentions * propagate * support cross attention edge case * same * test this * fixes * more fix * update * update * fix conflicts * update * fix emu3 * fix emu3 * move the fix a bit * quel enfer * some fixes, loss_kwargs should never had been * finish fixing gemma3n * fix small lm3 * fix another one * fix csm now * fux csm and mistral * fix mistral now * small fixes * fix janusss * only for some models * fixup * phix phi3 * more fixes? * dose this fix it? * update * holy shit it was just graph breaks * protect torch * updates * fix samhq? * fix moonshine * more moonshine fixes, 3 failures left! * nits * generic needs to support more * more fixes to moonshine! * fix cross attention outputs! * fix csm! * nits * fix stupid kosmos2 * current updates * fixes * use output recorder? * nicer! * a little bit of magic * update * fix protect * fix * small fixes * protect import * fix a bunch of more models * fix fixups * fix some of the last ones * nit * partly fix phi * update * fix import path * make something that is fullgraph compatible just to be sure * typing was wrong on llama so the rest was wrong as well * fucking ugly but at least it is still exportable * syle * supposed to fix moonshine, it still breaks * fix some default * fix the last bits of sam * update samhq * more fixes to am hq * nit * fix all output+hidden states and output_attentions! * fix? * fix diffllama * updates to fix initialization on the sam pips * ups there was a bug * fix the last sam hq test * fix gotocr * fix gotocr2! * fixes * skip stupid tests * there was one left :) * fixup * fix fix copies issues with this test file * fix copies for sam_hq * rm some comments * skip 2 more failing tests * fix * fix everything * Apply suggestions from code review Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * add more doc! * fix public init * fix modular qwen3 --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-07-05 11:34:28 +02:00
Yih-Dar	e6a8063ef1	Update expected values (after switching to A10) - part 8 - Final (#39220 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-04 13:35:53 +02:00
Yih-Dar	cd8a041a4f	Update expected values (after switching to A10) - part 7 (#39218 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-04 12:48:10 +02:00
Cyril Vallez	0cf27916f0	Add packed tensor format support for flex/sdpa/eager through the mask! (#39194 ) * Add the necesary logic to mask_utils * add it everywhere * Update masking_utils.py * style * Update masking_utils.py * Update modeling_mimi.py * Update masking_utils.py * add support for more than batch size 1 * Update masking_utils.py * add test * style * Update test_masking_utils.py * Update masking_utils.py * add require_token * fix tests * fix	2025-07-04 09:01:56 +02:00
Yih-Dar	037755ed54	Update expected values (after switching to A10) - part 6 (#39207 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-03 22:45:30 +02:00
Yih-Dar	1168f57abf	Update expected values (after switching to A10) - part 5 (#39205 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-03 19:56:02 +02:00
Lysandre Debut	7d9e52f376	Fix continuous batching in `transformers serve` (#39149 ) * Fix CB * Nit * Update src/transformers/commands/serving.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Add todos --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-03 18:15:31 +02:00
Joao Gante	85d93cc6e3	[serve] Cursor support, move docs into separate page, add more examples (#39133 ) * jan docs * rm * [cursor] tmp commit * Cursor working :D * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/commands/serving.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * cursor docs * try to fix agents/tools docs? * try to fix agents/tools docs? * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * add transformers chat example with transformers serve --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2025-07-03 17:04:16 +01:00
Pavel Iakubovskii	e15b06d8dc	[typing] better return typehints for `from_pretrained` (#39184 ) * config * processor * feature-extractor * jukebox * fixup * update other methods in config * remove "PretrainedConfig" annotations	2025-07-03 14:22:47 +00:00
Yih-Dar	a25fc3592e	Update expected values (after switching to A10) - part 4 (#39189 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-03 15:13:06 +02:00
Anton Vlasjuk	b31e9d19a6	[`Dia`] Change ckpt path in docs (#39181 ) fix ckpt path	2025-07-03 10:02:58 +00:00
Ilyas Moutawwakil	18e0cae207	Fix many HPU failures in the CI (#39066 ) * more torch.hpu patches * increase top_k because it results in flaky behavior when Tempreture, TopP and TopK are used together, which ends up killing beams early. * remove temporal fix * fix scatter operation when input and src are the same * trigger * fix and reduce * skip finding batch size as it makes the hpu go loco * fix fsdp (yay all are passing) * fix checking equal nan values * style * remove models list * order * rename to cuda_extensions * Update src/transformers/trainer.py	2025-07-03 11:17:27 +02:00
Marc Sun	bff964c429	Decouple device_map='auto' and tp_plan='auto' (#38942 ) * dissociate * better place * fix	2025-07-03 11:07:11 +02:00
Wing Lian	8178c43112	when delaying optimizer creation only prepare the model (#39152 )	2025-07-03 09:04:16 +02:00
Raushan Turganbay	91221da2f1	[glm4v] fix video inference (#39174 ) fix video inference	2025-07-03 05:20:41 +00:00
Rémi Ouazan	ebfbcd42da	Test fixes for Aria (and some Expectation for llava_next_video) (#39131 ) * Expectations for llava_next_video * Updated image src in aria * Fix test_small_model_integration_test * Fix small model integration llama * Fix a bunch of tests * Style * Shortened generation in test from 900 to 90	2025-07-02 23:41:14 +02:00
Yih-Dar	37a239ca50	Update expected values (after switching to A10) - part 3 (#39179 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-02 22:48:30 +02:00
Yih-Dar	9326fc332d	Update expected values (after switching to A10) - part 2 (#39165 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * empty * [skip ci] * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-02 22:47:55 +02:00
Pedro Cuenca	25cd65ac43	Random serve fixes (#39176 ) * Fix index out of bounds exception on wrong kv reuse * Prevent loading same model twice --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Lysandre Debut <hi@lysand.re>	2025-07-02 22:09:58 +02:00
Lysandre Debut	548794b886	[serve] Model name or path should be required (#39178 ) * Model name or path should be required * Fix + add tests * Change print to log so it doesn't display in transformers chat	2025-07-02 22:06:47 +02:00
Joao Gante	2d561713f8	[generate] document non-canonical beam search default behavior (#39000 )	2025-07-02 18:29:16 +01:00
Steven Liu	df12d87d18	[docs] ViTPose (#38630 ) * vitpose * fix? * fix? * feedback * fix * feedback * feedback * update sample image	2025-07-02 07:56:29 -07:00
Cyril Vallez	2b4a12b5bf	Reduce Glm4v model test size significantly (#39173 ) * fix test size * Update test_modeling_glm4v.py	2025-07-02 15:55:05 +02:00
BUI Van Tuan	e355c0a11c	Fix missing initializations for models created in 2024 (#38987 ) * fix GroundingDino * fix SuperGlue * fix GroundingDino * fix MambaModel * fix OmDetTurbo * fix SegGpt * fix Qwen2Audio * fix Mamba2 * fix DabDetr * fix Dac * fix FalconMamba * skip timm initialization * fix Encodec and MusicgenMelody * fix Musicgen * skip timm initialization test * fix OmDetTurbo * clean the code Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * add reviewed changes * add back timm * style * better check for parametrizations --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-02 15:03:57 +02:00
Rémi Ouazan	1125513a8d	Blip2 fixes (#39080 ) * Fixed some devices errors * Fixed other device issues and more expectations * Reverted support flags * style * More granular support * Fixed some rebase stuff * add a not None check before .to	2025-07-02 14:39:39 +02:00
Isotr0py	28df7f854a	Fix multimodal processor get duplicate arguments when receive kwargs for initialization (#39125 ) * fix processor tokenizer override Signed-off-by: Isotr0py <2037008807@qq.com> * code format Signed-off-by: Isotr0py <2037008807@qq.com> * add regression test Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> * check image processor same Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-07-02 19:57:15 +08:00
Yaswanth Gali	b61023a1b7	🚨🚨🚨 [eomt] make EoMT compatible with pipeline (#39122 ) * Make EoMT compatible with pipeline * Implicit patch offsets * remove patch offsets from arg * Modify tests * Update example * fix proc testcase * Add few more args * add pipeline test suite * fix * docstring fixes * add pipeline test * changes w.r.t review * 🙈 MB * should fix device mismatch * debug * Fixes device mismatch * use decorator * we can split mlp * expected values update --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2025-07-02 12:25:26 +01:00
Raushan Turganbay	4d5822e65d	[smolvlm] fix video inference (#39147 ) * fix smolvlm * better do as before, set sampling params in overwritten `apply_chat_template` * style * update with `setdefault`	2025-07-02 12:05:10 +02:00
वेदांत	9b2f5b66d8	fix default value of config to match checkpionts in LLaVa-OV models (#39163 )	2025-07-02 09:45:50 +00:00
Chong You	e8e0c76162	Add activation sparsity reference in gemma3n doc (#39160 ) Add activation sparsity reference in the description of gemma3n	2025-07-02 04:11:03 +02:00
Yih-Dar	8e87adc45f	fix `llama` tests (#39161 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-01 23:27:22 +02:00
Yih-Dar	4c1715b610	Update expected values (after switching to A10) (#39157 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * empty * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-01 20:54:31 +02:00
Yih-Dar	ab59cc27fe	Suggest jobs to use in `run-slow` (#39100 ) * pr * pr * pr * pr * pr * pr * pr * pr * pr --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-01 20:19:06 +02:00
jiqing-feng	db2f535443	update bnb ground truth (#39117 ) * update bnb resulte Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * set seed to avoid sampling different results Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix int8 tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix typo Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add comments Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-07-01 20:06:37 +02:00
ybkurt	260846efad	fix: remove undefined variable (#39146 )	2025-07-01 19:10:29 +02:00
rasmi	cdfe49a4d0	Change `@lru_cache()` to `@lru_cache` to match styles from #38883 . (#39093 ) Match styles in #38883	2025-07-01 18:29:16 +02:00
DavidS2106	f46798193e	Fix: Ensure wandb logs config in offline mode (#38992 ) * Fix: Ensure wandb logs config in offline mode * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-07-01 16:17:58 +00:00
Yih-Dar	fe838d6631	Fix missing fsdp & trainer jobs in daily CI (#39153 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-01 18:10:30 +02:00
StevenBucaille	1283877571	[superglue] fix wrong concatenation which made batching results wrong (#38850 )	2025-07-01 12:14:44 +00:00
Raushan Turganbay	f8b88866f5	[VLMs] support passing embeds along with pixels (#38467 ) * VLMs can work with embeds now * update more models * fix tests * fix copies * fixup * fix * style * unskip tests * fix copies * fix tests * style * omni modality models * qwen models had extra indentation * fix some other tests * fix copies * fix test last time * unrelated changes revert * we can't rely only on embeds * delete file * de-flake mistral3 * fix qwen models * fix style * fix tests * fix copies * deflake the test * modular reverted by fixes, fix again * flaky test, overwritten * fix copies * style	2025-07-01 11:33:20 +00:00
Ayush Singh	20901f1d68	[typing] LlamaAttention return typehint (#38998 ) * helo llama * helo llama * helo llama * apply modular * fix dia --------- Co-authored-by: qubvel <qubvel@gmail.com>	2025-07-01 11:29:52 +01:00
Raushan Turganbay	7a25f8dfdb	[qwen2-vl] fix FA2 inference (#39121 ) * fix FA2 * update is causal flag and remove mask for FA2 * update for FA2 with varlen path * how the tests were passing with different devices? * add comment and ref to the PR * move mask preparation to base pretrained model * seq len is the first dim, not second * fix copies to fix GLM4V	2025-07-01 10:18:37 +00:00
Mehant Kammakomati	def9663239	feat: support indivisible shards for TP model loading and TPlizing. (#37220 ) * feat: support uneven loading and sharding resolve merge conflicts Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: allow for empty tensor computations Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * test: add llama1b test case Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * due to q_proj colwise it has to be multi of 2 Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * refactor: use slice API Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * refactor: use slice API Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * refactor: use slice API Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * refactor: use slice API Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> --------- Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>	2025-07-01 10:03:22 +00:00
jiqing-feng	06c4a4d499	fix caching_allocator_warmup with tie weights (#39070 ) * fix caching_allocator_warmup with tie weights Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix comment Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-07-01 11:32:20 +02:00
Raushan Turganbay	e435574721	🚨 Don't use cache in non-generative models (#38751 ) * deprecate for 1 version * style * fix some tests * fix esm * skip for now, GC requires positional args but we have keyword args * remove transpose for scores in modified models only * skip fx trace tests	2025-07-01 09:08:21 +00:00
Cyril Vallez	dbc98328da	Several fixes for Gemma3n (#39135 ) * remove the skips * fix the epsilon to a small value (does not make sense otherwise) * safeguard * overload test_eager_matches_sdpa * Update test_modeling_common.py * skip appropriate tests * correct no_split_layer * fix all devices issue * fix backward * fix	2025-07-01 10:34:53 +02:00
BUI Van Tuan	d53518c5f2	Fix key mapping for VLMs (#39029 ) * fix key mapping for VLMs * use __mro__ instead * update key mapping in save_pretrained	2025-07-01 09:47:53 +02:00
eustlb	3457e8e73e	[Whisper] update token timestamps tests (#39126 ) * fixes * update comment * update for A10 * all a10 * all a10 * all a10 * all a10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-30 21:55:36 +02:00
Drew Ross	fe35eca7bd	Update BigBirdPegasus model card (#39104 ) * Update igbird_pegasus.md * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-30 10:42:56 -07:00
Yao Matrix	29a3f5ed8c	switch default xpu tp backend to pytorch built-in XCCL from pytorch 2.8 (#39024 ) * switch default xpu tp backend to pytorch built-in XCCL from pytorch 2.8 Signed-off-by: YAO Matrix <matrix.yao@intel.com> * Update docs/source/en/perf_infer_gpu_multi.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update perf_infer_gpu_multi.md * Update perf_infer_gpu_multi.md * Update perf_infer_gpu_multi.md --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-30 08:54:05 -07:00
Vladimir Gutuev	9e0c865b8b	docs: correct two typos in awesome-transformers.md (#39102 ) * docs(awesome-projects): fix typo “Itt leverages” → “It leverages” (#39101) closes #39101 * docs(awesome-projects): fix grammar “We provides” → “We provide” (#39101) closes #39101	2025-06-30 08:53:43 -07:00
jiqing-feng	03db2700ab	Enable XPU doc (#38929 ) * fix example with dataset Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update torchao doc Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update torchao doc Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix device type Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert torchao change Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torchao doc Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert torchao change Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update xpu torchao doc Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update chat_templating_multimodal.md Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * use full name for int8 Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert int8 title Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-06-30 07:56:55 -07:00
Joao Gante	ea0ea392e5	Fix chat (#39128 )	2025-06-30 13:47:48 +00:00
Lysandre Debut	ed36f8490e	Licenses (#39127 ) * Licenses * Licenses	2025-06-30 15:25:36 +02:00
Lysandre Debut	e8f90b5397	Split `transformers chat` and `transformers serve` (#38443 ) * Next token * Split chat and serve * Support both generation methods * Style * Generation Config * temp * temp * Finalize serving.py Co-authored-by: =?UTF-8?q?c=C3=A9lina?= <hanouticelina@gmail.com> * Finalize chat.py * Update src/transformers/commands/serving.py Co-authored-by: célina <hanouticelina@gmail.com> * Lucain's comments Co-authored-by: Lucain <lucain@huggingface.co> * Update * Last comments on PR * Better error handling * Better error handling * CI errors * CI errors * Add tests * Fix tests * Fix tests * [chat] Split chat/serve (built on top of lysandre's PR) (#39031) * Next token * Split chat and serve * Support both generation methods * Style * Generation Config * temp * temp * Finalize serving.py Co-authored-by: =?UTF-8?q?c=C3=A9lina?= <hanouticelina@gmail.com> * Finalize chat.py * Update src/transformers/commands/serving.py Co-authored-by: célina <hanouticelina@gmail.com> * Lucain's comments Co-authored-by: Lucain <lucain@huggingface.co> * Update * Last comments on PR * Better error handling * Better error handling * CI errors * CI errors * Add tests * Fix tests * Fix tests * streaming tool call * abstract tool state; set tool start as eos * todos * server working on models without tools * rm chat's deprecated flags * chat defaults * kv cache persists across calls * add server docs * link * Update src/transformers/commands/serving.py * Apply suggestions from code review * i love merge conflicts * solve multi turn with tiny-agents * On the fly switching of the models * Remove required positional arg --------- Co-authored-by: Lysandre <hi@lysand.re> Co-authored-by: =?UTF-8?q?c=C3=A9lina?= <hanouticelina@gmail.com> Co-authored-by: Lucain <lucain@huggingface.co> * Protect names * Fix tests --------- Co-authored-by: =?UTF-8?q?c=C3=A9lina?= <hanouticelina@gmail.com> Co-authored-by: Lucain <lucain@huggingface.co> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-06-30 15:10:53 +02:00
Yih-Dar	539c6c2fa8	All CI jobs with A10 (#39119 ) all a10 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-30 14:23:27 +02:00
Ryan Mullins	ed9f252608	docs: Gemma 3n audio encoder (#39087 ) Updating Gemma 3n docs and docstrings to clarify the relationship between the newly trained audio encoder used in Gemma 3n and the USM model from the original paper.	2025-06-30 14:10:51 +02:00
Yuxuan Zhang	4a79bf947d	Fix some bug for finetune and batch infer For GLM-4.1V (#39090 ) * update * 1	2025-06-30 12:16:22 +02:00
Yao Matrix	2100ee6545	fix UT failures on XPU w/ stock PyTorch 2.7 & 2.8 (#39116 ) * fix UT failures on XPU w/ stock PyTorch 2.7 & 2.8 Signed-off-by: YAO Matrix <matrix.yao@intel.com> * zamba2 Signed-off-by: YAO Matrix <matrix.yao@intel.com> * xx Signed-off-by: YAO Matrix <matrix.yao@intel.com> * internvl Signed-off-by: YAO Matrix <matrix.yao@intel.com> * tp cases Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-06-30 11:49:03 +02:00
Yih-Dar	ccf2ca162e	skip some `test_sdpa_can_dispatch_on_flash` (#39092 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-27 23:08:14 +02:00
st81	a11f692895	Fixes the failing test `test_is_split_into_words` in `test_pipelines_token_classification.py` (#39079 ) * Fix test pipelines token classification for is_split_into_words * Fix incorrect import format	2025-06-27 19:25:32 +01:00
Sandeep Yadav	18143c76bf	Sandeepyadav1478/2025 06 19 deberta v2 model card update (#38895 ) * [docs]: update deberta-v2.md model card * chore: req updates * chore: address code review feedback and update docs * chore: review feedback and updates * chore: model selection updates * chores: quantizations review updates	2025-06-27 10:35:30 -07:00
Steven Liu	02a769b058	[fix] Add FastSpeech2ConformerWithHifiGan (#38207 ) * add to mapping * oops * oops * add to config_mapping_names * revert * fix? * config-mapping-names * fix? * fix?	2025-06-27 09:38:21 -07:00
Benjamin Bossan	c2dc72bb5f	TST Fix PEFT integration test bitsandbytes config (#39082 ) TST Fix PEFT integration test bitsandbytes config The PEFT integration tests still used load_in_{4,8}_bit, which is deprecated, moving to properly setting BitsAndBytesConfig. For 4bit, also ensure that nf4 is being used to prevent > RuntimeError: quant_type must be nf4 on CPU, got fp4	2025-06-27 18:33:11 +02:00
Matej Sirovatka	c8064bea9a	Fix: unprotected import of tp plugin (#39083 )	2025-06-27 17:28:05 +02:00
farrosalferro	dd7dc4a4a2	Add Fast Image Processor for Chameleon (#37140 ) * Add Fast Image Processor for Chameleon * add warning to resize and move blend_rgba to convert_to_rgb * Remove unrelated files * Update image_processing_chameleon_fast to use auto_docstring * fix equivalence test --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-06-27 15:26:57 +00:00
Yih-Dar	6d773fc3bc	fix `dots1` tests (#39088 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-27 16:54:11 +02:00
Tijana Vukovic	c8764ab935	guard torch distributed check (#39057 ) * guard torch distributed check * Update src/transformers/pipelines/base.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-06-27 14:49:47 +00:00
MinJu-Ha	49d9fd49bd	Add Fast Image Processor for mobileViT (#37143 ) * Add image_processing_mobilevit_fast.py * Fix copies * update _preprocess for channel_flip * Update for batched image processing * Resolve merge conflicts with main * Fix import order and remove trailing whitespace (ruff clean-up) * Fix copy inconsistencies * Add NotImplementedError for post_process_semantic_segmentation to satisfy repo checks * Add auto_docstring * Adjust style * Update docs/source/en/model_doc/mobilevit.md Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Update src/transformers/models/mobilevit/image_processing_mobilevit_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Update src/transformers/models/mobilevit/image_processing_mobilevit_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Delete not used function * test: add missing tests for and * Add post_process_semantic_segmentation to mobilevit_fast.py * Add preprocess function to image_processing_mobilebit_fast.py * ruff check for formatting * fix: modify preprocess method to handle BatchFeature correctly * Remove logic for default value assignment Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Remove normalization adn RGB conversion logic not used in slow processor Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Simplify return_tensors logic using one-liner conditional expression Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Remove unused normalization and format parameters Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * add *kwargs and remove default values in _preprocess add slow_fast equivalence tests for segmentation * style: autoformat code with ruff * Fix slow_fast equivalence test * merge + remove skipped test --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-06-27 14:40:24 +00:00
Nahieli	4336ecd1ea	add fast image processor nougat (#37661 ) * add fast image processor nougat * test fixes * docstring white space * last fixes * docstring_type * tolerance unit test * fix tolerance * fix rtol * remove traling white space * remove white space * note for tolerance unit test * fix tests * remove print --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-06-27 14:39:43 +00:00
Benjamin Bossan	0c35280e58	TST PEFT integration tests with pipeline generate (#39086 ) Some PEFT integration tests involving text generation pipelines were failing since #38129 because the base model is too small to generate longer sequences. Setting max_new_tokens fixes this.	2025-06-27 15:58:10 +02:00
JINO ROHIT	993665a5ff	fixed typo for docstring in prepare_inputs method (#39071 )	2025-06-27 13:57:56 +00:00
Yih-Dar	839893c86b	fix `mistral3` tests (#38989 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-27 15:44:10 +02:00
eustlb	2b85b6ce19	[Whisper] 🚨 Fix pipeline word timestamp: timestamp token is end of token time !!! (#36632 ) * timestamp token is end of token time !!! * ensure correct alignment between tokens and timestamp tokens * ignore input tokens for DTW computation * use num_frames to avoid token timestamp hallucinations * token timestamps test updates ! * num_frames: deprecate and use attention_mask instead * avoid breaking change * fix the pipeline usage for chunk approach * make style * better logging * better logging * make style * update tests with correct values	2025-06-27 12:51:43 +00:00
eustlb	9c8d3a70b8	Pipeline: fix unnecessary warnings (#35753 ) * return attention mask * use correct model input name * fix * make	2025-06-27 14:32:03 +02:00
Yaswanth Gali	1750c518dd	✨ Add EoMT Model \|\| 🚨 Fix Mask2Former loss calculation (#37610 ) * Initial Commit * up * More changes * up * Only mask_logits mismatch * close enough logits debug later * fixes * format * Add dummy loss * Close enough processing for semantic seg * nit * Added panoptic postprocessor * refactor * refactor * finally fixed panoptic postprocessor * temp update * Refactor ForUniversalSegmentation class * nits and config update * Few fixes and inference matches * change mapping * Added training support but loss slightly off 🥲 * Loss is matching 😀 * update * Initial tests skelton * changes * tests update * more modular * initial tests * updates * better docstrings * changes * proc tests passing :) * Image processor update * tiny change * QOL changes * Update test w.r.t latest attn refactor * repo-consistency fixes * up * Image proc fix and integration tests :) * docs update * integration tests * fix * docs update 🥰 * minor fix * Happy CI * fix * obvious refactoring * refactoring w.r.t review * Add fask image proc skelton * Fast Image proc and cleanups * Use more modular * tests update * Add more tests * Nit * QOL updates * change init_weights to torch default * add eager func coz of make style * up * changes * typo fix * Updates * More deterministic tests * More modular * go more modular 🚀 * up * dump * add supprot for giant ckpts * overhaul * modular * refactor * instace seg is ready * cleanup * forgot this * docs cleanup * minor changes * EoMT - > Eomt * Happy CI * remove redundant comment * Change model references * final change * check annealing per block * My other PR changes 😂 --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-06-27 14:18:18 +02:00
Yao Matrix	0106a50a6b	fix a bunch of XPU UT failures on stock PyTorch 2.7 and 2.8 (#39069 ) * fix a bunch of XPU UT failures on stock PyTorch 2.7 and 2.8 Signed-off-by: YAO Matrix <matrix.yao@intel.com> * qwen3 Signed-off-by: YAO Matrix <matrix.yao@intel.com> * quanto Signed-off-by: YAO Matrix <matrix.yao@intel.com> * models Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * idefics2 Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-06-27 14:01:53 +02:00
Mohamed Mekkouri	cb17103bd5	Uninstallling Flash attention from quantization docker (#39078 ) * update * revert	2025-06-27 13:51:46 +02:00
BUI Van Tuan	371c471113	Fix initialization of OneFormer (#38901 ) * fix initialization of OneFormer * remove redundant initializations * remove redundant initializations * remove redundant initializations * keep BC	2025-06-27 12:39:37 +02:00
Yih-Dar	540a10848c	fix `Gemma3nProcessorTest` (#39068 ) * fix * fix * oups forgot style --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-06-27 12:28:10 +02:00
Yaswanth Gali	0d66ef7792	Cleanup Attention class for Siglip and dependent models (#39040 ) * cleanup attention class * More models * more models * Changes * make style * Should fix CI * This should work 🙏	2025-06-27 12:14:09 +02:00
eustlb	1ccc73dee9	[Whisper] fix shape mismatch in tests (#39074 ) fix shape mismatch	2025-06-27 09:27:42 +00:00
Steven Liu	a52478253b	[docs] Tensor parallelism (#38241 ) * updates * feedback * badges * fix? * fix? * fix? * fix?	2025-06-26 14:40:45 -07:00
Steven Liu	84e8696cae	[docs] @auto_docstring (#39011 ) * refactor * feedback	2025-06-26 14:21:54 -07:00
Drew Ross	018855de63	Update PEGASUS-X model card (#38971 ) * Update PEGASUS-X model card * Add cache_implementation argument in quantization code example * Update CLI example * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Remove TensorFlow and Flax badges --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-26 13:54:48 -07:00
Steven Liu	757c26fb40	[docs] Model contribution (#38995 ) improve	2025-06-26 12:25:14 -07:00
Yih-Dar	b372bb5ed1	fix `layoutlmv3` tests (#39050 ) * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-26 20:07:17 +02:00
StevenBucaille	f171e7e884	Update SuperPoint model card (#38896 ) * docs: first draft to more standard SuperPoint documentation * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * docs: reverted changes on Auto classes * docs: addressed the rest of the comments * docs: remove outdated reference to keypoint detection task guide in SuperPoint documentation * Update superpoint.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-26 10:13:06 -07:00
Yih-Dar	2f50230c59	fix `t5gemma` tests (#39052 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-26 18:48:14 +02:00
Yih-Dar	23b7e73f05	fix `test_compare_unprocessed_logit_scores` (#39053 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-26 18:36:56 +02:00
Anton Vlasjuk	58c7689226	[`Flex Attn`] Fix torch 2.5.1 incompatibilities (#37406 ) * remove compile on mask creation, ensure kv blocks do not explode on indices * trigger ci * switch dynamic compilation to false * patch new masking functions as well * add len check * i was wrong * last comment	2025-06-26 18:23:55 +02:00
Lysandre	5154497607	Dev version	2025-06-26 18:04:36 +02:00
Kyle Sayers	0a8081b03d	[Modeling] Fix encoder CPU offloading for whisper (#38994 ) * fix cpu offloading for whisper Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * unskip offloading tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * revert small change Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-06-26 15:56:33 +00:00
Ryan Mullins	c63cfd6a83	Gemma 3n (#39059 ) * Gemma 3n * initial commit of Gemma 3n scaffold * Fixing param pass through on Gemm3p5RMSNorm * Adds Einsum layer to Gemma 3n * Updating EinsumLayer API * Undoing erroneous force push * Reverting RMSNorm to with_scale by default * Adds LAuReL to Gemma 3n * Adds AltUp to Gemma 3n * Adding Gemma3p5 overall and text config with vision and audio config placeholders (#3) * Adding gemma3p5 text configs * Adding audio config placeholders * Adding a placeholder for vision configs * Updating MobileNetVisionConfig, inheriting TimmWrapperConfig * Updating text configs * Update src/transformers/models/gemma3p5/modular_gemma3p5.py Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Removing altup configs to accept the suggested configs * Update src/transformers/models/gemma3p5/modular_gemma3p5.py Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Updating altup config * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Addressing review comments and updating text configs * Adding a config for activation sparsity * Updating configs to pass through options to super class init and adjust some name prefixes * Updating laurel and altup with corrected config values * Normalizing sub_config initializers --------- Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Updating MLP with activation sparsity (#2) * Updating DecoderBlock for Gemma 3n (#3) * Initial Gemm3nTextModel (#4) NOTE: This implementation WILL CHANGE in the coming weeks, however, changes will be strictly additive and this will remain a suitable baseline for downstream implementations to reference. * Adding KV Cache Sharing * Adds Einsum layer to Gemma 3n * Updating EinsumLayer API * Refactored kv cache sharing in attention * Adding KVStore for cache sharing * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update src/transformers/cache_utils.py Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Undoing erroneous force push * Reverting RMSNorm to with_scale by default * Adds LAuReL to Gemma 3n * Updating KV Cache Sharing implementation * Updating the q and k norm definitions in the attention module * Fixing name error for q,k,v RMS norm to use the right 3n module * Updating MLP with activation sparsity * Updating DecoderBlock for Gemma 3.5 * Updating kv cache sharing implementation with the use of a cache buffer and refactoring some lines of code * Isolating KV Cache logic to relevant components * Fixing logic error in Gemma3nAttention.forward * Refactoring caching contributions and fixing kv_store initialization * Simplifying Configs * Remove errant self from super init call * Bug fix in the Attention module - changing self.head_dim to config.head_dim * Bug fixes in the LaurelBlock and RMS Norm super init call * removing redundant code from a merge * Adding per_layer_inputs to TextModel * Adding preprocess embeddings with altup * Adds per-layer-to-single output and a host of TODOs * Integrating altup predict with the model workflow and other minor bug fixes * Using nn.Embedding temporarily for text model * It goes forward * Minor refactor of attention sparsity and RoPE initialization * Fixing duplicate rope_scaling param bug when loading from pretrained --------- Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: SindhuRaghuram97 <114270661+SindhuRaghuram97@users.noreply.github.com> * Normalizing on altup_num_inputs config option * regenerating modeling file after syncing to HEAD * Use torch.std(..., unbiased=False) for activation sparsity (#8) * Refactoring to a single QVK Norm (#13) * AltUp: support scale_corrected_output (#14) * Converts einsums to nn.Linear (#7) * Converts einsums to nn.Linear * Removing unused variables * Aligning SharedKVCache with HybridCache (#11) * Alinging SharedKVStore with HybridCache * Remove KVStore. Refactor apply_rotary_pos_emb for sharing * Addressing review comments * Supporting split modality embeddings in Gemma3n (#10) * Adding the Embedder class * Update modular Co-authored-by: Ryan Mullins <ryan@ryanmullins.org> * Update modular Co-authored-by: Ryan Mullins <ryan@ryanmullins.org> * Update modular Co-authored-by: Ryan Mullins <ryan@ryanmullins.org> * Update modular Co-authored-by: Ryan Mullins <ryan@ryanmullins.org> * Update modular Co-authored-by: Ryan Mullins <ryan@ryanmullins.org> * Update modular Co-authored-by: Ryan Mullins <ryan@ryanmullins.org> * Addressing review comments, adding audio embedding layers, integrating embedder with the remaining architecture, adding a forward method for conditional generation * Apply suggestions from code review Co-authored-by: Ryan Mullins <ryan@ryanmullins.org> * Update modular Co-authored-by: Ryan Mullins <ryan@ryanmullins.org> * Addressing review comments, prop drilling audio and vision configs to the text config * Removing TODO's that have been addressed * Simplify Embedder init and add audio embeddings * Embeddings refactor. Adds Gemma3nAudioEmbedder and Gemma3nVisionEmbedder * Refactoring vision and audio embeddings into ConditionalGeneration model --------- Co-authored-by: Ryan Mullins <ryan@ryanmullins.org> Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Updating attention mask for Gemma 3.5 (#15) * xxx_token_index to xxx_token_id * remvoing deprecated last_cache_position * Removing references to SigLIP * Always init per-layer inputs * Using torch.finfo().min for epsilon_tensor * Gemma3nDecoderLayer inherits from Gemma3DecoderLayer. Remove gating lambdas * fix modular GEMMA3N_INPUTS_DOCSTRING * Gemma3nAttention inherits from Gemma3Attention * Modular inheritance fixes * CausalLM conversion script for 4B model (#16) * Add Gemma3n Audio Encoder (#6) * initial commit of Gemma 3.5 scaffold * Fixing param pass through on Gemm3nRMSNorm * Adds Einsum layer to Gemma 3.5 * Updating EinsumLayer API * Undoing erroneous force push * Reverting RMSNorm to with_scale by default * Adds LAuReL to Gemma 3n * Adds AltUp to Gemma 3n * Adding Gemma3n overall and text config with vision and audio config placeholders (#3) * Adding gemma3n text configs * Adding audio config placeholders * Adding a placeholder for vision configs * Updating MobileNetVisionConfig, inheriting TimmWrapperConfig * Updating text configs * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Removing altup configs to accept the suggested configs * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Updating altup config * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Addressing review comments and updating text configs * Adding a config for activation sparsity * Updating configs to pass through options to super class init and adjust some name prefixes * Updating laurel and altup with corrected config values * Normalizing sub_config initializers --------- Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Updating MLP with activation sparsity (#2) * Updating DecoderBlock for Gemma 3.5 (#3) * Initial Gemm3nTextModel (#4) NOTE: This implementation WILL CHANGE in the coming weeks, however, changes will be strictly additive and this will remain a suitable baseline for downstream implementations to reference. * Adding KV Cache Sharing * Adds Einsum layer to Gemma 3.5 * Updating EinsumLayer API * Refactored kv cache sharing in attention * Adding KVStore for cache sharing * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update src/transformers/cache_utils.py Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Undoing erroneous force push * Reverting RMSNorm to with_scale by default * Adds LAuReL to Gemma 3n * Updating KV Cache Sharing implementation * Updating the q and k norm definitions in the attention module * Fixing name error for q,k,v RMS norm to use the right Gemma 3n module * Updating MLP with activation sparsity * Updating DecoderBlock for Gemma 3.5 * Updating kv cache sharing implementation with the use of a cache buffer and refactoring some lines of code * Isolating KV Cache logic to relevant components * Fixing logic error in Gemma3nAttention.forward * Refactoring caching contributions and fixing kv_store initialization * Simplifying Configs * Remove errant self from super init call * Bug fix in the Attention module - changing self.head_dim to config.head_dim * Bug fixes in the LaurelBlock and RMS Norm super init call * removing redundant code from a merge * Adding per_layer_inputs to TextModel * Adding preprocess embeddings with altup * Adds per-layer-to-single output and a host of TODOs * Integrating altup predict with the model workflow and other minor bug fixes * Using nn.Embedding temporarily for text model * It goes forward * Minor refactor of attention sparsity and RoPE initialization * Fixing duplicate rope_scaling param bug when loading from pretrained --------- Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: SindhuRaghuram97 <114270661+SindhuRaghuram97@users.noreply.github.com> * Normalizing on altup_num_inputs config option * Adding audio encoder config * Adds high-level components for Audio Encoder * Implement uniform reducer for Audio Encoder * Adding placeholders for Conformer components in Audio Encoder * Adding placeholders for SubSampleConvProjection components in Audio Encoder * Adding SequenceLayer component placeholders * Implementing Gemma3nAudioEncoder with nn.Sequential * Implementing Gemma3nAudioSubSampleConvProjection with nn.Sequential * Implementing Conformer model with SequenceLayers * Use OrderedDict in nn.Sequential initializers * Implements sl.Residual in Torch with nn.Sequential and OrderedDict * Adopting a base SequenceLayer class with default forward() method * Implementing sl.GatedLinearUnit in Torch * Implementing sl.Swish in Torch * Implementing sl.ReLU in Torch * Implementing sl.Scale in Torch * Removing sl.Dropout after tree-shaking * Implementing sl.RMSNorm in Torch with fake shape * Implementing sl.GroupNorm in Torch * Implementing sl.Conv2d in Torch * Implementing sl.Dense in Torch * Removing sl.Delay layers, which act as pass-throughs * Connecting shapes to configs in initializers * Removing sl.Emit * Implementing sl.ExpandDims in Torch * Adding sl.GradientClipping to Torch * Implementing sl.DenseShaped in Torch * Implementing sl.LDPA in Torch * Removing unused sl.CombinedQKVProj class * Fixing erroneous type hint * Implemnenting sl.DepthwiseConv1D in Torch * Implementing sl.MaskInvalid in Torch * Fixes for initialization * Fixes for saving weights * Removing einsums per feedback from HF staff * Removing Sequence Layers idioms from audio encoder * Fixes for reviewer comments * CausalLM conversion script for 4B model * inv_timescales to non-persistent buffer * Addressing audio encoder Attention feedback * Addressing Gemma3nAudioSSCPConvBlock feedback * Addressing Gemma3nAudioConformerAttention feedback * Addressing padding feedback * Weights conversion loads audio state dict * Always use vision_config so saving works * Token id updates for configs * Stubs for interleaving audio embs * Addressing reviewer feedback --------- Co-authored-by: SindhuRaghuram97 <114270661+SindhuRaghuram97@users.noreply.github.com> Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> * Fixing cache access error * Removing duplicate code from a bad merge * Gemma 3n Text + Vision Part 1 (#17) * testing utilities for numerics comparisons * Corrected einsum to nn.Linear weights conversion * Inherit scaled word embs from Gemma3 not Bart * Fixing transposes for collapsed linears * More transpose fixes * numpy api fix * RMSNorm: Explicit kwargs, scale_shift=0.0 when with_scale=True * Force AltUp to float32 * Updating debugging script for AudioEncoder debugging * Support divide_weight_by_sqrt_fan_in from JAX for per-layer inputs * Correcting attention einsum conversions * RMSNorm in type of x * Fixing douplicate laurel norm/gating * KV sharing using the right previous indices * Refactor kv shared index computation. Correct frac_shared_layers * Use num_shared_layers instead of inferring from a fraction * fixing a bug for logging * Fix shared data_ptrs in altup inits * rope: adjust proj -> norm -> rope to preserve computation (#20) * rope: adjust proj -> norm -> rope to preserve computation * Removing some breaking language model fluff in ConditionalGeneration * Consolidate query_states transforms --------- Co-authored-by: Douglas Reid <21148125+douglas-reid@users.noreply.github.com> Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Vectorize the loops in AltUp (#19) * Vectorize the loops in AltUp * fix typo * Expanding to support batched inputs * remove extra debug script * Fix AltUp.forward --------- Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Add 'scale_shift=0.0, with_scale=True' to the final norm in TextModel * Convert norm to 1/sqrt (#21) * Convert norm to 1/sqrt * Scale shift change per Phil's rec * Adding default activation sparsity * Fixing 2B config in weights conversion script * Fixing RMSNorm parameters - adding scale_shift and with_scale * Correcting query pre-attention scaling * Adding query_rescale_scalar to text config * Adding layer_idx to MLP * Permafix for input_layernorm * Use 1/sqrt instead of rsqrt in DecoderLayer * Fix o_proj conversion * Conversion script update for vision encoder * Removing logging for debugging timm model * Fixing bugs in Gemma3nForConditionalGeneration for text generation * Generating the modeling_gemma3n.py file * Removing the addition of an erroneous line in the modeling file * Adding gemma3n text model to modeling_auto * Bugfix: Updating the interleaving of inputs_embeds and vision_embeds * Updating the modeling file with the latest bugfix changes * Updating models/auto for Gemma 3n * using AutoTokenizer in forward test * Adding processing_gemma3n.py * Gemma 3n configured for AutoModel. Conversion script updated. * Removing errant merge artifacts --------- Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Douglas Reid <douglas-reid@users.noreply.github.com> Co-authored-by: Douglas Reid <21148125+douglas-reid@users.noreply.github.com> Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> * Removing errant debugging statements from Gemma 3 * Gemma3n audio model (#18) * testing utilities for numerics comparisons * Implement CumulativeGroupNorm and add to SubSampleConvProjection and SSCPConvBlock * Add audio version of forward script based on RyanMullins' implementation * Updating to match encoder tests. WIP: config question needs resolving * Updates to audio classes to enable end-to-end running * Removing vestigial classes, cleaning up print statements * Adding SiLU / Swish to audio conformer feed forward block * Shifted Gemma3p5Audio naming prefix to Gemma3NanoAudio * Adding outputs to audio test * Fixes to padding in SSCP and 1D convolution, align RMS Norm with wider model * Update forward test to load from local weights * Update conversion to process / output audio layers * Update __all__ to export audio encoder * AutoModel registration for Gemma 3n Audio * Use AutoModel for ConditionalGeneration.audio_tower * Fixing input_proj_linear transpose * Fixing Gemma3NanoAudioConformerAttention.post conversion * Fixing Gemma3NanoAudioSSCPConvBlock.conv weights conversion * Correcting indentation issue on Gemma3p5RMSNorm --------- Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Text + Vision Part 2 (#23) * Updates for ConditionalGeneration.get_image_features * Adding a WIP draft of image_processing_gemma3p5.py * Update src/transformers/models/gemma3p5/modular_gemma3p5.py Co-authored-by: SindhuRaghuram97 <114270661+SindhuRaghuram97@users.noreply.github.com> * Modular conversion after github suggested change * Text + image gives good results * Fixing image size preset * Updating configs for the 2B variant in the conversion script * Using final generation config in conversion script --------- Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: SindhuRaghuram97 <114270661+SindhuRaghuram97@users.noreply.github.com> * Audio Integration (#12) * initial commit of Gemma 3n scaffold * Fixing param pass through on Gemm3nRMSNorm * Adds Einsum layer to Gemma 3n * Updating EinsumLayer API * Undoing erroneous force push * Reverting RMSNorm to with_scale by default * Adds LAuReL to Gemma 3n * Adds AltUp to Gemma 3n * Adding Gemma 3n overall and text config with vision and audio config placeholders (#3) * Adding Gemma 3n text configs * Adding audio config placeholders * Adding a placeholder for vision configs * Updating MobileNetVisionConfig, inheriting TimmWrapperConfig * Updating text configs * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Removing altup configs to accept the suggested configs * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Updating altup config * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Addressing review comments and updating text configs * Adding a config for activation sparsity * Updating configs to pass through options to super class init and adjust some name prefixes * Updating laurel and altup with corrected config values * Normalizing sub_config initializers --------- Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Updating MLP with activation sparsity (#2) * Updating DecoderBlock for Gemma 3n (#3) * Initial Gemma3nTextModel (#4) NOTE: This implementation WILL CHANGE in the coming weeks, however, changes will be strictly additive and this will remain a suitable baseline for downstream implementations to reference. * Adding KV Cache Sharing * Adds Einsum layer to Gemma 3n * Updating EinsumLayer API * Refactored kv cache sharing in attention * Adding KVStore for cache sharing * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update modular Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Update src/transformers/cache_utils.py Co-authored-by: Ryan Mullins <ryanmullins@google.com> * Undoing erroneous force push * Reverting RMSNorm to with_scale by default * Adds LAuReL to Gemma 3n * Updating KV Cache Sharing implementation * Updating the q and k norm definitions in the attention module * Fixing name error for q,k,v RMS norm to use the right 3n module * Updating MLP with activation sparsity * Updating DecoderBlock for Gemma 3n * Updating kv cache sharing implementation with the use of a cache buffer and refactoring some lines of code * Isolating KV Cache logic to relevant components * Fixing logic error in Gemma3nAttention.forward * Refactoring caching contributions and fixing kv_store initialization * Simplifying Configs * Remove errant self from super init call * Bug fix in the Attention module - changing self.head_dim to config.head_dim * Bug fixes in the LaurelBlock and RMS Norm super init call * removing redundant code from a merge * Adding per_layer_inputs to TextModel * Adding preprocess embeddings with altup * Adds per-layer-to-single output and a host of TODOs * Integrating altup predict with the model workflow and other minor bug fixes * Using nn.Embedding temporarily for text model * It goes forward * Minor refactor of attention sparsity and RoPE initialization * Fixing duplicate rope_scaling param bug when loading from pretrained --------- Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: SindhuRaghuram97 <114270661+SindhuRaghuram97@users.noreply.github.com> * Normalizing on altup_num_inputs config option * Adding audio encoder config * Adds high-level components for Audio Encoder * Implement uniform reducer for Audio Encoder * Adding placeholders for Conformer components in Audio Encoder * Adding placeholders for SubSampleConvProjection components in Audio Encoder * Adding SequenceLayer component placeholders * Implementing Gemma3nAudioEncoder with nn.Sequential * Implementing Gemma3nAudioSubSampleConvProjection with nn.Sequential * Implementing Conformer model with SequenceLayers * Use OrderedDict in nn.Sequential initializers * Implements sl.Residual in Torch with nn.Sequential and OrderedDict * Adopting a base SequenceLayer class with default forward() method * Implementing sl.GatedLinearUnit in Torch * Implementing sl.Swish in Torch * Implementing sl.ReLU in Torch * Implementing sl.Scale in Torch * Removing sl.Dropout after tree-shaking * Implementing sl.RMSNorm in Torch with fake shape * Implementing sl.GroupNorm in Torch * Implementing sl.Conv2d in Torch * Implementing sl.Dense in Torch * Removing sl.Delay layers, which act as pass-throughs * Connecting shapes to configs in initializers * Removing sl.Emit * Implementing sl.ExpandDims in Torch * Adding sl.GradientClipping to Torch * Implementing sl.DenseShaped in Torch * Implementing sl.LDPA in Torch * Removing unused sl.CombinedQKVProj class * Fixing erroneous type hint * Implemnenting sl.DepthwiseConv1D in Torch * Implementing sl.MaskInvalid in Torch * Fixes for initialization * Fixes for saving weights * Removing einsums per feedback from HF staff * Removing Sequence Layers idioms from audio encoder * Fixes for reviewer comments * Converting sl.Frontend to FeatureExtractor * Updates for ConditionalGeneration.get_image_features * Adding a WIP draft of image_processing_gemma3n.py * Update modular Co-authored-by: SindhuRaghuram97 <114270661+SindhuRaghuram97@users.noreply.github.com> * Modular conversion after github suggested change * Text + image gives good results * Fixing image size preset * Draft of audio data in chat template * Removing image processing. Using SigLIP instead. * Audio input going end-to-end * Fixing dtype issues in audio encoder * x-lib formatting consistency * Adding example data * Save preprocessor_config.json from conversion script * Instrumentaiton for debugging * Additional instrumentation for preprocessing debugging * Updates to preprocessor, padding; produces correct end-to-end results on sample * Tackling configuraiton TODOs * Start of feature extractor refatcor * Adds Numpy version of USM extractor, removes Torch version and dependencies * Fixing AltUp.correct coef permute * Supporting batches of single audio segment inputs * Docstrings updates for config * In-lining audio feature extraction * Adjustments to conversion script and smoke test script --------- Co-authored-by: SindhuRaghuram97 <114270661+SindhuRaghuram97@users.noreply.github.com> Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: pculliton <phillipculliton@gmail.com> * Gemma 3n renaming * Removing test data and utilities * Renaming test files * Gemma 3n refactor * Fix tokenizer config in conversion script * Address reviewer feedback * FeatureExtractor returns float32 by default * Adding basic tests for audio, and input name for audio encoder * Audio integration test, updates to model_id for other integration tests * Use scales for q and k norms (#26) * Update audio integration test to use HF dataset * Reviewer feedback * Expand embedding table to full vocab size in weights conversion * Mix-n-match MatFormers for Gemma 3n (#25) * Remove in-place operations (#30) * chore: removing inplace ops * remove [tensor] * n pattern * chore: reviewer feedback in AudioEncoder and AltUp * More grad clipping * Dynamo compatibility * fix: cache slicing error * chore: simplify shared kv cache slicing * chore: vision encoder rename in timm * fix: image processor do_normalize=False * fixup: style * chore: model_doc * fix: docs for code quality * chore: repo consistency * fix: RMSNorm in float as in prior Gemmas * fix: per_layer_inputs = None * chore: Gemma3nForCausalLM from Gemma3nForConditionalGeneration checkpoint * chore: repo consistency * Add initial unit tests for Gemma3nAudioFeatureExtractor (#27) * Add initial unit tests for Gemma3nAudioFeatureExtractor * Add basic unit tests for Gemma3nProcessor (#28) Co-authored-by: Douglas Reid <21148125+douglas-reid@users.noreply.github.com> * parameterize tests --------- Co-authored-by: Douglas Reid <21148125+douglas-reid@users.noreply.github.com> * chore: code style * fix: test cases * style and consistency * fix config in the test to be coherent with layer cache sharing * fix hidden states in tests and code * inits and mappings * fix modality prefixes * test order and prefixes * fix test exception * fix class order and reduce model size for faster tests * restore _checkpoint_conversion_mapping to load Caual from Conditional * fix config mapping! * fix: reviewer feedback --------- Co-authored-by: SindhuRaghuram97 <114270661+SindhuRaghuram97@users.noreply.github.com> Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Douglas Reid <douglas-reid@users.noreply.github.com> Co-authored-by: Douglas Reid <21148125+douglas-reid@users.noreply.github.com> Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> Co-authored-by: pculliton <phillipculliton@gmail.com> Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * fix import test * add model args * auto_docstring * replace test path * consistency * skip tests for now * fix docstring for doc builder * skip unused attr --------- Co-authored-by: SindhuRaghuram97 <114270661+SindhuRaghuram97@users.noreply.github.com> Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Douglas Reid <douglas-reid@users.noreply.github.com> Co-authored-by: Douglas Reid <21148125+douglas-reid@users.noreply.github.com> Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> Co-authored-by: pculliton <phillipculliton@gmail.com> Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Arthur <arthur.zucker@gmail.com>	2025-06-26 17:55:47 +02:00
Joao Gante	3e5cc12855	[tests] remove tests from libraries with deprecated support (flax, tensorflow_text, ...) (#39051 ) * rm tf/flax tests * more flax deletions * revert fixture change * reverted test that should not be deleted; rm tf/flax test * revert * fix a few add-model-like tests * fix add-model-like checkpoint source * a few more * test_get_model_files_only_pt fix * fix test_retrieve_info_for_model_with_xxx * fix test_retrieve_model_classes * relative paths are the devil * add todo	2025-06-26 16:25:00 +01:00
eustlb	cfff7ca9a2	[Whisper] Pipeline: handle long form generation (#35750 ) * handle long form generation * add warning * correct incorrect in place token change * update test to catch edge case * make style * update warning * add doc	2025-06-26 14:33:31 +00:00
eustlb	02ecdcfc0f	add _keep_in_fp32_modules_strict (#39058 ) * add _keep_in_fp32_modules_strict * complete test	2025-06-26 13:55:28 +00:00
vb	d973e62fdd	fix condition where torch_dtype auto collides with model_kwargs. (#39054 ) * fix condition where torch_dtype auto collides with model_kwargs. * update tests * update comment * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-26 14:52:57 +02:00
Raushan Turganbay	44b231671d	[qwen2-vl] fix vision attention scaling (#39043 ) scale lost its `-` when refactoring	2025-06-26 14:06:52 +02:00
emmmm	ae15715df1	polishing docs: error fixes for clarity (#39042 ) * fix duplicate deprecate_models.py * fix duplicate modular_model_converter.py	2025-06-26 11:56:31 +00:00
Manuel de Prada Corral	3abeaba7e5	Create test for #38916 (custom generate from local dir with imports) (#39015 ) * create test for #38916 (custom generate from local dir with imports)	2025-06-26 13:54:36 +02:00
Rémi Ouazan	25c44d4b68	Internvl fix (#38946 ) * Image processor compile fix (#38540) * Added a compile-friendly versiom of resize to BaseImgProcessorFast * Changed qwen2 processor to use its parent class .resize * Style * underlined issue only happens on AMD w/ comment and bool check * Fixed some utils functions * Fixed the same issue for bridgetower * Fixed the same issue for llava_next * Repo consistency for llava onevision * Update src/transformers/image_processing_utils_fast.py Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com> --------- Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com> * Added an Expectation to an internvl test * Made qwen2_vl use the resize method of its parent clas * Changed to torch.where --------- Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com>	2025-06-26 13:44:59 +02:00
Anton Vlasjuk	f85b47d1b8	[`Generate`] Fix no grad on some models (#39008 ) fixes on torch no grad for generate	2025-06-26 13:06:09 +02:00
Jaeyong Sung	583db52bc6	Add Dia model (#38405 ) * add dia model * add tokenizer files * cleanup some stuff * brut copy paste code * rough cleanup of the modeling code * nuke some stuff * more nuking * more cleanups * updates * add mulitLayerEmbedding vectorization * nits * more modeling simplifications * updates * update rope * update rope * just fixup * update configuration files * more cleanup! * default config values * update * forgotten comma * another comma! * update, more cleanups * just more nits * more config cleanups * time for the encoder * fix * sa=mall nit * nits * n * refacto a bit * cleanup * update cv scipt * fix last issues * fix last nits * styling * small fixes * just run 1 generation * fixes * nits * fix conversion * fix * more fixes * full generate * ouf! * fixes! * updates * fix * fix cvrt * fixup * nits * delete wrong test * update * update * test tokenization * let's start changing things bit by bit - fix encoder step * removing custom generation, moving to GenerationMixin * add encoder decoder attention masks for generation * mask changes, correctness checked against ad29837 in dia repo * refactor a bit already --> next cache * too important not to push :) * minimal cleanup + more todos * make main overwrite modeling utils * add cfg filter & eos filter * add eos countdown & delay pattern * update eos countdown * add max step eos countdown * fix tests * fix some things * fix generation with testing * move cfg & eos stuff to logits processor * make RepetitionPenaltyLogitsProcessor flexible - can accept 3D scores like (batch_size, channel, vocab) * fix input_ids concatenation dimension in GenerationMixin for flexibility * Add DiaHangoverLogitsProcessor and DiaExponentialDecayLengthPenalty classes; refactor logits processing in DiaForConditionalGeneration to utilize new configurations and improve flexibility. * Add stopping criteria * refactor * move delay pattern from processor to modeling like musicgen. - add docs - change eos countdown to eos delay pattern * fix processor & fix tests * refactor types * refactor imports * format code * fix docstring to pass ci * add docstring to DiaConfig & add DiaModel to test * fix docstring * add docstring * fix some bugs * check * porting / merging results from other branch - IMPORTANT: it very likely breaks generation, the goal is to have a proper forward path first * experimental testing of left padding for first channel * whoops * Fix merge to make generation work * fix cfg filter * add position ids * add todos, break things * revert changes to generation --> we will force 2d but go 3d on custom stuff * refactor a lot, change prepare decoder ids to work with left padding (needs testing), add todos * some first fixes to get to 10. in generation * some more generation fixes / adjustment * style + rope fixes * move cfg out, simplify a few things, more todos * nit * start working on custom logit processors * nit * quick fixes * cfg top k * more refactor of logits processing, needs a decision if gen config gets the new attributes or if we move it to config or similar * lets keep changes to core code minimal, only eos scaling is questionable atm * simpler eos delay logits processor * that was for debugging :D * proof of concept rope * small fix on device mismatch * cfg fixes + delay logits max len * transformers rope * modular dia * more cleanup * keep modeling consistently 3D, generate handles 2D internally * decoder starts with bos if nothing * post processing prototype * style * lol * force sample / greedy + fixes on padding * style * fixup tokenization * nits * revert * start working on dia tests * fix a lot of tests * more test fixes * nit * more test fixes + some features to simplify code more * more cleanup * forgot that one * autodocs * small consistency fixes * fix regression * small fixes * dia feature extraction * docs * wip processor * fix processor order * processing goes brrr * transpose before * small fix * fix major bug but needs now a closer look into the custom processors esp cfg * small thing on logits * nits * simplify indices and shifts * add simpler version of padding tests back (temporarily) * add logit processor tests * starting tests on processor * fix mask application during generation * some fixes on the weights conversion * style + fixup logits order * simplify conversion * nit * remove padding tests * nits on modeling * hmm * fix tests * trigger * probably gonna be reverted, just a quick design around audio tokenizer * fixup typing * post merge + more typing * initial design for audio tokenizer * more design changes * nit * more processor tests and style related things * add to init * protect import * not sure why tbh * add another protect * more fixes * wow * it aint stopping :D * another missed type issue * ... * change design around audio tokenizer to prioritize init and go for auto - in regards to the review * change to new causal mask function + docstrings * change ternary * docs * remove todo, i dont think its essential tbh * remove pipeline as current pipelines do not fit in the current scheme, same as csm * closer to wrapping up the processor * text to audio, just for demo purposes (will likely be reverted) * check if it's this * save audio function * ensure no grad * fixes on prefixed audio, hop length is used via preprocess dac, device fixes * integration tests (tested locally on a100) + some processor utils / fixes * style * nits * another round of smaller things * docs + some fixes (generate one might be big) * msytery solved * small fix on conversion * add abstract audio tokenizer, change init check to abstract class * nits * update docs + fix some processing :D * change inheritance scheme for audio tokenizer * delete dead / unnecessary code in copied generate loop * last nits on new pipeline behavior (+ todo on tests) + style * trigger --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Vasqu <antonprogamer@gmail.com>	2025-06-26 11:04:23 +00:00
Alex Brooks	5995cfa0a0	Fix Bad Outputs in Fast Path for GraniteMoeHybrid (#39033 ) Fix bug in previous state setting	2025-06-26 09:45:57 +02:00
Avihu Dekel	22b0a89878	Granite speech speedup + model saving bugfix (#39028 ) * ensure the query is updated during training avoid unused parameters that DDP does not like * avoid a crash when `kwargs` contain `padding=True` trainers often pass this argument automatically * minor * Remove mel_spec lazy init, and rename to mel_filters. this ensures save_pretrained will not crash when saving the processor during training `d5d007a1a0/src/transformers/feature_extraction_utils.py (L595)` * minor - most feature extractors has a `sampling_rate` property * speedup relative position embeddings * fix several issues in model saving/loading: - avoid modifying `self._hf_peft_config_loaded` when saving - adapter_config automatically points to the original base model - a finetuned version should point to the model save dir. - fixing model weights names, that are changed by adding an adapter. * minor * minor * minor * fixing a crash without peft active * add todo to replace einsum	2025-06-26 09:44:17 +02:00
Joao Gante	1d45d90e5d	[tests] remove TF tests (uses of `require_tf`) (#38944 ) * remove uses of require_tf * remove redundant import guards * this class has no tests * nits * del tf rng comment	2025-06-25 17:29:10 +00:00
Matt	d37f751797	Two ReDOS fixes (#39013 ) * two_redos_fixes * Fix two redos issues * Just don't use RE at all	2025-06-25 17:31:26 +01:00
eustlb	551e48f182	[Kyutai-STT] correct model type + model id (#39035 ) * correct model type + model id * udpate doc * init fix * style !!!	2025-06-25 16:09:00 +00:00
Anton Lozhkov	dad0e87c79	Add SmolLM3 (#38755 ) * init smollm3 * integration tests * config quirks * docs stub * rests round 2 * tests round 3 * tests round 4 * bring SWA back * config checker pls * final checkpoint * style and copies * Update src/transformers/models/smollm3/modular_smollm3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/smollm3/modular_smollm3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-06-25 15:12:15 +00:00
Enno Hermann	3233e9b7c3	refactor: remove custom BarkLayerNorm (#39003 ) `nn.LayerNorm` supports `bias=False` since Pytorch 2.1	2025-06-25 16:07:52 +01:00
Marcel Ambo Ndowah	3c1d4dfbac	Fix grammatical error in models documentation (#39019 )	2025-06-25 14:55:22 +00:00
Quentin Lhoest	858f9b71a8	Remove script datasets in tests (#38940 ) * remove trust_remote_code * again * Revert "Skip some tests for now (#38931)" This reverts commit 31d30b72245aacfdf70249165964b53790d9c4d8. * again * style * again * again * style * fix integration test * fix tests * style * fix * fix * fix the last ones * style * last one * fix last * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-25 14:31:20 +00:00
Marc Sun	3c322c9cdf	fix gemma3 grad acc (#37208 ) * fix gemma3 grad acc * fix * fix * fix * fix * rmv print * rm * Update setup.py * Apply style fixes * propagate the changes --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Arthur <arthur.zucker@gmail.com>	2025-06-25 16:28:44 +02:00
Umar Butler	860b898d03	fix: astronomical loss with ModernBERT when using gradient checkpointing (#38982 ) (#38983 ) * fix: astronomical loss with ModernBERT when using gradient checkpointing * update the modling fix --------- Co-authored-by: Arthur <arthur.zucker@gmail.com>	2025-06-25 16:11:18 +02:00
EduardDurech	a2eb75c891	Support for Flash Attention 3 (#38972 ) * Support `flash_attn_3` Implements fwd and tests for Flash Attention 3 https://github.com/Dao-AILab/flash-attention/commits/main/hopper - Includes checks for dropout>0 and ALiBi in `modeling_utils.PreTrainedModel._check_and_enable_flash_attn_3` (Dropout will likely be supported soon, so this will need to be updated and `modeling_flash_attention_utils._flash_attention_forward` at the `if _IS_FLASH_ATTN_3_AVAILABLE: ...` An example Llama implementation is included in `modeling_llama.py` but other models would still need to be updated Based on https://github.com/huggingface/transformers/pull/36190 which has model implementations and examples which could be merged * Add tests for Flash Attention 2 and 3 parity * ci fix * FA2 compatibiity - `_prepare_flash_attention_from_position_ids` ->`prepare_fa2_from_position_ids` - Remove bettertransformer check in Flash Attention 3 - Merge tests - Add licensing * ci fix * Test naming consistency * ci fix * Deprecation warning for `prepare_fa2_from_position_ids` * ci fix	2025-06-25 14:39:27 +02:00
Yuan Wu	de98fb25a3	Fix the seamless_m4t cannot work on Gaudi (#38363 ) * Fix the seamless_m4t cannot work on Gaudi Signed-off-by: yuanwu <yuan.wu@intel.com> * Refine the patch Signed-off-by: yuanwu <yuan.wu@intel.com> * Fix seamless_m4t_v2 crash Signed-off-by: yuanwu <yuan.wu@intel.com> * Use the patched_gather Signed-off-by: yuanwu <yuan.wu@intel.com> * Remove debug logs Signed-off-by: yuanwu <yuan.wu@intel.com> * Remove useless modifications Signed-off-by: yuanwu <yuan.wu@intel.com> * Add hpu check Signed-off-by: yuanwu <yuan.wu@intel.com> * Add comments Signed-off-by: yuanwu <yuan.wu@intel.com> --------- Signed-off-by: yuanwu <yuan.wu@intel.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-06-25 12:40:01 +02:00
redmoe-moutain	7503cb9113	[Model] add dots1 (#38143 ) * add dots1 * address comments * fix * add link to dots1 doc * format --------- Co-authored-by: taishan <rgtjf1@163.com>	2025-06-25 11:38:25 +02:00
Biao Zhang	3ef8896906	Encoder-Decoder Gemma (#38332 ) * Initial submit * Fix bugs: 1. add __init__ file 2. tied word embedding 3. support flash/flex attention 4. model saving and loading * Code refactor: * Rename encdecgemma to t5gemma. * Split attention into self- and cross-attention * Split stack into encoder and decoder * Add test cases * Add auto configuration * Update configurations. * Fix bugs related to copy and attribute checks * Fix type union * Fix merge errors * run ruff format * Run make style and update tests. * Add t5gemma model doc. * ruff and style formatting. * Add missed module config. * Add dummy checkpoint link to pass tests (need updated when real checkpoints are uplioaded.). * Update model doc. * Minor updates following Arthur's comments: * replace docstrings with auto_docstrings * remove checkpoint layers * remove deprecate_kwargs * fix rebase errors * Fix docstring issues. * fix t5gemma doc issue. * run ruff format * Updates: * split encoder-only model out * make t5gemmamodel encoder-decoder only * update token and sequence classification * update tests	2025-06-25 09:05:10 +00:00
Yuxuan Zhang	af9870265e	GLM-4.1V Model support (#38431 ) * 20250508 Model Architecture * Update modeling_glm4v.py * Update modeling_glm4v.py * Update modeling_glm4v.py * update 1447 * 0526 * update * format * problem * update * update with only image embed diff * Final * upload * update * 1 * upload with ruff * update * update * work * 1 * 1 * update with new note * 2 * Update convert_glm4v_mgt_weights_to_hf.py * Update tokenization_auto.py * update with new format * remove rmsnrom * draft with videos * draft * update * update * fix for review problem * try to remove min_pixel * update * for test * remove timestamps * remove item * update with remove * change * update 2200 * update * Delete app.py * format * update * Update test_video_processing_glm4v.py * 1 * 2 * use new name * Update test_video_processing_glm4v.py * remove docs * change * update for image processors update * 2108 * 2128 * Update modular_glm4v.py * 1 * update some * update * rename * 1 * remove tests output * 2 * add configuration * update * Update test_video_processing_glm4v.py * fix simple forward tests * update with modular * 1 * fix more tests * fix generation test * fix beam search and init * modular changed * fix beam search in case of single-image/video. Fails if multiple visuals per text * update processor * update test * pass * fix beam search * update * param correct * Update convert_glm4v_mgt_weights_to_hf.py * 1 * Update test_modeling_glm4v.py * 4 * 2 * 2123 video process * 2 * revert * 1 * 2 * revert processing * update preprocesor * changed * 1 * update * update * 6 * update * update * update * Delete tmp.txt * config * Update video_processing_glm4v.py * apply modular correctly * move functions * fix order * update the longest_edge * style * simplify a lot * fix random order of classes * skip integration tests * correctly fix the tests * fix TP plan --------- Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-06-25 10:43:05 +02:00
null-pointer-access	7b3807387b	Drop unnecessary tokens in GPT2Model generation (#39016 ) Drop unnecessary tokens in GPT2Model generation. Co-authored-by: Yi Pan <conlesspan@outlook.com>	2025-06-25 08:29:00 +00:00
Raushan Turganbay	e212ff9e6a	[video processor] support torchcodec and decrease cuda memory usage (#38880 ) * don't move the whole video to GPU * add torchcodec * add tests * make style * instrucblip as well * consistency * Update src/transformers/utils/import_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/utils/import_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/video_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-06-25 08:23:37 +00:00
NielsRogge	11d0feacce	[AutoModelForMaskGeneration] Remove duplicate code (#38622 ) Remove duplicate code	2025-06-25 10:00:13 +02:00
efsotr	3ee72af6b6	Fix graph break in torch.compile when using FA2 with attention_mask=None and batch size > 1 (#37332 ) * Fix graph break in torch.compile when using FA2 with attention_mask=None and batch size > 1 * fix code format * add test; replace position_ids with query_states becasue position_ids.shape[0] is always 1 * add assert loss is not nan	2025-06-25 07:58:34 +00:00
ranzhejiang	ae32f1ad11	Add zero dim tensor check when using flash_attention (#38280 ) * Add zero dim tensor check when using flash_attention Signed-off-by: ranzhejiang <zhejiang.ran@intel.com> * Add zero dim tensor check when using flash_attention Signed-off-by: ranzhejiang <zhejiang.ran@intel.com> --------- Signed-off-by: ranzhejiang <zhejiang.ran@intel.com>	2025-06-25 09:48:50 +02:00
StevenBucaille	ca402e2116	[LightGlue] Fixed attribute usage from descriptor_dim to keypoint_detector_descriptor_dim (#39021 ) fix: fix descriptor dimension handling in LightGlue model	2025-06-24 23:32:07 +01:00
Marcel Ambo Ndowah	48b6ef0238	Add Hugging Face authentication procedure for IDEs (PyCharm, VS Code,… (#38954 ) * Add Hugging Face authentication procedure for IDEs (PyCharm, VS Code, etc.) * Update quicktour.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-24 11:48:15 -07:00
Dmitry	ea9a30923e	[HPU][Critical Issue Fix] ThreadPool instead of Pool for parallel pre-processing (#39002 ) * ThreadPool instead of Pool for parallel pre-processing * ThreadPool only if hpu available	2025-06-24 20:24:50 +02:00
ivarflakstad	995666edb5	Skip sdpa dispatch on flash test due to unsupported head dims (#39010 )	2025-06-24 20:16:56 +02:00
ivarflakstad	f367c6337d	Update self-comment-ci.yml user list (#39014 ) add ivarflakstad to self-comment-ci.yml	2025-06-24 20:13:36 +02:00
Tugsbayasgalan Manlaibaatar	67d36dc1d7	Fix bugs in DynamicCache (#37880 ) * Fix bugs in DynamicCache * Updarte * Update * Lint * lint * Rename test * update * update	2025-06-24 19:43:40 +02:00
eustlb	6bdd4ec952	Add kyutai stt (#38909 ) * first draft * cleaner version * udpate tests + modeling * add tests * init * udpate test_modeling_common * fix tests * csm Processor draft * convertion update * mimi cache padding convolutions draft * mimi streaming udpates * update mimi padding cache test * udpate cache padding mimi test * make style mimi * updates generate moshi asr * moshi asr integration tests (single + batched) * update tests * update conversion script * good default sliding window value * udpdate generate * update test checkpoint * nit * fix mimi * fix codec prefix * revert * revert * update config * update config * unnecessary mimi input restriction * remove delay in tokens * remove _prepare_4d_causal_attention_mask_with_cache_position and _update_causal_mask * test update * modular update * make style * nit * rename * create codec model generation config at init * remove delay * max_new_tokens/length warning * correct conv1 padding cache import for modular * nit * fix on encoder_past_key_values * convert modular * move frame_size to config * move frame_size to config * update test name * handle first token is bos * better handling of max_new_tokens * fix * fix batch size in test input prep * update docstring * convert modular * make style * make style * add feature extractor * correct modular convention name for feature_extraction file * update convertion script * doc processor * update doc * udpate init * update model type * fixes * update tests * fix * make * add doc * nit * fix * doc * auto mappings * doc * nit * convert modular * doc * nit * extend _keep_in_fp32_modules to enforce fp32 * renaming to stt * doc update + test update * doc fixes * doc fix * doc fix * fix musicgen tests * fix musicgen tests * make style * fix musicgen tests * correct frame_rate config param for mimi * update mimi test * revert update mimi test * enforce cpu test * move cache init in cache class * convert modular * docstring update * update model id * feature_extractor -> feature_extraction (SEW) * convert modular * update model id	2025-06-24 18:01:15 +02:00
Mohamed Mekkouri	08bf7f1afe	Add kernelize to transformers (#38205 ) * fix * fix * fix flow * remove non compiling path * change * style * fix * update * update pin * revert	2025-06-24 17:38:54 +02:00
Avihu Dekel	be10d4df60	Granite speech - minor fixes to support training with the HF trainer (#38833 ) * ensure the query is updated during training avoid unused parameters that DDP does not like * avoid a crash when `kwargs` contain `padding=True` trainers often pass this argument automatically * minor * Remove mel_spec lazy init, and rename to mel_filters. this ensures save_pretrained will not crash when saving the processor during training `d5d007a1a0/src/transformers/feature_extraction_utils.py (L595)` * minor - most feature extractors has a `sampling_rate` property	2025-06-24 17:06:52 +02:00
Cyril Vallez	e1e11b0299	Fix undeterministic order in modular dependencies (#39005 ) * sort correctly * Update modeling_minimax.py * Update modular_model_converter.py	2025-06-24 17:04:33 +02:00
7mile	bdf5fb70aa	Skip non-selected experts for qwen3_moe (#38133 ) * fix(qwen3moe): skip experts with no workload * avoid tolist and also update other moe models * fix: should squeeze 0-dim only	2025-06-24 16:33:48 +02:00
Tanuj Rai	719058c625	Update attention_visualizer.py (#37860 )	2025-06-24 16:21:36 +02:00
Mylon Jones	9f42c1f192	Added scikit-learn to the example image-classification requirements.txt (#37506 ) Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-06-24 15:24:02 +02:00
Cyril Vallez	1636a7bcb9	Fixes for Arcee model (#39001 ) * fix modular * Update modular_arcee.py * fix	2025-06-24 15:23:52 +02:00
Crystalcareai	71de20b818	Add Arcee model support (#38621 ) * Add Arcee model support to transformers - Add ArceeConfig and model mappings for all task types (CausalLM, SequenceClassification, QuestionAnswering, TokenClassification) - Add auto-loading support through AutoModel, AutoConfig, and AutoTokenizer - Use LlamaTokenizer for tokenization - Add FX graph support for Arcee models - Create lazy loading module structure for Arcee * feat: update YARN scaling and RoPE validation for Arcee model * feat: add auto_docstring checkpoint config to Arcee model classes * docs: add pre-trained model weights reference to Arcee configuration files * refactor: move RoPE utilities to dedicated modeling_rope_utils module * Add comprehensive test suite for Arcee model - Add test_modeling_arcee.py following standard transformers test patterns - Include tests for all model variants (CausalLM, SequenceClassification, QuestionAnswering, TokenClassification) - Add specific test for ReLU² activation in ArceeMLP - Add RoPE scaling tests including YARN support - Follow CausalLMModelTest pattern used by similar models * Add documentation for Arcee model - Add comprehensive model documentation with usage examples - Include all model variants in autodoc - Add to table of contents in proper alphabetical order - Fixes documentation coverage for Arcee model classes * Make style/fixup * fix copyright year * Sync modular conversion * revert in legacy supported models in src/transformers/utils/fx * cleaned redundant code in modular_arcee.py * cleaned testing * removed pretraining tp * fix styles * integration testing --------- Co-authored-by: Pranav <veldurthipranav@gmail.com> Co-authored-by: Pranav <56645758+pranav4501@users.noreply.github.com>	2025-06-24 15:05:29 +02:00
Anton Vlasjuk	23c89a6732	[`Attention`] Small fix on output attentions (#38948 ) small fix	2025-06-24 14:42:10 +02:00
Dianana	4f650040a6	Removing extra space in large command for speech-pretraining example (#38705 ) Removing extra space in Large command	2025-06-24 12:24:56 +00:00
Raushan Turganbay	d3d835d4fc	[qwen] refactor attentions for vision/audio (#38930 ) * refactor attentions in vision/audio * remove fa2 import * make config the only args * pass along kwargs from modality encoders * style	2025-06-24 10:53:52 +02:00
vb	2e4c045540	🔴 Update default `dtype` for pipelines to `auto` (#38882 ) * check typing * Fallback to fp32 if auto not supported. * up. * feedback from review. * make style.	2025-06-24 10:39:18 +02:00
casinca	21cb353b7b	[docs] Typos - Single GPU efficient training features (#38964 ) * Typos - corrected bf16 training argument - corrected header for SDPA * improved readability for SDPA suggested by @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-23 12:33:10 -07:00
Yih-Dar	f9be71b34d	Fix `rag` (#38585 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-23 17:42:46 +02:00
Yusuf Shihata	9eac19eb59	[Feature] Support `is_split_into_words` in the `TokenClassificationPipeline`. (#38818 ) * some fixes * some fixes * now the pipeline can take list of tokens as input and is_split_into_words argument * now the pipeline can take list of tokens as input and is_split_into_words argument * now the pipeline can take list of tokens as input and is_split_into_words argument and we can handle batches of tokenized input * now the pipeline can take list of tokens as input and is_split_into_words argument and we can handle batches of tokenized input * solving test problems * some fixes * some fixes * modify tests * aligning start and end correctly * adding tests * some formatting * some formatting * some fixes * some fixes * some fixes * resolve conflicts * removing unimportant lines * removing unimportant lines * generalize to other languages * generalize to other languages * generalize to other languages * generalize to other languages	2025-06-23 15:31:32 +00:00
Yih-Dar	2ce02b98bf	fix `mistral` and `mistral3` tests (#38978 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-23 17:07:18 +02:00
Yoni Gozlan	b6b4d43d6d	Add support for auto_docstring with model outputs (#38242 ) * experiment auto_docstring model outputs * Fix PatchTSMixer * Add check model output docstring to check_auto_docstring and fix all model outputs docstring * add reordering of docstring in check_docstrings * add check for redundant docstring in check_docstrings, remove redundant docstrings * refactor check_auto_docstring * make style * fix copies * remove commented code * change List-> list Tuple-> tuple in docstrings * fix modular * make style * Fix modular vipllava --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-06-23 10:39:41 -04:00
kallewoof	0c98f24889	fix: add __bool__ operator to tokenizer to avoid bloated asserts (#38899 ) * fix: add __bool__ operator to tokenizer to avoid bloated asserts When a user does 'assert tokenizer' to ensure that the tokenizer is not None, they inadvertently set off a rather expensive process in the '__len__()' operator. This fix adds a trivial '__bool__()' that returns True, so that a None tokenizer asserts and an actual tokenizer returns True when asserted, without calling length op. * typo	2025-06-23 14:32:16 +00:00
Yoni Gozlan	d29482cc91	Add Idefics2/3 and SmolVLM Fast image processors + improvements for fast image processors (#38157 ) * add working idefics2 fast and improvements for fast nested images processing * add fast image processors idefics 3 and smolvlm * cleanup tests * fic doc idefics2 * PR review and fix issues after merge * Force providing disable_grouping to group_images_by_shape * simplify group_images_by_shape * fix modular * Fix nits after review	2025-06-23 14:17:25 +00:00
Rémi Ouazan	1a96127e46	Break tie in Expectations and gemma3 fixes (#38943 ) * Added major / minor version to Expectations ordering * Added fixes to gemma3 * Style	2025-06-23 15:13:27 +02:00
Pavel Iakubovskii	84d19be41e	Apply GradientCheckpointingLayer to the whole repo (#38913 ) * first batch (4) * align * altclip * beit * bert * yolos * dino, pvt_v2 * bark, bart, bert_generation * big_bird, biogpt * blnderbot, bloom * bridgetower * camambert, canine, chameleon * chinese clip, clap, clip * codegen, conditional detr, convbert * dab_detr, data2vec * dbrx, deberta * deberta, decicion_tranformer, deformable_detr * deit, deta, mctct * detr, dinov2, distilbert * donut, dpt, electra * ernie, esm, falcon * flava, fnet, falcon_mamba * focalnet, git, gpt2 * gpt - bigcode, neo, neox * gptj, groupvit * idefics2, idefics3 * ijepa, imagegpt, internvl * jetmoe, kosmos2, layoutlm * layoutlm2-3, led * lilt, longformer, longt5, luke * m2m, mamba1-2 * marian, markuplm, mask2former * maskformer * mbart, megatron_bert, mimi * mixtral, mlcd * mobilevit1-2, modernbert * moshi, mpt, mra * mt5, musicgen * mvp, nemotron * nllb_moe * nystromformer, omdet_turbo * opt, owlvit, owlv2 * pegasus, pegasus_x, presimmon * phimoe, pix2struct, pixtral * plbart, pop2piano, prophetnet * qwen2* * qwen2, qwen3 moe, rec gemma * rembert * roberta * roberta prelayernorm * roc_bert, roformer, rwkv * sam, sam_hq * seggpt, smolvlm, speech_to_text * splinter, stablelm, swin * swin2sr, switch_transformer, t5, table_transformer * tapas, time_series_tranformer, timesformer * trocr, tvp, umt5 * videomae, vilt, visual_bert * vit, vit_mae, vit_msn * vitpose_backbone, vits, vivit * whisper. x_clip, xglm * xlm_roberta, xmod * yoso * zamba * vitdet, wav2vec2, wav2vec2_bert * unispeech, wav2vec_conformer * wavlm * speecht5 * swinv2 * sew / _d * seamless_mt4 / _v2 * deprecated models update * bros * gemma2, gemma3 * got, hiera, hubert, llama4, mllama, oneformer, phi, olmoe, informer * fixup * Add use_cache=False and past_key_value=None to GradientCheckpointingLayer * fixup * fix prophetnet * fix bigbird_pegasus * fix blenderbot * fix mbart * fix mvp * fix zamba2 * fix bart * fix blenderbot_small * fix codegen * Update gradient checkpointing layer to support more past_key_values arg names * fix data2vec vision * fix deformable_detr * fix gptj * fix led * fix m2m_100 * add comment * fix nnlb_moe * Fix pegasus_x * fix plbart * udop * fix-copies: beit, wav2vec2 * fix gpt_bigcode * fixup * fix t5 * fix switch_transformers * fix longt5 * fix mt5 * update tapas * fix blip2 * update blip * fix musicgen * fix gpt2, trocr * fix copies * !!! Revert zamba, mllama * update autoformer * update bros * update args / kwargs for BERT and copies * 2nd round of updates * update conditional detr * Pass encoder_hidden_states as positional arg * Update to pass encoder_decoder_position_bias as positional arg * fixup * biogpt modular * modular gemma2 * modular gemma3 * modular gpt_neox * modular informer * modular internvl * modular mixtral * modular mlcd * modular modernbert * modular phi * modular qwen2_5_omni * modular qwen2_5_vl * modular sam_hq * modular sew * wav2vec2_bert * modular wav2vec2_conformer * modular wavlm * fixup * Update by modular instructblipvideo * modular data2vec_audio * nit modular mistral * apply modular minimax * fix modular moonshine * revert zamba2 * fix mask2former * refactor idefics	2025-06-23 14:24:48 +02:00
Cyril Vallez	07aab1af1e	Remove dead protected imports (#38980 ) * remove them * more	2025-06-23 13:44:50 +02:00
Cyril Vallez	74f5e4a1fa	[modular] CLI allows positional arguments, and more defaults names for the optional arg (#38979 ) * More defaults * Update modular_model_converter.py	2025-06-23 12:40:01 +02:00
Vensen	334bf913dc	Fix(informer): Correct tensor shape for input_size=1 (#38856 ) * Fix(time_series): Correct scaler tensor shape in base model The create_network_inputs function in TimeSeriesTransformerModel handled the scaler's loc and scale tensors inconsistently. When input_size=1, the tensors were not squeezed, leading to downstream dimension errors for models like Informer. This commit refactors the logic to unconditionally apply .squeeze(1), which correctly handles all input_size cases and fixes the bug at its source. Fixes #38745 * Fix(time_series): Correct scaler tensor shape in base model The create_network_inputs function in TimeSeriesTransformerModel handled the scaler's loc and scale tensors inconsistently. When input_size=1, the tensors were not squeezed, leading to downstream dimension errors for models like Informer. This commit refactors the logic to unconditionally apply .squeeze(1), which correctly handles all input_size cases and fixes the bug at its source. Fixes #38745 --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>	2025-06-23 11:50:51 +02:00
Benoqtr	c184550daf	Fix DTensor import compatibility for PyTorch < 2.5 (#38836 )	2025-06-23 11:25:56 +02:00
Ilyas Moutawwakil	984ff89e73	Gaudi3 CI (#38790 )	2025-06-23 10:56:51 +02:00
DongKyu Kang	2166b6b4ff	Update blip model card (#38513 ) * Update docs/source/en/model_doc/blip.md * fix(docs/source/en/model_doc/blip.md): fix redundent typo error * fix (docs/source/en/model_doc/blip.md): modify of review contents * fix(docs/source/en/model_doc/blip.md): modify code block * Update blip.md --------- Co-authored-by: devkade <mouseku@moana-master> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-20 13:46:19 -07:00
Manuel de Prada Corral	166e823f77	Fix custom generate from local directory (#38916 ) Fix custom generate from local directory: 1. Create parent dirs before copying files (custom_generate dir) 2. Correctly copy relative imports to the submodule file. 3. Update docs.	2025-06-20 17:36:57 +01:00
Yih-Dar	3d34b92116	Switch to use A10 progressively (#38936 ) * try * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-20 16:10:35 +00:00
Yih-Dar	b8059e1f8f	Fix more flaky `test_initialization` (#38932 ) * try * try * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-20 17:28:32 +02:00
Cyril Vallez	5ee60f970a	Correctly raise error for awq quantization (#38945 ) fix warning	2025-06-20 17:18:06 +02:00
Ákos Hadnagy	8ac2d75353	Pin PyTorch extras for AMD containers (#38941 ) * Pin additional Torch packages * Remove unused def --------- Co-authored-by: ivarflakstad <69173633+ivarflakstad@users.noreply.github.com>	2025-06-20 12:17:21 +00:00
Pavel Iakubovskii	9120567b02	Add kwargs for timm.create_model in TimmWrapper (#38860 ) * Add init kwargs for timm wrapper * model_init_kwargs -> model_args * add save-load test * fixup	2025-06-20 12:00:09 +00:00
Raushan Turganbay	ff95974bc6	[static cache] fix device map per layer in VLMs (#38488 ) return lm as decoder	2025-06-20 13:49:29 +02:00
Cyril Vallez	aa42987c1e	Remove `ALL_LAYERNORM_LAYERS` (#38922 ) * remove it everywhere * Update trainer_pt_utils.py * Update trainer_pt_utils.py * style * sort list in test * CIs * use recursion same way as before (for intermediate layer names)	2025-06-20 12:06:48 +02:00
Yao Matrix	38a9b70786	add pytorch-xpu Dockerfile (#38875 ) * first commit Signed-off-by: YAO Matrix <matrix.yao@intel.com> * use rls pytorch Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-06-20 11:42:44 +02:00
Rémi Ouazan	9bcdd5cde9	Modernbert fixes (#38912 ) * Removed deprecated argument in modernbert RotaryEmbedding * Skip test_sdpa_can_dispatch_on_flash for modernbert --------- Co-authored-by: ivarflakstad <69173633+ivarflakstad@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-06-20 11:22:32 +02:00
Yih-Dar	31d30b7224	Skip some tests for now (#38931 ) * try * [test all] --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-20 11:05:49 +02:00
Cyril Vallez	0725cd6953	Remove deprecated classes in modeling_utils.py (#38919 ) * remove deprecated classes * style	2025-06-19 19:25:20 +02:00
Hamza Benchekroun	797860c68c	feat: add flexible Liger Kernel configuration to TrainingArguments (#38911 ) * feat: add flexible Liger Kernel configuration to TrainingArguments Add support for granular Liger Kernel configuration through a new `liger_kernel_config` parameter in TrainingArguments. This allows users to selectively enable/disable specific kernels (rope, swiglu, cross_entropy, etc.) instead of the current approach that rely on default configuration. Features: - Add `liger_kernel_config` dict parameter to TrainingArguments - Support selective kernel application for all supported models - Maintain full backward compatibility with existing `use_liger_kernel` flag Example usage: ```python TrainingArguments( use_liger_kernel=True, liger_kernel_config={ "rope": True, "swiglu": True, "cross_entropy": False, "fused_linear_cross_entropy": True } ) Closes #38905 * Address comments and update Liger section in Trainer docs	2025-06-19 15:54:08 +00:00
Matt	89b35be618	Allow make-fixup on main branch, albeit slowly (#38892 ) * Allow make-fixup on main branch, albeit slowly * Make the other style checks work correctly on main too * More update * More makefile update	2025-06-19 15:22:59 +01:00
Gabe Goodhart	9a02e7602d	feat: Add granite architectures to auto tokenizer name mappings (#38802 ) Branch: GraniteTokenizerMapping Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-19 15:20:42 +01:00
Matt	54a02160eb	Fix ReDOS in tokenizer digit substitution (#38844 ) * Fix regexes vulnerable to ReDOS * Let's just use regex * Import regex/re correctly	2025-06-19 14:53:52 +01:00
ivarflakstad	af6120b3eb	Skip sdpa tests if submodule does not support sdpa (#38907 )	2025-06-19 13:11:01 +00:00
Yih-Dar	5d26a38735	Fix `FalconMambaIntegrationTests` (#38566 ) * update * update * update * update * update * update * update * update * update * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-19 13:50:33 +02:00
Yao Matrix	a9ce8c69c9	align xpu's autocast behavior w/ cuda by using device agnostic torch APIs (#38284 ) * siwtch to device agnostic autocast in nemotron to align xpu behavior w/ cuda Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix issue Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> * use torch.cast as other modeling code for decision_transformer&gpt2&imagegpt Signed-off-by: Matrix Yao <matrix.yao@intel.com> * refine Signed-off-by: Matrix Yao <matrix.yao@intel.com> * update get_autocast_gpu_dtype to device agnostic one Signed-off-by: Matrix YAO <matrix.yao@intel.com> * fix style Signed-off-by: Matrix YAO <matrix.yao@intel.com> * fix comments Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com> Signed-off-by: Matrix YAO <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-06-19 11:48:23 +00:00
Yuanyuan Chen	0a53df1a77	Fix unnecessary super calls (#38897 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-06-19 11:45:51 +00:00
Yih-Dar	b949747b54	Fix `fsmt` tests (#38904 ) * fix 1 * fix 2 * fix 3 * fix 4 * fix 5 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-19 10:56:34 +02:00
eustlb	11738f8537	[phi-4] use mel filters from audio utils (#36966 ) * use mel_filter_bank from audio utils * Apply style fixes --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-06-19 12:35:32 +09:00
Lucain	f7b21822e3	Use `raise from e` in `hub.py` utility (#37241 ) Use raise from e Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-06-19 03:06:25 +00:00
Isaac Breen	3756bf192c	Add support for specifying revisions when pushing to Hub via internal Trainer call (#36852 ) * Update training_args.py * Update trainer.py * fixes * fix * remove extraneous comments * explicit revision arg * add msg * fixup * fix field name * rename field revision to hub_revision * restore gradient_checkpointing doc * fix ws --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-06-19 02:35:33 +00:00
Dhruv	458e0b376c	Update bamba model card (#38853 ) * Update bamba model card * Update the doc for bamba * Update docs/source/en/model_doc/bamba.md Bamba paragraph Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md Bamba collection url Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md Update Padding-Free Training to Notes heading Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md update examples Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md Update additional info Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md consistent casing Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md simplify sentences Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Include pipeline and cli examples + fix formatting * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bamba.md update cli id * Update quantization example * Fix auto code formatter changes * Update cli command + include BambaModel * Update docs/source/en/model_doc/bamba.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-18 16:01:25 -07:00
Raushan Turganbay	ea01334873	[video processor] fix slow tests (#38881 ) * we need to check against mapping to be safe * need to check only when inferring from image type, otherwise messes custom code --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-06-18 22:39:56 +02:00
Sam Rae	b922b22ec2	36978 \| Fast image processor for DPT model (#37481 ) * chore: ran codegen script * test: test_image_processor_properties * test: test_image_processor_from_dict_with_kwargs * test: wip - test_padding * test: test_padding * test: test_keep_aspect_ratio * wip * test * test: wip * test: wip * test: test_call_segmentation_maps, wip * chore: tidy up * test: test_call_segmentation_maps * fix: test_save_load_fast_slow * test: reduce labels * chore: make fixup * chore: rm comment * chore: tidy * chore remove comment * refactor: no need to infer channel dimesnion * refactor: encapsulate logic for preparing segmentation maps * refactor: improve readability of segmentation_map preparation * improvement: batched version of pad_image * chore: fixup * docs * chore: make quality * chore: remove unecessary comment * fix: add SemanticSegmentationMixin * feat: add post_process_depth_estimation to fast dpt image processor * chore: fix formatting * remove max_height, max_width * fix: better way of processin segmentation maps - copied from Beit Fast processor * chore: formatting + remove TODO * chore: fixup styles * chore: remove unecessary line break * chore: core review suggestion to remove autodocstring * fix: add do_reduce_labels logic + refactor - refactor preprocess logic to make it consistent with other processors - add missing reduce labels logic * refactor: remove deprecated mixin * chore: fixup * use modular for dpt + final nit changes * fix style --------- Co-authored-by: Samuel Rae <samuelrae@Samuels-Air.fritz.box> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-06-18 17:33:29 +00:00
Ashu_kun	c27f628e98	Docs: Add custom fine-tuning tutorial to TrOCR model page (#38847 ) * Update trocr.md Docs: add community fine‑tuning notebook link to TrOCR page * apply suggested changes from PR review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/trocr.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-18 09:38:58 -07:00
Keshav Singh	0a289d1630	log: Add logging when using split_batches and per_device_train_batch_size (#38633 ) * log: Add logging when user uses split_batches and per_device_train_batch_size * refactor: remove whitespace from blank line * Update src/transformers/training_args.py Change logging level to info Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-18 16:26:46 +00:00
Quyu Kong	c55d806355	[bugfix] fix ATTN_MASK_NPU device mismatch error on multi-device NPU … (#38876 ) [bugfix] fix ATTN_MASK_NPU device mismatch error on multi-device NPU setups	2025-06-18 16:26:22 +00:00
Matt	9cd7570f34	Fix loop var naming (#38885 )	2025-06-18 13:45:01 +00:00
Yuanyuan Chen	1fc67a25c6	More PYUP fixes (#38883 ) More pyup fixes Signed-off-by: cyy <cyyever@outlook.com>	2025-06-18 14:38:08 +01:00
Wing Lian	12d4c5b66f	null deepspeed_plugin in args for wandb callback fake trainer (#38867 )	2025-06-18 13:10:22 +00:00
Stefan	3620b32cc8	Fixed markdown for BertTokenizer's '[CLS]' token. (#38506 )	2025-06-18 13:09:58 +00:00
艾梦	cb0f604192	Fix HQQ model param device transfer issue (#38466 ) * Fix HQQ model param device transfer issue * modify a comment * clear the code and add test for hqq device/dtype * fix test hqq code quality of imports --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-18 15:09:00 +02:00
Yih-Dar	c77bcd889f	Fix `qwen3_moe` tests (#38865 ) * try 1 * try 2 * try 3 * try 4 * try 5 * try 6 * try 7 * try 8 * try 9 * try 10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-18 14:36:03 +02:00
Cyril Vallez	5a95ed5ca0	🚨🚨 Fix initialization of Mask2Former (#38864 ) * Correctly fix init Co-authored-by: BUI Van Tuan <buivantuan07@gmail.com> * add back the block, breaking BC but this is correct author's code * override the test for params needing it --------- Co-authored-by: BUI Van Tuan <buivantuan07@gmail.com>	2025-06-18 09:46:22 +02:00
Yih-Dar	309e8c96f2	Fix `phi4_multimodal` tests (#38816 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-18 09:39:17 +02:00
Yao Matrix	3526e25d3d	enable misc test cases on XPU (#38852 ) * enable misc test cases on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * tweak bamba ground truth on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * remove print Signed-off-by: YAO Matrix <matrix.yao@intel.com> * one more Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-06-18 09:20:49 +02:00
Matt	d058f81e5b	Post-PR fixes! (#38868 ) * Post-PR fixes! * make fix-copies	2025-06-17 19:58:47 +01:00
Matt	508a704055	No more Tuple, List, Dict (#38797 ) * No more Tuple, List, Dict * make fixup * More style fixes * Docstring fixes with regex replacement * Trigger tests * Redo fixes after rebase * Fix copies * [test all] * update * [test all] * update * [test all] * make style after rebase * Patch the hf_argparser test * Patch the hf_argparser test * style fixes * style fixes * style fixes * Fix docstrings in Cohere test * [test all] --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-17 19:37:18 +01:00
SohamPrabhu	a396f4324b	Update roc bert docs (#38835 ) * Moved the sources to the right * small Changes * Some Changes to moonshine * Added the install to pipline * updated the monshine model card * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated Documentation According to changes * Fixed the model with the commits * Changes to the roc_bert * Final Update to the branch * Adds Quantizaiton to the model * Finsihed Fixing the Roc_bert docs * Fixed Moshi * Fixed Problems * Fixed Problems * Fixed Problems * Fixed Problems * Fixed Problems * Fixed Problems * Added the install to pipline * updated the monshine model card * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated Documentation According to changes * Fixed the model with the commits * Fixed the problems * Final Fix * Final Fix * Final Fix * Update roc_bert.md --------- Co-authored-by: Your Name <sohamprabhu@Mac.fios-router.home> Co-authored-by: Your Name <sohamprabhu@Sohams-MacBook-Air.local> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-17 11:02:18 -07:00
Md. Muhaimin Rahman	3ae52cc312	Update CvT documentation with improved usage examples and additional … (#38731 ) * Update CvT documentation with improved usage examples and additional notes * initial update * cvt * Update docs/source/en/model_doc/cvt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update cvt.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-17 10:30:03 -07:00
StevenBucaille	e5a9ce48f7	Add LightGlue model (#31718 ) * init * chore: various changes to LightGlue * chore: various changes to LightGlue * chore: various changes to LightGlue * chore: various changes to LightGlue * Fixed dynamo bug and image padding tests * refactor: applied refactoring changes from SuperGlue's concat, batch and stack functions to LightGlue file * tests: removed sdpa support and changed expected values * chore: added some docs and refactoring * chore: fixed copy to superpoint.image_processing_superpoint.convert_to_grayscale * feat: adding batch implementation * feat: added validation for preprocess and post process method to LightGlueImageProcessor * chore: changed convert_lightglue_to_hf script to comply with new standard * chore: changed lightglue test values to match new lightglue config pushed to hub * chore: simplified convert_lightglue_to_hf conversion map * feat: adding batching implementation * chore: make style * feat: added threshold to post_process_keypoint_matching method * fix: added missing instructions that turns keypoints back to absolute coordinate before matching forward * fix: added typehint and docs * chore: make style * [run-slow] lightglue * fix: add matches different from -1 to compute valid matches in post_process_keypoint_matching * tests: added CUDA proof tests similar to SuperGlue * chore: various changes to modeling_lightglue.py - Added "Copies from" statements for copied functions from modeling_superglue.py - Added missing docstrings - Removed unused functions or classes - Removed unnecessary statements - Added missing typehints - Added comments to the main forward method * chore: various changes to convert_lightglue_to_hf.py - Added model saving - Added model reloading * chore: fixed imports in lightglue files * [run-slow] lightglue * chore: make style * [run-slow] lightglue * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * [run-slow] lightglue * chore: Applied some suggestions from review - Added missing typehints - Refactor "cuda" to device variable - Variable renaming - LightGlue output order changed - Make style * fix: added missing grayscale argument in image processor in case use of SuperPoint keypoint detector * fix: changed lightglue HF repo to lightglue_superpoint with grayscale default to True * refactor: make keypoints `(batch_size, num_keypoints, keypoint_dim)` through forward and unsqueeze only before attention layer * refactor: refactor do_layer_keypoint_pruning * tests: added tests with no early stop and keypoint pruning * refactor: various refactoring to modeling_lightglue.py - Removed unused functions - Renamed variables for consistency - Added comments for clarity - Set methods to private in LightGlueForKeypointMatching - Replaced tensor initialization to list then concatenation - Used more pythonic list comprehension for repetitive instructions * refactor: added comments and renamed filter_matches to get_matches_from_scores * tests: added copied from statement with superglue tests * docs: added comment to prepare_keypoint_matching_output function in tests * [run-slow] lightglue * refactor: reordered _concat_early_stopped_outputs in LightGlue class * [run-slow] lightglue * docs: added lightglue.md model doc * docs: added Optional typehint to LightGlueKeypointMatchingOutput * chore: removed pad_images function * chore: set do_grayscale default value to True in LightGlueImageProcessor * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * docs: added missing LightGlueConfig typehint in nn.Module __init__ methods * docs: removed unnecessary code in docs * docs: import SuperPointConfig only from a TYPE_CHECKING context * chore: use PretrainedConfig arguments `num_hidden_layers` and `num_attention_heads` instead of `num_layers` and `num_heads` * chore: added organization as arg in convert_lightglue_to_hf.py script * refactor: set device variable * chore: added "gelu" in LightGlueConfig as hidden_act parameter * docs: added comments to reshape.flip.reshape instruction to perform cross attention * refactor: used batched inference for keypoint detector forward pass * fix: added fix for SDPA tests * docs: fixed docstring for LightGlueImageProcessor * [run-slow] lightglue * refactor: removed unused line * refactor: added missing arguments in LightGlueConfig init method * docs: added missing LightGlueConfig typehint in init methods * refactor: added checkpoint url as default variable to verify models output only if it is the default url * fix: moved print message inside if statement * fix: added log assignment r removal in convert script * fix: got rid of confidence_thresholds as registered buffers * refactor: applied suggestions from SuperGlue PR * docs: changed copyright to 2025 * refactor: modular LightGlue * fix: removed unnecessary import * feat: added plot_keypoint_matching method to LightGlueImageProcessor with matplotlib soft dependency * fix: added missing import error for matplotlib * Updated convert script to push on ETH org * fix: added missing licence * fix: make fix-copies * refactor: use cohere apply_rotary_pos_emb function * fix: update model references to use ETH-CVG/lightglue_superpoint * refactor: add and use intermediate_size attribute in config to inherit CLIPMLP for LightGlueMLP * refactor: explicit variables instead of slicing * refactor: use can_return_tuple decorator in LightGlue model * fix: make fix-copies * docs: Update model references in `lightglue.md` to use the correct pretrained model from ETH-CVG * Refactor LightGlue configuration and processing classes - Updated type hints for `keypoint_detector_config` in `LightGlueConfig` to use `SuperPointConfig` directly. - Changed `size` parameter in `LightGlueImageProcessor` to be optional. - Modified `position_embeddings` in `LightGlueAttention` and `LightGlueAttentionBlock` to be optional tuples. - Cleaned up import statements across multiple files for better readability and consistency. * refactor: Update LightGlue configuration to enforce eager attention implementation - Added `attn_implementation="eager"` to `keypoint_detector_config` in `LightGlueConfig` and `LightGlueAttention` classes. - Removed unnecessary logging related to attention implementation fallback. - Cleaned up import statements for better readability. * refactor: renamed message into attention_output * fix: ensure device compatibility in LightGlueMatchAssignmentLayer descriptor normalization - Updated the normalization of `m_descriptors` to use the correct device for the tensor, ensuring compatibility across different hardware setups. * refactor: removed Conv layers from init_weights since LightGlue doesn't have any * refactor: replace add_start_docstrings with auto_docstring in LightGlue models - Updated LightGlue model classes to utilize the new auto_docstring utility for automatic documentation generation. - Removed legacy docstring handling to streamline the code and improve maintainability. * refactor: simplify LightGlue image processing tests by inheriting from SuperGlue - Refactored `LightGlueImageProcessingTester` and `LightGlueImageProcessingTest` to inherit from their SuperGlue counterparts, reducing code duplication. - Removed redundant methods and properties, streamlining the test setup and improving maintainability. * test: forced eager attention implementation to LightGlue model tests - Updated `LightGlueModelTester` to include `attn_implementation="eager"` in the model configuration. - This change aligns the test setup with the recent updates in LightGlue configuration for eager attention. * refactor: update LightGlue model references * fix: import error * test: enhance LightGlue image processing tests with setup method - Added a setup method in `LightGlueImageProcessingTest` to initialize `LightGlueImageProcessingTester`. - Included a docstring for `LightGlueImageProcessingTester` to clarify its purpose. * refactor: added LightGlue image processing implementation to modular file * refactor: moved attention blocks into the transformer layer * fix: added missing import * fix: added missing import in __all__ variable * doc: added comment about enforcing eager attention because of SuperPoint * refactor: added SuperPoint eager attention comment and moved functions to the closest they are used --------- Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-06-17 18:10:23 +02:00
Yih-Dar	2507169bf6	Fix `qwen3` tests (#38862 ) * fix * update * update * update * update * update * update * format --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-17 15:21:36 +02:00
Mikhail Moskovchenko	41e0c921cb	Improve `auxiliary_in_channels` default behavior in UperNet (#37540 ) Improve auxiliary_in_channels behavior in UperNet Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-06-17 12:56:46 +00:00
Yih-Dar	c61ca64aaa	Fix `qwen2_5_vl` tests (#38845 ) * fix * breakpoint() * breakpoint() * update * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-17 10:55:24 +02:00
Kimish Patel	37367c7d9f	Allow customization of sdpa in executorch.py (#38827 ) Earlier PR put executorch specific sdpa and mask function in the export function. This prevent any customization that can be done to sdpa, prior to export. By moving this to __init__, we still keep the original behavior but allow users like optimum-executorch to override sdpa by setting model.config._attn_implementation.	2025-06-17 10:38:20 +02:00
Jingxiang Zhang	9c878d2f64	Fix incorrect width ratio calculation in Llama4 image processor (#38842 )	2025-06-17 07:33:36 +00:00
Raushan Turganbay	bf370e446b	[video processor] fix BC when no video config if found (#38840 ) fix auto video processor	2025-06-17 09:20:16 +02:00
Dhruv	e61160c5db	Remove merge conflict artifacts in Albert model doc (#38849 )	2025-06-16 14:21:18 -07:00
Vanshu	64e9b049d9	Updated aya_vision.md (#38749 ) * Update aya_vision.md * Suggested changes made to aya_vision.md * Quantization Example added - aya_vision.md * Polished - aya_vision.md * Update aya_vision.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-16 10:46:30 -07:00
Shawn Tan	5ab0f447ab	GraniteMoeHybrid: Allow for only shared expert case. (#38801 ) * Allow for only shared expert case. * Style	2025-06-16 16:15:42 +01:00
Yusuf Shihata	a7593a1d1f	[BugFix] QA pipeline edge case: `align_to_words=True` in `QuestionAnsweringPipeline` can lead to duplicate answers (#38761 ) * fixing the problem align_to_words=True leading to duplicate solutions * adding tests * some fixes * some fixes * changing the handle_duplicate_answers=False by default * some fixese * some fixes * make the duplicate handling the default behaviour and merge duplicates * make the duplicate handling the default behaviour	2025-06-16 15:01:22 +00:00
Drew Ross	18c7f32daa	Fix broken tag in Longformer model card (#38828 )	2025-06-16 07:44:40 -07:00
VolodymyrBg	b44b04ee9a	Fix broken notebooks link in Italian training docs (#38834 )	2025-06-16 07:38:51 -07:00
Cyril Vallez	9300728665	Fix peft integration (#38841 ) Update peft.py	2025-06-16 10:39:25 +02:00
Cyril Vallez	608884960e	add default mapping to peft integration	2025-06-16 10:23:51 +02:00
Manuel Faysse	ce6ac53ac1	bugfix: propage weight key_mapping to peft to fix 3.52 VLM renaming (#38627 ) * propage key mapping to peft * propage key mapping to peft * make requested changes * revert	2025-06-16 10:10:23 +02:00
Yaswanth Gali	925da8ac56	Fix redundant code in Janus (#38826 ) * minor mistake * modify return statements	2025-06-16 06:53:59 +00:00
Raushan Turganbay	d2fd3868bb	[internvl] fix video inference (#38811 ) fix	2025-06-16 08:37:30 +02:00
SOUVIK CHAND [ZD]	d5d007a1a0	Updated Albert model Card (#37753 ) * Updated Albert model Card * Update docs/source/en/model_doc/albert.md added the quotes in <hfoption id="Pipeline"> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/albert.md updated checkpoints Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/albert.md changed !Tips description Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/albert.md updated text Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/albert.md updated transformer-cli implementation Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/albert.md changed text Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/albert.md removed repeated description Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update albert.md removed lines * Update albert.md updated pipeline code * Update albert.md updated auto model code, removed quantization as model size is not large, removed the attention visualizer part * Update docs/source/en/model_doc/albert.md updated notes Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update albert.md reduced a repeating point in notes * Update docs/source/en/model_doc/albert.md updated transformer-CLI Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/albert.md removed extra notes Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-13 14:58:06 -07:00
Sunil Reddy	443aafd3d6	[docs] updated roberta model card (#38777 ) * updated roberta model card * fixes suggested after reviewing --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-13 12:02:44 -07:00
Steven Liu	fdb5da59dd	[docs] Update docs moved to the course (#38800 ) * update * update * update not_doctested.txt * slow_documentation_tests.txt	2025-06-13 12:02:27 -07:00
Lawrence Feng	8b73799500	fixed docstring in modular_qwen2_5_vl.py (#38798 ) * fixed docstring in modular_qwen2_5_vl.py * Regenerate file to match docstring update	2025-06-13 11:09:51 -07:00
Pavel Iakubovskii	9bec2654ed	Add V-JEPA for video classification model (#38788 ) * adding model and conversion scripts * add imports to test vjepa conversion * fix imports and make conversion work * fix computation for short side * replace attention with library attention function * cleanup more attention classes * remove config overrides * add test cases, fix some of the failing ones * fix the model outputs * fix outputs of the model per review * fix too big model test case * fix styling __init__.py * fix initialization test * remove all asserts per review * update sorting unsorting logic as per feedback * remove is_video per review * remove another is_video segment * remove unwanted stuff * small fixes * add docstrings for the model * revert adding vjepa2 config here * update styling * add config docstrings (wip) * fix dpr issue * removed test failing issues * update styles * merge predictor configs into main config * remove processing code, add video processor * remove permute which is not necessary now * fix styles * updated vjepa2 to be in video_processing_auto * update comment for preprocessing * test integration test and fix the outputs * update test values, change test to look at repeated frames for a given image * add a simple video processing test * refactoring pixel_values_videos and upload ckpts to original * fix torch_fx test cases * remove unused config * add all config docstrings * add more integration tests * add basic doc * revert unwanted styling changes * working make fixup * Fix model_type in config * Add ForVideoClassification model * update attention implementation to fit new hf standards * fix the preprocessing logic, ensure it matches the original model * remove use_rope logic, cleanup * fix docstrings * Further cleanup, update doc * Fix model prefix * fix get_vision_features * VJEPA2Embeddings style refactor * nit, style comment * change modules default values * Only `str` activation in config * GradientCheckpointingLayer * fixup * fix conversion script * Remove return_dict * remove None return typehint * Refactor VJEPA2Layer, remove use_SiLU * Fix fx tests * dpr -> drop_path_rates * move ModelOutput on top format docs bit * update docs * update docs * update doc example * remove prune_heads from model * remove unused config params * refactor embed signature * Add vjepa to docs * Fix config docstring * attention head * update defaults * Update docs/source/en/model_doc/vjepa2.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/vjepa2.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Fix import * Min refactoring * Update HUB_SOURCE and HUB_REPO in conversion script * Add missing headers * VJEPA -> V-JEPA in docs * Add image to doc * fix style * fix init weights * change checkpoint name in modeling tests * Initial cls head setup * remove rop attention from head (not needed) * remove swigluffn - not needed * Add siglip layer * Replace with siglip layer * Rename Siglip - VJEPA2 * remove unused modules * remove siglip mlp * nit * remove MLP * Refactor head cross attention * refactor VJEPA2HeadCrossAttentionLayer * nit renaming * fixup * remove commented code * Add cls head params to config * depth from config * move pooler + classifier to the model * Update for cls model signature * move layers, rename a bit * fix docs * update weights init * remove typehint for init * add to auto-mapping * enable tests * Add conversion script * fixup * add to docs * fix docs * nit * refactor for mapping * clean * Add integration test * Fixing multi gpu test * update not-split-modules * update video cls test tolerance * Increase test_inference_image tolerance * Update no-split modules for multi gpu * Apply suggestions from code review * fixing multi-gpu * fix docstring * Add cls snippet to docs * Update checkpoint	2025-06-13 17:56:15 +01:00
Kusanagi Nene	2ff964bcb4	Fix trainer.py not showing signature columns (#38465 ) Fix trainer.py not showing signature columns	2025-06-13 15:39:29 +00:00
Yih-Dar	4c3c177ecf	Fix a minor security issue (#38815 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-13 17:37:46 +02:00
Prashant Tandon	93445aed06	change fsdp_strategy to fsdp in TrainingArguments in accelerate doc (#38807 )	2025-06-13 15:32:40 +00:00
Matt	b82a45b3b4	Refactor DBRX tests to use CausalLMModelTest base classes (#38475 ) * Refactor DBRX tests to use CausalLMModelTest base classes - Changed DbrxModelTester to inherit from CausalLMModelTester - Changed DbrxModelTest to inherit from CausalLMModelTest - Removed duplicate methods that are already in base classes - Added required class attributes for model classes - Updated pipeline_model_mapping to include feature-extraction - Kept DBRX-specific configuration and test methods - Disabled RoPE tests as DBRX's rotary embedding doesn't accept config parameter This refactoring reduces code duplication and follows the pattern established in other causal LM model tests like Gemma. * Apply style fixes * Trigger tests * Refactor DBRX test * Make sure the DBRX-specific settings are handled * Use the attribute_map * Fix attribute map --------- Co-authored-by: openhands <openhands@all-hands.dev> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-06-13 16:22:12 +01:00
Quentin Gallouédec	64041694a8	Use `wandb.run.url` instead of `wandb.run.get_url()` (deprecated) (#38817 )	2025-06-13 15:20:04 +00:00
Rémi Ouazan	9ff246db00	Expectation fixes and added AMD expectations (#38729 )	2025-06-13 16:14:58 +02:00
Yih-Dar	e39172ecab	Fix `llava_next` tests (#38813 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-13 15:19:41 +02:00
Pavel Iakubovskii	b3b7789cbc	Better pipeline type hints ✨ (#38049 ) * image-classification * depth-estimation * zero-shot-image-classification * image-feature-extraction * image-segmentation * mask-generation * object-detection * zero-shot-object-detection * image-to-image * image-text-to-text * image-to-text * text-classification * text-generation * text-to-audio * text2text_generation * fixup * token-classification * document-qa * video-classification * audio-classification * automatic-speech-recognition * feature-extraction * fill-mask * zero-shot-audio-classification * Add pipeline function typing * Add code generator and checker for pipeline types * Add to makefile * style * Add to CI * Style	2025-06-13 13:44:07 +01:00
Quentin Gallouédec	c989ddd294	Simplify and update trl examples (#38772 ) * Simplify and update trl examples * Remove optim_args from SFTConfig in Trainer documentation * Update docs/source/en/trainer.md * Apply suggestions from code review * Update docs/source/en/trainer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Quentin Gallouédec <qgallouedec@Quentins-MacBook-Pro.local> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-13 12:03:49 +00:00
Quentin Gallouédec	de24fb63ed	Use HF papers (#38184 ) * Use hf papers * Hugging Face papers * doi to hf papers * style	2025-06-13 11:07:09 +00:00
Ákos Hadnagy	1031ed5166	Disable custom MRA kernels for ROCm (#38738 ) * Disable custom MRA kernels for ROCm * Move platform check code to utils * Ruff * Ruff again * Fix querying HIP version * Revert some changes * Add missing return statement --------- Co-authored-by: ivarflakstad <69173633+ivarflakstad@users.noreply.github.com>	2025-06-13 12:25:28 +02:00
Guang Yang	7f00b325f8	Unbreak optimum-executorch (#38646 ) * Unbreak optimum-executorch * use static cache if has layer_types but no sliding_window * revert view on kv_arange --------- Co-authored-by: Guang Yang <guangyang@fb.com>	2025-06-13 11:13:32 +02:00
Cyril Vallez	5f59a9b439	Fix configs and doc for the Qwens (#38808 ) fix doc and configs	2025-06-13 11:10:55 +02:00
Artur Chakhvadze	8222a9325d	Fix erroneous docstring for the ordering of SWA layers (#38794 )	2025-06-13 10:46:44 +02:00
Raushan Turganbay	e26ae89281	[docs] update cache docs with new info (#38775 ) * update docs with new info * Update docs/source/en/kv_cache.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-13 07:10:56 +00:00
Ita Zaporozhets	324cc77dc3	refactor create_token_type_ids_from_sequences (#37681 ) * rm build_input.. from old file * refactor create_token_type_ids_from_sequences * handle when cls_token_id is None * updated fix * markuplm * refactoring rest of models * copies * revert funnel * rm incorrect file * ruff * ruff	2025-06-12 23:24:43 +02:00
SohamPrabhu	85f060e9b0	Updated moonshine modelcard (#38711 ) * Moved the sources to the right * small Changes * Some Changes to moonshine * Added the install to pipline * updated the monshine model card * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated Documentation According to changes * Fixed the model with the commits * Update moonshine.md * Update moshi.md --------- Co-authored-by: Your Name <sohamprabhu@Mac.fios-router.home> Co-authored-by: Your Name <sohamprabhu@Sohams-MacBook-Air.local> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-12 10:27:17 -07:00
Drew Ross	645cf297cc	Add missing div in Pegasus model card (#38773 ) Add missing div	2025-06-12 10:27:07 -07:00
Yusuf Shihata	346f341630	[Docs] New DiT model card (#38721 ) * documenation finished * Update dit.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-12 10:26:50 -07:00
Cyril Vallez	4b8ec667e9	Remove all traces of `low_cpu_mem_usage` (#38792 ) * remove it from all py files * remove it from the doc * remove it from examples * style * remove traces of _fast_init * Update test_peft_integration.py * CIs	2025-06-12 16:39:33 +02:00
Kyle Mylonakis	3542e0b844	build: 📌 Remove upper bound on PyTorch (#38789 ) build: 📌 remove upper bound on torch dependency as issue which originally resulted in the pin has been released in torch 2.7.1	2025-06-12 16:34:13 +02:00
Yih-Dar	eea35a15b0	Fix `mllama` (#38704 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-12 16:15:35 +02:00
Mohammad Nasirifar	038a59e2cd	Initialize flash attn flag (#38768 ) _flash_supports_window_size is used further down in this file and relied on by e.g. [ring-flash-attention](https://github.com/zhuzilin/ring-flash-attention/blob/123f924/ring_flash_attn/adapters/hf_adapter.py#L9-L11). Even though it is an unexported name, it still makes sense to keep the state of `globals()` in this file consistent.	2025-06-12 14:06:13 +00:00
leopardracer	910355a010	Fix Typos in Comments: "quantitation" → "quantization", "averege" → "average" (#38766 ) * Update convert_llama4_weights_to_hf.py * Update modeling_visual_bert.py	2025-06-12 14:04:39 +00:00
Lysandre Debut	6a5fd0c6d2	Reword README in light of model definitions (#38762 ) * Slight readme reword * reword * reword * reword * Slight readme reword	2025-06-12 14:43:31 +01:00
Yih-Dar	c87058beb8	Fix `llava_onevision` tests (#38791 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-12 15:06:49 +02:00
Yih-Dar	d4e7aa5526	Fix `qwen_2_5 omni` (#38658 ) * fix * fix * break style * break style * Apply style fixes * break style * Apply style fixes * fix modular --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-06-12 14:43:54 +02:00
Jesse Cai	e1812864ab	[docs] Add int4wo + 2:4 sparsity example to TorchAO README (#38592 ) * update quantization readme * update --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-06-12 12:17:07 +00:00
Quentin Gallouédec	bc68defcac	Update PULL_REQUEST_TEMPLATE.md (#38770 )	2025-06-12 14:03:33 +02:00
Quentin Gallouédec	960fda25d1	Reduce verbosity for `average_tokens_across_devices=True` and `world size = 1` (#38785 ) * Warning to info for average_tokens_across_devices and world size = 1 * Update src/transformers/training_args.py	2025-06-12 14:02:53 +02:00
Yih-Dar	89c46b648d	Skip some export tests on torch 2.7 (#38677 ) * skip * fix * better check * Update import_utils.py --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-06-12 12:47:15 +02:00
Raushan Turganbay	27459025b8	[video processors] support frame sampling within processors (#38105 ) * apply updates smolVLM (still needs workaround for chat template) * add other models * dump qwen omni for now, come back later * port qwen omni from their impl * wait, all qwens sample videos in same way! * clean up * make smolvlm backwards compatible and fix padding * dix some tests * fox smolvlm tests * more clean up and test fixing * delete unused arg * fix * address comments * style * fix test	2025-06-12 09:34:30 +00:00
Cyril Vallez	887054c714	Fix masking utils (#38783 ) * fix * Update masking_utils.py * Update masking_utils.py	2025-06-12 11:00:46 +02:00
Yih-Dar	7c58336949	[Hotfix] Fix style bot (#38779 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-12 10:20:36 +02:00
Raushan Turganbay	7c6b1707c3	[masking utils] check `None` instead of try/except (#38561 ) * fix vllm's compile backend * fix the test * apply the same changes in other masking strategies	2025-06-12 06:50:28 +00:00
rileyafox	9487765f07	Add Qwen2 MoE model card (#38649 ) * Add Qwen2 MoE model card * Revisions to qwen2 moe model card * Add Qwen2 MoE model card	2025-06-11 15:14:01 -07:00
Emile Aydar	32dbf4bddb	Update altCLIP model card (#38306 ) * Update altclip.md * Update altclip.md * Update altclip.md * Update altclip.md * Update altclip.md * Update altclip.md * Rename altclip.md to altclip.mdx * Rename altclip.mdx to altclip.md * Update altclip.md * Update altclip.md * Update altclip.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-11 14:48:34 -07:00
Dongruixuan Li	1dcb022e8f	chore(pixtral): emit block attention mask when using flash attention (#38741 ) * chore(pixtral): emit block attention mask when using flash attention Since flash_attention_2 relies solely on position_ids, emitting the block attention mask avoids unnecessary memory usage and prevents OOM on large inputs. * remove unnecessary attention_mask assignment	2025-06-11 18:55:23 +00:00
Yih-Dar	60d4b35b20	Make style bot trigger CI after push (#38754 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-11 20:40:04 +02:00
Drew Ross	bb44d2a0f6	Update pegasus model card (#38675 ) * Update Pegasus model card * Fix transformers-cli command * Update code examples to use bfloat16 * Reverted code examples to use float16 * Fix typo, update checkpoints link * Update str formatting in code examples * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix typo * Remove inaccurate badges * Revert badge removal * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Include cache_implementation argument in quantization example --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-11 10:56:25 -07:00
L	b84ebb7f3c	fix(qwen3_moe): pass kwargs to self_attn (#38691 ) This is needed to avoid `.item()` calls in `_flash_attention_forward`.	2025-06-11 19:26:08 +02:00
Matt	9f563ada70	Deprecate TF + JAX (#38758 ) * Scatter deprecation warnings around * Delete the tests * Make logging work properly!	2025-06-11 17:28:06 +01:00
Matt	337757cbd5	Update repo consistency check (#38763 )	2025-06-11 17:02:03 +01:00
Matthew Douglas	e2bdc13375	Remove IPEX requirement for bitsandbytes on CPU (#38594 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-11 17:46:34 +02:00
Matt	063bef0865	Prepare for TF+Jax deprecation (#38760 ) * Prepare for TF+Jax deprecation * Remove .circleci jobs	2025-06-11 16:03:31 +01:00
Marc Sun	11ad9be153	Better typing for num_items_in_batch (#38728 ) * fix * style * type checking ? * maybe this ? * fix * can't be an int anymore * fix	2025-06-11 16:26:41 +02:00
Pavel Iakubovskii	84710a4291	Add V-JEPA 2 (#38746 ) * adding model and conversion scripts * add imports to test vjepa conversion * fix imports and make conversion work * fix computation for short side * replace attention with library attention function * cleanup more attention classes * remove config overrides * add test cases, fix some of the failing ones * fix the model outputs * fix outputs of the model per review * fix too big model test case * fix styling __init__.py * fix initialization test * remove all asserts per review * update sorting unsorting logic as per feedback * remove is_video per review * remove another is_video segment * remove unwanted stuff * small fixes * add docstrings for the model * revert adding vjepa2 config here * update styling * add config docstrings (wip) * fix dpr issue * removed test failing issues * update styles * merge predictor configs into main config * remove processing code, add video processor * remove permute which is not necessary now * fix styles * updated vjepa2 to be in video_processing_auto * update comment for preprocessing * test integration test and fix the outputs * update test values, change test to look at repeated frames for a given image * add a simple video processing test * refactoring pixel_values_videos and upload ckpts to original * fix torch_fx test cases * remove unused config * add all config docstrings * add more integration tests * add basic doc * revert unwanted styling changes * working make fixup * Fix model_type in config * update attention implementation to fit new hf standards * fix the preprocessing logic, ensure it matches the original model * remove use_rope logic, cleanup * fix docstrings * Further cleanup, update doc * Fix model prefix * fix get_vision_features * VJEPA2Embeddings style refactor * nit, style comment * change modules default values * Only `str` activation in config * GradientCheckpointingLayer * fixup * fix conversion script * Remove return_dict * remove None return typehint * Refactor VJEPA2Layer, remove use_SiLU * Fix fx tests * dpr -> drop_path_rates * move ModelOutput on top format docs bit * update docs * update docs * update doc example * remove prune_heads from model * remove unused config params * refactor embed signature * Add vjepa to docs * Fix config docstring * update defaults * Update docs/source/en/model_doc/vjepa2.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/vjepa2.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Fix import * Min refactoring * Update HUB_SOURCE and HUB_REPO in conversion script * Add missing headers * VJEPA -> V-JEPA in docs * Add image to doc * fix style * fix init weights * change checkpoint name in modeling tests --------- Co-authored-by: Koustuv Sinha <koustuv.sinha@mail.mcgill.ca> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: Koustuv Sinha <koustuvsinha@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2025-06-11 15:00:08 +01:00
Davis Wertheimer	a6f0e2b64a	Add z-loss to Bamba for v2 (#37842 ) * Remove const * Fix arg ref * Sharded save * Add z_loss flag * Add modeling zloss * Demodularize clm forward for zloss * Also demodularize init for z_loss flag * PR comments (mostly modularizing right) * Demodularize forward * Better name zloss and explain typematch * Fully propagate coeff name * style fixes * zloss default float * Remove conflicting annotations --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-06-11 15:29:17 +02:00
Yih-Dar	6b610d89f1	Revert "Trigger doc-builder job after style bot" (#38735 ) Revert "Trigger doc-builder job after style bot (#38398)" This reverts commit 51e0fac29fc3994d49dfbfd1c8d085d29360d393.	2025-06-11 14:56:39 +02:00
Minho Ryu	0bf53e69e2	[DeepSeek-V3] implement when q_lora_rank is None (#38743 ) * implement when q_lora_rank is None * make style and quality	2025-06-11 13:35:10 +01:00
ye	b426c2b313	fix: bf16 with TPU is allowed in configuration (#38670 ) * fix: tpu bf16 * fix: style --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-11 12:35:01 +00:00
Yao Matrix	c8c1e525ed	from 1.11.0, torchao.prototype.low_bit_optim is promoted to torchao.optim (#38689 ) * since 1.11.0, torchao.prototype.low_bit_optim is promoted to torchao.optim Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix review comments Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-11 12:16:25 +00:00
Yushun Xiang	56a7cf5546	fix: Add method to get image features in PaliGemmaForConditionalGeneration (#38730 ) * fix: Add method to retrieve image features in PaliGemmaForConditionalGeneration * feat: Add get_image_features method to multiple models for image feature extraction * fix: reformat the files with ruff. * feat: Add methods for packing and retrieving image and video features across multiple models modified: - modeling_chameleon.py - modeling_llava_next.py - modular_llava_next_video.py - modeling_qwen2_vl.py and generate the: - modeling_llava_next_video.py - modeling_llava_onevision.py - modeling_qwen2_5_vl.py * feat: Implement get_image_features method in Aria, Mistral3, and VipLlava models with updated parameters * fix: reformatted the code with fix-style	2025-06-11 10:26:31 +00:00
Raushan Turganbay	380e6ea406	[llava] fix integration tests with Siglip (#38732 ) fix llava siglip test	2025-06-11 08:09:16 +00:00
Rémi Ouazan	f1849eab22	Fixed a multiple-devices issue in SmolVLM model (#38736 ) Fixed a multiple-devices issue in SmolVLMModel (#38557) * Fixed a multiple-devices issue in SmolVLMModel * Changed the modular to reflect changes	2025-06-11 10:08:01 +02:00
RogerSinghChugh	aa798b7ac9	New canine model card (#38631 ) * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Commit for new_gpt_model_card. * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * commit for new canine model card. * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/canine.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * implemented suggestion by @stevhliu. * Update canine.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-10 09:30:05 -07:00
Matt	e28fb26e7d	Add AGENTS.md (#38734 ) * More name sync * repeatedly underlining "WRITE LESS, ROBOT" * fewer, commas, please * Clarify "copied from" * Clarify "copied from" * Mention test dependencies * Added a line on preferring `modular` style	2025-06-10 16:27:37 +00:00
Francisco R Castro Garcia	cb4c56ce0d	Fix typo in Language Modeling example scripts and update TPU type (#38652 ) * Fix typo that prevents the examples to be run correctly * return .TPU in accelerator.distributedtype comparison	2025-06-10 13:43:35 +00:00
alexzms	8ff22e9d3b	[add-new-model-like] Robust search & proper outer '),' in tokenizer mapping (#38703 ) * [add-new-model-like] Robust search & proper outer '),' in tokenizer mapping * code-style: arrange the importation in add_new_model_like.py * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-06-10 12:25:12 +00:00
Yuanyuan Chen	8340e8746e	Use OSError (#38712 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-06-10 12:13:49 +00:00
Yih-Dar	8257734b5f	Fix `llava` tests (#38722 ) * update * fix 1 * fix 2 * fix 3 * fix 4 * fix 5 * fix 6 * fix 7 * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-10 13:53:17 +02:00
वेदांत	71f7385942	Logging message for `` `is_bitsandbytes_available()` `` (#38528 ) * bnb import log * bnb import log * log mesage change * moved error issue in qunatizer_bnb_4_bit.py * ruff * arg added for bnb check * required changes --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-10 10:15:01 +00:00
Yih-Dar	04cdf83244	Update some tests for torch 2.7.1 (#38701 ) * fix 1 * fix 2 * fix 3 * fix 4 * fp16 * break * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-10 11:46:52 +02:00
rdonggroq	afdb821318	Fix smart resize (#38706 ) * Fix smart_resize bug * Add smart_resize test * Remove unnecessary error checking * Fix smart_resize tests --------- Co-authored-by: Richard Dong <rdong@rdong.c.groq-143208.internal>	2025-06-10 08:59:22 +00:00
Yana Mishula	81799d8b55	Standardize ByT5 model card format (#38699 ) * Standardize ByT5 model card format * Apply review feedback from @stevhliu * Fix Notes formatting and wording * Fix `aya_vision` test (#38674) * fix 1: load_in_4bit=True, * fix 2: decorateor * fixfix 2: breakpoint * fixfix 3: update * fixfix 4: fast * fixfix 5: cond * fixfix 5: cond * fixfix 6: cuda 8 * ruff * breakpoint * dtype * a10 * a10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix autodoc formatting for ByT5Tokenizer --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-09 15:02:50 -07:00
Yih-Dar	e55983e2b9	Fix `aya_vision` test (#38674 ) * fix 1: load_in_4bit=True, * fix 2: decorateor * fixfix 2: breakpoint * fixfix 3: update * fixfix 4: fast * fixfix 5: cond * fixfix 5: cond * fixfix 6: cuda 8 * ruff * breakpoint * dtype * a10 * a10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-09 22:18:52 +02:00
Aashish Anand	b61c47f5a5	Created model card for xlm-roberta-xl (#38597 ) * Created model card for xlm-roberta-xl * Update XLM-RoBERTa-XL model card with improved descriptions and usage examples * Minor option labeling fix * Added MaskedLM version of XLM RoBERTa XL to model card * Added quantization example for XLM RoBERTa XL model card * minor fixes to xlm roberta xl model card * Minor fixes to mask format in xlm roberta xl model card	2025-06-09 13:00:38 -07:00
Aashish Anand	e594e75f1b	Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout (#38596 ) * Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout * Added CLI command example and quantization example for XLM RoBERTa model card. * Minor change to transformers CLI and quantization example for XLM roberta model card	2025-06-09 12:26:31 -07:00
Aashish Anand	29ca043856	Created model card for XLM model (#38595 ) * Created model card for XLM model * Revised model card structure and content of XLM model * Update XLM model documentation with improved examples and code snippets for predicting <mask> tokens using Pipeline and AutoModel.	2025-06-09 12:26:23 -07:00
Marcel Ambo Ndowah	25f711aa89	Drop as_target_processor from the _call_ and pad methods (#38642 ) Drop as_target_processor from _call_ and pad methods; reformat docstrings for readability	2025-06-09 12:26:09 -07:00
Matthew Douglas	837ddac1ec	Docs: update bitsandbytes torch.compile compatibility (#38651 )	2025-06-09 14:51:57 -04:00
dbleyl	b9faf2f930	Fix TypeError: 'NoneType' object is not iterable for esm (#38667 ) (#38668 ) Add post_init() calls to EsmForMaskedLM, EsmForTokenClassification and EsmForSequenceClassification.	2025-06-09 15:23:20 +00:00
Fiona Waters	11dca07a10	Fix retrieve function signature and remove faiss requirement (#38624 ) Signed-off-by: Fiona Waters <fiwaters6@gmail.com>	2025-06-09 15:17:33 +00:00
xiao	b31d462c61	Fix some models import (#38694 ) Fix models import	2025-06-09 16:09:24 +01:00
pweglik	282d6684dc	Fix attention mask expansion when converting to executorch (#38637 )	2025-06-09 15:00:55 +00:00
Anthony	19224c3642	fix: "check out" as verb (#38678 ) "check out" as verb	2025-06-09 14:07:31 +00:00
StevenBucaille	237ff80387	Fixed modeling_auto.py MODEL_FOR_MASK_GENERATION_MAPPING_NAMES variable (#38664 ) fix: grouped the two MODEL_FOR_MASK_GENERATION_MAPPING_NAMES variables	2025-06-09 13:40:46 +00:00
Isotr0py	d7b87b415a	Fix qwen2-audio chat template audio placeholder insertion (#38640 ) * fix qwen2-audio template Signed-off-by: Isotr0py <2037008807@qq.com> * add message['type'] back Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-06-09 09:56:42 +00:00
Yih-Dar	10627c1a0f	Use torch 2.7.1 on daily CI (#38620 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-08 14:37:45 +02:00
Yih-Dar	ebeec13609	Fix `InternVL` integration test (#38612 ) * fix * fix * fix OOM --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-07 08:30:47 +02:00
Yih-Dar	3fb7e7bc01	Skip torchscript tests for 2 models (#38643 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 20:17:37 +02:00
Yao Matrix	dc76eff12b	remove ipex_optimize_model usage (#38632 ) * remove ipex_optimize_model usage Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update Dockerfile Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> Co-authored-by: root <root@a4bf01945cfe.jf.intel.com>	2025-06-06 20:04:44 +02:00
Yih-Dar	5009252a05	Better CI (#38552 ) better CI Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 17:59:14 +02:00
jiqing-feng	2e889c18e1	fix torch_dtype on awq (#38463 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-06 17:14:00 +02:00
inkcherry	871901cb3d	fix total batch size calculation in trainer (#38286 ) * fix total batch size calculation * update Signed-off-by: inkcherry <mingzhi.liu@intel.com> * Update src/transformers/trainer.py --------- Signed-off-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-06 14:54:00 +00:00
Yih-Dar	02f946a038	Don't run `AriaForConditionalGenerationModelTest` on CircleCI (#38615 ) git rid of this model Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 11:30:31 +02:00
Mehant Kammakomati	3d15606e64	fix: support grad clipping for TP through replicating non-sharded modules (#36132 ) * feat: fix tp grad norm: Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * feat: use implicit replication Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> --------- Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-06 11:07:22 +02:00
Yih-Dar	fca6748246	Improve `test_initialization` for `SwiftFormer` (#38636 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 10:47:10 +02:00
Yih-Dar	92a87134ea	update `ColQwen2ModelIntegrationTest` (#38583 ) * update * update * update * update * 4 bit * 8 bit * final --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 10:41:17 +02:00
Raushan Turganbay	dbfc79c17c	[generation] bring back tests on vision models (#38603 ) * bring back geenration tests on VLMs * remove head mask tests overwritten	2025-06-06 08:23:15 +00:00
Yih-Dar	90c4b90a10	Use torch 2.7.1 on CircleCI jobs (#37856 ) 2.7.1 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 10:16:57 +02:00
Yih-Dar	3e35ea1782	Improve `test_initialization` (#38607 ) * fix flaky init tests * fix flaky init tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 10:08:05 +02:00
Yao Matrix	89542fb81c	enable more test cases on xpu (#38572 ) * enable glm4 integration cases on XPU, set xpu expectation for blip2 Signed-off-by: Matrix YAO <matrix.yao@intel.com> * more Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * refine wording Signed-off-by: YAO Matrix <matrix.yao@intel.com> * refine test case names Signed-off-by: YAO Matrix <matrix.yao@intel.com> * run Signed-off-by: YAO Matrix <matrix.yao@intel.com> * add gemma2 and chameleon Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix review comments Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: Matrix YAO <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-06-06 09:29:51 +02:00
Armaghan Shakir	31023b6909	Fix `MiniMax` (docs and integration tests checkpoint) (#38575 ) * update checkpoints for integration tests * minor fixes in docs	2025-06-06 08:43:11 +02:00
Vanshu	593e29c5e2	Updated Aria model card (#38472 ) * Update aria.md * Update aria.md * Suggested Updates - aria.md	2025-06-05 14:36:54 -07:00
Parag Ekbote	77cf4936fe	[Nit] Add Note on SigOpt being in Public Archive Mode (#38610 ) * add note on sigopt * update * Update docs/source/en/hpo_train.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-05 14:07:23 -07:00
Monish Singhal	c75bf2c36e	Fix typo in LLaVa documentation (#38618 ) * Fix typo in LLaVa documentation In exactly one section, LlavaImageProcessor was spelt wrongly as LLavaImageProcessor, which throws off copy-pasting the section. * Fix LlavaImageProcessor url to make it valid (and copypaste-able) Earlier, the URL contained the entire HF prefix. This commit removes that to ensure that the code block can be copied and run as is.	2025-06-05 13:25:07 -07:00
johncaged	5399c1d670	docs: fix dark mode logo display. (#38586 )	2025-06-05 13:06:59 -07:00
Yih-Dar	481b953170	Fix `return_dict=False` giving errors in a few VLM models (#38519 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-05 21:19:07 +02:00
Sai-Suraj-27	88912b8e95	Remove `isort` from dependencies (#38616 ) Removed isort as a dependency	2025-06-05 16:42:49 +00:00
David Klank	fa921ad854	fix spelling errors (#38608 ) * fix errors test_modeling_mllama.py * fix error test_modeling_video_llava.py * fix errors test_processing_common.py	2025-06-05 13:57:23 +01:00
Isotr0py	0f833528c9	Avoid overwrite existing local implementation when loading remote custom model (#38474 ) * avoid overwrite existing local implementation when loading custom remote model Signed-off-by: Isotr0py <2037008807@qq.com> * update comments Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-06-05 13:54:40 +01:00
KameniAlexNea	8f630651b0	Allow `mlm_probability` to be set to `None` when `mlm=False` in DataCollatorForLanguageModeling (#38522 ) (#38537 ) * mlm_probability in DataCollatorForLanguageModeling should be validated only when mlm is True (#38522) * Change mlm_probability to Optional in DataCollatorForLanguageModeling (#38537) --------- Co-authored-by: eak <eak@ivalua.com>	2025-06-05 13:54:12 +01:00
dependabot[bot]	65f5fa71cd	Bump torch from 2.6.0 to 2.7.1 in /examples/flax/vision (#38606 ) Bumps [torch](https://github.com/pytorch/pytorch) from 2.6.0 to 2.7.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v2.6.0...v2.7.1) --- updated-dependencies: - dependency-name: torch dependency-version: 2.7.1 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-06-05 13:38:02 +01:00
Yih-Dar	8c59cdb3f8	pin pandas (#38605 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-05 11:33:06 +02:00
Yih-Dar	8cfcfe58c0	Remove custom pytest and pluggy (#38589 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-05 10:23:40 +02:00
Raushan Turganbay	0d69fa6dcd	[qwen-omni] fix sliding window (#38525 ) fix	2025-06-05 10:11:58 +02:00
Henrik Matthiesen	1fed6166c0	added fast image processor for ZoeDepth and expanded tests accordingly (#38515 ) * added fast image processor for ZoeDepth and expanded tests accordingly * added fast image processor for ZoeDepth and expanded tests accordingly, hopefully fixed repo consistency issue too now * final edits for zoedept fast image processor * final minor edit for zoedepth fast imate procesor	2025-06-04 22:59:17 +00:00
Sai-Suraj-27	a510be20f3	Updated deprecated typing imports with equivalents for Python 3.9+ (#38546 ) * Replace deprecated typing imports with collections.abc equivalents for Python 3.9+ * Fixed code quality --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-06-04 16:57:23 +00:00
RogerSinghChugh	8e1266de2b	New gpt neo model card (#38505 ) * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Commit for new_gpt_model_card. * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-04 09:56:47 -07:00
Dmitry Rogozhkin	8046aff520	tests/roformer: fix couple roformer tests on gpus (#38570 ) Fix "RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu" error running the following roformer tests on GPUs (CUDA or XPU): ``` tests/models/roformer/test_modeling_roformer.py::RoFormerSinusoidalPositionalEmbeddingTest::test_basic tests/models/roformer/test_modeling_roformer.py::RoFormerSelfAttentionRotaryPositionEmbeddingTest::test_apply_rotary_position_embeddings ``` Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2025-06-04 18:45:56 +02:00
Aryan Chauhan	b9c17c5dc0	[Dinov2] Enable device_map="auto" support (#38487 ) * Fix: resolve import order and duplicate import (ruff I001, F811) * Format: clean up Dinov2 test file with ruff formatter * Add _no_split_modules = ['Dinov2Layer'] to enable device_map='auto' * Revert dinov2_with_registers _no_split_modules to original state * Remove redundant device_map test as suggested * Remove unused import after deleting test * removed import torch and the redundant test function * Update tests/models/dinov2/test_modeling_dinov2.py --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-04 15:42:40 +00:00
Luc Georges	ae3733f06e	feat: add `repository` field to benchmarks table (#38582 ) * feat: add `repository` field to benchmarks table * fix: remove unwanted `,`	2025-06-04 15:40:52 +02:00
Manal ML	1285aec4cc	Docs: fix code formatting in torchao docs (#38504 )	2025-06-04 12:35:21 +00:00
Minho Ryu	6c5d4b1dd2	allow custom head_dim for qwen2_moe (#37188 ) allow custom head_dim Co-authored-by: ryan.agile <ryan.agile@kakaobrain.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-06-04 12:27:30 +00:00
5pipeline	82fa68ca14	fix(attention_visualizer): add default value for image_seq_length (#38577 )	2025-06-04 12:20:31 +00:00
Anton Vlasjuk	1dc619e59f	[`FlexAttn`] Fix models with unique characteristics (#38433 ) * fix * style * check * check 2 * add deepseek workaround	2025-06-04 13:37:28 +02:00
Yih-Dar	ff3fad61e3	Fix `deepseekv3` (#38562 ) * fix 1 * fix 2 * fix 3 * fix 4 * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-04 11:40:14 +02:00
Yih-Dar	6085cded38	update `utils/notification_service.py` for AMD vs Nvidia (#38563 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-04 11:38:25 +02:00
Yih-Dar	3c995c1fdc	Fix `chameleon` tests (#38565 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-04 10:13:35 +02:00
Armaghan Shakir	55736eea99	Add support for MiniMax's MiniMax-Text-01 (#35831 ) * end-to-end architecture * lightning-attn: refactor, clean, optimize * put minimax_text_01 in other files * use latest __init__ standards and auto-generate modular * support attention_mask for lightning-attn * Revert "use latest __init__ standards and auto-generate modular" This reverts commit d8d3c409d89e335c98a8cd36f47304a76eac7493. * fix modular conversion * pass both attention masks instead of tuple * formatting * Updated Dynamic Cache * created MiniMaxText01Cache * fix hardcoded slope_rate * update attn_type_list in config * fix lightning when use_cache=False * copy tests from mixtral * (checkpoint) all tests pass for normal attention * fix all unittests * fix import sorting * fix consistency and formatting tests * fix config * update tests, since changes in main * fix seq_len error * create dummy docs * fix checkpoint * add checkpoint in config docstring * run modular_conversion * update docs * fix checkpoint path and update tests * fix ruff * remove repeated expected_slice * update docs * rename "minimax-text-01" to "minimax" * inherit config from mixtral * remove from docs in other languages * undo files that should be untouched * move minimax to end in conversation docs * use MiniMaxForCausalLM as it is * ruff fixes * run modular * fix docstring example in causallm * refactor attention loop and decay factors * refactor config in modular * run modular * refactor cache * rename static_cache to linear_cache * make positional embeddings necessary * remove unnecessary layernorms declarations * fix import in tests * refactor attention in next tokens * remove outdated code * formatting and modular * update tests * rename layernorm alpha/beta factors * register decay factors as buffers * remove unused declarations of decay factors * update config for alpha/beta factors * run modular * remove head_dim in tests * remove minimax from fx.py * remove stuff that is not really needed * update __init__ * update qkv torch.split Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * fix qkv torch.split * quality fixes * remove mistakenly added dummy * purge unused ModelTester code * fix-copies * run fix-copies * fix head_dim * write cache formatting tests * remove postnorm * avoid contiguous in attention current states * update expected_slice * add generation test for integration * fix dtype in generation test * update authors * update with changes in main * update graident checkpointing and minor fixes * fix mutable attn_type_list * rename: attn_type -> layer_type * update for layer_types * update integration tests * update checkpoint * clean overview in docs --------- Co-authored-by: Shakib-IO <shakib.khan17@northsouth.edu> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-06-04 09:38:40 +02:00
Rémi Ouazan	037acf1d10	[janus] Fix failing tests on mi3XX (#38426 ) * Fix multiple devices error on Janus * Fix AttributeError on Janus BOI token * Initialize lm first in Janus to get correct device map * Added expectations for Janus test_model_generate_images * Fixed JanusVisionEncoderLayer being split across devices * Code formatting * Adding modeling file * Reverted changes out of scope for this PR	2025-06-04 09:38:10 +02:00
Steven Liu	78d771c3c2	[docs] Format fix (#38414 ) fix table	2025-06-03 09:53:23 -07:00
Marc Sun	0f41c41a46	Fix hqq issue (#38551 ) * bc * style	2025-06-03 17:58:31 +02:00
Driss Guessous	279000bb70	Name change AOPermod -> ModuleFqn (#38456 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-06-03 15:43:31 +00:00
Yih-Dar	e8b292e35f	Fix `utils/notification_service.py` (#38556 ) * fix * fix * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-03 13:59:31 +00:00
Muqi Li	8cb96787a6	Explicitly setting encoding in tokenization_utils_base.py (#38553 ) Update tokenization_utils_base.py Add encoding explicitly	2025-06-03 12:08:35 +00:00
Matej Sirovatka	caf708da1b	[TP] Change command in tests to `python3` (#38555 ) * Fix: change to `python3` * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-03 11:03:33 +00:00
Zhen	fdf86fb440	[bugfix] [WIP] fix apply_rotary_emb error on Ascend NPU (#38491 ) [bugfix] fix apply_rotary_emb error on Ascend NPU	2025-06-03 09:31:49 +00:00
Yih-Dar	ca0a682796	Update docker image to use `av` (#38548 ) * Update * Update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-03 11:04:41 +02:00
jiqing-feng	814432423c	update emu3 test (#38543 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-06-03 11:02:01 +02:00
Raushan Turganbay	55ec319de6	Don't use default attn if pre-set in sub-config (#38526 ) * don't use default attn if pre-set in sib-config * style * add a test maybe	2025-06-03 07:53:07 +00:00
Raushan Turganbay	bf68dd9e6e	[tests] expand flex-attn test for vision models (#38434 ) * expand the test for VLMs * typo * mark models `supports_flex` + expand test for additional kwargs * flex attn for refactored vision models * fix copies * fix * unskip * style * address comments	2025-06-03 07:40:44 +00:00
Yih-Dar	de4cf5a38e	Fix blip2 tests (#38510 ) * fix 1: not sure * fix 2: _supports_flex_attn = False * fix 3: embedding_output = self.layernorm(query_embeds.to(self.layernorm.weight.dtype)) * fix 4: query_embeds = query_embeds.to(self.layernorm.weight.dtype) * fix 5: text_embeds = text_embeds.to(dtype=torch.float16) * fix 5: question_embeds.to(dtype=torch.float16) * fix 6: text_embeds = text_embeds.to(dtype=self.itm_head.weight.dtype) * fix 7: image_embeds and question_embeds * fix 8: fix other 2 fp16 tests * fix 9: fix T5 OOM * fix 10: fix T5 OOM * fix 11: fix T5 * fix 11: fix T5 beam * fix 12: _supports_sdpa=False * fix 12: style and expect * revert * revert --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-02 22:46:35 +02:00
Yih-Dar	ccc859620a	Fix `Gemma2IntegrationTest` (#38492 ) * fix * fix * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * update * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-02 22:45:09 +02:00
Yaswanth Gali	1094dd34f7	Remove type annotation in Siglip Attention Module (#38503 ) * Remove type annotation * remove print statement	2025-06-02 17:51:07 +02:00
Lysandre Debut	afb35a10ed	Num parameters in model.safetensors.index.json (#38531 ) Num parameters in index.json	2025-06-02 17:16:31 +02:00
Yiding Jia	cceab972ba	[flax/mistral] support sliding_window: null in config (#37402 ) flax/mistral: Allow sliding_window to be set to none	2025-06-02 16:45:02 +02:00
Marc Sun	1a25fd2f6d	Fix amp deprecation issue (#38100 ) apex amp is deprecated	2025-06-02 16:15:41 +02:00
Ita Zaporozhets	05ad826002	remove unhandled parameter (#38145 )	2025-06-02 15:57:32 +02:00
Tony Wu	c72ba69441	Add ColQwen2 to 🤗 transformers (#35778 ) * feat: add colqwen2 (wip) * tests: fix test_attention_outputs * tests: reduce hidden size to accelerate tests * tests: fix `test_attention_outputs` 🥳 * fix: fix wrong parent class for `ColQwen2ForRetrievalOutput` * fix: minor typing and style changes * chore: run `make style` * feat: remove redundant `max_num_visual_tokens` attribute in `ColQwen2Processor` * tests: tweak comments * style: apply ruff formatter * feat: move default values for `visual_prompt_prefix` and `query_prefix` * docs: update ColQwen2 model card * docs: tweak model cards * docs: add required example config checkpoint * tests: update expected scores in integration test * docs: tweak quickstart snippets * fix: address PR comments * tests: fix colqwen2 tests + tweak comment in colpali test * tests: unskip useful tests * fix: fix bug when `visual_prompt_prefix` or `query_prefix` is an empty string * fix: fix ColPali outputs when `return_dict == False` * fix: fix issue with PaliGemma output not being a dict * docs: set default dtype to bfloat16 in quickstart snippets * fix: fix error when `return_dict=False` in ColPali and ColQwen2 * tests: fix special tokens not being replaced in input_ids * style: fix lint * fix: `ColQwen2Processor`'s `padding_side` is now set from `processor_config.json` * fix: remove unused `padding_side` in ColQwen2 model * docs: update ColQwen2's model doc * fix: fix harcoded vlm backbone class in ColQwen2Config * fix: remove `padding_side` from ColQwen2Processor as should fed from kwargs * docs: fix typo in model docstring * docs: add illuin mention in model docs * fix: let `padding_size` be handled by `tokenizer_config.json` * docs: add colpali reference url in colqwen2's model doc * docs: add Hf mention in model docs * docs: add late interaction mention in model docs * docs: tweak colqwen2 model doc * docs: update reference checkpoint for ColPali to v1.3 * docs: simplify quickstart snippets * docs: remove redundant `.eval()` * refactor: use `can_return_tuple` decorator for ColPali and ColQwen2 * docs: fix copyright date * docs: add missing copyright in tests * fix: raise error when `initializer_range` is not in config * docs: remove redundant `.eval()` in colpali doc * fix: fix `get_text_config` now that Qwen2VL has a proper `text_config` attribute See https://github.com/huggingface/transformers/pull/37268 for details about changes in Qwen2VL's config. * fix: add missing `initializer_range` attribute in `ColQwen2Config` * fix: use `get_text_config` in `resize_token_embeddings` * update colwen2 with auto_docstring * docs: fix wrong copyright year * chore: remove `raise` as `initializer_range` has a default value in `ColQwen2Config` * refactor: merge `inner_forward` into `forward` * Refactor colqwen2 after refactoring of qwen2VL, use modular for modeling code * protect torch import in modular to protect in processing * protect torch import in modular to protect in processing * tests: fix hf model path in ColQwen2 integration test * docs: clarify `attn_implementation` and add comments * docs: add fallback snippet for using offline PIL dummy images * docs: temporarily revert attn_implementation to `None` while sdpa is not fixed * docs: tweaks in colpali/colqwen2 quick start snippets * fix: add missing flags to enable SDPA/Flex Attention in ColQwen2 model * fix: add missing changes in modular file * fix modeling tests --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-06-02 12:58:01 +00:00
Joao Gante	beaed8ce01	[generate] move `SinkCache` to a `custom_generate` repo (#38399 ) remove sink cache	2025-06-02 12:13:30 +02:00
Joao Gante	fe5bfaa4b5	[generate] add soft deprecations on custom generation methods (#38406 ) soft deprecations	2025-06-02 12:11:46 +02:00
mohammed benyamna	a75b9ffb5c	Update Loss Functions to Accept Tensor num_items_in_batch (#38029 ) * Update Loss Functions to Accept Tensor num_items_in_batch * Fix device mismatch by moving num_items_in_batch to loss device in fixed_cross_entropy * fix the ruff check * delete the unused if stat * fix the type problem	2025-06-02 11:31:44 +02:00
Rémi Ouazan	493cf1554b	[seamless_m4t] Skip some tests when speech is not available (#38430 ) * Added the require_speech decorator * Added require_speecj to some seamless_m4t tests * Changed skip message	2025-06-02 09:17:28 +00:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟	64d14ef28d	Fix setting FLASH_ATTENTION_DETERMINISTIC after importing (#37185 ) transformers.enable_full_determinism enables deterministic flash attention using `FLASH_ATTENTION_DETERMINISTIC` `800510c67b/src/transformers/trainer_utils.py (L79)` However, current checks use a global variable `deterministic_g`, which will do the environment variable check as soon as importing, this will cause issues as users can call `transformers.enable_full_determinism` after `transformers.modeling_flash_attention_utils` is imported. This behavior is introduced in https://github.com/huggingface/transformers/pull/33932/files#r1806668579 to fix the graph break. As a result, this PR implement fixes by delaying the environment variable check to the first time when `_flash_attention_forward` is executed, so that we can fix this issue and we won't introduce a graph break. Signed-off-by: Hollow Man <hollowman@opensuse.org>	2025-06-02 11:08:20 +02:00
Yuanyuan Chen	fde1120b6c	Remove deprecated use_flash_attention_2 parameter (#37131 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-06-02 11:06:25 +02:00
Fanli Lin	51d732709e	[docs] add xpu environment variable for gpu selection (#38194 ) * squash commits * rename gpu * rename accelerator * change _toctree.yml * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: sdp <sdp@a4bf01943ff7.jf.intel.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-05-30 16:05:07 +00:00
Marc Sun	c7f2b79dd8	protect dtensor import (#38496 ) protect	2025-05-30 17:36:00 +02:00
Marc Sun	051a8acc9a	Align TP check (#38328 ) align tp check	2025-05-30 17:15:39 +02:00
M Saqlain	e0545ef0b8	[Tests] Reduced model size for albert-test model (#38480 ) * Reduced model size for albert-test model * Run checks * Removed test_save_load * Removed test skipping functions	2025-05-30 14:22:32 +00:00
dependabot[bot]	f962c862ff	Bump torch from 2.2.0 to 2.6.0 in /examples/flax/vision (#37618 ) Bumps [torch](https://github.com/pytorch/pytorch) from 2.2.0 to 2.6.0. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v2.2.0...v2.6.0) --- updated-dependencies: - dependency-name: torch dependency-version: 2.6.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-30 14:04:52 +01:00
islemyakoubi	98568d1e25	Fix incorrect bbox_embed initialization when decoder_bbox_embed_share=False in GroundingDINO (#38238 ) * A shallow copy in groundingdino Fixes #37333 * Supprimer une ligne vide dans la classe GroundingDinoForObjectDetection * Translate comments in the GroundingDinoForObjectDetection class from French to English	2025-05-30 15:02:18 +02:00
Winston Castorp	d0fccbf7ef	Fix convert_internvl_weights_to_hf.py to support local paths (#38264 ) fix(internvl): add local path support to convert_internvl_weights_to_hf.py	2025-05-30 14:56:32 +02:00
Arthur	858ce6879a	make it go brrrr (#38409 ) * make it go brrrr * date time * update * fix * up * uppp * up * no number i * udpate * fix * [paligemma] fix processor with suffix (#38365) fix pg processor * [video utils] group and reorder by number of frames (#38374) fix * Fix convert to original state dict for VLMs (#38385) * fix convert to original state dict * fix * lint * Update modeling_utils.py * update * warn * no verbose * fginal * ouft * style --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-05-30 11:19:42 +02:00
Luc Georges	ab5067e7fd	fix: handle no scheduler passed by user (#38407 )	2025-05-30 11:00:44 +02:00
XING, Zhenghao	42ef218b58	[Qwen2.5-Omni] Fix dtype of cos,sin when used with flash attention (#38453 ) * Fix dtype of cos,sin when used with flash attention * Fix dtype of cos,sin when used with flash attention	2025-05-29 18:24:40 +00:00
Yih-Dar	81cff7ad34	Fix `Gemma3IntegrationTest` (#38471 ) * check * check * check * check * check * check * check * test style bot * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-29 16:51:12 +02:00
Lukas Geiger	e508965df7	Cleanup `BatchFeature` and `BatchEncoding` (#38459 ) * Use dict comprehension to create dict * Fix type annotation Union[Any] doesn't really make any sense * Remove methods that are already implemented in the `UserDict` parent class	2025-05-29 14:13:43 +00:00
Rahul	8e5cefcb1e	Fix TypeError in save_pretrained error handling (fixes #38422 ) (#38449 )	2025-05-29 13:58:16 +00:00
Raushan Turganbay	ad9dd3d17b	🔴 [VLM] modeling updates (#38317 ) * updates * fixup * fix tests * fix test * fix * let it be here for now, till monday * two more fixes * persimmon * fixup * fix * fixup * make sure fuyu runs now that LM has new attn API * fixup + tests * qwen vl uses new mask interface as well * qwen image features format * update * remove image_sizes * address comments * i am dumb...	2025-05-29 11:08:23 +00:00
Yaswanth Gali	a6f7acb603	[Tests] Clean up test cases for few models (#38315 ) * Update tests * revert aria change * too slow hence revert	2025-05-29 08:21:28 +00:00
Luc Georges	8010f3cf61	feat: add cache retention for requests (#38446 ) * feat: add cache retention for requests * fix: propagate `manual_eviction` param & refactor `finish_request` `finish_request` now only takes `request_id: str` as an input rather than the full `RequestState`, which was not needed and simplifies calling from `ContinuousBatchingManager::evict_request_from_cache` * refactor: pop req from `active_requests` * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-05-28 18:15:10 +00:00
Yih-Dar	66da700145	Fix GLM4 checkpoints (#38412 ) * fix * fix * fix * fix * fix * fix * test style bot * Apply style fixes --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-05-28 16:40:08 +00:00
Avasam	2872e8bac5	Merge type hints from `microsoft/python-type-stubs` (post dropping support for Python 3.8) (#38335 ) * Merge type hints from microsoft/python-type-stubs (post Python 3.8) * Remove mention of pylance * Resolved conflict * Merge type hints from microsoft/python-type-stubs (post Python 3.8) * Remove mention of pylance * Resolved conflict * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Avasam <samuel.06@hotmail.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-05-28 16:21:40 +00:00
Yuanzhou Cai	942c60956f	Model card for mobilenet v1 and v2 (#37948 ) * doc: #36979 * doc: update hfoptions * add model checkpoints links * add model checkpoints links * update example output * update style #36979 * add pipeline tags * improve comments * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * apply suggested changes * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-28 09:20:19 -07:00
Jiwook Han	9a8510572b	Updated the model card for ViTMAE (#38302 ) * Update vit_mae.md * badge float:right * Update docs/source/en/model_doc/vit_mae.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/vit_mae.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/vit_mae.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/vit_mae.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/vit_mae.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/vit_mae.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/vit_mae.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/vit_mae.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/vit_mae.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update model_doc/vit_mae.md * fix --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-28 09:19:43 -07:00
Vanshu	c9fcbd5bf9	Updated the Model docs - for the ALIGN model (#38072 ) * Updated the Model docs - for the ALIGN model * Update docs/source/en/model_doc/align.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/align.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated align.md * Update docs/source/en/model_doc/align.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/align.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update align.md * fix --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-28 09:19:09 -07:00
Yoni Gozlan	cba94e9272	Fix handling of slow/fast image processors in image_processing_auto.py (#38161 ) Fix wrong error when torchvision is not installed	2025-05-28 16:00:23 +00:00
Yoni Gozlan	21b10d9aa4	Fix `from_args_and_dict` ProcessorMixin (#38296 ) * fix-from-args-and-dict-processormixin * change used_kwargs to valid_kwargs * remove manual valid_kwargs * fix copies * fix modular aria	2025-05-28 11:46:33 -04:00
Matt	f844733568	Fix MoE gradient test (#38438 )	2025-05-28 16:44:20 +01:00
Matt	0ed6f7e6b4	Remove redundant test_sdpa_equivalence test (#38436 ) * Remove redundant test * make fixup	2025-05-28 17:22:25 +02:00
Yih-Dar	51e0fac29f	Trigger doc-builder job after style bot (#38398 ) * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-28 17:15:34 +02:00
Yoni Gozlan	c24d18bbae	Fix convert weights for InternVL (#38233 ) Fix internvl convert weights	2025-05-28 11:14:56 -04:00
Matthew Ngan	8850427242	Fix typo in tokenization_utils_base.py docstring (#38418 ) Fix typo in tokenization_utils_base.py	2025-05-28 14:52:10 +00:00
Peter St. John	bab40c6838	[core] support tensor-valued _extra_state values in `from_pretrained` (#38155 ) Support tensor-valued _extra_state values TransformerEngine uses the pytorch get/set_extra_state API to store FP8 layer config information as bytes Tensor in the _extra_state entry in the state dict. With recent changes to from_pretrained, this functionality has broken and loading a model that uses this API doesn't appear to work. This PR fixes the save/load pretrained functions for extra state entries that use a pytorch tensor, and adds a (currently x-failing) test for a dictionary extra state. Signed-off-by: Peter St. John <pstjohn@nvidia.com>	2025-05-28 15:38:42 +02:00
Anton Vlasjuk	badc71b9f6	🔴[`Attention`] Attention refactor for Whisper-based models (#38235 ) * start refactoring whisper * revert for now * first step * carry over attn fixes * check if this works * whisper has an off by one somewhere - cutting mask in any interface * make it based on interface * remove some tests that were skipped but now work * some fixes for whisper tests * interface changes * change the order of fix * some attention adjustments for eager + TP * fix scaling * mask changes * why does whisper contain those extra seq lens? * fix from config for fa2 as input_ids is invalid * fix another test * another fix * disable flex attn due to compile issues * copies and refactor for qwen audio since it somewhat relies on whisper * fix scaling and smaller things * retrigger * new new interface version + more fixups * adjust qwen * add comment * forgot this one * change copies as whisper cuts on the mask * add guard * add flex attention * switch to new mask function + add skips for torchscript * remove old api with cache position * last changes? * trigger ci	2025-05-28 13:32:38 +02:00
JJJYmmm	565a0052ed	make Llama4TextMoe forward more readable (#37529 ) * update forward of Llama4TextMoe * remove redudant transpose * fix formatting --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-05-28 11:54:45 +02:00
Yih-Dar	defeb04299	Fix CircleCI not triggered when PR is opened from a branch of `huggingface/transformers` (#38413 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-28 11:25:43 +02:00
Cyril Vallez	593276fe1e	Update error when using additional and/or masks (#38429 ) update error	2025-05-28 11:08:49 +02:00
ivarflakstad	3aab6e95cb	Disable mi210 scheduled CI (#38411 )	2025-05-28 10:35:41 +02:00
Yao Matrix	fb82a98717	enable large_gpu and torchao cases on XPU (#38355 ) * cohere2 done Signed-off-by: Matrix Yao <matrix.yao@intel.com> * enable torchao cases on XPU Signed-off-by: Matrix YAO <matrix.yao@intel.com> * fix Signed-off-by: Matrix YAO <matrix.yao@intel.com> * fix Signed-off-by: Matrix YAO <matrix.yao@intel.com> * fix Signed-off-by: Matrix YAO <matrix.yao@intel.com> * rename Signed-off-by: Matrix YAO <matrix.yao@intel.com> * fix Signed-off-by: Matrix YAO <matrix.yao@intel.com> * fix comments Signed-off-by: Matrix YAO <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com> Signed-off-by: Matrix YAO <matrix.yao@intel.com>	2025-05-28 10:30:16 +02:00
Yih-Dar	cea254c909	Update `CsmForConditionalGenerationIntegrationTest` (#38424 ) * require_read_token * ruff --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-28 10:20:43 +02:00
Raushan Turganbay	baddbdd24b	[qwen-vl] Look for vocab size in text config (#38372 ) fix qwen	2025-05-28 09:32:26 +02:00
Koki Ryu	a974e3b4e1	Fix an error in verify_tp_plan for keys without '.' (#38420 )	2025-05-28 09:30:43 +02:00
ivarflakstad	b1eae943a2	Change slack channel for mi250 CI (#38410 )	2025-05-28 09:20:34 +02:00
ivarflakstad	5f49e180a6	Add mi300 to amd daily ci workflows definition (#38415 )	2025-05-28 09:17:41 +02:00
Andy Vu	3b3ebcec40	Updated model card for OLMo2 (#38394 ) * Updated OLMo2 model card * added command line * Add suggestions Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added suggestions Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Indented code block as per suggestions --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-27 16:24:36 -07:00
Yoni Gozlan	f5307272f5	Falcon-H1 - Fix auto_docstring and add can_return_tuple decorator (#38260 ) Fix auto_docstring and add can_return_tuple	2025-05-27 16:18:05 -04:00
Tanuj Rai	a092f6babf	Update granite.md (#37791 ) * Update granite.md * Update docs/source/en/model_doc/granite.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/granite.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/granite.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update granite.md * Update docs/source/en/model_doc/granite.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/granite.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/granite.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/granite.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/granite.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/granite.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * minor fixes --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-27 12:55:15 -07:00
RogerSinghChugh	be7aa3210b	New bart model card (#37858 ) * Modified BART documentation wrt to issue #36979. * Modified BART documentation wrt to issue #36979. * fixed a typo. * Update docs/source/en/model_doc/bart.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bart.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bart.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bart.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bart.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bart.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * blank commit. --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-27 11:51:41 -07:00
RogerSinghChugh	587c1b0ed1	Updated BERTweet model card. (#37981 ) * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-27 11:51:22 -07:00
RogerSinghChugh	b73faef52f	Updated BigBird Model card as per #36979 . (#37959 ) * Updated BigBird Model card as per #36979. * Update docs/source/en/model_doc/big_bird.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/big_bird.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/big_bird.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/big_bird.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-27 11:24:28 -07:00
Madhav Kumar	538e847c06	Updated Zoedepth model card (#37898 ) * Edited zoedepth model card according to specifications. * Edited Zoedepth model file * made suggested changes.	2025-05-27 10:06:53 -07:00
Parag Ekbote	4f7b0ff8d1	Update Model Card for Mamba-2 (#37951 ) * update model page. * update model page. * Update docs/source/en/model_doc/mamba2.md Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * update the model page. * update. * Apply suggestions from code review Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Apply the suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add an quantization example and update the toctree. * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * remove the additional comma --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-27 10:06:39 -07:00
Cory Cornelius	9c50576860	[mllama] Allow `pixel_values` with `inputs_embeds` (#38334 ) * Allow pixel_values and inputs_embeds at the same time * remove unnecessary overwritten tests	2025-05-27 16:33:56 +00:00
Joao Gante	0f5a8243c4	[tests] remove overload for deleted test (`test_offloaded_cache_implementation`) (#37896 ) * remove overload for deleted tests * make fixup	2025-05-27 16:45:15 +01:00
Joao Gante	f85fd90407	[cleanup] delete deprecated kwargs in qwen2_audio 🧹 (#38404 ) delete deprecated	2025-05-27 16:08:53 +01:00
eustlb	b9f8f863d9	[CSM] update model id (#38211 ) * update model id * codec_model eval * add processor img * use ungated repo for processor tests	2025-05-27 17:03:55 +02:00
ivarflakstad	07dd6b2495	Add report_repo_id to mi300 workflow (#38401 )	2025-05-27 16:35:07 +02:00
eustlb	3142bd8592	[CSM] infer codec model with no_grad + audio eos label (#38215 ) * infer codec model with no_grad * codec_model eval * training labels: add audio eos token	2025-05-27 14:10:17 +00:00
Ye Liu	10ae443ec0	Fix Qwen2.5-VL Video Processor (#38366 ) * Update processing_qwen2_5_vl.py * Update processing_qwen2_5_vl.py * Update modular_qwen2_5_vl.py * Fix CI * Update modular_qwen2_5_vl.py * Update processing_qwen2_5_vl.py * Update video_processing_utils.py	2025-05-27 13:46:37 +02:00
Joao Gante	80902ae9b1	[chat] use the checkpoint's `generation_config.json` as base parameterization (#38330 ) * use model gen config * unwanted diff	2025-05-27 10:35:33 +00:00
hoshi-hiyouga	008e0d87c5	Fix convert to original state dict for VLMs (#38385 ) * fix convert to original state dict * fix * lint * Update modeling_utils.py	2025-05-27 10:27:59 +00:00
Joao Gante	c769483188	[chat] improvements for thinking models and reduce default verbosity (#38322 ) misc improvements	2025-05-27 10:20:58 +00:00
Marc Sun	55f2333366	guard size mismatch check to only quantized models (#38397 ) fix	2025-05-27 11:45:03 +02:00
Raushan Turganbay	1a5be2f5c0	[aya vision] fix processor for vLLM (#38371 ) accidentally merged two PRs in one (；－＿－)	2025-05-27 09:43:53 +00:00
Raushan Turganbay	19fdb75cf0	[video utils] group and reorder by number of frames (#38374 ) fix	2025-05-27 11:32:33 +02:00
Raushan Turganbay	b0735dc0c1	[paligemma] fix processor with suffix (#38365 ) fix pg processor	2025-05-27 11:31:56 +02:00
Raushan Turganbay	9e1017b479	[transformers x vLLM] standardize processors (#37915 ) * standardize * fix tests * batch update some processors, not final yet * oke, now I tested that everything indeed runs. Still needs prettification * emu3 * fixup * gemma3 but it doesn't generate anything * fuyu * update * why? * Update src/transformers/models/aya_vision/processing_aya_vision.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * address comments * bc * why do we need to guard import this every time? * i hate guarded imports * i am blind --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-05-27 11:30:30 +02:00
Cyril Vallez	b5ececb900	Fix image token mask in Gemma3 (#38295 ) fix mask	2025-05-27 11:15:52 +02:00
Jitesh Gupta	c4e71e8fff	Add AMD MI300 CI caller leveraging self-hosted runner scale set workflow in hf-workflows (#38132 )	2025-05-26 23:13:02 +02:00
Matt	706b00928f	Stop autoconverting custom code checkpoints (#37751 ) * Stop autoconverting custom code checkpoints * make fixup * Better auto class detection * Match the kwarg ordering	2025-05-26 19:15:28 +01:00
Yih-Dar	07848a8405	update gemma tests (#38384 ) * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-26 19:54:04 +02:00
Joao Gante	cd0f3ce73b	[cli] cli usable without torch (#38386 ) cli without torch	2025-05-26 16:54:18 +00:00
Matt	ba6d72226d	🚨 🚨 Fix custom code saving (#37716 ) * Firstly: Better detection of when we're a custom class * Trigger tests * Let's break everything * make fixup * fix mistaken line doubling * Let's try to get rid of it from config classes at least * Let's try to get rid of it from config classes at least * Fixup image processor * no more circular import * Let's go back to setting `_auto_class` again * Let's go back to setting `_auto_class` again * stash commit * Revert the irrelevant changes until we figure out AutoConfig * Change tests since we're breaking expectations * make fixup * do the same for all custom classes * Cleanup for feature extractor tests * Cleanup tokenization tests too * typo * Fix tokenizer tests * make fixup * fix image processor test * make fixup * Remove warning from register_for_auto_class * Stop adding model info to auto map entirely * Remove todo * Remove the other todo * Let's start slapping _auto_class on models why not * Let's start slapping _auto_class on models why not * Make sure the tests know what's up * Make sure the tests know what's up * Completely remove add_model_info_to_* * Start adding _auto_class to models * Start adding _auto_class to models * Add a flaky decorator * Add a flaky decorator and import * stash commit * More message cleanup * make fixup * fix indent * Fix trust_remote_code prompts * make fixup * correct indentation * Reincorporate changes into dynamic_module_utils * Update call to trust_remote_code * make fixup * Fix video processors too * Fix video processors too * Remove is_flaky additions * make fixup	2025-05-26 17:37:30 +01:00
Matt	701caef704	Stop TF weight rename reDOS (#38325 ) * let's try a non-regex solution * make fixup * Slight adjustment * Let's just use the original code with a check * slight tweak to conditional * slight tweak to conditional	2025-05-26 16:58:51 +01:00
Judd	0a4e8e2855	fix typo: `tokenizer` -> `tokenize` (#38357 )	2025-05-26 15:29:16 +00:00
Ragnar	63964b7c67	fix typos (#38336 ) * Update video_processor.md * Update deepseek_v3.md	2025-05-26 14:42:37 +00:00
Cyril Vallez	8b03c8eaf2	Better check in `initialize_weights` (#38382 ) * Update modeling_utils.py * CIs * CIs	2025-05-26 16:20:23 +02:00
Yih-Dar	eb74cf977b	Use one `utils/notification_service.py` (#38379 ) * step 1 * step 2 * step 3 * step 4 * step 5 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-26 16:15:29 +02:00
Arthur	98328fd9a1	for now disable compile (#38383 )	2025-05-26 15:57:11 +02:00
Manuel de Prada Corral	78079abeff	Improved cache docs (#38060 ) * improved cache docs Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-26 13:53:41 +00:00
Dhia Eddine Rhaiem	7a9b071bfd	[Falcon H1] Fix slow path forward pass (#38320 ) * Create push-important-models.yml * feat: add falcon-h1 * fixup * address comment * fix * fix copies * fix copies * fix * fix * fix * fix * fix copies * fix * fix copies * fix test import to at least trigget the cis * yups * update * fix make fix copies * fix inits? * fix style * skip annoying test * add integration test for Falcon H1 * fix copies * fix * fix typo * make style * fix slow path generations * clean debug traces * debug * remove debug traces final confirmation * clean debug traces final * fix format and lineup * make style * debug * Update src/transformers/models/falcon_h1/modular_falcon_h1.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * adress comments * fix fix-copies * fix integration test * Merge pull request #7 from ydshieh/fix-slow-path update * another update (#8) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Younes Belkada <younesbelkada@gmail.com> Co-authored-by: younesbelkada <younes.belkada@tii.ae> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-26 15:30:35 +02:00
Cyril Vallez	b5b76b5561	Protect `get_default_device` for torch<2.3 (#38376 ) * Update modeling_utils.py * CIs	2025-05-26 15:00:09 +02:00
Isotr0py	bff32678cc	Fix incorrect batching audio index calculation for Phi-4-Multimodal (#38103 ) * fix Signed-off-by: Isotr0py <2037008807@qq.com> * add tests Signed-off-by: Isotr0py <2037008807@qq.com> * code format Signed-off-by: Isotr0py <2037008807@qq.com> * Update src/transformers/models/phi4_multimodal/feature_extraction_phi4_multimodal.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-05-26 12:41:31 +00:00
Cyril Vallez	9f0402bc4d	Fix all import errors based on older torch versions (#38370 ) * Update masking_utils.py * fix * fix * fix * Update masking_utils.py * Update executorch.py * fix	2025-05-26 12:11:54 +02:00
Anton Vlasjuk	d03a3ca692	[`OPT`] Fix attention scaling (#38290 ) * fix opt attention scaling * add comment to why we do this	2025-05-26 11:02:16 +02:00
Yao Matrix	a5a0c7b888	switch to device agnostic device calling for test cases (#38247 ) * use device agnostic APIs in test cases Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> * add one more Signed-off-by: YAO Matrix <matrix.yao@intel.com> * xpu now supports integer device id, aligning to CUDA behaviors Signed-off-by: Matrix Yao <matrix.yao@intel.com> * update to use device_properties Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> * update comment Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix comments Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-26 10:18:53 +02:00
Raushan Turganbay	cba279f46c	[VLMs] add helpers for get/set embedding (#38144 ) * add helpers in VLMs * fix tied weight key test	2025-05-26 09:50:32 +02:00
Yih-Dar	6e3063422c	Uninstall `kernels` for AMD docker images (#38354 ) Uninstall kernels for AMD docker images Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-25 19:42:25 +02:00
Yih-Dar	4a03044ddb	Hot fix for AMD CI workflow (#38349 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-25 11:15:31 +02:00
Yih-Dar	d0c9c66d1c	new failure CI reports for all jobs (#38298 ) * new failures * report_repo_id * report_repo_id * report_repo_id * More fixes * More fixes * More fixes * ruff --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-24 19:15:02 +02:00
Kseniya Parkhamchuk	31f8a0fe8a	[docs]: update roformer.md model card (#37946 ) * Update roformer model card * fix example purpose description * fix model description according to the comments * revert changes for autodoc * remove unneeded tags * fix review issues * fix hfoption --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-23 16:27:56 -07:00
Bryan C.	36f97ae15b	docs(swinv2): Update SwinV2 model card to new standard format (#37942 ) * docs(swinv2): Update SwinV2 model card to new standard format * docs(swinv2): Apply review suggestions Incorporates feedback from @stevhliu to: - Enhance the introductory paragraph with more details about scaling and SimMIM. - Generalize the tip from "image classification tasks" to "vision tasks". Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-23 13:04:13 -07:00
Aguedo	33d23c39ed	Update BioGPT model card (#38214 ) * Update BioGPT model card * Update docs/source/en/model_doc/biogpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/biogpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/biogpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/biogpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/biogpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/biogpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/biogpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/biogpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/biogpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/biogpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/biogpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * correction for CPU fallback * added quantization code and method * fixed transformers-cli call --------- Co-authored-by: Aguedo <aguedo@fakeemail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-23 13:03:47 -07:00
Cheery	dffb118013	Remove duplicate docstring: resample (#38305 ) Duplicate of the line above.	2025-05-23 13:02:58 -07:00
Cyril Vallez	e0aad278fe	Never fallback to eager implicitly (#38327 ) * remove arg everywhere * Update warnings * add more models * Update sdpa_attention.py * fix style * fix * readd warnings but not for flex * Update test_modeling_common.py * skip * fix --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-05-23 19:48:01 +02:00
Alex Brooks	e64ed0304c	Use Gradient Checkpointing Layer in Jamba & Blip Related Models (#38310 ) * Use gradient checkpointing class in blip classes * Use gradient checkpointing class in jamba/bamba	2025-05-23 19:35:25 +02:00
Matt	53fb245eb6	🚨 🚨 Inherited CausalLM Tests (#37590 ) * stash commit * Experiment 1: Try just Gemma * Experiment 1: Just try Gemma * make fixup * Trigger tests * stash commit * Try adding Gemma3 as well * make fixup * Correct attrib names * Correct pipeline model mapping * Add in all_model_classes for Gemma1 again * Move the pipeline model mapping around again * make fixup * Revert Gemma3 changes since it's a VLM * Let's try Falcon * Correct attributes * Correct attributes * Let's try just overriding get_config() for now * Do Nemotron too * And Llama! * Do llama/persimmon * Correctly skip tests * Fix Persimmon * Include Phimoe * Fix Gemma2 * Set model_tester_class correctly * Add GLM * More models! * models models models * make fixup * Add Qwen3 + Qwen3MoE * Correct import * make fixup * Add the QuestionAnswering classes * Add the QuestionAnswering classes * Move pipeline mapping to the right place * Jetmoe too * Stop RoPE testing models with no RoPE * Fix up JetMOE a bit * Fix up JetMOE a bit * Can we just force pad_token_id all the time? * make fixup * fix starcoder2 * Move pipeline mapping * Fix RoPE skipping * Fix RecurrentGemma tests * Fix Falcon tests * Add MoE attributes * Fix values for RoPE testing * Make sure we set bos_token_id and eos_token_id in an appropriate range * make fixup * Fix GLM4 * Add mamba attributes * Revert bits of JetMOE * Re-add the JetMOE skips * Update tests/causal_lm_tester.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add licence --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-05-23 18:29:31 +01:00
Aaron V	d5f992f5e6	Enhance Model Loading By Providing Parallelism, Uses Optional Env Flag (#36835 ) * Get parallel loader working. Include tests. * Update the tests for parallel loading * Rename env variables. * Add docs for parallel model weight loading. * Touch up parallel model loading docs. * Touch up parallel model loading docs again. * Edit comment in test_modeling_utils_parallel_loading.py * Make sure HF_PARALLEL_LOADING_WORKERS is spelled correctly in modeling_utils.py * Correct times for parallelized loading, previous times were for a "hot" filesystem * Update parallel model loading so the spawn method is encapsulated. DRY up the code by leveraging get_submodule. * Update docs on model loading parallelism so that details on setting the multiprocessing start method are removed, now that the package handles this step internally. * Fix style on model loading parallelism changes. * Merge latest version of master's modeling_utils. * Removed unused variable. * Fix argument packing for the parallel loader. * Fix state dict being undefined in the parallel model loader. * Rename variables used in parallel model loading for clarity. Use get_module_from_name(). * Switch to the use of threads for parallel model loading. * Update docs for parallel loading. * Remove the use of json.loads when evaluating HF_ENABLE_PARALLEL_LOADING. Prefer simple casting. * Move parallelized shard loading into its own function. * Remove use of is_true(). Favor checking env var true values for HF_ENABLE_PARALLEL_LOADING. * Update copyright to 2025 in readme for paralell model loading. * Remove garbage collection line in load_shard_file, implicit garbage collection already occurs. * Run formatter on modeling_utils.py * Apply style fixes * Delete tests/utils/test_modeling_utils_parallel_loading.py --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-05-23 16:39:47 +00:00
Anton Vlasjuk	1ed19360b1	[`FlexAttention`] Reenable flex for encoder-decoder and make the test more robust (#38321 ) * reenable most flex attention test cases * style * trigger * trigger	2025-05-23 18:16:43 +02:00
Ita Zaporozhets	bb567d85a4	refactor can_save_slow_tokenizer (#37722 ) * refactor to rm property can_save_slow_tokenizer, it can be done within the if of save_vocab * move property to fast * revert if * check if vocab_file is attr * fix check for sp * fix if condition * fix if condition * fix if condition	2025-05-23 17:29:38 +02:00
Zhen	3c289e2104	[performance_optim] reduce frequency of declaring attention_mask in Ascend NPU flash attention (#38278 ) [performance_optim] reduce frequency of declaring attention_mask in ASCEND NPU flash attention	2025-05-23 17:24:51 +02:00
Arthur	f5d45d89c4	🚨Early-error🚨 config will error out if `output_attentions=True` and the attn implementation is wrong (#38288 ) * Protect ParallelInterface * early error out on output attention setting for no wraning in modeling * modular update * fixup * update model tests * update * oups * set model's config * more cases * ?? * properly fix * fixup * update * last onces * update * fix? * fix wrong merge commit * fix hub test * nits * wow I am tired * updates * fix pipeline! --------- Co-authored-by: Lysandre <hi@lysand.re>	2025-05-23 17:17:38 +02:00
Cyril Vallez	896833c183	Fix some tests (especially compile with fullgraph=True on Python<3.11) (#38319 ) * fix tests * better fix for python<3.11 * fixes * style	2025-05-23 17:11:40 +02:00
Yih-Dar	a63bc17416	add `vasqu` to `self-comment-ci.yml` (#38324 ) add vasqu Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-23 17:09:44 +02:00
Joao Gante	54cd86708d	[custom_generate] don't forward `custom_generate` and `trust_remote_code` (#38304 ) * prevent infinite loops * docs * more links to custom generation methods	2025-05-23 14:49:39 +00:00
Jinan Zhou	135163e9c5	Expose AutoModelForTimeSeriesPrediction for import (#38307 ) * expose AutoModelForTimeSeriesPrediction for import * add in docs	2025-05-23 13:09:29 +00:00
Joao Gante	a6b51e7341	[Whisper + beam search] fix usage of `beam_indices` (#38259 ) * tmp * fix test_tiny_token_timestamp_batch_generation * better comments * test * comments * Apply suggestions from code review Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-05-23 10:05:44 +00:00
Joao Gante	3e960e032d	[tf/flax] handle `forced_decoder_ids` deletion (#38316 ) fix tf/flax, attr checks	2025-05-23 09:44:58 +00:00
Ryan Mullins	9eb0a37c9e	Adds use_repr to model_addition_debugger_context (#37984 ) * Adds use_repr to model_addition_debugger_context * Updating docs for use_repr option	2025-05-23 09:35:13 +00:00
Abdessamad Enabih	38f9c5b15b	Fix typo: change 'env' to 'environment' in .circleci/config.yml (#38273 ) * Fix typo: change 'env' to 'environment' in .circleci/config.yml * Remove CIRCLE_TOKEN environment variable from artifact retrieval step --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-05-23 10:45:27 +02:00
Yuanyuan Chen	11b670a282	Fix run_slow (#38314 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-05-23 10:18:30 +02:00
Raushan Turganbay	b01984a51d	[emu3] fix conversion script (#38297 ) * fix conversion script and update weights * fixup * remove commented line	2025-05-23 09:49:56 +02:00
Yaswanth Gali	2b585419b4	[Tests] Cleanup Janus Testcase (#38311 ) * Cleanup janus testcase * shift code to setup	2025-05-23 09:29:16 +02:00
Cyril Vallez	b59386dc0a	Oups typo for HybridChunkedCache (#38303 ) typo	2025-05-22 17:52:37 +02:00
Arthur	211f2b0875	Add CB (#38085 ) * stash for now * initial commit * small updated * up * up * works! * nits and fixes * don't loop too much * finish working example * update * fix the small freeblocks issue * feat: stream inputs to continuous batch * fix: update attn from `eager` to `sdpa` * refactor: fmt * refactor: cleanup unnecessary code * feat: add `update` fn to `PagedAttentionCache` * feat: broken optimal block size computation * fix: debugging invalid cache logic * fix: attention mask * refactor: use custom prompts for example * feat: add streaming output * fix: prefill split refactor: add doc strings and unsound/redundant logic fix: compute optimal blocks logic * fix: send decoded tokens when `prefilling_split` -> `decoding` * refactor: move logic to appropriate parent class * fix: remove truncation as we split prefilling anyways refactor: early return when we have enough selected requests * feat: add paged attention forward * push Ggraoh> * add paged sdpa * update * btter mps defaults * feat: add progress bar for `generate_batch` * feat: add opentelemetry metrics (ttft + batch fill %age) * feat: add tracing * Add cuda graphs (#38059) * draft cudagraphs addition * nits * styling * update * fix * kinda draft of what it should look like * fixes * lol * not sure why inf everywhere * can generate but output is shit * some fixes * we should have a single device synch * broken outputs but it does run * refactor * updates * updates with some fixes * fix mask causality * another commit that casts after * add error * simplify example * update * updates * revert llama changes * fix merge conflicts * fix: tracing and metrics * my updates * update script default values * fix block allocation issue * fix prefill split attnetion mask * no bugs * add paged eager * fix * update * style * feat: add pytorch traces * fix * fix * refactor: remove pytorch profiler data * style * nits * cleanup * draft test file * fix * fix * fix paged and graphs * small renamings * cleanups and push * refactor: move tracing and metrics logic to utils * refactor: trace more blocks of code * nits * nits * update * to profile or not to profile * refactor: create new output object * causal by default * cleanup but generations are still off for IDK what reason * simplifications but not running still * this does work. * small quality of life updates * nits * updaet * fix the scheduler * fix warning * ol * fully fixed * nits * different generation parameters * nice * just style * feat: add cache memory usage * feat: add kv cache free memory * feat: add active/waiting count & req latency * do the sampling * fix: synchronize CUDA only if available and improve error handling in ContinuousBatchingManager * fix on mps * feat: add dashboard & histogram buckets * perf: improve waiting reqs data structures * attempt to compile, but we should only do it on mps AFAIK * feat: decouple scheduling logic * just a draft * c;eanup and fixup * optional * style * update * update * remove the draft documentation * fix import as well * update * fix the test * style doomed --------- Co-authored-by: Luc Georges <luc.sydney.georges@gmail.com>	2025-05-22 17:43:48 +02:00
Cyril Vallez	73286d8e29	Fix HybridChunedCache & Llama4 (#38299 ) * Update cache_utils.py * Update cache_utils.py	2025-05-22 17:25:51 +02:00
Anton Vlasjuk	d95c864a25	🔴🔴🔴 [`Attention`] Refactor Attention Interface for Bart-based Models (#38108 ) * starting attn refactor for encoder decoder models via bart (eager + sdpa) * flash attention works, remove unnecessary code * flex attention support for bart!, gotta check if the renaming is not too aggressive * some comments * skip flex grad test for standalone as done with the other test * revert flex attn rename (for now), sdpa simplify, and todos * more todos * refactor mask creation for reuse * modular attempt at biogpt * first batch of other models * fix attn dropout * fix autoformer copies * hubert * another batch of models * copies/style + last round of bart models --> whisper next? * remove unnecessary _reshape function and remove copy to whisper * add skip for decoder-only models out of enc-dec (same as in bart) * bring back licences * remove comment, added to pr read instead * mostly docs * disable sew flex attn as it's unclear attn mask for now * oops * test fixes for enc-dec * torch fx fixes + try at flex attn * skip on mbart * some more fixes * musicgen skip / delete old attn class logic + sdpa compose compile skip * disable flex attn for musicgen, not worth the effort * more fixes and style * flex attention test for dropout and encoder decoder that dont have main input names * informer fixes * the weirdest thing I've encountered yet... * style * remove empty tensor attempt, found core root in previous commits * disable time series due to tests being very text centric on inputs * add speech to text to be ignoring the other attns, also due to tests * update docs * remaining issues resolved ? * update docs for current state --> nllb moe and pegasus x sdpa is questionable :D * some models have not set the is_causal flag... * change dtype in softmax tol old behaviour + some modular fixes * I hate it but it is what it is * fixes from main for bart * forgot this one * some model fixes * style * current status * marian works now * fixing some copies * some copy fixes + time series x informer * last models possibly and fixes on style/copies * some post merge fixes * more fixes * make attention interface callable and move warnings there * style lol * add comment to "unsupported" * remove callable interface and change interface warnings + some copies * fix * ternary is ugly af, make it simpler * how did that happen * fix flex attn test * failing the test * no more fallback! fixing copies next * style + attn fixed * fixing copies and mask creation * wrong copy * fixup tests and disable flex attn for now * fixup last tests?	2025-05-22 17:12:58 +02:00
Ákos Hadnagy	9895819514	Update CI Docker base image for AMD tests (#38261 ) use newer Pytorch base image for AMD CI tests	2025-05-22 16:38:40 +02:00
Yao Matrix	dfbee79ca3	refine `transformers env` output (#38274 ) * refine `transformers env` output Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-22 15:22:18 +02:00
Yuanyuan Chen	1234683309	More typing in src/transformers/training_args.py (#38106 ) * Annotate `framework` in src/transformers/training_args.py Signed-off-by: cyy <cyyever@outlook.com> * Fix typing Signed-off-by: cyy <cyyever@outlook.com> * Revert framework change Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-05-22 13:14:33 +02:00
Marc Sun	03a4c024dc	Fix tp error when torch distributed is already initialized (#38294 ) fix tp error	2025-05-22 12:34:05 +02:00
Yih-Dar	dcaf47dde5	add `liger-kernel` to docker file (#38292 ) add Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-22 11:58:17 +02:00
Cyril Vallez	163138a911	🚨🚨[core] Completely rewrite the masking logic for all attentions (#37866 ) * start * start having a clean 4d mask primitive * Update mask_utils.py * Update mask_utils.py * switch name * Update masking_utils.py * add a new AttentionMask tensor class * fix import * nits * fixes * use full and quandrants * general sdpa mask for all caches * style * start some tests * tests with sliding, chunked * add styling * test hybrid * Update masking_utils.py * small temp fixes * Update modeling_gemma2.py * compile compatible * Update masking_utils.py * improve * start making it more general * Update masking_utils.py * generate * make it work with flex style primitives! * Update masking_utils.py * Update masking_utils.py * Update masking_utils.py * improve * Update cache_utils.py * Update masking_utils.py * simplify - starting to look good! * Update masking_utils.py * name * Update masking_utils.py * style * Update masking_utils.py * Update masking_utils.py * Update masking_utils.py * Update masking_utils.py * small fix for flex * flex compile * FA2 * Update masking_utils.py * Escape for TGI/vLLM! * Update masking_utils.py * Update masking_utils.py * Update masking_utils.py * General case without cache * rename * full test on llama4 * small fix for FA2 guard with chunk * Update modeling_gemma2.py * post rebase cleanup * FA2 supports static cache! * Update modeling_flash_attention_utils.py * Update flex_attention.py * Update masking_utils.py * Update masking_utils.py * Update utils.py * override for export * Update executorch.py * Update executorch.py * Update executorch.py * Update executorch.py * Update masking_utils.py * Update masking_utils.py * output attentions * style * Update masking_utils.py * Update executorch.py * Add doicstring * Add license and put mask visualizer at the end * Update test_modeling_common.py * fix broken test * Update test_modeling_gemma.py * Update test_modeling_gemma2.py * Use fullgraph=False with FA2 * Update utils.py * change name * Update masking_utils.py * improve doc * change name * Update modeling_attn_mask_utils.py * more explicit logic based on model's property * pattern in config * extend * fixes * make it better * generalize to other test models * fix * Update masking_utils.py * fix * do not check mask equivalence if layer types are different * executorch * Update modeling_gemma2.py * Update masking_utils.py * use layer_idx instead * adjust * Update masking_utils.py * test * fix imports * Update modeling_gemma2.py * other test models * Update modeling_llama4.py * Update masking_utils.py * improve * simplify * Update masking_utils.py * typos * typo * fix * Update masking_utils.py * default DynamicCache * remove default cache * simplify * Update masking_utils.py * Update masking_utils.py * Update masking_utils.py * Update masking_utils.py * simplify * Update masking_utils.py * Update masking_utils.py * Update masking_utils.py * export * Update executorch.py * Update executorch.py * Update flex_attention.py * Update executorch.py * upstream to modular gemma 1 & 2 * Update modular_mistral.py * switch names * use dict * put it in the Layer directly * update copy model source for mask functions * apply so many modular (hopefully 1 shot) * use explicite dicts for make style happy * protect import * check docstring * better default in hybrid caches * qwens * Update modular_qwen2.py * simplify core logic! * Update executorch.py * qwen3 moe * Update masking_utils.py * Update masking_utils.py * simplify a lot sdpa causal skip * Update masking_utils.py * post-rebase * gemma3 finally * style * check it before * gemma3 * More general with newer torch * align gemma3 * Update utils.py * Update utils.py * Update masking_utils.py * Update test_modeling_common.py * Update flex_attention.py * Update flex_attention.py * Update flex_attention.py * test * executorch * Update test_modeling_common.py * Update masking_utils.py * Update masking_utils.py * Update masking_utils.py * Update masking_utils.py * Update executorch.py * Update test_modeling_common.py * fix copies * device * sdpa can be used without mask -> pass the torchscript tests in this case * Use enum for check * revert enum and add check instead * remove broken test * cohere2 * some doc & reorganize the Interface * Update tensor_parallel.py * Update tensor_parallel.py * doc and dummy * Update test_modeling_paligemma2.py * Update modeling_falcon_h1.py * Update masking_utils.py * executorch patch * style * CIs * use register in executorch * final comments! --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-05-22 11:38:26 +02:00
Joao Gante	f8630c778c	[Whisper] handle deprecation of `forced_decoder_ids` (#38232 ) * fix * working saved forced_decoder_ids * docstring * add deprecation message * exception message ordering * circular import comment	2025-05-22 09:16:38 +00:00
Joao Gante	aa02a5d902	[whisper] move processor test into processor test file 🧹 (#38266 ) move processor tests	2025-05-22 10:07:11 +01:00
Yao Matrix	b26157d64c	add XPU info print in print_env (#38282 ) Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-22 11:03:56 +02:00
Bryan C.	b369a65480	docs(swin): Update Swin model card to standard format (#37628 ) * docs(swin): Update Swin model card to standard format * docs(swin): Refine link to Microsoft organization for Swin models Apply suggestion from @stevhliu in PR #37628. This change updates the link pointing to the official Microsoft Swin Transformer checkpoints on the Hugging Face Hub. The link now directs users specifically to the Microsoft organization page, filtered for Swin models, providing a clearer and more canonical reference compared to the previous general search link. Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * docs(swin): Clarify padding description and link to backbone docs Apply suggestion from @stevhliu in PR #37628. This change introduces two improvements to the Swin model card: 1. Refines the wording describing how Swin handles input padding for better clarity. 2. Adds an internal documentation link to the general "backbones" page when discussing Swin's capability as a backbone model. These updates enhance readability and improve navigation within the Transformers documentation. Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * docs(swin): Change Swin paper link to huggingface.co/papers as suggested Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-21 16:16:43 -07:00
Parag Ekbote	28d3148b07	Update Model Card for Mamba (#37863 ) * update model card. * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update quantization example. * update example. * update --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-21 10:58:23 -07:00
Arthur	7b7bb8df97	Protect ParallelInterface (#38262 ) Co-authored-by: Lysandre <hi@lysand.re> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-05-21 17:45:38 +02:00
ritsumei-aoi	5c13cc0f94	Remove Japanese sequence_classification doc and update references (#38246 )	2025-05-21 08:33:41 -07:00
jiqing-feng	71009e4b68	assign the correct torchao data layout for xpu (#37781 ) * assign the correct data layout for xpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * check torch version before using torchao xpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix the log Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix zero point type Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix check torch version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-05-21 17:21:55 +02:00
danielyxyang	d6c34cdcd0	Fix: missing else branch to handle "--load_best_model_at_end" in training_args.py (#38217 ) Update training_args.py	2025-05-21 14:28:56 +00:00
Yuanyuan Chen	ae3e4e2d97	Improve typing in TrainingArgument (#36944 ) * Better error message in TrainingArgument typing checks * Better typing * Small fixes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-05-21 13:54:38 +00:00
amd-xiaoyu12	174684a9b6	Simplify DTensor Check for modeling_utils.py (#38245 ) Update modeling_utils.py	2025-05-21 13:35:44 +00:00
Joao Gante	e4decee9c0	[whisper] small changes for faster tests (#38236 )	2025-05-21 14:11:08 +01:00
Lysandre Debut	ddf67d2d73	Clearer error on import failure (#38257 ) Clearer error	2025-05-21 14:32:29 +02:00
Mohamed Mekkouri	9a962dd9ed	Add tearDown method to Quark to solve OOM issues (#38234 ) fix	2025-05-21 14:26:44 +02:00
youngrok cha	101b3fa4ea	fix multi-image case for llava-onevision (#38084 ) * _get_padding_size module * do not patchify images when processing multi image * modify llava onevision image processor fast * tensor to list of tensors * backward compat * reuse pad_to_square in llave & some clarification * add to doc * fix: consider no image cases (text only or video) * add integration test * style & repo_consistency	2025-05-21 11:50:46 +02:00
Raushan Turganbay	a21f11fca2	[`compile`] re-enable for Qwen-VL models (#38127 ) * compile qwen models * delete TODO comment * fix embeds test * fix assisted decoding * add comments	2025-05-21 09:50:39 +00:00
Dhia Eddine Rhaiem	4542086db7	[Falcon H1] Fix Typo in Integration Test (#38256 ) * Create push-important-models.yml * feat: add falcon-h1 * fixup * address comment * fix * fix copies * fix copies * fix * fix * fix * fix * fix copies * fix * fix copies * fix test import to at least trigget the cis * yups * update * fix make fix copies * fix inits? * fix style * skip annoying test * add integration test for Falcon H1 * fix copies * fix * fix typo * make style --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Younes Belkada <younesbelkada@gmail.com> Co-authored-by: younesbelkada <younes.belkada@tii.ae> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-05-21 11:25:26 +02:00
Younes Belkada	6829936ee0	[MODEL] Add Falcon H1 (#38249 ) * Create push-important-models.yml * feat: add falcon-h1 * fixup * address comment * fix * fix copies * fix copies * fix * fix * fix * fix * fix copies * fix * fix copies * fix test import to at least trigget the cis * yups * update * fix make fix copies * fix inits? * fix style * skip annoying test * add integration test for Falcon H1 * fix copies * fix --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: dhia.rhaiem <dhia.rhaiem@tii.ae>	2025-05-21 10:43:11 +02:00
Arthur	e288ee00d8	tp plan should not be NONE (#38255 ) * accept custom device_mesh * fix device_map * assert that num_heads % tp_size == 0 * todo. * ReplicateParallel * handle tied weights * handle dtensor in save_pretrained with safe_serialization * tp test works * doesnt work * fix shard_and_distribute_module's rank should be local_rank * tp=4 is correct * dp+tp is broken * todo allreduce with dtensors on another dim is annoying * workaround to sync dp grads when using dtensors * loading a checkpoint works * wandb and compare losses with different tp/dp * cleaning * cleaning * . * . * logs * CP2 DP2 no mask works after commenting attn_mask and is_causal from scaled_dot_product_attention * DP=2 TP=2 now works even with tied embeddings * model.parameters() and model.module.parameters() are empty.. * reformat sanity_check_tensor_sync * set atol=1e-4 for CP to pass * try populate _parameters from named_modules * refactors TP2 DP2 works CP2 DP2 works * is_causal=True and pack sequences, no attn mask, and preshuffle dataset * fix packing * CP=4 doesn't work * fix labels and position_ids for CP * DP CP works with transformers 🥳🥳🥳 * refactor * add example cp * fixup * revert sdpa changes * example cleared * add CP, DP to the mesh init * nit * clean * use `ALL_PARALLEL_STYLES` * style * FSDP works * log on 1 rank * . * fix? * FSDP1 also has .parameters() bug * reported gradnorm when using FSDP1 is wrong, but loss is correct so it's okay * . * style and fixup * move stuff around * fix tests * style * let's make it a check * add missing licences * warning should be an info * tp plan should not be NONE * test all * god damn it * test all --------- Co-authored-by: nouamanetazi <nouamane98@gmail.com>	2025-05-21 10:22:38 +02:00
Lysandre Debut	711d78d104	Revert parallelism temporarily (#38240 ) * Revert "Protect ParallelInterface" This reverts commit cb513e35f9c096d60558bd43110837cbb66611ce. * Revert "parallelism goes brrr (#37877)" This reverts commit 1c2f36b480e02c9027d2523746d34e27b39e01a4. * Empty commit	2025-05-20 22:43:04 +02:00
Yih-Dar	feec294dea	CI reporting improvements (#38230 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-20 19:34:58 +02:00
Lysandre	cb513e35f9	Protect ParallelInterface	2025-05-20 18:27:50 +02:00
Lysandre	f4ef41c45e	v4.53.0.dev0	2025-05-20 18:12:56 +02:00
Raushan Turganbay	f834d368f6	[gemma3] fix bidirectional attention mask (#38080 ) * fix attn mask * attn viz doesn't show yello cubes between images * bucketize made it hard with different number of crops * fixup	2025-05-20 17:35:04 +02:00
Raushan Turganbay	2edb0e4b4d	[mllama] fix loading and inference (#38223 ) fix loading	2025-05-20 17:34:55 +02:00
Garrett Goon	390f153469	Add padding-free to bamba (#35861 ) * add seq_idx and fa kwargs * update tests * docs and grad ckpt support * fmt * better names * test_raise_missing_padding_free_kwarg_errs * + seq_idx in doc strings * padding free training docs * add link to pr plots * raise err on attn_mask with padding free * rm raising missing padding free err test * BambaFlashAttentionKwargs * run modular util for modular_granitemoehybrid.py	2025-05-20 17:13:59 +02:00
Mohamed Mekkouri	2a79471318	Fixing Bitnet after use_rms_norm introduction (#38229 ) * fix * make style	2025-05-20 17:13:21 +02:00
Bojun Feng	9661896083	Enable Quantize KV Cache for Mistral Model (#35042 ) fix #35041	2025-05-20 16:50:26 +02:00
Nouamane Tazi	1c2f36b480	parallelism goes brrr (#37877 ) * accept custom device_mesh * fix device_map * assert that num_heads % tp_size == 0 * todo. * ReplicateParallel * handle tied weights * handle dtensor in save_pretrained with safe_serialization * tp test works * doesnt work * fix shard_and_distribute_module's rank should be local_rank * tp=4 is correct * dp+tp is broken * todo allreduce with dtensors on another dim is annoying * workaround to sync dp grads when using dtensors * loading a checkpoint works * wandb and compare losses with different tp/dp * cleaning * cleaning * . * . * logs * CP2 DP2 no mask works after commenting attn_mask and is_causal from scaled_dot_product_attention * DP=2 TP=2 now works even with tied embeddings * model.parameters() and model.module.parameters() are empty.. * reformat sanity_check_tensor_sync * set atol=1e-4 for CP to pass * try populate _parameters from named_modules * refactors TP2 DP2 works CP2 DP2 works * is_causal=True and pack sequences, no attn mask, and preshuffle dataset * fix packing * CP=4 doesn't work * fix labels and position_ids for CP * DP CP works with transformers 🥳🥳🥳 * refactor * add example cp * fixup * revert sdpa changes * example cleared * add CP, DP to the mesh init * nit * clean * use `ALL_PARALLEL_STYLES` * style * FSDP works * log on 1 rank * . * fix? * FSDP1 also has .parameters() bug * reported gradnorm when using FSDP1 is wrong, but loss is correct so it's okay * . * style and fixup * move stuff around * fix tests * style * let's make it a check * warning should be an info --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-05-20 16:22:52 +02:00
Cyril Vallez	b591d925be	Fix Llama4 (#38222 ) Update modeling_llama4.py	2025-05-20 16:00:46 +02:00
ivarflakstad	3f0b7d0fac	Mamba2 remove unecessary test parameterization (#38227 )	2025-05-20 13:54:04 +00:00
Pablo Montalvo	9cde2f5d42	Minor llama4 fixes (#38123 ) * fix wrong scaling value/default Cache init * style * fix various issues on integration tests * change expected outputs * fixup * fix config access * protect default scaling	2025-05-20 13:15:54 +00:00
James Niken	856f034f45	fix dead flax links modeling_flax_pytorch_utils.py (#38212 )	2025-05-20 13:03:41 +00:00
Marc Sun	bb3c6426d8	Make `train_dataset` attribute in `_get_train_sampler` optional (#38226 ) make it optional	2025-05-20 12:59:53 +00:00
Boian Petkantchin	2ad152f84c	In Llama4 fix wrongly inverted causal attention mask when using SDPA implementation (#38094 ) When preparing the causal attention mask at this point the mask comes in as a float tensor with min value as a masked value. It is not correct to convert it to bool and treat it as a bool mask as this inverts the mask. `torch.nn.functional.scaled_dot_product_attention` expects that a masked value is `False`. I suspect that the `sdpa` implementation variant may not have been thoroughly tested and that is why this error was not caught earlier. Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-05-20 14:47:59 +02:00
ivarflakstad	de70c8426e	Disable torchscript tests for AriaForConditionalGenerationModelTest (#38225 ) Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-05-20 14:37:55 +02:00
brenoca	8ea61c4530	Add support to Marimo Notebooks and Enverge.ai (#38210 ) * Add support to Marimo notebooks * Consice logic * Simplify logic * Ruff fixes	2025-05-20 12:26:34 +00:00
Manuel de Prada Corral	d34e21e7dd	New cache tests and refactored Hybrid Cache (#37972 )	2025-05-20 12:46:13 +02:00
Matthew Hoffman	183fb3637c	Add `Llama4TextModel` to `AutoModel` mapping (#38162 ) Add Llama4TextModel to AutoModel mapping using Llama4TextConfig on AutoModel.from_config raises a ValueError when it is expected to instantiate a Llama4TextModel	2025-05-20 10:01:00 +00:00
Titus	f022bf9322	Remove trust_remote_code=True tests from bnb quantization tests (MPT now integrated) (#38206 ) bnb quant tests: remove obsolete trust_remote_code test The MPT model is now natively integrated in Transformers and no longer requires trust_remote_code=True. This removes the failing test_get_keys_to_not_convert_trust_remote_code and related usage, which depended on remote code and caused CI issues due to missing dependencies (e.g., triton_pre_mlir).	2025-05-20 11:43:11 +02:00
Raushan Turganbay	0a52bd2403	[fix] sliding window attention mask (#38045 ) * fix sliding attn * make style * Update tests/test_modeling_common.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * no a second throught, should default to `True` fo BC --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-05-20 09:32:19 +00:00
Yong Hoon Shin	555715f418	Fix broken example generation script for Llama3 (#38062 ) Fix broken example generation script for llama3	2025-05-20 10:53:43 +02:00
Matej Sirovatka	7a611f0afd	Fix: make docs work better with doc builder (#38213 )	2025-05-20 08:23:03 +00:00
Yao Matrix	3bd1c20149	enable misc cases on XPU & use device agnostic APIs for cases in tests (#38192 ) * use device agnostic APIs in tests Signed-off-by: Matrix Yao <matrix.yao@intel.com> * more Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> * add reset_peak_memory_stats API Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-20 10:09:01 +02:00
shawn	dbc4b91db4	Qwen2.5-Omni: Update modeling_qwen2_5_omni.py to fix error when loading quantized weights with AutoAWQ. (#38013 ) * Update modular_qwen2_5_omni.py fix the error when loading quantized model by AuotAWQ. * Update modeling_qwen2_5_omni.py sync code to modular_qwen2_5_omni.py	2025-05-20 09:53:51 +02:00
Matej Sirovatka	46a4b7c909	Feat: save_pretrained for tensor parallel (and other parallelisms) models (#37919 ) * tmp: initial save pretrained with dtensors * Feat: add correctness tests * Refactor: version checks * Temp: 1:1 checkpoint llama4 * refactor * Tests * Feat: works * Style * Feat: version checks + minor fixes * Style * Fix: version checks in tests * Feat: move more stuff into tensor_parallel.py	2025-05-19 18:16:21 +00:00
Fanli Lin	9ecee14378	[doc] fix bugs in `how_to_hack_models.md` (#38198 ) fix several bugs	2025-05-19 10:37:54 -07:00
Nanji Huaji	f524439cc5	Translating model_doc/bert.md to Chinese (#37806 ) * Translated model_doc/bert.md * Revise grammatical errors * Changed _toctree.yml * Revise some errors	2025-05-19 10:14:57 -07:00
Matej Sirovatka	6e738411e1	Tensor parallel docs (#38178 ) * Feat: initial docs * Feat: update doc * Final typos/changes * Refactor: reorder top to bottom.	2025-05-19 17:05:01 +00:00
Joao Gante	9c500015c5	🚨🚨🚨 [pipelines] update defaults in pipelines that can `generate` (#38129 ) * pipeline generation defaults * add max_new_tokens=20 in test pipelines * pop all kwargs that are used to parameterize generation config * add class attr that tell us whether a pipeline calls generate * tmp commit * pt text gen pipeline tests passing * remove failing tf tests * fix text gen pipeline mixin test corner case * update text_to_audio pipeline tests * trigger tests * a few more tests * skips * some more audio tests * not slow * broken * lower severity of generation mode errors * fix all asr pipeline tests * nit * skip * image to text pipeline tests * text2test pipeline * last pipelines * fix flaky * PR comments * handle generate attrs more carefully in models that cant generate * same as above	2025-05-19 18:02:06 +01:00
Joao Gante	6f9da7649f	[image-text-to-text pipeline] Accept a chat as a positional arg (#38204 ) accept chat as a positional arg	2025-05-19 17:26:09 +01:00
NielsRogge	7c9b0ca08c	[SAM-HQ] Update names in the docs (#38058 ) Update names	2025-05-19 09:21:14 -07:00
Daize Dong	04282a9ef5	Remove Deprecated `verbose` arg in LayerWiseDummyScheduler (#38197 ) Remove Deprecated args in LayerWiseDummyScheduler	2025-05-19 13:49:11 +00:00
Shane A	aef12349b6	Make HF implementation match original OLMo 2 models for lower precisions (#38131 ) * Make HF implementation match OLMo models for lower precisions * Add test of 1B logits in bfloat16 * Run make fixup	2025-05-19 15:35:23 +02:00
Fanli Lin	9644acb7cb	[docs] add Audio import (#38195 ) add Audio import	2025-05-19 13:16:35 +00:00
Fanli Lin	7d93f93f83	[docs] minor fixes in `models.md` (#38193 ) minor gix	2025-05-19 13:14:21 +00:00
Sergio Paniego Blanco	47f8578d96	Pass `eps` to `Mistral3RMSNorm` (#38026 ) Pass eps to Mistral3RMSNorm Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-05-19 15:09:25 +02:00
Emmanuel Ferdman	6c6302817d	Resolve Python logger warnings (#38183 ) * Resolve Python logger warnings Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> * Apply style fixes --------- Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-05-19 12:53:07 +00:00
Lysandre Debut	003deb16f1	Support for transformers explicit filename (#38152 ) * Support for transformers explicit filename * Tests * Rerun tests	2025-05-19 14:33:47 +02:00
Joao Gante	dbb9813dff	[generation] Less verbose warnings by default (#38179 ) * tmp commit (imports broken) * working version; update tests * remove line break * shorter msg * dola checks need num_beams=1; other minor PR comments * update early trainer failing on bad gen config * make fixup * test msg	2025-05-19 10:03:37 +00:00
Daize Dong	656e2eab3f	Add adam_kwargs for Apollo Optimizer (#38168 ) Add adam_kwargs for Apollo	2025-05-19 08:59:49 +00:00
Yaswanth Gali	6bb6821d93	Refactor `get_XXX_dataloader` from Trainer (#38090 ) * Remove test_dataloader * refactor	2025-05-19 10:43:27 +02:00
Joao Gante	40a493c7ed	[tests] remove `test_sdpa_equivalence` (redundant) (#37911 ) * rm test_sdpa_equivalence * make fixup --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-05-16 18:37:27 +01:00
kang sheng	ea29f61ed9	fix bug in distributed loss test (#38166 ) * fix bug in distributed loss test and change some config to pass at both 2&8 gpus * fix doc	2025-05-16 16:21:35 +00:00
Chachura Baptiste	a4389494c7	Fix import torchao.prototype.low_bit_optim since torchao v0.11 (#38174 ) * Fix ModuleNotFoundError torchao.prototype.low_bit_optim since torchao v 0.11.0 * Fix space on blank line * update torchao's AdamW4bit and AdamW8bit import for v0.11.0 * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-05-16 18:02:33 +02:00
Yoni Gozlan	0ba95564b7	Add args support for fast image processors (#37018 ) * add args support to fast image processors * add comment for clarity * fix-copies * Handle child class args passed as both args or kwargs in call and preprocess functions * revert support args passed as kwargs in overwritten preprocess * fix image processor errors	2025-05-16 12:01:46 -04:00
Peter St. John	d69945e5fc	[ESM] Add flash-attention-2 backend for ESM-2 (#38023 ) * Add flash-attention-2 backend for ESM-2 Signed-off-by: Peter St. John <pstjohn@nvidia.com> * update extended_attention_mask for fa2 Signed-off-by: Peter St. John <pstjohn@nvidia.com> * add test_flash_attn_2_equivalence test Signed-off-by: Peter St. John <pstjohn@nvidia.com> --------- Signed-off-by: Peter St. John <pstjohn@nvidia.com>	2025-05-16 14:11:56 +01:00
Matej Sirovatka	7b5e327c6e	Feat: add warnings for unused keys and rules in tensor parallel (#37893 ) Feat: tensor parallel plan verification	2025-05-16 14:52:47 +02:00
Yih-Dar	120935234f	remove some commands from `fetch_tests` CircleCI job (#38176 ) delete Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-16 14:42:50 +02:00
Yih-Dar	91f6fa00f4	Disable `convert to draft` workflow (#38177 ) delete Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-16 14:42:14 +02:00
Yih-Dar	5036ec8872	Disable `Trigger CircleCI by ready for review` (#38171 ) delete Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-16 14:02:48 +02:00
Yao Matrix	7f28da2850	clean autoawq cases on xpu (#38163 ) * clean autoawq cases on xpu Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-16 13:56:43 +02:00
Raushan Turganbay	01ad9f4b49	Bart: new cache format (#35314 ) * bart compile * add mbart * some more models touched by fix-copies * more * more models * even more models * fix copies * fix tests * fix copies * fix * biogpt accepts position ids now (breaking?) * fix failing non-slow tests * fix some tests * should not be removed * small update * Update src/transformers/models/bart/modeling_bart.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * update for last `main` * fix copies * clone `update_causal_mask` from llama * tmp * fixup * why? how? * fix bart tests * dont skip test * address comments * fix tests * fix * fixup and delete the file --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-05-16 13:26:54 +02:00
Raushan Turganbay	3ab47b6ce3	[VLMs] add helpers to get multimodal encodings (#37743 ) * add helpers in VLMs * fix tests and copies * fix blip tests * make fix-copies * fix copies * fixup	2025-05-16 13:20:10 +02:00
Codys12	1e921a3a9c	Add optional RMSNorm support to BitNet quantization (config + layers) (#38087 ) * enable optional RMS in BitLinear * Fix naming * Import RMS from Llama using config.* * make fix-copies * ran CI loop * remove default BitNetQuantConfig values * Fix BitNetQuantConfig to be Optional * Fix config docstrings to match Optoinal * Edit docstrings to match standards --------- Co-authored-by: steinmetzc <codysteinmetz7@gmail.com> Co-authored-by: codys12 <steinmetzc@dh-mgmt4.hpc.msoe.edu> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-05-16 12:38:06 +02:00
BakerBunker	57a79f51b2	Fix Qwen2.5 Omni `SinusoidsPositionEmbedding` precision (#38151 ) * Fix Qwen2.5 Omni `SinusoidsPositionEmbedding` precision fixes https://github.com/QwenLM/Qwen2.5-Omni/issues/271 * Update modular_qwen2_5_omni.py	2025-05-16 12:24:50 +02:00
Jerry Zhang	44fa04ae8d	Include output embedding as well with `include_embedding` flag (#37935 ) * Include output embedding as well with `include_embedding` flag Summary: att Test Plan: python tests/quantization/torchao_integration/test_torchao.py -k test_include_embedding Reviewers: Subscribers: Tasks: Tags: * format * rename include_embedding to include_input_output_embeddings --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-05-16 12:06:11 +02:00
Yao Matrix	34c1e29cdd	enable autoround cases on XPU (#38167 ) * enable autoround cases on XPU Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-16 09:08:35 +00:00
Pavel Gein	0f77ca72ca	[FIX] Save speed metrics to logs (#38136 ) Previously, we calculated speed metrics and did not do anything with the result.	2025-05-15 16:58:50 +02:00
Simon Levine	27ef46e846	Omit creation of positional IDs within ESM if applicable (#38089 ) * omit pos emb creation * rft --------- Co-authored-by: sgottreich <sgottreich@absci.com>	2025-05-15 14:09:21 +00:00
Wing Lian	fe9426f12d	disable deepspeed when setting up fake trainer (#38101 ) * disable deepspeed when setting up fake trainer * Apply style fixes --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-05-15 15:34:04 +02:00
Yao Matrix	7caa57e85e	enable trainer test cases on xpu (#38138 ) * enable trainer test cases on xpu Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-15 12:17:44 +00:00
Aurélien Lac	b11b28cc4e	Hotfix: Flash Attention 2 support in Pixtral (#38146 ) setting attention_mask to None when flash_attention_2 is selected Co-authored-by: aurelien.lac <aurelien.lac@lighton.ai>	2025-05-15 11:45:35 +02:00
Joao Gante	0e0e5c1044	[generate] Run custom generation code from the Hub (#36405 ) * mvp * remove trust_remote_code * generate_from_hub * handle requirements; docs * english * doc PR suggestions * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * changed remote code path to generate/generate.py * model repo has custom generate -> override base generate * check for proper inheritance * some doc updates (missing: tag-related docs) * update docs to model repo * nit * nit * nits * Update src/transformers/dynamic_module_utils.py * Apply suggestions from code review * Update docs/source/en/generation_strategies.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * trust remote code is required * use new import utils for requirements version parsing * use org examples * add tests * Apply suggestions from code review Co-authored-by: Manuel de Prada Corral <6536835+manueldeprada@users.noreply.github.com> * ascii file structure; tag instructions on readme.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Manuel de Prada Corral <6536835+manueldeprada@users.noreply.github.com>	2025-05-15 10:35:54 +01:00
Raushan Turganbay	955e61b0da	Remove head mask in generative models (#35786 ) * just squash into one commit * delete print	2025-05-15 10:44:19 +02:00
Yao Matrix	0173a99e73	enable csm integration cases on xpu, all passed (#38140 ) * enable csm test cases on XPU, all passed Signed-off-by: Matrix Yao <matrix.yao@intel.com> * fix style Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Matrix Yao <matrix.yao@intel.com>	2025-05-15 09:46:29 +02:00
Huang, Guangtai	e5a48785d9	[Qwen3] Qwen3 MoE add tp plan for expert mlps (#38135 ) fix tp plan	2025-05-15 09:12:39 +02:00
Olivier Schipper	4005e30c80	Fix incorrect attention mask truncate in WhisperFlashAttention2 (#36477 ) * Fix incorrect attention mask truncate in whisper flash attention * also fix incorrect attention mask truncate in qwen2 audio * Nit attention mask truncate modeling_qwen2_audio.py * Nit attention mask truncate modeling_whisper.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-05-14 20:08:31 +00:00
Sangbum Daniel Choi	aa27fa75cd	enable d_fine finetuning properly (#37962 ) add pre_output in the front Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-05-14 16:53:04 +01:00
Manuel de Prada Corral	e021bf6bf8	Add `manueldeprada` to `run_slow` whitelist (#38126 ) Add manueldeprada to run_slow allowed users	2025-05-14 15:16:58 +02:00
Arjuna Sky Kok	ef27b2bc22	[docs] add uv installation instructions for source builds (#37968 )	2025-05-14 13:09:41 +00:00
guspuffygit	4a2decd192	Update trainer.md (#38113 ) Fix typo in torch.compile method parameters	2025-05-14 12:40:00 +00:00
Kirire	935bbbc711	Add config validation and style tweaks (#37589 ) * Add config validation and style tweaks * Fix style issues * Fix style issues * style * Small fixes for copy/paste errors --------- Co-authored-by: Cyrile <cyrile.delestre@arkea.com>	2025-05-14 12:22:10 +00:00
ivarflakstad	1b00966395	Fix auto batch size finder test (#38125 ) Ensure --auto_find_batch_size is the last test arg so indexing is correct	2025-05-14 12:12:04 +00:00
Ritwick Chaudhry	fe918d13b9	Fix temporal padding in Qwen2VLImageProcessor when the number of frames is not divisible by temporal_patch_size (#38076 ) Qwen2VL: Fix temporal padding in Qwen2VLImageProcessor when frames are not divisible by temporal_patch_size	2025-05-14 12:28:21 +02:00
Raushan Turganbay	aaf224d570	[video processor] fix tests (#38104 ) * fix tests * delete * fix one more test * fix qwen + some tests are failing irrespective of `VideoProcessor` * delete file	2025-05-14 10:24:07 +00:00
Yao Matrix	9b5ce556aa	enable finegrained_fp8 and granite_speech cases on XPU (#38036 ) * enable finegrained_fp8 cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> * change back to auto Signed-off-by: Yao Matrix <matrix.yao@intel.com> * rename per comments Signed-off-by: Matrix Yao <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Signed-off-by: Matrix Yao <matrix.yao@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-05-14 08:58:40 +00:00
bilibili12433014	b311a3f506	Fix description and formatting errors in code docs (#38074 ) * Update stopping_criteria.py Fix description and formatting errors. * Update stopping_criteria.py Align formatting with existing files for consistency.	2025-05-13 17:17:15 +00:00
Marc Sun	b499a14b17	Add style bot (#38102 ) add style bot	2025-05-13 19:07:17 +02:00
eustlb	e0f225cb10	[CSM] update test for t4 runners (#38110 ) update test for t4 runners	2025-05-13 11:59:26 -04:00
Jinyong Lee	342961f669	Add Fast Image Processor for vilt (#37304 ) * init vilt image processor fast * Refactor image processor tests to use loop for all processors * Add ViltImageProcessorFast with PyTorch-based optimized image processing * Change made automatically by make fixup command * Change made automatically by make fix-copies command * Fix type hints in ViltImageProcessorFast for Python compatibility * Define constants for image resizing based on COCO dataset aspect ratio * Add missing property initializations to ViltImageProcessorFast * Extract resize logic into dedicated method in ViltImageProcessorFast * Extract padding logic into dedicated method * Implement shape-based image grouping for optimized processing in Vilt * Update test suite to verify ViltImageProcessorFast attributes * Move variable declarations to _preprocess method parameters * Remove unused parameters * Rename _resize method to resize to override existing function * Remove whitespace * Remove unnecessary type check and conversion for stacked_images * Remove redundant loop and apply padding directly to stacked images * Refactor pad function to return images and mask as tuple instead of dict * Add tests comparing padding masks in slow and fast implementations * Update ViltImageProcessor tests to ensure compatibility between slow and fast implementations * Replace add_start_docstrings with auto_docstring in ViltImageProcessorFast * Move docstrings of custom args to ViltFastImageProcessorKwargs * Use reorder_images function for both masks and images --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-05-13 15:40:53 +00:00
Yoni Gozlan	8771766a70	Fix InternVL interpolate_pos_encoding and add to video_processing_auto (#38092 ) * fix InternVL interpolate_pos_encoding * fix modular and auto_video_processor for internvl	2025-05-13 11:18:40 -04:00
Yih-Dar	582d5e0e11	fix `check_bad commit.py` gives wrong results (#38107 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-13 16:58:22 +02:00
youngrok cha	a5cc7a67d7	[bug] fix llava processor to calculate unpadding size correctly (#37988 ) * fix llava processor to calculate unpad size correctly * repo consistency * Revert "repo consistency" & "setUp in llava family" This reverts commit 26a50af8db5b15bb6b700db3d53342fe69579d8e. * add edge case test for padding & unpadding * compute unpadding size from original size * make test config explicit * Revert "compute unpadding size from original size" This reverts commit 752cd27ad9710ab056c17a9986760c4651975540. * Revert "add edge case test for padding & unpadding" This reverts commit ccbd094d69c3f8f6a259159164284f60ba835bce. * revert unpad logic * remove irrelevant tests * model test * remove processor from model test --------- Co-authored-by: jaycha <jaycha@ncsoft.com>	2025-05-13 13:49:09 +00:00
Chris	67b3d45eb6	Fix `past_key_values` type hint in model output types (#37953 ) * F: Fix type hint. * F: Use Cache type. * F: Sort import. * U: Format. * U: Address reviews.	2025-05-13 13:36:49 +00:00
Eva Koroleva	07feaad8fb	Fix bug in prefill_chunk_size that ignores disable_compile flag (#38067 ) Fix bug in prefill_chunk_size implementation that ignores disable_compile flag	2025-05-13 13:23:23 +00:00
Raushan Turganbay	e40f301f1f	[smolvlm] skip the test (#38099 ) skip the test	2025-05-13 12:50:43 +00:00
ivarflakstad	e27d230ddd	Disable report callbacks for certain training tests (#38088 ) * Disable report callbacks for certain training tests * Disable report callbacks for test_auto_batch_size_finder	2025-05-13 14:49:55 +02:00
Bongseok Lee	ab65ba47ad	fix: Propagate `lr_scheduler_kwargs` options to create LR Scheduler when LayerWiseDummyOptimizer is used (#34559 ) fix: fix get_scheduler	2025-05-13 13:56:45 +02:00
Fanli Lin	8fb60bf6be	add timeout for downloading the `librispeech_asr` dataset (#38073 ) * add timeout * change 10 to 60	2025-05-13 11:50:12 +01:00
Yih-Dar	3ad35d0bca	update `require_read_token` (#38093 ) * update require_read_token * new repo * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-13 12:07:07 +02:00
Yoni Gozlan	e3b70b0d1c	Refactor image processor phi4 (#36976 ) * refactor image processor phi4 * nits fast image proc * add image tests phi4 * Fix image processing tests * update integration tests * remove revision and add comment in integration tests	2025-05-12 15:13:40 -04:00
Yih-Dar	4143f94d51	uninstall `kernels` from docker images (#38083 ) uninstall kernels Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-12 18:03:47 +02:00
Shiyu	a63cb7578e	update seed_worker to set seed based on worker_id and rank (#37980 ) * update seed_worker to set seed based on worker_id and rank * test case * set output_dir as remove tmp dir	2025-05-12 15:59:16 +00:00
efsotr	e387821a96	Fix tot update in trainer (#37923 ) * fix total updates in epoch * add test; fix max_steps * replace with multi-gpu decorator	2025-05-12 17:45:24 +02:00
Weipeng Jiang	f0e975c6cf	fix the inconsist docstring in apply_chat_template (#38069 ) The commit (`5cf11e5ab9`) fixed the type hints for the parameter `tools` in apply_chat_template, but the docstring was not changed.	2025-05-12 16:32:01 +01:00
Junlin Zhou	31791b16a1	chore(qwen2): display warning log only when sliding window attention … (#36316 ) * chore(qwen2): display warning log only when sliding window attention is enabled * Align modeling_qwen2.py and modular_qwen2.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-05-12 16:31:44 +01:00
ivarflakstad	8ea72d12a2	Fix mt5 test on AMD devices (#38081 )	2025-05-12 16:59:00 +02:00
谭九鼎	5c85018072	docs: fix md style (#38057 )	2025-05-12 15:56:31 +01:00
ivarflakstad	7eaa90b87b	Add AMD expectation to test_gpt2_sample (#38079 )	2025-05-12 16:51:21 +02:00
Pavel Iakubovskii	4220039b29	Fix OneFormer integration test (#38016 ) * Fix integration tests * format	2025-05-12 16:02:41 +02:00
Joao Gante	8efe3a9d77	[`chat`] generate parameterization powered by `GenerationConfig` and UX-related changes (#38047 ) * accept arbitrary kwargs * move user commands to a separate fn * work with generation config files * rm cmmt * docs * base generate flag doc section * nits * nits * nits * no <br> * better basic args description	2025-05-12 14:04:41 +01:00
Raushan Turganbay	a5c6172c81	[VLM] fix loading issues (#38051 ) * fix qwen2-vl loading * fix a few nore models * delete print * fix copies	2025-05-12 10:14:04 +00:00
Raushan Turganbay	a31fa218ad	🔴 Video processors as a separate class (#35206 ) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-05-12 11:55:51 +02:00
Arjuna Sky Kok	716819b830	fix(conversion): Fix size mismatch error during TF->PT model loading (#38014 )	2025-05-10 11:11:07 +00:00
Yao Matrix	8f08318769	enable generation fsdp/utils cases on XPU (#38009 ) * enable generation fsdp/utils test cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> * xx Signed-off-by: Yao Matrix <matrix.yao@intel.com> * use backend_xx APIs Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com>	2025-05-09 20:52:41 +00:00
Pavel Iakubovskii	87e971e14d	Fix linalg.norm for CovnNextV2 (#38015 ) Fix norm	2025-05-09 17:44:28 +01:00
Cyril Vallez	aaed2f5577	Fix cache update! (#38046 ) * fix slicing * better fix	2025-05-09 17:54:48 +02:00
Mikhail Moskovchenko	7f1a97bae3	Fix reduce-labels in BEIT Fast Image Processor (#38042 ) * Fixed reduce-labels * Little doc fix * Change docstring	2025-05-09 11:51:46 -04:00
Yih-Dar	9f9020fed3	Re-Enable `Trigger CircleCI via GitHub Actions when "ready for review" (#37885)` (#38041 ) * check actions * trigger CI * check actions * finally --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 16:57:54 +02:00
Lysandre Debut	23d79cea75	Support for version spec in requires & arbitrary mismatching depths across folders (#37854 ) * Support for version spec in requires & arbitrary mismatching depths * Quality * Testing	2025-05-09 15:26:27 +02:00
François REMY	774dc274ac	Do not erase a cache_position passed explicitly to generate(), if there is one (#37986 ) Do not erase a cache_position initialization passed explicitly to generate(), if there is one. But: Let initialization replace cache_position if it's set to None. I assume that if the value is explicitly passed but None, we should initialize anyway.	2025-05-09 10:56:21 +00:00
Yih-Dar	0010b41524	Disable `Trigger CircleCI via GitHub Actions when` ready for review` (#38038 ) disable Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 12:27:53 +02:00
Yih-Dar	d498528800	Trigger CircleCI via GitHub Actions when `ready for review` (#37885 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 11:45:03 +02:00
Yih-Dar	66e696ee15	[Temporary] Log some information in some pytest/pluggy internal places (#37996 ) log pytest info Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 11:06:37 +02:00
Yao Matrix	a72cb31434	enable utils test cases on XPU (#38005 ) * enable utils test cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> * Update tests/utils/test_skip_decorators.py Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> * fix comment Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-05-09 08:45:01 +02:00
Yao Matrix	1dfad4beb2	make mistral3 pass on xpu (#37882 ) * enabled mistral3 test cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * calibrate A100 expectation Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update * update * update * update * update * update --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-09 06:41:11 +00:00
Wing Lian	121f7037c7	fix document masking for chunked attention (#37429 ) * fix document masking for chunked attention * remove accidental debugging sum	2025-05-09 08:22:00 +02:00
Arthur	5f5ccfdc54	[`AutoDocstring`] Based on inspect parsing of the signature (#33771 ) * delete common docstring * nit * updates * push * fixup * move stuff around fixup * no need for dataclas * damn nice modular * add auto class docstring * style * modular update * import autodocstring * fixup * maybe add original doc! * more cleanup * remove class do cas well * update * nits * more celanup * fix * wups * small check * updatez * some fixes * fix doc * update * nits * try? * nit * some updates * a little bit better * where ever we did not have help we are not really adding it! * revert llama config * small fixes and small tests * test * fixup * more fix-copies * updates * updates * fix doc building * style * small fixes * nits * fix-copies * fix merge issues faster * fix merge conf * nits jamba * ? * working autodoc for model class and forward except returns and example * support return section and unpack kwargs description * nits and cleanup * fix-copies * fix-copies * nits * Add support for llava-like models * fixup * add class args subset support * add examples inferred from automodel/pipelines * update ruff * autodocstring for Aria, Albert + fixups * Fix empty return blocks * fix copies * fix copies * add autodoc for all fast image processors + align, altclip * fix copies * add auto_doc for audio_spectrogram, auto_former, bark, bamba * Drastically improve speed + add bart beit bert * add autodoc to all bert-like models * Fix broken doc * fix copies * fix auto_docstring after merge * add autodoc to models * add models * add models * add models and improve support for optional, and custom shape in args docstring * update fast image processors * refactor auto_method_docstring in args_doc * add models and fix docstring parsing * add models * add models * remove debugging * add models * add fix_auto_docstrings and improve args_docs * add support for additional_info in args docstring * refactor (almost) all models * fix check docstring * fix -copies * fill in all missing docstrings * fix copies * fix qwen3 moe docstring * add documentation * add back labels * update docs and fix can_return_tuple in modular files * fix LongformerForMaskedLM docstring * add auto_docstring to _toctree * remove auto_docstring tests temporarily * fix copyrights new files * fix can_return_tuple granite hybrid * fix fast beit * Fix empty config doc * add support for COMMON_CUSTOM_ARGS in check_docstrings and add missing models * fix code block not closed flava * fix can_return_tuple sam hq * Fix Flaubert dataclass --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-05-08 17:46:07 -04:00
jiqing-feng	d231f5a7d4	update bnb tests (#38011 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-05-08 20:35:24 +00:00
Yao Matrix	b3db4ddb22	enable mamba2 integration cases on xpu (#38006 ) * enable mamba2 integration cases on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com>	2025-05-08 19:48:09 +00:00
Fanli Lin	c7c2f08994	make `test_speculative_decoding_non_distil` device-agnostic (#38010 ) * make device-agnostic * use condition --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-05-08 19:19:47 +00:00
Raushan Turganbay	d23aae2b8c	[VLMs] support attention backends (#37576 ) * update models * why rename * return attn weights when sdpa * fixes * fix attn implementation composite * fix moshi * add message * add typings * use explicitly all flags for each attn type * fix some tests * import what is needed * kosmos on main has ew attention already, yay * new models in main, run fixup * won't fix kosmos yet * fix-copies * clean up after rebasing * fix tests * style * dont cast attns to fp32 * did we update ruff? oke, let's just do what it asks * fix pixtral after rebase	2025-05-08 18:18:54 +02:00
Tomek	e296c63cd4	Fix wording in `torchscript.md` (#38004 ) Fix wording in torchscript.md	2025-05-08 16:47:45 +01:00
Yufeng Xu	1c65aef923	Fix incorrect installation instructions (for issue #37476 ) (#37640 ) * debugging issue 36758 * debugging issue 36758 * debugging issue 36758 * updated attn_mask type specification in _flash_attention_forward * removed pdb * added a blank line * removed indentation * update constants * remove unnecessary files * created installation script, modified README * modified requirements and install.sh * undo irrelevant changes * removed blank line * fixing installation guide * modified README, python requirements, and install script * removed tests_otuput * modified README * discarded installation script and python<3.13 requirement	2025-05-08 16:32:58 +01:00
Yih-Dar	f2909e024c	Skip `test_push_to_hub_with_saves_each_epoch` for now (#38022 ) * update * trigger CI --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-08 16:26:24 +02:00
Joao Gante	f2b59c6173	[caches] Raise exception on offloaded static caches + multi device (#37974 ) * skip tests on >1 gpu * add todo	2025-05-08 14:37:36 +01:00
Joao Gante	4279057d70	[CI] remove duplicated message on GH comment to run slow tests (#37970 ) duplicated msg	2025-05-08 14:35:54 +01:00
Yih-Dar	3390534f36	Print commit SHA on slack message for new model notification. (#38019 ) add commit info Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-08 15:26:19 +02:00
Pavel Iakubovskii	9f8fffed3c	Fix `Optional` typing (#38018 ) * Fix * trigger	2025-05-08 14:51:45 +02:00
Yuanyuan Chen	06c16de3d3	Enable RUF013 to enforce optional typing (#37266 ) * Enable RUF013 for Optional typing Signed-off-by: cyy <cyyever@outlook.com> * Add Optional to types * Format code Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-05-08 12:39:56 +02:00
Aurélien Lac	f6664ee713	Add ALL_ATTENTION_FUNCTIONS compatibility for Pixtral model (#37960 ) * Add ALL_ATTENTION_FUNCTIONS compatibility for Pixtral model * Fix invalid operand type * Allow image_sizes to be optional in forward pass to fit tests Disallow using sdpa and output_attentions * Disallow using sdpa with output_attentions * Delete useless comments, use eager attention from smolvlm, use pattern from mistral * add _supports_attention_backend * use kwargs instead of position_ids --------- Co-authored-by: aurelien.lac <aurelien.lac@lighton.ai>	2025-05-08 12:13:13 +02:00
Sebastiaan Vermeulen	015b6dfbf8	Fix `pad` image transform for batched inputs (#37544 ) * fix * add batch dimension to expected output	2025-05-08 10:51:15 +01:00
Eon Kim	5c47d08b0d	Add Swin2SR ImageProcessorFast (#37169 ) * Add fast image processor support for Swin2SR * Add Swin2SR tests of fast image processing * Update docs and remove unnecessary test func * Fix docstring formatting * Skip fast vs slow processing test --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-05-07 12:20:16 -04:00
Raushan Turganbay	17742bd9c8	🔴 [VLM] Add base model without head (#37033 ) * i guessreverted all CdGen classes * style * llava onevision * fix copies * fix some tests * some more tests * dump * skip these * nevermind, i am dumb * revert fix not needed * fixup * fixup * another fixup * more fixup to make ci finally happy * fixup after rebasing * fix qwen tests * add internVL + typos here and there * image token index -> id * style * fix init weights * revert blip-2 not supported * address comments * fix copies * revert blip2 test file as well * as discussed internally, revert back CdGen models * fix some tests * fix more tests for compile * CI red * fix copies * enumerate explicitly allowed models * address comments * fix tests * fixup * style again * add tests for new model class * another fixup ( x _ x ) * [fixup] unused attributes can be removed post-deprecation	2025-05-07 17:47:51 +02:00
eustlb	3fa8d9c20e	[CSM] tiny fix on generation (#38001 ) nit	2025-05-07 11:45:23 -04:00
eustlb	798f948e88	Add CSM model (#36719 ) * draft structure * depth decoder with forward pre hook * full model forward draft * draft update * depth decoder update * ConversationalSpeechModelForCausalLM udpates * add generate * max length criteria small fix * udpate * updates * generation update * update in loss compute * conversion script * update for correct input embeddings * handle interleaved rope * update * update * update * support compile * update training * add doc * update doc * correct inits * ConversationalSpeechModel -> Csm * conf update * name update * tests CsmForCausalLMTest * convert use cached_file * conf + modeling updates * generate utils handle third dim shape * integration test * modeling + conf updates * common test handle more than 2 dims * add nested audio list utils * processing handle nested audio list * csm processing draft * mimi util * init updates * modular update * convert modular * processing update * csm tests update * generate tests handle third dim * generate utils handle third dim * propagate _get_initial_cache_position update * tied_weight_keys update + convert correctly * fix inputs_embeds * revert audio nested list * batch inference update + return audio * audio_utils update * processor update * some more integration tests * remove old test * porcessing output labels * improve * fix * update rope values with equivalent ones * conversion update * udpate tests * handle depth decoder generation config * remove default eos_token_id * make style * revert modeling_mimi * add default generation_config * remove sdpa since handled by default * make * fix conflict * fix conflicts * correct naming * correct imports * make * causal -> conditional naming * causal -> conditional naming * auto update * make * make * add doc * test update * fix weight init * audio tokens offsets as buffer * 4d mask in conditional class * make * doc update * fix causal mask * fix causal mask * doc update * doc update * add processor doc * update doc * fix 4d causal mask * update make_list_of_audio * do not default to mutable * remove duplicates * remove useless reset_parameters * use GradientCheckpointingLayer * use can_return_tuple * formatting * prepend placeholder in _sample * torch compile fix * some more fixies * convert modular * fix * default max_length in convert * handle depth decoder generation config correctly * clearer formulation * handle output_loading_info * handle softmax warning * add doc * propagate _get_initial_cache_position changes * generation in its own module * add processor tests * fix compile witu cuda graphs * fix compile with cuda graphs * add csm.md * include CSM loss * doc nit * doc nit * doc nit * Update docs/source/en/model_doc/csm.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add save_audio to processor * Update src/transformers/models/csm/modular_csm.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * doc update * simplify audio_codes_mask computation * doc update * simplify loss computation * fix static cache test * fix * remove comment * simplify encoded length computation * use hf-internal-testing * doc update * cast to float before numpy * nit * mem efficient codebook head * nit * cat input values with cutoffs --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-05-07 10:20:13 -04:00
Fiona Waters	c8607a17cb	Add a check to import_utils.py to allow for use of faiss_gpu installation (#37997 ) Adding check to import_utils.py for faiss_gpu	2025-05-07 14:27:41 +01:00
kaixuanliu	fb1e3a4daa	remove duplicate code (#37991 ) Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>	2025-05-07 13:46:45 +01:00
Raushan Turganbay	8a9441d26d	[chat template] separate jinja logic from tokenizers (#37602 ) * split oit jinja * raise error	2025-05-07 14:18:03 +02:00
Yao Matrix	038f8fc159	make aya vision 5 integration tests pass on xpu (#37990 ) * 5 aya vision integration pass on XPU Signed-off-by: Yao Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-05-07 11:16:38 +02:00
Joao Gante	a9384f849a	[offload] respect `max_memory` argument when factoring in unused reserved memory (#37982 )	2025-05-07 09:49:31 +01:00
Guang Yang	0b037fd425	Fix Qwen models export with torch 2.7 (#37985 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2025-05-07 09:13:08 +02:00
Aritra Roy Gosthipaty	3c0796aaea	[Fast Processor] BEiT (#37005 ) * adding fast processor for beit * adding resample * address review issues and add segmentation maps logic * style * chore: adding tests * reduce label test * adding batched tests * Update src/transformers/models/beit/image_processing_beit_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * fix imports and make segmentation masks * fix tests * build segmentation maps * all tests pass * style * style fix * style * chore: delete demo.py file * review suggestions * Update docs/source/en/model_doc/beit.md Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-05-06 17:40:28 -04:00
Matt	ebbe9b12dd	Fix donut backtracking (#37788 ) * Fix donut backtracking * make fixup * Trigger tests * Remove old line * Update code * Fix reversed slice	2025-05-06 17:39:04 +01:00
Alex Brooks	06c4d05fe6	Enable granite speech 3.3 tests (#37560 ) * Enable granite speech 3.3 tests * skip sdpa test for granite speech * Explicitly move model to device * Use granite speech 2b in tests --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-05-06 17:56:18 +02:00
Joaquin Caballero	031ef8802c	fix FSDP + torch.compile bug when saving pretrained model (#37725 ) * args keep_torch_compile=False in _save and _wwrap_method * Fix FSDP execution on evaluation for torch_compile mode * add test trainer FSDP + Torch Compile * fix quality code * make style * Revert " make style" This reverts commit 77e797f8829c50992cc21496be3d9a3e480e1c97. * make style	2025-05-06 17:51:28 +02:00
Yao Matrix	5534b80b7f	enable xpu in test_trainer (#37774 ) * enable xpu in test_trainer Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * enhance _device_agnostic_dispatch to cover value Signed-off-by: Yao Matrix <matrix.yao@intel.com> * add default values for torch not available case Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Signed-off-by: Yao Matrix <matrix.yao@intel.com>	2025-05-06 17:13:35 +02:00
Kyungmin Lee	7db5d5b9ea	Fix typo (#37964 )	2025-05-06 14:59:00 +01:00
Joao Gante	af2866a8b1	[speech2text] fix init of sinusoidal embeddings (#37931 ) * fix init (meta device -> bad numbers) * fast test * dont init sinusoidal twice * make fixup	2025-05-06 14:49:00 +01:00
omahs	274e79b326	Fix typos (#37978 ) fix typos	2025-05-06 14:45:20 +01:00
nlhm	057ae00504	Small typo lines 47 and 199 perf_infer_gpu_one.md (#37938 ) * Small typo line 199 perf_infer_gpu_one.md * Typo l. 47 perf_infer_gpu_one.md	2025-05-06 14:32:55 +01:00
湛露先生	cc68070d41	fix docs serving typos. (#37936 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-05-06 14:32:44 +01:00
Yih-Dar	b1375177fc	add job links to new model failure report (#37973 ) * update for job link * stye --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-06 15:10:29 +02:00
youngrok cha	acded47fe7	[llava] one pixel is missing from padding when length is odd (#37819 ) * [fix] one pixel should be added when length is odd * [fix] add vision_aspect_ratio args & typo * [fix] style * [fix] do not fix fast file directly * [fix] convert using modular * remove duplicate codes * match unpad logic with pad logic * test odd-sized images for llava & aria * test unpad odd-sized padding for llava family * fix style * add kwarg to onvision modular * move vision_aspect_ratio from image_processor to processor (llava_onevision)	2025-05-06 13:11:26 +02:00
Joao Gante	9981214d32	[tests] Smaller model in slow cache tests (#37922 )	2025-05-06 11:15:25 +01:00
Fanli Lin	ff5ef95db7	add xpu memory check (#37969 ) add xpu check	2025-05-06 11:57:49 +02:00
Pedro Sandoval	7cc78804ba	🚨🚨🚨 Fix forward of Dinov2ForImageClassification for models with registers (#37836 ) * add num_tokens_to_discard to the forward of Dinov2ForImageClassification * redefine forward in modular file, remove change to modeling_dinov2 file * run make fixup --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-05-06 11:55:53 +02:00
Sukriti Sharma	471958b620	Add GraniteMoeHybrid support for 4.0 (#37658 ) * initial config and MLA layer Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * first pass at decoder Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * completion of layers Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * modeling class Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * adding hybrid class to imports Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix imports granitemoehybrid Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix granitehybrid imports Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix granitehybrid import Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix generated modeling file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * add some comments Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * minor fixes in layers Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * add sharedMLP layer Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * correct layer names Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fixes in mamba config Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix mamba config Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * change name of MLP layer Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix seq mizer layers Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * correct mamba config Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fixes in param names Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * enable hybrid model Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * update config Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix config granite hybrid Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix attention layer Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * cleanup to re-use mamba code Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * keep layer types Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * attention bias cleanup Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * update mamba layer name Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * first pass at tests Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * first pass at tests Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * use granite attention Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix: self attn weights Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * pass at making pos_emb optional Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * initialize self_attn only as needed Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * overwrite forward to create HybridMambaCache Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * Log invalid layer types * Add attention outputs test * Only emit attentions/logits if not None * Fix config test hidden size divisibility * mark granitmoehybrid as stateful * Initialize mamba convolutional layers * Formatting fixes * config docstring, removed some unused attrs * Fix missing arg in models test * Fix create and check decoder model test * support logits to keep in granitemoe * regen to pass logits_to_keep * Allow None or rope * Fix gradient checkpointing * Add granitemoehybrid as special cache for generate check * Remove unused MLA refs * Fix mamba layer mask * Remove logits to keep from config * Minor docstring nits * Update licenses * Enable cache by default * map layer types to layer block type * First pass at granite moe hybrid docs * Ignore granite moe hybrid in valid checkpoint check * Align attention interfaces * regenerate modular granitemoeshared attention interface * Align granite moe hybrid attn interface * run formatting * Handle mamba initialization * avoid conditional attr defs * Move hybrid layer validation to config * Add placeholder integration tests * Docs nits / Update model names * Clean up forward conditions * Use gradient checkpointing layer * Remove some copied bamba tests + inherit align test init delete more tests Use common layer init with bamba tests finish test consolidation * avoid redundant intermediate std var * use @can_return_tuple * Remove unused moe state * make skipped test names consistent * Fix docstring order * Add missing toc * Always create the shared mlp * Fix name in docstring * link preview model in docs --------- Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> Co-authored-by: Alex-Brooks <Alex.Brooks@ibm.com>	2025-05-06 06:47:43 +02:00
Kyle Sayers	fe29b8c487	[Ready to Merge][HFQuantizer] Squelch pydantic warnings (#37726 ) replace dict with model_dump Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-05-05 20:38:49 +02:00
Tanuj Rai	46c0e1ff80	Fix incorrect type annotation in get_auxiliary_logits (#37955 ) Correct type annotation from Dict(str, Tensor) to Dict[str, Tensor]	2025-05-05 19:00:49 +01:00
Jonas	d80f53fa50	[generate] Fix `vocab_size` access for multimodal models (#37937 ) Implements last migrations for generation from `config.vocab_size` to `config.get_text_config().vocab.size` In doing so, we enable multimodal models to fully leverage all existing generation features.	2025-05-05 15:56:56 +01:00
Yih-Dar	7819911b0c	Use T4 single GPU runner with more CPU RAM (#37961 ) larger T4 single GPU Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-05 16:17:45 +02:00
Joao Gante	3b067a15dd	[core] reuse unused reserved cuda memory when loading models (#37920 )	2025-05-05 15:14:05 +01:00
ivarflakstad	afbc293e2b	More fault tolerant notification service (#37924 ) * Let notification service succeed even when artifacts and reported jobs on github have mismatch * Use default trace msg if no trace msg available * Add pop_default helper fn * style	2025-05-05 15:19:48 +02:00
NielsRogge	36ca58bf4f	[D-FINE] Update names (#37957 ) * Update names * Fix modular --------- Co-authored-by: qubvel <qubvel@gmail.com>	2025-05-05 13:05:46 +01:00
Joao Gante	2932f318a2	[docs] logits docstring (#37929 )	2025-05-02 16:38:35 +01:00
Jerry Zhang	fa3c3f9cab	Break weight tying when quantizing input embedding (#37905 ) Summary: Currently when we try to quantize input_embedding for some models, the output embedding (lm_head) will also be quantized the same way, since they are tied, and this may not be what we want. To break the tie, we added the option to allow people to 1. load unquantized weight 2. tie weights 3. quantize so that the tie will be broken Test Plan: ``` from transformers import ( AutoModelForCausalLM, AutoProcessor, AutoTokenizer, TorchAoConfig, ) from torchao.quantization.quant_api import ( IntxWeightOnlyConfig, Int8DynamicActivationIntxWeightConfig, AOPerModuleConfig ) from torchao.quantization.granularity import PerGroup, PerAxis import torch model_id = "microsoft/Phi-4-mini-instruct" embedding_config = IntxWeightOnlyConfig( weight_dtype=torch.int8, granularity=PerAxis(0), ) linear_config = Int8DynamicActivationIntxWeightConfig( weight_dtype=torch.int4, weight_granularity=PerGroup(32), weight_scale_dtype=torch.bfloat16, ) quant_config = AOPerModuleConfig({"_default": linear_config, "model.embed_tokens": embedding_config}) quantization_config = TorchAoConfig(quant_type=quant_config, include_embedding=True, untie_embedding_weights=True) quantized_model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float32, device_map="auto", quantization_config=quantization_config) tokenizer = AutoTokenizer.from_pretrained(model_id) print(quantized_model) print("embed_tokens.weight:", quantized_model.model.embed_tokens.weight) print("lm head weight:", quantized_model.lm_head.weight) from transformers.modeling_utils import find_tied_parameters print(find_tied_parameters(quantized_model)) ``` Reviewers: Subscribers: Tasks: Tags: Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-05-02 10:53:23 +02:00
Aritra Roy Gosthipaty	8a0a508f2b	Aligning modling code for GPT2 to work with vLLM (fallback) (#36934 ) * aligning for vllm * using input shape rather than attn outputs * remove demo * revert Conv1D * style * style * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix copies * Apply suggestions from code review Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * adding docs about vllm * chore: style --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-02 09:55:16 +02:00
Federico Baldassarre	e94a4807df	Add usage example for DINOv2 (#37398 ) * Add usage example for DINOv2 * More explicit shape names * More verbose text * Moved example to Notes section * Indentation	2025-05-01 08:54:22 -07:00
Bogeum Kim	d20aa68193	🌐 [i18n-KO] Translated `gpu_selection.md` to Korean (#36757 ) * Add _toctree.yml * feat: serving.md draft * Add _toctree.yml * feat: gpu_selection.md nmt draft * fix: TOC edit * Update docs/source/ko/serving.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ko/gpu_selection.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ko/serving.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update _toctree.yml --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-01 08:44:12 -07:00
woctordho	ee25d57ed1	Improve performance of `load_state_dict` (#37902 ) Improve performance of load_state_dict	2025-05-01 16:35:17 +02:00
Joao Gante	410aa01901	[chat] clean code and add base help (#37892 )	2025-05-01 15:12:18 +01:00
co63oc	5b573bebb9	Fix typos in strings and comments (#37910 )	2025-05-01 14:58:58 +01:00
Ita Zaporozhets	c80f65265b	🚨 rm already deprecated pad_to_max_length arg (#37617 ) * rm already deprecated padding max length * truncate_strategy AS AN ARG is already deprecated for a few years * fix * rm test_padding_to_max_length * rm pad_to_max_length=True in other tests * rm from common * missed fnet	2025-05-01 15:21:55 +02:00
Diogo Glória-Silva	7a3e208892	fixed gemma3 collection path pointing to llama 2 collection. (#37899 )	2025-04-30 12:50:54 -07:00
Jerry Zhang	86777b5e2f	Support `AOPerModuleConfig` and `include_embedding` (#37802 ) * Support `AOPerModuleConfig` and include_embedding Summary: This PR adds support per module configuration for torchao Also added per module quantization examples: 1. Quantizing different layers with different quantization configs 2. Skip quantization for certain layers Test Plan: python tests/quantization/torchao_integration/test_torchao.py -k test_include_embedding python tests/quantization/torchao_integration/test_torchao.py -k test_per_module_config_skip Reviewers: Subscribers: Tasks: Tags: * format * format * inlcude embedding remove input embedding from module not to convert * more docs * Update docs/source/en/quantization/torchao.md Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_torchao.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_torchao.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-04-30 20:16:29 +02:00
Sifal	c3aeaa8060	Enhance documentation to explain chat-based few-shot prompting (#37828 ) * Enhance documentation to explain chat-based few-shot prompting Updates the documentation on few-shot prompting to illustrate how to structure examples using the chat-based format for instruction-tuned models. * Update docs/source/en/tasks/prompting.md Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Update docs/source/en/tasks/prompting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/prompting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/prompting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/prompting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix typos --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-30 11:00:10 -07:00
Mohamed Mekkouri	36e2e33bbe	Fix Qwen3 tp plan with FP8 (#37871 ) * update for qwen 3 * fix style * rm print	2025-04-30 18:14:10 +02:00
Joao Gante	8e8025b384	[tests] reset logs in `torch.compile` test (#37894 )	2025-04-30 16:04:28 +01:00
Joao Gante	1b222903c3	[tests] Test all cache implementations (#37873 )	2025-04-30 15:37:00 +01:00
Yan Zhao	2c1155519f	Support FlaxPreTrainedModel to load model checkpoint from local subfolder safetensors (#37732 ) Support FlaxPreTrainedModel to load model checkpoint from subfolder in local directory as safetensors format Signed-off-by: Yan Zhao <zhao.y4@northeastern.edu>	2025-04-30 16:13:23 +02:00
Arjuna Sky Kok	5b223bbc8c	update comment in image_processing_base.py to reference image_process… (#37864 ) update comment in image_processing_base.py to reference image_processing_utils_fast	2025-04-30 14:31:29 +01:00
LLinkedlist	0dffcb0967	Fix: reassign in qwen3 moe model (#37848 ) * Fix: reassign in qwen3 moe model Fix: reassign in qwen3 moe model * Remove redundant assignment to self.mlp * make fix-copies * Revert unwanted style change * Revert unwanted style change --------- Co-authored-by: li.ding <int.li.ding@enflame-tech.com> Co-authored-by: Matt <rocketknight1@gmail.com>	2025-04-30 13:49:59 +01:00
Tibor Reiss	6c5d374d56	uniformize kwargs for VisionTextDualEncoder (#34563 ) * Make kwargs uniform for VisionTextDualEncoder * Add bc for flipped args	2025-04-30 14:32:59 +02:00
湛露先生	4fc976779e	Fix qwen2-vl-docs. (#37879 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-04-30 13:32:21 +01:00
Wing Lian	4eb6acc896	make sure lr is not a tensor (#37881 ) * make sure lr is not a tensor * revert change from #37704 * clean up to reduce extra LoC --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-30 14:23:39 +02:00
jiaqiw09	7be92f9a94	fix error for _register_pytree_node in torch2.1.0 and fix bf16 assertion in xpu and npu (#37839 ) * fix error for _register_pytree_node and bf16 assertion * fix format * update xpu available assert function	2025-04-30 14:22:53 +02:00
湛露先生	455c3a33b0	update Clean_up_tokenization_spaces typos. (#37865 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-04-30 13:04:49 +01:00
Lysandre Debut	d538293f62	Transformers cli clean command (#37657 ) * transformers-cli -> transformers * Chat command works with positional argument * update doc references to transformers-cli * doc headers * deepspeed --------- Co-authored-by: Joao Gante <joao@huggingface.co>	2025-04-30 12:15:43 +01:00
Pedro Cuenca	63cd4c76f3	Llama Guard updates (#37872 ) * Unhardcode use_chunked_attention, fix no_rope_layers * Go back to exhaustive list of bools * Conversion and modeling updates * Fix rope * Unhardcode rope * Fix context length * style * Minor updates to conversion * Use StaticCache * Minor simplification * DynamicCache 🤦 * Style * Style	2025-04-30 10:34:43 +02:00
Yao Matrix	34f26e2c3e	enable internvl UTs on XPU (#37779 ) * enable internvl UTs on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style per comments Signed-off-by: Yao Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Signed-off-by: Yao Matrix <matrix.yao@intel.com>	2025-04-30 10:29:40 +02:00
Guang Yang	a57274466f	Allow override inputs to export recipe (#37508 ) Add option to specify dynamic shapes during export Co-authored-by: Guang Yang <guangyang@fb.com>	2025-04-30 10:19:27 +02:00
Matt	481de7204c	Skip is_flaky tests in the CI (#37723 ) * No more red flaky tests in the CI! * Remove the CircleCI logic as well * Revert most changes including is_flaky behaviour * make fixup * Move to a more sensible place * Mark a flaky test that failed on this PR! * correct import * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-30 09:52:21 +02:00
Aaditya Ura	5f8d17268c	Update modeling_llama4.py (#37841 ) * Update modeling_llama4.py * Update modeling_llama4.py * do not pass device --------- Co-authored-by: raushan <raushan@huggingface.co>	2025-04-30 00:36:02 +02:00
Kim Juwon	50f8caaa48	🌐 [i18n-KO] Translated `electra.md` to Korean (#36763 ) * docs: ko: electra.md * feat: nmt draft * fix: manual edits * fix: manual edits	2025-04-29 14:03:39 -07:00
regisss	91f3e9422f	Add Intel Gaudi doc (#37855 ) * Add Intel Gaudi doc * Use "TIP" instead of "NOTE" * Address comments from reviews	2025-04-29 13:28:06 -07:00
Pedro Cuenca	c34afa5957	Processor chat template: pass custom kwargs (#37852 )	2025-04-29 21:22:10 +02:00
Yaner	66ad8b2db0	docs: Details for ambigious channel dimension assignment (#37600 ) * docs: Details for ambigious channel dimension inference * Update src/transformers/image_utils.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-29 08:12:38 -07:00
Mohamed Mekkouri	096f25ae1f	Fix Bitnet tokenizer in pipeline (#37861 ) add tokenizer	2025-04-29 15:35:02 +02:00
Chris	da7ae467c4	Fix cache get item return type hints (#37847 ) F: Fix cache return hints Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-04-29 14:23:52 +01:00
Hicham Tala	aa6b79db43	Fix check of unecessary packages (issue #37626 ) (#37825 ) * Fix check of unecessary packages (issue #37626) * Reformat using ruff * And a condition to avoind the risk of matching a random object in `import_utils` * Reformat	2025-04-29 14:21:05 +01:00
Matt	517367fe9a	Revert change that breaks on Torch 2.1 (#37531 ) * Revert change that breaks on Torch 2.1 * Add TODO * Trigger tests * Trigger tests	2025-04-29 13:27:09 +01:00
Joao Gante	755b0fa2fe	[tests] reorganize cache tests and clean memory between tests (#37684 )	2025-04-29 12:21:14 +01:00
Joao Gante	3a1acc36ed	[tests] fix flaky pattern in `test_generate_continue_from_past_key_values` (#37724 )	2025-04-29 12:20:42 +01:00
Vladislav Bronzov	4abeb50f6e	Add D-FINE Model into Transformers (#36261 ) * copy the last changes from broken PR * small format * some fixes and refactoring after review * format * add config attr for loss * some fixes and refactoring * fix copies * fix style * add test for d-fine resnet * fix decoder layer prop * fix dummies * format init * remove extra print * refactor modeling, move resnet into separate folder * fix resnet config * change resnet on hgnet_v2, add clamp into decoder * fix init * fix config doc * fix init * fix dummies * fix config docs * fix hgnet_v2 config typo * format modular * add image classification for hgnet, some refactoring * format tests * fix dummies * fix init * fix style * fix init for hgnet v2 * fix index.md, add init rnage for hgnet * fix conversion * add missing attr to encoder * add loss for d-fine, add additional output for rt-detr decoder * tests and docs fixes * fix rt_detr v2 conversion * some fixes for loos and decoder output * some fixes for loss * small fix for converted modeling * add n model config, some todo comments for modular * convert script adjustments and fixes, small refact * remove extra output for rt_detr * make some outputs optionsl, fix conversion * some posr merge fixes * small fix * last field fix * fix not split for hgnet_v2 * disable parallelism test for hgnet_v2 image classification * skip multi gpu for d-fine * adjust after merge init * remove extra comment * fix repo name references * small fixes for tests * Fix checkpoint path * Fix consistency * Fixing docs --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-04-29 12:17:55 +01:00
Cyril Vallez	4602059aae	[modular] Fix the prefix-based renaming if the old and new model share a common name suffix (#37829 ) * first try * Fix and set examples * style * fix * Update modular_test_detr.py * Update image_processing_new_imgproc_model.py * Update modular_model_converter.py	2025-04-29 10:43:23 +02:00
Henrik Matthiesen	a847d4aa6b	Fast image processor for VitMatte added and bug in slow version fixed (#37616 ) * added fast image processor for VitMatte including updated and new tests, fixed a bug in the slow image processor that processed images incorrectly for input format ChannelDimension.FIRST in which case the trimaps were not added in the correct dimension, this bug was also reflected in the tests through incorretly shaped trimaps being passed * final edits for fast vitmatte image processor and tests * final edits for fast vitmatte image processor and tests --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-28 14:51:50 -04:00
sushmanth reddy	65e940208c	Samhq model addition (#35147 ) * added the configuartion for sam_hq * added the modeelling for sam_hq * added the sam hq mask decoder with hq features * added the code for the samhq * added the code for the samhq * added the code for the samhq * Delete src/transformers/models/sam_hq/modelling_sam_hq.py * added the code for the samhq * added the code for the samhq * added the chnages for the modeelling * added the code for sam hq for image processing * added code for the sam hq model * added the required changes * added the changes * added the key mappings for the sam hq * adding the working code of samhq * added the required files * adding the pt object * added the push to hub account * added the args for the sam maks decoder * added the args for the sam hq vision config * aded the some more documentation * removed the unecessary spaces * all required chnages * removed the image processor * added the required file * added the changes for the checkcopies * added the code for modular file * added the changes for the __init file * added the code for the interm embeds * added the code for sam hq * added the changes for modular file * added the test file * added the changes required * added the changes required * added the code for the * added the cl errors * added the changes * added the required changes * added the some code * added the code for the removing image processor * added the test dimensins * added the code for the removing extra used variables * added the code for modeluar file hf_mlp for a better name * removed abbrevaation in core functionality * removed abbrevaation in core functionality * .contiguous() method is often used to ensure that the tensor is stored in a contiguous block of memory * added the code which is after make fixup * added some test for the intermediate embeddings test * added the code for the torch support in sam hq * added the code for the updated modular file * added the changes for documentations as mentioned * removed the heading * add the changes for the code * first mentioned issue resolved * added the changes code to processor * added the easy loading to init file * added the changes to code * added the code to changes * added the code to work * added the code for sam hq * added the code for sam hq * added the code for the point pad value * added the small test for the image embeddings and intermediate embedding * added the code * added the code * added the code for the tests * added the code * added ythe code for the processor file * added the code * added the code * added the code * added the code * added the code * added the code for tests and some checks * added some code * added the code * added the code * added some code * added some code * added the changes for required * added the code * added the code * added the code * added the code * added the code * added the code * added the code * added the code * added the code * added the code * added some changes * added some changes * removed spaces and quality checks * added some code * added some code * added some code * added code quality checks * added the checks for quality checks * addded some code which fixes test_inference_mask_generation_no_point * added code for the test_inference_mask_generation_one_point_one_bb * added code for the test_inference_mask_generation_one_point_one_bb_zero * added code for the test_inference_mask_generation_one_box * added some code in modelling for testing * added some code which sort maks with high score * added some code * added some code * added some code for the move KEYS_TO_MODIFY_MAPPING * added some code for the unsqueeze removal * added some code for the unsqueeze removal * added some code * added some code * add some code * added some code * added some code * added some testign values changed * added changes to code in sam hq for readbility purpose * added pre commit checks * added the fix samvisionmodel for compatibilty * added the changes made on sam by cyyever * fixed the tests for samhq * added some the code * added some code related to init file issue during merge conflicts * remobved the merge conflicts * added changes mentioned by aruther and mobap * added changes mentioned by aruther and mobap * solving quality checks * added the changes for input clearly * added the changes * added changes in mask generation file rgearding model inputs and sam hq quargs in processor file * added changes in processor file * added the Setup -> setupclass conversion * added the code mentioned for processor * added changes for the code * added some code * added some code * added some code --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-04-28 19:07:09 +02:00
Raushan Turganbay	9c5b1319d0	[config] revert #37603 (#37821 ) revert	2025-04-28 16:28:30 +02:00
Marc Sun	9e730689c3	change XLA deprecated api (#37741 ) * deprecated api * fix	2025-04-28 16:27:41 +02:00
Yuan Wu	2933894985	Fix error of HPU TP (#37782 ) * Fix error of HPU TP Signed-off-by: yuanwu <yuan.wu@intel.com> * Add the init distrubuted for hpu Signed-off-by: yuanwu <yuan.wu@intel.com> * Fix error of make style Signed-off-by: yuanwu <yuan.wu@intel.com> --------- Signed-off-by: yuanwu <yuan.wu@intel.com>	2025-04-28 15:47:16 +02:00
Yuanyuan Chen	da4ff2a5f5	Add Optional to remaining types (#37808 ) More Optional typing Signed-off-by: cyy <cyyever@outlook.com>	2025-04-28 14:20:45 +01:00
Benjamin Bossan	1a9188a54e	FIX: Faulty PEFT tests (#37757 ) Two PEFT tests are actually failing: tests/peft_integration/test_peft_integration.py::PeftIntegrationTester::test_delete_adapter tests/peft_integration/test_peft_integration.py::PeftIntegrationTester::test_peft_pipeline_no_warning This must have been going on for some time but was apparently never noticed. The cause is that the tests themselves are faulty, the PEFT integration is correct in these cases. test_delete_adapter The first faulty test was introduced by #34650. AFAICT, it should never have passed in the first place, the PEFT integration logic was not changed in the meantime. At this point, the logs for the PR CI are gone, so I'm not sure if the test passed back then or not. test_peft_pipeline_no_warning This test was introduced in #36783 and should also never have passed, as the self.assertNoLogs context manager only returns None, thus the assert should never have worked (mea culpa for suggesting this code snippet). Here too, the CI logs are deleted by now, so I can't check if the test already failed back then.	2025-04-28 15:10:46 +02:00
Mohamed Mekkouri	b262680af4	Add Bitnet model (#37742 ) * Adding BitNet b1.58 Model * Add testing code for BitNet * Fix format issues * Fix docstring format issues * Fix docstring * Fix docstring * Fix: weight back to uint8 * Fix * Fix format issues * Remove copy comments * Add model link to the docstring * Fix: set tie_word_embeddings default to false * Update * Generate modeling file * Change config name for automatically generating modeling file. * Generate modeling file * Fix class name * Change testing branch * Remove unused param * Fix config docstring * Add docstring for BitNetQuantConfig. * Fix docstring * Update docs/source/en/model_doc/bitnet.md Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update docs/source/en/model_doc/bitnet.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update bitnet config * Update explanation between online and offline mode * Remove space * revert changes * more revert * spaces * update * fix-copies * doc fix * fix minor nits * empty * small nit * empty --------- Co-authored-by: Shuming Ma <shumingma@pku.edu.cn> Co-authored-by: shumingma <shmingm@gmail.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-28 15:08:46 +02:00
NielsRogge	82862ce443	[RT-DETR] Improve docs (#37814 ) Fix docs	2025-04-28 13:19:24 +02:00
Li Haoru	97e57b2545	Fix: Correct tensor shape comment in Mamba modeling (#37801 ) * Fix: Correct tensor shape comment in Mamba modeling * Update src/transformers/models/mamba/modeling_mamba.py * Update src/transformers/models/mamba/modeling_mamba.py --------- Co-authored-by: ShadyPi <11342288+shadypi@user.noreply.gitee.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-04-28 11:56:42 +01:00
Ken J	33493542aa	[doc] fix the code examples in qwen doc (#37803 )	2025-04-28 11:56:32 +01:00
co63oc	d5fa7d2d19	Fix typos in strings and comments (#37799 )	2025-04-28 11:39:11 +01:00
Mohamed Mekkouri	f466603963	Define warmup allocator for torchao quantization (#37764 ) * torchao allocator * add comment --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-28 10:45:55 +02:00
Yuan Wu	a41b6d9b5c	Fix the fsdp config cannot work issue. (#37549 ) * Fix the fsdp config cannot work issue. Signed-off-by: yuanwu <yuan.wu@intel.com> * Check the fsdp_config type Signed-off-by: yuanwu <yuan.wu@intel.com> * Add the accelerate_fsdp_config test Signed-off-by: yuanwu <yuan.wu@intel.com> * fix error of make style Signed-off-by: yuanwu <yuan.wu@intel.com> * Add key check Signed-off-by: yuanwu <yuan.wu@intel.com> --------- Signed-off-by: yuanwu <yuan.wu@intel.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-28 10:44:51 +02:00
Guang Yang	816b37010c	Gemma3 is Torch Exportable (#37728 ) * Gemma3 is Torch Exportable * Expand the support to other mdoels using HybridCache --------- Co-authored-by: Guang Yang <guangyang@fb.com>	2025-04-28 09:36:46 +02:00
SR	397a5ede33	Fix error message in `hub.py` (#37796 ) Fix error message	2025-04-25 14:03:06 -07:00
martin-harmonic	6ce675ee81	fix performance issue in convert_ids_to_tokens (#37773 )	2025-04-25 22:00:50 +02:00
saswatmeher	57c620bf8a	chore: update SigLIP2 model card (#37624 ) * update siglip2 model card * Update docs/source/en/model_doc/siglip2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/siglip2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/siglip2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/siglip2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/siglip2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/siglip2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * address comments * separate naflex and fixres variant * Update docs/source/en/model_doc/siglip2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/siglip2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/siglip2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-25 12:46:17 -07:00
Minki Kim	eb4afdd1fb	[i18n-KO] Translated `keypoint_detection.md` to Korean (#36649 ) * fix: manual edits * fix: manual edits * fix: manual edits * Update docs/source/ko/tasks/keypoint_detection.md Anchor lower modify Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/tasks/keypoint_detection.md connect letter Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/tasks/keypoint_detection.md modify to usual words Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/tasks/keypoint_detection.md modify extension word Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ko/tasks/keypoint_detection.md modify to usual words Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/tasks/keypoint_detection.md modify to usual words Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/tasks/keypoint_detection.md modify to usual representation Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-25 12:24:12 -07:00
jiqing-feng	555693fbfa	fix mpt test of different outputs from cuda (#37691 ) * fix mpt test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix mpt tests with Expectations Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix typo Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix output Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-04-25 18:04:56 +02:00
Cyril Vallez	0cfbf9c95b	Force torch>=2.6 with torch.load to avoid vulnerability issue (#37785 ) * fix all main files * fix test files * oups forgot modular * add link * update message	2025-04-25 16:57:09 +02:00
Cyril Vallez	eefc86aa31	Fix tensor parallel with non-floating dtypes (#37790 ) fix	2025-04-25 15:48:16 +02:00
co63oc	214062201e	Fix typos in strings and comments (#37784 ) * Fix typos in strings and comments * Fix	2025-04-25 13:47:25 +01:00
Cyril Vallez	ba3bd37253	Align gpt2 mask preparation to #37612 (#37787 ) Update modeling_gpt2.py	2025-04-25 12:50:30 +02:00
Yih-Dar	50d231a806	unpin pytest<8 (#37768 ) * pytest 8 * pytest 8 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-25 12:34:33 +02:00
Raushan Turganbay	79d4bc761d	[causal mask] fix preparation with multi-gpu (#37612 ) * fix multi-gpu * forgot non-copied models * fixup	2025-04-25 09:34:18 +02:00
김가영	7bb619d710	🌐 [i18n-KO] Translated `roberta.md` to Korean (#37069 ) * docs: ko: roberta.md * fix: manual edits * Apply suggestions from code review Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>	2025-04-24 10:00:24 -07:00
AfafEL	cfe666919e	Update model card for Gemma (#37674 ) * Update Gemma model card * Updated after review * Update following review	2025-04-24 09:58:46 -07:00
Mohamed Mekkouri	b2d70e9c49	Fix auto-round hfoption (#37759 ) fix	2025-04-24 18:19:38 +02:00
lewtun	acdbe627e3	Guard DeepSpeed imports (#37755 ) * Guard DeepSpeed imports * Fix import * Import deepspeed consistently --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-24 18:16:34 +02:00
Joao Gante	af6d2756d9	[deps] pin max `torch` version (#37760 ) pin max pt version :(	2025-04-24 16:18:25 +01:00
co63oc	0302aa1c6e	Fix typos in comments (#37694 ) Signed-off-by: co63oc <co63oc@users.noreply.github.com>	2025-04-24 15:59:56 +01:00
Wing Lian	af000ceb92	Fix load of rng state for resuming training from checkpoint (#37162 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-24 16:55:34 +02:00
Cyril Vallez	0af0a5f969	Fix tied weight loading with TP and loading sub state_dicts (#37758 ) Update modeling_utils.py	2025-04-24 16:47:40 +02:00
flashJd	3af24f7e27	Refine parameter type annotations (#37666 )	2025-04-24 15:37:13 +01:00
Kaiwen	22e3da92b7	Fix wrong input shapes in doc-string of models (#37729 ) * Fix wrong position_ids shape in doc Supported by ClvpDecoder.forward, line 1212--1215: src/transformers/models/clvp/modeling_clvp.py: 1212 if inputs_embeds is None: 1213 inputs_embeds = self.input_embeds_layer(input_ids) 1214 position_embeds = self.position_embeds_layer(position_ids) 1215 inputs_embeds = inputs_embeds + position_embeds * Fix possibly wrong input_ids shape in doc Since 'input_ids_length' was mentioned immediately after the shape `(batch_size, sequence_length)`, it doesn't make sense to me for `input_ids` to have such shape---IMO it ought to have shape `(batch_size, input_ids_length)` instead. * Fix possibly wrong inputs_embeds shape in doc Supported by CTRLModel.forward, line 448--449: src/transformers/models/ctrl/modeling_ctrl.py: 448 if inputs_embeds is None: 449 inputs_embeds = self.w(input_ids) This commit is introduced due to commit 6f36b56497828642b65f54ea26aa4064186de57a. * Fix possibly wrong token_type_ids shape in doc Supported by CTRLModel.forward, line 441--460: src/transformers/models/ctrl/modeling_ctrl.py: 441 if token_type_ids is not None: 442 token_type_ids = token_type_ids.view(-1, input_shape[-1]) 443 token_type_embeds = self.w(token_type_ids) 444 token_type_embeds = np.sqrt(self.d_model_size) 445 else: 446 token_type_embeds = 0 447 448 if inputs_embeds is None: 449 inputs_embeds = self.w(input_ids) 450 # inputs_embeds = embedded.unsqueeze(0) if len(input_ids.shape)<2 else embedded 451 seq_len = input_shape[-1] 452 mask = torch.triu(torch.ones(seq_len + past_length, seq_len + past_length), 1).to(device) 453 454 inputs_embeds = np.sqrt(self.d_model_size) 455 456 # `self.pos_encoding` won't be sent to the correct device along the model, so we do it manually. 457 self.pos_encoding = self.pos_encoding.to(device) 458 pos_embeds = self.pos_encoding[position_ids, :] 459 460 hidden_states = inputs_embeds + pos_embeds + token_type_embeds This commit is introduced due to commit 6f36b56497828642b65f54ea26aa4064186de57a. * Fix possibly wrong position_ids shape in doc Supported by CTRLModel.forward, line 448--460: src/transformers/models/ctrl/modeling_ctrl.py: 448 if inputs_embeds is None: 449 inputs_embeds = self.w(input_ids) 450 # inputs_embeds = embedded.unsqueeze(0) if len(input_ids.shape)<2 else embedded 451 seq_len = input_shape[-1] 452 mask = torch.triu(torch.ones(seq_len + past_length, seq_len + past_length), 1).to(device) 453 454 inputs_embeds = np.sqrt(self.d_model_size) 455 456 # `self.pos_encoding` won't be sent to the correct device along the model, so we do it manually. 457 self.pos_encoding = self.pos_encoding.to(device) 458 pos_embeds = self.pos_encoding[position_ids, :] 459 460 hidden_states = inputs_embeds + pos_embeds + token_type_embeds This commit is introduced due to commit 6f36b56497828642b65f54ea26aa4064186de57a. Fix wrong token_type_ids shape in doc Supported by TFCTRLMainLayer.call, line 376--394: src/transformers/models/ctrl/modeling_tf_ctrl.py: 376 if token_type_ids is not None: 377 token_type_ids = tf.reshape(token_type_ids, [-1, shape_list(token_type_ids)[-1]]) 378 token_type_embeds = self.w(token_type_ids) 379 token_type_embeds = tf.math.sqrt(tf.cast(self.d_model_size, dtype=token_type_embeds.dtype)) 380 else: 381 token_type_embeds = tf.constant(0.0) 382 position_ids = tf.reshape(position_ids, [-1, shape_list(position_ids)[-1]]) 383 384 if inputs_embeds is None: 385 check_embeddings_within_bounds(input_ids, self.w.input_dim) 386 inputs_embeds = self.w(input_ids) 387 seq_len = input_shape[-1] 388 mask = 1 - tf.linalg.band_part(tf.ones((seq_len, seq_len)), -1, 0) 389 390 inputs_embeds = tf.math.sqrt(tf.cast(self.d_model_size, inputs_embeds.dtype)) 391 392 pos_embeds = tf.gather(self.pos_encoding, position_ids) 393 pos_embeds = tf.cast(pos_embeds, dtype=token_type_embeds.dtype) 394 hidden_states = inputs_embeds + pos_embeds + token_type_embeds * Fix wrong position_ids shape in doc Supported by TFCTRLMainLayer.call, line 384--394: src/transformers/models/ctrl/modeling_tf_ctrl.py: 384 if inputs_embeds is None: 385 check_embeddings_within_bounds(input_ids, self.w.input_dim) 386 inputs_embeds = self.w(input_ids) 387 seq_len = input_shape[-1] 388 mask = 1 - tf.linalg.band_part(tf.ones((seq_len, seq_len)), -1, 0) 389 390 inputs_embeds = tf.math.sqrt(tf.cast(self.d_model_size, inputs_embeds.dtype)) 391 392 pos_embeds = tf.gather(self.pos_encoding, position_ids) 393 pos_embeds = tf.cast(pos_embeds, dtype=token_type_embeds.dtype) 394 hidden_states = inputs_embeds + pos_embeds + token_type_embeds Fix wrong inputs_embeds shape in doc Supported by TFCTRLMainLayer.call, line 384--394: src/transformers/models/ctrl/modeling_tf_ctrl.py: 384 if inputs_embeds is None: 385 check_embeddings_within_bounds(input_ids, self.w.input_dim) 386 inputs_embeds = self.w(input_ids) 387 seq_len = input_shape[-1] 388 mask = 1 - tf.linalg.band_part(tf.ones((seq_len, seq_len)), -1, 0) 389 390 inputs_embeds = tf.math.sqrt(tf.cast(self.d_model_size, inputs_embeds.dtype)) 391 392 pos_embeds = tf.gather(self.pos_encoding, position_ids) 393 pos_embeds = tf.cast(pos_embeds, dtype=token_type_embeds.dtype) 394 hidden_states = inputs_embeds + pos_embeds + token_type_embeds Fix wrong inputs_embeds shape in doc Supported by ClvpDecoder.forward, line 1212--1213: src/transformers/models/clvp/modeling_clvp.py: 1212 if inputs_embeds is None: 1213 inputs_embeds = self.input_embeds_layer(input_ids) * Fix wrong position_ids shape in doc Supported by FlaxGemmaPreTrainedModel.__call__, line 502--508: src/transformers/models/gemma/modeling_flax_gemma.py: 502 batch_size, sequence_length = input_ids.shape 503 504 if position_ids is None: 505 if past_key_values is not None: 506 raise ValueError("Make sure to provide `position_ids` when passing `past_key_values`.") 507 508 position_ids = jnp.broadcast_to(jnp.arange(sequence_length)[None, :], (batch_size, sequence_length)) * Fix wrong position_ids shape in doc Supported by FlaxGPT2PreTrainedModel.__call__, line 482--488: src/transformers/models/gpt2/modeling_flax_gpt2.py: 482 batch_size, sequence_length = input_ids.shape 483 484 if position_ids is None: 485 if past_key_values is not None: 486 raise ValueError("Make sure to provide `position_ids` when passing `past_key_values`.") 487 488 position_ids = jnp.broadcast_to(jnp.arange(sequence_length)[None, :], (batch_size, sequence_length)) * Fix wrong position_ids shape in doc Supported by GPT2Model.forward, line 918--921: src/transformers/models/gpt2/modeling_gpt2.py: 918 if inputs_embeds is None: 919 inputs_embeds = self.wte(input_ids) 920 position_embeds = self.wpe(position_ids) 921 hidden_states = inputs_embeds + position_embeds.to(inputs_embeds.device) * Fix wrong inputs_embeds shape in doc Supported by GPT2Model.forward, line 918--919: src/transformers/models/gpt2/modeling_gpt2.py: 918 if inputs_embeds is None: 919 inputs_embeds = self.wte(input_ids) * Fix wrong labels shape in doc Supported by GPT2LMHeadModel.forward, line 1156--1157: src/transformers/models/gpt2/modeling_gpt2.py: 1156 Labels for language modeling. Note that the labels are shifted inside the model, i.e. you can set 1157 `labels = input_ids` Indices are selected in `[-100, 0, ..., config.vocab_size]` All labels set to `-100` * Fix wrong labels shape in doc Supported by GPT2DoubleHeadsModel.forward, line 1314--1315: src/transformers/models/gpt2/modeling_gpt2.py: 1314 Labels for language modeling. Note that the labels are shifted inside the model, i.e. you can set 1315 `labels = input_ids`. Indices are selected in `[-100, 0, ..., config.vocab_size - 1]`. All labels set to * Fix wrong token_type_ids shape in doc Supported by TFGPT2MainLayer.call, line 486--500: src/transformers/models/gpt2/modeling_tf_gpt2.py: 486 if inputs_embeds is None: 487 check_embeddings_within_bounds(input_ids, self.config.vocab_size) 488 inputs_embeds = self.wte(input_ids) 489 490 position_embeds = self.wpe(position_ids) 491 492 if token_type_ids is not None: 493 token_type_ids = tf.reshape(token_type_ids, [-1, shape_list(token_type_ids)[-1]]) 494 token_type_embeds = self.wte(token_type_ids) 495 else: 496 token_type_embeds = tf.constant(0.0) 497 498 position_embeds = tf.cast(position_embeds, dtype=inputs_embeds.dtype) 499 token_type_embeds = tf.cast(token_type_embeds, dtype=inputs_embeds.dtype) 500 hidden_states = inputs_embeds + position_embeds + token_type_embeds * Fix wrong position_ids shape in doc Supported by TFGPT2MainLayer.call, line 486--500: src/transformers/models/gpt2/modeling_tf_gpt2.py: 486 if inputs_embeds is None: 487 check_embeddings_within_bounds(input_ids, self.config.vocab_size) 488 inputs_embeds = self.wte(input_ids) 489 490 position_embeds = self.wpe(position_ids) 491 492 if token_type_ids is not None: 493 token_type_ids = tf.reshape(token_type_ids, [-1, shape_list(token_type_ids)[-1]]) 494 token_type_embeds = self.wte(token_type_ids) 495 else: 496 token_type_embeds = tf.constant(0.0) 497 498 position_embeds = tf.cast(position_embeds, dtype=inputs_embeds.dtype) 499 token_type_embeds = tf.cast(token_type_embeds, dtype=inputs_embeds.dtype) 500 hidden_states = inputs_embeds + position_embeds + token_type_embeds * Fix wrong inputs_embeds shape in doc Supported by TFGPT2MainLayer.call, line 486--488: src/transformers/models/gpt2/modeling_tf_gpt2.py: 486 if inputs_embeds is None: 487 check_embeddings_within_bounds(input_ids, self.config.vocab_size) 488 inputs_embeds = self.wte(input_ids) * Fix wrong position_ids shape in doc Supported by GPTBigCodeModel.forward, line 962--965: src/transformers/models/gpt_bigcode/modeling_gpt_bigcode.py: 962 if inputs_embeds is None: 963 inputs_embeds = self.wte(input_ids) 964 position_embeds = self.wpe(position_ids) 965 hidden_states = inputs_embeds + position_embeds.to(inputs_embeds.device) * Fix wrong inputs_embeds shape in doc Supported by GPTBigCodeModel.forward, line 962--963: src/transformers/models/gpt_bigcode/modeling_gpt_bigcode.py: 962 if inputs_embeds is None: 963 inputs_embeds = self.wte(input_ids) * Fix wrong labels shape in doc Supported by GPTBigCodeForCausalLM.forward, line 1158--1159: src/transformers/models/gpt_bigcode/modeling_gpt_bigcode.py: 1158 Labels for language modeling. Note that the labels are shifted inside the model, i.e. you can set 1159 `labels = input_ids` Indices are selected in `[-100, 0, ..., config.vocab_size]` All labels set to `-100` * Fix wrong position_ids shape in doc Supported by FlaxGPTNeoModule.__call__, line 549--552: src/transformers/models/gpt_neo/modeling_flax_gpt_neo.py: 549 input_embeds = self.wte(input_ids.astype("i4")) 550 position_embeds = self.wpe(position_ids.astype("i4")) 551 552 hidden_states = input_embeds + position_embeds * Fix wrong position_ids shape in doc Supported by GPTNeoModel.forward, line 685--720: src/transformers/models/gpt_neo/modeling_gpt_neo.py: 685 if inputs_embeds is None: 686 inputs_embeds = self.wte(input_ids) 687 688 # kept for BC (non `Cache` `past_key_values` inputs) 689 return_legacy_cache = False 690 if use_cache and not isinstance(past_key_values, Cache): 691 return_legacy_cache = True 692 if past_key_values is None: 693 past_key_values = DynamicCache() 694 else: 695 past_key_values = DynamicCache.from_legacy_cache(past_key_values) 696 logger.warning_once( 697 "We detected that you are passing `past_key_values` as a tuple of tuples. This is deprecated and " 698 "will be removed in v4.47. Please convert your cache or use an appropriate `Cache` class " 699 "(https://huggingface.co/docs/transformers/kv_cache#legacy-cache-format)" 700 ) 701 702 seq_length = inputs_embeds.shape[1] 703 if cache_position is None: 704 past_seen_tokens = past_key_values.get_seq_length() if past_key_values is not None else 0 705 cache_position = torch.arange(past_seen_tokens, past_seen_tokens + seq_length, device=inputs_embeds.device) 706 707 if position_ids is None: 708 position_ids = cache_position.unsqueeze(0) 709 710 causal_mask = self._update_causal_mask( 711 attention_mask, inputs_embeds, cache_position, past_key_values, output_attentions 712 ) 713 714 # Prepare head mask if needed 715 # 1.0 in head_mask indicate we keep the head 716 # attention_probs has shape bsz x num_heads x N x N 717 # head_mask has shape n_layer x batch x num_heads x N x N 718 head_mask = self.get_head_mask(head_mask, self.config.num_layers) 719 position_embeds = self.wpe(position_ids) 720 hidden_states = inputs_embeds + position_embeds * Fix wrong inputs_embeds shape in doc Supported by GPTNeoModel.forward, line 685--686: src/transformers/models/gpt_neo/modeling_gpt_neo.py: 685 if inputs_embeds is None: 686 inputs_embeds = self.wte(input_ids) * Fix wrong labels shape in doc Supported by GPTNeoForCausalLM.forward, line 968--969: src/transformers/models/gpt_neo/modeling_gpt_neo.py: 968 Labels for language modeling. Note that the labels are shifted inside the model, i.e. you can set 969 `labels = input_ids` Indices are selected in `[-100, 0, ..., config.vocab_size]` All labels set to `-100` * Fix wrong position_ids shape in doc Supported by FlaxGPTJPreTrainedModel.__call__, line 455--461: src/transformers/models/gptj/modeling_flax_gptj.py: 455 batch_size, sequence_length = input_ids.shape 456 457 if position_ids is None: 458 if past_key_values is not None: 459 raise ValueError("Make sure to provide `position_ids` when passing `past_key_values`.") 460 461 position_ids = jnp.broadcast_to(jnp.arange(sequence_length)[None, :], (batch_size, sequence_length)) * Fix wrong token_type_ids shape in doc Supported by TFGPTJMainLayer.call, line 482--493: src/transformers/models/gptj/modeling_tf_gptj.py: 482 if inputs_embeds is None: 483 check_embeddings_within_bounds(input_ids, self.wte.vocab_size) 484 inputs_embeds = self.wte(input_ids, mode="embedding") 485 486 if token_type_ids is not None: 487 token_type_ids = tf.reshape(token_type_ids, [-1, shape_list(token_type_ids)[-1]]) 488 token_type_embeds = self.wte(token_type_ids, mode="embedding") 489 else: 490 token_type_embeds = tf.constant(0.0) 491 492 token_type_embeds = tf.cast(token_type_embeds, dtype=inputs_embeds.dtype) 493 hidden_states = inputs_embeds + token_type_embeds * Fix wrong position_ids shape in doc Supported by TFGPTJMainLayer.call, line 434--449: src/transformers/models/gptj/modeling_tf_gptj.py: 434 elif input_ids is not None: 435 input_shape = shape_list(input_ids) 436 input_ids = tf.reshape(input_ids, [-1, input_shape[-1]]) 437 elif inputs_embeds is not None: 438 input_shape = shape_list(inputs_embeds)[:-1] 439 else: 440 raise ValueError("You have to specify either input_ids or inputs_embeds") 441 442 if past_key_values is None: 443 past_length = 0 444 past_key_values = [None] * len(self.h) 445 else: 446 past_length = shape_list(past_key_values[0][0])[-2] 447 448 if position_ids is None: 449 position_ids = tf.expand_dims(tf.range(past_length, input_shape[-1] + past_length), axis=0) * Fix wrong inputs_embeds shape in doc Supported by TFGPTJMainLayer.call, line 482--484: src/transformers/models/gptj/modeling_tf_gptj.py: 482 if inputs_embeds is None: 483 check_embeddings_within_bounds(input_ids, self.wte.vocab_size) 484 inputs_embeds = self.wte(input_ids, mode="embedding") * Fix wrong labels shape in doc Supported by TFGPTJForCausalLM.call, line 812--813: src/transformers/models/gptj/modeling_tf_gptj.py: 812 Labels for language modeling. Note that the labels are shifted inside the model, i.e. you can set 813 `labels = input_ids` Indices are selected in `[-100, 0, ..., config.vocab_size]` All labels set to `-100` * Fix possibly wrong input_ids shape in doc Since 'input_ids_length' was mentioned immediately after the shape `(batch_size, sequence_length)`, it doesn't make sense to me for `input_ids` to have such shape---IMO it ought to have shape `(batch_size, input_ids_length)` instead. * Fix possibly wrong token_type_ids shape in doc Supported by ImageGPTModel.forward, line 773--780: src/transformers/models/imagegpt/modeling_imagegpt.py: 773 if inputs_embeds is None: 774 inputs_embeds = self.wte(input_ids) 775 position_embeds = self.wpe(position_ids) 776 hidden_states = inputs_embeds + position_embeds.to(inputs_embeds.device) 777 778 if token_type_ids is not None: 779 token_type_embeds = self.wte(token_type_ids) 780 hidden_states = hidden_states + token_type_embeds This commit is introduced due to commit 8e594a4143cca79f165b99e4ed4c9f3a90047bf3. * Fix possibly wrong position_ids shape in doc Supported by ImageGPTModel.forward, line 773--776: src/transformers/models/imagegpt/modeling_imagegpt.py: 773 if inputs_embeds is None: 774 inputs_embeds = self.wte(input_ids) 775 position_embeds = self.wpe(position_ids) 776 hidden_states = inputs_embeds + position_embeds.to(inputs_embeds.device) This commit is introduced due to commit 8e594a4143cca79f165b99e4ed4c9f3a90047bf3. * Fix possibly wrong inputs_embeds shape in doc Supported by ImageGPTModel.forward, line 773--774: src/transformers/models/imagegpt/modeling_imagegpt.py: 773 if inputs_embeds is None: 774 inputs_embeds = self.wte(input_ids) This commit is introduced due to commit 8e594a4143cca79f165b99e4ed4c9f3a90047bf3. * Fix possibly wrong labels shape in doc Supported by ImageGPTForCausalImageModeling.forward, line 923--924: src/transformers/models/imagegpt/modeling_imagegpt.py: 923 Labels for language modeling. Note that the labels are shifted inside the model, i.e. you can set 924 `labels = input_ids` Indices are selected in `[-100, 0, ..., config.vocab_size]` All labels set to `-100` This commit is introduced due to commit 8e594a4143cca79f165b99e4ed4c9f3a90047bf3. * Fix possibly wrong labels shape in doc Supported by ImageGPTModel.forward, line 665--666: src/transformers/models/imagegpt/modeling_imagegpt.py: 665 Labels for language modeling. Note that the labels are shifted inside the model, i.e. you can set 666 `labels = input_ids` Indices are selected in `[-100, 0, ..., config.vocab_size]` All labels set to `-100` This commit is introduced due to commit 8e594a4143cca79f165b99e4ed4c9f3a90047bf3. * Fix wrong position_ids shape in doc Supported by FlaxLlamaPreTrainedModel.__call__, line 484--490: src/transformers/models/llama/modeling_flax_llama.py: 484 batch_size, sequence_length = input_ids.shape 485 486 if position_ids is None: 487 if past_key_values is not None: 488 raise ValueError("Make sure to provide `position_ids` when passing `past_key_values`.") 489 490 position_ids = jnp.broadcast_to(jnp.arange(sequence_length)[None, :], (batch_size, sequence_length)) * Fix wrong position_ids shape in doc Supported by FlaxMistralPreTrainedModel.__call__, line 478--484: src/transformers/models/mistral/modeling_flax_mistral.py: 478 batch_size, sequence_length = input_ids.shape 479 480 if position_ids is None: 481 if past_key_values is not None: 482 raise ValueError("Make sure to provide `position_ids` when passing `past_key_values`.") 483 484 position_ids = jnp.broadcast_to(jnp.arange(sequence_length)[None, :], (batch_size, sequence_length))	2025-04-24 15:36:03 +01:00
Joao Gante	4d64c38593	[generate] fix default autocompile case on gpu (#37756 )	2025-04-24 15:08:38 +01:00
robert	43bb4c0456	Fix qwen2_5 get_rope_index tensor device locations (#37597 ) * Fix qwen2_5 get_rope_index tensor device locations * simpler fix * edit right file for modular model * add a test * try normalizing type to fix non-video * fix some imports * add a video forward test with dummy input	2025-04-24 16:04:38 +02:00
Prem Kumar M	dd2649fa98	updated hidden_features for FlaxDinov2SwiGLUFFN in Dinov2 (#37747 ) Flax Dinov2: updated hidden_features in FlaxDinov2SwiGLUFFN Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-04-24 14:30:31 +01:00
Joao Gante	8bdd4f2acd	[generate] skip compilation on cpu offload (#37709 ) * skip compilation on cpu offload * add test * better logic * docstring * boolean logic * add disk offload check * warn users if compilation options are set but compilation doesn happen * fix test --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-24 14:08:17 +01:00
Poedator	7c62e69326	`GPT2Model` StaticCache support (#35761 ) * initial GPT2 changes * causal_mask support * return_legacy_cache * cleanup * fix1 * outputs shape fixes * gpt2 return fix * pkv, attn fixes * fix dual_head * is_causal arg fix * decision transformer updated * style fix * batch_size from inputs_embeds * DecisionTransformerModel fixes * cross-attn support + cache warning * x-attn @decision * EDCache proper init * simplified logic in `if use_cache:` for GPT2Model * @deprecate_kwarg for DecisionTr attn fwd * @deprecate_kwarg in gpt2 * deprecation version updated to 4.51 * kwargs in gradient_checkpointing_fn * rename next_cache to past_key_values * attention_mask prep * +cache_position in GPT2DoubleHeadsModel * undo kwargs in gradient checkpointing * moved up `if self.gradient_checkpointing` * consistency in decision_transformer * pastkv, cache_pos in grad_checkpt args * rm _reorder_cache * output_attentions streamlined * decision_transformer consistency * return_legacy_cache improved * ClvpForCausalLM used for legacy cache test now * is_causal fixed * attn_output cleanup * consistency @ decision_transformer * Updated deprecation notice version to 4.52 * upd deprecation * consistent legacy cache code in decision transformers\ * next_cache -> past_kv in decision_tr * cache support flags in decision_transf * rm legacy cache warning * consistency in cache init for decision transf * no Static Cache for Decision Transformer --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-04-24 14:46:35 +02:00
Joao Gante	9f927c8250	[cache] fix `HybridCache` init when `device` is passed (#37718 ) fix device init	2025-04-24 13:36:52 +01:00
amd-xiaoyu12	4fee320926	Expand quantized data type support for tensor parallelism (#37719 ) Update tensor_parallel.py Co-authored-by: Xiao YU <Xiao.YU@xilinx.com>	2025-04-24 14:34:32 +02:00
Yih-Dar	0f7940bb3f	Update `MllamaForConditionalGenerationIntegrationTest` (#37750 ) * fix 1 * fix 2 * fix 3 * fix 4 * fix 5 * fix 6 * trigger CI --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-24 14:29:46 +02:00
Yih-Dar	7e6f36cd38	Skip all `AriaForConditionalGenerationIntegrationTest` on `T4` (#37746 ) * skip * ruff * trigger CI --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-24 14:11:56 +02:00
Zhen	0327d0f7f2	[performance_optim] define flash attention mask on NPU device directly (#37698 ) Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-04-24 14:06:47 +02:00
Cyril Vallez	14e28bd721	Correctly raise errors when downloading tokenizer files (#37740 ) * first try * Update tokenization_utils_base.py * Update tokenization_utils_base.py * standardize	2025-04-24 12:53:07 +02:00
BakerBunker	0ec0495967	Fix `embeds_to_talker` device in Qwen2.5-Omni (#37739 ) Fix `embeds_to_talker` device Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com>	2025-04-24 12:49:57 +02:00
NanoCode012	72e4844059	fix: learning_rate logged as tensor causing save issue with deepspeed (#37704 ) * fix: learning_rate logged as tensor causing save issue with deepspeed * chore: lint --------- Co-authored-by: NanoCode012 <chanvichet@Chanvichets-MacBook-Pro.local> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-24 12:20:47 +02:00
Raushan Turganbay	1cfcbfcab8	[VLMs] fix flash-attention tests (#37603 ) * fix one test * fa2 ln test * remove keys from config recursively * fix * fixup	2025-04-24 11:48:11 +02:00
Mohamed Mekkouri	02baa61fab	Make sure torch_is_available before using torch.distributed (#37693 ) fix	2025-04-24 11:31:35 +02:00
Fanli Lin	864e9636ff	[tests] fix `test_nemotron_8b_generation_sdpa` (#37665 ) add max_new_tokens	2025-04-24 11:28:35 +02:00
Mohamed Mekkouri	9b3bf4a206	Fix torchao doc examples (#37697 ) fix Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-24 11:10:27 +02:00
BakerBunker	3ed56bea0f	Fix inference bugs in Qwen2.5 Omni (#37701 ) * Init `SinusoidsPositionEmbedding` with float to avoid precision problem * fix hidden_state for talker * Update modular_qwen2_5_omni.py * Move hidden processing out from thinker * fixup --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com>	2025-04-24 10:51:44 +02:00
jiqing-feng	b7f7aa78a0	Fix Aria tests (#37444 ) * update aria tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add cuda tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * check outputs for cpu and cuda and xpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * check outputs for cpu and cuda and xpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * check outputs for cpu and cuda and xpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * check output for each device Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix style Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix style Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix xpu output Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add comments and use assert list equal Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * rm pad token assign Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-04-24 10:51:29 +02:00
Daksh Maheshwari	b6d65e40b2	Add Fast Image Processor for MobileNetV1 (#37111 ) * fast image processor template for MobileNetV1 via transformers-cli * Add fast image processors and unify tests for slow/fast image processor classes * added loop over image_processor_list for all tests and removed boilerplate comments. --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-23 15:55:41 -04:00
Vinh H. Pham	dea1919be4	Add Fast Image Processor for PoolFormer (#37182 ) * support poolformer fast image processor * support test for crop_pct=None * run make style * Apply suggestions from code review * rename test --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-23 15:55:33 -04:00
Parteek	b491f128d6	Add Fast PVT Processor (#37204 ) * Add Fast PVT Processor * Update image_processing_pvt_fast.py * Update image_processing_pvt_fast.py * remove kwargs --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-23 15:55:20 -04:00
Yao Matrix	19e9079dc1	enable 4 test_trainer cases on XPU (#37645 ) Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-23 21:29:42 +02:00
Yoni Gozlan	5cd6b64059	Process inputs directly in apply_chat_template in image-text-to-text pipeline (#35616 ) * tokenize inputs directly in apply_chat_template * refactor processing * revert changes processing llava * Update docs * fix issue with str being iterable * add test chat text only * change function name	2025-04-23 13:31:33 -04:00
Joao Gante	80ea2c05c2	[tests, `qwen2_5_omni`] fix flaky tests (#37721 )	2025-04-23 17:54:12 +01:00
Pedro Cuenca	63c6331387	Qwen 2.5 Omni: apply video defaults (#37660 ) * Apply video defaults for min_pixels and max_pixels * fps kwarg should not be a list * Update test to account for new resizing	2025-04-23 17:08:11 +02:00
Raushan Turganbay	1e9087368c	[internvl] fix chat template (#37656 ) * fix chat template * update * update conversion * rename `fake_image_token` in tests	2025-04-23 16:56:36 +02:00
Matt	9ec8be56dd	TransfoXL is deprecated, don't keep it in tested examples! (#37707 ) * TransfoXL is deprecated, so we should remove it from examples that get tested * Remove the tokenizer too * Trigger tests	2025-04-23 14:59:38 +01:00
Joao Gante	be9b0e8521	[CI] add back `sacrebleu` (and document why) (#37700 ) * example test * add back dep * dev-ci * dev-ci	2025-04-23 14:45:00 +01:00
Matt	1d7d7a942e	Add maintainers for ROCm/Intel XPU/Ascend NPU (#37678 ) * Add maintainers for ROCm/Intel XPU/Ascend NPU * Correct capitalization for usernames * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> * Trigger tests --------- Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-04-23 14:28:32 +01:00
Joao Gante	cc9a245e6d	[cleanup] remove `/model_cards` 🧹 🧹 (#37685 ) rm model_cards	2025-04-23 12:45:27 +01:00
Yih-Dar	ca790303f7	Pin torch == 2.6 on PR CI docker images for now (#37695 ) pin 2.6 on CircleCi images Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-23 11:47:23 +02:00
Yao Matrix	12f65ee752	enable cpu offloading for Bark on xpu (#37599 ) * enable cpu offloading of bark modeling on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * remove debug print Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix review comments Signed-off-by: YAO Matrix <matrix.yao@intel.com> * enhance test Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update * add deprecate message Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update * update * trigger CI --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-23 11:37:15 +02:00
Shahruk Hossain	4f9893cbbc	fix: remove classmethod from `Qwen2_5OmniConfig.get_text_config` (#37690 ) - Since the `get_text_config` references an instance variable within the class (`self.thinker_config`), the `get_text_config` method should not be a classmethod. - Before this fix, users were getting the following error: ''' AttributeError: type object 'Qwen2_5OmniConfig' has no attribute 'thinker_config' '''	2025-04-23 09:30:57 +02:00
Vishesh-Mistry	1d9743edc2	Updated model card for mbart and mbart50 (#37619 ) * new card for mbart and mbart50 * removed comment BADGES * Update mBart overview Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix typo (MBart to mBart) Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * maybe fix typo * update typo and combine notes * changed notes * changed the example sentence * fixed grammatical error and removed some lines from notes example * missed one word * removed documentation resources and added some lines of example code back in notes. --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-22 12:26:47 -07:00
Jinyong Lee	fbfa1dd4db	🌐 [i18n-KO] Translated `siglip.md` to Korean (#37145 ) * docs: ko: siglip.md * feat: nmt draft * fix: manual edits * chore: Correct document title to kebab-case format Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Convert unnatural language to natural Korean Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>	2025-04-22 12:23:19 -07:00
Yao Matrix	ece79b0688	enable blip2 and emu3 cases on XPU (#37662 ) * enable blip2 and emu3 modeling cases on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * remove extra new line Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-22 18:37:09 +02:00
Ken J	ca4c114dc4	Add counters for dataset classes (#37636 ) * add counters for dataset classes * fix failed code style	2025-04-22 17:30:43 +01:00
NielsRogge	d47cdae27e	[Docs] Move models to appropriate section (#37338 ) * Move models * update --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-22 18:23:14 +02:00
Deepak Sahu	dbfccd3c92	typo update in the parameter name (#37655 ) See L118 and L143 for the class attribute `hidden_dim`	2025-04-22 18:14:20 +02:00
Joao Gante	de8916dde6	[docs] only build `en` docs in push CI (#37677 )	2025-04-22 17:05:11 +01:00
Joao Gante	0f8c34b0a0	[cleanup] remove old scripts in `/scripts` 🧹 🧹 (#37676 ) * rm old files * not this one	2025-04-22 16:59:03 +01:00
Yao Matrix	6673081b21	enable 6 granite cases on xpu (#37569 ) * enable 6 granite cases on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * make them all pass on A100 Signed-off-by: N <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Signed-off-by: N <matrix.yao@intel.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-22 17:55:02 +02:00
Yao Matrix	9167461a7d	enable mllama cases on xpu (#37644 ) * enable mllama testing on xpu Signed-off-by: YAO Matrix <matrix.yao@intel.com> * more mllama cases enabling Signed-off-by: YAO Matrix <matrix.yao@intel.com> * make cases pass on A100 Signed-off-by: N <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Signed-off-by: N <matrix.yao@intel.com>	2025-04-22 17:39:10 +02:00
Mohamed Mekkouri	de182ba269	Refactor bitsandbytes doc (#37668 ) * doc * torch ops * fix * nits * Update docs/source/en/quantization/bitsandbytes.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-22 16:13:25 +02:00
Antonin Stefanutti	dde9b03e3b	Fix no_split_modules for Llama4 pretrained models (#37673 )	2025-04-22 16:05:12 +02:00
Marc Sun	9481e9e9f1	Fix autoround docs (#37675 ) * fix * empty	2025-04-22 15:33:13 +02:00
Mohamed Mekkouri	38c406844e	Fixing quantization tests (#37650 ) * fix * style * add capability check	2025-04-22 13:59:57 +02:00
Wenhua Cheng	b3492ff9f7	Add AutoRound quantization support (#37393 ) * add auto-round support * Update src/transformers/quantizers/auto.py Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> * fix style issue Signed-off-by: wenhuach <wenhuach87@gmail.com> * tiny change * tiny change * refine ut and doc * revert unnecessary change * tiny change * try to fix style issue * try to fix style issue * try to fix style issue * try to fix style issue * try to fix style issue * try to fix style issue * try to fix style issue * fix doc issue * Update tests/quantization/autoround/test_auto_round.py * fix comments * Update tests/quantization/autoround/test_auto_round.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/autoround/test_auto_round.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * update doc * Update src/transformers/quantizers/quantizer_auto_round.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * update * update * fix * try to fix style issue * Update src/transformers/quantizers/auto.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update docs/source/en/quantization/auto_round.md Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update docs/source/en/quantization/auto_round.md Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update docs/source/en/quantization/auto_round.md Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * update * fix style issue * update doc * update doc * Refine the doc * refine doc * revert one change * set sym to True by default * Enhance the unit test's robustness. * update * add torch dtype * tiny change * add awq convert test * fix typo * update * fix packing format issue * use one gpu --------- Signed-off-by: wenhuach <wenhuach87@gmail.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Shen, Haihao <haihao.shen@intel.com>	2025-04-22 13:56:54 +02:00
Cyril Vallez	9608908639	Correct warm-up with fp8 (#37670 ) * start clean warmup for quantizers * style --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-22 13:12:49 +02:00
Cyril Vallez	6614209b96	Fix duplicated weights in fp8 quantization (#37667 ) * fix fp8 * Update quantizer_finegrained_fp8.py * fix circular import * Update quantizer_finegrained_fp8.py	2025-04-22 13:12:27 +02:00
Raushan Turganbay	dcf6df5b0d	[qwen-omni] fix training (#37517 ) * fix * add text config * fixup * fix docs	2025-04-22 12:36:07 +02:00
Pavel Iakubovskii	9167fadab9	Introduce GradientCheckpointingLayer (#37223 ) * GradientCheckpointingLayer * trigger * Move GC layer to a separate file * Update import * Expose and document GC layer * Fix dummy * Apply to llama-based models * Update modulars * Update a few more models for consistency * Update glm4 * Update Janus	2025-04-22 11:33:31 +01:00
Manuel de Prada Corral	413f9bbf80	Fixes #37219 : RecurrentGemma crashes for inputs longer than sliding window length (#37613 ) * fix: RecurrentGemma crashes during inference for inputs longer than sliding window width * fix recurrentgemma tests; add long test bigger than context window	2025-04-22 12:21:16 +02:00
jeffhataws	964a1b6b7d	Fix ValueError when eval_do_concat_batches=False with examples (#37621 ) https://github.com/huggingface/transformers/issues/37593 Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-22 12:13:25 +02:00
Joao Gante	85665a4263	[tests] Stricter generate + compilation test -- no recompilations allowed (#37629 ) * tmp commit * stricter compilation test * trigger tests * rm todo	2025-04-22 11:12:18 +01:00
Joao Gante	362fa37da2	[test] update `test_past_key_values_format` (#37614 ) allow custom shapes	2025-04-22 11:07:34 +01:00
Manuel de Prada Corral	1cd110c6cb	Add test to ensure unknown exceptions reraising in utils/hub.py::cached_files() (#37651 ) * add test to ensure unknown exceptions are reraised in utils/hub.py::cached_files()	2025-04-22 11:38:10 +02:00
Isotr0py	c69e23455d	Support loading Gemma3 QAT GGUF models (#37649 ) * fix gemma3 qat gguf support Signed-off-by: isotr0py <2037008807@qq.com> * update test Signed-off-by: isotr0py <2037008807@qq.com> * make ruff happy Signed-off-by: isotr0py <2037008807@qq.com> --------- Signed-off-by: isotr0py <2037008807@qq.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-04-22 11:23:17 +02:00
Jerry Zhang	7eb1107cc2	Restructure torchao quantization examples (#37592 ) * Restructure torchao quantization examples Summary: Mainly structured the examples by hardwares and then listed the recommended quantization methods for each hardware H100 GPU, A100 GPU and CPU Also added example for push_to_hub Test Plan: not required Reviewers: Subscribers: Tasks: Tags: * update * drop float8 cpu * address comments and simplify * small update * link update * minor update	2025-04-22 11:20:34 +02:00
chenin-wang	006530d285	[fix gemma] Set default value for output_attentions parameter in Gemma2 and Gemma… (#37633 ) * Set default value for output_attentions parameter in Gemma2 and Gemma3 models * update * fix * fix --------- Co-authored-by: chenin <wangzhichen@encosmart.com>	2025-04-22 11:18:17 +02:00
youngrok cha	31ea547b7a	[fix] make legacy bnb code work (#37331 ) * [fix] make legacy bnb code work * [fix] use get with default instead of getter * add test for bnb 8bit optim skip embed * [fix] style * add require annotation of bnb --------- Co-authored-by: jaycha <jaycha@ncsoft.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-22 11:17:29 +02:00
Kero Liang	5f791281c3	Fix Qwen2.5-Omni get_chunked_index chunking functionality (#37631 ) * fix: qwen2.5 omni modular get_rope_index * test: add test for qwen2.5 omni rope index (video with audio input) * style * expected_position_ids readability * fix: use spatial_merge_size = 1 in unit test	2025-04-22 11:15:37 +02:00
JihadHammoud02	fee1190601	Refactor phi doc (#37583 ) * Added documentation for phi model * Update phi.md * Update phi.md * Update phi.md * Update docs/source/en/model_doc/phi.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated model card * Update phi.md * Update phi.md * Update phi.md * Update docs/source/en/model_doc/phi.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Jihad <jihadhammoud_@hotmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-21 10:31:04 -07:00
JihadHammoud02	b2db54f66b	Update longformer.md (#37622 ) * Update longformer.md * Update longformer.md * Update docs/source/en/model_doc/longformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/longformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update longformer.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-21 10:30:51 -07:00
Manuel de Prada Corral	2c60a442f3	fix link in kv_cache.md (#37652 ) fix typo in kv_cache.md	2025-04-21 09:01:11 -07:00
Alex Brooks	a42ba80fa5	Allow Exclusion of Input IDs from RepetitionPenaltyLogitsProcessor (#37625 ) * Allow exclusion of input IDs for repetition penalty * Add logit proc tests for rep penalty exclusion * Expose rep pen flag through generate * Only slice if needed * keep current rep pen default behavior * Revert exposing reppen changes through generate * Fix test arg * Update src/transformers/generation/logits_process.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Rename to rep penalty kwarg * Add custom repetition penalty processor example * Validate prompt_ignore_length --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-04-21 15:46:05 +01:00
Lysandre Debut	1077603410	Remove torchvision requirement from AutoImageProcessor (#37457 )	2025-04-21 14:59:33 +02:00
Joao Gante	1930e750e4	[kernels] use original forward at compile time (#37604 )	2025-04-21 13:22:47 +01:00
Yoni Gozlan	6daa3eeba5	Fix InternVL attention when using qk_norm (38B and 78B) (#37620 ) * fix internvlvision attention when using qk_norm * nit * modular	2025-04-19 21:39:08 +02:00
saswatmeher	27a25bee4f	chore: update model card for SigLIP (#37585 ) * edit siglip model card * fix syntax * Update docs/source/en/model_doc/siglip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/siglip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/siglip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/siglip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/siglip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/siglip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * address comments --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-18 13:30:41 -07:00
Xiaojian Ma	e1f379bb09	Fixing the example in generation strategy doc (#37598 ) Update generation_strategies.md The prompt text shown in the example does not match what is inside the generated output. As the generated output always include the prompt, the correct prompt should be "Hugging Face is an open-source company".	2025-04-18 12:50:17 -07:00
Pavel Iakubovskii	4f58fc9c82	Deprecate modeling_utils.py classes (#37298 ) * Move utils classes into models * Add deprecation warnings * Remove from docs * Update config attributes check	2025-04-18 18:47:34 +01:00
Yoni Gozlan	a245011252	Add InternVL (2.5 MPO) (#35968 ) * initial commit * add convert internvl * add first end-to-end working internvl * nit prompt and image proc * add working chat template * add conversion llama-based models * add tests * pass all tests * fix isort * fix modular after main merge * add video processing for internvl * add support for interlaced images and videos * Remove processing and config from modular, add more tests * add llama model tests * Modify processor for compatibility with refactored got ocr image processor * add comments in processor * Add docs and nits * change video processing to use custom sample_indices_fn * rebase and fix tests * add processor tests * Add changes Raushan review * Use the new attention interface for the vision model * nits * add support for custom video_load_backend * remove mention to InternVLTokenizer * refactor vision model to simplify logic * refactor processor for better readibility * fix copies * fix require av processor test * refactor internVL vision * Update processor and fix processing tests * fix docstring * update convert_weights for internvl3 * change image processor to fast by default * remove do_center_crop=True in convert_weights * force use_cache to True * push_to_hub before reloading * fix internVLVision for larger models * update convert weight for qk norm * fix convert_weights * fix eos_token_id in convert * update docs and integration tests * make modifs after review * fix wrong k_norm and reduce modular * change image_token_index to image_token_id * change checkpoint to OpenGVLab org * last nits * explicitely del self.num_key_value_groups * add extra special tokens	2025-04-18 18:57:33 +02:00
we1559	b0c6ff5e13	fix issue that some example with no trainer use accelerator.end_train… (#37435 ) * fix issue that some example with no trainer use accelerator.end_training in a wrong way * reformat code --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-18 17:59:42 +02:00
Yao Matrix	6f5014ac31	fix 2 encoder_decoder issues on XPU (#37572 ) * fix 2 encoder_decoder issues on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fmt --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-18 17:49:24 +02:00
Raushan Turganbay	2ba6b92a6f	[VLMs] use only `xxx_token_id` for multimodal tokens (#37573 ) * use only `xxx_token_id` for multimodal tokens * update modeling files as well * fixup * why fixup doesn't fix modular docstring first? * janus, need to update configs in the hub still * last fixup	2025-04-18 17:03:39 +02:00
Pablo Montalvo	4afd3f4820	Model debugger upgrades (#37391 ) * debugging improvements * add debugging details * add more debugging details * debug more * clean up layers + output * add summary json file * cleanup * copies 👀 * remove hooks + add documentation * draft a small test, why not * respect the format (respect it) * fixup imports * nit * add tests and configurable pruning of layers	2025-04-18 16:45:54 +02:00
Joao Gante	e5ac23081e	[Gemma3] compile ✨ (#37447 )	2025-04-18 14:55:43 +01:00
Yao Matrix	a1b82563f1	enable 6 modeling cases on XPU (#37571 ) Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-18 12:28:08 +02:00
Yao Matrix	3cd6627cd7	enable 6 gemma2 cases on XPU (#37564 ) Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-18 12:10:34 +02:00
Pablo Montalvo	049b75ea72	Flag SpeechT5 flaky test (#37587 ) flag flaky test	2025-04-18 11:35:46 +02:00
Zhen	aa17cfb4d5	[Bugfix] Fix flash-attention func param mismatch and softmax_scale default value mistake on Ascend NPU (#37575 ) [Bugfix] fix flash-attention func param mismatch and softmax_scale default value mistake on Ascend NPU Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-04-18 11:34:17 +02:00
jiqing-feng	14b3dbcf3b	remove _run_third_party_device_tests (#37445 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-04-18 11:19:56 +02:00
Yih-Dar	f974214353	Fix some GPU OOM after #37553 (#37591 ) * fix * trigger CI --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-18 10:09:19 +02:00
Yuan Wu	438324c9cf	Gaudi: Add the bf16 support for hpu (#37568 ) * Fix: hpu can support the bf16 Signed-off-by: yuanwu <yuan.wu@intel.com> * hpu is not integrated into torch. Signed-off-by: yuanwu <yuan.wu@intel.com> * Gaudi1 cannot support bf16 Signed-off-by: yuanwu <yuan.wu@intel.com> * Update src/transformers/utils/import_utils.py Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> --------- Signed-off-by: yuanwu <yuan.wu@intel.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-04-18 08:00:26 +02:00
Mohamed Mekkouri	bb2a44ad4b	Fix Quark quantization config (#37578 ) fix	2025-04-18 07:23:39 +02:00
Cyril Vallez	4acf692ace	Update Phi4 converter (#37594 ) * fix converter * Update phi4_multimodal.md	2025-04-17 23:08:24 +02:00
Cyril Vallez	40cba20e87	Ensure positive warm-up size (#37581 ) ensure > 0	2025-04-17 16:11:54 +02:00
Anthony Song	346f1eebbd	docs: fix typo (#37567 ) Co-authored-by: Anthony <anthony.song@capitalone.com>	2025-04-17 14:54:44 +01:00
Raushan Turganbay	48dd89cf55	[phi4] update conversion (#37579 ) * update conversion * update	2025-04-17 15:43:04 +02:00
Cyril Vallez	58e5e976e0	Small fix on context manager detection (#37562 ) * small fixes * Update modeling_utils.py * test * Update test_modeling_common.py * Update test_modeling_timm_backbone.py * more general * simpler	2025-04-17 15:39:44 +02:00
Alex Brooks	c7d3cc67a1	Fix qwen2audio wanr -> warn (#37559 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2025-04-17 14:34:58 +01:00
Kashif Rasul	dc06e7cecd	[TimesFM] use the main revison instead of revision for integration test (#37558 ) * use the main revison instead of revision * test prediction * check larger time steps	2025-04-17 11:26:03 +02:00
Raushan Turganbay	3bc44eaaee	[qwen-vl] Standardize config (#37268 ) * update * fix tests * fixup * update * skip this one * fixup * fix	2025-04-17 09:38:12 +02:00
Raushan Turganbay	4f96081aad	[chat template] fix security vulnerability (#37523 ) * fix security issues * nit	2025-04-17 09:21:37 +02:00
Yaswanth Gali	a2ef3cf537	Add Janus model (#36053 ) * Iterative generation using input embeds * Add Janus model * discard changes * Janus imports * Refactor config and processor * Added Vision tower of Janus * Import Janus Image processor * Vision tower fixes * Refactor code * Added VQ Model * Complete model integration * temp conversion script * processor refactor * Adding files to facilitate pulling * Fixes after debugging * Skip test for these models * Add Janus Model * discard changes * Janus imports * Refactor config and processor * Added Vision tower of Janus * Import Janus Image processor * Vision tower fixes * Refactor code * Added VQ Model * Complete model integration * temp conversion script * processor refactor * Adding files to facilitate pulling * Fixes after debugging * Refactor to Text config * ✨ Added generate function * Saving intermediate convert file. Still need to read configs from the hub and convert them to our format. * Adding version that reads from the JSON files. Still have to tweak some parameters manually. * relative imports * Initial tests * Refactor image processor * Seemingly working version of the conversion script, will need to test further. * Adding command message * Fixing conflicting JanusTextConfig class * Incorporating some of the discussed changes. * Small fix to create dir. * Removing system from JINJA template * Adding draft processor tests * style fixes * Minor fixes and enhancement * added generation config * Initial tests * Small modifications, tests are now passing. * Small changes I noticed while reading code. * more fixes * Added JanusModel class * Small merge adaptations * Small merge adaptations * Image processing tests passing * More tests and fixes * Convert script updated and refactored * Tests and cleanup * make style * Postprocessing for image generation * generate refactor * fixes * - Passing tests that write a part of the model to cpu (e.g. test_cpu_offload) - Passing tests of dispatching SDPA - Only gradient checkpointing tests are left. * Removing temporary code * Changes * Writing change to modular * Added JanusVisionModel. SDPA dispatch tests pass more robustly. Gradient checkpoint tests are next * Gradient checkpoint tests passing * Removing debug code * Major generate refactor 😮‍💨 * Temp changes for testing * Green quality CI * 2 out of 4 integration tests passing * breadcrumbs * Usage Examples * Regenerate modeling after merge * dirty code * JanusIntegrationTest are passing * breadcrumbs * happy CI * fixes * Changing template * nits * Text generation logits matching original codebase at 100% precision * Remove ./tmp from git tracking * Remove ./tmp from git tracking * Checkpointing changes after reviewing * Fixing code in docstrings * CHanging comments and small bug in convert file * Fixing bug in image_token_id for 7B version * Removing line that was added by both of us * Pushing changes after discussion. Only one left is to change the key mapping for convert file. * Updating module file * New convert file using dict. Tested that it is equivalent to the old one by: - comparing keys in a script - comparing checksums of the output files between version generated with the current convert script and those generated with the old script. This is a more reliable test. * revert changes * mistake * consistency change for CI * make style * doc fixes * more fixes * experimenting with masking out pad token * checkpoint * Batched generation with multi-images working for 1B models. Will test 7B next. * Device fix. * Writing changes to modular, previous ones were written to modeling just for quick testing. * Using passed processor attention mask (only in modeling for now) * Matching performance done in the non-standard way * Working version of batched generation. Will change how some args are passed to make it more similar to language case * More compliant version of the code * Removed duplicated `_prepare_4d_causal_attention_mask_with_cache_position` * Updating modular file, making masked filling with paddings more efficient * Slightly more efficient version * Modifying JanusVisionModel to be a wrapper * Fixing test to comply with new names * Modular overhaul * More refactoring * - Changing JanusVisionModel back - Changing forward pass - Adding boi token to the comparison * - Removing whole context model_ids - Using inherited implementation of prepare_inputs_for_generation * Moving the way boi token is passed to the model * Fixing sdpa test * Minor changes * testing changes * Minor fix * - Adding postprocessing test - checking values of generated image on integration test * changes * Removing pooled attention vision module, fixing convert script as a consequence * More changes * Fixes * Draft after merge * Bug fixes * More bug fix * Fixing docs * Nits * Refactor return dict * Moving image post processing test to main processor post process * Passing guidance_scale as kwarg * make style * 🔥 refactor * make style * Update and green CI * Nits and tests update * up * Added MID block * fix * Dead code * update testcase * update * model_id change * init_weight changes --------- Co-authored-by: hsilva664 <metallic-silver@hotmail.com>	2025-04-17 09:18:51 +02:00
Cyril Vallez	688f4707bf	All models can be initialized on meta device (#37563 ) * Update test_modeling_common.py * fix all * more fixes	2025-04-16 23:26:44 +02:00
Vinh H. Pham	0a83588c51	Bridgetower fast image processor (#37373 ) * add support for fast tokenizer * make style * fix according to reviews * make style * relax slow_fast_equivalence mean diff --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-04-16 22:39:18 +02:00
Chih-Chieh Yang	4005730044	Fix Mamba2 Grouped SSD Support in the torch_forward Path (#37533 ) * Fix mamba2 grouped support in bamba torch path * patch zamba2 and mamba2 * Add a unit test for grouped SSD * add comment for the new unit test * add output_size arg value to repeat_interleave calls * Add comment	2025-04-16 22:16:01 +02:00
Zeeshan Khan Suri	a7d2bbaaa8	Add EfficientNet Image PreProcessor (#37055 ) * added efficientnet image preprocessor but tests fail * ruff checks pass * ruff formatted * properly pass rescale_offset through the functions * - corrected indentation, ordering of methods - reshape test passes when casted to float64 - equivalence test doesn't pass * all tests now pass - changes order of rescale, normalize acc to slow - rescale_offset defaults to False acc to slow - resample was causing difference in fast and slow. Changing test to bilinear resolves this difference * ruff reformat * F.InterpolationMode.NEAREST_EXACT gives TypeError: Object of type InterpolationMode is not JSON serializable * fixes offset not being applied when do_rescale and do_normalization are both true * - using nearest_exact sampling - added tests for rescale + normalize * resolving reviews --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-16 21:59:24 +02:00
Raushan Turganbay	32eca7197a	[vlm] adjust max length for special tokens (#37342 ) * update * apply suggestion * fix tests for main branch * remove unused logger * add special tokens in tests * nit * fix more tests * fix test * pg also	2025-04-16 20:49:20 +02:00
Manuel Faysse	c94c59fc47	Fix pixel attention mask padding in smolvlm (#37497 ) * fix bad init * also modif smolvlm --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2025-04-16 20:48:46 +02:00
Yih-Dar	5a6de703a7	Run `test_can_load_with_global_device_set` using a subprocess (#37553 ) * fix * fix * fix * Update tests/test_modeling_common.py Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-04-16 19:48:30 +02:00
Pablo Montalvo	9a4ce64770	🔴 Update CLIP vision attention to new attention interface (#37498 ) * update attention interface * fix test * propagate attention changes * revert weird changes * fix modular * what? * ruff is mocking me * ruff being ruff * simplify test suite + fix FA2 * fixup tests + propagate FA2 fixes * add Copied From where relevant * fix conflict between copies and modular * recover FA2 training for CLIP + handle quantization * don't ditch the warning * tiny import fix * code review (FA2 support, copied from) * fix style * modularity * wrong copies * future-proofing for TP * mlcd inherits from CLIP	2025-04-16 18:15:22 +02:00
Cyril Vallez	dc8227827d	Fix TimesFm doc issue (#37552 ) * fix doc * code block	2025-04-16 16:28:42 +02:00
William Buchanan	2f517200c1	Make Ignored Columns ValueError More Informative (#33299 ) Make Ignored Columns Value Error More Informative Included forward method signature columns in the ValueError so end users will know what columns are expected to be passed to the model in addition to those which are ignored.	2025-04-16 16:14:55 +02:00
Yih-Dar	0577cae808	Fix device issue for tapas (with `as_tensor`) (#37551 ) * fix 1 * fix 2 * fix 3 * fix 4 * fix 5 * fix 6 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-16 16:02:53 +02:00
Nicholas Wilson	b33edf1b9b	docs(typo): Update ISSUES.md, fix a small typo (#37542 ) Update ISSUES.md	2025-04-16 15:01:04 +01:00
Garrett Goon	503541d7ef	add FlashAttentionKwargs and seq_idx to flat collator (#36456 ) * add flash attn kwargs to flattening collator * add return_seq_idx option * doc string edits * cleaner max len updates * various fixes * temp testing code * return int32 seq_idx and FlashAttnKwargs * DataCollatorIntegrationTest impl * fix batch dims and dtypes * fill out remaining collator tests * test name change and fmt * rm unused var * fmt * minor change * fmt * add missing pos_ids check * consistent {np,pt,tf} tests * split pt tests into 3, like np/tf tests * mv comment, rename fa test * remove batch dim comment * simply wrapping * compute cu_seq_len/max_length once * fmt * remove tf code * rm warning * move separator_id back to 2nd pos * use cleaner lists in tests * ret -> batch * fmt * attr ordering * use py ints for max_length_{k,q}	2025-04-16 15:45:03 +02:00
DerekLiu35	9ddcf5fce5	Update quantization docs (#37439 )	2025-04-16 15:44:53 +02:00
Jinan Zhou	a91020aed0	Add TimesFM Time Series Forecasting Model (#34082 ) * initial documentation * rename mask to attention_mask * smaller tests * fixup * fix copies * move to time series section * sort docs * isort fix * batch_size is not a configuration * rename to TimesFMModelForPrediction * initial script * add check_outputs * remove dropout_rate * works with torch.Tensor inputs * rename script * fix docstrings * fix freq when window_size is given * add loss * fix _quantile_loss * formatting * fix isort * add weight init * add support for sdpa and flash_attention_2 * fixes for flash_attention * formatting * remove flash_attention * fix tests * fix file name * fix quantile loss * added initial TimesFMModelIntegrationTests * fix formatting * fix import order * fix _quantile_loss * add doc for SDPA * use timesfm 2.0 * bug fix in timesfm decode function. * compare mean forecasts * refactor type hints, use CamelCase * consolidate decode func * more readable code for weight conversion * fix-copies * simpler init * renaem TimesFmMLP * use T5LayerNorm * fix tests * use initializer_range * TimesFmModel instead of TimesFmDecoder * TimesFmPositionalEmbedding takes config for its init * 2.0-500m-pytorch default configs * use TimesFmModel * fix formatting * ignore TimesFmModel for testing * fix docstring * override generate as its not needed * add doc strings * fix logging * add docstrings to output data classes * initial copy from t5 * added config and attention layers * add TimesFMPositionalEmbedding * calcuate scale_factor once * add more configs and TimesFMResidualBlock * fix input_dims * standardize code format with black * remove unneeded modules * TimesFM Model * order of imports * copy from Google official implementation * remove covariate forecasting * Adapting TimesFM to HF format * restructing in progress * adapted to HF convention * timesfm test * the model runs * fixing unit tests * fixing unit tests in progress * add post_init * do not change TimesFMOutput * fixing unit tests * all unit tests passed * remove timesfm_layers * add intermediate_size and initialize with config * initial documentation * rename mask to attention_mask * smaller tests * fixup * fix copies * move to time series section * sort docs * isort fix * batch_size is not a configuration * rename to TimesFMModelForPrediction * initial script * add check_outputs * remove dropout_rate * works with torch.Tensor inputs * rename script * fix docstrings * fix freq when window_size is given * add loss * fix _quantile_loss * formatting * fix isort * add weight init * add support for sdpa and flash_attention_2 * fixes for flash_attention * formatting * remove flash_attention * fix tests * fix file name * fix quantile loss * added initial TimesFMModelIntegrationTests * fix formatting * fix import order * fix _quantile_loss * add doc for SDPA * use timesfm 2.0 * bug fix in timesfm decode function. * compare mean forecasts * refactor type hints, use CamelCase * consolidate decode func * more readable code for weight conversion * fix-copies * simpler init * renaem TimesFmMLP * use T5LayerNorm * fix tests * use initializer_range * TimesFmModel instead of TimesFmDecoder * TimesFmPositionalEmbedding takes config for its init * 2.0-500m-pytorch default configs * use TimesFmModel * fix formatting * ignore TimesFmModel for testing * fix docstring * override generate as its not needed * add doc strings * fix logging * add docstrings to output data classes * add _CHECKPOINT_FOR_DOC * fix comments * Revert "fix comments" This reverts commit 8deeb3e191b3671bc1d74dbfe77b736a066c3d34. * add _prepare_4d_attention_mask * we do not have generative model classes * use Cache * return past_key_values * modules initialized with config only * update year * Update docs/source/en/model_doc/timesfm.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add layer_idx to cache * modular timesfm * fix test * unwrap sequential class * fix toctree * remove TimesFmOnnxConfig * fix modular * remove TimesFmStackedDecoder * split qkv layer into individual layers * rename projection layers * use ALL_ATTENTION_FUNCTIONS * is_causal is True * rename config * does not support flash_attn_2 * formatting * fix typo in docsstring * rename inputs * add time series mapping * Update src/transformers/models/olmo2/modeling_olmo2.py * Update src/transformers/models/moonshine/modeling_moonshine.py * use updated arguments * fix class name * add MODEL_FOR_TIME_SERIES_PREDICTION_MAPPING * isort * consolidate _preprocess into forward * fix a typo * fix a typo * fix toc * fix modular * remove aaserts * use self.config._attn_implementation * move to _postprocess_output * remove timesfm_get_large_negative_number * use view unstead of multiple unsqueeze * make helpers static methods of the Model * use to_tuple * use to_tuple if not return_dict * remove unused intitialization block as its incorporated in nn.Linear * remove unused num_key_value_groups * use the same convention as the masking method * update modular * do not use unsqueeze * use view instead of unsqueeze * use buffer for inv_timescales * formatting * modular conversion * remove unneeded intialization * add missing docstrings * remove cache * use simple_eager_attention_forward * support tp_plan * support for flex and flash attention masks * Revert "support for flex and flash attention masks" This reverts commit def36c4fcf31599b3f4937c9334b7da1a20132c3. * fix device * fix tests on gpu * remove unsued large model test * removed unneeded comments * add example usage * fix style * add import * Update docs/source/en/model_doc/timesfm.md Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * inherit from LlamaRMSNorm * use can_return_tuple decorator * remvoe return_dict * fix year * Update docs/source/en/model_doc/timesfm.md Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * pretrained does not inherit from GenerationMixin * use model for integration test --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Rajat Sen <rsen91@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-04-16 15:00:53 +02:00
Mohamed Mekkouri	8669c016d2	Refactor torchao docs (#37490 ) * refactor docs * add serialization * Update docs/source/en/quantization/torchao.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * reorder * add link * change automatic to autoquant Co-authored-by: DerekLiu35 <91234588+DerekLiu35@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * nits * refactor * add colab * update --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: DerekLiu35 <91234588+DerekLiu35@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-16 14:56:48 +02:00
Bowen Bao	e3d3b54638	Keep Quark loading through meta device (#37538 )	2025-04-16 14:19:56 +02:00
Ligong Han	61436a9323	convert scale and zero to cuda when using HQQ backend (#37425 )	2025-04-16 14:13:20 +02:00
Mohamed Mekkouri	7752e7487c	Fixes hqq by following a new path for bias parameter in pre_quantized models (#37530 ) * fix * add test	2025-04-16 13:58:14 +02:00
Cyril Vallez	7dafcd0077	More appropriate cuda warmup in resource-constrained hardware (#37550 ) * better allocation in resource constrained env * Update modeling_utils.py * CIs	2025-04-16 13:40:02 +02:00
Parteek	6fd87d1172	Add Fast Grounding-Dino Processor (#37108 ) * Add Fast Grounding-Dino Processor * Added modular file --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-16 12:26:08 +02:00
Yao Matrix	ed53809ac5	enable 6 rt_detr_v2 cases on xpu (#37548 ) * enable 6 rt_detr_v2 cases on xpu Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-16 11:23:56 +02:00
Yao Matrix	d91858c232	enable 3 mpt test cases on XPU (#37546 ) * enable 3 mpt test cases on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-16 11:23:06 +02:00
Antonin Stefanutti	4541c2cdef	Fix BitsAndBytesConfig JSON serialization in TrainingArguments (#37520 ) Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-04-16 11:18:17 +02:00
Yao Matrix	a335dc4d6d	enable `test_offloaded_cache_implementation` on XPU (#37514 ) Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-16 11:04:57 +02:00
Yao Matrix	33f6c5a5c8	enable several cases on XPU (#37516 ) * enable several cases on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * Update tests/test_modeling_common.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-16 11:01:04 +02:00
Yao Matrix	5ab7a7c640	enable 5 cases on XPU (#37507 ) * make speecht5 test_batch_generation pass on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * enable 4 GlmIntegrationTest cases on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * Update src/transformers/testing_utils.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-16 09:28:02 +02:00
Carceller--Meunier Pierre	3165eb7c28	Refactor ColPali model documentation (#37309 ) * Refactor ColPali model documentation * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Include quantisation exemple + real images * simpler image loading --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-15 13:52:11 -07:00
汪志鹏	33c6fdb2cf	Update VITS model card (#37335 ) * Update VITS model card * Update docs/source/en/model_doc/vits.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/vits.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/vits.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/vits.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update vits.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-15 13:16:05 -07:00
Yoni Gozlan	4cc6b60654	Fix broken add-fast-image-processor CLI (#37499 )	2025-04-15 18:50:21 +02:00
Parteek	51f544a4d4	Add Fast Conditional-DETR Processor (#37071 ) * Add Fast Conditional-DETR Processor * Update image_processing_conditional_detr_fast.py * Add modular_conditional_detr.py * Update image_processing_conditional_detr_fast.py * Update tests * make fix --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-15 18:33:34 +02:00
Parteek	4f1dbe8152	Add Fast Chinese-CLIP Processor (#37012 ) * Add Fast Chinese-CLIP Processor * Update dummy_torchvision_objects.py * Fix tests	2025-04-15 18:31:20 +02:00
Merve Noyan	c08997c52e	VDR task guide (#37485 ) * VDR task guide * Add to toctree * Update docs/source/en/tasks/visual_document_retrieval.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_document_retrieval.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_document_retrieval.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_document_retrieval.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_document_retrieval.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_document_retrieval.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_document_retrieval.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_document_retrieval.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_document_retrieval.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_document_retrieval.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-15 08:55:13 -07:00
Yao Matrix	57da364d8e	fix and enhance pipeline_webserver.md (#36992 ) * fix and enhance pipeline_webserver.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update docs/source/en/pipeline_webserver.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/pipeline_webserver.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * use pipe Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-15 08:35:05 -07:00
Pavel Iakubovskii	356b3cd71d	Fix missing return type for MLCD docs (#37527 ) * Fix missing return type for docs * trigger	2025-04-15 14:04:16 +01:00
Manuel de Prada Corral	0ad3710d47	fix: Restore explicit error surfacing for unexpected hub exceptions (#37525 ) * fix: Restore explicit error surfacing for unexpected hub exceptions Prior to PR #36033, unexpected exceptions (e.g., ModuleNotFoundError) during hub model loading were not swallowed silently. They either matched specific except blocks or were raised. After #36033, a catch-all except Exception block was introduced without a fallback else, causing unknown errors to be silently ignored and leading to misleading downstream behavior. This commit adds an `else: raise e` to ensure only explicitly handled exceptions are suppressed. All others are surfaced, restoring pre-4.50 behavior and aiding in debugging and dependency visibility. Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-04-15 14:54:11 +02:00
Parteek	f6c79f767c	Add Fast Yolos Processor (#37292 ) * Add Fast Yolos Processor * Update modular file * Fix copies --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-15 14:23:08 +02:00
Pavel Belevich	ecaeee66bc	Llama4: remove redundant transpose of router_logits (#37468 ) * Llama4: remove redundant transpose of router_logits * Fix formatting	2025-04-15 12:29:26 +01:00
Huajie Tan	6f7ea1cf00	Add MLCD model (#36182 ) * Add MLCD model * Update codes for auto-mapping * Add test scripts for MLCD * Update doc for MLCD model * Fix import error * Fix import error * Fix CI error for attention_outputs * Fix code style for CI * Fix code style for CI * Fix code style for CI * Fix code style for CI * Fix code style for CI * Fix CI error for initialization * Fix code style for CI * Fix code style for CI * Reformat codes and docs for CI test * Reformat codes and docs for CI test * Remove unused attributes for CI test * Fix style for CI test * List MLCD in flash_attn doc * Fix: typos, modulars, refactors from suggestions * Refactoring convert_mlcd_weights_to_hf.py from suggestions * Fix: docs conflicts * Fix error for CI test * Fix style for CI test * Add integration test for MLCD * Refactoring by class inheritance * Fix: refactor attention interface, adjust codes * Fix: merging conflicts * Fix: merging conflicts * Fix: style for CI test * Fix: style for CI test * Fix: set test_resize_embeddings to be False * Fix: initializer for CI test * Fix: conflicts, CI test, warning and refactoring * Fix: merging conflicts * Refactor * Update docs * Fix mistakes * Remove unused args and fix multi-gpu error * Revert position_embeddings * Solve conflicts * Solve conflicts * Remove dummy * Update _init_weights * Update _init_weights * Update _init_weights for CI test	2025-04-15 11:33:09 +01:00
AinL	d6ac923ad9	Change default value of `attn_temperature_tuning` (#37501 ) fix: change default value of `attn_temperature_tuning`	2025-04-15 12:10:38 +02:00
Cyril Vallez	c8e0e603de	Detect and use device context manager or global device in `from_pretrained` (#37216 ) * Update modeling_utils.py * improve * Update modeling_utils.py * Update test_modeling_common.py * Update test_modeling_timm_backbone.py * Update test_modeling_common.py * Update test_modeling_common.py * Update test_modeling_common.py * Update test_modeling_common.py * CIs	2025-04-15 09:59:20 +02:00
Matt	4e63a1747c	Don't auto-assign reviewers when the author is in HF (#37500 ) * Don't auto-assign reviewers when the author is in HF * Trigger tests	2025-04-14 18:17:38 +01:00
Cyril Vallez	8ab296501a	Remove deprecation warning for `num_logits_to_keep` (#37149 ) * remove everything * style	2025-04-14 19:08:45 +02:00
Parteek	20ceaca228	Add Fast owlvit Processor (#37164 ) * Add Fast Owlvit Processor * Update image_processing_owlvit_fast.py * Update image_processing_owlvit_fast.py --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-14 17:58:09 +02:00
Raushan Turganbay	cb39f7dd5b	[qwen-omni] fix processor (#37493 ) * fix * delete print * accept kwargs in overriden models as well * remove duplicate	2025-04-14 17:30:31 +02:00
Mohamed Mekkouri	d228f50acc	Fixing gated repo issues (#37463 ) using unsloth model	2025-04-14 17:19:10 +02:00
7mile	a5dfb98977	Fix wrong argparse type in modular checker script (#37472 ) fix(util): wrong argparse type in modular checker script	2025-04-14 16:11:29 +01:00
Parteek	a53a63c9c2	Add Fast Mobilenet-V2 Processor (#37113 ) Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-14 17:08:47 +02:00
Yann Chéné	4774a39d05	Add ImageProcessorFast to BiT processor (#37180 ) * Add ImageProcessorFast to BiT processor * propose a fast processor and add tests * all tests pass except one * run make * remove useless print * use same test as clip * apply make * Update src/transformers/models/bit/image_processing_bit_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Update setup.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Update src/transformers/models/bit/image_processing_bit_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * apply review comment --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-14 17:07:48 +02:00
Parteek	e43f168eb3	Add Fast LeViT Processor (#37154 ) * Add Fast LeViT Processor * Update levit.md * Update src/transformers/models/levit/image_processing_levit_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * ruff check --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-14 17:07:36 +02:00
Rupesh K Srivastava	1efcfa9ca4	Fix mask handling for flex attention in llama/gemma2/mistral/qwen2 (#37381 ) * fix BlockMask handling when using flex_attention for llama/mistral/gemma2 * fix attention_mask types * revert type hints and fixup * remove unnecessary assertion	2025-04-14 15:53:27 +01:00
Keumgang Cha	86064035f0	[bug] deprecated deta load_cuda_kernel, MultiScaleDeformableAttention (#37443 ) * Update modeling_deta.py * variable initialization	2025-04-14 15:44:30 +01:00
Vinh H. Pham	7cc9e61a3a	Add Fast Image Processor for Donut (#37081 ) * add donut fast image processor support * run make style * Update src/transformers/models/donut/image_processing_donut_fast.py Co-authored-by: Parteek <parteekkamboj112@gmail.com> * update test, remove none default values * add do_align_axis = True test, fix bug in slow image processor * run make style * remove np usage * make style * Apply suggestions from code review * Update src/transformers/models/donut/image_processing_donut_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * add size revert in preprocess * make style * fix copies * add test for preprocess with kwargs * make style * handle None input_data_format in align_long_axis --------- Co-authored-by: Parteek <parteekkamboj112@gmail.com> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-14 16:24:01 +02:00
Cyril Vallez	4e53840920	Detect and fix most `_init_weights()` issues - make it work for composite models (#37070 ) * Update test_modeling_common.py * Fix Llama and its modular children * Update test_modeling_common.py * qwen3 * first try at prioritizing models * Update test_modeling_common.py * Update test_modeling_common.py * Update test_modeling_common.py * test * fix * fix * more models * more * more * more * smarter init for composite models! * fix post rebase * smol * fix missing args * more * typo * Super elegant and efficient init for submodels * Update modeling_utils.py * style * last fixes * cleanup * finalize cleanup * CIs * improve docstring * Update modeling_utils.py * llama4 * style * CIs * style * add dpt * granite speech * qwen 2.5 omni * better fix * Parse the config file instead * CIs	2025-04-14 16:19:04 +02:00
Vinh H. Pham	1897a02d83	Add Fast Image Processor for LayoutLMv3 (#37201 ) * support fast image processor layoutlmv3 * make style * add warning and update test * make style * Update src/transformers/models/layoutlmv3/image_processing_layoutlmv3_fast.py * Update image_processing_auto.py --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-14 15:42:11 +02:00
Cypher Pepe	7bff4bdcf6	Fixed broken links (#37466 ) * Update broken link * Update broken link	2025-04-14 14:16:07 +01:00
Vinh H. Pham	e16775d103	Add Fast Image Processor for LayoutLMv2 (#37203 ) * add support layoutlmv2 * make style * Apply suggestions from code review Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * add warning and clean up * make style * Update src/transformers/models/layoutlmv2/image_processing_layoutlmv2_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-14 15:06:41 +02:00
Vinh H. Pham	49b9a69a36	Add Fast Image Processor for Flava (#37135 ) * support flava fast image processor * run style and quality * update test * update according to reviews * make style * update comment on BICUBIC * make style --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-14 15:05:31 +02:00
Raushan Turganbay	a5079a2c84	[ci] fix doc builder (#37489 ) happy doc ci	2025-04-14 13:49:31 +02:00
Vinh H. Pham	e7f5724efd	Add Fast Image Processor for Perceiver (#37176 ) * add test and fast image processor * make style * Update src/transformers/models/perceiver/image_processing_perceiver_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * make style --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-04-14 13:49:13 +02:00
BakerBunker	4b8c6d4cf8	Add Qwen2.5-Omni (#36752 ) * Add qwen2.5-omni * Remove einops dependency * Add torchdiffeq dependency * Sort init * Add torchdiffeq to extras['diffeq'] * Fix repo consistency * use cached_file * del odeint * renew pytest * format * Remove torchdiffeq * format * fixed batch infer bug * Change positional_embedding to parameter * Change default speaker * Config revision * Use modular & code clean * code clean * decouple padding with model & code cleaning * sort init * fix * fix * Second code review * fix * fix * rename vars to full name + some comments * update pytest * Code clean & fix * fix * style * more clean up * fixup * smaller vision model in tests * fix processor test * deflake a bit the tests (still flaky though) * de-flake tests finally + add generation mixin * final nits i hope * make sure processor tests are complete * replace with Qwen2_5OmniForConditionalGeneration * fix tests after updating ckpt * fix typos when cleaning, also we can't change ckpt * fixup * images and videos kwargs for processor * thinker and talker loadable from hub ckpt * address comments and update tests after rebase * fixup * skip for now * fixup * fixup * remove torch dependency in processors --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.con> Co-authored-by: feizi.wx <feizi.wx@alibaba-inc.com> Co-authored-by: raushan <raushan@huggingface.co>	2025-04-14 12:36:41 +02:00
Yih-Dar	ac1df5fccd	Fix tests failed with gated repos. (#37484 ) * fix * slow --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-14 12:08:13 +02:00
cyyever	1ef64710d2	Remove `fsspec` dependency which isn't directly used by transformers (#37318 ) Signed-off-by: cyy <cyyever@outlook.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-14 12:02:28 +02:00
Yao Matrix	47b9f06aa2	make test_snowman_image_captioning pass on XPU, by sharing same atol w/ ROCM (#37480 ) Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-14 11:39:45 +02:00
Mehant Kammakomati	78cea3e22c	fix: (llama4) fix no_split_modules to be picked up for fsdpv1 and v2 sharding (#37462 ) fix: fix no_split_modules to be picked up for fsdpv1 and v2 sharding Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>	2025-04-14 10:44:32 +02:00
Eric Wiener	953196a43d	Fix typing issues with SigLip2 (#37356 ) * Fix issues * Fix comment --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-04-11 22:24:23 +01:00
Joao Gante	aaf129cdae	[agents] remove agents 🧹 (#37368 )	2025-04-11 18:42:37 +01:00
Matt	69e6ddf27f	Delete hubconf.py (#37455 ) * Delete hubconf.py * Trigger tests	2025-04-11 18:12:45 +01:00
Alex Brooks	623d395aff	Add Granite Speech Support (#36801 ) * First pass at speech granite Add encoder / projector, rename things * Combine into one model file with causal lm outputs for forward * Add loss calc * Fix config loading Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Split new / old loading logic * Use transformers integration for loading peft adapters * Add generation wrapper for selective lora enablement * Add note for qformer encoder automodel * Guard torch/audio imports in feature extractor * Handle granite speech autoclasses * Handle optional deps in package structure for granite speech * Add granite pretrained model def for init * Add dummy objects for torch/torchaudio * Add tests for granite speech processor * Minor formatting fixes and refactoring * Add options for falling back to config in forward * Tentative model docstrings for granite speech * Fix config type * Remove legacy load * Allow non-lora variants for granite speech * Override weight tying for llm * Use text config instead of llm config * Add output embeddings getter to fix weight tying * Fix relative imports * computing the number of audio features, based on the raw audio sequence. * collating audio inputs, and keeping the original lengths. * asserted we have text. otherwise we can't specify the audio special token. * assering the number of audio-symbols/audios match correctly. running get validated_audios only when audio is present * indentation bugfix + supporting different feature lengths when expanding audio. * redundant, done in _get_validated_text * adapting the tests: - we must have text (not either audio or text) - _get_num_audio_features takes a list of raw lengths, provided it insetad. * Minor cleanup, remove unused import * Add more tests for batch feature processing * Allow setting offset in rel position embeddings * Add config option for warning if peft is not installed w/ lora * Port blip2 qformer code into granite speech * Add sad test for numpy arr processing * Allow numpy arrays / tuples in granite speech processor * Fix config type for projector * - pad instead of creating a zeros tensor, to keep the original dtype/device (support bfloat16) - cast input_features to the model dtype (support bfloat16) * merge Blip2QFormerConfig to GraniteSpeechProjectorConfig * prevent a crash when re-saving/loading the model (line 109) * consider additional edge cases during preprocessing. * consider additional edge cases during preprocessing. * add features mask for batched inference (bugfix) * Minor refactor, remove multiaudio processor tests * Add set input/output embeddings for granite speech * Fix feature dim check in processor test * Pop input features in embed test for granite speech * Small fixes for test edge cases Add granite speech to seq2seq causal lm mapping names * Add small tests for granite speech model * Fix data parallelism test * Standardize model class names * Fix check for copies * Fix misaligned init check * Skip granite speech in checkpoint check * Use default for tie_word_embeddings in granite speech * Fix non documentation granite speech repo issues * Fix comments and docstring checks * Add placeholder docs for granite speech * Fix test naming collision * Code formatting * Rerun torch dummy obj regen * Fix save pretrained for granite speech * Import sorting * Fix tests typo * Remove offset hack * Pass args through encoder config * Remove unused prune heads from blip2 * removing einsum. replaced with explicit multiplication (relative positional encodings) and sdpa attention. * remove Sequential from ConformerFeedForward and ConformerConvModule. + fix for sdpa attention * remove GraniteSpeechConformerScale * rename to hidden_states * rename conformer layers to self.layers, remove the first linear from the list to keep the list homogenous. * move pre-norm to the attention/feedforward blocks (avoid complex module wrapping) * adding pre_norm into forward * feature extractor refactoring to resemble how it's done in phi4multimodal. * rename feature_extractor to audio_processor * bugfix: input_feature_mask fix to get the exact number tokens. * Fix pytest decorator in processor test * Add (disabled) integration tests for granite speech * Fix handling of optional feature masking * Loosen validation in processing for vLLM compatability * Formatting fixes * Update init structure to mirror llama * Make granite speech projector generic * Update test config to reflect generic projector * Formatting fixes * Fix typos, add license * Fix undefined var in input processing * Cleanup and expose ctc encoder * Add missing config docstrings * Better var names, type hints, etc * Set attn context size in init * Add max pos emb to encoder config * Cleanup feature extractor * Add granite speech architecture details * Remove granite speech qformer ref * Add paper link, explicit calc for qkv * Calculate padding directly in depthwise conv1d init * Raise value error instead of asserting * Reorder class defs (classes used at top) * Precompute relpos distances * Run formatting * Pass attention distances through forward * Apply suggestions from code review Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Add todo for using common batch feature extraction * Rename audios/features * Ensure chat template may be provided to processor * Move granite speech docs to audio models * Add todos for input proc refactoring * Fix import order * Guard torch import * Use relative imports * Require torch backend for processor in granite speech * Add backend guards in feature extractor --------- Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> Co-authored-by: Avihu Dekel <avihu.dekel@ibm.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-04-11 18:52:00 +02:00
Mehant Kammakomati	435f88f1db	nit: typing use Llama4TextConfig instead of Llama4Config (#37430 ) nit: typing to text config Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>	2025-04-11 17:29:34 +01:00
cyyever	954f31cd81	Add XPU case to is_torch_bf16_gpu_available (#37132 ) * Add xpu case to is_torch_bf16_gpu_available Signed-off-by: cyy <cyyever@outlook.com> * Refine error messages Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-11 17:28:47 +01:00
cyyever	28eae8b4bd	Add weights_only=True to torch.load (#37062 )	2025-04-11 17:18:41 +01:00
Matt	bf46e44878	🚨 🚨 Allow saving and loading multiple "raw" chat template files (#36588 ) * Add saving in the new format (but no loading yet!) * Add saving in the new format (but no loading yet!) * A new approach to template files! * make fixup * make fixup, set correct dir * Some progress but need to rework for cached_file * Rework loading handling again * Small fixes * Looks like it's working now! * make fixup * Working! * make fixup * make fixup * Add TODO so I don't miss it * Cleaner control flow with one less indent * Copy the new logic to processing_utils as well * Proper support for dicts of templates * make fixup * define the file/dir names in a single place * Update the processor chat template reload test as well * Add processor loading of multiple templates * Flatten correctly to match tokenizers * Better support when files are empty sometimes * Stop creating those empty templates * Revert changes now we don't have empty templates * Revert changes now we don't have empty templates * Don't support separate template files on the legacy path * Rework/simplify loading code * Make sure it's always a chat_template key in chat_template.json * Update processor handling of multiple templates * Add a full save-loading test to the tokenizer tests as well * Correct un-flattening * New test was incorrect * Correct error/offline handling * Better exception handling * More error handling cleanup * Add skips for test failing on main * Reorder to fix errors * make fixup * clarify legacy processor file docs and location * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Rename to _jinja and _legacy * Stop saving multiple templates in the legacy format * Cleanup the processing code * Cleanup the processing code more * make fixup * make fixup * correct reformatting * Use correct dir name * Fix import location * Use save_jinja_files instead of save_raw_chat_template_files * Correct the test for saving multiple processor templates * Fix type hint * Update src/transformers/utils/hub.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Patch llava_onevision test * Update src/transformers/processing_utils.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Refactor chat template saving out into a separate function * Update tests for the new default * Don't do chat template saving logic when chat template isn't there * Ensure save_jinja_files is propagated to tokenizer correctly * Trigger tests * Update more tests to new default * Trigger tests --------- Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co>	2025-04-11 16:37:23 +01:00
Mohamed Mekkouri	897874748b	Disable kernels for quantization (#37446 ) fix	2025-04-11 16:35:38 +02:00
Wing Lian	6a75528cbc	prevent creating a view/leaf param for low rank optimizers w FSDP (#37379 ) prevent creating a view/leaf param for low rank optimizers:	2025-04-11 14:36:29 +02:00
Bowen Bao	6cef03ba66	[Regression] Fix Quark quantized model loading after refactorization (#37407 )	2025-04-11 13:43:36 +02:00
Raushan Turganbay	a563999a02	[processor] clean up mulitmodal tests (#37362 ) * clkea up mulitmodal processor tests * fixup * fix tests * fix one last test * forgot	2025-04-11 13:32:19 +02:00
Mohamed Mekkouri	3c39c07939	Remove triton mlp kernel, not compiling for some models (#37449 ) * remove mlp for now * disable on docker	2025-04-11 12:47:13 +02:00
Lysandre Debut	f797e3d98a	Fix the test fetcher (#37452 ) Test fetcher	2025-04-11 12:19:27 +02:00
Arthur	442d356aa5	Add moe kernels (#37376 ) * the fix that did not get in * add kernels * full graph does not work * simpler is better * Update src/transformers/integrations/hub_kernels.py Co-authored-by: Daniël de Kok <me@danieldk.eu> * Update src/transformers/integrations/fbgemm_fp8.py Co-authored-by: Daniël de Kok <me@danieldk.eu> * Update src/transformers/integrations/hub_kernels.py Co-authored-by: Daniël de Kok <me@danieldk.eu> * fixup --------- Co-authored-by: Daniël de Kok <me@danieldk.eu>	2025-04-11 11:56:22 +02:00
Arthur	7e9b57ce62	Update-kernel-pin (#37448 ) * update `kernels` * oups * new pinned version	2025-04-11 11:19:21 +02:00
Lysandre Debut	54a123f068	Simplify soft dependencies and update the dummy-creation process (#36827 ) * Reverse dependency map shouldn't be created when test_all is set * [test_all] Remove dummies * Modular fixes * Update utils/check_repo.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * [test_all] Better docs * [test_all] Update src/transformers/commands/chat.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * [test_all] Remove deprecated AdaptiveEmbeddings from the tests * [test_all] Doc builder * [test_all] is_dummy * [test_all] Import utils * [test_all] Doc building should not require all deps --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-04-11 11:08:36 +02:00
Donggeun Yu	931126b929	Fixes: Corrects file path for CUDA kernels (#37438 ) Corrects the file path used to locate the CUDA kernels for the Deformable Attention module. This ensures that the kernels are loaded correctly, resolving potential errors during module initialization and usage.	2025-04-11 09:41:46 +01:00
Yao Matrix	c7064cdba1	enhance require_deterministic_for_xpu (#37437 ) * enhance require_deterministic_for_xpu Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-11 08:06:08 +02:00
cyyever	371c44d0ef	Remove old code for PyTorch, Accelerator and tokenizers (#37234 ) * Remove unneeded library version checks Signed-off-by: cyy <cyyever@outlook.com> * Remove PyTorch condition Signed-off-by: cyy <cyyever@outlook.com> * Remove PyTorch condition Signed-off-by: cyy <cyyever@outlook.com> * Fix ROCm get_device_capability Signed-off-by: cyy <cyyever@outlook.com> * Revert "Fix ROCm get_device_capability" This reverts commit 0e756434bd7e74ffd73de5500476072b096570a6. * Remove unnecessary check Signed-off-by: cyy <cyyever@outlook.com> * Revert changes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-04-10 20:54:21 +02:00
duanjunwen	7ff896c0f2	[Feat] Support npu in modeling models (#37369 )	2025-04-10 19:00:58 +02:00
Mohamed Mekkouri	10907e2846	Adding to self_comment_ci.yml (#37426 ) add myself	2025-04-10 17:46:56 +02:00
Mehant Kammakomati	7d76876498	(Part 2) feat: allow for tp_size attr for tplizing the model (#37054 ) * feat: custom tp_size, new transformers tp interface Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: review cmt - error when tp_plan not set for tp_size Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: nit in docs Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> --------- Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com>	2025-04-10 17:44:09 +02:00
Terrasse	dac443414e	fix: use mtime by default in Trainer._rotate_checkpoints with automatic fallback (#37260 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-10 17:42:06 +02:00
Isotr0py	6daec12d0b	Add GGUF support to Gemma3 Text backbone (#37424 ) * add gemma3 gguf support Signed-off-by: Isotr0py <2037008807@qq.com> * fix typo and add gguf limit Signed-off-by: Isotr0py <2037008807@qq.com> * fix a typo Signed-off-by: Isotr0py <2037008807@qq.com> * add vision conversion test Signed-off-by: Isotr0py <2037008807@qq.com> * fix typos Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-10 17:15:43 +02:00
Mohamed Mekkouri	0ea1151222	Llama Kernel integration (#37092 ) * initial commit * style * update * change approach attention * clean up * fix import * update * update * fix style * change method * attention * add mlp back * change name * update name * fix copies * fix config * fix	2025-04-10 17:13:25 +02:00
Mohamed Mekkouri	9c0c323e12	Fix require_read_token (#37422 ) * nit * fix * fix	2025-04-10 17:01:40 +02:00
Mario Michael Krell	bde41d69b4	Correctly drop tokens in SwitchTransformer (#37123 ) Previously, the identity function was used for dropped tokens with a weight from the expert that was not applied to the hidden states. This was misleading, because dropping means, the expert weight is zero. Instead of trying to fix the weight, we take an easier approach by initializing with zeros. Fixes issue https://github.com/huggingface/transformers/issues/37017	2025-04-10 16:58:57 +02:00
AbdelKarim ELJANDOUBI	7ecc5b88c0	Add image classifier donut & update loss calculation for all swins (#37224 ) * add classifier head to donut * add to transformers __init__ * add to auto model * fix typo * add loss for image classification * add checkpoint * remove no needed import * reoder import * format * consistency * add test of classifier * add doc * try ignore * update loss for all swin models	2025-04-10 15:00:42 +02:00
Mohamed Mekkouri	5ae9b2cac0	Quark Quantization gated repo (#37412 ) * fix * empty commit * empty * nit * fix maybe ?	2025-04-10 14:57:15 +02:00
Yih-Dar	d9e76656ae	Fix new failure reports not including anything other than `tests/models/` (#37415 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-10 14:47:23 +02:00
Raushan Turganbay	1ae8d54b04	[chat-template] Unify tests and clean up 🧼 (#37275 ) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now	2025-04-10 14:42:32 +02:00
Arthur	10144ff116	use `rms_norm_eps` for the L2Norm for Llama4 (#37418 ) use `rms_norm_eps`	2025-04-10 13:33:50 +02:00
ivarflakstad	aa478567f8	Allow rocm systems to run these tests (#37278 ) * Allow rocm systems to run these tests * Fix skipTest logic * Use get_device_properties to check system capabilities	2025-04-10 13:33:01 +02:00
Wang, Yi	ae5ce22664	from_pretrained should handle xpu case (#37382 ) * from_pretrained should handle xpu case Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * fmt Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2025-04-10 13:23:17 +02:00
Yih-Dar	4f139f5a50	Send trainer/fsdp/deepspeed CI job reports to a single channel (#37411 ) * send trainer/fsdd/deepspeed channel * update * change name * no . * final --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-10 13:17:31 +02:00
Arthur	a2c2fb0108	update `kernels` to 0.4.3 (#37419 ) * update `kernels` * oups	2025-04-10 12:14:22 +02:00
Wing Lian	0ddad2d655	mark llama4 as not supported with fa2 (#37416 )	2025-04-10 11:48:46 +02:00
Cyril Vallez	fbb2054ed5	Offloaded hybrid cache for Llama4 (#37401 ) * first try (maybe race condition) * Update cache_utils.py * cannot avoid the race condition -> use 2 layers * Update cache_utils.py * Update cache_utils.py	2025-04-10 11:44:34 +02:00
Cyril Vallez	6d8b0b3378	Fix Llama4 offset (#37414 ) * add +1 * Update modeling_llama4.py	2025-04-10 11:40:58 +02:00
Mohamed Mekkouri	f5865d32a2	Restrict & Explain tp_plan for FBgemm (#37404 ) * explain tp_plan * add llama4 check * add clarification	2025-04-10 11:33:33 +02:00
Serge Panev	e39c732644	Handle torch ver in flexattn (#37400 ) * Handle torch ver in flexattn * update	2025-04-10 11:27:54 +02:00
Manuel de Prada Corral	bc0150bb04	Add warning when failed to acquire other user's lock at model download (#37395 )	2025-04-10 11:18:27 +02:00
Wing Lian	9cda4265d6	handle torch version edge cases (#37399 )	2025-04-09 21:49:57 +02:00
Arthur	e032d12e8a	the fix that did not get in (#37370 ) * debugging improvements * add debugging details * add more debugging details * debug more * the fix that did not get in * First fix flex * fix query offset * fix flex first * fix device mask creation for speed * small mask creation sdpa * Update flex_attention.py * remove chunked prefill from HybridChunkedCache * never seen such a fucked up merged * clean up layers + output * add summary json file * Efficient general cache * Update cache_utils.py * cleanup * fix? * fix! * oups typo * not everywhere * more fixes * revert unrelated changes * Fix but ugly for now -> should use pad instead * oups * re-initialize the cache * Use pad to simplify * style * correct slicing --------- Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-04-09 20:15:33 +02:00
Mohamed Mekkouri	f834ca2c19	Attention Quantization with FBGemm & TP (#37384 ) * fix * keep fused * contiguous * rm print * update * update * rm print	2025-04-09 18:45:42 +02:00
DerekLiu35	c5c648dd74	Fix some failing AWQ tests (#37383 ) * update AwqQuantizer * fix style * add an arg to get_modules_to_not_convert to add get_keys_to_not_convert(model)	2025-04-09 18:24:57 +02:00
Brayden Zhong	71b35387fd	Apply torchfix to replace deprecated functions: `_pytree._register_pytree_node` and `torch.cpu.amp.autocast` (#37372 ) fix: apply torchfix	2025-04-09 16:11:18 +01:00
Sangyun_LEE (이상윤)	ad340908e4	Fix warning message for PEFT models in text-generation pipeline #36783 (#36887 ) * add peft model in constant * add test * fix formating * make fixup execute * change code * check by self.task * add test * fixup test code * fix minor typo * fix pipeline test * apply maintainers reqests	2025-04-09 15:36:52 +01:00
DerekLiu35	2527f71a47	Add "selecting a quantization method" doc (#37159 ) * initial draft * make documentation simpler * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/selecting.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * turn pros and cons into tables * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add links to each quant method page * separate calibration vs no calibration methods * add calibration time estimates --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-09 15:51:37 +02:00
Marc Sun	7ae0be722e	update deepspeed docker (#37371 ) * update * create docker image * 03 * uninstall pytest as it conflits with transformers * wrong one * better * see which package depends on pytest * up * resintall * fix * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd * deepspeedddddddd --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-09 14:54:06 +02:00
Arthur	e3eda6d188	Add glm4 (#37388 ) * add changed * Revert "add changed" This reverts commit 0a0166a1fe80556115a49fbf0c2132de0f4f85c9. * update with NEW MODEL class called GLM4 * update * Update glm4.md * Name * style * fix copies * fixup test --------- Co-authored-by: Yuxuan Zhang <2448370773@qq.com>	2025-04-09 14:02:04 +02:00
Jonas M. Kübler	1e6ff5fd55	fix: llama4 conversion script no_rope_layers (#37359 ) fix conversion script no_rope_layers `no_rope_layers` should either be a list of NoPE layers or None, such that it is created in the config from the `no_rope_layer_interval` Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2025-04-09 13:02:15 +02:00
Raushan Turganbay	6f4058aee3	Update composition flag usage (#36263 ) * update composition flag usage * remove print * fix tests * actually fix * oh c'mon * now should be fixed right? * fix copies	2025-04-09 11:48:49 +02:00
Jerry Zhang	08e3217baf	Preserve requires_grad in pre quantized model (#37354 ) * Preserve requires_grad in pre quantized model Summary: discovered this when running lm-eval for some models, current code will set requires_grad to True always Test Plan: lm_eval --model hf --model_args pretrained=jerryzh168/phi4-torchao-gguf-q4_k --tasks hellaswag --device cuda:0 --batch_size 8 Reviewers: Subscribers: Tasks: Tags: * ruff format --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-04-08 18:41:30 +02:00
Matt	4d0de5f73a	🚨 🚨 Setup -> setupclass conversion (#37282 ) * More limited setup -> setupclass conversion * make fixup * Trigger tests * Fixup UDOP * Missed a spot * tearDown -> tearDownClass where appropriate * Couple more class fixes * Fixups for UDOP and VisionTextDualEncoder * Ignore errors when removing the tmpdir, in case it already got cleaned up somewhere * CLIP fixes * More correct classmethods * Wav2Vec2Bert fixes * More methods become static * More class methods * More class methods * Revert changes for integration tests / modeling files * Use a different tempdir for tests that actually write to it * Remove addClassCleanup and just use teardownclass * Remove changes in modeling files * Cleanup get_processor_dict() for got_ocr2 * Fix regression on Wav2Vec2BERT test that was masked by this before * Rework tests that modify the tmpdir * make fix-copies * revert clvp modeling test changes * Fix CLIP processor test * make fix-copies	2025-04-08 17:15:37 +01:00
KimmiShi	c15a7adb28	fix(qwen): fix shape error when using tp (#36947 ) * fix(qwen): fix shape error when using tp * Update modeling_qwen2_vl.py --------- Co-authored-by: shidongxing <shidongxing@pjlab.org.cn> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-04-08 17:47:30 +02:00
Jonathan Mamou	121f91d36c	prune LM Head for USD (#36695 ) * initial commit * fix * fix style * set default to prune * add tests * comment * remove prune flag from generate * address Joao's comments * deprecate_kwarg * add doc * fix target_vocab_size * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * fix deprecated argument assistant_model_device --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-04-08 16:44:10 +01:00
Joao Gante	4321b0648c	[core] remove `GenerationMixin` inheritance by default in `PreTrainedModel` (#37173 )	2025-04-08 16:42:05 +01:00
Kerry	aab0878327	Skip non-selected experts for mixtral and qwen2_moe (#32429 ) * Skip non-selected experts for mixtral and qwen2_moe * Fix: tensor tolist() * WIP: tokenization test * fix modular source of truth * nits --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-04-08 17:41:28 +02:00
Joao Gante	35f0f5b5da	[llama 4] dynamic rope decorator (#37365 ) l4 + dynamic rope decorator	2025-04-08 15:56:31 +01:00
Ryan Mullins	530322ccb6	Set vision config to None for Gemma 1B conversion (#37366 ) * Set vision config to None for Gemma 1B conversion * Trigger tests --------- Co-authored-by: Matt <rocketknight1@gmail.com>	2025-04-08 14:22:32 +01:00
Yih-Dar	8064cd9b4f	fix deepspeed job (#37284 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-08 15:19:33 +02:00
Cyril Vallez	cdfb018d03	A bit of cleaning 🧹🧹 (#37215 ) * cleaning * CIs	2025-04-08 14:33:58 +02:00
cyyever	1e6b546ea6	Use Python 3.9 syntax in tests (#37343 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-04-08 14:12:08 +02:00
Minho Ryu	0fc683d1cd	convert float for yarn related arguments in rope_scaling (#37139 ) * convert float for yarn related arguments in rope_scaling * sort keys alphabetically --------- Co-authored-by: ryan.agile <ryan.agile@kakaobrain.com>	2025-04-08 13:58:22 +02:00
Alex Brooks	2515a5a290	Expose blip2qformer (#37254 ) * Expose blip2qformer * Add missing args to blip2 config	2025-04-08 12:04:33 +02:00
Arthur	2da82e432d	Multiple llama4 fixe (#37353 ) * update for fixes * more fixes * fuxix dynamic cache? * style * fix both traiining and generating. Eager seems alright * dynamic does not work * fix most cases, use_cache or not, eager or not, no default cache (ex: not training but you want to get cache states) * should be final fixes * fix more stuff no cat * style * fix * style * final sytle * qualityeioiwhjfaopsejdpofqsdjkfjha;wesdhgfkjlqsw.denghjkaswednkgs * fix * revert	2025-04-08 11:14:49 +02:00
salman	794fde7b1c	Fixing flex attention for torch=2.6.0 (#37285 ) * adding compile kwarg for torch 2.6 * fixing dynamic * addressing comment * typo * Update src/transformers/integrations/flex_attention.py --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-04-07 23:04:46 +02:00
Wing Lian	b54c2f4689	more fixes for post-training llama4 (#37329 ) * more fixes for post-training llama4 * use target_length instead of guearded past_key_values	2025-04-07 21:20:23 +02:00
Tugsbayasgalan Manlaibaatar	754a370bca	Remove unnecessary attr assignment (#36837 ) Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-04-07 20:19:54 +01:00
logesh R	31a62c2eb8	Updated Model-card for donut (#37290 ) * Updated documentation for Donut model * Update docs/source/en/model_doc/donut.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/donut.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/donut.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/donut.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated code suggestions * Update docs/source/en/model_doc/donut.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated code suggestion to Align with the AutoModel example * Update docs/source/en/model_doc/donut.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Updated notes section included code examples * close hfoption block and indent --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-07 11:54:47 -07:00
Mohamed Mekkouri	f830105183	Add bnb to the list of supported quantization methods for LLama4 (#37348 ) * add bnb * style * update * add pre_quantized check	2025-04-07 20:34:06 +02:00
Parag Ekbote	e2b0224d94	Update Model Card for Jamba (#37152 ) * Update model card for jamba * Apply the suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review-2 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update model page. * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update as per code review. * Update docs/source/en/model_doc/jamba.md as per code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/jamba.md as per code review ` Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update as per code review. * fixes --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-07 11:02:59 -07:00
Devesh Rahatekar	6cc109c354	Improvements in Gemma2 model card (#37076 ) * Improved Model card for Gemma2 * Made changes in gemma2 as suggested * Made more changes in the doc (adding image, notes, closing hfoptions) * minor fixes --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-07 10:51:26 -07:00
Mohamed Mekkouri	8bbcdf5409	Clean up the compressed-tensors integration (#37349 ) clean up	2025-04-07 19:26:45 +02:00
Ashvanth.S	3a826a45ca	Update Model card for GPT2 (#37101 ) * Update Model card for gpt2 * Update link for gpt2 space * fixes docs based on suggestions * Add transformers-cli and quantization example for GPT-2 * Remove resources and flash attention docs and fix typos	2025-04-07 10:15:28 -07:00
Ricardo Alanis	5e855095a2	Update falcon mamba card (#37253 ) * feat: edit falcon mamba card * fix: edit statement on falconmamba arch * Update docs/source/en/model_doc/falcon_mamba.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon_mamba.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon_mamba.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix: add right indent for tags * fix: remove notas --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-07 10:12:44 -07:00
Shubham Panchal	416b5a875d	Update model-card for DINOv2 (#37104 ) [docs] Update model-card for DINOv2	2025-04-07 10:11:08 -07:00
Nahieli	f8a16805c5	updated model card for Mistral (#37156 ) * model card for Mistral * Update docs/source/en/model_doc/mistral.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/mistral.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/mistral.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/mistral.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/mistral.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * apply suggestions * fix typo * updated with comments * updated with comments * updated with comments * remove hfoption block --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-07 10:05:36 -07:00
Cyril Vallez	48e179857c	Remove HQQ from caching allocator warmup (#37347 ) Update modeling_utils.py	2025-04-07 18:33:48 +02:00
Steven Liu	832cb684a0	Update translation template (#37294 )	2025-04-07 09:29:37 -07:00
Cyril Vallez	22065bd645	fix derived berts `_init_weights` (#37341 ) * fix derived berts * more * roformer	2025-04-07 18:25:07 +02:00
Matt	f789f960c8	Avoid build crashes when torch.version.xpu doesn't exist and fix Llama4 processor tests (#37346 ) * Avoid build crashes when torch.version.xpu doesn't exist * Trigger tests * Fix image token and skip inappropriate test * Remove ignore_errors=True * Add another skip	2025-04-07 17:05:54 +01:00
Yao Matrix	12bf24d6ae	enable 2 llama UT cases on xpu (#37126 ) * enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu Signed-off-by: YAO Matrix <matrix.yao@intel.com> * switch to use Expectations Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * extract gen bits from architecture and use it Signed-off-by: YAO Matrix <matrix.yao@intel.com> * add cross refererence Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-07 16:02:14 +02:00
Yih-Dar	e7ad077012	byebye torch 2.0 (#37277 ) * bump Torch 2.1 with broken compatibility `torch.compile` * dep table * remove usage of is_torch_greater_or_equal_than_2_1 * remove usage of is_torch_greater_or_equal_than_2_1 * remove if is_torch_greater_or_equal("2.1.0") * remove torch >= "2.1.0" * deal with 2.0.0 * PyTorch 2.0+ --> PyTorch 2.1+ * ruff 1 * difficult ruff * address comment * address comment --------- Co-authored-by: Jirka B <j.borovec+github@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-07 15:19:47 +02:00
jiqing-feng	99f9f1042f	Fix torchao usage (#37034 ) * fix load path Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix path Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Fix torchao usage Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert useless change Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert fp8 test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix fp8 test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix fp8 test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torch dtype Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-07 14:50:48 +02:00
cyyever	0fb8d49e88	Use Python 3.9 syntax in examples (#37279 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-04-07 12:52:21 +01:00
Cyril Vallez	08f36771b3	Fix `init empty weights` without accelerate (#37337 ) * add the integration * Update accelerate.py * Update accelerate.py * add find_tied_params as well * Update accelerate.py * add where copied from * simplify * add error	2025-04-07 11:37:29 +02:00
Cyril Vallez	9db31ea585	Fix deepspeed with quantization (#37324 ) * Update modeling_utils.py * Update modeling_utils.py	2025-04-07 11:36:44 +02:00
hoshi-hiyouga	debfe904c9	fix llama4 training (#37319 )	2025-04-07 09:24:44 +02:00
Wing Lian	54538ebee3	fix flex attn when optional args aren't passed (#37327 )	2025-04-07 09:12:21 +02:00
Lysandre	d1b92369ca	v4.52.0.dev0	2025-04-05 22:04:21 +02:00
Arthur	25b7f27234	Add llama4 (#37307 ) * remove one of the last deps * update fast image processor after refactor * styling * more quality of life improvements * nit * update * cleanups * some cleanups * vllm updates * update fake image token * [convert] Fix typo * [convert] Strip extraneous bytes from shards * [convert] Minor fixes * [convert] Use num_experts * multi-image fixes in modeling + processor * fixup size * 128 experts * Use default rope * Unfuse mlp * simplify a lot inputs embeds merging * remove .item() 👀 * fix from review * Address feedback * Use None "default" for rope_scaling. Add eot. * set seed * return aspect ratios and bug fixes * Moe 128 rebased (#8) * 128 experts * Use default rope * Unfuse mlp * Address feedback * Use None "default" for rope_scaling. Add eot. * Meta/llama quant compat (#7) * add quant compatible model & conversion code for llama4 * fix a few issues * fix a few issues * minor type mapping fix --------- Co-authored-by: Lu Fang <fanglu@fb.com> * use a new config parameter to determine which model definition to use for MoE --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Lu Fang <fanglu@fb.com> * un-comment write_tokenizer from converting script * remove un-used imports * [llama4] Pop aspect_ratios from image processor output in Llama4Processor Signed-off-by: Jon Swenson <jmswen@gmail.com> * Fix parameter_count name * Update src/transformers/models/llama4/configuration_llama4.py * nit * Add changes for no_rope, moe_layers, chunked attention. Just need to test all * Update src/transformers/models/llama4/image_processing_llama4_fast.py * nit * fix post merge with main * support flex attention * fixes * fix * add layer * small updates * rebase and delete llm_compressor * nit * [llama4/mm] Add back <\|image\|> token that delimits global tile * [llama4/mm] Fix Llama 4 image processing unit tests * add explicit dtype Signed-off-by: Jon Swenson <jmswen@gmail.com> * sdpa works * comment todo small * fix model loading Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> * revert * nits * small fix for TP on 1 node * Read new params from config * Add <\|eom\|> * lol don't know how this got here * adding fp8 * Save processor, fix chat template * style * Add boi/eoi tokens We don't use them. * fixes for now flex seems to work :) * updates * nits * updates * missking keys * add context parallel * update * update * fix * nits * add worldsize and make eager attn work for vision * Ignore new key present in base models * add tp_plan * fix nope Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> * minor fix Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> * Clean up Llama4 vision model * current updates * add support for `attn_temperature_tuning` * add floor scale * add missing attn scales * push what works, dirty trick for the device synch * oups * Fix pad_token_id See https://huggingface.co/ll-re/Llama-4-Scout-17B-16E/discussions/2/files Confirmed in the original codebase. * fix causallml loading * rm * fix tied-weights * fix sdpa * push current version * should work with both short and long * add compressed_tensos & fix fbgemm tp * Fix flex impl * style * chunking * try to revert the potentially breaking change * fix auto factory * fix shapes in general * rm processing * commit cache utils cleanup * Fix context length * fix * allocate * update tp_plan * fix SDPA! * Add support for sparse `Llama4TextMoe` layer from the kernel hub * cleanup * better merge * update * still broken fixing now * nits * revert print * Write max_position_embeddings and max_model_length * Update modeling_llama4.py * Save attention_chunk_size * Sync eos terminators * Read initializer_range * style * remove `dict` * fix * eager should use `chunked_attention_mask` * revert * fixup * fix config * Revert "Merge pull request #36 from huggingface/sparse-llama4-moe" This reverts commit ccda19f050867dd42ea143c5de60f3dec81375f0, reversing changes made to a515579aed8c0fe9bf529b6c40446a289406d5d6. * Fix typo and remove warning with compiled flex and chunked prefill * Fix MoE vs FF (#41) * fix * Use correct no_rope_layers if provided one is empty list * update tests * fix * skipping some tests * fix fp8 loading Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> * fix text geneartion pipeline Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> * eager needs 4D mask * fix * Some cleanup * fix * update * fix * replace correctly module * patch * modulelist * update * update * clean up * Don't move to `cuda:0` in distributed mode * restrict to compressed tensors for now * rm print * Docs! * Fixes * Update docs/source/en/model_doc/llama4.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Fixes * cuda graph fix * revert some stuff * fixup * styling * Update src/transformers/models/llama4/modeling_llama4.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * commit licence, cleanup here and there and style * more styling changes * fix dummies * fix and clean docstrings * remove comment * remove warning * Only fast image processor is supported * nit * trigger CI * fix issue with flex encoder * fix dynamic cache * Code quality * Code quality * fix more tests for now * Code quality * Code quality * Nuke bunch of failing stuff * Code quality * Code quality * cleanup removal of slow image processor * ruff fix fast image processor * fix * fix styling * Docs * Repo consistency * Repo consistency * fix sliding window issue * separate llama cache * styling * Repo consistency * Repo consistency * push waht works * L4 Repo consistency * Docs * fix last last alst alst alst alstsaltlsltlaslt --------- Signed-off-by: Jon Swenson <jmswen@gmail.com> Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: Keyun Tong <tongkeyun@gmail.com> Co-authored-by: Zijing Liu <liuzijing2014@users.noreply.github.com> Co-authored-by: Lu Fang <fanglu@fb.com> Co-authored-by: Zijing Liu <liuzijing2014@gmail.com> Co-authored-by: Jon Swenson <jmswen@gmail.com> Co-authored-by: jmswen <jmswen@users.noreply.github.com> Co-authored-by: MekkCyber <mekk.cyber@gmail.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com> Co-authored-by: Yong Hoon Shin <yhshin@meta.com> Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: drisspg <drisspguessous@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Daniël de Kok <me@danieldk.eu> Co-authored-by: Lysandre <hi@lysand.re> Co-authored-by: Ye (Charlotte) Qi <ye.charlotte.qi@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-05 22:02:22 +02:00
Lysandre Debut	aa40fda346	Hf Xet extra (#37305 ) * Hf Xet extra * Hf Xet extra	2025-04-05 21:06:05 +02:00
Cyril Vallez	e94571580b	Fix deepspeed loading (part 2) (#37306 ) * fix * Update modeling_utils.py * Update modeling_utils.py * oups remove print	2025-04-05 20:41:42 +02:00
Cyril Vallez	84aa13dd85	Fix deepspeed loading (#37281 ) * Update modeling_utils.py * Update modeling_utils.py * fix and remove all imports * Update modeling_utils.py * Update modeling_utils.py * style * Update modeling_utils.py	2025-04-05 17:05:45 +02:00
Linnet Cosmos Tuscano	0ef339ff1b	Update OpenAI GPT model card (#37255 ) * Update OpenAI GPT model card * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update OpenAI GPT model card: add usage examples and notes section * Add API autodoc tags after Notes section for OpenAI GPT model * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/openai-gpt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added missing badges --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-04 15:25:16 -07:00
Sharareh Younesian	46d73910d5	Updated T5 model card with standardized format (#37261 ) * Updated T5 model card with standardized format * Updated T5 model card with standardized format, fixed typo * Update docs/source/en/model_doc/t5.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/t5.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/t5.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/t5.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/t5.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/t5.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/t5.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/t5.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/t5.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/t5.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply reviewer suggestions * Update docs/source/en/model_doc/t5.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-04 15:23:09 -07:00
Chathumina Vimukthi	579135a2f6	Updated model card for distilbert (#37157 ) * Updated model card for distilbert * Updated the distilbert model card * Updated model card for distilbert * Updated the distilbert model card * Addressed code review comments * Addressed review comments * fix pipeline --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-04 15:22:46 -07:00
Reshan Gomis	8cd57eb731	mobilebert model card update (#37256 ) * mobilebert model card update * Updates to model card mobilebert --------- Co-authored-by: Reshan Gomis <reshang@verdentra.com>	2025-04-04 14:28:35 -07:00
Rahul Tuli	ebe47ce3e9	Fix: Unexpected Keys, Improve `run_compressed`, Rename Test Folder (#37077 )	2025-04-04 21:30:11 +02:00
Shubham Panchal	531e4fcf0e	Update model card for Depth Anything (#37065 ) [docs] Update model card for Depth Anything	2025-04-04 11:36:05 -07:00
byi8220	a4e55fcff8	Disable delay_optimizer_creation in `Trainer` to support fsdp2 (#37147 ) * github why you do this * fix * make fixup * disable cpu offload test * fixup * tmp reworks * git branch movement * make fixup * add require_fsdp_v2_version * dep issues * update ruff and fixup	2025-04-04 20:11:37 +02:00
Yao Matrix	878562b68d	fix test device spec relative path importing issue (#37190 ) Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-04 18:22:55 +02:00
Matt	8ebc435267	Fix llava_onevision tests (#37280 ) * Fix llava_onevision tests * Trigger tests	2025-04-04 15:03:38 +01:00
Joao Gante	ad3d157188	[RoPE] abstract dynamic RoPE update under a decorator ✨ (#37249 ) * dynamic rope decorator * longrope; shorter fwd pass * propper docstring * make fixup	2025-04-04 14:27:28 +01:00
Lysandre Debut	3d40bda30e	Hugging Face Hub pin to v0.30.0 for Xet (#37166 )	2025-04-04 14:58:22 +02:00
Joao Gante	acbcb5d07d	[Tests] flaky `test_constrained_beam_search_generate_dict_output` (#37276 )	2025-04-04 13:38:42 +01:00
Ryan McConville	4ba0989eab	Clarify error message to ensure min 28x28 image supplied for Qwen 2.5 VL (#37264 ) fix: clarify error message for min 28x28 images Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-04-04 12:53:38 +01:00
Yih-Dar	352ec8ef22	pin specific `natten` version in docker file (#37274 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-04 13:47:16 +02:00
cyyever	edd345b52e	Fix deprecated PT functions (#37237 ) * Fix deprecated PT functions Signed-off-by: cyy <cyyever@outlook.com> * Revert some changes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-04-04 12:31:11 +01:00
Yih-Dar	b016de1ae4	Fix `utils/check_bad_commit.py` (#37272 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-04 12:18:20 +02:00
Nikos Antoniou	f74d7da836	Introduce modular files for speech models (#35902 ) * WAV_2_VEC_2 to WAV2VEC2 * added modular files for hubert, wavlm, wav2vec2_bert, data2vec_audio * remove unnessary definitions in modulars * added modular files for UniSpeech, UniSpeechSat, Wav2Vec2Conformer * docstring fix for UniSpeechForCTC * removed unneccessary re-definition of modular classes * reverted lazy imports change on modular_model_converter, type-alias for Wav2Vec2BaseModelOutput * top-level import of deepspeed in seamless_m4t, speecht5 * avoid tracking imports inside classes, relocate lazy deepspeed, peft imports in their original locations * convert modular * tiny modular typing fixes * some more modular fixes * make style --------- Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>	2025-04-04 11:46:27 +02:00
Ita Zaporozhets	d130cd0e16	update error msg (#37207 )	2025-04-04 10:21:30 +02:00
Raushan Turganbay	41b9b92b52	[qwen-vl] fix image processor (#37258 ) * fix * add test	2025-04-03 19:48:56 +02:00
Surya Garikipati	8dd0a2b89c	Update model card for electra (#37063 ) * Update ELECTRA model card with new format * Update ELECTRA model card with new format * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * close hfoption block --------- Co-authored-by: Wun0 <f20191221@hyderabad.bits-pilani.ac.in> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-03 10:45:35 -07:00
Parag Ekbote	15ac2b6ac5	Update Model Card for ModernBERT (#37052 ) * Modify Model Card for ModernBERT. * Update as per code review. Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update model card. * Update model card. --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-03 10:14:02 -07:00
Abhishek Ranjan	b552708694	chore: Update model doc for code_llama (#37115 ) * Update code_llama.md aims to handle https://github.com/huggingface/transformers/issues/36979#issuecomment-2758560598 sub part of https://github.com/huggingface/transformers/issues/36979 * Update docs/source/en/model_doc/code_llama.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/code_llama.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/code_llama.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * make changes as per code review * chore: make the function smaller for attention mask visualizer * chore[docs]: update code_llama.md with some more suggested changes * Update docs/source/en/model_doc/code_llama.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * chore[docs] : Update code_llama.md with indentation changes --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-03 10:09:41 -07:00
Bimal Gajera	2b84831a93	Update model card for Cohere (#37056 ) * Update Cohere model card to follow standard template * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update cohere.md Update code snippet for AutoModel, quantization, and transformers-cli * Update cohere.md * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-03 09:51:40 -07:00
Matt	2d46a08b63	Purge unused ModelTester code (#37085 ) * Purge correctly this time * Remove more methods from recent PRs * make fixup	2025-04-03 17:48:35 +01:00
Avigyan Sinha	1b29409d89	feat: updated model card for qwen_2.5_vl (#37099 ) * feat: updated model card for qwen_2.5_vl * applied suggested change 1 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * applied suggested change 2 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * applied suggested change 3 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix: made requested changes for quantization and notes * suggeested model card change 4 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated model card wiht suggested change 5 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated model card wiht suggested change 6 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated model card wiht suggested change 7 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * feat: applied requested changes --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-03 09:13:26 -07:00
cyyever	8a828a747e	Add Optional to types (#37163 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-04-03 16:38:01 +01:00
Ryan Mullins	3f6af96732	Adding links to ShieldGemma 2 technical report (#37247 )	2025-04-03 16:26:29 +01:00
Joao Gante	9a1c1fe7ed	[CI] green llama tests (#37244 ) * green llama tests * use cleanup instead * better test comment; cleanup upgrade * better test comment; cleanup upgrade	2025-04-03 14:15:53 +01:00
Matt	782d7d945d	Allow flexible generation params arg when checking pipeline specs (#37211 ) * Allow flexible generation params arg * Trigger tests * Add docstring and rename js_generate to hub_generate	2025-04-03 13:29:36 +01:00
Jaime Fraustro	afafb84b59	Add support for fast image processing in image-pretraining example (#37021 ) * Add support for fast image processing in image-pretraining example Fix typo: correct tuple formatting in IMAGE_PROCESSOR_MAPPING_NAMES Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com> * Use fast image processor by default Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com> --------- Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-04-03 13:26:46 +01:00
Matt	34ccfebf32	Fix AST parsing when looking for remote code imports (#37245 ) * Not all Call.func nodes have id because they can be methods * Trigger tests * Trigger tests	2025-04-03 13:00:51 +01:00
Yao Matrix	f697b3f824	enable 2 types of case on XPU (#37198 ) enable 2 types of case on XPU 1. test_resize_tokens_embeddings_with_deepspeed_multi_gpu 2. test_resize_embeddings_untied_with_deepspeed_multi_gpu Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-03 11:37:55 +02:00
Joao Gante	2099287a59	[CI] lazy loading external datasets (#37218 )	2025-04-03 09:57:45 +01:00
Fanli Lin	a0803a9555	[tests] fix mamba integration simple inference precision issue (#37193 ) * fix precision issue * use float32	2025-04-03 10:38:03 +02:00
Cyril Vallez	6ce238fe7a	Fix test (#37213 ) * Update test_modeling_common.py * style	2025-04-03 10:24:34 +02:00
regisss	12048990a9	Add new dim to `num_items_in_batch` if necessary (#36967 ) * Add new dim to `num_items_in_batch` if necessary * Unsqueeze only in the DP case --------- Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-03 09:57:03 +02:00
Raushan Turganbay	98601cc818	[Phi4] add multimodal chat template (#36996 ) * phi4 chat template * remove from valid kwargs	2025-04-03 09:52:09 +02:00
Guang Yang	c9302c0983	Fix static cache export (#37229 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2025-04-03 07:05:57 +02:00
ARAVINDHAN T	2056287940	Updated model card for Qwen2 (#37192 ) * Update qwen2.md * Update qwen2.md * Update qwen2.md * Update qwen2.md * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update qwen2.md * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-02 18:10:41 -07:00
Ricardo Alanis	3e96a0c32b	Update falcon model card (#37184 ) * feat: updated model card for falcon * fix:rewrite model description * fix: add link to conversion script * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix: Add suggested changes * fix: typo in link for quantization * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix: fix indent and close ticks * fix: add indent --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-02 17:30:37 -07:00
Purusharth Malik	199d7adf10	Updated the model card for CLIP (#37040 ) * Update clip.md * Update docs/source/en/model_doc/clip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/clip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/clip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Incorporated suggested changes * Update docs/source/en/model_doc/clip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/clip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/clip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-02 14:57:38 -07:00
Matt	126abe3461	More ReDOS fixes! (#36964 ) * More ReDOS fixes! * Slight regex cleanup * Cleanup regex replacement * Drop that regex entirely too * The regex didn't match config.json, let's make sure we don't either * Cleanup allowed_value_chars a little * Cleanup the import search * Catch multi-condition blocks too * Trigger tests * Trigger tests	2025-04-02 18:46:14 +01:00
Matt	3d133cc557	Stop DOSing the Hub in the CI (#37209 ) * As the title suggests, stop hammering the same files * make fixup * Use shutil instead of pathlib	2025-04-02 17:19:33 +01:00
Joao Gante	e90d55ebcc	[Tests] add `min_new_tokens` to prevent flaky length checks (#37175 )	2025-04-02 15:24:00 +01:00
Matt	cbfa14823b	No more dtype_byte_size() (#37144 ) * No more dtype_byte_size() * Remove function once again * Fix rebase cruft * Trigger tests	2025-04-02 14:58:38 +01:00
cyyever	7613cf1a45	Add py.typed (#37022 )	2025-04-02 14:17:27 +01:00
cyyever	32c12aaec3	[3/N] Use pyupgrade --py39-plus to improve code (#36936 ) Use pyupgrade --py39-plus to improve code Signed-off-by: cyy <cyyever@outlook.com>	2025-04-02 14:16:06 +01:00
cyyever	764ab0d46a	Merge tensor operations with device transfer operations (#37097 ) * Merge operations with to Signed-off-by: cyy <cyyever@outlook.com> * Use dtype Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-04-02 14:15:23 +01:00
湛露先生	c94c6ed397	Fix some code annotation typos. (#37102 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-04-02 14:00:41 +01:00
Dan Saattrup Nielsen	e94d607c8b	fix: Add 'image-text-to-text' to `TASK_MAPPING` (#37107 ) Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-02 14:51:03 +02:00
Yih-Dar	adfc91cd46	Try to avoid/reduce some remaining CI job failures (#37202 ) * try * try * Update tests/pipelines/test_pipelines_video_classification.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-04-02 14:39:57 +02:00
Xavier Dupré	6f5dc9c82e	Fixes DynamicCache export issues due to control flow and inplace modifications (#36652 ) * Remove unnecessary masked_fill in deberta models * Enable some code when exporting but not compiling * add missing import * style * replace if by torch.cond * style * use numel * style * add unit tests * style * change empty value for dynamic cache * replace != [] by numel() * fix import issue * style	2025-04-02 12:04:40 +01:00
Jerry Zhang	a165458901	Add device workaround for int4 weight only quantization after API update (#36980 ) * merge * fix import * format * reformat * reformat --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-04-02 12:42:22 +02:00
Yih-Dar	ed95493ce0	Skip code `307` in `RequestCounter` (#36953 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-02 11:35:46 +02:00
Raushan Turganbay	211e4dc9a4	[chat-template] fix video loading (#37146 ) * fix * add video * trigger * push new iamges * fix tests * revert --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-02 11:27:50 +02:00
Bowen Bao	800510c67b	[doc] Fix link for Quark quantization page (#37179 ) Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-04-01 20:57:38 +02:00
Cyril Vallez	41f5c3216c	Revert #37031 (#37178 ) Update modeling_utils.py	2025-04-01 19:48:15 +02:00
Cyril Vallez	bc2dea3f54	Fix meta state dict loading with quantizers (#37136 ) Update modeling_utils.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-01 18:45:58 +02:00
Yih-Dar	35253076f4	Avoid pipeline test failing related to Hub call (#37170 ) * cls * cls * cls --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-01 18:22:45 +02:00
Yufeng Xu	bf41e54fc8	Fixes the inconsistency of the optionality of attention_mask (#37153 ) * debugging issue 36758 * debugging issue 36758 * debugging issue 36758 * updated attn_mask type specification in _flash_attention_forward * removed pdb * added a blank line * removed indentation	2025-04-01 15:31:10 +01:00
Pavel Iakubovskii	3249c5dc15	Refactor attention for SigLIP based models (#36981 ) * Update Siglip attention implementation * Update tests for Siglip * Remove one level of indentation * Update test to be more specific * Fixup * Idefics2 * Idefics3 * Emu3 * SmolVLM * Phi4 (just init small update) * Idefics2 (test fix) * Update siglip2 tests * Update eager * trigger * Clean up * Transfer inputs to device in test * Fixing test * Fixing test * Revert contiguous * Remove unused is_flash_attn_2_available * Move flaky to specific models	2025-04-01 15:37:25 +02:00
Yao Matrix	24e311f42b	fix XPU UT error case brough by RNG difference btw XPU and CUDA (#37121 ) * fix XPU UT error case brough by RNG difference btw XPU and CUDA Signed-off-by: YAO Matrix <matrix.yao@intel.com> * enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu Signed-off-by: YAO Matrix <matrix.yao@intel.com> * Revert "enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu" This reverts commit 3ef83a4f0204642daa45fda56e8aca1afed24b4f. --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-01 13:52:55 +01:00
Tom Aarsen	897ff9af0e	[`ModernBERT`] Never save 'reference_compile' config; should be set based on end user (#36305 ) * Never save 'reference_compile' config; should be set based on end user * Reformat (I ran 'make style' from the wrong env) * Use pop instead of del Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Use pop instead of del Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-04-01 14:14:39 +02:00
Tugsbayasgalan Manlaibaatar	c0bd8048a5	Make canine model exportable by removing unncessary complicated logic (#37124 )	2025-04-01 12:31:12 +01:00
Ilyas Moutawwakil	60b75d99b6	Only count num items in batch when needed (#36867 ) only count num itels when needed	2025-04-01 12:30:39 +02:00
Qizhi Chen	fac70ff3c0	Convert `_VALID_DICT_FIELDS` to class attribute for shared dict parsing in subclasses (#36736 ) * make _VALID_DICT_FIELDS as a class attribute * fix test case about TrainingArguments	2025-04-01 12:29:12 +02:00
Guang Yang	ae34bd75fd	Use public export API on torch 2.5 and future (#36781 ) Co-authored-by: Guang Yang <guangyang@fb.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-04-01 10:47:38 +01:00
Yao Matrix	8f6b27eb5c	enable `test_assisted_decoding_in_different_gpu` test on XPU (#37120 ) Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-01 11:22:59 +02:00
jiqing-feng	737cbd2109	Fix llava xpu tests. (#37130 ) * fix llava 4bit xpu test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix llava 4bit xpu test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-04-01 11:10:13 +02:00
jiqing-feng	3a6ab46a0b	add gpt2 test on XPU (#37028 ) * add gpt2 test on XPU Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * auto dtype has been fixed Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * convert model to train mode Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-04-01 11:09:29 +02:00
Yaswanth Gali	4b13a02920	Fix std initialization in Idefics variants (#37100 ) * Nit 😅 * Another one * fix * run ci * revert change	2025-04-01 09:18:54 +02:00
cyyever	786d9c5ed9	Fix more inefficient PT operations (#37060 ) * Fix inefficient operations * Remove cpu() call * Reorder detach() * Reorder detach() * tolist without detach * item without detach * Update src/transformers/models/rag/modeling_rag.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/encodec/test_modeling_encodec.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Use detach().cpu().numpy * Revert some numpy operations * More fixes --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-31 16:31:24 +01:00
Pavel Iakubovskii	a1e389e637	Refactor `return_dict` logic to remove complicated if/else paths (#36794 ) * SAM * CLIP * SigLIP * GOT-OCR2 (depends on SAM) * SigLIP2 (depends on SigLIP) * trigger tests * Fix SAM * Fix missed indexing, use named attributes * Llama * Aria * Bamba * Update llama: missed outputs return type * (fixup) Aria * DiffLlama * Emu3 * Gemma * Gemma2 * Paligemma * Fix paligemma * Gemma3 * GLM * Helium * JetMoe * Jamba * Mistral * Mistral * Mixtral * Nemotron * Olmo * Olmo2 * Persimmon * Phi * Phi3 * PhiMoe * Qwen2 * Qwen2_moe * StableLM * Starcoder2 * Add return_dict decorator * SAM * Update decorator: compile, export, trace - friendly * Llama (decorator) * SAM (decorator) * Add decorator `can_return_tuple` * Llama * Update to decorator * Update CLIP * Update decorator to store `_is_top_level_module` in self * Update decorator to correctly handle compile/export * Remove is_torchdynamo_compiling constraint, all work fine with self attribute assignment * Typing * GPT NeoX * Fixup * Fix attribute Granite * Fix return type mixtral * Update Gemma3 * Fix Cohere amd Cohere2 * Fixup * Fix corner case for Phi4, when activation is shared * (fix-copies) deepseekv3, phi4 * Fixup * Apply to qwen3/qwen3_moe * Fix	2025-03-31 16:23:37 +01:00
Cyril Vallez	f304318f5f	Remove low_cpu_mem_usage and _fast_init (#36963 ) * Remove low_cpu_mem_usage and _fast_init * Update deepspeed.py * Update modeling_utils.py * remove the first 2 tests everywhere * Update test_modeling_common.py * remove what was remaining about fast_init * fix logic and simplify * mismatched keys logic update * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * fix 2 models init_weights * extend to others * remove grad * Update modeling_fsmt.py * init weights in tests * style * Update test_modeling_fsmt.py * more old models * fix more init_weights * copies * fix * style * Update modeling_lxmert.py * fix inits * more and more * more * should finalize * style * Update modeling_dinov2_with_registers.py * fix * Update modeling_encoder_decoder.py * fix * style * Update modeling_lxmert.py * post rebase cleanup * Update modeling_informer.py * back to start for device * fix * add test to detect all failing cases correctly * Update test_modeling_common.py * fix * fix * sam * style * Update modeling_maskformer_swin.py * CIs * CIs * remove test - will add it on separate PR * fix * fix * Update modeling_sam.py * CIs * CIs * CIs * convnext * suggestions * CIs * fix copies after merge --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-31 17:18:43 +02:00
Raushan Turganbay	8805600406	[qwen3] fix generation tests (#37142 ) * do not skip tests * fix qwen3-moe as well * fixup * fixup	2025-03-31 16:33:41 +02:00
Zhen	e686fed635	[Feature] Support using FlashAttention2 on Ascend NPU (#36696 ) * [Feature] Support using flash-attention on Ascend NPU * Fix qwen3 and qwen3_moe moduler conversion mismatch	2025-03-31 16:12:58 +02:00
Yih-Dar	a03cee7a1d	skip (#37141 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-31 15:38:40 +02:00
Guang Yang	3b07ca78bb	Export T5 (encoder-decoder) to ExecuTorch (#36486 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2025-03-31 12:10:26 +02:00
Fanli Lin	475664e2c6	[tests] remove cuda-only test marker in `AwqConfigTest` (#37032 ) * enable on xpu * add xpu support --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-31 11:53:02 +02:00
Armaghan Shakir	0710e9b1e8	Create and Expose SamVisionModel as public for better accessibility (#36493 ) * move encoder below * auto modeling * write SamVisionTester * fix vision attention shape * fix SamVisionTest * minor changes to SamVisionTest * Revert "fix vision attention shape" This reverts commit d2a4083ae5704716e33351aed03af8f3cc45f3ae. * fix attention output shape in new tests * remove encoder examples * run modular on got_ocr2 * code formatting * fix got_ocr2 * ruff fixes * code quality * add sam_vision in auto modeling and auto configuration * remove composite test * updated index.md * add TFSamVisionEncoder to __init__ * fix public TFSamVisionEncoder * remove outdated todo comment * set test_torch_exportable Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * rename: VisionEncoder -> VisionModel * bring back original SamVisionEncoder * rename back: VisionEncoderOutput -> VisionModelOutput * undo changes in SamModelTester * reuse SamVisionEncoder in SamVisionModel --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-31 11:45:07 +02:00
cyyever	f99c279d20	Remove deprecated code (#37059 ) * Remove deprecated code * fix get_loading_attributes * fix error * skip test --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-31 11:15:35 +02:00
Robin Kahlow	d1efaf0318	RWKV: fix mask warning typo (#37114 ) rwkv: fix mask warning typo	2025-03-31 11:07:51 +02:00
Thien Tran	19919689b2	Fix Gemma3 embedding scaling (#37109 ) fix gemma3 embedding	2025-03-31 11:04:02 +02:00
huismiling	d0b65bb479	[MLU] Fix FA2 check error, remove deepspeed-mlu deps. (#36159 ) * add Cambricon MLUs support * fix mlu device rng state * up for quality check * up mlu to support fp16 * fix mlu device dependency error * fix mlu device dependency error * enable mlu device for bf16 * fix mlu device memory tracker * Cambricon support SDPA and flash_attn * MLU devices : Checks if `mlu` is available via an `cndev-based` check which won't trigger the drivers and leave mlu * Fix mlu FA2 check. Remove deepspeed-mlu check. add mlu tests support. * fix testing errors. * Merge branch 'hf/main' into main * fix get_device_count error. * fix mlu testing utils. * fix code quality and style. * switch to @require_torch_multi_accelerator	2025-03-31 11:02:49 +02:00
jiqing-feng	ad63d20dff	fix whisper re-compile (#36712 ) * fix whisper re-compile Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix copy Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix comment Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix copies Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert useless changes Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-31 11:01:51 +02:00
jiqing-feng	286393fbb1	enable tp on CPU (#36299 ) * enable tp on CPU Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * get rank from cpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable TP tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix comment Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * em print Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix model id Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix conflict Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix index and add doc Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-03-31 10:55:47 +02:00
Qubitium-ModelCloud	4705b04c74	Fix 4090/ada not detected as having FP8 support (#37067 ) fix 4090/ada not detected as having FP8 support Signed-off-by: Qubitium <qubitium@modelcloud.ai> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-31 10:53:48 +02:00
efsotr	2b4734bd49	Support passing flash_attn_kwargs when gradient_checkpointing is enabled (#37037 ) * support passing flash_attn_kwargs when gradient_checkpointing is enabled * make modeling_deepspeek_v3.py consistent with modular_deepseek_v3.py	2025-03-31 10:53:02 +02:00
Yuan Wu	bd41b9c1ac	Gaudi: Fix the pipeline failed issue with hpu device (#36990 ) * Gaudi: fix the issue of is_torch_hpu_available() returns false Signed-off-by: yuanwu <yuan.wu@intel.com> * Fix make fixup Signed-off-by: yuanwu <yuan.wu@intel.com> * Add comments for the implicit behavior of import Signed-off-by: yuanwu <yuan.wu@intel.com> * Update src/transformers/utils/import_utils.py * Update src/transformers/utils/import_utils.py --------- Signed-off-by: yuanwu <yuan.wu@intel.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-03-31 10:23:47 +02:00
Bo Zheng	6acd5aecb3	Adding Qwen3 and Qwen3MoE (#36878 ) * Initial commit for Qwen3 * fix and add tests for qwen3 & qwen3_moe * rename models for tests. * fix * fix * fix and add docs. * fix model name in docs. * simplify modular and fix configuration issues * Fix the red CI: ruff was updated * revert ruff, version was wrong * fix qwen3moe. * fix * make sure MOE can load * fix copies --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-03-31 09:50:49 +02:00
MinJu-Ha	0d6a60fe55	🌐 [i18n-KO] Translated `qwen2_vl.md` to Korean (#36750 ) * fix: manual edits * fix: resolve suggestions * Update toctree.yml	2025-03-30 15:00:27 -07:00
Yih-Dar	b7fc2daf8b	Kenlm (#37091 ) * kenlm * kenlm * kenlm * kenlm --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-28 21:42:54 +01:00
Joao Gante	bab605dd04	[Cache] rename dtype attribute 🚨 🚨 (#37044 ) * yoink * same pattern in all cache	2025-03-28 19:08:02 +01:00
Joao Gante	9fd9476005	[generate] beam search -- fix output cropping (#37080 ) * handle jagged beams * better comment * bart -- beam search tests print special tokens * more bart test updates * more tests! * better comment	2025-03-28 18:57:51 +01:00
湛露先生	257bc670fb	fixed typo. (#37057 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-28 17:12:14 +00:00
Cyril Vallez	2bea6bf24e	Fix AttentionInterface following feedback (#37010 ) * up * typo * update doc * Update attention_interface.md	2025-03-28 18:00:35 +01:00
Cyril Vallez	a86dad56bc	Fix state_dict map location when quantized (#37086 ) * Update modeling_utils.py * Update modeling_utils.py	2025-03-28 17:57:16 +01:00
Zach Mueller	d6064754ea	Update w/ new account (#37084 ) * Update w/ new account * DS	2025-03-28 12:43:00 -04:00
Yih-Dar	581cf96e0c	fix tied weigths issue (#37031 ) * fix * comment --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-28 16:36:44 +01:00
Minho Ryu	eca74d1367	[WIP] add deepseek-v3 (#35926 ) * init commit * style * take comments into account * add deepseekv3 modeling * remove redundant code * apply make style * apply fix-copies * make format * add init files * rename deepseekv3 into deepseek_v3 based on its model_type * rename deepseekv3 into deepseek_v3 based on its model_type * deepseek-v3 not deepseek_v3 * set model_type as deepseek_v3 * use default docs * apply make * fill type and docstring * add rope_config_validation * use custom DeepseekV3MLP * hold code only for checkpoints congifuration; remove redundant * revise rope yarn for DeepSeek variation * rename DeepSeek-V3 * some refactoring * revise load_hook to work properly; make moe func trainable; use llama instead of mixtral * fix attention forward * use -1 for not-changing dim when to use exapnd * refactor DeepseekV3TopkRouter * use reshape_for_rope instead of load_hook; revise attention forward for TP; rename q_head_dim with qk_head_dim * register pre_hook and hook both * make style * use n_shared_experts * Update src/transformers/models/deepseek_v3/configuration_deepseek_v3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add test file * update modeling_file according to modular file * make style * add mapping for DeepseekV3ForSequenceClassification * remove aux_loss_alpha * add deepseek_v3 for perf * add deepseek_v3 * rename test as deepseekv3 * use tiny-deepseek-v3 * remove DeepseekV3ForSequenceClassification * cache before padding * remote output_router_logits * Revert "remote output_router_logits" This reverts commit f264f800d04950390db8413b9efb24cef8186330. * remove output_router_logits * make e_score_correction_bias as buffer * skip tests not compatible * make style * make e_score_correction_bias as buffer * use rope_interleave instead of load_hook * skip tests not compatible with MLA * add doc for rope_interleave * fix typo * remove torch.no_grad for selecting topk * fix post merge issue * mrege with main and simplify * nits * final * small fixes * fix * support TP better * stash * changes currently requires * remove synch * more fixes for TP * temp fix for TP : some attention layers's FP8 scales are too small + shared is local colwise and anything is local if FP8 because weights are used * updates to have generation work! * push most of the changes * reorder functions + call for contributions! * update readme * nits * update * ruff was updated on main * merge with main and fix copies * revert unrelated changes * route all tokens to all experts when testing to avoid no gradient iddues * finish fixing all tests * fixup * nit * clean config * last readme changes * nit * do cnit * typo * last nit * one more one more --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-165-131.ec2.internal>	2025-03-28 15:56:59 +01:00
Raushan Turganbay	52cc204dd7	[blip-2] Fix dtype mismatch when keep in fp32 (#37068 ) * fix fp32 BLIP2 * no need to reorder that * check for `Noneness` as well before casting dtype	2025-03-28 15:52:11 +01:00
cyyever	aa3778afc2	Change deprecated PT functions (#37041 ) Change deprecated functions	2025-03-28 14:26:22 +00:00
湛露先生	c90e6e9625	Fix some typos about benchmark scripts. (#37027 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-03-28 14:10:20 +00:00
Yih-Dar	1fcaad6df9	Use `lru_cache` for tokenization tests (#36818 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-28 15:09:35 +01:00
jp	3af425d4c6	fix: AttributeError: 'LlavaProcessor' object has no attribute 'image_token_id' (#37026 ) * Add image_token_id and video_token_id handling in Llava processors * fix: image to video * fix: correct image and video token ID handling in Llava processors * fix: improve image and video token ID handling in Llava processors	2025-03-28 10:46:24 +01:00
Manuel Faysse	064cd7cdac	Fix SDPA implementation in Qwen2-VL (issues with torch==2.6.0) (#36891 ) * fix sdpa implementation * ruff * also modify 2_5 for consistency	2025-03-28 09:54:21 +01:00
Perry Gibson	348f3285c5	fix: Fully remove legacy cache from Llama (#36958 ) * bug: fully remove legacy cache from Llama * bug: fix CI issues * bug: update jetmoe model * bug: apply =check_modular_conversion.py= fix * bug: apply make fix-copies * bug: fix ruff * PR suggestions * Remove trailing commas in auto-gen files * Trivial new line removal	2025-03-27 17:22:44 +00:00
Finn-Ole Höner	d6b3c7486b	fixed typo (#37036 )	2025-03-27 15:37:53 +00:00
cyyever	6cc9c8d7d1	Remove deprecated batch_size parameter (#37007 )	2025-03-27 15:01:56 +00:00
Prem Kumar M	4cc65e990f	Replace default split function with jnp.split() in flax models (#37001 ) Replace split with jnp's split function for flax models (#36854)	2025-03-27 14:59:57 +00:00
cyyever	41a0e58e5b	Set weights_only in torch.load (#36991 )	2025-03-27 14:55:50 +00:00
cyyever	de77f5b1ec	Fix typing for None valued variables (#37004 ) Fix typing for None-able variables	2025-03-27 14:46:32 +00:00
cyyever	8c5e29bad5	Avoid unnecessary device operations in loss computing (#36950 ) * Avoid unnecessary tensor copy in loss computing * Add type	2025-03-27 14:45:14 +00:00
湛露先生	471cf1de63	clean pipeline question_answering. (#36986 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-03-27 14:35:33 +00:00
Joao Gante	29f322d04d	[generate, cache] handle more complex device maps (#37014 )	2025-03-27 14:33:20 +00:00
eustlb	fb8e6c50e4	[audio utils] fix fft_bin_width computation (#36603 ) * fix fft_bin_width computation * update docstring + enforce correct params * update test with correct value * udpate test * update feature extractors for concerned models * update * make * udpate docstring * udpate docstring	2025-03-27 15:20:02 +01:00
Raushan Turganbay	e97c760006	[chat templates} support loading audio from video (#36955 ) * add audio from video * typos * delete print * comments	2025-03-27 14:46:11 +01:00
Pavel Iakubovskii	c7bc79bd2a	Fixup for distill_any_depth conversion script (#37043 ) * Fixup * trigger	2025-03-27 13:29:25 +00:00
Sungyoon Jeong	d1eafe8d4e	Optimize `to_py_obj` for python-native numeric lists and scalars (#36885 ) * Optimize to_py_obj for python-native numeric lists and scalars * Fix bug that tuple is not converted to list * Try np.array for more robust type checking * Apply review and add tests for to_py_obj	2025-03-27 14:16:46 +01:00
jiqing-feng	0e56fb69a2	fix pegasus init weights and other copied models (#36844 ) * fix pegasus init weights Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix the rest of models Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix informer init Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * init weight before checking Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix roformer tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix roformer tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-03-27 14:14:30 +01:00
Parteek	7e813f9cf0	Add Distill Any Depth (#36614 ) * Added conversion Script * Update src/transformers/models/depth_anything/convert_distill_any_depth_to_hf.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Updated Conversion Script * Update src/transformers/models/depth_anything/convert_distill_any_depth_to_hf.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-27 13:10:03 +00:00
Mohamed Mekkouri	92429057d9	Skip FP8 linear tests For device capability < 9.0(#37008 ) * skip fp8 linear * add capability check * format	2025-03-27 12:38:37 +01:00
hoshi-hiyouga	279c2e302a	remove redundant code in trainer (#36994 ) * Update optimization.py * Update optimization.py	2025-03-27 11:35:15 +01:00
Yih-Dar	d13c390d01	Mark 2 tests as flaky for now (#37038 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-27 10:59:47 +01:00
Kyle Sayers	d6d930a64b	[Modeling] Load FP8 safetensors such as DeepSeek (#36828 ) support loading fp8 Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-03-27 10:47:10 +01:00
Michael Goin	927ce1d39f	Fix PixtralProcessor patch_size when spatial_merge_size is used (#37019 )	2025-03-27 10:46:23 +01:00
Abu Bakr Soliman	49b5ab6a27	Support QuestionAnswering Module for ModernBert based models. (#35566 ) * push ModernBertForQuestionAnswering * update ModernBertForQuestionAnswering * update __init__ loading * set imports for ModernBertForQuestionAnswering * update ModernBertForQuestionAnswering * remove debugging logs * update init_weights method * remove custom initialization for ModernBertForQuestionAnswering * apply make fix-copies * apply make style * apply make fix-copies * append ModernBertForQuestionAnswering to the pipeline supported models * remove unused file * remove invalid autoload value * update en/model_doc/modernbert.md * apply make fixup command * make fixup * Update dummies * update usage tips for ModernBertForQuestionAnswering * update usage tips for ModernBertForQuestionAnswering * add init * add lint * add consistency * update init test * change text to trigger stuck text * use self.loss_function instead of custom loss By @Cyrilvallez Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update modeling_modernbert.py make comparable commit to even it out * Match whitespace * whitespace --------- Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Orion Weller <wellerorion@gmail.com> Co-authored-by: Orion Weller <31665361+orionw@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-03-26 21:24:18 +01:00
Yao Matrix	5b08db8844	fix transformers_cli import relative path issue (#36989 ) * fix transformers_cli relative import path issue Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-26 18:45:56 +00:00
Steven Liu	3a8ec8c467	[docs] Attention mask image (#36970 ) add image	2025-03-26 10:11:34 -07:00
cyyever	2b550c47b2	Remove deprecated training arguments (#36946 ) * Remove deprecated training arguments * More fixes * More fixes * More fixes	2025-03-26 16:44:48 +00:00
Afanti	44715225e3	fix typos in the code comments and error messages (#36993 ) * chore: enhance code comments * chore: enhance code comments * chore: enhance code comments * chore: enhance code comments * chore: enhance code comments * chore: enhance code comments * chore: enhance code comments	2025-03-26 16:09:48 +00:00
Marc Sun	79d6f9fd70	Log the correct learning rate (#36973 ) * fix learning rate log * fix lr log * add lr	2025-03-26 16:52:00 +01:00
Mohamed Mekkouri	13d36e89fe	Fix device_map check for ggml files (#37003 ) fix	2025-03-26 16:24:57 +01:00
Josh Marshall	021006e1b0	Fix removing "cpu" from frozenset in bitsandbytes.py to allow better ROCm support. (#36975 ) * Fix removing "cpu" from frozenset in bitsandbytes.py to allow better ROCm support. Related to https://github.com/bitsandbytes-foundation/bitsandbytes/issues/1573 and https://github.com/huggingface/transformers/issues/36949 , this resolves a bug in allowing ROCm/HIP support in bitsandbytes. * Related to bitsandbytes-foundation/bitsandbytes#1573 and huggingface#36949 , this resolves a bug in the biteandbytes integration, allowing ROCm/HIP support in bitsandbytes. --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-26 16:18:08 +01:00
Cyril Vallez	788e1092e9	Allow easy registration of custom attention functions (#36889 ) * Update modeling_utils.py * style * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * add to init * Update modeling_utils.py * style * update * Update modeling_utils.py * Update modeling_utils.py * style * Add some doc * Update _toctree.yml * readd it for tgi/vllm compat * CIs * CIs	2025-03-26 16:15:06 +01:00
ivarflakstad	ad5d40de9c	Fix get_device_properties (#36997 ) Fix remove remnant self from get_device_properties Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-26 15:46:34 +01:00
cyyever	8084b26294	Fix Optional type annotation (#36841 ) * Fix annotation * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-26 13:53:44 +00:00
Yih-Dar	b56d8f07e4	Install `networkx==3.2.1` manually in some CircleCI jobs after #36957 (#37000 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-26 14:49:09 +01:00
cyyever	78afa1c537	Use torch.expm1 (#36995 )	2025-03-26 13:06:33 +00:00
Yih-Dar	181d453069	byebye CircleCI TF jobs (#36998 ) * byebye tf jobs * byebye tf jobs --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-26 12:49:50 +01:00
cyyever	e7139d06f5	Fix tensor dtype mismatch (#36985 ) * Fix tensor dtype mismatch * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-26 10:37:46 +01:00
Yoni Gozlan	be37d34f44	🚨Deprecate legacy argument for image-text-to-text models and adopt new behavior by default (#36307 ) * deprecate legacy argument and adopt new behavior by default * revert back modification git	2025-03-25 17:32:17 -04:00
Yih-Dar	ab4656f6b7	update bot comment again (#36974 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 19:42:09 +01:00
cyyever	ba531278ca	Add ruff target-version (#36971 )	2025-03-25 19:41:25 +01:00
Steven Liu	a844297088	[docs] Fix image link (#36869 ) * fix image link * fix * update * fix	2025-03-25 11:34:21 -07:00
cyyever	d68a91aebf	Remove extra tensor clone in PyTorch code (#36748 ) * Use detach().clone() * Eliminate continuous() * Merge clone and other calls with to * Merge clone and other calls with to	2025-03-25 17:42:15 +00:00
Yih-Dar	121830ab47	update examples after ruff being updated (#36972 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 18:15:47 +01:00
Sai-Suraj-27	a41677a68b	Updated docker files to use `uv` for installing packages (#36957 ) * Updated docker files to use uv pip install as uv is blazingly fast. * Removed -y flag for uv pip uninstall. * Passed --no-build-isolation flag --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 18:12:51 +01:00
NargiT	3dce98a437	typo fixed in README_fr.md (#36951 )	2025-03-25 09:29:36 -07:00
湛露先生	ebd2029483	Change GPUS to GPUs (#36945 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 17:25:39 +01:00
Yih-Dar	69632aadb7	Update after #36962 (#36965 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 16:16:06 +01:00
Yih-Dar	c6814b4ee8	Update ruff to `0.11.2` (#36962 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 16:00:11 +01:00
Joao Gante	bc1c90a755	[Utils] torch version checks optionally accept dev versions (#36847 )	2025-03-25 10:58:58 +00:00
Marc Sun	80b4c5dcc9	Fix cuda index issue in cache allocator (#36937 ) fix	2025-03-25 11:51:41 +01:00
Raushan Turganbay	0f733110a6	Support `return_tensors` in audio chat templates (#34601 ) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky	2025-03-25 11:08:47 +01:00
Afanti	19085c28da	fix typos in the tests directory (#36932 ) * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: fix typos in test codes * chore: format codes	2025-03-25 10:49:24 +01:00
Guang Yang	69bcb86c58	Export for Phi4-mini (#36780 ) * Export for Phi4-mini * Update tests/models/phi3/test_modeling_phi3.py --------- Co-authored-by: Guang Yang <guangyang@fb.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 10:46:38 +01:00
Mohamed Mekkouri	be2c0e7bff	Fixing _pre_quantization_dtype when torch_dtype is None (#36930 ) fix	2025-03-25 10:43:27 +01:00
Cyril Vallez	4303d88c09	Add Phi4 multimodal (#36939 ) * raw start * update * update * add to imports * update * up * simplify configs * clean configs * style * typos * Update convert_phi4_multimodal_weights_to_hf.py * Update convert_phi4_multimodal_weights_to_hf.py * fix * up * up * up * Update convert_phi4_multimodal_weights_to_hf.py * Update convert_phi4_multimodal_weights_to_hf.py * up * up * up * Update feature_extraction_phi4_multimodal.py * up * up * up * up * up * simplify configs * typo * cut code * typo * typo * typo * re * typo * up * up * up * add tests * fix * fix * Update test_modeling_phi4_multimodal.py * up * Update test_modeling_phi4_multimodal.py * doc * fix * up * up * up * up * up * up * simplify * up * simplify * config docstrings * cleanup * clean * typo * typo * fix * Update phi4_multimodal.md * fix * fix * Update test_modeling_phi4_multimodal.py * update * simplify reshapes and permutes * up * simplify special tokens * simplify processor a lot * Update processing_phi4_multimodal.py * Update processing_phi4_multimodal.py * switch to fast processor * image processor * Update image_processing_phi4_multimodal_fast.py * add lora extraction to converter * Update convert_phi4_multimodal_weights_to_hf.py * Update __init__.py * add AudioInput type in audio_utils * rewrite feature_extraction: support torch batched FFT * input_audio_embeds -> audio_input_features, input_image_embeds -> image_pixel_values * test update * not mono channel warning update * remove auto maps from processor * kargs dispatch in processor * simplify kwargs dispatch * simplify merging * remove default sampling rate * style * Update test_modeling_phi4_multimodal.py * update doc * doc * torch only feature extractor * make fake tokens adjustable * Update feature_extraction_phi4_multimodal.py * fix * Update processing_phi4_multimodal.py * simplify mask * last touch * fix copies * style * Update audio_utils.py * style * Update feature_extraction_phi4_multimodal.py * Update __init__.py * docstrings * copies * fix all checks * back to fix-copies * trigger CIs * Update feature_extraction_phi4_multimodal.py * improve tests with multimodal inputs * trigger CIs --------- Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>	2025-03-25 09:55:21 +01:00
Raushan Turganbay	47e5432805	Deprecate #36741 and map Causal to Conditional (#36917 ) * deprecate the prev fix * reword warning and update docs * reword warning * tests * dont bloat `get_text_config()`	2025-03-25 09:13:56 +01:00
Mohamed Mekkouri	2b8a15cc3f	Disallow Offload to disk for gguf files (#36933 ) update Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-24 19:30:01 +01:00
Yoni Gozlan	91455c1825	Fix processor kwargs qwen2 vl (#36890 ) * Fix qwen2_vl and qwen2_5_vl processors cutom images kwargs * change version warning	2025-03-24 13:19:26 -04:00
gautham	48385aa4f4	Added support for seed in `DataCollatorForWholeWordMask` (#36903 ) * Added support for seed in `DataCollatorForWholeWordMask`, and also wrote tests. Also fixed bugs where the code hardcoded values for mask replacement probability and random replacement probability, instead of using the values passed by the user. * formatting issues * Used better way to generate seed in TF. Made tests more consistent.	2025-03-24 16:57:17 +00:00
Yih-Dar	5932606d8e	More precise comment (#36935 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-24 17:03:09 +01:00
Pavel Iakubovskii	2be2984462	Fix pytorch defomr attn path (#36923 ) * Fix pytorch path for DeformableAttention * Apply for GroundingDino	2025-03-24 15:58:51 +00:00
cyyever	00d077267a	[2/N] Use pyupgrade --py39-plus to improve code (#36857 ) Use pyupgrade --py39-plus to improve code	2025-03-24 15:42:25 +00:00
Ethan Knights	a6ecb54159	Update `trainer_pt_utils.py` docstrings for consistency (#36912 ) * Update trainer_pt_utils.py * update docstrings trainer_pt_utils.py for consistency * Update src/transformers/trainer_pt_utils.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-03-24 14:46:41 +00:00
omahs	cbf924b76c	Fix typos (#36910 ) * fix typos * fix typos * fix typos * fix typos	2025-03-24 14:08:29 +00:00
Yih-Dar	340500b1a9	Use another repo. for Mistral3 processor testing (#36925 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-24 14:36:05 +01:00
Mohamed Mekkouri	9e125d9a2e	Fix Compressed tensors to_dict_diff (#36922 ) fix	2025-03-24 13:06:33 +01:00
Raushan Turganbay	57f551c78d	[chameleon] fix num image token check (#36918 ) * [chameleon] fix num image token check * embed after merging image token * skip this also * mistral require_read_token	2025-03-24 12:36:08 +01:00
Dmitry Rogozhkin	a41e08aa19	tests: fix asyncio.wait() usage for python>=3.11 (#36898 ) tests: fix asyncio.wait() usage for python>=3.7 Passing coroutings directly to `asyncio.wait()` is deprecated since python 3.8 and removed starting from python 3.11. Instead, it's required to explicitly wrap coroutine in the task with `asyncio.create_task()` which first appeared in python 3.7. We step into this issue running the following Transformers tests on a system with python 3.11 or later (for example, Ubuntu 24.04 has python 3.12): * `tests/trainer/test_trainer_distributed.py` * `tests/extended/test_trainer_ext.py` The error will be: ``` src/transformers/testing_utils.py:2380: in execute_subprocess_async result = loop.run_until_complete( /usr/lib/python3.12/asyncio/base_events.py:687: in run_until_complete return future.result() src/transformers/testing_utils.py:2368: in _stream_subprocess await asyncio.wait( ... E TypeError: Passing coroutines is forbidden, use tasks explicitly. ``` See: https://docs.python.org/3.10/library/asyncio-task.html#asyncio.wait See: https://docs.python.org/3.10/library/asyncio-task.html#asyncio.wait See: https://docs.python.org/3.7/library/asyncio-task.html#asyncio.create_task Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-24 11:53:59 +01:00
XinyuanTong	e28be7a692	[Fix] Add `original_max_position_embeddings` to YARN rope_scaling optional keys (#36877 ) [fix] Update optional keys in _validate_yarn_parameters to include original_max_position_embeddings	2025-03-24 11:05:19 +01:00
Raushan Turganbay	48da44be24	Fix torch version guard at import (#36907 ) fix	2025-03-24 10:33:33 +01:00
AbdelKarim ELJANDOUBI	fe4ca2f4a7	fix Gemma3 Config (#36893 ) * fix Gemma3 Config * fix config in modular gemm3	2025-03-24 10:05:44 +01:00
Aritra Roy Gosthipaty	c9d1e5238a	Update installation.md (#36826 ) * Update installation.md * Update README.md	2025-03-21 16:32:02 -07:00
Steven Liu	d253de6d58	[docs] Model docs (#36469 ) * initial * fix * fix * update * fix * fixes * quantization * attention mask visualizer * multimodal * small changes * fix code samples	2025-03-21 15:35:22 -07:00
Yoni Gozlan	beb9b5b022	Fix Pan and Scan on batched images Gemma3 (#36864 ) * process flattened images in fast image proc * process flattened images in low proc and add tests * remove print * add unbalanced batch test pas image proc * fix integration tests	2025-03-21 13:56:00 -04:00
Cyril Vallez	dd3933dd65	Simplify keep_in_fp32_modules logic (#36722 ) * better regex everywhere * fix * Update test_modeling_instructblip.py * BC with explanations this time otherwise it makes no sense at all * Update test_modeling_instructblip.py * style * CIs * update _keep_in_fp32_modules in blip2 * Update modeling_utils.py * Update modeling_utils.py * style * CIs * add check * trigger CIs * Update modeling_utils.py * trigger CIs	2025-03-21 16:12:59 +01:00
Sukriti Sharma	90e2df5d55	fix: loss computation after embeddings resize - mllama (#36840 ) * move loss to generation class Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * code cleanup Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * test for resize and loss computation Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix tests Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix:test for resize and loss Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix resize embedding mllama test Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * review changes Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> --------- Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>	2025-03-21 14:47:59 +01:00
Arthur Zucker	4542b8fb27	push v4.51.0.dev0	2025-03-21 13:45:25 +01:00
Raushan Turganbay	523f6e743c	Fix: dtype cannot be str (#36262 ) * fix * this wan't supposed to be here, revert * refine tests a bit more	2025-03-21 13:27:47 +01:00
Pablo Montalvo	3f9ff19b4e	Minor Gemma 3 fixes (#36884 ) fix attention mask dtype + outputs type	2025-03-21 13:15:22 +01:00
Daniël de Kok	f94b0c59f2	Use `deformable_detr` kernel from the Hub (#36853 ) * Use `deformable_detr` kernel from the Hub Remove the `deformable_detr` kernel from `kernels/` and use the pre-built kernel from the Hub instead. * Add license header * Add `kernels` as an extra `hub-kernels` Also add it to `testing`, so that the kernel replacement gets tested when using CUDA in CI.	2025-03-21 13:08:47 +01:00
Pablo Montalvo	2638d54e78	Gemma 3 tests expect greedy decoding (#36882 ) tests expect greedy decoding	2025-03-21 12:36:39 +01:00
Pablo Montalvo	b8aadc31d5	🔴 🔴 🔴 supersede paligemma forward to shift pos id indexing (#36859 ) * supersede paligemma forward to shift pos id indexing * fix prepare_inputs_ as well * fix modular error --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-03-21 12:36:27 +01:00
Arthur Zucker	6321876b5b	add eustlb as an actor	2025-03-21 12:32:12 +01:00
Joao Gante	94f487626a	[generate] model defaults being inherited only happens for newer models (#36881 )	2025-03-21 11:01:09 +00:00
Arthur	f19d018bff	Revert "Update deprecated Jax calls (#35919 )" (#36880 ) * Revert "Update deprecated Jax calls (#35919)" This reverts commit f0d5b2ff04e1354d32beac70984adcc8100352a0. * Revert "Update deprecated Jax calls (#35919)" This reverts commit f0d5b2ff04e1354d32beac70984adcc8100352a0. * udpate	2025-03-21 11:01:44 +01:00
sebbaur	62116c967f	Make ViTPooler configurable (#36517 ) * Make ViT Pooler configurable, so that it is possible to pick the activation function and the number of channels in the output * Add documentation and allow functions as activations (instead of just string) * formatting change * Use ACT2FN * Formatting change * Formatting changes * force pooler_act to be string * force pooler_act to be string * Add configs to OBJECTS_TO_IGNORE to make check_docstrings happy * Making the same change in ijepa to make check_modular_conversion happy * Add IJepaConfig to make CI happy * rename pooler_size to pooler_output_size as defined in the config * typo * revert change to ignore variable * Ran utils/check_docstrings.py --fix_and_overwrite * revert unrelated change * remove redundant defaults * rename self.act -> self.activation * tanh activation function in mapping	2025-03-21 11:01:07 +01:00
Afanti	26c83490d2	chore: fix typos in the tests directory (#36813 ) * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * chore: fix typos in the tests * fix: format codes * chore: fix copy mismatch issue * fix: format codes * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: fix copy mismatch issue * chore: restore previous words * chore: revert unexpected changes	2025-03-21 10:20:05 +01:00
regisss	0adbc873d0	Remove call to `.item` in `get_batch_samples` (#36861 )	2025-03-21 10:14:26 +01:00
Benjamin Bossan	6bb8565f0c	FIX FSDP plugin update for QLoRA (#36720 ) The _fsdp_qlora_plugin_updates checks for LoraConfig but other PEFT methods can also support quantized models, e.g. VeRA. Therefore, the isinstance check is now looking for PeftConfig in general. Moreover, the fsdp_plugin variable may be undefined in the 2nd if condition, leading to an `UnboundLocalError` error. This is fixed by not assigning the variable at all. I checked for tests that may need updating but only found test_fsdp_config_transformers_auto_wrap associated with this change. AFAICT, this test does not cover the changed code, since the test does not start the training loop. Therefore, I haven't updated any tests. LMK if/how this fix should be tested. Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-21 10:11:47 +01:00
Joao Gante	949cca4061	[CI] doc builder without custom image (#36862 ) * no image * test * revert jax version updates * make fixup * update autodoc path for model_addition_debugger * shieldgemma2 * add missing pages to toctree	2025-03-21 09:10:27 +00:00
Raushan Turganbay	97d2f9d8ae	Mllama: raise better error (#35934 ) * fix mllama * update test * fix test	2025-03-21 09:35:37 +01:00
Yoni Gozlan	6a2627918d	Refactor Aya Vision with modular (#36688 ) * refactor aya_vision with modular (incorrect docstring) * Fix docstrings * Fix other modulars * fix docstring * revert changes * add tie_weights and resize_token_embeddings	2025-03-20 15:34:56 -04:00
gautham	9e771bf402	Add support for seed in `DataCollatorForLanguageModeling` (#36497 ) Add support for `seed` in `DataCollatorForLanguageModeling`. Also wrote tests for verifying behaviour.	2025-03-20 18:27:43 +00:00
Joao Gante	ecd60d01c3	[CI] fix update metadata job (#36850 ) fix updata_metadata job	2025-03-20 17:17:36 +00:00
Raushan Turganbay	42c489f2ae	Gemma3: fix test (#36820 ) * fix test * require_read_token and public repo ids * flash-attn test uncomment * fix torchscript	2025-03-20 18:14:53 +01:00
Marc Sun	068b663f90	[torchao] revert to get_apply_tensor_subclass (#36849 ) * revert to old name * empty commit --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-20 18:00:13 +01:00
Pablo Montalvo	1d3f35f30a	Add model visual debugger (#36798 ) * draft of model tracer visualiser * add context manager in addition to decorator * add debug utils to init * move model debugging utils to dedicated file * add documentation * protect some imports * format * move and protect imports * format * doc: improve errors in case of broken dummy imports. * format * use automatic torch backend * update doc * fix backend * (TEMP) move to dummies while backend wait * update documentation * doc	2025-03-20 17:37:29 +01:00
Haotong LIN	6515c25953	Add Prompt Depth Anything Model (#35401 ) * add prompt depth anything model by modular transformer * add prompt depth anything docs and imports * update code style according transformers doc * update code style: import order issue is fixed by custom_init_isort * fix depth shape from B,1,H,W to B,H,W which is as the same as Depth Anything * move prompt depth anything to vision models in _toctree.yml * update backbone test; there is no need for resnet18 backbone test * update init file & pass RUN_SLOW tests * update len(prompt_depth) to prompt_depth.shape[0] Co-authored-by: Joshua Lochner <admin@xenova.com> * fix torch_int/model_doc * fix typo * update PromptDepthAnythingImageProcessor * fix typo * fix typo for prompt depth anything doc * update promptda overview image link of huggingface repo * fix some typos in promptda doc * Update image processing to include pad_image, prompt depth position, and related explanations for better clarity and functionality. * add copy disclaimer for prompt depth anything image processing * fix some format typos in image processing and conversion scripts * fix nn.ReLU(False) to nn.ReLU() * rename residual layer as it's a sequential layer * move size compute to a separate line/variable for easier debug in modular prompt depth anything * fix modular format for prompt depth anything * update modular prompt depth anything * fix scale to meter and some internal funcs warp * fix code style in image_processing_prompt_depth_anything.py * fix issues in image_processing_prompt_depth_anything.py * fix issues in image_processing_prompt_depth_anything.py * fix issues in prompt depth anything * update converting script similar to mllamma * update testing for modeling prompt depth anything * update testing for image_processing_prompt_depth_anything * fix assertion in image_processing_prompt_depth_anything * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/prompt_depth_anything.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/prompt_depth_anything.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update some testing * fix testing * fix * add return doc for forward of prompt depth anything * Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update tests/models/prompt_depth_anything/test_modeling_prompt_depth_anything.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix prompt depth order * fix format for testing prompt depth anything * fix minor issues in prompt depth anything doc * fix format for modular prompt depth anything * revert format for modular prompt depth anything * revert format for modular prompt depth anything * update format for modular prompt depth anything * fix parallel testing errors * fix doc for prompt depth anything * Add header * Fix imports * Licence header --------- Co-authored-by: Joshua Lochner <admin@xenova.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-20 16:12:44 +00:00
Pavel Iakubovskii	66291778dd	Refactor Attention implementation for ViT-based models (#36545 ) * Refactor vit attention * Refactor ViT-based models * 🚨🚨🚨 Fix prefix for DPT * Update params order * trigger tests * Fix Dinov2 attention * Fix DPT attention impl propagation for backbone config * Common test fix: config is modif. inplace - avoid it * view->reshape * Fixup * Fixup * Enable IJepa FA2 * Add FA2 in corresponding model docs	2025-03-20 15:15:01 +00:00
inkcherry	730d2a52e7	DeepSpeed tensor parallel+ZeRO (#36825 ) add ds tp change	2025-03-20 16:12:01 +01:00
fxmarty-amd	1a374799ce	Support loading Quark quantized models in Transformers (#36372 ) * add quark quantizer * add quark doc * clean up doc * fix tests * make style * more style fixes * cleanup imports * cleaning * precise install * Update docs/source/en/quantization/quark.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/quark_integration/test_quark.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * remove import guard as suggested * update copyright headers * add quark to transformers-quantization-latest-gpu Dockerfile * make tests pass on transformers main + quark==0.7 * add missing F8_E4M3 and F8_E5M2 keys from str_to_torch_dtype --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Bowen Bao <bowenbao@amd.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-03-20 15:40:51 +01:00
cyyever	ce091b1bda	Use pyupgrade --py39-plus to improve code (#36843 )	2025-03-20 14:39:44 +00:00
mobicham	3e8f0fbf44	Fix hqq skipped modules and dynamic quant (#36821 ) * Fix hqq skip_modules and dynamic_quant * fix skipped modules loading * add dynamic/skip HqqConfig test	2025-03-20 15:31:49 +01:00
Ella Charlaix	055afdb6bb	Fix ONNX export for sequence classification head (#36332 ) * set dtype to int32 * fix style	2025-03-20 14:22:48 +00:00
Ryan Mullins	487dab1b2b	Shieldgemma2 (#36678 ) * single commit * correct config * fixup * dummy pt * Use ShieldGemma2Config in conversion script * Update src/transformers/models/shieldgemma2/configuration_shieldgemma2.py * Adding shieldgemma2 to models.__init__.py * Adding ShieldGemma2 to main __init__.py * Update shieldgemma2.md * Update shieldgemma2.md * Adding tests. Addressing review feedback. * Minor docs update * Fixing code quality feedback from CI * Fixing empty messages bug reported by ghunkins --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Ren Pang <ain-soph@live.com>	2025-03-20 15:14:38 +01:00
HuangBugWei	a63e92e2f0	Fix: remove the redundant snippet of _whole_word_mask (#36759 ) remove the redundant snippet of _whole_word_mask	2025-03-20 14:10:43 +00:00
Ryan Mullins	8124a234ca	Gemma 3: Adding explicit GenerationConfig and refactoring conversion … (#36833 ) Gemma 3: Adding explicit GenerationConfig and refactoring conversion script	2025-03-20 15:03:32 +01:00
Pavel Iakubovskii	cf8091c017	Fix import for torch 2.0, 2.1 - guard typehint for "device_mesh" (#36768 ) * Fix device_mesh * Remove rebase leftover	2025-03-20 11:55:47 +00:00
Marc Sun	388e6659bf	Update min safetensors bis (#36823 ) * update setup.py * style	2025-03-20 12:50:07 +01:00
Joao Gante	b47d9b2f8a	[generate] clarify docstrings: when to inherit `GenerationMixin` (#36605 )	2025-03-20 10:58:54 +00:00
Joao Gante	8e97b44087	[modular] Sort modular skips (#36304 )	2025-03-20 10:55:12 +00:00
Artem Kudisov	63380b77d4	Pass state dict (#35234 ) * Pass state_dict argument to get_peft_model_state_dict * Style fix * Change arguments order	2025-03-20 11:54:59 +01:00
Joao Gante	957b05b413	[qwen2 audio] remove redundant code and update docs (#36282 )	2025-03-20 10:54:51 +00:00
rasmi	f0d5b2ff04	Update deprecated Jax calls (#35919 ) * Remove deprecated arguments for jax.numpy.clip. * Remove deprecated arguments for jax.numpy.clip. * Update jax version to 0.4.27 to 0.4.38. * Avoid use of deprecated xla_bridge.get_backend().platform Co-authored-by: Jake Vanderplas <jakevdp@google.com> --------- Co-authored-by: Jake Vanderplas <jakevdp@google.com>	2025-03-20 11:51:51 +01:00
Pavel Iakubovskii	1ddb64937c	Fix fp16 ONNX export for RT-DETR and RT-DETRv2 (#36460 ) * Fix FP16 ONNX export * Fix typo * Sync omdet-turbo * Refactor encoder for better readability * Fix _no_split_modules * Fix int -> torch_int * Fix rt_detr * Apply to rt-detr-v2 * Fixup * Fix copies	2025-03-20 10:43:51 +00:00
AbdelKarim ELJANDOUBI	e7337ee7be	Pass num_items_in_batch directly to loss computation (#36753 ) * Pass num_items_in_batch directly to loss computation * use self loss instead * fix loss kwrgs * fix vocab size	2025-03-20 10:35:35 +00:00
yutong_liu	8b479e39bb	Saving `Trainer.collator.tokenizer` in when `Trainer.processing_class` is `None` (#36552 ) * feat: Saving tokenizer in collator when processing_class is None * chore: Style issue * chore: Typo * dbg: Check why test failed * dbg: Remove logics and another test failed which successed before, so should be the stablibility issue * test: Init unit-test * chore: Style * chore: Add err log * fix: Case * Update tests/trainer/test_trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * chore: Try to use get_regression_trainer * fix: Impl and style * fix: Style * fix: Case * fix: Import err * fix: Missed import * fix: Import block un-sorted problem * fix: Try another tokenizer * fix: Test logic * chore: Light updates * chore: Reformat --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-20 11:27:47 +01:00
Ita Zaporozhets	3f03c379d2	fix tiktoken convert to pass AddedToken to Tokenizer (#36566 ) * pass AddedToken to Tokenizer * ruff * handle dict for special tokens * option: test tokenizer from tiktoken same as fast * ruff * ruff	2025-03-20 11:26:49 +01:00
Stas Bekman	8f64b177f6	[ForCausalLMLoss] allow users to pass shifted labels (#36607 ) * [ForCausalLMLoss] allow users to pass shifted labels Signed-off-by: Stas Bekman <stas@stason.org> * style Signed-off-by: Stas Bekman <stas@stason.org> --------- Signed-off-by: Stas Bekman <stas@stason.org>	2025-03-20 11:25:22 +01:00
HDCharles	94555437e2	Disable inductor config setter by default (#36608 ) * Disable inductor config setter by default This is hard to debug and should be off by default * remove default settings in autoquant too * Add info to torchao.md about recommended settings * satisfying Ruff format Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-20 11:23:14 +01:00
Ze-Yi LIN	8733297b41	Fix swanlab global step (#36728 ) * fix * global step	2025-03-20 11:13:37 +01:00
Quentin Gallouédec	b815fae359	Move the warning to the documentation for DataCollatorWithFlattening (#36707 ) Remove init warning	2025-03-20 11:09:57 +01:00
Matt	9be4728af8	Just import torch AdamW instead (#36177 ) * Just import torch AdamW instead * Update docs too * Make AdamW undocumented * make fixup * Add a basic wrapper class * Add it back to the docs * Just remove AdamW entirely * Remove some AdamW references * Drop AdamW from the public init * make fix-copies * Cleanup some references * make fixup * Delete lots of transformers.AdamW references * Remove extra references to adamw_hf	2025-03-19 18:29:40 +00:00
Michael Feil	51bd0ceb9e	Update configuration_qwen2.py (#36735 ) * Update configuration_qwen2_moe.py * Update modeling_qwen2_moe.py * ruff fmt * docstring add qkv_bias	2025-03-19 18:15:54 +00:00
JJJYmmm	107fedc1e2	quick fix fast_image_processor register error (#36716 ) * fix fast_image_processor register error * update error message * remove redundant import * fix format	2025-03-19 18:05:45 +00:00
Mohamed Mekkouri	258dd9cc69	Add Space to Bitsandbytes doc (#36834 ) * add space * address review	2025-03-19 18:56:07 +01:00
Tugsbayasgalan Manlaibaatar	f39f4960f3	Support tracable dynamicKVcache (#36311 ) * Support tracable dynamicKVcache * Fix lint * More fine grained test * Lint * Update * Update * Fix up * Apply suggestions from code review * Update src/transformers/cache_utils.py * Update tests/utils/test_cache_utils.py * Apply suggestions from code review * Update * Change error message * Rename * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review --------- Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-19 16:52:30 +00:00
Matt	63c3116530	One more fix for reviewer assignment (#36829 ) * one more fix * one more fix * Trigger tests	2025-03-19 16:25:24 +00:00
Joao Gante	7c233980f4	[gemma 3] multimodal checkpoints + AutoModelForCausalLM (#36741 )	2025-03-19 15:04:19 +00:00
Yao Matrix	b11050d6a2	enable OffloadedCache on XPU from PyTorch 2.7 (#36654 ) * fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model * follow Marc's suggestion to use _tie_weights to fix Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * enable OffloadedCache on XPU since PyTorch 2.7 Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * don't change bart Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> * make code more concise per review comments Signed-off-by: N <matrix.yao@intel.com> * fix review comments Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> * Revert "fix review comments" This reverts commit acf1484b86c7cc58b2dee69e7008c0eeb4c97b1b. * fix review comments Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> * fix style Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> Signed-off-by: N <matrix.yao@intel.com> Co-authored-by: root <root@a4bf01945cfe.jf.intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-19 15:15:52 +01:00
Driss Guessous	e8d960329e	Add option for ao base configs (#36526 )	2025-03-19 14:59:47 +01:00
Arthur	fef8b7f8e9	Add attention visualization tool (#36630 ) * add utils fiel * style * nits * nits * update * updaets * update * fix init issues * big updates * nits * nits? * small updates * nites * there were still some models left * style * fixes * updates * nits _ fixes * push changes * update * update * update * Apply suggestions from code review Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * style * styling and return a string for testing * small updates * always biderectional for now * update --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-03-19 13:58:46 +01:00
Joao Gante	0fe0bae0a8	[Generation] remove leftover code from end-to-end compilation (#36685 )	2025-03-19 11:28:33 +00:00
Mohamed Mekkouri	a861db01e5	Fix Device map for bitsandbytes tests (#36800 ) fix	2025-03-19 11:57:13 +01:00
Yih-Dar	b9374a0763	Remove `dist": "loadfile"` for `pytest` in CircleCI jobs (#36811 ) * fasterrrrr * avoid crash in example jobs * avoid crash in TF example jobs --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-19 11:15:09 +01:00
Yao Matrix	4fa91b1be5	fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model (#36572 ) * fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model * follow Marc's suggestion to use _tie_weights to fix Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix review comments. Signed-off-by: N <matrix.yao@intel.com> * fix quality Signed-off-by: N <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Signed-off-by: N <matrix.yao@intel.com>	2025-03-19 10:48:47 +01:00
ivarflakstad	706703bba6	Expectations test utils (#36569 ) * Add expectation classes + tests * Use typing Union instead of \| * Use bits to track score in properties cmp method * Add exceptions and tests + comments * Remove compute cap minor as it is not needed currently * Simplify. Remove Properties class * Add example Exceptions usage * Expectations as dict subclass * Update example Exceptions usage * Refactor. Improve type name. Document score fn. * Rename to DeviceProperties.	2025-03-18 23:39:50 +01:00
Joao Gante	179d02ffb8	[generate] ✨ vectorized beam search ✨ (#35802 )	2025-03-18 18:39:36 +00:00
Yoni Gozlan	12f2ebef63	Support custom dosctrings in modular (#36726 ) * Override docstrings in modular if not none * Update doc	2025-03-18 14:00:54 -04:00
Gar	00915d3041	Fix chameleon's TypeError because inputs_embeds may None (#36673 ) * fix chameleon TypeError when inputs_embeds is None * reformat * hotfix	2025-03-18 18:59:30 +01:00
Marc Sun	14b597f518	Fix casting dtype for qunatization (#36799 ) * fix * remove print	2025-03-18 18:46:03 +01:00
Yoni Gozlan	30580f035b	Fix Mistral3 tests (#36797 ) * fix processor tests * fix modeling tests * fix test processor chat template * revert modeling test changes	2025-03-18 13:08:12 -04:00
Cyril Vallez	db1d4c5a0b	Loading optimizations (#36742 ) * improvements * Update modeling_utils.py * add some doc about loading * Update modeling_utils.py	2025-03-18 16:38:44 +01:00
Yih-Dar	7baf00089a	Update SHA for `tj-actions/changed-files` (#36795 ) * trigger * trigger --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-18 16:19:39 +01:00
Marc Sun	3017536ebf	fix hqq due to recent modeling changes (#36771 ) * fix-hqq * style * test	2025-03-18 12:20:27 +01:00
Cyril Vallez	e959530b8f	Add Mistral3 (#36790 ) * initial start * style and dummies * Create convert_mistral3_weights_to_hf.py * update * typo * typo * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * up * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * update * update * Update image_processing_mistral3.py * Update convert_mistral3_weights_to_hf.py * fix patch merger * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * up * update modular to fit * style * Update convert_mistral3_weights_to_hf.py * typo * Update modular_mistral3.py * simplify a lot all shape shenanigans * simplify * add working test processor * Add partially working common modeling tests * All tests working and remove mistral3 image processors * add docs and fixup * fix inference with image size >1540 * 🚨fix test image proc pixtral * Remove vision_feature_select_strategy * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * clean * fix test checkpoints * Update test_modeling_mistral3.py * Update test_modeling_mistral3.py * style * Use Pixtral processor * up * finish cleaning processor to use pixtral directly * Update __init__.py * Update processing_pixtral.py * doc * Update __init__.py * Update mistral3.md * Update _toctree.yml --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com>	2025-03-18 12:04:42 +01:00
Lysandre Debut	bd92073692	Fix gemma3_text tokenizer in mapping (#36793 )	2025-03-18 11:50:22 +01:00
Zebin	7426d02ea8	Fixing typo in gemma3 image_processor_fast and adding a small test (#36776 ) Co-authored-by: zebz13 <zeb@fedora> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-18 11:35:06 +01:00
Afanti	19b9d8ae13	chore: fix typos in tests directory (#36785 ) * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory	2025-03-18 10:31:13 +01:00
Afanti	7f5077e536	fix typos in the tests directory (#36717 )	2025-03-17 17:45:57 +00:00
Daniel Kleine	cbfb8d7b27	doc: Clarify `is_decoder` usage in PretrainedConfig documentation (#36724 ) * fix: clarify decoder usage in PretrainedConfig documentation * Apply suggestions from code review updated doc Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-03-17 09:40:25 -07:00
Steven Liu	ac1a1b66b9	[docs] Update README (#36265 ) * update * feedback * feedback * update versions	2025-03-17 09:37:19 -07:00
Joao Gante	cff4caa0c1	[CI] remove redundant checks in `test_eager_matches_sdpa_inference` (#36740 )	2025-03-17 16:29:18 +00:00
Christopher Akiki	e3af4fec91	[MINOR:TYPO] Update hubert.md (#36733 ) * [MINOR:TYPO] Update hubert.md - typo fix (wave2vec instead of hubert) - make code snippet copiable and runnable * Run tests	2025-03-17 09:07:51 -07:00
Petr Kuderov	c8a2b25f91	Fix `TrainingArguments.torch_empty_cache_steps` post_init check (#36734 ) Mistaken use of De Morgan's law. Fixed "not (X or Y)" to correct "not (X and Y)" check to raise a ValueError. Added corresponding test to check "positive int or None" condition. Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-17 16:09:46 +01:00
Sambhav Dixit	8e67230860	Fix test isolation for clear_import_cache utility (#36345 ) * test fixup * test fixup * fixing tests for unused imports * style fixes * fix * style fixes * styke fix * remove isolated module cache * rm custom subprocess defination * run using exsiting fn * style fixup * make fixup * remove redundant comments * rm redundat skipif + style changes	2025-03-17 16:09:09 +01:00
jiqing-feng	27361bd218	fix xpu tests (#36656 ) * fix awq xpu tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix llava next video bnb tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-17 15:57:49 +01:00
Fredrik Norén	da7d64f4ff	Allow ray datasets to be used with trainer (#36699 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-17 15:44:47 +01:00
jiqing-feng	2256875a77	fix can_generate (#36570 ) * fix can_generate Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix can generate for speecht5 and blip Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix speecht5 tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-03-17 14:56:18 +01:00
Marc Sun	9e94801146	enable/disable compile for quants methods (#36519 ) * disable compile for most quants methods * fix * Update src/transformers/generation/configuration_utils.py Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update tests/quantization/bnb/test_mixed_int8.py Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update src/transformers/generation/configuration_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * changes from joao suggestions --------- Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-17 11:38:21 +01:00
Armaghan Shakir	c53d53da89	🚨🚨🚨 Fix sdpa in SAM and refactor relative position embeddings (#36422 ) * fall back to eager if output_attentions * improve relative position embeddings * run modular on got_ocr2 * run-slow: sam * fix run-length encoding * fix tf processor errors * update tf_sam * fix compile error * re-run tests	2025-03-17 09:39:52 +00:00
Joao Gante	fc8764c9a6	[Generation, Gemma 3] When passing a custom `generation_config`, overwrite default values with the model's base `generation_config` (#36684 )	2025-03-15 12:40:09 +00:00
Guillaume LEGENDRE	f263e88dcf	Update self-push-caller.yml	2025-03-15 11:32:04 +01:00
Ilyas Moutawwakil	6f3e0b68e0	Fix grad accum arbitrary value (#36691 )	2025-03-14 22:03:01 +01:00
Cyril Vallez	2c2495cc7b	Fix post_init() code duplication (#36727 ) * Update modeling_utils.py * CIs	2025-03-14 17:36:02 +01:00
MaCAT	25992b493c	🌐 [i18n-KO] Translated codegen.md to Korean (#36698 ) * Initial translation * Add _toctree.yml	2025-03-14 09:31:18 -07:00
Joao Gante	42ebb6c23e	[tests] Parameterized `test_eager_matches_sdpa_inference` (#36650 )	2025-03-14 14:41:27 +00:00
Matt	9215cc62d4	Try working around the processor registration bugs (#36184 ) * Try working around the processor registration bugs * oops * Update error message * Clarify error * Docstring docstring docstring * The extra content is indexed by config class, so let's grab some values out of there * Commit my confusion as a TODO * Resolve my confusion * Cleanup and mostly revert to the original * Better autoclass fallback * Don't nest f-strings you lunatic * Clearer error message * Less getattr() * Revert a lot of changes to try a different approach! * Try the global registry * Check the dynamic list as well as the transformers root * Move the dynamic list somewhere safer * Move the dynamic list somewhere even safer * More import cleanup * Simplify all the register_for_auto_class methods * Set _auto_class in the register() methods * Stop setting the cls attribute in register() * Restore specifying the model class for Model derivatives only * Fix accidentally taking the .__class__ of a class * Revert register_for_auto_class changes * Fix get_possibly_dynamic_module * No more ALL_CUSTOM_CLASSES * Fix up get_possibly_dynamic_module as well * Revert unnecessary formatting changes * Trigger tests	2025-03-14 13:56:21 +00:00
Sean (Seok-Won) Yi	691d1b52c3	Fix/best model checkpoint fix (#35885 ) * Set best_model_checkpoint only when ckpt exists. Rather than set it explicitly without checking if the checkpoint directory even exists as before, now we moved the setting logic inside of _save_checkpoint and are only setting it if it exists. * Added best_global_step to TrainerState. * Added tests for best_model_checkpoint. * Fixed hard-coded values in test to prevent fail. * Added helper func and removed hard-coded best_step. * Added side effect patch generator for _eval. * Added evaluate side effect func. * Removed erroneous patching. * Fixed minor bug. * Applied Ruff. * Fixed Ruff problem in make style. * Used Trainer.set_initial_training_values.	2025-03-14 14:24:53 +01:00
Joao Gante	3bd1a0ddf1	[model loading] don't `gc.collect()` if only 1 shard is used (#36721 ) * don't gc collect if 1 shard is used * delete state dict anyways	2025-03-14 12:56:56 +00:00
Matt	8cb522b419	Cleanup the regex used for doc preprocessing (#36648 ) * Cleanup the regex used for doc preprocessing * Run tests	2025-03-14 12:18:49 +00:00
Matt	72861e11eb	Make the flaky list a little more general (#36704 ) * Make the flaky list a little more general * Trigger tests * Make the flaky list a little more general	2025-03-14 12:15:32 +00:00
Kingsley	53742b11f5	Gemma3 processor typo (#36710 ) * fix typo when is on * tiny * add test and remove 'text_crops' * lint	2025-03-14 13:07:55 +01:00
Yoni Gozlan	69bc848480	Add support for fast image processors in add-new-model-like CLI (#36313 ) * add support for fast image processors in add-new-model-like * fix header not found add-fast-image-processor-cli * Encourage adding fast image processor * nit * start improve doc * update docs * make requested modifs	2025-03-13 14:16:37 -04:00
Matt	48ef468c74	Final CI cleanup (#36703 ) * make fixup * make fixup * Correct skip decorator * Add TODOs * add is_flaky() parentheses	2025-03-13 17:26:09 +00:00
Isotr0py	b070025aa6	Add GGUF support to T5-Encoder (#36700 ) * add gguf support to t5encoder Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> * remove gguf from model_kwargs Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-03-13 17:57:33 +01:00
Mohamed Mekkouri	4a60bae8e2	Handling an exception related to HQQ quantization in modeling (#36702 ) * adding exception * style * add types	2025-03-13 17:53:36 +01:00
Mehant Kammakomati	09a309d273	fix: fsdp sharded state dict wont work for save_only_model knob (#36627 ) Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-13 17:17:35 +01:00
Cyril Vallez	2a004f9ff1	Add loading speed test (#36671 ) * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * trigger CIs * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * better error messages * Update test_modeling_utils.py * Update test_modeling_utils.py	2025-03-13 17:07:30 +01:00
Joao Gante	a3201cea14	[CI] Automatic rerun of certain test failures (#36694 )	2025-03-13 15:40:23 +00:00
Afanti	d84569387f	chore: fix typos in utils module (#36668 ) * chore: fix typos in utils module * chore: fix typos in utils module * chore: fix typos in utils module * chore: fix typos in utils module * chore: fix typos in utils module * chore: fix typos in utils module	2025-03-13 15:12:44 +00:00
Cyril Vallez	32c95bd847	Fix dtype for params without tp_plan (#36681 ) * Update tensor_parallel.py * CIs	2025-03-13 15:28:14 +01:00
wineandchord	bb965d8e87	fix type annotation for ALL_ATTENTION_FUNCTIONS (#36690 ) Corrects the type annotation to match actual usage. The variable was typed as Dict[str, Dict[str, Callable]] but is actually used as Dict[str, Callable] where keys are attention mechanism names and values are the corresponding attention functions directly. This change makes the type annotation consistent with how the dictionary is used in the codebase.	2025-03-13 14:27:50 +00:00
Yoni Gozlan	1c287aecfc	Change Qwen2_VL image processors to have init and call accept the same kwargs (#36207 ) Change qwen2VL image processors to have init and call accept the same kwargs	2025-03-13 10:15:17 -04:00
Mohamed Mekkouri	65b8e38aac	Upgrading torch version and cuda version in quantization docker (#36264 ) * update * small update * no spqr quant * testing * testing * test nightly * gptqmodel * flute * fix hadamard * running tests * new docker * fix docker * run tests * testing new docker * new docker * run tests * new docker * run tests * final test * update * update * run tests * new docker * launch tests * test_docker * running tests * add comments * fixing yml * revert	2025-03-13 12:39:16 +01:00
bd793fcb	87b30c3589	fix wandb hp search unable to resume from sweep_id (#35883 ) * fix wandb hp search unable to resume from sweep_id * format styles --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-13 12:32:26 +01:00
Mohamed Mekkouri	47cc4da351	Changing the test model in Quanto kv cache (#36670 ) changing model	2025-03-13 12:23:34 +01:00
Marc Sun	bc3d5781e7	Fix slicing for 0-dim param (#36580 ) * fix * switch to ellipsis instead * Add co-author Co-authored-by: fxmarty-amd <fxmarty-amd@users.noreply.github.com> * Add co-author second try Co-authored-by: fxmarty-amd <felmarty@amd.com>	2025-03-13 12:16:13 +01:00
Marc Sun	fbb18ce68b	Update config.torch_dtype correctly (#36679 ) * fix * style * new test	2025-03-13 12:08:02 +01:00
Joao Gante	c4161238bd	[Cache] Don't initialize the cache on `meta` device (#36543 )	2025-03-13 10:13:29 +00:00
Yoni Gozlan	79254c9b61	Fix rescale normalize inconsistencies in fast image processors (#36388 ) * fix fused rescale normalize inconsistencies * fix siglip2 fast image processor * refactor kwargs validation and fused nirmalize rescale * cleanup kwargs handling in preprocess * update new procs after refactor	2025-03-12 23:18:34 -04:00
Yoni Gozlan	48292a9848	Refactor siglip2 fast image processor (#36406 ) * refactor siglip2 fast image processor, add unused_kwargs in base fast image processor * nits * change unused_kwargs default to None * update siglip2 fast image proc	2025-03-12 20:28:27 -04:00
Yoni Gozlan	ea219ed164	Remove differences between init and preprocess kwargs for fast image processors (#36186 ) * Remove differences between init and preprocess kwargs in fast image processors * make modifs got_ocr2 * update gemma3	2025-03-12 19:44:05 -04:00
Marc Sun	cc3a361b46	[quants] refactor logic for modules_to_not_convert (#36672 )	2025-03-12 23:43:30 +01:00
Yoni Gozlan	bc3253f076	Remove hardcoded slow image processor class in processors supporting fast ones (#36266 ) * Add fast image processor class to processors supporting them * fix test kosmos2	2025-03-12 18:39:25 -04:00
Mohamed Mekkouri	0013ba61e5	Fix Failing GPTQ tests (#36666 ) fix tests	2025-03-12 20:03:02 +01:00
Matt	c7eb95581a	Don't accidentally mutate the base_model_tp_plan (#36677 ) * Don't accidentally mutate the base_model_tp_plan * Co-authored by: Joao Gante <joaofranciscocardosogante@gmail.com> * Trigger tests * Marking grad accum test as slow * Add a flaky decorator * Add a flaky decorator * Use cyril's codeblock * Don't copy() when it's None * Use cyril's new codeblock * make fixup	2025-03-12 18:59:13 +00:00
Cyril Vallez	071a161d3e	[core] Large/full refactor of `from_pretrained` (#36033 ) * squash everything together start to simplify inner logic Update modeling_utils.py Update modeling_utils.py Update modeling_utils.py Update modeling_utils.py continue refactor fix small fixes add type hints/docstring Update modeling_utils.py remove _fast_init keep improving Update modeling_utils.py Update modeling_utils.py new first tp loading version style fix weird in-place op trigger CIs Update modeling_utils.py much clearer renaming of keys fix update Update test_modeling_common.py trigger CIs update update style Update modeling_utils.py Update modeling_utils.py Update modeling_utils.py fix fast download first prototype remove old function remove old functions Remove unused function and move back _get_tp_registry fix tp plan registry simplify CIs Update hub.py Update modeling_utils.py simplify simplify renaming logic remove unused check add sanity check back (a test depends on it) Update modeling_utils.py finalize sound renaming logic style add forgotten check Update modeling_utils.py add key_mapping keyword style Update modeling_utils.py add comment minor updates minor change for clarity fix small prefix issue and simplify style trigger CIs typo fix Post rebase fix post rebase cleanup simplify tp typo oupsi typo correctly escape improvements based on Marc's review finalize Marc's review comments squash everything * improve * Update modeling_utils.py * Update modeling_utils.py * fix * Update modeling_utils.py * Update modeling_utils.py * style * Update modeling_utils.py * simplify * style * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * fix dtype issue * Update modeling_utils.py * style * remove test that does not make sense * style * small fixes * style * fix * cleanup after rebase * style * typo * escape * tp for task specific top modules * Update modeling_utils.py * Update modeling_utils.py * fix allocation * CIs * CIs * CIs * improve docstring * CIs * Update modeling_utils.py * fix	2025-03-12 13:39:25 +01:00
Marc Sun	7652804d23	Fix bnb regression due to empty state dict (#36663 ) fix	2025-03-12 11:40:46 +01:00
Joao Gante	994cad2790	[CI] gemma 3 `make fix-copies` (#36664 ) * make fixup * trigger ci	2025-03-12 10:35:13 +00:00
Arthur	2829013d2d	fix block mask typing (#36661 ) * fix block mask typing * updated Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * gemma * fix --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-03-12 11:29:11 +01:00
Ilyas Moutawwakil	89f6956015	HPU support (#36424 ) * test * fix * fix * skip some and run some first * test fsdp * fix * patches for generate * test distributed * copy * don't test distributed loss for hpu * require fp16 and run first * changes from marc's PR fixing zero3 * better alternative * return True when fp16 support on gaudi without creating bridge * fix * fix tested dtype in deepspeed inference test * test * fix * test * fix * skip * require fp16 * run first fsdp * Apply suggestions from code review * address comments * address comments and refactor test * reduce precison * avoid doing gaudi1 specific stuff in the genreation loop * document test_gradient_accumulation_loss_alignment_with_model_loss test a bit more	2025-03-12 09:08:12 +01:00
Ryan Mullins	50d3530aa0	Gemma3 (#36658 ) * Fix converter * [Broken] Adds Gemma 3 to Hugging Face Transformers * Consolidating Config and Processor params across impls * Sorting out configuration parameters. Adds qk_norm before RoPE. Still not sure if RoPE is right. * Additional plumbing for CausalLM and ConditionalGeneration variants * incomplete draft of Orbax conversion script * More complete checkpoint conversion * Supporting Gemma 3 1B checkpoints * Updating RoPE for multiple frequencies * Adjustments to rotary embedder * Proof of life for text-only operation * Updating the conversion script to handle multimodal projection weights * Fixing tet-only conversions * Cleaner conversion script with multimodal support and a simpler processor * Additional refatcors to the Gemma3Processor * Simplified Processor to work over text representations * Updated conversion script to join text and vision embeddings at converion time * Logging for debugging * Update src/transformers/models/gemma2/modeling_gemma2.py Co-authored-by: Joshua Lochner <admin@xenova.com> * Removed extraneous Config params * Switching to fast tokenizer for checkpoint conversions * isolating siglip for performance tetsing * Minor changes for debugging tests against baselines * Adding average pooling for soft tokens * Updating processor code to enable simpler embedding interleaving for arbitrary number of images in prompts * Updating conversion script for ShieldGemma 2 conversion compatibility * Allow disable_compile to be provided as a kwarg * Refresh from modular * Updated conversion script and corrected sliding window * Fix type mismatch in cache_position (#4) * Fix dtype (#5) * Fix type mismatch in cache_position * Actually fix in the modular file Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com> --------- Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com> * fixes for embedding table overflow and missing image_soft_token_mask from Gemma3Processor * Adding 2D pooling for image embeddings * Revert "Adding 2D pooling for image embeddings" This reverts commit 65350cf531296f050b2078a5b8e46f61642b2648. * Gemma3 average pooling changed from 1D to 2D * Major refactor to Gemma3MultimodalInputProjection * Updating Gemm 3 Auto* registrations * Add option to save Gemma 3 chat template with tokenizer during weights conversion * Removing unused imports * Moving out-of-vocab handling from Gemma3Processor to Gemma3ForConditionalGeneration * Removing duplicate config property * Removing final logit softcapping and 1-indexing of position ids * Fixing image processor config and none --> None typo * Fixing sliding window size for 1B * Updating image_mean and image_std in Image Processor * Attention masking changed to lower triangular * Moving image special tokens to conversion script * Mirror image processor defaults from conversion script into Gemma3ProcessorKwargs * Remove special token variables from symbol space * Moving image soft token mask computation from Gemma3Processor to Gemma3ForConditionalGeneration * tie lm_head and embedding weights Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Correct tied weights in Gemma3CausalLM * iterative bidirectional attention * resolving merge conflicts * Reverting to Gemma 2 HybridCache with sldiing window support and a sliding_window_pattern of 6 * Correcting RoPE scaling * clean up first pass, dummy model geenration works * final clean up before fixing tests * causal lm test works, so fine * Fix conversion * Update src/transformers/models/gemma3/processing_gemma3.py * model tests are happy * processor tests are happy * image processing tests added * fixup * Fix pre-processing in conversion * Inputs merging * Do not normalize vision embeddings * Apply Ryan's (and team) changes to attention * token type ids + mask * template * move embed scale, add rope scale, fix tests * Add chat template to tokenizer * Use prefix for causal model loading * use existing code for sliding mask from gemma2 * self.embed_tokens already normalizes * Correcting Gemma3TextConfig parameters in conversion script * typo, modular overwrites my fixes * enable device map for text model * Conversion updates * ultra nit: no einsums * update image token * copy deepcopy config + some docs * add some test, still WIP * Refactoring --include_chat_tempalte logic in converter * Update src/transformers/models/gemma3/modular_gemma3.py Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> * Add eos tokens for instruct models * dump so i can work on dgx * Removing add_bos by default * dump * add fast im proc * docs for PaS + fixup * another fixup * one more fixup * fix tests * Inverting prior BOS change * ultra nit * Reverting to Tokenizer saved with add_bos_token=True and chat template starting with BOS * resize embeds, remove sqrt, add slow test outputs * FA2 but quality is meh * nit * skip FA2, no idea what happened * last bit for green CI * please, green CI for docs * T_T * Fix for Gemma3 logits * Support both options for system prompt * Update src/transformers/models/gemma3/image_processing_gemma3_fast.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/gemma3.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Docs updates now that assets are live * Style fixes --------- Co-authored-by: Joshua Lochner <admin@xenova.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com> Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> Co-authored-by: Lysandre <hi@lysand.re>	2025-03-12 09:06:17 +01:00
Afanti	81aa9b2e07	fix typos in the docs directory (#36639 ) * chore: fix typos in the docs directory * chore: fix typos in the docs directory * chore: fix typos in the docs directory	2025-03-11 09:41:41 -07:00
Marc Sun	cb384dcd7a	Fix gguf docs (#36601 ) * update * doc * update * Update docs/source/en/gguf.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-03-11 15:29:14 +01:00
Matt	1e4286fd59	Remove research projects (#36645 ) * Remove research projects * Add new README to explain where the projects went * Trigger tests * Cleanup all references to research_projects	2025-03-11 13:47:38 +00:00
Steven Liu	ed1807bab3	[docs] Update docs dependency (#36635 ) update	2025-03-11 13:42:49 +00:00
Matt	b80b3ec529	Stop warnings from unnecessary torch.tensor() overuse (#36538 )	2025-03-11 13:41:13 +00:00
Matt	556d2c23c6	Remove remote code warning (#36285 ) * Remove redundant pipeline warning * Remove redundant pipeline warning	2025-03-11 13:29:15 +00:00
ivarflakstad	b1a51ea464	Fix AriaForConditionalGeneration flex attn test (#36604 ) AriaForConditionalGeneration depends on idefics3 vision transformer which does not support flex attn	2025-03-11 11:05:49 +01:00
Arthur	d126f35427	Proper_flex (#36643 ) * proper performant flex attention implementation * wrapper for flex attention to compile only when triggered * wrapper for flex attention to compile only when triggered * attention mask type detection * Update src/transformers/integrations/flex_attention.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * nit * nit * nit * nit * gemma2 support * add citation for torchtune * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update flex_attention.py * nit * nit * nit * reset gemma2 modifications * nit * nit * nit * licencing * apply changes to other models * safe import --------- Co-authored-by: Sung Ching Liu <sunny19981005@outlook.com> Co-authored-by: Sung Ching Liu <22844540+bursteratom@users.noreply.github.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-03-11 10:24:12 +01:00
Travis Johnson	d8663cb8c5	Fix bugs in mllama image processing (#36156 ) * fix: handle input_channel_dim == channels_last Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> * fix: default PIL images to channels_last Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fixup from review batch Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> * test: add 1x1 PIL image to ambiguous channel test Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> * fix(mllama): avoid 0 dimension for image with impractical aspect ratio Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> --------- Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-11 10:22:48 +01:00
Arthur	1c4b62b219	Refactor some core stuff (#36539 ) * some config changes * update * current state * update * update * updates and cleanup * something that works * fixup * fixes * nits * nit * nits and fix * Update src/transformers/integrations/tensor_parallel.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Update src/transformers/integrations/tensor_parallel.py Co-authored-by: Lysandre Debut <hi@lysand.re> * cleanup * style * safe import * fix * updates * rename stuff an clean * style * small updates * ups * oups * nit * protect imports * update tp * rodfl * arf * turbo nit on init * fix import error * frumble gumbgle * try to fix the import error * should fix the non model test * update keep in float32 * update * fix * nits * fix subvconfigs * test was weird * nit * fix failing test * fix instruct blip * fixes * style * x.com * fix overwrite * ok last bit of failing test --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2025-03-11 09:26:28 +01:00
Steven Liu	e9756cdbc7	[docs] Serving LLMs (#36522 ) * initial * fix * model-impl	2025-03-10 13:14:19 -07:00
Afanti	af9b2eaa54	chore: fix typos in language models (#36586 ) * chore: fix typos in language models * chore: fix typos in mistral model * chore: fix model copy from issue * chore: fix model copy from issue * chore: fix model copy from issue * chore: fix model copy from issue * chore: fix model copy from issue	2025-03-10 15:54:49 +00:00
Matt	a929c466d0	Fix auto-assign reviewers (#36631 ) * Fix auto-assign reviewers * Clean up endanchor a bit * We don't actually need the end anchor at all	2025-03-10 15:52:13 +00:00
Joao Gante	858545047c	[`HybridCache`] disable automatic compilation (#36620 )	2025-03-10 09:24:26 +00:00
Kevron Rees	94ae1ba5b5	Fix check for XPU. PyTorch >= 2.6 no longer needs ipex. (#36593 )	2025-03-07 14:09:35 +00:00
gautham	a1cf9f3390	Fixed datatype related issues in `DataCollatorForLanguageModeling` (#36457 ) Fixed 2 issues regarding `tests/trainer/test_data_collator.py::TFDataCollatorIntegrationTest::test_all_mask_replacement`: 1. I got the error `RuntimeError: "bernoulli_tensor_cpu_p_" not implemented for 'Long'`. This is because the `mask_replacement_prob=1` and `torch.bernoulli` doesn't accept this type (which would be a `torch.long` dtype instead. I fixed this by manually casting the probability arguments in the `__post_init__` function of `DataCollatorForLanguageModeling`. 2. I also got the error `tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute Equal as input #1(zero-based) was expected to be a int64 tensor but is a int32 tensor [Op:Equal]` due to the line `tf.reduce_all((batch["input_ids"] == inputs) \| (batch["input_ids"] == tokenizer.mask_token_id))` in `test_data_collator.py`. This occurs because the type of the `inputs` variable is `tf.int32`. Solved this by manually casting it to `tf.int64` in the test, as the expected return type of `batch["input_ids"]` is `tf.int64`.	2025-03-07 14:09:27 +00:00
dependabot[bot]	4fce7a0f0f	Bump jinja2 from 3.1.5 to 3.1.6 in /examples/research_projects/decision_transformer (#36582 ) Bump jinja2 in /examples/research_projects/decision_transformer Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.5 to 3.1.6. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-07 13:35:59 +00:00
Joao Gante	f2fb41948e	Update "who to tag" / "who can review" (#36394 ) update who to tag	2025-03-07 13:09:31 +00:00
Krishnakumar Kannan	1b9978c360	Update chat_extras.md with content correction (#36599 ) Update chat_extras.md - content Fixed a typo in the content, that may confuse the readers.	2025-03-07 13:09:02 +00:00
Matt	f2e197c30a	Github action for auto-assigning reviewers (#35846 ) * First draft of github action on PR opening for auto-assigning reviewers * fix missing import * Don't reassign reviewers if we already have them * Temporarily comment out the opened line so we can test the script * Correct path for codeowners file * Update workflow permissions * Update workflow permissions * Update debug logs * Strip inline comments * Remove prefix * Request reviews instead of assigning * Request reviews instead of assigning * Add TODO * Use pull-request-target instead * Update the script * Set back to pull_request for testing * Set to pull_request_target, testing works! * Add licence * Tighten up one of the globs * Refactor things to be a bit less convoluted * Only assign reviewers when marked ready for review	2025-03-07 12:18:49 +00:00
Andreas Abdi	8a16edce67	Export base streamer. (#36500 ) * Export base streamer. Previously, the base streamer class was not exported so the set of available streamers was fixed to 3 streamer classes. This change makes it so that customers may extend the default base streamer class. * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co>	2025-03-07 11:16:09 +00:00
Dolen	6f775970c7	avoid errors when the size of `input_ids` passed to `PrefixConstrainedLogitsProcessor` is zero (#36489 ) * avoid errors when the size of `input_ids` passed to PrefixConstrainedLogitsProcessor is zero * use more reasonable process * avoid early return --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-07 11:02:49 +00:00
Nouamane Tazi	51ed61e2f0	Mention UltraScale Playbook 🌌 in docs (#36589 )	2025-03-06 14:48:11 -08:00
Aritra Roy Gosthipaty	159445d044	fix: argument (#36558 ) `752ef3fd4e/utils/modular_model_converter.py (L1729)`	2025-03-06 13:11:19 -08:00
Joao Gante	5275ef6f3d	[XGLM] tag tests as slow (#36592 ) these tests should be slow	2025-03-06 17:54:41 +00:00
Joao Gante	c1b24c0b73	[bark] fix loading of generation config (#36587 )	2025-03-06 16:55:19 +00:00
Shaohon Chen	0440dbc0e1	Integrate SwanLab for offline/online experiment tracking and local visualization (#36433 ) * add swanlab integration * feat(integrate): add SwanLab as an optional experiment tracking tool in transformers - Integrated SwanLab into the transformers library as an alternative for experiment tracking. - Users can now log training metrics, hyperparameters, and other experiment details to SwanLab by setting `report_to="swanlab"` in the `TrainingArguments`. - Added necessary dependencies and documentation for SwanLab integration. * Fix the spelling error of SwanLabCallback in callback.md * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Fix typo in comment * Fix typo in comment * Fix typos and update comments * fix annotation * chore: opt some comments --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: AAssets <20010618@qq.com> Co-authored-by: ZeYi Lin <944270057@qq.com> Co-authored-by: KAAANG <79990647+SAKURA-CAT@users.noreply.github.com>	2025-03-06 17:35:30 +01:00
hlky	bc30dd1efb	Modular Conversion --fix_and_overwrite on Windows (#36583 ) * Modular Conversion --fix_and_overwrite on Windows * -newline on read	2025-03-06 13:12:30 +00:00
湛露先生	9e385109cf	Delete redundancy if case in model_utils (#36559 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-03-06 11:36:11 +00:00
dependabot[bot]	acc49e390d	Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/pplm (#36540 ) Bump transformers in /examples/research_projects/pplm Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-06 11:35:47 +00:00
Afanti	9e84b38135	chore: enhance message descriptions in parameters,comments,logs and docstrings (#36554 ) * chore: enhance message descriptons in parameters,comments,logs and docstrings * chore: enhance message descriptons in parameters,comments,logs and docstrings * Update src/transformers/hf_argparser.py * Update src/transformers/keras_callbacks.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-03-06 11:02:35 +00:00
湛露先生	6966fa1901	Fix typos . (#36551 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-03-05 16:31:43 -08:00
co63oc	996f512d52	Fix typos in tests (#36547 ) Signed-off-by: co63oc <co63oc@users.noreply.github.com>	2025-03-05 15:04:06 -08:00
Marc Sun	752ef3fd4e	guard torch version for uint16 (#36520 ) * u16 * style * fix	2025-03-05 11:27:01 +01:00
Afanti	66f29aaaf5	chore: enhance messages in docstrings (#36525 ) chore: enhance the message in docstrings	2025-03-04 16:31:20 +00:00
Mohamed Mekkouri	89d27fa6ff	Fix links in quantization doc (#36528 ) fix quantization doc	2025-03-04 16:43:03 +01:00
ivarflakstad	c0c5acff07	Fix bamba tests amd (#36535 )	2025-03-04 15:24:27 +01:00
co63oc	37508816d6	chore: Fix typos in docs and examples (#36524 ) Fix typos in docs and examples Signed-off-by: co63oc <co63oc@users.noreply.github.com>	2025-03-04 13:47:41 +00:00
Arthur	84f0186e89	Add aya (#36521 ) * initial commit * small fix * move stuff to image processing file * remove stuff in validate turn and fix return tensor * remove liquid stuff * in the process of addressing comments * changes to get the right tokenization * new __init__ works * fixing defulat std and mean * works * small testing scipt -- to be deleted before merge * remove redundant code * addressing comments * fix inits, add docs templates * refactor processor, switch to gotocr image processor * remove image proc from init * refactor to working llava-style architecture * Change AyaVisionModel to AyaVisionForConditionalGeneration * add tests * fixups * update doc * Adding logits_to_keep explicitly in ayavision forward to enable compatibility with cohere model * better variable names + remove code paths * Updates to aya_vision.md * address comments * adding copied from * make style and remove unused projector_hidden_act from config * sort init * include usage of fast image proc and proc on cuda in doc * update checkpoint iin test processor * update checkpoint in test processor 2 * remove test_model and update docstring * skip failing tests --------- Co-authored-by: Saurabh Dash <saurabh@cohere.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-03-04 12:24:33 +01:00
Steven Liu	c0f8d055ce	[docs] Redesign (#31757 ) * toctree * not-doctested.txt * collapse sections * feedback * update * rewrite get started sections * fixes * fix * loading models * fix * customize models * share * fix link * contribute part 1 * contribute pt 2 * fix toctree * tokenization pt 1 * Add new model (#32615) * v1 - working version * fix * fix * fix * fix * rename to correct name * fix title * fixup * rename files * fix * add copied from on tests * rename to `FalconMamba` everywhere and fix bugs * fix quantization + accelerate * fix copies * add `torch.compile` support * fix tests * fix tests and add slow tests * copies on config * merge the latest changes * fix tests * add few lines about instruct * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * fix tests --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * "to be not" -> "not to be" (#32636) * "to be not" -> "not to be" * Update sam.md * Update trainer.py * Update modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * fix hfoption tag * tokenization pt. 2 * image processor * fix toctree * backbones * feature extractor * fix file name * processor * update not-doctested * update * make style * fix toctree * revision * make fixup * fix toctree * fix * make style * fix hfoption tag * pipeline * pipeline gradio * pipeline web server * add pipeline * fix toctree * not-doctested * prompting * llm optims * fix toctree * fixes * cache * text generation * fix * chat pipeline * chat stuff * xla * torch.compile * cpu inference * toctree * gpu inference * agents and tools * gguf/tiktoken * finetune * toctree * trainer * trainer pt 2 * optims * optimizers * accelerate * parallelism * fsdp * update * distributed cpu * hardware training * gpu training * gpu training 2 * peft * distrib debug * deepspeed 1 * deepspeed 2 * chat toctree * quant pt 1 * quant pt 2 * fix toctree * fix * fix * quant pt 3 * quant pt 4 * serialization * torchscript * scripts * tpu * review * model addition timeline * modular * more reviews * reviews * fix toctree * reviews reviews * continue reviews * more reviews * modular transformers * more review * zamba2 * fix * all frameworks * pytorch * supported model frameworks * flashattention * rm check_table * not-doctested.txt * rm check_support_list.py * feedback * updates/feedback * review * feedback * fix * update * feedback * updates * update --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2025-03-03 10:33:46 -08:00
Matt	6aa9888463	Remove unused code (#36459 )	2025-03-03 18:31:10 +00:00
Kashif Rasul	9fe82793ee	[Style] fix E721 warnings (#36474 ) * fix E721 warnings * config.hidden_size is not a tuple * fix copies * fix-copies * not a tuple * undo * undo	2025-03-03 18:03:42 +00:00
Matt	1975be4d97	Fix edge case for continue_final_message (#36404 ) * Fix edge case for continue_final_message * lstrip() correctly * Add regression test * Add a clearer error message when the final message is not present * Add a clearer error message when the final message is not present * Fix massive bug!	2025-03-03 18:03:03 +00:00
Matt	2aff938992	Fix pipeline+peft interaction (#36480 ) * Fix pipeline-peft interaction * once again you have committed a debug breakpoint * Remove extra testing line * Add a test to check adapter loading * Correct adapter path * make fixup * Remove unnecessary check * Make check a little more stringent	2025-03-03 18:01:43 +00:00
Afanti	28159aee63	chore: fix message descriptions in arguments and comments (#36504 ) chore: fix messagedescriptions in arguments and comments	2025-03-03 17:54:57 +00:00
co63oc	acb8586dd9	Fix some typos in docs (#36502 ) Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-03-03 17:53:53 +00:00
Marc Sun	0463901c92	fix torch_dtype, contiguous, and load_state_dict regression (#36512 ) * fix regression * fix param * fix load_state_dict * style * better fix for module * fix tests * quick fix for now * rm print	2025-03-03 18:35:37 +01:00
Marcel	3e83ee75ec	Fix kwargs UserWarning in SamImageProcessor (#36479 ) transformers/image_processing_utils.py:41: UserWarning: The following named arguments are not valid for `SamImageProcessor.preprocess` and were ignored: 'point_pad_value'	2025-03-03 16:23:34 +00:00
Yih-Dar	9e3a1072c2	Check `TRUST_REMOTE_CODE` for `RealmRetriever` for security (#36511 ) * fix * repush --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-03 15:08:12 +01:00
Zach Mueller	4d8259d245	Fix loading zero3 weights (#36455 ) * Check if fixes * Fix zero3 loading * Quality * Fix marc nit * Add fast tests * Migrate to integrations.deepspeed rather than modeling_utils * Style	2025-03-03 15:05:58 +01:00
hlky	dcbdf7e962	Fix _load_state_dict_into_meta_model with device_map=None (#36488 ) * Fix _load_state_dict_into_meta_model with device_map=None * Update src/transformers/modeling_utils.py	2025-03-02 08:33:36 +01:00
Marc Sun	a40f1ac602	Fix couples of issues from #36335 (#36453 ) * fix * style * better allocation * fix * fix * style * revert disk * exit * style * return if nothing to cache * dtensor guard * fix regressiion * fix regression * fix * fix	2025-03-01 07:12:17 +01:00
Yoni Gozlan	2c5d038f92	Add Got-OCR 2 Fast image processor and refactor slow one (#36185 ) * refactor image processor slow got ocr * add working image processor fast * fix fast image processor, update doc * use one big loop for processing patches	2025-03-01 00:56:00 -05:00
Fanli Lin	51083d1bac	[docs] fix bug in deepspeed config (#36081 ) bug fix	2025-02-28 07:09:54 -08:00
Pavel Iakubovskii	02776d2c6a	Fix loading models with mismatched sizes (#36463 ) * Fix loading model with mismatched sizes * trigger tests	2025-02-28 11:48:59 +01:00
Eduardo Pacheco	222505c7e4	[GroundingDino] Fix grounding dino loss 🚨 (#31828 ) * Starting to fix GroundingDinoLoss and GroundingDinoHungarianMatcher * More updates * More updates * fixed: GroundingDinoLoss * fixed: failing tests * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/grounding_dino/test_modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Addressed comments * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * add: cardinality loss and make box loss as copy from * change: default for reduction loss is sum * fix: vectorized generate fake box * fix copies * Addressed comments * addressed comments * addressed one-hot * Update tests/models/grounding_dino/test_modeling_grounding_dino.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * Addressed comments * fixed test * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py * Update tests/models/grounding_dino/test_modeling_grounding_dino.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Starting to fix GroundingDinoLoss and GroundingDinoHungarianMatcher * More updates * More updates * fixed: GroundingDinoLoss * add: cardinality loss and make box loss as copy from * fix copies * Revert "Update tests/models/grounding_dino/test_modeling_grounding_dino.py" This reverts commit aa74c4c57c430e54cc74c414d6269edb65c73e83. * [run-slow] groundigdino * remove nestedtensor * [run-slow] groundig_dino * [run-slow] grounding_dino * [run-slow] grounding_dino * [run-slow] grounding_dino * check * check * add: enconder intermediate outputs to ImageLoss forward * add: GroundingDinoForObjectDetectionLoss in the loss directory * make style * fix the loss function * remove class_reduction since it sum is default * remove class_reduction * Update src/transformers/loss/loss_grounding_dino.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * simple fix * Update src/transformers/loss/loss_grounding_dino.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * minor fix * Update src/transformers/loss/loss_for_object_detection.py --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: sangbumchoi <danielsejong55@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-27 19:15:58 +00:00
Yih-Dar	482d17be60	Fix `hub_retry` (#36449 ) * cry * trigger --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-27 14:38:25 +01:00
Harry Mellor	6a876462c3	Lazy import libraries in `src/transformers/image_utils.py` (#36435 ) * Lazy import libraries in `src/transformers/image_utils.py` * `make fixup` Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Protect imports Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> --------- Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-02-27 12:53:42 +00:00
Joao Gante	8aed019764	[generate] `torch.distributed`-compatible `DynamicCache` (#36373 ) * test * docstring * prepare distributed cache data * fix cat dim * test mvp * add test checks * like this? * working test and solution * nit * nit * add shape info	2025-02-27 11:48:57 +00:00
wejoncy	17792556b2	[save_pretrained ] Skip collecting duplicated weight (#36409 ) * Skip collecting duplicated weight * format	2025-02-27 10:57:11 +01:00
Yih-Dar	2d6cc0dfde	Add `contents: write` (#36445 ) fix permission Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-27 10:55:37 +01:00
Yih-Dar	549db241e5	Fix another permission (#36444 ) * fix permission * fix permission --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-27 10:29:06 +01:00
Yih-Dar	a8e4fe45fd	Fix permission (#36443 ) fix permission Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-27 10:08:31 +01:00
Yih-Dar	d0727d92cd	Change PR to draft when it is (re)opened (#36417 ) * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft * draft --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-27 09:44:33 +01:00
Marc Sun	8ede897c30	restrict cache allocator to non quantized model (#36428 )	2025-02-26 22:16:15 +01:00
Mohamed Mekkouri	a7fbab33ae	Fix Expected output for compressed-tensors tests (#36425 ) fix	2025-02-26 21:17:24 +01:00
Arthur	1603018e7a	Update form pretrained to make TP a first class citizen (#36335 ) * clean code * oups * fix merge * yups * fix if * now you can play * fix shape issue * try non blocking * fix * updates * up * updates * fix most of thetests * update * update * small updates * up * fix the remaining bug? * update * rename when you read from the file * buffer issues * current status * cleanup * properly allocate dumb memory * update a small bug * fix colwise rep issue * fix keep in float 32 that was keeping everything in float 32 * typo * more fixes with keep_in_fp32_modules as we use to serach on it * fix ROPE dtype for TP * remove what's breaking the tests * updates * update and fixes * small cleanup after merging * allocate 2x to be safe * style, auto * update * yup nit * fix * remove slow as fuck torch api :( * work * fixup * update * brting the fix back * fix and update * fixes Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * updates because some suggestions were wrong 👀 * update? * fuck this bloated function * typo * fix the dumb prefix thing once and forall * fixes here and there * updates * remove prints * fix strict cases * styel * properly fix keys on load! * update * fix base model prefix issue * style * update * fix all? * remoce 1 print * fix the final etsts * fixup * last nits * fix the detach issue which cause a 2x slowdown * fixup * small fixes * ultra nit * fix * fix --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-26 20:12:38 +01:00
Mohamed Mekkouri	981c276a02	Fix compressed tensors config (#36421 ) * fix config * update --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-26 17:56:15 +01:00
Nadav Timor	d18d9c3205	Universal Speculative Decoding `CandidateGenerator` (#35029 ) * move `TestAssistedCandidateGeneratorDifferentTokenizers` into a new testing file * refactor * NOTHING. add space to rerun github actions tests * remove it... * `UniversalSpeculativeDecodingGenerator` * Use `UniversalSpeculativeDecodingGenerator` when `generation_config.do_sample=True` * assistant tokenizes only the target's new suffix * formatting * fix code * fix code * formatting * add `TestGenerateWithDifferentModels` * `TestGenerateWithDifferentModels` parameterize on `do_sample` * `AssistantVocabMapping` & `AssistantVocabMappingCache` * formatting * `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits` * improve `_get_assistant_to_target_input_ids` & formatting * renaming * WIP: debugging `min_new_tokens` * fix get_target_ids * `UniversalSpeculativeDecodingGenerator` * assistant tokenizes only the target's new suffix * formatting * fix code * fix code * formatting * `TestGenerateWithDifferentModels` parameterize on `do_sample` * `AssistantVocabMapping` & `AssistantVocabMappingCache` * formatting * `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits` * improve `_get_assistant_to_target_input_ids` & formatting * renaming * WIP: debugging `min_new_tokens` * fix get_target_ids * fix device issue * fix get_assistant_input_ids * add `TestAssistedCandidateGeneratorDifferentTokenizers` * formatting * `AssistantVocabTranslatorCache` refactor & tests * revert changes in `src/transformers/generation/logits_process.py` * refactor `AssistedCandidateGenerator` * refactor `AssistedCandidateGeneratorDifferentTokenizers` * formatting * refactor `UniversalSpeculativeDecodingGenerator` * fix negative value for max_new_tokens * fix generation length target + attention_mask vs. assistant + attent * fix device * fix negative max_new_tokens bug * fix UAG * minor * formatting * `AssistedCandidateGeneratorDifferentTokenizers` `lookbehind`s init * resolve conflict & formatting * rerun CI tests * remove space... * remove old code * fix candidate_input_ids device * minor * formatting * Fix prepare + apply (#7) * fix prepare + apply * move to cpu * simplity suppress_tokens * fix bugs and refacatoring * device move * handle self.config.vocab_size > len(target_tokenizer.get_vocab()) * no need to normalize in candidate_generator * address Nadav's comments + minor * optimize device move + SuppressTokensLogitsProcessor * AssistantToTargetTranslator, SuppressTokensLogitsProcessor and tokenizers mapping improvements * padding size * padding improvement * fix and simplify get_target_logits * renaming in get_target_logits * minor * add filter_value and suppress_tokens_id * style + rename * remove TODO * restore original SelectTokensLogitsProcessor with modification * fix style * fix _update_past_and_masks and optimize code * remove assistant_vocab_size arg * fix attention_mask * call _prepare_attention_mask also if not has_past_key_values * handling attention mask for first generation * comment * restore test * remove SelectTokensLogitsProcessor * _update_past_and_masks implementation for USD * Add unittests for Universal Assisted generation * fix style * update tests * Remove unused import and fix `test_speculation_depth` test * exclude special and reserved tokens from tokenizer for UAG * mv `test_universal_assisted_generation.py` to `generation/test_candidate_generator.py` * Remove unused imports and fix style using `make style` (#9) * formatting * Swap gated `meta-llama/llama-3.2` with `allenai/llama` (#10) * Fix space sign disagreement (#12) * default values for AssistantToTargetTranslator fileds * fix space sign * minor * fix test + style * Default values for some fields of assistant to target translator (#11) * default values for AssistantToTargetTranslator fileds * fix * add support to empty logit_processors * Update candidate_generator.py (#15) fix typo * BUG fix in _prepare_assistant_input_ids (#14) * fix _prepare_assistant_input_ids * target_to_assistant_input_ids * Update src/transformers/generation/candidate_generator.py Co-authored-by: Nadav Timor <nadav.timor@weizmann.ac.il> --------- Co-authored-by: Nadav Timor <nadav.timor@weizmann.ac.il> * typo (`target_to_assistant_input_ids`) * formatting * merge upstream/main * Fix minor review comments (#16) * Fix: `token_ids.to(torch.int64)` (#18) * tok ids to `torch.int64` (reference: https://huggingface.co/docs/transformers.js/en/api/tokenizers) * `LongTensor` * fix dtype * `assistant_input_ids.to(dtype=torch.long)` * Remove unused import from test_candidate_generator.py * Remove unused import from test_candidate_generator.py * Remove `numpy` import * resolve pr comments (#19) * `AssistantToTargetTranslator` docstring * (per gante's comment) `filter_value` and `suppress_tokens_id` to class constants * update `AssistantToTargetTranslator` docstring * (gante's comment) replace `match-case` * formatting * Fix Joao's comments (#21) * remove threading * fix logits_processor * fix test device * fix style (#23) * Move atm (#24) * move AssistantToTargetTranslator * fixup * fix logit_processor * add atm_translator test * refactor test * remove threading from test * add require_torch in tests * move AssistantVocabTranslatorCache + add tests * ruff fix --------- Co-authored-by: jmamou <jonathan.mamou@intel.com> Co-authored-by: Gaurav <gauravj@d-matrix.ai> Co-authored-by: Gaurav Jain <gaurjain14@gmail.com> Co-authored-by: gauravjain14 <41287729+gauravjain14@users.noreply.github.com>	2025-02-26 16:14:02 +00:00
Manny Cortes	082834dd79	fix: prevent model access error during Optuna hyperparameter tuning (#36395 ) * fix: prevent model access error during Optuna hyperparameter tuning The `transformers.integrations.integration_utils.run_hp_search_optuna` function releases model memory and sets trainer.model to None after each trial. This causes an AttributeError when subsequent Trainer.train calls attempt to access the model before reinitialization. This is only an issue when `fp16_full_eval` or `bf16_full_eval` flags are enabled. * Update src/transformers/trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-26 17:06:48 +01:00
zheliuyu	6513e5e402	add recommendations for NPU using flash_attn (#36383 ) * add recommendations for Ascend NPU using flash_attn * update recommend_message_npu Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-26 14:51:08 +01:00
Nicolas Patry	b4965cecc5	Fixing the docs corresponding to the breaking change in torch 2.6. (#36420 )	2025-02-26 14:11:52 +01:00
Aymeric Roucher	9a217fc327	Deprecate transformers.agents (#36415 )	2025-02-26 11:38:47 +01:00
Zach Mueller	41925e4213	Add retry hf hub decorator (#35213 ) * Add retry torch decorator * New approach * Empty commit * Empty commit * Style * Use logger.error * Add a test * Update src/transformers/testing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Fix err * Update tests/utils/test_modeling_utils.py --------- Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-02-25 20:53:11 +01:00
Chulhwa (Evan) Han	9ebfda3263	Fixed VitDet for non-squre Images (#35969 ) * size tuple * delete original input_size * use zip * process the other case * Update src/transformers/models/vitdet/modeling_vitdet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * [VITDET] Test non-square image * [Fix] Make Quality * make fix style * Update src/transformers/models/vitdet/modeling_vitdet.py --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-25 19:31:24 +00:00
Yih-Dar	cbe0ea59f3	Security fix for `benchmark.yml` (#36402 ) security Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-25 17:22:09 +01:00
Marcel	88d10517b4	Fix convert_to_rgb for SAM ImageProcessor (#36369 )	2025-02-25 15:10:21 +00:00
Joao Gante	e1ce948908	[CLI] add import guards (#36376 ) * add import guards * nit	2025-02-25 15:06:50 +00:00
Pavel Iakubovskii	fb83befb14	Fix pytorch integration tests for SAM (#36397 ) Fix device in tests	2025-02-25 14:53:34 +00:00
Afanti	ca6ebcb9bc	chore: fix function argument descriptions (#36392 )	2025-02-25 14:28:34 +00:00
jiqing-feng	7c8916ddb5	fix audio classification pipeline fp16 test on cuda (#36359 ) * fix audio classification pipeline fp16 test on cuda Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add comments Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update tests/pipelines/test_pipelines_audio_classification.py --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-02-25 15:01:25 +01:00
Fanli Lin	c3700b0eee	[tests] enable autoawq tests on XPU (#36327 ) add autoawq Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-02-25 13:38:09 +01:00
Dmitry Rogozhkin	b4b9da6d9b	tests: revert change of torch_require_multi_gpu to be device agnostic (#35721 ) * tests: revert change of torch_require_multi_gpu to be device agnostic The 11c27dd33 modified `torch_require_multi_gpu()` to be device agnostic instead of being CUDA specific. This broke some tests which are rightfully CUDA specific, such as: * `tests/trainer/test_trainer_distributed.py::TestTrainerDistributed` In the current Transformers tests architecture `require_torch_multi_accelerator()` should be used to mark multi-GPU tests agnostic to device. This change addresses the issue introduced by 11c27dd33 and reverts modification of `torch_require_multi_gpu()`. Fixes: 11c27dd33 ("Enable BNB multi-backend support (#31098)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * fix bug: modification of frozen set --------- Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-02-25 13:36:10 +01:00
MAHIR DAIYAN	d80d52b007	addressing the issue #34611 to make FlaxDinov2 compatible with any batch size (#35138 ) fixed the batch_size error, all tests are passing Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-02-25 10:44:44 +00:00
andreystarenky	3a02fe56c2	Added handling for length <2 of suppress_tokens for whisper (#36336 ) * Update generation_whisper.py Added handling for <2 length of suppress_tokens for whisper * Updated None check for suppress_tokens to avoid ambiguity --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-02-25 10:32:49 +00:00
Cyril Vallez	da4ab2a1b6	Fix doc formatting in forward passes & modular (#36243 ) * fix indentation issues + modular without magic keyword * style * Update doc.py * style * Fix all decorators indentation * all models * style * style * Update doc.py * fix * general fix * style	2025-02-25 11:09:01 +01:00
Jeff	92abc0dae8	Update _get_eval_sampler to reflect Trainer.tokenizer is deprecation self.tokenizer -> self.processing_class (#36315 ) * fix warning self.tokenizer -> self.processing_class * formating change	2025-02-25 11:07:50 +01:00
jiqing-feng	9d6abf9778	enable torchao quantization on CPU (#36146 ) * enable torchao quantization on CPU Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix int4 Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable CPU torchao tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix cuda tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix cpu tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix style Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix cuda tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torchao available Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torchao available Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torchao config cannot convert to json * fix docs Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * rm to_dict to rebase Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * limited torchao version for CPU Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix skip Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/testing_utils.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix cpu test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-25 11:06:52 +01:00
Cyril Vallez	401543a825	Fix `is_causal` fail with compile (#36374 ) fix	2025-02-25 10:44:56 +01:00
Cyril Vallez	bc65f3fc1c	[modular] Do not track imports in functions (#36279 ) * Add check * just check for function * Update examples	2025-02-25 10:29:47 +01:00
Cyril Vallez	4b5cf5496d	Load models much faster on accelerator devices!! (#36380 ) * caching allocator warmup * Update modeling_utils.py * reuse expanded map * style	2025-02-25 09:41:22 +01:00
Yin Song	931e5f4ac3	Update modeling_llava_onevision.py (#36391 ) Fixed a potential bug in modeling_llava_onevision.py	2025-02-25 09:34:50 +01:00
Yih-Dar	2ab7bdc403	notify new model merged to `main` (#36375 ) notify new model Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-24 17:53:18 +01:00
Kyle Sayers	05dfed06d7	[Modeling] Reduce runtime when loading missing keys (#36312 ) * hoist keys Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove hoist Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-02-24 16:10:28 +00:00
Mathew Shen	18276b03f7	fix(type): padding_side type should be Optional[str] (#36326 )	2025-02-24 16:09:42 +00:00
ivarflakstad	f4684a6eb2	Update amd pytorch index to match base image (#36347 ) pip pytorch index should match docker base image	2025-02-24 16:17:20 +01:00
Jerry Zhang	2af272c101	Add autoquant support for torchao quantizer (#35503 ) * Add autoquant support for torchao quantizer Summary: att, also verified that autoquantized model can be saved and loaded: save: https://gist.github.com/jerryzh168/01d367aaf44dbbbfd4068a4a10a00061 load: https://gist.github.com/jerryzh168/d5c6c401b2abdf18e0b6771341f1525c Test Plan: tested locally with above script model uploaded to https://huggingface.co/jerryzh168/llama3-8b-autoquant Reviewers: Subscribers: Tasks: Tags: * add test * ruff fix * ruff reformat * add docs and min_sqnr support * format * format * fix test * update doc * format * remove disable_compile * format	2025-02-24 15:54:16 +01:00
ivarflakstad	977a61f743	Change slack channel for mi250 CI to amd-hf-ci (#36346 )	2025-02-24 15:50:06 +01:00
Rahul Tuli	884a8ea1f0	Improve model loading for compressed tensor models (#36152 ) * Disable warnings for stacked compressors * Introduce two new hooks in HfQuantizer lifecycle to allow updates to missing and unexpected keys * Update missing and unexpected keys for stacked compressors * Add tests * Fix: run_compressed cases * Fix: uncompressed cases * Rename compressed_tensor folder to compressed_tensors Move RunCompressedTest to the same file Update tests to unittest	2025-02-24 13:47:21 +01:00
Fanli Lin	4dbf17c17f	[tests] enable bnb tests on xpu (#36233 ) * fix failed test * fix device * fix more device cases * add more cases * fix empty cache * Update test_4bit.py --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-02-24 11:30:15 +01:00
Matt	92c5ca9dd7	Fix exploitable regexes in Nougat and GPTSan/GPTJNeoXJapanese (#36121 ) * Fix potential regex catastrophic backtracking in NougatTokenizerFast The original regex pattern in tokenization_nougat_fast.py was vulnerable to catastrophic backtracking due to greedy quantifiers and nested alternations. This commit replaces it with a more efficient pattern that: 1. Uses explicit character classes instead of dot (.) 2. Handles whitespace more precisely 3. Avoids unnecessary backtracking 4. Supports both lowercase and uppercase roman numerals 5. Maintains the same functionality while being more robust * Try another regex * Trying deepseek's answer * Start with a simplification * Another simplification * Just rewrite the whole function myself * Fix gptneox and gptsan * Simplify the regex even further * Tighten up the price regex a little * Add possessive version of the regex * Fix regex * Much cleaner regexes --------- Co-authored-by: openhands <openhands@all-hands.dev>	2025-02-21 19:49:51 +00:00
CalOmnie	547911e727	Uses Collection in transformers.image_transforms.normalize (#36301 ) * Uses Collection instead of Sequence in transformers.image_transforms.normalize * Uses collections.abc.Collection in lieu of deprecated typing one	2025-02-21 18:38:41 +01:00
Fanli Lin	7c5bd24ffa	[tests] make quanto tests device-agnostic (#36328 ) * make device-agnostic * name change	2025-02-21 14:20:40 +01:00
Joao Gante	678885bbbd	[CI] Check test if the `GenerationTesterMixin` inheritance is correct 🐛 🔫 (#36180 )	2025-02-21 10:18:20 +00:00
Pavel Iakubovskii	a957b7911a	Add SigLIP 2 (#36323 ) * Docs * Inits * Auto classes * Add siglip base * Add base tests * Fix Siglip V1 for fix res version * Add image processor * Update conversion * Experimenting with vectorized embeddings * Fixup * Add modular Siglip2Processor * Add modular configuration * Rename num patches * Correct image and text features merging * Working conversion script * Refactoring conversion script * Remove unused code in conversion script * Shorten dict a bit * Refactoring conversion * Done conversion refactoring * Fixup * Modular siglip2 * Make model exportable and compilable without graph breaks * Remove position_ids from image_processor * REmove position ids from modeling file * Update modular * Type hint * Fixup * Set defaults to processor * Add integration test * Revert spatial shapes back to tensor * Change order * Fix most of the tests * Fix docstring * Remove interpolate_pos_encoding arg (not needed) * Update docs * Standardize processing * Fix attention_mask in vision head * Siglip v1: remove double transpose in FA2 * Update modular file * Update FA2 test * Update expected logits * Fix interpolation for siglip2 image processor * Skip init test * Skip dispatch on flash test * Fix modeling tests * Fixup * Add dummy objects * Fix some docstrings * Add siglip2 in index.md * Fix consistency * Add docs * Remove size and data format * Add image processor tests * Fix * Add fast image processor * Fix style * Fix * Docs * Set lowercase for tokenizer * Adjust head size for Siglip v1 * Update siglip2 for consistency with siglip1 * Update siglip2 conversion * Update pipeline * Update checkpoints in tests * Update checkpoint name * Fix pooling for image classification model * Fix FA2 test * Update processor * Fix check repo * Update docs * Fix typos * Fix docstring for fast image processor * Add siglip2 to FA2 docs * Fix fast ip tests * Fix constitency * Fix tokenizer class for siglip v1 * Fix missing header * Refactor scaling for clip, siglip, siglip2 * Remove unused imports * Make fast IP default for siglip2 * Update docs * Update checkpoints * Update modular * Update paper link * Fixup * Fix name in toctree * Fix test	2025-02-21 09:04:19 +00:00
Raushan Turganbay	14552cbd7c	VLMs: even more clean-up (#36249 ) * squash * style	2025-02-21 09:46:31 +01:00
Cyan	e18f233f6c	Fix default attention mask of generate in MoshiForConditionalGeneration (#36171 )	2025-02-20 19:53:27 +00:00
Joao Gante	27d1707586	[smolvlm] make CI green (#36306 ) * add smolvlm to toctree * add requirements * dev-ci * no docker changes * dev-ci * update torch-light.dockerfile * derp * dev-ci	2025-02-20 18:56:11 +01:00
Nosimus	effaef334b	fix: prevent second save in the end of training if last step was saved already (#36219 ) * fix: prevent second save in the end of training * fix: prevent second save in the end of training * test: added test for no duplicate save on epoch save strategy * fix: removed TrainerControl * chore: style formatting --------- Co-authored-by: JaktensTid <jaktenstid1@gmail.com>	2025-02-20 17:38:52 +01:00
12v	5412ff1a13	Fix typo in Pixtral example (#36302 ) Fix typo	2025-02-20 14:13:48 +00:00
Orr Zohar	4397dfcb71	SmolVLM2 (#36126 ) * smolvlm init * updates * fixing bugs * minimal run, no checks * minimal run, no checks * passing first check + adding url support * updating video dataloading logic * fixing image logic * trying modular, but fails * modular is working, changing processor to match PR comments and general transformers logic * fixing kwargs * offloading video loading logic to image_util * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * fixing circleci code formatting errors * update * add idefics3-based tests * add keyword to all * add PreTrainedModel * updateing video loading logic * working inference * updates for PR comments * updates for PR comments * moving SmolVLMPretrainedModel higher to fix import error * CI test pass * CI test pass * removing lambda * CI test pass * CI test pass * CI test pass * CI test pass * CI test pass * CI test pass * processor tests * add example in docs * typo * fix copies * skip compile tests - sdpa for VisionTransformer * fix init * raise import error for num2words * update doc for FA2 * more doc fix * CI * updates for PR comments * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Joshua Lochner <admin@xenova.com> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * fixing processor -- tokenizer not defined properly, (gpt2 tokenizer), and does not have the attributes of fake image token, etc * adding smolvlm to VQA models * removing vqa auto class * Update src/transformers/models/smolvlm/processing_smolvlm.py Co-authored-by: Joshua Lochner <admin@xenova.com> * removing smolvlmvisiontransformer from index.md * my bad, video processing had typos * fixing docs * renaming params in SmolVLMModel.inputs_merger * removing un-needed dtype/device in model forward * ruff for CI * update docs * Update docs/source/en/model_doc/smolvlm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * return cache position * return cache position * return cache also in modular * needed to run modular again * fix training tests * push vectorized inputs merger * format * format * reduce number of mappings * addressing PR comments * happy CI, happy me :) * skip non-nested images * adjust integration test for smaller GPUs * format * fix kwargs in chat template apply * skip this for now --------- Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Joshua Lochner <admin@xenova.com>	2025-02-20 15:00:26 +01:00
Yih-Dar	f2ab182dca	Ignore conversion files in test fetcher (#36251 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-20 13:32:02 +01:00
Yih-Dar	e8531a0e33	Fix broken CI on release branch due to missing conversion files (#36275 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-20 13:22:10 +01:00
Ilyas Moutawwakil	5e2183f344	Make cache traceable (#35873 ) simply make cache traceable	2025-02-20 09:59:25 +01:00
Marc Sun	31bb662db1	Fix callback handler reference (#36250 ) * fix reference * style	2025-02-19 18:17:33 +01:00
hyjbrave	78d6484675	docs: Update README_zh-hans.md (#36269 ) Update README_zh-hans.md docs: Fix awkward sentence in README	2025-02-19 09:04:46 -08:00
Mohamed Mekkouri	e5cea20743	Add Example for Custom quantization (#36286 ) * add example * rename	2025-02-19 17:09:23 +01:00
Joao Gante	e3d99ec2f5	[tests] make `test_from_pretrained_low_cpu_mem_usage_equal` less flaky (#36255 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-19 15:14:02 +00:00
Joao Gante	99adc74462	[tests] remove flax-pt equivalence and cross tests (#36283 )	2025-02-19 15:13:27 +00:00
Joao Gante	fa8cdccd91	[tests] deflake dither test (#36284 )	2025-02-19 15:13:10 +00:00
Cyril Vallez	60226c6ff3	TP initialization module-by-module (#35996 ) * module-by-module loading! * Update modeling_utils.py * dtyle and comments * Update modeling_utils.py * Update modeling_utils.py * Update test * Update modeling_utils.py * Update modeling_utils.py * Update test_tp.py * Update test_tp.py * Update modeling_utils.py * re-trigger CIs * re-trigger CIs	2025-02-19 14:04:57 +01:00
Joao Gante	0863eef248	[tests] remove `pt_tf` equivalence tests (#36253 )	2025-02-19 11:55:11 +00:00
Karel Vesely	1a81d774b1	Add dithering to the `Speech2TextFeatureExtractor` API. (#34638 ) * Add dithering to the `Speech2TextFeatureExtractor` API. - in kaldi : `4a8b7f6732/src/feat/feature-window.cc (L145)` - with dithering without a seed, the features become non-deterministic due to small Gaussian noise added to the audio (i.e. 2 runs lead to little different outputs) * update the PR - add dithering also for WhisperFeatureExtractor - not adding to Wav2Vec2FeatureExtractor (no FBANK computation) * add unit-tests for dithering, fix docstrings * ruff * utils/check_copies.py --fix_and_overwrite * update code, add seed to unit-test * adding explanation of dithering	2025-02-19 11:50:02 +01:00
Yoni Gozlan	9f51dc2535	Add support for post-processing kwargs in image-text-to-text pipeline (#35374 ) * fix error and improve pipeline * add processing_kwargs to apply_chat_template * change default post_process kwarg to args * Fix slow tests * fix copies	2025-02-18 17:43:36 -05:00
Yoni Gozlan	9b479a245b	Uniformize LlavaNextVideoProcessor kwargs (#35613 ) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs	2025-02-18 14:13:51 -05:00
Ardalan	8ee50537fe	Qwen2VL fix cos,sin dtypes to float when used with deepspeed (#36188 ) * fix dtype of cos,sin when used with deepspeed * move sin,cos casting withing flash attention functions * fix cos,sin float casting in modular --------- Co-authored-by: ardalan.mehrani <ardalan.mehrani@ardalanmehranis-MacBook-Pro.local> Co-authored-by: ardalan.mehrani <ardalan.mehrani@bytedance.com>	2025-02-18 19:18:29 +01:00
Parteek	8eaae6bee9	Added Support for Custom Quantization (#35915 ) * Added Support for Custom Quantization * Update code * code reformatted * Updated Changes * Updated Changes --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-02-18 16:14:19 +01:00
ivarflakstad	07182b2e10	GitModelIntegrationTest - flatten the expected slice tensor (#36260 ) Flatten the expected slice tensor	2025-02-18 16:04:19 +01:00
Damiano Amatruda	4d2de5f63c	Fix XGLM loss computation (PyTorch and TensorFlow) (#35878 ) * Fix XGLM loss computation (PyTorch and TensorFlow) * Update expected output string in XGLM sample test This updates the expected output string of test_xglm_sample for torch 2.0 to the correct one and removes the one for torch 1.13.1 + cu116 (transformers moved to torch 2.0 with PR #35358). * Update expected output IDs in XGLM generation test	2025-02-18 15:37:48 +01:00
Mehant Kammakomati	c3ba53303b	feat: add support for tensor parallel training workflow with accelerate (#34194 ) * feat: add support for tensor parallel flow using accelerate Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: add tp degree to env variable Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: add version check for accelerate to allow TP Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * docs: tensor parallelism Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * nit: rename plugin name Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * fix: guard accelerate version before allow tp Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * docs: add more docs and updates related to TP Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> --------- Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-18 14:05:46 +01:00
Raushan Turganbay	e6cc410d5b	Remove flakiness in VLMs (#36242 ) * fix * nit * no logits processor needed * two more tests on assisted decoding	2025-02-18 11:41:07 +01:00
andrewor14	fdcfdbfd22	Fix TorchAoConfig not JSON serializable (#36206 ) Summary: TorchAoConfig optionally contains a `torchao.dtypes.Layout` object which is a dataclass and not JSON serializable, and so the following fails: ``` import json from torchao.dtypes import TensorCoreTiledLayout from transformers import TorchAoConfig config = TorchAoConfig("int4_weight_only", layout=TensorCoreTiledLayout()) config.to_json_string() json.dumps(config.to_dict()) ``` This also causes `quantized_model.save_pretrained(...)` to fail because the first step of this call is to JSON serialize the config. Fixes https://github.com/pytorch/ao/issues/1704. Test Plan: python tests/quantization/torchao_integration/test_torchao.py -k test_json_serializable Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-18 11:05:42 +01:00
Yih-Dar	626666c444	Au revoir flaky `test_fast_is_faster_than_slow` (#36240 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-17 18:30:07 +01:00
Joao Gante	429f1a682d	[tests] remove `test_export_to_onnx` (#36241 )	2025-02-17 16:52:44 +00:00
Marc Sun	dae8708c36	Add compressed tensor in quant dockerfile (#36239 ) add compressed_tensors in the dockerfile	2025-02-17 17:48:57 +01:00
dependabot[bot]	3e970dbbf1	Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/codeparrot/examples (#36237 ) Bump transformers in /examples/research_projects/codeparrot/examples Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-17 16:28:43 +00:00
eustlb	77aa9fc076	[generate] Fix encoder decoder models attention mask (#36018 )	2025-02-17 15:42:28 +00:00
Joao Gante	55493f1390	[tests] remove tf/flax tests in `/generation` (#36235 )	2025-02-17 14:59:22 +00:00
Arthur Zucker	c877c9fa5b	v4.45.0-dev0	2025-02-17 15:21:20 +01:00
ivarflakstad	7ec35bc3bd	Add missing atol to torch.testing.assert_close where rtol is specified (#36234 )	2025-02-17 14:57:50 +01:00
Joao Gante	dad513e0c2	[generate] remove cache v4.47 deprecations (#36212 )	2025-02-17 13:55:03 +00:00
ivarflakstad	936aeb70ab	AMD DeepSpeed image additional HIP dependencies (#36195 ) * Add hipsolver and hipblastlt as dependencies * Upgrade torch libs with rocm6.2.4 index	2025-02-17 11:50:49 +01:00
Yih-Dar	23d6095e8f	Fix `LlavaForConditionalGenerationModelTest::test_config` after #36077 (#36230 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-17 11:49:07 +01:00
Fanli Lin	fae0f3dde8	[tests] fix `EsmModelIntegrationTest::test_inference_bitsandbytes` (#36225 ) fix failed test	2025-02-17 11:10:33 +01:00
Yih-Dar	dd16acb8a3	set `test_torchscript = False` for Blip2 testing (#35972 ) * just skip * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-14 17:43:32 +01:00
Yih-Dar	0a9923a609	Use `args.num_workers` in `check_modular_conversion.py` (#36200 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-14 17:31:03 +01:00
Mayank Mishra	a570e2ba87	add shared experts for upcoming Granite 4.0 language models (#35894 ) * Modular GraniteMoE with shared Experts. Signed-off-by: Shawn Tan <shawntan@ibm.com> * Modified * Import order. * Modified for style * Fix space. * Test * Remove extra granitemoe file. * New converted file and tests * Modified __init__ files. * Formatting. * Dummy PT objects * register granitemoe shared model Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix linting of a file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix import in modeling file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * update generated modeling file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * add documentation Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * update docstrings Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * update generated modeling file Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * fix docstrings in config class Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> * merge main Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> --------- Signed-off-by: Shawn Tan <shawntan@ibm.com> Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> Co-authored-by: Shawn Tan <shawntan@ibm.com> Co-authored-by: Shawn Tan <shawn@wtf.sg> Co-authored-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com> Co-authored-by: Sukriti Sharma <Ssukriti@users.noreply.github.com>	2025-02-14 16:55:28 +01:00
ivarflakstad	7ae7e87a09	Add @require_bitsandbytes to Aria test_batched_generation (#36192 )	2025-02-14 15:48:47 +01:00
Kyle Sayers	bcfc9d795e	[Bugfix] Fix reloading of pixtral/llava configs (#36077 ) * add is_composition flag to LlavaConfig Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * WIP: pixtral text config Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * fix style Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * add test Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * use is_composition for pixtral Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * Revert "use is_composition for pixtral" This reverts commit a53d5f9fc5149c84419b0e9e03db6d99362add53. * Revert "Revert "use is_composition for pixtral"" This reverts commit 3ab1c99404e2c2963fba0bcf94b9786d6365db0f. --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-02-14 15:27:05 +01:00
Raushan Turganbay	0c78ef6cd3	🔴 VLM: compile compatibility (#35724 ) * llavas * add mroe models * fix `compile_forward` test for all models * fix copies * make style * also doesn't support cache class * fix some tests * not copied from * ci green? * fix tests * fix copies * fix tests * check with `numel` and remove `item` * fix copies * fix copies * Update src/transformers/models/cohere2/modeling_cohere2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * opt remove cross attn * gemma2 * fixup * fixup * fix newly added test * maybe fixed? * green please? --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-14 15:23:49 +01:00
David LaPalomento	b45cf0e90a	Guard against unset resolved_archive_file (#35628 ) * archive_file may not be specified When loading a pre-trained model from a gguf file, resolved_archive_file may not be set. Guard against that case in the safetensors availability check. * Remap partial disk offload to cpu for GGUF files GGUF files don't support disk offload so attempt to remap them to the CPU when device_map is auto. If device_map is anything else but None, raise a NotImplementedError. * Don't remap auto device_map and raise RuntimeError If device_map=auto and modules are selected for disk offload, don't attempt to map them to any other device. Raise a runtime error when a GGUF model is configured to map any modules to disk. --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-14 14:44:31 +01:00
Arthur	96f01a36ac	Revert qwen2 breaking changes related to attention refactor (#36162 ) * dito * add a test * upsate * test needs fa2 * update test and configuration * test requires fa2 * style	2025-02-14 13:44:14 +01:00
Mohamed Mekkouri	cb586a3999	Add require_read_token to fp8 tests (#36189 ) fix	2025-02-14 12:27:35 +01:00
Andrei Panferov	5f726f8b8e	New HIGGS quantization interfaces, JIT kernel compilation support. (#36148 ) * new flute * new higgs working * small adjustments * progress and quallity * small updates * style --------- Co-authored-by: Andrey Panferov <panferov.andrey3@wb.ru> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-02-14 12:26:45 +01:00
Raushan Turganbay	15ec971b8e	Prepare processors for VideoLLMs (#36149 ) * allow processor to preprocess conversation + video metadata * allow callable * add test * fix test * nit: fix * add metadata frames_indices * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * port updates from Orr and add one more test * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * typo * as dataclass * style * docstring + maek sure tests green --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-02-14 11:34:08 +01:00
Isotr0py	33d1d715b0	Add ImageProcessorFast to Qwen2.5-VL processor (#36164 ) * add qwen2 fast image processor to modular file Signed-off-by: isotr0py <2037008807@qq.com> * fix modular Signed-off-by: isotr0py <2037008807@qq.com> * fix circle import Signed-off-by: isotr0py <2037008807@qq.com> * add docs Signed-off-by: isotr0py <2037008807@qq.com> * fix typo Signed-off-by: isotr0py <2037008807@qq.com> * add modular generated files Signed-off-by: isotr0py <2037008807@qq.com> * revert qwen2vl fast image processor Signed-off-by: isotr0py <2037008807@qq.com> * remove qwen2.5-vl image processor from modular Signed-off-by: isotr0py <2037008807@qq.com> * re-generate qwen2.5-vl files Signed-off-by: isotr0py <2037008807@qq.com> * remove unnecessary test Signed-off-by: isotr0py <2037008807@qq.com> * fix auto map Signed-off-by: isotr0py <2037008807@qq.com> * cleanup Signed-off-by: isotr0py <2037008807@qq.com> * fix model_input_names Signed-off-by: isotr0py <2037008807@qq.com> * remove import Signed-off-by: isotr0py <2037008807@qq.com> * make fix-copies Signed-off-by: isotr0py <2037008807@qq.com> --------- Signed-off-by: isotr0py <2037008807@qq.com>	2025-02-14 17:34:55 +08:00
Raushan Turganbay	1931a35140	Chat template docs (#36163 ) * decompose chat template docs * add docs * update model docs * qwen2-5 * pixtral * remove old chat template * also video as list frames supported * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_template_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * remove audio for now --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-02-14 10:32:14 +01:00
Raushan Turganbay	3bf02cf440	CI: fix `test-save-trainer` (#36191 ) * fix * also the docstring	2025-02-14 10:20:56 +01:00
Amit Garg	0ae93d31ce	Add support for partial rotary embeddings in Phi3 model (#35947 ) * Added support for partial_rotary_factor * addressed comments * refactored	2025-02-14 09:37:38 +01:00
Yoni Gozlan	336dc69d63	Uniformize OwlViT and Owlv2 processors (#35700 ) * uniformize owlvit processor * uniformize owlv2 * nit * add positional arg test owlvit * run-slow: owlvit, owlv2 * run-slow: owlvit, owlv2 * remove one letter variable	2025-02-13 17:30:26 -05:00
Yoni Gozlan	e6a7981711	Fix make_batched_videos and add tests (#36143 ) * add support for initial shift in video processing and other fixes * revert modifications video loading functions	2025-02-13 17:14:30 -05:00
Yih-Dar	8fd4bc7d1d	Fix a mistake in #36175 (#36179 ) fix my bad Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-13 18:33:02 +01:00
Mohamed Mekkouri	b1a2de075d	Follow up to SpQR integration (#36176 ) fix	2025-02-13 17:40:59 +01:00
Wizyoung	12962fe84b	Fix the key name for _load_rng_state under torch.cuda (#36138 ) fix load key name for _load_rng_state under torch.cuda Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-13 11:35:08 -05:00
Yih-Dar	bfe46c98b5	Make `check_repository_consistency` run faster by MP (#36175 ) * speeddddd * speeddddd * speeddddd * speeddddd --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-13 17:25:17 +01:00
Jiahao Li	5f0fd1185b	Optimize Qwen2VL vision model by precomputing cos/sin embeds before ViT blocks (#35837 ) * Optimize Qwen2VL vision model by precomputing cos/sin embeds before ViT blocks * Make rotary_pos_emb optional & fix type * Adapt pre-computed cos/sin to Qwen2.5VL * More concise	2025-02-13 17:10:58 +01:00
மனோஜ்குமார் பழனிச்சாமி	d72642bccc	Use tqdm auto (#35726 ) * Remove traces of the progressbar * Use tqdm auto	2025-02-13 15:41:30 +00:00
Joao Gante	62c7ea0201	CI: avoid human error, automatically infer generative models (#33212 ) * tmp commit * move tests to the right class * remove ALL all_generative_model_classes = ... * skip tf roberta * skip InstructBlipForConditionalGenerationDecoderOnlyTest * videollava * reduce diff * reduce diff * remove on vlms * fix a few more * manual rebase bits * more manual rebase * remove all manual generative model class test entries * fix up to ernie * a few more removals * handle remaining cases * recurrent gemma * it's better here * make fixup * tf idefics is broken * tf bert + generate is broken * don't touch tf :() * don't touch tf :( * make fixup * better comments for test skips * revert tf changes * remove empty line removal * one more * missing one	2025-02-13 16:27:11 +01:00
Arthur	06231fdfc7	add disable compile option (#36161 ) * add disable compile code * fix	2025-02-13 16:24:46 +01:00
Arthur	0ca7259217	fix training issues (#36158 ) * fix training issues * Update Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-13 16:24:28 +01:00
Elvir Crnčević	845b0a2616	Efficient Inference Kernel for SpQR (#34976 ) * Resolve vptq conflict * Rename spqr package to spqr_quant * Get rid of aqlm mention * Start working on tests * Resolve ruff code checks * Ruff format * Isort * Test updates * Add gpu tag * Rename to modules_to_not_convert * Config update * Docs and config update * Docs and config update * Update to update_torch_dtype * spqr config parameter validation * Ruff update * Apply ruff fixes * Test fixes * Ruff update * Mark tests as @slow again; Ruff; Docstring update * Ruff * Remove absolute path * Resolve typo * Remove redundandt log * Check accelerate/spqr availability * Ruff fix * Check if the config contains proper shapes * Ruff test * Documentation update * overview update * Ruff checks * Ruff code quality * Make style * Update docs/source/en/quantization/spqr.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update spqr.md * Enable gptqmodel (#35012) * gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update readme Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * gptqmodel need use checkpoint_format (#1) * gptqmodel need use checkpoint_format * fix quantize * Update quantization_config.py * Update quantization_config.py * Update quantization_config.py --------- Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * Revert quantizer_gptq.py (#2) * revert quantizer_gptq.py change * pass *kwargs limit gptqmodel and optimum version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix warning Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix version check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert unrelated changes Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable gptqmodel tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix requires gptq Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Fix Transformer compat (#3) * revert quantizer_gptq.py change * pass *kwargs add meta info * cleanup * cleanup * Update quantization_config.py * hf_select_quant_linear pass checkpoint_format and meta * fix GPTQTestCUDA * Update test_gptq.py * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * cleanup * add backend * cleanup * cleanup * no need check exllama version * Update quantization_config.py * lower checkpoint_format and backend * check none * cleanup * Update quantization_config.py * fix self.use_exllama == False * spell * fix unittest * fix unittest --------- Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format again Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update gptqmodel version (#6) * update gptqmodel version * update gptqmodel version * fix unit test (#5) * update gptqmodel version * update gptqmodel version * "not self.use_exllama" is not equivalent to "self.use_exllama==False" * fix unittest * update gptqmodel version * backend is loading_attibutes (#7) * fix format and tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix memory check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix device mismatch Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix result check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * update tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * review: update docs (#10) * review: update docs (#12) * review: update docs * fix typo * update tests for gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update document (#9) * update overview.md * cleanup * Update overview.md * Update overview.md * Update overview.md * update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * typo * doc note for asymmetric quant * typo with apple silicon(e) * typo for marlin * column name revert: review * doc rocm support * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix : Nemotron Processor in GGUF conversion (#35708) * fixing nemotron processor * make style * Update docs/source/en/quantization/spqr.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add missing TOC to doc --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-13 16:22:58 +01:00
dependabot[bot]	c5506f4f00	Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/adversarial (#36168 ) Bump transformers in /examples/research_projects/adversarial Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-13 15:06:16 +00:00
dependabot[bot]	d7c5d1b539	Bump transformers from 4.38.0 to 4.48.0 in /examples/tensorflow/language-modeling-tpu (#36167 ) Bump transformers in /examples/tensorflow/language-modeling-tpu Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-13 14:46:38 +00:00
Joao Gante	636ee57489	[generate] revert change in Aria: the maximum cache length must match `max_length` (#36120 ) * revert inputs_embeds len * Update test_utils.py * make fixup	2025-02-13 14:36:33 +00:00
Mohamed Mekkouri	b41591d847	Fix : fix doc fp8 (#36173 ) * fix * fix	2025-02-13 15:29:59 +01:00
Arthur	b079dd1fa2	Fix red CI (#36174 ) test was weird	2025-02-13 14:27:55 +01:00
Joao Gante	d114a6f78e	[Modular] skip modular checks based on diff (#36130 ) skip modular checks based on diff	2025-02-13 12:53:21 +00:00
Pavel Iakubovskii	6397916dd2	Remove loading custom kernel for RT-DETRv2 (#36098 ) * Remove loading custom kernels * Remove config param * Fixup	2025-02-13 12:01:53 +00:00
Mohamed Mekkouri	efe72fe21f	Adding FP8 Quantization to transformers (#36026 ) * first commit * adding kernels * fix create_quantized_param * fix quantization logic * end2end * fix style * fix imports * fix consistency * update * fix style * update * udpate after review * make style * update * update * fix * update * fix docstring * update * update after review * update * fix scheme * update * update * fix * update * fix docstring * add source * fix test --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-13 13:01:19 +01:00
Lysandre Debut	c82319b493	Helium documentation fixes (#36170 ) * Helium documentation fixes * Update helium.md * Update helium.md * Update helium.md	2025-02-13 12:20:53 +01:00
Thomas Bauwens	8f137b2427	Move `DataCollatorForMultipleChoice` from the docs to the package (#34763 ) * Add implementation for DataCollatorForMultipleChoice based on docs. * Add DataCollatorForMultipleChoice to import structure. * Remove custom DataCollatorForMultipleChoice implementations from example scripts. * Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean. * Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable. * Apply suggested changes and run make fixup. * fix copies, style and fixup * add missing documentation * nits * fix docstring * style * nits * isort --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-02-13 12:01:28 +01:00
CL-ModelCloud	35c155052d	Fix PretrainedTokenizerFast check => Fix PretrainedTokenizerFast Save (#35835 ) * Fix the bug in tokenizer.save_pretrained when saving tokenizer_class to tokenizer_config.json * Update tokenization_utils_base.py * Update tokenization_utils_base.py * Update tokenization_utils_base.py * add tokenizer class type test * code review * code opt * fix bug * Update test_tokenization_fast.py * ruff check * make style * code opt * Update test_tokenization_fast.py --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>	2025-02-13 12:00:33 +01:00
Marco Edward Gorelli	3c912c9089	docs: fix return type annotation of `get_default_model_revision` (#35982 )	2025-02-13 11:59:15 +01:00
gewenbin0992	6a1ab634b6	qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 (#36083 ) * qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 * fix * fix * fix * fix * add tests * fix test bugs * fix * fix failed tests * fix	2025-02-13 11:35:28 +01:00
Pavel Iakubovskii	d419862889	Fix tests for vision models (#35654 ) * Trigger tests * [run-slow] beit, detr, dinov2, vit, textnet * Fix BEiT interpolate_pos_encoding * Fix DETR test * Update DINOv2 test * Fix textnet * Fix vit * Fix DPT * fix data2vec test * Fix textnet test * Update interpolation check * Fix ZoeDepth tests * Update interpolate embeddings for BEiT * Apply suggestions from code review	2025-02-13 10:28:37 +00:00
Lucain	e60ae0d078	Replace deprecated update_repo_visibility (#35970 )	2025-02-13 11:27:55 +01:00
Nerogar	9065cf0d92	Fix Gemma2 dtype issue when storing weights in float16 precision (#35398 ) fix gemma2 dtype issue when storing weights in float16 precision	2025-02-13 11:17:37 +01:00
Ben Schneider	08ab1abff4	Add reminder config to issue template and print DS version in env (#35156 ) * update env command to log deepspeed version * suppress deepspeed import logging * Add reminder to include configs to repro description in bug report. * make fixup * [WIP] update import utils for deepspeed * Change to using is_deepspeed_available() from integrations. * make fixup	2025-02-13 10:55:49 +01:00
Sambhav Dixit	950cfb0b4f	Fix PaliGemma Pad Token Masking During Training #35855 (#35859 ) * change order of unmasking of tokens * library import * class setup * test function * refactor * add commit message * test modified * explict initiliasation of weights + made model smaller * removed sepete testing file * fixup * fixup core * test attention mask with token types * tests fixup * removed PaliGemmaAttentionMaskTest class --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-13 10:11:44 +01:00
Benjamin Badger	1614d196e8	Mllama fsdp (#36000 ) * pixel input assignment revoked * double send * Update src/transformers/models/mllama/modeling_mllama.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-13 09:49:39 +01:00
ivarflakstad	847854b023	Add git LFS to AMD docker image (#36016 ) Add git lfs to AMD docker image	2025-02-12 22:27:21 +01:00
Yih-Dar	9985d06add	skip `test_initialization` for `VitPoseBackboneModelTest` for now (#36154 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-12 18:24:24 +01:00
Yih-Dar	4a5a7b991a	Fix test fetcher (#36129 ) * fix * fix * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-12 17:35:41 +01:00
Zach Mueller	1fae54c721	Add more rigerous non-slow grad accum tests (#35668 ) * Add more rigerous non-slow grad accum tests * Further nits * Re-add space * Readbility * Use tinystories instead * Revert transformer diff * tweak threshs	2025-02-12 10:26:21 -05:00
Ke Wen	f869d486d3	Update doc re list of models supporting TP (#35864 ) Update doc about models' TP support	2025-02-12 15:53:27 +01:00
hsilva664	281c0c8b5b	adding option to save/reload scaler (#34932 ) * Adding option to save/reload scaler * Removing duplicate variable * Adding save/reload test * Small fixes on deterministic algorithm call * Moving LLM test to another file to isolate its environment * Moving back to old file and using subprocess to run test isolated * Reverting back accidental change * Reverting back accidental change	2025-02-12 15:48:16 +01:00
kang sheng	a33ac830af	Fix multi gpu loss sync condition, add doc and test (#35743 ) * Fix multi gpu loss sync condition, add doc and test * rename function and class * loss should not scale during inference * fix typo	2025-02-12 15:41:31 +01:00
zhuHQ	08c4959a23	Optim: APOLLO optimizer integration (#36062 ) * Added APOLLO optimizer integration * fix comment * Remove redundancy: Modularize low-rank optimizer construction * Remove redundancy: Remove useless comment * Fix comment: Add typing * Fix comment: Rewrite apollo desc	2025-02-12 15:33:43 +01:00
Dmitry Rogozhkin	2440512723	multi-gpu: fix tensor device placements for various models (#35763 ) * milti-gpu: fix inputs_embeds + position_embeds Fixing the following errors in few models: ``` > hidden_states = inputs_embeds + pos_embeds E RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:2 and xpu:3! ``` Fixes: #35762 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * multi-gpu: fix tensor device placements for various models Fixes: #35762 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * Apply make fix-copies Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> --------- Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2025-02-12 15:28:18 +01:00
Lucain	befea8c4f0	🚨 Remove cache migration script (#35810 ) * Remove cache migration script * remove dummy move_cache	2025-02-12 15:12:38 +01:00
dependabot[bot]	d52a9d08ce	Bump cryptography from 43.0.1 to 44.0.1 in /examples/research_projects/decision_transformer (#36142 ) Bump cryptography in /examples/research_projects/decision_transformer Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.1 to 44.0.1. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pyca/cryptography/compare/43.0.1...44.0.1) --- updated-dependencies: - dependency-name: cryptography dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-12 13:34:52 +00:00
dependabot[bot]	31e4831b98	Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/vqgan-clip (#36136 ) Bump transformers in /examples/research_projects/vqgan-clip Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-12 13:21:09 +00:00
Leon Engländer	243aeb7c4a	Fix Gradient Checkpointing for Deberta & Deberta-V2 using PEFT / Adapters (#35898 ) Replace In-Place Operations for Deberta and Deberta-V2	2025-02-12 14:21:01 +01:00
Joao Gante	8a2f062eac	[commands] remove deprecated/inoperational commands (#35718 ) rm deprecated/inoperational commands	2025-02-12 12:23:58 +00:00
Raushan Turganbay	8fc6ecba4f	VLM: enable skipped tests (#35746 ) * fix cached tests * fix some tests * fix pix2struct * fix	2025-02-12 12:55:46 +01:00
Sambhav Dixit	d6897b46bd	Add utility for Reload Transformers imports cache for development workflow #35508 (#35858 ) * Reload transformers fix form cache * add imports * add test fn for clearing import cache * ruff fix to core import logic * ruff fix to test file * fixup for imports * fixup for test * lru restore * test check * fix style changes * added documentation for usecase * fixing --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-12 12:45:11 +01:00
Joao Gante	1cc7ca3295	Whisper: remove redundant assisted generation tests (#34814 ) * remove redundant test * delete another test * revert default max_length * (wrong place, moving)	2025-02-12 11:37:19 +00:00
MilkClouds	0cd5e2dfd0	added warning to Trainer when label_names is not specified for PeftModel (#32085 ) * feat: added warning to Trainer when label_names is not specified for PeftModel * Update trainer.py * feat: peft detectw ith `_is_peft_model` * Update src/transformers/trainer.py Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * Applied formatting in trainer.py --------- Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>	2025-02-12 12:34:47 +01:00
nhamanasu	377d8e2b9c	add RAdamScheduleFree optimizer (#35313 ) * add RAdamScheduleFree optimizer * revert schedulefree version to the minimum requirement * refine is_schedulefree_available so that it can take min_version * refine documents --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-12 11:31:51 +01:00
Harry Mellor	f5fff672db	Add pipeline parallel plan to `PretrainedConfig` and `PreTrainedModel` (#36091 ) * Add `base_model_pp_plan` to `PretrainedConfig` Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add `_pp_plan` to `PreTrainedModel` Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add both to Llama for testing Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Fix type error Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update to suggested schema Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * `_pp_plan` keys are not patterns Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Simplify schema Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Fix typing error Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update input name for Llama Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Aria Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Bamba Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Cohere 1 & 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to diffllama and emu3 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Gemma 1 & 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to GLM and GPT NeoX Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Granite and Helium Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Mistral and Mixtral Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to OLMo 1 & 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan to Phi and Phi 3 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan for Qwen 2, 2 MoE, 2 VL and 2.5 VL Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add pp plan for Starcoder 2 Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Add enum for accessing inputs and outputs Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Update type hints to use tuples Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Change outer list to tuple Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> --------- Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-02-12 10:51:48 +01:00
Fanli Lin	11afab19c0	[docs] update awq doc (#36079 ) * update awq doc * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/awq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add note for inference --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-02-11 10:35:28 -08:00
Fanli Lin	9b69986e8a	[docs] minor doc fix (#36127 ) fix	2025-02-11 10:31:12 -08:00
Sambhav Dixit	1b57de8dcf	Make `output_dir` Optional in `TrainingArguments` #27866 (#35735 ) * make output_dir optional * inintaied a basic testing module to validate and verify the changes * Test output_dir default to 'tmp_trainer' when unspecified. * test existing functionality of output_dir. * test that output dir only created when needed * final check * added doc string and changed the tmp_trainer to trainer_output * amke style fixes to test file. * another round of fixup --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-11 18:54:36 +01:00
Arthur	03534a92f8	update tiktoken integ to use converted (#36135 )	2025-02-11 18:27:22 +01:00
Pablo Montalvo	3a5c328fd8	Fix CI issues (#35662 ) * make explicit gpu dep * [run-slow] bamba	2025-02-11 18:17:01 +01:00
Hicham Tala	775252abd4	Fix max size deprecated warning (#34998 ) * Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning * Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning * Remove deprecated warnings and eliminate `max_size` usage * Test use `int` as argument for `size` Add a test to ensure test can pass successfully and backward compatibility * The test pipelines still use `max_size` Remove `max_size` from test pipelines and replace by `size` by a `Dict` with `'shortest_edge'` `'longest_edge'` as keys * Reformatting * Reformatting * Revert "Reformatting" This reverts commit c3040acee75440357cffd1f60c9d29ff5b2744b8. * Revert "Reformatting" This reverts commit ac4522e5c9a02d2d0c298295026db68ea26453df. * Revert "The test pipelines still use `max_size`" This reverts commit eaed96f041ffc32459536e1524d87f7a12ddee29. * Revert "Test use `int` as argument for `size`" This reverts commit 1925ee38c7c5eabb11832316712df1d4ba8043d0. * Revert "Remove deprecated warnings and eliminate `max_size` usage" This reverts commit d8e7e6ff9025931468fc1f3827cda1fa391003d5. * Change version `4.26` to "a future version" * Reformatting * Revert "Change version `4.26` to "a future version"" This reverts commit 2b53f9e4	2025-02-11 18:14:31 +01:00
湛露先生	5489fea557	update awesome-transformers.md. (#36115 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-02-11 15:55:49 +00:00
Maxim Evtush	76048be419	fix: typos in documentation files (#36122 ) * Update tools.py * Update text_generation.py * Update question_answering.py	2025-02-11 13:47:20 +00:00
Pavel Iakubovskii	f42d46ccb4	Add common test for `torch.export` and fix some vision models (#35124 ) * Add is_torch_greater_or_equal test decorator * Add common test for torch.export * Fix bit * Fix focalnet * Fix imagegpt * Fix seggpt * Fix swin2sr * Enable torch.export test for vision models * Enable test for video models * Remove json * Enable for hiera * Enable for ijepa * Fix detr * Fic conditional_detr * Fix maskformer * Enable test maskformer * Fix test for deformable detr * Fix custom kernels for export in rt-detr and deformable-detr * Enable test for all DPT * Remove custom test for deformable detr * Simplify test to use only kwargs for export * Add comment * Move compile_compatible_method_lru_cache to utils * Fix beit export * Fix deformable detr * Fix copies data2vec<->beit * Fix typos, update test to work with dict * Add seed to the test * Enable test for vit_mae * Fix beit tests * [run-slow] beit, bit, conditional_detr, data2vec, deformable_detr, detr, focalnet, imagegpt, maskformer, rt_detr, seggpt, swin2sr * Add vitpose test * Add textnet test * Add dinov2 with registers * Update tests/test_modeling_common.py * Switch to torch.testing.assert_close * Fix masformer * Remove save-load from test * Add dab_detr * Add depth_pro * Fix and test RT-DETRv2 * Fix dab_detr	2025-02-11 11:37:31 +00:00
Arthur	1779f5180e	Fix nighlty CIs: missing atols (#35903 ) fix osme missing atols	2025-02-11 10:49:21 +01:00
ivarflakstad	1feebb5b41	AutoformerForPrediction test add atol (#36017 )	2025-02-10 19:22:24 +01:00
Joao Gante	be2ac0916a	[generate] shape checks in tests compatible with fixed-length caches (+ some minor fixes) (#35993 ) * shape checks compatible with static cache * add test * tmp * manually turn on eager attn when we want to output attn * typo * generalize to encoder-decoder models * force compilation on cpu * tmp commit * fix static cache shape checks * models with odd caches * fix copies * shorter cache search loop * use decoder_past_key_values everywhere * better test variable names and comments * signature * rename _check_outputs into _check_generate_outputs * add comments * HybridCache future test note	2025-02-10 17:50:54 +00:00
Marc Sun	9510ae39d9	fix bnb warning (#36116 ) fix	2025-02-10 17:34:50 +01:00
kkscilife	09261ccf12	[Bugfix] fix file name of docstring in utils/check_table.py (#36108 ) fix file name Co-authored-by: kkscilife <qa-caif-cicd@pjlab.org.cn>	2025-02-10 15:48:02 +00:00
Marc Sun	d4a6b4099b	Revert checkpoint tmp dir (#36112 ) * Revert "Fix OS err (#36094)" This reverts commit ba29a439adbe6f371710d0514659127264ae24b3. * Revert "Save checkpoint to temporary directory to handle partial saves during failures (#35580)" This reverts commit 20d17358c468b7aefca9e54c3461eb88d1ee34f9.	2025-02-10 16:22:03 +01:00
jiqing-feng	0baf003915	Refactor OPT model (#36101 ) * remove cross attention Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * remove is_decoder Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix pkv Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-02-10 14:27:16 +01:00
Yoni Gozlan	924f1c717a	Remove Multi-threaded image conversion for fast image processors (#36105 ) remove multithreaded image conversion Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-10 07:59:34 -05:00
Yih-Dar	3897f2caf8	Enable pytest live log and show warning logs on GitHub Actions CI runs (#35912 ) * fix * remove * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-10 13:36:20 +01:00
Jingze Shi	48a309d0d2	Support constant lr with cooldown (#35453 ) * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and decay methods to 'get_wsd_schedule' * support num_training_steps and num_stable_steps for get_wsd_schedule * support num_training_steps and num_stable_steps for get_wsd_schedule * get wsd scheduler before the `num_training_steps` decision * fix code_quality * Update stable branch logic * fix code_quality * Move stable stage decide to `get_wsd_schedule` * Update docstring of `get_wsd_schedule` * Update `num_train_steps` to optional * Update `num_train_steps` to optional * Update docstring of `get_wsd_schedule` * Update src/transformers/optimization.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-10 13:21:55 +01:00
Armaghan Shakir	9a6be63fdb	Add Apple's Depth-Pro for depth estimation (#34583 ) * implement config and model building blocks * refactor model architechture * update model outputs * update init param to include use_fov_model * update param name in config * fix hidden_states and attentions outputs for fov * sort config * complete minor todos * update patching * update config for encoder * fix config * use correct defaults in config * update merge for compatibility with different image size * restructure encoder for custom configuration * make fov model compatible with custom config * replace word "decoder" with "fusion" * weight conversion script * fix fov squeeze * update conversion script (without test) * upload ruff image processing * create fast image processing * use torch interpolation for image processing * complete post_process_depth_estimation * config: fix imports and sort args * apply inference in weight conversion * use mllama script instead for weight conversion * clean weight conversion script * add depth-pro status in other files * fill docstring in config * formatting * more formatting * formatting with ruff * formatting with style * fix copied classes * add examples; update weight convert script * fix using check_table.py and isort * fix config docstring * add depth pro to sdpa docs * undo unintentional changes in configuration_gemma.py * minor fixes * test image processing * fixes and tests * more fixes * use output states from image_encoder instead * Revert "use output states from image_encoder instead" This reverts commit 2408ec54e4f27d2abbecdb8374e58f34d91d8e96. * make embeddings dynamic * reshape output hidden states and attentions as part of computation graph * fix ruff formating * fix docstring failure * use num_fov_head_layers in tests * update doc * check consistency with config * ruff formatting * update test case * fix ruff formatting * add tests for fov * use interpolation in postprocess * run and fix slow tests locally * use scaled_images_features for image and fov encoder * return fused_hidden_states in fusion stage * fix example * fix ruff * fix copyright license for all files * add __all__ for each file * minor fixes - fix download spell - add push_to_hub option - fix Optional type hinting - apply single loop for DepthProImageProcessor.preprocess * return list in post_process_depth_estimation * minor fixes - capitalize start of docstring - use ignore copy - fix examples - move docstring templates and custom output classes to top - remove "-> None" typehinting from __init__ - type hinting for forward passes - fix docstrings for custom output classes * fix "ruff check" * update upsample and projection * major changes: (image size and merge optimization) - add support for images of any size - optimize merge operation - remove image_size from config - use full names instead of B, C, H, W - remove interpolation from fusion stage - add interpolation after merge - move validations to config - update integration test - add type hints for functions * fix push_to_hub option in weights conversion * remove image_size in weights conversion * major changes in the architecture - remove all DepthProViT modules and support different backbones using the AutoModel API - set default use_fov_model to False - validate parameters in configuration - update interpolate function: use "nearest" for faster computation - update reshape_feature function: remove all special tokens, possible from different backbones - update merge function: use padding from config instead of merge_out_size - remove patch_to_batch and batch_to_patch conversions for now - calculate out_size dynamically in the encoder - leave head_mask calculation to the backbone - fix bugs with merge - add more comments - update tests * placeholder for unused config attributes * improve docs amid review * minor change in docs * further optimize merge * fix formatting * remove unused patch/batch convertion functions * use original F.interpolate * improve function naming * minor chages - use torch_int instead of int - use proper for newly initialized tensors - use user provided return_dict for patch_encoder - use if-else block instead in self.use_fov_model * rearchitect upsample block for improved modularity * update upsample keys in weight conversion * improve padding in merge_patches * use double-loop for merge * update comments * create feature_extractor, reduce some forward code * introduce config.use_mask_token in dinov2 * minor fixes * minor fixes for onnx * update __init__ to latest format * remove DepthProConfig.to_dict() * major changes in backbone * update config in weight conversion * formatting * converted model is fp32 * improve naming and docs for feature_extractor->reconstruct_feature_maps * minor fixes; amid review * create intermediate vars in func call * use torch.testing.assert_close * use ModuleList instead of Sequential and ModuleDict * update docs * include fov in integraiton tests * update docs * improve initialization of convolution layers * fix unused fov keys * update tests * ruff format * fix test, amid kaimming initialization * add depthpro to toctree * add residual layer to _no_split_modules * architecture rework * Update src/transformers/models/depth_pro/image_processing_depth_pro.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/depth_pro/image_processing_depth_pro_fast.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update docs * improve merge_patches * use flatten with fov_output * ruff formatting * update resources section in docs Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix typo "final_kernal_size" Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix output typehint for DepthProDepthEstimator Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * residual operation in 2 steps Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * use image_size instead of global patch_size in interpolation * replace all Sequential with ModuleList * update fov * update heads * fix and update conversion script for heads * ruff formatting * remove float32 conversion * use "Fov" instead of "FOV" in class names * use "Fov" instead of "FOV" in config docs * remove prune_heads * update fusion stage * use device in examples * update processor * ruff fixes * add do_rescale in image_processor_dict * skip test: test_fast_is_faster_than_slow * ruff formatting * DepthProImageProcessorFast in other files * revert antialias removal * add antialias in BaseImageProcessorFast * Revert "revert antialias removal" This reverts commit 5caa0bd8f9f7463b98410c04e6cfe8fef3adee18. * Revert "add antialias in BaseImageProcessorFast" This reverts commit 3ae1134780ae236872985523d9c0a444eabcc179. * update processor for grouping and antialias * try test_fast_is_faster_than_slow without "skip" or "flanky" * update checkpoint * update checkpoint * use @is_flanky for processor test * update checkpoint to "apple/DepthPro-hf" --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-10 11:32:45 +00:00
Raushan Turganbay	c399921965	Paligemma: revert #36084 (#36113 ) * revert * type check	2025-02-10 12:04:24 +01:00
Raushan Turganbay	eebd2c972c	Chat template: update for processor (#35953 ) * update * we need batched nested input to always process correctly * update a bit * fix copies	2025-02-10 09:52:19 +01:00
Raushan Turganbay	5bd7694781	Processors: allow tuples of images when checking (#36084 ) allow tuples of images	2025-02-10 09:35:13 +01:00
Kyle Sayers	3a3b06ace4	fix MllamaVisionAttention typehint (#35975 ) * fix MllamaVisionAttention typehint Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * Update src/transformers/models/mllama/modeling_mllama.py Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> * fix suggestion Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>	2025-02-10 09:17:10 +01:00
Fanli Lin	6b55046213	[docs] fix not-working example code in `perf_infer_gpu_one.md` (#36087 ) * bug fix * update memory limit	2025-02-07 12:42:22 -08:00
Fanli Lin	14ca7f1452	[docs] fix typo (#36080 ) typo fix	2025-02-07 12:42:09 -08:00
Fanli Lin	c361b1e3d9	[docs] fix model checkpoint name (#36075 ) update model name	2025-02-07 12:41:52 -08:00
Zach Mueller	ba29a439ad	Fix OS err (#36094 ) * Try via local_main_process first * try 2	2025-02-07 09:57:43 -05:00
Matt	a18b7fdd9e	Move audio top_k tests to the right file and add slow decorator (#36072 ) * Move audio top_k tests to the right file and add slow decorator because we load a real model * empty commit to trigger tests	2025-02-07 14:32:30 +00:00
DeepWave	014047e1c8	Fix bug in apply_rotary_pos_emb_flashatt: in Qwen2-5-VL (#36065 )	2025-02-07 10:43:45 +01:00
Jade Choghari	006d9249ec	Adding RT-DETRv2 for object detection (#34773 ) * cookiecutter add rtdetrv2 * make modular working * working modelgit add . * working modelgit add . * finalize moduar inheritence * finalize moduar inheritence * Update src/transformers/models/rtdetrv2/modular_rtdetrv2.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * update modular and add rename * remove output ckpt * define loss_kwargs * fix CamelCase naming * fix naming + files * fix modular and convert file * additional changes * fix modular * fix import error (switch to lazy) * fix autobackbone * make style * add * update testing * fix loss * remove old folder * fix testing for v2 * update docstring * fix docstring * add resnetv2 (with modular bug to fix) * remove resnetv2 backbone * fix changes * small fixes * remove rtdetrv2resnetconfig * add rtdetrv2 name to convert * make style * Update docs/source/en/model_doc/rt_detr_v2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/rt_detr_v2/modular_rt_detr_v2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/rt_detr_v2/modular_rt_detr_v2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix modular typo after review * add reviewed changes * add final review changes * Update docs/source/en/model_doc/rt_detr_v2.md Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/rt_detr_v2/__init__.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/rt_detr_v2/convert_rt_detr_v2_weights_to_hf.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * add review changes * remove rtdetrv2 resnet * removing this weird project change * change ckpt name from jadechoghari to author * implement review and update testing * update naming and remove wrong ckpt * name * make fix-copies * Fix RT-DETR loss * Add resources, fix name * Fix repo in docs * Fix table name --------- Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: qubvel <qubvel@gmail.com>	2025-02-06 19:28:45 +00:00
Fanli Lin	6246c03260	[docs] fix outdated example code in `trainer.md` (#36066 ) fix bugs	2025-02-06 10:54:22 -08:00
Matt	4563ba2c6f	Fix StopStringCriteria to handle tokens above len(tokenizer) (#35797 ) * Fix StopStringCriteria to handle tokens above len(tokenizer) This fixes #35244 by clipping token IDs to be within the tokenizer's vocabulary size before performing the embedding lookup. This prevents index errors when model.config.vocab_size > len(tokenizer). The fix: 1. Adds a clamp operation to ensure token IDs are within bounds 2. Adds a test case to verify the behavior * Use self.stop_strings instead of stop_strings * Handle clipping correctly * make fixup * Update test to the new embedding vecs * Use much bigger values in the mismatch test * Typo fix * Slight simplification --------- Co-authored-by: openhands <openhands@all-hands.dev>	2025-02-06 16:53:28 +00:00
Zach Mueller	28f73bc307	Fix model kwargs (#35875 ) * Save state * Make a failing test * Better test * mpt -> done, many more to go * Rm extranious * Bamba * Bert * big_bird * biogpt * bloom * codegen * ctrl * data2vec * dbrx * Through up to Dbrx * electra * ernie * falcon * Fuyu/persimmon * Include noop kwargs to base models * Rebase * Skip musigen * Refactor/skip mllama * Revert makefile * Rm file * Fix PT failing, need to modify rest of loss funcs to not resize * Propagate some * Continue * More * More options * Mostly fixed * Proved that it's the same * Bloom is good * Make ability to override loss func possible * Fixup * Clean * Fix xglm * Quality tests * Skip OCR2 * Make specific loss for xglm * Make order the same/line up 1:1 * xglm * Skip fx output loss bloom model * Didn't pass in pad_token_id * Fix quality	2025-02-06 11:35:25 -05:00
湛露先生	1590c66430	Fix words typos in ggml test. (#36060 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-02-06 15:32:40 +00:00
Zach Mueller	1ce0e2992e	Nail in edge case of torch dtype being overriden permantly in the case of an error (#35845 ) * Nail in edge case of torch dtype * Rm unused func * Apply suggestions from code review Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * Refactor tests to only mock what we need, don't introduce injection functions * SetUp/TearDown * Do super --------- Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>	2025-02-06 09:05:23 -05:00
SilverSoldier	e3458af726	Save checkpoint to temporary directory to handle partial saves during failures (#35580 ) Save checkpoint to temporary folder first Since partial/missing files due to failures throw error during load	2025-02-06 08:48:05 -05:00
Raushan Turganbay	3dd1de39bb	Paligemma: fix generation with Gemma2 (#36044 ) * fix paligemma * nit * use `kwargs` in models that can load any LM	2025-02-06 14:31:32 +01:00
Yih-Dar	dce9970884	Update `test_flash_attn_2_can_dispatch_composite_models` (#36050 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-06 12:09:49 +01:00
Yih-Dar	37faa97d9b	Fix repo consistency (#36063 ) * fix 1 * fix 2 * fix modular * simplify at the same time --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-02-06 11:53:15 +01:00
Pavel Gein	ed98ad35e6	Fix usage of unpad_input function (#35925 ) Fix usage of unpad_function See https://github.com/huggingface/transformers/issues/35899 In the [commit](`cdbbe844b1`) return type of `unpad_input` was changed. Now the code support older and newer versions Co-authored-by: Pavel Gein <pavel.gein@gmail.com>	2025-02-06 11:33:42 +01:00
Yaswanth Gali	7aee036e54	Iterative generation using Input embeds and `past_key_values` (#35890 ) * Iterative generation using input embeds * ruff fix * Added Testcase * Updated comment * ♻️ Refactored testcase * Skip test for these models * Continue generation using input embeds and cache * Skip generate_continue_from_embeds test * Refactor `prepare_input_for_generation` func * Continue generation using input embeds and cache * Modular changes fix * Overwrite 'prepare_inputs_for_generation' function	2025-02-06 11:06:05 +01:00
Ye Liu	b5f327f350	Add `Qwen2VLImageProcessorFast` into `Qwen2VLProcessor` (#35987 ) * Add `Qwen2VLImageProcessorFast` into `Qwen2VLProcessor` * Use `AutoImageProcessor` instead Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-02-06 10:03:09 +01:00
Sambhav Dixit	0de15c988b	Fix Audio Classification Pipeline top_k Documentation Mismatch and Bug #35736 (#35771 ) * added condition for top_k Doc mismatch fix * initilation of test file for top_k changes * added test for returning all labels * added test for few labels * tests/test_audio_classification_top_k.py * final fix * ruff fix --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-05 16:25:08 +00:00
Matt	694aaa7fbc	Fix how we compute the final non-padding token for ForSequenceClassification models (#35911 ) * Fix how we compute the final non-padding token for Gemma (and probably other models) * .size() -> .shape[] * Propagating changes to other models * Propagating changes to other models * Change it for all ForSequenceClassification models * Fix batch dim * More TF fixes * Copy the TF fix around as well * Correct layer name for TFCTRL * Cleaner .to() * Clean up the nested if-else * Use argmax() instead of .max().values	2025-02-05 16:23:33 +00:00
Fanli Lin	531d1511f5	[docs] no hard-coding cuda (#36043 ) make device-agnostic	2025-02-05 08:22:33 -08:00
Fanli Lin	7399f8021e	[docs] fix bugs in the bitsandbytes documentation (#35868 ) * fix doc * update model	2025-02-05 08:21:20 -08:00
Fanli Lin	0a1a8e3c7e	[docs] no hard coding cuda as bnb has multi-backend support (#35867 ) * change cuda to DEVICE * Update docs/source/en/llm_tutorial.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-02-05 08:20:02 -08:00
Stas Bekman	9dc1efa5d4	DeepSpeed github repo move sync (#36021 ) deepspeed github repo move	2025-02-05 08:19:31 -08:00
ROZBEH	c772bff31a	add support for empty list as input to create_model_card (#36042 ) handle cases where it is list	2025-02-05 13:29:17 +01:00
Liangliang Ma	315a9f494e	Add XPU type for work-around -inf mask causing sdpa NaN issue in modeling files (#35647 ) * add xpu for unmask * change modular for generated matching * add lastest modeling for helium	2025-02-05 13:28:31 +01:00
ManukyanD	d8080d55c7	Fix synced multi-GPU generation with LLMs and VLMs (#35893 ) * Fix synced multi-GPU generation * fix copies --------- Co-authored-by: Davit Manukyan <ManukyanD> Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2025-02-05 11:15:11 +01:00
ManukyanD	4831a94ee7	Fix Gemma2 synced multi-GPU generation (#35232 ) * Fix Gemma2 synced multi-GPU generation * Fix import ordering in modular_gemma2.py	2025-02-05 10:07:50 +01:00
Yoni Gozlan	fa56dcc2ab	Refactoring of ImageProcessorFast (#35069 ) * add init and base image processing functions * add add_fast_image_processor to transformers-cli * add working fast image processor clip * add fast image processor to doc, working tests * remove "to be implemented" SigLip * fix unprotected import * fix unprotected vision import * update ViTImageProcessorFast * increase threshold slow fast ewuivalence * add fast img blip * add fast class in tests with cli * improve cli * add fast image processor convnext * add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision * add device kwarg to ImagesKwargs for fast processing on cuda * cleanup * fix unprotected import * group images by sizes and add batch processing * Add batch equivalence tests, skip when center_crop is used * cleanup * update init and cli * fix-copies * refactor convnext, cleanup base * fix * remove patching mixins, add piped torchvision transforms for ViT * fix unbatched processing * fix f strings * protect imports * change llava onevision to class transforms (test) * fix convnext * improve formatting (following Pavel review) * fix handling device arg * improve cli * fix * fix inits * Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs * uniformize qwen2_vl fast * fix docstrings * add add fast image processor llava * remove min_pixels max_pixels from accepted size * nit * nit * refactor fast image processors docstrings * cleanup and remove fast class transforms * update add fast image processor transformers cli * cleanup docstring * uniformize pixtral fast and make _process_image explicit * fix prepare image structure llava next/onevision * Use typed kwargs instead of explicit args * nit fix import Unpack * clearly separate pops and gets in base preprocess. Use explicit typed kwargs * make qwen2_vl preprocess arguments hashable	2025-02-04 17:52:31 -05:00
David	8d73a38606	Add DAB-DETR for object detection (#30803 ) * initial commit * encoder+decoder layer changes WIP * architecture checks * working version of detection + segmentation * fix modeling outputs * fix return dict + output att/hs * found the position embedding masking bug * pre-training version * added iamge processors * typo in init.py * iterupdate set to false * fixed num_labels in class_output linear layer bias init * multihead attention shape fixes * test improvements * test update * dab-detr model_doc update * dab-detr model_doc update2 * test fix:test_retain_grad_hidden_states_attentions * config file clean and renaming variables * config file clean and renaming variables fix * updated convert_to_hf file * small fixes * style and qulity checks * return_dict fix * Merge branch main into add_dab_detr * small comment fix * skip test_inputs_embeds test * image processor updates + image processor test updates * check copies test fix update * updates for check_copies.py test * updates for check_copies.py test2 * tied weights fix * fixed image processing tests and fixed shared weights issues * added numpy nd array option to get_Expected_values method in test_image_processing_dab_detr.py * delete prints from test file * SafeTensor modification to solve HF Trainer issue * removing the safetensor modifications * make fix copies and hf uplaod has been added. * fixed index.md * fixed repo consistency * styel fix and dabdetrimageprocessor docstring update * requested modifications after the first review * Update src/transformers/models/dab_detr/image_processing_dab_detr.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * repo consistency has been fixed * update copied NestedTensor function after main merge * Update src/transformers/models/dab_detr/modeling_dab_detr.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * temp commit * temp commit2 * temp commit 3 * unit tests are fixed * fixed repo consistency * updated expected_boxes varible values based on related notebook results in DABDETRIntegrationTests file. * temporarialy config modifications and repo consistency fixes * Put dilation parameter back to config * pattern embeddings have been added to the rename_keys method * add dilation comment to config + add as an exception in check_config_attributes SPECIAL CASES * delete FeatureExtractor part from docs.md * requested modifications in modeling_dab_detr.py * [run_slow] dab_detr * deleted last segmentation code part, updated conversion script and changed the hf path in test files * temp commit of requested modifications * temp commit of requested modifications 2 * updated config file, resolved codepaths and refactored conversion script * updated decodelayer block types and refactored conversion script * style and quality update * small modifications based on the request * attentions are refactored * removed loss functions from modeling file, added loss function to lossutils, tried to move the MLP layer generation to config but it failed * deleted imageprocessor * fixed conversion script + quality and style * fixed config_att * [run_slow] dab_detr * changing model path in conversion file and in test file * fix Decoder variable naming * testing the old loss function * switched back to the new loss function and testing with the odl attention functions * switched back to the new last good result modeling file * moved back to the version when I asked the review * missing new line at the end of the file * old version test * turn back to newest mdoel versino but change image processor * style fix * style fix after merge main * [run_slow] dab_detr * [run_slow] dab_detr * added device and type for head bias data part * [run_slow] dab_detr * fixed model head bias data fill * changed test_inference_object_detection_head assertTrues to torch test assert_close * fixes part 1 * quality update * self.bbox_embed in decoder has been restored * changed Assert true torch closeall methods to torch testing assertclose * modelcard markdown file has been updated * deleted intemediate list from decoder module --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-04 17:28:27 +00:00
Yih-Dar	fe52679e74	Update tests regarding attention types after #35235 (#36024 ) * update * update * update * dev-ci * more changes * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-04 18:04:47 +01:00
Yih-Dar	014a1fa2c8	CircleCI with python 3.9 (#36027 ) update docker files Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-04 17:40:20 +01:00
Luc Georges	c98b467905	feat(ci): ignore trufflehog unverified results (#36031 )	2025-02-04 16:39:36 +01:00
Yih-Dar	9855acb9c5	Hotfix for `self-comment-ci.yml` (#36030 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-04 16:28:05 +01:00
Marc Sun	9f486badd5	Display warning for unknown quants config instead of an error (#35963 ) * add supports_quant_method check * fix * add test and fix suggestions * change logic slightly --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-02-04 15:17:01 +01:00
Yih-Dar	f19bfa50e7	Commont bot CI for other jobs (`generation` / `quantization`) (#35341 ) * quantization CI on PRs * fix * fix * add 2 members --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-04 14:42:51 +01:00
pglorio	a93b80588b	Fix RMSNormGated in Zamba2 (#35943 ) * First commit * Finish model implementation * First commit * Finish model implementation * Register zamba2 * generated modeling and configuration * generated modeling and configuration * added hybrid cache * fix attention_mask in mamba * dropped unused loras * fix flash2 * config docstrings * fix config and fwd pass * make fixup fixes * text_modeling_zamba2 * small fixes * make fixup fixes * Fix modular model converter * added inheritances in modular, renamed zamba cache * modular rebase * new modular conversion * fix generated modeling file * fixed import for Zamba2RMSNormGated * modular file cleanup * make fixup and model tests * dropped inheritance for Zamba2PreTrainedModel * make fixup and unit tests * Add inheritance of rope from GemmaRotaryEmbedding * moved rope to model init * drop del self.self_attn and del self.feed_forward * fix tests * renamed lora -> adapter * rewrote adapter implementation * fixed tests * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Dropped adapter in-place sum * removed rope from attention init * updated rope * created get_layers method * make fixup fix * make fixup fixes * make fixup fixes * update to new attention standard * update to new attention standard * make fixup fixes * minor fixes * cache_position * removed cache_position postion_ids use_cache * remove config from modular * removed config from modular (2) * import apply_rotary_pos_emb from llama * fixed rope_kwargs * Instantiate cache in Zamba2Model * fix cache * fix @slow decorator * small fix in modular file * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * several minor fixes * inherit mamba2decoder fwd and drop position_ids in mamba * removed docstrings from modular * reinstate zamba2 attention decoder fwd * use regex for tied keys * Revert "use regex for tied keys" This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5. * use regex for tied keys * add cpu to slow forward tests * dropped config.use_shared_mlp_adapter * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * re-convert from modular * extended Zamba2RMSNormGated to n_groups>1 * removed einops import * set _supports_sdpa = True * add use_mem_eff_path flag for fused mamba2 fwd * added docstring for use_mem_eff_ath flag --------- Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-02-04 14:28:04 +01:00
Sumit Vij	bc9a6d8302	Fix device mismatch error in Whisper model during feature extraction (#35866 ) * Fix device mismatch error in whisper feature extraction * Set default device * Address code review feedback --------- Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-02-04 12:23:08 +01:00
Cyril Vallez	9afb904b15	Refactor (and fix) gpt_neox (#35610 ) * start a nice modular * Update modular_gpt_neox.py * Update modular_gpt_neox.py * Update modular_gpt_neox.py * Update modular_gpt_neox.py * update * Update modular_gpt_neox.py * convert * fix attribute * fix attrs * oups * fix * fix * fix * fix * fix * fix order to pass test (see with accelerate team) * trigger CIs * modular * update * up * Update test_modeling_gpt_neox.py * Update test_modeling_gpt_neox.py * trigger CIs * correctly pass arg * simplify * remove key warning * update tp -> it's compatible since the view is before * trigger CIs	2025-02-04 11:18:43 +01:00
Cyril Vallez	ad30598923	Update Mistral converter (#35967 ) * Update convert_mistral_weights_to_hf.py * Update convert_mistral_weights_to_hf.py * update * style * move it to integrations * style * trigger CIs * trigger CIs	2025-02-04 11:13:12 +01:00
Ryoo Kwangrok	b1954fd64a	layernorm_decay_fix (#35927 ) * layernorm_decay_fix * W293 fix * ruff format fix * black format * ruff format * erase last layer * add test_get_parameter_names_rmsnorm * rmsnorm fix	2025-02-04 11:01:49 +01:00
Dmitry Tarasov	2ba040a71f	apply_chat_template: consistent behaviour for return_assistant_tokens_mask=True return_tensors=True (#35582 ) * apply_chat_template: consistent return_tensors behaviour with return_assistant_tokens_mask flag * test_chat_template_return_assistant_tokens_mask: support tokenizers with no attention mask * test_chat_template_return_assistant_tokens_mask: skip tokenizers with no padding token * test_chat_template_return_assistant_tokens_mask: force tokenizer padding_side=right --------- Co-authored-by: Eduard Allakhverdov <goncharova@airi.net> Co-authored-by: d.tarasov <d.tarasov@airi.net>	2025-02-04 10:27:52 +01:00
Pavel Iakubovskii	9c02cb6233	Fix custom kernel for DeformableDetr, RT-Detr, GroindingDINO, OmDet-Turbo in Pytorch 2.6.0 (#35979 ) Updates type().is_cuda() -> .is_cuda(); .data<> -> .data_ptr<>	2025-02-04 09:07:25 +00:00
Raushan Turganbay	5d75a25b03	Qwen2-VL: fix rope delta calculation (#36013 ) * fix rope delats calculation * add test * style	2025-02-04 09:48:29 +01:00
Alex Brooks	e284c7e954	Update Granite Vision Model Path / Tests (#35998 ) * Update granite vision model path Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Enable granite vision test Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-02-03 20:06:03 +01:00
Gar	9d2056f12b	Add mean_resizing for every VLMs' resizing_token_embeddings() (#35717 ) * refine all resize_token_embedding() * ruff format * hotfix	2025-02-03 15:03:49 +01:00
Arthur	7eecdf2a86	Update-tp test (#35844 ) * update test for now * up * cleanup * update todo	2025-02-03 09:37:02 +01:00
Yih-Dar	62db3e6ed6	use torch 2.6 for daily CI (#35985 ) use torch 2.6 for CI Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-31 18:58:23 +01:00
Yoni Gozlan	2b46943195	Add GOT-OCR 2.0 to Transformers (#34721 ) * init modular got_ocr2 * Get correct got_ocr architecture * add processing * run modular with processing * add working inference * apply modular * Refactor and fix style * Refactor, cleanup, fix style * fix init order * Fix docs * add base modeling tests * fix style and consistency * rename doc file * fix repo consistency * fix inference with box * add image processing and support for crop_to_multi_page * Fix batch inference * add tests * fixup * fix slow test * fix docstrings * Add model doc * update to new init * fix input autocast pixel_values dtype * update doc * move doc to multimodal * Reformat crop_image_to_patches and add docstrings * Fix example in forward docstring * Address Pablo review * [run slow] got_ocr2 * remove defaults defined twice * apply modular * add torch_device to integration tests * update modular * follow-up Pavel review * add device variable in doc * fix doc multi-page * Force eager attention for vision encoder to avoid attn implementation conflict * revert qwen2vl doc changes * use Qwen2ForCausalLM instead of Qwen2Model * make fixup * refactor gotocr2 to llava style * uniformize function names and reduce checks * final nits * fix pixel_values dtype error * change checkpoint names * fix modular	2025-01-31 11:28:13 -05:00
Joao Gante	5bbee12ac9	[Moshi] disable automatic compilation if the model can't compile (#35992 ) moshi cant compile	2025-01-31 15:53:06 +00:00
eustlb	e6f4a4ebbf	[Moonshine] compute head_dim_padding at init (#35984 ) compute head_dim_padding at init	2025-01-31 14:26:52 +01:00
Yoni Gozlan	d7188ba600	Add support for nested images to LLava and VipLLava (#35558 ) * move make_flat_list_of_images and make_batched_videos to image_utils * remove unnecessary is_vision_available * move make_nested_list_of_images to image_utils * fix fast pixtral image processor * fix import mllama * fix make_nested_list_of_images * add tests * convert 4d arrays/tensors to list * add test_make_batched_videos * add support nested batch of videos * fix image processing qwen2vl	2025-01-30 16:49:20 -05:00
Marcel	e4227eb4d4	Handle empty change indices in SAM's mask to rle conversion (#35665 ) * Handle empty change indices in RLE conversion for masks * [test] Add unit tests for RLE encoding of masks in SamProcessor * [test] Update RLE conversion tests to use TensorFlow implementation * [test] Fix formatting in SamProcessorTest according to check_code_quality action * [test] Fix formatting in SamProcessorTest according to check_code_quality * [test] Refactored rle test cases into one test and used tf tensors in tf test cases * [test] Fix: removed self parameter from refactored methods * [test] Removed nested methods in run-length encoding tests for PyTorch and TensorFlow * [test] Added description to individual to run-length encoding tests for PyTorch and TensorFlow.	2025-01-30 19:08:38 +00:00
Yih-Dar	47bd4296d6	not to use A100 for `benchmark.yml` (#35974 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-30 18:55:36 +01:00
Nat Jeffries	693328f2bc	Support batching for UsefulSensors Moonshine (#35922 ) * Add support for attention masking in moonshine. Tested against Open ASR Leaderboard with batch size 256. * Update comments and ensure attention masks are passed everywhere. Perform attention mask downsampling inside of moonshine forward call. * Hide padding behind conditional. Fix encoder/decoder masking. - Correctly pipe encoder attention mask into decoder - Add correct scaling factor if one is not already provided. - Fix formatting with ruff * Add auto generated modeling_moonshine file. * Update formatting in generated model file. * Address review comments. * Fix typo. * Add `pad_head_dim_to_multiple_of` to moonshine config. * Correct args order for MooonshineConfig. * Update configuration moonshine too. * Update src/transformers/models/moonshine/modular_moonshine.py * Update src/transformers/models/moonshine/configuration_moonshine.py --------- Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-01-30 17:08:07 +01:00
Yih-Dar	5757681837	Less flaky for `TimmBackboneModelTest::test_batching_equivalence` (#35971 ) * fix * remove is_flaky * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-30 16:56:26 +01:00
Matt	e320d5542e	Revert p_mask to a list in DQA pipeline (#35964 ) * p_mask back to being a list * Remove breakpoint	2025-01-30 15:37:59 +00:00
Raushan Turganbay	365fecb4d0	Whisper: fix static cache CI (#35852 ) * fix * remove overriden method * small change	2025-01-30 12:43:00 +01:00
Raushan Turganbay	9725e5be2f	Pixtral: vectorize patch embeddings and enable tests (#35122 ) * initial POC * - batch mix feature * fix tests * fix tests * make style * do not skip and instead fix tests * update * return back the test * correct text with the correct ckpt	2025-01-30 12:40:18 +01:00
Joao Gante	8bc4c89ee9	[bart] minor test fixes (#35965 ) fix tests	2025-01-30 10:00:11 +00:00
Ilyas Moutawwakil	19f2ec80cf	Fix is_causal being a tensor (#35791 ) * fix is_causal being a tensor * convert in sdpa attention only when jit tracing	2025-01-30 09:22:33 +01:00
Wing Lian	7547f55e5d	fix iterator overflow when gradient accumulation is 1 (#35960 )	2025-01-29 14:45:09 -05:00
Joao Gante	4d3b1076a1	[generate] move max time tests (#35962 ) * move max time tests to their right place * move test to the right place	2025-01-29 17:56:46 +00:00
Boris Malashenko	4d1d489617	Update README.md (#35958 ) There should be a dot after pip install .	2025-01-29 15:46:26 +00:00
Fanli Lin	f0ae65c198	[tests] further fix `Tester object has no attribute '_testMethodName'` (#35781 ) * bug fix * update with more cases * more entries * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 16:05:33 +01:00
Yih-Dar	ec7790f0d3	update docker file `transformers-pytorch-deepspeed-latest-gpu` (#35940 ) update docker file for deepspeed Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 16:01:27 +01:00
Zach Mueller	5d257111c1	Trainer Refactor: Part 1 (#35567 ) * start * So far: 30% * Small fix * Continuing update * Continuing * Forgot to check if not None * Continuing refactor * Fix if else * Fix ref * Should make tests pass * Keep grad norm same * Document * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Err instead of info for logging RNG state error * Seperate out to func --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-01-29 09:50:54 -05:00
Jonas Rohw	23d782ead2	Output dicts support in text generation pipeline (#35092 ) * Support for generate_argument: return_dict_in_generate=True, instead of returning a error * fix: call test with return_dict_in_generate=True * fix: Only import torch if it is present * update: Encapsulate output_dict changes * fix: added back original comments --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-01-29 14:44:46 +00:00
Yih-Dar	cf90404807	Fix flaky `test_assisted_decoding_matches_greedy_search` (#35951 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:50:07 +01:00
Yih-Dar	692afa102d	Update `squad_convert_example_to_features` to work with numpy v2 (#35955 ) * Fix * Fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:33:06 +01:00
Yih-Dar	c600e89f5c	Update `unwrap_and_save_reload_schedule` to use `weights_only=False` (#35952 ) * fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:30:57 +01:00
Nadav Timor	42c8ccfd4c	fix `test_generated_length_assisted_generation` (#34935 ) fix test_generated_length_assisted_generation	2025-01-29 12:03:45 +00:00
Mohamed Abu El-Nasr	ec7afad609	use torch constraints to check if covariance is positive definite during mean resizing. (#35693 ) * use torch constraints to check for psd * small nit * Small change * Small change for the ci * nit	2025-01-28 17:33:42 +01:00
Ella Charlaix	61cbb723fc	Remove INC notebook reference in documentation (#35936 ) remove INC notebook in documentation	2025-01-28 17:10:02 +01:00
NanoCode012	478c4f2d0d	fix(FA): QKV not being casted to target_dtype for FA with dpo lora (#35834 ) fix(FA): QKV not being casted to target_dtype due to dtype check	2025-01-28 17:06:56 +01:00
Joao Gante	ece8c42488	Test: generate with `torch.compile(model.forward)` as a fast test (#34544 )	2025-01-28 14:10:38 +00:00
Cyril Vallez	f48ecd7608	Fix TP initialization (#35860 ) * fix tp * Update modeling_utils.py * style * style * Update test_tp.py * Update test_tp.py * style * Update test_tp.py * Update test_tp.py * Update test_tp.py * Update test_tp.py	2025-01-28 15:07:37 +01:00
Raushan Turganbay	f85ba20449	Qwen-2-5-VL: fix CI (#35935 ) fix	2025-01-28 14:51:57 +01:00
Cyril Vallez	3f860dba55	Fix mask slicing for models with HybridCache (#35681 ) * correctly slice * check mask * Update modular_gemma2.py * fix * add tests * fix typo * finally fix mask slicing * Finally correctly slice in all cases!! * add test for all attention functions * small fix in tests * trick around dynamo tracing issue * last update * more robust * kwargs propagation * make it explicit for checkpointing * apply modular	2025-01-28 14:35:00 +01:00
Raushan Turganbay	b764c20b09	Fix: loading DBRX back from saved path (#35728 ) * fix dtype as dict for some models + add test * add comment in tests	2025-01-28 11:38:45 +01:00
Cyril Vallez	3613f568cd	Add default TP plan for all models with backend support (#35870 ) * Add some tp plans! * More tp plans! * Add it in the comment * style * Update configuration_mixtral.py * Update configuration_phi.py * update the layout according to special archs * fix mixtral * style * trigger CIs * trigger CIs * CIs * olmo2 --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-28 11:20:58 +01:00
ivarflakstad	96625d85fd	Use rocm6.2 for AMD images (#35930 ) * Use rocm6.2 as rocm6.3 only has nightly pytorch wheels atm * Use stable wheel index for torch libs	2025-01-28 11:10:28 +01:00
Yih-Dar	bf16a182ba	Remove `_supports_static_cache = True` for some model classes (#34975 ) * use mask_fill * remove comment --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-28 10:42:10 +01:00
Steven Liu	86d7564611	[docs] Fix Zamba2 (#35916 ) fix code block	2025-01-27 11:44:10 -08:00
Matt	414658f94f	Close Zamba2Config code block (#35914 ) * close zamba2 code block * Add Zamba2 to toctree	2025-01-27 19:09:42 +00:00
Matt	63e9c941eb	Fix the config class comparison for remote code models (#35592 ) * Fix the config class comparison when repeatedly saving and loading remote code models * once again you have committed your debug breakpoint	2025-01-27 18:37:30 +00:00
Steven Liu	c550a1c640	[docs] uv install (#35821 ) uv install	2025-01-27 08:49:28 -08:00
CalOmnie	cd6591bfb2	Fix typing in audio_utils.chroma_filter_bank (#35888 ) * Fix typing in audio_utils.chroma_filter_bank * Apply make style --------- Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>	2025-01-27 16:06:03 +00:00
Isotr0py	e57b459997	Split and clean up GGUF quantization tests (#35502 ) * clean up ggml test Signed-off-by: Isotr0py <2037008807@qq.com> * port remaining tests Signed-off-by: Isotr0py <2037008807@qq.com> * further cleanup Signed-off-by: Isotr0py <2037008807@qq.com> * format Signed-off-by: Isotr0py <2037008807@qq.com> * fix broken tests Signed-off-by: Isotr0py <2037008807@qq.com> * update comment Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> * reorganize tests Signed-off-by: Isotr0py <2037008807@qq.com> * k-quants use qwen2.5-0.5B Signed-off-by: Isotr0py <2037008807@qq.com> * move ggml tokenization test Signed-off-by: Isotr0py <2037008807@qq.com> * remove dead code Signed-off-by: Isotr0py <2037008807@qq.com> * add assert for serilization test Signed-off-by: Isotr0py <2037008807@qq.com> * use str for parameterize Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-27 15:46:57 +01:00
Ross Wightman	5c576f5a66	🚨🚨🚨 image-classification pipeline single-label and multi-label prob type squashing fns (sigmoid vs softmax) are backwards (#35848 ) single-label and multi-label prob type squashing fns (sigmoid vs softmax) were backwards for image-classification pipeline	2025-01-27 15:34:57 +01:00
Mikhail Moskovchenko	5450e7c84a	🔴 🔴 🔴 Added `segmentation maps` support for DPT image processor (#34345 ) * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements	2025-01-27 15:14:00 +01:00
ivarflakstad	a50befa9b9	Update deepspeed amd image (#35906 )	2025-01-27 14:32:36 +01:00
pglorio	33cb1f7b61	Add Zamba2 (#34517 ) * First commit * Finish model implementation * First commit * Finish model implementation * Register zamba2 * generated modeling and configuration * generated modeling and configuration * added hybrid cache * fix attention_mask in mamba * dropped unused loras * fix flash2 * config docstrings * fix config and fwd pass * make fixup fixes * text_modeling_zamba2 * small fixes * make fixup fixes * Fix modular model converter * added inheritances in modular, renamed zamba cache * modular rebase * new modular conversion * fix generated modeling file * fixed import for Zamba2RMSNormGated * modular file cleanup * make fixup and model tests * dropped inheritance for Zamba2PreTrainedModel * make fixup and unit tests * Add inheritance of rope from GemmaRotaryEmbedding * moved rope to model init * drop del self.self_attn and del self.feed_forward * fix tests * renamed lora -> adapter * rewrote adapter implementation * fixed tests * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Dropped adapter in-place sum * removed rope from attention init * updated rope * created get_layers method * make fixup fix * make fixup fixes * make fixup fixes * update to new attention standard * update to new attention standard * make fixup fixes * minor fixes * cache_position * removed cache_position postion_ids use_cache * remove config from modular * removed config from modular (2) * import apply_rotary_pos_emb from llama * fixed rope_kwargs * Instantiate cache in Zamba2Model * fix cache * fix @slow decorator * small fix in modular file * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * several minor fixes * inherit mamba2decoder fwd and drop position_ids in mamba * removed docstrings from modular * reinstate zamba2 attention decoder fwd * use regex for tied keys * Revert "use regex for tied keys" This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5. * use regex for tied keys * add cpu to slow forward tests * dropped config.use_shared_mlp_adapter * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * re-convert from modular --------- Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-27 10:51:23 +01:00
Sugendran Ganess	14a9bb520e	Fix fast image processor warnings in object detection examples (#35892 ) Have the DETR examples default to using the fast image processor	2025-01-27 08:32:44 +00:00
Steven Liu	f11f57c925	[doctest] Fixes (#35863 ) doctest fixes	2025-01-26 15:26:38 -08:00
Yih-Dar	fc269f77da	Add `Rocketknight1` to `self-comment-ci.yml` (#35881 ) my bad Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-24 19:07:07 +00:00
Fanli Lin	bcb841f007	add xpu device check in device_placement (#35865 ) add xpu device	2025-01-24 19:13:07 +01:00
Arthur	b912f5ee43	use torch.testing.assertclose instead to get more details about error in cis (#35659 ) * use torch.testing.assertclose instead to get more details about error in cis * fix * style * test_all * revert for I bert * fixes and updates * more image processing fixes * more image processors * fix mamba and co * style * less strick * ok I won't be strict * skip and be done * up	2025-01-24 16:55:28 +01:00
Suyuchen Wang	72d1a4cd53	Fix Llava-NeXT / Llava-NeXT Video / Llava-OneVision's token unpadding mismatch (#35779 ) * Fix Llava OneVision's token padding * Fix Llava next and Llava next video's token unpadding for consistency	2025-01-24 09:10:27 +01:00
CalOmnie	b5aaf87509	Fix `test_pipelines_video_classification` that was always failing (#35842 ) * Fix test_pipelines_video_classification that was always failing * Update video pipeline docstring to reflect actual return type --------- Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>	2025-01-23 19:22:32 +01:00
baoyf4244	328e2ae4c0	fix apply_chat_template() padding choice (#35828 ) fix apply_chat_template() padding choice to bool, str, PaddingStrategy and the docstring of pad()	2025-01-23 17:32:32 +00:00
SilverSoldier	d2a424b550	Fix typo (#35854 )	2025-01-23 17:32:18 +00:00
Yosshi999	045c02f209	[DOC] Fix contamination and missing paragraph in translation (#35851 ) Fix contamination and missing paragraph in translation	2025-01-23 08:33:44 -08:00
Alex Brooks	71cc8161b2	Granite Vision Support (#35579 ) * Add multimodal granite support Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Support multiple image feature layres Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Remove failing validation for visual encoders with no cls Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Update llava based models / configs to support list of feature layers Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Add tests for multiple feature layers Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Use conditional instead of except for misaligned feature shapes Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * crop cls from each hidden state Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Fix formatting Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Support single vision feature int in vipllava Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Fix typo in vision feature selection strategy validation Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Add tentative integration test for granite vision models Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Add granite vision docs Replace multimodal granite refs with granite vision Add granite vision / llava next alias Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Use image url in granitevision example Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-01-23 17:15:52 +01:00
Arthur	8f1509a96c	Fix more CI tests (#35661 ) add tooslow for the fat ones	2025-01-23 14:45:42 +01:00
Jack Roberts	0a950e0bbe	Fix uploading processors/tokenizers to WandB on train end (#35701 ) * rename tokenizer to processing_class in WandbCallback.on_train_end * rename tokenizer to processing_class in ClearMLCallback and DVCLiveCallback	2025-01-23 13:32:15 +01:00
張庭瑜	4ec425ffad	Fix GA loss for Deepspeed (#35808 ) * Fix GA loss for Deepspeed * Turn off loss scaling in DeepSpeed engine by scale_wrt_gas * Add comment linking to PR	2025-01-23 11:45:02 +01:00
ShuaiBai623	f3f6c86582	add qwen2.5vl (#35569 ) * add qwen2.5vl * fix * pass check table * add modular file * fix style * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> * padd copy check * use modular * fix * fix * fix * update flashatt2&sdpa support_list * Update docs/source/en/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_5_vl.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update config * update * fix hf path * rename Qwen2_5_VLVideosKwargs * fix * fix * update * excuted modular * rollback init * fix * formated * simpler init * fix * fix * fix * fix * fix * update docs * fix * fix * update Qwen2VLRotaryEmbedding for yarn * fix --------- Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: gewenbin0992 <gewenbin292@163.com> Co-authored-by: gewenbin0992 <67409248+gewenbin0992@users.noreply.github.com>	2025-01-23 11:23:00 +01:00
Cyril Vallez	d3af76df58	[Backend support] Allow `num_logits_to_keep` as Tensor + add flag (#35757 ) * support * Update modeling_utils.py * style * most models * Other models * fix-copies * tests + generation utils	2025-01-23 09:47:54 +01:00
Arthur	8736e91ad6	[ `tests`] remove some flash attention class tests (#35817 ) remove class from tests	2025-01-23 09:44:21 +01:00
Marc Sun	2c3a44f9a7	Fix NoneType type as it requires py>=3.10 (#35843 ) fix type	2025-01-22 15:56:53 +00:00
Mohit Sharma	fdcc62c855	Add PyTorch version check for FA backend on AMD GPUs (#35813 ) Disable FA backend for SDPA on AMD GPUs (PyTorch < 2.4.1)	2025-01-22 16:09:23 +01:00
LRL-ModelCloud	3b9770581e	Fix compatibility issues when using auto_gptq with these older versions (#35830 ) convert_model method of optimum only accepts a single nn.Module type model parameter for versions less than 1.23.99.	2025-01-22 15:46:47 +01:00
Joao Gante	62bd83947a	[chat] docs fix (#35840 ) docs fix	2025-01-22 14:32:27 +00:00
Isotr0py	487e2f63bd	Fix `head_dim` in config extracted from Gemma2 GGUF model (#35818 ) fix gemma2 head dim Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-01-22 15:22:04 +01:00
Joao Gante	b3d6722469	[Chat] Add Chat from TRL 🐈 (#35714 ) * tmp commit * add working chat * add docts * docs 2 * use auto dtype by default	2025-01-22 13:30:12 +00:00
Mohamed Mekkouri	a7738f5a89	Fix : Nemotron tokenizer for GGUF format (#35836 ) fix nemotron gguf	2025-01-22 12:28:40 +01:00
Joao Gante	ec28957f94	[pipeline] missing import regarding assisted generation (#35752 ) missing import	2025-01-22 10:34:28 +00:00
Joao Gante	36c9181f5c	[gpt2] fix generation tests (#35822 ) fix gpt2 generation tests	2025-01-22 09:41:04 +00:00
Yih-Dar	f439e28d32	Hotfix: missing `working-directory` in `self-comment-ci.yml` (#35833 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-22 10:25:50 +01:00
Raushan Turganbay	373e50e970	Init cache on meta device (#35164 ) * init cache on meta device * offloaded static + enable tests * tests weren't running before :( * update * fix mamba * fix copies * update * address comments and fix tests * fix copies * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update * mamba fix --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-22 09:49:17 +01:00
Yih-Dar	870e2c8ea0	Another security patch for `self-comment-ci.yml` (#35816 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-22 09:29:54 +01:00
CalOmnie	f4f33a20a2	Remove pyav pin to allow python 3.11 to be used (#35823 ) * Remove pyav pin to allow python 3.11 to be used * Run make fixup --------- Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>	2025-01-21 20:16:18 +00:00
Joao Gante	90b46e983f	Remove old `benchmark` code (#35730 ) * remove traces of the old deprecated benchmarks * also remove old tf benchmark example, which uses deleted code * run doc builder	2025-01-21 17:56:43 +00:00
eustlb	870eb7b41b	[Mimi] update test expected values for t4 runners (#35696 ) update values for t4	2025-01-21 18:23:36 +01:00
Cyril Vallez	8ac851b0b3	Improve modular documentation (#35737 ) * start a nice doc * keep improving the doc * Finalize doc * Update modular_transformers.md * apply suggestion	2025-01-21 17:53:30 +01:00
Yoni Gozlan	107f9f5127	add Qwen2-VL image processor fast (#35733 ) * add qwen2_vl image processor fast * add device to ImagesKwargs * remove automatic fix copies * fix fast_is_faster_than_slow * remove unnecessary import	2025-01-21 11:49:05 -05:00
eustlb	3df90103b8	move fastspeech to audio models (#35788 )	2025-01-21 08:32:09 -08:00
Ahmed Almaghz	741d55237a	[i18n-ar] Translated file: `docs/source/ar/tasks/masked_language_modeling.md` into Arabic (#35198 ) * إضافة الترجمة العربية: masked_language_modeling.md * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/masked_language_modeling.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml * Update _toctree.yml * Add language_modeling.md * Add Sequence_classifiation.md * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2025-01-21 08:29:58 -08:00
v2ray	568941bf11	Optimized set_initialized_submodules. (#35493 )	2025-01-21 17:01:28 +01:00
Lucain	7051c5fcc8	Remove deprecated `get_cached_models` (#35809 ) * Remove deprecated get_cached_models * imports	2025-01-21 16:08:31 +01:00
InfroLab	97fbaf0861	Fixed typo in autoawq version number in an error message for IPEX backend requirements. (#35815 ) Fixed typo in version number for IPEX backend required minimal autoawq version	2025-01-21 14:42:44 +00:00
Mohamed Mekkouri	dbd8474125	Fix : BLOOM tie_word_embeddings in GGUF (#35812 ) * fix bloom ggml * fix falcon output * make style	2025-01-21 15:35:54 +01:00
Pedro Cuenca	678bd7f1ce	Auto-add `timm` tag to timm-wrapper models. (#35794 ) Works for fine-tuned or exported models: ```py from transformers import AutoModelForImageClassification checkpoint = "timm/vit_base_patch16_224.augreg2_in21k_ft_in1k" model = AutoModelForImageClassification.from_pretrained(checkpoint) model.push_to_hub("pcuenq/tw1") ``` The uploaded model will now show snippets for both the timm and the transformers libraries.	2025-01-21 14:34:45 +01:00
fzyzcjy	dc10f7906a	Support adamw_torch_8bit (#34993 ) * var * more * test	2025-01-21 14:17:49 +01:00
Louie Tsai	f82b19cb6f	add a new flax example for Bert model inference (#34794 ) * add a new example for flax inference cases * Update examples/flax/language-modeling/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update examples/flax/language-modeling/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update examples/flax/language-modeling/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update examples/flax/language-modeling/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update examples/flax/language-modeling/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update examples/flax/language-modeling/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix for "make fixup" --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-21 14:09:29 +01:00
Aritra Roy Gosthipaty	edbabf6b82	[Doc] Adding blog post to model doc for `TimmWrapper` (#35744 ) * adding blog post to model doc * Update docs/source/en/model_doc/timm_wrapper.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * review suggestions * review suggestions --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-21 12:32:39 +00:00
Yih-Dar	fd8d61fdb2	Byebye `test_batching_equivalence`'s flakiness (#35729 ) * fix * fix * skip * better error message --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-21 13:11:33 +01:00
NielsRogge	78f5ee0217	Add LlavaImageProcessor (#33191 ) * First draft * Add equivalence test * Update docstrings * Add tests * Use numpy * Fix tests * Improve variable names * Improve docstring * Add link * Remove script * Add copied from * Address comment * Add note in docs * Add docstring, data format * Improve test * Add test * update * Update src/transformers/models/llava/image_processing_llava.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/llava/image_processing_llava.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * loop once only --------- Co-authored-by: raushan <raushan@huggingface.co> Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-01-21 12:47:04 +01:00
ivarflakstad	8e4cedd9ca	Update AMD Docker image (#35804 )	2025-01-21 12:11:23 +01:00
Raushan Turganbay	705aeaaa12	Fix "test_chat_template_dict" in video LLMs (#35660 ) * fix "test_chat_template_dict" in llava_onevision * Update src/transformers/models/llava_next_video/processing_llava_next_video.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * get one video calles once --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-01-21 10:23:40 +01:00
Cyril Vallez	e867b97443	Deterministic sorting in modular converter when adding new functions (#35795 ) deterministic sort	2025-01-21 09:38:48 +01:00
Nikos Antoniou	920f34a772	modular_model_converter bugfix on assignments (#35642 ) * added bugfix in modular converter to keep modular assignments for docstrings, expected outputs etc. * revert stracoder2 docstring copying, add forward in EMU3 to enable docstring assingment, remove verbatim assignments in modular converter * added _FOR_DOC in assignments to keep, corrected wrong checkpoint name in ijepa's configuration	2025-01-21 08:06:44 +01:00
Ross Wightman	234168c4dc	Fixes, improvements to `timm` import behaviour (#35800 ) * Fix timm dummy import logic * Add requires to TimmWrapperConfig.from_dict so users see a helpful import error message if timm not installed	2025-01-20 13:17:01 -08:00
Aymeric Roucher	44393df089	Tool calling: support more types (#35776 ) * Tool calling: support NoneType for function return type	2025-01-20 19:15:34 +01:00
jiqing-feng	f19135afc7	fix low-precision audio classification pipeline (#35435 ) * fix low-precision audio classification pipeline Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torch import Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix torch import Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-01-20 16:20:51 +00:00
jiqing-feng	641238eb76	Fix vits low-precision dtype (#35418 ) * fix vits dtype Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * use weight dtype Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-01-20 16:19:31 +00:00
jiqing-feng	729b569531	fix document qa bf16 pipeline (#35456 ) * fix document qa bf16 pipeline Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-01-20 16:18:07 +00:00
Ihar Hrachyshka	ec97417827	Don't import torch.distributed when it's not available (#35777 ) This is a continuation of 217c47e31bc0cd442443e5b4a62c8bc2785d53ee but for another module. This issue was spotted in nixpkgs (again) when building lm-eval package that used a different path in transformers library to reach the same failure. Related: #35133	2025-01-20 17:10:35 +01:00
eustlb	5f0f4b1b93	Patch moonshine (#35731 ) * udpate expected logits for T4 runners * update doc * correct order of the args for better readability * remove generate wrap * convert modular	2025-01-20 16:19:29 +01:00
CalOmnie	a142f16131	transformers.image_transforms.normalize wrong types (#35773 ) transformers.image_transforms.normalize documents and checks for the wrong type for std and mean arguments Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>	2025-01-20 15:00:46 +00:00
Fanli Lin	3998fa8aab	[fix] cannot import name 'Pop2PianoFeatureExtractor' from 'transformers' (#35604 ) * update pop2piano __init__ * add lib check * update fix * revert	2025-01-20 15:21:45 +01:00
Mohamed Mekkouri	b80e334e71	Skip Falcon 7B GGML Test (#35783 ) skip test	2025-01-20 15:00:34 +01:00
Arthur	68947282fc	remove code owners as it was generating too much noise BUT (#35784 ) remove code owners	2025-01-20 14:18:03 +01:00
Arthur Zucker	135e86aa54	Remove read_video and run	2025-01-20 13:40:57 +01:00
Joao Gante	88b95e6179	[generate] update docstring of `SequenceBiasLogitsProcessor` (#35699 ) * fix docstring * space	2025-01-20 11:00:15 +00:00
Francesco Cariaggi	56afd2f488	fix register_buffer in MimiEuclideanCodebook (#35759 ) Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-01-20 11:54:58 +01:00
StevenBucaille	abe57b6f17	Add SuperGlue model (#29886 ) * Initial commit with template code generated by transformers-cli * Multiple additions to SuperGlue implementation : - Added the SuperGlueConfig - Added the SuperGlueModel and its implementation - Added basic weight conversion script - Added new ImageMatchingOutput dataclass * Few changes for SuperGlue * Multiple changes : - Added keypoint detection config to SuperGlueConfig - Completed convert_superglue_to_pytorch and succesfully run inference * Reverted unintentional change * Multiple changes : - Added SuperGlue to a bunch of places - Divided SuperGlue into SuperGlueForImageMatching and SuperGlueModel - Added testing images * Moved things in init files * Added docs (to be finished depending on the final implementation) * Added necessary imports and some doc * Removed unnecessary import * Fixed make fix-copies bug and ran it * Deleted SuperGlueModel Fixed convert script * Added SuperGlueImageProcessor * Changed SuperGlue to support batching pairs of images and modified ImageMatchingOutput in consequences * Changed convert_superglue_to_hf.py script to experiment different ways of reading an image and seeing its impact on performances * Added initial tests for SuperGlueImageProcessor * Added AutoModelForImageMatching in missing places and tests * Fixed keypoint_detector_output instructions * Fix style * Adapted to latest main changes * Added integration test * Fixed bugs to pass tests * Added keypoints returned by keypoint detector in the output of SuperGlue * Added doc to SuperGlue * SuperGlue returning all attention and hidden states for a fixed number of keypoints * Make style * Changed SuperGlueImageProcessor tests * Revert "SuperGlue returning all attention and hidden states for a fixed number of keypoints" Changed tests accordingly This reverts commit 5b3b669c * Added back hidden_states and attentions masked outputs with tests * Renamed ImageMatching occurences into KeypointMatching * Changed SuperGlueImageProcessor to raise error when batch_size is not even * Added docs and clarity to hidden state and attention grouping function * Fixed some code and done refactoring * Fixed typo in SuperPoint output doc * Fixed some of the formatting and variable naming problems * Removed useless function call * Removed AutoModelForKeypointMatching * Fixed SuperGlueImageProcessor to only accept paris of images * Added more fixes to SuperGlueImageProcessor * Simplified the batching of attention and hidden states * Simplified stack functions * Moved attention instructions into class * Removed unused do_batch_norm argument * Moved weight initialization to the proper place * Replaced deepcopy for instantiation * Fixed small bug * Changed from stevenbucaille to magic-leap repo * Renamed London Bridge images to Tower Bridge * Fixed formatting * Renamed remaining "london" to "tower" * Apply suggestions from code review Small changes in the docs Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Added AutoModelForKeypointMatching * Changed images used in example * Several changes to image_processing_superglue and style * Fixed resample type hint * Changed SuperGlueImageProcessor and added test case for list of 2 images * Changed list_of_tuples implementation * Fix in dummy objects * Added normalize_keypoint, log_sinkhorn_iterations and log_optimal_transport docstring * Added missing docstring * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Moved forward block at bottom * Added docstring to forward method * Added docstring to match_image_pair method * Changed test_model_common_attributes to test_model_get_set_embeddings test method signature * Removed AutoModelForKeypointMatching * Removed image fixtures and added load_dataset * Added padding of images in SuperGlueImageProcessor * Cleaned up convert_superglue_to_hf script * Added missing docs and fixed unused argument * Fixed SuperGlueImageProcessor tests * Transposed all hidden states from SuperGlue to reflect the standard (..., seq_len, feature_dim) shape * Added SuperGlueForKeypointMatching back to modeling_auto * Fixed image processor padding test * Changed SuperGlue docs * changes: - Abstraction to batch, concat and stack of inconsistent tensors - Changed conv1d's to linears to match standard attention implementations - Renamed all tensors to be tensor0 and not tensor_0 and be consistent - Changed match image pair to run keypoint detection on all image first, create batching tensors and then filling these tensors matches after matches - Various changes in docs, etc * Changes to SuperGlueImageProcessor: - Reworked the input image pairs checking function and added tests accordingly - Added Copied from statements - Added do_grayscale tag (also for SuperPointImageProcessor) - Misc changes for better code * Formatting changes * Reverted conv1d to linear conversion because of numerical differences * fix: changed some code to be more straightforward (e.g. filtering keypoints) and converted plot from opencv to matplotlib * fix: removed unnecessary test * chore: removed commented code and added back hidden states transpositions * chore: changed from "inconsistent" to "ragged" function names as suggested Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * docs: applied suggestions Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * docs: updated to display matched output * chore: applied suggestion for check_image_pairs_input function Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * chore: changed check_image_pairs_input function name to validate_and_format_image_pairs and used validate_preprocess_arguments function * tests: simplified tests for image input format and shapes * feat: converted SuperGlue's use of Conv1d with kernel_size of 1 with Linear layers. Changed tests and conversion script accordingly * feat: several changes to address comments Conversion script: - Reverted fuse batchnorm to linear conversion - Changed all 'nn.Module' to respective SuperGlue models - Changed conversion script to use regex mapping and match other recent scripts Modeling SuperGlue: - Added batching with mask and padding to attention - Removed unnecessary concat, stack and batch ragged pairs functions - Reverted batchnorm layer - Renamed query, key, value and merge layers into q, k, v, out proj - Removed Union of different Module into nn.Module in _init_weights method typehint - Changed several method's signature to combine image0 and image1 inputs with appropriate doc changes - Updated SuperGlue's doc with torch.no_grad() Updated test to reflect changes in SuperGlue model * refactor: changed validate_and_format_image_pairs function with clarity * refactor: changed from one SuperGlueMLP class to a list of SuperGlueMLP class * fix: fixed forgotten init weight change from last commit * fix: fixed rebase mistake * fix: removed leftover commented code * fix: added typehint and changed some of arguments default values * fix: fixed attribute default values for SuperGlueConfig * feat: added SuperGlueImageProcessor post process keypoint matching method with tests * fix: fixed SuperGlue attention and hidden state tuples aggregation * chore: fixed mask optionality and reordered tensor reshapes to be cleaner * chore: fixed docs and error message returned in validate_and_format_image_pairs function * fix: fixed returned keypoints to be the ones that SuperPoint returns * fix: fixed check on number of image sizes for post process compared to the pairs in outputs of SuperGlue * fix: fixed check on number of image sizes for post process compared to the pairs in outputs of SuperGlue (bis) * fix: Changed SuperGlueMultiLayerPerceptron instantiation to avoid if statement * fix: Changed convert_superglue_to_hf script to reflect latest SuperGlue changes and got rid of nn.Modules * WIP: implement Attention from an existing class (like BERT) * docs: Changed docs to include more appealing matching plot * WIP: Implement Attention * chore: minor typehint change * chore: changed convert superglue script by removing all classes and apply conv to linear conversion in state dict + rearrange keys to comply with changes in model's layers organisation * Revert "Fixed typo in SuperPoint output doc" This reverts commit 2120390e827f94fcd631c8e5728d9a4980f4a503. * chore: added comments in SuperGlueImageProcessor * chore: changed SuperGlue organization HF repo to magic-leap-community * [run-slow] refactor: small change in layer instantiation * [run-slow] chore: replaced remaining stevenbucaille org to magic-leap-community * [run-slow] chore: make style * chore: update image matching fixture dataset HF repository * [run-slow] superglue * tests: overwriting test_batching_equivalence * [run-slow] superglue * tests: changed test to cope with value changing depending on cuda version * [run-slow] superglue * tests: changed matching_threshold value * [run-slow] superglue * [run-slow] superglue * tests: changed tests for integration * [run-slow] superglue * fix: Changed tensor view and permutations to match original implementation results * fix: updated convert script and integration test to include last change in model * fix: increase tolerance for CUDA variances * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * [run-slow] superglue * chore: removed blank whitespaces * [run-slow] superglue * Revert SuperPoint image processor accident changes * [run-slow] superglue * refactor: reverted copy from BERT class * tests: lower the tolerance in integration tests for SuperGlue * [run-slow] superglue * chore: set do_grayscale to False in SuperPoint and SuperGlue image processors * [run-slow] superglue * fix: fixed imports in SuperGlue files * chore: changed do_grayscale SuperGlueImageProcessing default value to True * docs: added typehint to post_process_keypoint_matching method in SuperGlueImageProcessor * fix: set matching_threshold default value to 0.0 instead of 0.2 * feat: added matching_threshold to post_process_keypoint_matching method * docs: update superglue.md to include matching_threshold parameter * docs: updated SuperGlueConfig docstring for matching_threshold default value * refactor: removed unnecessary parameters in SuperGlueConfig * fix: changed from matching_threshold to threshold * fix: re-revert changes to make SuperGlue attention classes copies of BERT * [run-slow] superglue * fix: added missing device argument in post_processing method * [run-slow] superglue * fix: add matches different from -1 to compute valid matches in post_process_keypoint_matching (and docstring) * fix: add device to image_sizes tensor instantiation * tests: added checks on do_grayscale test * chore: reordered and added Optional typehint to KeypointMatchingOutput * LightGluePR suggestions: - use `post_process_keypoint_matching` as default docs example - add `post_process_keypoint_matching` in autodoc - add `SuperPointConfig` import under TYPE_CHECKING condition - format SuperGlueConfig docstring - add device in convert_superglue_to_hf - Fix typo - Fix KeypointMatchingOutput docstring - Removed unnecessary line - Added missing SuperGlueConfig in __init__ methods * LightGluePR suggestions: - use batching to get keypoint detection * refactor: processing images done in 1 for loop instead of 4 * fix: use @ instead of torch.einsum for scores computation * style: added #fmt skip to long tensor values * refactor: rollbacked validate_and_format_image_pairs valid and invalid case to more simple ones * refactor: prepare_imgs * refactor: simplified `validate_and_format_image_pairs` * docs: fixed doc --------- Co-authored-by: steven <steven.bucaillle@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-01-20 10:32:39 +00:00
NielsRogge	872dfbdd46	[ViTPose] Convert more checkpoints (#35638 ) * Convert more checkpoints * Update docs, convert huge variant * Update model name * Update src/transformers/models/vitpose/modeling_vitpose.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Remove print statements * Update docs/source/en/model_doc/vitpose.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Link to collection --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-20 11:29:47 +01:00
Yih-Dar	332fa024d6	Security fix for `self-comment-ci.yml` (#35548 ) * Revert "Disable `.github/workflows/self-comment-ci.yml` for now (#35366)" This reverts commit ccc4a5a59b2d4134a49971915db0710e7a8c7824. * fix * fix * fix * least permission * add env --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-20 11:16:03 +01:00
Raushan Turganbay	8571bb145a	Fix CI for VLMs (#35690 ) * fix some easy test * more tests * remove logit check here also * add require_torch_large_gpu in Emu3	2025-01-20 11:15:39 +01:00
ivarflakstad	5fa3534475	Use AMD CI workflow defined in hf-workflows (#35058 ) * Use AMD CI workflow defined in hf-workflows	2025-01-17 20:52:57 +01:00
Dmitry Rogozhkin	7d4b3ddde4	ci: fix xpu skip condition for test_model_parallel_beam_search (#35742 ) `return unittest.skip()` used in the `test_model_parallel_beam_search` in skip condition for xpu did not actually mark test to be skipped running under pytest: * 148 passed, 1 skipped Other tests use `self.skipTest()`. Reusing this approach and moving the condition outside the loop (since it does not depend on it) allows to skip for xpu correctly: * 148 skipped Secondly, `device_map="auto"` is now implemented for XPU for IPEX>=2.5 and torch>=2.6, so we can now enable these tests for XPU for new IPEX/torch versions. Fixes: 1ea3ad1ae ("[tests] use `torch_device` instead of `auto` for model testing (#29531)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2025-01-17 16:47:27 +01:00
Matt	8ad6bd0f1b	Stop mutating input dicts in audio classification pipeline (#35754 )	2025-01-17 15:41:56 +00:00
eustlb	936a731534	Revert "Unable to use `MimiModel` with DeepSpeed ZeRO-3" (#35755 ) Revert "Unable to use `MimiModel` with DeepSpeed ZeRO-3 (#34735)" This reverts commit 54fd7e92604e8ecb2f4601aae2f75322af9184c5.	2025-01-17 16:29:26 +01:00
Tyler Michael Smith	10e8cd0d63	Restore is_torch_greater_or_equal_than for backward compatibility (#35734 ) * Restore is_torch_greater_or_equal_than for backward compatibility Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * review comments Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> --------- Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-01-17 16:22:44 +01:00
Pavel Iakubovskii	099d93d2e9	Grounding DINO Processor standardization (#34853 ) * Add input ids to model output * Add text preprocessing for processor * Fix snippet * Add test for equivalence * Add type checking guard * Fixing typehint * Fix test for added `input_ids` in output * Add deprecations and "text_labels" to output * Adjust tests * Fix test * Update code examples * Minor docs and code improvement * Remove one-liner functions and rename class to CamelCase * Update docstring * Fixup	2025-01-17 14:18:16 +00:00
Pavel Iakubovskii	42b2857b01	OmDet Turbo processor standardization (#34937 ) * Fix docstring * Fix docstring * Add `classes_structure` to model output * Update omdet postprocessing * Adjust tests * Update code example in docs * Add deprecation to "classes" key in output * Types, docs * Fixing test * Fix missed clip_boxes * [run-slow] omdet_turbo * Apply suggestions from code review Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Make CamelCase class --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-01-17 14:10:19 +00:00
Pavel Iakubovskii	94ae9a8da1	OwlViT/Owlv2 post processing standardization (#34929 ) * Refactor owlvit post_process_object_detection + add text_labels * Fix copies in grounding dino * Sync with Owlv2 postprocessing * Add post_process_grounded_object_detection method to processor, deprecate post_process_object_detection * Add test cases * Move text_labels to processors only * [run-slow] owlvit owlv2 * [run-slow] owlvit, owlv2 * Update snippets * Update docs structure * Update deprecated objects for check_repo * Update docstring for post processing of image guided object detection	2025-01-17 13:58:28 +00:00
Ambrose Robinson	add5f0566c	Added liger_kernel compatibility with `PeftModel` (#35680 ) * Added liger_kernel compatibility with `PeftModel` * Amending based on review comments * Amending based on review comments	2025-01-17 14:43:20 +01:00
alpertunga-bile	df6d42a914	check is added for the report_to variable in TrainingArguments (#35403 ) check for report_to variable is added	2025-01-17 14:39:32 +01:00
Francesco Cariaggi	54fd7e9260	Unable to use `MimiModel` with DeepSpeed ZeRO-3 (#34735 ) use torch.tensor(), not torch.Tensor() Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-01-17 14:06:20 +01:00
Cyril Vallez	ab1afd56f5	Fix some tests (#35682 ) * cohere tests * glm tests * cohere2 model name * create decorator * update * fix cohere2 completions * style * style * style * add cuda in comments	2025-01-17 12:10:43 +00:00
Ross Wightman	8c1b5d3782	🚨🚨🚨 An attempt to fix #29554 . Include 'LayerNorm.' in gamma/beta rename scope, optimize string search. (#35615 ) * An attempt to fix #29554. Include 'LayerNorm.' in gamma/beta rename scope, reduce number of characters searched on every load considerably. * Fix fix on load issue * Fix gamma/beta warning test * A style complaint * Improve efficiency of weight norm key rename. Add better comments about weight norm and layer norm renaming. * Habitual elif redunant with the return	2025-01-16 17:25:44 -08:00
Sai-Suraj-27	02a492a838	Added resource class configuration option for `check_circleci_user` job (#32866 ) Added resource class configuration option for check_circleci_user job.	2025-01-16 21:31:18 +01:00
Joao Gante	94af1c0aa2	[generate] return Cache object even if passed in a legacy format (#35673 ) * generate returns a Cache object by default * fix tests * fix test for encoder-decoder models	2025-01-16 17:06:24 +00:00
Joao Gante	2818307e93	[generate] can instantiate `GenerationConfig(cache_implementation="static")` (#35679 ) fix failing instantiation	2025-01-16 17:04:54 +00:00
Joao Gante	aaa969e97d	Remove `pt_to_tf` (#35672 ) * rm command * remove exception	2025-01-16 17:03:37 +00:00
Joao Gante	80dbbd103c	🧹 remove `generate`-related objects and methods scheduled for removal in v4.48 (#35677 ) * remove things scheduled for removal * make fixup	2025-01-16 17:03:20 +00:00
Joao Gante	aeeceb9916	[cache] add a test to confirm we can use cache at train time (#35709 ) * add test * augment test as suggested * Update tests/utils/test_modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * rerun tests --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-16 17:02:34 +00:00
Quinten Roets	57bf1a12a0	Remove batch size argument warning when unjustified (#35519 ) * use max batch size * revert unneccessary change --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2025-01-16 17:48:11 +01:00
Cyril Vallez	91be6a5eb2	Modular: support for importing functions from any file (#35692 ) * fix function imports * improve comment * Update modeling_switch_function.py * make checks more robust * improvement * rename * final test update	2025-01-16 16:37:53 +00:00
efsotr	8ebe9d7166	Optimize ForCausalLMLoss by removing unnecessary contiguous() call to reduce memory overhead (#35646 ) Optimize ForCausalLMLoss by removing unnecessary contiguous() calls to reduce memory overhead	2025-01-16 15:47:43 +00:00
Matt	1302c32a84	Add proper jinja2 error (#35533 ) * Cleanup jinja2 imports * Raise a proper error if Jinja is missing * make fixup	2025-01-16 15:31:11 +00:00
Joao Gante	3292e96a4f	[generation] fix type hint (#35725 ) fix type hint	2025-01-16 15:09:59 +00:00
人民艺术家	8b78d9d6e7	Fix the bug that `Trainer` cannot correctly call `torch_jit_model_eval` (#35722 ) Fix the bug that the accelerator.autocast does not pass parameters correctly when calling torch_jit_model_eval (#35706)	2025-01-16 15:53:37 +01:00
kang sheng	2cbcc5877d	Fix condition when GA loss bug fix is not performed (#35651 ) * fix condition when GA loss bug fix is not performed * max loss diff is 2.29 * fix typo * add an extra validation that loss should not vary too much	2025-01-16 13:59:53 +01:00
Mohamed Mekkouri	fd4f14c968	Fix: Falcon tie_word_embeddings in GGUF (#35715 ) * fix falcon tie_word_embeddings * fix style	2025-01-16 13:18:22 +01:00
Mikko Reinikainen	bef7dded22	Replace deprecated batch_size with max_batch_size when using HybridCache (#35498 ) * Replace deprecated batch_size with max_batch_size - Functionality remains the same, because property getter batch_size(self) returned max_batch_size anyways. - This change just avoids an unnecessary warning about deprecation. * Use max_batch_size instead of deprecated batch_size with HybridCache * Use max_batch_size instead of deprecated batch_size with HybridCache - Change generated code to match original source	2025-01-16 11:48:41 +00:00
hiroaki222	99e0ab6ed8	Fix typo in /docs/source/ja/model_doc/decision_transformer.md URL (#35705 ) doc: Update original code repository URL	2025-01-15 07:36:50 -08:00
Mohamed Mekkouri	12dfd99007	Fix : Nemotron Processor in GGUF conversion (#35708 ) * fixing nemotron processor * make style	2025-01-15 14:25:44 +01:00
jiqing-feng	387663e571	Enable gptqmodel (#35012 ) * gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update readme Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * gptqmodel need use checkpoint_format (#1) * gptqmodel need use checkpoint_format * fix quantize * Update quantization_config.py * Update quantization_config.py * Update quantization_config.py --------- Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * Revert quantizer_gptq.py (#2) * revert quantizer_gptq.py change * pass *kwargs limit gptqmodel and optimum version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix warning Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix version check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert unrelated changes Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable gptqmodel tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix requires gptq Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Fix Transformer compat (#3) * revert quantizer_gptq.py change * pass *kwargs add meta info * cleanup * cleanup * Update quantization_config.py * hf_select_quant_linear pass checkpoint_format and meta * fix GPTQTestCUDA * Update test_gptq.py * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * cleanup * add backend * cleanup * cleanup * no need check exllama version * Update quantization_config.py * lower checkpoint_format and backend * check none * cleanup * Update quantization_config.py * fix self.use_exllama == False * spell * fix unittest * fix unittest --------- Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format again Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update gptqmodel version (#6) * update gptqmodel version * update gptqmodel version * fix unit test (#5) * update gptqmodel version * update gptqmodel version * "not self.use_exllama" is not equivalent to "self.use_exllama==False" * fix unittest * update gptqmodel version * backend is loading_attibutes (#7) * fix format and tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix memory check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix device mismatch Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix result check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * update tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * review: update docs (#10) * review: update docs (#12) * review: update docs * fix typo * update tests for gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update document (#9) * update overview.md * cleanup * Update overview.md * Update overview.md * Update overview.md * update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * typo * doc note for asymmetric quant * typo with apple silicon(e) * typo for marlin * column name revert: review * doc rocm support * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-15 14:22:49 +01:00
Matt	615bf9c5e4	Add future import for Py < 3.10 (#35666 ) * Add future import for Py < 3.10 * make fixup * Same issue in convert_olmo2_weights_to_hf.py	2025-01-15 12:45:43 +00:00
Raushan Turganbay	09d5f76274	Clean-up composite configs (#34603 ) * remove manual assignment tie-word-embeddings * remove another unused attribute * fix tests * fix tests * remove unnecessary overwrites * fix * decoder=True * clean pix2struct * run-all * forgot `_tied_weights_keys` when adding Emu3 * also Aria + fix-copies * and clean aria	2025-01-15 10:04:07 +01:00
Mahdi Baghbanzadeh	c61fcde910	Enhance DataCollatorForLanguageModeling with Configurable Token Replacement Probabilities (#35251 ) * DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing * DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing * Addressed review comments, modified the docstring and made a test for the DataCollatorForLanguageModeling	2025-01-14 17:01:10 +00:00
Ego Joseph Oborakpororo	b0cdbd9119	Enhanced Installation Section in README.md (#35094 ) * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md Enhanced installation section with troubleshooting, GPU setup, and OS-specific details. * Update README.md Enhanced installation section with troubleshooting, GPU setup, and OS-specific details. * Update installation.md Updated installation.md to include virtual environment and GPU setup instructions. * Update installation.md Updated installation.md to include virtual environment and GPU setup instructions. * Update installation.md Updated installation.md to include virtual environment, troubleshooting and GPU setup instructions. * Update installation.md * Update installation.md * Update installation.md * Update installation.md Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions. * Update installation.md Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions. * Update installation.md Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions. * Update README.md Removed numbering from README.md. * Update README.md Removed unnecessary "a)" formatting as per maintainer feedback. * Update README.md Added blank lines around code snippets for better readability. * Update README.md Removed the line "b) Install a backend framework:" from README.md as per feedback. * Update README.md Simplified "For Windows:" to "Windows" in README.md as per feedback as well as "For macOS/Linux:" to "macOS/Linux" * Update README.md Removed unnecessary heading and retained valid code snippet. * Update README.md Removed unnecessary heading "d) Optional: Install from source for the latest updates" as per feedback. * Update README.md Removed "GPU Setup (Optional)" section to align with minimal design feedback. * Update installation.md Removed "Create and Activate a Virtual Environment" section from installation.md as per feedback. * Update installation.md Adjusted "Troubleshooting" to a second-level heading and added an introductory line as per feedback. * Update installation.md Updated troubleshooting section with simplified headings and formatted code blocks as per feedback. * Update installation.md Integrated GPU setup instructions into the "Install with pip" section for better content flow. * Update README.md Removed Troubleshooting section from README.md for minimalism as per maintainer feedback.	2025-01-14 08:05:08 -08:00
Mohamed Mekkouri	a11041ffad	Fix : add require_read_token for gemma2 gated model (#35687 ) fix gemma2 gated model test	2025-01-14 11:47:05 +01:00
Mohamed Mekkouri	df2a812e95	Fix expected output for ggml test (#35686 ) fix expected output	2025-01-14 11:46:55 +01:00
Mohamed Mekkouri	050636518a	Fix : HQQ config when hqq not available (#35655 ) * fix * make style * adding require_hqq * make style	2025-01-14 11:37:37 +01:00
Martin	715fdd6459	Update torchao.md: use auto-compilation (#35490 ) * Update torchao.md: use auto-compilation * Update torchao.md: indicate updating transformers to the latest --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-01-14 11:33:48 +01:00
Mohamed Mekkouri	4b8d1f7fca	Fix : adding einops lib in the CI docker for some bitsandbytes tests (#35652 ) * fix docker * fix	2025-01-14 07:36:10 +01:00
RTrace	34f76bb62b	Fix `zero_shot_image_classification` documentation guide link in SigLIP (#35671 )	2025-01-13 11:08:17 -08:00
Arthur	c23a1c1932	Add-helium (#35669 ) * Add the helium model. * Add a missing helium. * And add another missing helium. * Use float for the rmsnorm mul. * Add the Helium tokenizer converter. * Add the pad token as suggested by Arthur. * Update the RMSNorm + some other tweaks. * Fix more rebase issues. * fix copies and style * fixes and add helium.md * add missing tests * udpate the backlink * oups * style * update init, and expected results * small fixes * match test outputs * style fixup, fix doc builder * add dummies and we should be good to go!z * update sdpa and fa2 documentation --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2025-01-13 18:41:15 +01:00
Ahmed Almaghz	a3f82328ed	[i18n-ar] Translated file : docs/source/ar/tasks/token_classification.md into Arabic (#35193 ) * Create token_classification.md * Update token_classification.md * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2025-01-13 09:32:15 -08:00
Fanli Lin	2fa876d2d8	[tests] make cuda-only tests device-agnostic (#35607 ) * intial commit * remove unrelated files * further remove * Update test_trainer.py * fix style	2025-01-13 14:48:39 +01:00
Arthur	e6f9b03464	[`Compile`] Only test compiling model forward pass (#35658 ) * rename test to only compile forward! * style emu	2025-01-13 13:43:29 +01:00
Raushan Turganbay	84a6789145	Enable different torch dtype in sub models (#34873 ) * fix * fix test * add tests * add more tests * fix tests * supposed to be a torch.dtype test * handle BC and make fp32 default	2025-01-13 13:42:08 +01:00
Arthur	87089176d9	[`Phi`] bias should be True (#35650 ) bias should be True	2025-01-13 13:15:07 +01:00
Sai-Suraj-27	91f14f1fc4	Removed some duplicated code (#35637 ) * Removed duplicate class field definition. * Removed duplicate code in try-except block. --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-01-13 12:34:21 +01:00
jiqing-feng	b8c34d97fc	Fix whisper compile (#35413 ) Fix compile error Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-01-13 11:31:51 +01:00
Cyril Vallez	cd44bdb4b8	Fix device in rope module when using dynamic updates (#35608 ) fix rope device	2025-01-13 10:11:17 +01:00
Matt	15bd3e61f8	Update codeowners with individual model owners (#35595 ) * Update codeowners with individual model owners * rip yoach * add comment * Replace - with _ * Add @qubvel for zero-shot object-detection * Update CODEOWNERS Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update CODEOWNERS Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update CODEOWNERS Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update CODEOWNERS Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add yoni for omdet-turbo * Update CODEOWNERS Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Refactor / comment the CODEOWNERS file * Capture modular files as well * Add dummies without owner * More cleanup * Set Niels on a few more models that he added --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-01-10 17:59:36 +00:00
Yih-Dar	1e3c6c1f7d	Skip `MobileNetV1ModelTest::test_batching_equivalence` for now (#35614 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-10 18:32:36 +01:00
Yih-Dar	04eae987f3	Fix flaky `test_beam_search_low_memory` (#35611 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-10 17:31:03 +01:00
Zach Mueller	b02828e4af	Let `EarlyStoppingCallback` not require `load_best_model_at_end` (#35101 ) * Bookmark * Add warning	2025-01-10 10:25:32 -05:00
Taha Akbari	0aaf124fb9	Added error when sequence length is bigger than max_position_embeddings (#32156 ) * Added error when sequence length is bigger than max_position_embeddings * Fixed formatting * Fixed bug * Changed copies to match * Fixed bug * Applied suggestions * Removed redundant code * Fixed bugs * Bug fix * Bug fix * Added requested Changes * Fixed bug * Fixed unwanted change * Fixed unwanated changes * Fixed formatting	2025-01-10 15:23:54 +00:00
Zach Mueller	1211e616a4	Use inherit tempdir makers for tests + fix failing DS tests (#35600 ) * Use existing APIs to make tempdir folders * Fixup deepspeed too * output_dir -> tmp_dir	2025-01-10 10:01:58 -05:00
Yih-Dar	bbc00046b9	Fix flaky `test_custom_4d_attention_mask` (#35606 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-10 15:40:04 +01:00
Arthur Zucker	f63829c87b	v4.49.0-dev	2025-01-10 12:31:11 +01:00
Raushan Turganbay	52e1f87c7d	[WIP] Emu3: add model (#33770 ) * model can convert to HF and be loaded back * nit * works in single batch generation but hallucinates * use the image tokens * add image generation * now it works * add tests * update * add modulare but it doesn't work for porting docstring :( * skip some tests * add slow tests * modular removed the import? * guess this works * update * update * fix copies * fix test * fix copies * update * docs * fix tests * last fix tests? * pls * repo consistency * more style * style * remove file * address comments * tiny bits * update after the new modular * fix tests * add one more cond in check attributes * decompose down/up/mid blocks * allow static cache generation in VLMs * nit * fix copies * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/emu3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix VAE upsampling * Update src/transformers/models/emu3/modular_emu3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * address comments * state overwritten stuff explicitly * fix copies * add the flag for flex attn --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-10 12:23:00 +01:00
Cyril Vallez	ccc0381d36	Fix flex_attention in training mode (#35605 ) * fix flex * add test * style	2025-01-10 11:49:12 +01:00
Arthur Zucker	a9bd1e6284	Remove `benchmark.py` after #34275	2025-01-10 11:09:06 +01:00
Raushan Turganbay	e0646f3dce	Chat template: return vectorized output in processors (#34275 ) * update chat template * style * fix tests * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * typehints + docs * fix tests * remove unnecessary warnings * forgot code style :( * allow users to pass backend and num frames * Update docs/source/en/chat_templating.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/image_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * typo fix * style * address comments * align with "pipeline" template * update docs * update docs * unpack for all kwargs? * wrong conflict resolution while rebasing * tmp * update docs * Update docs/source/en/chat_templating.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_templating.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_templating.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/chat_templating.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-10 11:05:29 +01:00
eustlb	5f087d1335	Add Moonshine (#34784 ) * config draft * full encoder forward * full decoder forward * fix sdpa and FA2 * fix sdpa and FA2 * moonshine model * moonshine model forward * fix attention with past_key_values * add MoonshineForConditionalGeneration * fix cache handling and causality for cross attention * no causal attention mask for the encoder * model addition (imports etc) * small nit * nits * Update src/transformers/models/moonshine/convert_usefulsensors_to_hf.py Co-authored-by: Joshua Lochner <admin@xenova.com> * add rope_theta * nits * model doc * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Joshua Lochner <admin@xenova.com> * imports * add MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING_NAMES * updates modular * make * make fix-copies * ruff check examples fix * fix check_modular_conversion * nit * nits * nits * copied from -> imports * imports fix * integrate attention refacto * modular edge case * remove encoder * convolutions params in config * run modular_model_converter * make * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Joshua Lochner <admin@xenova.com> * MoonshineModelTest * correct typo * make style * integration tests * make * modular convert * name conversion update (up_proj -> fc1 etc) * update config * update MLP * update attention * update encoder layer * update decoder layer * update convolutions parameters * update encoder * remove INPUTS_DOCSTRING * update decoder * update conditional generation * update pretrained model * imports * modular converted * update doc * fix * typo * update doc * update license * update init * split config in file * two classes for MLP * attention from GLM * from GlmRotaryEmbedding * split MLP * apply arthur's review suggestions * apply arthur's review suggestions * apply arthur's review suggestions * auto feature extractor * convert modular * fix + make * convert modular * make * unsplit config * use correct checkpoint * wrap generate * update tests * typos * make * typo * update doc --------- Co-authored-by: Joshua Lochner <admin@xenova.com>	2025-01-10 11:00:54 +01:00
Yih-Dar	6f127d3f81	Skip `torchscript` tests if a cache object is in model's outputs (#35596 ) * fix 1 * fix 1 * comment --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-10 10:46:03 +01:00
Tom Aarsen	6b73ee8905	ModernBert: reuse GemmaRotaryEmbedding via modular + Integration tests (#35459 ) * Introduce 5 integration tests for the 4 model classes + torch export * ModernBert: reuse GemmaRotaryEmbedding via modular * Revert #35589, keep rope_kwargs; rely on them in modular_modernbert * Revert "Revert #35589, keep rope_kwargs; rely on them in modular_modernbert" This reverts commit 11b44b9ee83e199cbfb7c5ba2d11f7a7fdbba2d3. * Don't set rope_kwargs; override 'self.rope_init_fn' call instead	2025-01-10 10:25:10 +01:00
Zach Mueller	8de7b1ba8d	Add flex_attn to diffllama (#35601 ) Add sdpa to diffllama	2025-01-09 20:49:11 +01:00
Benjamin Warner	1e3ddcb2d0	ModernBERT bug fixes (#35404 ) * bug fixes * organize imports * wrap cpu warning in reference_compile * Avoid needing repad_logits_with_grad, always repad with grads when training I'm not 100% that the conditional with "or labels is None" makes sense though - not sure what the intention is there. Perhaps we can remove that? * Revert "Avoid needing repad_logits_with_grad, always repad with grads when training" This reverts commit cedcb4e89bcea199a1135a0933e71f534b656239. * Fix grammar: keep -> keeps * Propagate grammar fix with modular_model_converter --------- Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com> Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>	2025-01-09 20:15:38 +01:00
Arthur	e97d7a5be5	add `_supports_flex_attn = True` for models that do support it (#35598 ) * add `_supports_flex_attn = True` * fix repo consistency	2025-01-09 20:03:33 +01:00
胡译文	c9c682d19c	[doc] deepspeed universal checkpoint (#35015 ) * universal checkpoint * Update docs/source/en/deepspeed.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/deepspeed.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/deepspeed.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-09 09:50:51 -08:00
Cyril Vallez	3a4ae6eace	Refactor/fix Cohere2 (#35594 ) * refactor/fix cohere2 * add kwargs * tests * remove func and import it	2025-01-09 17:54:57 +01:00
Tom Aarsen	32e0db8a69	[`tokenizers`] Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer (#35593 ) * Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer in PreTrainedTokenizerFast, rather than relying on subclasses to take care of this. * Simplify setting self.add_prefix_space, ensure pre_tok exists * Wrap in try-except to catch 'Custom PreTokenizer cannot be serialized' `862d1a346a/bindings/python/src/pre_tokenizers.rs (L672)` produces the Exception. They're triggered by the roformer tests, as the RoFormerTokenizerFast uses a custom PreTokenizer. * Propagate add_prefix_space in T5TokenizerFast to superclass	2025-01-09 17:46:50 +01:00
Cyril Vallez	46276f9a7f	Fix modular edge case + modular sorting order (#35562 ) * look-ahead negation * re add examples by default * Fix the bug in topological sort * Update create_dependency_mapping.py * start adding test * finalize test * more tests * style * style	2025-01-09 17:17:52 +01:00
Amit Luhar	d3fe9fa3fe	PR for Issue #22694 : Fixed Training Evaluation table display for VSCode (#35557 )	2025-01-09 15:05:47 +00:00
Pablo Montalvo	395b114bd1	Small fix rope kwargs (#35589 ) * don't know why this keeps popping up? * remove unused rope_kwargs	2025-01-09 15:40:36 +01:00
Yih-Dar	82dd6c14bb	Fix flaky `SwitchTransformersModelTest::test_training_gradient` (#35587 ) * fix * Update tests/models/switch_transformers/test_modeling_switch_transformers.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-09 15:36:22 +01:00
Arthur	eb4579cf43	`tokenizer` train from iterator without pre_tokenizers (#35396 ) * fix if else issues * add a test * fix the test * style	2025-01-09 15:34:43 +01:00
Mehant Kammakomati	320512df46	feat: add TP plan for granite (#35573 ) Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>	2025-01-09 15:25:55 +01:00
Saif Rehman Nasir	633da1b10e	[Idefics3] Move image features to same device as input embeds (#35100 ) * [Idefics3] Move image features to same device as input embeds * Update src/transformers/models/idefics3/modeling_idefics3.py * make style --------- Co-authored-by: Saif Rehman Nasir <shyshin@github.com> Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2025-01-09 14:25:36 +01:00
Jack Morris	832c6191ed	Add inputs_embeds param to ModernBertModel (#35373 ) * update modular_modernbert -- add inputs_embeds param to ModernBertModel * Fix implementation issues; extend to other classes; docstring First of all, the inputs_embeds shouldn't fully replace `self.embeddings(input_ids)`, because this call also does layer normalization and dropout. So, now both input_ids and inputs_embeds is passed to the ModernBertEmbeddings, much like how BertEmbeddings is implemented. I also added `inputs_embeds` to the docstring, and propagated the changes to the other model classes. I also introduced an error if input_ids and input_embeds are both or neither provided. Lastly, I fixed an issue with device being based solely on input_ids with attention_mask. * Propagate inputs_embeds to ModernBertForMaskedLM correctly Also reintroduce inputs_embeds test --------- Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>	2025-01-09 14:17:26 +01:00
Yih-Dar	1b2f942af7	Fix flaky `test_batching_equivalence` (#35564 ) * yes! * oh no!!! * oh no!!! * style * oh no!!! * oh no!!! * oh no!!! * oh no!!! --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-09 14:00:08 +01:00
Chander G	4adc415b6d	Setup loss_type in config at model init time (#34616 ) * setup loss_type in config at model init time ensures no additional graph break introduced when torch.compile'ed fixes #34615 Signed-off-by: ChanderG <mail@chandergovind.org> * lookup loss mapping at init time instead of manual setup Signed-off-by: ChanderG <mail@chandergovind.org> * remove redundant lookup at loss_function time * overwride losstype at init time --------- Signed-off-by: ChanderG <mail@chandergovind.org> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-01-09 13:32:21 +01:00
Cyril Vallez	c8ab6ce6ce	Re-add missing __all__ for Cohere and Phi3 (#35578 ) re-add missing __all__	2025-01-09 11:29:31 +01:00
Merve Noyan	487c31a21f	Minor fix in video text 2 text docs (#35546 ) minor fix in docs	2025-01-09 11:20:36 +01:00
Cyril Vallez	965a2fb320	More model refactoring! (#35359 ) * cohere * style * phi3 * style * small fix * small fix * phi3 longrope * oups * Update rope (only for phi3 still) * Update test_modeling_rope_utils.py * Update modeling_phi3.py * fix * fix copies * style * Fix copied from bad renaming	2025-01-09 11:09:09 +01:00
Raushan Turganbay	137965ca7d	Don't show warning for `inv_freq` buffers (#35255 ) dont show warning	2025-01-09 10:46:01 +01:00
Arthur	8cad65a698	Fix multi-gpu loss (#35395 ) push to device	2025-01-09 10:14:31 +01:00
Arthur	2e2f8015c0	update code owners (#35576 ) update	2025-01-09 09:55:41 +01:00
Ahmed Almaghz	a6256ec098	[i18n-ar] Translated file: `docs/source/ar/tasks/multiple_choice.md` into Arabic (#35199 ) * إضافة الترجمة العربية: multiple_choice.md * Update multiple_choice.md * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/multiple_choice.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml * Add files via upload * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2025-01-08 14:17:58 -08:00
nhamanasu	b32938aeee	Fix all output_dir in test_trainer.py to use tmp_dir (#35266 ) * update codecarbon * replace directly-specified-test-dirs with tmp_dir * pass tmp_dir to all get_regression_trainer * test_trainer.py: Use tmp_dir consistently for all output_dir arguments * fix some with...as tmp_dir blocks * reflect the comments to improve test_trainer.py * refresh .gitignore	2025-01-08 19:44:39 +01:00
Joao Gante	76da6ca034	Pipeline: simple API for assisted generation (#34504 ) Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-01-08 17:08:02 +00:00
Arthur	3f483beab9	[`PixtralLarge`] Update Pixtral conversion script to support large format! (#34801 ) * update conversion script * update for bias again * remove pdv * use my dir * Update how we initialize the tokenizer * Convert in bfloat16 * Undo that one again * fix config dump * .to() was broken for BatchMixFeature * quick debug breakpoint * put the breakpoint in the right place * Add a config flag for the multimodal projector bias * Add a config flag for the multimodal projector bias * Conversion script can load chat templates * Indent config for comparison * Stop clobbering the config * Re-enable the config clobber * Get rid of the config manual save - it has no effect! * Handle adapter bias correctly * Default vision transformer activation to silu * Remove legacy processing path * One commit with all the debug breakpoints before I delete them all, in case I need to revert * Update conversion * Remove vLLM debugging instrumentation * Drop xformers * Remove debug enumerates * make fixup * make fixup * Break copied from in pixtral * Propagate multimodal_projector_bias change * Propagate multimodal_projector_bias change * Remove debug device .to() * Restore attention weights output * Fix Pixtral test * Drop image_seq_length * Drop image_seq_length * Put the legacy processing code back * Add the bias option to the llava_next_video config * Add the bias option to the llava_next_video config * Make certain args required in converter * Make certain args required in converter * typo * make fixup * Reverting some dtype changes since it seems to work without them --------- Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-166-244.ec2.internal> Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-01-08 17:39:47 +01:00
DaNing An	4c2c12b3de	[docs] Remove Hiera from AUDIO MODELS in docs (#35544 ) Remove Hiera from AUDIO MODELS Hiera is a visual model and should not appear in audio model...	2025-01-08 16:33:21 +00:00
HERIUN	854dc7941b	ovewrite top_k when crate audio classification pipeline (#35541 ) * ovewrite top_k when crate audio classification pipeline * Update src/transformers/pipelines/audio_classification.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-01-08 16:32:27 +00:00
Arthur	8c555ca3d7	add code owners (#35528 ) * add co owners * normal processing * /src/transformers/models//_modeling* * Update CODEOWNERS * Update CODEOWNERS Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update CODEOWNERS Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * nit * Apply suggestions from code review Co-authored-by: Alvaro Moran <6949769+tengomucho@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update CODEOWNERS * rather put `@Rocketknight1` --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: Alvaro Moran <6949769+tengomucho@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>	2025-01-08 17:14:44 +01:00
NielsRogge	8490d3159c	Add ViTPose (#30530 ) * First draft * Make fixup * Make forward pass worké * Improve code * More improvements * More improvements * Make predictions match * More improvements * Improve image processor * Fix model tests * Add classic decoder * Convert classic decoder * Verify image processor * Fix classic decoder logits * Clean up * Add post_process_pose_estimation * Improve post_process_pose_estimation * Use AutoBackbone * Add support for MoE models * Fix tests, improve num_experts% * Improve variable names * Make fixup * More improvements * Improve post_process_pose_estimation * Compute centers and scales * Improve postprocessing * More improvements * Fix ViTPoseBackbone tests * Add docstrings, fix image processor tests * Update index * Use is_cv2_available * Add model to toctree * Add cv2 to doc tests * Remove script * Improve conversion script * Add coco_to_pascal_voc * Add box_to_center_and_scale to image_transforms * Update tests * Add integration test * Fix merge * Address comments * Replace numpy by pytorch, improve docstrings * Remove get_input_embeddings * Address comments * Move coco_to_pascal_voc * Address comment * Fix style * Address comments * Fix test * Address comment * Remove udp * Remove comment * [WIP] need to check if the numpy function is same as cv * add scipy affine_transform * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * refactor convert * add output_shape * add atol 5e-2 * Use hf_hub_download in conversion script * make box_to_center more applicable * skipt test_get_set_embedding * fix to accept array and fix CI * add co-contributor * make it to tensor type output * add torch * change to torch tensor * add more test * minor change * CI test change * import torch should be above ImageProcessor * make style * try not use torch in def * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/vitpose_backbone/configuration_vitpose_backbone.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/vitpose/modeling_vitpose.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix * fix * add caution * make more detail about dataset_index * Update src/transformers/models/vitpose/modeling_vitpose.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * add docs * Update docs/source/en/model_doc/vitpose.md * Update src/transformers/models/vitpose/configuration_vitpose.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Revert "Update src/transformers/__init__.py" This reverts commit 7ffa504450bb9dbccf9c7ea668441b98a1939d5c. * change name * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/vitpose/test_modeling_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/vitpose.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vitpose/modeling_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * move vitpose only function to image_processor * raise valueerror when using timm backbone * use out_indices * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove camel-case of def flip_back * rename vitposeEstimatorOutput * Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix confused camelcase of MLP * remove in-place logic * clear scale description * make consistent batch format * docs update * formatting docstring * add batch tests * test docs change * Update src/transformers/models/vitpose/image_processing_vitpose.py * Update src/transformers/models/vitpose/configuration_vitpose.py * chagne ViT to Vit * change to enable MoE * make fix-copies * Update docs/source/en/model_doc/vitpose.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * extract udp * add more described docs * simple fix * change to accept target_size * make style * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vitpose/configuration_vitpose.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * change to `verify_backbone_config_arguments` * Update docs/source/en/model_doc/vitpose.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove unnecessary copy * make config immutable * enable gradient checkpointing * update inappropriate docstring * linting docs * split function for visibility * make style * check isinstances * change to acceptable use_pretrained_backbone * make style * remove copy in docs * Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update docs/source/en/model_doc/vitpose.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/vitpose/modeling_vitpose.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * simple fix + make style * change input config of activation function to string * Update docs/source/en/model_doc/vitpose.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * tmp docs * delete index.md * make fix-copies * simple fix * change conversion to sam2/mllama style * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/vitpose/image_processing_vitpose.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * refactor convert * add supervision * Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * remove reduntant def * seperate code block for visualization * add validation for num_moe * final commit * add labels * [run-slow] vitpose, vitpose_backbone * Update src/transformers/models/vitpose/convert_vitpose_to_hf.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * enable all conversion * final commit * [run-slow] vitpose, vitpose_backbone * ruff check --fix * [run-slow] vitpose, vitpose_backbone * rename split module * [run-slow] vitpose, vitpose_backbone * fix pos_embed * Simplify init * Revert "fix pos_embed" This reverts commit 2c56a4806e30bc9b5753b142fa04b913306c54ff. * refactor single loop * allow flag to enable custom model * efficiency of MoE to not use unused experts * make style * Fix range -> arange to avoid warning * Revert MOE router, a new one does not work * Fix postprocessing a bit (labels) * Fix type hint * Fix docs snippets * Fix links to checkpoints * Fix checkpoints in tests * Fix test * Add image to docs --------- Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: sangbumchoi <danielsejong55@gmail.com> Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-01-08 16:02:14 +00:00
Minho Shim	4349a0e401	fix: Qwen2-VL generate with inputs_embeds (#35466 ) * fix: Qwen2-VL generate with inputs_embeds * change: optional input_ids in get_rope_index	2025-01-08 16:36:03 +01:00
Sean (Seok-Won) Yi	88e18b3c63	Update doc for `metric_for_best_model` when `save_strategy="best"`. (#35389 ) * Updated docstring for _determine_best_metric. * Updated docstring for metric_for_best_model. * Added test case for save strategy. * Updated incorrect test case. * Changed eval_strategy to match save_strategy. * Separated test cases for metric. * Allow load_best_model when save_strategy == "best". * Updated docstring for metric_for_best_model.	2025-01-08 16:32:35 +01:00
jp	29e74b7cbc	Add: num_additional_image_tokens to models (#35052 ) * Add: num_additional_image_tokens to models * docs: update docstring for num_additional_image_tokens in configuration files * Add num_additional_image_tokens to LlavaNextVideo model and update feature selection logic * revert * Fix: adjust num_image_tokens calculation in LlavaProcessor * Remove num_additional_image_tokens initialization from configuration files * Fix test error * revert * Fix: adjust num_image_tokens calculation in LlavaNextVideoProcessor * fix conflict * Fix: adjust num_image_tokens calculation in VideoLlavaProcessor * make style --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2025-01-08 16:20:01 +01:00
Pavel Iakubovskii	657bb14f98	Enable auto task for timm models in pipeline (#35531 ) * Enable auto task for timm models * Add pipeline test	2025-01-08 15:14:17 +00:00
Yichen Yan	1a6c1d3a9a	Bump torch requirement to >= 2 (#35479 ) Bump torch requirement, follow-up of #35358	2025-01-08 15:59:32 +01:00
Pavel Iakubovskii	59e5b3f01b	Timm wrapper label names (#35553 ) * Add timm wrapper label names mapping * Add index to classification pipeline * Revert adding index for pipelines * Add custom model check for loading timm labels * Add tests for labels * [run-slow] timm_wrapper * Add note regarding label2id mapping	2025-01-08 14:09:46 +00:00
Matt	f1639ea51d	Update missing model error message (#35370 ) * Update missing model error message * Update missing model error message * Update missing model error message * Fix capitalization	2025-01-08 15:05:06 +01:00
Jade Choghari	bd39b0627b	Update doc and default value of TextNetImageProcessor (#35563 ) update doc and default value	2025-01-08 13:47:52 +00:00
Yoni Gozlan	651cfb400f	Add support for modular with fast image processors (#35379 ) * Add support for modular with fast image processors * fix order and remove copied from * add comment for "image_processing*_fast"	2025-01-08 08:37:57 -05:00
Joao Gante	430d3d43a5	[Docs] links to `logits-processor-zoo` (#35552 ) links to logits-processor-zoo	2025-01-08 13:36:30 +00:00
Jacky Lee	3c1895aa65	Fix Qwen2VL processor to handle odd number of frames (#35431 ) * fix: processing odd number of frames * feat: add test case * update: test one frame * feat: support custom patch size * fix: test with videos * revert: change on patch repeat * fix: much wow * update: fixups * fixup pls * ruff fixup * fix typo at least	2025-01-08 13:49:00 +01:00
Quentin Lhoest	3fde88b19d	support chat generator as input of TextGenerationPipeline (#35551 ) * support chat generator as input of TextGenerationPipeline * missing import * fix tests * again * simpler * add test	2025-01-08 13:27:07 +01:00
hoshi-hiyouga	ebdd1ad400	Pass correct `num_items_in_batch` value into the training_step function (#35438 ) pass correct `num_items_in_batch` to compute_loss	2025-01-08 13:16:03 +01:00
Koichi Yasuoka	0e0516c119	MODERNBERT_INPUTS_DOCSTRING: past_key_values are ignored (#35513 ) * MODERNBERT_INPUTS_DOCSTRING: past_key_values are ignored * sync to modular_modernbert.py	2025-01-08 11:45:40 +01:00
Raushan Turganbay	d1681ec2b6	VLMs: major clean up 🧼 (#34502 ) only lllava models are modified	2025-01-08 10:35:23 +01:00
Jade Choghari	7176e06b52	Add TextNet (#34979 ) * WIP * Add config and modeling for Fast model * Refactor modeling and add tests * More changes * WIP * Add tests * Add conversion script * Add conversion scripts, integration tests, image processor * Fix style and copies * Add fast model to init * Add fast model in docs and other places * Fix import of cv2 * Rename image processing method * Fix build * Fix Build * fix style and fix copies * Fix build * Fix build * Fix Build * Clean up docstrings * Fix Build * Fix Build * Fix Build * Fix build * Add test for image_processing_fast and add documentation tests * some refactorings * Fix failing tests * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Introduce TextNet * Fix failures * Refactor textnet model * Fix failures * Add cv2 to setup * Fix failures * Fix failures * Add CV2 dependency * Fix bugs * Fix build issue * Fix failures * Remove textnet from modeling fast * Fix build and other things * Fix build * some cleanups * some cleanups * Some more cleanups * Fix build * Incorporate PR feedbacks * More cleanup * More cleanup * More cleanup * Fix build * Remove all the references of fast model * More cleanup * Fix build * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Fix Build * Fix build * Fix build * Fix build * Fix build * Fix build * Incorporate PR feedbacks * Fix style * Fix build * Incorporate PR feedbacks * Fix image processing mean and std * Incorporate PR feedbacks * fix build failure * Add assertion to image processor * Incorporate PR feedbacks * Incorporate PR feedbacks * fix style failures * fix build * Fix Imageclassification's linear layer, also introduce TextNetImageProcessor * Fix build * Fix build * Fix build * Fix build * Incorporate PR feedbacks * Incorporate PR feedbacks * Fix build * Incorporate PR feedbacks * Remove some script * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Fix image processing in textnet * Incorporate PR Feedbacks * Fix CI failures * Fix failing test * Fix failing test * Fix failing test * Fix failing test * Fix failing test * Fix failing test * Add textnet to readme * Improve readability * Incorporate PR feedbacks * fix code style * fix key error and convert working * tvlt shouldn't be here * fix test modeling test * Fix tests, make fixup * Make fixup * Make fixup * Remove TEXTNET_PRETRAINED_MODEL_ARCHIVE_LIST * improve type annotation Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update tests/models/textnet/test_image_processing_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * improve type annotation Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * space typo Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * improve type annotation Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/textnet/configuration_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * make conv layer kernel sizes and strides default to None * Update src/transformers/models/textnet/modeling_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/textnet/modeling_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix keyword bug * add batch init and make fixup * Make fixup * Update integration test * Add figure * Update textnet.md * add testing and fix errors (classification, imgprocess) * fix error check * make fixup * make fixup * revert to original docstring * add make style * remove conflict for now * Update modeling_auto.py got a confusion in `timm_wrapper` - was giving some conflicts * Update tests/models/textnet/test_modeling_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/textnet/modeling_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update tests/models/textnet/test_modeling_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/textnet/modeling_textnet.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * add changes * Update textnet.md * add doc * add authors hf ckpt + rename * add feedback: classifier/docs --------- Co-authored-by: raghavanone <opensourcemaniacfreak@gmail.com> Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co> Co-authored-by: Niels <niels.rogge1@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-01-08 09:52:51 +01:00
Steven Liu	b05df6611e	[docs] Remove sortish_sampler (#35539 ) remove	2025-01-07 12:06:19 -08:00
Matt	a7d1441d65	Correctly list the chat template file in the Tokenizer saved files list (#34974 ) * Correctly list the chat template file in the saved files list * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add save file checking to test * make fixup * better filename handling * make fixup --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-07 19:11:02 +00:00
eustlb	cdca3cf9e3	[Whisper] fix docstrings typo (#35338 ) fix typo	2025-01-07 09:20:27 -08:00
eustlb	7f7677307c	[Qwen2Audio] handle input ids expansion during processing (#35534 ) * add audio_token attribute to proc * expand input_ids * and legacy and expanded input_ids * test update * split lines * add possibility not to provide eos and bos audio tokens * raise errors * test incorrect number of audio tokens * add example * fmt * typo	2025-01-07 16:47:27 +01:00
Viggo	628cd838a3	Release GPU memory after Optuna trial (#35440 ) * Release GPU memory after trial * Update to use release_memory from accelerate.utils.memory after suggestion	2025-01-07 16:26:28 +01:00
Kevin R	665a4942e4	Check whether rescale is requested before checking is_scaled_image (#35439 )	2025-01-07 11:39:45 +00:00
Francesco Cariaggi	f408d55448	Fix bug when requesting input normalization with EnCodec (#34756 ) * EnCodec: unsqueeze padding mask * add test for normalization	2025-01-07 11:50:02 +01:00
松本和真	96bf3d6cc5	Add diffllama (#34083 ) * first adding diffllama * add Diff Attention and other but still with errors * complate make attention Diff-Attention * fix some bugs which may be caused by transformer-cli while adding model * fix a bug caused by forgetting KV cache... * Update src/transformers/models/diffllama/modeling_diffllama.py You don't need to divide by 2 if we use same number of attention heads as llama. instead you can just split in forward. Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py fit to changeing "num_heads // 2" place Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py new codes are more meaningful than before Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py new codes are more meaningful than before Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py fit to changeing "num_heads // 2" place Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py fix 2times divide by sqrt(self.head_dim) Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py fix 2times divide by sqrt(self.head_dim) Co-authored-by: Minho Ryu <ryumin93@gmail.com> * Update src/transformers/models/diffllama/modeling_diffllama.py fit to changeing "num_heads // 2" place. and more visible Co-authored-by: Minho Ryu <ryumin93@gmail.com> * I found Attention missed implemented from paper still on e072544a3bfc69b8a903e062729f861108ffecd3. * re-implemented * adding groupnorm Co-authored-by: Minho Ryu <ryumin93@gmail.com> * align with transformers code style Co-authored-by: Minho Ryu <ryumin93@gmail.com> * fix typo Co-authored-by: Minho Ryu <ryumin93@gmail.com> * adding groupnorm Co-authored-by: Minho Ryu <ryumin93@gmail.com> * change SdpaAttention to DiffSdpaAttention Co-authored-by: Minho Ryu <ryumin93@gmail.com> * fix bug * Update src/transformers/models/diffllama/modeling_diffllama.py resolve "not same outputs" problem Co-authored-by: Minho Ryu <ryumin93@gmail.com> * fix bugs of places of "GroupNorm with scale" and etc * Revert "fix bugs of places of "GroupNorm with scale" and etc" This reverts commit 26307d92f6acd55e9fe89f2facff350f05760960. * simplify multiple of attention (matmul) operations into one by repeating value_states Co-authored-by: Minho Ryu <ryumin93@gmail.com> * simplify multiple of attention (matmul) operations into one by repeating value_states Co-authored-by: Minho Ryu <ryumin93@gmail.com> * simplify multiple of attention (matmul) operations into one by repeating value_states Co-authored-by: Minho Ryu <ryumin93@gmail.com> * remove missed type * add diffllama model_doc * apply make style/quality * apply review comment about model * apply review comment about test * place diffllama alphabetically on the src/transformers/__init__.py * fix forgot code * Supports parameters that are not initialized with standard deviation 0 in the conventional method * add DiffLlamaConfig to CONFIG_CLASSES_TO_IGNORE_FOR_DOCSTRING_CHECKPOINT_CHECK on utils/check_config_docstrings.py * remove unused property of config * add to supported model list * add to spda supported model list * fix copyright, remove pretraining_tensor_parallel, and modify for initialization test * remove unused import and etc. * empty commit * empty commit * empty commit * apply modular transformers but with bugs * revert prev commit * create src/transformers/model/diffllama/modular_diffllama.py * run utils/modular_model_converter.py * empty commit * leaner modular diffllama * remove more and more in modular_diffllama.pt * remove more and more in modular_diffllama.pt * resolve missing docstring entries * force reset * convert modular --------- Co-authored-by: Minho Ryu <ryumin93@gmail.com>	2025-01-07 11:34:56 +01:00
zheliuyu	ed73ae210b	NPU support SDPA (#35165 ) Co-authored-by: root <weichunyude@163.com>	2025-01-07 11:30:05 +01:00
zzaebok	02ed609285	Replace tokenizer to processing_class in Seq2SeqTrainer (#35452 )	2025-01-07 09:51:12 +00:00
Dmitry Rogozhkin	9fd123ac31	ci: mark model_parallel tests as cuda specific (#35269 ) `parallelize()` API is deprecated in favor of accelerate's `device_map="auto"` and therefore is not accepting new features. At the same time `parallelize()` implementation is currently CUDA-specific. This commit marks respective ci tests with `@require_torch_gpu`. Fixes: #35252 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2025-01-07 10:16:34 +01:00
pglorio	bd442c6d3a	Zamba new attention standard (#35375 ) * updated zamba to new attention standard * make fixup fixes	2025-01-07 10:08:45 +01:00
NielsRogge	12ba96aa3c	[Dinov2 with Registers] Some fixes (#35411 ) * First draft * Thanks claude * Remove print statement * Use torch_int * Address comments * Address comment	2025-01-06 21:10:59 +01:00
Sarthak Karandikar	ca00950057	added logic for deleting adapters once loaded (#34650 ) * added logic for deleting adapters once loaded * updated to the latest version of transformers, merged utility function into the source * updated with missing check * added peft version check * Apply suggestions from code review Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * changes according to reviewer * added test for deleting adapter(s) * styling changes * styling changes in test * removed redundant code * formatted my contributions with ruff * optimized error handling * ruff formatted with correct config * resolved formatting issues --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-01-06 18:36:40 +00:00
Mukund Sudarshan	1650e0e514	Fixed typo in Llama configuration docstring (#35520 ) Update configuration_llama.py There is no `num_heads` parameter, only `num_attention_heads`	2025-01-06 09:54:08 -08:00
Woojun Jung	3b1be043cd	🌐 [i18n-KO] Remove duplicates in toctree (#35496 ) fix(docs): remove duplicates in toctree	2025-01-06 09:14:22 -08:00
Isotr0py	3951da1a6b	[GGUF] Refactor and decouple gguf checkpoint loading logic (#34385 ) * draft load_gguf refactor * update Signed-off-by: Isotr0py <2037008807@qq.com> * remove llama mapping Signed-off-by: Isotr0py <2037008807@qq.com> * remove qwen2 mapping Signed-off-by: Isotr0py <2037008807@qq.com> * remove unused function Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate stablelm mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate phi3 mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate t5 mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate bloom mapping Signed-off-by: Isotr0py <2037008807@qq.com> * fix bloom Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate starcoder2 mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate gpt2 mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate mistral mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate nemotron mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate mamba mapping Signed-off-by: Isotr0py <2037008807@qq.com> * deprecate mamba mapping Signed-off-by: Isotr0py <2037008807@qq.com> * code format Signed-off-by: Isotr0py <2037008807@qq.com> * code format Signed-off-by: Isotr0py <2037008807@qq.com> * fix mamba Signed-off-by: Isotr0py <2037008807@qq.com> * fix qwen2moe Signed-off-by: Isotr0py <2037008807@qq.com> * remove qwen2moe mapping Signed-off-by: Isotr0py <2037008807@qq.com> * clean up Signed-off-by: Isotr0py <2037008807@qq.com> * remove falcon 7b map Signed-off-by: Isotr0py <2037008807@qq.com> * remove all ggml tensors mapping Signed-off-by: Isotr0py <2037008807@qq.com> * add comments Signed-off-by: Isotr0py <2037008807@qq.com> * update messages Signed-off-by: Isotr0py <2037008807@qq.com> * fix tensors in parsed parameters Signed-off-by: Isotr0py <2037008807@qq.com> * add gguf check Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-06 18:02:38 +01:00
dependabot[bot]	86fa3cedad	Bump jinja2 from 3.1.4 to 3.1.5 in /examples/research_projects/decision_transformer (#35408 ) Bump jinja2 in /examples/research_projects/decision_transformer Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.4 to 3.1.5. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/jinja/compare/3.1.4...3.1.5) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-01-06 16:58:29 +00:00
Jacky Lee	44a26c871c	Update llm_optims docs for `sdpa_kernel` (#35481 ) update: use sdpa_kernel	2025-01-06 08:54:31 -08:00
Chulhwa (Evan) Han	18e896bd8f	🌐 [i18n-KO] Translated `altclip.md` to Korean (#34594 ) * docs: ko: model_doc/timesformer.md * feat: nmt draft * Apply suggestions from code review Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com> * Update docs/source/ko/model_doc/altclip.md * add snippet --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>	2025-01-06 08:45:26 -08:00
Zach Mueller	a821b9c7ab	Add check for if num_items_in_batch is not None (#35102 )	2025-01-06 10:11:21 -05:00
Yih-Dar	203e978826	Add `position_ids` in `XLMRobertaXLForCausalLM.prepare_inputs_for_generation` (#35044 ) * fix * fix * cleanup * style --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-06 16:10:21 +01:00
Antoine Dussolle	c451a72cd7	Add French translation of task_summary and tasks_explained (#33407 ) * Add French translation of task_summary and tasks_explained --------- Co-authored-by: Aymeric Roucher <69208727+aymeric-roucher@users.noreply.github.com>	2025-01-06 14:23:52 +01:00
Raushan Turganbay	9895f7df81	Idefics: fix docstring (#35079 ) nit: fix docstring	2025-01-06 10:58:04 +01:00
Isotr0py	32aa2db04a	Fix Llava conversion for models that use safetensors to store weights (#35406 ) * fix llava-med-v1.5-mistral-7b conversion Signed-off-by: Isotr0py <2037008807@qq.com> * add weights_only=True Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-06 09:59:38 +01:00
Lysandre Debut	b2f2977533	Applies the rest of the init refactor except to modular files (#35238 ) * [test_all] Applies the rest of the init refactor except to modular files * Revert modular that doesn't work * [test_all] TFGPT2Tokenizer	2025-01-05 18:30:08 +01:00
Yijun Lee	e5fd865eba	Add Gemma2 GGUF support (#34002 ) * initial setup for ggml.py * initial setup of GGUFGemma2Converter class * Add gemma2 model to gguf.md doc * Partial work on GGUF_TENSOR_MAPPING * initial setup of GGUF_TENSOR_MAPPING for Gemma2 * refactor: rename GemmaConvert class to GemmaConverter for naming consistency * feat: complete gemma2 tensor mapping implementation * feat: add initial implementation of GGUFGemmaConverter * feat: complete GGUFGemmaConverter implementation * feat: add test code for gemma2 * refactor: minor code cleanup * refactor: minor code cleanup * fix: resolve suggestions * Update tests/quantization/ggml/test_ggml.py Co-authored-by: Isotr0py <2037008807@qq.com> --------- Co-authored-by: Isotr0py <2037008807@qq.com>	2025-01-03 14:50:07 +01:00
湛露先生	1fe2d53d4e	Reuse "if not" logic in image_processing. (#35405 )	2025-01-03 14:44:57 +01:00
Jacky Lee	30a9971632	Use `sdpa_kernel` in tests (#35472 ) * update: use sdpa_kernel * update: rerun test	2025-01-03 14:39:52 +01:00
Blanchon	cba49cb2a6	Change `is_soundfile_availble` to `is_soundfile_available` (#35030 )	2025-01-03 14:37:42 +01:00
hoshi-hiyouga	42865860ec	Fix paligemma warning message (#35486 ) fix log input	2025-01-02 11:36:53 +01:00
湛露先生	b2b04e86e7	Fix docs typos. (#35465 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-01-02 11:29:46 +01:00
Matthew Douglas	6b1e86fd4d	Fix new BNB test failures (#35345 )	2025-01-02 11:24:52 +01:00
Tom Aarsen	5b516b06c8	Reintroduce Python 3.9 support for ModernBERT (#35458 ) Co-authored-by: Koichi Yasuoka <yasuoka@kanji.zinbun.kyoto-u.ac.jp>	2025-01-02 11:23:07 +01:00
Jacky Lee	919220dab1	Update translated docs for `sdpa_kernel` (#35461 ) * docs: update sdpa_kernel for translation * fix: nn.attention * update: infer many	2024-12-31 08:37:58 -08:00
Ahmed Almaghz	eb2b452432	[i18n-ar] Translated file: `docs/source/ar/tasks/summarization.md` into Arabic (#35195 ) * إضافة الترجمة العربية: summarization.md * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/summarization.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-12-31 08:35:54 -08:00
Ahmed Almaghz	d5aebc6465	[i18n-ar] Translated file: `docs/source/ar/tasks/question_answering.md` into Arabic (#35196 ) * إضافة الترجمة العربية: question_answering.md * Update question_answering.md * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/question_answering.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-12-30 11:56:05 -08:00
Jacky Lee	b5f97977ed	Update docs for `sdpa_kernel` (#35410 ) update: sdp_kernel -> sdpa_kernel	2024-12-30 09:50:34 -08:00
Cheng-Han Chiang	5cabc75b4b	Add compute_loss_func to Seq2SeqTrainer (#35136 )	2024-12-29 15:01:35 +01:00
Martin	90f256c90c	Update perf_infer_gpu_one.md: fix a typo (#35441 )	2024-12-29 14:57:08 +01:00
Pavel Iakubovskii	5c75087aee	Fix `model_accepts_loss_kwargs` for timm model (#35257 ) * Fix for timm model * Add comment	2024-12-27 16:33:44 +00:00
Kyle Safran	3b0a94ef9e	Fix f-string to show `ACCELERATE_MIN_VERSION` on error (#35189 ) fix f-string to show ACCELERATE_MIN_VERSION on error Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-12-27 13:21:44 +01:00
Thien Tran	f63da20a9f	CLIP conversion script - Change fairseq to OpenAI (#35384 ) Change fairseq to OpenAI	2024-12-27 13:12:32 +01:00
宁宇	7f97d01675	Fix: Rename keyword argument in_channels to num_channels (#35289 ) Fix: Rename keyword argument in_channels to num_channels in some default backbone configs	2024-12-27 13:07:31 +01:00
Quentin Gallouédec	4eb17b26e7	Drop inplace operation for loss computation with gradient accumulation (#35416 ) Fix inplace loss computation	2024-12-26 14:58:53 +01:00
Anton Vlasjuk	24c91f095f	[`GPTQ`, `CompressedTensors`] Fix unsafe imports and metada check (#34815 ) * fix gptq creation when optimum is not installed + fix metadata checking * fix compressed tensors as well * style * pray for ci luck on flaky tests :prayge: * trigger ci --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2024-12-24 19:32:44 +01:00
NielsRogge	6e0515e99c	Add DINOv2 with registers (#35348 ) * added changes from 32905 * fixed mistakes caused by select all paste * rename diff_dinov2... * ran tests * Fix modular * Fix tests * Use new init * Simplify drop path * Convert all checkpoints * Add figure and summary * Update paths * Update docs * Update docs * Update toctree * Update docs --------- Co-authored-by: BernardZach <bernardzach00@gmail.com> Co-authored-by: Zach Bernard <132859071+BernardZach@users.noreply.github.com>	2024-12-24 13:21:59 +01:00
jiqing-feng	d8c1db2f56	enable non-cuda awq model support without modify version (#35334 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2024-12-24 12:36:00 +01:00
Yih-Dar	ccc4a5a59b	Disable `.github/workflows/self-comment-ci.yml` for now (#35366 ) * disable * disable --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-24 10:53:57 +01:00
Yoni Gozlan	93aafdc620	Add compile test for fast image processor (#35184 ) * add compile test for fast image processor * override pixtral test	2024-12-23 13:12:45 -05:00
Mohamed Mekkouri	82fcac0a7e	Adding logger.info about update_torch_dtype in some quantizers (#35046 ) adding logger.info	2024-12-23 17:01:00 +01:00
Miquel Farré	a1780b7ba5	bugfix Idefics3 processor - handle gracefully cases with text and no images (#35363 ) * bugfix processing empty images * fix * fix * Update src/transformers/models/idefics3/processing_idefics3.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * adding tests * fix * fix * fix --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2024-12-23 16:59:01 +01:00
Andrei Panferov	64c05eecd6	HIGGS Quantization Support (#34997 ) * higgs init * working with crunches * per-model workspaces * style * style 2 * tests and style * higgs tests passing * protecting torch import * removed torch.Tensor type annotations * torch.nn.Module inheritance fix maybe * hide inputs inside quantizer calls * style structure something * Update src/transformers/quantizers/quantizer_higgs.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * reworked num_sms * Update src/transformers/integrations/higgs.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * revamped device checks * docstring upd * Update src/transformers/quantizers/quantizer_higgs.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * edited tests and device map assertions * minor edits * updated flute cuda version in docker * Added p=1 and 2,3bit HIGGS * flute version check update * incorporated `modules_to_not_convert` * less hardcoding * Fixed comment * Added docs * Fixed gemma support * example in docs * fixed torch_dtype for HIGGS * Update docs/source/en/quantization/higgs.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Collection link * dequantize interface * newer flute version, torch.compile support * unittest message fix * docs update compile * isort * ValueError instead of assert --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2024-12-23 16:54:49 +01:00
Huazhong Ji	ef1f54a0a7	add bnb support for Ascend NPU (#31512 ) * add bnb support for Ascend NPU * delete comment	2024-12-23 16:36:16 +01:00
Mohamed Mekkouri	59178780a6	Fix : VPTQ test (#35394 ) fix_test	2024-12-23 16:27:46 +01:00
Alvaro Bartolome	3a4ced9ab4	Fix typing in docstring for `PaliGemmaProcessor` (#35278 ) Updated typing for `tokenizer` in the `PaliGemmaProcessor` to be `GemmaTokenizerFast` instead of `LlamaTokenizerFast`	2024-12-23 16:22:04 +01:00
Quentin Gallouédec	3cd3cd50ac	Scale loss before backward (#35207 )	2024-12-23 16:16:38 +01:00
Mohamed Mekkouri	f5264a86ee	Deprecate _is_quantized_training_enabled (#34991 ) deperecate Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-12-23 15:51:31 +01:00
Tibor Reiss	e10be82b71	uniformize kwargs for SAM (#34578 ) * Make kwargs uniform for SAM * Remove unused attribute * Make point_pad_value part of image_kwargs * Update annotations * Code review - use existing methods * Use ProcessorTesterMixin * Do not add ProcessorTesterMixin everywhere	2024-12-23 13:54:57 +01:00
Taha Yassine	2bb60982ac	Patch GPTNeoX to use adequate FA2 if position_ids is provided (#35318 )	2024-12-23 13:45:55 +01:00
Wing Lian	5e7aedebeb	make LlamaModel._update_causal_mask torch compilable (#35187 ) * make LlamaModel._update_causal_mask torch compilable * chore: lint (make fix-copies) * fix-copies --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-12-23 13:10:00 +01:00
Matthew Douglas	401aa39d7b	bitsandbytes: simplify 8bit dequantization (#35068 )	2024-12-23 13:04:59 +01:00
Cyril Vallez	05260a1fc1	Fix new FA2 if `is_causal` is passed explicitly (#35390 ) * fix * Update modeling_decision_transformer.py * Update flash_attention.py	2024-12-22 20:00:07 +01:00
bastrob	8f38f58f3d	owlvit/2 dynamic input resolution (#34764 ) * owlvit/2 dynamic input resolution. * adapt box grid to patch_dim_h patch_dim_w * fix ci * clarify variable naming * clarify variable naming.. * compute box_bias dynamically inside box_predictor * change style part of code * [run-slow] owlvit, owlv2	2024-12-21 08:51:09 +00:00
Steven Liu	608e163b52	[docs] Follow up register_pipeline (#35310 ) example json	2024-12-20 09:22:44 -08:00
UV	94fe0b915b	Improved Documentation Of Audio Classification (#35368 ) * Improved Documentation Of Audio Classification * Updated documentation as per review * Updated audio_classification.md * Update audio_classification.md	2024-12-20 09:17:28 -08:00
Joel Koch	c96cc039c3	Improve modular transformers documentation (#35322 ) * Improve modular transformers documentation - Adds hints to general contribution guides - Lists which utils scripts are available to generate single-files from modular files and check their content * Show commands in copyable code cells --------- Co-authored-by: Joel Koch <joel@bitcrowd.net>	2024-12-20 09:16:02 -08:00
Yih-Dar	504c4d3692	Make `test_generate_with_static_cache` even less flaky (#34995 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-20 16:03:26 +01:00
Yih-Dar	0fc2970363	Use `weights_only=True` with `torch.load` for `transfo_xl` (#35241 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-20 15:40:55 +01:00
Arthur	6fae2a84ae	Update test fetcher when we want to test all (#35364 ) * [test-all] * style * [test-all] * [test_all] * [test_all] * style	2024-12-20 15:10:43 +01:00
nhamanasu	34ad1bd287	update codecarbon (#35243 ) * update codecarbon * replace directly-specified-test-dirs with tmp_dir * Revert "replace directly-specified-test-dirs with tmp_dir" This reverts commit 310a6d962ec83db3f6d4f96daeeba5c6746f736c. * revert the change of .gitignore * Update .gitignore --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2024-12-20 15:04:36 +01:00
Jiwoong	40292aa4e9	bugfix: torch.export failure caused by `_make_causal_mask` (#35291 ) * bugfix: torch.export failure caused by `_make_causal_mask` Recent changes in torch dynamo prevent mutations on tensors converted with aten::_to_copy. To address this, we can clone such tensor before performing in-place operation `masked_fill_` only when the code is being compiled by torch dynamo. (relevant issue: https://github.com/pytorch/pytorch/issues/127571) * chore: use `is_torchdynamo_compiling` instead of `torch._dynamo.is_compiling`	2024-12-20 14:37:04 +01:00
Yih-Dar	05de764e9c	Aurevoir PyTorch 1 (#35358 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-20 14:36:31 +01:00
Qizhi Chen	4567ee8057	fix zoedepth initialization error under deepspeed zero3 (#35011 ) fix zoe bug in deepspeed zero3	2024-12-20 11:42:40 +00:00
Jacky Lee	c3a43594b7	Add Tensor Parallel support for Qwen2VL (#35050 ) feat: add parallel support for qwen2vl	2024-12-20 12:40:38 +01:00
Cyril Vallez	0d51d65905	Cleaner attention interfaces (#35342 ) * cleaner attention interfaces * correctly set the _attn_implementation when adding other functions to it * update * Update modeling_utils.py * CIs	2024-12-20 12:09:34 +01:00
Sigbjørn Skjæret	eafbb0eca7	Implement AsyncTextIteratorStreamer for asynchronous streaming (#34931 ) * Add AsyncTextIteratorStreamer class * export AsyncTextIteratorStreamer * export AsyncTextIteratorStreamer * improve docs * missing import * missing import * doc example fix * doc example output fix * add pytest-asyncio * first attempt at tests * missing import * add pytest-asyncio * fallback to wait_for and raise TimeoutError on timeout * check for TimeoutError * autodoc * reorder imports * fix style --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-12-20 12:08:12 +01:00
Yih-Dar	b5a557e5fe	Reduce CircleCI usage (#35355 ) * reduce 1 * reduce 1 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-20 10:18:15 +01:00
wejoncy	4e27a4009d	FEAT : Adding VPTQ quantization method to HFQuantizer (#34770 ) * init vptq * add integration * add vptq support fix readme * add tests && format * format * address comments * format * format * address comments * format * address comments * remove debug code * Revert "remove debug code" This reverts commit ed3b3eaaba82caf58cb3aa6e865d98e49650cf66. * fix test --------- Co-authored-by: Yang Wang <wyatuestc@gmail.com>	2024-12-20 09:45:53 +01:00
Anton Vlasjuk	5a2aedca1e	[`Mamba2`] Fix caching, slow path, and multi-gpu (#35154 ) * fixup mamba2 - caching and several other small fixes * fixup cached forward * correct fix this time * fixup cache - we do not need to extend the attn mask it's handled by generate (gives total ids + mask at each step) * remove unnecessary (un)squeeze * fixup cache position * simplify a few things * [run-slow] mamba2 * multi gpu attempt two * [run-slow] mamba2 * [run-slow] mamba2 * [run-slow] mamba2 * [run-slow] mamba2 * add newer slow path fix * [run-slow] mamba2	2024-12-20 09:27:47 +01:00
Nikos Antoniou	ff9141bb85	fix onnx export of speech foundation models (#34224 ) * added expanded attention/padding masks prior to indexing the hidden_states * consistency fix in WavLMForSequenceClassification --------- Co-authored-by: Nikos Antoniou <nikosantoniou@Nikos-MacBook-Pro.local>	2024-12-20 09:22:05 +01:00
Tom Aarsen	f42084e641	[`docs`] Add link to ModernBERT Text Classification GLUE finetuning script (#35347 ) Add link to ModernBERT Text Classification GLUE finetuning script	2024-12-19 14:45:52 -08:00
Benjamin Warner	0ade1caa35	Modernbert Release Fixes (#35344 ) * fix ForSequenceClassification * unmodularize rope layer * fix linting warning * Avoid complex PoolingHead, only one prediction head needed --------- Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>	2024-12-19 17:22:37 +01:00
Arthur	1fa807fa63	Fix some fa2 tests (#35340 ) * remove fa2 test * remove other failing tests * style	2024-12-19 17:05:25 +01:00
Benjamin Warner	667ed5635e	Add ModernBERT to Transformers (#35158 ) * initial cut of modernbert for transformers * small bug fixes * fixes * Update import * Use compiled mlp->mlp_norm to match research implementation * Propagate changes in modular to modeling * Replace duplicate attn_out_dropout in favor of attention_dropout cc @warner-benjamin let me know if the two should remain separate! * Update BOS to CLS and EOS to SEP Please confirm @warner-benjamin * Set default classifier bias to False, matching research repo * Update tie_word_embeddings description * Fix _init_weights for ForMaskedLM * Match base_model_prefix * Add compiled_head to match research repo outputs * Fix imports for ModernBertForMaskedLM * Just use "gelu" default outright for classifier * Fix config name typo: initalizer -> initializer * Remove some unused parameters in docstring. Still lots to edit there! * Compile the embeddings forward Not having this resulted in very slight differences - so small it wasn't even noticed for the base model, only for the large model. But the tiny difference for large propagated at the embedding layer through the rest of the model, leading to notable differences of ~0.0084 average per value, up to 0.2343 for the worst case. * Add drafts for ForSequenceClassification/ForTokenClassification * Add initial SDPA support (not exactly equivalent to FA2 yet!) During testing, FA2 and SDPA still differ by about 0.0098 per value in the token embeddings. It still predicts the correct mask fills, but I'd like to get it fully 1-1 if possible. * Only use attention dropout if training * Add initial eager attention support (also not equivalent to FA2 yet!) Frustratingly, I also can't get eager to be equivalent to FA2 (or sdpa), but it does get really close, i.e. avg ~0.010 difference per value. Especially if I use fp32 for both FA2&eager, avg ~0.0029 difference per value The fill-mask results are good with eager. * Add initial tests, output_attentions, output_hidden_states, prune_heads Tests are based on BERT, not all tests pass yet: 23 failed, 79 passed, 100 skipped * Remove kwargs from ModernBertForMaskedLM Disable sparse_prediction by default to match the normal HF, can be enabled via config * Remove/adjust/skip improper tests; warn if padding but no attn mask * Run formatting etc. * Run python utils/custom_init_isort.py * FlexAttention with unpadded sequences(matches FA2 within bf16 numerics) * Reformat init_weights based on review * self -> module in attention forwards * Remove if config.tie_word_embeddings * Reformat output projection on a different line * Remove pruning * Remove assert * Call contiguous() to simplify paths * Remove prune_qkv_linear_layer * Format code * Keep as kwargs, only use if needed * Remove unused codepaths & related config options * Remove 3d attn_mask test; fix token classification tuple output * Reorder: attention_mask above position_ids, fixes gradient checkpointing * Fix usage if no FA2 or torch v2.5+ * Make torch.compile/triton optional Should we rename 'compile'? It's a bit vague * Separate pooling options into separate functions (cls, mean) - cls as default * Simplify _pad_modernbert_output, remove unused labels path * Update tied weights to remove decoder.weight, simplify decoder loading * Adaptively set config.compile based on hf_device_map/device/resize, etc. * Update ModernBertConfig docstring * Satisfy some consistency checks, add unfinished docs * Only set compile to False if there's more than 1 device * Add docstrings for public ModernBert classes * Dont replace docstring returns - ends up being duplicate * Fix mistake in toctree * Reformat toctree * Patched FlexAttention, SDPA, Eager with Local Attention * Implement FA2 -> SDPA -> Eager attn_impl defaulting, crucial both to match the original performance, and to get the highest inference speed without requiring users to manually pick FA2 * Patch test edge case with Idefics3 not working with 'attn_implementation="sdpa"' * Repad all_hidden_states as well * rename config.compile to reference_compile * disable flex_attention since it crashes * Update modernbert.md * Using dtype min to mask in eager * Fully remove flex attention for now It's only compatible with the nightly torch 2.6, so we'll leave it be for now. It's also slower than eager/sdpa. Also, update compile -> reference_compile in one more case * Call contiguous to allow for .view() * Copyright 2020 -> 2024 Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update/simplify __init__ structure Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove "... if dropout_prob > 0 else identity" As dropout with 0.0 should be efficient like identity * re-use existing pad/unpad functions instead of creating new ones * remove flexattention method * Compute attention_mask and local_attention_mask once in modeling * Simplify sequence classification prediction heads, only CLS now Users can make custom heads if they feel like it Also removes the unnecessary pool parameter * Simplify module.training in eager attn * Also export ModernBertPreTrainedModel * Update the documentation with links to finetuning scripts * Explain local_attention_mask parameter in docstring * Simplify _autoset_attn_implementation, rely on super() * Keep "in" to initialize Prediction head Doublechecked with Benjamin that it's correct/what we used for pretraining * add back mean pooling * Use the pooling head in TokenClassification * update copyright * Reset config._attn_implementation_internal on failure * Allow optional attention_mask in ForMaskedLM head * fix failing run_slow tests * Add links to the paper * Remove unpad_no_grad, always pad/unpad without gradients * local_attention_mask -> sliding_window_mask * Revert "Use the pooling head in TokenClassification" This reverts commit 99c38badd1dbce01d7aef41095fbf2f5cce87279. There was no real motivation, no info on whether having this bigger head does anything useful. * Simplify pooling, 2 options via if-else --------- Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com> Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com> Co-authored-by: Said Taghadouini <taghadouinisaid@gmail.com> Co-authored-by: Benjamin Clavié <ben@clavie.eu> Co-authored-by: Antoine Chaffin <ant54600@hotmail.fr> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-12-19 14:03:35 +01:00
Peter	56ff1e92fd	PaliGemma: Make sure to add <eos> to suffix if <image> is present in `text` (#35201 ) Move suffix processing code to out of if statement	2024-12-19 09:53:48 +01:00
Yih-Dar	4592cc9e98	Update comment CI bot (#35323 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-19 09:45:27 +01:00
Tony Wu	d19b11f59b	Fix documentation for ColPali (#35321 ) * docs: fix typo quickstart snippet in ColPali's model card * docs: clean the ColPali's model card * docs: make the `ColPaliForRetrieval`'s docstring more concise * docs: add missing bash command used to convert weights for `vidore/colpali-v1.3-hf`	2024-12-19 09:08:28 +01:00
Yu Chin Fabian Lim	9613933b02	Add the Bamba Model (#34982 ) * initial commit for PR Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com> * rename dynamic cache Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * add more unit tests Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * add integration test Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * add integration test Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * Add modular bamba file * Remove trainer changes from unrelated PR * Modify modular and cofig to get model running * Fix some CI errors and beam search * Fix a plethora of bugs from CI/docs/etc * Add bamba to models with special caches * Updat to newer mamba PR for mamba sublayer * fix test_left_padding_compatibility Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * fix style Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * fix remaining tests Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * missed this test Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * ran make style Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * move slow tag to integration obj Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * make style Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * address comments Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * fix modular Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * left out one part of modular Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * change model Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * Make Rotary modular as well * Update bamba.md Added overview, update Model inference card and added config * Update bamba.md * Update bamba.md * Update bamba.md Minor fixes * Add docs for config and model back Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> * Add warning when using fast kernels * replaced generate example Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> * Address comments from PR Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> * Propagate attention fixes Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> * Fix attention interfaces to the new API Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> * Fix API for decoder layer Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> * Remove extra weights Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> --------- Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com> Signed-off-by: Antoni Viros i Martin <aviros@ibm.com> Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com> Co-authored-by: Antoni Viros i Martin <aviros@ibm.com> Co-authored-by: divya-kumari32 <72085811+divya-kumari32@users.noreply.github.com> Co-authored-by: Antoni Viros <ani300@gmail.com>	2024-12-18 20:18:17 +01:00
Luc Georges	9a94dfe123	feat: add `benchmarks_entrypoint.py` (#34495 ) * feat: add `benchmarks_entrypoint.py` Adding `benchmarks_entrypoint.py` file, which will be run from the benchmarks CI. This python script will list all python files from the `benchmark/` folder and run the included `run_benchmark` function, allowing people to add new benchmarks scripts. * feat: add `MetricsRecorder` * feat: update dashboard * fix: add missing arguments to `MetricsRecorder` * feat: update dash & add datasource + `default.yml` * fix: move responsibility to create `MetricsRecorder` in bench script * fix: update incorrect datasource UID * fix: incorrect variable values * debug: benchmark entrypoint script * refactor: update log level * fix: update broken import * feat: add debug log in `MetricsRecorder` * debug: set log level to debug * fix: set connection `autocommit` to `True`	2024-12-18 18:59:07 +01:00
Arthur	2c47618c1a	🚨All attention refactor🚨 (#35235 ) * refactor LlamaAttention * minimal changes * fix llama * update * modular gemmas * modular nits * modular updates * nits * simplify * gpt2 * more modualr and fixes * granite * modular modular modular * nits * update * qwen2 + starcoder2 * mostly gemma2 * Update image_processing_auto.py * fix * Update modular_starcoder2.py * fix * remove all copied from attentions * remove gcv * make fix-copies * oups * oups2.0 * fix some modulars + all copied from * should be good now * revert unwanted changes * Update modeling_decision_transformer.py * finish cleanup * Update modeling_olmo.py * consistency * re-add gradient checkpointing attribute * fix * style * make config necessary * bis * bis * Update modeling_my_new_model2.py * is_causal attr * fix * remove past kv return from decoder layer * fix * default rope config * correctly fix rope config * fix bias * fix gpt2 attention output * fix test * fix inits * fix default sdpa * fix default sdpa implementation * harmonize classes * fix mistral * fix sliding window models * mixtral * be more explicit * style * fix * several fixes * Update modeling_dbrx.py * fix test * olmo + phi * rotary * syle * phi * phi again * again * kwargs * Update test_modeling_common.py * skip fx tracing tests * Update modeling_utils.py * gemma 2 * again * Update modeling_recurrent_gemma.py * gemma2 * granite * style * starcoder * Update sdpa_attention.py * switch args * Update modeling_mllama.py * fix * cache type tests * gpt2 * Update test_modeling_common.py * fix * consistency * fix shape with encoder * should be the last one * tests non model * most comments * small oupsi * be more explicit in modulars * more explicit modulars * CIs! it works locally * add kwargs to _flash_attention_forward --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2024-12-18 16:53:39 +01:00
eustlb	75be5a0a5b	[Whisper] fix docstrings typo (#35319 ) typos docstring	2024-12-18 16:38:19 +01:00
jiqing-feng	69e31eb1bf	change bnb tests (#34713 ) * fix training tests * fix xpu check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * rm pdb Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix 4bit logits check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix 4bit logits check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add xpu check on int8 training * fix training tests * add llama test on bnb Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * only cpu and xpu disable autocast training Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com>	2024-12-18 09:49:59 -05:00
eustlb	da334bcfa8	[Whisper] 🚨 Fix whisper decoding 🚨 (#34135 ) * do not remove decoder_input_ids for the first segment * do not remove eos token in generate_with_fallback * when removing padding tokens, do not remove eos token * remove eos token in generate (and not in generate_with_fallback!) * reconciliate short-from/ long-form behavior * correct avg_logprobs calculation * handle eos token in segments * handle decoder_input_ids and eos token in _prepare_decoder_input_ids * fix incorrect time precision * always remove eos token * always remove decoder_input_ids * no need to handle decoder_inputs_ids and eos token * no need to remove decoder_input_ids * no need to handle eos token * fix num_beams in _retrieve_logit_processors * remove todo unconsistency * no need to add eos token * last_timestamp_pos should indeed be timestamp token pos * patch generate to enable compatibility with GenerationTesterMixin tests * adapt test_generate_continue_from_past_key_values * adapt test_prompt_lookup_decoding_matches_greedy_search * adapt generic GenerationMixin tests to whisper's generate * fix speculative decoding * fix * [run-slow] whisper * change HF_HUB_TOKEN for require_read_token * [run-slow] whisper * prioritize kwargs over generation_config * remove unnecessary args * [run-slow] whisper * update tests * [run-slow] whisper * add comment * update test * [run-slow] whisper * update test + revert require_read_token * docstring updates * revert tokenizer decode args change * do not use a patch + docstring updates * [run-slow] whisper * make * [run-slow] whisper * add a flag to force unique call to generate * test update * [run-slow] whisper * add force_unique_generate_call arg * do not use a patch * correct the timestamps for the pad tokens * docstring update * docstring update * docstring update * upodate TF tests * add require_read_token * [run-slow] whisper * test reset dynamo * [run-slow] whisper * fix * [run-slow] whisper * avoid iterating twice on current_segments * [run-slow] whisper * [run-slow] whisper --------- Co-authored-by: Eustache Le Bihan <eustlb@users.noreply.huggingface.co> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-18 14:13:21 +01:00
Yih-Dar	f1b7634fc8	Trigger GitHub CI with a comment on PR (#35211 ) * fix * fix * comment * final * final * final --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-18 13:56:49 +01:00
Fanli Lin	c7e48053aa	[tests] make cuda-only tests device-agnostic (#35222 ) fix cuda-only tests	2024-12-18 10:14:22 +01:00
Marc Sun	1eee1cedfd	Fix loading with only state dict and low_cpu_mem_usage = True (#35217 ) * fix loading with only state dict and config * style * add tests --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-18 09:54:32 +01:00
Steven Liu	0531d7513b	[docs] Improve register_pipeline (#35300 ) register_pipeline	2024-12-17 10:27:23 -08:00
UV	77080f023f	Fixed typo in audio_classification.md (#35305 )	2024-12-17 09:45:51 -08:00
alexrs-cohere	8bfd7eeeef	Add Cohere2 docs details (#35294 ) * Add Cohere2 docs details * Update docs/source/en/model_doc/cohere2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-12-17 09:36:31 -08:00
ShunanZhu	a7feae190f	Fix remove unused parameter in docs (#35306 ) remove unused parameter in example Co-authored-by: zzzzzsa <zzzzzsaqwq@gmail.com>	2024-12-17 09:34:41 -08:00
Jacky Lee	927c3e39ec	Fix image preview in multi-GPU inference docs (#35303 ) fix: link for img	2024-12-17 09:33:50 -08:00
Jacky Lee	4302b27719	Fix typos in translated quicktour docs (#35302 ) * fix: quicktour typos * fix: one more	2024-12-17 09:32:00 -08:00
Pavel Iakubovskii	deac971c46	🚨🚨🚨 Limit backtracking in Nougat regexp (#35264 ) * Limit backtracking in regexp * Update * [run-slow] nougat * Update	2024-12-17 16:34:18 +00:00
Yih-Dar	d29a06e39a	remove `benchmark` job in `push-important-models.yml` (#35292 ) remove-bench Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-17 17:27:26 +01:00
Matt	e0ae9b5974	🚨🚨🚨 Delete conversion scripts when making release wheels (#35296 ) * Delete conversion scripts when making release wheels * make fixup * Update docstring	2024-12-17 14:18:42 +00:00
Magnus	6eb00dd2f0	Support for SDPA for SAM models (#34110 ) * feat: add support for sdpa and gradient checkpointing * fix: ruff format * fix: config sdpa * fix: sdpa layer naming convention * fix: update test_eager_matches_sdpa_inference to handle vision_hidden_states * test: skip incompatible tests and fix loading issue with sdpa - Updated tests to skip cases flash and dynamic compile. - Minor adjustment to ensure correct loading of model with sdpa for dispatch test. * style: apply Ruff formatting * ruff fix again after rebase * [run-slow] sam * [run-slow] sam * refactor: Address review comments and improve sub-config handling in SAM model tests - Added attributes for sub_configs as per PR #34410. - Enabled tests for configs, ensuring the composite model (SAM) has several sub-configs in the main config. - Added class attribute _is_composite=True to the tester class - test_sdpa_can_dispatch_composite_models added * [run-slow] sam * style: ruff * [run-slow] sam * style: ruff again ... * [run-slow] sam	2024-12-17 14:46:05 +01:00
Omar Salman	747f361da1	Add sdpa for Beit (#34941 ) * Add sdpa for Beit * Updates * [run-slow] beit * Update inference benchmarks * Update * Fix - add missed to super().forward() * Updates * Fix missing import	2024-12-17 14:44:47 +01:00
Billel Mokeddem	6c08b3b6e5	Add Falcon3 documentation (#35307 ) * Add Falcon3 documentation * Update Falcon3 documentation * Change Falcon to Falcon3 * Update docs and run make fix-copies * Add blog post and huggingface models links	2024-12-17 14:23:13 +01:00
Tony Wu	f33a0cebb3	Add ColPali to 🤗 transformers (#33736 ) * feat: run `add-new-model-like` * feat: add paligemma code with "copied from" * feat: add ColPaliProcessor * feat: add ColPaliModel * feat: add ColPaliConfig * feat: rename `ColPaliForConditionalGeneration` to `ColPaliModel` * fixup modeling colpali * fix: fix root import shortcuts * fix: fix `modeling_auto` dict * feat: comment out ColPali test file * fix: fix typos from `add-new-model-like` * feat: explicit the forward input args * feat: move everything to `modular_colpali.py` * fix: put back ColPaliProcesor * feat: add auto-generated files * fix: run `fix-copies` * fix: remove DOCStRING constants to make modular converter work * fix: fix typo + modular converter * fix: add missing imports * feat: no more errors when loading ColPaliModel * fix: remove unused args in forward + tweak doc * feat: rename `ColPaliModel` to `ColPaliForRetrieval` * fix: apply `fix-copies` * feat: add ColPaliProcessor to `modular_colpali` * fix: run make quality + make style * fix: remove duplicate line in configuration_auto * feat: make ColPaliModel inehrit from PaliGemmaForConditionalGeneration * fix: tweak and use ColPaliConfig * feat: rename `score` to `post_process_retrieval` * build: run modular formatter + make style * feat: convert colpali weights + fixes * feat: remove old weight converter file * feat: add and validate tests * feat: replace harcoded path to "vidore/colpali-v1.2-hf" in tests * fix: add bfloat16 conversion in weight converter * feat: replace pytest with unittest in modeling colpali test * feat: add sanity check for weight conversion (doesn't work yet) * feat: add shape sanity check in weigth converter * feat: make ColPaliProcessor args explicit * doc: add doc for ColPali * fix: trying to fix output mismatch * feat: tweaks * fix: ColPaliModelOutput inherits from ModelOutput instead of PaliGemmaCausalLMOutputWithPast * fix: address comments on PR * fix: adapt tests to the Hf norm * wip: try things * feat: add `__call__` method to `ColPaliProcessor` * feat: remove need for dummy image in `process_queries` * build: run new modular converter * fix: fix incorrect method override * Fix tests, processing, modular, convert * fix tokenization auto * hotfix: manually fix processor -> fixme once convert modular is fixed * fix: convert weights working * feat: rename and improve convert weight script * feat: tweaks * fest: remove `device` input for `post_process_retrieval` * refactor: remove unused `get_torch_device` * Fix all tests * docs: update ColPali model doc * wip: fix convert weights to hf * fix logging modular * docs: add acknowledgements in model doc * docs: add missing docstring to ColPaliProcessor * docs: tweak * docs: add doc for `ColPaliForRetrievalOutput.forward` * feat: add modifications from colpali-engine v0.3.2 in ColPaliProcessor * fix: fix and upload colapli hf weights * refactor: rename `post_process_retrieval` to `score_retrieval` * fix: fix wrong typing for `score_retrieval` * test: add integration test for ColPali * chore: rerun convert modular * build: fix root imports * Update docs/source/en/index.md Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * fix: address PR comments * wip: reduce the prediction gap in weight conversion * docs: add comment in weight conversion script * docs: add example for `ColPaliForRetrieval.forward` * tests: change dataset path to the new one in hf-internal * fix: colpali weight conversion works * test: add fine-grained check for ColPali integration test * fix: fix typos in convert weight script * docs: move input docstring in a variable * fix: remove hardcoded torch device in test * fix: run the new modular refactor * docs: fix python example for ColPali * feat: add option to choose `score_retrieval`'s output dtype and device * docs: update doc for `score_retrieval` * feat: add `patch_size` property in ColPali model * chore: run `make fix-copies` * docs: update description for ColPali cookbooks * fix: remove `ignore_index` methods * feat: remove non-transformers specific methods * feat: update `__init__.py` to new hf format * fix: fix root imports in transformers * feat: remove ColPali's inheritance from PaliGemma * Fix CI issues * nit remove prints * feat: remove ColPali config and model from `modular_colpali.py` * feat: add `ColPaliPreTrainedModel` and update modeling and configuration code * fix: fix auto-removed imports in root `__init__.py` * fix: various fixes * fix: fix `_init_weight` * temp: comment `AutoModel.from_config` for experiments * fix: add missing `output_attentions` arg in ColPali's forward * fix: fix `resize_token_embeddings` * fix: make `input_ids` optional in forward * feat: rename `projection_layer` to `embedding_proj_layer` * wip: fix convert colpali weight script * fix tests and convert weights from original repo * fix unprotected import * fix unprotected torch import * fix style * change vlm_backbone_config to vlm_config * fix unprotected import in modular this time * fix: load config from Hub + tweaks in convert weight script * docs: move example usage from model docstring to model markdown * docs: fix input docstring for ColPali's forward method * fix: use `sub_configs` for ColPaliConfig * fix: remove non-needed sanity checks in weight conversion script + tweaks * fix: fix issue with `replace_return_docstrings` in ColPali's `forward` * docs: update docstring for `ColPaliConfig` * test: change model path in ColPali test * fix: fix ColPaliConfig * fix: fix weight conversion script * test: fix expected weights for ColPali model * docs: update ColPali markdown * docs: fix minor typo in ColPaliProcessor * Fix tests and add _no_split_modules * add text_config to colpali config * [run slow] colpali * move inputs to torch_device in integration test * skip test_model_parallelism * docs: clarify quickstart snippet in ColPali's model card * docs: update ColPali's model card --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2024-12-17 11:26:43 +01:00
Arthur	a7f5479b45	fix modular order (#35297 ) * fix modular ordre * fix * style	2024-12-17 08:05:35 +01:00
UV	f5620a7634	Improved documentation of Automatic speech recognition (#35268 ) Improved documentation quality of Automatic speech recognition	2024-12-16 09:50:11 -08:00
湛露先生	eb92bc44b7	Fix wrongs in quicktour[zh] (#35272 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2024-12-16 09:23:34 -08:00
HMJ0628	886f690e76	Translating "translate perf_infer_gpu_multi.md" to Chinese (#35271 ) add "translate perf_infer_gpu_multi"	2024-12-16 09:22:35 -08:00
Jacky Lee	22834eeba1	Fix typos in Translated Audio Classification Docs (#35287 ) * fix: qwen2 model ids * fix: line * fix: more format * update: reformat * fix: doc typos	2024-12-16 08:51:32 -08:00
eustlb	9feae5fb01	[Whisper] patch float type on mps (#35295 ) * fix float type on mps * make	2024-12-16 16:52:47 +01:00
湛露先生	d5b81e1ca1	Delete redundancy for loop checks. (#35288 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2024-12-16 13:36:27 +00:00
ivarflakstad	d0f32212ed	Temporarily disable amd push ci (#35293 ) Temporarily disable amd push ci (reduce noise)	2024-12-16 14:18:50 +01:00
Mohamed Mekkouri	85eb339231	Fix : model used to test ggml conversion of Falcon-7b is incorrect (#35083 ) fixing test model	2024-12-16 13:21:44 +01:00
Raushan Turganbay	14910281a7	Blip: fix offloading and MP tests (#35239 ) * fix device map * fix offloading + model parallel test	2024-12-16 12:44:33 +01:00
Yih-Dar	66531a1ec3	Aggeregate test summary files in CircleCI workflow runs (#34989 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * try 1 * fix * fix * fix * update * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-16 11:06:17 +01:00
Yoni Gozlan	5615a39369	Fall back to slow image processor in ImageProcessingAuto when no fast processor available (#34785 ) * refactor image_processing_auto logic * fix fast image processor tests * Fix tests fast vit image processor * Add safeguard when use_fast True and torchvision not available * change default use_fast back to None, add warnings * remove debugging print * call get_image_processor_class_from_name once	2024-12-15 14:00:36 -05:00
French_Ball	ca03842cdc	[i18n-Chinese] Translating perf_train_cpu.md to Chinese (#35242 ) add "1"	2024-12-13 14:46:49 -08:00
Wing Lian	add53e25ff	don't use no_sync when deepspeed doesn't support it for certain zero stages (#35157 ) * don't use no_sync when deepspeed doesn't support it for certain zero stages * chore: lint * fix no_sync context for deepspeed across all zero types * chore: lint	2024-12-13 19:23:00 +01:00
Zach Mueller	7237b3ecfc	Fix FSDP no longer working (#35212 ) Fix FSDP failing	2024-12-13 19:20:51 +01:00
HMJ0628	6009642459	Translating agents_advanced.md to Chinese (#35231 ) add "translate agents_advanced"	2024-12-13 10:12:00 -08:00
UV	e94083bf90	Fixed typos in Audio Classification Documentation (#35263 ) * Fixed typos in Audio Classification Documentation * removed space in '8000 kHZ' * Changes made as per review	2024-12-13 09:43:44 -08:00
ivarflakstad	bc6ae0d55e	Update AMD docker image (rocm 6.1) (#35259 ) * Use rocm 6.3 as base amd image and add nvidia-ml-py to exclude list * Align rocm base image with torch wheels @6.1. Seems like the most stable combo	2024-12-13 15:41:03 +01:00
Yih-Dar	8096161b76	Use `rsfE` with `pytest` (#35119 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-13 14:36:22 +01:00
Fanli Lin	bdd4201fdb	[tests] fix "Tester object has no attribute '_testMethodName'" (#34910 ) * add more cases * fix method not found in unittest Signed-off-by: Lin, Fanli <fanli.lin@intel.com> * fix more cases * add more models * add all * no unittest.case * remove for oneformer * fix style --------- Signed-off-by: Lin, Fanli <fanli.lin@intel.com>	2024-12-13 14:33:45 +01:00
nhamanasu	3d213b57fe	skip Fuyu from test_generate (#35246 ) * skip Fuyu from test_generate * make fixup, quality, repo-consistency	2024-12-13 10:12:49 +01:00
alexrs-cohere	64478c7631	Add Cohere2 model (#35224 )	2024-12-13 09:35:50 +01:00
George	e4e404fdd0	Run model as compressed/uncompressed mode (#34719 ) * draft, run model as compreszed/uncompressed mode * draft * run run_compressed=False * run_compressed as attr * set run_compressed=False using quantization_config * remove redundant line * make is_qat_trainable dependent on run_compressed status * add tests * lint * full in docstring * add decompress * comments * decompress if model is compresssed and not run_compressed * apply_quant_config logic fix -- populate statedict properly * comments * remove non compressed model * make is_compressed as property * cosmetic * run apply_quant_config for non-compressed models -- popualte scales and zeropoints * add pahtway for decompressing sparse models * typo on is_quantization_compressed * lint * fix typo	2024-12-13 08:23:31 +01:00
EricWinsorDSIT	31f9a289a6	Fix typo in chat template example (#35250 ) Fix template example typo	2024-12-12 16:53:21 -08:00
Lysandre Debut	11ba1d472c	[Init refactor] Modular changes (#35240 ) * Modular changes * Gemma * Gemma	2024-12-12 19:23:28 +01:00
Yih-Dar	a691ccb0c2	Change back to `Thread` for SF conversion (#35236 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-12 16:05:04 +01:00
Nadav Timor	e3ee49fcfb	Refactoring `AssistedCandidateGenerator` for Improved Modularity and Reusability (#35009 ) * move `TestAssistedCandidateGeneratorDifferentTokenizers` into a new testing file * refactor * NOTHING. add space to rerun github actions tests * remove it... * NOTHING. add space to rerun github actions tests * remove it... * replace: `self.prev_tokens` -> `self.prev_assistant_ids` * NOTHING. rerun CI tests * remove it * introduce `self.prev_target_ids_len` * fix style * fix style --------- Co-authored-by: Jonathan Mamou <jonathan.mamou@intel.com>	2024-12-12 15:47:05 +01:00
Reza Rahemtola	63766abe36	Support Python 3.10+ Union style in chat template type hints parsing (#35103 ) * fix(utils): Support the newest Union type in chat template * fix(utils/chat_template): Backward compatibility for the newest Union type * Update src/transformers/utils/chat_template_utils.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2024-12-12 14:07:06 +00:00
Matt	5cf11e5ab9	Fix type hints for apply_chat_template (#35216 )	2024-12-12 13:59:24 +00:00
UV	3db8e27816	Fixed typo of 'indentifier' in audio_utils.py (#35226 )	2024-12-12 13:45:04 +00:00
Vijay	a9ccdfd8e3	docs: clarify initializer_range parameter description in Idefics3VisionConfig (#35215 )	2024-12-11 11:26:18 -08:00
Yoach Lacombe	6181c6b095	Fix seamless TTS generate (#34968 ) * fix seamless tts generate * apply same fix for v2 * [run-slow] seamless_m4t, seamless_m4t_v2 * remove TODO * [run-slow] seamless_m4t, seamless_m4t_v2 * [run-slow] seamless_m4t, seamless_m4t_v2 * ignore failing test on multigpus * [run-slow] seamless_m4t, seamless_m4t_v2 * [run-slow] seamless_m4t, seamless_m4t_v2	2024-12-11 15:38:42 +01:00
Cyril Vallez	33c12e4d80	Fix CI (#35208 ) fix aria	2024-12-11 14:24:52 +01:00
Lysandre Debut	7d303efa5f	Cleanup: continue the init refactor (#35170 ) * Round 2 * Round 3	2024-12-11 14:12:34 +01:00
Pavel Iakubovskii	5fcf6286bf	Add TimmWrapper (#34564 ) * Add files * Init * Add TimmWrapperModel * Fix up * Some fixes * Fix up * Remove old file * Sort out import orders * Fix some model loading * Compatible with pipeline and trainer * Fix up * Delete test_timm_model_1/config.json * Remove accidentally commited files * Delete src/transformers/models/modeling_timm_wrapper.py * Remove empty imports; fix transformations applied * Tidy up * Add image classifcation model to special cases * Create pretrained model; enable device_map='auto' * Enable most tests; fix init order * Sort imports * [run-slow] timm_wrapper * Pass num_classes into timm.create_model * Remove train transforms from image processor * Update timm creation with pretrained=False * Fix gamma/beta issue for timm models * Fixing gamma and beta renaming for timm models * Simplify config and model creation * Remove attn_implementation diff * Fixup * Docstrings * Fix warning msg text according to test case * Fix device_map auto * Set dtype and device for pixel_values in forward * Enable output hidden states * Enable tests for hidden_states and model parallel * Remove default scriptable arg * Refactor inner model * Update timm version * Fix _find_mismatched_keys function * Change inheritance for Classification model (fix weights loading with device_map) * Minor bugfix * Disable save pretrained for image processor * Rename hook method for loaded keys correction * Rename state dict keys on save, remove `timm_model` prefix, make checkpoint compatible with `timm` * Managing num_labels <-> num_classes attributes * Enable loading checkpoints in Trainer to resume training * Update error message for output_hidden_states * Add output hidden states test * Decouple base and classification models * Add more test cases * Add save-load-to-timm test * Fix test name * Fixup * Add do_pooling * Add test for do_pooling * Fix doc * Add tests for TimmWrapperModel * Add validation for `num_classes=0` in timm config + test for DINO checkpoint * Adjust atol for test * Fix docs * dev-ci * dev-ci * Add tests for image processor * Update docs * Update init to new format * Update docs in configuration * Fix some docs in image processor * Improve docs for modeling * fix for is_timm_checkpoint * Update code examples * Fix header * Fix typehint * Increase tolerance a bit * Fix Path * Fixing model parallel tests * Disable "parallel" tests * Add comment for metadata * Refactor AutoImageProcessor for timm wrapper loading * Remove custom test_model_outputs_equivalence * Add require_timm decorator * Fix comment * Make image processor work with older timm versions and tensor input * Save config instead of whole model in image processor tests * Add docstring for `image_processor_filename` * Sanitize kwargs for timm image processor * Fix doc style * Update check for tensor input * Update normalize * Remove _load_timm_model function --------- Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>	2024-12-11 12:40:30 +00:00
Benjamin Bossan	bcc50cc7ce	[PEFT] Better Trainer error when prompt learning with loading best model at the end (#35087 ) Original issue: https://github.com/huggingface/peft/issues/2256 There is a potential error when using load_best_model_at_end=True with a prompt learning PEFT method. This is because Trainer uses load_adapter under the hood but with some prompt learning methods, there is an optimization on the saved model to remove parameters that are not required for inference, which in turn requires a change to the model architecture. This is why load_adapter will fail in such cases and users should instead set load_best_model_at_end=False and use PeftModel.from_pretrained. As this is not obvious, we now intercept the error and add a helpful error message.	2024-12-11 12:44:39 +01:00
Cyril Vallez	d363e71d0e	🧹 Remove deprecated RotaryEmbedding parts in the Attention layers (#34858 ) * update * style * fix missing args * remove last trace of old rope classes * remove deprecated copied from * fix copies * trigger CIs * post rebase clean-up * reverse mistral * cleanup after dropping commits * Add comment	2024-12-11 11:16:52 +01:00
Raushan Turganbay	9094b87dd4	BLIP: enable device map (#34850 ) fix device map	2024-12-11 11:03:30 +01:00
HMJ0628	10feacd88a	[i18n-<languageCode>] Translating agents.md to Chinese (#35139 ) * add "translate agents.md" * add "agents.md" * add "translate warnings" * add "totree" * add "remove transformer_agent" * add "remove transformer _agent file" --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-12-10 15:16:37 -08:00
John Graham Reynolds	e8508924fd	Update data collator docstrings to accurately reference Nvidia tensor core compute capability version (#35188 ) update data collator docs to reflect correct tensor core compute capability Co-authored-by: John Graham Reynolds <john.graham.reynolds@vumc.org>	2024-12-10 15:16:01 -08:00
Steven Liu	5290f6a62d	[docs] Fix FlashAttention link (#35171 ) fix link	2024-12-10 11:36:25 -08:00
French_Ball	91b8ab18b7	[i18n-<languageCode>] Translating Benchmarks.md to Chinese (#35137 ) * add "Translating Benchmarks.md to Chinese " * Removed all the English original text (which was previously kept as comments in the document) and refined some of the Chinese expressions.	2024-12-10 09:58:47 -08:00
Gaétan Lepage	217c47e31b	Only import torch.distributed if it is available (#35133 )	2024-12-10 18:19:30 +01:00
Henry Hyeonmok Ko	52d135426f	Multiple typo fixes in NLP, Audio docs (#35181 ) Fixed multiple typos in Tutorials, NLP, and Audio sections	2024-12-10 09:08:55 -08:00
Ahmed Almaghz	425af6cdc2	[i18n-ar] Translated file : `docs/source/ar/community.md` into Arabic (#33027 ) * Add docs/source/ar/community.md to Add_docs_source_ar_community.md * Update community.md * Update community.md * Update community.md * Update _toctree.yml - add community.md * Update docs/source/ar/community.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Create how_to_hack_models.md * Create modular_transformers.md * Create tiktoken.md * Update _toctree.yml * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/how_to_hack_models.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/modular_transformers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tiktoken.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tiktoken.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-12-10 09:08:27 -08:00
Mohamed Mekkouri	e5c45a6679	Fixing GGUF support for StableLm (#35060 ) fix Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-12-10 16:30:09 +01:00
Huang, Guangtai	3e2769a3c9	Fix DBRX LayerNorm init method (#35177 ) fix dbrx layernorm init	2024-12-10 14:31:22 +00:00
Xavier Dupré	5fba3f99c0	Remove unnecessary masked_fill in deberta models (#35182 )	2024-12-10 13:52:20 +00:00
Gallil Maimon	6acb4e43a7	Support BatchNorm in Hubert pos_conv_emb as in fairseq (#34389 ) * Support BatchNorm in Hubert pos_conv_emb as in fairseq * Correct the new defaults (#34377) * Correct the new defaults * CIs * add check * Update utils.py * Update utils.py * Add the max_length in generate test checking shape without passing length * style * CIs * fix fx CI issue * [auto. ping] Avoid sending empty info + add more team members (#34383) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix glm (#34388) * Fix duplicated * fix import * Use non nested images and batched text Idefics2/3 (#34222) * add support for non nested images and add tests * add tests error scenario * fix style * added single and no image to error tests * Fix onnx non-expotable inplace aten op (#34376) * fix onnx non-expotable inplace op * mistral, qwen2, qwen2_vl, starcoder2 * fixup copies * Fix right padding in LLaVA models (#34305) * fix right pad llavas * device mismatch * no filter (#34391) * no filter * no filter * no filter --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * SynthID: better example (#34372) * better example * Update src/transformers/generation/configuration_utils.py * Update src/transformers/generation/logits_process.py * nits * Tests: upgrade `test_eager_matches_sdpa_generate` (#34386) * Fix bnb training test failure (#34414) * Fix bnb training test: compatibility with OPTSdpaAttention * Avoid check expected exception when it is on CUDA (#34408) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix typos in agents_advanced.md (#34405) * [docs] Cache implementations (#34325) cache * [run-slow] hubert * Support BatchNorm in Hubert pos_conv_emb as in fairseq Add conversion integration test, and make batchnorm explicit variable * Support BatchNorm in Hubert pos_conv_emb as in fairseq fix make fixup styling changes * [run-slow] hubert * Support BatchNorm in Hubert pos_conv_emb as in fairseq * [run-slow] hubert * Support BatchNorm in Hubert pos_conv_emb as in fairseq Add conversion integration test, and make batchnorm explicit variable * Support BatchNorm in Hubert pos_conv_emb as in fairseq fix make fixup styling changes * [run-slow] hubert * [run-slow] hubert --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by: Rudy Delouya <rudy.delouya@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>	2024-12-10 14:18:23 +01:00
Trevor Royer	80f2b1610f	Fix file path for shard_num 1 with mllama converter (#35053 ) "#35049 fix path for num_shard 1"	2024-12-10 09:11:45 +00:00
Raushan Turganbay	0938b57770	Assisted decoding multi-gpu (#35116 ) * fix * move a few lines up	2024-12-10 09:59:17 +01:00
Spiros Dontas	dada0fd85f	Fix `num_items_in_batch` not being an integer (#35115 ) In method `Trainer#get_batch_samples`, the return values should be a list of batch samples and an integer indicating the number of items that exist in the batch. However, this was not actually a case and what was returned instead of an integer, was a tensor with one element. In the multi-GPU setup, this tensor is placed in a different device than the loss tensor, causing the loss function to raise a `RuntimeError`. The problem arises from `5d7739f15a/src/transformers/trainer.py (L5139-L5144)`, where the outer `sum` operates over a list of tensors which means that the final result is also a tensor. To counter this issue, a new check (after the accelerator gathering) has been added in order to convert a potential tensor to an integer before returning the `num_items_in_batch`.	2024-12-10 08:40:40 +01:00
Matthew Douglas	34f4080ff5	[CI] Fix bnb quantization tests with accelerate>=1.2.0 (#35172 )	2024-12-09 13:55:16 -05:00
UV	fa8763ce17	Fixed typo of 'avilable' in prompts.py (#35145 )	2024-12-09 16:40:32 +00:00
fzyzcjy	4bc39de5c3	Super tiny fix logging message (#35132 ) Update integration_utils.py	2024-12-09 16:31:32 +00:00
Lysandre Debut	8e806a336f	Cleanup: continue the init refactor (#35167 ) Round 2	2024-12-09 16:09:50 +01:00
Mohamed Mekkouri	7238387f67	Fix typo in EETQ Tests (#35160 ) fix	2024-12-09 14:13:36 +01:00
Daniel Bogdoll	de8a0b7547	Option to set 'non_blocking' for to(device) in BatchEncoding and BatchFeature (#34883 ) * Option to set 'non_blocking' for to(device) operation for performance improvements. Defaults to 'false', thus no behavioral changes. * Enabling non_blocking in to() operation of BatchFeature. * Improved docstring on utilization of non_blocking * Force non_blocking as keyword argument Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by: Daniel Bogdoll <dbogdoll@umich.edu> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-12-09 11:29:04 +01:00
UV	1452dc2514	Corrected typo in agent system prompts (#35143 )	2024-12-09 10:42:23 +01:00
NielsRogge	9e420e0269	[I-JEPA] Update docs (#35148 ) Update docs	2024-12-09 10:01:31 +01:00
kang sheng	1ccca8f48c	Fix GA loss bugs and add unit test (#35121 ) * fix GA bugs and add unit test * narrow down model loss unit test diff gap * format code to make ruff happy * send num_items_in_batch argument to decoder * fix GA loss bug in BertLMHeadModel * use TinyStories-33M to narrow down diff gap * fotmat code * missing .config * avoid add extra args --------- Co-authored-by: kangsheng <kangsheng@meituan.com>	2024-12-09 09:57:41 +01:00
Pavel Iakubovskii	c8c8dffbe4	Update I-JEPA checkpoints path (#35120 ) Update checkpoints path	2024-12-06 13:42:51 +00:00
Victor Agostinelli	7f95372c62	Add feature dim attributes to BitLinear for easier PEFT integration (#34946 ) Update bitnet.py, extremely small change to allow for easier PEFT integration Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2024-12-06 13:39:45 +01:00
Aymeric Roucher	9ad4c93536	Add Aria (#34157 ) * Add Aria --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-12-06 12:17:34 +01:00
Yih-Dar	15ab310c3a	Fix private forked repo. CI (#35114 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-06 12:03:31 +01:00
Steven Liu	98e8062df3	[docs] top_p, top_k, temperature docstrings (#35065 ) clarify	2024-12-05 11:24:51 -08:00
Jacky Lee	44f88d8ccb	[docs] Update Python version in translations (#35096 ) update: doc version	2024-12-05 11:06:54 -08:00
Lysandre	66ab300aaf	Dev version	2024-12-05 19:12:22 +01:00
Pablo Montalvo	a5bb528471	Fix signatures for processing kwargs (#35105 ) * add conversion script * remove pg2 refs * fixup style * small update * get correct scaling * add back missing bos * fix missing config keys * might revert this pos_embeddings * fixup 9b config * fix 9b * fixup 9b conversion for good + add back num_hidden_layers * add correct query scaling for 2b, 9b, 27b * fixup 27b conversion * Additional variant: 27b-896 * Use CPU for conversion to reduce GPU RAM requirements * fix causal mask generation + formatting * fix in-training causal mask generation edge case * trigger CI * update config * update config * update config * update config * update config * update config * update config * update config * update config * move conversion file to main model dir * handle multi-images + bos token * address comments for input ids * revert ci fixes * [run-slow] paligemma * fix * [run-slow] paligemma * skip end 2 end * [run-slow] paligemma --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-05 18:15:48 +01:00
Jonathan Mamou	e27465c801	Adaptive dynamic number of speculative tokens (#34156 ) * initial commit * update strategy * add tradeoff FPR TPR with cost * all probs * fix * fix * fix style * Update src/transformers/generation/configuration_utils.py shorter docstring Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * import guard * fix style * add is_sklearn_available condition * vectorizing to flatten the for-loop * fix style * disable adaptation for UAG * update doc * add TestAssistedCandidateGeneratorUpdateStrategy * fix style * protect import * fix style --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2024-12-05 17:07:33 +01:00
Yih-Dar	b0a51e5cff	Fix flaky Hub CI (`test_trainer.py`) (#35062 ) * fix * Update src/transformers/testing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * fix * fix * fix * fix * fix * fix * fix * fix * check * check * check * check * check * check * Update src/transformers/testing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/testing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * check * check * check * Final space * Final adjustment --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Lucain <lucainp@gmail.com>	2024-12-05 17:02:27 +01:00
Arthur	a928d9c128	[`trainer`] fix the GA `model_accepts_loss_kwargs` (#34915 ) * fix * style * values * fix	2024-12-05 16:37:46 +01:00
Raushan Turganbay	e682c17e4a	BLIP: this is correct now (#35081 ) this is correct now	2024-12-05 16:30:09 +01:00
João Marcelo	50189e36a6	Add I-JEPA (#33125 ) * first draft * add IJepaEmbeddings class * fix copy-from for IJepa model * add weight conversion script * update attention class names in IJepa model * style changes * Add push_to_hub option to convert_ijepa_checkpoint function * add initial tests for I-JEPA * minor style changes to conversion script * make fixup related * rename conversion script * Add I-JEPA to sdpa docs * minor fixes * adjust conversion script * update conversion script * adjust sdpa docs * [run_slow] ijepa * [run-slow] ijepa * [run-slow] ijepa * [run-slow] ijepa * [run-slow] ijepa * [run-slow] ijepa * formatting issues * adjust modeling to modular code * add IJepaModel to objects to ignore in docstring checks * [run-slow] ijepa * fix formatting issues * add usage instruction snippet to docs * change pos encoding, add checkpoint for doc * add verify logits for all models * [run-slow] ijepa * update docs to include image feature extraction instructions * remove pooling layer from IJepaModel in image classification class * [run-slow] ijepa * remove pooling layer from IJepaModel constructor * update docs * [run-slow] ijepa * [run-slow] ijepa * small changes * [run-slow] ijepa * style adjustments * update copyright in init file * adjust modular ijepa * [run-slow] ijepa	2024-12-05 16:14:46 +01:00
Mohamed Mekkouri	95a855e212	Deprecate quanto and switch to optimum-quanto (#35001 ) * deprecate quanto * fix style	2024-12-05 16:11:09 +01:00
Isotr0py	482cb28a18	Fix `tie_word_embeddings` handling for GGUF models (#35085 ) * fix tie_word_embeddings Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2024-12-05 16:00:41 +01:00
Cyril Vallez	35447054f5	Update Mistral conversion script (#34829 ) * Update convert_mistral_weights_to_hf.py * Update convert_mistral_weights_to_hf.py * Update convert_mistral_weights_to_hf.py	2024-12-05 15:47:20 +01:00
Arthur	93f87d3cf5	[`tokenizers`] bump to 0.21 (#34972 ) bump to 0.21	2024-12-05 15:46:02 +01:00
eustlb	54aae121eb	[Whisper] Fix whisper tokenizer (#34537 ) * handle single timestamp ending * include last timestamp token * handle single timestamp ending * avoid floating points arithm limitations * ensure float64 operations * new test * make fixup * make copies * handle edge case double tokens ending with different tokens * handle single timestamp ending * make fixup * handle conditioning on prev segments * fix * Update src/transformers/models/whisper/generation_whisper.py Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * [run-slow] whisper * don't call item() to avoid unnecessary sync * fix --------- Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> Co-authored-by: Eustache Le Bihan <eustlb@users.noreply.huggingface.co>	2024-12-05 13:46:29 +01:00
Yih-Dar	beb2c66ec3	Informative (#35059 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-05 09:50:27 +01:00
Steven Liu	1ed1de2fec	[docs] Increase visibility of torch_dtype="auto" (#35067 ) * auto-dtype * feedback	2024-12-04 09:18:44 -08:00
Fanli Lin	baa3b22137	[docs] add a comment that offloading requires CUDA GPU (#35055 ) * add commen to offloading * Update docs/source/en/kv_cache.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-12-04 07:48:34 -08:00
Cyril Vallez	1da1e0d7f2	Support for easier multimodal use of modular (#35056 ) * update modular and add examples * style * improve example comments * style * fix small logic issue for imports * fix relative order issue when files do not make sense * Improve comments * trigger CIs	2024-12-04 15:13:11 +01:00
Anton Vlasjuk	46df859975	[`GPTNeoX`] Flex Attention + Refactor (#34896 ) * gpt neox flex attention + refactor * some formatting * small fix on dropout * add assertion on flex attn test * flaky ci :( * add head mask support * style * handle dtype, replace torch where * fixup flex with output attns * code review and several other fixes * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * style * remove unnecessary comment * remove incorrect comment * make flex attn check more agnostic tor versions and centralized * change peft input dtype check to value since q and k could be affected by other stuff like RoPE * i forgor * flaky * code review and small fixes * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-12-04 14:48:28 +01:00
Vladislav Bronzov	accb7204f9	Add Pytorch Tensor Parallel support for Qwen2, Qwen2Moe, Starcoder2 (#35007 ) * add base tp plan for qwen2 and qwen2moe * add parallel tp for starcoder2 * fix modular conversion * add infer dim for qkv states * Update src/transformers/models/qwen2_moe/configuration_qwen2_moe.py --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-12-04 14:43:36 +01:00
Tianshu Wang	c7a109ec81	Fix `pad_token_tensor` is None in warning (#34005 ) Fix pad_token_tensor is None in warning	2024-12-04 11:15:25 +01:00
Fanli Lin	329f5dbf97	[docs] use device-agnostic API instead of hard-coded cuda (#35048 ) replace cuda	2024-12-03 10:54:15 -08:00
Fanli Lin	b8cdc262d5	[docs] use device-agnostic instead of `cuda` (#35047 ) * fix on xpu * [run_all] * add the missing import for Image lib * add more devices in comment * bug fix * replace cuda	2024-12-03 10:53:45 -08:00
wwwbai	346597b644	Translate community.md into Chinese (#35013 ) * community translation * Update docs/source/zh/community.md Co-authored-by: Isotr0py <2037008807@qq.com> --------- Co-authored-by: Isotr0py <2037008807@qq.com>	2024-12-03 10:22:02 -08:00
Fanli Lin	3deaa8179d	[docs] fix example code bug (#35054 ) fix code bug	2024-12-03 09:18:39 -08:00
Wang, Yi	125de41643	fix speecht5 failure issue in test_peft_gradient_checkpointing_enable… (#34454 ) * fix speecht5 failure issue in test_peft_gradient_checkpointing_enable_disable Signed-off-by: Wang, Yi <yi.a.wang@intel.com> * [run-slow] speecht5 --------- Signed-off-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Matt <rocketknight1@gmail.com>	2024-12-03 13:58:54 +00:00
Yih-Dar	7a7f27697a	Fix `BertGeneration` (#35043 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-12-03 13:56:59 +01:00
Aymeric Roucher	901f504580	Add token cost + runtime monitoring to Agent and HfEngine children (#34548 ) * Add monitoring to Agent and HfEngine children	2024-12-03 13:14:52 +01:00
Cyril Vallez	ee37bf0d95	Automatic compilation in generate: do not rely on inner function (#34923 ) * compiled forward in PreTrainedModel * update * style * update name * trigger CIs * Add way to use custom compile args * style * switch parameterization to generation_config * Add to inits * Update configuration_utils.py * inits * style * docs * style * Update configuration_utils.py * back without dataclass for repo consistency * Update configuration_utils.py * style * style * style once again * add config serialization * update * true dataclass * trigger CIs * merge compile methods + remove serialization of compile config	2024-12-03 11:20:31 +01:00
wwwbai	f9c7e6021e	Translate bertlogy.md into Chinese (#34908 ) * bertology translation * Update docs/source/zh/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/zh/bertology.md Co-authored-by: blueingman <15329507600@163.com> * Update docs/source/zh/bertology.md Co-authored-by: blueingman <15329507600@163.com> * Update docs/source/zh/bertology.md Co-authored-by: Isotr0py <2037008807@qq.com> * Update docs/source/zh/bertology.md Co-authored-by: Isotr0py <2037008807@qq.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: blueingman <15329507600@163.com> Co-authored-by: Isotr0py <2037008807@qq.com>	2024-12-02 11:42:40 -08:00
Fanli Lin	527dc04e46	[docs] add the missing import for Image and bug fix (#34776 ) * add the missing import for Image lib * add more devices in comment * bug fix	2024-12-02 11:40:20 -08:00
Ahmed Almaghz	4955e4e638	[i18n-ar] Translated file : `docs/source/ar/notebooks.md` into Arabic (#33049 ) * Add docs/source/ar/notebooks.md to Add_docs_source_ar_notebooks.md * Update notebooks.md * Update _toctree.yml	2024-12-02 11:40:04 -08:00
secrettoad	f0dec874f0	add docstring example for compute_loss_func (#35020 )	2024-12-02 11:39:09 -08:00
Henry Hyeonmok Ko	31299670cd	Multiple typo fixes in Tutorials docs (#35035 ) * Fixed typo in multi gpu docs and OLMoE version * Fixed typos in docs for agents, agents advanced, knowledge distillation, and image feature extraction * Fixed incorrect usage of model.image_guided_detection in zero shot object detection docs	2024-12-02 15:26:34 +00:00
Dmitry Rogozhkin	31830474bf	Fix `test_eager_matches_sdpa_inference` for `XPU` backend (#34889 ) * Use torch.nn.attention.sdpa_kernel instead of deprecated torch.backends.cuda.sdp_kernel Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * Fix test_eager_matches_sdpa_inference for XPU backend As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH which is implemented on PyTorch level using aten operators and is device agnostic with respect to implementation of each aten operator. Thus, we can reuse CUDA (or CPU) MATH weights for XPU. Fixes: #34888 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * Use torch.amp.autocast instead of deprecated torch.cuda.amp.autocast in nemotron Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> --------- Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2024-12-02 16:21:04 +01:00
Jacky Lee	f41d5d8f74	Add type hints for forward functions in Gemma2 (#35034 ) * feat: add gemma2 type hints * fix: mask is optional	2024-12-02 14:03:36 +00:00
Bojun Feng	7b5f76e32e	Typo in warning switching to optimum-quanto (#35028 ) fix typos	2024-12-02 13:47:05 +00:00
milesial	c24c79ebf9	Optimize memory usage of mllama encoder (#34930 ) mllama encoder memory optimization	2024-12-02 11:46:45 +01:00
Weize Chen	9ab8c5b503	fix variable undefined bug when return_tensors is not specified in llava processing (#34953 ) * fix variable undefined bug when return_tensors is not specified in llava processor * improve readability	2024-12-02 11:44:42 +01:00
Joshua Lochner	3480cbb97e	Only cast `cu_seqlens` when tracing (#35016 ) * Only cast `cu_seqlens` when tracing * Formatting	2024-12-02 11:39:39 +01:00
Alvaro Bartolome	19dabe9636	Update `FillMaskPipeline.__call__` signature and docstring (#35006 ) Update `FillMaskPipeline.__call__` - Remove unused `*args` - Update docstring with `inputs` over `args`	2024-11-29 13:44:56 +00:00
Samuel Larkin	f7427f58ed	fix: double verbs (#35008 )	2024-11-29 13:19:57 +00:00
Pavel Iakubovskii	737f4dc4b6	Update timm version (#35005 ) * Bump timm * dev-ci	2024-11-29 12:46:59 +00:00
Tibor Reiss	89d7bf584f	🚨🚨🚨 Uniformize kwargs for TrOCR Processor (#34587 ) * Make kwargs uniform for TrOCR * Add tests * Put back current_processor * Remove args * Add todo comment * Code review - breaking change	2024-11-29 11:58:11 +00:00
Lucain	0b5b5e6a70	Let server decide default repo visibility (#34999 ) * Let server decide default repo visibility * code style	2024-11-28 17:05:08 +01:00
Mohamed Mekkouri	f491096f7d	Fix docker CI : install autogptq from source (#35000 ) * Fixed Docker * Test ci * Finally * add comment	2024-11-28 16:31:36 +01:00
Pavel Iakubovskii	01ad80f820	Improve `.from_pretrained` type annotations (#34973 ) * Fix from_pretrained type annotations * Better typing for image processor's `from_pretrained`	2024-11-28 15:05:19 +00:00
Michael Goin	9d6f0ddcec	Add optimized `PixtralImageProcessorFast` (#34836 ) * Add optimized PixtralImageProcessorFast * make style * Add dummy_vision_object * Review comments * Format * Fix dummy * Format * np.ceil for math.ceil	2024-11-28 16:04:05 +01:00
Yih-Dar	6300212946	Fix `utils/check_bad_commit.py` (for auto ping in CI) (#34943 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-28 15:34:38 +01:00
Raushan Turganbay	5e8c1d713d	Offloaded cache: fix generate (#34921 ) * fix cache impl * require_torch_gpu * fix mamba * fix copies	2024-11-28 15:05:56 +01:00
George	57ca9e6d2f	Allow compressed-tensors quantized model to be trained (#34520 ) * populate quantization_config for kv-cache-scheme only configs * make compressed-tensors quantized models trainable * populate versions on quant config * pass oneshot then finetune * remove breakpoint * SunMarc comments and fix to_dict logic * lint * lint * test * comment * comments'	2024-11-28 15:05:16 +01:00
xinpengzz	44af935ec5	Refine the code of Universal Assisted Generation (#34823 ) * removed the useless attritbutes * add configs for window size * fixed the wrong kwargs * added docstring	2024-11-28 15:04:24 +01:00
Oscar Skean	2b053fdf1a	🚨🚨🚨 Changed DINOv2Config default patch size to 14 (#34568 ) Changed DINOv2Config default patch size to 14	2024-11-28 14:48:06 +01:00
Kyle Sayers	4f0bf9864c	Fix `save_pretrained` for partially offloaded models (#34890 ) * delete unnecessary reference Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * update comment, explicit delete state_dict * Update src/transformers/modeling_utils.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix style Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-11-28 14:46:56 +01:00
Benjamin Bossan	f4b674f269	[PEFT] Set eval mode when loading PEFT adapter (#34509 ) * [PEFT] Set eval mode when loading PEFT adapter Resolves #34469 When calling model.load_adapter to load a PEFT adapter, by default the adapter should be set to eval mode. This is now correctly done. Users can still pass is_trainable=True to load the adapter in training mode. * Linter	2024-11-28 13:56:25 +01:00
Sergio Paniego Blanco	5523e38b55	Fixed typo in `VisitWebpageTool` (#34978 ) Fixed typo in VisitWebpageTool	2024-11-27 12:49:21 -08:00
Xiao Yuan	4120cb257f	Fix typo in code block in vipllava.md (#34957 ) fix typo in code block in vipllava.md	2024-11-27 08:19:34 -08:00
blueingman	2910015d6d	[i18n-zh]Translated perf_train_special.md into Chinese (#34948 ) * Add translation for perf_train_special documentation * Update docs/source/zh/perf_train_special.md Co-authored-by: Isotr0py <2037008807@qq.com> * Update docs/source/zh/perf_train_special.md Co-authored-by: Isotr0py <2037008807@qq.com> * Update _toctree.yml * Update _toctree.yml * Update perf_train_special.md * Update perf_train_special.md --------- Co-authored-by: Isotr0py <2037008807@qq.com>	2024-11-27 07:57:43 -08:00
Fanli Lin	637225508f	[docs] add explanation to `release_memory()` (#34911 ) * explain release_memory * Update docs/source/en/llm_tutorial_optimization.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-27 07:47:28 -08:00
MaCAT	0600f46353	🌐 [i18n-KO] Translated encoder-decoder.md to Korean (#34880 ) * Initial version of translation, english still remaining * Revised Translation, removed english. _toctree not updated * updated _toctree.yml && 3rd ver translation * updated _toctree.yml && 3rd ver translation * Update encoder-decoder.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update encoder-decoder.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update encoder-decoder.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update encoder-decoder.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update encoder-decoder.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update encoder-decoder.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>	2024-11-27 07:47:14 -08:00
Yih-Dar	5f8b24ee12	Fix flaky test execution caused by `Thread` (#34966 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-27 16:32:50 +01:00
Yih-Dar	0d99a938aa	Avoid calling `get_max_length` (#34971 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-27 15:15:35 +01:00
Mohamed Mekkouri	8f48ccf548	Fix : Add PEFT from source to CI docker (#34969 ) * Docker fix peft * Test new docker * uncomment	2024-11-27 14:10:47 +01:00
Arthur	4c1388f48e	[`FlexAttention`] Update gemma2 (#34942 ) * update tests * now maybe this fixes the previous fialing tests! * nit default * Update src/transformers/models/gemma2/modular_gemma2.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * fix-copies --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2024-11-27 11:50:48 +01:00
blueingman	6c3f168b36	[i18n-zh]Translated tiktoken.md into chinese (#34936 ) * Add translation for tiktoken documentation * Update tiktoken.md * Update tiktoken.md	2024-11-26 10:09:52 -08:00
谭九鼎	5bfb40bc8e	docs: HUGGINGFACE_HUB_CACHE -> HF_HUB_CACHE (#34904 )	2024-11-26 09:37:18 -08:00
Fanli Lin	784d22078a	[doc] use full path for run_qa.py (#34914 ) use full path for run_qa.py	2024-11-26 09:23:44 -08:00
Fanli Lin	6bc0c219c1	[docs] use device-agnostic API instead of cuda (#34913 ) add device-agnostic API Signed-off-by: Lin, Fanli <fanli.lin@intel.com>	2024-11-26 09:23:34 -08:00
Ahmed Almaghz	64b73e61f8	[i18n-ar] Translated file : `docs/source/ar/benchmarks.md` into Arabic (#33023 ) * Add docs/source/ar/benchmarks.md to Add_docs_source_ar_benchmarks.md * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/benchmarks.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml * Update benchmarks.md --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-11-26 09:23:11 -08:00
vansin	a0ba631519	Update the Python version in the Chinese README to match the English README. (#34870 ) Update Python Version	2024-11-26 09:22:34 -08:00
Joshua Lochner	1f6b423f0c	Fix torch.onnx.export of Qwen2-VL vision encoder (#34852 ) * Fix torch.onnx.export of Qwen2-VL vision encoder This PR fixes onnx export support for the vision encoder of Qwen2-VL, which converts the `cu_seqlens` to `torch.int32`, leading to errors later on when using the values for slicing. `c57eafdaa1/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py (L1044-L1046)` ## Error: ``` onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] (op_type:Slice, node name: /blocks.0/attn/Slice_4): axes has inconsistent type tensor(int64) ``` ## Code to reproduce issue: ```py import requests from PIL import Image import torch from transformers import ( AutoProcessor, Qwen2VLForConditionalGeneration, ) # Constants VISION_MODEL_NAME = "vision_encoder.onnx" # Load model and processor model_id = "hf-internal-testing/tiny-random-Qwen2VLForConditionalGeneration" model = Qwen2VLForConditionalGeneration.from_pretrained(model_id).eval() processor = AutoProcessor.from_pretrained(model_id) # Prepare inputs url = "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg" image = Image.open(requests.get(url, stream=True).raw) conversation = [ { "role": "user", "content": [ { "type": "image" }, { "type": "text", "text": "Describe this image."}, ], }, ] images = [image] text_prompt = processor.apply_chat_template(conversation, add_generation_prompt=True) inputs = processor(text=[text_prompt], images=images, padding=True, return_tensors="pt") ## Vision model vision_inputs = dict( pixel_values=inputs["pixel_values"], grid_thw=inputs["image_grid_thw"], ) vision_inputs_positional = tuple(vision_inputs.values()) vision_outputs = model.visual.forward(vision_inputs_positional) # Test forward pass torch.onnx.export( model.visual, args=vision_inputs_positional, f=VISION_MODEL_NAME, export_params=True, opset_version=14, do_constant_folding=True, input_names=list(vision_inputs.keys()), output_names=["image_features"], dynamic_axes={ "pixel_values": { 0: "batch_size grid_t * grid_h * grid_w", 1: "channel * temporal_patch_size * patch_size * patch_size", }, "grid_thw": {0: "batch_size"}, "image_features": {0: "batch_size * grid_t * grid_h * grid_w"}, }, ) # Load and check the exported model model import onnx model = onnx.load(VISION_MODEL_NAME) onnx.checker.check_model(model, full_check=True) inferred = onnx.shape_inference.infer_shapes(model, check_type=True) ``` * Formatting * [run-slow] qwen2_vl	2024-11-26 16:14:36 +01:00
Matt	d5cf91b346	Separate chat templates into a single file (#33957 ) * Initial draft * Add .jinja file loading for processors * Add processor saving of naked chat template files * make fixup * Add save-load test for tokenizers * Add save-load test for tokenizers * stash commit * Try popping the file * make fixup * Pop the arg correctly * Pop the arg correctly * Add processor test * Fix processor code * stash commit * Processor clobbers child tokenizer's chat template * Processor clobbers child tokenizer's chat template * make fixup * Split processor/tokenizer files to avoid interactions * fix test * Expand processor tests * Rename arg to "save_raw_chat_template" across all classes * Update processor warning * Move templates to single file * Move templates to single file * Improve testing for processor/tokenizer clashes * Improve testing for processor/tokenizer clashes * Extend saving test * Test file priority correctly * make fixup * Don't pop the chat template file before the slow tokenizer gets a look * Remove breakpoint * make fixup * Fix error	2024-11-26 14:18:04 +00:00
Yuxuan.Zhang	5a45617887	change apply_rotary_pos_emb of Glmmodel for GLM-Edge Series model (#34629 ) * change apply_rotary_pos_emb * upload for glm-edge * remove useless part * follow the suggestion * fix * format * format * test * format again * format again * remove modular change * remove modular change * this apply_rotary_pos_emb need modify? * fix with this * format * format * ruff check * modify modular_glm failed * remove partial_rotary_factor of function partial_rotary_factor * fix wrong change of examples/research_projects * revert * remove line 118 * use q_rot	2024-11-26 15:05:42 +01:00
Vladislav Bronzov	1141eff1bd	Add Pytorch Tensor Parallel support for Mistral (#34927 ) add base tp support	2024-11-26 14:28:07 +01:00
eustlb	4d1d0f29a4	[Whisper] Fix whisper integration tests (#34111 ) * fix test_tiny_timestamp_generation * fix test_large_timestamp_generation * fix test_whisper_shortform_single_batch_prev_cond * fix test_whisper_shortform_multi_batch_hard_prev_cond * return_timestamps necessary with long form * fix test_default_multilingual_transcription_long_form * fix test_tiny_token_timestamp_generation_longform * fix test_whisper_longform_multi_batch_hard * Update tests/models/whisper/test_modeling_whisper.py Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * fix typo * do not expect special tokens * fix test_whisper_longform_single_batch_beam * fix test_whisper_longform_multi_batch_hard_prev_cond * update test_whisper_longform_multi_batch_hard_prev_cond * update test_whisper_longform_multi_batch_hard_prev_cond * these tests does not make sense anymore * this test does not make sense anymore * make fixup * suggested nits * add test with forced_decoder_ids * this test does not make sense anymore * change assert for unittest test cases * make fixup * test with prompt_ids and task and language * fix unittest test case call * fix test_tiny_generation * fix test_tiny_en_generation * fix test_tiny_en_batched_generation * fix test_tiny_longform_timestamps_generation * fix test_tiny_timestamp_generation * fix test_large_generation * fix test_large_batched_generation * fix test_large_generation_multilingual * fix test_large_timestamp_generation * fix test_large_timestamp_generation * fix test_tiny_token_timestamp_generation_longform * fix test_tiny_en_batched_generation * make fixup * [run-slow] whisper --------- Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>	2024-11-26 12:23:08 +01:00
Mohamed Mekkouri	0e805e6d1e	Skipping aqlm non working inference tests till fix merged (#34865 )	2024-11-26 11:09:30 +01:00
Raushan Turganbay	73b4ab1085	VideoLLaVA: add default values (#34916 ) add default values	2024-11-26 08:20:06 +01:00
Yoni Gozlan	bdb29ff9f3	Fix import structure for Fast Image processors (#34859 ) * Fix import structure image_processor_fast * update to new inits	2024-11-25 16:27:56 -05:00
xuzifei-dmatrix	bfc3556b20	making gpt2 fx traceable (#34633 ) * making gpt2 fx tracable * running make fix-copies * Revert "running make fix-copies" This reverts commit 5a3437cb5b63799243bceae7d21a2aed8d0418c7.	2024-11-25 19:30:38 +01:00
Viktor Scherbakov	95c10fedb3	Updated documentation and added conversion utility (#34319 ) * Updated documentation and added conversion utility * Update docs/source/en/tiktoken.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tiktoken.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Moved util function to integration folder + allow for str * Update formatting Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Updated formatting * style changes --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-25 18:44:09 +01:00
Mohamed Mekkouri	890ea7de93	Fix failling GGML test (#34871 ) fix_test	2024-11-25 18:04:52 +01:00
Mohamed Mekkouri	b76a292bde	Upgrade torch version to 2.5 in dockerfile for quantization CI (#34924 ) * Upgrade Torch 2.5 * uncomment	2024-11-25 17:38:20 +01:00
Yih-Dar	a830df2909	Fix `test_auto_backbone_timm_model_from_pretrained` (#34877 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-25 17:20:41 +01:00
jiqing-feng	a464afbe2a	fix static cache data type miss-match (#34799 ) * fix gptj data type missmatch Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add low precision static cache tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix low-precision static cache tests * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * avoid config change Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * change data type convert in cache copy Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix comment Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * cast key value after k v out Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2024-11-25 16:59:38 +01:00
Benjamin Bossan	b13916c09d	[AWQ, CI] Bump AWQ version used in docker image (#34922 ) The old AWQ version is failing with the latest (unreleased) transformers, giving the error: > ImportError: cannot import name 'shard_checkpoint' from 'transformers.modeling_utils' This has been resolved in awq v0.2.7: https://github.com/casper-hansen/AutoAWQ/pull/644	2024-11-25 16:49:57 +01:00
Mohamed Mekkouri	4e6b19cd95	Fix : BitNet tests (#34895 ) * fix_tests_bitnet * fix format	2024-11-25 16:47:14 +01:00
Shane A	9121ab8fe8	Rename OLMo November to OLMo2 (#34864 ) * Rename/move OLMo Nov files to OLMo2 * Rename Olmo1124 and its variants to Olmo2	2024-11-25 16:31:22 +01:00
dependabot[bot]	1de3598d30	Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/lxmert (#34917 ) Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2. - [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst) - [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2) --- updated-dependencies: - dependency-name: tornado dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-11-25 15:19:29 +00:00
Jacky Lee	f4c04ba32b	Fix Qwen2 failing tests (#34819 ) * fix: qwen2 model ids * fix: line * fix: more format * update: reformat	2024-11-25 15:53:04 +01:00
Tom Aarsen	11cc2295c7	[`peft`] Given that `self.active_adapter` is deprecated, avoid using it (#34804 ) * Given that self.active_adapter is deprecated, avoid using it * Remove misleading comment - `self.active_adapter` is not used (and deprecated)	2024-11-25 15:29:52 +01:00
Donald Szeto	74db22f905	Fix convert_tokens_to_string when decoder is None (#34569 ) * Fix convert_tokens_to_string when decoder is None * revert unrelated changs --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-11-25 14:35:24 +01:00
wanxiangchwng	97514a8ba3	chore: fix some typos (#34891 ) Signed-off-by: wanxiangchwng <cui.shuang@foxmail.com>	2024-11-25 13:05:59 +00:00
dependabot[bot]	62ab94dea8	Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/visual_bert (#34887 ) Bump tornado in /examples/research_projects/visual_bert Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2. - [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst) - [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2) --- updated-dependencies: - dependency-name: tornado dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-11-25 12:54:55 +00:00
Meliksah Turker	c50b5675d6	prepare_fa2_from_position_ids function bugfix (#33269 ) contiguous() is called before view() for key and value within prepare_fa2_from_position_ids function	2024-11-25 13:51:26 +01:00
VictorAtIfInsurance	a0f4f3174f	allow unused input parameters passthrough when chunking in asr pipelines (#33889 ) * allow unused parameter passthrough when chunking in asr pipelines * format code * format * run fixup * update tests * update parameters to pipline in test * updates parametrs in tests * change spelling in gitignore * revert .gitignore to main * add git ignore of devcontainer folder * assert asr output follows expected inference output type * run fixup * Remove .devcontainer from .gitignore * remove compliance check	2024-11-25 11:36:44 +01:00
kang sheng	4dc1a69349	Sum gathered input tokens (#34554 ) * sum gathered input tokens * ruff line-length is 119, format the code --------- Co-authored-by: kangsheng <kangsheng@meituan.com>	2024-11-25 11:27:13 +01:00
Raushan Turganbay	1e492afd61	🔴 Mllama: fix base prefix (#34874 ) fix base prefix	2024-11-25 11:20:20 +01:00
Arthur	857d46ca0c	[`Deberta/Deberta-v2`] Refactor code base to support compile, export, and fix LLM (#22105 ) * some modification for roadmap * revert some changes * yups * weird * make it work * sttling * fix-copies * fixup * renaming * more fix-copies * move stuff around * remove torch script warnings * ignore copies * revert bad changes * woops * just styling * nit * revert * style fixup * nits configuration style * fixup * nits * will this fix the tf pt issue? * style * ??????? * update * eval? * update error message * updates * style * grumble grumble * update * style * nit * skip torch fx tests that were failing * style * skip the failing tests * skip another test and make style	2024-11-25 10:43:16 +01:00
Raushan Turganbay	098962dac2	BLIP: fix generation after hub update (#34876 ) * fix blip generation * dont remove it yet * Update src/transformers/models/blip_2/modeling_blip_2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * address comments * modular --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-25 10:41:55 +01:00
Raushan Turganbay	c1a8520419	Cache: init empty cache when `use_cache` (#34274 ) * fix * fix tests * fix copies * add docs * Revert "add docs" This reverts commit 32d35634f12ba02781d2ebdee0c8dcfbe992a7b9. * qwen move deltas * mllama can potentiall fullgraph compile * enable mllama compile and fix tests * remove mllama fixes	2024-11-25 10:11:33 +01:00
Dmitry Rogozhkin	1339a14dca	Add safe_globals to resume training on PyTorch 2.6 (#34632 ) Starting from version 2.4 PyTorch introduces a stricter check for the objects which can be loaded with torch.load(). Starting from version 2.6 loading with weights_only=True requires allowlisting of such objects. This commit adds allowlist of some numpy objects used to load model checkpoints. Usage is restricted by context manager. User can still additionally call torch.serialization.add_safe_globals() to add other objects into the safe globals list. Accelerate library also stepped into same problem and addressed it with PR-3036. Fixes: #34631 See: https://github.com/pytorch/pytorch/pull/137602 See: https://pytorch.org/docs/stable/notes/serialization.html#torch.serialization.add_safe_globals See: https://github.com/huggingface/accelerate/pull/3036 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2024-11-25 10:03:43 +01:00
jeongin601	318fe25f22	Fix: Enable prefill phase key value caching of nemotron/minitron models (#34742 ) * modeling nemotron kv caching bugfix Signed-off-by: jeongin601 <0200angela@gmail.com> * test file deleted Signed-off-by: jeongin601 <0200angela@gmail.com> * code refinement Signed-off-by: jeongin601 <0200angela@gmail.com> * remove unused variables Signed-off-by: jeongin601 <0200angela@gmail.com> * import block sorted * removed deprecation warning Signed-off-by: jeongin601 <0200angela@gmail.com> * removed support for tuple shape past_key_values Signed-off-by: jeongin601 <0200angela@gmail.com> * Update conditional statement for cache initialization Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Signed-off-by: jeongin601 <0200angela@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-25 09:45:35 +01:00
Yoni Gozlan	3a8eb74668	Fix support for image processors modifications in modular (#34866 ) * add fix and examples * fix camel case naming	2024-11-22 18:14:24 -05:00
Mohamed Mekkouri	54be2d7ae8	Bitnet test fix to avoid using gated model (#34863 ) small test fix	2024-11-22 17:18:49 +01:00
Benjamin Bossan	286ffaaf0a	[CI] Skip EETQ tests while package is broken with latest transformers (#34854 ) * CI Skip EETQ tests while package is broken EETQ tries to import the shard_checkpoint function from transformers but the function has been removed. Therefore, trying to use EETQ currently results in an import error. This fix results in EETQ tests being skipped if there is an import error. The issue has been reported to EETQ: https://github.com/NetEase-FuXi/EETQ/issues/34 * Raise helpful error when trying to use eetq * Forget to raise the error in else clause	2024-11-22 17:13:30 +01:00
Andrés Marafioti	861758e235	smol improvements to support more flexible usage (#34857 ) * smol improvements to support more flexible usage * ruff	2024-11-22 16:34:38 +01:00
Nadav Timor	42b36d7395	Speculative decoding: Test the target distribution (to prevent issues like #32867 ) (#34553 ) * Update test_utils.py * formatting * Update test_utils.py * formatting * formatting * Update test_utils.py * formatting * Update test_utils.py * formatting * format * comments at standard positions	2024-11-22 16:02:37 +01:00
Arthur	597efd21d2	Auto compile when static cache (#34247 ) * generate with compile * nits * simple * generate with compile * nits * simple * safe * style * Update src/transformers/generation/utils.py Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * remove TOKENIZER forked warning --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2024-11-22 15:33:35 +01:00
Konrad Kalita	d9e6f307e7	Remove quantization related config from dequantized model (#34856 ) * Remove quantization related config from dequantized model * Fix whitespace	2024-11-22 10:06:29 +01:00
Logan Adams	1867be666d	Update checks for torch.distributed.tensor to require torch >= 2.5 (#34816 ) * Update checks for torch.distributed.tensor * Update PR with feedback * Formatting fix for import order * Remove unused function	2024-11-22 10:05:26 +01:00
Raushan Turganbay	6a912ff2c5	Watermarking: fix order (#34849 ) fix watermarking order	2024-11-22 08:25:14 +01:00
Cyril Vallez	4e90b99ed9	Refactor StarCoder2 using modular (#34015 ) * Create modular_starcoder2.py * Update modular_starcoder2.py * update * finalize modular * revert # no-unravel * Add support * style * Update modular_model_converter.py * update docstring	2024-11-21 14:52:39 +01:00
Jonathan Mamou	18871599c9	Fix heuristic scheduling for UAG (#34805 ) * fix heuristic schedule * fix style * fix format	2024-11-21 14:46:35 +01:00
AbdelKarim ELJANDOUBI	d6a5c23f71	Fix ds nvme (#34444 ) * skip nested deepspeed.zero.Init call * make fixup * solve conflict * solve conflict * put back local * use context mangers instead of local thread * Skip recursive calls to deepspeed.zero.Init * Skip recursive calls to deepspeed.zero.Init * back to old notebooks * make style	2024-11-21 13:52:22 +01:00
Vladislav Bronzov	ae5cbf804b	Improve gguf tensor processing (#34515 ) * add tensor processing system to separate logic for models * format refactoring * small fix * make some methods private * move custom methods to processors * refactor tensor processing * format fix	2024-11-21 13:40:49 +01:00
farrosalferro	c57eafdaa1	Add Nemotron GGUF Loading Support (#34725 ) * Add Nemotron GGUF Loading Support * fix the Nemotron architecture assignation --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-11-21 11:37:34 +01:00
Quentin Gallouédec	d4e1acbb7c	Change logging level from warning to info for `max_steps` overriding `num_train_epochs` (#34810 ) Update trainer.py	2024-11-21 11:37:02 +01:00
Raushan Turganbay	28fb02fc05	VLMs: enable generation tests - last batch (#34484 ) * add tests for 3 more vlms * fix fuyu back * skip test	2024-11-21 11:00:22 +01:00
Yih-Dar	40821a2478	Fix CI slack reporting issue (#34833 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-20 21:36:13 +01:00
Marc Sun	3cb8676a91	Fix CI by tweaking torchao tests (#34832 )	2024-11-20 20:28:51 +01:00
Corentin Royer	bf42c3bd4b	Fix hyperparameter search when optuna+deepseed (#34642 ) * Fix hyperparameter search when optuna+deepseed * Adding free_memory to the search setup --------- Co-authored-by: Corentin-Royer <corentin.royer@ibm.com>	2024-11-20 18:02:58 +01:00
Marc Sun	67890de3b8	Torchao weights only + prequantized compability (#34355 ) * weights only compability * better tests from code review * ping torch version * add weights_only check	2024-11-20 17:24:45 +01:00
Tibor Reiss	f297af55df	Fix: take into account meta device (#34134 ) * Do not load for meta device * Make some minor improvements * Add test * Update tests/utils/test_modeling_utils.py Update test parameters Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Make the test simpler --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-11-20 11:32:07 +01:00
Phillip Kuznetsov	8cadf76e1c	fix(DPT,Depth-Anything) `torch.export` (#34103 ) * Fix torch.export issue in dpt based models Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Simplify the if statements Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Move activation definitions of zoe_depth to init() Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Add test_export for dpt and zoedepth Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * add depth anything Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Remove zoedepth non-automated zoedepth changes and zoedepth test Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * [run_slow] dpt, depth_anything, zoedepth Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> --------- Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>	2024-11-20 11:31:21 +01:00
kjohew	9d16441e4f	Fix the memory usage issue of logits in generate() (#34813 )	2024-11-20 11:25:37 +01:00
Raushan Turganbay	9470d65324	Fix low memory beam search (#34746 ) * fix * higher max positions in tests	2024-11-20 07:46:35 +01:00
Raushan Turganbay	145fbd46cb	LLaVA OV: fix unpadding precision (#34779 ) * fix * propagate * type check	2024-11-20 07:46:13 +01:00
wwwbai	3033509327	Translate attention.md into Chinese (#34716 ) * try * tryagain * tryagggain * translated * translated2 * Update docs/source/zh/attention.md Co-authored-by: Huazhong Ji <hzji210@gmail.com> --------- Co-authored-by: Huazhong Ji <hzji210@gmail.com>	2024-11-19 10:03:12 -08:00
Merve Noyan	befbbf2f98	Added image-text-to-text pipeline to task guide (#34783 ) * Added image-text-to-text pipeline to task guide * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Merge codeblocks --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-19 09:49:10 -08:00
Yih-Dar	469eddbe2d	Fix `check_training_gradient_checkpointing` (#34806 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-19 17:48:34 +01:00
Yih-Dar	05ebe8b9b0	Run `test_medium_seamless_m4t_pt` in `subprocess` to avoid many failures (#34812 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-19 17:32:10 +01:00
Yoni Gozlan	eedc113914	Add Image Processor Fast Deformable DETR (#34353 ) * add deformable detr image processor fast * add fast processor to doc * fix copies * nit docstring * Add tests gpu/cpu and fix docstrings * fix docstring * import changes from detr * fix imports * rebase and fix * fix input data format change in detr and rtdetr fast	2024-11-19 11:18:58 -05:00
Yoni Gozlan	b99ca4d28b	Add support for OpenAI api "image_url" input in chat for image-text-to-text pipeline (#34562 ) * add support for openai api image_url input * change continue to elif * Explicitely add support for OpenAI/TGI chat format * rewrite content to transformers chat format and add tests * Add support for typing of image type in chat templates * add base64 to possible image types * refactor nesting	2024-11-19 11:08:37 -05:00
dependabot[bot]	15dd625a0f	Bump aiohttp from 3.10.2 to 3.10.11 in /examples/research_projects/decision_transformer (#34792 ) Bump aiohttp in /examples/research_projects/decision_transformer Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.10.2 to 3.10.11. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.10.2...v3.10.11) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-11-19 16:08:07 +00:00
Wang, Yi	dc42330388	fix crash in tiiuae/falcon-11B-vlm image-to-text generation (#34728 ) Signed-off-by: Wang, Yi <yi.a.wang@intel.com>	2024-11-19 16:51:32 +01:00
David Zhang	427b62ed1a	Fix post process function called in the instance segmentation example of mask2former (#34588 ) * Fix post process function called in the instance segmentation example of mask2former * fix description and additional notes for post_process_instance_segmentation of maskformers * remove white space in maskformers post_process_instance_segmentation doc * change image.size[::-1] to height and width for clarity in segmentation examples	2024-11-19 16:49:25 +01:00
jp	fdb9230485	Add do_convert_rgb to vit (#34523 ) * Add: do_convert_rgb * Add: doc string * Update src/transformers/models/vit/image_processing_vit.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/vit/image_processing_vit.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/vit/image_processing_vit.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Add: do_convert_rgb to fast * Add: convert_to_rgb --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-11-19 16:48:05 +01:00
Tibor Reiss	7b9e51c1a0	Feature: print tokens per second during training (#34507 ) * Log tokens per second during training * Nitpicks * Move logic into _maybe_log_save_evaluate * Use speed_metrics	2024-11-19 16:46:04 +01:00
Phillip Kuznetsov	5fa4f64605	🚨🚨🚨 fix(Mask2Former): torch export 🚨🚨🚨 (#34393 ) * fix(Mask2Former): torch export Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * revert level_start_index and create a level_start_index_list Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Add a comment to explain the level_start_index_list Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Address comment Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * add torch.export.export test Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * rename arg Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * remove spatial_shapes Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * Use the version check from pytorch_utils Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * [run_slow] mask2former Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> --------- Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>	2024-11-19 16:44:53 +01:00
huismiling	581524389a	MLU devices : Checks if mlu is available via an cndev-based check which won't trigger the drivers and leave mlu (#34326 ) * add Cambricon MLUs support * fix mlu device rng state * up for quality check * up mlu to support fp16 * fix mlu device dependency error * fix mlu device dependency error * enable mlu device for bf16 * fix mlu device memory tracker * Cambricon support SDPA and flash_attn * MLU devices : Checks if `mlu` is available via an `cndev-based` check which won't trigger the drivers and leave mlu	2024-11-19 16:37:39 +01:00
Cyril Vallez	e3a5889ef0	Modular fix (#34802 ) * Modular fix * style * remove logger warning * Update modular_model_converter.py	2024-11-19 16:08:57 +01:00
Marc Sun	ce1d328e3b	Fix cache_utils for optimum.quanto kvcache quantization (#34750 ) * add co-author Co-authored-by: w3rew <w3rew@users.noreply.github.com> * fix docs * fix cache * remove print --------- Co-authored-by: w3rew <w3rew@users.noreply.github.com>	2024-11-19 14:16:34 +01:00
Arthur	4bff54f921	Gemma capping (#34282 ) * softcapping * soft cap before the mask * style * ... * super nit * update * fixes * update * small issue with modular * fix modular imports * update * fixup * simplify a hell lot * simplify cleaning imports * finish fixing * update our design * nits * use a deprecation cycle * updates * Fix modular (recursive deps need to always be computed after merges!) * push * fix * update * fix modular order * make fix-copies * updates * update * ? * don't compile for now * ? * fix some stuff * donc! * fix copies * update * fixup * ? * fix two tests * fix? * for now, don't use head info * eager when output attentoin and sdpa or flash as it's the simplest behaviour (for our tests as well :)) * fix-copies * revert sdpa check * Apply suggestions from code review Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * rebase, fix-copies and push * add a slow integration test * update the test * fix left padding issue * fix test * remove duplicate scaling * quality * add a small test and make sure it works * 2b --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2024-11-19 13:52:38 +01:00
Arthur	54739a320e	Self-speculation (Layer-Skip Llama) (#34240 ) * 😅 * early exit (#34244) * mvp * docs and tests * a few fixes * no shared cache * Apply suggestions from code review Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org> * docs * make fix-copies * cohere fix * [test all] * [test all] consistent model code copies * [test all] make fix-copies :D * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org> * Update src/transformers/generation/candidate_generator.py * Update src/transformers/generation/configuration_utils.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * [test all] don't use a stand-alone attribute; fix test --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2024-11-19 12:20:07 +00:00
jiqing-feng	5de58d5955	fix cpu bnb path (#34647 ) * fix cpu bnb path * Update src/transformers/generation/utils.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix awq quantizer env check * fix awq quantizer device check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-11-19 12:44:44 +01:00
jp	3cd78be34e	Fix: siglip image processor rgb_convert is not being applied correctly. (#34301 ) Fix: do_convert_rgb	2024-11-19 12:40:36 +01:00
Jiahao Li	0db91c3c8d	Support gradient checkpointing in Qwen2VL ViT (#34724 ) * Support gradient checkpointing in Qwen2VL ViT * Enable gradient checkpoint tests for Qwen2VL * [run-slow] qwen2_vl	2024-11-19 12:30:44 +01:00
gebbissimo	1a0cd69435	feat: allow to use hf-hub models for timm backbone (#34729 ) Currently a backbone name like 'hf-hub:bioptimus/H-optimus-0' throws an error, even though it could work. Co-authored-by: Christian Gebbe <>	2024-11-19 10:26:35 +00:00
Guillem García Subies	d8a5d31d9c	Trainer hyperparameter search kwargs docs update (#34459 ) * doc: Trainer.hyperparameter_search docstring discrepancy solved * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-19 11:23:03 +01:00
Arthur	dadb286f06	protect tensor parallel usage (#34800 ) protect	2024-11-19 09:54:11 +01:00
Yih-Dar	eed11f34ab	Fix Whisper CI (#34617 ) * Revert "Revert "Fix Whisper CI" (#34605)" This reverts commit 74d3824cc0725829e7d92e1d43b97be1f18454f8. * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-18 21:37:50 +01:00
Aymeric Roucher	759a378ee5	Allow handling files as args for a tool created with Tool.from_space (#34687 ) * Allow handling files as args for a tool created with `Tool.from_space`	2024-11-18 20:15:35 +01:00
Ke Wen	20142ab542	Simplify Tensor Parallel implementation with PyTorch TP (#34184 ) * Simplify Tensor Parallel implementation with PyTorch TP * Move tp_plan to config * Lint * Format and warning * Disable copy-from check * Conditionally get attr from config * make fix-copies * Move base_model_tp_plan to PretrainedConfig * Move TP into from_pretrained * Add device context for load * Do not serialize * Move _tp_plan setting to post_init * Add has_tp_plan * Add test_tp * Add 'Multi-gpu inference' doc * Add backward support for device type identification * Auto-detect accelerator * supports_tp_plan * copyright year * Fix copy	2024-11-18 19:51:49 +01:00
ecyht2	7df93d6ffb	fix: Wrong task mentioned in docs (#34757 )	2024-11-18 18:42:28 +00:00
Hun-soo Jung	7693b62268	Fix callback key name (#34762 ) Fixes typo.	2024-11-18 18:41:12 +00:00
Eon Kim	1ef6c5f1c5	fix: Update pixel_values parameter in hf_model input (#34782 )	2024-11-18 18:40:01 +00:00
Fanli Lin	e80a65ba4f	[tests] add XPU part to testing (#34778 ) add XPU part to testing Signed-off-by: Lin, Fanli <fanli.lin@intel.com>	2024-11-18 09:59:11 -08:00
Fanli Lin	9568a9dfc5	[docs] add XPU besides CUDA, MPS etc. (#34777 ) add XPU	2024-11-18 09:58:50 -08:00
Fanli Lin	8568bf1bcf	[docs] make `empty_cache` device-agnostic (#34774 ) make device-agnostic	2024-11-18 09:58:26 -08:00
Wing Lian	36759f3312	make sure to disable gradients for integer tensor (#32943 )	2024-11-18 16:49:37 +01:00
Dmitry Rogozhkin	1c471fc307	Fix skip of test_training_gradient_checkpointing (#34723 ) 19d58d31f has introduced a context manager to manage subtests of test_training_gradient_checkpointing. However, test body was not moved under "with" statement. Thus, while tests are correctly marked as skipped, test bodies were still executed. In some cases, as with llama this caused attribute errors. Fixes: #34722 Fixes: 19d58d31f ("Add MLLama (#33703)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2024-11-18 15:45:40 +01:00
ZuoChen_BUPT	c772d4d91e	fix a typo bug where 'id2label' was incorrectly written as 'i2label' when reading config (#34637 ) fix a bug where 'id2label' was incorrectly written as 'i2label' when reading the config from pretrained config	2024-11-18 14:41:48 +01:00
Ofek Lev	eb0ab3ed4b	Fix broken link (#34618 )	2024-11-18 14:13:26 +01:00
Raushan Turganbay	1646ffb4d1	VLMs: `patch_size` -> `num_image_tokens` in processing (#33424 ) * use num additional tokens * fix copies + docs * another fix copies :) * add docs * move order for BC	2024-11-18 13:21:07 +01:00
Shane A	3ee24e2208	Add OLMo November 2024 (#34551 ) * Add model skeletion with transformers-cli add-new-model-like * Convert config to modular, add rms_norm_eps, delete clip_qkv * Convert model to modular, add RMSNorm * Add flash attention with qk norm and no qkv clipping * Add decoder layer with RMSNorm after attention/feedforward layers * Add base and causal model * Add converter improvements from OLMo repo * Update weight loading in OLMo to HF converter * Set correct default for rms_norm_eps * Set correct pipeline_model_mapping in test * Run make fixup * Fix model type * Re-run modular conversion * Manually set config docs to fix build errors * Convert olmo-1124 to olmo_1124 to fix flash attention docs errors * Start updating tests * Update tests * Copy upstream test_eager_matches_sdpa_inference_1_bfloat16 changes to olmo_1124 * Rename input_layernorm and post_attention_layernorm to reflect their ops better * Use correct tokenizer * Remove test unsupported by GPT2 tokenizer * Create GenerationConfig outside of from_pretrained call * Use simpler init file structure * Add explicit __all__ to support simplified init * Make safetensor serialization the default * Update OLMo November 2024 docs	2024-11-18 10:43:10 +01:00
Joao Gante	13493215ab	🧼 remove v4.44 deprecations (#34245 ) * remove v4.44 deprecations * PR comments * deprecations scheduled for v4.50 * hub version update * make fiuxp --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-15 23:07:24 +01:00
AbdelKarim ELJANDOUBI	8d50fda644	Remove FSDP wrapping from sub-models. (#34452 ) * Remove FSDP wrapping from sub-models. * solve conflict trainer.py * make fixup * add unit test for fsdp_auto_wrap_policy when using auto_find_batch_size * put back extract_model_from_parallel * use transformers unwrap_model	2024-11-15 23:00:03 +01:00
Wing Lian	b0c0ba7b4d	FSDP grad accum fix (#34645 ) * add gradient accumulation steps tests for fsdp * invert no_sync context to fix training for fsdp	2024-11-15 22:28:06 +01:00
jiqing-feng	52ea4aa589	add xpu path for awq (#34712 ) * add xpu path for awq * update readme	2024-11-15 15:45:24 +01:00
CezaPasc	7b3d615bc2	fix(wandb): pass fake dataset to avoid exception in trainer (see #34455 ) (#34720 )	2024-11-15 15:44:02 +01:00
Lysandre Debut	f5dbfab7f3	Update llava.md (#34749 ) LLava -> Llava	2024-11-15 15:39:57 +01:00
lewtun	8ba3e1505e	Retain newlines in chat template when `continue_final_message=True` (#34253 ) * Retain newlines in chat template when * Add try/except * Add regression test * Simplify test * Apply suggestions from code review Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2024-11-15 14:27:04 +00:00
Fanli Lin	a3d69a8994	[docs] add xpu device check (#34684 ) * add XPU path * use accelerate API * Update docs/source/en/tasks/semantic_segmentation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update more places with accelerate API --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-13 14:16:59 -08:00
Xiao Yuan	68f8186a89	Fix example in EsmConfig docstring (#34653 )	2024-11-13 13:55:58 -08:00
Pedro Cuenca	e7c36a9d57	[docs] Broken link in generation_strategies (#34717 ) [docs] Broken link	2024-11-13 13:44:42 -08:00
MaCAT	be8748a53c	🌐 [i18n-KO] Translated marian.md to Korean (#34698 ) * initial translation * removed english * Fixed Trivial Typos, updated _toctree.yml	2024-11-13 13:14:23 -08:00
Aymeric Roucher	33eef99250	Agents: Small fixes in streaming to gradio + add tests (#34549 ) * Better support transformers.agents in gradio: small fixes and additional tests	2024-11-11 20:52:09 +01:00
Ahmed Almaghz	6de2a4d1f1	[i18n-ar] Translated file : `docs/source/ar/torchscript.md` into Arabic (#33079 ) * Add docs/source/ar/torchscript.md to Add_docs_source_ar_torchscript.md * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Merge troubleshooting.md with this Branch * Update _toctree.yml * Update torchscript.md * Update troubleshooting.md --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-11-11 10:41:01 -08:00
Fanli Lin	25f510a9c6	[docs] update not-working model revision (#34682 ) update revision	2024-11-11 07:09:31 -08:00
Aymeric Roucher	3ea3ab62d8	Agents: turn any Space into a Tool with `Tool.from_space()` (#34561 ) * Agents: you can now load a Space as a tool	2024-11-10 12:22:40 +01:00
Louis Brulé Naudet	134ba90da9	Update llm_engine.py (#33332 ) * Update llm_engine.py - Added support for optional token and max_tokens parameters in the constructor. - Provided usage examples and detailed documentation for each method.	2024-11-10 12:19:20 +01:00
Ahmed Almaghz	768f3c016e	[i18n-ar] Translated file : `docs/source/ar/trainer.md` into Arabic (#33080 ) * Add docs/source/ar/trainer.md to Add_docs_source_ar_trainer.md * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update trainer.md * Update trainer.md * Update trainer.md * Create _toctree.yml * Delete docs/source/ar/_toctree.yml * Update _toctree.yml - add trainer * Update _toctree.yml * merge serialization.md into this branch * merge sagemaker.md into this PR * Update _toctree.yml * Update docs/source/ar/trainer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ar/trainer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-09 11:26:28 -08:00
MaCAT	a06a0d1263	🌐 [i18n-KO] Translated bert.md to Korean (#34627 ) * Translated bert.md, Need additional check * Translation 2nd ver, changed _toctree.yml * Fixed Typo * Update bert.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update bert.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update bert.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update bert.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-11-07 18:56:09 -08:00
Jiwook Han	1cf17077bf	🌐 [i18n-KO] Translated `timesformer.md` to Korean (#33972 ) * docs: ko: model_doc/timesformer.md * feat: nmt draft * fix: manual edits * fix_toctree * fix toctree on Video Models	2024-11-07 11:04:27 -08:00
Ivan Shcheklein	6938524a28	fix(dvclive): pass fake dataset to avoid exception in trainer init (#34455 ) fix(dvclive): pass fake dataset to avoid exception in trainer	2024-11-07 15:57:34 +01:00
Ahnjj_DEV	7bbc624743	🌐 [i18n-KO] Translated `convbert.md` to Korean (#34599 ) * docs: ko: convbert.md * Update _toctree.yml * feat: nmt draft	2024-11-05 09:32:17 -08:00
Isotr0py	e83aaaa86b	Fix `use_parallel_residual` and `qkv_bias` for StableLM GGUF config extraction (#34450 ) * fix stablelm qkv_bias * fix stablelm qkv_bias and use_parallel_residual * remove original_model.config for stablelm gguf test	2024-11-05 18:26:20 +01:00
Yoni Gozlan	9f28d0c5d0	Fix torchvision interpolation CI (#34539 ) fix-torch-interpolation-ci	2024-11-05 11:02:14 -05:00
Mohamed Mekkouri	d2bae7ee9d	Changing __repr__ in torchao to show quantized Linear (#34202 ) * Changing __repr__ in torchao * small update * make style * small update * add LinearActivationQuantizedTensor * remove some cases * update imports & handle return None * update	2024-11-05 16:11:02 +01:00
Yih-Dar	f2d5dfbab2	Remove `@slow` for `test_eager_matches_sdpa_inference` (#34558 ) * update * update * update * update * update * update * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-05 16:10:42 +01:00
Yoni Gottesman	082e57e0d4	Fix #34494 assistant tokens when truncated (#34531 ) * Fix assistant tokens when truncated * fix test * fix test * step	2024-11-05 15:10:15 +00:00
Yih-Dar	74d3824cc0	Revert "Fix Whisper CI" (#34605 ) Revert "Fix Whisper CI (#34541)" This reverts commit eb811449a2389e48930c45f84c88fd041735cf92.	2024-11-05 15:12:47 +01:00
Eon Kim	45b0c7680c	Remove unused test_dataset (#34516 )	2024-11-05 14:01:25 +00:00
Guang Yang	663c851239	DistilBERT is ExecuTorch compatible (#34475 ) * DistillBERT is ExecuTorch compatible * [run_slow] distilbert * [run_slow] distilbert --------- Co-authored-by: Guang Yang <guangyang@fb.com>	2024-11-05 13:41:48 +01:00
Raushan Turganbay	893ad04fad	Load sub-configs from composite configs (#34410 ) * save/load sub-configs * nit forgot these * fix copies * move test to common * use dict for sub-configs * add load-save-laod test * clean up modeling check * oops this are correct keys * fix some tests, missed some composite configs * this model was missed	2024-11-05 11:34:01 +01:00
Benjamin Bossan	5e1fd4e204	FIX: Broken repr of TorchAoConfig (#34560 ) FIX Broken repr of TorchAoConfig The __repr__ method references a non-existent self.kwargs. This is now fixed. There does not appear to be a uniform way of defining __repr__ for quantization configs. I copied the method as implemented for HQQ: `e2ac16b28a/src/transformers/utils/quantization_config.py (L285-L287)`	2024-11-05 10:26:13 +01:00
AbdelKarim ELJANDOUBI	d0b1d8d888	Skip DeepSpeed ZeRO Stage 3 model initialization when bnb (#34395 ) * Skip DeepSpeed ZeRO Stage 3 model initialization when it is intended to be quantized. * Propagate the quantization state using a context manager * make fixup	2024-11-05 10:06:07 +01:00
Yih-Dar	eb811449a2	Fix Whisper CI (#34541 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-11-04 21:35:37 +01:00
kang sheng	bfa021be05	fix TrainerState doc because num_input_tokens_seen is unused by defau… (#34593 ) fix TrainerState doc because num_input_tokens_seen is unused by default config Co-authored-by: kangsheng <kangsheng@meituan.com>	2024-11-04 09:42:20 -08:00
Ju Hoon Park	0a6795af12	🌐 [i18n-KO] Update README_ko.md (#33098 ) * Update README_ko.md Delete the blank paragraph in the language selection button and Edit to synchronize with the English version of README.md * [i18n-KO] Update README_ko.md * Additional edit for keep consistency with main [documentation](https://huggingface.co/docs/transformers/v4.44.2/ko/index). (메인 문서와 일관성 유지를 위한 수정) * Update README_ko.md Additional update. * Change docs link to Korean translated page if it exists. * Change doc link to korean translated if it exists. Change the link of doc and delete a row 'migration' of the table Learn more[더 알아보기], since it does not exist in the main version of doc. * modify a link of the main README.md from `https://huggingface.co/docs/transformers/index#supported-frameworks` to `https://huggingface.co/docs/transformers/index#supported-models-and-frameworks` since the title of 'supported table' changed. * [i18n-ko] edit links and sync with main `README.md` * docs/change comment to Korean1 Change English comment to Korean Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * docs/change comment to Korean2 Change English comment to Korean Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * revise to original to seperate `edit_README_ko_md` and `README.md` * Synchronization with English documentation. Synchronization with English documentation, and translated a line of comment from English to Korean. --------- Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>	2024-11-04 09:42:07 -08:00
MaCAT	1112c54604	🌐 [i18n-KO] Translated perf_train_special.md to Korean (#34590 ) * Translated to Ko, 1st version * updated _toctree.yml	2024-11-04 09:41:44 -08:00
Karthik Vallamsetla	a86bd6f2d8	[i18n-HI] Translated TFLite page to Hindi (#34572 ) * [i18n-HI] Translated TFLite page to Hindi * [i18n-HI] Translated TFLite page to Hindi * Update docs/source/hi/tflite.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> --------- Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>	2024-11-04 09:40:30 -08:00
JacobLinCool	48831b7d11	Add text support to the Trainer's TensorBoard integration (#34418 ) * feat: add text support to TensorBoardCallback * feat: ignore long strings in trainer progress * docs: add docstring for max_str_len * style: remove trailing whitespace --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-11-04 17:36:27 +01:00
Joao Gante	34927b0f73	MPS: `isin_mps_friendly` can support 0D tensors (#34538 ) * apply fix * tested * make fixup	2024-11-04 16:18:50 +00:00
Raushan Turganbay	187439c3fa	VLM: special multimodal Tokenizer (#34461 ) * kinda works * update * add tests * update * use special tokens in processors * typo * fix copies * fix * fix moshi after rebase * update * fix tests * update * Update docs/source/en/main_classes/tokenizer.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update docs * test for load time adding tokens * fix some more tests which are now fetched better * one more fix --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-11-04 16:37:51 +01:00
Zach Mueller	ef976a7e18	Update trainer for easier handling of accumulate, compile fixes, and proper reporting (#34511 ) * Update trainer for easier handling of accumulate + proper reporting * test * Fixup tests * Full fix * Fix style * rm comment * Fix tests * Minimize test + remove py 311 check * Unused import * Forward contrib credits from discussions * Fix reported metrics * Refactor, good as it's going to get * rm pad tok id check * object detection and audio are being annoying * Fin * Fin x2 --------- Co-authored-by: Gyanateet Dutta <Ryukijano@users.noreply.github.com>	2024-11-04 07:47:34 -05:00
Karthik Vallamsetla	33868a057c	[i18n-HI] Translated accelerate page to Hindi (#34443 ) * [i18n-HI] Translated accelerate page to Hindi * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> --------- Co-authored-by: Kay <kay@Kays-MacBook-Pro.local> Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>	2024-11-01 08:26:45 -07:00
Cyril Vallez	e2ac16b28a	Large modular logic refactoring (#34487 ) * rework converter * Update modular_model_converter.py * Update modular_model_converter.py * Update modular_model_converter.py * Update modular_model_converter.py * cleaning * cleaning * finalize imports * imports * Update modular_model_converter.py * Better renaming to avoid visiting same file multiple times * start converting files * style * address most comments * style * remove unused stuff in get_needed_imports * style * move class dependency functions outside class * Move main functions outside class * style * Update modular_model_converter.py * rename func * add augmented dependencies * Update modular_model_converter.py * Add types_to_file_type + tweak annotation handling * Allow assignment dependency mapping + fix regex * style + update modular examples * fix modular_roberta example (wrong redefinition of __init__) * slightly correct order in which dependencies will appear * style * review comments * Performance + better handling of dependencies when they are imported * style * Add advanced new classes capabilities * style * add forgotten check * Update modeling_llava_next_video.py * Add prority list ordering in check_conversion as well * Update check_modular_conversion.py * Update configuration_gemma.py	2024-11-01 10:13:51 +01:00
Pablo Montalvo	86701f2b6f	🔴 🔴 fix `query_pre_attn_scalar` different of `num_heads` in default gemma2 config (#34540 ) * fix query_pre_attn_scalar different of num_heads in default config * propagate modular changes * fix copies * fix modular copies * fix copies? * correct copies fix	2024-11-01 09:06:17 +01:00
Raushan Turganbay	4cc0813e28	BLIP: enable generation tests (#34174 ) * blip2 tests * instructblips * copies * fix slow tests * fix * uncomment this * clean up after rebase * should be model main input * fix overwritten tests * oops len should be multiple of frame number * style * fix some tests	2024-11-01 08:54:48 +01:00
Raushan Turganbay	6beb3f1691	Blip: get/set input embeddings correctly (#34152 ) * set-get embeds * add tests * fix tests * remove * return dict True * fix tests * why did i remove this * enabel torchscript tests	2024-11-01 08:39:39 +01:00
Ahmed Almaghz	b53e44e847	[i18n-ar] Translated file : `docs/source/ar/multilingual.md` into Arabic (#33048 ) * Add docs/source/ar/multilingual.md to Add_docs_source_ar_multilingual.md * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml * Update _toctree.yml * Add Translated files to branch for merg * Update _toctree.yml * Update _toctree.yml * Update custom_models.md * Update chat_templating.md * Update docs/source/ar/create_a_model.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update create_a_model.md * Update gguf.md * Update gguf.md * Update gguf.md * Update gguf.md --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-31 16:10:09 -07:00
jiqing-feng	2801d7bcf6	update doc (#34478 ) * update doc * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * delete closing tip --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-31 15:59:23 -07:00
NielsRogge	df8640cedb	[CLIPSeg] Make interpolate_pos_encoding default to True (#34419 ) * Remove interpolate_pos_encoding * Make fixup * Make interpolate_pos_encoding default to True * Reuse existing interpolation * Add integration test	2024-10-31 22:15:04 +01:00
Yoni Gozlan	203e27059b	Add image text to text pipeline (#34170 ) * Standardize image-text-to-text-models-output add post_process_image_text_to_text to chameleon and cleanup Fix legacy kwarg behavior and deprecation warning add post_process_image_text_to_text to qwen2_vl and llava_onevision Add post_process_image_text_to_text to idefics3, mllama, pixtral processor * nit var name post_process_image_text_to_text udop * nit fix deprecation warnings * Add image-text-to-text pipeline * add support for image url in chat template for pipeline * Reformat to be fully compatible with chat templates * Add tests chat template * Fix imports and tests * Add pipeline tag * change logic handling of single prompt ans multiple images * add pipeline mapping to models * fix batched inference * fix tests * Add manual batching for preprocessing * Fix outputs with nested images * Add support for all common processing kwargs * Add default padding when multiple text inputs (batch size>1) * nit change version deprecation warning * Add support for text only inference * add chat_template warnings * Add pipeline tests and add copied from post process function * Fix batched pipeline tests * nit * Fix pipeline tests blip2 * remove unnecessary max_new_tokens * revert processing kosmos2 and remove unnecessary max_new_tokens * fix pipeline tests idefics * Force try loading processor if pipeline supports it * revert load_processor change * hardcode loading only processor * remove unnecessary try except * skip imagetexttotext tests for kosmos2 as tiny model causes problems * Make code clearer * Address review comments * remove preprocessing logic from pipeline * fix fuyu * add BC resize fuyu * Move post_process_image_text_to_text to ProcessorMixin * add guard in post_process * fix zero shot object detection pipeline * add support for generator input in pipeline * nit * change default image-text-to-text model to llava onevision * fix owlv2 size dict * Change legacy deprecation warning to only show when True	2024-10-31 15:48:11 -04:00
fpgaminer	c443d8d536	Bug Fix for issue #34294 (#34295 ) Update SiglipVisionEmbeddings.forward to cast input to correct dtype before embedding it.	2024-10-31 18:51:15 +01:00
Yih-Dar	114dd812dd	make `test_eager_matches_sdpa_inference` less flaky (#34512 ) * try * try * try * try * try * try * update * update * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-31 18:34:00 +01:00
Luc Georges	294c170ff9	feat: add benchmarks pg indexes (#34536 ) * feat: add benchmarks pg indexes * refactor: remove debug `df -h`	2024-10-31 17:41:06 +01:00
Phillip Kuznetsov	b5919e12f7	fix(DPT,Depth-Anything) Address expected_slice errors inside inference tests (#34518 ) * fix(DPT,Depth-Anything) Address expected_slice errors inside inference tests Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai> * [run_slow] dpt, depth_anything --------- Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>	2024-10-31 16:47:58 +01:00
Joao Gante	4ca004eac6	Qwen2VL: skip base `input_ids`-`inputs_embeds` equivalence check (#34535 ) it has complex inputs_embeds computation	2024-10-31 15:42:13 +00:00
Yih-Dar	ab98f0b0a1	avoid calling `gc.collect` and `cuda.empty_cache` (#34514 ) * update * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-31 16:36:13 +01:00
kibitzing	dca93ca076	Fix step shifting when accumulate gradient (#33673 ) * replace total_batched_samples with step while counting grad accum step * remove unused variable * simplify condition for update step * fix format by ruff * simplify update step condition using accelerator.sync_gradients * simplify update condition using do_sync_step * remove print for test --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-10-31 09:53:23 -04:00
jp	1b86772de5	Fix: img size mismatch caused by incorrect unpadding in LLaVA-Next (#34522 ) Fix: unpadding img mismatch	2024-10-31 14:32:45 +01:00
jiqing-feng	f38531619d	enable QA bf16 pipeline (#34483 ) * enable QA bf16 pipeline * add tests	2024-10-31 12:55:53 +00:00
anshumangahlot	405b562698	UPDATE Documentation for #TRANSLATING.md Documentation into Multiple Languages.(Changes made) (#34226 ) * Update TRANSLATING.md * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update TRANSLATING.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-30 12:37:39 -07:00
Yoni Gozlan	48872fd6ae	Add Image Processor Fast RT-DETR (#34354 ) * add fast image processor rtdetr * add gpu/cpu test and fix docstring * remove prints * add to doc * nit docstring * avoid iterating over images/annotations several times * change torch typing * Add image processor fast documentation	2024-10-30 13:49:47 -04:00
fzyzcjy	9f06fb0505	Fix super tiny extra space typo (#34440 ) Update training_args.py	2024-10-30 16:55:16 +01:00
Vladislav Bronzov	5251fe6271	Add GGUF for Mamba (#34200 ) * add mamba architecture for gguf * add logic for weights conversion, some fixes and refactoring * add lm_head layers, unit test refactoring * more fixes for tests * remove lm_head creation * remove unused comments	2024-10-30 16:52:17 +01:00
Yih-Dar	eab6c491d4	Use torch 2.5 in scheduled CI (#34465 ) * torch 2.5 * try --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-30 14:54:10 +01:00
Pablo Montalvo	241d79026f	fix pixtral processor (#34486 ) * fix pixtral processor * test out full length batches + remove undue ValueError * fix up processing * fix tests * fix * last fixup * style * [run-slow] pixtral * [run-slow] pixtral * fix config key * skip torchscript tests * [run-slow] pixtral * add missing key * [run-slow] pixtral * fix docs * [run-slow] pixtral * fix wrong url for integration test * [run-slow] pixtral * pixtralVisionModel does not have a lm head * [run-slow] pixtral	2024-10-30 14:17:20 +01:00
Joao Gante	8a734ea2c3	Tests: move `generate` tests to the right mixin and delete redundant tests (#34464 ) * tmp commit * tmp commit * cull overwrites of deleted tests * typo * more specific docstring * make fixup * parameterize at the top? * correction * more deletions :D * tmp commit * for VLMs too * fix _check_outputs * test nit * make fixup * fix another flaky * test_generate_from_inputs_embeds -- handle missing attention mask	2024-10-30 10:59:08 +00:00
Raushan Turganbay	913330ca9f	VLMs: fix number of image tokens (#34332 ) * fix * fix tests * add tests * style * style * fix qwen after rebase * fix video llava	2024-10-30 10:21:37 +01:00
Raushan Turganbay	0f764a5af7	Mllama: update docs (#34334 ) * update docs * be more explicit * use avaialble methods	2024-10-30 10:11:50 +01:00
Pethő Gergely	25a9fc584a	Fix format mistake in string repr of tokenizer objects (#34493 ) * fix repr string format for tokenizer objects The repr of tokenizer tokens looks confusing and just stupid, like this: `Tokenizer(...), added_tokens_decoder={1: ..., 2: ...}`. The dict that is the value of the added_tokens_decoder attribute is outside of the parentheses of the tokenizer object, whereas all other attributes are inside the parentheses like they should be. This commit fixes this bug. * cos: add newline before closing parenthesis of repr string	2024-10-30 10:03:41 +01:00
Guang Yang	cd277618d4	Roberta is ExecuTorch compatible (#34425 ) * Roberta is ExecuTorch compatible * [run_slow] roberta --------- Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-30 08:36:45 +00:00
Matt	9bee9ff5db	Un-deprecate timeout arg in pipelines (#34382 ) * Un-deprecate timeout * Put "timeout" on the allowed list * make fixup	2024-10-29 18:45:14 +00:00
Yoni Gozlan	e4449bb790	fix incorrect warning (#34416 )	2024-10-29 14:08:42 -04:00
Aleksey Lobanov	f55595b177	Fix performance in get_imports regexp (#34298 ) * fix: Fix performance in get_imports regexp * Minimize get_imports content regexp	2024-10-29 17:29:24 +00:00
dependabot[bot]	4e2e8809ff	Bump werkzeug from 3.0.3 to 3.0.6 in /examples/research_projects/decision_transformer (#34420 ) Bump werkzeug in /examples/research_projects/decision_transformer Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.3 to 3.0.6. - [Release notes](https://github.com/pallets/werkzeug/releases) - [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/werkzeug/compare/3.0.3...3.0.6) --- updated-dependencies: - dependency-name: werkzeug dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-10-29 16:42:40 +00:00
Apoorv Khandelwal	e9ad460494	Adding `optimizer_cls_and_kwargs` to `Trainer.__init__` (#34358 ) * Adding `optimizer_cls_and_kwargs` to `Trainer.__init__` * formatting * make fix-copies docstring * added more docs for optimizer_cls_and_kwargs * add docs for Trainer(optimizer_cls_and_kwargs) * reverting anchor names	2024-10-29 16:23:16 +01:00
Guang Yang	f339042b0b	Albert is ExecuTorch compatible (#34476 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-29 16:22:13 +01:00
Guang Yang	34620e8f0a	MobileBERT is ExecuTorch compatible (#34473 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-29 16:14:31 +01:00
Abhijit Deo	56c45d5757	Bug fix for drop path decay rate in swin transformer (#34291 ) * potential bug fix for drop path * variable name change * forgot to rename the variables * back to original * modify dpr properly * check_copies auto fix * corresponsing swin2 changes * auto fix * linting * default value for drop_path_rate as 0.0 * Update src/transformers/models/glm/modeling_glm.py * maskformer fix * ruff format * changes made to tf code as well * lint --------- Co-authored-by: abhijit deo <167164474+deo-abhijit@users.noreply.github.com>	2024-10-29 16:09:18 +01:00
Shijie	0ab0a42651	fix-qwen2vl-no-position_ids (#33487 )	2024-10-29 15:27:34 +01:00
Doohae Jung	8755dd26b7	manual `head_dim` for `mixtral` model (#34281 )	2024-10-29 14:31:36 +01:00
Guang Yang	5392f12e16	Bert is ExecuTorch compatible (#34424 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-29 14:30:02 +01:00
Marc Sun	004530aa05	Fix regression loading dtype (#34409 ) * fix regression * add test for torchao * expected output * better fix	2024-10-29 11:41:04 +01:00
hlky	9e3d704e23	Fixes for Modular Converter on Windows (#34266 ) * Separator in regex * Standardize separator for relative path in auto generated message * open() encoding * Replace `\` on `os.path.abspath` --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-29 11:40:41 +01:00
Martin Gubri	626c610a4d	Fix perplexity computation in perplexity.md (#34387 ) fix average NLL in perplexity.md	2024-10-29 11:10:10 +01:00
Yih-Dar	439334c8fb	Simplify running tests in a subprocess (#34213 ) * check * check * check * check * add docstring --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-29 10:48:57 +01:00
StevenBucaille	a1835195d1	🚨🚨🚨 [SuperPoint] Fix keypoint coordinate output and add post processing (#33200 ) * feat: Added int conversion and unwrapping * test: added tests for post_process_keypoint_detection of SuperPointImageProcessor * docs: changed docs to include post_process_keypoint_detection method and switched from opencv to matplotlib * test: changed test to not depend on SuperPointModel forward * test: added missing require_torch decorator * docs: changed pyplot parameters for the keypoints to be more visible in the example * tests: changed import torch location to make test_flax and test_tf * Revert "tests: changed import torch location to make test_flax and test_tf" This reverts commit 39b32a2f69500bc7af01715fc7beae2260549afe. * tests: fixed import * chore: applied suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * tests: fixed import * tests: fixed import (bis) * tests: fixed import (ter) * feat: added choice of type for target_size and changed tests accordingly * docs: updated code snippet to reflect the addition of target size type choice in post process method * tests: fixed imports (...) * tests: fixed imports (...) * style: formatting file * docs: fixed typo from image[0] to image.size[0] * docs: added output image and fixed some tests * Update docs/source/en/model_doc/superpoint.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix: included SuperPointKeypointDescriptionOutput in TYPE_CHECKING if statement and changed tests results to reflect changes to SuperPoint from absolute keypoints coordinates to relative * docs: changed SuperPoint's docs to print output instead of just accessing * style: applied make style * docs: added missing output type and precision in docstring of post_process_keypoint_detection * perf: deleted loop to perform keypoint conversion in one statement * fix: moved keypoint conversion at the end of model forward * docs: changed SuperPointInterestPointDecoder to SuperPointKeypointDecoder class name and added relative (x, y) coordinates information to its method * fix: changed type hint * refactor: removed unnecessary brackets * revert: SuperPointKeypointDecoder to SuperPointInterestPointDecoder * Update docs/source/en/model_doc/superpoint.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-10-29 09:36:03 +00:00
kang sheng	655bec2da7	use a tinymodel to test generation config which aviod timeout (#34482 ) * use a tinymodel to test generation config which aviod timeout * remove tailing whitespace	2024-10-29 09:39:06 +01:00
Raushan Turganbay	63ca6d9771	Fix CI (#34458 ) * fix * fix mistral	2024-10-29 08:26:04 +01:00
Raushan Turganbay	808d6c50f8	Generation: fix test (#34369 ) * fix test * fix copies	2024-10-29 07:57:10 +01:00
Raushan Turganbay	fe76b60370	LLaVA: latency issues (#34460 ) * fix llavas * code style * green ci	2024-10-29 07:54:51 +01:00
Alexandros Benetatos	a769ed45e1	Add `post_process_depth_estimation` for GLPN (#34413 ) * add depth postprocessing for GLPN * remove previous temp fix for glpn tests * Style changes for GLPN's `post_process_depth_estimation` Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * additional style fix --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-28 19:44:20 +01:00
Luc Georges	6cc4a67b3d	feat: run benchmarks on A100 (#34287 )	2024-10-28 19:33:17 +01:00
kang sheng	d21dbd1520	enable average tokens across devices (#34373 ) * enable average tokens across devices * reduce earlier in case model needs it * simplify if statement * reformat code to make ruff happy * add doc for argument: average_tokens_across_devices * cannot find world size when pytorch is unavailable * format code --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-28 18:59:38 +01:00
Ahmed Almaghz	a17f287ac0	[i18n-ar] Translated file : `docs/source/ar/fast_tokenizers.md` into Arabic (#33034 ) * Add docs/source/ar/fast_tokenizers.md to Add_docs_source_ar_fast_tokenizers.md * Update _toctree.yml * Update _toctree.yml * Update docs/source/ar/_toctree.yml Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-10-28 10:54:37 -07:00
Shubham S Jagtap	084e946cfd	Apply linting to the important code blocks to make it readable (#34449 ) Enhance user experience using py-linting	2024-10-28 10:48:18 -07:00
wony617	1f7539c829	🌐 [i18n-KO] Translated `model_doc/barthez.md` to Korean (#33980 ) * docs: ko: model_doc/barthez.md * feat: nmt draft --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-28 10:46:49 -07:00
Vijay	fc1ae7f30f	[docs] update input documentation for MAMBA2 and MISTRAL models to include cache_position and attention_mask details (#34322 ) * [docs] update input documentation for MAMBA2 and MISTRAL models to include cache_position and attention_mask details * [docs] correct input documentation for MISTRAL model to reference `input_ids` instead of `decoder_input_ids` * [docs] clarify cache_position description in MISTRAL model documentation	2024-10-28 09:14:07 -07:00
Sean (Seok-Won) Yi	c1753436db	New option called `"best"` for `args.save_strategy`. (#31817 ) * Add _determine_best_metric and new saving logic. 1. Logic to determine the best logic was separated out from `_save_checkpoint`. 2. In `_maybe_log_save_evaluate`, whether or not a new best metric was achieved is determined after each evaluation, and if the save strategy is "best' then the TrainerControl is updated accordingly. * Added SaveStrategy. Same as IntervalStrategy, but with a new attribute called BEST. * IntervalStrategy -> SaveStrategy * IntervalStratgy -> SaveStrategy for save_strat. * Interval -> Save in docstring. * Updated docstring for save_strategy. * Added SaveStrategy and made according changes. `save_strategy` previously followed `IntervalStrategy` but now follows `SaveStrategy`. Changes were made accordingly to the code and the docstring. * Changes from `make fixup`. * Removed redundant metrics argument. * Added new test_save_best_checkpoint test. 1. Checks for both cases where `metric_for_best_model` is explicitly provided and when it's not provided. 2. The first case should have two checkpoints saved, whereas the second should have three saved. * Changed should_training_end saving logic. The Trainer saves a checkpoints at the end of training by default as long as `save_strategy != SaveStrategy.NO`. This condition was modified to include `SaveStrategy.BEST` because it would be counterintuitive that we'd only want the best checkpoint to be saved but the last one is as well. * `args.metric_for_best_model` default to loss. * Undo metric_for_best_model update. * Remove checking metric_for_best_model. * Added test cases for loss and no metric. * Added error for metric and changed default best_metric. * Removed unused import. * `new_best_metric` -> `is_new_best_metric` Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Applied `is_new_best_metric` to all. Changes were made for consistency and also to fix a potential bug. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-10-28 16:02:22 +01:00
AbdelKarim ELJANDOUBI	8b3b9b48fc	exclude fsdp from delay_optimizer_creation (#34140 ) * exclude fsdp from delay_optimizer_creation * add test case for trainer: FSDP mode and fp8 as mixed precision * rearrange imports * ruff formatted * adapt _init_fsdp to fp8 * use _init_fsdp only when resume_from_checkpoint * In case of FDP, self.layer will be CheckpointWrapper which has no len() method * delete _init_fsdp * solve conflict * fix conflict * make fixup	2024-10-28 13:50:16 +01:00
Nischay	92bcdff2ef	Fix batch size handling in prediction_loop for DataLoaderShard (#34343 ) * Fix batch size handling in prediction_loop for DataLoaderShard Updated the prediction_loop method in the Trainer class to correctly handle batch size when using DataLoaderShard. This ensures that the batch size is retrieved from total_batch_size for distributed training scenarios, preventing TypeError related to NoneType during evaluation. * Update src/transformers/trainer.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Applied the fix to remove unused imports --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-10-28 13:23:52 +01:00
Yih-Dar	9360f1827d	Tiny update after #34383 (#34404 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-28 12:01:05 +01:00
Yih-Dar	fc465bb196	pin `tensorflow_probability<0.22` in docker files (#34381 ) 0.21 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-28 11:59:46 +01:00
Ilyas Moutawwakil	fddbd3c13c	Fix pix2struct (#34374 ) * fix * fix and test use_cache test * style * remove atol	2024-10-28 11:24:56 +01:00
Steven Liu	1d06379331	[docs] Cache implementations (#34325 ) cache	2024-10-25 08:52:45 -07:00
Rudy Delouya	6a62a6d1b5	Fix typos in agents_advanced.md (#34405 )	2024-10-25 08:52:29 -07:00
Yih-Dar	f73f5e62e2	Avoid check expected exception when it is on CUDA (#34408 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-25 17:14:07 +02:00
Matthew Douglas	e447185b1f	Fix bnb training test failure (#34414 ) * Fix bnb training test: compatibility with OPTSdpaAttention	2024-10-25 10:23:20 -04:00
Joao Gante	186b8dc190	Tests: upgrade `test_eager_matches_sdpa_generate` (#34386 )	2024-10-25 11:55:07 +01:00
Joao Gante	8814043c8c	SynthID: better example (#34372 ) * better example * Update src/transformers/generation/configuration_utils.py * Update src/transformers/generation/logits_process.py * nits	2024-10-25 11:46:46 +01:00
Yih-Dar	223855314f	no filter (#34391 ) * no filter * no filter * no filter --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-25 12:32:39 +02:00
Raushan Turganbay	9f365fe0ac	Fix right padding in LLaVA models (#34305 ) * fix right pad llavas * device mismatch	2024-10-25 11:02:07 +02:00
Ilyas Moutawwakil	5779bac4c4	Fix onnx non-expotable inplace aten op (#34376 ) * fix onnx non-expotable inplace op * mistral, qwen2, qwen2_vl, starcoder2 * fixup copies	2024-10-25 09:44:09 +02:00
Yoni Gozlan	940a6bd343	Use non nested images and batched text Idefics2/3 (#34222 ) * add support for non nested images and add tests * add tests error scenario * fix style * added single and no image to error tests	2024-10-24 20:00:13 -04:00
Cyril Vallez	3d99f1746e	Fix glm (#34388 ) * Fix duplicated * fix import	2024-10-24 19:17:52 +02:00
Yih-Dar	a308d28d39	[auto. ping] Avoid sending empty info + add more team members (#34383 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-24 19:07:23 +02:00
Cyril Vallez	4c6e0c9252	Correct the new defaults (#34377 ) * Correct the new defaults * CIs * add check * Update utils.py * Update utils.py * Add the max_length in generate test checking shape without passing length * style * CIs * fix fx CI issue	2024-10-24 18:42:03 +02:00
Michael Benayoun	1c5918d910	Fix `torch.fx` issue related to the new `loss_kwargs` keyword argument (#34380 ) * Fix FX * Unskip tests	2024-10-24 18:34:28 +02:00
Benjamin Bossan	d9989e0b9a	[PEFT] Add warning for missing key in LoRA adapter (#34068 ) When loading a LoRA adapter, so far, there was only a warning when there were unexpected keys in the checkpoint. Now, there is also a warning when there are missing keys. This change is consistent with https://github.com/huggingface/peft/pull/2118 in PEFT and the planned PR https://github.com/huggingface/diffusers/pull/9622 in diffusers. Apart from this change, the error message for unexpected keys was slightly altered for consistency (it should be more readable now). Also, besides adding a test for the missing keys warning, a test for unexpected keys warning was also added, as it was missing so far.	2024-10-24 17:56:40 +02:00
Yoni Gozlan	fe35073319	Ignore unsupported kwarg in ProcessorMixin call (#34285 ) Fix accept any common kwargs	2024-10-24 11:46:39 -04:00
Winston H.	e288616606	refactor: remove redundant if-condition and improve type correctness for `convert_tokens_to_ids` (#34030 ) * chore: remove redundant if-condition * fix: import `Iterable`	2024-10-24 17:40:26 +02:00
Vijay	450b9cbfac	Add code sample docstrings and checkpoint reference for GLM models (#34360 ) * Add code sample docstrings and checkpoint reference for GLM models * Update modular_glm.py * Update modeling_glm.py	2024-10-24 17:28:51 +02:00
Yoni Gozlan	6432ad8bb5	Fix pil_torch_interpolation_mapping import in image_processing_detr_fast (#34375 ) fix pil_torch_interpolation_mapping import	2024-10-24 09:22:50 -04:00
김준재	dd267fca72	Add T5 GGUF loading support (#33389 ) * add: GGUFT5Converter * add: tensormapping for t5 * add: test code for t5 * fix: Remove whitespace from blank line * add: t5 fp16 tests * fix: whitespace formatting * fix: minor formatting * fix: testing every weights	2024-10-24 15:10:59 +02:00
Thomas Furtner	30c76d5b28	add code generation to natural language processing section (#34333 )	2024-10-24 14:42:47 +02:00
Lysandre Debut	2112027d0c	Zamba is an LM (#34342 ) * Zamba is an LM * Addition	2024-10-24 14:29:33 +02:00
Raushan Turganbay	b29c24ff1e	CI: fix failures (#34371 ) fix	2024-10-24 13:44:53 +02:00
blueingman	f0b3ef9e2e	translated gguf.md into chinese (#34163 ) * translated gguf.md into chinese * Apply suggestions from code review I have updated the PR accordingly.Thank you very much for detailed guidance,and I 'll pay more attention to the details next time. Co-authored-by: Isotr0py <2037008807@qq.com> * Apply suggestions from code review Co-authored-by: Isotr0py <2037008807@qq.com> --------- Co-authored-by: Isotr0py <2037008807@qq.com>	2024-10-24 11:47:58 +02:00
Arthur Zucker	9643069465	v4.47.0.dev0	2024-10-24 11:23:29 +02:00
Yih-Dar	f0e640adfa	Drop support for Python 3.8 (#34314 ) * drop python 3.8 * update docker files --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-24 11:16:55 +02:00
Arthur	05863817d6	Better defaults (#34026 ) * be nice to our usres * nit * fixup * default to -1 * oups * turbo nit * auto infer framework	2024-10-24 11:11:55 +02:00
Abhishek Maurya	65753d6065	Remove graph breaks for torch.compile() in flash_attention_forward when Lllama Model is padding free tuned (#33932 ) * fix: fixes for graph breaks Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: formatting Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: import error Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: Add Fa2Kwargs Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * Revert "PR changes" This reverts commit 39d2868e5c93cc5f3f3c7c6ff981b66614c0e0e4. * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: FlashAttentionKwarg Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: FlashAttentionKwarg Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * addition of documentation Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * change in _flash_attention_forward Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * make fix-copies Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * revert make fix-copies Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix copies * style * loss kwargs typing * style and pull latest changes --------- Signed-off-by: Abhishek <maurya.abhishek@ibm.com> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-10-24 11:02:54 +02:00
Joao Gante	b0f0c61899	Add SynthID (watermerking by Google DeepMind) (#34350 ) * Add SynthIDTextWatermarkLogitsProcessor * esolving comments. * Resolving comments. * esolving commits, * Improving SynthIDWatermark tests. * switch to PT version * detector as pretrained model + style * update training + style * rebase * Update logits_process.py * Improving SynthIDWatermark tests. * Shift detector training to wikitext negatives and stabilize with lower learning rate. * Clean up. * in for 7B * cleanup * upport python 3.8. * README and final cleanup. * HF Hub upload and initiaze. * Update requirements for synthid_text. * Adding SynthIDTextWatermarkDetector. * Detector testing. * Documentation changes. * Copyrights fix. * Fix detector api. * ironing out errors * ironing out errors * training checks * make fixup and make fix-copies * docstrings and add to docs * copyright * BC * test docstrings * move import * protect type hints * top level imports * watermarking example * direct imports * tpr fpr meaning * process_kwargs * SynthIDTextWatermarkingConfig docstring * assert -> exception * example updates * no immutable dict (cant be serialized) * pack fn * einsum equivalent * import order * fix test on gpu * add detector example --------- Co-authored-by: Sumedh Ghaisas <sumedhg@google.com> Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: sumedhghaisas2 <138781311+sumedhghaisas2@users.noreply.github.com> Co-authored-by: raushan <raushan@huggingface.co>	2024-10-23 21:18:52 +01:00
Arthur	e50bf61dec	Fix red CI: benchmark script (#34351 ) * dont'trigger always * fux * oups * update * ?? * ? * aie	2024-10-23 18:33:52 +02:00
Yih-Dar	c42b3223db	skip `test_pipeline_depth_estimation` temporarily (#34316 ) skip Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-23 17:27:51 +02:00
Zach Mueller	d9f733625c	Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283 ) * Enable grad accum fix across all models + trainer fully in forward() * handle peft case * Account for DDP: need to run scale tests * Use accelerator state * Quality * Guard * Experiment w/ only fairseq fix * Fairseq only * Revert multiply_grads fix * Mult by grad accum to fully bring back solution * Style * Good to go now * Skip fx tests for now * Bookmark * Working now	2024-10-23 11:24:57 -04:00
Aymeric Roucher	1fb575fcf0	Support boolean tool args (#34208 ) Support boolean tool arguments	2024-10-23 16:48:21 +02:00
Filippos Ventirozos	343c8cb86f	Added Deberta model type support (#34308 ) * Added Deberta model type for 'add_prefix_space' functionality * housekeeping --------- Co-authored-by: Filippos Ventirozos <filippos.ventirozos@autotrader.co.uk>	2024-10-23 11:15:36 +02:00
Steven Liu	5ba85de7a4	[docs] Fix Korean toctree (#34324 ) fix	2024-10-23 10:52:51 +02:00
Vijay	049682a5a6	Example doc for token classification of Llama and Dependent/Copied Models (#34139 ) * Added Example Doc for token classification on all tokenClassificationModels copied from llama * Refactor code to add code sample docstrings for Gemma and Gemma2 models (including modular Gemma) * Refactor code to update model checkpoint names for Qwen2 models	2024-10-22 10:26:16 -07:00
wony617	644d5287b2	🌐 [i18n-KO] Translated `model_doc/bartpho.md` to Korean (#33981 ) * docs: ko: model_doc/bartpho.md * feat: nmt draft * Update docs/source/ko/model_doc/bartpho.md * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-22 09:46:52 -07:00
Ahnjj_DEV	b03dc0a87e	🌐 [i18n-KO] Translated `bert japanese.md` to Korean (#33890 ) * docs: ko: bert-japanese.md * Update _toctree.yml * fix: manual edits * Update docs/source/ko/_toctree.yml Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/_toctree.yml Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> --------- Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-22 09:46:31 -07:00
Ahnjj_DEV	4b14aa1bcd	🌐 [i18n-KO] Translated `executorch.md` to Korean (#33888 ) * docs: ko: executorch.md * Update _toctree.yml * fix: manual edits * Update docs/source/ko/main_classes/executorch.md Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * Update docs/source/ko/_toctree.yml Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/_toctree.yml * Update docs/source/ko/_toctree.yml * Update docs/source/ko/_toctree.yml --------- Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-22 09:46:20 -07:00
Fanli Lin	688eeac81e	[docs] fix typo (#34235 ) fix typo	2024-10-22 09:46:07 -07:00
Mansu Kim	a65a6ce7fe	fix error in _get_eval_sampler when group_by_length enabled (#34237 ) * remove self in _get_eval_sampler * remove self in front of _get_eval_sampler	2024-10-22 18:02:42 +02:00
Yoni Gozlan	e7c3fa7f57	Fix continue_final_message for image-text-to-text chat templates (#34236 ) * fix continue_final_message for vlms * Add one test for vlms continue_final_message chat template	2024-10-22 11:57:44 -04:00
Chinedum Echeta	96f67c068b	Feature: Add `MLFLOW_MAX_LOG_PARAMS` to `MLflowCallback` (#34279 )	2024-10-22 16:34:17 +02:00
Michael Kamerath	eef6b0ba42	Add option for running ffmpeg_microphone_live as a background process (#32838 ) * Add option for running ffmpeg_microphone_live as a background process * Code quality checks for audio_utils * Code clean up for audio_utils * Fixing logic in ffmpeg_microphone calls in audio_utils * Allowing any arbitrary arguments to be passed to ffmpeg_microphone_live * Formatting * Fixing last problems with adding ffmpeg_additional_args * Fixing default arguments and formatting issues * Fixing comments for ffmpeg_additional_args * Adding two shorts tests for ffmpeg_microphone_live * Fixing test bug	2024-10-22 15:56:41 +02:00
Guang Yang	c14ccbcd64	Olmo is ExecuTorch Compatible (#34181 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-22 15:53:01 +02:00
Guang Yang	7a08a772cc	Qwen2.5 is ExecuTorch Compatible (#34102 ) Qwen2 is ExecuTorch Compatible Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-22 15:52:23 +02:00
Alexandros Benetatos	c31a6ff474	Add post_process_depth_estimation to image processors and support ZoeDepth's inference intricacies (#32550 ) * add colorize_depth and matplotlib availability check * add post_process_depth_estimation for zoedepth + tests * add post_process_depth_estimation for DPT + tests * add post_process_depth_estimation in DepthEstimationPipeline & special case for zoedepth * run `make fixup` * fix import related error on tests * fix more import related errors on test * forgot some `torch` calls in declerations * remove `torch` call in zoedepth tests that caused error * updated docs for depth estimation * small fix for `colorize` input/output types * remove `colorize_depth`, fix various names, remove matplotlib dependency * fix formatting * run fixup * different images for test * update examples in `forward` functions * fixed broken links * fix output types for docs * possible format fix inside `<Tip>` * Readability related updates Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Readability related update * cleanup after merge * refactor `post_process_depth_estimation` to return dict; simplify ZoeDepth's `post_process_depth_estimation` * rewrite dict merging to support python 3.8 --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-10-22 15:50:54 +02:00
pbelcak	104599d7a8	Fix: tensor of examples of the same length triggers invalid stacking (#34166 ) * Fix issue where tensor of examples of the same length triggers invalid stacking * Update data_collator.py	2024-10-22 15:49:21 +02:00
Cyril Vallez	51e395d13e	Fix FA2 attention for models supporting sliding window (#34093 ) Fix FA2	2024-10-22 15:37:21 +02:00
HALLOUARD	eb6a734995	[RT-DETR] Fix onnx inference bug for Optype (Where) (#33877 ) * feat: [RT-DETR] Add onnx runtime config and fix onnx inference bug Optype (Where) * fix lint * use dtype istead of torch.float32 * add doc * remove onnx config * use dtype info * use tensor to fix lint	2024-10-22 15:14:07 +02:00
Marc Sun	84b17e03f1	Update PR templates (#34065 ) update PR template	2024-10-22 15:11:54 +02:00
Matt	681fc43713	Sync video classification pipeline with huggingface_hub spec (#34288 ) * Sync video classification pipeline * Add disclaimer	2024-10-22 13:33:49 +01:00
regisss	93352e81f5	Fix Korean doc _toctree.yml (#34293 ) Fix korean doc _toctree.yml	2024-10-22 11:05:56 +02:00
Steven Liu	b644178ed4	[docs] Fix GenerationConfig params (#34299 ) fix generationconfigs	2024-10-22 11:03:25 +02:00
Raushan Turganbay	73d65e637b	T5 compile compatibilty (#34089 ) * this worked in normal generation, needs more tests * fix almost all tests in t5 * nit * longt5, umt5, mt5 * style * udop, pix2struct * more models * fix some tests * fix onnx tests * tracing tests fixed * compile enabled and tested for t5 models * fix small bug in slow tests * [run-slow] t5 * uncomment * style * update with new generation refactoring * nit * fix copies * this is the fix, had to change t5 to fix copies * update * [run-slow] t5 * [run-slow] t5 * update * add test for encoder only T5 * clean up after rebase * fix pop2piano * add comment * style * fix copies after rebase * fix copies missed this one	2024-10-22 08:23:53 +02:00
Raushan Turganbay	5077bc034f	VLM: add more modularity (#34175 ) * update * fix tests + fix copies * fix tests once more	2024-10-22 07:56:35 +02:00
Raushan Turganbay	21d5025826	Attn implementation for composite models (#32238 ) * first try * codestyle * idefics2 is happy * [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo, paligemma * fix-copies * [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo * blip-2 needs to init vision from config * when was this removed O_o * minor fix * tests * this way? * tests * model-agnostic code * codestyle * add tests for idefics * modify general test for VLMs * no generation test for vlm yet! * no generation test here also * wanr in VIT-SDPA if output attn * add more tests * user can pass dict as attn impl * repo consistency * update * muicgen * no prints * forgot speech enc-dec and clip * how many composite models we have? * musicgen meelody is same as mudicgen * +siglip * fix tests + add some more * remove idefics custom overriden code * make idefics2 automappable * nits * skip tests * doctests * Update src/transformers/models/idefics2/configuration_idefics2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/clip/test_modeling_clip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/idefics2/test_modeling_idefics2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/idefics2/test_modeling_idefics2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/configuration_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * major update, no need for automap * clean up * add FA2 test * more tests * style * skip tests * why did these started failing now? * no attributes for FA2 needed * one tiny test * address comment about FA2 false warning * style * add new models and resolve conflicts * fix copies * let it be this way for now, come back tomorrow to review * some more fixes * update * more updates * update * fix copies * style and tests * another big update * fix tests * fix tests * update * another update * fix tests * fix copies * fix tests --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-10-22 06:54:44 +02:00
Andrés Marafioti	32590b5ecb	Fix method name which changes in tutorial (#34252 ) The method `model_download_tool` was called `model_download_counter` earlier in the tutorial, this raises an error when following the code.	2024-10-21 14:21:52 -03:00
Matt	f701b98e4a	Add a doc section on writing generation prompts (#34248 ) Add a section on writing generation prompts	2024-10-21 14:35:57 +01:00
Yoni Gozlan	a4122813d1	Add DetrImageProcessorFast (#34063 ) * add fully functionning image_processing_detr_fast * Create tensors on the correct device * fix copies * fix doc * add tests equivalence cpu gpu * fix doc en * add relative imports and copied from * Fix copies and nit	2024-10-21 09:05:05 -04:00
Yoni Gozlan	24bdc94da5	Change Paligemma import logging to work with modular (#34211 ) * change import logging * fix CI	2024-10-21 08:55:27 -04:00
Raushan Turganbay	ca541bd4f4	Generation tests: don't rely on main input name (#34228 ) * don't rely on main input name * update	2024-10-21 10:00:14 +02:00
Matthew Hoffman	816f442496	Only cast logits to float when computing loss (#34147 ) * Only cast logits to float when computing loss Some misses from #31292 and #33902 * Move logits.float() into existing if labels is not None branch	2024-10-18 18:15:26 +02:00
Matt	e46e3bc173	Fix UDOP dtype issue (#34180 ) * Trigger UDOP tests * Try forcing dtype in LayoutLMV3 * Do checks to see where uint8 is getting in * Do checks to see where uint8 is getting in * Found it! * Add .astype(np.float32) * Remove forced check, make fixup * Checking where exactly the uint8 creeps in * More checking on the uint8 issues * Manually upcast in rescale() * Remove UDOP trigger	2024-10-18 16:54:58 +01:00
Cyril Vallez	6604764007	add Glm (#33823 ) * Create modular_glm.py * Update modular_glm.py * Finalize architecture without all attentions * Add all attentions modules * Finalize modular * Update given last version * Last update * Finalize model * Finalize converter * Update convert_glm_weights_to_hf.py * style * style * Create __init__.py * Aff all inits * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Correct the rotary embeddings * Remove apply_residual_connection_post_layernorm (always false) * remove use_rms_norm (always true) * remove past_layer_norm (always true) * Update __init__.py * Update config and license * start adding tests and doc * Add doc + style * Update test_modeling_glm.py * Add dummies * Apply correct modeling * Refactor attention to follow llama * Update __init__.py * Update convert_glm_weights_to_hf.py * Correct bias * remove linear_bias and pdrop (never used) * apply modular * Simplify converter * remove dummies + style * add model_input_names * Add pretraining_tp to config for when eager attention is used * Update modular to remove all pretraining_tp * Update test_modeling_glm.py * Update the __all__ * Update __all__ * Update __init__.py * Update test_modeling_glm.py * add revisions * Add the correct repos and revisions * style * Update __init__.py * update exports * remove import of modular files * style * Apply Llama changes + refine converter * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * style * Use new modular converter * add pretrainedmodel to init * style * Update test_modeling_glm.py * Move config outside modular to please CI about docstrings * Add dummies to please CI * Update glm.md * Update glm.md	2024-10-18 17:41:12 +02:00
Lysandre Debut	e95ea479ee	Informative 2 (#34154 ) * Informative * style * Informative 2 * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-10-18 14:12:15 +02:00
byi8220	0437d6cd03	Fix broken test decorator `require_torch_up_to_2_accelerators` (#34201 ) * fix broken require_torch_up_to_2_accelerators * make style	2024-10-18 13:54:55 +02:00
Raushan Turganbay	5a5b590d06	BLIP: fix input expansion logic (#34225 ) fix	2024-10-18 12:17:30 +02:00
Arthur	b54109c746	Fix-red-ci (#34230 ) * fix copies, skip fx for llama * styke * re-fix copies * last? * style	2024-10-17 23:38:35 +02:00
Zach Mueller	6ba31a8a94	Enable users to use their own loss functions + deal with prefetching for grad accum (#34198 ) * bookmark * Bookmark * Bookmark * Actually implement * Pass in kwarg explicitly * Adjust for if we do or don't have labels * Bookmark fix for od * bookmark * Fin * closer * Negate accelerate grad accum div * Fixup not training long enough * Add in compute_loss to take full model output * Document * compute_loss -> compute_loss_fn * Add a test * Refactor * Refactor * Uncomment tests * Update tests/trainer/test_trainer.py Co-authored-by: Daniel Han <danielhanchen@gmail.com> --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>	2024-10-17 17:01:56 -04:00
Pedro Cuenca	7a06d07e14	Support Llama 3.2 conversion (text models) (#33778 ) * Support Llama 3.2 conversion (text models) Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Fix rope factor * Update chat template Initialize from a well-known template. The guidance is that the changes should be applied to 3.1 models as well. * Remove import * Support Llama Guard 3 conversion * Tokenizer details * Fix eos added token in base models * Fix generation config for base models * Specify revision for known tokenizers * Style * Reuse chat templates for older models * Improve error when converting tokenizer < Llama 3 --------- Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>	2024-10-17 22:37:37 +02:00
Arthur	c1c7e89620	Fix Gradient Accumulation issue (#34191 ) * quick fix * 3 losses * oups * fix * nits * check how it scales for special models * propagate for conditiona detr * propagate * propagate * propagate * fixes * propagate changes * update * fixup * nits * f string * fixes * more fixes * ? * nit * arg annoying f string * nits * grumble * update * nit * refactor * fix fetch tests * nit * nit * Update src/transformers/loss/loss_utils.py Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> * update * nit * fixup * make pass * nits * port code to more models * fixup * ntis * arf * update * update * nits * update * fix * update * nits * fine * agjkfslga.jsdlkgjklas * nits * fix fx? * update * update * styel * fix imports * update * update * fixup to fix the torch fx? --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>	2024-10-17 22:34:40 +02:00
Joao Gante	f51ac9e059	Generate: visit non-llm `prepare_inputs_for_generation` (#34199 ) * tmp * all visited * test all * Update src/transformers/models/moshi/modeling_moshi.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * delete another one :D --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-17 16:53:48 +01:00
David Chanin	1d2c29f0b3	Fix bus error when using GPT2 on M1 macs (#34031 ) There's a bug on M1 macs with transformer >= 4.43.0 and torch >= 2.1.0, where if a model has tied embeddings, then the fast loading from #31771 causes a bus error when the model is actually run. This can be solved by disabling `_supports_param_buffer_assignment` for these models. More info in comments in #33357	2024-10-17 17:39:04 +02:00
Guang Yang	9470c00042	Llama3 and Llama2 are ExecuTorch compatible (#34101 ) Llama3_1b and Llama2_7b are ExecuTorch compatible Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-17 17:33:19 +02:00
Name	7f5088503f	removes decord (#33987 ) * removes decord dependency optimize np Revert "optimize" This reverts commit faa136b51ec4ec5858e5b0ae40eb7ef89a88b475. helpers as documentation pydoc missing keys * make fixup * require_av --------- Co-authored-by: ad <hi@arnaudiaz.com>	2024-10-17 17:27:34 +02:00
Sebastian Schoennenbeck	f2846ad2b7	Fix for tokenizer.apply_chat_template with continue_final_message=True (#34214 ) * Strip final message * Do full strip instead of rstrip * Retrigger CI --------- Co-authored-by: Matt <rocketknight1@gmail.com>	2024-10-17 15:45:07 +01:00
Christopher McGirr	b57c7bce21	fix(Wav2Vec2ForCTC): torch export (#34023 ) * fix(Wav2Vec2ForCTC): torch export Resolves the issue described in #34022 by implementing the masking of the hidden states using an elementwise multiplication rather than indexing with assignment. The torch.export functionality seems to mark the tensor as frozen even though the update is legal. This change is a workaround for now to allow the export of the model as a FxGraph. Further investigation is required to find the real solution in pytorch. * [run-slow] hubert, unispeech, unispeech_sat, wav2vec2	2024-10-17 15:41:55 +01:00
Yih-Dar	fce1fcfe71	Ping team members for new failed tests in daily CI (#34171 ) * ping * fix * fix * fix * remove runner * update members --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-17 16:11:52 +02:00
Amos You	aa3e35ac67	Fix warning message for fp32_cpu_offloading in bitsandbytes configs (#34079 ) * change cpu offload warning for fp8 quantization * change cpu offload warning for fp4 quantization * change cpu offload variable name for fp8 and fp4 quantization	2024-10-17 15:11:33 +02:00
larin92	6d2b203339	Update `trainer._get_eval_sampler()` to support `group_by_length` arg (#33514 ) Update 'trainer._get_eval_sampler()' to support 'group_by_length' argument Trainer didn't support grouping by length for evaluation, which made evaluation slow with 'eval_batch_size'>1. Updated 'trainer._get_eval_sampler()' method was based off of 'trainer._get_train_sampler()'.	2024-10-17 14:43:29 +02:00
Marc Sun	3f06f95ebe	Revert "Fix FSDP resume Initialization issue" (#34193 ) Revert "Fix FSDP resume Initialization issue (#34032)" This reverts commit 4de1bdbf637fe6411c104c62ab385f660bfb1064.	2024-10-16 15:25:18 -04:00
Reza Rahemtola	3a10c6192b	Avoid using torch's Tensor or PIL's Image in chat template utils if not available (#34165 ) * fix(utils): Avoid using torch Tensor or PIL Image if not available * Trigger CI --------- Co-authored-by: Matt <rocketknight1@gmail.com>	2024-10-16 16:01:18 +01:00
Yoni Gozlan	bd5dc10fd2	Fix wrong name for llava onevision and qwen2_vl in tokenization auto (#34177 ) * nit fix wrong llava onevision name in tokenization auto * add qwen2_vl and fix style	2024-10-16 16:48:52 +02:00
steveepreston	cc7d8b87e1	Revert `accelerate` error caused by `46d09af` (#34197 ) Revert `accelerate` bug	2024-10-16 16:13:41 +02:00
alpertunga-bile	98bad9c6d6	[fix] fix token healing tests and usage errors (#33931 ) * auto-gptq requirement is removed & model is changed & tokenizer pad token is assigned * values func is changed with extensions & sequence key value bug is fixed * map key value check is added in ExtensionsTree * empty trimmed_ids bug is fixed * tail_id IndexError is fixed * empty trimmed_ids bug fix is updated for failed test * too much specific case for specific tokenizer is removed * input_ids check is updated * require auto-gptq import is removed * key error check is changed with empty list check * empty input_ids check is added * empty trimmed_ids fix is checked with numel function * usage change comments are added * test changes are commented * comment style and quality bugs are fixed * test comment style and quality bug is fixed	2024-10-16 14:22:55 +02:00
Yoach Lacombe	9ba021ea75	Moshi integration (#33624 ) * clean mimi commit * some nits suggestions from Arthur * make fixup * first moshi WIP * converting weights working + configuration + generation configuration * finalize converting script - still missing tokenizer and FE and processor * fix saving model w/o default config * working generation * use GenerationMixin instead of inheriting * add delay pattern mask * fix right order: moshi codes then user codes * unconditional inputs + generation config * get rid of MoshiGenerationConfig * blank user inputs * update convert script:fix conversion, add tokenizer, feature extractor and bf16 * add and correct Auto classes * update modeling code, configuration and tests * make fixup * fix some copies * WIP: add integration tests * add dummy objects * propose better readiblity and code organisation * update tokenization tests * update docstrigns, eval and modeling * add .md * make fixup * add MoshiForConditionalGeneration to ignore Auto * revert mimi changes * re * further fix * Update moshi.md * correct md formating * move prepare causal mask to class * fix copies * fix depth decoder causal * fix and correct some tests * make style and update .md * correct config checkpoitn * Update tests/models/moshi/test_tokenization_moshi.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/moshi/test_tokenization_moshi.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * make style * Update src/transformers/models/moshi/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * change firm in copyrights * udpate config with nested dict * replace einsum * make style * change split to True * add back splt=False * remove tests in convert * Update tests/models/moshi/test_modeling_moshi.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add default config repo + add model to FA2 docstrings * remove logits float * fix some tokenization tests and ignore some others * make style tokenization tests * update modeling with sliding window + update modeling tests * [run-slow] moshi * remove prepare for generation frol CausalLM * isort * remove copied from * ignore offload tests * update causal mask and prepare 4D mask aligned with recent changes * further test refine + add back prepare_inputs_for_generation for depth decoder * correct conditional use of prepare mask * update slow integration tests * fix multi-device forward * remove previous solution to device_map * save_load is flaky * fix generate multi-devices * fix device * move tensor to int --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Marc Sun <marc@huggingface.co>	2024-10-16 11:21:49 +02:00
Raushan Turganbay	d087165db0	IDEFICS: support inputs embeds (#34043 ) * support embeds * use cache from config * style... * fix tests after rebase	2024-10-16 09:25:26 +02:00
Chulhwa (Evan) Han	9d6998c759	🌐 [i18n-KO] Translated `blip-2.md` to Korean (#33516 ) * docs: ko: model_doc/blip-2 * feat: nmt draft * Apply suggestions from code review Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/model_doc/blip-2.md Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>	2024-10-15 11:21:22 -07:00
Yijun Lee	554ed5d1e0	🌐 [i18n-KO] Translated `trainer_utils.md` to Korean (#33817 ) * docs: ko: trainer_utils.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>	2024-10-15 11:21:05 -07:00
Yijun Lee	8c33cf4eec	🌐 [i18n-KO] Translated `gemma2.md` to Korean (#33937 ) * docs: ko: gemma2.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions	2024-10-15 11:20:46 -07:00
Jiwook Han	67acb0b123	🌐 [i18n-KO] Translated `vivit.md` to Korean (#33935 ) * docs: ko: model_doc/vivit.md * feat: nmt draft * fix: manual edits * fix: manual edits	2024-10-15 10:31:44 -07:00
laurentd-lunit	0f49deacbf	[feat] LlavaNext add feature size check to avoid CUDA Runtime Error (#33608 ) * [feat] add feature size check to avoid CUDA Runtime Error * [minor] add error handling to all llava models * [minor] avoid nested if else * [minor] add error message to Qwen2-vl and chameleon * [fix] token dimension for check * [minor] add feature dim check for videos too * [fix] dimension check * [fix] test reference values --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2024-10-15 16:19:18 +02:00
Marc Sun	d00f1ca860	Fix optuna ddp hp search (#34073 )	2024-10-15 15:42:07 +02:00
Yoni Gozlan	65442718c4	Add support for inheritance from class with different suffix in modular (#34077 ) * add support for different suffix in modular * add dummy example, pull new changes for modular * nide lines order change	2024-10-15 14:55:09 +02:00
Joao Gante	d314ce70bf	Generate: move `logits` to same device as `input_ids` (#34076 ) tmp commit	2024-10-15 14:32:09 +02:00
Subhalingam D	5ee9e786d1	Fix default behaviour in TextClassificationPipeline for regression problem type (#34066 ) * update code * update docstrings * update tests	2024-10-15 13:06:20 +01:00
Shikhar Mishra	4de1bdbf63	Fix FSDP resume Initialization issue (#34032 ) * Fix FSDP Initialization for resume training * Added init_fsdp function to work with dummy values * Fix FSDP initialization for resuming training * Added CUDA decorator for tests * Added torch_gpu decorator to FSDP tests * Fixup for failing code quality tests	2024-10-15 13:48:10 +02:00
Prakarsh Kaushik	293e6271c6	Add sdpa for Vivit (#33757 ) * chore:add sdpa to vivit * fix:failing slow test_inference_interpolate_pos_encoding(failing on main branch too) * chore:fix nits * ci:fix repo consistency failure * chore:add info and benchmark to model doc * [run_slow] vivit * chore:revert interpolation test fix for new issue * [run_slow] vivit * [run_slow] vivit * [run_slow] vivit * chore:add fallback for output_attentions being True * [run_slow] vivit * style:make fixup * [run_slow] vivit	2024-10-15 11:27:54 +02:00
Raushan Turganbay	23874f5948	Idefics: enable generation tests (#34062 ) * add idefics * conflicts after merging main * enable tests but need to fix some * fix tests * no print * fix/skip some slow tests * continue not skip * rebasing broken smth, this is the fix	2024-10-15 11:17:14 +02:00
Victor Muštar	dd4216b766	Update README.md with Enterprise Hub (#34150 )	2024-10-15 10:45:22 +02:00
Arthur	fa3f2db5c7	Add documentation for docker (#33156 ) * initial commit * nit	2024-10-14 11:58:45 +02:00
Lysandre Debut	5114c9b9e9	Specify that users should be careful with their own files (#34153 ) * Informative * style	2024-10-14 11:40:39 +02:00
Diogo Miguel Silva	013d3ac2b5	Fixed error message in mllama (#34106 )	2024-10-14 10:30:35 +02:00
Vladislav Bronzov	cb5ca3265f	Add GGUF for starcoder2 (#34094 ) * add starcoder2 arch support for gguf * fix q6 test	2024-10-14 10:22:49 +02:00
PengWeixuan	4c439173df	Fix a typo (#34148 ) Correct a typo "If you want you tokenizer..."->"If you want your tokenizer...."	2024-10-14 10:15:25 +02:00
Anton Vlasjuk	7434c0ed21	Mistral-related models for QnA (#34045 ) * mistral qna start * mixtral qna * oops * qwen2 qna * qwen2moe qna * add missing input embed methods * add copied to all methods, can't directly from llama due to the prefix * make top level copied from	2024-10-14 08:53:32 +02:00
Joao Gante	37ea04013b	Generate: Fix modern llm `generate` calls with `synced_gpus` (#34095 )	2024-10-12 16:45:52 +01:00
Luc Georges	617b21273a	fix(ci): benchmarks dashboard was failing due to missing quotations (#34100 )	2024-10-11 19:52:06 +02:00
Luc Georges	144852fb6b	refactor: benchmarks (#33896 ) * refactor: benchmarks Based on a discussion with @LysandreJik & @ArthurZucker, the goal of this PR is to improve transformers' benchmark system. This is a WIP, for the moment the infrastructure required to make things work is not ready. Will update the PR description when it is the case. * feat: add db init in benchmarks CI * fix: pg_config is missing in runner * fix: add psql to the runner * fix: connect info from env vars + PR comments * refactor: set database as env var * fix: invalid working directory * fix: `commit_msg` -> `commit_message` * fix: git marking checked out repo as unsafe * feat: add logging * fix: invalid device * feat: update grafana dashboard for prod grafana * feat: add `commit_id` to header table * feat: commit latest version of dashboard * feat: move measurements into json field * feat: remove drop table migration queries * fix: `torch.arrange` -> `torch.arange` * fix: add missing `s` to `cache_position` positional argument * fix: change model * revert: `cache_positions` -> `cache_position` * fix: set device for `StaticCache` * fix: set `StaticCache` dtype * feat: limit max cache len * fix script * raise error on failure! * not try catch * try to skip generate compilation * update * update docker image! * update * update again!@ * update * updates * ??? * ?? * use `torch.cuda.synchronize()` * fix json * nits * fix * fixed! * f*k feat: add TTNT panels * feat: add try except --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-10-11 18:03:29 +02:00
Yih-Dar	80bee7b114	Avoid many test failures for `LlavaNextVideoForConditionalGeneration` (#34070 ) * skip * [run-slow] llava_next_video * skip * [run-slow] video_llava, llava_next_video * skip * [run-slow] llava_next_video --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-11 17:41:50 +02:00
Joao Gante	37ac078535	Generate: move `prepare_inputs_for_generation` in encoder-decoder llms (#34048 )	2024-10-11 16:11:18 +01:00
Raushan Turganbay	fd70464fa7	Fix flaky tests (#34069 ) * fix mllama only * allow image token index	2024-10-11 14:41:46 +01:00
Dmytro Mishkin	3a24ba82ad	Fix NaNs in cost_matrix for mask2former (#34074 ) Fix NaNs in cost_matrix Sometimes that happens :(	2024-10-11 15:35:55 +02:00
Yih-Dar	7b06473b8f	avoid many failures for ImageGPT (#34071 ) * skip * [run-slow] imagegpt * skip * [run-slow] imagegpt * [run-slow] imagegpt,video_llava * skip * [run-slow] imagegpt,video_llava --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-11 15:24:01 +02:00
Lucain	1c66be8062	Fix PushToHubMixin when pusing to a PR revision (#34090 )	2024-10-11 15:06:15 +02:00
Lysandre Debut	409dd2d19c	Fix failing conversion (#34010 ) * Fix * Tests * Typo * Typo	2024-10-11 14:59:23 +02:00
Yoach Lacombe	9dca0c9116	Fix DAC slow tests (#34088 ) * Fix DAC slow tests and fix decode * [run-slow] dac	2024-10-11 14:43:03 +02:00
Lysandre Debut	f052e94bcc	Fix flax failures (#33912 ) * Few fixes here and there * Remove typos * Remove typos	2024-10-11 14:38:35 +02:00
Joao Gante	e878eaa9fc	Tests: upcast `logits` to `float()` (#34042 ) upcast	2024-10-11 11:51:49 +01:00
Yih-Dar	4b9bfd32f0	Update SSH workflow file (#34084 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-11 10:53:12 +02:00
Raushan Turganbay	be9aeba581	Idefics: fix position ids (#33907 ) * fix position ids * fix labels also * fix copies * oops, not that one * dont deprecate	2024-10-11 10:28:34 +02:00
Guang Yang	7d97cca8dd	Generate using exported model and enable gemma2-2b in ExecuTorch (#33707 ) * Generate using exported model and enable gemma2-2b in ExecuTorch * [run_slow] gemma, gemma2 * truncate expected output message * Bump required torch version to support gemma2 export * [run_slow] gemma, gemma2 --------- Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-11 10:16:31 +02:00
Matthew Hoffman	70b07d97cf	Default `synced_gpus` to `True` when using `FullyShardedDataParallel` (#33483 ) * Default synced_gpus to True when using FullyShardedDataParallel Fixes #30228 Related: * https://github.com/pytorch/pytorch/issues/100069 * https://github.com/pytorch/pytorch/issues/123962 Similar to DeepSpeed ZeRO Stage 3, when using FSDP with multiple GPUs and differently sized data per rank, the ranks reach different synchronization points at the same time, leading to deadlock To avoid this, we can automatically set synced_gpus to True if we detect that a PreTrainedModel is being managed by FSDP using _is_fsdp_managed_module, which was added in 2.0.0 for torch.compile: https://github.com/pytorch/pytorch/blob/v2.0.0/torch/distributed/fsdp/_dynamo_utils.py * Remove test file * ruff formatting * ruff format * Update copyright year Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add test for FSDP-wrapped model generation Before #33483, these tests would have hung for 10 minutes before crashing due to a timeout error * Ruff format * Move argparse import * Remove barrier I think this might cause more problems if one of the workers was killed * Move import into function to decrease load time https://github.com/huggingface/transformers/pull/33483#discussion_r1787972735 * Add test for accelerate and Trainer https://github.com/huggingface/transformers/pull/33483#discussion_r1790309675 * Refactor imports * Ruff format * Use nullcontext --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-10 14:09:04 -04:00
Mohamed Mekkouri	24b82f3cd5	Small Fix to modular converter (#34051 ) * small_fix * supporting both src/tranformers and examples/ * make style	2024-10-10 18:43:27 +02:00
Ekaterina Aidova	211f1d93db	provide trust_remote_code for search feat extractor in model config (#34036 )	2024-10-10 16:33:46 +01:00
Pavel Iakubovskii	8363fd8346	Update Blip2 `is_pipeline_test_to_skip` method signature (#34067 ) Update method signature	2024-10-10 16:32:08 +01:00
Yoach Lacombe	e7dfb917f8	[TESTS] ASR pipeline (#33925 ) * fix whisper translation * correct slow_unfinished_sequence test * make fixup	2024-10-10 17:31:22 +02:00
Mohamed Mekkouri	a37a06a20b	Fix data_seed unused (#33731 ) * fixing data_seed unused * fix accelerate version needed * fix style * update the fix following accelerate fix	2024-10-10 15:28:00 +02:00
Michael Goin	b2f09fb90f	[Docs] Update compressed_tensors.md (#33961 ) * Update compressed_tensors.md Fix some unfinished sections * Update docs/source/en/quantization/compressed_tensors.md Co-authored-by: Xiao Yuan <yuanx749@gmail.com> --------- Co-authored-by: Xiao Yuan <yuanx749@gmail.com>	2024-10-10 15:22:41 +02:00
Mohamed Abu El-Nasr	4a3f1a686f	check if eigenvalues of covariance matrix are complex. (#34037 ) check if eigenvalues of covariance complex for psd checking	2024-10-10 14:44:05 +02:00
Daniel Korat	fb0c6b521d	Universal Assisted Generation: Assisted generation with any assistant model (by Intel Labs) (#33383 ) * Update candidate_generator.py * Update utils.py * add lookbehind params to _get_candidate_generator * make fixup * add unit tests * fix failing tests * add docstrings * fix docstrings; remove non-optimized AnyTokenizer * added any tokenizer generation correctness test * make fixup * fix assertion syntax * PR review fixes * address additional PR comments * fix tests * remove stropping criteria arg * make fixup * add AssistantConfig * fix prev_tokens branching * pass tokenizers through `generate()`kwargs * fix lookbehind values; tokenizer params WIP * fixup * AssistantConfig * remove AssistantConfig; apply PR suggestions * restructure tests * fixup * fix assistant_tokenizer arg validation * fixup * fix tests in TestAssistedCandidateGeneratorDifferentTokenizers * fix class docstring * PR suggestions * doc * doc update and improvements to `_validate_assistant()` --------- Co-authored-by: mosheber <moshe.berchansky@intel.com>	2024-10-10 14:41:53 +02:00
Hamza Tahboub	dda3f91d06	Specifying torch dtype in Qwen2VLForConditionalGeneration (#33953 ) * Specifying torch dtype * Reverting change & changing fallback _from_config() dtype	2024-10-10 14:39:33 +02:00
Matt	f8a260e2a4	Sync QuestionAnsweringPipeline (#34039 ) * Sync QuestionAnsweringPipeline * typo fixes * Update deprecation warnings	2024-10-10 13:38:14 +01:00
Vladislav Bronzov	c9afee5392	Add gguf support for gpt2 (#34044 ) * add gpt2 gguf support * add doc change * small refactoring	2024-10-10 13:42:18 +02:00
Pavel Iakubovskii	66e08dba71	Fix pipelines tests (#34049 ) * Fix wrong skip annotation * Remove error raise	2024-10-10 12:04:06 +01:00
Dani Martí	a84c413773	HfArgumentParser: allow for hyhenated field names in long-options (#33990 ) Allow for hyphenated field names in long-options argparse converts hyphens into underscores before assignment (e.g., an option passed as `--long-option` will be stored under `long_option`), So there is no need to pass options as literal attributes, as in `--long_option` (with an underscore instead of a hyphen). This commit ensures that this behavior is respected by `parse_args_into_dataclasses` as well. Issue: #33933 Co-authored-by: Daniel Marti <mrtidm@amazon.com>	2024-10-10 11:58:26 +02:00
Raushan Turganbay	adea67541a	Phi3: fix attn for sliding window (#33586 ) * fix phi3 attn fir sliding window * fix tests * address most comment * style * update after rebase * add more models * fix tests	2024-10-10 11:50:39 +02:00
Avishai Elmakies	a265600c60	add sdpa to OPT (#33298 ) * add sdpa to OPT * chore: remove redundant whitespace in OPTDecoder class * fixup * bug fix * add sdpa and attention generate test * fixup * Refactor OPTAttention forward method for improved readability and maintainability * undo refactor for _shape and key,val states * add OPT to doc, fixup didn't find it for some reason * change order * change default attn_implemntation in testing to eager * [run-slow] opt * change test_eager_matches_sdpa_generate to the one llama * Update default attention implementation in testing common * [run-slow] opt * remove uneeded print * [run-slow] opt * refactor model testers to have attn_implementation="eager" * [run-slow] opt * convert test_eager_matches_sdpa_generate to opt-350M * bug fix when creating mask for opt * [run-slow] opt * if layer head mask default to eager * if head mask is not none fall to eager * [run-slow] opt * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Clean up Unpack imports (#33631) clean up Unpack imports * Fix DPT /Dinov2 sdpa regression on main (#33660) * fallback to eager if output attentions. * fix copies * handle dependency errors in check_imports (#33622) * handle dependency errors in check_imports * change log level to warning * add back self.max_position_embeddings = config.max_position_embeddings (#33550) * add back self.max_position_embeddings = config.max_position_embeddings * fix-copies * Fix Llava conversion for LlavaQwen2ForCausalLM with Clip vision tower (#33613) fix llavaqwen2 model conversion * Uniformize kwargs for Udop processor and update docs (#33628) * Add optional kwargs and uniformize udop * cleanup Unpack * nit Udop * Generation: deprecate `PreTrainedModel` inheriting from `GenerationMixin` (#33203) * Enable BNB multi-backend support (#31098) * enable cpu bnb path * fix style * fix code style * fix 4 bit path * Update src/transformers/utils/import_utils.py Co-authored-by: Aarni Koskela <akx@iki.fi> * add multi backend refactor tests * fix style * tweak 4bit quantizer + fix corresponding tests * tweak 8bit quantizer + try fixing corresponding tests * fix dequant bnb 8bit * account for Intel CPU in variability of expected outputs * enable cpu and xpu device map * further tweaks to account for Intel CPU * fix autocast to work with both cpu + cuda * fix comments * fix comments * switch to testing_utils.torch_device * allow for xpu in multi-gpu tests * fix tests 4bit for CPU NF4 * fix bug with is_torch_xpu_available needing to be called as func * avoid issue where test reports attr err due to other failure * fix formatting * fix typo from resolving of merge conflict * polish based on last PR review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix CI * Update src/transformers/integrations/integration_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/integrations/integration_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix error log * fix error msg * add \n in error log * make quality * rm bnb cuda restriction in doc * cpu model don't need dispatch * fix doc * fix style * check cuda avaliable in testing * fix tests * Update docs/source/en/model_doc/chameleon.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Aarni Koskela <akx@iki.fi> * Update tests/quantization/bnb/test_4bit.py Co-authored-by: Aarni Koskela <akx@iki.fi> * Update tests/quantization/bnb/test_4bit.py Co-authored-by: Aarni Koskela <akx@iki.fi> * fix doc * fix check multibackends * fix import sort * remove check torch in bnb * docs: update bitsandbytes references with multi-backend info * docs: fix small mistakes in bnb paragraph * run formatting * reveret bnb check * move bnb multi-backend check to import_utils * Update src/transformers/utils/import_utils.py Co-authored-by: Aarni Koskela <akx@iki.fi> * fix bnb check * minor fix for bnb * check lib first * fix code style * Revert "run formatting" This reverts commit ac108c6d6b34f45a5745a736ba57282405cfaa61. * fix format * give warning when bnb version is low and no cuda found] * fix device assignment check to be multi-device capable * address akx feedback on get_avlbl_dev fn * revert partially, as we don't want the function that public, as docs would be too much (enforced) --------- Co-authored-by: Aarni Koskela <akx@iki.fi> Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix error string after refactoring into get_chat_template (#33652) * Fix error string after refactoring into get_chat_template * Take suggestion from CR Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * uniformize git processor (#33668) * uniformize git processor * update doctring * Modular `transformers`: modularity and inheritance for new model additions (#33248) * update exampel * update * push the converted diff files for testing and ci * correct one example * fix class attributes and docstring * nits * oups * fixed config! * update * nitd * class attributes are not matched against the other, this is missing * fixed overwriting self.xxx now onto the attributes I think * partial fix, now order with docstring * fix docstring order? * more fixes * update * fix missing docstrings! * examples don't all work yet * fixup * nit * updated * hick * update * delete * update * update * update * fix * all default * no local import * fix more diff * some fix related to "safe imports" * push fixed * add helper! * style * add a check * all by default * add the * update * FINALLY! * nit * fix config dependencies * man that is it * fix fix * update diffs * fix the last issue * re-default to all * alll the fixes * nice * fix properties vs setter * fixup * updates * update dependencies * make sure to install what needs to be installed * fixup * quick fix for now * fix! * fixup * update * update * updates * whitespaces * nit * fix * simplify everything, and make it file agnostic (should work for image processors) * style * finish fixing all import issues * fixup * empty modeling should not be written! * Add logic to find who depends on what * update * cleanup * update * update gemma to support positions * some small nits * this is the correct docstring for gemma2 * fix merging of docstrings * update * fixup * update * take doc into account * styling * update * fix hidden activation * more fixes * final fixes! * fixup * fixup instruct blip video * update * fix bugs * align gemma2 with the rest as well * updats * revert * update * more reversiom * grind * more * arf * update * order will matter * finish del stuff * update * rename to modular * fixup * nits * update makefile * fixup * update order of the checks! * fix * fix docstring that has a call inside * fiix conversion check * style * add some initial documentation * update * update doc * some fixup * updates * yups * Mostly todo gimme a minut * update * fixup * revert some stuff * Review docs for the modular transformers (#33472) Docs * good update * fixup * mmm current updates lead to this code * okay, this fixes it * cool * fixes * update * nit * updates * nits * fix doc * update * revert bad changes * update * updates * proper update * update * update? * up * update * cool * nits * nits * bon bon * fix * ? * minimise changes * update * update * update * updates? * fixed gemma2 * kind of a hack * nits * update * remove `diffs` in favor of `modular` * fix make fix copies --------- Co-authored-by: Lysandre Debut <hi@lysand.re> * Fix CIs post merging modular transformers (#33681) update * Fixed docstring for cohere model regarding unavailability of prune_he… (#33253) * Fixed docstring for cohere model regarding unavailability of prune_head() methods The docstring mentions that cohere model supports prune_heads() methods. I have fixed the docstring by explicitly mentioning that it doesn't support that functionality. * Update src/transformers/models/cohere/modeling_cohere.py --------- Co-authored-by: Lysandre Debut <hi@lysand.re> * Generation tests: update imagegpt input name, remove unused functions (#33663) * Improve Error Messaging for Flash Attention 2 on CPU (#33655) Update flash-attn error message on CPU Rebased to latest branch * Gemma2: fix config initialization (`cache_implementation`) (#33684) * Fix ByteLevel alphabet missing when Sequence pretokenizer is used (#33556) * Fix ByteLevel alphabet missing when Sequence pretokenizer is used * Fixed formatting with `ruff`. * Uniformize kwargs for image-text-to-text processors (#32544) * uniformize FUYU processor kwargs * Uniformize instructblip processor kwargs * Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2 * Uniformize llava_next processor * Fix save_load test for processor with chat_template only as extra init args * Fix import Unpack * Fix Fuyu Processor import * Fix FuyuProcessor import * Fix FuyuProcessor * Add defaults for specific kwargs kosmos2 * Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs * Add tests processor Udop * remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature * Fix overwrite tests kwargs processors * Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop * Fix processing test fuyu * remove unnecessary pad_token check in instructblip ProcessorTest * Fix BC tests and cleanup * FIx imports fuyu * Uniformize Pix2Struct * Fix wrong name for FuyuProcessorKwargs * Fix slow tests reversed inputs align fuyu llava-next, change udop warning * Fix wrong logging import udop * Add check images text input order * Fix copies * change text pair handling when positional arg * rebase on main, fix imports in test_processing_common * remove optional args and udop uniformization from this PR * fix failing tests * remove unnecessary test, fix processing utils and test processing common * cleanup Unpack * cleanup * fix conflict grounding dino * 🚨🚨 Setting default behavior of assisted decoding (#33657) * tests: fix pytorch tensor placement errors (#33485) This commit fixes the following errors: * Fix "expected all tensors to be on the same device" error * Fix "can't convert device type tensor to numpy" According to pytorch documentation torch.Tensor.numpy(force=False) performs conversion only if tensor is on CPU (plus few other restrictions) which is not the case. For our case we need force=True since we just need a data and don't care about tensors coherency. Fixes: #33517 See: https://pytorch.org/docs/2.4/generated/torch.Tensor.numpy.html Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> * bump tokenizers, fix added tokens fast (#32535) * update based on tokenizers release * update * nits * update * revert re addition * don't break that yet * fmt * revert unwanted * update tokenizers version * update dep table * update * update in conversion script as well * some fix * revert * fully revert * fix training * remove set trace * fixup * update * update * [Pixtral] Improve docs, rename model (#33491) * Improve docs, rename model * Fix style * Update repo id * fix code quality after merge * HFQuantizer implementation for compressed-tensors library (#31704) * Add compressed-tensors HFQuantizer implementation * flag serializable as False * run * revive lines deleted by ruff * fixes to load+save from sparseml, edit config to quantization_config, and load back * address satrat comment * compressed_tensors to compressed-tensors and revert back is_serializable * rename quant_method from sparseml to compressed-tensors * tests * edit tests * clean up tests * make style * cleanup * cleanup * add test skip for when compressed tensors is not installed * remove pydantic import + style * delay torch import in test * initial docs * update main init for compressed tensors config * make fix-copies * docstring * remove fill_docstring * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * review comments * review comments * comments - suppress warnings on state dict load, tests, fixes * bug-fix - remove unnecessary call to apply quant lifecycle * run_compressed compatability * revert changes not needed for compression * no longer need unexpected keys fn * unexpected keys not needed either * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * add to_diff_dict * update docs and expand testing * Update _toctree.yml with compressed-tensors * Update src/transformers/utils/quantization_config.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update doc * add note about saving a loaded model --------- Co-authored-by: George Ohashi <george@neuralmagic.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Sara Adkins <sara@neuralmagic.com> Co-authored-by: Sara Adkins <sara.adkins65@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Dipika Sikka <ds3822@columbia.edu> Co-authored-by: Dipika <dipikasikka1@gmail.com> * update model card for opt * add batch size to inference table * [slow-run] opt * [run-slow] opt --------- Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Co-authored-by: Avishai Elmakies <avishai.elma@cs.huji.ac.il> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: chengchengpei <5881383+chengchengpei@users.noreply.github.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Aarni Koskela <akx@iki.fi> Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Tibor Reiss <75096465+tibor-reiss@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Lysandre Debut <hi@lysand.re> Co-authored-by: Muhammad Naufil <m.naufil1@gmail.com> Co-authored-by: sizhky <yyeshr@gmail.com> Co-authored-by: Umar Butler <umar@umar.au> Co-authored-by: Jonathan Mamou <jonathan.mamou@intel.com> Co-authored-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com> Co-authored-by: George Ohashi <george@neuralmagic.com> Co-authored-by: Sara Adkins <sara@neuralmagic.com> Co-authored-by: Sara Adkins <sara.adkins65@gmail.com> Co-authored-by: Dipika Sikka <ds3822@columbia.edu> Co-authored-by: Dipika <dipikasikka1@gmail.com>	2024-10-10 11:49:34 +02:00
Ahmed Almaghz	69b5ccb887	Add Translate docs into Arabic - section files CONCEPTUAL GUIDES (#33982 ) Add Translate docs into Arabic - section files CONCEPTUAL GUIDES --------------------------------------------------------------------------------------- Philosophy [i18n-ar] Translated file : docs/source/ar/philosophy.md into Arabic #33064 Glossary [i18n-ar] Translated file : docs/source/ar/glossary.md into Arabic #33038 What 🤗 Transformers can do [i18n-ar] Translated file : docs/source/ar/task_summary.md into Arabic #33073 How 🤗 Transformers solve tasks [i18n-ar] Translated file : docs/source/ar/tasks_explained.md into Arabic #33074 The Transformer model family [i18n-ar] Translated file : docs/source/ar/model_summary.md into Arabic #33047 Summary of the tokenizers [i18n-ar] Translated file : docs/source/ar/tokenizer_summary.md into Arabic #33078 Attention [i18n-ar] Translated file : docs/source/ar/attention.md into Arabic #33021 Padding and truncation [i18n-ar] Translated file : docs/source/ar/pad_truncation.md into Arabic #33050 BERTology [i18n-ar] Translated file : docs/source/ar/bertology.md into Arabic #33024 Perplexity of fixed-length models [i18n-ar] Translated file : docs/source/ar/perplexity.md into Arabic #33063 Pipelines for webserver inference [i18n-ar] Translated file : docs/source/ar/pipeline_webserver.md into Arabic #33066 Model training anatomy [i18n-ar] Translated file : docs/source/ar/model_memory_anatomy.md into Arabic #33045 Getting the most out of LLMs [i18n-ar] Translated file : docs/source/ar/llm_tutorial_optimization.md into Arabic #33043	2024-10-09 14:51:19 -07:00
Yijun Lee	88d01d9119	🌐 [i18n-KO] Translated `generation_utils.md` to Korean (#33818 ) * docs: ko: generation_utils.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update generation_utils.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-09 11:55:07 -07:00
wony617	c02cf48729	🌐 [i18n-KO] Translated `main_classes/callback.md` to Korean (#33572 ) * docs: ko: callback.md * feat: nmt draft & manual edits * fix: resolve suggestions * Update docs/source/ko/main_classes/callback.md * Apply suggestions from code review * Apply suggestions from code review 확인했습니다! 상세한 리뷰 정말 감사합니다! Co-authored-by: boyunJang <gobook1234@naver.com> * Update _toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: boyunJang <gobook1234@naver.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-09 11:54:38 -07:00
Yijun Lee	0354d44926	🌐 [i18n-KO] Translated `text_generation.md` to Korean (#33777 ) * docs: ko: text_generation.md * feat: nmt draft * fix: manual edits * fix: manual edits * fix: resolve suggestions Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> --------- Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-09 11:20:01 -07:00
Sungmin Oh	973e6066d4	🌐 [i18n-KO] Translated `model_doc/patchtst.md` to Korean (#33589 ) * docs: ko: model_doc/patchtst.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> --------- Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-09 11:15:24 -07:00
Sungmin Oh	61a6dce7e4	🌐 [i18n-KO] Translated `main_classes/data_collator.md` to Korean (#33954 ) * docs: ko: main_classes/data_collator.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: resolve suggestions --------- Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-09 11:14:43 -07:00
Yijun Lee	6ac5f25bb6	🌐 [i18n-KO] Translated `modeling_utils.md` to Korean (#33808 ) * docs: ko: modeling_utils.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>	2024-10-09 10:50:03 -07:00
Sungmin Oh	8dca259826	🌐 [i18n-KO] Translated `model_doc/graphormer.md` to Korean (#33569 ) * docs: ko: model_doc/graphormer.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>	2024-10-09 10:44:28 -07:00
Sungmin Oh	4ad923344d	🌐 [i18n-KO] Translated `model_doc/informer.md` to Korean (#33585 ) * docs: ko: model_doc/informer.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions --------- Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>	2024-10-09 10:41:06 -07:00
Sungmin Oh	04f51c42c8	🌐 [i18n-KO] Translated `model_doc/time_series_transformer.md` to Korean (#33596 ) * docs: ko: model_doc/time_series_transformer.md * fix: resolve suggestions Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> --------- Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-09 10:40:48 -07:00
Sungmin Oh	32cc15c6a2	🌐 [i18n-KO] Translated `model_doc/trajectory_transformer.md` to Korean (#33597 ) * docs: ko: model_doc/trajectory_transformer.md * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>	2024-10-09 10:40:36 -07:00
Sungmin Oh	f0fbef1c63	🌐 [i18n-KO] Translated `main_classes/model.md` to Korean (#33606 ) * feat: nmt draft * fix: manual edits * docs: ko: main_classes/model.md * fix: resolve suggestions Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-09 10:40:06 -07:00
Sungmin Oh	48b54205d0	🌐 [i18n-KO] Translated `model_doc/mamba2.md` to Korean (#33629 ) * docs: ko: model_doc/mamba2.md * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestion * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions --------- Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-09 10:39:54 -07:00
Sungmin Oh	03e6fa0061	🌐 [i18n-KO] Translated `main_classes/keras_callbacks.md` to Korean (#33955 ) * docs: ko: main_classes/keras_callbacks.md * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> --------- Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-09 10:34:01 -07:00
Sungmin Oh	13929a0ec6	🌐 [i18n-KO] Translated `model_doc/deberta.md` to Korean (#33967 ) * docs: ko: model_doc/deberta.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>	2024-10-09 10:33:34 -07:00
Sungmin Oh	41794e6098	🌐 [i18n-KO] Translated `model_doc/bart.md` to Korean (#33893 ) * docs: ko: model_doc/bart.md * fix: anchor edits * feat: nmt draft * Update docs/source/ko/model_doc/bart.md * Update docs/source/ko/model_doc/bart.md * fix: manual edits * Update docs/source/ko/model_doc/bart.md * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-09 10:33:14 -07:00
Mohamed Mekkouri	36d410dab6	FEAT : Adding BitNet quantization method to HFQuantizer (#33410 ) * rebasing changes * fixing style * adding some doc to functions * remove bitblas * change dtype * fixing check_code_quality * fixing import order * adding doc to tree * Small update on BitLinear * adding some tests * sorting imports * small update * reformatting * reformatting * reformatting with ruff * adding assert * changes after review * update disk offloading * adapting after review * Update after review * add is_serializable back * fixing style * adding serialization test * make style * small updates after review	2024-10-09 17:51:41 +02:00
Pavel Iakubovskii	48461c0fe2	Make `pipeline` able to load `processor` (#32514 ) * Refactor get_test_pipeline * Fixup * Fixing tests * Add processor loading in tests * Restructure processors loading * Add processor to the pipeline * Move model loading on tom of the test * Update `get_test_pipeline` * Fixup * Add class-based flags for loading processors * Change `is_pipeline_test_to_skip` signature * Skip t5 failing test for slow tokenizer * Fixup * Fix copies for T5 * Fix typo * Add try/except for tokenizer loading (kosmos-2 case) * Fixup * Llama not fails for long generation * Revert processor pass in text-generation test * Fix docs * Switch back to json file for image processors and feature extractors * Add processor type check * Remove except for tokenizers * Fix docstring * Fix empty lists for tests * Fixup * Fix load check * Ensure we have non-empty test cases * Update src/transformers/pipelines/__init__.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Update src/transformers/pipelines/base.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Rework comment * Better docs, add note about pipeline components * Change warning to error raise * Fixup * Refine pipeline docs --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-10-09 16:46:11 +01:00
Zach Mueller	4fb28703ad	Fix PIL dep for tests (#34028 ) Fix PIL dep for tess	2024-10-09 10:45:06 -04:00
Raushan Turganbay	5ee52ae0bc	Mllama: fix tests (#34000 ) * fix tests * don't need this * style	2024-10-09 14:02:56 +02:00
Joao Gante	295a90cb40	Generate: remove most decoder-only LLMs `prepare_inputs_for_generation` (#33870 )	2024-10-09 12:15:48 +01:00
Mohamed Abu El-Nasr	cdee5285ca	Fix Failed tests with mobile bert resize tokens embedding (#33950 ) * Fix Failed tests with mobile bert * Cast to the correct dtype * Code fixup * Fix padding_idx larger that embedding_size * Reduce covariance more. use 1e-7 instead of 1e-5 * Comment fix * Reduce covariance more. use 1e-9 instead of 1e-7 * Copy new config * all but MRA fixed * fix mra * very flaky * skip instead * make fixup --------- Co-authored-by: Joao Gante <joao@huggingface.co>	2024-10-09 11:23:50 +01:00
Vladislav Bronzov	faa0f63b93	Add gguf support for StableLM (#33793 ) * add stablelm gguf architecture support * add additional quantization tests * resolve merge conflict, add weight conversion tests for fp16	2024-10-09 12:16:13 +02:00
Arthur	e783f12f20	[`Patch helper`] update to not have to checkout main (#34006 ) add more support	2024-10-09 09:21:46 +02:00
Yijun Lee	698b36da72	🌐 [i18n-KO] Translated `modular_transformers.md` to Korean (#33772 ) * docs: ko: modular_transformers.md * feat: nmt draft * fix inline TOC * fix: manual edits * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * fix: resolve suggestions Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-08 18:30:41 -07:00
Yijun Lee	6151bc47ba	🌐 [i18n-KO] Translated `image_processing_utils.md` to Korean (#33804 ) * docs: ko: image_processing_utils.md * feat: nmt draft * fix: manual edits	2024-10-08 18:19:37 -07:00
YONGSANG	d31d076b53	🌐 [i18n-KO] Translated output.md to Korean (#33607 ) * nmt draft * fix toctree * minor fix * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: boyunJang <gobook1234@naver.com> Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> * Apply suggestions from code review * Apply suggestions from code review * Update docs/source/ko/main_classes/output.md * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: boyunJang <gobook1234@naver.com> Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-08 18:19:21 -07:00
Chulhwa (Evan) Han	109b1e7591	🌐 [i18n-KO] Translated `blip.md` to Korean (#33515 ) * docs: ko: model_doc/blip * feat: nmt darft * Apply suggestions from code review Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> * Update docs/source/ko/model_doc/blip.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>	2024-10-08 17:59:31 -07:00
Yijun Lee	5809b43a62	🌐 [i18n-KO] Translated `biogpt.md` to Korean (#33773 ) * docs: ko: biogpt.md * feat: nmt draft * fix: manual edits * fix: resolve suggestion Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> --------- Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>	2024-10-08 17:57:51 -07:00
Yijun Lee	c674f2e313	🌐 [i18n-KO] Translated `openai-gpt.md` to Korean (#33801 ) * docs: ko: openai-gpt.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>	2024-10-08 17:57:33 -07:00
Yijun Lee	c15d01fa1d	🌐 [i18n-KO] Translated `file_utils.md` to Korean (#33803 ) * docs: ko: file_utils.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>	2024-10-08 17:57:17 -07:00
Jiwook Han	f0f8077025	🌐 [i18n-KO] Translated `swin.md` to Korean (#33510 ) * ko: doc: model_doc/swin.md * feat: nmt draft * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * Update docs/source/ko/model_doc/swin.md Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> * resolve conflicts * resolve conflicts - 2 --------- Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>	2024-10-08 17:57:03 -07:00
Yijun Lee	0d0ec1dbfb	🌐 [i18n-KO] Translated `tokenization_utils.md` to Korean (#33813 ) * docs: ko: tokenization_utils.md * feat: nmt draft * fix: manual edits	2024-10-08 17:56:30 -07:00
Sungmin Oh	386401eca0	🌐 [i18n-KO] Translated `main_classes/onnx.md` to Korean (#33601 ) * docs: ko: main_classes/onnx.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> --------- Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>	2024-10-08 17:15:46 -07:00
Sungmin Oh	db5f117b8a	🌐 [i18n-KO] Translated `model_doc/deberta-v2.md` to Korean (#33968 ) * docs: ko: model_doc/deberta-v2.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>	2024-10-08 17:15:33 -07:00
Sungmin Oh	cd9a3c49b8	🌐 [i18n-KO] Translated `model_doc/dbrx.md` to Korean (#33951 ) * docs: ko: model_doc/dbrx.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>	2024-10-08 17:14:42 -07:00
Sungmin Oh	d6d07f9c77	🌐 [i18n-KO] Translated `model_doc/cohere.md` to Korean (#33885 ) * docs: ko: model_doc/cohere.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: resolve suggestions --------- Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>	2024-10-08 17:14:25 -07:00
Sungmin Oh	48e80284fa	🌐 [i18n-KO] Translated `model_doc/mistral.md` to Korean (#33648 ) * docs: ko: model_doc/mistral.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> --------- Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>	2024-10-08 17:14:12 -07:00
Sungmin Oh	adb14b93f4	🌐 [i18n-KO] Translated `model_doc/llama3.md` to Korean (#33635 ) * docs: ko: model_doc/llama3.md * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions --------- Co-authored-by: Chaewon Song <chaewon1019@ewhain.net> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-08 17:13:57 -07:00
Sungmin Oh	291e707868	🌐 [i18n-KO] Translated `model_doc/paligemma.md` to Korean (#33612 ) * docs: ko: model_doc/paligemma.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-08 17:13:25 -07:00
Sungmin Oh	dd43dafa39	🌐 [i18n-KO] Translated `model_doc/clip.md` to Korean (#33610 ) * docs: ko: model_doc/clip.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions * fix: resolve suggestions * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions --------- Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>	2024-10-08 17:13:07 -07:00
Sungmin Oh	acde6c7d9d	🌐 [i18n-KO] Translated `model_doc/patchtsmixer.md` to Korean (#33587 ) * docs: ko: model_doc/patchtsmixer.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * fix: resolve suggestions --------- Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>	2024-10-08 17:11:48 -07:00
Sungmin Oh	bb825dde73	🌐 [i18n-KO] Translated `model_doc/autoformer.md` to Korean (#33574 ) * docs: ko: model_doc/autoformer.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions	2024-10-08 17:11:19 -07:00
Sungmin Oh	1d458437dd	🌐 [i18n-KO] Translated `model_doc/mamba.md` to Korean (#33626 ) * docs: ko: model_doc/mamba.md * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-08 17:11:11 -07:00
Sungmin Oh	47da2c528b	🌐 [i18n-KO] Translated `main_classes/configuration.md` to Korean (#33952 ) * docs: ko: main_classes/configuration.md * feat: nmt draft	2024-10-08 17:11:02 -07:00
Sungmin Oh	2e8de976bd	🌐 [i18n-KO] Translated `main_classes/quantization.md` to Korean (#33959 ) * docs: ko: main_classes/quantization.md * feat: nmt draft * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> * fix: resolve suggestions --------- Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-08 17:10:41 -07:00
Chaewon Song	2fe77783c3	🌐 [i18n-KO] Translated `rag.md` to Korean (#33989 ) * fix: toctree edits * feat: nmt-draft * fix: edit Inline TOC	2024-10-08 17:10:26 -07:00
Ahnjj_DEV	1ed98773e5	🌐 [i18n-KO] Translated `gpt_neox_japanese.md` to Korean (#33894 ) * docs: ko: gpt_neox_japanese.md * Update _toctree.yml * fix: manual edits * Update docs/source/ko/model_doc/gpt_neox_japanese.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/model_doc/gpt_neox_japanese.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/model_doc/gpt_neox_japanese.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> --------- Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>	2024-10-08 17:08:06 -07:00
Ahnjj_DEV	79af52ad9a	🌐 [i18n-KO] Translated `bertweet.md` to Korean (#33891 ) * docs: ko: bertweet.md * Update _toctree.yml * fix: manual edits * Update docs/source/ko/model_doc/bertweet.md Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> --------- Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>	2024-10-08 17:07:13 -07:00
Yijun Lee	d49999ce11	🌐 [i18n-KO] Translated `feature_extractor.md` to Korean (#33775 ) * docs: ko: feature_extractor.md * feat: nmt draft * fix: manual edits	2024-10-08 17:06:56 -07:00
Ben Lewis	573942d96a	Fix `trainer_seq2seq.py`'s `__init__` type annotations (#34021 ) * Fix `trainer_seq2seq.py`'s `__init__` type annotations * Update src/transformers/trainer_seq2seq.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Fix issue pointed out by `muellerzr` --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-10-08 16:43:30 -04:00
Marc Sun	04b4e441dc	Remove `decoder_config=None` (#34014 ) * remove unnecessary line * changed to the right one	2024-10-08 15:57:12 +02:00
Marc Sun	1909def2de	fix awq tests due to ipex backend (#34011 ) fix awq tests	2024-10-08 15:56:05 +02:00
Marc Sun	4f2bf135af	Fix typing issue (#34012 )	2024-10-08 15:15:40 +02:00
Zach Mueller	f4b741d674	Fixup DeepSpeed things (#34007 )	2024-10-08 09:04:24 -04:00
Cyril Vallez	17806d11ba	Improve modular converter (#33991 ) * improve modular * style * Update modular_model_converter.py * pretty print warning * style * Support to remove unused classes as part of added dependencies as well * nits * correct bug * add example * style * Add documentation	2024-10-08 14:53:58 +02:00
Matt	fb360a6c7a	BatchFeature.to() supports non-tensor keys (#33918 ) * Fix issue in oneformer preprocessing * [run slow] oneformer * [run_slow] oneformer * Make the same fixes in DQA and object detection pipelines * Fix BatchFeature.to() instead * Revert pipeline-specific changes * Add the same check in Pixtral's methods * Add the same check in BatchEncoding * make sure torch is imported	2024-10-08 13:43:32 +01:00
Matt	3b44d2f042	Image pipelines spec compliance (#33899 ) * Update many similar visual pipelines * Add input tests * Add ImageToText as well * Add output tests * Add output tests * Add output tests * OutputElement -> Output * Correctly test elements * make fixup * fix typo in the task list * Fix VQA testing * Add copyright to image_classification.py * Revert changes to VQA pipeline because outputs have differences - will move to another PR * make fixup * Remove deprecation warnings	2024-10-08 13:34:28 +01:00
Yoni Gozlan	e2001c3413	Add auto model for image-text-to-text (#32472 ) * Add Auto model for image-text-to-text * Remove donut from processing auto, add chameleon ti image text to text models * add qwen2_vl and llava_onevision * add pixtral to auto model for image-text-to-text * add mllama and idefics3 * remove models in IGNORE_NON_AUTO_CONFIGURED * add AutoModelForImageTextToText to tests and doc	2024-10-08 14:26:43 +02:00
Raushan Turganbay	0dbc7090ba	Processors: don't default padding side (#33942 ) * don't default padding side * fix	2024-10-08 10:58:49 +02:00
Arthur	a3add29097	Add support for __all__ and potentilly deleting functions (#33859 ) * Add support for __all__ and potentailly deleting functions * updates * update * nits * remove dummies * fix warning * fixup * style * update * fixup * skip copied from when # skip * remove log * bring dummies back * fixup * remove copied from * fixup * remove warnings from `make fix-copies` * fix doc issues * nits * Better error message ! * add support for more flexible naming! * style * breaking style? * fix super() renaming issues * del not needed when you don't call super().__init__() * style * no more fmt on :) * properly remove `self` * fixup * fix * doc nits * add some doc 🫡	2024-10-08 10:19:17 +02:00
Raushan Turganbay	bead0fa8dc	Cache: slight change in naming (#32421 ) * squash * codestyle * Update src/transformers/cache_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * propagate changes to all cache classes * + whisper * fix tests * more fixes * add deprecation warning * fix copies * address comments * fix mistral also * these didn't have "copied from" --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2024-10-08 09:43:40 +02:00
Yijun Lee	d6ba1ac041	🌐 [i18n-KO] Translated `gemma.md` to Korean (#33936 ) * docs: ko: gemma.md * feat: nmt draft * fix: manual edits	2024-10-07 15:59:14 -07:00
Jiwook Han	46f146a2b5	🌐 [i18n-KO] Translated `vit.md` to Korean (#33884 ) * docs: ko: model_doc/vit.md * feat: nmt draft * fix: manual edits * fix: manual edits * Update docs/source/ko/model_doc/vit.md Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> * Update docs/source/ko/model_doc/vit.md Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> --------- Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>	2024-10-07 15:35:11 -07:00
Jiwook Han	1ecca92f03	🌐 [i18n-KO] Translated `swin2sr.md` to Korean (#33795 ) * ko: doc: model_doc/swin2sr.md * feat: nmt draft * Update docs/source/ko/model_doc/swin2sr.md Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> --------- Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>	2024-10-07 15:34:56 -07:00
boyunJang	8258219c4c	🌐 [i18n-KO] Translated `auto.md` to Korean (#33590 ) * docs: ko: model_doc/auto.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * fix: resolve suggestions --------- Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com> Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>	2024-10-07 15:34:45 -07:00
Chaewon Song	253a9a9d6f	🌐 [i18n-KO] Translated `logging.md` to Korean (#33543 ) * docs: ko: main_classes/logging.md * feat: nmt-draft * fix: update toctree.yml * Update docs/source/ko/main_classes/logging.md Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/main_classes/logging.md Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Apply suggestions from code review Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com> --------- Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>	2024-10-07 15:34:34 -07:00
Yijun Lee	178d707b7e	🌐 [i18n-KO] Translated `chameleon.md` to Korean (#33799 ) * docs: ko: chameleon.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>	2024-10-07 15:06:13 -07:00
Yijun Lee	13432f8409	🌐 [i18n-KO] Translated `trainer.md` to Korean (#33797 ) * docs: ko: trainer.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>	2024-10-07 15:05:57 -07:00
Yijun Lee	e9fbe62965	🌐 [i18n-KO] Translated `pipelines_utils.md` to Korean (#33809 ) * docs: ko: pipelines_utils.md * feat: nmt draft * fix: manual edits	2024-10-07 15:05:17 -07:00
Yijun Lee	9c61ba2f25	🌐 [i18n-KO] Translated `time_series_utils.md` to Korean (#33806 ) * docs: ko: time_series_utils.md * feat: nmt draft * fix: manual edits	2024-10-07 15:05:00 -07:00
Yijun Lee	9c8bd3fc1b	🌐 [i18n-KO] Translated `esm.md` to Korean (#33796 ) * docs: ko: esm.md * feat: nmt draft * fix: manual edits	2024-10-07 13:39:22 -07:00
Yijun Lee	6996f2186a	🌐 [i18n-KO] Translated `audio_utils.md` to Korean (#33802 ) * docs: ko: audio_utils.md * feat: nmt draft * fix: manual edits	2024-10-07 13:39:10 -07:00
Jiwook Han	410c73af1d	🌐 [i18n-KO] Translated `swinv2.md` to Korean (#33566 ) * docs: ko: model_doc/swinv2.md * feat: nmt draft * fix: manual edits * fix: manual edits	2024-10-07 12:50:43 -07:00
Yijun Lee	6c18cefed0	🌐 [i18n-KO] Translated `gguf.md` to Korean (#33764 ) * docs: ko: gguf.md * feat nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr> --------- Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com> Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>	2024-10-07 12:49:08 -07:00
Michael Goin	c91fe85b78	Fix undefined default_config in configuration_utils.py (#33934 )	2024-10-07 18:32:20 +02:00
Arthur	736c7cde51	[`pytes collection`] Fix flax test collection (#34004 ) bit weird but to filter I had to use this	2024-10-07 18:11:13 +02:00
roy	55be7c4c48	Enable customized optimizer for DeepSpeed (#32049 ) * transformers: enable custom optimizer for DeepSpeed * transformers: modify error message --------- Co-authored-by: datakim1201 <roy.kim@maum.ai>	2024-10-07 15:36:54 +02:00
Arthur	7bae833728	properly fix and RUN_SLOW (#33965 ) * properly fix and RUN_SLOW * lots of models were affected * fix-copies * more fixes	2024-10-07 14:45:57 +02:00
Kaito	e782e95e34	Fix Tensor + Embedding error in some cases when using SiglipVisionModel (#33994 ) Fix Tensor + Embedding error in some cases Co-authored-by: kaitolucifer <kaito.o@ghelia.com>	2024-10-07 11:17:34 +02:00
Arthur	9b4b0c07db	[`Red CIs`] Fix hub failures (#34001 ) maybe setup should work?	2024-10-07 10:56:24 +02:00
Magnus	ad1a250719	[Docs] Add Developer Guide: How to Hack Any Transformers Model (#33979 ) * docs: add example for separating q, k, v projections in SAM * docs: How to Hack Any Transformers Model * docs: remove changes from sam model docs * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-07 10:08:20 +02:00
NielsRogge	f5aeb7c1a5	[Docs] Improve VLM docs (#33393 ) * Improve docs * Update docs/source/en/model_doc/llava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/llava.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Address comment * Address comment * Improve pixtral docs --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-10-07 09:54:07 +02:00
Cyril Vallez	1f33023cfa	Flash-attn performance: remove cuda sync during inference (#33570 ) Switch conditions to use short-circuit during inference	2024-10-07 09:52:19 +02:00
Avishai Elmakies	4953ddf036	Add position ids in forward pass to opt model (#33121 ) * start working on adding position ids * add docs * Refactor modeling_biogpt.py and modeling_opt.py for code consistency * fix 2 PR comments * move position_ids to end of args * remove trailing white space * add comment with TODO * bug fix gradient checkpointing * fixup * missed on position_ids * remove _attention_to_position_ids and refactor embedding class * remove redundent code --------- Co-authored-by: Avishai Elmakies <avishai.elma@cs.huji.ac.il>	2024-10-07 09:20:49 +02:00
TomLim	1bd604d11c	[WIP] Add Tokenizer for MyT5 Model (#31286 ) * Initial commit for MyT5 model * custom implementation of MyT5 tokenizer, unused files deleted * unittest for myt5 tokenizer * upadate of import structure and style * removed remmanents of MyT5Config * fixed docstrings * Updates after review: filled documentaion file, new docstrings and tests added * Fixed code style issues * fixed copied from to refer to function * updated loading myt5 tokenizer in tests, added sample byte map file to fixtures * changes after review * removed redundant copied from * removed redundant copied from * optimalization and loading model from hf * [run_slow] myt5 * [run-slow] myt5 * Updated en documentation for myt5 Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-06 10:33:16 +02:00
Anton Vlasjuk	5ef432e474	[`TF`] Fix Tensorflow XLA Generation on limited seq_len models (#33903 ) * fix tf xla generation on limited seq_len models * [run-slow] opt * [run-slow] opt	2024-10-05 16:20:50 +02:00
Vladislav Bronzov	22e102ad98	Bug fix gguf qwen2moe (#33940 ) * fix qwen2moe tensors mapping, add unit tests * add expert tensor split logic, test refactoring * small params refactoring * add comment to tensor reshaping	2024-10-05 16:19:01 +02:00
Yehoshua Cohen	56be9f1925	add test for Jamba with new model jamba-tiny-dev (#33863 ) * add test for jamba with new model * ruff fix --------- Co-authored-by: Yehoshua Cohen <yehoshuaco@ai21.com>	2024-10-05 16:03:12 +02:00
Adam Pocock	a7e4e1a77c	Updating `char_to_token` documentation to note behaviour when `trim_offsets` is True (#33919 ) Updating char_to_token documentation.	2024-10-05 14:13:26 +02:00
Raushan Turganbay	612065efeb	Paligemma: fix static cache test (#33941 ) * fix * not flaky anymore + style	2024-10-05 09:47:37 +02:00
Joao Gante	38f9f10dd9	Cache: revert DynamicCache init for BC (#33861 ) * tmp commit * tmp commit * make fixup * missing removal * fix condition * fix end-to-end compilation * if -> elif * BC * BC * use @deprecate_kwarg("num_hidden_layers", version="4.47.0") * wups the import * 🥴 --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-10-04 22:47:08 +02:00
Arthur	f92d354823	fix red check-copies (#33964 )	2024-10-04 22:45:37 +02:00
pglorio	f319ba16fa	Add Zamba (#30950 ) * Update index.md * Rebase * Rebase * Updates from make fixup * Update zamba.md * Batched inference * Update * Fix tests * Fix tests * Fix tests * Fix tests * Update docs/source/en/model_doc/zamba.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/zamba.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update configuration_zamba.py * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update modeling_zamba.py * Update modeling_zamba.py * Update modeling_zamba.py * Update configuration_zamba.py * Update modeling_zamba.py * Update modeling_zamba.py * Merge branch 'main' of https://github.com/Zyphra/transformers_zamba * Update ZambaForCausalLM * Update ZambaForCausalLM * Describe diffs with original mamba layer * Moved mamba init into `_init_weights` * Update index.md * Rebase * Rebase * Updates from make fixup * Update zamba.md * Batched inference * Update * Fix tests * Fix tests * Fix tests * Fix tests * Update docs/source/en/model_doc/zamba.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/zamba.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update configuration_zamba.py * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update modeling_zamba.py * Update modeling_zamba.py * Update modeling_zamba.py * Update configuration_zamba.py * Update modeling_zamba.py * Update modeling_zamba.py * Merge branch 'main' of https://github.com/Zyphra/transformers_zamba * Update ZambaForCausalLM * Moved mamba init into `_init_weights` * Update ZambaForCausalLM * Describe diffs with original mamba layer * make fixup fixes * quality test fixes * Fix Zamba model path * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * circleci fixes * Update * circleci fixes * fix zamba test from merge * fix ValueError for disabling mamba kernels * add HF copyright Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * shared_transf --> shared_transformer * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fixes * Move attention head dim to config * Fix circle/ci tests * Update modeling_zamba.py * apply GenerationMixin inheritance change from upstream * apply import ordering * update needed transformers version for zamba Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add contribution author * add @slow to avoid CI * Update src/transformers/models/zamba/modeling_zamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Define attention_hidden_size * Added doc for attention_head_size * trigger CI * Fix doc of attention_hidden_size * [run-slow] zamba * Fixed shared layer logic, swapped up<->gate in mlp * shared_transformer -> shared_transf * reformat HybridLayer __init__ * fix docstrings in zamba config * added definition of _get_input_ids_and_config * fixed formatting of _get_input_ids_and_config --------- Co-authored-by: root <root@node-4.us-southcentral1-a.compute.internal> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: root <root@node-1.us-southcentral1-a.compute.internal> Co-authored-by: Quentin Anthony <qganthony@yahoo.com>	2024-10-04 22:28:05 +02:00
Amit Garg	e3775539c8	PhiMoE (#33363 ) * onboard phimoe model * removed debug code * added unit tests * updated docs * formatted * fixed unit tests * fixed test case * fixed format * refactored code * fixed expected outputs in the integration tests * Added a warning msg * Addressed comments * Addressed comments * fixed test cases * added paper link * Addressed comments * Refactored PhimoeForCausalLM forward fn * Refactored PhimoeRotaryEmbedding class * fixed test cases * fixed testcase * fixed test case * Addressed comments * fixed test cases * fixed testcases * Used cache position instead to get the seq len	2024-10-04 21:39:45 +02:00
Arthur	46579c0e77	hot fix `self.position_embeddings->self.position_embedding` (#33958 )	2024-10-04 21:35:31 +02:00
Longjie Zheng	0d1692a49b	Fix attn mask ignore logic in training-time trace (#32613 ) * fix attn mask logic for training-time trace * add test * fix * fix * fix * fix * fix * format * [run-slow] llama * avoid accelearate * [run-slow] llama	2024-10-04 19:00:45 +02:00
karan-uppal3	614660fdb9	Removed unnecessary transpose in Switch Transformer Routing (#33582 ) removed switch transformer routing transpose	2024-10-04 17:39:03 +02:00
Mohamed Abu El-Nasr	78ef58325c	🔴 🚨 Resizing tokens embeddings: initialize from old embeddings' normal distribution. (#33325 ) * intilize new embeddings from normal distrib * Fix typo in comments * Fix typo in comments * Fix style * Fix variables naming * Add tests * Fix style * code consistency nit * Add deepspeed support * Add deepspeed support * Conver embeddings weights to float32 before computations * Add deepspeed tests * Cover when vocab_size is smaller than embedding_size * Style fix * Add tests for vocab_size smaller than hiddin_size * Style fix * Nits in tests * Nits in tests * Check for deepspeed before importing it * Increase vocab_size for positive definite covariance matrix test * Add warning * Add multivariate_resizing flag and implement resizing for lm_heads * Fix typo * Fix wrong bias indexing * Fix bias is zero check * remove multivariate_resizing flag from tests * Intialize bias from old bias normal distribution * Fixup * Code usability * Use mean_resizing instead of multivariate_resizing * Fix up * Fix comments and docs	2024-10-04 16:29:55 +02:00
jiqing-feng	b916efcb3c	Enables CPU AWQ model with IPEX version. (#33460 ) * enable cpu awq ipex linear * add doc for cpu awq with ipex kernel * add tests for cpu awq * fix code style * fix doc and tests * Update docs/source/en/quantization/awq.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/autoawq/test_awq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix comments * fix log * fix log * fix style --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-04 16:25:10 +02:00
Matt	de4112e4d2	Add a section on writing tool templates to the chat template docs (#33924 ) * Add a section on writing tool templates to the chat template docs * Small cleanups	2024-10-04 14:40:44 +01:00
Arthur	2e719e35fd	[`PR run-slow`] (#33939 ) * force latest torch * Update .github/workflows/self-pr-slow-ci.yml Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2024-10-04 14:46:15 +02:00
Raushan Turganbay	061c2c4c38	Ignore keys on `validate_rope` (#33753 ) * ignore keys on check rope * add tests * fix tests, so maybe better leave at logger lvl	2024-10-04 12:39:37 +02:00
Artyom Semidolin	4a173b88b5	[i18n-ru] Fixes typo in the README_ru.md (#33882 )	2024-10-04 11:21:38 +02:00
Deepak Saldanha	b6a01df6e9	[Doc]: Broken link in Kubernetes doc (#33879 ) * add relative path in .md and redirects to conf.py * add redirects to conf.py and update .md * modify links in .md	2024-10-04 11:20:56 +02:00
Yoach Lacombe	124713c32b	Fix distil whisper segment computation (#33920 ) * Fix distil whisper segment computation * [run-slow] whisper	2024-10-04 11:18:01 +02:00
Hamza Tahboub	2bd4d5897d	Minor error condition bug fix (#33781 ) * Error condition bug fix * Update error message * Update src/transformers/models/qwen2_vl/modeling_qwen2_vl.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Making change in the rest of the repo * Formatting * Formatting with ruff --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-10-04 08:25:32 +02:00
Matthew Hoffman	550673a70c	Remove `logits.float()` (#33902 ) * Remove logits.float() if not computing loss * Remove warning about 4.46 logits dtype change if not computing loss	2024-10-04 08:21:12 +02:00
Yoni Gozlan	074aa3b3fd	Uniformize kwargs for Idefics/2 processors (#32568 ) * Add uniformize idefics processor kwargs and tests * Uniformize idefics2 processor kwargs * add image_processor tests idefics * add BC args order change idefics2 processor and update doc * Add support for multiple images per prompt in image-text-to-text mode idefics * Fix processor input args in idefics tests * improve test processing common, remove unnecessary tests, update process uniformization * fix doctrings idefics * fix tests processors idefics/2	2024-10-03 18:08:24 +02:00
Joao Gante	b0c5660e88	Config: lower `save_pretrained` exception to warning (#33906 ) * lower to warning * msg * make fixup * rm extra comma	2024-10-03 16:45:14 +01:00
Jerry Zhang	15a4d24805	Add support for `weights_only` flag when loading state_dict (#32481 ) * Add support for `weights_only` flag when loading state_dict Summary: This is to enable loading a state_dict with wrapper tensor subclasses (used in torchao to for quantized weights) Test Plan: tested locally with torchao weights, also need https://github.com/huggingface/transformers/pull/32306: ``` import torch from transformers import AutoModelForCausalLM, AutoTokenizer from transformers import TorchAoConfig from torchao.utils import benchmark_model import torchao DEVICE_TYPE = "cuda" def init_model_and_benchmark(model_id, torch_dtype=torch.bfloat16, quantization_config=None): tokenizer = AutoTokenizer.from_pretrained(model_id) if quantization_config is not None: model = AutoModelForCausalLM.from_pretrained(model_id, device_map=DEVICE_TYPE, torch_dtype=torch.\bfloat16, quantization_config=quantization_config) else: model = AutoModelForCausalLM.from_pretrained(model_id, device_map=DEVICE_TYPE, torch_dtype=torch.\bfloat16, weights_only=False) # sanity check: run the model input_text = "What are we having for dinner?" input_ids = tokenizer(input_text, return_tensors="pt").to(DEVICE_TYPE) output = model.generate(*input_ids, max_new_tokens=1000) print(tokenizer.decode(output[0], skip_special_tokens=True)) NUM_WARMUP = 1 NUM_RUNS = 5 if quantization_config is not None: torchao.quantization.utils.recommended_inductor_config_setter() model = torch.compile(model, mode="max-autotune") benchmark_model(model.generate, NUM_WARMUP, kwargs=input_ids, device_type=DEVICE_TYPE) print("running benchmark") results = benchmark_model(model.generate, NUM_RUNS, kwargs=input_ids, device_type=DEVICE_TYPE) return model, results model_id = "jerryzh168/test-model" torchao.quantization.utils.recommended_inductor_config_setter() bf16_model, bf16_time = init_model_and_benchmark(model_id) print(f"bf16: {bf16_time}") ``` Reviewers: Subscribers: Tasks: Tags: format	2024-10-03 17:03:42 +02:00
Arthur	a220c5b99f	add setter for trainer processor (#33911 ) * add setter for trainer processor * Update src/transformers/trainer.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2024-10-03 16:34:10 +02:00
Benjamin Bossan	6500f78c86	[PEFT] Support low_cpu_mem_usage option for PEFT loading adapters (#33725 ) * [PEFT] Support low_cpu_mem_usage for PEFT loading PEFT added support for low_cpu_mem_usage=True when loading adapters in https://github.com/huggingface/peft/pull/1961. This feature is now available when installing PEFT v0.13.0. With this PR, this option is also supported when loading PEFT adapters directly into transformers models. Additionally, with this PR, https://github.com/huggingface/diffusers/pull/9510 will be unblocked, which implements this option in diffusers. * Fix typo	2024-10-03 16:15:36 +02:00
Yoach Lacombe	bf0ffe3d29	[Tests] Diverse Whisper fixes (#33665 ) * fix beam indices in token_timestamps * fix attention_mask in FA2 * correct translation example with the right example * correct how somes tests are using outputs + correct num_frames * fix shortform batch prev cond tests * make fix-copies * make fix-copies * take care of shifting beam indices * [run-slow] whisper * [run-slow] whisper	2024-10-03 15:59:01 +02:00
KanTakahiro	ab97a78130	Fix: use unidic-lite instead of ipadic as the tokenizer dictionary for Japanese (#33372 ) * Fix: use unidic-lite instead of ipadic as the tokenizer dictionary of Japanese Signed-off-by: Kan Takahiro <kan@Kans-Mac-mini.local> * fix the default name --------- Signed-off-by: Kan Takahiro <kan@Kans-Mac-mini.local> Co-authored-by: Kan Takahiro <kan@Kans-Mac-mini.local> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-10-03 15:30:03 +02:00
Joao Gante	d29738f5b4	Generate tests: modality-agnostic input preparation (#33685 )	2024-10-03 14:01:24 +01:00
Arie Pratama Sutiono	f2bf4fcf3d	Add `SplinterTokenizer` unit test (#32652 ) * add unit tests for splinter_tokenizer * add unit test for splinter tokenizer, pass in the question_token to be saved on save_pretrained called * remove unused import * remove vocab_splinter.txt, add Copied from, use fmt:on and fmt:off to prevent autoformatting on long lines * remove all the spaces Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-03 14:49:56 +02:00
Ben Schneider	95a2f5f6c3	Fix module initialization for root module under Zero3 (#33632 ) * Use all state dict keys when checking if root module is initialized. * Apply style corrections * Add comment explaining change. * Change comment phrasing.	2024-10-03 14:41:50 +02:00
Guillaume LEGENDRE	4df3ccddb7	Migrate the CI runners to the new clusters (#33849 ) * try fixing push-ci * move to new runners * move benchmark.yml to new runners * move doctest_job.yml to new runners * move doctests.yml to new runners * move push-important-models.yml to new runners * move self-pr-slow-ci.yml to new runners * fix typo Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * fix working directory Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * fix working directory Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * improve code Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2024-10-03 14:39:49 +02:00
Joao Gante	6f0ce52760	VLM Generate: tag `test_static_cache_matches_dynamic` as flaky (#33630 ) flaky	2024-10-03 12:27:02 +01:00
Nonthachai Plodthong	f1a5f81296	Update an keyerror on _save_check_point prevent confusion of missing … (#33832 ) * Update an keyerror on _save_check_point prevent confusion of missing metric keys * Update grammar error and case sensitive. Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * adding update KeyError on _evaluate function to align with _save_checkpoint function --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-03 10:27:49 +02:00
HofitBata	dc8156fdd8	Fix dt proj bias reassigned (#33314 ) * When we set self.dt_proj.bias = None, it removes the bias parameter from the model. When we later tried to assign a tensor to self.dt_proj.bias, it caused a TypeError because PyTorch expects a Parameter object. * When we set self.dt_proj.bias = None, it removes the bias parameter from the model. When we later tried to assign a tensor to self.dt_proj.bias, it caused a TypeError because PyTorch expects a Parameter object. * When we set self.dt_proj.bias = None, it removes the bias parameter from the model. When we later tried to assign a tensor to self.dt_proj.bias, it caused a TypeError because PyTorch expects a Parameter object.	2024-10-03 09:51:03 +02:00
Yoni Gozlan	d7950bff82	uniformize processor Mllama (#33876 ) * uniformize processor Mllama * nit syntax * nit	2024-10-02 16:50:15 +02:00
Yoni Gozlan	62e8c759c3	rename all test_processing_.py to test_processor_.py (#33878 ) * rename all test_processing_.py to test_processor_.py ans fix duplicate test processor paligemma * fix copies * fix broken tests * fix-copies * fix test processor bridgetower	2024-10-02 16:43:43 +02:00
Pavel Iakubovskii	2f25ab95db	Handle Trainer `tokenizer` kwarg deprecation with decorator (#33887 ) * Handle deprecation with decorator * Fix for seq2seq Trainer	2024-10-02 15:28:20 +01:00
Yoni Gozlan	ee71c9853a	Optim deformable detr (#33600 ) * optimize deformable detr * fix copies * remove deformable_detr_basline * fix hardcoded float16 and .float() * [run slow] deformable-detr,grounding-dino,mask2former,oneformer,rt-detr * [run slow] deformable_detr,grounding_dino,mask2former,oneformer,rt_detr	2024-10-02 15:46:27 +02:00
Marc Sun	cac4a4876b	[Quantization] Switch to optimum-quanto (#31732 ) * switch to optimum-quanto rebase squach * fix import check * again * test try-except * style	2024-10-02 15:14:34 +02:00
amyeroberts	b7474f211d	Trainer - deprecate tokenizer for processing_class (#32385 ) * Trainer - deprecate tokenizer for processing_class * Extend chage across Seq2Seq trainer and docs * Add tests * Update to FutureWarning and add deprecation version	2024-10-02 14:08:46 +01:00
Omar Salman	e7c8af7f33	Add sdpa for DistilBert (#33724 ) * Add sdpa for DistilBert * [run_slow] distilbert * [run_slow] distilbert * [run_slow] distilbert * Try without slow tests * [run_slow] distilbert * [run_slow] distilbert	2024-10-02 13:55:19 +01:00
Kyle Sayers	614c79a9b0	Fix kwargs passed by AutoQuantizationConfig.from_pretrained (#33798 ) fix kwargs Co-authored-by: kylesayrs <kyle@neuralmagic.com>	2024-10-02 14:12:03 +02:00
Kyle Sayers	b09234cfc1	Allow for nightly packages of `compressed_tensors` (#33828 ) * only check spec * correct typo in nightly package name	2024-10-02 14:11:44 +02:00
g-prz	fe484726aa	Add falcon gguf (#33437 ) * feat(gguf): add falcon q2 k * fix(gguf): remove useless renaming * feat(gguf): seperate falcon 7b and 40b * feat(gguf): apply fixup * fix(test): error rebase * feat(gguf): add fp16 weight comparison for falcon * feat(gguf): test weight of all layers * test(gguf): add falcon 40b under skip decorator * feat(gguf): quick example for extracting model size	2024-10-02 14:10:39 +02:00
George	181c962aab	populate quantization_config for kv-cache-scheme only configs (#33874 )	2024-10-02 14:06:40 +02:00
Yih-Dar	e5d14f39ad	Don't run reminder bot for now (#33883 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-02 11:51:01 +02:00
Pablo Montalvo	50290cf7a0	Uniformize model processors (#31368 ) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default 👀 * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * 🧹 * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-10-02 10:41:08 +02:00
TrickEye	2292be6c1b	Fix: typo (#33880 ) Update llm_tutorial.md: typo	2024-10-02 09:12:21 +01:00
Yoni Gozlan	61ac161a9d	Add support for custom inputs and batched inputs in ProcessorTesterMixin (#33711 ) * add support for custom inputs and batched inputs in ProcessorTesterMixin * Fix batch_size behavior ProcessorTesterMixin * Change format prepare inputs batched * Remove override test pixtral processor * Remove unnecessary tests and cleanup after new prepare_inputs functions * Fix instructBlipVideo image processor	2024-10-01 23:52:03 +02:00
amyeroberts	1baa08897d	Repo consistency fix after #33339 (#33873 ) * Repo consistency fix after #33339 * [run-slow] omdet_turbo	2024-10-01 21:03:15 +01:00
Prakarsh Kaushik	68a2b50069	[Fix] ViViT interpolate_pos_encoding (#33815 ) * fix:test_inference_interpolate_pos_encoding * style:make style;make fixup * test: add suggestion to test_modeling_vivit * chore:add suggestions * style:make style * [run_slow] vivit * ci:slow test fix * [run_slow] vivit	2024-10-01 20:14:35 +01:00
g-prz	8635802af9	Move weight initilization deformabledetr (#33339 ) * fix(copy): fixup copy * fix(deformable_detr): move weight initialization to the right place * fix(grounding_dino): move weight initialization to the right place * fix(rt_detr): move weight initialization to the right place * [run-slow] deformable_detr, grounding_dino, rt_detr	2024-10-01 20:08:57 +01:00
Matt	a43e84cb3b	Make ASR pipeline compliant with Hub spec + add tests (#33769 ) * Remove max_new_tokens arg * Add ASR pipeline to testing * make fixup * Factor the output test out into a util * Full error reporting * Full error reporting * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Small comment --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-10-01 18:15:04 +01:00
Nicola De Angeli	0256520794	fix: repair depth estimation multiprocessing (#33759 ) * fix: repair depth estimation multiprocessing * test: add test for multiprocess depth estimation	2024-10-01 17:59:59 +01:00
Yih-Dar	f205da9660	Avoid using context that is not accessable from external contributors (#33866 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-01 17:42:45 +02:00
Manal ML	0c4c2d7e07	Add include_loss_for_metrics (#33088 ) * Add include_loss_for_metrics * Fix styling * Initialize inputs and losses to avoid AttributeError * Ruff styling * Refactor compute_metrics and update EvalPrediction * Change Naming * Added include_for_metrics to group both args * Fix style * Change warnings to logger Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-01 16:51:41 +02:00
jackyjinjing	5f9f58fc59	Validate the eval dataset in advance. (#33743 ) * Validate the eval dataset in advance. * format * format * format * Update src/transformers/trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * format --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-01 16:45:06 +02:00
Kyle Sayers	f8110a6ddf	Raise `accelerate` dependency error in case of defaulting `low_cpu_mem_usage=True` (#33830 ) Clarify warning, add import check	2024-10-01 16:44:38 +02:00
aroun-coumar	326b2bad1c	This PR contains additional changes for #33143 (#33581 ) * fix: Fix optimizer bug in ModelCard * fix: fix W293 * Fixes in modelcard.py for issue #33143 --------- Co-authored-by: moontidef <53668275+relic-yuexi@users.noreply.github.com>	2024-10-01 16:42:30 +02:00
Raushan Turganbay	b1c914e463	Fix device mismatch errors (#33851 ) fix device mismatch errors	2024-10-01 15:55:57 +02:00
Matt	ac28a23b3d	Workaround for bark issue in pipelines (#33824 ) * Quick workaround for bark + generation_config issue * make fixup * [run slow] bark	2024-10-01 14:40:12 +01:00
Francesco Ortu	acdfdd9387	add attention weight up-cast to float32 in chameleon (#33822 ) add attention weight float32 cast in chameleon	2024-10-01 15:19:16 +02:00
Fabian David Schmidt	351873a145	fix: skip dropout in eval for flash_attn in various models (#33844 ) * fix(m2m_100): skip dropout in eval for flash_attn * fix(misc): skip dropout in eval for flash attn various models * chore(m2m_100): copy flash attn from bart * chore: run make fix-copies * [run-slow] bart, m2m_100	2024-10-01 14:39:21 +02:00
Kenza Bouzid	88d960937c	Refactor image features selection in LlaVa (#33696 ) * refactor image features selection * break line * remove whitespace * add pr comments: include projection and rename function * make fix-copies * fix get_image_feature in vip llava	2024-10-01 14:37:31 +02:00
Joao Gante	22266be970	Generate: move llama `prepare_inputs_for_generation` to `GenerationMixin` (#33677 )	2024-10-01 12:32:54 +01:00
Yih-Dar	d19ab15421	post reminder comment only once (#33848 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-01 12:52:53 +02:00
Wing Lian	fbde09c8c9	fix check for hidden size in text model for deepspeed zero3 auto entries (#33829 ) * fix check for hidden size in text model for deepspeed zero3 auto entries * fix typo	2024-10-01 12:28:26 +02:00
Guang Yang	808997a634	Fix passing str dtype to static cache (#33741 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-01 09:50:17 +02:00
Adibvafa Fallahpour	c269c5c74d	Fix Mamba slow path bug with dtype mismatch. (#32691 ) * Fix Mamba slow path bug with dtype mismatch. * Update test_modeling_mamba.py * Improve style. * Fix issue with cache position of dtype mismatch test. * Change test for slow path. * Revert changes. * Switch to buggy code and add test to catch it. * Fix the dtype mismatch bug and add test code to verify it. * Fix minor bug with test. * Fix incorrect dtype of model output. * Fix incorrect dtype of cache. * Fix incorrect dtype of ssm cache. * Fix incorrect dtype of conv state. * Remove assertion for ssm state. * Add assertion for conv state dtype. * Fix all issues with dtype mismatch test.	2024-10-01 09:28:40 +02:00
dependabot[bot]	570c89625b	Bump torch from 1.13.1 to 2.2.0 in /examples/research_projects/lxmert (#33821 ) Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-30 21:57:57 +02:00
Aryan	90dca5a71b	minor typo fix (#33784 ) fix typo	2024-09-30 21:42:22 +02:00
pogpog	b77846a6e6	Fix link in gguf.md (#33768 ) Change hyphen to underscore for URL in link to convert_hf_to_gguf.py	2024-09-30 20:17:33 +02:00
aroun-coumar	baa765f813	Fixes for issue #33763 in idefics2 model (#33766 )	2024-09-30 18:08:48 +01:00
Joshua Lochner	18c5b216f1	Fix ViT-MAE decoder interpolate (#33330 ) * Fix ViT-MAE decoder interpolate * Add unit test for `interpolate_pos_encoding` w/ custom sizes * [run_slow] vit_mae	2024-09-30 18:47:13 +02:00
Arthur	1dba608df9	[`modular`] fixes! (#33820 ) * fix converter for function definitions * small changes * no prints * style	2024-09-30 16:43:55 +02:00
Yih-Dar	1d29a75a6a	Add Slow CI reminder bot (#33506 ) * add workflow * update * fix * Update .github/workflows/slow_ci_remainder.yml Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-09-30 16:26:54 +02:00
mobicham	f5247aca01	Hqq serialization (#33141 ) * HQQ model serialization attempt * fix hqq dispatch and unexpected keys * style * remove check_old_param * revert to check HQQLinear in quantizer_hqq.py * revert to check HQQLinear in quantizer_hqq.py * update HqqConfig default params * make ci happy * make ci happy * revert to HQQLinear check in quantizer_hqq.py * check hqq_min version 0.2.0 * set axis=1 as default in quantization_config.py * validate_env with hqq>=0.2.0 version message * deprecated hqq kwargs message * make ci happy * remove run_expected_keys_check hack + bump to 0.2.1 min hqq version * fix unexpected_keys hqq update * add pre_quantized check * add update_expected_keys to base quantizerr * ci base.py fix? * ci base.py fix? * fix "quantization typo" src/transformers/utils/quantization_config.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix post merge --------- Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-09-30 14:47:18 +02:00
Quentin Gallouédec	4d5b458704	Fix typo in documentation (#33805 ) fix typo	2024-09-30 12:02:23 +02:00
Jerry Zhang	4bb49d4e00	Enable non-safetensor ser/deser for TorchAoConfig quantized model 🔴 (#33456 ) * Enable non-safetensor serialization and deserialization for TorchAoConfig quantized model Summary: After https://github.com/huggingface/huggingface_hub/pull/2440 we added non-safetensor serialization and deserialization in huggingface, with this we can now add the support in transformers Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor see README for more details Test Plan: tested locally Reviewers: Subscribers: Tasks: Tags: * formatting * formatting * minor fix * formatting * address comments * comments * minor fix * update doc * refactor compressed tensor quantizer	2024-09-30 11:30:29 +02:00
Philip May	2e24ee4dfa	Fix typing in `load_balancing_loss_func` function of `modeling_mixtral.py`. (#33641 ) * fix return type * update to union * fix gate_logits typing * fix num_experts type * fix typing * run fix-copies * add doc for top_k * run fix-copies * empty commit to trigger CI	2024-09-27 18:10:07 +02:00
Matt	d3821c4aed	Make audio classification pipeline spec-compliant and add test (#33730 ) * Make audio classification pipeline spec-compliant and add test * Check that test actually running in CI * Try a different pipeline for the CI * Move the test so it gets triggered * Move it again, this time into task_tests! * make fixup * indentation fix * comment * Move everything from testing_utils to test_pipeline_mixin * Add output testing too * revert small diff with main * make fixup * Clarify comment * Update tests/pipelines/test_pipelines_audio_classification.py Co-authored-by: Lucain <lucainp@gmail.com> * Update tests/test_pipeline_mixin.py Co-authored-by: Lucain <lucainp@gmail.com> * Rename function and js_args -> hub_args * Cleanup the spec recursion * Check keys for all outputs --------- Co-authored-by: Lucain <lucainp@gmail.com>	2024-09-27 17:01:06 +01:00
Lysandre Debut	4973fc5769	Model addition timeline (#33762 ) * Model addition timeline * Link guide * Update docs/source/en/add_new_model.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/add_new_model.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Review comments * Add contact email --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-09-27 17:15:13 +02:00
Matt	75cd270e5e	Cleanup return_text and return_full_text options in TextGenerationPipeline (#33542 ) * Cleanup return_text and return_full_text options in TextGenerationPipeline * Cleanup return_text and return_full_text options in TextGenerationPipeline * Cleanup return_text and return_full_text options in TextGenerationPipeline * Cleanup return_text and return_full_text options in TextGenerationPipeline * Revert pipeline code, but update docs instead * Restore pipeline test	2024-09-27 15:01:31 +01:00
Ita Zaporozhets	0d09c44bd4	remove warning v2 (#33761 )	2024-09-27 14:54:28 +02:00
dependabot[bot]	4196590aa0	Bump torch from 1.13.1 to 2.2.0 in /examples/flax/vision (#33748 ) Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-27 13:24:11 +02:00
Vladislav Bronzov	9d200cfbee	Add gguf support for bloom (#33473 ) * add bloom arch support for gguf * apply format * small refactoring, bug fix in GGUF_TENSOR_MAPPING naming * optimize bloom GGUF_TENSOR_MAPPING * implement reverse reshaping for bloom gguf * add qkv weights test * add q_8 test for bloom	2024-09-27 12:13:40 +02:00
Raushan Turganbay	3e039d3827	Paligemma support for multi-image (#33447 ) * upadte * Update src/transformers/models/paligemma/processing_paligemma.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update docs * better example in tests * support image tokens * read token * Update tests/models/paligemma/test_processing_paligemma.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * nit: naming * Update docs/source/en/model_doc/paligemma.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * conflicts after rebasing --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2024-09-27 11:23:14 +02:00
John B Nelson	55b7a0404e	Make siglip examples clearer and error free (#33667 ) Update siglip.md This was already partially fixed relative to the deployed docs. But the partial fix made it inconsistent. Additionally, giving the full text ("This is a photo of...") is likely not the desired output.	2024-09-27 10:33:55 +02:00
Arthur	7f9a9ca1e0	[`MllamaImageProcessing`] Update doc (#33747 ) * update docstring * style	2024-09-27 10:27:11 +02:00
Arthur	5f4420587a	[`clean_up_tokenization_spaces`] Pl bart was failing, updating (#33735 ) `clean_up_tokenization_spaces=True` for pl bart	2024-09-27 10:26:51 +02:00
fkrasnov2	294477aafb	Doc and config mismatch for DeBERTa (#33713 ) * Update modeling_deberta_v2.py * Update configuration_deberta.py * Revert "Update modeling_deberta_v2.py" * Revert "Update configuration_deberta.py" * fix the config doc mismatch --------- Co-authored-by: Fedor Krasnov <fedor.krasnov@gmail.com>	2024-09-27 10:19:46 +02:00
Anton Vlasjuk	4f29a60bee	Update Albumentations Versions (#33704 ) update albumentations versions	2024-09-27 10:13:30 +02:00
Xiaodong Wang	1ec7a70fef	fix trainer tr_loss add error (#33651 )	2024-09-27 10:10:03 +02:00
Tony Wu	e1b150862e	Fix modular model converter unable to generate Processor classes (#33737 ) fix: fix wrong file type for processor in `modular_model_converter.py`	2024-09-27 00:00:39 +02:00
Luciano	e32521bf24	fix: add docstring for `image_size` in Convnextv2 config (#33734 ) add docstring for image_size	2024-09-26 13:56:06 -07:00
Ita Zaporozhets	6730485b02	clean_up_tokenization_spaces=False if unset (#31938 ) * clean_up_tokenization_spaces=False if unset * deprecate warning * updating param for old models * update models * make fix-copies * fix-copies and update bert models * warning msg * update prophet and clvp * updating test since space before is arbitrarily removed * remove warning for 4.45	2024-09-26 19:38:20 +02:00
Joao Gante	3557f9a14a	Generate: `can_generate()` recursive check (#33718 ) * add recursive check and test warnings * missing space * models without can_generate	2024-09-26 18:11:14 +01:00
Pablo Montalvo	9f97c39384	Fix position embeddings singular/plural (#33678 ) * fix position embeddings * [run-slow] blip, blip_2, instructblip, instructblipvideo * fix init * [run-slow] blip, blip_2, instructblip, instructblipvideo * fix copies * [run-slow] blip, blip_2, instructblip, instructblipvideo * [run-slow] blip, blip_2, instructblip, instructblipvideo * handle exception where list + tensors are cat'd * [run-slow] blip, blip_2, instructblip, instructblipvideo * add missing default * [run-slow] blip, blip_2, instructblip, instructblipvideo	2024-09-26 19:07:00 +02:00
Yoni Gozlan	77b47e6645	Fix docs and docstrings Omdet-Turbo (#33726 ) Fix weights path in docs	2024-09-26 12:18:23 -04:00
Nicola De Angeli	c716fc0e48	fix: use correct var names for check_tokenizers script (#33702 )	2024-09-26 17:24:46 +02:00
Arthur	46841d3eb2	[`MllamaProcessor`] Update errors and API with multiple image (#33715 ) * update error * update and add a test * update * update	2024-09-26 16:33:25 +02:00
Franz Louis Cesista	0a21381ba3	Uniformize kwargs for chameleon processor (#32181 ) * uniformize kwargs of Chameleon * fix linter nit * rm stride default * add tests for chameleon processor * fix tests * add comment on get_component * rm Chameleon's slow tokenizer * add check order images text + nit * update docs and tests * Fix LlamaTokenizer tests * fix gated repo access * fix wrong import --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2024-09-26 10:18:07 -04:00
Andrés Marafioti	f2c388e3f9	Add Idefics 3! (#32473 ) * Add Idefics 3! * fixes to make both pipelines identical * fix for quantized models * First pass at the review * remove vocab size from the main config (it's still in the text_config) * hot fix for merve * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * re-add model_type for text_config * remove support for old_cache * remove hidden_size from main config * rename idefics3 HF repo * few changes suggested in the PR * fix to input_data_format computation * remove overwrite of _autoset_attn_implementation following @zucchini-nlp suggestion * improve example * few improvements from amy's review * big change to enable processing input images as numpy arrays * Changes to the code to uniformize processor kwargs * image processing tests * image processing tests fixes and some bugs they discovered * addressed review comments from Yoni * fix modeling tests * remove special tokens that are not special * fixes tests * skip failing tests - they also fail for idefics2 * added paper and readded the tests with multi gpu, who knows * Update docs/source/en/model_doc/idefics3.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * review amy until image_processing_idefics3 * last comments from Amy * review amy * Update src/transformers/models/idefics3/image_processing_idefics3.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/idefics3/modeling_idefics3.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/idefics3.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * doc improvement - amy review * fix runtime error during fine-tuning * amy's review * Update src/transformers/models/idefics3/image_processing_idefics3.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/idefics3/image_processing_idefics3.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/idefics3/modeling_idefics3.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * ruff * amy's comment on the order * ruff ruff * fix copies * square images when they are not splitted * ruff :( * Update src/transformers/models/idefics3/image_processing_idefics3.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/idefics3/test_processing_idefics3.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix small bug introduced in refactor * amy's image processing changes * fixes peft tests and ruff * modify to_pil_image from transformers. and review from emanuele. * add modified to_pil_image --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-09-25 21:28:49 +02:00
Lysandre	f0eabf6c7d	Dev release	2024-09-25 20:14:35 +02:00
Manuel	a55adee890	adding positional encoder changes and tests (#32600 ) * adding positional encoder changes and tests * adding ruff suggestions * changes added by python utils/check_copies.py --fix_and_overwrite * removing pos_encoding added by script * adding interpolation to clipseg * formatting * adding further testing to altclip and better documentation to kosmos2 * skipping test_inputs_embeds_matches_input_ids_with_generate in git model * fixing clipseg comment suggestions * [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip * fixing bridgetower test * fixing altclip tensor output POS test * adding ruff formatting * fixing several tests * formatting with ruff * adding positional encoder changes and tests * adding ruff suggestions * changes added by python utils/check_copies.py --fix_and_overwrite * removing pos_encoding added by script * adding interpolation to clipseg * formatting * adding further testing to altclip and better documentation to kosmos2 * skipping test_inputs_embeds_matches_input_ids_with_generate in git model * fixing clipseg comment suggestions * fixing bridgetower test * fixing altclip tensor output POS test * adding ruff formatting * fixing several tests * formatting with ruff * adding right pretrained model * [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip * fixing test_inference_image_segmentation * [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip * fixing test_inference_interpolate_pos_encoding for the git model as there is no vision_model_output * [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip * adding ruff formatting * [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip * adding new interpolate_pos_encoding function * [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip * fixing interpolate_POS funciton * adapting output tensor in teests * [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip * modifying output tensor * [run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip * adding the correct tensor * [run_slow] clipseg * fixing spaces * [run_slow] clipseg * [run_slow] clipseg --------- Co-authored-by: Manuel Sanchez Hernandez <manuel.sanchez.hernandez@schibsted.com>	2024-09-25 19:05:01 +01:00

5581 changed files with 727528 additions and 687377 deletions

									
										42

.circleci/config.yml
									
												View File
												
				@ -7,12 +7,25 @@ parameters:

				    nightly:

				        type: boolean

				        default: false

				    GHA_Actor:

				        type: string

				        default: ""

				    GHA_Action:

				        type: string

				        default: ""

				    GHA_Event:

				        type: string

				        default: ""

				    GHA_Meta:

				        type: string

				        default: ""

				jobs:

				    # Ensure running with CircleCI/huggingface

				    check_circleci_user:

				        docker:

				            - image: python:3.10-slim

				        resource_class: small

				        parallelism: 1

				        steps:

				            - run: echo $CIRCLE_PROJECT_USERNAME

				@ -57,15 +70,15 @@ jobs:

				            - run:

				                name: "Prepare pipeline parameters"

				                command: |

				                    python utils/process_test_artifacts.py 

				                    python utils/process_test_artifacts.py

				            # To avoid too long generated_config.yaml on the continuation orb, we pass the links to the artifacts as parameters.

				            # Otherwise the list of tests was just too big. Explicit is good but for that it was a limitation.

				            # We used:

				            # https://circleci.com/docs/api/v2/index.html#operation/getJobArtifacts : to get the job artifacts

				            # We could not pass a nested dict, which is why we create the test_file_... parameters for every single job

				            - store_artifacts:

				                path: test_preparation/transformed_artifacts.json

				            - store_artifacts:

				@ -99,8 +112,6 @@ jobs:

				            - run:

				                name: "Retrieve Artifact Paths"

				                env:

				                    CIRCLE_TOKEN: ${{ secrets.CI_ARTIFACT_TOKEN }}

				                command: |

				                    project_slug="gh/${CIRCLE_PROJECT_USERNAME}/${CIRCLE_PROJECT_REPONAME}"

				                    job_number=${CIRCLE_BUILD_NUM}

				@ -109,7 +120,7 @@ jobs:

				            - run:

				                name: "Prepare pipeline parameters"

				                command: |

				                    python utils/process_test_artifacts.py 

				                    python utils/process_test_artifacts.py

				            # To avoid too long generated_config.yaml on the continuation orb, we pass the links to the artifacts as parameters.

				            # Otherwise the list of tests was just too big. Explicit is good but for that it was a limitation.

				@ -145,7 +156,7 @@ jobs:

				                  path: ~/transformers/installed.txt

				            - run: python -c "from transformers import *" || (echo '🚨 import failed, this means you introduced unprotected imports! 🚨'; exit 1)

				            - run: ruff check examples tests src utils

				            - run: ruff format tests src utils --check

				            - run: ruff format examples tests src utils --check

				            - run: python utils/custom_init_isort.py --check_only

				            - run: python utils/sort_auto_mappings.py --check_only

				            - run: python utils/check_doc_toc.py

				@ -170,23 +181,34 @@ jobs:

				                  path: ~/transformers/installed.txt

				            - run: python utils/check_copies.py

				            - run: python utils/check_modular_conversion.py

				            - run: python utils/check_table.py

				            - run: python utils/check_dummies.py

				            - run: python utils/check_repo.py

				            - run: python utils/check_inits.py

				            - run: python utils/check_pipeline_typing.py

				            - run: python utils/check_config_docstrings.py

				            - run: python utils/check_config_attributes.py

				            - run: python utils/check_doctest_list.py

				            - run: make deps_table_check_updated

				            - run: python utils/update_metadata.py --check-only

				            - run: python utils/check_docstrings.py

				            - run: python utils/check_support_list.py

				workflows:

				    version: 2

				    setup_and_quality:

				        when:

				            not: <<pipeline.parameters.nightly>>

				            and:

				                - equal: [<<pipeline.project.git_url>>, https://github.com/huggingface/transformers]

				                - not: <<pipeline.parameters.nightly>>

				        jobs:

				            - check_circleci_user

				            - check_code_quality

				            - check_repository_consistency

				            - fetch_tests

				    setup_and_quality_2:

				        when:

				            not:

				                 equal: [<<pipeline.project.git_url>>, https://github.com/huggingface/transformers]

				        jobs:

				            - check_circleci_user

				            - check_code_quality

									
										229

.circleci/create_circleci_config.py
									
												View File
												
				@ -16,10 +16,9 @@

				import argparse

				import copy

				import os

				import random

				from dataclasses import dataclass

				from typing import Any, Dict, List, Optional

				import glob

				from typing import Any, Optional

				import yaml

				@ -28,36 +27,70 @@ COMMON_ENV_VARIABLES = {

				    "TRANSFORMERS_IS_CI": True,

				    "PYTEST_TIMEOUT": 120,

				    "RUN_PIPELINE_TESTS": False,

				    "RUN_PT_TF_CROSS_TESTS": False,

				    "RUN_PT_FLAX_CROSS_TESTS": False,

				    # will be adjust in `CircleCIJob.to_dict`.

				    "RUN_FLAKY": True,

				    "DISABLE_SAFETENSORS_CONVERSION": True,

				}

				# Disable the use of {"s": None} as the output is way too long, causing the navigation on CircleCI impractical

				COMMON_PYTEST_OPTIONS = {"max-worker-restart": 0, "dist": "loadfile", "vvv": None, "rsf":None}

				COMMON_PYTEST_OPTIONS = {"max-worker-restart": 0, "vvv": None, "rsfE":None}

				DEFAULT_DOCKER_IMAGE = [{"image": "cimg/python:3.8.12"}]

				# Strings that commonly appear in the output of flaky tests when they fail. These are used with `pytest-rerunfailures`

				# to rerun the tests that match these patterns.

				FLAKY_TEST_FAILURE_PATTERNS = [

				    "OSError",  # Machine/connection transient error

				    "Timeout",  # Machine/connection transient error

				    "ConnectionError",  # Connection transient error

				    "FileNotFoundError",  # Raised by `datasets` on Hub failures

				    "PIL.UnidentifiedImageError",  # Raised by `PIL.Image.open` on connection issues

				    "HTTPError",  # Also catches HfHubHTTPError

				    "AssertionError: Tensor-likes are not close!",  # `torch.testing.assert_close`, we might have unlucky random values

				    # TODO: error downloading tokenizer's `merged.txt` from hub can cause all the exceptions below. Throw and handle

				    # them under a single message.

				    "TypeError: expected str, bytes or os.PathLike object, not NoneType",

				    "TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType",

				    "Converting from Tiktoken failed",

				    "KeyError: <class ",

				    "TypeError: not a string",

				]

				class EmptyJob:

				    job_name = "empty"

				    def to_dict(self):

				        steps = [{"run": 'ls -la'}]

				        if self.job_name == "collection_job":

				            steps.extend(

				                [

				                    "checkout",

				                    {"run": "pip install requests || true"},

				                    {"run": """while [[ $(curl --location --request GET "https://circleci.com/api/v2/workflow/$CIRCLE_WORKFLOW_ID/job" --header "Circle-Token: $CCI_TOKEN"| jq -r '.items[]|select(.name != "collection_job")|.status' | grep -c "running") -gt 0 ]]; do sleep 5; done || true"""},

				                    {"run": 'python utils/process_circleci_workflow_test_reports.py --workflow_id $CIRCLE_WORKFLOW_ID || true'},

				                    {"store_artifacts": {"path": "outputs"}},

				                    {"run": 'echo "All required jobs have now completed"'},

				                ]

				            )

				        return {

				            "docker": copy.deepcopy(DEFAULT_DOCKER_IMAGE),

				            "steps":["checkout"],

				            "resource_class": "small",

				            "steps": steps,

				        }

				@dataclass

				class CircleCIJob:

				    name: str

				    additional_env: Dict[str, Any] = None

				    docker_image: List[Dict[str, str]] = None

				    install_steps: List[str] = None

				    additional_env: dict[str, Any] = None

				    docker_image: list[dict[str, str]] = None

				    install_steps: list[str] = None

				    marker: Optional[str] = None

				    parallelism: Optional[int] = 0

				    pytest_num_workers: int = 12

				    pytest_options: Dict[str, Any] = None

				    resource_class: Optional[str] = "2xlarge"

				    tests_to_run: Optional[List[str]] = None

				    pytest_num_workers: int = 8

				    pytest_options: dict[str, Any] = None

				    resource_class: Optional[str] = "xlarge"

				    tests_to_run: Optional[list[str]] = None

				    num_test_files_per_worker: Optional[int] = 10

				    # This should be only used for doctest job!

				    command_timeout: Optional[int] = None

				@ -76,7 +109,9 @@ class CircleCIJob:

				                self.docker_image[0]["image"] = f"{self.docker_image[0]['image']}:dev"

				            print(f"Using {self.docker_image} docker image")

				        if self.install_steps is None:

				            self.install_steps = ["uv venv && uv pip install ."]

				            self.install_steps = ["uv pip install ."]

				        # Use a custom patched pytest to force exit the process at the end, to avoid `Too long with no output (exceeded 10m0s): context deadline exceeded`

				        self.install_steps.append("uv pip install git+https://github.com/ydshieh/pytest.git@8.4.1-ydshieh")

				        if self.pytest_options is None:

				            self.pytest_options = {}

				        if isinstance(self.tests_to_run, str):

				@ -95,6 +130,14 @@ class CircleCIJob:

				    def to_dict(self):

				        env = COMMON_ENV_VARIABLES.copy()

				        if self.job_name != "tests_hub":

				            # fmt: off

				            # not critical

				            env.update({"HF_TOKEN": "".join(["h", "f", "_", "H", "o", "d", "V", "u", "M", "q", "b", "R", "m", "t", "b", "z", "F", "Q", "O", "Q", "A", "J", "G", "D", "l", "V", "Q", "r", "R", "N", "w", "D", "M", "V", "C", "s", "d"])})

				            # fmt: on

				        # Do not run tests decorated by @is_flaky on pull requests

				        env['RUN_FLAKY'] = os.environ.get("CIRCLE_PULL_REQUEST", "") == ""

				        env.update(self.additional_env)

				        job = {

				@ -112,7 +155,9 @@ class CircleCIJob:

				                # Examples special case: we need to download NLTK files in advance to avoid cuncurrency issues

				        timeout_cmd = f"timeout {self.command_timeout} " if self.command_timeout else ""

				        marker_cmd = f"-m '{self.marker}'" if self.marker is not None else ""

				        additional_flags = f" -p no:warning -o junit_family=xunit1 --junitxml=test-results/junit.xml"

				        junit_flags = " -p no:warning -o junit_family=xunit1 --junitxml=test-results/junit.xml"

				        joined_flaky_patterns = "|".join(FLAKY_TEST_FAILURE_PATTERNS)

				        repeat_on_failure_flags = f"--reruns 5 --reruns-delay 2 --only-rerun '({joined_flaky_patterns})'"

				        parallel = f' << pipeline.parameters.{self.job_name}_parallelism >> '

				        steps = [

				            "checkout",

				@ -133,18 +178,38 @@ class CircleCIJob:

				                "command": """dpkg-query --show --showformat='${Installed-Size}\t${Package}\n' | sort -rh | head -25 | sort -h | awk '{ package=$2; sub(".*/", "", package); printf("%.5f GB %s\n", $1/1024/1024, package)}' || true"""}

				            },

				            {"run": {"name": "Create `test-results` directory", "command": "mkdir test-results"}},

				            {"run": {"name": "Get files to test", "command":f'curl -L -o {self.job_name}_test_list.txt <<pipeline.parameters.{self.job_name}_test_list>>' if self.name != "pr_documentation_tests" else 'echo "Skipped"'}},

				            {"run": {"name": "Get files to test", "command":f'curl -L -o {self.job_name}_test_list.txt <<pipeline.parameters.{self.job_name}_test_list>> --header "Circle-Token: $CIRCLE_TOKEN"' if self.name != "pr_documentation_tests" else 'echo "Skipped"'}},

				                        {"run": {"name": "Split tests across parallel nodes: show current parallel tests",

				                    "command": f"TESTS=$(circleci tests split  --split-by=timings {self.job_name}_test_list.txt) && echo $TESTS > splitted_tests.txt && echo $TESTS | tr ' ' '\n'" if self.parallelism else f"awk '{{printf \"%s \", $0}}' {self.job_name}_test_list.txt > splitted_tests.txt"

				                    }

				            },

				            # During the CircleCI docker images build time, we might already (or not) download the data.

				            # If it's done already, the files are inside the directory `/test_data/`.

				            {"run": {"name": "fetch hub objects before pytest", "command": "cp -r /test_data/* . 2>/dev/null || true; python3 utils/fetch_hub_objects_for_ci.py"}},

				            {"run": {"name": "download and unzip hub cache", "command": 'curl -L -o huggingface-cache.tar.gz https://huggingface.co/datasets/hf-internal-testing/hf_hub_cache/resolve/main/huggingface-cache.tar.gz && apt-get install pigz && tar --use-compress-program="pigz -d -p 8" -xf huggingface-cache.tar.gz && mv -n hub/* /root/.cache/huggingface/hub/ && ls -la /root/.cache/huggingface/hub/'}},

				            {"run": {

				                "name": "Run tests",

				                "command": f"({timeout_cmd} python3 -m pytest {marker_cmd} -n {self.pytest_num_workers} {additional_flags} {' '.join(pytest_flags)} $(cat splitted_tests.txt) | tee tests_output.txt)"}

				                "command": f"({timeout_cmd} python3 -m pytest {marker_cmd} -n {self.pytest_num_workers} {junit_flags} {repeat_on_failure_flags} {' '.join(pytest_flags)} $(cat splitted_tests.txt) | tee tests_output.txt)"}

				            },

				            {"run": {"name": "Expand to show skipped tests", "when": "always", "command": f"python3 .circleci/parse_test_outputs.py --file tests_output.txt --skip"}},

				            {"run": {"name": "Failed tests: show reasons",   "when": "always", "command": f"python3 .circleci/parse_test_outputs.py --file tests_output.txt --fail"}},

				            {"run": {"name": "Errors",                       "when": "always", "command": f"python3 .circleci/parse_test_outputs.py --file tests_output.txt --errors"}},

				            {"run":

				                {

				                    "name": "Check for test crashes",

				                    "when": "always",

				                    "command": """if [ ! -f tests_output.txt ]; then

				                            echo "ERROR: tests_output.txt does not exist - tests may not have run properly"

				                            exit 1

				                        elif grep -q "crashed and worker restarting disabled" tests_output.txt; then

				                            echo "ERROR: Worker crash detected in test output"

				                            echo "Found: crashed and worker restarting disabled"

				                            exit 1

				                        else

				                            echo "Tests output file exists and no worker crashes detected"

				                        fi"""

				                },

				            },

				            {"run": {"name": "Expand to show skipped tests", "when": "always", "command": "python3 .circleci/parse_test_outputs.py --file tests_output.txt --skip"}},

				            {"run": {"name": "Failed tests: show reasons",   "when": "always", "command": "python3 .circleci/parse_test_outputs.py --file tests_output.txt --fail"}},

				            {"run": {"name": "Errors",                       "when": "always", "command": "python3 .circleci/parse_test_outputs.py --file tests_output.txt --errors"}},

				            {"store_test_results": {"path": "test-results"}},

				            {"store_artifacts": {"path": "test-results/junit.xml"}},

				            {"store_artifacts": {"path": "reports"}},

				@ -163,147 +228,79 @@ class CircleCIJob:

				# JOBS

				torch_and_tf_job = CircleCIJob(

				    "torch_and_tf",

				    docker_image=[{"image":"huggingface/transformers-torch-tf-light"}],

				    additional_env={"RUN_PT_TF_CROSS_TESTS": True},

				    marker="is_pt_tf_cross_test",

				    pytest_options={"rA": None, "durations": 0},

				)

				torch_and_flax_job = CircleCIJob(

				    "torch_and_flax",

				    additional_env={"RUN_PT_FLAX_CROSS_TESTS": True},

				    docker_image=[{"image":"huggingface/transformers-torch-jax-light"}],

				    marker="is_pt_flax_cross_test",

				    pytest_options={"rA": None, "durations": 0},

				)

				torch_job = CircleCIJob(

				    "torch",

				    docker_image=[{"image": "huggingface/transformers-torch-light"}],

				    marker="not generate",

				    parallelism=6,

				    pytest_num_workers=8

				)

				generate_job = CircleCIJob(

				    "generate",

				    docker_image=[{"image": "huggingface/transformers-torch-light"}],

				    # networkx==3.3 (after #36957) cause some issues

				    # TODO: remove this once it works directly

				    install_steps=["uv pip install ."],

				    marker="generate",

				    parallelism=6,

				    pytest_num_workers=8

				)

				tokenization_job = CircleCIJob(

				    "tokenization",

				    docker_image=[{"image": "huggingface/transformers-torch-light"}],

				    parallelism=8,

				    pytest_num_workers=16

				)

				processor_job = CircleCIJob(

				    "processors",

				    docker_image=[{"image": "huggingface/transformers-torch-light"}],

				    parallelism=8,

				    pytest_num_workers=6

				)

				tf_job = CircleCIJob(

				    "tf",

				    docker_image=[{"image":"huggingface/transformers-tf-light"}],

				    parallelism=6,

				    pytest_num_workers=16,

				)

				flax_job = CircleCIJob(

				    "flax",

				    docker_image=[{"image":"huggingface/transformers-jax-light"}],

				    parallelism=6,

				    pytest_num_workers=16

				)

				pipelines_torch_job = CircleCIJob(

				    "pipelines_torch",

				    additional_env={"RUN_PIPELINE_TESTS": True},

				    docker_image=[{"image":"huggingface/transformers-torch-light"}],

				    marker="is_pipeline_test",

				    parallelism=4

				    parallelism=4,

				)

				pipelines_tf_job = CircleCIJob(

				    "pipelines_tf",

				    additional_env={"RUN_PIPELINE_TESTS": True},

				    docker_image=[{"image":"huggingface/transformers-tf-light"}],

				    marker="is_pipeline_test",

				    parallelism=4

				)

				custom_tokenizers_job = CircleCIJob(

				    "custom_tokenizers",

				    additional_env={"RUN_CUSTOM_TOKENIZERS": True},

				    docker_image=[{"image": "huggingface/transformers-custom-tokenizers"}],

				)

				examples_torch_job = CircleCIJob(

				    "examples_torch",

				    additional_env={"OMP_NUM_THREADS": 8},

				    docker_image=[{"image":"huggingface/transformers-examples-torch"}],

				    # TODO @ArthurZucker remove this once docker is easier to build

				    install_steps=["uv venv && uv pip install . && uv pip install -r examples/pytorch/_tests_requirements.txt"],

				    pytest_num_workers=8,

				    install_steps=["uv pip install . && uv pip install -r examples/pytorch/_tests_requirements.txt"],

				    pytest_num_workers=4,

				)

				examples_tensorflow_job = CircleCIJob(

				    "examples_tensorflow",

				    additional_env={"OMP_NUM_THREADS": 8},

				    docker_image=[{"image":"huggingface/transformers-examples-tf"}],

				    pytest_num_workers=16,

				)

				hub_job = CircleCIJob(

				    "hub",

				    additional_env={"HUGGINGFACE_CO_STAGING": True},

				    docker_image=[{"image":"huggingface/transformers-torch-light"}],

				    install_steps=[

				        'uv venv && uv pip install .',

				        'uv pip install .',

				        'git config --global user.email "ci@dummy.com"',

				        'git config --global user.name "ci"',

				    ],

				    marker="is_staging_test",

				    pytest_num_workers=2,

				    resource_class="medium",

				)

				onnx_job = CircleCIJob(

				    "onnx",

				    docker_image=[{"image":"huggingface/transformers-torch-tf-light"}],

				    install_steps=[

				        "uv venv",

				        "uv pip install .[torch,tf,testing,sentencepiece,onnxruntime,vision,rjieba]",

				    ],

				    pytest_options={"k onnx": None},

				    pytest_num_workers=1,

				)

				exotic_models_job = CircleCIJob(

				    "exotic_models",

				    docker_image=[{"image":"huggingface/transformers-exotic-models"}],

				    pytest_num_workers=12,

				    parallelism=4,

				    pytest_options={"durations": 100},

				)

				repo_utils_job = CircleCIJob(

				    "repo_utils",

				    docker_image=[{"image":"huggingface/transformers-consistency"}],

				@ -311,13 +308,14 @@ repo_utils_job = CircleCIJob(

				    resource_class="large",

				)

				non_model_job = CircleCIJob(

				    "non_model",

				    docker_image=[{"image": "huggingface/transformers-torch-light"}],

				    # networkx==3.3 (after #36957) cause some issues

				    # TODO: remove this once it works directly

				    install_steps=["uv pip install .[serving]"],

				    marker="not generate",

				    parallelism=6,

				    pytest_num_workers=8,

				)

				@ -333,7 +331,7 @@ doc_test_job = CircleCIJob(

				    additional_env={"TRANSFORMERS_VERBOSITY": "error", "DATASETS_VERBOSITY": "error", "SKIP_CUDA_DOCTEST": "1"},

				    install_steps=[

				        # Add an empty file to keep the test step running correctly even no file is selected to be tested.

				        "uv venv && pip install .",

				        "uv pip install .",

				        "touch dummy.py",

				        command,

				        "cat pr_documentation_tests_temp.txt",

				@ -345,13 +343,14 @@ doc_test_job = CircleCIJob(

				    pytest_num_workers=1,

				)

				REGULAR_TESTS = [torch_and_tf_job, torch_and_flax_job, torch_job, tf_job, flax_job, hub_job, onnx_job, tokenization_job, processor_job, generate_job, non_model_job] # fmt: skip

				EXAMPLES_TESTS = [examples_torch_job, examples_tensorflow_job]

				PIPELINE_TESTS = [pipelines_torch_job, pipelines_tf_job]

				REGULAR_TESTS = [torch_job, hub_job, tokenization_job, processor_job, generate_job, non_model_job] # fmt: skip

				EXAMPLES_TESTS = [examples_torch_job]

				PIPELINE_TESTS = [pipelines_torch_job]

				REPO_UTIL_TESTS = [repo_utils_job]

				DOC_TESTS = [doc_test_job]

				ALL_TESTS = REGULAR_TESTS + EXAMPLES_TESTS + PIPELINE_TESTS + REPO_UTIL_TESTS + DOC_TESTS + [custom_tokenizers_job] + [exotic_models_job]  # fmt: skip

				def create_circleci_config(folder=None):

				    if folder is None:

				        folder = os.getcwd()

				@ -361,19 +360,35 @@ def create_circleci_config(folder=None):

				    if len(jobs) == 0:

				        jobs = [EmptyJob()]

				    print("Full list of job name inputs", {j.job_name + "_test_list":{"type":"string", "default":''} for j in jobs})

				    else:

				        print("Full list of job name inputs", {j.job_name + "_test_list":{"type":"string", "default":''} for j in jobs})

				        # Add a job waiting all the test jobs and aggregate their test summary files at the end

				        collection_job = EmptyJob()

				        collection_job.job_name = "collection_job"

				        jobs = [collection_job] + jobs

				    config = {

				        "version": "2.1",

				        "parameters": {

				            # Only used to accept the parameters from the trigger

				            "nightly": {"type": "boolean", "default": False},

				            "tests_to_run": {"type": "string", "default": ''},

				            # Only used to accept the parameters from GitHub Actions trigger

				            "GHA_Actor": {"type": "string", "default": ""},

				            "GHA_Action": {"type": "string", "default": ""},

				            "GHA_Event": {"type": "string", "default": ""},

				            "GHA_Meta": {"type": "string", "default": ""},

				            "tests_to_run": {"type": "string", "default": ""},

				            **{j.job_name + "_test_list":{"type":"string", "default":''} for j in jobs},

				            **{j.job_name + "_parallelism":{"type":"integer", "default":1} for j in jobs},

				        },

				        "jobs" : {j.job_name: j.to_dict() for j in jobs},

				        "workflows": {"version": 2, "run_tests": {"jobs": [j.job_name for j in jobs]}}

				        "jobs": {j.job_name: j.to_dict() for j in jobs}

				    }

				    if "CIRCLE_TOKEN" in os.environ:

				        # For private forked repo. (e.g. new model addition)

				        config["workflows"] = {"version": 2, "run_tests": {"jobs": [{j.job_name: {"context": ["TRANSFORMERS_CONTEXT"]}} for j in jobs]}}

				    else:

				        # For public repo. (e.g. `transformers`)

				        config["workflows"] = {"version": 2, "run_tests": {"jobs": [j.job_name for j in jobs]}}

				    with open(os.path.join(folder, "generated_config.yml"), "w") as f:

				        f.write(yaml.dump(config, sort_keys=False, default_flow_style=False).replace("' << pipeline", " << pipeline").replace(">> '", " >>"))

									
										3

.circleci/parse_test_outputs.py
									
												View File
												
				@ -1,5 +1,6 @@

				import re

				import argparse

				import re

				def parse_pytest_output(file_path):

				    skipped_tests = {}

									
										43

.github/ISSUE_TEMPLATE/bug-report.yml
									
										vendored
									
												View File
												
				@ -16,7 +16,7 @@ body:

				    id: system-info

				    attributes:

				      label: System Info

				      description: Please share your system info with us. You can run the command `transformers-cli env` and copy-paste its output below.

				      description: Please share your system info with us. You can run the command `transformers env` and copy-paste its output below.

				      placeholder: transformers version, platform, python version, ...

				    validations:

				      required: true

				@ -36,26 +36,37 @@ body:

				        Models:

				          - text models: @ArthurZucker

				          - vision models: @amyeroberts, @qubvel

				          - speech models: @ylacombe, @eustlb

				          - text models: @ArthurZucker @Cyrilvallez

				          - vision models: @yonigozlan @molbap

				          - audio models: @eustlb @ebezzam @vasqu

				          - multimodal models: @zucchini-nlp

				          - graph models: @clefourrier

				        Library:

				          - flax: @sanchit-gandhi

				          - generate: @zucchini-nlp (visual-language models) or @gante (all others)

				          - continuous batching: @remi-or @ArthurZucker @McPatate

				          - pipelines: @Rocketknight1

				          - tensorflow: @gante and @Rocketknight1

				          - tokenizers: @ArthurZucker and @itazap

				          - trainer: @muellerzr @SunMarc

				          - trainer: @SunMarc

				          - attention: @vasqu @ArthurZucker @CyrilVallez

				          - model loading (from pretrained, etc): @CyrilVallez

				          - distributed: @3outeille @ArthurZucker

				          - CIs: @ydshieh

				        Integrations:

				          - deepspeed: HF Trainer/Accelerate: @muellerzr

				          - ray/raytune: @richardliaw, @amogkam

				          - Big Model Inference: @SunMarc

				          - quantization (bitsandbytes, autogpt): @SunMarc

				          - quantization: @SunMarc @MekkCyber

				          - kernels: @MekkCyber @drbh

				          - peft: @BenjaminBossan @githubnemo

				        Devices/Backends:

				          - AMD ROCm: @ivarflakstad

				          - Intel XPU: @IlyasMoutawwakil

				          - Ascend NPU: @ivarflakstad 

				        Documentation: @stevhliu

				@ -63,19 +74,6 @@ body:

				          - for issues with a model, report at https://discuss.huggingface.co/ and tag the model's creator.

				        HF projects:

				          - accelerate: [different repo](https://github.com/huggingface/accelerate)

				          - datasets: [different repo](https://github.com/huggingface/datasets)

				          - diffusers: [different repo](https://github.com/huggingface/diffusers)

				          - rust tokenizers: [different repo](https://github.com/huggingface/tokenizers)

				        Maintained examples (not research project or legacy):

				          - Flax: @sanchit-gandhi

				          - PyTorch: See Models above and tag the person corresponding to the modality of the example.

				          - TensorFlow: @Rocketknight1

				        Research projects are not maintained and should be taken as is.

				      placeholder: "@Username ..."

				@ -106,6 +104,7 @@ body:

				      label: Reproduction

				      description: |

				        Please provide a code sample that reproduces the problem you ran into. It can be a Colab link or just a code snippet.

				        Please include relevant config information with your code, for example your Trainers, TRL, Peft, and DeepSpeed configs.

				        If you have code snippets, error messages, stack traces please provide them here as well.

				        Important! Use code tags to correctly format your code. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting

				        Do not use screenshots, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.

									
										2

.github/ISSUE_TEMPLATE/i18n.md
									
										vendored
									
												View File
												
				@ -23,7 +23,7 @@ Some notes:

				* Please translate in a gender-neutral way.

				* Add your translations to the folder called `<languageCode>` inside the [source folder](https://github.com/huggingface/transformers/tree/main/docs/source).

				* Register your translation in `<languageCode>/_toctree.yml`; please follow the order of the [English version](https://github.com/huggingface/transformers/blob/main/docs/source/en/_toctree.yml).

				* Once you're finished, open a pull request and tag this issue by including #issue-number in the description, where issue-number is the number of this issue. Please ping @stevhliu and @MKhalusova for review.

				* Once you're finished, open a pull request and tag this issue by including #issue-number in the description, where issue-number is the number of this issue. Please ping @stevhliu for review.

				* 🙋 If you'd like others to help you with the translation, you can also post in the 🤗 [forums](https://discuss.huggingface.co/).

				## Get Started section

									
										2

.github/ISSUE_TEMPLATE/migration.yml
									
										vendored
									
												View File
												
				@ -6,7 +6,7 @@ body:

				    id: system-info

				    attributes:

				      label: System Info

				      description: Please share your system info with us. You can run the command `transformers-cli env` and copy-paste its output below.

				      description: Please share your system info with us. You can run the command `transformers env` and copy-paste its output below.

				      render: shell

				      placeholder: transformers version, platform, python version, ...

				    validations:

									
										43

.github/PULL_REQUEST_TEMPLATE.md
									
										vendored
									
												View File
												
				@ -39,41 +39,40 @@ members/contributors who may be interested in your PR.

				Models:

				- text models: @ArthurZucker

				- vision models: @amyeroberts, @qubvel

				- speech models: @ylacombe, @eustlb

				- text models: @ArthurZucker @Cyrilvallez

				- vision models: @yonigozlan @molbap

				- audio models: @eustlb @ebezzam @vasqu

				- multimodal models: @zucchini-nlp

				- graph models: @clefourrier

				Library:

				- flax: @sanchit-gandhi

				- generate: @zucchini-nlp (visual-language models) or @gante (all others)

				- continuous batching: @remi-or @ArthurZucker @McPatate

				- pipelines: @Rocketknight1

				- tensorflow: @gante and @Rocketknight1

				- tokenizers: @ArthurZucker

				- trainer: @muellerzr and @SunMarc

				- chat templates: @Rocketknight1

				- tokenizers: @ArthurZucker and @itazap

				- trainer: @SunMarc

				- attention: @vasqu @ArthurZucker @CyrilVallez

				- model loading (from pretrained, etc): @CyrilVallez

				- distributed: @3outeille @ArthurZucker

				- CIs: @ydshieh

				Integrations:

				- deepspeed: HF Trainer/Accelerate: @muellerzr

				- ray/raytune: @richardliaw, @amogkam

				- Big Model Inference: @SunMarc

				- quantization (bitsandbytes, autogpt): @SunMarc

				- quantization: @SunMarc @MekkCyber

				- kernels: @MekkCyber @drbh

				- peft: @BenjaminBossan @githubnemo

				Devices/Backends:

				- AMD ROCm: @ivarflakstad

				- Intel XPU: @IlyasMoutawwakil

				- Ascend NPU: @ivarflakstad 

				Documentation: @stevhliu

				HF projects:

				- accelerate: [different repo](https://github.com/huggingface/accelerate)

				- datasets: [different repo](https://github.com/huggingface/datasets)

				- diffusers: [different repo](https://github.com/huggingface/diffusers)

				- rust tokenizers: [different repo](https://github.com/huggingface/tokenizers)

				Maintained examples (not research project or legacy):

				- Flax: @sanchit-gandhi

				- PyTorch: See Models above and tag the person corresponding to the modality of the example.

				- TensorFlow: @Rocketknight1

				Research projects are not maintained and should be taken as is.

				 -->

									
										39

.github/copilot-instructions.md
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,39 @@

				# copilot-instructions.md Guide for Hugging Face Transformers

				This copilot-instructions.md file provides guidance for code agents working with this codebase.

				## Core Project Structure

				- `/src/transformers`: This contains the core source code for the library

				  - `/models`: Code for individual models. Models inherit from base classes in the root `/src/transformers` directory.

				- `/tests`: This contains the core test classes for the library. These are usually inherited rather than directly run.

				  - `/models`: Tests for individual models. Model tests inherit from common tests in the root `/tests` directory.

				- `/docs`: This contains the documentation for the library, including guides, tutorials, and API references.

				## Coding Conventions for Hugging Face Transformers

				- PRs should be as brief as possible. Bugfix PRs in particular can often be only one or two lines long, and do not need large comments, docstrings or new functions in this case. Aim to minimize the size of the diff.

				- When writing tests, they should be added to an existing file. The only exception is for PRs to add a new model, when a new test directory should be created for that model.

				- Code style is enforced in the CI. You can install the style tools with `pip install -e .[quality]`. You can then run `make fixup` to apply style and consistency fixes to your code.

				## Copying and inheritance

				Many models in the codebase have similar code, but it is not shared by inheritance because we want each model file to be self-contained.

				We use two mechanisms to keep this code in sync:

				- "Copied from" syntax. Functions or entire classes can have a comment at the top like this: `# Copied from transformers.models.llama.modeling_llama.rotate_half` or `# Copied from transformers.models.t5.modeling_t5.T5LayerNorm with T5->MT5`

				  These comments are actively checked by the style tools, and copies will automatically be updated when the base code is updated. If you need to update a copied function, you should

				  either update the base function and use `make fixup` to propagate the change to all copies, or simply remove the `# Copied from` comment if that is inappropriate.

				- "Modular" files. These files briefly define models by composing them using inheritance from other models. They are not meant to be used directly. Instead, the style tools

				  automatically generate a complete modeling file, like `modeling_bert.py`, from the modular file like `modular_bert.py`. If a model has a modular file, the modeling file

				  should never be edited directly! Instead, changes should be made in the modular file, and then you should run `make fixup` to update the modeling file automatically.

				When adding new models, you should prefer `modular` style and inherit as many classes as possible from existing models.

				## Testing

				After making changes, you should usually run `make fixup` to ensure any copies and modular files are updated, and then test all affected models. This includes both

				the model you made the changes in and any other models that were updated by `make fixup`. Tests can be run with `pytest tests/models/[name]/test_modeling_[name].py`

				If your changes affect code in other classes like tokenizers or processors, you should run those tests instead, like `test_processing_[name].py` or `test_tokenization_[name].py`.

				In order to run tests, you may need to install dependencies. You can do this with `pip install -e .[testing]`. You will probably also need to `pip install torch accelerate` if your environment does not already have them.

									
										122

.github/scripts/assign_reviewers.py
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,122 @@

				# coding=utf-8

				# Copyright 2025 the HuggingFace Inc. team. All rights reserved.

				#

				# Licensed under the Apache License, Version 2.0 (the "License");

				# you may not use this file except in compliance with the License.

				# You may obtain a copy of the License at

				#

				#     http://www.apache.org/licenses/LICENSE-2.0

				#

				# Unless required by applicable law or agreed to in writing, software

				# distributed under the License is distributed on an "AS IS" BASIS,

				# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				# See the License for the specific language governing permissions and

				# limitations under the License.

				import json

				import os

				import re

				from collections import Counter

				from pathlib import Path

				import github

				from github import Github

				def pattern_to_regex(pattern):

				    if pattern.startswith("/"):

				        start_anchor = True

				        pattern = re.escape(pattern[1:])

				    else:

				        start_anchor = False

				        pattern = re.escape(pattern)

				    # Replace `*` with "any number of non-slash characters"

				    pattern = pattern.replace(r"\*", "[^/]*")

				    if start_anchor:

				        pattern = r"^\/?" + pattern  # Allow an optional leading slash after the start of the string

				    return pattern

				def get_file_owners(file_path, codeowners_lines):

				    # Process lines in reverse (last matching pattern takes precedence)

				    for line in reversed(codeowners_lines):

				        # Skip comments and empty lines, strip inline comments

				        line = line.split('#')[0].strip()

				        if not line:

				            continue

				        # Split into pattern and owners

				        parts = line.split()

				        pattern = parts[0]

				        # Can be empty, e.g. for dummy files with explicitly no owner!

				        owners = [owner.removeprefix("@") for owner in parts[1:]]

				        # Check if file matches pattern

				        file_regex = pattern_to_regex(pattern)

				        if re.search(file_regex, file_path) is not None:

				            return owners  # Remember, can still be empty!

				    return []  # Should never happen, but just in case

				def pr_author_is_in_hf(pr_author, codeowners_lines):

				    # Check if the PR author is in the codeowners file

				    for line in codeowners_lines:

				        line = line.split('#')[0].strip()

				        if not line:

				            continue

				        # Split into pattern and owners

				        parts = line.split()

				        owners = [owner.removeprefix("@") for owner in parts[1:]]

				        if pr_author in owners:

				            return True

				    return False

				def main():

				    script_dir = Path(__file__).parent.absolute()

				    with open(script_dir / "codeowners_for_review_action") as f:

				        codeowners_lines = f.readlines()

				    g = Github(os.environ['GITHUB_TOKEN'])

				    repo = g.get_repo("huggingface/transformers")

				    with open(os.environ['GITHUB_EVENT_PATH']) as f:

				        event = json.load(f)

				    # The PR number is available in the event payload

				    pr_number = event['pull_request']['number']

				    pr = repo.get_pull(pr_number)

				    pr_author = pr.user.login

				    if pr_author_is_in_hf(pr_author, codeowners_lines):

				        print(f"PR author {pr_author} is in codeowners, skipping review request.")

				        return

				    existing_reviews = list(pr.get_reviews())

				    if existing_reviews:

				        print(f"Already has reviews: {[r.user.login for r in existing_reviews]}")

				        return

				    users_requested, teams_requested = pr.get_review_requests()

				    users_requested = list(users_requested)

				    if users_requested:

				        print(f"Reviewers already requested: {users_requested}")

				        return

				    locs_per_owner = Counter()

				    for file in pr.get_files():

				        owners = get_file_owners(file.filename, codeowners_lines)

				        for owner in owners:

				            locs_per_owner[owner] += file.changes

				    # Assign the top 2 based on locs changed as reviewers, but skip the owner if present

				    locs_per_owner.pop(pr_author, None)

				    top_owners = locs_per_owner.most_common(2)

				    print("Top owners", top_owners)

				    top_owners = [owner[0] for owner in top_owners]

				    try:

				        pr.create_review_request(top_owners)

				    except github.GithubException as e:

				        print(f"Failed to request review for {top_owners}: {e}")

				if __name__ == "__main__":

				    main()

370

.github/scripts/codeowners_for_review_action vendored Normal file

View File

 @ -0,0 +1,370 @@
 # Top-level rules are matched only if nothing else matches
 * @Rocketknight1 @ArthurZucker # if no one is pinged based on the other rules, he will do the dispatch
 *.md @stevhliu
 *tokenization* @ArthurZucker
 docs/ @stevhliu
 /benchmark/ @McPatate
 /docker/ @ydshieh @ArthurZucker
 # More high-level globs catch cases when specific rules later don't apply
 /src/transformers/models/*/processing* @molbap @yonigozlan
 /src/transformers/models/*/image_processing* @yonigozlan
 /src/transformers/models/*/image_processing_*_fast* @yonigozlan
 # Owners of subsections of the library
 /src/transformers/generation/ @gante
 /src/transformers/pipeline/ @Rocketknight1 @yonigozlan
 /src/transformers/integrations/ @SunMarc @MekkCyber @zach-huggingface
 /src/transformers/quantizers/ @SunMarc @MekkCyber
 tests/ @ydshieh
 tests/generation/ @gante
 /src/transformers/models/auto/ @ArthurZucker
 /src/transformers/utils/ @ArthurZucker @Rocketknight1
 /src/transformers/loss/ @ArthurZucker
 /src/transformers/onnx/ @michaelbenayoun
 # Specific files come after the sections/globs, so they take priority
 /.circleci/config.yml @ArthurZucker @ydshieh
 /utils/tests_fetcher.py @ydshieh
 trainer.py @zach-huggingface @SunMarc
 trainer_utils.py @zach-huggingface @SunMarc
 /utils/modular_model_converter.py @Cyrilvallez @ArthurZucker
 # Owners of individual models are specific / high priority, and so they come last
 # mod* captures modeling and modular files
 # Text models
 /src/transformers/models/albert/mod*_albert* @ArthurZucker
 /src/transformers/models/bamba/mod*_bamba* @ArthurZucker
 /src/transformers/models/bart/mod*_bart* @ArthurZucker
 /src/transformers/models/barthez/mod*_barthez* @ArthurZucker
 /src/transformers/models/bartpho/mod*_bartpho* @ArthurZucker
 /src/transformers/models/bert/mod*_bert* @ArthurZucker
 /src/transformers/models/bert_generation/mod*_bert_generation* @ArthurZucker
 /src/transformers/models/bert_japanese/mod*_bert_japanese* @ArthurZucker
 /src/transformers/models/bertweet/mod*_bertweet* @ArthurZucker
 /src/transformers/models/big_bird/mod*_big_bird* @ArthurZucker
 /src/transformers/models/bigbird_pegasus/mod*_bigbird_pegasus* @ArthurZucker
 /src/transformers/models/biogpt/mod*_biogpt* @ArthurZucker
 /src/transformers/models/blenderbot/mod*_blenderbot* @ArthurZucker
 /src/transformers/models/blenderbot_small/mod*_blenderbot_small* @ArthurZucker
 /src/transformers/models/bloom/mod*_bloom* @ArthurZucker
 /src/transformers/models/bort/mod*_bort* @ArthurZucker
 /src/transformers/models/byt5/mod*_byt5* @ArthurZucker
 /src/transformers/models/camembert/mod*_camembert* @ArthurZucker
 /src/transformers/models/canine/mod*_canine* @ArthurZucker
 /src/transformers/models/codegen/mod*_codegen* @ArthurZucker
 /src/transformers/models/code_llama/mod*_code_llama* @ArthurZucker
 /src/transformers/models/cohere/mod*_cohere* @ArthurZucker
 /src/transformers/models/cohere2/mod*_cohere2* @ArthurZucker
 /src/transformers/models/convbert/mod*_convbert* @ArthurZucker
 /src/transformers/models/cpm/mod*_cpm* @ArthurZucker
 /src/transformers/models/cpmant/mod*_cpmant* @ArthurZucker
 /src/transformers/models/ctrl/mod*_ctrl* @ArthurZucker
 /src/transformers/models/dbrx/mod*_dbrx* @ArthurZucker
 /src/transformers/models/deberta/mod*_deberta* @ArthurZucker
 /src/transformers/models/deberta_v2/mod*_deberta_v2* @ArthurZucker
 /src/transformers/models/dialogpt/mod*_dialogpt* @ArthurZucker
 /src/transformers/models/diffllama/mod*_diffllama* @ArthurZucker
 /src/transformers/models/distilbert/mod*_distilbert* @ArthurZucker
 /src/transformers/models/dpr/mod*_dpr* @ArthurZucker
 /src/transformers/models/electra/mod*_electra* @ArthurZucker
 /src/transformers/models/encoder_decoder/mod*_encoder_decoder* @ArthurZucker
 /src/transformers/models/ernie/mod*_ernie* @ArthurZucker
 /src/transformers/models/ernie_m/mod*_ernie_m* @ArthurZucker
 /src/transformers/models/esm/mod*_esm* @ArthurZucker
 /src/transformers/models/falcon/mod*_falcon* @ArthurZucker
 /src/transformers/models/falcon3/mod*_falcon3* @ArthurZucker
 /src/transformers/models/falcon_mamba/mod*_falcon_mamba* @ArthurZucker
 /src/transformers/models/fastspeech2_conformer/mod*_fastspeech2_conformer* @ArthurZucker
 /src/transformers/models/flan_t5/mod*_flan_t5* @ArthurZucker
 /src/transformers/models/flan_ul2/mod*_flan_ul2* @ArthurZucker
 /src/transformers/models/flaubert/mod*_flaubert* @ArthurZucker
 /src/transformers/models/fnet/mod*_fnet* @ArthurZucker
 /src/transformers/models/fsmt/mod*_fsmt* @ArthurZucker
 /src/transformers/models/funnel/mod*_funnel* @ArthurZucker
 /src/transformers/models/fuyu/mod*_fuyu* @ArthurZucker
 /src/transformers/models/gemma/mod*_gemma* @ArthurZucker
 /src/transformers/models/gemma2/mod*_gemma2* @ArthurZucker
 /src/transformers/models/glm/mod*_glm* @ArthurZucker
 /src/transformers/models/openai_gpt/mod*_openai_gpt* @ArthurZucker
 /src/transformers/models/gpt_neo/mod*_gpt_neo* @ArthurZucker
 /src/transformers/models/gpt_neox/mod*_gpt_neox* @ArthurZucker
 /src/transformers/models/gpt_neox_japanese/mod*_gpt_neox_japanese* @ArthurZucker
 /src/transformers/models/gptj/mod*_gptj* @ArthurZucker
 /src/transformers/models/gpt2/mod*_gpt2* @ArthurZucker
 /src/transformers/models/gpt_bigcode/mod*_gpt_bigcode* @ArthurZucker
 /src/transformers/models/gptsan_japanese/mod*_gptsan_japanese* @ArthurZucker
 /src/transformers/models/gpt_sw3/mod*_gpt_sw3* @ArthurZucker
 /src/transformers/models/granite/mod*_granite* @ArthurZucker
 /src/transformers/models/granitemoe/mod*_granitemoe* @ArthurZucker
 /src/transformers/models/herbert/mod*_herbert* @ArthurZucker
 /src/transformers/models/ibert/mod*_ibert* @ArthurZucker
 /src/transformers/models/jamba/mod*_jamba* @ArthurZucker
 /src/transformers/models/jetmoe/mod*_jetmoe* @ArthurZucker
 /src/transformers/models/jukebox/mod*_jukebox* @ArthurZucker
 /src/transformers/models/led/mod*_led* @ArthurZucker
 /src/transformers/models/llama/mod*_llama* @ArthurZucker @Cyrilvallez
 /src/transformers/models/longformer/mod*_longformer* @ArthurZucker
 /src/transformers/models/longt5/mod*_longt5* @ArthurZucker
 /src/transformers/models/luke/mod*_luke* @ArthurZucker
 /src/transformers/models/m2m_100/mod*_m2m_100* @ArthurZucker
 /src/transformers/models/madlad_400/mod*_madlad_400* @ArthurZucker
 /src/transformers/models/mamba/mod*_mamba* @ArthurZucker
 /src/transformers/models/mamba2/mod*_mamba2* @ArthurZucker
 /src/transformers/models/marian/mod*_marian* @ArthurZucker
 /src/transformers/models/markuplm/mod*_markuplm* @ArthurZucker
 /src/transformers/models/mbart/mod*_mbart* @ArthurZucker
 /src/transformers/models/mega/mod*_mega* @ArthurZucker
 /src/transformers/models/megatron_bert/mod*_megatron_bert* @ArthurZucker
 /src/transformers/models/megatron_gpt2/mod*_megatron_gpt2* @ArthurZucker
 /src/transformers/models/mistral/mod*_mistral* @ArthurZucker
 /src/transformers/models/mixtral/mod*_mixtral* @ArthurZucker
 /src/transformers/models/mluke/mod*_mluke* @ArthurZucker
 /src/transformers/models/mobilebert/mod*_mobilebert* @ArthurZucker
 /src/transformers/models/modernbert/mod*_modernbert* @ArthurZucker
 /src/transformers/models/mpnet/mod*_mpnet* @ArthurZucker
 /src/transformers/models/mpt/mod*_mpt* @ArthurZucker
 /src/transformers/models/mra/mod*_mra* @ArthurZucker
 /src/transformers/models/mt5/mod*_mt5* @ArthurZucker
 /src/transformers/models/mvp/mod*_mvp* @ArthurZucker
 /src/transformers/models/myt5/mod*_myt5* @ArthurZucker
 /src/transformers/models/nemotron/mod*_nemotron* @ArthurZucker
 /src/transformers/models/nezha/mod*_nezha* @ArthurZucker
 /src/transformers/models/nllb/mod*_nllb* @ArthurZucker
 /src/transformers/models/nllb_moe/mod*_nllb_moe* @ArthurZucker
 /src/transformers/models/nystromformer/mod*_nystromformer* @ArthurZucker
 /src/transformers/models/olmo/mod*_olmo* @ArthurZucker
 /src/transformers/models/olmo2/mod*_olmo2* @ArthurZucker
 /src/transformers/models/olmoe/mod*_olmoe* @ArthurZucker
 /src/transformers/models/open_llama/mod*_open_llama* @ArthurZucker
 /src/transformers/models/opt/mod*_opt* @ArthurZucker
 /src/transformers/models/pegasus/mod*_pegasus* @ArthurZucker
 /src/transformers/models/pegasus_x/mod*_pegasus_x* @ArthurZucker
 /src/transformers/models/persimmon/mod*_persimmon* @ArthurZucker
 /src/transformers/models/phi/mod*_phi* @ArthurZucker
 /src/transformers/models/phi3/mod*_phi3* @ArthurZucker
 /src/transformers/models/phimoe/mod*_phimoe* @ArthurZucker
 /src/transformers/models/phobert/mod*_phobert* @ArthurZucker
 /src/transformers/models/plbart/mod*_plbart* @ArthurZucker
 /src/transformers/models/prophetnet/mod*_prophetnet* @ArthurZucker
 /src/transformers/models/qdqbert/mod*_qdqbert* @ArthurZucker
 /src/transformers/models/qwen2/mod*_qwen2* @ArthurZucker
 /src/transformers/models/qwen2_moe/mod*_qwen2_moe* @ArthurZucker
 /src/transformers/models/rag/mod*_rag* @ArthurZucker
 /src/transformers/models/realm/mod*_realm* @ArthurZucker
 /src/transformers/models/recurrent_gemma/mod*_recurrent_gemma* @ArthurZucker
 /src/transformers/models/reformer/mod*_reformer* @ArthurZucker
 /src/transformers/models/rembert/mod*_rembert* @ArthurZucker
 /src/transformers/models/retribert/mod*_retribert* @ArthurZucker
 /src/transformers/models/roberta/mod*_roberta* @ArthurZucker
 /src/transformers/models/roberta_prelayernorm/mod*_roberta_prelayernorm* @ArthurZucker
 /src/transformers/models/roc_bert/mod*_roc_bert* @ArthurZucker
 /src/transformers/models/roformer/mod*_roformer* @ArthurZucker
 /src/transformers/models/rwkv/mod*_rwkv* @ArthurZucker
 /src/transformers/models/splinter/mod*_splinter* @ArthurZucker
 /src/transformers/models/squeezebert/mod*_squeezebert* @ArthurZucker
 /src/transformers/models/stablelm/mod*_stablelm* @ArthurZucker
 /src/transformers/models/starcoder2/mod*_starcoder2* @ArthurZucker
 /src/transformers/models/switch_transformers/mod*_switch_transformers* @ArthurZucker
 /src/transformers/models/t5/mod*_t5* @ArthurZucker
 /src/transformers/models/t5v1.1/mod*_t5v1.1* @ArthurZucker
 /src/transformers/models/tapex/mod*_tapex* @ArthurZucker
 /src/transformers/models/transfo_xl/mod*_transfo_xl* @ArthurZucker
 /src/transformers/models/ul2/mod*_ul2* @ArthurZucker
 /src/transformers/models/umt5/mod*_umt5* @ArthurZucker
 /src/transformers/models/xmod/mod*_xmod* @ArthurZucker
 /src/transformers/models/xglm/mod*_xglm* @ArthurZucker
 /src/transformers/models/xlm/mod*_xlm* @ArthurZucker
 /src/transformers/models/xlm_prophetnet/mod*_xlm_prophetnet* @ArthurZucker
 /src/transformers/models/xlm_roberta/mod*_xlm_roberta* @ArthurZucker
 /src/transformers/models/xlm_roberta_xl/mod*_xlm_roberta_xl* @ArthurZucker
 /src/transformers/models/xlm_v/mod*_xlm_v* @ArthurZucker
 /src/transformers/models/xlnet/mod*_xlnet* @ArthurZucker
 /src/transformers/models/yoso/mod*_yoso* @ArthurZucker
 /src/transformers/models/zamba/mod*_zamba* @ArthurZucker
 # Vision models
 /src/transformers/models/beit/mod*_beit* @yonigozlan @molbap
 /src/transformers/models/bit/mod*_bit* @yonigozlan @molbap
 /src/transformers/models/conditional_detr/mod*_conditional_detr* @yonigozlan @molbap
 /src/transformers/models/convnext/mod*_convnext* @yonigozlan @molbap
 /src/transformers/models/convnextv2/mod*_convnextv2* @yonigozlan @molbap
 /src/transformers/models/cvt/mod*_cvt* @yonigozlan @molbap
 /src/transformers/models/deformable_detr/mod*_deformable_detr* @yonigozlan @molbap
 /src/transformers/models/deit/mod*_deit* @yonigozlan @molbap
 /src/transformers/models/depth_anything/mod*_depth_anything* @yonigozlan @molbap
 /src/transformers/models/depth_anything_v2/mod*_depth_anything_v2* @yonigozlan @molbap
 /src/transformers/models/deta/mod*_deta* @yonigozlan @molbap
 /src/transformers/models/detr/mod*_detr* @yonigozlan @molbap
 /src/transformers/models/dinat/mod*_dinat* @yonigozlan @molbap
 /src/transformers/models/dinov2/mod*_dinov2* @yonigozlan @molbap
 /src/transformers/models/dinov2_with_registers/mod*_dinov2_with_registers* @yonigozlan @molbap
 /src/transformers/models/dit/mod*_dit* @yonigozlan @molbap
 /src/transformers/models/dpt/mod*_dpt* @yonigozlan @molbap
 /src/transformers/models/efficientformer/mod*_efficientformer* @yonigozlan @molbap
 /src/transformers/models/efficientnet/mod*_efficientnet* @yonigozlan @molbap
 /src/transformers/models/focalnet/mod*_focalnet* @yonigozlan @molbap
 /src/transformers/models/glpn/mod*_glpn* @yonigozlan @molbap
 /src/transformers/models/hiera/mod*_hiera* @yonigozlan @molbap
 /src/transformers/models/ijepa/mod*_ijepa* @yonigozlan @molbap
 /src/transformers/models/imagegpt/mod*_imagegpt* @yonigozlan @molbap
 /src/transformers/models/levit/mod*_levit* @yonigozlan @molbap
 /src/transformers/models/mask2former/mod*_mask2former* @yonigozlan @molbap
 /src/transformers/models/maskformer/mod*_maskformer* @yonigozlan @molbap
 /src/transformers/models/mobilenet_v1/mod*_mobilenet_v1* @yonigozlan @molbap
 /src/transformers/models/mobilenet_v2/mod*_mobilenet_v2* @yonigozlan @molbap
 /src/transformers/models/mobilevit/mod*_mobilevit* @yonigozlan @molbap
 /src/transformers/models/mobilevitv2/mod*_mobilevitv2* @yonigozlan @molbap
 /src/transformers/models/nat/mod*_nat* @yonigozlan @molbap
 /src/transformers/models/poolformer/mod*_poolformer* @yonigozlan @molbap
 /src/transformers/models/pvt/mod*_pvt* @yonigozlan @molbap
 /src/transformers/models/pvt_v2/mod*_pvt_v2* @yonigozlan @molbap
 /src/transformers/models/regnet/mod*_regnet* @yonigozlan @molbap
 /src/transformers/models/resnet/mod*_resnet* @yonigozlan @molbap
 /src/transformers/models/rt_detr/mod*_rt_detr* @yonigozlan @molbap
 /src/transformers/models/segformer/mod*_segformer* @yonigozlan @molbap
 /src/transformers/models/seggpt/mod*_seggpt* @yonigozlan @molbap
 /src/transformers/models/superpoint/mod*_superpoint* @yonigozlan @molbap
 /src/transformers/models/swiftformer/mod*_swiftformer* @yonigozlan @molbap
 /src/transformers/models/swin/mod*_swin* @yonigozlan @molbap
 /src/transformers/models/swinv2/mod*_swinv2* @yonigozlan @molbap
 /src/transformers/models/swin2sr/mod*_swin2sr* @yonigozlan @molbap
 /src/transformers/models/table_transformer/mod*_table_transformer* @yonigozlan @molbap
 /src/transformers/models/textnet/mod*_textnet* @yonigozlan @molbap
 /src/transformers/models/timm_wrapper/mod*_timm_wrapper* @yonigozlan @molbap
 /src/transformers/models/upernet/mod*_upernet* @yonigozlan @molbap
 /src/transformers/models/van/mod*_van* @yonigozlan @molbap
 /src/transformers/models/vit/mod*_vit* @yonigozlan @molbap
 /src/transformers/models/vit_hybrid/mod*_vit_hybrid* @yonigozlan @molbap
 /src/transformers/models/vitdet/mod*_vitdet* @yonigozlan @molbap
 /src/transformers/models/vit_mae/mod*_vit_mae* @yonigozlan @molbap
 /src/transformers/models/vitmatte/mod*_vitmatte* @yonigozlan @molbap
 /src/transformers/models/vit_msn/mod*_vit_msn* @yonigozlan @molbap
 /src/transformers/models/vitpose/mod*_vitpose* @yonigozlan @molbap
 /src/transformers/models/yolos/mod*_yolos* @yonigozlan @molbap
 /src/transformers/models/zoedepth/mod*_zoedepth* @yonigozlan @molbap
 # Audio models
 /src/transformers/models/audio_spectrogram_transformer/mod*_audio_spectrogram_transformer* @eustlb
 /src/transformers/models/bark/mod*_bark* @eustlb
 /src/transformers/models/clap/mod*_clap* @eustlb
 /src/transformers/models/dac/mod*_dac* @eustlb
 /src/transformers/models/encodec/mod*_encodec* @eustlb
 /src/transformers/models/hubert/mod*_hubert* @eustlb
 /src/transformers/models/mctct/mod*_mctct* @eustlb
 /src/transformers/models/mimi/mod*_mimi* @eustlb
 /src/transformers/models/mms/mod*_mms* @eustlb
 /src/transformers/models/moshi/mod*_moshi* @eustlb
 /src/transformers/models/musicgen/mod*_musicgen* @eustlb
 /src/transformers/models/musicgen_melody/mod*_musicgen_melody* @eustlb
 /src/transformers/models/pop2piano/mod*_pop2piano* @eustlb
 /src/transformers/models/seamless_m4t/mod*_seamless_m4t* @eustlb
 /src/transformers/models/seamless_m4t_v2/mod*_seamless_m4t_v2* @eustlb
 /src/transformers/models/sew/mod*_sew* @eustlb
 /src/transformers/models/sew_d/mod*_sew_d* @eustlb
 /src/transformers/models/speech_to_text/mod*_speech_to_text* @eustlb
 /src/transformers/models/speech_to_text_2/mod*_speech_to_text_2* @eustlb
 /src/transformers/models/speecht5/mod*_speecht5* @eustlb
 /src/transformers/models/unispeech/mod*_unispeech* @eustlb
 /src/transformers/models/unispeech_sat/mod*_unispeech_sat* @eustlb
 /src/transformers/models/univnet/mod*_univnet* @eustlb
 /src/transformers/models/vits/mod*_vits* @eustlb
 /src/transformers/models/wav2vec2/mod*_wav2vec2* @eustlb
 /src/transformers/models/wav2vec2_bert/mod*_wav2vec2_bert* @eustlb
 /src/transformers/models/wav2vec2_conformer/mod*_wav2vec2_conformer* @eustlb
 /src/transformers/models/wav2vec2_phoneme/mod*_wav2vec2_phoneme* @eustlb
 /src/transformers/models/wavlm/mod*_wavlm* @eustlb
 /src/transformers/models/whisper/mod*_whisper* @eustlb
 /src/transformers/models/xls_r/mod*_xls_r* @eustlb
 /src/transformers/models/xlsr_wav2vec2/mod*_xlsr_wav2vec2* @eustlb
 # Video models
 /src/transformers/models/timesformer/mod*_timesformer* @Rocketknight1
 /src/transformers/models/videomae/mod*_videomae* @Rocketknight1
 /src/transformers/models/vivit/mod*_vivit* @Rocketknight1
 # Multimodal models
 /src/transformers/models/align/mod*_align* @zucchini-nlp
 /src/transformers/models/altclip/mod*_altclip* @zucchini-nlp
 /src/transformers/models/aria/mod*_aria* @zucchini-nlp
 /src/transformers/models/blip/mod*_blip* @zucchini-nlp
 /src/transformers/models/blip_2/mod*_blip_2* @zucchini-nlp
 /src/transformers/models/bridgetower/mod*_bridgetower* @zucchini-nlp
 /src/transformers/models/bros/mod*_bros* @zucchini-nlp
 /src/transformers/models/chameleon/mod*_chameleon* @zucchini-nlp
 /src/transformers/models/chinese_clip/mod*_chinese_clip* @zucchini-nlp
 /src/transformers/models/clip/mod*_clip* @zucchini-nlp
 /src/transformers/models/clipseg/mod*_clipseg* @zucchini-nlp
 /src/transformers/models/clvp/mod*_clvp* @zucchini-nlp
 /src/transformers/models/colpali/mod*_colpali* @zucchini-nlp @yonigozlan
 /src/transformers/models/data2vec/mod*_data2vec* @zucchini-nlp
 /src/transformers/models/deplot/mod*_deplot* @zucchini-nlp
 /src/transformers/models/donut/mod*_donut* @zucchini-nlp
 /src/transformers/models/flava/mod*_flava* @zucchini-nlp
 /src/transformers/models/git/mod*_git* @zucchini-nlp
 /src/transformers/models/grounding_dino/mod*_grounding_dino* @yonigozlan
 /src/transformers/models/groupvit/mod*_groupvit* @zucchini-nlp
 /src/transformers/models/idefics/mod*_idefics* @zucchini-nlp
 /src/transformers/models/idefics2/mod*_idefics2* @zucchini-nlp
 /src/transformers/models/idefics3/mod*_idefics3* @zucchini-nlp
 /src/transformers/models/instructblip/mod*_instructblip* @zucchini-nlp
 /src/transformers/models/instructblipvideo/mod*_instructblipvideo* @zucchini-nlp
 /src/transformers/models/kosmos_2/mod*_kosmos_2* @zucchini-nlp
 /src/transformers/models/layoutlm/mod*_layoutlm* @NielsRogge
 /src/transformers/models/layoutlmv2/mod*_layoutlmv2* @NielsRogge
 /src/transformers/models/layoutlmv3/mod*_layoutlmv3* @NielsRogge
 /src/transformers/models/layoutxlm/mod*_layoutxlm* @NielsRogge
 /src/transformers/models/lilt/mod*_lilt* @zucchini-nlp
 /src/transformers/models/llava/mod*_llava* @zucchini-nlp @arthurzucker
 /src/transformers/models/llava_next/mod*_llava_next* @zucchini-nlp
 /src/transformers/models/llava_next_video/mod*_llava_next_video* @zucchini-nlp
 /src/transformers/models/llava_onevision/mod*_llava_onevision* @zucchini-nlp
 /src/transformers/models/lxmert/mod*_lxmert* @zucchini-nlp
 /src/transformers/models/matcha/mod*_matcha* @zucchini-nlp
 /src/transformers/models/mgp_str/mod*_mgp_str* @zucchini-nlp
 /src/transformers/models/mllama/mod*_mllama* @zucchini-nlp
 /src/transformers/models/nougat/mod*_nougat* @NielsRogge
 /src/transformers/models/omdet_turbo/mod*_omdet_turbo* @yonigozlan
 /src/transformers/models/oneformer/mod*_oneformer* @zucchini-nlp
 /src/transformers/models/owlvit/mod*_owlvit* @yonigozlan
 /src/transformers/models/owlv2/mod*_owlv2* @yonigozlan
 /src/transformers/models/paligemma/mod*_paligemma* @zucchini-nlp @molbap
 /src/transformers/models/perceiver/mod*_perceiver* @zucchini-nlp
 /src/transformers/models/pix2struct/mod*_pix2struct* @zucchini-nlp
 /src/transformers/models/pixtral/mod*_pixtral* @zucchini-nlp @ArthurZucker
 /src/transformers/models/qwen2_audio/mod*_qwen2_audio* @zucchini-nlp @ArthurZucker
 /src/transformers/models/qwen2_vl/mod*_qwen2_vl* @zucchini-nlp @ArthurZucker
 /src/transformers/models/sam/mod*_sam* @zucchini-nlp @ArthurZucker
 /src/transformers/models/siglip/mod*_siglip* @zucchini-nlp
 /src/transformers/models/speech_encoder_decoder/mod*_speech_encoder_decoder* @zucchini-nlp
 /src/transformers/models/tapas/mod*_tapas* @NielsRogge
 /src/transformers/models/trocr/mod*_trocr* @zucchini-nlp
 /src/transformers/models/tvlt/mod*_tvlt* @zucchini-nlp
 /src/transformers/models/tvp/mod*_tvp* @zucchini-nlp
 /src/transformers/models/udop/mod*_udop* @zucchini-nlp
 /src/transformers/models/video_llava/mod*_video_llava* @zucchini-nlp
 /src/transformers/models/vilt/mod*_vilt* @zucchini-nlp
 /src/transformers/models/vipllava/mod*_vipllava* @zucchini-nlp
 /src/transformers/models/vision_encoder_decoder/mod*_vision_encoder_decoder* @Rocketknight1
 /src/transformers/models/vision_text_dual_encoder/mod*_vision_text_dual_encoder* @Rocketknight1
 /src/transformers/models/visual_bert/mod*_visual_bert* @zucchini-nlp
 /src/transformers/models/xclip/mod*_xclip* @zucchini-nlp
 # Reinforcement learning models
 /src/transformers/models/decision_transformer/mod*_decision_transformer* @Rocketknight1
 /src/transformers/models/trajectory_transformer/mod*_trajectory_transformer* @Rocketknight1
 # Time series models
 /src/transformers/models/autoformer/mod*_autoformer* @Rocketknight1
 /src/transformers/models/informer/mod*_informer* @Rocketknight1
 /src/transformers/models/patchtsmixer/mod*_patchtsmixer* @Rocketknight1
 /src/transformers/models/patchtst/mod*_patchtst* @Rocketknight1
 /src/transformers/models/time_series_transformer/mod*_time_series_transformer* @Rocketknight1
 # Graph models
 /src/transformers/models/graphormer/mod*_graphormer* @clefourrier
 # Finally, files with no owners that shouldn't generate pings, usually automatically generated and checked in the CI
 utils/dummy*

									
										2

.github/workflows/add-model-like.yml
									
										vendored
									
												View File
												
				@ -54,7 +54,7 @@ jobs:

				      - name: Create model files

				        run: |

				          . ~/venv/bin/activate

				          transformers-cli add-new-model-like --config_file tests/fixtures/add_distilbert_like_config.json --path_to_repo .

				          transformers add-new-model-like --config_file tests/fixtures/add_distilbert_like_config.json --path_to_repo .

				          make style

				          make fix-copies

									
										26

.github/workflows/assign-reviewers.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,26 @@

				name: Assign PR Reviewers

				on:

				  pull_request_target:

				    branches:

				      - main

				    types: [ready_for_review]

				jobs:

				  assign_reviewers:

				    permissions:

				       pull-requests: write

				    runs-on: ubuntu-22.04

				    steps:

				      - uses: actions/checkout@v4

				      - name: Set up Python

				        uses: actions/setup-python@v5

				        with:

				          python-version: '3.13'

				      - name: Install dependencies

				        run: |

				          python -m pip install --upgrade pip

				          pip install PyGithub

				      - name: Run assignment script

				        env:

				          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				        run: python .github/scripts/assign_reviewers.py

									
										75

.github/workflows/benchmark.yml
									
										vendored
									
												View File
												
				@ -1,42 +1,73 @@

				name: Self-hosted runner (benchmark)

				on:

				  schedule:

				    - cron: "17 2 * * *"

				  workflow_call:

				  workflow_dispatch:

				concurrency:

				  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}

				  cancel-in-progress: true

				env:

				  HF_HOME: /mnt/cache

				  TF_FORCE_GPU_ALLOW_GROWTH: true

				jobs:

				  benchmark:

				    name: Benchmark

				    runs-on: [single-gpu, nvidia-gpu, a10, ci]

				    strategy:

				      matrix:

				        # group: [aws-g5-4xlarge-cache, aws-p4d-24xlarge-plus] (A100 runner is not enabled)

				        group: [aws-g5-4xlarge-cache]

				    runs-on:

				      group: ${{ matrix.group }}

				    if: |

				      (github.event_name == 'pull_request' && contains( github.event.pull_request.labels.*.name, 'run-benchmark') )||

				      (github.event_name == 'push' && github.ref == 'refs/heads/main')

				    container:

				      image: huggingface/transformers-all-latest-gpu

				      options: --gpus all --privileged --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				      image: huggingface/transformers-pytorch-gpu

				      options: --gpus all --privileged --ipc host

				    steps:

				      - name: Update clone

				        working-directory: /transformers

				      - name: Get repo

				        uses: actions/checkout@v4

				        with:

				          ref: ${{ github.event.pull_request.head.sha || github.sha }}

				      - name: Install libpq-dev & psql

				        run: |

				          git fetch && git checkout ${{ github.sha }}

				          apt update

				          apt install -y libpq-dev postgresql-client

				      - name: Install benchmark script dependencies

				        run: python3 -m pip install -r benchmark/requirements.txt

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e ".[torch]"

				      - name: Benchmark (daily)

				        if: github.event_name == 'schedule'

				        working-directory: /transformers

				      - name: Run database init script

				        run: |

				          python3 -m pip install optimum-benchmark>=0.3.0

				          HF_TOKEN=${{ secrets.TRANSFORMERS_BENCHMARK_TOKEN }} python3 benchmark/benchmark.py --repo_id hf-internal-testing/benchmark_results --path_in_repo $(date +'%Y-%m-%d') --config-dir benchmark/config --config-name generation --commit=${{ github.sha }} backend.model=google/gemma-2b backend.cache_implementation=null,static backend.torch_compile=false,true --multirun

				          psql -f benchmark/utils/init_db.sql

				        env:

				          PGDATABASE: metrics

				          PGHOST: ${{ secrets.TRANSFORMERS_BENCHMARKS_PGHOST }}

				          PGUSER: transformers_benchmarks

				          PGPASSWORD: ${{ secrets.TRANSFORMERS_BENCHMARKS_PGPASSWORD }}

				      - name: Benchmark (merged to main event)

				        if: github.event_name == 'push' && github.ref_name == 'main'

				        working-directory: /transformers

				      - name: Run benchmark

				        run: |

				          python3 -m pip install optimum-benchmark>=0.3.0

				          HF_TOKEN=${{ secrets.TRANSFORMERS_BENCHMARK_TOKEN }} python3 benchmark/benchmark.py --repo_id hf-internal-testing/benchmark_results_merge_event --path_in_repo $(date +'%Y-%m-%d') --config-dir benchmark/config --config-name generation --commit=${{ github.sha }} backend.model=google/gemma-2b backend.cache_implementation=null,static backend.torch_compile=false,true --multirun

				          git config --global --add safe.directory /__w/transformers/transformers

				          if [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then

				            commit_id=$(echo "${{ github.event.pull_request.head.sha }}")

				          elif [ "$GITHUB_EVENT_NAME" = "push" ]; then

				            commit_id=$GITHUB_SHA

				          fi

				          commit_msg=$(git show -s --format=%s | cut -c1-70)

				          python3 benchmark/benchmarks_entrypoint.py "huggingface/transformers" "$BRANCH_NAME" "$commit_id" "$commit_msg"

				        env:

				          HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				          # Enable this to see debug logs

				          # HF_HUB_VERBOSITY: debug

				          # TRANSFORMERS_VERBOSITY: debug

				          PGHOST: ${{ secrets.TRANSFORMERS_BENCHMARKS_PGHOST }}

				          PGUSER: transformers_benchmarks

				          PGPASSWORD: ${{ secrets.TRANSFORMERS_BENCHMARKS_PGPASSWORD }}

				          BRANCH_NAME: ${{ github.head_ref || github.ref_name }}

									
										57

.github/workflows/benchmark_v2.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,57 @@

				name: Benchmark v2 Framework

				on:

				  workflow_dispatch:

				env:

				  HF_HOME: /mnt/cache

				  TRANSFORMERS_IS_CI: yes

				  # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.

				  # This token is created under the bot `hf-transformers-bot`.

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				jobs:

				  benchmark-v2:

				    name: Benchmark v2

				    runs-on: ${{ inputs.runner }}

				    if: |

				      (github.event_name == 'pull_request' && contains( github.event.pull_request.labels.*.name, 'run-benchmark')) ||

				      (github.event_name == 'schedule')

				    container:

				      image: ${{ inputs.container_image }}

				      options: ${{ inputs.container_options }}

				    steps:

				      - name: Get repo

				        uses: actions/checkout@v4

				        with:

				          ref: ${{ inputs.commit_sha || github.sha }}

				      - name: Install benchmark dependencies

				        run: |

				          python3 -m pip install -r benchmark_v2/requirements.txt

				      - name: Reinstall transformers in edit mode

				        run: |

				          python3 -m pip uninstall -y transformers

				          python3 -m pip install -e ".[torch]"

				      - name: Show installed libraries and their versions

				        run: |

				          python3 -m pip list

				          python3 -c "import torch; print(f'PyTorch version: {torch.__version__}')"

				          python3 -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"

				          python3 -c "import torch; print(f'CUDA device count: {torch.cuda.device_count()}')" || true

				          nvidia-smi || true

				      - name: Run benchmark v2

				        working-directory: benchmark_v2

				        run: |

				          echo "Running benchmarks"

				          python3 run_benchmarks.py \

				          --commit-id '${{ inputs.commit_sha || github.sha }}' \

				          --run-id '${{ inputs.run_id }}' \

				          --push-to-hub '${{ inputs.benchmark_repo_id}}' \

				          --token '${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}' \

				          --log-level INFO

				        env:

				          HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

									
										17

.github/workflows/benchmark_v2_a10_caller.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,17 @@

				name: Benchmark v2 Scheduled Runner - A10 Single-GPU

				on:

				  workflow_dispatch:

				jobs:

				  benchmark-v2-default:

				    name: Benchmark v2 - Default Models

				    uses: ./.github/workflows/benchmark_v2.yml

				    with:

				      runner: aws-g5-4xlarge-cache-use1-public-80

				      container_image: huggingface/transformers-pytorch-gpu

				      container_options: --gpus all --privileged --ipc host --shm-size "16gb"

				      commit_sha: ${{ github.sha }}

				      run_id: ${{ github.run_id }}

				      benchmark_repo_id: hf-internal-testing/transformers-daily-benchmarks

				    secrets: inherit

									
										17

.github/workflows/benchmark_v2_mi325_caller.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,17 @@

				name: Benchmark v2 Scheduled Runner - MI325 Single-GPU

				on:

				  workflow_dispatch:

				jobs:

				  benchmark-v2-default:

				    name: Benchmark v2 - Default Models

				    uses: ./.github/workflows/benchmark_v2.yml

				    with:

				      runner: amd-mi325-ci-1gpu

				      container_image: huggingface/transformers-pytorch-amd-gpu

				      container_options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache

				      commit_sha: ${{ github.sha }}

				      run_id: ${{ github.run_id }}

				      benchmark_repo_id: hf-internal-testing/transformers-daily-benchmarks

				    secrets: inherit

									
										6

.github/workflows/build-ci-docker-images.yml
									
										vendored
									
												View File
												
				@ -26,7 +26,7 @@ jobs:

				    strategy:

				      matrix:

				        file: ["quality", "consistency", "custom-tokenizers", "torch-light", "tf-light", "exotic-models", "torch-tf-light", "torch-jax-light", "jax-light", "examples-torch",  "examples-tf"]

				        file: ["quality", "consistency", "custom-tokenizers", "torch-light", "exotic-models", "examples-torch"]

				    continue-on-error: true

				    steps:

				@ -34,11 +34,11 @@ jobs:

				        name: Set tag

				        run: |

				              if ${{contains(github.event.head_commit.message, '[build-ci-image]')}}; then

				                  echo "TAG=huggingface/transformers-${{ matrix.file }}:dev" >> "$GITHUB_ENV" 

				                  echo "TAG=huggingface/transformers-${{ matrix.file }}:dev" >> "$GITHUB_ENV"

				                  echo "setting it to DEV!"

				              else

				                  echo "TAG=huggingface/transformers-${{ matrix.file }}" >> "$GITHUB_ENV"

				              fi

				      -

				        name: Set up Docker Buildx

									
										77

.github/workflows/build-docker-images.yml
									
										vendored
									
												View File
												
				@ -5,6 +5,7 @@ on:

				    branches:

				      - build_ci_docker_image*

				  repository_dispatch:

				  workflow_dispatch:

				  workflow_call:

				    inputs:

				      image_postfix:

				@ -19,7 +20,7 @@ concurrency:

				jobs:

				  latest-docker:

				    name: "Latest PyTorch + TensorFlow [dev]"

				    name: "Latest PyTorch [dev]"

				    runs-on:

				      group: aws-general-8-plus

				    steps:

				@ -63,14 +64,14 @@ jobs:

				        uses: huggingface/hf-workflows/.github/actions/post-slack@main

				        with:

				          slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}

				          title: 🤗 Results of the transformers-all-latest-gpu-push-ci docker build 

				          title: 🤗 Results of the transformers-all-latest-gpu-push-ci docker build

				          status: ${{ job.status }}

				          slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

				  latest-torch-deepspeed-docker:

				    name: "Latest PyTorch + DeepSpeed"

				    runs-on:

				      group: aws-general-8-plus

				      group: aws-g4dn-2xlarge-cache

				    steps:

				      -

				        name: Set up Docker Buildx

				@ -99,7 +100,7 @@ jobs:

				        uses: huggingface/hf-workflows/.github/actions/post-slack@main

				        with:

				          slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER}}

				          title: 🤗 Results of the transformers-pytorch-deepspeed-latest-gpu docker build 

				          title: 🤗 Results of the transformers-pytorch-deepspeed-latest-gpu docker build

				          status: ${{ job.status }}

				          slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

				@ -140,7 +141,7 @@ jobs:

				        uses: huggingface/hf-workflows/.github/actions/post-slack@main

				        with:

				          slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}

				          title: 🤗 Results of the transformers-pytorch-deepspeed-latest-gpu-push-ci docker build 

				          title: 🤗 Results of the transformers-pytorch-deepspeed-latest-gpu-push-ci docker build

				          status: ${{ job.status }}

				          slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

				@ -176,7 +177,7 @@ jobs:

				        uses: huggingface/hf-workflows/.github/actions/post-slack@main

				        with:

				          slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}

				          title: 🤗 Results of the huggingface/transformers-doc-builder docker build 

				          title: 🤗 Results of the huggingface/transformers-doc-builder docker build

				          status: ${{ job.status }}

				          slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

				@ -214,28 +215,28 @@ jobs:

				        uses: huggingface/hf-workflows/.github/actions/post-slack@main

				        with:

				          slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}

				          title: 🤗 Results of the huggingface/transformers-pytorch-gpudocker build 

				          title: 🤗 Results of the huggingface/transformers-pytorch-gpudocker build

				          status: ${{ job.status }}

				          slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

				  latest-pytorch-amd:

				    name: "Latest PyTorch (AMD) [dev]"

				    runs-on:

				      group: aws-general-8-plus

				      group: aws-highcpu-32-priv

				    steps:

				      - 

				      -

				        name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@v3

				      - 

				      -

				        name: Check out code

				        uses: actions/checkout@v4

				      - 

				      -

				        name: Login to DockerHub

				        uses: docker/login-action@v3

				        with:

				          username: ${{ secrets.DOCKERHUB_USERNAME }}

				          password: ${{ secrets.DOCKERHUB_PASSWORD }}

				      - 

				      -

				        name: Build and push

				        uses: docker/build-push-action@v5

				        with:

				@ -263,45 +264,7 @@ jobs:

				        uses: huggingface/hf-workflows/.github/actions/post-slack@main

				        with:

				          slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}

				          title: 🤗 Results of the huggingface/transformers-pytorch-amd-gpu-push-ci build 

				          status: ${{ job.status }}

				          slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

				  latest-tensorflow:

				    name: "Latest TensorFlow [dev]"

				    # Push CI doesn't need this image

				    if: inputs.image_postfix != '-push-ci'

				    runs-on:

				      group: aws-general-8-plus

				    steps:

				      -

				        name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@v3

				      -

				        name: Check out code

				        uses: actions/checkout@v4

				      -

				        name: Login to DockerHub

				        uses: docker/login-action@v3

				        with:

				          username: ${{ secrets.DOCKERHUB_USERNAME }}

				          password: ${{ secrets.DOCKERHUB_PASSWORD }}

				      -

				        name: Build and push

				        uses: docker/build-push-action@v5

				        with:

				          context: ./docker/transformers-tensorflow-gpu

				          build-args: |

				            REF=main

				          push: true

				          tags: huggingface/transformers-tensorflow-gpu

				      - name: Post to Slack

				        if: always()

				        uses: huggingface/hf-workflows/.github/actions/post-slack@main

				        with:

				          slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}

				          title: 🤗 Results of the huggingface/transformers-tensorflow-gpu build 

				          title: 🤗 Results of the huggingface/transformers-pytorch-amd-gpu-push-ci build

				          status: ${{ job.status }}

				          slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

				@ -310,19 +273,19 @@ jobs:

				    runs-on:

				      group: aws-general-8-plus

				    steps:

				      - 

				      -

				        name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@v3

				      - 

				      -

				        name: Check out code

				        uses: actions/checkout@v4

				      - 

				      -

				        name: Login to DockerHub

				        uses: docker/login-action@v3

				        with:

				          username: ${{ secrets.DOCKERHUB_USERNAME }}

				          password: ${{ secrets.DOCKERHUB_PASSWORD }}

				      - 

				      -

				        name: Build and push

				        uses: docker/build-push-action@v5

				        with:

				@ -350,7 +313,7 @@ jobs:

				        uses: huggingface/hf-workflows/.github/actions/post-slack@main

				        with:

				          slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}

				          title: 🤗 Results of the transformers-pytorch-deepspeed-amd-gpu build 

				          title: 🤗 Results of the transformers-pytorch-deepspeed-amd-gpu build

				          status: ${{ job.status }}

				          slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

				@ -388,6 +351,6 @@ jobs:

				        uses: huggingface/hf-workflows/.github/actions/post-slack@main

				        with:

				          slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}

				          title: 🤗 Results of the transformers-quantization-latest-gpu build 

				          title: 🤗 Results of the transformers-quantization-latest-gpu build

				          status: ${{ job.status }}

				          slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

									
										10

.github/workflows/build-nightly-ci-docker-images.yml
									
										vendored
									
												View File
												
				@ -2,6 +2,10 @@ name: Build docker images (Nightly CI)

				on:

				  workflow_call:

				    inputs:

				      job:

				        required: true

				        type: string

				  push:

				    branches:

				      - build_nightly_ci_docker_image*

				@ -12,7 +16,8 @@ concurrency:

				jobs:

				  latest-with-torch-nightly-docker:

				    name: "Nightly PyTorch + Stable TensorFlow"

				    name: "Nightly PyTorch"

				    if: inputs.job == 'latest-with-torch-nightly-docker' || inputs.job == ''

				    runs-on:

				      group: aws-general-8-plus

				    steps:

				@ -41,8 +46,9 @@ jobs:

				  nightly-torch-deepspeed-docker:

				    name: "Nightly PyTorch + DeepSpeed"

				    if: inputs.job == 'nightly-torch-deepspeed-docker' || inputs.job == ''

				    runs-on:

				      group: aws-general-8-plus

				      group: aws-g4dn-2xlarge-cache

				    steps:

				      -

				        name: Set up Docker Buildx

									
										14

.github/workflows/build_documentation.yml
									
										vendored
									
												View File
												
				@ -16,8 +16,20 @@ jobs:

				      commit_sha: ${{ github.sha }}

				      package: transformers

				      notebook_folder: transformers_doc

				      languages: ar de en es fr hi it ko pt tr zh ja te

				      languages: en

				      custom_container: huggingface/transformers-doc-builder

				    secrets:

				      token: ${{ secrets.HUGGINGFACE_PUSH }}

				      hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}

				   build_other_lang:

				    uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main

				    with:

				      commit_sha: ${{ github.sha }}

				      package: transformers

				      notebook_folder: transformers_doc

				      languages: ar de es fr hi it ja ko pt zh

				      custom_container: huggingface/transformers-doc-builder

				    secrets:

				      token: ${{ secrets.HUGGINGFACE_PUSH }}

				      hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}

									
										3

.github/workflows/build_pr_documentation.yml
									
										vendored
									
												View File
												
				@ -14,5 +14,4 @@ jobs:

				      commit_sha: ${{ github.event.pull_request.head.sha }}

				      pr_number: ${{ github.event.number }}

				      package: transformers

				      languages: ar de en es fr hi it ko pt tr zh ja te

				      custom_container: huggingface/transformers-doc-builder

				      languages: en

									
										255

.github/workflows/check_failed_tests.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,255 @@

				name: Process failed tests

				on:

				  workflow_call:

				    inputs:

				      docker:

				        required: true

				        type: string

				      start_sha:

				        required: true

				        type: string

				      job:

				        required: true

				        type: string

				      slack_report_channel:

				        required: true

				        type: string

				      ci_event:

				        required: true

				        type: string

				      report_repo_id:

				        required: true

				        type: string

				      commit_sha:

				        required: false

				        type: string

				env:

				  HF_HOME: /mnt/cache

				  TRANSFORMERS_IS_CI: yes

				  OMP_NUM_THREADS: 8

				  MKL_NUM_THREADS: 8

				  RUN_SLOW: yes

				  # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.

				  # This token is created under the bot `hf-transformers-bot`.

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				  TF_FORCE_GPU_ALLOW_GROWTH: true

				  CUDA_VISIBLE_DEVICES: 0,1

				jobs:

				  check_new_failures:

				    name: "Find commits for new failing tests"

				    strategy:

				      matrix:

				        run_idx: [1]

				    runs-on:

				      group: aws-g5-4xlarge-cache

				    outputs:

				      process: ${{ steps.check_file.outputs.process }}

				    container:

				      image: ${{ inputs.docker }}

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    steps:

				      - uses: actions/download-artifact@v4

				        with:

				          name: ci_results_${{ inputs.job }}

				          path: /transformers/ci_results_${{ inputs.job }}

				      - name: Check file

				        id: check_file

				        working-directory: /transformers

				        run: |

				          if [ -f ci_results_${{ inputs.job }}/new_failures.json ]; then

				            echo "`ci_results_${{ inputs.job }}/new_failures.json` exists, continue ..."

				            echo "process=true" >> $GITHUB_ENV

				            echo "process=true" >> $GITHUB_OUTPUT

				          else

				            echo "`ci_results_${{ inputs.job }}/new_failures.json` doesn't exist, abort."

				            echo "process=false" >> $GITHUB_ENV

				            echo "process=false" >> $GITHUB_OUTPUT

				          fi

				      - uses: actions/download-artifact@v4

				        if: ${{ env.process == 'true' }}

				        with:

				          pattern: setup_values*

				          path: setup_values

				          merge-multiple: true

				      - name: Prepare some setup values

				        if: ${{ env.process == 'true' }}

				        run: |

				          if [ -f setup_values/prev_workflow_run_id.txt ]; then

				            echo "PREV_WORKFLOW_RUN_ID=$(cat setup_values/prev_workflow_run_id.txt)" >> $GITHUB_ENV

				          else

				            echo "PREV_WORKFLOW_RUN_ID=" >> $GITHUB_ENV

				          fi

				          if [ -f setup_values/other_workflow_run_id.txt ]; then

				            echo "OTHER_WORKFLOW_RUN_ID=$(cat setup_values/other_workflow_run_id.txt)" >> $GITHUB_ENV

				          else

				            echo "OTHER_WORKFLOW_RUN_ID=" >> $GITHUB_ENV

				          fi

				      - name: Update clone

				        working-directory: /transformers

				        if: ${{ env.process == 'true' }}

				        run: git fetch && git checkout ${{ inputs.commit_sha || github.sha }}

				      - name: Get target commit

				        working-directory: /transformers/utils

				        if: ${{ env.process == 'true' }}

				        run: |

				          echo "END_SHA=$(TOKEN=${{ secrets.ACCESS_REPO_INFO_TOKEN }} python3 -c 'import os; from get_previous_daily_ci import get_last_daily_ci_run_commit; commit=get_last_daily_ci_run_commit(token=os.environ["TOKEN"], workflow_run_id=os.environ["PREV_WORKFLOW_RUN_ID"]); print(commit)')" >> $GITHUB_ENV

				      - name: Checkout to `start_sha`

				        working-directory: /transformers

				        if: ${{ env.process == 'true' }}

				        run: git fetch && git checkout ${{ inputs.start_sha }}

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				        if: ${{ env.process == 'true' }}

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

				      - name: NVIDIA-SMI

				        if: ${{ env.process == 'true' }}

				        run: |

				          nvidia-smi

				      - name: Environment

				        working-directory: /transformers

				        if: ${{ env.process == 'true' }}

				        run: |

				          python3 utils/print_env.py

				      - name: Install pytest-flakefinder

				        if: ${{ env.process == 'true' }}

				        run: python3 -m pip install pytest-flakefinder

				      - name: Show installed libraries and their versions

				        working-directory: /transformers

				        if: ${{ env.process == 'true' }}

				        run: pip freeze

				      - name: Check failed tests

				        working-directory: /transformers

				        if: ${{ env.process == 'true' }}

				        run: python3 utils/check_bad_commit.py --start_commit ${{ inputs.start_sha }} --end_commit ${{ env.END_SHA }} --file ci_results_${{ inputs.job }}/new_failures.json --output_file new_failures_with_bad_commit_${{ inputs.job }}_${{ matrix.run_idx }}.json

				      - name: Show results

				        working-directory: /transformers

				        if: ${{ env.process == 'true' }}

				        run: |

				          ls -l new_failures_with_bad_commit_${{ inputs.job }}_${{ matrix.run_idx }}.json

				          cat new_failures_with_bad_commit_${{ inputs.job }}_${{ matrix.run_idx }}.json

				      - name: Upload artifacts

				        uses: actions/upload-artifact@v4

				        with:

				          name: new_failures_with_bad_commit_${{ inputs.job }}_${{ matrix.run_idx }}

				          path: /transformers/new_failures_with_bad_commit_${{ inputs.job }}_${{ matrix.run_idx }}.json

				  process_new_failures_with_commit_info:

				    name: "process bad commit reports"

				    needs: check_new_failures

				    if: needs.check_new_failures.outputs.process == 'true'

				    runs-on:

				      group: aws-g5-4xlarge-cache

				    container:

				      image: ${{ inputs.docker }}

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    steps:

				      - uses: actions/download-artifact@v4

				        with:

				          name: ci_results_${{ inputs.job }}

				          path: /transformers/ci_results_${{ inputs.job }}

				      - uses: actions/download-artifact@v4

				        with:

				          pattern: new_failures_with_bad_commit_${{ inputs.job }}*

				          path: /transformers/new_failures_with_bad_commit_${{ inputs.job }}

				          merge-multiple: true

				      - name: Check files

				        working-directory: /transformers

				        run: |

				          ls -la /transformers

				          ls -la /transformers/new_failures_with_bad_commit_${{ inputs.job }}

				      # Currently, we only run with a single runner by using `run_idx: [1]`. We might try to run with multiple runners

				      # to further reduce the false positive caused by flaky tests, which requires further processing to merge reports.

				      - name: Merge files

				        shell: bash

				        working-directory: /transformers

				        run: |

				          cp /transformers/new_failures_with_bad_commit_${{ inputs.job }}/new_failures_with_bad_commit_${{ inputs.job }}_1.json new_failures_with_bad_commit.json

				      - name: Update clone

				        working-directory: /transformers

				        run: git fetch && git checkout ${{ inputs.commit_sha || github.sha }}

				      - name: Process report

				        shell: bash

				        working-directory: /transformers

				        env:

				          ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}

				          TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}

				          JOB_NAME: ${{ inputs.job }}

				          REPORT_REPO_ID: ${{ inputs.report_repo_id }}

				        run: |

				          python3 utils/process_bad_commit_report.py

				      - name: Process report

				        shell: bash

				        working-directory: /transformers

				        env:

				          ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}

				          TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}

				          JOB_NAME: ${{ inputs.job }}

				          REPORT_REPO_ID: ${{ inputs.report_repo_id }}

				        run: |

				          {

				            echo 'REPORT_TEXT<<EOF'

				            python3 utils/process_bad_commit_report.py

				            echo EOF

				          } >> "$GITHUB_ENV"

				      - name: Prepare Slack report title

				        working-directory: /transformers

				        run: |

				          pip install slack_sdk

				          echo "title=$(python3 -c 'import sys; sys.path.append("utils"); from utils.notification_service import job_to_test_map; ci_event = "${{ inputs.ci_event }}"; job = "${{ inputs.job }}"; test_name = job_to_test_map[job]; title = f"New failed tests of {ci_event}" + ":" + f" {test_name}"; print(title)')" >> $GITHUB_ENV

				      - name: Send processed report

				        if: ${{ !endsWith(env.REPORT_TEXT, '{}') }}

				        uses: slackapi/slack-github-action@6c661ce58804a1a20f6dc5fbee7f0381b469e001

				        with:

				          # Slack channel id, channel name, or user id to post message.

				          # See also: https://api.slack.com/methods/chat.postMessage#channels

				          channel-id: '#${{ inputs.slack_report_channel }}'

				          # For posting a rich message using Block Kit

				          payload: |

				            {

				              "blocks": [

				                {

				                  "type": "header",

				                  "text": {

				                    "type": "plain_text",

				                    "text": "${{ env.title }}"

				                  }

				                },

				                {

				                  "type": "section",

				                  "text": {

				                    "type": "mrkdwn",

				                    "text": "${{ env.REPORT_TEXT }}"

				                  }

				                }

				              ]

				            }

				        env:

				          SLACK_BOT_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

									
										43

.github/workflows/collated-reports.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,43 @@

				name: CI collated reports

				on:

				  workflow_call:

				    inputs:

				      job:

				        required: true

				        type: string

				      report_repo_id:

				        required: true

				        type: string

				      machine_type:

				        required: true

				        type: string

				      gpu_name:

				        description: Name of the GPU used for the job. Its enough that the value contains the name of the GPU, e.g. "noise-h100-more-noise". Case insensitive.

				        required: true

				        type: string

				jobs:

				  collated_reports:

				    name: Collated reports

				    runs-on: ubuntu-22.04

				    if: always()

				    steps:

				      - uses: actions/checkout@v4

				      - uses: actions/download-artifact@v4

				      - name: Collated reports

				        shell: bash

				        env:

				          ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}

				          CI_SHA: ${{ github.sha }}

				          TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}

				        run: |

				          pip install huggingface_hub

				          python3 utils/collated_reports.py                  \

				            --path .                                         \

				            --machine-type ${{ inputs.machine_type }}        \

				            --commit-hash ${{ env.CI_SHA }}                  \

				            --job ${{ inputs.job }}                          \

				            --report-repo-id ${{ inputs.report_repo_id }}    \

				            --gpu-name ${{ inputs.gpu_name }}

									
										6

.github/workflows/doctest_job.yml
									
										vendored
									
												View File
												
				@ -16,7 +16,6 @@ env:

				  RUN_SLOW: yes

				  OMP_NUM_THREADS: 16

				  MKL_NUM_THREADS: 16

				  SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}

				  TF_FORCE_GPU_ALLOW_GROWTH: true

				jobs:

				@ -27,10 +26,11 @@ jobs:

				      fail-fast: false

				      matrix:

				        split_keys: ${{ fromJson(inputs.split_keys) }}

				    runs-on: [single-gpu, nvidia-gpu, t4, ci]

				    runs-on: 

				      group: aws-g5-4xlarge-cache

				    container:

				      image: huggingface/transformers-all-latest-gpu

				      options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    steps:

				      - name: Update clone

				        working-directory: /transformers

									
										7

.github/workflows/doctests.yml
									
										vendored
									
												View File
												
				@ -14,10 +14,11 @@ env:

				jobs:

				  setup:

				    name: Setup

				    runs-on: [single-gpu, nvidia-gpu, t4, ci]

				    runs-on: 

				      group: aws-g5-4xlarge-cache

				    container:

				      image: huggingface/transformers-all-latest-gpu

				      options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    outputs:

				      job_splits: ${{ steps.set-matrix.outputs.job_splits }}

				      split_keys: ${{ steps.set-matrix.outputs.split_keys }}

				@ -85,4 +86,4 @@ jobs:

				        uses: actions/upload-artifact@v4

				        with:

				          name: doc_test_results

				          path: doc_test_results

				          path: doc_test_results

									
										157

.github/workflows/get-pr-info.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,157 @@

				name: Get PR commit SHA

				on:

				  workflow_call:

				    inputs:

				      pr_number:

				        required: true

				        type: string

				    outputs:

				      PR_HEAD_REPO_FULL_NAME:

				        description: "The full name of the repository from which the pull request is created"

				        value: ${{ jobs.get-pr-info.outputs.PR_HEAD_REPO_FULL_NAME }}

				      PR_BASE_REPO_FULL_NAME:

				        description: "The full name of the repository to which the pull request is created"

				        value: ${{ jobs.get-pr-info.outputs.PR_BASE_REPO_FULL_NAME }}

				      PR_HEAD_REPO_OWNER:

				        description: "The owner of the repository from which the pull request is created"

				        value: ${{ jobs.get-pr-info.outputs.PR_HEAD_REPO_OWNER }}

				      PR_BASE_REPO_OWNER:

				        description: "The owner of the repository to which the pull request is created"

				        value: ${{ jobs.get-pr-info.outputs.PR_BASE_REPO_OWNER }}

				      PR_HEAD_REPO_NAME:

				        description: "The name of the repository from which the pull request is created"

				        value: ${{ jobs.get-pr-info.outputs.PR_HEAD_REPO_NAME }}

				      PR_BASE_REPO_NAME:

				        description: "The name of the repository to which the pull request is created"

				        value: ${{ jobs.get-pr-info.outputs.PR_BASE_REPO_NAME }}

				      PR_HEAD_REF:

				        description: "The branch name of the pull request in the head repository"

				        value: ${{ jobs.get-pr-info.outputs.PR_HEAD_REF }}

				      PR_BASE_REF:

				        description: "The branch name in the base repository (to merge into)"

				        value: ${{ jobs.get-pr-info.outputs.PR_BASE_REF }}

				      PR_HEAD_SHA:

				        description: "The head sha of the pull request branch in the head repository"

				        value: ${{ jobs.get-pr-info.outputs.PR_HEAD_SHA }}

				      PR_BASE_SHA:

				        description: "The head sha of the target branch in the base repository"

				        value: ${{ jobs.get-pr-info.outputs.PR_BASE_SHA }}

				      PR_MERGE_COMMIT_SHA:

				        description: "The sha of the merge commit for the pull request (created by GitHub) in the base repository"

				        value: ${{ jobs.get-pr-info.outputs.PR_MERGE_COMMIT_SHA }}

				      PR_HEAD_COMMIT_DATE:

				        description: "The date of the head sha of the pull request branch in the head repository"

				        value: ${{ jobs.get-pr-info.outputs.PR_HEAD_COMMIT_DATE }}

				      PR_MERGE_COMMIT_DATE:

				        description: "The date of the merge commit for the pull request (created by GitHub) in the base repository"

				        value: ${{ jobs.get-pr-info.outputs.PR_MERGE_COMMIT_DATE }}

				      PR_HEAD_COMMIT_TIMESTAMP:

				        description: "The timestamp of the head sha of the pull request branch in the head repository"

				        value: ${{ jobs.get-pr-info.outputs.PR_HEAD_COMMIT_TIMESTAMP }}

				      PR_MERGE_COMMIT_TIMESTAMP:

				        description: "The timestamp of the merge commit for the pull request (created by GitHub) in the base repository"

				        value: ${{ jobs.get-pr-info.outputs.PR_MERGE_COMMIT_TIMESTAMP }}

				      PR:

				        description: "The PR"

				        value: ${{ jobs.get-pr-info.outputs.PR }}

				      PR_FILES:

				        description: "The files touched in the PR"

				        value: ${{ jobs.get-pr-info.outputs.PR_FILES }}

				jobs:

				  get-pr-info:

				    runs-on: ubuntu-22.04

				    name: Get PR commit SHA better

				    outputs:

				      PR_HEAD_REPO_FULL_NAME: ${{ steps.pr_info.outputs.head_repo_full_name }}

				      PR_BASE_REPO_FULL_NAME: ${{ steps.pr_info.outputs.base_repo_full_name }}

				      PR_HEAD_REPO_OWNER: ${{ steps.pr_info.outputs.head_repo_owner }}

				      PR_BASE_REPO_OWNER: ${{ steps.pr_info.outputs.base_repo_owner }}

				      PR_HEAD_REPO_NAME: ${{ steps.pr_info.outputs.head_repo_name }}

				      PR_BASE_REPO_NAME: ${{ steps.pr_info.outputs.base_repo_name }}

				      PR_HEAD_REF: ${{ steps.pr_info.outputs.head_ref }}

				      PR_BASE_REF: ${{ steps.pr_info.outputs.base_ref }}

				      PR_HEAD_SHA: ${{ steps.pr_info.outputs.head_sha }}

				      PR_BASE_SHA: ${{ steps.pr_info.outputs.base_sha }}

				      PR_MERGE_COMMIT_SHA: ${{ steps.pr_info.outputs.merge_commit_sha }}

				      PR_HEAD_COMMIT_DATE: ${{ steps.pr_info.outputs.head_commit_date }}

				      PR_MERGE_COMMIT_DATE: ${{ steps.pr_info.outputs.merge_commit_date }}

				      PR_HEAD_COMMIT_TIMESTAMP: ${{ steps.get_timestamps.outputs.head_commit_timestamp }}

				      PR_MERGE_COMMIT_TIMESTAMP: ${{ steps.get_timestamps.outputs.merge_commit_timestamp }}

				      PR: ${{ steps.pr_info.outputs.pr }}

				      PR_FILES: ${{ steps.pr_info.outputs.files }}

				    if: ${{ inputs.pr_number != '' }}

				    steps:

				      - name: Extract PR details

				        id: pr_info

				        uses: actions/github-script@v6

				        with:

				          script: |            

				            const { data: pr } = await github.rest.pulls.get({

				              owner: context.repo.owner,

				              repo: context.repo.repo,

				              pull_number: ${{ inputs.pr_number }}

				            });

				            const { data: head_commit }  = await github.rest.repos.getCommit({

				              owner: pr.head.repo.owner.login,

				              repo: pr.head.repo.name,

				              ref: pr.head.ref

				            });

				            const { data: merge_commit }  = await github.rest.repos.getCommit({

				              owner: pr.base.repo.owner.login,

				              repo: pr.base.repo.name,

				              ref: pr.merge_commit_sha,

				            });

				            const { data: files } = await github.rest.pulls.listFiles({

				              owner: context.repo.owner,

				              repo: context.repo.repo,

				              pull_number: ${{ inputs.pr_number }}

				            });

				            core.setOutput('head_repo_full_name', pr.head.repo.full_name);

				            core.setOutput('base_repo_full_name', pr.base.repo.full_name);

				            core.setOutput('head_repo_owner', pr.head.repo.owner.login);

				            core.setOutput('base_repo_owner', pr.base.repo.owner.login);

				            core.setOutput('head_repo_name', pr.head.repo.name);

				            core.setOutput('base_repo_name', pr.base.repo.name);

				            core.setOutput('head_ref', pr.head.ref);

				            core.setOutput('base_ref', pr.base.ref);

				            core.setOutput('head_sha', pr.head.sha);

				            core.setOutput('base_sha', pr.base.sha);

				            core.setOutput('merge_commit_sha', pr.merge_commit_sha);

				            core.setOutput('pr', pr);

				            core.setOutput('head_commit_date', head_commit.commit.committer.date);

				            core.setOutput('merge_commit_date', merge_commit.commit.committer.date);

				            core.setOutput('files', files);            

				            console.log('PR head commit:', {

				              head_commit: head_commit,

				              commit: head_commit.commit,

				              date: head_commit.commit.committer.date

				            });

				            console.log('PR merge commit:', {

				              merge_commit: merge_commit,

				              commit: merge_commit.commit,

				              date: merge_commit.commit.committer.date

				            });

				      - name: Convert dates to timestamps

				        id: get_timestamps

				        run: |

				          head_commit_date=${{ steps.pr_info.outputs.head_commit_date }}

				          merge_commit_date=${{ steps.pr_info.outputs.merge_commit_date }}

				          echo $head_commit_date

				          echo $merge_commit_date

				          head_commit_timestamp=$(date -d "$head_commit_date" +%s)

				          merge_commit_timestamp=$(date -d "$merge_commit_date" +%s)

				          echo $head_commit_timestamp

				          echo $merge_commit_timestamp

				          echo "head_commit_timestamp=$head_commit_timestamp" >> $GITHUB_OUTPUT

				          echo "merge_commit_timestamp=$merge_commit_timestamp" >> $GITHUB_OUTPUT

									
										36

.github/workflows/get-pr-number.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,36 @@

				name: Get PR number

				on:

				  workflow_call:

				    outputs:

				      PR_NUMBER:

				        description: "The extracted PR number"

				        value: ${{ jobs.get-pr-number.outputs.PR_NUMBER }}

				jobs:

				  get-pr-number:

				    runs-on: ubuntu-22.04

				    name: Get PR number

				    outputs:

				      PR_NUMBER: ${{ steps.set_pr_number.outputs.PR_NUMBER }}

				    steps:

				      - name: Get PR number

				        shell: bash

				        run: |

				          if [[ "${{ github.event.issue.number }}" != "" && "${{ github.event.issue.pull_request }}" != "" ]]; then

				            echo "PR_NUMBER=${{ github.event.issue.number }}" >> $GITHUB_ENV

				          elif [[ "${{ github.event.pull_request.number }}" != "" ]]; then

				            echo "PR_NUMBER=${{ github.event.pull_request.number }}" >> $GITHUB_ENV

				          elif [[ "${{ github.event.pull_request }}" != "" ]]; then

				            echo "PR_NUMBER=${{ github.event.number }}" >> $GITHUB_ENV

				          else

				            echo "PR_NUMBER=" >> $GITHUB_ENV

				          fi

				      - name: Check PR number

				        shell: bash

				        run: |

				          echo "${{ env.PR_NUMBER }}"

				      - name: Set PR number

				        id: set_pr_number

				        run: echo "PR_NUMBER=${{ env.PR_NUMBER }}" >> "$GITHUB_OUTPUT"

									
										79

.github/workflows/model_jobs.yml
									
										vendored
									
												View File
												
				@ -12,12 +12,22 @@ on:

				      slice_id:

				        required: true

				        type: number

				      runner:

				        required: true

				        type: string

				      docker:

				        required: true

				        type: string

				      commit_sha:

				        required: false

				        type: string

				      report_name_prefix:

				        required: false

				        default: run_models_gpu

				        type: string

				      runner_type:

				        required: false

				        type: string

				      report_repo_id:

				        required: false

				        type: string

				env:

				  HF_HOME: /mnt/cache

				@ -28,9 +38,7 @@ env:

				  # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.

				  # This token is created under the bot `hf-transformers-bot`.

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				  SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}

				  TF_FORCE_GPU_ALLOW_GROWTH: true

				  RUN_PT_TF_CROSS_TESTS: 1

				  CUDA_VISIBLE_DEVICES: 0,1

				jobs:

				@ -46,6 +54,8 @@ jobs:

				    container:

				      image: ${{ inputs.docker }}

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    outputs:

				      machine_type: ${{ steps.set_machine_type.outputs.machine_type }}

				    steps:

				      - name: Echo input and matrix info

				        shell: bash

				@ -67,7 +77,7 @@ jobs:

				      - name: Update clone

				        working-directory: /transformers

				        run: git fetch && git checkout ${{ github.sha }}

				        run: git fetch && git checkout ${{ inputs.commit_sha || github.sha }}

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				@ -99,14 +109,15 @@ jobs:

				        run: pip freeze

				      - name: Set `machine_type` for report and artifact names

				        id: set_machine_type

				        working-directory: /transformers

				        shell: bash

				        run: |

				          echo "${{ inputs.machine_type }}"

				          if [ "${{ inputs.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then

				          if [ "${{ inputs.machine_type }}" = "aws-g5-4xlarge-cache" ]; then

				            machine_type=single-gpu

				          elif [ "${{ inputs.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then

				          elif [ "${{ inputs.machine_type }}" = "aws-g5-12xlarge-cache" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ inputs.machine_type }}

				@ -114,26 +125,58 @@ jobs:

				          echo "$machine_type"

				          echo "machine_type=$machine_type" >> $GITHUB_ENV

				          echo "machine_type=$machine_type" >> $GITHUB_OUTPUT

				      - name: Create report directory if it doesn't exist

				        shell: bash

				        run: |

				          mkdir -p /transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports

				          echo "dummy" > /transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports/dummy.txt

				          ls -la /transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports

				      - name: Run all tests on GPU

				        working-directory: /transformers

				        run: python3 -m pytest -rsfE -v --make-reports=${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}

				        run: |

				          script -q -c "PATCH_TESTING_METHODS_TO_COLLECT_OUTPUTS=yes _PATCHED_TESTING_METHODS_OUTPUT_DIR=/transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports python3 -m pytest -rsfE -v --make-reports=${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports tests/${{ matrix.folders }}" test_outputs.txt

				          ls -la

				          # Extract the exit code from the output file

				          EXIT_CODE=$(tail -1 test_outputs.txt | grep -o 'COMMAND_EXIT_CODE="[0-9]*"' | cut -d'"' -f2)

				          exit ${EXIT_CODE:-1}

				      - name: Failure short reports

				        if: ${{ failure() }}

				        # This step is only to show information on Github Actions log.

				        # Always mark this step as successful, even if the report directory or the file `failures_short.txt` in it doesn't exist

				        continue-on-error: true

				        run: cat /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/failures_short.txt

				        run: cat /transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports/failures_short.txt

				      - name: Run test

				        shell: bash

				      - name: Captured information

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: |

				          mkdir -p /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports

				          echo "hello" > /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/hello.txt

				          echo "${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports"

				          cat /transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports/captured_info.txt

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports"

				      - name: Copy test_outputs.txt

				        if: ${{ always() }}

				        continue-on-error: true

				        run: |

				          cp /transformers/test_outputs.txt /transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports

				          path: /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports

				          name: ${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports

				          path: /transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports

				  collated_reports:

				    name: Collated Reports

				    if: ${{ always() }}

				    needs: run_models_gpu

				    uses: huggingface/transformers/.github/workflows/collated-reports.yml@main

				    with:

				      job: run_models_gpu

				      report_repo_id: ${{ inputs.report_repo_id }}

				      gpu_name: ${{ inputs.runner_type }}

				      machine_type: ${{ needs.run_models_gpu.outputs.machine_type }}

				    secrets: inherit

									
										129

.github/workflows/model_jobs_amd.yml
									
										vendored
									
												View File
											
				@ -1,129 +0,0 @@

				name: model jobs

				on:

				  workflow_call:

				    inputs:

				      folder_slices:

				        required: true

				        type: string

				      machine_type:

				        required: true

				        type: string

				      slice_id:

				        required: true

				        type: number

				      runner:

				        required: true

				        type: string

				      docker:

				        required: true

				        type: string

				env:

				  HF_HOME: /mnt/cache

				  TRANSFORMERS_IS_CI: yes

				  OMP_NUM_THREADS: 8

				  MKL_NUM_THREADS: 8

				  RUN_SLOW: yes

				  # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.

				  # This token is created under the bot `hf-transformers-bot`.

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				  SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}

				  TF_FORCE_GPU_ALLOW_GROWTH: true

				  RUN_PT_TF_CROSS_TESTS: 1

				  CUDA_VISIBLE_DEVICES: 0,1

				jobs:

				  run_models_gpu:

				    name: " "

				    strategy:

				      max-parallel: 1  # For now, not to parallelize. Can change later if it works well.

				      fail-fast: false

				      matrix:

				        folders: ${{ fromJson(inputs.folder_slices)[inputs.slice_id] }}

				    runs-on: ['${{ inputs.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']

				    container:

				      image: ${{ inputs.docker }}

				      options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    steps:

				      - name: Echo input and matrix info

				        shell: bash

				        run: |

				          echo "${{ inputs.folder_slices }}"

				          echo "${{ matrix.folders }}"

				          echo "${{ toJson(fromJson(inputs.folder_slices)[inputs.slice_id]) }}"

				      - name: Echo folder ${{ matrix.folders }}

				        shell: bash

				        # For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to

				        # set the artifact folder names (because the character `/` is not allowed).

				        run: |

				          echo "${{ matrix.folders }}"

				          matrix_folders=${{ matrix.folders }}

				          matrix_folders=${matrix_folders/'models/'/'models_'}

				          echo "$matrix_folders"

				          echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV

				      - name: Update clone

				        working-directory: /transformers

				        run: git fetch && git checkout ${{ github.sha }}

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

				      - name: Update / Install some packages (for Past CI)

				        if: ${{ contains(inputs.docker, '-past-') }}

				        working-directory: /transformers

				        run: |

				          python3 -m pip install -U datasets

				      - name: Update / Install some packages (for Past CI)

				        if: ${{ contains(inputs.docker, '-past-') && contains(inputs.docker, '-pytorch-') }}

				        working-directory: /transformers

				        run: |

				          python3 -m pip install --no-cache-dir git+https://github.com/huggingface/accelerate@main#egg=accelerate

				      - name: ROCM-SMI

				        run: |

				          rocm-smi

				      - name: ROCM-INFO

				        run: |

				          rocminfo  | grep "Agent" -A 14

				      - name: Show ROCR environment

				        run: |

				          echo "ROCR: $ROCR_VISIBLE_DEVICES"

				      - name: Environment

				        working-directory: /transformers

				        run: |

				          python3 utils/print_env.py

				      - name: Show installed libraries and their versions

				        working-directory: /transformers

				        run: pip freeze

				      - name: Run all tests on GPU

				        working-directory: /transformers

				        run: python3 -m pytest -rsfE -v --make-reports=${{ inputs.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}  -m "not not_device_test"

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat /transformers/reports/${{ inputs.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/failures_short.txt

				      - name: Run test

				        shell: bash

				        run: |

				          mkdir -p /transformers/reports/${{ inputs.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports

				          echo "hello" > /transformers/reports/${{ inputs.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/hello.txt

				          echo "${{ inputs.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports"

				      - name: "Test suite reports artifacts: ${{ inputs.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ inputs.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports

				          path: /transformers/reports/${{ inputs.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports

									
										120

.github/workflows/model_jobs_intel_gaudi.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,120 @@

				name: model jobs

				on:

				  workflow_call:

				    inputs:

				      folder_slices:

				        required: true

				        type: string

				      slice_id:

				        required: true

				        type: number

				      runner:

				        required: true

				        type: string

				      machine_type:

				        required: true

				        type: string

				      report_name_prefix:

				        required: false

				        default: run_models_gpu

				        type: string

				env:

				  RUN_SLOW: yes

				  PT_HPU_LAZY_MODE: 0

				  TRANSFORMERS_IS_CI: yes

				  PT_ENABLE_INT64_SUPPORT: 1

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				  HF_HOME: /mnt/cache/.cache/huggingface

				jobs:

				  run_models_gpu:

				    name: " "

				    strategy:

				      max-parallel: 8

				      fail-fast: false

				      matrix:

				        folders: ${{ fromJson(inputs.folder_slices)[inputs.slice_id] }}

				    runs-on:

				      group: ${{ inputs.runner }}

				    container:

				      image: vault.habana.ai/gaudi-docker/1.21.1/ubuntu22.04/habanalabs/pytorch-installer-2.6.0:latest

				      options: --runtime=habana

				        -v /mnt/cache/.cache/huggingface:/mnt/cache/.cache/huggingface

				        --env OMPI_MCA_btl_vader_single_copy_mechanism=none

				        --env HABANA_VISIBLE_DEVICES

				        --env HABANA_VISIBLE_MODULES

				        --cap-add=sys_nice

				        --shm-size=64G

				    steps:

				      - name: Echo input and matrix info

				        shell: bash

				        run: |

				          echo "${{ inputs.folder_slices }}"

				          echo "${{ matrix.folders }}"

				          echo "${{ toJson(fromJson(inputs.folder_slices)[inputs.slice_id]) }}"

				      - name: Echo folder ${{ matrix.folders }}

				        shell: bash

				        run: |

				          echo "${{ matrix.folders }}"

				          matrix_folders=${{ matrix.folders }}

				          matrix_folders=${matrix_folders/'models/'/'models_'}

				          echo "$matrix_folders"

				          echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV

				      - name: Checkout

				        uses: actions/checkout@v4

				        with:

				          fetch-depth: 0

				      - name: Install dependencies

				        run: |

				          pip install -e .[testing,torch] "numpy<2.0.0" scipy scikit-learn

				      - name: HL-SMI

				        run: |

				          hl-smi

				          echo "HABANA_VISIBLE_DEVICES=${HABANA_VISIBLE_DEVICES}"

				          echo "HABANA_VISIBLE_MODULES=${HABANA_VISIBLE_MODULES}"

				      - name: Environment

				        run: python3 utils/print_env.py

				      - name: Show installed libraries and their versions

				        run: pip freeze

				      - name: Set `machine_type` for report and artifact names

				        shell: bash

				        run: |

				          if [ "${{ inputs.machine_type }}" = "1gaudi" ]; then

				            machine_type=single-gpu

				          elif [ "${{ inputs.machine_type }}" = "2gaudi" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ inputs.machine_type }}

				          fi

				          echo "machine_type=$machine_type" >> $GITHUB_ENV

				      - name: Run all tests on Gaudi

				        run: python3 -m pytest -v --make-reports=${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports/failures_short.txt

				      - name: Run test

				        shell: bash

				        run: |

				          mkdir -p reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports

				          echo "hello" > reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports/hello.txt

				          echo "${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports"

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports

				          path: reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports

									
										68

.github/workflows/new_model_pr_merged_notification.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,68 @@

				# Used to notify core maintainers about new model PR being merged

				name: New model PR merged notification

				on:

				  push:

				    branches:

				      - main

				    paths:

				      - 'src/transformers/models/*/modeling_*'

				jobs:

				  notify_new_model:

				    name: Notify new model

				    runs-on: ubuntu-22.04

				    steps:

				      - uses: actions/checkout@v4

				        with:

				          fetch-depth: 0

				      - name: Check new model

				        shell: bash

				        run: |

				          python -m pip install gitpython

				          python -c 'from utils.pr_slow_ci_models import get_new_model; new_model = get_new_model(diff_with_last_commit=True); print(new_model)' | tee output.txt

				          echo "NEW_MODEL=$(tail -n 1 output.txt)" >> $GITHUB_ENV

				          echo "COMMIT_SHA=$(git log -1 --format=%H)" >> $GITHUB_ENV

				      - name: print commit sha

				        if: ${{ env.NEW_MODEL != ''}}

				        shell: bash

				        run: |

				          echo "$COMMIT_SHA"

				      - name: print new model

				        if: ${{ env.NEW_MODEL != ''}}

				        shell: bash

				        run: |

				          echo "$NEW_MODEL"

				      - name: Notify

				        if: ${{ env.NEW_MODEL != ''}}

				        uses: slackapi/slack-github-action@6c661ce58804a1a20f6dc5fbee7f0381b469e001

				        with:

				          # Slack channel id, channel name, or user id to post message.

				          # See also: https://api.slack.com/methods/chat.postMessage#channels

				          channel-id: transformers-new-model-notification

				          # For posting a rich message using Block Kit

				          payload: |

				            {

				              "blocks": [

				                {

				                  "type": "header",

				                  "text": {

				                    "type": "plain_text",

				                    "text": "New model!",

				                    "emoji": true

				                  }

				                },

				                {

				                  "type": "section",

				                  "text": {

				                    "type": "mrkdwn",

				                    "text": "<https://github.com/huggingface/transformers/commit/${{ env.COMMIT_SHA }}|New model: ${{ env.NEW_MODEL }}> GH_ArthurZucker, GH_lysandrejik, GH_ydshieh\ncommit SHA: ${{ env.COMMIT_SHA }}"

				                  }

				                }

				              ]

				            }

				        env:

				          SLACK_BOT_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

									
										18

.github/workflows/pr-style-bot.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,18 @@

				# To run this bot, comment "@bot /style" on a PR

				name: Style Bot

				on:

				  issue_comment:

				    types: [created]

				permissions:

				  pull-requests: write

				jobs:

				  style:

				    uses: huggingface/huggingface_hub/.github/workflows/style-bot-action.yml@main

				    with:

				      python_quality_dependencies: "[quality]"

				      style_command_type: "default"

				    secrets:

				      bot_token: ${{ secrets.HF_STYLE_BOT_ACTION }}

									
										134

.github/workflows/pr_build_doc_with_comment.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,134 @@

				name: PR - build doc via comment

				on:

				  issue_comment:

				    types:

				      - created

				    branches-ignore:

				      - main

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.issue.number }}-${{ startsWith(github.event.comment.body, 'build-doc') }}

				  cancel-in-progress: true

				permissions: {}

				jobs:

				  get-pr-number:

				    name: Get PR number

				    if: ${{ github.event.issue.state == 'open' && contains(fromJSON('["ydshieh", "ArthurZucker", "zucchini-nlp", "molbap", "gante", "LysandreJik", "Cyrilvallez", "Rocketknight1", "SunMarc", "eustlb", "MekkCyber", "vasqu", "ivarflakstad", "stevhliu", "ebezzam", "itazap"]'), github.actor) && (startsWith(github.event.comment.body, 'build-doc')) }}

				    uses: ./.github/workflows/get-pr-number.yml

				  get-pr-info:

				    name: Get PR commit SHA

				    needs: get-pr-number

				    if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}

				    uses: ./.github/workflows/get-pr-info.yml

				    with:

				      pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}

				  verity_pr_commit:

				    name: Verity PR commit corresponds to a specific event by comparing timestamps

				    if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}

				    runs-on: ubuntu-22.04

				    needs: get-pr-info

				    env:

				      COMMENT_DATE: ${{ github.event.comment.created_at }}

				      PR_MERGE_COMMIT_DATE: ${{ needs.get-pr-info.outputs.PR_MERGE_COMMIT_DATE }}

				      PR_MERGE_COMMIT_TIMESTAMP: ${{ needs.get-pr-info.outputs.PR_MERGE_COMMIT_TIMESTAMP }}

				    steps:

				      - run: |

				          COMMENT_TIMESTAMP=$(date -d "${COMMENT_DATE}" +"%s")

				          echo "COMMENT_DATE: $COMMENT_DATE"

				          echo "PR_MERGE_COMMIT_DATE: $PR_MERGE_COMMIT_DATE"

				          echo "COMMENT_TIMESTAMP: $COMMENT_TIMESTAMP"

				          echo "PR_MERGE_COMMIT_TIMESTAMP: $PR_MERGE_COMMIT_TIMESTAMP"

				          if [ $COMMENT_TIMESTAMP -le $PR_MERGE_COMMIT_TIMESTAMP ]; then

				            echo "Last commit on the pull request is newer than the issue comment triggering this run! Abort!";

				            exit -1;

				          fi

				  create_run:

				    name: Create run

				    needs: [get-pr-number, get-pr-info]

				    if: ${{ needs.get-pr-number.outputs.PR_NUMBER != '' }}

				    permissions:

				      statuses: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Create Run

				        id: create_run

				        env:

				          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				          # Create a commit status (pending) for a run of this workflow. The status has to be updated later in `update_run_status`.

				          # See https://docs.github.com/en/rest/commits/statuses?apiVersion=2022-11-28#create-a-commit-status

				          GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}

				        run: |

				          gh api \

				            --method POST \

				            -H "Accept: application/vnd.github+json" \

				            -H "X-GitHub-Api-Version: 2022-11-28" \

				            repos/${{ github.repository }}/statuses/${{ needs.get-pr-info.outputs.PR_HEAD_SHA }} \

				            -f "target_url=$GITHUB_RUN_URL" -f "state=pending" -f "description=Custom doc building job" -f "context=custom-doc-build"

				  reply_to_comment:

				    name: Reply to the comment

				    if: ${{ needs.create_run.result == 'success' }}

				    needs: [get-pr-number, create_run]

				    permissions:

				      pull-requests: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Reply to the comment

				        env:

				          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				          GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}

				        run: |

				          gh api \

				            --method POST \

				            -H "Accept: application/vnd.github+json" \

				            -H "X-GitHub-Api-Version: 2022-11-28" \

				            repos/${{ github.repository }}/issues/${{ needs.get-pr-number.outputs.PR_NUMBER }}/comments \

				            -f "body=[Building docs for all languages...](${{ env.GITHUB_RUN_URL }})"

				  build-doc:

				    name: Build doc

				    needs: [get-pr-number, get-pr-info]

				    if: ${{ needs.get-pr-number.outputs.PR_NUMBER != '' }}

				    uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main

				    with:

				      commit_sha: ${{ needs.get-pr-info.outputs.PR_HEAD_SHA }}

				      pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}

				      package: transformers

				      languages: ar de en es fr hi it ja ko pt zh

				  update_run_status:

				    name: Update Check Run Status

				    needs: [ get-pr-info, create_run, build-doc ]

				    permissions:

				      statuses: write

				    if: ${{ always() && needs.create_run.result == 'success' }}

				    runs-on: ubuntu-22.04

				    env:

				      GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				      GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}

				      STATUS_OK: ${{ contains(fromJSON('["skipped", "success"]'), needs.create_run.result) }}

				    steps:

				      - name: Get `build-doc` job status

				        run: |

				          echo "${{ needs.build-doc.result }}"

				          echo $STATUS_OK

				          if [ "$STATUS_OK" = "true" ]; then

				            echo "STATUS=success" >> $GITHUB_ENV

				          else

				            echo "STATUS=failure" >> $GITHUB_ENV

				          fi

				      - name: Update PR commit statuses

				        run: |

				          echo "${{ needs.build-doc.result }}"

				          echo "${{ env.STATUS }}"

				          gh api \

				            --method POST \

				            -H "Accept: application/vnd.github+json" \

				            -H "X-GitHub-Api-Version: 2022-11-28" \

				            repos/${{ github.repository }}/statuses/${{ needs.get-pr-info.outputs.PR_HEAD_SHA }} \

				            -f "target_url=$GITHUB_RUN_URL" -f "state=${{ env.STATUS }}" -f "description=Custom doc building job" -f "context=custom-doc-build"

									
										177

.github/workflows/pr_run_slow_ci.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,177 @@

				name: PR slow CI

				on:

				  pull_request_target:

				    types: [opened, synchronize, reopened]

				jobs:

				  get-pr-number:

				    name: Get PR number

				    uses: ./.github/workflows/get-pr-number.yml

				  get-pr-info:

				    name: Get PR commit SHA

				    needs: get-pr-number

				    if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}

				    uses: ./.github/workflows/get-pr-info.yml

				    with:

				      pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}

				  get-jobs:

				    name: Get test files to run

				    runs-on: ubuntu-22.04

				    needs: [get-pr-number, get-pr-info]

				    outputs:

				      jobs: ${{ steps.get_jobs.outputs.jobs_to_run }}

				    steps:

				      - name: Get repository content

				        id: repo_content

				        uses: actions/github-script@v6

				        with:

				          script: |

				            const { data: tests_dir } = await github.rest.repos.getContent({

				              owner: '${{ needs.get-pr-info.outputs.PR_HEAD_REPO_OWNER }}',

				              repo: '${{ needs.get-pr-info.outputs.PR_HEAD_REPO_NAME }}',

				              path: 'tests',

				              ref: '${{ needs.get-pr-info.outputs.PR_HEAD_SHA }}',

				            });

				            const { data: tests_models_dir } = await github.rest.repos.getContent({

				              owner: '${{ needs.get-pr-info.outputs.PR_HEAD_REPO_OWNER }}',

				              repo: '${{ needs.get-pr-info.outputs.PR_HEAD_REPO_NAME }}',

				              path: 'tests/models',

				              ref: '${{ needs.get-pr-info.outputs.PR_HEAD_SHA }}',

				            });

				            const { data: tests_quantization_dir } = await github.rest.repos.getContent({

				              owner: '${{ needs.get-pr-info.outputs.PR_HEAD_REPO_OWNER }}',

				              repo: '${{ needs.get-pr-info.outputs.PR_HEAD_REPO_NAME }}',

				              path: 'tests/quantization',

				              ref: '${{ needs.get-pr-info.outputs.PR_HEAD_SHA }}',

				            });

				            core.setOutput('tests_dir', tests_dir);

				            core.setOutput('tests_models_dir', tests_models_dir);

				            core.setOutput('tests_quantization_dir', tests_quantization_dir);

				      # This checkout to the main branch

				      - uses: actions/checkout@v4

				        with:

				          fetch-depth: "0"

				      - name: Write pr_files file

				        run: |

				          cat > pr_files.txt << 'EOF'

				          ${{ needs.get-pr-info.outputs.PR_FILES }}

				          EOF

				      - name: Write tests_dir file

				        run: |

				          cat > tests_dir.txt << 'EOF'

				          ${{ steps.repo_content.outputs.tests_dir }}

				          EOF

				      - name: Write tests_models_dir file

				        run: |

				          cat > tests_models_dir.txt << 'EOF'

				          ${{ steps.repo_content.outputs.tests_models_dir }}

				          EOF

				      - name: Write tests_quantization_dir file

				        run: |

				          cat > tests_quantization_dir.txt << 'EOF'

				          ${{ steps.repo_content.outputs.tests_quantization_dir }}

				          EOF

				      - name: Run script to get jobs to run

				        id: get_jobs

				        run: |

				          python utils/get_pr_run_slow_jobs.py | tee output.txt

				          echo "jobs_to_run: $(tail -n 1 output.txt)"

				          echo "jobs_to_run=$(tail -n 1 output.txt)" >> $GITHUB_OUTPUT

				  send_comment:

				    # Will delete the previous comment and send a new one if:

				    #   - either the content is changed

				    #   - or the previous comment is 30 minutes or more old

				    name: Send a comment to suggest jobs to run

				    if: ${{ needs.get-jobs.outputs.jobs != '' }}

				    needs: [get-pr-number, get-jobs]

				    permissions:

				      pull-requests: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Check and update comment if needed

				        uses: actions/github-script@v7

				        env:

				          BODY: "\n\nrun-slow: ${{ needs.get-jobs.outputs.jobs }}"

				        with:

				          script: |

				            const prNumber = ${{ needs.get-pr-number.outputs.PR_NUMBER }};

				            const commentPrefix = "**[For maintainers]** Suggested jobs to run (before merge)";

				            const thirtyMinutesAgo = new Date(Date.now() - 30 * 60 * 1000); // 30 minutes ago

				            const newBody = `${commentPrefix}${process.env.BODY}`;

				            // Get all comments on the PR

				            const { data: comments } = await github.rest.issues.listComments({

				              owner: context.repo.owner,

				              repo: context.repo.repo,

				              issue_number: prNumber

				            });

				            // Find existing comments that start with our prefix

				            const existingComments = comments.filter(comment => 

				              comment.user.login === 'github-actions[bot]' && 

				              comment.body.startsWith(commentPrefix)

				            );

				            let shouldCreateNewComment = true;

				            let commentsToDelete = [];

				            if (existingComments.length > 0) {

				              // Get the most recent comment

				              const mostRecentComment = existingComments

				                .sort((a, b) => new Date(b.created_at) - new Date(a.created_at))[0];

				              const commentDate = new Date(mostRecentComment.created_at);

				              const isOld = commentDate < thirtyMinutesAgo;

				              const isDifferentContent = mostRecentComment.body !== newBody;

				              console.log(`Most recent comment created: ${mostRecentComment.created_at}`);

				              console.log(`Is older than 30 minutes: ${isOld}`);

				              console.log(`Has different content: ${isDifferentContent}`);

				              if (isOld || isDifferentContent) {

				                // Delete all existing comments and create new one

				                commentsToDelete = existingComments;

				                console.log(`Will delete ${commentsToDelete.length} existing comment(s) and create new one`);

				              } else {

				                // Content is same and comment is recent, skip

				                shouldCreateNewComment = false;

				                console.log('Comment is recent and content unchanged, skipping update');

				              }

				            } else {

				              console.log('No existing comments found, will create new one');

				            }

				            // Delete old comments if needed

				            for (const comment of commentsToDelete) {

				              console.log(`Deleting comment #${comment.id} (created: ${comment.created_at})`);

				              await github.rest.issues.deleteComment({

				                owner: context.repo.owner,

				                repo: context.repo.repo,

				                comment_id: comment.id

				              });

				            }

				            // Create new comment if needed

				            if (shouldCreateNewComment) {

				              await github.rest.issues.createComment({

				                owner: context.repo.owner,

				                repo: context.repo.repo,

				                issue_number: prNumber,

				                body: newBody

				              });

				              console.log('✅ New comment created');

				            } else {

				              console.log('ℹ️ No comment update needed');

				            }

									
										259

.github/workflows/push-important-models.yml
									
										vendored
									
												View File
												
				@ -4,18 +4,6 @@ on:

				  push:

				    branches: [ main ]

				env:

				  OUTPUT_SLACK_CHANNEL_ID: "C06L2SGMEEA"

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				  HF_HOME: /mnt/cache 

				  TRANSFORMERS_IS_CI: yes 

				  OMP_NUM_THREADS: 8 

				  MKL_NUM_THREADS: 8 

				  RUN_SLOW: yes # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access. # This token is created under the bot `hf-transformers-bot`. 

				  SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }} 

				  TF_FORCE_GPU_ALLOW_GROWTH: true 

				  RUN_PT_TF_CROSS_TESTS: 1

				jobs:

				  get_modified_models:

				    name: "Get all modified files"

				@ -25,118 +13,145 @@ jobs:

				    steps:

				      - name: Check out code

				        uses: actions/checkout@v4

				      - name: Get changed files

				        id: changed-files

				        uses: tj-actions/changed-files@3f54ebb830831fc121d3263c1857cfbdc310cdb9 #v42

				      - name: Get changed files using `actions/github-script`

				        id: get-changed-files

				        uses: actions/github-script@v7

				        with:

				          files: src/transformers/models/**

				      - name: Run step if only the files listed above change

				        if: steps.changed-files.outputs.any_changed == 'true'

				        id: set-matrix

				          script: |

				            let files = [];

				            // Only handle push events

				            if (context.eventName === 'push') {

				              const afterSha = context.payload.after;

				              const branchName = context.payload.ref.replace('refs/heads/', '');

				              let baseSha;

				              if (branchName === 'main') {

				                console.log('Push to main branch, comparing to parent commit');

				                // Get the parent commit of the pushed commit

				                const { data: commit } = await github.rest.repos.getCommit({

				                  owner: context.repo.owner,

				                  repo: context.repo.repo,

				                  ref: afterSha

				                });

				                baseSha = commit.parents[0]?.sha;

				                if (!baseSha) {

				                  throw new Error('No parent commit found for the pushed commit');

				                }

				              } else {

				                console.log(`Push to branch ${branchName}, comparing to main`);

				                baseSha = 'main';

				              }

				              const { data: comparison } = await github.rest.repos.compareCommits({

				                owner: context.repo.owner,

				                repo: context.repo.repo,

				                base: baseSha,

				                head: afterSha

				              });

				              // Include added, modified, and renamed files

				              files = comparison.files

				                .filter(file => file.status === 'added' || file.status === 'modified' || file.status === 'renamed')

				                .map(file => file.filename);

				            }

				            // Include all files under src/transformers/ (not just models subdirectory)

				            const filteredFiles = files.filter(file => 

				              file.startsWith('src/transformers/')

				            );

				            core.setOutput('changed_files', filteredFiles.join(' '));

				            core.setOutput('any_changed', filteredFiles.length > 0 ? 'true' : 'false');

				      - name: Parse changed files with Python

				        if: steps.get-changed-files.outputs.any_changed == 'true'

				        env:

				          ALL_CHANGED_FILES: ${{ steps.changed-files.outputs.all_changed_files }}

				          CHANGED_FILES: ${{ steps.get-changed-files.outputs.changed_files }}

				        id: set-matrix

				        run: |

				            model_arrays=()

				            for file in $ALL_CHANGED_FILES; do

				                model_path="${file#*models/}"

				                model_path="models/${model_path%%/*}"

				                if grep -qFx "$model_path" utils/important_models.txt; then

				                    # Append the file to the matrix string

				                    model_arrays+=("$model_path")

				                fi

				            done

				            matrix_string=$(printf '"%s", ' "${model_arrays[@]}" | sed 's/, $//')

				            echo "matrix=[$matrix_string]" >> $GITHUB_OUTPUT

				  test_modified_files:

				          python3 - << 'EOF'

				          import os

				          import sys

				          import json

				          # Add the utils directory to Python path

				          sys.path.insert(0, 'utils')

				          # Import the important models list

				          from important_files import IMPORTANT_MODELS

				          print(f"Important models: {IMPORTANT_MODELS}")

				          # Get the changed files from the previous step

				          changed_files_str = os.environ.get('CHANGED_FILES', '')

				          changed_files = changed_files_str.split() if changed_files_str else []

				          # Filter to only Python files

				          python_files = [f for f in changed_files if f.endswith('.py')]

				          print(f"Python files changed: {python_files}")

				          result_models = set()

				          # Specific files that trigger all models

				          transformers_utils_files = [

				              'modeling_utils.py',

				              'modeling_rope_utils.py', 

				              'modeling_flash_attention_utils.py',

				              'modeling_attn_mask_utils.py',

				              'cache_utils.py',

				              'masking_utils.py',

				              'pytorch_utils.py'

				          ]

				          # Single loop through all Python files

				          for file in python_files:

				              # Check for files under src/transformers/models/

				              if file.startswith('src/transformers/models/'):

				                  remaining_path = file[len('src/transformers/models/'):]

				                  if '/' in remaining_path:

				                      model_dir = remaining_path.split('/')[0]

				                      if model_dir in IMPORTANT_MODELS:

				                          result_models.add(model_dir)

				                          print(f"Added model directory: {model_dir}")

				              # Check for specific files under src/transformers/ or src/transformers/generation/ files

				              elif file.startswith('src/transformers/generation/') or \

				                   (file.startswith('src/transformers/') and os.path.basename(file) in transformers_utils_files):

				                  print(f"Found core file: {file} - including all important models")

				                  result_models.update(IMPORTANT_MODELS)

				                  break  # No need to continue once we include all models

				          # Convert to sorted list and create matrix

				          result_list = sorted(list(result_models))

				          print(f"Final model list: {result_list}")

				          if result_list:

				              matrix_json = json.dumps(result_list)

				              print(f"matrix={matrix_json}")

				              # Write to GITHUB_OUTPUT

				              with open(os.environ['GITHUB_OUTPUT'], 'a') as f:

				                  f.write(f"matrix={matrix_json}\n")

				          else:

				              print("matrix=[]")

				              with open(os.environ['GITHUB_OUTPUT'], 'a') as f:

				                  f.write("matrix=[]\n")

				          EOF

				  model-ci:

				    name: Model CI

				    uses: ./.github/workflows/self-scheduled.yml

				    needs: get_modified_models

				    name: Slow & FA2 tests

				    runs-on: [single-gpu, nvidia-gpu, a10, ci]

				    container:

				      image: huggingface/transformers-all-latest-gpu

				      options: --gpus all --privileged --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    if: ${{ needs.get_modified_models.outputs.matrix != '[]' && needs.get_modified_models.outputs.matrix != '' && fromJson(needs.get_modified_models.outputs.matrix)[0] != null }}

				    strategy:

				      fail-fast: false

				      matrix: 

				        model-name: ${{ fromJson(needs.get_modified_models.outputs.matrix) }}

				    steps:

				      - name: Check out code

				        uses: actions/checkout@v4

				      - name: Install locally transformers & other libs

				        run: |

				          apt install sudo

				          sudo -H pip install --upgrade pip

				          sudo -H pip uninstall -y transformers 

				          sudo -H pip install -U -e ".[testing]" 

				          MAX_JOBS=4 pip install flash-attn --no-build-isolation

				          pip install bitsandbytes

				      - name: NVIDIA-SMI

				        run: |

				          nvidia-smi

				      - name: Show installed libraries and their versions

				        run: pip freeze

				      - name: Run FA2 tests

				        id: run_fa2_tests

				        run:

				          pytest -rsfE -m "flash_attn_test" --make-reports=${{ matrix.model-name }}_fa2_tests/ tests/${{ matrix.model-name }}/test_modeling_*

				      - name: "Test suite reports artifacts: ${{ matrix.model-name }}_fa2_tests"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ matrix.model-name }}_fa2_tests

				          path: /transformers/reports/${{ matrix.model-name }}_fa2_tests

				      - name: Post to Slack

				        if: always()

				        uses: huggingface/hf-workflows/.github/actions/post-slack@main

				        with:

				          slack_channel: ${{ env.OUTPUT_SLACK_CHANNEL_ID }}

				          title: 🤗 Results of the FA2 tests - ${{ matrix.model-name }}

				          status: ${{ steps.run_fa2_tests.conclusion}}

				          slack_token: ${{ secrets.CI_SLACK_BOT_TOKEN }}

				      - name: Run integration tests

				        id: run_integration_tests

				        if: always()

				        run:

				          pytest -rsfE -k "IntegrationTest"  --make-reports=tests_integration_${{ matrix.model-name }} tests/${{ matrix.model-name }}/test_modeling_*

				      - name: "Test suite reports artifacts: tests_integration_${{ matrix.model-name }}"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: tests_integration_${{ matrix.model-name }}

				          path: /transformers/reports/tests_integration_${{ matrix.model-name }}

				      - name: Post to Slack

				        if: always()

				        uses: huggingface/hf-workflows/.github/actions/post-slack@main 

				        with:

				          slack_channel: ${{ env.OUTPUT_SLACK_CHANNEL_ID }}

				          title: 🤗 Results of the Integration tests - ${{ matrix.model-name }}

				          status: ${{ steps.run_integration_tests.conclusion}}

				          slack_token: ${{ secrets.CI_SLACK_BOT_TOKEN }}

				      - name: Tailscale # In order to be able to SSH when a test fails

				        if: ${{ runner.debug == '1'}}

				        uses: huggingface/tailscale-action@v1

				        with:

				          authkey: ${{ secrets.TAILSCALE_SSH_AUTHKEY }}

				          slackChannel: ${{ secrets.SLACK_CIFEEDBACK_CHANNEL }}

				          slackToken: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

				          waitForSSH: true

				  benchmark:

				    name: Benchmark workflow

				    needs: get_modified_models

				    if: ${{ needs.get_modified_models.outputs.matrix != '[]' && needs.get_modified_models.outputs.matrix != '' && fromJson(needs.get_modified_models.outputs.matrix)[0] != null }}

				    uses: ./.github/workflows/benchmark.yml

				    if: needs.get_modified_models.outputs.matrix != '' && needs.get_modified_models.outputs.matrix != '[]'

				    with:

				      job: run_models_gpu

				      slack_report_channel: "#transformers-ci-push"

				      docker: huggingface/transformers-all-latest-gpu

				      ci_event: push

				      report_repo_id: hf-internal-testing/transformers_ci_push

				      commit_sha: ${{ github.sha }}

				      models: ${{ needs.get_modified_models.outputs.matrix }}

				    secrets: inherit

									
										415

.github/workflows/self-comment-ci.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,415 @@

				name: PR comment GitHub CI

				on:

				  issue_comment:

				    types:

				      - created

				    branches-ignore:

				      - main

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.issue.number }}-${{ startsWith(github.event.comment.body, 'run-slow') || startsWith(github.event.comment.body, 'run slow') || startsWith(github.event.comment.body, 'run_slow') }}

				  cancel-in-progress: true

				permissions: read-all

				env:

				  HF_HOME: /mnt/cache

				  TRANSFORMERS_IS_CI: yes

				  OMP_NUM_THREADS: 8

				  MKL_NUM_THREADS: 8

				  RUN_SLOW: yes

				  # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.

				  # This token is created under the bot `hf-transformers-bot`.

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				  TF_FORCE_GPU_ALLOW_GROWTH: true

				  CUDA_VISIBLE_DEVICES: 0,1

				jobs:

				  get-pr-number:

				    runs-on: ubuntu-22.04

				    name: Get PR number

				    # For security: only allow team members to run

				    if: ${{ github.event.issue.state == 'open' && contains(fromJSON('["ydshieh", "ArthurZucker", "zucchini-nlp", "molbap", "gante", "LysandreJik", "Cyrilvallez", "Rocketknight1", "SunMarc", "eustlb", "MekkCyber", "vasqu", "ivarflakstad", "stevhliu", "ebezzam", "remi-or", "itazap"]'), github.actor) && (startsWith(github.event.comment.body, 'run-slow') || startsWith(github.event.comment.body, 'run slow') || startsWith(github.event.comment.body, 'run_slow')) }}

				    outputs:

				      PR_NUMBER: ${{ steps.set_pr_number.outputs.PR_NUMBER }}

				    steps:

				      - name: Get PR number

				        shell: bash

				        run: |

				          if [[ "${{ github.event.issue.number }}" != "" && "${{ github.event.issue.pull_request }}" != "" ]]; then

				            echo "PR_NUMBER=${{ github.event.issue.number }}" >> $GITHUB_ENV

				          else

				            echo "PR_NUMBER=" >> $GITHUB_ENV

				          fi

				      - name: Check PR number

				        shell: bash

				        run: |

				          echo "${{ env.PR_NUMBER }}"

				      - name: Set PR number

				        id: set_pr_number

				        run: echo "PR_NUMBER=${{ env.PR_NUMBER }}" >> "$GITHUB_OUTPUT"

				  get-sha:

				    runs-on: ubuntu-22.04

				    needs: get-pr-number

				    if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}

				    outputs:

				      PR_HEAD_SHA: ${{ steps.get_sha.outputs.PR_HEAD_SHA }}

				      PR_MERGE_SHA: ${{ steps.get_sha.outputs.PR_MERGE_SHA }}

				    steps:

				      - uses: actions/checkout@v4

				        with:

				          fetch-depth: "0"

				          ref: "refs/pull/${{needs.get-pr-number.outputs.PR_NUMBER}}/merge"

				      - name: Get SHA (and verify timestamps against the issue comment date)

				        id: get_sha

				        env:

				          PR_NUMBER: ${{ needs.get-pr-number.outputs.PR_NUMBER }}

				          COMMENT_DATE: ${{ github.event.comment.created_at }}

				        run: |

				            git fetch origin refs/pull/$PR_NUMBER/head:refs/remotes/pull/$PR_NUMBER/head

				            git checkout refs/remotes/pull/$PR_NUMBER/head

				            echo "PR_HEAD_SHA: $(git log -1 --format=%H)"

				            echo "PR_HEAD_SHA=$(git log -1 --format=%H)" >> "$GITHUB_OUTPUT"

				            git fetch origin refs/pull/$PR_NUMBER/merge:refs/remotes/pull/$PR_NUMBER/merge

				            git checkout refs/remotes/pull/$PR_NUMBER/merge

				            echo "PR_MERGE_SHA: $(git log -1 --format=%H)"

				            echo "PR_MERGE_SHA=$(git log -1 --format=%H)" >> "$GITHUB_OUTPUT"

				            PR_MERGE_COMMIT_TIMESTAMP=$(git log -1 --date=unix --format=%cd)

				            echo "PR_MERGE_COMMIT_TIMESTAMP: $PR_MERGE_COMMIT_TIMESTAMP"

				            COMMENT_TIMESTAMP=$(date -d "${COMMENT_DATE}" +"%s")

				            echo "COMMENT_DATE: $COMMENT_DATE"

				            echo "COMMENT_TIMESTAMP: $COMMENT_TIMESTAMP"

				            if [ $COMMENT_TIMESTAMP -le $PR_MERGE_COMMIT_TIMESTAMP ]; then

				              echo "Last commit on the pull request is newer than the issue comment triggering this run! Abort!";

				              exit -1;

				            fi

				  # use a python script to handle this complex logic

				  # case 1: `run-slow` (auto. infer with limited number of models, but in particular, new model)

				  # case 2: `run-slow model_1, model_2`

				  get-tests:

				    runs-on: ubuntu-22.04

				    needs: [get-pr-number, get-sha]

				    if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}

				    outputs:

				      models: ${{ steps.models_to_run.outputs.models }}

				      quantizations: ${{ steps.models_to_run.outputs.quantizations }}

				    steps:

				      - uses: actions/checkout@v4

				        with:

				          fetch-depth: "0"

				          ref: "refs/pull/${{needs.get-pr-number.outputs.PR_NUMBER}}/merge"

				      - name: Verify merge commit SHA

				        env:

				          VERIFIED_PR_MERGE_SHA: ${{ needs.get-sha.outputs.PR_MERGE_SHA }}

				        run: |

				            PR_MERGE_SHA=$(git log -1 --format=%H)

				            if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then

				              echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";

				              exit -1;

				            fi

				      - name: Get models to test

				        env:

				          PR_COMMENT: ${{ github.event.comment.body }}

				        run: |

				          python -m pip install GitPython

				          python utils/pr_slow_ci_models.py --message "$PR_COMMENT" | tee output.txt

				          echo "models=$(tail -n 1 output.txt)" >> $GITHUB_ENV

				          python utils/pr_slow_ci_models.py --message "$PR_COMMENT" --quantization | tee output2.txt

				          echo "quantizations=$(tail -n 1 output2.txt)" >> $GITHUB_ENV

				      - name: Show models to test

				        id: models_to_run

				        run: |

				          echo "${{ env.models }}"

				          echo "models=${{ env.models }}" >> $GITHUB_ENV

				          echo "models=${{ env.models }}" >> $GITHUB_OUTPUT

				          echo "${{ env.quantizations }}"

				          echo "quantizations=${{ env.quantizations }}" >> $GITHUB_OUTPUT

				  reply_to_comment:

				    name: Reply to the comment

				    if: ${{ needs.get-tests.outputs.models != '[]'  || needs.get-tests.outputs.quantizations != '[]' }}

				    needs: [get-pr-number, get-tests]

				    permissions:

				      pull-requests: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Reply to the comment

				        env:

				          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				          MODELS: ${{ needs.get-tests.outputs.models }}

				          BODY: "\n\nmodels: ${{ needs.get-tests.outputs.models }}\nquantizations: ${{ needs.get-tests.outputs.quantizations }}"

				        run: |

				          gh api \

				            --method POST \

				            -H "Accept: application/vnd.github+json" \

				            -H "X-GitHub-Api-Version: 2022-11-28" \

				            repos/${{ github.repository }}/issues/${{ needs.get-pr-number.outputs.PR_NUMBER }}/comments \

				            -f "body=This comment contains run-slow, running the specified jobs: ${{ env.BODY }} ..."

				  create_run:

				    name: Create run

				    if: ${{ needs.get-tests.outputs.models != '[]' || needs.get-tests.outputs.quantizations != '[]' }}

				    needs: [get-sha, get-tests, reply_to_comment]

				    permissions:

				      statuses: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Create Run

				        id: create_run

				        env:

				          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				          # Create a commit status (pending) for a run of this workflow. The status has to be updated later in `update_run_status`.

				          # See https://docs.github.com/en/rest/commits/statuses?apiVersion=2022-11-28#create-a-commit-status

				          GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}

				        run: |

				          gh api \

				            --method POST \

				            -H "Accept: application/vnd.github+json" \

				            -H "X-GitHub-Api-Version: 2022-11-28" \

				            repos/${{ github.repository }}/statuses/${{ needs.get-sha.outputs.PR_HEAD_SHA }} \

				            -f "target_url=$GITHUB_RUN_URL" -f "state=pending" -f "description=Slow CI job" -f "context=pytest/custom-tests"

				  run_models_gpu:

				    name: Run all tests for the model

				    if: ${{ needs.get-tests.outputs.models != '[]' }}

				    needs: [get-pr-number, get-sha, get-tests, create_run]

				    strategy:

				      fail-fast: false

				      matrix:

				        folders: ${{ fromJson(needs.get-tests.outputs.models) }}

				        machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]

				    runs-on:

				       group: '${{ matrix.machine_type }}'

				    container:

				      image: huggingface/transformers-all-latest-gpu

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    steps:

				      - name: Echo input and matrix info

				        shell: bash

				        run: |

				          echo "${{ matrix.folders }}"

				      - name: Echo folder ${{ matrix.folders }}

				        shell: bash

				        # For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to

				        # set the artifact folder names (because the character `/` is not allowed).

				        run: |

				          echo "${{ matrix.folders }}"

				          matrix_folders=${{ matrix.folders }}

				          matrix_folders=${matrix_folders/'models/'/'models_'}

				          echo "$matrix_folders"

				          echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV

				      - name: Checkout to PR merge commit

				        working-directory: /transformers

				        run: |

				          git fetch origin refs/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge:refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge

				          git checkout refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge

				          git log -1 --format=%H

				      - name: Verify merge commit SHA

				        env:

				          VERIFIED_PR_MERGE_SHA: ${{ needs.get-sha.outputs.PR_MERGE_SHA }}

				        working-directory: /transformers

				        run: |

				          PR_MERGE_SHA=$(git log -1 --format=%H)

				          if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then

				            echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";

				            exit -1;

				          fi

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

				      - name: NVIDIA-SMI

				        run: |

				          nvidia-smi

				      - name: Set `machine_type` for report and artifact names

				        working-directory: /transformers

				        shell: bash

				        run: |

				          echo "${{ matrix.machine_type }}"

				          if [ "${{ matrix.machine_type }}" = "aws-g5-4xlarge-cache" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "aws-g5-12xlarge-cache" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				          fi

				          echo "$machine_type"

				          echo "machine_type=$machine_type" >> $GITHUB_ENV

				      - name: Environment

				        working-directory: /transformers

				        run: |

				          python3 utils/print_env.py

				      - name: Show installed libraries and their versions

				        working-directory: /transformers

				        run: pip freeze

				      - name: Run all tests on GPU

				        working-directory: /transformers

				        run: |

				          export CUDA_VISIBLE_DEVICES="$(python3 utils/set_cuda_devices_for_ci.py --test_folder ${{ matrix.folders }})"

				          echo $CUDA_VISIBLE_DEVICES

				          python3 -m pytest -v -rsfE --make-reports=${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/failures_short.txt

				      - name: Make sure report directory exists

				        shell: bash

				        run: |

				          mkdir -p /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports

				          echo "hello" > /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/hello.txt

				          echo "${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports"

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports

				          path: /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports

				  run_quantization_torch_gpu:

				    name: Run all tests for a quantization

				    if: ${{ needs.get-tests.outputs.quantizations != '[]' }}

				    needs: [get-pr-number, get-sha, get-tests, create_run]

				    strategy:

				      fail-fast: false

				      matrix:

				        folders: ${{ fromJson(needs.get-tests.outputs.quantizations) }}

				        machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]

				    runs-on:

				      group: '${{ matrix.machine_type }}'

				    container:

				      image: huggingface/transformers-quantization-latest-gpu

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    steps:

				      - name: Echo folder ${{ matrix.folders }}

				        shell: bash

				        run: |

				          echo "${{ matrix.folders }}"

				          matrix_folders=${{ matrix.folders }}

				          matrix_folders=${matrix_folders/'quantization/'/'quantization_'}

				          echo "$matrix_folders"

				          echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV

				      - name: Checkout to PR merge commit

				        working-directory: /transformers

				        run: |

				          git fetch origin refs/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge:refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge

				          git checkout refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge

				          git log -1 --format=%H

				      - name: Verify merge commit SHA

				        env:

				          VERIFIED_PR_MERGE_SHA: ${{ needs.get-sha.outputs.PR_MERGE_SHA }}

				        working-directory: /transformers

				        run: |

				          PR_MERGE_SHA=$(git log -1 --format=%H)

				          if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then

				            echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";

				            exit -1;

				          fi

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

				      - name: NVIDIA-SMI

				        run: |

				          nvidia-smi

				      - name: Set `machine_type` for report and artifact names

				        working-directory: /transformers

				        shell: bash

				        run: |

				          echo "${{ matrix.machine_type }}"

				          if [ "${{ matrix.machine_type }}" = "aws-g5-4xlarge-cache" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "aws-g5-12xlarge-cache" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				          fi

				          echo "$machine_type"

				          echo "machine_type=$machine_type" >> $GITHUB_ENV

				      - name: Environment

				        working-directory: /transformers

				        run: |

				          python3 utils/print_env.py

				      - name: Show installed libraries and their versions

				        working-directory: /transformers

				        run: pip freeze

				      - name: Run quantization tests on GPU

				        working-directory: /transformers

				        run: |

				          python3 -m pytest -v --make-reports=${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat /transformers/reports/${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports/failures_short.txt

				      - name: Make sure report directory exists

				        shell: bash

				        run: |

				          mkdir -p /transformers/reports/${{ env.machine_type }}_run_quantization_gpu_${{ matrix.folders }}_test_reports

				          echo "hello" > /transformers/reports/${{ env.machine_type }}_run_quantization_gpu_${{ matrix.folders }}_test_reports/hello.txt

				          echo "${{ env.machine_type }}_run_quantization_gpu_${{ matrix.folders }}_test_reports"

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports

				          path: /transformers/reports/${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports

				  update_run_status:

				    name: Update Check Run Status

				    needs: [get-sha, create_run, run_models_gpu, run_quantization_torch_gpu]

				    permissions:

				      statuses: write

				    if: ${{ always() && needs.create_run.result == 'success' }}

				    runs-on: ubuntu-22.04

				    env:

				      GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				      GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}

				      STATUS_OK: ${{ contains(fromJSON('["skipped", "success"]'), needs.run_models_gpu.result) && contains(fromJSON('["skipped", "success"]'), needs.run_quantization_torch_gpu.result) }}

				    steps:

				      - name: Get `run_models_gpu` job status

				        run: |

				          echo "${{ needs.run_models_gpu.result }}"

				          echo "${{ needs.run_quantization_torch_gpu.result }}"

				          echo $STATUS_OK

				          if [ "$STATUS_OK" = "true" ]; then

				            echo "STATUS=success" >> $GITHUB_ENV

				          else

				            echo "STATUS=failure" >> $GITHUB_ENV

				          fi

				      - name: Update PR commit statuses

				        run: |

				          echo "${{ needs.run_models_gpu.result }}"

				          echo "${{ env.STATUS }}"

				          gh api \

				            --method POST \

				            -H "Accept: application/vnd.github+json" \

				            -H "X-GitHub-Api-Version: 2022-11-28" \

				            repos/${{ github.repository }}/statuses/${{ needs.get-sha.outputs.PR_HEAD_SHA }} \

				            -f "target_url=$GITHUB_RUN_URL" -f "state=${{ env.STATUS }}" -f "description=Slow CI job" -f "context=pytest/custom-tests"

									
										61

.github/workflows/self-nightly-caller.yml
									
										vendored
									
												View File
												
				@ -1,43 +1,56 @@

				name: Self-hosted runner (nightly-ci)

				name: Nvidia CI with nightly torch

				on:

				  repository_dispatch:

				  schedule:

				    - cron: "17 2 * * *"

				  # triggered when the daily scheduled Nvidia CI is completed.

				  # This way, we can compare the results more easily.

				  workflow_run:

				    workflows: ["Nvidia CI"]

				    branches: ["main"]

				    types: [completed]

				  push:

				    branches:

				      - run_nightly_ci*

				      - run_ci_with_nightly_torch*

				# Used for `push` to easily modify the target workflow runs to compare against

				env:

				    prev_workflow_run_id: ""

				    other_workflow_run_id: ""

				jobs:

				  build_nightly_ci_images:

				    name: Build Nightly CI Docker Images

				    if: (github.event_name == 'schedule') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_nightly_ci'))

				  build_nightly_torch_ci_images:

				    name: Build CI Docker Images with nightly torch

				    uses: ./.github/workflows/build-nightly-ci-docker-images.yml

				    with:

				      job: latest-with-torch-nightly-docker

				    secrets: inherit

				  setup:

				    name: Setup

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Setup

				        run: |

				          mkdir "setup_values"

				          echo "${{ inputs.prev_workflow_run_id || env.prev_workflow_run_id }}" > "setup_values/prev_workflow_run_id.txt"

				          echo "${{ inputs.other_workflow_run_id || env.other_workflow_run_id }}" > "setup_values/other_workflow_run_id.txt"

				      - name: Upload artifacts

				        uses: actions/upload-artifact@v4

				        with:

				          name: setup_values

				          path: setup_values

				  model-ci:

				    name: Model CI

				    needs: [build_nightly_ci_images]

				    needs: build_nightly_torch_ci_images

				    uses: ./.github/workflows/self-scheduled.yml

				    with:

				      job: run_models_gpu

				      slack_report_channel: "#transformers-ci-past-future"

				      runner: ci

				      docker: huggingface/transformers-all-latest-torch-nightly-gpu

				      ci_event: Nightly CI

				    secrets: inherit

				  deepspeed-ci:

				    name: DeepSpeed CI

				    needs: [build_nightly_ci_images]

				    uses: ./.github/workflows/self-scheduled.yml

				    with:

				      job: run_torch_cuda_extensions_gpu

				      slack_report_channel: "#transformers-ci-past-future"

				      runner: ci

				      # test deepspeed nightly build with the latest release torch

				      docker: huggingface/transformers-pytorch-deepspeed-latest-gpu

				      ci_event: Nightly CI

				      working-directory-prefix: /workspace

				      report_repo_id: hf-internal-testing/transformers_daily_ci_with_torch_nightly

				      commit_sha: ${{ github.event.workflow_run.head_sha || github.sha }}

				    secrets: inherit

									
										33

.github/workflows/self-nightly-past-ci-caller.yml
									
										vendored
									
												View File
												
				@ -21,39 +21,6 @@ jobs:

				          echo "$(python3 -c 'print(int(${{ github.run_number }}) % 10)')"

				          echo "run_number=$(python3 -c 'print(int(${{ github.run_number }}) % 10)')" >> $GITHUB_OUTPUT

				  run_past_ci_pytorch_1-13:

				    name: PyTorch 1.13

				    needs: get_number

				    if: needs.get_number.outputs.run_number == 0 && (cancelled() != true) && ((github.event_name == 'schedule') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci')))

				    uses: ./.github/workflows/self-past-caller.yml

				    with:

				      framework: pytorch

				      version: "1.13"

				      sha: ${{ github.sha }}

				    secrets: inherit

				  run_past_ci_pytorch_1-12:

				    name: PyTorch 1.12

				    needs: get_number

				    if: needs.get_number.outputs.run_number == 1 && (cancelled() != true) && ((github.event_name == 'schedule') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci')))

				    uses: ./.github/workflows/self-past-caller.yml

				    with:

				      framework: pytorch

				      version: "1.12"

				      sha: ${{ github.sha }}

				    secrets: inherit

				  run_past_ci_pytorch_1-11:

				    name: PyTorch 1.11

				    needs: get_number

				    if: needs.get_number.outputs.run_number == 2 && (cancelled() != true) && ((github.event_name == 'schedule') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci')))

				    uses: ./.github/workflows/self-past-caller.yml

				    with:

				      framework: pytorch

				      version: "1.11"

				      sha: ${{ github.sha }}

				    secrets: inherit

				  run_past_ci_tensorflow_2-11:

				    name: TensorFlow 2.11

				    needs: get_number

									
										135

.github/workflows/self-pr-slow-ci.yml
									
										vendored
									
												View File
											
				@ -1,135 +0,0 @@

				name: PR slow CI

				on:

				  pull_request:

				    paths:

				      - "src/transformers/models/*/modeling_*.py"

				      - "tests/**/test_*.py"

				concurrency:

				  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}

				  cancel-in-progress: true

				env:

				  HF_HOME: /mnt/cache

				  TRANSFORMERS_IS_CI: yes

				  OMP_NUM_THREADS: 8

				  MKL_NUM_THREADS: 8

				  RUN_SLOW: yes

				  # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.

				  # This token is created under the bot `hf-transformers-bot`.

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				  SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}

				  TF_FORCE_GPU_ALLOW_GROWTH: true

				  RUN_PT_TF_CROSS_TESTS: 1

				  CUDA_VISIBLE_DEVICES: 0,1

				jobs:

				  find_models_to_run:

				      runs-on: ubuntu-22.04

				      name: Find models to run slow tests

				      # Triggered only if the required label `run-slow` is added

				      if: ${{ contains(github.event.pull_request.labels.*.name, 'run-slow') }}

				      outputs:

				        models: ${{ steps.models_to_run.outputs.models }}

				      steps:

				        - uses: actions/checkout@v4

				          with:

				            fetch-depth: "0"

				            ref: ${{ github.event.pull_request.head.sha }}

				        - name: Get commit message

				          run: |

				            echo "commit_message=$(git show -s --format=%s)" >> $GITHUB_ENV

				        - name: Get models to run slow tests

				          run: |

				            echo "${{ env.commit_message }}"

				            python -m pip install GitPython

				            python utils/pr_slow_ci_models.py --commit_message "${{ env.commit_message }}" | tee output.txt

				            echo "models=$(tail -n 1 output.txt)" >> $GITHUB_ENV

				        - name: Models to run slow tests

				          id: models_to_run

				          run: |

				            echo "${{ env.models }}"

				            echo "models=${{ env.models }}" >> $GITHUB_OUTPUT

				  run_models_gpu:

				      name: Run all tests for the model

				      # Triggered only `find_models_to_run` is triggered (label `run-slow` is added) which gives the models to run

				      # (either a new model PR or via a commit message)

				      if: ${{ needs.find_models_to_run.outputs.models != '[]' }}

				      needs: find_models_to_run

				      strategy:

				        fail-fast: false

				        matrix:

				          folders: ${{ fromJson(needs.find_models_to_run.outputs.models) }}

				          machine_type: [single-gpu, multi-gpu]

				      runs-on: ['${{ matrix.machine_type }}', nvidia-gpu, t4, ci]

				      container:

				        image: huggingface/transformers-all-latest-gpu

				        options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				      steps:

				      - name: Echo input and matrix info

				        shell: bash

				        run: |

				          echo "${{ matrix.folders }}"

				      - name: Echo folder ${{ matrix.folders }}

				        shell: bash

				        # For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to

				        # set the artifact folder names (because the character `/` is not allowed).

				        run: |

				          echo "${{ matrix.folders }}"

				          matrix_folders=${{ matrix.folders }}

				          matrix_folders=${matrix_folders/'models/'/'models_'}

				          echo "$matrix_folders"

				          echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV

				      - name: Update clone

				        working-directory: /transformers

				        run: git fetch && git fetch origin pull/${{ github.event.pull_request.number }}/head:pull/${{ github.event.pull_request.number }}/merge && git checkout pull/${{ github.event.pull_request.number }}/merge

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

				      - name: NVIDIA-SMI

				        run: |

				          nvidia-smi

				      - name: Environment

				        working-directory: /transformers

				        run: |

				          python3 utils/print_env.py

				      - name: Show installed libraries and their versions

				        working-directory: /transformers

				        run: pip freeze

				      - name: Run all tests on GPU

				        working-directory: /transformers

				        run: |

				          export CUDA_VISIBLE_DEVICES="$(python3 utils/set_cuda_devices_for_ci.py --test_folder ${{ matrix.folders }})"

				          echo $CUDA_VISIBLE_DEVICES

				          python3 -m pytest -v -rsfE --make-reports=${{ matrix.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat /transformers/reports/${{ matrix.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/failures_short.txt

				      - name: Make sure report directory exists

				        shell: bash

				        run: |

				          mkdir -p /transformers/reports/${{ matrix.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports

				          echo "hello" > /transformers/reports/${{ matrix.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/hello.txt

				          echo "${{ matrix.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports"

				      - name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ matrix.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports

				          path: /transformers/reports/${{ matrix.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports

									
										50

.github/workflows/self-push-amd-mi210-caller.yml
									
										vendored
									
												View File
												
				@ -1,25 +1,25 @@

				name: Self-hosted runner (AMD mi210 CI caller)

				on:

				  workflow_run:

				    workflows: ["Self-hosted runner (push-caller)"]

				    branches: ["main"]

				    types: [completed]

				  push:

				    branches:

				      - run_amd_push_ci_caller*

				    paths:

				      - "src/**"

				      - "tests/**"

				      - ".github/**"

				      - "templates/**"

				      - "utils/**"

				jobs:

				  run_amd_ci:

				    name: AMD mi210

				    if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))

				    uses: ./.github/workflows/self-push-amd.yml

				    with:

				      gpu_flavor: mi210

				    secrets: inherit

				name: Self-hosted runner (AMD mi210 CI caller)

				on:

				  #workflow_run:

				  #  workflows: ["Self-hosted runner (push-caller)"]

				  #  branches: ["main"]

				  #  types: [completed]

				  push:

				    branches:

				      - run_amd_push_ci_caller*

				    paths:

				      - "src/**"

				      - "tests/**"

				      - ".github/**"

				      - "templates/**"

				      - "utils/**"

				jobs:

				  run_amd_ci:

				    name: AMD mi210

				    if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))

				    uses: ./.github/workflows/self-push-amd.yml

				    with:

				      gpu_flavor: mi210

				    secrets: inherit

									
										50

.github/workflows/self-push-amd-mi250-caller.yml
									
										vendored
									
												View File
												
				@ -1,25 +1,25 @@

				name: Self-hosted runner (AMD mi250 CI caller)

				on:

				  workflow_run:

				    workflows: ["Self-hosted runner (push-caller)"]

				    branches: ["main"]

				    types: [completed]

				  push:

				    branches:

				      - run_amd_push_ci_caller*

				    paths:

				      - "src/**"

				      - "tests/**"

				      - ".github/**"

				      - "templates/**"

				      - "utils/**"

				jobs:

				  run_amd_ci:

				    name: AMD mi250

				    if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))

				    uses: ./.github/workflows/self-push-amd.yml

				    with:

				      gpu_flavor: mi250

				    secrets: inherit

				name: Self-hosted runner (AMD mi250 CI caller)

				on:

				  #workflow_run:

				  #  workflows: ["Self-hosted runner (push-caller)"]

				  #  branches: ["main"]

				  #  types: [completed]

				  push:

				    branches:

				      - run_amd_push_ci_caller*

				    paths:

				      - "src/**"

				      - "tests/**"

				      - ".github/**"

				      - "templates/**"

				      - "utils/**"

				jobs:

				  run_amd_ci:

				    name: AMD mi250

				    if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))

				    uses: ./.github/workflows/self-push-amd.yml

				    with:

				      gpu_flavor: mi250

				    secrets: inherit

									
										25

.github/workflows/self-push-amd-mi300-caller.yml
									
										vendored
									
												View File
											
				@ -1,25 +0,0 @@

				name: Self-hosted runner (AMD mi300 CI caller)

				on:

				  workflow_run:

				    workflows: ["Self-hosted runner (push-caller)"]

				    branches: ["main"]

				    types: [completed]

				  push:

				    branches:

				      - run_amd_push_ci_caller*

				    paths:

				      - "src/**"

				      - "tests/**"

				      - ".github/**"

				      - "templates/**"

				      - "utils/**"

				jobs:

				  run_amd_ci:

				    name: AMD mi300

				    if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && (startsWith(github.ref_name, 'run_amd_push_ci_caller') || startsWith(github.ref_name, 'mi300-ci'))))

				    uses: ./.github/workflows/self-push-amd.yml

				    with:

				      gpu_flavor: mi300

				    secrets: inherit

									
										1

.github/workflows/self-push-amd.yml
									
										vendored
									
												View File
												
				@ -14,7 +14,6 @@ env:

				  MKL_NUM_THREADS: 8

				  PYTEST_TIMEOUT: 60

				  TF_FORCE_GPU_ALLOW_GROWTH: true

				  RUN_PT_TF_CROSS_TESTS: 1

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				jobs:

									
										4

.github/workflows/self-push-caller.yml
									
										vendored
									
												View File
												
				@ -25,7 +25,7 @@ jobs:

				        - name: Get changed files

				          id: changed-files

				          uses: tj-actions/changed-files@v41

				          uses: tj-actions/changed-files@1c8e6069583811afb28f97afeaf8e7da80c6be5c

				        - name: Was setup changed 

				          id: was_changed

				@ -51,4 +51,4 @@ jobs:

				    needs: build-docker-containers

				    steps:

				      - name: Trigger push CI via workflow_run

				        run: echo "Trigger push CI via workflow_run"

				        run: echo "Trigger push CI via workflow_run"

									
										310

.github/workflows/self-push.yml
									
										vendored
									
												View File
												
				@ -24,7 +24,6 @@ env:

				  MKL_NUM_THREADS: 8

				  PYTEST_TIMEOUT: 60

				  TF_FORCE_GPU_ALLOW_GROWTH: true

				  RUN_PT_TF_CROSS_TESTS: 1

				  CUDA_VISIBLE_DEVICES: 0,1

				jobs:

				@ -32,11 +31,12 @@ jobs:

				    name: Setup

				    strategy:

				      matrix:

				        machine_type: [single-gpu, multi-gpu]

				    runs-on: ['${{ matrix.machine_type }}', nvidia-gpu, t4, push-ci]

				        machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]

				    runs-on:

				      group: '${{ matrix.machine_type }}'

				    container:

				      image: huggingface/transformers-all-latest-gpu-push-ci

				      options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    outputs:

				      matrix: ${{ steps.set-matrix.outputs.matrix }}

				      test_map: ${{ steps.set-matrix.outputs.test_map }}

				@ -131,11 +131,12 @@ jobs:

				      fail-fast: false

				      matrix:

				        folders: ${{ fromJson(needs.setup.outputs.matrix) }}

				        machine_type: [single-gpu]

				    runs-on: ['${{ matrix.machine_type }}', nvidia-gpu, t4, push-ci]

				        machine_type: [aws-g5-4xlarge-cache]

				    runs-on:

				      group: '${{ matrix.machine_type }}'

				    container:

				      image: huggingface/transformers-all-latest-gpu-push-ci

				      options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    env:

				      # For the meaning of these environment variables, see the job `Setup`

				      CI_BRANCH_PUSH: ${{ github.event.ref }}

				@ -162,6 +163,23 @@ jobs:

				          echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"

				          echo "env.CI_SHA = ${{ env.CI_SHA }}"

				      - name: Set `machine_type` for report and artifact names

				        working-directory: /transformers

				        shell: bash

				        run: |

				          echo "${{ matrix.machine_type }}"

				          if [ "${{ matrix.machine_type }}" = "aws-g5-4xlarge-cache" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "aws-g5-12xlarge-cache" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				          fi

				          echo "$machine_type"

				          echo "machine_type=$machine_type" >> $GITHUB_ENV

				      - name: Update clone using environment variables

				        working-directory: /transformers

				        run: |

				@ -203,19 +221,19 @@ jobs:

				      - name: Run all non-slow selected tests on GPU

				        working-directory: /transformers

				        run: |

				          python3 -m pytest -n 2 --dist=loadfile -v --make-reports=${{ matrix.machine_type }}_tests_gpu_${{ matrix.folders }} ${{ fromJson(needs.setup.outputs.test_map)[matrix.folders] }}

				          python3 -m pytest -n 2 --dist=loadfile -v --make-reports=${{ env.machine_type }}_tests_gpu_${{ matrix.folders }} ${{ fromJson(needs.setup.outputs.test_map)[matrix.folders] }}

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat /transformers/reports/${{ matrix.machine_type }}_tests_gpu_${{ matrix.folders }}/failures_short.txt

				        run: cat /transformers/reports/${{ env.machine_type }}_tests_gpu_${{ matrix.folders }}/failures_short.txt

				      - name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_all_tests_gpu_${{ env.matrix_folders }}_test_reports"

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_all_tests_gpu_${{ env.matrix_folders }}_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ matrix.machine_type }}_run_all_tests_gpu_${{ env.matrix_folders }}_test_reports

				          path: /transformers/reports/${{ matrix.machine_type }}_tests_gpu_${{ matrix.folders }}

				          name: ${{ env.machine_type }}_run_all_tests_gpu_${{ env.matrix_folders }}_test_reports

				          path: /transformers/reports/${{ env.machine_type }}_tests_gpu_${{ matrix.folders }}

				  run_tests_multi_gpu:

				    name: Model tests

				@ -226,8 +244,9 @@ jobs:

				      fail-fast: false

				      matrix:

				        folders: ${{ fromJson(needs.setup.outputs.matrix) }}

				        machine_type: [multi-gpu]

				    runs-on: ['${{ matrix.machine_type }}', nvidia-gpu, t4, push-ci]

				        machine_type: [aws-g5-12xlarge-cache]

				    runs-on:

				      group: '${{ matrix.machine_type }}'

				    container:

				      image: huggingface/transformers-all-latest-gpu-push-ci

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				@ -257,6 +276,23 @@ jobs:

				          echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"

				          echo "env.CI_SHA = ${{ env.CI_SHA }}"

				      - name: Set `machine_type` for report and artifact names

				        working-directory: /transformers

				        shell: bash

				        run: |

				          echo "${{ matrix.machine_type }}"

				          if [ "${{ matrix.machine_type }}" = "aws-g5-4xlarge-cache" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "aws-g5-12xlarge-cache" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				          fi

				          echo "$machine_type"

				          echo "machine_type=$machine_type" >> $GITHUB_ENV

				      - name: Update clone using environment variables

				        working-directory: /transformers

				        run: |

				@ -300,19 +336,19 @@ jobs:

				          MKL_SERVICE_FORCE_INTEL: 1

				        working-directory: /transformers

				        run: |

				          python3 -m pytest -n 2 --dist=loadfile -v --make-reports=${{ matrix.machine_type }}_tests_gpu_${{ matrix.folders }} ${{ fromJson(needs.setup.outputs.test_map)[matrix.folders] }}

				          python3 -m pytest -n 2 --dist=loadfile -v --make-reports=${{ env.machine_type }}_tests_gpu_${{ matrix.folders }} ${{ fromJson(needs.setup.outputs.test_map)[matrix.folders] }}

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat /transformers/reports/${{ matrix.machine_type }}_tests_gpu_${{ matrix.folders }}/failures_short.txt

				        run: cat /transformers/reports/${{ env.machine_type }}_tests_gpu_${{ matrix.folders }}/failures_short.txt

				      - name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_all_tests_gpu_${{ env.matrix_folders }}_test_reports"

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_all_tests_gpu_${{ env.matrix_folders }}_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ matrix.machine_type }}_run_all_tests_gpu_${{ env.matrix_folders }}_test_reports

				          path: /transformers/reports/${{ matrix.machine_type }}_tests_gpu_${{ matrix.folders }}

				          name: ${{ env.machine_type }}_run_all_tests_gpu_${{ env.matrix_folders }}_test_reports

				          path: /transformers/reports/${{ env.machine_type }}_tests_gpu_${{ matrix.folders }}

				  run_tests_torch_cuda_extensions_single_gpu:

				    name: Torch CUDA extension tests

				@ -321,100 +357,9 @@ jobs:

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [single-gpu]

				    runs-on: ['${{ matrix.machine_type }}', nvidia-gpu, t4, push-ci]

				    container:

				      image: huggingface/transformers-pytorch-deepspeed-latest-gpu-push-ci

				      options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    env:

				      # For the meaning of these environment variables, see the job `Setup`

				      CI_BRANCH_PUSH: ${{ github.event.ref }}

				      CI_BRANCH_WORKFLOW_RUN: ${{ github.event.workflow_run.head_branch }}

				      CI_SHA_PUSH: ${{ github.event.head_commit.id }}

				      CI_SHA_WORKFLOW_RUN: ${{ github.event.workflow_run.head_sha }}

				    steps:

				      # Necessary to get the correct branch name and commit SHA for `workflow_run` event

				      # We also take into account the `push` event (we might want to test some changes in a branch)

				      - name: Prepare custom environment variables

				        shell: bash

				        # For the meaning of these environment variables, see the job `Setup`

				        run: |

				          CI_BRANCH_PUSH=${CI_BRANCH_PUSH/'refs/heads/'/''}

				          echo $CI_BRANCH_PUSH

				          echo $CI_BRANCH_WORKFLOW_RUN

				          echo $CI_SHA_PUSH

				          echo $CI_SHA_WORKFLOW_RUN

				          [[ ! -z "$CI_BRANCH_PUSH" ]] && echo "CI_BRANCH=$CI_BRANCH_PUSH" >> $GITHUB_ENV || echo "CI_BRANCH=$CI_BRANCH_WORKFLOW_RUN" >> $GITHUB_ENV

				          [[ ! -z "$CI_SHA_PUSH" ]] && echo "CI_SHA=$CI_SHA_PUSH" >> $GITHUB_ENV || echo "CI_SHA=$CI_SHA_WORKFLOW_RUN" >> $GITHUB_ENV

				      - name: print environment variables

				        run: |

				          echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"

				          echo "env.CI_SHA = ${{ env.CI_SHA }}"

				      - name: Update clone using environment variables

				        working-directory: /workspace/transformers

				        run: |

				          echo "original branch = $(git branch --show-current)"

				          git fetch && git checkout ${{ env.CI_BRANCH }}

				          echo "updated branch = $(git branch --show-current)"

				          git checkout ${{ env.CI_SHA }}

				          echo "log = $(git log -n 1)"

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /workspace/transformers

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

				      - name: Remove cached torch extensions

				        run: rm -rf /github/home/.cache/torch_extensions/

				      # To avoid unknown test failures

				      - name: Pre build DeepSpeed *again*

				        working-directory: /workspace

				        run: |

				          python3 -m pip uninstall -y deepspeed

				          DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install deepspeed --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check

				      - name: NVIDIA-SMI

				        run: |

				          nvidia-smi

				      - name: Environment

				        working-directory: /workspace/transformers

				        run: |

				          python utils/print_env.py

				      - name: Show installed libraries and their versions

				        working-directory: /workspace/transformers

				        run: pip freeze

				      - name: Run all non-slow selected tests on GPU

				        working-directory: /workspace/transformers

				        # TODO: Here we pass all tests in the 2 folders for simplicity. It's better to pass only the identified tests.

				        run: |

				          python -m pytest -n 1 --dist=loadfile -v --make-reports=${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports tests/deepspeed tests/extended

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat /workspace/transformers/reports/${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt

				      - name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports

				          path: /workspace/transformers/reports/${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports

				  run_tests_torch_cuda_extensions_multi_gpu:

				    name: Torch CUDA extension tests

				    needs: setup

				    if: contains(fromJson(needs.setup.outputs.matrix), 'deepspeed') || contains(fromJson(needs.setup.outputs.matrix), 'extended')

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [multi-gpu]

				    runs-on: ['${{ matrix.machine_type }}', nvidia-gpu, t4, push-ci]

				        machine_type: [aws-g5-4xlarge-cache]

				    runs-on:

				      group: '${{ matrix.machine_type }}'

				    container:

				      image: huggingface/transformers-pytorch-deepspeed-latest-gpu-push-ci

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				@ -444,6 +389,23 @@ jobs:

				          echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"

				          echo "env.CI_SHA = ${{ env.CI_SHA }}"

				      - name: Set `machine_type` for report and artifact names

				        working-directory: /workspace/transformers

				        shell: bash

				        run: |

				          echo "${{ matrix.machine_type }}"

				          if [ "${{ matrix.machine_type }}" = "aws-g5-4xlarge-cache" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "aws-g5-12xlarge-cache" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				          fi

				          echo "$machine_type"

				          echo "machine_type=$machine_type" >> $GITHUB_ENV

				      - name: Update clone using environment variables

				        working-directory: /workspace/transformers

				        run: |

				@ -484,19 +446,129 @@ jobs:

				        working-directory: /workspace/transformers

				        # TODO: Here we pass all tests in the 2 folders for simplicity. It's better to pass only the identified tests.

				        run: |

				          python -m pytest -n 1 --dist=loadfile -v --make-reports=${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports tests/deepspeed tests/extended

				          python -m pytest -n 1 --dist=loadfile -v --make-reports=${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports tests/deepspeed tests/extended

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat /workspace/transformers/reports/${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt

				        run: cat /workspace/transformers/reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt

				      - name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports

				          path: /workspace/transformers/reports/${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports

				          name: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports

				          path: /workspace/transformers/reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports

				  run_tests_torch_cuda_extensions_multi_gpu:

				    name: Torch CUDA extension tests

				    needs: setup

				    if: contains(fromJson(needs.setup.outputs.matrix), 'deepspeed') || contains(fromJson(needs.setup.outputs.matrix), 'extended')

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [aws-g5-12xlarge-cache]

				    runs-on:

				      group: '${{ matrix.machine_type }}'

				    container:

				      image: huggingface/transformers-pytorch-deepspeed-latest-gpu-push-ci

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    env:

				      # For the meaning of these environment variables, see the job `Setup`

				      CI_BRANCH_PUSH: ${{ github.event.ref }}

				      CI_BRANCH_WORKFLOW_RUN: ${{ github.event.workflow_run.head_branch }}

				      CI_SHA_PUSH: ${{ github.event.head_commit.id }}

				      CI_SHA_WORKFLOW_RUN: ${{ github.event.workflow_run.head_sha }}

				    steps:

				      # Necessary to get the correct branch name and commit SHA for `workflow_run` event

				      # We also take into account the `push` event (we might want to test some changes in a branch)

				      - name: Prepare custom environment variables

				        shell: bash

				        # For the meaning of these environment variables, see the job `Setup`

				        run: |

				          CI_BRANCH_PUSH=${CI_BRANCH_PUSH/'refs/heads/'/''}

				          echo $CI_BRANCH_PUSH

				          echo $CI_BRANCH_WORKFLOW_RUN

				          echo $CI_SHA_PUSH

				          echo $CI_SHA_WORKFLOW_RUN

				          [[ ! -z "$CI_BRANCH_PUSH" ]] && echo "CI_BRANCH=$CI_BRANCH_PUSH" >> $GITHUB_ENV || echo "CI_BRANCH=$CI_BRANCH_WORKFLOW_RUN" >> $GITHUB_ENV

				          [[ ! -z "$CI_SHA_PUSH" ]] && echo "CI_SHA=$CI_SHA_PUSH" >> $GITHUB_ENV || echo "CI_SHA=$CI_SHA_WORKFLOW_RUN" >> $GITHUB_ENV

				      - name: print environment variables

				        run: |

				          echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"

				          echo "env.CI_SHA = ${{ env.CI_SHA }}"

				      - name: Set `machine_type` for report and artifact names

				        working-directory: /workspace/transformers

				        shell: bash

				        run: |

				          echo "${{ matrix.machine_type }}"

				          if [ "${{ matrix.machine_type }}" = "aws-g5-4xlarge-cache" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "aws-g5-12xlarge-cache" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				          fi

				          echo "$machine_type"

				          echo "machine_type=$machine_type" >> $GITHUB_ENV

				      - name: Update clone using environment variables

				        working-directory: /workspace/transformers

				        run: |

				          echo "original branch = $(git branch --show-current)"

				          git fetch && git checkout ${{ env.CI_BRANCH }}

				          echo "updated branch = $(git branch --show-current)"

				          git checkout ${{ env.CI_SHA }}

				          echo "log = $(git log -n 1)"

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /workspace/transformers

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

				      - name: Remove cached torch extensions

				        run: rm -rf /github/home/.cache/torch_extensions/

				      # To avoid unknown test failures

				      - name: Pre build DeepSpeed *again*

				        working-directory: /workspace

				        run: |

				          python3 -m pip uninstall -y deepspeed

				          DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install deepspeed --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check

				      - name: NVIDIA-SMI

				        run: |

				          nvidia-smi

				      - name: Environment

				        working-directory: /workspace/transformers

				        run: |

				          python utils/print_env.py

				      - name: Show installed libraries and their versions

				        working-directory: /workspace/transformers

				        run: pip freeze

				      - name: Run all non-slow selected tests on GPU

				        working-directory: /workspace/transformers

				        # TODO: Here we pass all tests in the 2 folders for simplicity. It's better to pass only the identified tests.

				        run: |

				          python -m pytest -n 1 --dist=loadfile -v --make-reports=${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports tests/deepspeed tests/extended

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat /workspace/transformers/reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports

				          path: /workspace/transformers/reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports

				  send_results:

				    name: Send results to webhook

				@ -575,6 +647,6 @@ jobs:

				        # `models/bert` to `models_bert` is required, as the artifact names use `_` instead of `/`.

				        run: |

				          pip install huggingface_hub

				          pip install slack_sdk 

				          pip install slack_sdk

				          pip show slack_sdk

				          python utils/notification_service.py "${{ needs.setup.outputs.matrix }}"

									
										55

.github/workflows/self-scheduled-amd-mi210-caller.yml
									
										vendored
									
												View File
											
				@ -1,55 +0,0 @@

				name: Self-hosted runner (AMD mi210 scheduled CI caller)

				on:

				  workflow_run:

				    workflows: ["Self-hosted runner (AMD scheduled CI caller)"]

				    branches: ["main"]

				    types: [completed]

				  push:

				    branches:

				      - run_amd_scheduled_ci_caller*

				jobs:

				  model-ci:

				    name: Model CI

				    uses: ./.github/workflows/self-scheduled-amd.yml

				    with:

				      job: run_models_gpu

				      slack_report_channel: "#transformers-ci-daily-amd"

				      runner: mi210

				      docker: huggingface/transformers-pytorch-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi210

				    secrets: inherit

				  torch-pipeline:

				    name: Torch pipeline CI

				    uses: ./.github/workflows/self-scheduled-amd.yml

				    with:

				      job: run_pipelines_torch_gpu

				      slack_report_channel: "#transformers-ci-daily-amd"

				      runner: mi210

				      docker: huggingface/transformers-pytorch-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi210

				    secrets: inherit

				  example-ci:

				    name: Example CI

				    uses: ./.github/workflows/self-scheduled-amd.yml

				    with:

				      job: run_examples_gpu

				      slack_report_channel: "#transformers-ci-daily-amd"

				      runner: mi210

				      docker: huggingface/transformers-pytorch-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi210

				    secrets: inherit

				  deepspeed-ci:

				    name: DeepSpeed CI

				    uses: ./.github/workflows/self-scheduled-amd.yml

				    with:

				      job: run_torch_cuda_extensions_gpu

				      slack_report_channel: "#transformers-ci-daily-amd"

				      runner: mi210

				      docker: huggingface/transformers-pytorch-deepspeed-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi210

				    secrets: inherit

									
										114

.github/workflows/self-scheduled-amd-mi250-caller.yml
									
										vendored
									
												View File
												
				@ -1,55 +1,59 @@

				name: Self-hosted runner (AMD mi250 scheduled CI caller)

				on:

				  workflow_run:

				    workflows: ["Self-hosted runner (AMD scheduled CI caller)"]

				    branches: ["main"]

				    types: [completed]

				  push:

				    branches:

				      - run_amd_scheduled_ci_caller*

				jobs:

				  model-ci:

				    name: Model CI

				    uses: ./.github/workflows/self-scheduled-amd.yml

				    with:

				      job: run_models_gpu

				      slack_report_channel: "#transformers-ci-daily-amd"

				      runner: mi250

				      docker: huggingface/transformers-pytorch-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi250

				    secrets: inherit

				  torch-pipeline:

				    name: Torch pipeline CI

				    uses: ./.github/workflows/self-scheduled-amd.yml

				    with:

				      job: run_pipelines_torch_gpu

				      slack_report_channel: "#transformers-ci-daily-amd"

				      runner: mi250

				      docker: huggingface/transformers-pytorch-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi250

				    secrets: inherit

				  example-ci:

				    name: Example CI

				    uses: ./.github/workflows/self-scheduled-amd.yml

				    with:

				      job: run_examples_gpu

				      slack_report_channel: "#transformers-ci-daily-amd"

				      runner: mi250

				      docker: huggingface/transformers-pytorch-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi250

				    secrets: inherit

				  deepspeed-ci:

				    name: DeepSpeed CI

				    uses: ./.github/workflows/self-scheduled-amd.yml

				    with:

				      job: run_torch_cuda_extensions_gpu

				      slack_report_channel: "#transformers-ci-daily-amd"

				      runner: mi250

				      docker: huggingface/transformers-pytorch-deepspeed-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi250

				    secrets: inherit

				name: Self-hosted runner (AMD mi250 scheduled CI caller)

				on:

				  workflow_run:

				    workflows: ["Self-hosted runner (AMD scheduled CI caller)"]

				    branches: ["main"]

				    types: [completed]

				  push:

				    branches:

				      - run_amd_scheduled_ci_caller*

				jobs:

				  model-ci:

				    name: Model CI

				    uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main

				    with:

				      job: run_models_gpu

				      slack_report_channel: "#transformers-ci-daily-amd"

				      runner: mi250

				      docker: huggingface/transformers-pytorch-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi250

				      report_repo_id: optimum-amd/transformers_daily_ci

				    secrets: inherit

				  torch-pipeline:

				    name: Torch pipeline CI

				    uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main

				    with:

				      job: run_pipelines_torch_gpu

				      slack_report_channel: "#transformers-ci-daily-amd"

				      runner: mi250

				      docker: huggingface/transformers-pytorch-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi250

				      report_repo_id: optimum-amd/transformers_daily_ci

				    secrets: inherit

				  example-ci:

				    name: Example CI

				    uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main

				    with:

				      job: run_examples_gpu

				      slack_report_channel: "#transformers-ci-daily-amd"

				      runner: mi250

				      docker: huggingface/transformers-pytorch-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi250

				      report_repo_id: optimum-amd/transformers_daily_ci

				    secrets: inherit

				  deepspeed-ci:

				    name: DeepSpeed CI

				    uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main

				    with:

				      job: run_torch_cuda_extensions_gpu

				      slack_report_channel: "#transformers-ci-daily-amd"

				      runner: mi250

				      docker: huggingface/transformers-pytorch-deepspeed-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi250

				      report_repo_id: optimum-amd/transformers_daily_ci

				    secrets: inherit

									
										67

.github/workflows/self-scheduled-amd-mi325-caller.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,67 @@

				name: Self-hosted runner scale set (AMD mi325 scheduled CI caller)

				# Note: For every job in this workflow, the name of the runner scale set is finalized in the runner yaml i.e. huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml

				# For example, 1gpu scale set: amd-mi325-ci-1gpu

				#              2gpu scale set: amd-mi325-ci-2gpu

				on:

				  workflow_run:

				    workflows: ["Self-hosted runner (AMD scheduled CI caller)"]

				    branches: ["main"]

				    types: [completed]

				  push:

				    branches:

				      - run_amd_scheduled_ci_caller*

				jobs:

				  model-ci:

				    name: Model CI

				    uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@main

				    with:

				      job: run_models_gpu

				      slack_report_channel: "#amd-hf-ci"

				      runner_group: amd-mi325

				      docker: huggingface/transformers-pytorch-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi325

				      report_repo_id: optimum-amd/transformers_daily_ci

				      env_file: /etc/podinfo/gha-gpu-isolation-settings

				    secrets: inherit

				  torch-pipeline:

				    name: Torch pipeline CI

				    uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@main

				    with:

				      job: run_pipelines_torch_gpu

				      slack_report_channel: "#amd-hf-ci"

				      runner_group: amd-mi325

				      docker: huggingface/transformers-pytorch-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi325

				      report_repo_id: optimum-amd/transformers_daily_ci

				      env_file: /etc/podinfo/gha-gpu-isolation-settings

				    secrets: inherit

				  example-ci:

				    name: Example CI

				    uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@main

				    with:

				      job: run_examples_gpu

				      slack_report_channel: "#amd-hf-ci"

				      runner_group: amd-mi325

				      docker: huggingface/transformers-pytorch-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi325

				      report_repo_id: optimum-amd/transformers_daily_ci

				      env_file: /etc/podinfo/gha-gpu-isolation-settings

				    secrets: inherit

				  deepspeed-ci:

				    name: DeepSpeed CI

				    uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@main

				    with:

				      job: run_torch_cuda_extensions_gpu

				      slack_report_channel: "#amd-hf-ci"

				      runner_group: amd-mi325

				      docker: huggingface/transformers-pytorch-deepspeed-amd-gpu

				      ci_event: Scheduled CI (AMD) - mi325

				      report_repo_id: optimum-amd/transformers_daily_ci

				      env_file: /etc/podinfo/gha-gpu-isolation-settings

				    secrets: inherit

									
										63

.github/workflows/self-scheduled-amd-mi355-caller.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,63 @@

				name: Self-hosted runner scale set (AMD mi355 scheduled CI caller)

				# Note: For every job in this workflow, the name of the runner scale set is finalized in the runner yaml i.e. huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml

				# For example, 1gpu : amd-mi355-ci-1gpu

				#              2gpu : amd-mi355-ci-2gpu

				on:

				  workflow_run:

				    workflows: ["Self-hosted runner (AMD scheduled CI caller)"]

				    branches: ["main"]

				    types: [completed]

				  push:

				    branches:

				      - run_amd_scheduled_ci_caller*

				jobs:

				  model-ci:

				    name: Model CI

				    uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@main

				    with:

				      job: run_models_gpu

				      slack_report_channel: "#amd-hf-ci"

				      runner_group: hfc-amd-mi355

				      docker: huggingface/testing-rocm7.0-preview

				      ci_event: Scheduled CI (AMD) - mi355

				      report_repo_id: hf-transformers-bot/transformers-ci-dummy

				    secrets: inherit

				  torch-pipeline:

				    name: Torch pipeline CI

				    uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@main

				    with:

				      job: run_pipelines_torch_gpu

				      slack_report_channel: "#amd-hf-ci"

				      runner_group: hfc-amd-mi355

				      docker: huggingface/testing-rocm7.0-preview

				      ci_event: Scheduled CI (AMD) - mi355

				      report_repo_id: hf-transformers-bot/transformers-ci-dummy

				    secrets: inherit

				  example-ci:

				    name: Example CI

				    uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@main

				    with:

				      job: run_examples_gpu

				      slack_report_channel: "#amd-hf-ci"

				      runner_group: hfc-amd-mi355

				      docker: huggingface/testing-rocm7.0-preview

				      ci_event: Scheduled CI (AMD) - mi355

				      report_repo_id: hf-transformers-bot/transformers-ci-dummy

				    secrets: inherit

				  deepspeed-ci:

				    name: DeepSpeed CI

				    uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@main

				    with:  

				      job: run_torch_cuda_extensions_gpu

				      slack_report_channel: "#amd-hf-ci"

				      runner_group: hfc-amd-mi355

				      docker: huggingface/testing-rocm7.0-preview

				      ci_event: Scheduled CI (AMD) - mi355

				      report_repo_id: hf-transformers-bot/transformers-ci-dummy

				    secrets: inherit

									
										349

.github/workflows/self-scheduled-amd.yml
									
										vendored
									
												View File
											
				@ -1,349 +0,0 @@

				name: Self-hosted runner (scheduled-amd)

				# Note: For the AMD CI, we rely on a caller workflow and on the workflow_call event to trigger the

				# CI in order to run it on both MI210 and MI250, without having to use matrix here which pushes

				# us towards the limit of allowed jobs on GitHub Actions.

				on:

				  workflow_call:

				    inputs:

				      job:

				        required: true

				        type: string

				      slack_report_channel:

				        required: true

				        type: string

				      runner:

				        required: true

				        type: string

				      docker:

				        required: true

				        type: string

				      ci_event:

				        required: true

				        type: string

				env:

				  HF_HOME: /mnt/cache

				  TRANSFORMERS_IS_CI: yes

				  OMP_NUM_THREADS: 8

				  MKL_NUM_THREADS: 8

				  RUN_SLOW: yes

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				  SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}

				  NUM_SLICES: 2

				# Important note: each job (run_tests_single_gpu, run_tests_multi_gpu, run_examples_gpu, run_pipelines_torch_gpu) requires all the previous jobs before running.

				# This is done so that we avoid parallelizing the scheduled tests, to leave available

				# runners for the push CI that is running on the same machine.

				jobs:

				  check_runner_status:

				    name: Check Runner Status

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Checkout transformers

				        uses: actions/checkout@v4

				        with:

				          fetch-depth: 2

				      - name: Check Runner Status

				        run: python utils/check_self_hosted_runner.py --target_runners hf-amd-mi210-ci-1gpu-1,hf-amd-mi250-ci-1gpu-1,hf-amd-mi300-ci-1gpu-1 --token ${{ secrets.ACCESS_REPO_INFO_TOKEN }}

				  check_runners:

				    name: Check Runners

				    needs: check_runner_status

				    strategy:

				      matrix:

				        machine_type: [single-gpu, multi-gpu]

				    runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']

				    container:

				      image: huggingface/transformers-pytorch-amd-gpu

				      options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    steps:

				      - name: ROCM-SMI

				        run: |

				          rocm-smi

				      - name: ROCM-INFO

				        run: |

				          rocminfo  | grep "Agent" -A 14

				      - name: Show ROCR environment

				        run: |

				          echo "ROCR: $ROCR_VISIBLE_DEVICES"

				  setup:

				    if: contains(fromJSON('["run_models_gpu"]'), inputs.job)

				    name: Setup

				    needs: check_runners

				    strategy:

				      matrix:

				        machine_type: [single-gpu, multi-gpu]

				    runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']

				    container:

				      image: huggingface/transformers-pytorch-amd-gpu

				      options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    outputs:

				      folder_slices: ${{ steps.set-matrix.outputs.folder_slices }}

				      slice_ids: ${{ steps.set-matrix.outputs.slice_ids }}

				    steps:

				      - name: Update clone

				        working-directory: /transformers

				        run: |

				          git fetch && git checkout ${{ github.sha }}

				      - name: Cleanup

				        working-directory: /transformers

				        run: |

				          rm -rf tests/__pycache__

				          rm -rf tests/models/__pycache__

				          rm -rf reports

				      - name: Show installed libraries and their versions

				        working-directory: /transformers

				        run: pip freeze

				      - id: set-matrix

				        name: Identify models to test

				        working-directory: /transformers/tests

				        run: |

				          echo "folder_slices=$(python3 ../utils/split_model_tests.py --num_splits ${{ env.NUM_SLICES }})" >> $GITHUB_OUTPUT

				          echo "slice_ids=$(python3 -c 'd = list(range(${{ env.NUM_SLICES }})); print(d)')" >> $GITHUB_OUTPUT

				      - name: ROCM-SMI

				        run: |

				          rocm-smi

				      - name: ROCM-INFO

				        run: |

				          rocminfo  | grep "Agent" -A 14

				      - name: Show ROCR environment

				        run: |

				          echo "ROCR: $ROCR_VISIBLE_DEVICES"

				      - name: Environment

				        working-directory: /transformers

				        run: |

				          python3 utils/print_env.py

				  run_models_gpu:

				    if: ${{ inputs.job == 'run_models_gpu' }}

				    name: Single GPU tests

				    needs: setup

				    strategy:

				      max-parallel: 1  # For now, not to parallelize. Can change later if it works well.

				      fail-fast: false

				      matrix:

				        machine_type: [single-gpu, multi-gpu]

				        slice_id: ${{ fromJSON(needs.setup.outputs.slice_ids) }}

				    uses: ./.github/workflows/model_jobs_amd.yml

				    with:

				      folder_slices: ${{ needs.setup.outputs.folder_slices }}

				      machine_type: ${{ matrix.machine_type }}

				      slice_id: ${{ matrix.slice_id }}

				      runner: ${{ inputs.runner }}

				      docker: ${{ inputs.docker }}

				    secrets: inherit

				  run_pipelines_torch_gpu:

				    if: ${{ inputs.job == 'run_pipelines_torch_gpu' }}

				    name: PyTorch pipelines

				    needs: check_runners

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [single-gpu, multi-gpu]

				    runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']

				    container:

				      image: ${{ inputs.docker }}

				      options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    steps:

				      - name: Update clone

				        working-directory: /transformers

				        run: git fetch && git checkout ${{ github.sha }}

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

				      - name: ROCM-SMI

				        run: |

				          rocm-smi

				      - name: ROCM-INFO

				        run: |

				          rocminfo  | grep "Agent" -A 14

				      - name: Show ROCR environment

				        run: |

				          echo "ROCR: $ROCR_VISIBLE_DEVICES"

				      - name: Environment

				        working-directory: /transformers

				        run: |

				          python3 utils/print_env.py

				      - name: Show installed libraries and their versions

				        working-directory: /transformers

				        run: pip freeze

				      - name: Run all pipeline tests on GPU

				        working-directory: /transformers

				        run: |

				          python3 -m pytest -n 1 -v --dist=loadfile --make-reports=${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports tests/pipelines -m "not not_device_test"

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat /transformers/reports/${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports/failures_short.txt

				      - name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports

				          path: /transformers/reports/${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports

				  run_examples_gpu:

				    if: ${{ inputs.job == 'run_examples_gpu' }}

				    name: Examples directory

				    needs: check_runners

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [single-gpu]

				    runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']

				    container:

				      image: ${{ inputs.docker }}

				      options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    steps:

				      - name: Update clone

				        working-directory: /transformers

				        run: git fetch && git checkout ${{ github.sha }}

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

				      - name: ROCM-SMI

				        run: |

				          rocm-smi

				      - name: ROCM-INFO

				        run: |

				          rocminfo  | grep "Agent" -A 14

				      - name: Show ROCR environment

				        run: |

				          echo "ROCR: $ROCR_VISIBLE_DEVICES"

				      - name: Environment

				        working-directory: /transformers

				        run: |

				          python3 utils/print_env.py

				      - name: Show installed libraries and their versions

				        working-directory: /transformers

				        run: pip freeze

				      - name: Run examples tests on GPU

				        working-directory: /transformers

				        run: |

				          pip install -r examples/pytorch/_tests_requirements.txt

				          python3 -m pytest -v --make-reports=${{ matrix.machine_type }}_run_examples_gpu_test_reports examples/pytorch -m "not not_device_test"

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat /transformers/reports/${{ matrix.machine_type }}_run_examples_gpu_test_reports/failures_short.txt

				      - name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_examples_gpu_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ matrix.machine_type }}_run_examples_gpu_test_reports

				          path: /transformers/reports/${{ matrix.machine_type }}_run_examples_gpu_test_reports

				  run_torch_cuda_extensions_gpu:

				    if: ${{ inputs.job == 'run_torch_cuda_extensions_gpu' }}

				    name: Torch ROCm deepspeed tests

				    needs: check_runners

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [single-gpu, multi-gpu]

				    runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']

				    container:

				      image: ${{ inputs.docker }}

				      options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    steps:

				      - name: Update clone

				        working-directory: /transformers

				        run: git fetch && git checkout ${{ github.sha }}

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

				      - name: ROCM-SMI

				        run: |

				          rocm-smi

				      - name: ROCM-INFO

				        run: |

				          rocminfo  | grep "Agent" -A 14

				      - name: Show ROCR environment

				        run: |

				          echo "ROCR: $ROCR_VISIBLE_DEVICES"

				      - name: Environment

				        working-directory: /transformers

				        run: |

				          python3 utils/print_env.py

				      - name: Show installed libraries and their versions

				        working-directory: /transformers

				        run: pip freeze

				      - name: Run all tests on GPU

				        working-directory: /transformers

				        run: python3 -m pytest -v --make-reports=${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports tests/deepspeed tests/extended -m "not not_device_test"

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: cat /transformers/reports/${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt

				      - name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports

				          path: /transformers/reports/${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports

				  send_results:

				    name: Slack Report

				    needs: [

				      check_runner_status,

				      check_runners,

				      setup,

				      run_models_gpu,

				      run_pipelines_torch_gpu,

				      run_examples_gpu,

				      run_torch_cuda_extensions_gpu

				    ]

				    if: ${{ always() }}

				    uses: ./.github/workflows/slack-report.yml

				    with:

				      job: ${{ inputs.job }}

				      # This would be `skipped` if `setup` is skipped.

				      setup_status: ${{ needs.setup.result }}

				      slack_report_channel: ${{ inputs.slack_report_channel }}

				      # This would be an empty string if `setup` is skipped.

				      folder_slices: ${{ needs.setup.outputs.folder_slices }}

				      quantization_matrix: ${{ needs.setup.outputs.quantization_matrix }}

				      ci_event: ${{ inputs.ci_event }}

				    secrets: inherit

									
										102

.github/workflows/self-scheduled-caller.yml
									
										vendored
									
												View File
												
				@ -1,5 +1,4 @@

				name: Self-hosted runner (scheduled)

				name: Nvidia CI

				on:

				  repository_dispatch:

				@ -7,72 +6,53 @@ on:

				    - cron: "17 2 * * *"

				  push:

				    branches:

				      - run_scheduled_ci*

				      - multi_jobs_to_check_bad_commit

				  workflow_dispatch:

				    inputs:

				      prev_workflow_run_id:

				        description: 'previous workflow run id to compare'

				        type: string

				        required: false

				        default: ""

				      other_workflow_run_id:

				        description: 'other workflow run id to compare'

				        type: string

				        required: false

				        default: ""

				# Used for `push` to easily modify the target workflow runs to compare against

				env:

				    prev_workflow_run_id: "18548615847"

				    other_workflow_run_id: ""

				jobs:

				  setup:

				    name: Setup

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Setup

				        run: |

				          mkdir "setup_values"

				          echo "${{ inputs.prev_workflow_run_id || env.prev_workflow_run_id }}" > "setup_values/prev_workflow_run_id.txt"

				          echo "${{ inputs.other_workflow_run_id || env.other_workflow_run_id }}" > "setup_values/other_workflow_run_id.txt"

				      - name: Upload artifacts

				        uses: actions/upload-artifact@v4

				        with:

				          name: setup_values

				          path: setup_values

				  model-ci:

				    name: Model CI

				    uses: ./.github/workflows/self-scheduled.yml

				    with:

				      job: run_models_gpu

				      slack_report_channel: "#transformers-ci-daily-models"

				      runner: daily-ci

				      slack_report_channel: "#transformers-ci-dummy"

				      docker: huggingface/transformers-all-latest-gpu

				      ci_event: Daily CI

				    secrets: inherit

				  torch-pipeline:

				    name: Torch pipeline CI

				    uses: ./.github/workflows/self-scheduled.yml

				    with:

				      job: run_pipelines_torch_gpu

				      slack_report_channel: "#transformers-ci-daily-pipeline-torch"

				      runner: daily-ci

				      docker: huggingface/transformers-pytorch-gpu

				      ci_event: Daily CI

				    secrets: inherit

				  tf-pipeline:

				    name: TF pipeline CI

				    uses: ./.github/workflows/self-scheduled.yml

				    with:

				      job: run_pipelines_tf_gpu

				      slack_report_channel: "#transformers-ci-daily-pipeline-tf"

				      runner: daily-ci

				      docker: huggingface/transformers-tensorflow-gpu

				      ci_event: Daily CI

				    secrets: inherit

				  example-ci:

				    name: Example CI

				    uses: ./.github/workflows/self-scheduled.yml

				    with:

				      job: run_examples_gpu

				      slack_report_channel: "#transformers-ci-daily-examples"

				      runner: daily-ci

				      docker: huggingface/transformers-all-latest-gpu

				      ci_event: Daily CI

				    secrets: inherit

				  deepspeed-ci:

				    name: DeepSpeed CI

				    uses: ./.github/workflows/self-scheduled.yml

				    with:

				      job: run_torch_cuda_extensions_gpu

				      slack_report_channel: "#transformers-ci-daily-deepspeed"

				      runner: daily-ci

				      docker: huggingface/transformers-pytorch-deepspeed-latest-gpu

				      ci_event: Daily CI

				      working-directory-prefix: /workspace

				    secrets: inherit

				  quantization-ci:

				    name: Quantization CI

				    uses: ./.github/workflows/self-scheduled.yml

				    with:

				      job: run_quantization_torch_gpu

				      slack_report_channel: "#transformers-ci-daily-quantization"

				      runner: daily-ci

				      docker: huggingface/transformers-quantization-latest-gpu

				      ci_event: Daily CI

				      runner_type: "a10"

				      report_repo_id: hf-internal-testing/transformers_daily_ci

				      commit_sha: ${{ github.sha }}

				    secrets: inherit

									
										341

.github/workflows/self-scheduled-intel-gaudi.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,341 @@

				name: Self-hosted runner (scheduled-intel-gaudi)

				on:

				  workflow_call:

				    inputs:

				      job:

				        required: true

				        type: string

				      slack_report_channel:

				        required: true

				        type: string

				      runner_scale_set:

				        required: true

				        type: string

				      ci_event:

				        required: true

				        type: string

				      report_repo_id:

				        required: true

				        type: string

				env:

				  NUM_SLICES: 2

				  RUN_SLOW: yes

				  PT_HPU_LAZY_MODE: 0

				  TRANSFORMERS_IS_CI: yes

				  PT_ENABLE_INT64_SUPPORT: 1

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				  HF_HOME: /mnt/cache/.cache/huggingface

				jobs:

				  setup:

				    if: contains(fromJSON('["run_models_gpu", "run_trainer_and_fsdp_gpu"]'), inputs.job)

				    name: Setup

				    runs-on: ubuntu-latest

				    outputs:

				      slice_ids: ${{ steps.set-matrix.outputs.slice_ids }}

				      folder_slices: ${{ steps.set-matrix.outputs.folder_slices }}

				      quantization_matrix: ${{ steps.set-matrix.outputs.quantization_matrix }}

				    steps:

				      - name: Checkout

				        uses: actions/checkout@v4

				        with:

				          fetch-depth: 0

				      - name: Set up Python

				        uses: actions/setup-python@v5

				        with:

				          python-version: "3.10"

				      - id: set-matrix

				        if: contains(fromJSON('["run_models_gpu", "run_trainer_and_fsdp_gpu"]'), inputs.job)

				        name: Identify models to test

				        working-directory: tests

				        run: |

				          if [ "${{ inputs.job }}" = "run_models_gpu" ]; then

				            echo "folder_slices=$(python3 ../utils/split_model_tests.py --num_splits ${{ env.NUM_SLICES }})" >> $GITHUB_OUTPUT

				            echo "slice_ids=$(python3 -c 'd = list(range(${{ env.NUM_SLICES }})); print(d)')" >> $GITHUB_OUTPUT

				          elif [ "${{ inputs.job }}" = "run_trainer_and_fsdp_gpu" ]; then

				            echo "folder_slices=[['trainer'], ['fsdp']]" >> $GITHUB_OUTPUT

				            echo "slice_ids=[0, 1]" >> $GITHUB_OUTPUT

				          fi

				      - id: set-matrix-quantization

				        if: ${{ inputs.job == 'run_quantization_torch_gpu' }}

				        name: Identify quantization method to test

				        working-directory: tests

				        run: |

				          echo "quantization_matrix=$(python3 -c 'import os; tests = os.getcwd(); quantization_tests = os.listdir(os.path.join(tests, "quantization")); d = sorted(list(filter(os.path.isdir, [f"quantization/{x}" for x in quantization_tests]))) ;  print(d)')" >> $GITHUB_OUTPUT

				  run_models_gpu:

				    if: ${{ inputs.job == 'run_models_gpu' }}

				    name: " "

				    needs: setup

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [1gaudi, 2gaudi]

				        slice_id: ${{ fromJSON(needs.setup.outputs.slice_ids) }}

				    uses: ./.github/workflows/model_jobs_intel_gaudi.yml

				    with:

				      slice_id: ${{ matrix.slice_id }}

				      machine_type: ${{ matrix.machine_type }}

				      folder_slices: ${{ needs.setup.outputs.folder_slices }}

				      runner: ${{ inputs.runner_scale_set }}-${{ matrix.machine_type }}

				    secrets: inherit

				  run_trainer_and_fsdp_gpu:

				    if: ${{ inputs.job == 'run_trainer_and_fsdp_gpu' }}

				    name: " "

				    needs: setup

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [1gaudi, 2gaudi]

				        slice_id: ${{ fromJSON(needs.setup.outputs.slice_ids) }}

				    uses: ./.github/workflows/model_jobs_intel_gaudi.yml

				    with:

				      slice_id: ${{ matrix.slice_id }}

				      machine_type: ${{ matrix.machine_type }}

				      folder_slices: ${{ needs.setup.outputs.folder_slices }}

				      runner: ${{ inputs.runner_scale_set }}-${{ matrix.machine_type }}

				      report_name_prefix: run_trainer_and_fsdp_gpu

				    secrets: inherit

				  run_pipelines_torch_gpu:

				    if: ${{ inputs.job == 'run_pipelines_torch_gpu' }}

				    name: Pipelines

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [1gaudi, 2gaudi]

				    runs-on:

				      group: ${{ inputs.runner_scale_set }}-${{ matrix.machine_type }}

				    container:

				      image: vault.habana.ai/gaudi-docker/1.21.1/ubuntu22.04/habanalabs/pytorch-installer-2.6.0:latest

				      options: --runtime=habana

				        -v /mnt/cache/.cache/huggingface:/mnt/cache/.cache/huggingface

				        --env OMPI_MCA_btl_vader_single_copy_mechanism=none

				        --env HABANA_VISIBLE_DEVICES

				        --env HABANA_VISIBLE_MODULES

				        --cap-add=sys_nice

				        --shm-size=64G

				    steps:

				      - name: Checkout

				        uses: actions/checkout@v4

				        with:

				          fetch-depth: 0

				      - name: Install dependencies

				        run: |

				          pip install -e .[testing,torch] "numpy<2.0.0" scipy scikit-learn librosa soundfile

				      - name: HL-SMI

				        run: |

				          hl-smi

				          echo "HABANA_VISIBLE_DEVICES=${HABANA_VISIBLE_DEVICES}"

				          echo "HABANA_VISIBLE_MODULES=${HABANA_VISIBLE_MODULES}"

				      - name: Environment

				        run: python3 utils/print_env.py

				      - name: Show installed libraries and their versions

				        run: pip freeze

				      - name: Set `machine_type` for report and artifact names

				        shell: bash

				        run: |

				          if [ "${{ matrix.machine_type }}" = "1gaudi" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "2gaudi" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				          fi

				          echo "machine_type=$machine_type" >> $GITHUB_ENV

				      - name: Run all pipeline tests on Intel Gaudi

				        run: |

				          python3 -m pytest -v --make-reports=${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports tests/pipelines -m "not not_device_test"

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: |

				          cat reports/${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports/failures_short.txt

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports

				          path: reports/${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports

				  run_examples_gpu:

				    if: ${{ inputs.job == 'run_examples_gpu' }}

				    name: Examples directory

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [1gaudi]

				    runs-on:

				      group: ${{ inputs.runner_scale_set }}-${{ matrix.machine_type }}

				    container:

				      image: vault.habana.ai/gaudi-docker/1.21.1/ubuntu22.04/habanalabs/pytorch-installer-2.6.0:latest

				      options: --runtime=habana

				        -v /mnt/cache/.cache/huggingface:/mnt/cache/.cache/huggingface

				        --env OMPI_MCA_btl_vader_single_copy_mechanism=none

				        --env HABANA_VISIBLE_DEVICES

				        --env HABANA_VISIBLE_MODULES

				        --cap-add=sys_nice

				        --shm-size=64G

				    steps:

				      - name: Checkout

				        uses: actions/checkout@v4

				        with:

				          fetch-depth: 0

				      - name: Install dependencies

				        run: |

				          pip install -e .[testing,torch] "numpy<2.0.0" scipy scikit-learn librosa soundfile

				      - name: HL-SMI

				        run: |

				          hl-smi

				          echo "HABANA_VISIBLE_DEVICES=${HABANA_VISIBLE_DEVICES}"

				          echo "HABANA_VISIBLE_MODULES=${HABANA_VISIBLE_MODULES}"

				      - name: Environment

				        run: |

				          python3 utils/print_env.py

				      - name: Show installed libraries and their versions

				        run: |

				          pip freeze

				      - name: Set `machine_type` for report and artifact names

				        shell: bash

				        run: |

				          if [ "${{ matrix.machine_type }}" = "1gaudi" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "2gaudi" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				          fi

				          echo "machine_type=$machine_type" >> $GITHUB_ENV

				      - name: Run examples tests on Intel Gaudi

				        run: |

				          pip install -r examples/pytorch/_tests_requirements.txt

				          python3 -m pytest -v --make-reports=${{ env.machine_type }}_run_examples_gpu_test_reports examples/pytorch -m "not not_device_test"

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: |

				          cat reports/${{ env.machine_type }}_run_examples_gpu_test_reports/failures_short.txt

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_examples_gpu_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ env.machine_type }}_run_examples_gpu_test_reports

				          path: reports/${{ env.machine_type }}_run_examples_gpu_test_reports

				  run_torch_cuda_extensions_gpu:

				    if: ${{ inputs.job == 'run_torch_cuda_extensions_gpu' }}

				    name: Intel Gaudi deepspeed tests

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [1gaudi, 2gaudi]

				    runs-on:

				      group: ${{ inputs.runner_scale_set }}-${{ matrix.machine_type }}

				    container:

				      image: vault.habana.ai/gaudi-docker/1.21.1/ubuntu22.04/habanalabs/pytorch-installer-2.6.0:latest

				      options: --runtime=habana

				        -v /mnt/cache/.cache/huggingface:/mnt/cache/.cache/huggingface

				        --env OMPI_MCA_btl_vader_single_copy_mechanism=none

				        --env HABANA_VISIBLE_DEVICES

				        --env HABANA_VISIBLE_MODULES

				        --cap-add=sys_nice

				        --shm-size=64G

				    steps:

				      - name: Checkout

				        uses: actions/checkout@v4

				        with:

				          fetch-depth: 0

				      - name: Install dependencies

				        run: |

				          pip install -e .[testing,torch] "numpy<2.0.0" scipy scikit-learn librosa soundfile

				          pip install git+https://github.com/HabanaAI/DeepSpeed.git@1.20.0

				      - name: HL-SMI

				        run: |

				          hl-smi

				          echo "HABANA_VISIBLE_DEVICES=${HABANA_VISIBLE_DEVICES}"

				          echo "HABANA_VISIBLE_MODULES=${HABANA_VISIBLE_MODULES}"

				      - name: Environment

				        run: |

				          python3 utils/print_env.py

				      - name: Show installed libraries and their versions

				        run: |

				          pip freeze

				      - name: Set `machine_type` for report and artifact names

				        shell: bash

				        run: |

				          if [ "${{ matrix.machine_type }}" = "1gaudi" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "2gaudi" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				          fi

				          echo "machine_type=$machine_type" >> $GITHUB_ENV

				      - name: Run all deepspeed tests on intel Gaudi

				        run: |

				          python3 -m pytest -v --make-reports=${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports tests/deepspeed -m "not not_device_test"

				      - name: Failure short reports

				        if: ${{ failure() }}

				        continue-on-error: true

				        run: |

				          cat reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports

				          path: reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports

				  send_results:

				    name: Slack Report

				    needs:

				      [

				        setup,

				        run_models_gpu,

				        run_examples_gpu,

				        run_torch_cuda_extensions_gpu,

				        run_pipelines_torch_gpu,

				        run_trainer_and_fsdp_gpu,

				      ]

				    if: ${{ always() }}

				    uses: ./.github/workflows/slack-report.yml

				    with:

				      job: ${{ inputs.job }}

				      setup_status: ${{ needs.setup.result }}

				      slack_report_channel: ${{ inputs.slack_report_channel }}

				      quantization_matrix: ${{ needs.setup.outputs.quantization_matrix }}

				      folder_slices: ${{ needs.setup.outputs.folder_slices }}

				      report_repo_id: ${{ inputs.report_repo_id }}

				      ci_event: ${{ inputs.ci_event }}

				    secrets: inherit

									
										67

.github/workflows/self-scheduled-intel-gaudi3-caller.yml
									
										vendored
									
										Normal file
									
												View File
												
				@ -0,0 +1,67 @@

				name: Self-hosted runner (Intel Gaudi3 scheduled CI caller)

				on:

				  repository_dispatch:

				  workflow_dispatch:

				  schedule:

				    - cron: "17 2 * * *"

				jobs:

				  model-ci:

				    name: Model CI

				    uses: ./.github/workflows/self-scheduled-intel-gaudi.yml

				    with:

				      job: run_models_gpu

				      ci_event: Scheduled CI (Intel) - Gaudi3

				      runner_scale_set: itac-bm-emr-gaudi3-dell

				      slack_report_channel: "#transformers-ci-daily-intel-gaudi3"

				      report_repo_id: optimum-intel/transformers_daily_ci_intel_gaudi3

				    secrets: inherit

				  pipeline-ci:

				    name: Pipeline CI

				    uses: ./.github/workflows/self-scheduled-intel-gaudi.yml

				    with:

				      job: run_pipelines_torch_gpu

				      ci_event: Scheduled CI (Intel) - Gaudi3

				      runner_scale_set: itac-bm-emr-gaudi3-dell

				      slack_report_channel: "#transformers-ci-daily-intel-gaudi3"

				      report_repo_id: optimum-intel/transformers_daily_ci_intel_gaudi3

				    secrets: inherit

				  example-ci:

				    name: Example CI

				    uses: ./.github/workflows/self-scheduled-intel-gaudi.yml

				    with:

				      job: run_examples_gpu

				      ci_event: Scheduled CI (Intel) - Gaudi3

				      runner_scale_set: itac-bm-emr-gaudi3-dell

				      slack_report_channel: "#transformers-ci-daily-intel-gaudi3"

				      report_repo_id: optimum-intel/transformers_daily_ci_intel_gaudi3

				    secrets: inherit

				  deepspeed-ci:

				    name: DeepSpeed CI

				    uses: ./.github/workflows/self-scheduled-intel-gaudi.yml

				    with:

				      job: run_torch_cuda_extensions_gpu

				      ci_event: Scheduled CI (Intel) - Gaudi3

				      runner_scale_set: itac-bm-emr-gaudi3-dell

				      slack_report_channel: "#transformers-ci-daily-intel-gaudi3"

				      report_repo_id: optimum-intel/transformers_daily_ci_intel_gaudi3

				    secrets: inherit

				  trainer-fsdp-ci:

				    name: Trainer/FSDP CI

				    uses: ./.github/workflows/self-scheduled-intel-gaudi.yml

				    with:

				      job: run_trainer_and_fsdp_gpu

				      ci_event: Scheduled CI (Intel) - Gaudi3

				      runner_scale_set: itac-bm-emr-gaudi3-dell

				      slack_report_channel: "#transformers-ci-daily-intel-gaudi3"

				      report_repo_id: optimum-intel/transformers_daily_ci_intel_gaudi3

				    secrets: inherit

									
										195

.github/workflows/self-scheduled.yml
									
										vendored
									
												View File
												
				@ -1,4 +1,4 @@

				name: Self-hosted runner (scheduled)

				name: Nvidia CI (job definitions)

				# Note that each job's dependencies go into a corresponding docker file.

				#

				@ -15,9 +15,6 @@ on:

				      slack_report_channel:

				        required: true

				        type: string

				      runner:

				        required: true

				        type: string

				      docker:

				        required: true

				        type: string

				@ -28,6 +25,19 @@ on:

				        default: ''

				        required: false

				        type: string

				      report_repo_id:

				        required: true

				        type: string

				      commit_sha:

				        required: false

				        type: string

				      runner_type:

				        required: false

				        type: string

				      models:

				        default: ""

				        required: false

				        type: string

				env:

				  HF_HOME: /mnt/cache

				@ -38,24 +48,22 @@ env:

				  # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.

				  # This token is created under the bot `hf-transformers-bot`.

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				  SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}

				  TF_FORCE_GPU_ALLOW_GROWTH: true

				  RUN_PT_TF_CROSS_TESTS: 1

				  CUDA_VISIBLE_DEVICES: 0,1

				  NUM_SLICES: 2

				jobs:

				  setup:

				    if: contains(fromJSON('["run_models_gpu", "run_quantization_torch_gpu"]'), inputs.job)

				    name: Setup

				    if: contains(fromJSON('["run_models_gpu", "run_trainer_and_fsdp_gpu", "run_quantization_torch_gpu"]'), inputs.job)

				    strategy:

				      matrix:

				        machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]

				        machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]

				    runs-on:

				      group: '${{ matrix.machine_type }}'

				    container:

				      image: huggingface/transformers-all-latest-gpu

				      options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    outputs:

				      folder_slices: ${{ steps.set-matrix.outputs.folder_slices }}

				      slice_ids: ${{ steps.set-matrix.outputs.slice_ids }}

				@ -64,7 +72,7 @@ jobs:

				      - name: Update clone

				        working-directory: /transformers

				        run: |

				          git fetch && git checkout ${{ github.sha }}

				          git fetch && git checkout ${{ inputs.commit_sha || github.sha }}

				      - name: Cleanup

				        working-directory: /transformers

				@ -78,12 +86,17 @@ jobs:

				        run: pip freeze

				      - id: set-matrix

				        if: ${{ inputs.job == 'run_models_gpu' }}

				        if: contains(fromJSON('["run_models_gpu", "run_trainer_and_fsdp_gpu"]'), inputs.job)

				        name: Identify models to test

				        working-directory: /transformers/tests

				        run: |

				          echo "folder_slices=$(python3 ../utils/split_model_tests.py --num_splits ${{ env.NUM_SLICES }})" >> $GITHUB_OUTPUT

				          echo "slice_ids=$(python3 -c 'd = list(range(${{ env.NUM_SLICES }})); print(d)')" >> $GITHUB_OUTPUT

				          if [ "${{ inputs.job }}" = "run_models_gpu" ]; then

				            echo "folder_slices=$(python3 ../utils/split_model_tests.py --models '${{ inputs.models }}' --num_splits ${{ env.NUM_SLICES }})" >> $GITHUB_OUTPUT

				            echo "slice_ids=$(python3 -c 'd = list(range(${{ env.NUM_SLICES }})); print(d)')" >> $GITHUB_OUTPUT

				          elif [ "${{ inputs.job }}" = "run_trainer_and_fsdp_gpu" ]; then

				            echo "folder_slices=[['trainer'], ['fsdp']]" >> $GITHUB_OUTPUT

				            echo "slice_ids=[0, 1]" >> $GITHUB_OUTPUT

				          fi

				      - id: set-matrix-quantization

				        if: ${{ inputs.job == 'run_quantization_torch_gpu' }}

				@ -103,15 +116,38 @@ jobs:

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]

				        machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]

				        slice_id: ${{ fromJSON(needs.setup.outputs.slice_ids) }}

				    uses: ./.github/workflows/model_jobs.yml

				    with:

				      folder_slices: ${{ needs.setup.outputs.folder_slices }}

				      machine_type: ${{ matrix.machine_type }}

				      slice_id: ${{ matrix.slice_id }}

				      runner: ${{ inputs.runner }}

				      docker: ${{ inputs.docker }}

				      commit_sha: ${{ inputs.commit_sha || github.sha }}

				      runner_type: ${{ inputs.runner_type }}

				      report_repo_id: ${{ inputs.report_repo_id }}

				    secrets: inherit

				  run_trainer_and_fsdp_gpu:

				    if: ${{ inputs.job == 'run_trainer_and_fsdp_gpu' }}

				    name: " "

				    needs: setup

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]

				        slice_id: [0, 1]

				    uses: ./.github/workflows/model_jobs.yml

				    with:

				      folder_slices: ${{ needs.setup.outputs.folder_slices }}

				      machine_type: ${{ matrix.machine_type }}

				      slice_id: ${{ matrix.slice_id }}

				      docker: ${{ inputs.docker }}

				      commit_sha: ${{ inputs.commit_sha || github.sha }}

				      runner_type: ${{ inputs.runner_type }}

				      report_repo_id: ${{ inputs.report_repo_id }}

				      report_name_prefix: run_trainer_and_fsdp_gpu

				    secrets: inherit

				  run_pipelines_torch_gpu:

				@ -120,7 +156,7 @@ jobs:

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]

				        machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]

				    runs-on:

				      group: '${{ matrix.machine_type }}'

				    container:

				@ -129,7 +165,7 @@ jobs:

				    steps:

				      - name: Update clone

				        working-directory: /transformers

				        run: git fetch && git checkout ${{ github.sha }}

				        run: git fetch && git checkout ${{ inputs.commit_sha || github.sha }}

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				@ -154,9 +190,9 @@ jobs:

				        run: |

				          echo "${{ matrix.machine_type }}"

				          if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then

				          if [ "${{ matrix.machine_type }}" = "aws-g5-4xlarge-cache" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then

				          elif [ "${{ matrix.machine_type }}" = "aws-g5-12xlarge-cache" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				@ -182,91 +218,22 @@ jobs:

				          name: ${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports

				          path: /transformers/reports/${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports

				  run_pipelines_tf_gpu:

				    if: ${{ inputs.job == 'run_pipelines_tf_gpu' }}

				    name: TensorFlow pipelines

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]

				    runs-on:

				      group: '${{ matrix.machine_type }}'

				    container:

				      image: huggingface/transformers-tensorflow-gpu

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    steps:

				      - name: Update clone

				        working-directory: /transformers

				        run: |

				          git fetch && git checkout ${{ github.sha }}

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				        run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

				      - name: NVIDIA-SMI

				        run: |

				          nvidia-smi

				      - name: Environment

				        working-directory: /transformers

				        run: |

				          python3 utils/print_env.py

				      - name: Show installed libraries and their versions

				        working-directory: /transformers

				        run: pip freeze

				      - name: Set `machine_type` for report and artifact names

				        working-directory: /transformers

				        shell: bash

				        run: |

				          echo "${{ matrix.machine_type }}"

				          if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				          fi

				          echo "$machine_type"

				          echo "machine_type=$machine_type" >> $GITHUB_ENV

				      - name: Run all pipeline tests on GPU

				        working-directory: /transformers

				        run: |

				          python3 -m pytest -n 1 -v --dist=loadfile --make-reports=${{ env.machine_type }}_run_pipelines_tf_gpu_test_reports tests/pipelines

				      - name: Failure short reports

				        if: ${{ always() }}

				        run: |

				          cat /transformers/reports/${{ env.machine_type }}_run_pipelines_tf_gpu_test_reports/failures_short.txt

				      - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_pipelines_tf_gpu_test_reports"

				        if: ${{ always() }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ${{ env.machine_type }}_run_pipelines_tf_gpu_test_reports

				          path: /transformers/reports/${{ env.machine_type }}_run_pipelines_tf_gpu_test_reports

				  run_examples_gpu:

				    if: ${{ inputs.job == 'run_examples_gpu' }}

				    name: Examples directory

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [aws-g4dn-2xlarge-cache]

				        machine_type: [aws-g5-4xlarge-cache]

				    runs-on:

				      group: '${{ matrix.machine_type }}'

				    container:

				      image: huggingface/transformers-all-latest-gpu

				      options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				      options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				    steps:

				      - name: Update clone

				        working-directory: /transformers

				        run: git fetch && git checkout ${{ github.sha }}

				        run: git fetch && git checkout ${{ inputs.commit_sha || github.sha }}

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				@ -291,9 +258,9 @@ jobs:

				        run: |

				          echo "${{ matrix.machine_type }}"

				          if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then

				          if [ "${{ matrix.machine_type }}" = "aws-g5-4xlarge-cache" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then

				          elif [ "${{ matrix.machine_type }}" = "aws-g5-12xlarge-cache" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				@ -326,7 +293,7 @@ jobs:

				    strategy:

				      fail-fast: false

				      matrix:

				        machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]

				        machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]

				    runs-on:

				      group: '${{ matrix.machine_type }}'

				    container:

				@ -335,7 +302,7 @@ jobs:

				    steps:

				      - name: Update clone

				        working-directory: ${{ inputs.working-directory-prefix }}/transformers

				        run: git fetch && git checkout ${{ github.sha }}

				        run: git fetch && git checkout ${{ inputs.commit_sha || github.sha }}

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: ${{ inputs.working-directory-prefix }}/transformers

				@ -366,7 +333,7 @@ jobs:

				        run: |

				          python3 -m pip uninstall -y deepspeed

				          rm -rf DeepSpeed

				          git clone https://github.com/microsoft/DeepSpeed && cd DeepSpeed && rm -rf build

				          git clone https://github.com/deepspeedai/DeepSpeed && cd DeepSpeed && rm -rf build

				          DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install . --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check

				      - name: NVIDIA-SMI

				@ -383,14 +350,14 @@ jobs:

				        run: pip freeze

				      - name: Set `machine_type` for report and artifact names

				        working-directory: /transformers

				        working-directory: ${{ inputs.working-directory-prefix }}/transformers

				        shell: bash

				        run: |

				          echo "${{ matrix.machine_type }}"

				          if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then

				          if [ "${{ matrix.machine_type }}" = "aws-g5-4xlarge-cache" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then

				          elif [ "${{ matrix.machine_type }}" = "aws-g5-12xlarge-cache" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				@ -425,7 +392,7 @@ jobs:

				      fail-fast: false

				      matrix:

				        folders: ${{ fromJson(needs.setup.outputs.quantization_matrix) }}

				        machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]

				        machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]

				    runs-on:

				      group: '${{ matrix.machine_type }}'

				    container:

				@ -443,7 +410,7 @@ jobs:

				      - name: Update clone

				        working-directory: /transformers

				        run: git fetch && git checkout ${{ github.sha }}

				        run: git fetch && git checkout ${{ inputs.commit_sha || github.sha }}

				      - name: Reinstall transformers in edit mode (remove the one installed during docker image build)

				        working-directory: /transformers

				@ -468,9 +435,9 @@ jobs:

				        run: |

				          echo "${{ matrix.machine_type }}"

				          if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then

				          if [ "${{ matrix.machine_type }}" = "aws-g5-4xlarge-cache" ]; then

				            machine_type=single-gpu

				          elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then

				          elif [ "${{ matrix.machine_type }}" = "aws-g5-12xlarge-cache" ]; then

				            machine_type=multi-gpu

				          else

				            machine_type=${{ matrix.machine_type }}

				@ -507,6 +474,7 @@ jobs:

				        uses: actions/checkout@v4

				        with:

				          fetch-depth: 2

				          ref: ${{ inputs.commit_sha || github.sha }}

				      - name: Install transformers

				        run: pip install transformers

				@ -542,14 +510,14 @@ jobs:

				    needs: [

				      setup,

				      run_models_gpu,

				      run_trainer_and_fsdp_gpu,

				      run_pipelines_torch_gpu,

				      run_pipelines_tf_gpu,

				      run_examples_gpu,

				      run_torch_cuda_extensions_gpu,

				      run_quantization_torch_gpu,

				      run_extract_warnings

				    ]

				    if: ${{ always() }}

				    if: always() && !cancelled()

				    uses: ./.github/workflows/slack-report.yml

				    with:

				      job: ${{ inputs.job }}

				@ -560,5 +528,22 @@ jobs:

				      folder_slices: ${{ needs.setup.outputs.folder_slices }}

				      quantization_matrix: ${{ needs.setup.outputs.quantization_matrix }}

				      ci_event: ${{ inputs.ci_event }}

				      report_repo_id: ${{ inputs.report_repo_id }}

				      commit_sha: ${{ inputs.commit_sha || github.sha }}

				    secrets: inherit

				  check_new_failures:

				    if: ${{ always() && inputs.ci_event == 'Daily CI' && needs.send_results.result == 'success' }}

				    name: Check new failures

				    needs: send_results

				    uses: ./.github/workflows/check_failed_tests.yml

				    with:

				      docker: ${{ inputs.docker }}

				      start_sha: ${{ inputs.commit_sha || github.sha }}

				      job: ${{ inputs.job }}

				      slack_report_channel: ${{ inputs.slack_report_channel }}

				      ci_event: ${{ inputs.ci_event }}

				      report_repo_id: ${{ inputs.report_repo_id }}

				    secrets: inherit

									
										72

.github/workflows/slack-report.yml
									
										vendored
									
												View File
												
				@ -21,6 +21,13 @@ on:

				      ci_event:

				        required: true

				        type: string

				      report_repo_id:

				        required: true

				        type: string

				      commit_sha:

				        required: false

				        type: string

				env:

				  TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}

				@ -29,7 +36,7 @@ jobs:

				  send_results:

				    name: Send results to webhook

				    runs-on: ubuntu-22.04

				    if: always()

				    if: always() && !cancelled()

				    steps:

				      - name: Preliminary job status

				        shell: bash

				@ -38,9 +45,28 @@ jobs:

				          echo "Setup status: ${{ inputs.setup_status }}"

				      - uses: actions/checkout@v4

				        with:

				          fetch-depth: 2

				          ref: ${{ inputs.commit_sha || github.sha }}

				      - uses: actions/download-artifact@v4

				      - name: Prepare some setup values

				        run: |

				          if [ -f setup_values/prev_workflow_run_id.txt ]; then

				            echo "PREV_WORKFLOW_RUN_ID=$(cat setup_values/prev_workflow_run_id.txt)" >> $GITHUB_ENV

				          else

				            echo "PREV_WORKFLOW_RUN_ID=" >> $GITHUB_ENV

				          fi

				          if [ -f setup_values/other_workflow_run_id.txt ]; then

				            echo "OTHER_WORKFLOW_RUN_ID=$(cat setup_values/other_workflow_run_id.txt)" >> $GITHUB_ENV

				          else

				            echo "OTHER_WORKFLOW_RUN_ID=" >> $GITHUB_ENV

				          fi

				      - name: Send message to Slack

				        if: ${{ inputs.job != 'run_quantization_torch_gpu' }}

				        shell: bash

				        env:

				          CI_SLACK_BOT_TOKEN: ${{ secrets.CI_SLACK_BOT_TOKEN }}

				          CI_SLACK_CHANNEL_ID: ${{ secrets.CI_SLACK_CHANNEL_ID }}

				@ -49,20 +75,25 @@ jobs:

				          SLACK_REPORT_CHANNEL: ${{ inputs.slack_report_channel }}

				          ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}

				          CI_EVENT: ${{ inputs.ci_event }}

				          CI_SHA: ${{ github.sha }}

				          CI_WORKFLOW_REF: ${{ github.workflow_ref }}

				          # This `CI_TITLE` would be empty for `schedule` or `workflow_run` events.

				          CI_TITLE: ${{ github.event.head_commit.message }}

				          CI_SHA: ${{ inputs.commit_sha || github.sha }}

				          CI_TEST_JOB: ${{ inputs.job }}

				          SETUP_STATUS: ${{ inputs.setup_status }}

				          REPORT_REPO_ID: ${{ inputs.report_repo_id }}

				        # We pass `needs.setup.outputs.matrix` as the argument. A processing in `notification_service.py` to change

				        # `models/bert` to `models_bert` is required, as the artifact names use `_` instead of `/`.

				        # For a job that doesn't depend on (i.e. `needs`) `setup`, the value for `inputs.folder_slices` would be an

				        # empty string, and the called script still get one argument (which is the emtpy string).

				        run: |

				          sudo apt-get install -y curl

				          pip install huggingface_hub

				          pip install slack_sdk

				          pip show slack_sdk

				          python utils/notification_service.py "${{ inputs.folder_slices }}"

				          if [ "${{ inputs.quantization_matrix }}" != "" ]; then

				            python utils/notification_service.py "${{ inputs.quantization_matrix }}"

				          else

				            python utils/notification_service.py "${{ inputs.folder_slices }}"

				          fi

				      # Upload complete failure tables, as they might be big and only truncated versions could be sent to Slack.

				      - name: Failure table artifacts

				@ -70,32 +101,3 @@ jobs:

				        with:

				          name: ci_results_${{ inputs.job }}

				          path: ci_results_${{ inputs.job }}

				      - uses: actions/checkout@v4

				      - uses: actions/download-artifact@v4

				      - name: Send message to Slack for quantization workflow

				        if: ${{ inputs.job == 'run_quantization_torch_gpu' }}

				        env:

				          CI_SLACK_BOT_TOKEN: ${{ secrets.CI_SLACK_BOT_TOKEN }}

				          ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}

				          SLACK_REPORT_CHANNEL: ${{ inputs.slack_report_channel }}

				          CI_EVENT: ${{ inputs.ci_event }}

				          CI_SHA: ${{ github.sha }}

				          CI_TEST_JOB: ${{ inputs.job }}

				          SETUP_STATUS: ${{ inputs.setup_status }}

				        # We pass `needs.setup.outputs.quantization_matrix` as the argument. A processing in `notification_service_quantization.py` to change

				        # `quantization/bnb` to `quantization_bnb` is required, as the artifact names use `_` instead of `/`.

				        run: |

				          sudo apt-get install -y curl

				          pip install huggingface_hub

				          pip install slack_sdk

				          pip show slack_sdk

				          python utils/notification_service_quantization.py "${{ inputs.quantization_matrix }}" 

				      # Upload complete failure tables, as they might be big and only truncated versions could be sent to Slack.

				      - name: Failure table artifacts

				        if: ${{ inputs.job == 'run_quantization_torch_gpu' }}

				        uses: actions/upload-artifact@v4

				        with:

				          name: ci_results_${{ inputs.job }}

				          path: ci_results_${{ inputs.job }}

									
										58

.github/workflows/ssh-runner.yml
									
										vendored
									
												View File
												
				@ -5,7 +5,7 @@ on:

				    inputs:

				      runner_type:

				        description: 'Type of runner to test (a10 or t4)'

				        required: true 

				        required: true

				      docker_image:

				        description: 'Name of the Docker image'

				        required: true

				@ -15,20 +15,50 @@ on:

				env:

				  HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

				  HF_HOME: /mnt/cache 

				  TRANSFORMERS_IS_CI: yes 

				  OMP_NUM_THREADS: 8 

				  MKL_NUM_THREADS: 8 

				  RUN_SLOW: yes # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access. # This token is created under the bot `hf-transformers-bot`. 

				  SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }} 

				  TF_FORCE_GPU_ALLOW_GROWTH: true 

				  HF_HOME: /mnt/cache

				  TRANSFORMERS_IS_CI: yes

				  OMP_NUM_THREADS: 8

				  MKL_NUM_THREADS: 8

				  RUN_SLOW: yes # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access. # This token is created under the bot `hf-transformers-bot`.

				  TF_FORCE_GPU_ALLOW_GROWTH: true

				  CUDA_VISIBLE_DEVICES: 0,1

				  RUN_PT_TF_CROSS_TESTS: 1

				jobs:

				  get_runner:

				    name: "Get runner to use"

				    runs-on: ubuntu-22.04

				    outputs:

				      RUNNER: ${{ steps.set_runner.outputs.RUNNER }}

				    steps:

				      - name: Get runner to use

				        shell: bash

				        env:

				          NUM_GPUS: ${{ github.event.inputs.num_gpus }}

				          RUNNER_TYPE: ${{ github.event.inputs.runner_type }}

				        run: |

				          if [[ "$NUM_GPUS" == "single" && "$RUNNER_TYPE" == "t4" ]]; then

				            echo "RUNNER=aws-g4dn-4xlarge-cache" >> $GITHUB_ENV

				          elif [[ "$NUM_GPUS" == "multi" && "$RUNNER_TYPE" == "t4" ]]; then

				            echo "RUNNER=aws-g4dn-12xlarge-cache" >> $GITHUB_ENV

				          elif [[ "$NUM_GPUS" == "single" && "$RUNNER_TYPE" == "a10" ]]; then

				            echo "RUNNER=aws-g5-4xlarge-cache" >> $GITHUB_ENV

				          elif [[ "$NUM_GPUS" == "multi" && "$RUNNER_TYPE" == "a10" ]]; then

				            echo "RUNNER=aws-g5-12xlarge-cache" >> $GITHUB_ENV

				          else

				            echo "RUNNER=" >> $GITHUB_ENV

				          fi

				      - name: Set runner to use

				        id: set_runner

				        run: |

				          echo ${{ env.RUNNER }}

				          echo "RUNNER=${{ env.RUNNER }}" >> $GITHUB_OUTPUT

				  ssh_runner:

				    name: "SSH"

				    runs-on: ["${{ github.event.inputs.num_gpus }}-gpu", nvidia-gpu, "${{ github.event.inputs.runner_type }}", ci]

				    needs: get_runner

				    runs-on:

				      group: ${{ needs.get_runner.outputs.RUNNER }}

				    container:

				      image: ${{ github.event.inputs.docker_image }}

				      options: --gpus all --privileged --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/

				@ -49,7 +79,7 @@ jobs:

				      - name: Show installed libraries and their versions

				        working-directory: /transformers

				        run: pip freeze

				      - name: NVIDIA-SMI

				        run: |

				          nvidia-smi

				@ -57,9 +87,11 @@ jobs:

				      - name: Store Slack infos

				        #because the SSH can be enabled dynamically if the workflow failed, so we need to store slack infos to be able to retrieve them during the waitforssh step

				        shell: bash

				        env:

				          GITHUB_ACTOR: ${{ github.actor }}

				        run: |

				          echo "${{ github.actor }}"

				          github_actor=${{ github.actor }}

				          echo "$GITHUB_ACTOR"

				          github_actor=$GITHUB_ACTOR

				          github_actor=${github_actor/'-'/'_'}

				          echo "$github_actor"

				          echo "github_actor=$github_actor" >> $GITHUB_ENV

									
										2

.github/workflows/trufflehog.yml
									
										vendored
									
												View File
												
				@ -16,3 +16,5 @@ jobs:

				          fetch-depth: 0

				      - name: Secret Scanning

				        uses: trufflesecurity/trufflehog@main

				        with:

				          extra_args: --results=verified,unknown

									
										2

.github/workflows/update_metdata.yml
									
										vendored
									
												View File
												
				@ -19,7 +19,7 @@ jobs:

				      - name: Setup environment

				        run: |

				          pip install --upgrade pip

				          pip install datasets pandas==2.0.3

				          pip install datasets pandas

				          pip install .[torch,tf,flax]

				      - name: Update metadata

8

.gitignore vendored

View File

 @ -13,6 +13,7 @@ tests/fixtures/cached_*_text.txt
 logs/
 lightning_logs/
 lang_code_data/
 reports/
 # Distribution / packaging
 .Python
 @ -97,6 +98,7 @@ celerybeat-schedule
 # Environments
 .env
 .venv
 .venv*
 env/
 venv/
 ENV/
 @ -167,3 +169,9 @@ tags
 # ruff
 .ruff_cache
 # modular conversion
 *.modular_backup
 # Cursor IDE files
 .cursor/

									
										39

AGENTS.md
									
										Normal file
									
												View File
												
				@ -0,0 +1,39 @@

				# AGENTS.md Guide for Hugging Face Transformers

				This AGENTS.md file provides guidance for code agents working with this codebase.

				## Core Project Structure

				- `/src/transformers`: This contains the core source code for the library

				  - `/models`: Code for individual models. Models inherit from base classes in the root `/src/transformers` directory.

				- `/tests`: This contains the core test classes for the library. These are usually inherited rather than directly run.

				  - `/models`: Tests for individual models. Model tests inherit from common tests in the root `/tests` directory.

				- `/docs`: This contains the documentation for the library, including guides, tutorials, and API references.

				## Coding Conventions for Hugging Face Transformers

				- PRs should be as brief as possible. Bugfix PRs in particular can often be only one or two lines long, and do not need large comments, docstrings or new functions in this case. Aim to minimize the size of the diff.

				- When writing tests, they should be added to an existing file. The only exception is for PRs to add a new model, when a new test directory should be created for that model.

				- Code style is enforced in the CI. You can install the style tools with `pip install -e .[quality]`. You can then run `make fixup` to apply style and consistency fixes to your code.

				## Copying and inheritance

				Many models in the codebase have similar code, but it is not shared by inheritance because we want each model file to be self-contained.

				We use two mechanisms to keep this code in sync:

				- "Copied from" syntax. Functions or entire classes can have a comment at the top like this: `# Copied from transformers.models.llama.modeling_llama.rotate_half` or `# Copied from transformers.models.t5.modeling_t5.T5LayerNorm with T5->MT5`

				  These comments are actively checked by the style tools, and copies will automatically be updated when the base code is updated. If you need to update a copied function, you should

				  either update the base function and use `make fixup` to propagate the change to all copies, or simply remove the `# Copied from` comment if that is inappropriate.

				- "Modular" files. These files briefly define models by composing them using inheritance from other models. They are not meant to be used directly. Instead, the style tools

				  automatically generate a complete modeling file, like `modeling_bert.py`, from the modular file like `modular_bert.py`. If a model has a modular file, the modeling file

				  should never be edited directly! Instead, changes should be made in the modular file, and then you should run `make fixup` to update the modeling file automatically.

				When adding new models, you should prefer `modular` style.

				## Testing

				After making changes, you should usually run `make fixup` to ensure any copies and modular files are updated, and then test all affected models. This includes both

				the model you made the changes in and any other models that were updated by `make fixup`. Tests can be run with `pytest tests/models/[name]/test_modeling_[name].py`

				If your changes affect code in other classes like tokenizers or processors, you should run those tests instead, like `test_processing_[name].py` or `test_tokenization_[name].py`.

				In order to run tests, you may need to install dependencies. You can do this with `pip install -e .[testing]`. You will probably also need to `pip install torch accelerate` if your environment does not already have them.

									
										26

CONTRIBUTING.md
									
												View File
												
				@ -68,8 +68,7 @@ already reported** (use the search bar on GitHub under Issues). Your issue shoul

				Once you've confirmed the bug hasn't already been reported, please include the following information in your issue so we can quickly resolve it:

				* Your **OS type and version** and **Python**, **PyTorch** and

				  **TensorFlow** versions when applicable.

				* Your **OS type and version** and **Python**, and **PyTorch** versions when applicable.

				* A short, self-contained, code snippet that allows us to reproduce the bug in

				  less than 30s.

				* The *full* traceback if an exception is raised.

				@ -78,7 +77,7 @@ Once you've confirmed the bug hasn't already been reported, please include the f

				To get the OS and software versions automatically, run the following command:

				```bash

				transformers-cli env

				transformers env

				```

				You can also run the same command from the root of the repository:

				@ -132,7 +131,7 @@ You will need basic `git` proficiency to contribute to

				manual. Type `git --help` in a shell and enjoy! If you prefer books, [Pro

				Git](https://git-scm.com/book/en/v2) is a very good reference.

				You'll need **[Python 3.8](https://github.com/huggingface/transformers/blob/main/setup.py#L449)** or above to contribute to 🤗 Transformers. Follow the steps below to start contributing:

				You'll need **[Python 3.9](https://github.com/huggingface/transformers/blob/main/setup.py#L449)** or above to contribute to 🤗 Transformers. Follow the steps below to start contributing:

				1. Fork the [repository](https://github.com/huggingface/transformers) by

				   clicking on the **[Fork](https://github.com/huggingface/transformers/fork)** button on the repository's page. This creates a copy of the code

				@ -165,8 +164,7 @@ You'll need **[Python 3.8](https://github.com/huggingface/transformers/blob/main

				   mode with the `-e` flag.

				   Depending on your OS, and since the number of optional dependencies of Transformers is growing, you might get a

				   failure with this command. If that's the case make sure to install the Deep Learning framework you are working with

				   (PyTorch, TensorFlow and/or Flax) then do:

				   failure with this command. If that's the case make sure to install Pytorch then do:

				   ```bash

				   pip install -e ".[quality]"

				@ -221,10 +219,10 @@ You'll need **[Python 3.8](https://github.com/huggingface/transformers/blob/main

				   [Checks on a Pull Request](https://huggingface.co/docs/transformers/pr_checks) guide.

				   If you're modifying documents under the `docs/source` directory, make sure the documentation can still be built. This check will also run in the CI when you open a pull request. To run a local check

				   make sure you install the documentation builder:

				   make sure you install the [documentation builder](https://github.com/huggingface/doc-builder).

				   ```bash

				   pip install ".[docs]"

				   pip install hf-doc-builder

				   ```

				   Run the following command from the root of the repository:

				@ -280,13 +278,14 @@ are working on it).<br>

				useful to avoid duplicated work, and to differentiate it from PRs ready to be merged.<br>

				☐ Make sure existing tests pass.<br>

				☐ If adding a new feature, also add tests for it.<br>

				   - If you are adding a new model, make sure you use

				- If you are adding a new model, make sure you use

				     `ModelTester.all_model_classes = (MyModel, MyModelWithLMHead,...)` to trigger the common tests.

				   - If you are adding new `@slow` tests, make sure they pass using

				- If you are adding new `@slow` tests, make sure they pass using

				     `RUN_SLOW=1 python -m pytest tests/models/my_new_model/test_my_new_model.py`.

				   - If you are adding a new tokenizer, write tests and make sure

				- If you are adding a new tokenizer, write tests and make sure

				     `RUN_SLOW=1 python -m pytest tests/models/{your_model_name}/test_tokenization_{your_model_name}.py` passes.

				   - CircleCI does not run the slow tests, but GitHub Actions does every night!<br>

				- CircleCI does not run the slow tests, but GitHub Actions does every night!<br>

				☐ All public methods must have informative docstrings (see

				[`modeling_bert.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/models/bert/modeling_bert.py)

				@ -342,9 +341,8 @@ RUN_SLOW=yes python -m pytest -n auto --dist=loadfile -s -v ./examples/pytorch/t

				```

				Like the slow tests, there are other environment variables available which are not enabled by default during testing:

				- `RUN_CUSTOM_TOKENIZERS`: Enables tests for custom tokenizers.

				- `RUN_PT_FLAX_CROSS_TESTS`: Enables tests for PyTorch + Flax integration.

				- `RUN_PT_TF_CROSS_TESTS`: Enables tests for TensorFlow + PyTorch integration.

				More environment variables and additional information can be found in the [testing_utils.py](https://github.com/huggingface/transformers/blob/main/src/transformers/testing_utils.py).

									
										11

ISSUES.md
									
												View File
												
				@ -26,7 +26,7 @@ There are two main venues to receive support: [the forums](https://discuss.huggi

				[The user forums](https://discuss.huggingface.co/) are supported by the wide community of the library users and backed up by developers when needed.

				If you have a difficulty with deploying this library or some questions, or you'd like to discuss a new feature, please first consider discussing those things at the forums. Only when you feel your subject matter has been crystalized and you still need support from the library developers do proceed to file an [issue](https://github.com/huggingface/transformers/issues).

				If you have a difficulty with deploying this library or some questions, or you'd like to discuss a new feature, please first consider discussing those things at the forums. Only when you feel your subject matter has been crystallized and you still need support from the library developers do proceed to file an [issue](https://github.com/huggingface/transformers/issues).

				In particular all "Please explain" questions or objectively very user-specific feature requests belong to the forums. Here are some example of such questions:

				@ -38,7 +38,6 @@ In particular all "Please explain" questions or objectively very user-specific f

				* "How to train T5 on De->En translation?"

				## The GitHub Issues

				Everything which hints at a bug should be opened as an [issue](https://github.com/huggingface/transformers/issues).

				@ -154,7 +153,7 @@ You are not required to read the following guidelines before opening an issue. H

				    cd examples/seq2seq

				    torchrun --nproc_per_node=2 ./finetune_trainer.py \

				    --model_name_or_path sshleifer/distill-mbart-en-ro-12-4 --data_dir wmt_en_ro \

				    --output_dir output_dir --overwrite_output_dir \

				    --output_dir output_dir \

				    --do_train --n_train 500 --num_train_epochs 1 \

				    --per_device_train_batch_size 1  --freeze_embeds \

				    --src_lang en_XX --tgt_lang ro_RO --task translation \

				@ -247,7 +246,6 @@ You are not required to read the following guidelines before opening an issue. H

				    Try not use italics and bold text too much as these often make the text more difficult to read.

				12. If you are cross-referencing a specific comment in a given thread or another issue, always link to that specific comment, rather than using the issue link. If you do the latter it could be quite impossible to find which specific comment you're referring to.

				    To get the link to the specific comment do not copy the url from the location bar of your browser, but instead, click the `...` icon in the upper right corner of the comment and then select "Copy Link".

				@ -257,15 +255,14 @@ You are not required to read the following guidelines before opening an issue. H

				    1. https://github.com/huggingface/transformers/issues/9257

				    2. https://github.com/huggingface/transformers/issues/9257#issuecomment-749945162

				13. If you are replying to a last comment, it's totally fine to make your reply with just your comment in it. The readers can follow the information flow here.

				    But if you're replying to a comment that happened some comments back it's always a good practice to quote just the relevant lines you're replying it. The `>` is used for quoting, or you can always use the menu to do so. For example your editor box will look like:

				    ```

				    > How big is your gpu cluster?

				    > How big is your GPU cluster?

				    Our cluster is made of 256 gpus.

				    Our cluster is made of 256 GPUs.

				    ```

				    If you are addressing multiple comments, quote the relevant parts of each before your answer. Some people use the same comment to do multiple replies, others separate them into separate comments. Either way works. The latter approach helps for linking to a specific comment.

									
										28

Makefile
									
												View File
												
				@ -3,18 +3,24 @@

				# make sure to test the local checkout in scripts and not the pre-installed one (don't use quotes!)

				export PYTHONPATH = src

				check_dirs := examples tests src utils

				check_dirs := examples tests src utils scripts benchmark benchmark_v2

				exclude_folders :=  ""

				modified_only_fixup:

					$(eval modified_py_files := $(shell python utils/get_modified_files.py $(check_dirs)))

					@if test -n "$(modified_py_files)"; then \

						echo "Checking/fixing $(modified_py_files)"; \

						ruff check $(modified_py_files) --fix --exclude $(exclude_folders); \

						ruff format $(modified_py_files) --exclude $(exclude_folders);\

					@current_branch=$$(git branch --show-current); \

					if [ "$$current_branch" = "main" ]; then \

						echo "On main branch, running 'style' target instead..."; \

						$(MAKE) style; \

					else \

						echo "No library .py files were modified"; \

						modified_py_files=$$(python utils/get_modified_files.py $(check_dirs)); \

						if [ -n "$$modified_py_files" ]; then \

							echo "Checking/fixing files: $${modified_py_files}"; \

							ruff check $${modified_py_files} --fix --exclude $(exclude_folders); \

							ruff format $${modified_py_files} --exclude $(exclude_folders); \

						else \

							echo "No library .py files were modified"; \

						fi; \

					fi

				# Update src/transformers/dependency_versions_table.py

				@ -37,16 +43,16 @@ autogenerate_code: deps_table_update

				repo-consistency:

					python utils/check_copies.py

					python utils/check_modular_conversion.py

					python utils/check_table.py

					python utils/check_dummies.py

					python utils/check_repo.py

					python utils/check_inits.py

					python utils/check_pipeline_typing.py

					python utils/check_config_docstrings.py

					python utils/check_config_attributes.py

					python utils/check_doctest_list.py

					python utils/update_metadata.py --check-only

					python utils/check_docstrings.py

					python utils/check_support_list.py

					python utils/add_dates.py

				# this target runs checks on all files

				@ -81,9 +87,9 @@ fixup: modified_only_fixup extra_style_checks autogenerate_code repo-consistency

				fix-copies:

					python utils/check_copies.py --fix_and_overwrite

					python utils/check_modular_conversion.py  --fix_and_overwrite

					python utils/check_table.py --fix_and_overwrite

					python utils/check_modular_conversion.py --fix_and_overwrite

					python utils/check_dummies.py --fix_and_overwrite

					python utils/check_pipeline_typing.py --fix_and_overwrite

					python utils/check_doctest_list.py --fix_and_overwrite

					python utils/check_docstrings.py --fix_and_overwrite

									
										382

README.md
									
												View File
												
				@ -25,6 +25,7 @@ limitations under the License.

				</p>

				<p align="center">

				    <a href="https://huggingface.com/models"><img alt="Checkpoints on Hub" src="https://img.shields.io/endpoint?url=https://huggingface.co/api/shields/models&color=brightgreen"></a>

				    <a href="https://circleci.com/gh/huggingface/transformers"><img alt="Build" src="https://img.shields.io/circleci/build/github/huggingface/transformers/main"></a>

				    <a href="https://github.com/huggingface/transformers/blob/main/LICENSE"><img alt="GitHub" src="https://img.shields.io/github/license/huggingface/transformers.svg?color=blue"></a>

				    <a href="https://huggingface.co/docs/transformers/index"><img alt="Documentation" src="https://img.shields.io/website/http/huggingface.co/docs/transformers/index.svg?down_color=red&down_message=offline&up_message=online"></a>

				@ -43,266 +44,279 @@ limitations under the License.

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ja.md">日本語</a> |

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_hd.md">हिन्दी</a> |

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ru.md">Русский</a> |

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_pt-br.md">Рortuguês</a> |

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_pt-br.md">Português</a> |

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_te.md">తెలుగు</a> |

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_fr.md">Français</a> |

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_de.md">Deutsch</a> |

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_it.md">Italiano</a> |

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_vi.md">Tiếng Việt</a> |

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ar.md">العربية</a> |

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ur.md">اردو</a> |

				        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_bn.md">বাংলা</a> |

				    </p>

				</h4>

				<h3 align="center">

				    <p>State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow</p>

				    <p>State-of-the-art pretrained models for inference and training</p>

				</h3>

				<h3 align="center">

				    <a href="https://hf.co/course"><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/course_banner.png"></a>

				    <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/transformers_as_a_model_definition.png"/>

				</h3>

				🤗 Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

				Transformers acts as the model-definition framework for state-of-the-art machine learning models in text, computer

				vision, audio, video, and multimodal model, for both inference and training.

				These models can be applied on:

				It centralizes the model definition so that this definition is agreed upon across the ecosystem. `transformers` is the

				pivot across frameworks: if a model definition is supported, it will be compatible with the majority of training

				frameworks (Axolotl, Unsloth, DeepSpeed, FSDP, PyTorch-Lightning, ...), inference engines (vLLM, SGLang, TGI, ...),

				and adjacent modeling libraries (llama.cpp, mlx, ...) which leverage the model definition from `transformers`.

				* 📝 Text, for tasks like text classification, information extraction, question answering, summarization, translation, and text generation, in over 100 languages.

				* 🖼️ Images, for tasks like image classification, object detection, and segmentation.

				* 🗣️ Audio, for tasks like speech recognition and audio classification.

				We pledge to help support new state-of-the-art models and democratize their usage by having their model definition be

				simple, customizable, and efficient.

				Transformer models can also perform tasks on **several modalities combined**, such as table question answering, optical character recognition, information extraction from scanned documents, video classification, and visual question answering.

				There are over 1M+ Transformers [model checkpoints](https://huggingface.co/models?library=transformers&sort=trending) on the [Hugging Face Hub](https://huggingface.com/models) you can use.

				🤗 Transformers provides APIs to quickly download and use those pretrained models on a given text, fine-tune them on your own datasets and then share them with the community on our [model hub](https://huggingface.co/models). At the same time, each python module defining an architecture is fully standalone and can be modified to enable quick research experiments.

				Explore the [Hub](https://huggingface.com/) today to find a model and use Transformers to help you get started right away.

				🤗 Transformers is backed by the three most popular deep learning libraries — [Jax](https://jax.readthedocs.io/en/latest/), [PyTorch](https://pytorch.org/) and [TensorFlow](https://www.tensorflow.org/) — with a seamless integration between them. It's straightforward to train your models with one before loading them for inference with the other.

				## Installation

				## Online demos

				Transformers works with Python 3.9+, and [PyTorch](https://pytorch.org/get-started/locally/) 2.1+.

				You can test most of our models directly on their pages from the [model hub](https://huggingface.co/models). We also offer [private model hosting, versioning, & an inference API](https://huggingface.co/pricing) for public and private models.

				Create and activate a virtual environment with [venv](https://docs.python.org/3/library/venv.html) or [uv](https://docs.astral.sh/uv/), a fast Rust-based Python package and project manager.

				Here are a few examples:

				In Natural Language Processing:

				- [Masked word completion with BERT](https://huggingface.co/google-bert/bert-base-uncased?text=Paris+is+the+%5BMASK%5D+of+France)

				- [Named Entity Recognition with Electra](https://huggingface.co/dbmdz/electra-large-discriminator-finetuned-conll03-english?text=My+name+is+Sarah+and+I+live+in+London+city)

				- [Text generation with Mistral](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)

				- [Natural Language Inference with RoBERTa](https://huggingface.co/FacebookAI/roberta-large-mnli?text=The+dog+was+lost.+Nobody+lost+any+animal)

				- [Summarization with BART](https://huggingface.co/facebook/bart-large-cnn?text=The+tower+is+324+metres+%281%2C063+ft%29+tall%2C+about+the+same+height+as+an+81-storey+building%2C+and+the+tallest+structure+in+Paris.+Its+base+is+square%2C+measuring+125+metres+%28410+ft%29+on+each+side.+During+its+construction%2C+the+Eiffel+Tower+surpassed+the+Washington+Monument+to+become+the+tallest+man-made+structure+in+the+world%2C+a+title+it+held+for+41+years+until+the+Chrysler+Building+in+New+York+City+was+finished+in+1930.+It+was+the+first+structure+to+reach+a+height+of+300+metres.+Due+to+the+addition+of+a+broadcasting+aerial+at+the+top+of+the+tower+in+1957%2C+it+is+now+taller+than+the+Chrysler+Building+by+5.2+metres+%2817+ft%29.+Excluding+transmitters%2C+the+Eiffel+Tower+is+the+second+tallest+free-standing+structure+in+France+after+the+Millau+Viaduct)

				- [Question answering with DistilBERT](https://huggingface.co/distilbert/distilbert-base-uncased-distilled-squad?text=Which+name+is+also+used+to+describe+the+Amazon+rainforest+in+English%3F&context=The+Amazon+rainforest+%28Portuguese%3A+Floresta+Amaz%C3%B4nica+or+Amaz%C3%B4nia%3B+Spanish%3A+Selva+Amaz%C3%B3nica%2C+Amazon%C3%ADa+or+usually+Amazonia%3B+French%3A+For%C3%AAt+amazonienne%3B+Dutch%3A+Amazoneregenwoud%29%2C+also+known+in+English+as+Amazonia+or+the+Amazon+Jungle%2C+is+a+moist+broadleaf+forest+that+covers+most+of+the+Amazon+basin+of+South+America.+This+basin+encompasses+7%2C000%2C000+square+kilometres+%282%2C700%2C000+sq+mi%29%2C+of+which+5%2C500%2C000+square+kilometres+%282%2C100%2C000+sq+mi%29+are+covered+by+the+rainforest.+This+region+includes+territory+belonging+to+nine+nations.+The+majority+of+the+forest+is+contained+within+Brazil%2C+with+60%25+of+the+rainforest%2C+followed+by+Peru+with+13%25%2C+Colombia+with+10%25%2C+and+with+minor+amounts+in+Venezuela%2C+Ecuador%2C+Bolivia%2C+Guyana%2C+Suriname+and+French+Guiana.+States+or+departments+in+four+nations+contain+%22Amazonas%22+in+their+names.+The+Amazon+represents+over+half+of+the+planet%27s+remaining+rainforests%2C+and+comprises+the+largest+and+most+biodiverse+tract+of+tropical+rainforest+in+the+world%2C+with+an+estimated+390+billion+individual+trees+divided+into+16%2C000+species)

				- [Translation with T5](https://huggingface.co/google-t5/t5-base?text=My+name+is+Wolfgang+and+I+live+in+Berlin)

				In Computer Vision:

				- [Image classification with ViT](https://huggingface.co/google/vit-base-patch16-224)

				- [Object Detection with DETR](https://huggingface.co/facebook/detr-resnet-50)

				- [Semantic Segmentation with SegFormer](https://huggingface.co/nvidia/segformer-b0-finetuned-ade-512-512)

				- [Panoptic Segmentation with Mask2Former](https://huggingface.co/facebook/mask2former-swin-large-coco-panoptic)

				- [Depth Estimation with Depth Anything](https://huggingface.co/docs/transformers/main/model_doc/depth_anything)

				- [Video Classification with VideoMAE](https://huggingface.co/docs/transformers/model_doc/videomae)

				- [Universal Segmentation with OneFormer](https://huggingface.co/shi-labs/oneformer_ade20k_dinat_large)

				In Audio:

				- [Automatic Speech Recognition with Whisper](https://huggingface.co/openai/whisper-large-v3)

				- [Keyword Spotting with Wav2Vec2](https://huggingface.co/superb/wav2vec2-base-superb-ks)

				- [Audio Classification with Audio Spectrogram Transformer](https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593)

				In Multimodal tasks:

				- [Table Question Answering with TAPAS](https://huggingface.co/google/tapas-base-finetuned-wtq)

				- [Visual Question Answering with ViLT](https://huggingface.co/dandelin/vilt-b32-finetuned-vqa)

				- [Image captioning with LLaVa](https://huggingface.co/llava-hf/llava-1.5-7b-hf)

				- [Zero-shot Image Classification with SigLIP](https://huggingface.co/google/siglip-so400m-patch14-384)

				- [Document Question Answering with LayoutLM](https://huggingface.co/impira/layoutlm-document-qa)

				- [Zero-shot Video Classification with X-CLIP](https://huggingface.co/docs/transformers/model_doc/xclip)

				- [Zero-shot Object Detection with OWLv2](https://huggingface.co/docs/transformers/en/model_doc/owlv2)

				- [Zero-shot Image Segmentation with CLIPSeg](https://huggingface.co/docs/transformers/model_doc/clipseg)

				- [Automatic Mask Generation with SAM](https://huggingface.co/docs/transformers/model_doc/sam)

				## 100 projects using Transformers

				Transformers is more than a toolkit to use pretrained models: it's a community of projects built around it and the

				Hugging Face Hub. We want Transformers to enable developers, researchers, students, professors, engineers, and anyone

				else to build their dream projects.

				In order to celebrate the 100,000 stars of transformers, we have decided to put the spotlight on the

				community, and we have created the [awesome-transformers](./awesome-transformers.md) page which lists 100

				incredible projects built in the vicinity of transformers.

				If you own or use a project that you believe should be part of the list, please open a PR to add it!

				## If you are looking for custom support from the Hugging Face team

				<a target="_blank" href="https://huggingface.co/support">

				    <img alt="HuggingFace Expert Acceleration Program" src="https://cdn-media.huggingface.co/marketing/transformers/new-support-improved.png" style="max-width: 600px; border: 1px solid #eee; border-radius: 4px; box-shadow: 0 1px 2px 0 rgba(0, 0, 0, 0.05);">

				</a><br>

				## Quick tour

				To immediately use a model on a given input (text, image, audio, ...), we provide the `pipeline` API. Pipelines group together a pretrained model with the preprocessing that was used during that model's training. Here is how to quickly use a pipeline to classify positive versus negative texts:

				```python

				>>> from transformers import pipeline

				# Allocate a pipeline for sentiment-analysis

				>>> classifier = pipeline('sentiment-analysis')

				>>> classifier('We are very happy to introduce pipeline to the transformers repository.')

				[{'label': 'POSITIVE', 'score': 0.9996980428695679}]

				```py

				# venv

				python -m venv .my-env

				source .my-env/bin/activate

				# uv

				uv venv .my-env

				source .my-env/bin/activate

				```

				The second line of code downloads and caches the pretrained model used by the pipeline, while the third evaluates it on the given text. Here, the answer is "positive" with a confidence of 99.97%.

				Install Transformers in your virtual environment.

				Many tasks have a pre-trained `pipeline` ready to go, in NLP but also in computer vision and speech. For example, we can easily extract detected objects in an image:

				```py

				# pip

				pip install "transformers[torch]"

				``` python

				>>> import requests

				>>> from PIL import Image

				>>> from transformers import pipeline

				# Download an image with cute cats

				>>> url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/coco_sample.png"

				>>> image_data = requests.get(url, stream=True).raw

				>>> image = Image.open(image_data)

				# Allocate a pipeline for object detection

				>>> object_detector = pipeline('object-detection')

				>>> object_detector(image)

				[{'score': 0.9982201457023621,

				  'label': 'remote',

				  'box': {'xmin': 40, 'ymin': 70, 'xmax': 175, 'ymax': 117}},

				 {'score': 0.9960021376609802,

				  'label': 'remote',

				  'box': {'xmin': 333, 'ymin': 72, 'xmax': 368, 'ymax': 187}},

				 {'score': 0.9954745173454285,

				  'label': 'couch',

				  'box': {'xmin': 0, 'ymin': 1, 'xmax': 639, 'ymax': 473}},

				 {'score': 0.9988006353378296,

				  'label': 'cat',

				  'box': {'xmin': 13, 'ymin': 52, 'xmax': 314, 'ymax': 470}},

				 {'score': 0.9986783862113953,

				  'label': 'cat',

				  'box': {'xmin': 345, 'ymin': 23, 'xmax': 640, 'ymax': 368}}]

				# uv

				uv pip install "transformers[torch]"

				```

				Here, we get a list of objects detected in the image, with a box surrounding the object and a confidence score. Here is the original image on the left, with the predictions displayed on the right:

				Install Transformers from source if you want the latest changes in the library or are interested in contributing. However, the *latest* version may not be stable. Feel free to open an [issue](https://github.com/huggingface/transformers/issues) if you encounter an error.

				```shell

				git clone https://github.com/huggingface/transformers.git

				cd transformers

				# pip

				pip install '.[torch]'

				# uv

				uv pip install '.[torch]'

				```

				## Quickstart

				Get started with Transformers right away with the [Pipeline](https://huggingface.co/docs/transformers/pipeline_tutorial) API. The `Pipeline` is a high-level inference class that supports text, audio, vision, and multimodal tasks. It handles preprocessing the input and returns the appropriate output.

				Instantiate a pipeline and specify model to use for text generation. The model is downloaded and cached so you can easily reuse it again. Finally, pass some text to prompt the model.

				```py

				from transformers import pipeline

				pipeline = pipeline(task="text-generation", model="Qwen/Qwen2.5-1.5B")

				pipeline("the secret to baking a really good cake is ")

				[{'generated_text': 'the secret to baking a really good cake is 1) to use the right ingredients and 2) to follow the recipe exactly. the recipe for the cake is as follows: 1 cup of sugar, 1 cup of flour, 1 cup of milk, 1 cup of butter, 1 cup of eggs, 1 cup of chocolate chips. if you want to make 2 cakes, how much sugar do you need? To make 2 cakes, you will need 2 cups of sugar.'}]

				```

				To chat with a model, the usage pattern is the same. The only difference is you need to construct a chat history (the input to `Pipeline`) between you and the system.

				> [!TIP]

				> You can also chat with a model directly from the command line.

				> ```shell

				> transformers chat Qwen/Qwen2.5-0.5B-Instruct

				> ```

				```py

				import torch

				from transformers import pipeline

				chat = [

				    {"role": "system", "content": "You are a sassy, wise-cracking robot as imagined by Hollywood circa 1986."},

				    {"role": "user", "content": "Hey, can you tell me any fun things to do in New York?"}

				]

				pipeline = pipeline(task="text-generation", model="meta-llama/Meta-Llama-3-8B-Instruct", dtype=torch.bfloat16, device_map="auto")

				response = pipeline(chat, max_new_tokens=512)

				print(response[0]["generated_text"][-1]["content"])

				```

				Expand the examples below to see how `Pipeline` works for different modalities and tasks.

				<details>

				<summary>Automatic speech recognition</summary>

				```py

				from transformers import pipeline

				pipeline = pipeline(task="automatic-speech-recognition", model="openai/whisper-large-v3")

				pipeline("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")

				{'text': ' I have a dream that one day this nation will rise up and live out the true meaning of its creed.'}

				```

				</details>

				<details>

				<summary>Image classification</summary>

				<h3 align="center">

				    <a><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/coco_sample.png" width="400"></a>

				    <a><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/coco_sample_post_processed.png" width="400"></a>

				    <a><img src="https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png"></a>

				</h3>

				You can learn more about the tasks supported by the `pipeline` API in [this tutorial](https://huggingface.co/docs/transformers/task_summary).

				```py

				from transformers import pipeline

				In addition to `pipeline`, to download and use any of the pretrained models on your given task, all it takes is three lines of code. Here is the PyTorch version:

				```python

				>>> from transformers import AutoTokenizer, AutoModel

				>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")

				>>> model = AutoModel.from_pretrained("google-bert/bert-base-uncased")

				>>> inputs = tokenizer("Hello world!", return_tensors="pt")

				>>> outputs = model(**inputs)

				pipeline = pipeline(task="image-classification", model="facebook/dinov2-small-imagenet1k-1-layer")

				pipeline("https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png")

				[{'label': 'macaw', 'score': 0.997848391532898},

				 {'label': 'sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita',

				  'score': 0.0016551691805943847},

				 {'label': 'lorikeet', 'score': 0.00018523589824326336},

				 {'label': 'African grey, African gray, Psittacus erithacus',

				  'score': 7.85409429227002e-05},

				 {'label': 'quail', 'score': 5.502637941390276e-05}]

				```

				And here is the equivalent code for TensorFlow:

				```python

				>>> from transformers import AutoTokenizer, TFAutoModel

				</details>

				>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")

				>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-uncased")

				<details>

				<summary>Visual question answering</summary>

				>>> inputs = tokenizer("Hello world!", return_tensors="tf")

				>>> outputs = model(**inputs)

				<h3 align="center">

				    <a><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/idefics-few-shot.jpg"></a>

				</h3>

				```py

				from transformers import pipeline

				pipeline = pipeline(task="visual-question-answering", model="Salesforce/blip-vqa-base")

				pipeline(

				    image="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/idefics-few-shot.jpg",

				    question="What is in the image?",

				)

				[{'answer': 'statue of liberty'}]

				```

				The tokenizer is responsible for all the preprocessing the pretrained model expects and can be called directly on a single string (as in the above examples) or a list. It will output a dictionary that you can use in downstream code or simply directly pass to your model using the ** argument unpacking operator.

				</details>

				The model itself is a regular [Pytorch `nn.Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) or a [TensorFlow `tf.keras.Model`](https://www.tensorflow.org/api_docs/python/tf/keras/Model) (depending on your backend) which you can use as usual. [This tutorial](https://huggingface.co/docs/transformers/training) explains how to integrate such a model into a classic PyTorch or TensorFlow training loop, or how to use our `Trainer` API to quickly fine-tune on a new dataset.

				## Why should I use transformers?

				## Why should I use Transformers?

				1. Easy-to-use state-of-the-art models:

				    - High performance on natural language understanding & generation, computer vision, and audio tasks.

				    - Low barrier to entry for educators and practitioners.

				    - High performance on natural language understanding & generation, computer vision, audio, video, and multimodal tasks.

				    - Low barrier to entry for researchers, engineers, and developers.

				    - Few user-facing abstractions with just three classes to learn.

				    - A unified API for using all our pretrained models.

				1. Lower compute costs, smaller carbon footprint:

				    - Researchers can share trained models instead of always retraining.

				    - Practitioners can reduce compute time and production costs.

				    - Dozens of architectures with over 400,000 pretrained models across all modalities.

				    - Share trained models instead of training from scratch.

				    - Reduce compute time and production costs.

				    - Dozens of model architectures with 1M+ pretrained checkpoints across all modalities.

				1. Choose the right framework for every part of a model's lifetime:

				1. Choose the right framework for every part of a models lifetime:

				    - Train state-of-the-art models in 3 lines of code.

				    - Move a single model between TF2.0/PyTorch/JAX frameworks at will.

				    - Seamlessly pick the right framework for training, evaluation, and production.

				    - Move a single model between PyTorch/JAX/TF2.0 frameworks at will.

				    - Pick the right framework for training, evaluation, and production.

				1. Easily customize a model or an example to your needs:

				    - We provide examples for each architecture to reproduce the results published by its original authors.

				    - Model internals are exposed as consistently as possible.

				    - Model files can be used independently of the library for quick experiments.

				## Why shouldn't I use transformers?

				<a target="_blank" href="https://huggingface.co/enterprise">

				    <img alt="Hugging Face Enterprise Hub" src="https://github.com/user-attachments/assets/247fb16d-d251-4583-96c4-d3d76dda4925">

				</a><br>

				## Why shouldn't I use Transformers?

				- This library is not a modular toolbox of building blocks for neural nets. The code in the model files is not refactored with additional abstractions on purpose, so that researchers can quickly iterate on each of the models without diving into additional abstractions/files.

				- The training API is not intended to work on any model but is optimized to work with the models provided by the library. For generic machine learning loops, you should use another library (possibly, [Accelerate](https://huggingface.co/docs/accelerate)).

				- While we strive to present as many use cases as possible, the scripts in our [examples folder](https://github.com/huggingface/transformers/tree/main/examples) are just that: examples. It is expected that they won't work out-of-the-box on your specific problem and that you will be required to change a few lines of code to adapt them to your needs.

				- The training API is optimized to work with PyTorch models provided by Transformers. For generic machine learning loops, you should use another library like [Accelerate](https://huggingface.co/docs/accelerate).

				- The [example scripts](https://github.com/huggingface/transformers/tree/main/examples) are only *examples*. They may not necessarily work out-of-the-box on your specific use case and you'll need to adapt the code for it to work.

				## Installation

				## 100 projects using Transformers

				### With pip

				Transformers is more than a toolkit to use pretrained models, it's a community of projects built around it and the

				Hugging Face Hub. We want Transformers to enable developers, researchers, students, professors, engineers, and anyone

				else to build their dream projects.

				This repository is tested on Python 3.8+, Flax 0.4.1+, PyTorch 1.11+, and TensorFlow 2.6+.

				In order to celebrate Transformers 100,000 stars, we wanted to put the spotlight on the

				community with the [awesome-transformers](./awesome-transformers.md) page which lists 100

				incredible projects built with Transformers.

				You should install 🤗 Transformers in a [virtual environment](https://docs.python.org/3/library/venv.html). If you're unfamiliar with Python virtual environments, check out the [user guide](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).

				If you own or use a project that you believe should be part of the list, please open a PR to add it!

				First, create a virtual environment with the version of Python you're going to use and activate it.

				## Example models

				Then, you will need to install at least one of Flax, PyTorch, or TensorFlow.

				Please refer to [TensorFlow installation page](https://www.tensorflow.org/install/), [PyTorch installation page](https://pytorch.org/get-started/locally/#start-locally) and/or [Flax](https://github.com/google/flax#quick-install) and [Jax](https://github.com/google/jax#installation) installation pages regarding the specific installation command for your platform.

				You can test most of our models directly on their [Hub model pages](https://huggingface.co/models).

				When one of those backends has been installed, 🤗 Transformers can be installed using pip as follows:

				Expand each modality below to see a few example models for various use cases.

				```bash

				pip install transformers

				```

				<details>

				<summary>Audio</summary>

				If you'd like to play with the examples or need the bleeding edge of the code and can't wait for a new release, you must [install the library from source](https://huggingface.co/docs/transformers/installation#installing-from-source).

				- Audio classification with [Whisper](https://huggingface.co/openai/whisper-large-v3-turbo)

				- Automatic speech recognition with [Moonshine](https://huggingface.co/UsefulSensors/moonshine)

				- Keyword spotting with [Wav2Vec2](https://huggingface.co/superb/wav2vec2-base-superb-ks)

				- Speech to speech generation with [Moshi](https://huggingface.co/kyutai/moshiko-pytorch-bf16)

				- Text to audio with [MusicGen](https://huggingface.co/facebook/musicgen-large)

				- Text to speech with [Bark](https://huggingface.co/suno/bark)

				### With conda

				</details>

				🤗 Transformers can be installed using conda as follows:

				<details>

				<summary>Computer vision</summary>

				```shell script

				conda install conda-forge::transformers

				```

				- Automatic mask generation with [SAM](https://huggingface.co/facebook/sam-vit-base)

				- Depth estimation with [DepthPro](https://huggingface.co/apple/DepthPro-hf)

				- Image classification with [DINO v2](https://huggingface.co/facebook/dinov2-base)

				- Keypoint detection with [SuperPoint](https://huggingface.co/magic-leap-community/superpoint)

				- Keypoint matching with [SuperGlue](https://huggingface.co/magic-leap-community/superglue_outdoor)

				- Object detection with [RT-DETRv2](https://huggingface.co/PekingU/rtdetr_v2_r50vd)

				- Pose Estimation with [VitPose](https://huggingface.co/usyd-community/vitpose-base-simple)

				- Universal segmentation with [OneFormer](https://huggingface.co/shi-labs/oneformer_ade20k_swin_large)

				- Video classification with [VideoMAE](https://huggingface.co/MCG-NJU/videomae-large)

				> **_NOTE:_** Installing `transformers` from the `huggingface` channel is deprecated.

				</details>

				Follow the installation pages of Flax, PyTorch or TensorFlow to see how to install them with conda.

				<details>

				<summary>Multimodal</summary>

				> **_NOTE:_**  On Windows, you may be prompted to activate Developer Mode in order to benefit from caching. If this is not an option for you, please let us know in [this issue](https://github.com/huggingface/huggingface_hub/issues/1062).

				- Audio or text to text with [Qwen2-Audio](https://huggingface.co/Qwen/Qwen2-Audio-7B)

				- Document question answering with [LayoutLMv3](https://huggingface.co/microsoft/layoutlmv3-base)

				- Image or text to text with [Qwen-VL](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct)

				- Image captioning [BLIP-2](https://huggingface.co/Salesforce/blip2-opt-2.7b)

				- OCR-based document understanding with [GOT-OCR2](https://huggingface.co/stepfun-ai/GOT-OCR-2.0-hf)

				- Table question answering with [TAPAS](https://huggingface.co/google/tapas-base)

				- Unified multimodal understanding and generation with [Emu3](https://huggingface.co/BAAI/Emu3-Gen)

				- Vision to text with [Llava-OneVision](https://huggingface.co/llava-hf/llava-onevision-qwen2-0.5b-ov-hf)

				- Visual question answering with [Llava](https://huggingface.co/llava-hf/llava-1.5-7b-hf)

				- Visual referring expression segmentation with [Kosmos-2](https://huggingface.co/microsoft/kosmos-2-patch14-224)

				## Model architectures

				</details>

				**[All the model checkpoints](https://huggingface.co/models)** provided by 🤗 Transformers are seamlessly integrated from the huggingface.co [model hub](https://huggingface.co/models), where they are uploaded directly by [users](https://huggingface.co/users) and [organizations](https://huggingface.co/organizations).

				<details>

				<summary>NLP</summary>

				Current number of checkpoints: ![](https://img.shields.io/endpoint?url=https://huggingface.co/api/shields/models&color=brightgreen)

				- Masked word completion with [ModernBERT](https://huggingface.co/answerdotai/ModernBERT-base)

				- Named entity recognition with [Gemma](https://huggingface.co/google/gemma-2-2b)

				- Question answering with [Mixtral](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1)

				- Summarization with [BART](https://huggingface.co/facebook/bart-large-cnn)

				- Translation with [T5](https://huggingface.co/google-t5/t5-base)

				- Text generation with [Llama](https://huggingface.co/meta-llama/Llama-3.2-1B)

				- Text classification with [Qwen](https://huggingface.co/Qwen/Qwen2.5-0.5B)

				🤗 Transformers currently provides the following architectures: see [here](https://huggingface.co/docs/transformers/model_summary) for a high-level summary of each them.

				To check if each model has an implementation in Flax, PyTorch or TensorFlow, or has an associated tokenizer backed by the 🤗 Tokenizers library, refer to [this table](https://huggingface.co/docs/transformers/index#supported-frameworks).

				These implementations have been tested on several datasets (see the example scripts) and should match the performance of the original implementations. You can find more details on performance in the Examples section of the [documentation](https://github.com/huggingface/transformers/tree/main/examples).

				## Learn more

				| Section | Description |

				|-|-|

				| [Documentation](https://huggingface.co/docs/transformers/) | Full API documentation and tutorials |

				| [Task summary](https://huggingface.co/docs/transformers/task_summary) | Tasks supported by 🤗 Transformers |

				| [Preprocessing tutorial](https://huggingface.co/docs/transformers/preprocessing) | Using the `Tokenizer` class to prepare data for the models |

				| [Training and fine-tuning](https://huggingface.co/docs/transformers/training) | Using the models provided by 🤗 Transformers in a PyTorch/TensorFlow training loop and the `Trainer` API |

				| [Quick tour: Fine-tuning/usage scripts](https://github.com/huggingface/transformers/tree/main/examples) | Example scripts for fine-tuning models on a wide range of tasks |

				| [Model sharing and uploading](https://huggingface.co/docs/transformers/model_sharing) | Upload and share your fine-tuned models with the community |

				</details>

				## Citation

									
										9

SECURITY.md
									
												View File
												
				@ -14,7 +14,7 @@ Models uploaded on the Hugging Face Hub come in different formats. We heavily re

				models in the [`safetensors`](https://github.com/huggingface/safetensors) format (which is the default prioritized

				by the transformers library), as developed specifically to prevent arbitrary code execution on your system.

				To avoid loading models from unsafe formats(e.g. [pickle](https://docs.python.org/3/library/pickle.html), you should use the `use_safetensors` parameter. If doing so, in the event that no .safetensors file is present, transformers will error when loading the model.

				To avoid loading models from unsafe formats (e.g. [pickle](https://docs.python.org/3/library/pickle.html), you should use the `use_safetensors` parameter. If doing so, in the event that no .safetensors file is present, transformers will error when loading the model.

				### Remote code

				@ -27,13 +27,6 @@ These models require the `trust_remote_code=True` parameter to be set when using

				the content of the modeling files when using this argument. We recommend setting a revision in order to ensure you

				protect yourself from updates on the repository.

				#### Tools

				Through the `Agent` framework, remote tools can be downloaded to be used by the Agent. You're to specify these tools

				yourself, but please keep in mind that their code will be run on your machine if the Agent chooses to run them.

				Please inspect the code of the tools before passing them to the Agent to protect your runtime and local setup.

				## Reporting a Vulnerability

				Feel free to submit vulnerability reports to [security@huggingface.co](mailto:security@huggingface.co), where someone from the HF security team will review and recommend next steps. If reporting a vulnerability specific to open source, please note [Huntr](https://huntr.com) is a vulnerability disclosure program for open source software.

									
										35

awesome-transformers.md
									
												View File
												
				@ -6,7 +6,7 @@ developers, researchers, students, professors, engineers, and anyone else to bui

				In this list, we showcase incredibly impactful and novel projects that have pushed the field forward. We celebrate

				100 of these projects as we reach the milestone of 100k stars as a community; but we're very open to pull requests

				adding other projects to the list. If you believe a project should be here and it's not, then please, open a PR 

				adding other projects to the list. If you believe a project should be here and it's not, then please, open a PR

				to add it.

				## [gpt4all](https://github.com/nomic-ai/gpt4all)

				@ -15,7 +15,7 @@ to add it.

				Keywords: Open-source, LLaMa, GPT-J, instruction, assistant

				## [recommenders](https://github.com/microsoft/recommenders)

				## [recommenders](https://github.com/recommenders-team/recommenders)

				This repository contains examples and best practices for building recommendation systems, provided as Jupyter notebooks. It goes over several aspects required to build efficient recommendation systems: data preparation, modeling, evaluation, model selection & optimization, as well as operationalization

				@ -29,7 +29,7 @@ Keywords: inpainting, SD, Stable Diffusion

				## [flair](https://github.com/flairNLP/flair)

				FLAIR is a powerful PyTorch NLP framework, convering several important tasks: NER, sentiment-analysis, part-of-speech tagging, text and document embeddings, among other things.

				FLAIR is a powerful PyTorch NLP framework, covering several important tasks: NER, sentiment-analysis, part-of-speech tagging, text and document embeddings, among other things.

				Keywords: NLP, text embedding, document embedding, biomedical, NER, PoS, sentiment-analysis

				@ -39,17 +39,17 @@ MindsDB is a low-code ML platform, which automates and integrates several ML fra

				Keywords: Database, low-code, AI table

				## [langchain](https://github.com/hwchase17/langchain)

				## [langchain](https://github.com/langchain-ai/langchain)

				[langchain](https://github.com/hwchase17/langchain) is aimed at assisting in the development of apps merging both LLMs and other sources of knowledge. The library allows chaining calls to applications, creating a sequence across many tools.

				[langchain](https://github.com/langchain-ai/langchain) is aimed at assisting in the development of apps merging both LLMs and other sources of knowledge. The library allows chaining calls to applications, creating a sequence across many tools.

				Keywords: LLMs, Large Language Models, Agents, Chains

				## [LlamaIndex](https://github.com/jerryjliu/llama_index)

				## [LlamaIndex](https://github.com/run-llama/llama_index)

				[LlamaIndex](https://github.com/jerryjliu/llama_index) is a project that provides a central interface to connect your LLM's with external data. It provides various kinds of indices and retreival mechanisms to perform different LLM tasks and obtain knowledge-augmented results.

				[LlamaIndex](https://github.com/run-llama/llama_index) is a project that provides a central interface to connect your LLM's with external data. It provides various kinds of indices and retrieval mechanisms to perform different LLM tasks and obtain knowledge-augmented results.

				Keywords: LLMs, Large Language Models, Data Retrieval, Indices, Knowledge Augmentation 

				Keywords: LLMs, Large Language Models, Data Retrieval, Indices, Knowledge Augmentation

				## [ParlAI](https://github.com/facebookresearch/ParlAI)

				@ -146,9 +146,9 @@ Keywords: Framework, simplicity, NLP

				Keywords: LLM, Agents, HF Hub

				## [transformers.js](https://xenova.github.io/transformers.js/)

				## [transformers.js](https://github.com/huggingface/transformers.js/)

				[transformers.js](https://xenova.github.io/transformers.js/) is a JavaScript library targeted at running models from transformers directly within the browser.

				[transformers.js](https://github.com/huggingface/transformers.js/) is a JavaScript library targeted at running models from transformers directly within the browser.

				Keywords: Transformers, JavaScript, browser

				@ -257,7 +257,7 @@ Stable-Dreamfusion is a pytorch implementation of the text-to-3D model Dreamfusi

				Keywords: Text-to-3D, Stable Diffusion

				## [txtai](https://github.com/neuml/txtai)

				[txtai](https://github.com/neuml/txtai) is an open-source platform for semantic search and workflows powered by language models. txtai builds embeddings databases, which are a union of vector indexes and relational databases enabling similarity search with SQL. Semantic workflows connect language models together into unified applications.

				Keywords: Semantic search, LLM

				@ -288,7 +288,7 @@ Keywords: Music understanding, Music generation

				## [dalle-flow](https://github.com/jina-ai/dalle-flow)

				DALL·E Flow is an interactive workflow for generating high-definition images from a text prompt. Itt leverages DALL·E-Mega, GLID-3 XL, and Stable Diffusion to generate image candidates, and then calls CLIP-as-service to rank the candidates w.r.t. the prompt.

				DALL·E Flow is an interactive workflow for generating high-definition images from a text prompt. It leverages DALL·E-Mega, GLID-3 XL, and Stable Diffusion to generate image candidates, and then calls CLIP-as-service to rank the candidates w.r.t. the prompt.

				The preferred candidate is fed to GLID-3 XL for diffusion, which often enriches the texture and background. Finally, the candidate is upscaled to 1024x1024 via SwinIR.

				Keywords: High-definition image generation, Stable Diffusion, DALL-E Mega, GLID-3 XL, CLIP, SwinIR

				@ -309,8 +309,8 @@ Keywords: OCR, LaTeX, Math formula

				OpenCLIP is an open source implementation of OpenAI's CLIP.

				The goal of this repository is to enable training models with contrastive image-text supervision, and to investigate their properties such as robustness to distribution shift. 

				The starting point is an implementation of CLIP that matches the accuracy of the original CLIP models when trained on the same dataset. 

				The goal of this repository is to enable training models with contrastive image-text supervision, and to investigate their properties such as robustness to distribution shift.

				The starting point is an implementation of CLIP that matches the accuracy of the original CLIP models when trained on the same dataset.

				Specifically, a ResNet-50 model trained with this codebase on OpenAI's 15 million image subset of YFCC achieves 32.7% top-1 accuracy on ImageNet.

				@ -437,7 +437,7 @@ Keywords: DALL-E, Russian

				Keywords: Knowledge Extraction, Knowledge Graphs

				## [Nebuly](https://github.com/nebuly-ai/nebuly)

				## [Nebuly](https://github.com/nebuly-ai/optimate)

				Nebuly is the next-generation platform to monitor and optimize your AI costs in one place. The platform connects to all your AI cost sources (compute, API providers, AI software licenses, etc) and centralizes them in one place to give you full visibility on a model basis. The platform also provides optimization recommendations and a co-pilot model that can guide during the optimization process. The platform builds on top of the open-source tools allowing you to optimize the different steps of your AI stack to squeeze out the best possible cost performances.

				@ -526,7 +526,7 @@ Keywords: Model deployment, CLoud, Mobile, Edge

				## [underthesea](https://github.com/undertheseanlp/underthesea)

				[underthesea](https://github.com/undertheseanlp/underthesea) is a Vietnamese NLP toolkit. Underthesea is a suite of open source Python modules data sets and tutorials supporting research and development in Vietnamese Natural Language Processing. We provides extremely easy API to quickly apply pretrained NLP models to your Vietnamese text, such as word segmentation, part-of-speech tagging (PoS), named entity recognition (NER), text classification and dependency parsing.

				[underthesea](https://github.com/undertheseanlp/underthesea) is a Vietnamese NLP toolkit. Underthesea is a suite of open source Python modules data sets and tutorials supporting research and development in Vietnamese Natural Language Processing. We provide extremely easy API to quickly apply pretrained NLP models to your Vietnamese text, such as word segmentation, part-of-speech tagging (PoS), named entity recognition (NER), text classification and dependency parsing.

				Keywords: Vietnamese, NLP

				@ -596,7 +596,7 @@ Keywords: Data-Centric AI, Data Quality, Noisy Labels, Outlier Detection, Active

				## [BentoML](https://github.com/bentoml/BentoML)

				[BentoML](https://github.com/bentoml) is the unified framework for building, shipping, and scaling production-ready AI applications incorporating traditional ML, pre-trained AI models, Generative and Large Language Models. 

				[BentoML](https://github.com/bentoml) is the unified framework for building, shipping, and scaling production-ready AI applications incorporating traditional ML, pre-trained AI models, Generative and Large Language Models.

				All Hugging Face models and pipelines can be seamlessly integrated into BentoML applications, enabling the running of models on the most suitable hardware and independent scaling based on usage.

				Keywords: BentoML, Framework, Deployment, AI Applications

				@ -606,4 +606,3 @@ Keywords: BentoML, Framework, Deployment, AI Applications

				[LLaMA Factory](https://github.com/hiyouga/LLaMA-Factory) offers a user-friendly fine-tuning framework that incorporates PEFT. The repository includes training(fine-tuning) and inference examples for LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, and other LLMs. A ChatGLM version is also available in [ChatGLM-Efficient-Tuning](https://github.com/hiyouga/ChatGLM-Efficient-Tuning).

				Keywords: PEFT, fine-tuning, LLaMA-2, ChatGLM, Qwen

1

benchmark/.gitignore vendored Normal file

View File

				`@ -0,0 +1 @@`
				`benchmark_results/`

									
										49

benchmark/README.md
									
										Normal file
									
												View File
												
				@ -0,0 +1,49 @@

				# Benchmarks

				You might want to add new benchmarks.

				You will need to define a python function named `run_benchmark` in your python file and the file must be located in this `benchmark/` directory.

				The expected function signature is the following:

				```py

				def run_benchmark(logger: Logger, branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):

				```

				## Writing metrics to the database

				`MetricsRecorder` is thread-safe, in the sense of the python [`Thread`](https://docs.python.org/3/library/threading.html#threading.Thread). This means you can start a background thread to do the readings on the device measurements while not blocking the main thread to execute the model measurements.

				cf [`llama.py`](./llama.py) to see an example of this in practice.

				```py

				from benchmarks_entrypoint import MetricsRecorder

				import psycopg2

				def run_benchmark(logger: Logger, branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):

				  metrics_recorder = MetricsRecorder(psycopg2.connect("dbname=metrics"), logger, branch, commit_id, commit_msg)

				  benchmark_id = metrics_recorder.initialise_benchmark({"gpu_name": gpu_name, "model_id": model_id})

				    # To collect device measurements

				    metrics_recorder.collect_device_measurements(

				        benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes

				    )

				    # To collect your model measurements

				    metrics_recorder.collect_model_measurements(

				        benchmark_id,

				        {

				            "model_load_time": model_load_time,

				            "first_eager_forward_pass_time_secs": first_eager_fwd_pass_time,

				            "second_eager_forward_pass_time_secs": second_eager_fwd_pass_time,

				            "first_eager_generate_time_secs": first_eager_generate_time,

				            "second_eager_generate_time_secs": second_eager_generate_time,

				            "time_to_first_token_secs": time_to_first_token,

				            "time_to_second_token_secs": time_to_second_token,

				            "time_to_third_token_secs": time_to_third_token,

				            "time_to_next_token_mean_secs": mean_time_to_next_token,

				            "first_compile_generate_time_secs": first_compile_generate_time,

				            "second_compile_generate_time_secs": second_compile_generate_time,

				            "third_compile_generate_time_secs": third_compile_generate_time,

				            "fourth_compile_generate_time_secs": fourth_compile_generate_time,

				        },

				    )

				```

									
										353

benchmark/benches/llama.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,353 @@

				# Copyright 2025 The HuggingFace Team. All rights reserved.

				#

				# Licensed under the Apache License, Version 2.0 (the "License");

				# you may not use this file except in compliance with the License.

				# You may obtain a copy of the License at

				#

				#     http://www.apache.org/licenses/LICENSE-2.0

				#

				# Unless required by applicable law or agreed to in writing, software

				# distributed under the License is distributed on an "AS IS" BASIS,

				# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				# See the License for the specific language governing permissions and

				# limitations under the License.

				import os

				import sys

				from logging import Logger

				from threading import Event, Thread

				from time import perf_counter, sleep

				# Add the parent directory to Python path to import benchmarks_entrypoint

				sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

				import gpustat

				import psutil

				import psycopg2

				from benchmarks_entrypoint import MetricsRecorder

				# Optional heavy ML dependencies - only required when actually running the benchmark

				try:

				    import torch

				    from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig, StaticCache

				    TRANSFORMERS_AVAILABLE = True

				except ImportError:

				    TRANSFORMERS_AVAILABLE = False

				    torch = None

				    AutoModelForCausalLM = None

				    AutoTokenizer = None

				    GenerationConfig = None

				    StaticCache = None

				os.environ["HF_XET_HIGH_PERFORMANCE"] = "1"

				os.environ["TOKENIZERS_PARALLELISM"] = "1"

				# Only set torch precision if torch is available

				if TRANSFORMERS_AVAILABLE:

				    torch.set_float32_matmul_precision("high")

				def collect_metrics(benchmark_id, continue_metric_collection, metrics_recorder):

				    p = psutil.Process(os.getpid())

				    while not continue_metric_collection.is_set():

				        with p.oneshot():

				            cpu_util = p.cpu_percent()

				            mem_megabytes = p.memory_info().rss / (1024 * 1024)

				        gpu_stats = gpustat.GPUStatCollection.new_query()

				        gpu_util = gpu_stats[0]["utilization.gpu"]

				        gpu_mem_megabytes = gpu_stats[0]["memory.used"]

				        metrics_recorder.collect_device_measurements(

				            benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes

				        )

				        sleep(0.01)

				def run_benchmark(

				    logger: Logger,

				    repository: str,

				    branch: str,

				    commit_id: str,

				    commit_msg: str,

				    metrics_recorder=None,

				    num_tokens_to_generate=100,

				):

				    # Check if required ML dependencies are available

				    if not TRANSFORMERS_AVAILABLE:

				        logger.error("Transformers and torch are required to run the LLaMA benchmark. Please install them with:")

				        logger.error("pip install torch transformers")

				        logger.error("Skipping LLaMA benchmark due to missing dependencies.")

				        return

				    continue_metric_collection = Event()

				    metrics_thread = None

				    model_id = "meta-llama/Llama-2-7b-hf"

				    # If no metrics_recorder is provided, create one for backward compatibility

				    if metrics_recorder is None:

				        try:

				            metrics_recorder = MetricsRecorder(

				                psycopg2.connect("dbname=metrics"), logger, repository, branch, commit_id, commit_msg, True

				            )

				            should_close_recorder = True

				        except Exception as e:

				            logger.error(f"Failed to create metrics recorder: {e}")

				            return

				    else:

				        should_close_recorder = False

				    try:

				        gpu_stats = gpustat.GPUStatCollection.new_query()

				        gpu_name = gpu_stats[0]["name"]

				        benchmark_id = metrics_recorder.initialise_benchmark({"gpu_name": gpu_name, "model_id": model_id})

				        logger.info(f"running benchmark #{benchmark_id} on {gpu_name} for {model_id}")

				        metrics_thread = Thread(

				            target=collect_metrics,

				            args=[benchmark_id, continue_metric_collection, metrics_recorder],

				        )

				        metrics_thread.start()

				        logger.info("started background thread to fetch device metrics")

				        os.environ["TOKENIZERS_PARALLELISM"] = "false"  # silence warnings when compiling

				        device = "cuda"

				        logger.info("downloading weights")

				        # This is to avoid counting download in model load time measurement

				        model = AutoModelForCausalLM.from_pretrained(model_id, dtype=torch.float16)

				        gen_config = GenerationConfig(do_sample=False, top_p=1, temperature=1)

				        logger.info("loading model")

				        start = perf_counter()

				        model = AutoModelForCausalLM.from_pretrained(

				            model_id, dtype=torch.float16, generation_config=gen_config

				        ).eval()

				        model.to(device)

				        torch.cuda.synchronize()

				        end = perf_counter()

				        model_load_time = end - start

				        logger.info(f"loaded model in: {model_load_time}s")

				        tokenizer = AutoTokenizer.from_pretrained(model_id)

				        prompt = "Why dogs are so cute?"

				        inputs = tokenizer(prompt, return_tensors="pt").to(device)

				        # Specify the max length (including both the prompt and the response)

				        # When calling `generate` with `cache_implementation="static" later, this is also used to create a `StaticCache` object

				        # with sequence length = `max_length`. The longer the more you will re-use it

				        seq_length = inputs["input_ids"].shape[1]

				        model.generation_config.max_length = seq_length + num_tokens_to_generate

				        batch_size = inputs["input_ids"].shape[0]

				        # Copied from the gpt-fast repo

				        def multinomial_sample_one_no_sync(probs_sort):  # Does multinomial sampling without a cuda synchronization

				            q = torch.empty_like(probs_sort).exponential_(1)

				            return torch.argmax(probs_sort / q, dim=-1, keepdim=True).to(dtype=torch.int)

				        def logits_to_probs(logits, temperature: float = 1.0, top_k: int | None = None):

				            logits = logits / max(temperature, 1e-5)

				            if top_k is not None:

				                v, _ = torch.topk(logits, min(top_k, logits.size(-1)))

				                pivot = v.select(-1, -1).unsqueeze(-1)

				                logits = torch.where(logits < pivot, -float("Inf"), logits)

				            probs = torch.nn.functional.softmax(logits, dim=-1)

				            return probs

				        def sample(logits, temperature: float = 1.0, top_k: int | None = None):

				            probs = logits_to_probs(logits[0, -1], temperature, top_k)

				            idx_next = multinomial_sample_one_no_sync(probs)

				            return idx_next, probs

				        # First eager forward pass

				        logger.info("running first eager forward pass")

				        start = perf_counter()

				        _ = model(**inputs)

				        torch.cuda.synchronize()

				        end = perf_counter()

				        first_eager_fwd_pass_time = end - start

				        logger.info(f"completed first eager forward pass in: {first_eager_fwd_pass_time}s")

				        # Second eager forward pass (should be faster)

				        logger.info("running second eager forward pass")

				        start = perf_counter()

				        _ = model(**inputs)

				        torch.cuda.synchronize()

				        end = perf_counter()

				        second_eager_fwd_pass_time = end - start

				        logger.info(f"completed second eager forward pass in: {second_eager_fwd_pass_time}s")

				        # First eager generation

				        logger.info("running first eager generation")

				        start = perf_counter()

				        output = model.generate(**inputs)

				        torch.cuda.synchronize()

				        end = perf_counter()

				        first_eager_generate_time = end - start

				        logger.info(f"completed first eager generation in: {first_eager_generate_time}s")

				        logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")

				        # Second eager generation (should be faster)

				        logger.info("running second eager generation")

				        start = perf_counter()

				        output = model.generate(**inputs)

				        torch.cuda.synchronize()

				        end = perf_counter()

				        second_eager_generate_time = end - start

				        logger.info(f"completed second eager generation in: {second_eager_generate_time}s")

				        logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")

				        logger.info("running generation timing loop")

				        input_pos = torch.arange(0, seq_length, device=device)

				        inputs = inputs["input_ids"]

				        start = perf_counter()

				        with torch.nn.attention.sdpa_kernel(torch.nn.attention.SDPBackend.MATH):

				            logits = model(inputs, position_ids=input_pos).logits

				        next_token, probs = sample(logits, temperature=0.6, top_k=5)

				        torch.cuda.synchronize()

				        end = perf_counter()

				        time_to_first_token = end - start

				        input_pos = torch.tensor([seq_length], device=device, dtype=torch.int)

				        next_token = next_token.clone()

				        start = perf_counter()

				        with torch.nn.attention.sdpa_kernel(torch.nn.attention.SDPBackend.MATH):

				            logits = model(next_token, position_ids=input_pos).logits

				        next_token, probs = sample(logits, temperature=0.6, top_k=5)

				        torch.cuda.synchronize()

				        end = perf_counter()

				        time_to_second_token = end - start

				        input_pos = torch.tensor([seq_length + 1], device=device, dtype=torch.int)

				        next_token = next_token.clone()

				        start = perf_counter()

				        with torch.nn.attention.sdpa_kernel(torch.nn.attention.SDPBackend.MATH):

				            logits = model(next_token, position_ids=input_pos).logits

				        next_token, probs = sample(logits, temperature=0.6, top_k=5)

				        torch.cuda.synchronize()

				        end = perf_counter()

				        time_to_third_token = end - start

				        logger.info("running longer generation timing loop")

				        total_time = 0

				        for i in range(20):

				            input_pos = torch.tensor([seq_length + 2 + i], device=device, dtype=torch.int)

				            next_token = next_token.clone()

				            start = perf_counter()

				            with torch.nn.attention.sdpa_kernel(torch.nn.attention.SDPBackend.MATH):

				                logits = model(next_token, position_ids=input_pos).logits

				            next_token, probs = sample(logits, temperature=0.6, top_k=5)

				            torch.cuda.synchronize()

				            end = perf_counter()

				            total_time += end - start

				        mean_time_to_next_token = total_time / 20

				        logger.info("running compilation benchmarks")

				        # Now compile the model

				        model = torch.compile(model, mode="max-autotune", fullgraph=True)

				        # StaticCache for generation

				        with torch.device(device):

				            model.setup_caches(max_batch_size=batch_size, max_seq_len=seq_length + num_tokens_to_generate)

				        input_pos = torch.arange(0, seq_length, device=device)

				        inputs = tokenizer(prompt, return_tensors="pt").to(device)["input_ids"]

				        logger.info("compiling model")

				        model = AutoModelForCausalLM.from_pretrained(model_id, dtype=torch.float16, generation_config=gen_config)

				        model.to(device)

				        model = torch.compile(model, mode="max-autotune", fullgraph=True)

				        past_key_values = StaticCache(

				            model.config,

				            max_batch_size=batch_size,

				            device=device,

				            dtype=torch.float16,

				            max_cache_len=seq_length + 128,

				        )

				        # 1st call

				        start = perf_counter()

				        output = model.generate(**inputs, past_key_values=past_key_values)

				        end = perf_counter()

				        first_compile_generate_time = end - start

				        logger.info(f"completed first compile generation in: {first_compile_generate_time}s")

				        logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")

				        past_key_values = StaticCache(

				            model.config,

				            max_batch_size=batch_size,

				            device=device,

				            dtype=torch.float16,

				            max_cache_len=seq_length + 128,

				        )

				        # 2nd call

				        start = perf_counter()

				        output = model.generate(**inputs, past_key_values=past_key_values)

				        end = perf_counter()

				        second_compile_generate_time = end - start

				        logger.info(f"completed second compile generation in: {second_compile_generate_time}s")

				        logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")

				        past_key_values = StaticCache(

				            model.config,

				            max_batch_size=batch_size,

				            device=device,

				            dtype=torch.float16,

				            max_cache_len=seq_length + 128,

				        )

				        # 3rd call

				        start = perf_counter()

				        output = model.generate(**inputs, past_key_values=past_key_values)

				        end = perf_counter()

				        third_compile_generate_time = end - start

				        logger.info(f"completed third compile generation in: {third_compile_generate_time}s")

				        logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")

				        past_key_values = StaticCache(

				            model.config,

				            max_batch_size=batch_size,

				            device=device,

				            dtype=torch.float16,

				            max_cache_len=seq_length + 128,

				        )

				        # 4th call

				        start = perf_counter()

				        output = model.generate(**inputs, past_key_values=past_key_values)

				        end = perf_counter()

				        fourth_compile_generate_time = end - start

				        logger.info(f"completed fourth compile generation in: {fourth_compile_generate_time}s")

				        logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")

				        metrics_recorder.collect_model_measurements(

				            benchmark_id,

				            {

				                "model_load_time": model_load_time,

				                "first_eager_forward_pass_time_secs": first_eager_fwd_pass_time,

				                "second_eager_forward_pass_time_secs": second_eager_fwd_pass_time,

				                "first_eager_generate_time_secs": first_eager_generate_time,

				                "second_eager_generate_time_secs": second_eager_generate_time,

				                "time_to_first_token_secs": time_to_first_token,

				                "time_to_second_token_secs": time_to_second_token,

				                "time_to_third_token_secs": time_to_third_token,

				                "time_to_next_token_mean_secs": mean_time_to_next_token,

				                "first_compile_generate_time_secs": first_compile_generate_time,

				                "second_compile_generate_time_secs": second_compile_generate_time,

				                "third_compile_generate_time_secs": third_compile_generate_time,

				                "fourth_compile_generate_time_secs": fourth_compile_generate_time,

				            },

				        )

				    except Exception as e:

				        logger.error(f"Caught exception: {e}")

				    continue_metric_collection.set()

				    if metrics_thread is not None:

				        metrics_thread.join()

				    # Only close the recorder if we created it locally

				    if should_close_recorder:

				        metrics_recorder.close()

									
										4

benchmark/benchmark.py
									
												View File
												
				@ -31,9 +31,7 @@ from contextlib import contextmanager

				from pathlib import Path

				from git import Repo

				from huggingface_hub import HfApi

				from optimum_benchmark import Benchmark

				from optimum_benchmark_wrapper import main

				@ -90,7 +88,7 @@ def summarize(run_dir, metrics, expand_metrics=False):

				        model = benchmark.config.backend["model"]

				        # Ths looks like `benchmark.input_shapes.batch_size=1,benchmark.input_shapes.sequence_length=5`.

				        # This looks like `benchmark.input_shapes.batch_size=1,benchmark.input_shapes.sequence_length=5`.

				        # (we rely on the usage of hydra's `${hydra.job.override_dirname}`.)

				        benchmark_name = re.sub(f"backend.model={model},*", "", report_dir)

				        benchmark_name = str(Path(benchmark_name).parts[-1])

									
										502

benchmark/benchmarks_entrypoint.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,502 @@

				# Copyright 2025 The HuggingFace Team. All rights reserved.

				#

				# Licensed under the Apache License, Version 2.0 (the "License");

				# you may not use this file except in compliance with the License.

				# You may obtain a copy of the License at

				#

				#     http://www.apache.org/licenses/LICENSE-2.0

				#

				# Unless required by applicable law or agreed to in writing, software

				# distributed under the License is distributed on an "AS IS" BASIS,

				# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				# See the License for the specific language governing permissions and

				# limitations under the License.

				import argparse

				import importlib.util

				import json

				import logging

				import os

				import sys

				import uuid

				from datetime import datetime

				import pandas as pd

				try:

				    from psycopg2.extensions import register_adapter

				    from psycopg2.extras import Json

				    register_adapter(dict, Json)

				    PSYCOPG2_AVAILABLE = True

				except ImportError:

				    PSYCOPG2_AVAILABLE = False

				class ImportModuleException(Exception):

				    pass

				class MetricsRecorder:

				    def __init__(

				        self,

				        connection,

				        logger: logging.Logger,

				        repository: str,

				        branch: str,

				        commit_id: str,

				        commit_msg: str,

				        collect_csv_data: bool = True,

				    ):

				        self.conn = connection

				        self.use_database = connection is not None

				        if self.use_database:

				            self.conn.autocommit = True

				        self.logger = logger

				        self.repository = repository

				        self.branch = branch

				        self.commit_id = commit_id

				        self.commit_msg = commit_msg

				        self.collect_csv_data = collect_csv_data

				        # For CSV export - store all data in pandas DataFrames (only if CSV collection is enabled)

				        if self.collect_csv_data:

				            # Initialize empty DataFrames with proper schemas

				            self.benchmarks_df = pd.DataFrame(

				                columns=[

				                    "benchmark_id",

				                    "repository",

				                    "branch",

				                    "commit_id",

				                    "commit_message",

				                    "metadata",

				                    "created_at",

				                ]

				            )

				            self.device_measurements_df = pd.DataFrame(

				                columns=["benchmark_id", "cpu_util", "mem_megabytes", "gpu_util", "gpu_mem_megabytes", "time"]

				            )

				            self.model_measurements_df = pd.DataFrame(

				                columns=[

				                    "benchmark_id",

				                    "time",

				                    "model_load_time",

				                    "first_eager_forward_pass_time_secs",

				                    "second_eager_forward_pass_time_secs",

				                    "first_eager_generate_time_secs",

				                    "second_eager_generate_time_secs",

				                    "time_to_first_token_secs",

				                    "time_to_second_token_secs",

				                    "time_to_third_token_secs",

				                    "time_to_next_token_mean_secs",

				                    "first_compile_generate_time_secs",

				                    "second_compile_generate_time_secs",

				                    "third_compile_generate_time_secs",

				                    "fourth_compile_generate_time_secs",

				                ]

				            )

				        else:

				            self.benchmarks_df = None

				            self.device_measurements_df = None

				            self.model_measurements_df = None

				    def initialise_benchmark(self, metadata: dict[str, str]) -> str:

				        """

				        Creates a new benchmark, returns the benchmark id (UUID)

				        """

				        # Generate a unique UUID for this benchmark

				        benchmark_id = str(uuid.uuid4())

				        if self.use_database:

				            with self.conn.cursor() as cur:

				                cur.execute(

				                    "INSERT INTO benchmarks (benchmark_id, repository, branch, commit_id, commit_message, metadata) VALUES (%s, %s, %s, %s, %s, %s)",

				                    (benchmark_id, self.repository, self.branch, self.commit_id, self.commit_msg, metadata),

				                )

				                self.logger.debug(f"initialised benchmark #{benchmark_id}")

				        # Store benchmark data for CSV export (if enabled)

				        if self.collect_csv_data:

				            # Add row to pandas DataFrame

				            new_row = pd.DataFrame(

				                [

				                    {

				                        "benchmark_id": benchmark_id,

				                        "repository": self.repository,

				                        "branch": self.branch,

				                        "commit_id": self.commit_id,

				                        "commit_message": self.commit_msg,

				                        "metadata": json.dumps(metadata),

				                        "created_at": datetime.utcnow().isoformat(),

				                    }

				                ]

				            )

				            self.benchmarks_df = pd.concat([self.benchmarks_df, new_row], ignore_index=True)

				        mode_info = []

				        if self.use_database:

				            mode_info.append("database")

				        if self.collect_csv_data:

				            mode_info.append("CSV")

				        mode_str = " + ".join(mode_info) if mode_info else "no storage"

				        self.logger.debug(f"initialised benchmark #{benchmark_id} ({mode_str} mode)")

				        return benchmark_id

				    def collect_device_measurements(self, benchmark_id: str, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes):

				        """

				        Collect device metrics, such as CPU & GPU usage. These are "static", as in you cannot pass arbitrary arguments to the function.

				        """

				        # Store device measurements for CSV export (if enabled)

				        if self.collect_csv_data:

				            # Add row to pandas DataFrame

				            new_row = pd.DataFrame(

				                [

				                    {

				                        "benchmark_id": benchmark_id,

				                        "cpu_util": cpu_util,

				                        "mem_megabytes": mem_megabytes,

				                        "gpu_util": gpu_util,

				                        "gpu_mem_megabytes": gpu_mem_megabytes,

				                        "time": datetime.utcnow().isoformat(),

				                    }

				                ]

				            )

				            self.device_measurements_df = pd.concat([self.device_measurements_df, new_row], ignore_index=True)

				        # Store in database if available

				        if self.use_database:

				            with self.conn.cursor() as cur:

				                cur.execute(

				                    "INSERT INTO device_measurements (benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes) VALUES (%s, %s, %s, %s, %s)",

				                    (benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes),

				                )

				        self.logger.debug(

				            f"collected device measurements for benchmark #{benchmark_id} [CPU util: {cpu_util}, mem MBs: {mem_megabytes}, GPU util: {gpu_util}, GPU mem MBs: {gpu_mem_megabytes}]"

				        )

				    def collect_model_measurements(self, benchmark_id: str, measurements: dict[str, float]):

				        # Store model measurements for CSV export (if enabled)

				        if self.collect_csv_data:

				            # Add row to pandas DataFrame with flattened measurements

				            row_data = {"benchmark_id": benchmark_id, "time": datetime.utcnow().isoformat()}

				            # Flatten the measurements dict into the row

				            row_data.update(measurements)

				            new_row = pd.DataFrame([row_data])

				            self.model_measurements_df = pd.concat([self.model_measurements_df, new_row], ignore_index=True)

				        # Store in database if available

				        if self.use_database:

				            with self.conn.cursor() as cur:

				                cur.execute(

				                    """

				                    INSERT INTO model_measurements (

				                        benchmark_id,

				                        measurements

				                    ) VALUES (%s, %s)

				                    """,

				                    (

				                        benchmark_id,

				                        measurements,

				                    ),

				                )

				        self.logger.debug(f"collected model measurements for benchmark #{benchmark_id}: {measurements}")

				    def export_to_csv(self, output_dir: str = "benchmark_results"):

				        """

				        Export all collected data to CSV files using pandas DataFrames

				        """

				        if not self.collect_csv_data:

				            self.logger.warning("CSV data collection is disabled - no CSV files will be generated")

				            return

				        if not os.path.exists(output_dir):

				            os.makedirs(output_dir)

				            self.logger.info(f"Created output directory: {output_dir}")

				        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

				        files_created = []

				        # Export using pandas DataFrames

				        self._export_pandas_data(output_dir, timestamp, files_created)

				        self.logger.info(f"CSV export complete! Created {len(files_created)} files in {output_dir}")

				    def _export_pandas_data(self, output_dir: str, timestamp: str, files_created: list):

				        """

				        Export CSV files using pandas DataFrames

				        """

				        # Export benchmarks

				        benchmarks_file = os.path.join(output_dir, f"benchmarks_{timestamp}.csv")

				        self.benchmarks_df.to_csv(benchmarks_file, index=False)

				        files_created.append(benchmarks_file)

				        self.logger.info(f"Exported {len(self.benchmarks_df)} benchmark records to {benchmarks_file}")

				        # Export device measurements

				        device_file = os.path.join(output_dir, f"device_measurements_{timestamp}.csv")

				        self.device_measurements_df.to_csv(device_file, index=False)

				        files_created.append(device_file)

				        self.logger.info(f"Exported {len(self.device_measurements_df)} device measurement records to {device_file}")

				        # Export model measurements (already flattened)

				        model_file = os.path.join(output_dir, f"model_measurements_{timestamp}.csv")

				        self.model_measurements_df.to_csv(model_file, index=False)

				        files_created.append(model_file)

				        self.logger.info(f"Exported {len(self.model_measurements_df)} model measurement records to {model_file}")

				        # Create comprehensive summary using pandas operations

				        summary_file = os.path.join(output_dir, f"benchmark_summary_{timestamp}.csv")

				        self._create_summary(summary_file)

				        files_created.append(summary_file)

				    def _create_summary(self, summary_file: str):

				        """

				        Create a comprehensive summary CSV using pandas operations

				        """

				        if len(self.benchmarks_df) == 0:

				            # Create empty summary file

				            summary_df = pd.DataFrame()

				            summary_df.to_csv(summary_file, index=False)

				            self.logger.info(f"Created empty benchmark summary at {summary_file}")

				            return

				        # Start with benchmarks as the base

				        summary_df = self.benchmarks_df.copy()

				        # Add model measurements (join on benchmark_id)

				        if len(self.model_measurements_df) > 0:

				            # Drop 'time' column from model measurements to avoid conflicts

				            model_df = self.model_measurements_df.drop(columns=["time"], errors="ignore")

				            summary_df = summary_df.merge(model_df, on="benchmark_id", how="left")

				        # Calculate device measurement aggregates using pandas groupby

				        if len(self.device_measurements_df) > 0:

				            device_agg = (

				                self.device_measurements_df.groupby("benchmark_id")

				                .agg(

				                    {

				                        "cpu_util": ["mean", "max", "std", "count"],

				                        "mem_megabytes": ["mean", "max", "std"],

				                        "gpu_util": ["mean", "max", "std"],

				                        "gpu_mem_megabytes": ["mean", "max", "std"],

				                    }

				                )

				                .round(3)

				            )

				            # Flatten column names

				            device_agg.columns = [f"{col[0]}_{col[1]}" for col in device_agg.columns]

				            device_agg = device_agg.reset_index()

				            # Rename count column to be more descriptive

				            if "cpu_util_count" in device_agg.columns:

				                device_agg = device_agg.rename(columns={"cpu_util_count": "device_measurement_count"})

				            # Merge with summary

				            summary_df = summary_df.merge(device_agg, on="benchmark_id", how="left")

				        # Export the comprehensive summary

				        summary_df.to_csv(summary_file, index=False)

				        self.logger.info(f"Created comprehensive benchmark summary with {len(summary_df)} records at {summary_file}")

				    def close(self):

				        if self.use_database and self.conn:

				            self.conn.close()

				logger = logging.getLogger(__name__)

				logger.setLevel(logging.INFO)

				handler = logging.StreamHandler(sys.stdout)

				handler.setLevel(logging.INFO)

				formatter = logging.Formatter("[%(levelname)s - %(asctime)s] %(message)s")

				handler.setFormatter(formatter)

				logger.addHandler(handler)

				def parse_arguments() -> tuple[str, str, str, str, bool, str]:

				    """

				    Parse command line arguments for the benchmarking CLI.

				    """

				    parser = argparse.ArgumentParser(description="CLI for benchmarking the huggingface/transformers.")

				    parser.add_argument(

				        "repository",

				        type=str,

				        help="The repository name on which the benchmarking is performed.",

				    )

				    parser.add_argument(

				        "branch",

				        type=str,

				        help="The branch name on which the benchmarking is performed.",

				    )

				    parser.add_argument(

				        "commit_id",

				        type=str,

				        help="The commit hash on which the benchmarking is performed.",

				    )

				    parser.add_argument(

				        "commit_msg",

				        type=str,

				        help="The commit message associated with the commit, truncated to 70 characters.",

				    )

				    parser.add_argument("--csv", action="store_true", default=False, help="Enable CSV output files generation.")

				    parser.add_argument(

				        "--csv-output-dir",

				        type=str,

				        default="benchmark_results",

				        help="Directory for CSV output files (default: benchmark_results).",

				    )

				    args = parser.parse_args()

				    # CSV is disabled by default, only enabled when --csv is used

				    generate_csv = args.csv

				    return args.repository, args.branch, args.commit_id, args.commit_msg, generate_csv, args.csv_output_dir

				def import_from_path(module_name, file_path):

				    try:

				        spec = importlib.util.spec_from_file_location(module_name, file_path)

				        module = importlib.util.module_from_spec(spec)

				        sys.modules[module_name] = module

				        spec.loader.exec_module(module)

				        return module

				    except Exception as e:

				        raise ImportModuleException(f"failed to load python module: {e}")

				def create_database_connection():

				    """

				    Try to create a database connection. Returns None if connection fails.

				    """

				    if not PSYCOPG2_AVAILABLE:

				        logger.warning("psycopg2 not available - running in CSV-only mode")

				        return None

				    try:

				        import psycopg2

				        conn = psycopg2.connect("dbname=metrics")

				        logger.info("Successfully connected to database")

				        return conn

				    except Exception as e:

				        logger.warning(f"Failed to connect to database: {e}. Running in CSV-only mode")

				        return None

				def create_global_metrics_recorder(

				    repository: str, branch: str, commit_id: str, commit_msg: str, generate_csv: bool = False

				) -> MetricsRecorder:

				    """

				    Create a global metrics recorder that will be used across all benchmarks.

				    """

				    connection = create_database_connection()

				    recorder = MetricsRecorder(connection, logger, repository, branch, commit_id, commit_msg, generate_csv)

				    # Log the storage mode

				    storage_modes = []

				    if connection is not None:

				        storage_modes.append("database")

				    if generate_csv:

				        storage_modes.append("CSV")

				    if not storage_modes:

				        logger.warning("Running benchmarks with NO data storage (no database connection, CSV disabled)")

				        logger.warning("Use --csv flag to enable CSV output when database is unavailable")

				    else:

				        logger.info(f"Running benchmarks with: {' + '.join(storage_modes)} storage")

				    return recorder

				if __name__ == "__main__":

				    benchmarks_folder_path = os.path.dirname(os.path.realpath(__file__))

				    benches_folder_path = os.path.join(benchmarks_folder_path, "benches")

				    repository, branch, commit_id, commit_msg, generate_csv, csv_output_dir = parse_arguments()

				    # Create a global metrics recorder

				    global_metrics_recorder = create_global_metrics_recorder(repository, branch, commit_id, commit_msg, generate_csv)

				    successful_benchmarks = 0

				    failed_benchmarks = 0

				    # Automatically discover all benchmark modules in benches/ folder

				    benchmark_modules = []

				    if os.path.exists(benches_folder_path):

				        logger.debug(f"Scanning for benchmarks in: {benches_folder_path}")

				        for entry in os.scandir(benches_folder_path):

				            if not entry.name.endswith(".py"):

				                continue

				            if entry.name.startswith("__"):  # Skip __init__.py, __pycache__, etc.

				                continue

				            # Check if the file has a run_benchmark function

				            try:

				                logger.debug(f"checking if benches/{entry.name} has run_benchmark function")

				                module = import_from_path(entry.name.split(".")[0], entry.path)

				                if hasattr(module, "run_benchmark"):

				                    benchmark_modules.append(entry.name)

				                    logger.debug(f"discovered benchmark: {entry.name}")

				                else:

				                    logger.debug(f"skipping {entry.name} - no run_benchmark function found")

				            except Exception as e:

				                logger.debug(f"failed to check benches/{entry.name}: {e}")

				    else:

				        logger.warning(f"Benches directory not found: {benches_folder_path}")

				    if benchmark_modules:

				        logger.info(f"Discovered {len(benchmark_modules)} benchmark(s): {benchmark_modules}")

				    else:

				        logger.warning("No benchmark modules found in benches/ directory")

				    for module_name in benchmark_modules:

				        module_path = os.path.join(benches_folder_path, module_name)

				        try:

				            logger.debug(f"loading: {module_name}")

				            module = import_from_path(module_name.split(".")[0], module_path)

				            logger.info(f"running benchmarks in: {module_name}")

				            # Check if the module has an updated run_benchmark function that accepts metrics_recorder

				            try:

				                # Try the new signature first

				                module.run_benchmark(logger, repository, branch, commit_id, commit_msg, global_metrics_recorder)

				            except TypeError:

				                # Fall back to the old signature for backward compatibility

				                logger.warning(

				                    f"Module {module_name} using old run_benchmark signature - database connection will be created per module"

				                )

				                module.run_benchmark(logger, repository, branch, commit_id, commit_msg)

				            successful_benchmarks += 1

				        except ImportModuleException as e:

				            logger.error(e)

				            failed_benchmarks += 1

				        except Exception as e:

				            logger.error(f"error running benchmarks for {module_name}: {e}")

				            failed_benchmarks += 1

				    # Export CSV results at the end (if enabled)

				    try:

				        if generate_csv:

				            global_metrics_recorder.export_to_csv(csv_output_dir)

				            logger.info(f"CSV reports have been generated and saved to the {csv_output_dir} directory")

				        else:

				            logger.info("CSV generation disabled - no CSV files created (use --csv to enable)")

				        logger.info(f"Benchmark run completed. Successful: {successful_benchmarks}, Failed: {failed_benchmarks}")

				    except Exception as e:

				        logger.error(f"Failed to export CSV results: {e}")

				    finally:

				        global_metrics_recorder.close()

									
										2

benchmark/config/generation.yaml
									
												View File
												
				@ -19,7 +19,7 @@ backend:

				  model: meta-llama/Llama-2-7b-hf

				  cache_implementation: static

				  torch_compile: true

				  torch_dtype: float16

				  dtype: float16

				  torch_compile_config:

				    backend: inductor

				    mode: reduce-overhead

									
										10

benchmark/default.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,10 @@

				apiVersion: 1

				providers:

				  - name: 'Transformers Benchmarks'

				    orgId: 1

				    type: file

				    updateIntervalSeconds: 10

				    allowUiUpdates: true

				    options:

				      path: /etc/grafana/dashboards

2375

benchmark/grafana_dashboard.json Normal file

View File

File diff suppressed because it is too large Load Diff

									
										17

benchmark/grafana_datasource.yaml
									
										Normal file
									
												View File
												
				@ -0,0 +1,17 @@

				apiVersion: 1

				datasources:

				  - name: grafana-postgresql-datasource

				    uid: be28nkzirtb0gd

				    type: postgres

				    url: $GRAFANA_POSTGRES_DATASOURCE_URL

				    user: $GRAFANA_POSTGRES_DATASOURCE_USER

				    secureJsonData:

				      password: $GRAFANA_POSTGRES_DATASOURCE_PWD

				    jsonData:

				      database: metrics

				      maxOpenConns: 100

				      maxIdleConns: 100

				      maxIdleConnsAuto: true

				      connMaxLifetime: 14400

				      postgresVersion: 1000

				      timescaledb: false

									
										6

benchmark/optimum_benchmark_wrapper.py
									
												View File
												
				@ -3,7 +3,11 @@ import subprocess

				def main(config_dir, config_name, args):

				    subprocess.run(["optimum-benchmark", "--config-dir", f"{config_dir}", "--config-name", f"{config_name}"] + ["hydra/job_logging=disabled", "hydra/hydra_logging=disabled"] + args)

				    subprocess.run(

				        ["optimum-benchmark", "--config-dir", f"{config_dir}", "--config-name", f"{config_name}"]

				        + ["hydra/job_logging=disabled", "hydra/hydra_logging=disabled"]

				        + args

				    )

				if __name__ == "__main__":

6

benchmark/requirements.txt Normal file

View File

 @ -0,0 +1,6 @@
 gpustat==1.1.1
 psutil==6.0.0
 psycopg2==2.9.9
 torch>=2.4.0
 hf_xet
 pandas>=1.5.0

0

examples/research_projects/bert-loses-patience/pabee/init.py → benchmark/utils/init_db.sql

View File

2

benchmark_v2/.gitignore vendored Normal file

View File

 @ -0,0 +1,2 @@
 benchmark_results/
 benchmark_results_profiles/

									
										138

benchmark_v2/README.md
									
										Normal file
									
												View File
												
				@ -0,0 +1,138 @@

				# Benchmarking v2

				A comprehensive benchmarking framework for transformer models that supports multiple execution modes (eager, compiled, kernelized), detailed performance metrics collection, and structured output format.

				## Quick Start

				### Running All Benchmarks

				```bash

				# Run all benchmarks with default settings

				python run_benchmarks.py

				# Specify output directory

				python run_benchmarks.py --output-dir my_results

				# Run with custom parameters

				python run_benchmarks.py \

				    --warmup-iterations 5 \

				    --measurement-iterations 10 \

				    --num-tokens-to-generate 200

				```

				### Uploading Results to HuggingFace Dataset

				You can automatically upload benchmark results to a HuggingFace Dataset for tracking and analysis:

				```bash

				# Upload to a public dataset with auto-generated run ID

				python run_benchmarks.py --upload-to-hub username/benchmark-results

				# Upload with a custom run ID for easy identification

				python run_benchmarks.py --upload-to-hub username/benchmark-results --run-id experiment_v1

				# Upload with custom HuggingFace token (if not set in environment)

				python run_benchmarks.py --upload-to-hub username/benchmark-results --token hf_your_token_here

				```

				**Dataset Directory Structure:**

				```

				dataset_name/

				├── 2025-01-15/

				│   ├── runs/                       # Non-scheduled runs (manual, PR, etc.)

				│   │   └── 123-1245151651/         # GitHub run number and ID

				│   │       └── benchmark_results/

				│   │           ├── benchmark_summary_20250115_143022.json

				│   │           └── model-name/

				│   │               └── model-name_benchmark_20250115_143022.json

				│   └── benchmark_results_abc123de/ # Scheduled runs (daily CI)

				│       ├── benchmark_summary_20250115_143022.json

				│       └── model-name/

				│           └── model-name_benchmark_20250115_143022.json

				└── 2025-01-16/

				    └── ...

				```

				**Authentication for Uploads:**

				For uploading results, you need a HuggingFace token with write permissions to the target dataset. You can provide the token in several ways (in order of precedence):

				1. Command line: `--token hf_your_token_here`

				3. Environment variable: `HF_TOKEN`

				### Running Specific Benchmarks

				```bash

				# Include only specific benchmarks

				python run_benchmarks.py --include llama

				# Exclude specific benchmarks

				python run_benchmarks.py --exclude old_benchmark

				## Output Format

				Results are saved as JSON files with the following structure:

				```json

				{

				  "model_name": "llama_2_7b",

				  "benchmark_scenarios": [

				    {

				      "scenario_name": "eager_variant",

				      "metadata": {

				        "timestamp": "2025-01-XX...",

				        "commit_id": "abc123...",

				        "hardware_info": {

				          "gpu_name": "NVIDIA A100",

				          "gpu_memory_total": 40960,

				          "cpu_count": 64

				        },

				        "config": {

				          "variant": "eager",

				          "warmup_iterations": 3,

				          "measurement_iterations": 5

				        }

				      },

				      "measurements": {

				        "latency": {

				          "mean": 2.45,

				          "median": 2.43,

				          "std": 0.12,

				          "min": 2.31,

				          "max": 2.67,

				          "p95": 2.61,

				          "p99": 2.65

				        },

				        "time_to_first_token": {

				          "mean": 0.15,

				          "std": 0.02

				        },

				        "tokens_per_second": {

				          "mean": 87.3,

				          "unit": "tokens/sec"

				        }

				      },

				      "gpu_metrics": {

				        "gpu_utilization_mean": 85.2,

				        "gpu_memory_used_mean": 12450

				      }

				    }

				  ]

				}

				```

				### Debug Mode

				```bash

				python run_benchmarks.py --log-level DEBUG

				```

				## Contributing

				To add new benchmarks:

				1. Create a new file in `benches/`

				2. Implement the `ModelBenchmark` interface

				3. Add a runner function (`run_<benchmark_name>` or `run_benchmark`)

				4. run_benchmarks.py

									
										215

benchmark_v2/framework/benchmark_config.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,215 @@

				import hashlib

				import json

				import logging

				from typing import Any

				KERNELIZATION_AVAILABLE = False

				try:

				    from kernels import Mode, kernelize  # noqa: F401

				    KERNELIZATION_AVAILABLE = True

				except ImportError:

				    pass

				logger = logging.getLogger(__name__)

				class BenchmarkConfig:

				    """Configuration for a single benchmark scenario."""

				    def __init__(

				        self,

				        warmup_iterations: int = 5,

				        measurement_iterations: int = 20,

				        gpu_monitoring: bool = False,  # False by default because it slows down the benchmark by a lot

				        batch_size: int = 1,

				        sequence_length: int = 128,

				        num_tokens_to_generate: int = 128,

				        attn_implementation: str = "eager",

				        sdpa_backend: str | None = None,

				        compile_mode: str | None = None,

				        compile_options: dict[str, Any] | None = None,

				        kernelize: bool = False,

				        name: str | None = None,

				        skip_validity_check: bool = False,

				    ) -> None:

				        # Benchmark parameters

				        self.warmup_iterations = warmup_iterations

				        self.measurement_iterations = measurement_iterations

				        self.gpu_monitoring = gpu_monitoring

				        # Input parameters

				        self.batch_size = batch_size

				        self.sequence_length = sequence_length

				        self.num_tokens_to_generate = num_tokens_to_generate

				        # Generation parameters

				        self.attn_implementation = attn_implementation

				        self.sdpa_backend = sdpa_backend

				        # Optimization parameters

				        self.compile_mode = compile_mode

				        self.compile_options = compile_options if compile_options is not None else {}

				        self.kernelize = kernelize

				        # Constant parameters

				        self.dtype = "torch.bfloat16"

				        self.device = "cuda"

				        self.check_validity(skip_validity_check)

				        self.name = name if name is not None else self.infer_name()

				    def check_validity(self, skip_validity_check: bool = False) -> None:

				        if skip_validity_check:

				            return

				        # Flash attention does not support compile mode, so we turn it off # FIXME: it would be better to support it

				        is_fa = self.attn_implementation == "flash_attention_2"

				        is_fa |= self.attn_implementation == "sdpa" and self.sdpa_backend == "flash_attention"

				        if is_fa:

				            logger.warning("Flash attention does not support compile mode. Turning off compile mode.")

				            self.compile_mode = None

				    @property

				    def hash(self) -> str:

				        return hashlib.sha256(json.dumps(self.to_dict()).encode()).hexdigest()

				    def infer_name(self, compact: bool = True) -> str:

				        """Infer a human-readable name for the benchmark config, either compact or verbose."""

				        if compact:

				            iter_str = f"w{self.warmup_iterations}_i{self.measurement_iterations}"

				            gpu_monitor_str = "monitored" if self.gpu_monitoring else "unmonitored"

				            dimensions_str = f"b{self.batch_size}_s{self.sequence_length}_n{self.num_tokens_to_generate}"

				            attn_code = self.attn_implementation

				            attn_code += f"_{self.sdpa_backend}" if self.attn_implementation == "sdpa" else ""

				            compile_str = f"compiled_{self.compile_mode}" if self.compile_mode is not None else "uncompiled"

				            kernelize_str = "kernelized" if self.kernelize else "unkernelized"

				            sep = "-"

				        else:

				            iter_str = f"{self.warmup_iterations} warmup, {self.measurement_iterations} iterations"

				            gpu_monitor_str = ("with" if self.gpu_monitoring else "no") + " GPU monitoring"

				            dimensions_str = f"batch size {self.batch_size}, sequence length {self.sequence_length}, {self.num_tokens_to_generate} generated tokens"

				            attn_code = f"{self.attn_implementation} attention"

				            attn_code += f" with {self.sdpa_backend} backend" if self.attn_implementation == "sdpa" else ""

				            compile_str = "compiled" if self.compile_mode is not None else "not compiled"

				            kernelize_str = "kernelized" if self.kernelize else "not kernelized"

				            sep = ", "

				        return sep.join([iter_str, gpu_monitor_str, dimensions_str, attn_code, compile_str, kernelize_str])

				    def to_dict(self) -> dict[str, Any]:

				        return {

				            "name": self.name,

				            "warmup_iterations": self.warmup_iterations,

				            "measurement_iterations": self.measurement_iterations,

				            "gpu_monitoring": self.gpu_monitoring,

				            "batch_size": self.batch_size,

				            "sequence_length": self.sequence_length,

				            "num_tokens_to_generate": self.num_tokens_to_generate,

				            "attn_implementation": self.attn_implementation,

				            "sdpa_backend": self.sdpa_backend,

				            "compile_mode": self.compile_mode,

				            "compile_options": self.compile_options | {},  # to avoid inplace modification of the original dict

				            "kernelize": self.kernelize,

				        }

				    @classmethod

				    def from_dict(cls, data: dict[str, Any], skip_validity_check: bool = False) -> "BenchmarkConfig":

				        return cls(

				            warmup_iterations=data.get("warmup_iterations", 5),

				            measurement_iterations=data.get("measurement_iterations", 20),

				            gpu_monitoring=data.get("gpu_monitoring", False),

				            batch_size=data.get("batch_size", 1),

				            sequence_length=data.get("sequence_length", 128),

				            num_tokens_to_generate=data.get("num_tokens_to_generate", 128),

				            attn_implementation=data.get("attn_implementation", "eager"),

				            sdpa_backend=data.get("sdpa_backend"),

				            compile_mode=data.get("compile_mode"),

				            compile_options=data.get("compile_options"),

				            kernelize=data.get("kernelize", False),

				            name=data.get("name"),

				            skip_validity_check=skip_validity_check,

				        )

				def cross_generate_configs(

				    attn_impl_and_sdpa_backend: list[tuple[str, str | None]],

				    compiled_mode: list[str | None],

				    kernelized: list[bool],

				    warmup_iterations: int = 5,

				    measurement_iterations: int = 20,

				    batch_size: int = 1,

				    sequence_length: int = 128,

				    num_tokens_to_generate: int = 128,

				    gpu_monitoring: bool = False,  # this slows down the benchmark by a lot so we disable it by default

				) -> list[BenchmarkConfig]:

				    # Create kwargs common to all configs

				    kwargs = {

				        "warmup_iterations": warmup_iterations,

				        "measurement_iterations": measurement_iterations,

				        "batch_size": batch_size,

				        "sequence_length": sequence_length,

				        "num_tokens_to_generate": num_tokens_to_generate,

				        "gpu_monitoring": gpu_monitoring,

				    }

				    # Cross-generate all combinations of attn_implementation, compiled_mode, and kernelized

				    configs = []

				    for attn_implementation, sdpa_backend in list(dict.fromkeys(attn_impl_and_sdpa_backend)):

				        for cm in list(dict.fromkeys(compiled_mode)):

				            for kernelize_on in list(dict.fromkeys(kernelized)):

				                config = BenchmarkConfig(

				                    attn_implementation=attn_implementation,

				                    sdpa_backend=sdpa_backend,

				                    compile_mode=cm,

				                    kernelize=kernelize_on,

				                    **kwargs,

				                )

				                configs.append(config)

				    return configs

				def generate_all_configs(

				    warmup_iterations: int = 5,

				    measurement_iterations: int = 20,

				    batch_size: int = 1,

				    sequence_length: int = 128,

				    num_tokens_to_generate: int = 128,

				    gpu_monitoring: bool = False,

				) -> list[BenchmarkConfig]:

				    all_attn_implementations = [

				        ("flash_attention_2", None),

				        ("eager", None),

				        ("sdpa", "math"),

				        ("sdpa", "flash_attention"),

				        ("flex_attention", None),

				    ]

				    return cross_generate_configs(

				        attn_impl_and_sdpa_backend=all_attn_implementations,

				        compiled_mode=[None, "default", "reduce-overhead", "max-autotune", "max-autotune-no-cudagraphs"],

				        kernelized=[False, KERNELIZATION_AVAILABLE],

				        warmup_iterations=warmup_iterations,

				        measurement_iterations=measurement_iterations,

				        batch_size=batch_size,

				        sequence_length=sequence_length,

				        num_tokens_to_generate=num_tokens_to_generate,

				        gpu_monitoring=gpu_monitoring,

				    )

				def generate_main_configs(

				    warmup_iterations: int = 5,

				    measurement_iterations: int = 20,

				    batch_size: int = 1,

				    sequence_length: int = 128,

				    num_tokens_to_generate: int = 128,

				    gpu_monitoring: bool = False,

				) -> list[BenchmarkConfig]:

				    # Create kwargs common to all configs

				    kwargs = {

				        "warmup_iterations": warmup_iterations,

				        "measurement_iterations": measurement_iterations,

				        "batch_size": batch_size,

				        "sequence_length": sequence_length,

				        "num_tokens_to_generate": num_tokens_to_generate,

				        "gpu_monitoring": gpu_monitoring,

				    }

				    return [  # TODO: test max-autotune instead of default

				        BenchmarkConfig(attn_implementation="flex_attention", compile_mode="default", **kwargs),

				        BenchmarkConfig(attn_implementation="eager", compile_mode="default", **kwargs),

				        BenchmarkConfig(attn_implementation="flash_attention_2", **kwargs),

				    ]

									
										389

benchmark_v2/framework/benchmark_runner.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,389 @@

				import gc

				import json

				import logging

				import os

				import pathlib

				import re

				import time

				from contextlib import nullcontext

				from datetime import datetime

				from queue import Queue

				from typing import Any

				import torch

				from tqdm import trange

				from transformers import (

				    AutoModelForCausalLM,

				    AutoTokenizer,

				    CompileConfig,

				    GenerationConfig,

				    GenerationMixin,

				)

				from transformers.generation.streamers import BaseStreamer

				from .benchmark_config import BenchmarkConfig

				from .data_classes import BenchmarkMetadata, BenchmarkResult, GPURawMetrics, pretty_print_dict

				from .hardware_metrics import GPUMonitor

				try:

				    from kernels import Mode, kernelize  # noqa: F401

				except ImportError:

				    kernelize = None

				    Mode = None

				DEFAULT_PROMPT = "\n".join([

				    "The French Revolution was a period of political and societal change in France that began with the Estates General of 1789 and ended with the Coup of 18 Brumaire on 9 November 1799.",

				    "Many of the revolution's ideas are considered fundamental principles of liberal democracy, and its values remain central to modern French political discourse.",

				    "It was caused by a combination of social, political, and economic factors which the existing regime proved unable to manage.",

				    "Financial crisis and widespread social distress led to the convocation of the Estates General in May 1789, its first meeting since 1614.",

				    "The representatives of the Third Estate broke away and re-constituted themselves as a National Assembly in June.",

				    "The Storming of the Bastille in Paris on 14 July led to a series of radical measures by the Assembly, including the abolition of feudalism, state control over the Catholic Church in France, and issuing the Declaration of the Rights of Man and of the Citizen.",

				    "The next three years were dominated by a struggle for political control.",

				    "King Louis XVI's attempted flight to Varennes in June 1791 further discredited the monarchy, and military defeats after the outbreak of the French Revolutionary Wars in April 1792 led to the insurrection of 10 August 1792.",

				    "As a result, the monarchy was replaced by the French First Republic in September, followed by the execution of Louis XVI himself in January 1793.",

				    "After another revolt in June 1793, the constitution was suspended, and political power passed from the National Convention to the Committee of Public Safety, dominated by radical Jacobins led by Maximilien Robespierre.",

				    "About 16,000 people were sentenced by the Revolutionary Tribunal and executed in the Reign of Terror, which ended in July 1794 with the Thermidorian Reaction.",

				    "Weakened by external threats and internal opposition, the Committee of Public Safety was replaced in November 1795 by the Directory.",

				    "Its instability ended in the coup of 18 Brumaire and the establishment of the Consulate, with Napoleon Bonaparte as First Consul.",

				])  # fmt: skip

				def compact_json_numeric_arrays(data: dict):

				    # Match arrays that contain only numbers (ints/floats), whitespace, commas, and newlines

				    pattern = r"\[\s*\n\s*((?:\d+(?:\.\d+)?\s*,\s*)*\d+(?:\.\d+)?)\s*\n\s*\]"

				    def replace_numeric_array(match):

				        # Get the array content

				        content = match.group(1)

				        # Remove extra whitespace but keep commas

				        compact_content = re.sub(r"\s+", " ", content).strip()

				        return f"[{compact_content}]"

				    return re.sub(pattern, replace_numeric_array, json.dumps(data, indent=4, default=str), flags=re.DOTALL)

				def get_git_revision() -> str:

				    base_path = pathlib.Path(__file__).parent.parent.parent

				    git_dir = base_path / ".git"

				    with (git_dir / "HEAD").open("r") as head:

				        ref = head.readline().split(" ")[-1].strip()

				    with (git_dir / ref).open("r") as git_hash:

				        return git_hash.readline().strip()

				def get_sdpa_backend(backend_name: str | None) -> torch.nn.attention.SDPBackend | None:

				    """Get the SDPA backend enum from string name."""

				    if backend_name is None:

				        return None

				    try:

				        backend_map = {

				            "math": torch.nn.attention.SDPBackend.MATH,

				            "flash_attention": torch.nn.attention.SDPBackend.FLASH_ATTENTION,

				            "efficient_attention": torch.nn.attention.SDPBackend.EFFICIENT_ATTENTION,

				            "cudnn_attention": torch.nn.attention.SDPBackend.CUDNN_ATTENTION,

				        }

				        return backend_map.get(backend_name.lower())

				    except AttributeError:

				        # torch.nn.attention.SDPBackend not available in older torch versions

				        return None

				def flush_memory():

				    """Flush GPU memory and run garbage collection."""

				    gc.collect()

				    # Dynamo resets

				    torch._dynamo.reset()

				    torch._dynamo.reset_code_caches()

				    if hasattr(torch._inductor, "codecache"):

				        # Clear FX graph cache

				        if hasattr(torch._inductor.codecache, "FxGraphCache"):

				            torch._inductor.codecache.FxGraphCache.clear()

				        # Clear PyCodeCache

				        if hasattr(torch._inductor.codecache, "PyCodeCache"):

				            torch._inductor.codecache.PyCodeCache.cache_clear()

				        # Clear TritonFuture cache (for async compilation)

				        if hasattr(torch._inductor.codecache, "TritonFuture"):

				            if hasattr(torch._inductor.codecache.TritonFuture, "_compile_cache"):

				                torch._inductor.codecache.TritonFuture._compile_cache.clear()

				    # Clear CUDA cache

				    if torch.cuda.is_available():

				        torch.cuda.empty_cache()

				        torch.cuda.reset_max_memory_allocated()

				        torch.cuda.reset_peak_memory_stats()

				        torch.cuda.synchronize()

				    gc.collect()

				class BenchmarkStreamer(BaseStreamer):

				    def __init__(self, **kwargs) -> None:

				        self.timestamps = []

				        self.text_queue = Queue()

				    def put(self, value):

				        """Receives tokens and logs the timestamp of the generation."""

				        self.timestamps.append(time.perf_counter())

				    def end(self):

				        self.timestamps.append(time.perf_counter())

				    def __iter__(self):

				        return self

				    def __next__(self):

				        value = self.text_queue.get(timeout=self.timeout)

				        if value == self.stop_signal:

				            raise StopIteration()

				        else:

				            return value

				class BenchmarkRunner:

				    """Main benchmark runner that coordinates benchmark execution."""

				    def __init__(self, logger: logging.Logger, output_dir: str | None = None, commit_id: str | None = None) -> None:

				        # Those stay constant for the whole run

				        self.logger = logger

				        if output_dir is None:

				            output_dir = os.path.join(os.path.dirname(os.path.dirname(__file__)), "benchmark_results")

				        self.output_dir = output_dir

				        self.commit_id = get_git_revision() if commit_id is None else commit_id

				        os.makedirs(self.output_dir, exist_ok=True)

				        self.profile_dir = None

				        # Attributes that are reset for each model

				        self._setup_for = ""

				        # Attributes that are reset for each run

				        self.model: GenerationMixin | None = None

				    def cleanup(self) -> None:

				        del self.model

				        self.model = None

				        flush_memory()

				    def setup_one_run(self, model_id: str, config: BenchmarkConfig) -> None:

				        # Some attributes only need to be set once per model

				        if self._setup_for != model_id:

				            self.tokenizer = AutoTokenizer.from_pretrained(model_id)

				            # We set the EOS token to the padding token for open-ended generation

				            self.tokenizer.eos_token = self.tokenizer.pad_token

				            self._setup_for = model_id

				        # Prepare inputs

				        self.inputs = self.tokenizer(

				            [DEFAULT_PROMPT for _ in range(config.batch_size)],

				            return_tensors="pt",

				            max_length=config.sequence_length,

				            truncation=True,

				            return_attention_mask=True,

				        ).to(config.device)

				        self.inputs["use_cache"] = True

				        # Prepare generation config

				        gen_config = GenerationConfig(

				            do_sample=False, top_p=1.0, temperature=1.0, max_new_tokens=config.num_tokens_to_generate

				        )

				        # Prepare compile config

				        if config.compile_mode is not None:

				            gen_config.compile_config = CompileConfig(mode=config.compile_mode, options=config.compile_options)

				            gen_config.cache_implementation = "static"

				        # Load model

				        self.logger.debug(f"Loading model {model_id} on device {config.device}...")

				        dtype = getattr(torch, config.dtype.removeprefix("torch."))

				        self.model = AutoModelForCausalLM.from_pretrained(

				            model_id, dtype=dtype, attn_implementation=config.attn_implementation, generation_config=gen_config

				        )

				        self.model = self.model.eval().to(config.device)

				        # Kernelize the model if needed

				        if config.kernelize:

				            self.model = kernelize(self.model, mode=Mode.INFERENCE)

				    def run_one_benchmark(self, model_id: str, config: BenchmarkConfig, num_tokens_to_profile: int = 0) -> None:

				        sdpa_ctx = nullcontext()

				        if config.attn_implementation == "sdpa":

				            sdpa_backend = get_sdpa_backend(config.sdpa_backend)

				            sdpa_ctx = torch.nn.attention.sdpa_kernel(sdpa_backend)

				        with sdpa_ctx, torch.no_grad():

				            self.logger.info(f"Running benchmark scenario: {config.name}")

				            # Quick validation: try one measurement first to see if this scenario works

				            flush_memory()

				            e2e_latency, token_generation_times, shape_and_decoded_output, gpu_metrics = self.time_generate(

				                max_new_tokens=1, gpu_monitor=None

				            )

				            if e2e_latency < 0:

				                self.logger.warning(f"Skipping config {config.name}: {e2e_latency = } (no GPU monitoring)")

				                return None

				            # Warmup runs

				            self.logger.info(f"Warming up with {config.warmup_iterations} iterations...")

				            for _ in trange(config.warmup_iterations):

				                _ = self.time_generate(max_new_tokens=config.num_tokens_to_generate)

				            self.logger.info("Warmup over.")

				            # Measurement runs

				            result = BenchmarkResult()

				            self.logger.info(f"Benchmarking with {config.measurement_iterations} iterations.")

				            for _ in trange(config.measurement_iterations):

				                e2e_latency, token_generation_times, shape_and_decoded_output, gpu_metrics = self.time_generate(

				                    max_new_tokens=config.num_tokens_to_generate,

				                    gpu_monitor=(GPUMonitor(logger=self.logger) if config.gpu_monitoring else None),

				                )

				                result.accumulate(e2e_latency, token_generation_times, shape_and_decoded_output, gpu_metrics)

				            self.logger.info("Benchmarking done. Cleaning up.")

				            # Profile if needed

				            if num_tokens_to_profile > 0:

				                self.profile_generate(num_tokens_to_profile, config.name)

				            return {

				                "metadata": BenchmarkMetadata(model_id=model_id, commit_id=self.commit_id),

				                "measurements": result,

				                "config": config,

				            }

				    def time_generate(

				        self,

				        max_new_tokens: int,

				        gpu_monitor: GPUMonitor | None = None,

				    ) -> tuple[float, list[float], str, GPURawMetrics | None]:

				        """Time the latency of a call to model.generate() with the given (inputs) and (max_new_tokens)."""

				        # Prepare gpu monitoring if needed

				        if gpu_monitor is not None:

				            gpu_monitor.start()

				        # Prepare streamer

				        streamer = BenchmarkStreamer()

				        # Generate and time

				        wall_time_0 = time.perf_counter()

				        outputs = self.model.generate(

				            **self.inputs,

				            max_new_tokens=max_new_tokens,

				            streamer=streamer,

				        )

				        wall_time_1 = time.perf_counter()

				        # Stop gpu monitoring if needed

				        gpu_metrics = gpu_monitor.stop_and_collect() if gpu_monitor is not None else None

				        # Check if generation had the right number of tokens

				        input_tokens = self.inputs["input_ids"].size(-1)

				        batch_size, output_tokens = outputs.shape

				        new_tokens = output_tokens - input_tokens

				        if new_tokens != max_new_tokens:

				            raise RuntimeError(f"Generated {new_tokens} tokens, expected {max_new_tokens}")

				        # Decode outputs

				        decoded_output = self.tokenizer.decode(outputs[0, input_tokens:], skip_special_tokens=True)

				        shape_and_decoded_output = f"{tuple(outputs.shape)} | {decoded_output}"

				        # Compute intermediate quantities

				        e2e_latency = wall_time_1 - wall_time_0

				        token_generation_times = [t - wall_time_0 for t in streamer.timestamps[1:]]

				        return e2e_latency, token_generation_times, shape_and_decoded_output, gpu_metrics

				    def profile_generate(self, num_tokens_to_profile: int, config_name: str) -> None:

				        """Profile the latency of a call to model.generate() with the given (inputs) and (max_new_tokens)."""

				        profiler = torch.profiler.profile(

				            activities=[torch.profiler.ProfilerActivity.CPU, torch.profiler.ProfilerActivity.CUDA],

				            record_shapes=True,

				        )

				        with profiler as prof:

				            _ = self.model.generate(

				                **self.inputs,

				                max_new_tokens=num_tokens_to_profile,

				            )

				        if self.profile_dir is None:

				            self.profile_dir = self.output_dir + "_profiles"

				            os.makedirs(self.profile_dir, exist_ok=True)

				        prof.export_chrome_trace(f"{self.profile_dir}/{config_name}.json")

				    def run_benchmarks(

				        self,

				        model_id: str,

				        benchmark_configs: list[BenchmarkConfig],

				        num_tokens_to_profile: int = 0,

				        pretty_print_summary: bool = True,

				    ) -> dict[str, Any]:

				        all_results = {}

				        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

				        start_time = time.perf_counter()

				        n_configs = len(benchmark_configs)

				        for i, config in enumerate(benchmark_configs):

				            # Handle SDPA backend if not determined by the config (needs to be done before skipping duplicates)

				            if config.attn_implementation == "sdpa" and config.sdpa_backend is None:

				                default_backend = "flash_attention"  # FIXME: torch has a _cur_sdpa_kernel_backends but it fails

				                self.logger.warning(f"No SDPA backend provided, using {default_backend} instead.")

				                config.sdpa_backend = default_backend

				            # Skip if already run

				            if config.hash in all_results:

				                self.logger.info(f"Skipping duplicate config {config.name} for model {model_id} ({i + 1}/{n_configs})")

				                continue

				            # Otherwise, run the benchmark

				            self.setup_one_run(model_id, config)

				            self.logger.info(

				                f"Running benchmark of model {model_id} with scenario: {config.name} ({i + 1}/{n_configs})"

				            )

				            # Launch benchmark in a try/except block to avoid stopping the whole run if one benchmark fails

				            try:

				                results = self.run_one_benchmark(model_id, config, num_tokens_to_profile)

				                if results is not None:

				                    all_results[config.hash] = results

				            except Exception as e:

				                self.logger.error(f"Error running with scenario: {config.name}:\n{repr(e)}")

				            # Cleanup model and save results

				            self.cleanup()

				            self.save_results(model_id, all_results, timestamp=timestamp)

				        if pretty_print_summary:

				            print()

				            print("=" * 100)

				            print(f"Finished benchmarks in {time.perf_counter() - start_time:.2f} seconds")

				            print(f"Total number of benchmarks: {len(all_results)}")

				            if len(all_results) > 0:

				                print("First run metadata:")

				                first_key = list(all_results.keys())[0]

				                first_metadata = all_results[first_key]["metadata"].to_dict()

				                hardware_info = first_metadata.pop("hardware_info")

				                pretty_print_dict(first_metadata | hardware_info, tabs=1)

				            for result in all_results.values():

				                print("=" * 100)

				                print(f"Config: {result['config'].infer_name(compact=False)}\n")

				                result["measurements"].pprint(batch_size=result["config"].batch_size, tabs=1)

				            print("=" * 100)

				        return all_results

				    def save_results(self, model_name: str, results: dict, timestamp: str = "") -> str:

				        """Save benchmark results to JSON file."""

				        # Create model-specific subdirectory

				        model_name = model_name.replace("/", "_")

				        model_dir = os.path.join(self.output_dir, model_name)

				        os.makedirs(model_dir, exist_ok=True)

				        # Create filename with timestamp

				        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") if not timestamp else timestamp

				        filename = f"{model_name}_benchmark_{timestamp}.json"

				        filepath = os.path.join(model_dir, filename)

				        # Convert results to dict

				        converted_results = {}

				        for cfg_hash in results.keys():

				            converted_results[cfg_hash] = {

				                "metadata": results[cfg_hash]["metadata"].to_dict(),

				                "measurements": results[cfg_hash]["measurements"].to_dict(),

				                "config": results[cfg_hash]["config"].to_dict(),

				            }

				        # Save to JSON file

				        with open(filepath, "w") as f:

				            f.write(compact_json_numeric_arrays(converted_results))

				        self.logger.info(f"Results saved to {filepath}")

				        return filepath

									
										160

benchmark_v2/framework/data_classes.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,160 @@

				from dataclasses import dataclass

				from datetime import datetime

				from typing import Any

				import numpy as np

				from .hardware_metrics import GPURawMetrics, HardwareInfo

				def compute_basic_statistics(measurements: list[float]) -> dict[str, float]:

				    return {

				        "avg": np.mean(measurements),

				        "std": np.std(measurements),

				        "min": np.min(measurements),

				        "med": np.median(measurements),

				        "max": np.max(measurements),

				        "p95": np.percentile(measurements, 95),

				    }

				def add_unit_to_duration(stats: dict[str, float]) -> dict[str, str]:

				    for key in list(stats.keys()):

				        value = stats[key]

				        if value > 3600:

				            stats[key] = f"{(value / 3600):.2f}hr"

				        elif value > 60:

				            stats[key] = f"{(value / 60):.2f}min"

				        elif value > 1:

				            stats[key] = f"{value:.2f}s"

				        elif value > 1e-3:

				            stats[key] = f"{(value * 1e3):.2f}ms"

				        elif value > 1e-6:

				            stats[key] = f"{(value * 1e6):.2f}us"

				        else:

				            stats[key] = f"{(value * 1e9):.2f}ns"

				    return stats

				def equalize_lengths_and_collate(stats: list[dict[str, str]]) -> list[str]:

				    keys = ["avg", "std", "min", "med", "max", "p95"]

				    for key in keys:

				        max_length = max(len(stat[key]) for stat in stats)

				        for stat in stats:

				            stat[key] = stat[key].ljust(max_length, " ")

				    return [" ".join([f"{key}={stat[key]}" for key in keys]) for stat in stats]

				def pretty_print_dict(data: dict[str, Any], tabs: int = 0) -> None:

				    max_key_length = max([len(key) for key in data.keys()])

				    for key, value in data.items():

				        tabs_str = "  " * tabs

				        padded_key = key.ljust(max_key_length + 1, ".")

				        print(f"{tabs_str}{padded_key}: {value}")

				@dataclass

				class BenchmarkMetadata:

				    """Metadata collected for each benchmark run."""

				    model_id: str

				    timestamp: str

				    commit_id: str

				    hardware_info: HardwareInfo

				    def __init__(self, model_id: str, commit_id: str):

				        self.model_id = model_id

				        self.timestamp = datetime.utcnow().isoformat()

				        self.commit_id = commit_id

				        self.hardware_info = HardwareInfo()

				    def to_dict(self) -> dict[str, Any]:

				        return {

				            "timestamp": self.timestamp,

				            "commit_id": self.commit_id,

				            "hardware_info": self.hardware_info.to_dict(),

				        }

				class BenchmarkResult:

				    """Result from a series of benchmark runs."""

				    def __init__(self) -> None:

				        self.e2e_latency = []

				        self.token_generation_times = []  # time at which each token was generated (relative to start of the generation)

				        self.shape_and_decoded_outputs = []

				        self.gpu_metrics = []

				    def accumulate(

				        self,

				        e2e_latency: float,

				        token_generation_times: list[float],

				        shape_and_decoded_output: str,

				        gpu_metrics: GPURawMetrics | None,

				    ) -> None:

				        self.e2e_latency.append(e2e_latency)

				        self.token_generation_times.append(token_generation_times)

				        self.shape_and_decoded_outputs.append(shape_and_decoded_output)

				        self.gpu_metrics.append(gpu_metrics)

				    def to_dict(self) -> dict[str, None | int | float]:

				        # Save GPU metrics as None if it contains only None values

				        if all(gm is None for gm in self.gpu_metrics):

				            gpu_metrics = None

				        else:

				            gpu_metrics = [gm.to_dict() for gm in self.gpu_metrics]

				        return {

				            "e2e_latency": self.e2e_latency,

				            "token_generation_times": self.token_generation_times,

				            "shape_and_decoded_outputs": self.shape_and_decoded_outputs,

				            "gpu_metrics": gpu_metrics,

				        }

				    @classmethod

				    def from_dict(cls, data: dict[str, None | int | float]) -> "BenchmarkResult":

				        # Handle GPU metrics, which is saved as None if it contains only None values

				        if data["gpu_metrics"] is None:

				            gpu_metrics = [None for _ in range(len(data["e2e_latency"]))]

				        else:

				            gpu_metrics = [GPURawMetrics.from_dict(gm) for gm in data["gpu_metrics"]]

				        # Create a new instance and accumulate the data

				        new_instance = cls()

				        for i in range(len(data["e2e_latency"])):

				            new_instance.accumulate(

				                e2e_latency=data["e2e_latency"][i],

				                token_generation_times=data["token_generation_times"][i],

				                shape_and_decoded_output=data["shape_and_decoded_outputs"][i],

				                gpu_metrics=gpu_metrics[i],

				            )

				        return new_instance

				    def get_measured_ttft(self) -> list[float]:

				        return [dt[0] for dt in self.token_generation_times if len(dt) > 0]

				    def get_measured_itl(self) -> list[float]:

				        return [(dt[-1] - dt[0]) / (len(dt) - 1) for dt in self.token_generation_times if len(dt) > 1]

				    def get_throughput(self, batch_size: int) -> float:

				        return [

				            batch_size * len(dt) / e2e_latency

				            for e2e_latency, dt in zip(self.e2e_latency, self.token_generation_times)

				        ]

				    def pprint(self, batch_size: int = 0, tabs: int = 0) -> None:

				        stats_to_collate = [

				            add_unit_to_duration(compute_basic_statistics(self.e2e_latency)),

				            add_unit_to_duration(compute_basic_statistics(self.get_measured_ttft())),

				            add_unit_to_duration(compute_basic_statistics(self.get_measured_itl())),

				        ]

				        if batch_size > 0:

				            throughput_stats = compute_basic_statistics(self.get_throughput(batch_size))

				            stats_to_collate.append({key: f"{value:.2f}tok/s" for key, value in throughput_stats.items()})

				        collated_stats = equalize_lengths_and_collate(stats_to_collate)

				        dict_to_pprint = {

				            "E2E Latency": collated_stats[0],

				            "Time to First Token": collated_stats[1],

				            "Inter-Token Latency": collated_stats[2],

				        }

				        if batch_size > 0:

				            dict_to_pprint["Throughput"] = collated_stats[3]

				        pretty_print_dict(dict_to_pprint, tabs=tabs)

									
										171

benchmark_v2/framework/hardware_metrics.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,171 @@

				import json

				import logging

				import subprocess

				import sys

				import threading

				import time

				from dataclasses import dataclass

				from enum import Enum

				from logging import Logger

				import gpustat

				import psutil

				import torch

				# Data class to hold the hardware information

				def get_device_name_and_memory_total() -> tuple[str, float]:

				    """Returns the name and memory total of GPU 0."""

				    device_name = torch.cuda.get_device_properties(0).name

				    device_memory_total = torch.cuda.get_device_properties(0).total_memory / 1024**3

				    return device_name, device_memory_total

				class HardwareInfo:

				    """A class to hold information about the hardware."""

				    def __init__(self) -> None:

				        # Retrieve GPU stats

				        try:

				            self.gpu_name, self.gpu_memory_total_gb = get_device_name_and_memory_total()

				        except Exception:

				            self.gpu_name, self.gpu_memory_total_gb = None, None

				        # Retrieve python, torch and CUDA version

				        self.python_version = f"{sys.version.split()[0]}"

				        self.torch_version = torch.__version__

				        if hasattr(torch, "cuda") and torch.cuda.is_available():

				            self.cuda_version = torch.version.cuda

				        else:

				            self.cuda_version = None

				        # Retrieve general hardware information

				        self.cpu_count = psutil.cpu_count()

				        self.memory_total_mb = int(psutil.virtual_memory().total / (1024 * 1024))

				    def to_dict(self) -> dict[str, None | int | float | str]:

				        return {

				            "gpu_name": self.gpu_name,

				            "gpu_memory_total_gb": self.gpu_memory_total_gb,

				            "python_version": self.python_version,

				            "torch_version": self.torch_version,

				        }

				# Functions to get information about the GPU

				def get_amd_gpu_stats() -> tuple[int, float]:

				    """Returns the utilization and memory used of an AMD GPU, both in percent"""

				    rocm_smi_output = subprocess.check_output(["rocm-smi", "--json", "--showuse", "--showmeminfo", "VRAM"])

				    gpu_stats = json.loads(rocm_smi_output.decode("utf-8"))

				    gpu_stats = [

				        (card_id, stats["GPU use (%)"], stats["VRAM Total Used Memory (B)"]) for card_id, stats in gpu_stats.items()

				    ]

				    gpu_stats.sort(key=lambda x: x[1], reverse=True)

				    return int(gpu_stats[0][1]), float(gpu_stats[0][2]) / 1024**3

				def get_nvidia_gpu_stats() -> tuple[int, float]:

				    """Returns the utilization and memory used of an NVIDIA GPU, both in percent"""

				    gpu_stats = gpustat.GPUStatCollection.new_query()

				    gpu_stats = gpu_stats[0]

				    return int(gpu_stats["utilization.gpu"]), float(gpu_stats["memory.used"]) / 1024**3

				class GPUStatsCollector:

				    """A class to get statistics about the GPU. It serves as a wrapper that holds the GPU total memory and its name,

				    which is used to call the right function to get the utilization and memory used."""

				    def __init__(self) -> None:

				        self.device_name, self.device_memory_total = get_device_name_and_memory_total()

				        # Monkey patch the get_utilization_and_memory_used method based on the GPU type

				        if "amd" in self.device_name.lower():

				            self.get_utilization_and_memory_used = get_amd_gpu_stats

				        elif "nvidia" in self.device_name.lower():

				            self.get_utilization_and_memory_used = get_nvidia_gpu_stats

				        else:

				            raise RuntimeError(f"Unsupported GPU: {self.device_name}")

				    def get_measurements(self) -> tuple[int, float]:

				        """Get the utilization and memory used of the GPU, both in percent"""

				        raise NotImplementedError("This method is meant to be monkey patched during __init__")

				# Simple data classes to hold the raw GPU metrics

				class GPUMonitoringStatus(Enum):

				    """Status of GPU monitoring."""

				    SUCCESS = "success"

				    FAILED = "failed"

				    NO_GPUS_AVAILABLE = "no_gpus_available"

				    NO_SAMPLES_COLLECTED = "no_samples_collected"

				@dataclass

				class GPURawMetrics:

				    """Raw values for GPU utilization and memory used."""

				    utilization: list[float]  # in percent

				    memory_used: list[float]  # in GB

				    timestamps: list[float]  # in seconds

				    timestamp_0: float  # in seconds

				    monitoring_status: GPUMonitoringStatus

				    def to_dict(self) -> dict[str, None | int | float | str]:

				        return {

				            "utilization": self.utilization,

				            "memory_used": self.memory_used,

				            "timestamps": self.timestamps,

				            "timestamp_0": self.timestamp_0,

				            "monitoring_status": self.monitoring_status.value,

				        }

				# Main class, used to monitor the GPU utilization during benchmark execution

				class GPUMonitor:

				    """Monitor GPU utilization during benchmark execution."""

				    def __init__(self, sample_interval_sec: float = 0.1, logger: Logger | None = None):

				        self.sample_interval_sec = sample_interval_sec

				        self.logger = logger if logger is not None else logging.getLogger(__name__)

				        self.num_available_gpus = torch.cuda.device_count()

				        if self.num_available_gpus == 0:

				            raise RuntimeError("No GPUs detected by torch.cuda.device_count().")

				        self.gpu_stats_getter = GPUStatsCollector()

				    def start(self):

				        """Start monitoring GPU metrics."""

				        # Clear the stop event to enable monitoring

				        self.stop_event = threading.Event()

				        self.gpu_utilization = []

				        self.gpu_memory_used = []

				        self.timestamps = []

				        self.thread = threading.Thread(target=self._monitor_loop)

				        self.thread.start()

				        self.logger.debug("GPU monitoring started")

				    def stop_and_collect(self) -> GPURawMetrics:

				        """Stop monitoring and return collected metrics."""

				        self.stop_event.set()

				        self.thread.join()

				        if self.gpu_utilization:

				            timestamp_0 = self.timestamps[0]

				            metrics = GPURawMetrics(

				                utilization=self.gpu_utilization,

				                memory_used=self.gpu_memory_used,

				                timestamps=[t - timestamp_0 for t in self.timestamps],

				                timestamp_0=timestamp_0,

				                monitoring_status=GPUMonitoringStatus.SUCCESS,

				            )

				            self.logger.debug(f"GPU monitoring completed: {len(self.gpu_utilization)} samples collected")

				        else:

				            metrics = GPURawMetrics(monitoring_status=GPUMonitoringStatus.NO_SAMPLES_COLLECTED)

				        return metrics

				    def _monitor_loop(self):

				        """Background monitoring loop using threading.Event for communication."""

				        while not self.stop_event.is_set():

				            utilization, memory_used = self.gpu_stats_getter.get_utilization_and_memory_used()

				            self.gpu_utilization.append(utilization)

				            self.gpu_memory_used.append(memory_used)

				            self.timestamps.append(time.time())

				            if self.stop_event.wait(timeout=self.sample_interval_sec):

				                break

7

benchmark_v2/requirements.txt Normal file

View File

 @ -0,0 +1,7 @@
 numpy>=1.21.0
 psutil>=5.8.0
 gpustat>=1.0.0
 torch>=2.0.0
 transformers>=4.30.0
 datasets>=2.10.0
 huggingface_hub>=0.16.0

									
										116

benchmark_v2/run_benchmarks.py
									
										Executable file
									
												View File
												
				@ -0,0 +1,116 @@

				#!/usr/bin/env python3

				# Copyright 2025 The HuggingFace Team. All rights reserved.

				#

				# Licensed under the Apache License, Version 2.0 (the "License");

				# you may not use this file except in compliance with the License.

				# You may obtain a copy of the License at

				#

				#     http://www.apache.org/licenses/LICENSE-2.0

				#

				# Unless required by applicable law or agreed to in writing, software

				# distributed under the License is distributed on an "AS IS" BASIS,

				# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				# See the License for the specific language governing permissions and

				# limitations under the License.

				"""

				Top-level benchmarking script that automatically discovers and runs all benchmarks

				in the ./benches directory, organizing outputs into model-specific subfolders.

				"""

				import argparse

				import logging

				import sys

				import uuid

				from framework.benchmark_config import BenchmarkConfig, generate_all_configs, generate_main_configs

				from framework.benchmark_runner import BenchmarkRunner

				if __name__ == "__main__":

				    # Parse arguments

				    parser = argparse.ArgumentParser()

				    parser.add_argument("--output-dir", type=str, default=None, help="Output dir for benchmark results")

				    parser.add_argument("--log-level", type=str, choices=["DEBUG", "INFO", "WARNING", "ERROR"], default="INFO")

				    parser.add_argument("--model-id", type=str, help="Specific model ID to benchmark (if supported by benchmarks)")

				    parser.add_argument("--warmup", type=int, default=3, help="Number of warmup iterations")

				    parser.add_argument("--iterations", type=int, default=10, help="Number of measurement iterations")

				    parser.add_argument("--batch-size", "-b", type=int, nargs="+", help="Batch size")

				    parser.add_argument("--sequence-length", "-s", type=int, nargs="+", help="Sequence length")

				    parser.add_argument("--num-tokens-to-generate", "-n", type=int, nargs="+", help="Number of tokens to generate")

				    parser.add_argument("--cross-generate", action="store_true", help="Cross-generate all combinations of configs")

				    parser.add_argument("--num-tokens-to-profile", "-p", type=int, default=0, help="Number of tokens to profile")

				    parser.add_argument("--commit-id", type=str, help="Git commit ID (if not provided, will auto-detect from git)")

				    args = parser.parse_args()

				    # Setup logging

				    benchmark_run_uuid = str(uuid.uuid4())[:8]

				    numeric_level = getattr(logging, args.log_level.upper())

				    handlers = [logging.StreamHandler(sys.stdout)]

				    logging.basicConfig(

				        level=numeric_level, format="[%(levelname)s - %(asctime)s] %(name)s: %(message)s", handlers=handlers

				    )

				    logger = logging.getLogger("benchmark_v2")

				    logger.info("Starting benchmark discovery and execution")

				    logger.info(f"Benchmark run UUID: {benchmark_run_uuid}")

				    logger.info(f"Output directory: {args.output_dir}")

				    # Error out if one of the arguments is not provided

				    if len(args.batch_size) * len(args.sequence_length) * len(args.num_tokens_to_generate) == 0:

				        raise ValueError(

				            "At least one of the arguments --batch-size, --sequence-length, or --num-tokens-to-generate is required"

				        )

				    # If there is only one (batch_size, sequence_length, num_tokens_to_generate), we benchmark across configs

				    elif len(args.batch_size) * len(args.sequence_length) * len(args.num_tokens_to_generate) == 1:

				        if args.cross_generate:

				            benchmark_configs = generate_all_configs(

				                warmup_iterations=args.warmup,

				                measurement_iterations=args.iterations,

				                batch_size=args.batch_size[0],

				                sequence_length=args.sequence_length[0],

				                num_tokens_to_generate=args.num_tokens_to_generate[0],

				            )

				        else:

				            benchmark_configs = generate_main_configs(

				                warmup_iterations=args.warmup,

				                measurement_iterations=args.iterations,

				                batch_size=args.batch_size[0],

				                sequence_length=args.sequence_length[0],

				                num_tokens_to_generate=args.num_tokens_to_generate[0],

				            )

				    # Otherwise, we benchmark across all combinations of dimensions

				    else:

				        main_config = generate_main_configs(

				            warmup_iterations=args.warmup,

				            measurement_iterations=args.iterations,

				            batch_size=args.batch_size[0],

				            sequence_length=args.sequence_length[0],

				            num_tokens_to_generate=args.num_tokens_to_generate[0],

				        )[0]

				        benchmark_configs = []

				        for num_tokens_to_generate in args.num_tokens_to_generate:

				            for sequence_length in args.sequence_length:

				                for batch_size in args.batch_size:

				                    cfg_dict = main_config.to_dict()

				                    cfg_dict["batch_size"] = batch_size

				                    cfg_dict["sequence_length"] = sequence_length

				                    cfg_dict["num_tokens_to_generate"] = num_tokens_to_generate

				                    cfg_dict.pop("name")

				                    benchmark_configs.append(BenchmarkConfig.from_dict(cfg_dict))

				    runner = BenchmarkRunner(logger, args.output_dir, args.commit_id)

				    results = runner.run_benchmarks(

				        args.model_id,

				        benchmark_configs,

				        args.num_tokens_to_profile,

				        pretty_print_summary=True,

				    )

				    # runner.save_results(args.model_id, results)

									
										49

conftest.py
									
												View File
												
				@ -16,6 +16,7 @@

				# by pytest before any tests are run

				import doctest

				import os

				import sys

				import warnings

				from os.path import abspath, dirname, join

				@ -23,12 +24,18 @@ from os.path import abspath, dirname, join

				import _pytest

				import pytest

				from transformers.testing_utils import HfDoctestModule, HfDocTestParser

				from transformers.testing_utils import (

				    HfDoctestModule,

				    HfDocTestParser,

				    is_torch_available,

				    patch_testing_methods_to_collect_info,

				    patch_torch_compile_force_graph,

				)

				NOT_DEVICE_TESTS = {

				    "test_tokenization",

				    "test_processor",

				    "test_tokenization_mistral_common",

				    "test_processing",

				    "test_beam_constraints",

				    "test_configuration_utils",

				@ -46,12 +53,7 @@ NOT_DEVICE_TESTS = {

				    "test_keep_in_fp32_modules",

				    "test_gradient_checkpointing_backward_compatibility",

				    "test_gradient_checkpointing_enable_disable",

				    "test_save_load_fast_init_from_base",

				    "test_fast_init_context_manager",

				    "test_fast_init_tied_embeddings",

				    "test_save_load_fast_init_to_base",

				    "test_torch_save_load",

				    "test_initialization",

				    "test_forward_signature",

				    "test_model_get_set_embeddings",

				    "test_model_main_input_name",

				@ -61,17 +63,12 @@ NOT_DEVICE_TESTS = {

				    "test_load_save_without_tied_weights",

				    "test_tied_weights_keys",

				    "test_model_weights_reload_no_missing_tied_weights",

				    "test_pt_tf_model_equivalence",

				    "test_mismatched_shapes_have_properly_initialized_weights",

				    "test_matched_shapes_have_loaded_weights_when_some_mismatched_shapes_exist",

				    "test_can_load_ignoring_mismatched_shapes",

				    "test_model_is_small",

				    "test_tf_from_pt_safetensors",

				    "test_flax_from_pt_safetensors",

				    "ModelTest::test_pipeline_",  # None of the pipeline tests from PipelineTesterMixin (of which XxxModelTest inherits from) are running on device

				    "ModelTester::test_pipeline_",

				    "/repo_utils/",

				    "/utils/",

				    "/agents/",

				}

				# allow having multiple repository checkouts and not needing to remember to rerun

				@ -85,17 +82,14 @@ warnings.simplefilter(action="ignore", category=FutureWarning)

				def pytest_configure(config):

				    config.addinivalue_line(

				        "markers", "is_pt_tf_cross_test: mark test to run only when PT and TF interactions are tested"

				    )

				    config.addinivalue_line(

				        "markers", "is_pt_flax_cross_test: mark test to run only when PT and FLAX interactions are tested"

				    )

				    config.addinivalue_line("markers", "is_pipeline_test: mark test to run only when pipelines are tested")

				    config.addinivalue_line("markers", "is_staging_test: mark test to run only in the staging environment")

				    config.addinivalue_line("markers", "accelerate_tests: mark test that require accelerate")

				    config.addinivalue_line("markers", "agent_tests: mark the agent tests that are run on their specific schedule")

				    config.addinivalue_line("markers", "not_device_test: mark the tests always running on cpu")

				    config.addinivalue_line("markers", "torch_compile_test: mark test which tests torch compile functionality")

				    config.addinivalue_line("markers", "torch_export_test: mark test which tests torch export functionality")

				    os.environ["DISABLE_SAFETENSORS_CONVERSION"] = "true"

				def pytest_collection_modifyitems(items):

				@ -140,3 +134,18 @@ class CustomOutputChecker(OutputChecker):

				doctest.OutputChecker = CustomOutputChecker

				_pytest.doctest.DoctestModule = HfDoctestModule

				doctest.DocTestParser = HfDocTestParser

				if is_torch_available():

				    import torch

				    # The flag below controls whether to allow TF32 on cuDNN. This flag defaults to True.

				    # We set it to `False` for CI. See https://github.com/pytorch/pytorch/issues/157274#issuecomment-3090791615

				    torch.backends.cudnn.allow_tf32 = False

				    # patch `torch.compile`: if `TORCH_COMPILE_FORCE_FULLGRAPH=1` (or values considered as true, e.g. yes, y, etc.),

				    # the patched version will always run with `fullgraph=True`.

				    patch_torch_compile_force_graph()

				if os.environ.get("PATCH_TESTING_METHODS_TO_COLLECT_OUTPUTS", "").lower() in ("yes", "true", "on", "y", "1"):

				    patch_testing_methods_to_collect_info()

									
										9

docker/README.md
									
										Normal file
									
												View File
												
				@ -0,0 +1,9 @@

				# Dockers for `transformers`

				In this folder you will find various docker files, and some subfolders. 

				- dockerfiles (ex: `consistency.dockerfile`) present under `~/docker` are used for our "fast" CIs. You should be able to use them for tasks that only need CPU. For example `torch-light` is a very light weights container (703MiB). 

				- subfolders contain dockerfiles used for our `slow` CIs, which *can* be used for GPU tasks, but they are **BIG** as they were not specifically designed for a single model / single task. Thus the `~/docker/transformers-pytorch-gpu` includes additional dependencies to allow us to run ALL model tests (say `librosa` or `tesseract`, which you do not need to run LLMs)

				Note that in both case, you need to run `uv pip install -e .`, which should take around 5 seconds. We do it outside the dockerfile for the need of our CI: we checkout a new branch each time, and the `transformers` code is thus updated. 

				We are open to contribution, and invite the community to create dockerfiles with potential arguments that properly choose extras depending on the model's dependencies! :hugs:

12

docker/consistency.dockerfile

View File

 @ -4,13 +4,11 @@ USER root
 ARG REF=main
 RUN apt-get update && apt-get install -y time git g++ pkg-config make git-lfs
 ENV UV_PYTHON=/usr/local/bin/python
 RUN pip install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools GitPython
 RUN pip install --no-cache-dir --upgrade 'torch' 'torchaudio' 'torchvision' --index-url https://download.pytorch.org/whl/cpu
 # tensorflow pin matching setup.py
 RUN pip install uv && uv pip install --no-cache-dir -U pip setuptools GitPython
 RUN uv pip install --no-cache-dir --upgrade 'torch<2.9' 'torchaudio' 'torchvision' --index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-cache-dir pypi-kenlm
 RUN uv pip install --no-cache-dir "tensorflow-cpu<2.16" "tf-keras<2.16"
 RUN uv pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[flax,quality,testing,torch-speech,vision]"
 RUN uv pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[quality,testing,torch-speech,vision]"
 RUN git lfs install
 RUN pip uninstall -y transformers
 RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean
 RUN uv pip uninstall transformers
 RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean

25

docker/custom-tokenizers.dockerfile

View File

 @ -1,9 +1,10 @@
 FROM python:3.10-slim
 ENV PYTHONDONTWRITEBYTECODE=1
 ARG REF=main
 USER root
 RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git cmake wget xz-utils build-essential g++5 libprotobuf-dev protobuf-compiler
 RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git cmake wget xz-utils build-essential g++5 libprotobuf-dev protobuf-compiler git-lfs curl
 ENV UV_PYTHON=/usr/local/bin/python
 RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
 RUN pip --no-cache-dir install uv && uv pip install --no-cache-dir -U pip setuptools
 RUN wget https://github.com/ku-nlp/jumanpp/releases/download/v2.0.0-rc3/jumanpp-2.0.0-rc3.tar.xz
 RUN tar xvf jumanpp-2.0.0-rc3.tar.xz
 @ -14,13 +15,21 @@ RUN mv catch.hpp ../libs/
 RUN cmake .. -DCMAKE_INSTALL_PREFIX=/usr/local
 RUN make install -j 10
 WORKDIR /
 RUN uv pip install --no-cache --upgrade 'torch' --index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-cache-dir  --no-deps accelerate --extra-index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install  --no-cache-dir "transformers[ja,testing,sentencepiece,jieba,spacy,ftfy,rjieba]" unidic unidic-lite
 RUN uv pip install --no-cache --upgrade 'torch<2.9' --index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-cache-dir  --no-deps accelerate --extra-index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install  --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[ja,testing,sentencepiece,spacy,ftfy,rjieba]" unidic unidic-lite
 # spacy is not used so not tested. Causes to failures. TODO fix later
 RUN python3 -m unidic download
 RUN pip uninstall -y transformers
 RUN uv run python -m unidic download
 # fetch test data and hub objects within CircleCI docker images to reduce even more connections
 # we don't need a full clone of `transformers` to run `fetch_hub_objects_for_ci.py`
 # the data are downloaded to the directory `/test_data` and during CircleCI's CI runtime, we need to move them to the root of `transformers`
 RUN mkdir test_data && cd test_data && curl -O https://raw.githubusercontent.com/huggingface/transformers/${REF}/utils/fetch_hub_objects_for_ci.py && python3 fetch_hub_objects_for_ci.py
 RUN uv pip uninstall transformers
 RUN apt-get clean && rm -rf /var/lib/apt/lists/*
 RUN apt remove -y g++ cmake  xz-utils libprotobuf-dev protobuf-compiler
 RUN apt remove -y g++ cmake  xz-utils libprotobuf-dev protobuf-compiler

12

docker/examples-tf.dockerfile

View File

 @ -1,12 +0,0 @@
 FROM python:3.10-slim
 ENV PYTHONDONTWRITEBYTECODE=1
 USER root
 RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git
 RUN apt-get install -y g++ cmake
 ENV UV_PYTHON=/usr/local/bin/python
 RUN pip --no-cache-dir install uv && uv venv
 RUN uv pip install --no-cache-dir -U pip setuptools albumentations seqeval
 RUN pip install  --upgrade --no-cache-dir "transformers[tf-cpu,sklearn,testing,sentencepiece,tf-speech,vision]"
 RUN uv pip install --no-cache-dir  "protobuf==3.20.3"
 RUN pip uninstall -y transformers
 RUN apt-get clean && rm -rf /var/lib/apt/lists/*

22

docker/examples-torch.dockerfile

View File

 @ -1,11 +1,19 @@
 FROM python:3.10-slim
 ENV PYTHONDONTWRITEBYTECODE=1
 ARG REF=main
 USER root
 RUN apt-get update &&  apt-get install -y --no-install-recommends libsndfile1-dev espeak-ng time git g++ cmake pkg-config openssh-client git
 RUN apt-get update &&  apt-get install -y --no-install-recommends libsndfile1-dev espeak-ng time git g++ cmake pkg-config openssh-client git-lfs ffmpeg curl
 ENV UV_PYTHON=/usr/local/bin/python
 RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
 RUN pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-deps timm accelerate --extra-index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-cache-dir librosa "transformers[sklearn,sentencepiece,vision,testing]" seqeval albumentations jiwer
 RUN pip uninstall -y transformers
 RUN apt-get clean && rm -rf /var/lib/apt/lists/*
 RUN pip --no-cache-dir install uv && uv pip install --no-cache-dir -U pip setuptools
 RUN uv pip install --no-cache-dir 'torch<2.9' 'torchaudio' 'torchvision' 'torchcodec' --index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-deps timm accelerate --extra-index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-cache-dir librosa "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[sklearn,sentencepiece,vision,testing]" seqeval albumentations jiwer
 # fetch test data and hub objects within CircleCI docker images to reduce even more connections
 # we don't need a full clone of `transformers` to run `fetch_hub_objects_for_ci.py`
 # the data are downloaded to the directory `/test_data` and during CircleCI's CI runtime, we need to move them to the root of `transformers`
 RUN mkdir test_data && cd test_data && curl -O https://raw.githubusercontent.com/huggingface/transformers/${REF}/utils/fetch_hub_objects_for_ci.py && python3 fetch_hub_objects_for_ci.py
 RUN uv pip uninstall transformers
 RUN apt-get clean && rm -rf /var/lib/apt/lists/*

21

docker/exotic-models.dockerfile

View File

 @ -2,16 +2,23 @@ FROM python:3.10-slim
 ENV PYTHONDONTWRITEBYTECODE=1
 ARG REF=main
 USER root
 RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git libgl1-mesa-glx libgl1 g++ tesseract-ocr
 RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git libgl1 g++ tesseract-ocr git-lfs curl
 ENV UV_PYTHON=/usr/local/bin/python
 RUN pip --no-cache-dir install uv &&  uv venv && uv pip install --no-cache-dir -U pip setuptools
 RUN pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
 RUN pip --no-cache-dir install uv && uv pip install --no-cache-dir -U pip setuptools
 RUN uv pip install --no-cache-dir 'torch<2.9' 'torchaudio' 'torchvision' --index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-cache-dir  --no-deps timm accelerate
 RUN pip install -U --upgrade-strategy eager --no-cache-dir pytesseract python-Levenshtein opencv-python nltk
 RUN uv pip install -U --no-cache-dir pytesseract python-Levenshtein opencv-python nltk
 # RUN uv pip install --no-cache-dir natten==0.15.1+torch210cpu -f https://shi-labs.com/natten/wheels
 RUN pip install  --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[testing, vision]" 'scikit-learn' 'torch-stft' 'nose'  'dataset'
 RUN uv pip install  --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[testing, vision]" 'scikit-learn' 'torch-stft' 'nose'  'dataset'
 # RUN git clone https://github.com/facebookresearch/detectron2.git
 # RUN python3 -m pip install --no-cache-dir -e detectron2
 RUN pip install 'git+https://github.com/facebookresearch/detectron2.git@92ae9f0b92aba5867824b4f12aa06a22a60a45d3'
 RUN pip uninstall -y transformers
 RUN uv pip install 'git+https://github.com/facebookresearch/detectron2.git@92ae9f0b92aba5867824b4f12aa06a22a60a45d3' --no-build-isolation
 # fetch test data and hub objects within CircleCI docker images to reduce even more connections
 # we don't need a full clone of `transformers` to run `fetch_hub_objects_for_ci.py`
 # the data are downloaded to the directory `/test_data` and during CircleCI's CI runtime, we need to move them to the root of `transformers`
 RUN mkdir test_data && cd test_data && curl -O https://raw.githubusercontent.com/huggingface/transformers/${REF}/utils/fetch_hub_objects_for_ci.py && python3 fetch_hub_objects_for_ci.py
 RUN uv pip uninstall transformers
 RUN apt-get clean && rm -rf /var/lib/apt/lists/*

10

docker/jax-light.dockerfile

View File

 @ -1,10 +0,0 @@
 FROM python:3.10-slim
 ENV PYTHONDONTWRITEBYTECODE=1
 ARG REF=main
 USER root
 RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git g++ cmake
 ENV UV_PYTHON=/usr/local/bin/python
 RUN pip --no-cache-dir install uv &&  uv venv && uv pip install --no-cache-dir -U pip setuptools
 RUN pip install --no-cache-dir "scipy<1.13" "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[flax,testing,sentencepiece,flax-speech,vision]"
 RUN pip uninstall -y transformers
 RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean

10

docker/pipeline-tf.dockerfile

View File

 @ -1,10 +0,0 @@
 FROM python:3.10-slim
 ENV PYTHONDONTWRITEBYTECODE=1
 ARG REF=main
 USER root
 RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git cmake g++
 ENV UV_PYTHON=/usr/local/bin/python
 RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
 RUN pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[sklearn,tf-cpu,testing,sentencepiece,tf-speech,vision]"
 RUN uv pip install --no-cache-dir  "protobuf==3.20.3" tensorflow_probability
 RUN apt-get clean && rm -rf /var/lib/apt/lists/*

17

docker/pipeline-torch.dockerfile

View File

 @ -2,10 +2,17 @@ FROM python:3.10-slim
 ENV PYTHONDONTWRITEBYTECODE=1
 ARG REF=main
 USER root
 RUN apt-get update &&  apt-get install -y --no-install-recommends libsndfile1-dev espeak-ng time git pkg-config openssh-client git
 RUN apt-get update &&  apt-get install -y --no-install-recommends libsndfile1-dev espeak-ng time git pkg-config openssh-client git ffmpeg curl
 ENV UV_PYTHON=/usr/local/bin/python
 RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
 RUN pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-deps timm accelerate --extra-index-url https://download.pytorch.org/whl/cpu
 RUN pip --no-cache-dir install uv && uv pip install --no-cache-dir -U pip setuptools
 RUN uv pip install --no-cache-dir 'torch<2.9' 'torchaudio' 'torchvision' 'torchcodec' --index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-deps timm accelerate --extra-index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-cache-dir librosa "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[sklearn,sentencepiece,vision,testing]"
 RUN pip uninstall -y transformers
 # fetch test data and hub objects within CircleCI docker images to reduce even more connections
 # we don't need a full clone of `transformers` to run `fetch_hub_objects_for_ci.py`
 # the data are downloaded to the directory `/test_data` and during CircleCI's CI runtime, we need to move them to the root of `transformers`
 RUN mkdir test_data && cd test_data && curl -O https://raw.githubusercontent.com/huggingface/transformers/${REF}/utils/fetch_hub_objects_for_ci.py && python3 fetch_hub_objects_for_ci.py
 RUN uv pip uninstall transformers

6

docker/quality.dockerfile

View File

 @ -2,8 +2,8 @@ FROM python:3.10-slim
 ENV PYTHONDONTWRITEBYTECODE=1
 ARG REF=main
 USER root
 RUN apt-get update && apt-get install -y time git
 RUN apt-get update && apt-get install -y time git
 ENV UV_PYTHON=/usr/local/bin/python
 RUN pip install uv &&  uv venv
 RUN pip install uv
 RUN uv pip install --no-cache-dir -U pip setuptools GitPython "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[ruff]" urllib3
 RUN apt-get install -y jq curl && apt-get clean && rm -rf /var/lib/apt/lists/*
 RUN apt-get install -y jq curl && apt-get clean && rm -rf /var/lib/apt/lists/*

12

docker/tf-light.dockerfile

View File

 @ -1,12 +0,0 @@
 FROM python:3.10-slim
 ENV PYTHONDONTWRITEBYTECODE=1
 ARG REF=main
 USER root
 RUN apt-get update &&  apt-get install -y --no-install-recommends libsndfile1-dev espeak-ng time git g++ pkg-config openssh-client git
 RUN apt-get install -y  cmake
 ENV UV_PYTHON=/usr/local/bin/python
 RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
 RUN pip install  --upgrade --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[tf-cpu,sklearn,testing,sentencepiece,tf-speech,vision]"
 RUN uv pip install --no-cache-dir  "protobuf==3.20.3"
 RUN pip uninstall -y transformers
 RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean

16

docker/torch-jax-light.dockerfile

View File

 @ -1,16 +0,0 @@
 FROM python:3.10-slim
 ENV PYTHONDONTWRITEBYTECODE=1
 ARG REF=main
 USER root
 RUN apt-get update &&  apt-get install -y libsndfile1-dev espeak-ng time git g++ cmake pkg-config openssh-client git
 ENV UV_PYTHON=/usr/local/bin/python
 RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
 RUN uv pip install --no-deps accelerate
 RUN pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
 RUN pip install --no-cache-dir "scipy<1.13" "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[flax,audio,sklearn,sentencepiece,vision,testing]"
 # RUN pip install --no-cache-dir "scipy<1.13" "transformers[flax,testing,sentencepiece,flax-speech,vision]"
 RUN pip uninstall -y transformers
 RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean

16

docker/torch-light.dockerfile

View File

 @ -2,10 +2,16 @@ FROM python:3.10-slim
 ENV PYTHONDONTWRITEBYTECODE=1
 ARG REF=main
 USER root
 RUN apt-get update &&  apt-get install -y --no-install-recommends libsndfile1-dev espeak-ng time git g++ cmake pkg-config openssh-client git git-lfs
 RUN apt-get update &&  apt-get install -y --no-install-recommends libsndfile1-dev espeak-ng time git g++ cmake pkg-config openssh-client git-lfs ffmpeg curl
 ENV UV_PYTHON=/usr/local/bin/python
 RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
 RUN pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
 RUN pip --no-cache-dir install uv && uv pip install --no-cache-dir -U pip setuptools
 RUN uv pip install --no-cache-dir 'torch<2.9' 'torchaudio' 'torchvision' 'torchcodec' --index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-deps timm accelerate --extra-index-url https://download.pytorch.org/whl/cpu
 RUN uv pip install --no-cache-dir librosa "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[sklearn,sentencepiece,vision,testing,tiktoken]"
 RUN pip uninstall -y transformers
 RUN uv pip install --no-cache-dir librosa "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[sklearn,sentencepiece,vision,testing,tiktoken,num2words,video]"
 # fetch test data and hub objects within CircleCI docker images to reduce even more connections
 # we don't need a full clone of `transformers` to run `fetch_hub_objects_for_ci.py`
 # the data are downloaded to the directory `/test_data` and during CircleCI's CI runtime, we need to move them to the root of `transformers`
 RUN mkdir test_data && cd test_data && curl -O https://raw.githubusercontent.com/huggingface/transformers/${REF}/utils/fetch_hub_objects_for_ci.py && python3 fetch_hub_objects_for_ci.py
 RUN uv pip uninstall transformers

Compare commits

3986 Commits v4.45.2 ... multi_jobs

42 .circleci/config.yml Unescape Escape View File

229 .circleci/create_circleci_config.py Unescape Escape View File

3 .circleci/parse_test_outputs.py Unescape Escape View File

43 .github/ISSUE_TEMPLATE/bug-report.yml vendored Unescape Escape View File

2 .github/ISSUE_TEMPLATE/i18n.md vendored Unescape Escape View File

2 .github/ISSUE_TEMPLATE/migration.yml vendored Unescape Escape View File

43 .github/PULL_REQUEST_TEMPLATE.md vendored Unescape Escape View File

39 .github/copilot-instructions.md vendored Normal file Unescape Escape View File

122 .github/scripts/assign_reviewers.py vendored Normal file Unescape Escape View File

370 .github/scripts/codeowners_for_review_action vendored Normal file Unescape Escape View File

2 .github/workflows/add-model-like.yml vendored Unescape Escape View File

26 .github/workflows/assign-reviewers.yml vendored Normal file Unescape Escape View File

75 .github/workflows/benchmark.yml vendored Unescape Escape View File

57 .github/workflows/benchmark_v2.yml vendored Normal file Unescape Escape View File

17 .github/workflows/benchmark_v2_a10_caller.yml vendored Normal file Unescape Escape View File

17 .github/workflows/benchmark_v2_mi325_caller.yml vendored Normal file Unescape Escape View File

6 .github/workflows/build-ci-docker-images.yml vendored Unescape Escape View File

77 .github/workflows/build-docker-images.yml vendored Unescape Escape View File

10 .github/workflows/build-nightly-ci-docker-images.yml vendored Unescape Escape View File

14 .github/workflows/build_documentation.yml vendored Unescape Escape View File

3 .github/workflows/build_pr_documentation.yml vendored Unescape Escape View File

255 .github/workflows/check_failed_tests.yml vendored Normal file Unescape Escape View File

43 .github/workflows/collated-reports.yml vendored Normal file Unescape Escape View File

6 .github/workflows/doctest_job.yml vendored Unescape Escape View File

7 .github/workflows/doctests.yml vendored Unescape Escape View File

157 .github/workflows/get-pr-info.yml vendored Normal file Unescape Escape View File

36 .github/workflows/get-pr-number.yml vendored Normal file Unescape Escape View File

79 .github/workflows/model_jobs.yml vendored Unescape Escape View File

129 .github/workflows/model_jobs_amd.yml vendored Unescape Escape View File

120 .github/workflows/model_jobs_intel_gaudi.yml vendored Normal file Unescape Escape View File

68 .github/workflows/new_model_pr_merged_notification.yml vendored Normal file Unescape Escape View File

18 .github/workflows/pr-style-bot.yml vendored Normal file Unescape Escape View File

134 .github/workflows/pr_build_doc_with_comment.yml vendored Normal file Unescape Escape View File

177 .github/workflows/pr_run_slow_ci.yml vendored Normal file Unescape Escape View File

259 .github/workflows/push-important-models.yml vendored Unescape Escape View File

415 .github/workflows/self-comment-ci.yml vendored Normal file Unescape Escape View File

61 .github/workflows/self-nightly-caller.yml vendored Unescape Escape View File

33 .github/workflows/self-nightly-past-ci-caller.yml vendored Unescape Escape View File

135 .github/workflows/self-pr-slow-ci.yml vendored Unescape Escape View File

50 .github/workflows/self-push-amd-mi210-caller.yml vendored Unescape Escape View File

50 .github/workflows/self-push-amd-mi250-caller.yml vendored Unescape Escape View File

25 .github/workflows/self-push-amd-mi300-caller.yml vendored Unescape Escape View File

1 .github/workflows/self-push-amd.yml vendored Unescape Escape View File

4 .github/workflows/self-push-caller.yml vendored Unescape Escape View File

310 .github/workflows/self-push.yml vendored Unescape Escape View File

55 .github/workflows/self-scheduled-amd-mi210-caller.yml vendored Unescape Escape View File

114 .github/workflows/self-scheduled-amd-mi250-caller.yml vendored Unescape Escape View File

67 .github/workflows/self-scheduled-amd-mi325-caller.yml vendored Normal file Unescape Escape View File

63 .github/workflows/self-scheduled-amd-mi355-caller.yml vendored Normal file Unescape Escape View File

349 .github/workflows/self-scheduled-amd.yml vendored Unescape Escape View File

102 .github/workflows/self-scheduled-caller.yml vendored Unescape Escape View File

341 .github/workflows/self-scheduled-intel-gaudi.yml vendored Normal file Unescape Escape View File

67 .github/workflows/self-scheduled-intel-gaudi3-caller.yml vendored Normal file Unescape Escape View File

195 .github/workflows/self-scheduled.yml vendored Unescape Escape View File

72 .github/workflows/slack-report.yml vendored Unescape Escape View File

58 .github/workflows/ssh-runner.yml vendored Unescape Escape View File

2 .github/workflows/trufflehog.yml vendored Unescape Escape View File

2 .github/workflows/update_metdata.yml vendored Unescape Escape View File

8 .gitignore vendored Unescape Escape View File

39 AGENTS.md Normal file Unescape Escape View File

26 CONTRIBUTING.md Unescape Escape View File

11 ISSUES.md Unescape Escape View File

28 Makefile Unescape Escape View File

382 README.md Unescape Escape View File

9 SECURITY.md Unescape Escape View File

35 awesome-transformers.md Unescape Escape View File

1 benchmark/.gitignore vendored Normal file Unescape Escape View File

49 benchmark/README.md Normal file Unescape Escape View File

353 benchmark/benches/llama.py Normal file Unescape Escape View File

4 benchmark/benchmark.py Unescape Escape View File

502 benchmark/benchmarks_entrypoint.py Normal file Unescape Escape View File

2 benchmark/config/generation.yaml Unescape Escape View File

10 benchmark/default.yml Normal file Unescape Escape View File

2375 benchmark/grafana_dashboard.json Normal file View File

17 benchmark/grafana_datasource.yaml Normal file Unescape Escape View File

6 benchmark/optimum_benchmark_wrapper.py Unescape Escape View File

6 benchmark/requirements.txt Normal file Unescape Escape View File

0 examples/research_projects/bert-loses-patience/pabee/__init__.py → benchmark/utils/init_db.sql Unescape Escape View File

3986 Commits

v4.45.2 ... multi_jobs

42

.circleci/config.yml

View File

229

.circleci/create_circleci_config.py

View File

3

.circleci/parse_test_outputs.py

View File

43

.github/ISSUE_TEMPLATE/bug-report.yml vendored

View File

2

.github/ISSUE_TEMPLATE/i18n.md vendored

View File

2

.github/ISSUE_TEMPLATE/migration.yml vendored

View File

43

.github/PULL_REQUEST_TEMPLATE.md vendored

View File

39

.github/copilot-instructions.md vendored Normal file

View File

122

.github/scripts/assign_reviewers.py vendored Normal file

View File

370

.github/scripts/codeowners_for_review_action vendored Normal file

View File

2

.github/workflows/add-model-like.yml vendored

View File

26

.github/workflows/assign-reviewers.yml vendored Normal file

View File

75

.github/workflows/benchmark.yml vendored

View File

57

.github/workflows/benchmark_v2.yml vendored Normal file

View File

17

.github/workflows/benchmark_v2_a10_caller.yml vendored Normal file

View File

17

.github/workflows/benchmark_v2_mi325_caller.yml vendored Normal file

View File

6

.github/workflows/build-ci-docker-images.yml vendored

View File

77

.github/workflows/build-docker-images.yml vendored

View File

10

.github/workflows/build-nightly-ci-docker-images.yml vendored

View File

14

.github/workflows/build_documentation.yml vendored

View File

3

.github/workflows/build_pr_documentation.yml vendored

View File

255

.github/workflows/check_failed_tests.yml vendored Normal file

View File

43

.github/workflows/collated-reports.yml vendored Normal file

View File

6

.github/workflows/doctest_job.yml vendored

View File

7

.github/workflows/doctests.yml vendored

View File

157

.github/workflows/get-pr-info.yml vendored Normal file

View File

36

.github/workflows/get-pr-number.yml vendored Normal file

View File

79

.github/workflows/model_jobs.yml vendored

View File

129

.github/workflows/model_jobs_amd.yml vendored

View File

120

.github/workflows/model_jobs_intel_gaudi.yml vendored Normal file

View File

68

.github/workflows/new_model_pr_merged_notification.yml vendored Normal file

View File

18

.github/workflows/pr-style-bot.yml vendored Normal file

View File

134

.github/workflows/pr_build_doc_with_comment.yml vendored Normal file

View File

177

.github/workflows/pr_run_slow_ci.yml vendored Normal file

View File

259

.github/workflows/push-important-models.yml vendored

View File

415

.github/workflows/self-comment-ci.yml vendored Normal file

View File

61

.github/workflows/self-nightly-caller.yml vendored

View File

33

.github/workflows/self-nightly-past-ci-caller.yml vendored

View File

135

.github/workflows/self-pr-slow-ci.yml vendored

View File

50

.github/workflows/self-push-amd-mi210-caller.yml vendored

View File

50

.github/workflows/self-push-amd-mi250-caller.yml vendored

View File

25

.github/workflows/self-push-amd-mi300-caller.yml vendored

View File

1

.github/workflows/self-push-amd.yml vendored

View File

4

.github/workflows/self-push-caller.yml vendored

View File

310

.github/workflows/self-push.yml vendored

View File

55

.github/workflows/self-scheduled-amd-mi210-caller.yml vendored

View File

114

.github/workflows/self-scheduled-amd-mi250-caller.yml vendored

View File

67

.github/workflows/self-scheduled-amd-mi325-caller.yml vendored Normal file

View File

63

.github/workflows/self-scheduled-amd-mi355-caller.yml vendored Normal file

View File

349

.github/workflows/self-scheduled-amd.yml vendored

View File

102

.github/workflows/self-scheduled-caller.yml vendored

View File

341

.github/workflows/self-scheduled-intel-gaudi.yml vendored Normal file

View File

67

.github/workflows/self-scheduled-intel-gaudi3-caller.yml vendored Normal file

View File

195

.github/workflows/self-scheduled.yml vendored

View File

72

.github/workflows/slack-report.yml vendored

View File

58

.github/workflows/ssh-runner.yml vendored

View File

2

.github/workflows/trufflehog.yml vendored

View File

2

.github/workflows/update_metdata.yml vendored

View File

8

.gitignore vendored

View File

39

AGENTS.md Normal file

View File

26

CONTRIBUTING.md

View File

11

ISSUES.md

View File

28

Makefile

View File

382

README.md

View File

9

SECURITY.md

View File

35

awesome-transformers.md

View File

1

benchmark/.gitignore vendored Normal file

View File

49

benchmark/README.md Normal file

View File

353

benchmark/benches/llama.py Normal file

View File

4

benchmark/benchmark.py

View File

502

benchmark/benchmarks_entrypoint.py Normal file

View File

2

benchmark/config/generation.yaml

View File

10

benchmark/default.yml Normal file

View File

2375

benchmark/grafana_dashboard.json Normal file

View File

17

benchmark/grafana_datasource.yaml Normal file

View File

6

benchmark/optimum_benchmark_wrapper.py

View File

6

benchmark/requirements.txt Normal file

View File

0

examples/research_projects/bert-loses-patience/pabee/init.py → benchmark/utils/init_db.sql

View File

2

benchmark_v2/.gitignore vendored Normal file

View File