transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-10-20 09:03:53 +08:00

Author	SHA1	Message	Date
Steven Liu	b9bd8c45a1	[CI] Build translated docs (#41632 ) fix	2025-10-16 14:01:33 +02:00
Anton Vlasjuk	baecdb8a97	[`Ernie 4.5 Moe`] Fix Moe and offloading (#41385 ) fix	2025-10-16 13:59:01 +02:00
Anton Vlasjuk	44539827d5	[`Executorch`] Simplify for encoder models (#41627 ) * Trigger Build * revert extra treatment for executorch as we default to no vmapping now	2025-10-16 13:57:52 +02:00
jiqing-feng	143acfe2ce	fix check inputs for text2text pipeline (#41556 ) fix check inputs Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-10-16 11:42:41 +00:00
Andrei Panferov	67fae90519	Fix FP-Quant quantization fallback CPU dispatch. (#41619 ) * fp_quant fix * Update quantizer_fp_quant.py	2025-10-16 11:41:01 +00:00
Lucain	af2a66ced9	Migrate transformers cli to Typer (#41487 ) * Add typer-slim as explicit dependency * Migrate CLI to Typer * code quality * bump release candidate * adapt test_cli.py * Remove ./commands + adapt tests * fix quality * consistency * doctested * do not serve model in chat * style * will it fix them? * fix test * capitalize classes * Rebase * Rebase * tests + fixup tests + fixup * csutom error message * fix ? * should be good * fix caplog globally * inner caplog * last attempt * Retry * Let's try with capsys disabled --------- Co-authored-by: Lysandre <hi@lysand.re>	2025-10-16 13:29:42 +02:00
Yoni Gozlan	a59124e27e	Add missing dates to docs (#41576 ) add dates	2025-10-16 09:32:28 +00:00
Cyril Vallez	81f97b17d2	Remove randomly added script (#41650 ) remove	2025-10-16 11:23:53 +02:00
Cyril Vallez	c0a5cf19ad	Fix tokenization test (#41649 ) fix	2025-10-16 11:14:20 +02:00
Cyril Vallez	3ef6f2c415	Allow passing `tp_plan` in `from_pretrained` directly (#41435 ) * start * allow passing it * fix plans * fix * fix * style * style * fix * add_test * oupsi indent * fix * fix * fix for CI without accelerator * fix import	2025-10-16 11:12:07 +02:00
Yuxuan Zhang	59efd86da2	Add aux loss for GLM-4.5V (#41564 ) * add aux * update * update config to text_config * use qwen data class to avoid repeat again * format * update * use 1e-4 * update * update for remove init * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2025-10-16 09:04:21 +00:00
Raushan Turganbay	7b7d17f9bf	🚨 [v5] Toggle the serialization format in processors (#41474 ) * toggle the serialization * prob this fixes it * fix tests * typo * delete legacy save entirely * remove extra nesting in if * revert test and serialzie a public attr instead of private	2025-10-16 10:19:22 +02:00
Merve Noyan	e20df45bf6	Add Backbone API fine-tuning tutorial (#41590 ) --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-10-15 18:42:32 +02:00
Jack	19df66dcba	Update executorch.md (#41582 ) * Update executorch.md * Update executorch.md * Update executorch.md * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-10-15 09:01:46 -07:00
Steven Liu	9f71e3a604	[docs] Duplicate entry (#41591 ) fix	2025-10-15 17:02:36 +02:00
Marc Sun	bc9900562d	Fix quantization base class (#41613 ) * fix * fix --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-15 16:58:17 +02:00
Marc Sun	72fd67929b	Remove deprecated code (#41616 ) remove Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-15 16:57:52 +02:00
Yih-Dar	da382917aa	Remove the head masking block in some vision models (#41620 ) * old * new --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-15 15:51:01 +02:00
Raushan Turganbay	313afcc468	[chat template] update when "push_to_hub" (#39815 ) * update templates push to hub * rvert jinja suffix and move it to processor file	2025-10-15 13:49:59 +00:00
Raushan Turganbay	7bba4d1202	Fix video processing channel format (#41603 ) fix	2025-10-15 15:48:01 +02:00
Zhen	ab92534377	enable sdpa enable gqa logic for Ascend NPU (#41601 ) * enable gqa logic for Ascend NPU * remove redundant comments * fix comments about Ascend NPU --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-10-15 13:45:28 +00:00
i3hz	56a727dde5	Add fast path for bidirectional mask creation to fix regression (#41586 ) * fixed performance regression * also fixed the older_torch function * Update src/transformers/masking_utils.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * fix * more general * fix slicing * fix data dependent --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-10-15 15:30:39 +02:00
Yih-Dar	dc6fdeb705	Update a dataset reop link (#41618 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-15 14:41:38 +02:00
Harry Mellor	3953b65440	Reinstate early CUDA init fix (#41617 ) * Reinstate early CUDA init fix Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Delay import further Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> --------- Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-15 14:41:10 +02:00
Yih-Dar	96d245a83d	torch 2.9 don't ❤️ torchcodec 💔 (#41610 ) pin Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-15 14:34:00 +02:00
Yuanyuan Chen	bb0c3af995	More markdown file fixes (#41599 ) * Format markdown files Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Format markdown files Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Format markdown files Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-15 12:29:27 +00:00
Marc Sun	70e871959c	Fix trainer simple tests (#41449 ) * fix * fix ray * train to tune * breaking changes wrt generation config * Fix ! * fix * fix * fix deepspeed ! * fix * fix * fix * improve logic * revert and fix * revert comment * oups * revert change * fix * style * typo in comment --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-15 14:09:00 +02:00
Cyril Vallez	c4210796e0	Import `expand_device_map` instead of redefining it (#41608 ) remove it	2025-10-15 14:00:09 +02:00
Anton Vlasjuk	fcd1ccdb78	[`Docs`] Fix changed references (#41614 ) * fix * fix * other ln	2025-10-15 13:59:13 +02:00
Marc Sun	2b2c20f315	Update issue template (#41573 ) * update * fix	2025-10-15 13:54:37 +02:00
Marc Sun	e2122c4bcb	remove ray_scope and check_quantized_param (#41587 ) remove	2025-10-15 13:10:35 +02:00
Wang, Yi	e89cef6625	fix some case failures lead by "`torch.compile` recompiled part of th… (#41558 ) * fix some case failures lead by "`torch.compile` recompiled part of the forward pass" in xpu Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * update comment Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2025-10-15 10:45:29 +00:00
Philip Roeleveld	26b7f66850	Add `logits_to_keep` to many older CausalLM models (#41335 ) * Add logits_to_keep to CausalLM models * Skip failing test for git model * Remove unused return_dict from kosmos2 signature * Revert BlipForQuestionAnswering	2025-10-15 11:56:01 +02:00
Cyril Vallez	5db730786d	[device_map] Accelerate loading by computing device_map much faster (#41548 ) * start * add the important fix * continue * big cleanup * type hints * add method * fix typehints * typehints * fix * oupsi * remove space * improve function * CI	2025-10-15 11:18:57 +02:00
Lysandre Debut	13a35a5057	Enable non-streaming mode in `transformers serve` (#41446 ) * Enable non-streaming in transformers serve Remove typos Remove typos Remove typos * Fix tests * Arthur review	2025-10-15 09:37:26 +02:00
Rémi Ouazan	94df0e6560	Benchmark overhaul (#41408 ) * Big refactor, still classes to move around and script to re-complexify * Move to streamer, isolate benches, propagate num tokens * Some refacto * Added compile mode to name * Re-order * Move to dt_tokens * Better format * Fix and disable use_cache by default * Fixed compile and SDPA backend default * Refactor results format * Added default compile mode * Always use cache * Fixed cache and added flex * Plan for missing modules * Experiments: no cg and shuffle * Disable compile for FA * Remove wall time, add sweep mode, get git commit * Review compliance, start * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Update benchmark_v2/framework/benchmark_runner.py Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Disable workflow * Pretty print * Added some pretty names to have pretty logs * Review n2 compliance (end?) * Style and end of PR --------- Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>	2025-10-14 21:41:43 +02:00
Rémi Ouazan	9e4199ede3	Gemma3 fixes (#41572 ) * Multiple device error fix * FA2 equivalence fix * Move the train fwd in cfg test * Style * Added comment * Made the comment more clear	2025-10-14 18:33:27 +02:00
Yuanyuan Chen	4c8d293599	Fix typsetting and content of llm_tutorial_optimization.md (#41172 ) * Fix typsetting of llm_tutorial_optimization Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Fix errors Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-14 08:40:26 -07:00
Marc Sun	a99b1be3c7	Revert some breaking changes bnb (#41581 ) fix	2025-10-14 16:28:16 +02:00
Rémi Ouazan	82cae9eb52	Add __iter__ to DynamicCache (#41569 ) * Add __iter__ to DynamicCache * Fix tests that use ddp init	2025-10-14 16:16:32 +02:00
NielsRogge	4fad35ee4a	[VisionEncoderDecoderModel] Update loss function (#40863 ) Update loss function	2025-10-14 16:03:00 +02:00
Mohamed Mekkouri	ae6f6cc3e0	Revert "add rmsnorm kernels support for Intel XPU" (#41579 ) Revert "add rmsnorm kernels support for Intel XPU (#41563)" This reverts commit fd787c5f6d667d3e00def70f588972af4437f631.	2025-10-14 15:49:33 +02:00
kaixuanliu	fd787c5f6d	add rmsnorm kernels support for Intel XPU (#41563 ) Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>	2025-10-14 13:26:09 +00:00
Chunyu	4e4f2af586	Add conditional checks to _check_and_adjust_attn_implementation() (#41542 )	2025-10-14 13:00:07 +00:00
Merve Noyan	3648fde486	Add DINOv3Backbone for ConvNext variant (#40651 ) --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-10-14 14:57:04 +02:00
Yih-Dar	abf5b57a68	delete some tokenizer tests using pickle (#41514 ) * hate pickle * hate pickle * hate pickle --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-14 14:50:51 +02:00
Mohamed Mekkouri	8fe4db5399	[kernels] rm mra kernels (#41507 ) * fix modeling * remove kernel * fix style	2025-10-14 13:34:04 +02:00
Daniel Quintas	c620c38bb0	[Qwen3VLMoe] Fixed: Expected self.dtype to be equal to src.dtype - routing_weights casting (#41420 ) * Fixed Expected self.dtype to be equal to src.dtype on eval * Fixed Expected self.dtype to be equal to src.dtype on eval * Fixed Expected self.dtype to be equal to src.dtype on eval * generated modeling_qwen3_vl_moe.py file * Fixed Ernie_4_5_MoE router casting * Fixed routing_weights dtype casting (ernie4_5_moe, hunyuan_v1_moe, qwen2_moe, qwen3_moe, qwen3_next,qwen3_omni_moe) * rollback hunyuan_v1_moe changes --------- Co-authored-by: Daniel Oliveira <daniel-oliveira-11@hotmail.com> Co-authored-by: Daniel Oliveira <36623265+daniel3303@users.noreply.github.com>	2025-10-14 13:14:49 +02:00
Rémi Ouazan	0798797ec9	Fix an import error with PreTrainModel (#41571 )	2025-10-14 13:13:37 +02:00
Julien Denize	0566b6f5bd	Patch MistralCommonTokenizer (#41439 ) * Fix token_to_id and add add_generation_prompt * Fix spm download * Refactor spm * Try another possibly non-gated spm * Improve get_vocab * lint * Improve get_vocab * Add warn to piece_to_id * Improve from_pretrained raise and revert model spm * Revert fast	2025-10-14 11:13:19 +00:00

1 2 3 4 5 ...

20920 Commits