transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-10-20 17:13:56 +08:00

Author	SHA1	Message	Date
Judy	50ca781d78	🌐 [i18n-KO] Translated `code_llama.md` to Korean (#40558 ) * docs: ko: code_llama.md * feat: nmt draft * fix: manual edits * Apply suggestions from code review Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by: HyunZ118 <156191095+HyunZ118@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> --------- Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by: HyunZ118 <156191095+HyunZ118@users.noreply.github.com>	2025-10-16 11:27:46 -07:00
SSUM	8739fc05c4	[i18n-KO] Translated `big_bird.md` to Korean (#40445 ) * docs: ko: BigBird.md * feat: nmt draft * fix: manual edits	2025-10-16 11:23:56 -07:00
HyunZ118	77b5ad65ee	🌐 [i18n-KO] Translated sam_hq.md to Korean (#41340 ) * fix: manual edits * Apply suggestions from code review Apply suggestions from code review Co-authored-by: HyunSang Jang <tasker.dev103@gmail.com> * Apply suggestions from code review Apply suggestions from code review Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: HyunSang Jang <tasker.dev103@gmail.com> Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>	2025-10-16 11:10:16 -07:00
Judy	a9731a725e	🌐 [i18n-KO] Translated `chat_extras.md` to Korean (#39863 ) * docs: ko: chat_extras.md * feat: nmt draft * fix: manual edits * Apply suggestions from code review * Apply suggestions from code review * Update docs/source/ko/chat_extras.md	2025-10-16 10:41:03 -07:00
Marc Sun	bdbc2d037b	[Trainer] [Breaking change] `use_cache` default to `False` (#41585 ) * use_cache default to `False` when training * style * Fix comment * add checks * style * set * switch	2025-10-16 18:51:36 +02:00
Mohamed Mekkouri	fe11cbb808	Erroring when KernelConfig is passed without use_kernels = True (#41657 ) * update * update	2025-10-16 18:08:46 +02:00
Yih-Dar	6344371a91	improve `utils/check_bad_commit.py` (#41658 ) * robust * robust * robust --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-16 15:51:19 +00:00
Cyril Vallez	a408384a88	Improve package version check (#41661 ) fix	2025-10-16 17:31:58 +02:00
Rémi Ouazan	f7c33abab3	Small changes to benchmarking script (#41662 )	2025-10-16 17:25:49 +02:00
Marc Sun	9839d57a02	Fix serving continuous batching (#41624 ) * udpate-serving-cb * style * style * check none * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-10-16 17:24:21 +02:00
Cyril Vallez	e85d5ab2bb	Fix dtype casting with quantization (#41665 ) fix dtype casting	2025-10-16 17:19:32 +02:00
Raushan Turganbay	1c36d407d5	Add in-out modalities as class attribute per model (#41366 ) * update all models * fix copies * explanation comment * better notation in omni model * style * fix copies * output_modalities under generation mixin * fix copies * oh, glm4v also needs conversion	2025-10-16 17:11:06 +02:00
Rémi Ouazan	0215846d98	Switch to CB if cache_implementation == paged (#41655 ) * Add a switch to CB in case of paged cache * Added paged as a valid cache implem * Added a fallback on inputs_ids as a name * Rookie mistake * Removed paged from cache implems * Added warning about some beam search args * Moved up CB warning	2025-10-16 17:00:18 +02:00
Yuanyuan Chen	9e99198e5e	Use \| for Optional and Union typing (#41646 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-16 14:29:54 +00:00
Anton Vlasjuk	bf815e9b5e	[`Masks`] Fix mask handling in eager for vision models (#41625 ) add mask handling in case of models that do use it	2025-10-16 16:27:26 +02:00
vb	4a43e3d57c	purge HF_HUB_ENABLE_HF_TRANSFER; promote Xet (#41656 )	2025-10-16 16:17:09 +02:00
Fabian Joswig	8725ce10ed	[Fix] Deepseek V3 expert bias routing (#41647 ) * [Fix] Deepseek V3 expert bias routing * [Fix] fix-copies * [Fix] Run make style	2025-10-16 14:04:48 +00:00
Mohamed Mekkouri	1fb3fc4db0	[kernels] refactor function kernel calling (#41577 ) * refactor function kernel callling * nit * don't pass the mapping * use _kernels_available * rm import	2025-10-16 15:43:02 +02:00
Pablo Montalvo	9176af574a	Double router compute? (#41653 ) * weird double router compute? * flip it	2025-10-16 15:17:21 +02:00
Yuanyuan Chen	503c933f36	Fix confusing cls assignment (#41642 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-16 13:01:07 +00:00
Yuanyuan Chen	2aff20aff6	Fix typos in documentation (#41641 ) Fix typos Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-16 12:58:46 +00:00
Yuanyuan Chen	981370c038	Format MarkDown documentation and tiny fixes (#41638 ) * Fix MarkDown syntax Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-16 12:58:06 +00:00
Rémi Ouazan	eef9fb2af3	Fix EncoderDecoder cache (#41612 ) * Fix EncoderDecoder cache * Add the option for the ddp data tuples to have 2 elems * Modifiy the order of the KV and sliding * Adapted RAG and Whisper to new EncoderDecoderCache * A single comma * Remove kwargs in map * Fixed order in manual injection cache test * Slight changes to support legacy format * Removed Nonnes	2025-10-16 14:55:41 +02:00
Mario Koddenbrock	35dc8f0a2e	Adjust device logging level and add minor fixes (#41636 ) This commit addresses a noisy warning and improves the robustness of the base pipeline implementation. - The device placement message in the pipeline base class has been changed from a `warning` to a `debug` log. This reduces log noise for users who are aware of their device setup, while still providing the information for debugging purposes. - Additionally, potential `UnboundLocalError` exceptions in the `_pad` and `check_model_type` functions have been prevented by initializing variables before their conditional assignment.	2025-10-16 12:47:39 +00:00
Rémi Ouazan	2935a1be19	Fix fp32_ln for various models (#41605 ) * Add is_causal to KosmosTextAttention * Move get target_dtype to be imported elsewhere * Fix fp32 flash attention bug in bark * Fix is_causal in mllama * Fix fp32 issue on StableLM * Fix repo-consistency	2025-10-16 14:18:49 +02:00
Steven Liu	b9bd8c45a1	[CI] Build translated docs (#41632 ) fix	2025-10-16 14:01:33 +02:00
Anton Vlasjuk	baecdb8a97	[`Ernie 4.5 Moe`] Fix Moe and offloading (#41385 ) fix	2025-10-16 13:59:01 +02:00
Anton Vlasjuk	44539827d5	[`Executorch`] Simplify for encoder models (#41627 ) * Trigger Build * revert extra treatment for executorch as we default to no vmapping now	2025-10-16 13:57:52 +02:00
jiqing-feng	143acfe2ce	fix check inputs for text2text pipeline (#41556 ) fix check inputs Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-10-16 11:42:41 +00:00
Andrei Panferov	67fae90519	Fix FP-Quant quantization fallback CPU dispatch. (#41619 ) * fp_quant fix * Update quantizer_fp_quant.py	2025-10-16 11:41:01 +00:00
Lucain	af2a66ced9	Migrate transformers cli to Typer (#41487 ) * Add typer-slim as explicit dependency * Migrate CLI to Typer * code quality * bump release candidate * adapt test_cli.py * Remove ./commands + adapt tests * fix quality * consistency * doctested * do not serve model in chat * style * will it fix them? * fix test * capitalize classes * Rebase * Rebase * tests + fixup tests + fixup * csutom error message * fix ? * should be good * fix caplog globally * inner caplog * last attempt * Retry * Let's try with capsys disabled --------- Co-authored-by: Lysandre <hi@lysand.re>	2025-10-16 13:29:42 +02:00
Yoni Gozlan	a59124e27e	Add missing dates to docs (#41576 ) add dates	2025-10-16 09:32:28 +00:00
Cyril Vallez	81f97b17d2	Remove randomly added script (#41650 ) remove	2025-10-16 11:23:53 +02:00
Cyril Vallez	c0a5cf19ad	Fix tokenization test (#41649 ) fix	2025-10-16 11:14:20 +02:00
Cyril Vallez	3ef6f2c415	Allow passing `tp_plan` in `from_pretrained` directly (#41435 ) * start * allow passing it * fix plans * fix * fix * style * style * fix * add_test * oupsi indent * fix * fix * fix for CI without accelerator * fix import	2025-10-16 11:12:07 +02:00
Yuxuan Zhang	59efd86da2	Add aux loss for GLM-4.5V (#41564 ) * add aux * update * update config to text_config * use qwen data class to avoid repeat again * format * update * use 1e-4 * update * update for remove init * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2025-10-16 09:04:21 +00:00
Raushan Turganbay	7b7d17f9bf	🚨 [v5] Toggle the serialization format in processors (#41474 ) * toggle the serialization * prob this fixes it * fix tests * typo * delete legacy save entirely * remove extra nesting in if * revert test and serialzie a public attr instead of private	2025-10-16 10:19:22 +02:00
Merve Noyan	e20df45bf6	Add Backbone API fine-tuning tutorial (#41590 ) --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-10-15 18:42:32 +02:00
Jack	19df66dcba	Update executorch.md (#41582 ) * Update executorch.md * Update executorch.md * Update executorch.md * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-10-15 09:01:46 -07:00
Steven Liu	9f71e3a604	[docs] Duplicate entry (#41591 ) fix	2025-10-15 17:02:36 +02:00
Marc Sun	bc9900562d	Fix quantization base class (#41613 ) * fix * fix --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-15 16:58:17 +02:00
Marc Sun	72fd67929b	Remove deprecated code (#41616 ) remove Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-15 16:57:52 +02:00
Yih-Dar	da382917aa	Remove the head masking block in some vision models (#41620 ) * old * new --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-15 15:51:01 +02:00
Raushan Turganbay	313afcc468	[chat template] update when "push_to_hub" (#39815 ) * update templates push to hub * rvert jinja suffix and move it to processor file	2025-10-15 13:49:59 +00:00
Raushan Turganbay	7bba4d1202	Fix video processing channel format (#41603 ) fix	2025-10-15 15:48:01 +02:00
Zhen	ab92534377	enable sdpa enable gqa logic for Ascend NPU (#41601 ) * enable gqa logic for Ascend NPU * remove redundant comments * fix comments about Ascend NPU --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-10-15 13:45:28 +00:00
i3hz	56a727dde5	Add fast path for bidirectional mask creation to fix regression (#41586 ) * fixed performance regression * also fixed the older_torch function * Update src/transformers/masking_utils.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * fix * more general * fix slicing * fix data dependent --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-10-15 15:30:39 +02:00
Yih-Dar	dc6fdeb705	Update a dataset reop link (#41618 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-15 14:41:38 +02:00
Harry Mellor	3953b65440	Reinstate early CUDA init fix (#41617 ) * Reinstate early CUDA init fix Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * Delay import further Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> --------- Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-15 14:41:10 +02:00
Yih-Dar	96d245a83d	torch 2.9 don't ❤️ torchcodec 💔 (#41610 ) pin Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-15 14:34:00 +02:00

1 2 3 4 5 ...

20895 Commits