transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-10-20 09:03:53 +08:00

Author	SHA1	Message	Date
Yuanyuan Chen	12a50f294d	Enable FURB rules in ruff (#41395 ) * Apply ruff FURB rules Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Enable ruff FURB rules Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Revert changes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-17 15:00:40 +00:00
Cyril Vallez	39b6d3bf7e	Remove skipped tests without parents (#41691 ) remove	2025-10-17 16:25:40 +02:00
Cyril Vallez	75da795d8f	🚨 Remove torch.fx support (#41683 ) * remove all * fix comments * better checks * doc	2025-10-17 16:12:46 +02:00
Yuanyuan Chen	080d704af1	Fix Pylint warnings (#41644 ) * Fix pylint warnings Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Raise with an exception Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-17 13:09:42 +00:00
Raushan Turganbay	10de06dace	🚨 [v5] Refactor RoPE for layer types (#39847 ) * update * batch update model code * typos * too many diffs, dump * dump again * another dump * fix copies * make `rope_scaling_dict` self attr * fix a few more tests * another update * fix a few more tests, hopefully last ones * fox copies * fix copies again * fix newly added models, I hate rebasing on main * update config files * modular files * fix rope utils test * docstring has to be indented more, why? * oops forgot to update some modualr files * copy from doesn't copy decorators? * fix overriden test as well * add a new test * fix failing tests again * update docstrings * fix phi3 * fix two models * fix copies * forgot to add * stupid bug from modular conversion * fix slow tests * update to call rotary emb once per model forward * 3K tests failing?! * update * update more models * fix copies * fix the rest of tests hopefully * fix after rebase * fix the rope tests * fix docs omni * change a bit * models with layer types * why it was deleted? * fix a few tests * fix last test! * delete extra empty lines * add a test case * more changes * fix models * typing hint for nested rope params * missed when resolving conflicts * delete layer types and fix typo * fix copies * fix copies * update docs text * docs * huuge update all models * fix copies * rename attr to align with new format * delete redundant rope tests * trigger ci * update the case * this is why i hate rebasing * maybe fixed? * oops * now fix? * fix last tests and copies * fix copies? * fix minimax and gemma3n * update typo * deprecation end version * final fix copies :fingers-crossed: * oh my, add the docs in toctree * oke, this is really the last fix * fix copies and hope that tests won't start failing again * use rope scaling if saved * fix slow tests * fix cwm and unrelated deepseek * fix last * update * hope it works now, it took so long * lets keep None for now, I will try to remove after checking tests * some more fixes, i find and replace does not always find all cases * last fix of tests * arthur's comment for extra foreward kwargs * delete unused code * fix slow qwen tests * delete layer types from models * faulty modular conversion * fix qwen omni * fix copies and style * address my comment --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-17 14:57:27 +02:00
Cyril Vallez	0b3aef1da9	🚨 Remove torchscript support (#41688 ) * remove a lot * remove the rest * doc	2025-10-17 13:38:27 +02:00
Lucain	252d7cd952	Remove deprecated `use_auth_token` parameter (#41666 ) * Remove deprecated use_auth_token * code styl * fix test * Update examples/pytorch/speech-recognition/README.md	2025-10-17 09:57:46 +00:00
Julien	354567d955	Adding superglue fast image processing (#41394 ) * Default implementation - no time improvement * Improved implementation - apparently 2 times faster with only simple function refactor * elementary torch first approach, still need further implementation of torch first method * torch-first approach finished * refactor processor * refactor test * partial doc update * EfficientLoFTRImageProcessorFast based implementation * EfficientLoFTRImageProcessorFast based implementation * Logic checked - Test Passed - Validated execution speed * use modular for efficientloftr * fix import --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-10-16 19:34:09 +00:00
Yuanyuan Chen	9e99198e5e	Use \| for Optional and Union typing (#41646 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-16 14:29:54 +00:00
Rémi Ouazan	eef9fb2af3	Fix EncoderDecoder cache (#41612 ) * Fix EncoderDecoder cache * Add the option for the ddp data tuples to have 2 elems * Modifiy the order of the KV and sliding * Adapted RAG and Whisper to new EncoderDecoderCache * A single comma * Remove kwargs in map * Fixed order in manual injection cache test * Slight changes to support legacy format * Removed Nonnes	2025-10-16 14:55:41 +02:00
Anton Vlasjuk	44539827d5	[`Executorch`] Simplify for encoder models (#41627 ) * Trigger Build * revert extra treatment for executorch as we default to no vmapping now	2025-10-16 13:57:52 +02:00
Andrei Panferov	67fae90519	Fix FP-Quant quantization fallback CPU dispatch. (#41619 ) * fp_quant fix * Update quantizer_fp_quant.py	2025-10-16 11:41:01 +00:00
Lucain	af2a66ced9	Migrate transformers cli to Typer (#41487 ) * Add typer-slim as explicit dependency * Migrate CLI to Typer * code quality * bump release candidate * adapt test_cli.py * Remove ./commands + adapt tests * fix quality * consistency * doctested * do not serve model in chat * style * will it fix them? * fix test * capitalize classes * Rebase * Rebase * tests + fixup tests + fixup * csutom error message * fix ? * should be good * fix caplog globally * inner caplog * last attempt * Retry * Let's try with capsys disabled --------- Co-authored-by: Lysandre <hi@lysand.re>	2025-10-16 13:29:42 +02:00
Cyril Vallez	c0a5cf19ad	Fix tokenization test (#41649 ) fix	2025-10-16 11:14:20 +02:00
Cyril Vallez	3ef6f2c415	Allow passing `tp_plan` in `from_pretrained` directly (#41435 ) * start * allow passing it * fix plans * fix * fix * style * style * fix * add_test * oupsi indent * fix * fix * fix for CI without accelerator * fix import	2025-10-16 11:12:07 +02:00
Raushan Turganbay	7b7d17f9bf	🚨 [v5] Toggle the serialization format in processors (#41474 ) * toggle the serialization * prob this fixes it * fix tests * typo * delete legacy save entirely * remove extra nesting in if * revert test and serialzie a public attr instead of private	2025-10-16 10:19:22 +02:00
Marc Sun	bc9900562d	Fix quantization base class (#41613 ) * fix * fix --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-15 16:58:17 +02:00
Raushan Turganbay	313afcc468	[chat template] update when "push_to_hub" (#39815 ) * update templates push to hub * rvert jinja suffix and move it to processor file	2025-10-15 13:49:59 +00:00
Yih-Dar	dc6fdeb705	Update a dataset reop link (#41618 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-15 14:41:38 +02:00
Marc Sun	70e871959c	Fix trainer simple tests (#41449 ) * fix * fix ray * train to tune * breaking changes wrt generation config * Fix ! * fix * fix * fix deepspeed ! * fix * fix * fix * improve logic * revert and fix * revert comment * oups * revert change * fix * style * typo in comment --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-15 14:09:00 +02:00
Philip Roeleveld	26b7f66850	Add `logits_to_keep` to many older CausalLM models (#41335 ) * Add logits_to_keep to CausalLM models * Skip failing test for git model * Remove unused return_dict from kosmos2 signature * Revert BlipForQuestionAnswering	2025-10-15 11:56:01 +02:00
Lysandre Debut	13a35a5057	Enable non-streaming mode in `transformers serve` (#41446 ) * Enable non-streaming in transformers serve Remove typos Remove typos Remove typos * Fix tests * Arthur review	2025-10-15 09:37:26 +02:00
Rémi Ouazan	9e4199ede3	Gemma3 fixes (#41572 ) * Multiple device error fix * FA2 equivalence fix * Move the train fwd in cfg test * Style * Added comment * Made the comment more clear	2025-10-14 18:33:27 +02:00
Rémi Ouazan	82cae9eb52	Add __iter__ to DynamicCache (#41569 ) * Add __iter__ to DynamicCache * Fix tests that use ddp init	2025-10-14 16:16:32 +02:00
Merve Noyan	3648fde486	Add DINOv3Backbone for ConvNext variant (#40651 ) --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-10-14 14:57:04 +02:00
Yih-Dar	abf5b57a68	delete some tokenizer tests using pickle (#41514 ) * hate pickle * hate pickle * hate pickle --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-14 14:50:51 +02:00
Julien Denize	0566b6f5bd	Patch MistralCommonTokenizer (#41439 ) * Fix token_to_id and add add_generation_prompt * Fix spm download * Refactor spm * Try another possibly non-gated spm * Improve get_vocab * lint * Improve get_vocab * Add warn to piece_to_id * Improve from_pretrained raise and revert model spm * Revert fast	2025-10-14 11:13:19 +00:00
Kehan Li	cad74496ca	[model] Add VideoLLaMA3 implementation (#40499 ) * Add VideoLLaMA3 implementation * Run style fix * Switch to modular * Fix config and smart_resize * Fix * Fix * Fix style * Fix * Ruff fix * Rename * Rename * Fix * Clean * Fix consistency * Add doc * Fix * Fix * Fix doc * Update generated code * remove test_initialization * fix tests * simplify * tests * Add VideoLlama3IntegrationTest * replace asserts * fix tests --------- Co-authored-by: steven-ccq <55176896+steven-ccq@users.noreply.github.com> Co-authored-by: steven-ccq <1456320989@qq.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-13 15:54:34 +02:00
Akilesh	3813a8e3a1	Add VideoMAE video processor (#41534 ) * Add video processor for VideoMAE * Document VideoMAE video processor * Add regression tests for VideoMAE video processor * refactor: Use direct batch key access for pixel_values_videos * test: add parity test for VideoMAEVideoProcessor vs VideoMAEImageProcessor * docs(videomae): update model docstring example to demonstrate VideoMAEVideoProcessor (TorchCodec-based decoding and sampling)	2025-10-13 15:42:27 +02:00
Joao Gante	d621be8286	🚨 [v5] `generate` delegates default cache initialization to the model (#41505 )	2025-10-13 13:20:48 +01:00
Sai-Suraj-27	58f9e13313	Fixed Type-hints in function defintions (#41525 ) * Explicitly annotate default None parameters as Optional * make style. * make style. * Fixed check_copies. * fix consistency.	2025-10-13 11:48:37 +02:00
Yoni Gozlan	eb28242251	Add MLlama fast image processor (#41391 ) * Merge conflict * add fast processor * add fast processor * make style * add new convert rgb * use nested group by shape in mllama fast, add support for multiple inputs in group by shape * refactor after review --------- Co-authored-by: Vincent <phamvinh257@gmail.com>	2025-10-13 09:16:05 +00:00
Yih-Dar	3927ffed31	[testing] reduce runtime of `HunYuanMoEV1IntegrationTest:test_model_generation` (#41373 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-10 22:27:01 +02:00
Marc Sun	feca4f3de7	remove `tpu_num_cores` (#41383 ) * remove-tpu-num-cores * fix * let's remove it * style * Update examples/legacy/seq2seq/finetune_tpu.sh Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-10 15:53:28 +02:00
Cyril Vallez	c6042a4169	Remove outdated flags (#41512 ) remove flags	2025-10-10 14:34:47 +02:00
Marc Sun	f9f8bf5a10	Revert `local_rank` deletion and some cleaning (#41504 ) * forgot those * clean * Fix * merge * fix * fix	2025-10-10 12:23:04 +02:00
eustlb	c5094a4f97	[voxtral] language detection + skipping lang:xx (#41225 ) * proc + doc update * improve doc * add lang:xx in decode * update voxtral test * nit * nit * update test value * use regex	2025-10-10 09:18:30 +00:00
Yao Matrix	f4487ec521	fix gemma3n case failure (#41426 ) * fix gemma3n case failure Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update dependency_versions_table.py * change the case argument passing way to make the case PASS, generation_config way need re-visit Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-10-10 09:15:27 +00:00
Cyril Vallez	e8194fe84f	Fix some tests (#41503 ) * fix * fix * doc	2025-10-10 11:05:09 +02:00
Joao Gante	9556b36b2f	[causallm tester] automate pipeline mappings + bloom tests (#41318 )	2025-10-10 10:02:00 +01:00
Lysandre Debut	17c31a98ac	Streaming should be handled at the request-level rather than at the istance level (#41444 ) * Streaming should be handled at the request-level rather than at the instance level * Add tests * Require torch GPU	2025-10-10 10:24:55 +02:00
Marc Sun	0419ff881d	Remove `local_rank` arg from `TrainingArguments` (#41382 )	2025-10-09 18:54:12 +02:00
Marc Sun	081391b20e	deprecate `jit_mode_eval` (#41376 )	2025-10-09 18:50:45 +02:00
Marc Sun	776eea8612	deprecate `overwrite_output_dir` (#41323 ) * dep * style * rm * wut * style	2025-10-09 18:36:19 +02:00
Marc Sun	1a3a5f5289	Remove SigOpt (#41479 ) * remove sigopt * style	2025-10-09 18:05:55 +02:00
Marc Sun	823fab4860	Fix bnb fsdp loading for pre-quantized checkpoint (#41415 ) * fix * fix * get_param_name * fix device name	2025-10-09 18:05:35 +02:00
Jacob Kahn	0eae41ad36	Add Code World Model (CWM) (#41199 ) * [wip][cwm] Code World Model stubs and setup in HF Transformers * [wip] Get other things working * [wip] Working * Tokenizer pad * fix: cwm window attn * temp remove test * temp remove test * Fixes * Temporarily add auto config remapping option until VLLM 0.11 is out * Fix model type and add layer validation * Lint, remove CwmForSequenceClassification * Lint, tests * Remove CwmForSequenceClassification * Lint * Remove intermediary layer expors/doc errorss, fix tests * Lint * run python utils/sort_auto_mappings.py --check_only * Remove Cwm processor mapping, get check_repo passing * Remove CwmTextConfig from test * Add docstring for CwmConfig * remove global_window and window_pattern params from config * Fix docstrings * Revert change to auto docstring util * lint * Fixes minus test improvements * Alter tests to simply check logits * lint * Have slow tests use repo, make CwmPretrainedModel passthrough * Remove decoder layer implementation, use Llama3Decoder + CwmAttetion * Use linear w/o bias for CwmAttention, add token-level integration test * Don't ignore config attention bias * Remove attention bias parameter entirely from config --------- Co-authored-by: galco <galco@meta.com>	2025-10-09 17:57:45 +02:00
YangKai0616	26b5b52676	[Fix] Fix test file error (#40973 ) Fix test file error	2025-10-09 15:30:53 +00:00
Anton Vlasjuk	34b861abd1	🚨 [`Attention Masks`] Bidirectional masks for encoder and encoder-decoder models (#41265 ) * new masks * fixes * adjust comments * fix unnecessary mask creation on sdpa * simplify masks more * propogate to other models * style + repo consistency * copies * no comment * fix attempt * finally fix grounding dinos * fix distilbert * fix executorch * move to own module * address first few comments WIP * revert device comments, simplify executorch further * fix typo * add a test for cuda graphs * move cleanup... * fix conflict with new main * fix esm and evolla	2025-10-09 16:56:11 +02:00
Marc Sun	b44d91570f	[v5] remove load_in_4bit and load_in_8bit (#41287 ) * [v5] remove load_in_4bit and load_in_8bit * fix * reveert * fix --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-10-09 16:34:04 +02:00

1 2 3 4 5 ...

5649 Commits