transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-10-30 00:24:37 +08:00

Author	SHA1	Message	Date
Yuanyuan Chen	2b5e4c0d13	Import Callable from collections.abc (#41130 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-09 12:12:43 +00:00
Cyril Vallez	46db0edf3b	🚨🚨 Remove all traces of legacy cache format (#41378 ) * remove * more * add back * tests * revert classes * tests * add exceptions * reapply modular * rename * oupsi * start with whisper * fix tests * fix * fix * fix * typing	2025-10-08 11:14:44 +02:00
Cyril Vallez	242eb9cbdc	Remove deprecation warning (#41425 ) * remove * fix space	2025-10-07 19:21:14 +02:00
Anton Vlasjuk	c27b67f0cd	🚨 [`v5`] Remove relative position embeddings (for bert like models) (#41170 ) * remove from modeling files * remaining changes * style / copies * revert deprecated models and fixup some models * oops	2025-10-06 14:21:41 +02:00
Raushan Turganbay	9db58abd6e	Check model inputs - hidden states (#40994 ) * update all models * fix copies * skip aria tests * update other models * skip should be in test, not tester * i think this is more descriptive as a name * find and replace for new models	2025-10-06 11:48:52 +02:00
Cyril Vallez	163601c619	Standardize `PretrainedConfig` to `PreTrainedConfig` (#41300 ) * replace * add metaclass for full BC * doc * consistency * update deprecation message * revert	2025-10-06 11:34:02 +02:00
Anton Vlasjuk	52f5eca7c9	🚨 [`v5`] Remove headmasking (#41076 ) * first attempt at removing * copies * last bits in core * quick fixes * tests purge * docs and examples * some fixes * more * another round of cleanups * fix * fix a bunch of models * fix dummy bert * fix * fix new model * fix signature change * fix * fix style/copies * new models * fix copies didnt find that damn * test * this shouldnt have happened during model addition	2025-09-30 16:04:57 +02:00
Cyril Vallez	4df2529d79	🚨🚨🚨 Fully remove Tensorflow and Jax support library-wide (#40760 ) * setup * start the purge * continue the purge * more and more * more * continue the quest: remove loading tf/jax checkpoints * style * fix configs * oups forgot conflict * continue * still grinding * always more * in tje zone * never stop * should fix doc * fic * fix * fix * fix tests * still tests * fix non-deterministic * style * remove last rebase issues * onnx configs * still on the grind * always more references * nearly the end * could it really be the end? * small fix * add converters back * post rebase * latest qwen * add back all converters * explicitly add functions in converters * re-add	2025-09-18 18:27:39 +02:00
Yuanyuan Chen	fd2a29d468	Fix more typos (#40627 ) Fix typos Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-08 16:05:40 +00:00
Rémi Ouazan	21e708c8fd	Fix for missing default values in encoder decoder (#40517 ) * Added default_value for is_updated and type check * Forgot one * Repo consistency	2025-09-01 16:11:23 +02:00
Cyril Vallez	becab2c601	Use the config for DynamicCache initialization in all modelings (#40420 ) * update all * remove the most horrible old code * style	2025-08-28 14:32:30 +02:00
Pablo Montalvo	ba095d387d	🧹 🧹 🧹 Get set decoder cleanup (#39509 ) * simplify common get/set * remove some noise * change some 5 years old modeling utils * update examples * fix copies * revert some changes * fixes, gah * format * move to Mixin * remove smolvlm specific require grad * skip * force defaults * remodularise some stuff * remodularise more stuff * add safety for audio models * style * have a correct fallback, you daft donkey * remove this argh * change heuristic for audio models * fixup * revert * this works * this should be explicit * fix Nth ESM exception * tryout decoder * this as well * revert again * 🧠 * aaah ESM has two modelings aaah * broom broom * format * wrong copies * copies * modular cleanups * format * modularities * wrong mergefix * seriously * align with new model * new model	2025-08-25 10:57:56 +02:00
Matteo Destro	56c44213b3	[detection] fix attention mask for RT-DETR-based models (#40269 ) * Fix get_contrastive_denoising_training_group attention * Add bool attention_mask conversion	2025-08-19 09:15:56 +00:00
Manuel de Prada Corral	a36d51e801	🚨 Always return Cache objects in modelings (to align with generate) (#39765 ) * watch the world burn * fix models, pipelines * make the error a warning * remove kwargs and return_legacy_cache * fix reformer	2025-08-18 16:26:35 +02:00
Yuanyuan Chen	6333eb986a	Fix more typos (#40212 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-08-18 12:52:12 +00:00
Cyril Vallez	5c3fb7f731	Harmonize `past_key_value` to `past_key_valueS` everywhere (#39956 ) * all modulars and llama * apply modular * bert and gpt2 copies * fix imports * do it everywhere * fix import * finalize it * fix * oups set it in modular * style * fix * Add 1 version to deprecation cycle * Update modeling_layers.py	2025-08-08 11:52:57 +02:00
Yuanyuan Chen	1e0665a191	Simplify conditional code (#39781 ) * Use != Signed-off-by: cyy <cyyever@outlook.com> * Use get Signed-off-by: cyy <cyyever@outlook.com> * Format * Simplify bool operations Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-07-30 12:32:10 +00:00
Raushan Turganbay	eb1a007f7f	Rename `supports_static_cache` to `can_compile_fullgraph` (#39505 ) * update all * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * apply suggestions * fix copies --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-23 09:35:18 +00:00
Pablo Montalvo	69b158260f	Refactor embedding input/output getter/setter (#39339 ) * simplify common get/set * remove some noise * change some 5 years old modeling utils * update examples * fix copies * revert some changes * fixes, gah * format * move to Mixin * remove smolvlm specific require grad * skip * force defaults * remodularise some stuff * remodularise more stuff * add safety for audio models * style * have a correct fallback, you daft donkey * remove this argh * change heuristic for audio models * fixup * revert * this works * revert again * 🧠 * aaah ESM has two modelings aaah * add informative but short comment * add `input_embed_layer` mixin attribute * style * walrus has low precedence * modular fix * this was breaking parser	2025-07-21 18:18:14 +02:00
Raushan Turganbay	8c102e2eb1	Rename `_supports_flash_attn_2` in examples and tests (#39471 ) * delete `_supports_flash_attn_2` from examples and tests * simplify docs	2025-07-21 14:02:57 +02:00
Cyril Vallez	1cefb5d788	[modular] Allow method with the same name in case of @property decorator (#39308 ) * fix * add example * fix * Update modular_model_converter.py	2025-07-09 15:46:53 +02:00
Cyril Vallez	056fa73fae	[modular] Simplify logic and docstring handling (#39185 ) * simplify a lot * Update modular_model_converter.py * finalize * remove outdated functions * apply it * and examples	2025-07-07 14:52:57 +02:00
Cyril Vallez	5348fbc005	[modular] Follow global indexing and attribute setting, and their dependencies (#39180 ) * export global indexing statements * add example * style * examples	2025-07-07 14:36:43 +02:00
Arthur	ca7e1a3756	Refactor the way we handle outputs for new llamas and new models (#39120 ) * just update 2 files * update other models as well just making fix-copies * also add the changes needed to modeling utils * put this on the pretrained model instead * nits and fixes * update generic, fix to use config value * update other modelings * use transformers kwargs instead * update * update * update other models * update * updates * update * update * update * fix * finally * very small nits * this fixes more tests * fix other models as well! * update modularqwen2 * update models based on qwen2 * update * update * remove the *flash stuff in favor of noraml kwargs update * propagate gemma? * remove output attentions * propagate * support cross attention edge case * same * test this * fixes * more fix * update * update * fix conflicts * update * fix emu3 * fix emu3 * move the fix a bit * quel enfer * some fixes, loss_kwargs should never had been * finish fixing gemma3n * fix small lm3 * fix another one * fix csm now * fux csm and mistral * fix mistral now * small fixes * fix janusss * only for some models * fixup * phix phi3 * more fixes? * dose this fix it? * update * holy shit it was just graph breaks * protect torch * updates * fix samhq? * fix moonshine * more moonshine fixes, 3 failures left! * nits * generic needs to support more * more fixes to moonshine! * fix cross attention outputs! * fix csm! * nits * fix stupid kosmos2 * current updates * fixes * use output recorder? * nicer! * a little bit of magic * update * fix protect * fix * small fixes * protect import * fix a bunch of more models * fix fixups * fix some of the last ones * nit * partly fix phi * update * fix import path * make something that is fullgraph compatible just to be sure * typing was wrong on llama so the rest was wrong as well * fucking ugly but at least it is still exportable * syle * supposed to fix moonshine, it still breaks * fix some default * fix the last bits of sam * update samhq * more fixes to am hq * nit * fix all output+hidden states and output_attentions! * fix? * fix diffllama * updates to fix initialization on the sam pips * ups there was a bug * fix the last sam hq test * fix gotocr * fix gotocr2! * fixes * skip stupid tests * there was one left :) * fixup * fix fix copies issues with this test file * fix copies for sam_hq * rm some comments * skip 2 more failing tests * fix * fix everything * Apply suggestions from code review Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * add more doc! * fix public init * fix modular qwen3 --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-07-05 11:34:28 +02:00
Cyril Vallez	e1e11b0299	Fix undeterministic order in modular dependencies (#39005 ) * sort correctly * Update modeling_minimax.py * Update modular_model_converter.py	2025-06-24 17:04:33 +02:00
Matt	508a704055	No more Tuple, List, Dict (#38797 ) * No more Tuple, List, Dict * make fixup * More style fixes * Docstring fixes with regex replacement * Trigger tests * Redo fixes after rebase * Fix copies * [test all] * update * [test all] * update * [test all] * make style after rebase * Patch the hf_argparser test * Patch the hf_argparser test * style fixes * style fixes * style fixes * Fix docstrings in Cohere test * [test all] --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-17 19:37:18 +01:00
Quentin Gallouédec	de24fb63ed	Use HF papers (#38184 ) * Use hf papers * Hugging Face papers * doi to hf papers * style	2025-06-13 11:07:09 +00:00
Anthony	19224c3642	fix: "check out" as verb (#38678 ) "check out" as verb	2025-06-09 14:07:31 +00:00
Raushan Turganbay	bf68dd9e6e	[tests] expand flex-attn test for vision models (#38434 ) * expand the test for VLMs * typo * mark models `supports_flex` + expand test for additional kwargs * flex attn for refactored vision models * fix copies * fix * unskip * style * address comments	2025-06-03 07:40:44 +00:00
Cyril Vallez	4602059aae	[modular] Fix the prefix-based renaming if the old and new model share a common name suffix (#37829 ) * first try * Fix and set examples * style * fix * Update modular_test_detr.py * Update image_processing_new_imgproc_model.py * Update modular_model_converter.py	2025-04-29 10:43:23 +02:00
Yuanyuan Chen	da4ff2a5f5	Add Optional to remaining types (#37808 ) More Optional typing Signed-off-by: cyy <cyyever@outlook.com>	2025-04-28 14:20:45 +01:00
Cyril Vallez	8ab296501a	Remove deprecation warning for `num_logits_to_keep` (#37149 ) * remove everything * style	2025-04-14 19:08:45 +02:00
cyyever	0fb8d49e88	Use Python 3.9 syntax in examples (#37279 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-04-07 12:52:21 +01:00
efsotr	2b4734bd49	Support passing flash_attn_kwargs when gradient_checkpointing is enabled (#37037 ) * support passing flash_attn_kwargs when gradient_checkpointing is enabled * make modeling_deepspeek_v3.py consistent with modular_deepseek_v3.py	2025-03-31 10:53:02 +02:00
Yih-Dar	121830ab47	update examples after ruff being updated (#36972 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 18:15:47 +01:00
Cyril Vallez	da4ab2a1b6	Fix doc formatting in forward passes & modular (#36243 ) * fix indentation issues + modular without magic keyword * style * Update doc.py * style * Fix all decorators indentation * all models * style * style * Update doc.py * fix * general fix * style	2025-02-25 11:09:01 +01:00
Cyril Vallez	bc65f3fc1c	[modular] Do not track imports in functions (#36279 ) * Add check * just check for function * Update examples	2025-02-25 10:29:47 +01:00
Liangliang Ma	315a9f494e	Add XPU type for work-around -inf mask causing sdpa NaN issue in modeling files (#35647 ) * add xpu for unmask * change modular for generated matching * add lastest modeling for helium	2025-02-05 13:28:31 +01:00
Yoni Gozlan	fa56dcc2ab	Refactoring of ImageProcessorFast (#35069 ) * add init and base image processing functions * add add_fast_image_processor to transformers-cli * add working fast image processor clip * add fast image processor to doc, working tests * remove "to be implemented" SigLip * fix unprotected import * fix unprotected vision import * update ViTImageProcessorFast * increase threshold slow fast ewuivalence * add fast img blip * add fast class in tests with cli * improve cli * add fast image processor convnext * add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision * add device kwarg to ImagesKwargs for fast processing on cuda * cleanup * fix unprotected import * group images by sizes and add batch processing * Add batch equivalence tests, skip when center_crop is used * cleanup * update init and cli * fix-copies * refactor convnext, cleanup base * fix * remove patching mixins, add piped torchvision transforms for ViT * fix unbatched processing * fix f strings * protect imports * change llava onevision to class transforms (test) * fix convnext * improve formatting (following Pavel review) * fix handling device arg * improve cli * fix * fix inits * Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs * uniformize qwen2_vl fast * fix docstrings * add add fast image processor llava * remove min_pixels max_pixels from accepted size * nit * nit * refactor fast image processors docstrings * cleanup and remove fast class transforms * update add fast image processor transformers cli * cleanup docstring * uniformize pixtral fast and make _process_image explicit * fix prepare image structure llava next/onevision * Use typed kwargs instead of explicit args * nit fix import Unpack * clearly separate pops and gets in base preprocess. Use explicit typed kwargs * make qwen2_vl preprocess arguments hashable	2025-02-04 17:52:31 -05:00
Gar	9d2056f12b	Add mean_resizing for every VLMs' resizing_token_embeddings() (#35717 ) * refine all resize_token_embedding() * ruff format * hotfix	2025-02-03 15:03:49 +01:00
Cyril Vallez	91be6a5eb2	Modular: support for importing functions from any file (#35692 ) * fix function imports * improve comment * Update modeling_switch_function.py * make checks more robust * improvement * rename * final test update	2025-01-16 16:37:53 +00:00
Raushan Turganbay	09d5f76274	Clean-up composite configs (#34603 ) * remove manual assignment tie-word-embeddings * remove another unused attribute * fix tests * fix tests * remove unnecessary overwrites * fix * decoder=True * clean pix2struct * run-all * forgot `_tied_weights_keys` when adding Emu3 * also Aria + fix-copies * and clean aria	2025-01-15 10:04:07 +01:00
Cyril Vallez	46276f9a7f	Fix modular edge case + modular sorting order (#35562 ) * look-ahead negation * re add examples by default * Fix the bug in topological sort * Update create_dependency_mapping.py * start adding test * finalize test * more tests * style * style	2025-01-09 17:17:52 +01:00
Kevin R	665a4942e4	Check whether rescale is requested before checking is_scaled_image (#35439 )	2025-01-07 11:39:45 +00:00
Arthur	2c47618c1a	🚨All attention refactor🚨 (#35235 ) * refactor LlamaAttention * minimal changes * fix llama * update * modular gemmas * modular nits * modular updates * nits * simplify * gpt2 * more modualr and fixes * granite * modular modular modular * nits * update * qwen2 + starcoder2 * mostly gemma2 * Update image_processing_auto.py * fix * Update modular_starcoder2.py * fix * remove all copied from attentions * remove gcv * make fix-copies * oups * oups2.0 * fix some modulars + all copied from * should be good now * revert unwanted changes * Update modeling_decision_transformer.py * finish cleanup * Update modeling_olmo.py * consistency * re-add gradient checkpointing attribute * fix * style * make config necessary * bis * bis * Update modeling_my_new_model2.py * is_causal attr * fix * remove past kv return from decoder layer * fix * default rope config * correctly fix rope config * fix bias * fix gpt2 attention output * fix test * fix inits * fix default sdpa * fix default sdpa implementation * harmonize classes * fix mistral * fix sliding window models * mixtral * be more explicit * style * fix * several fixes * Update modeling_dbrx.py * fix test * olmo + phi * rotary * syle * phi * phi again * again * kwargs * Update test_modeling_common.py * skip fx tracing tests * Update modeling_utils.py * gemma 2 * again * Update modeling_recurrent_gemma.py * gemma2 * granite * style * starcoder * Update sdpa_attention.py * switch args * Update modeling_mllama.py * fix * cache type tests * gpt2 * Update test_modeling_common.py * fix * consistency * fix shape with encoder * should be the last one * tests non model * most comments * small oupsi * be more explicit in modulars * more explicit modulars * CIs! it works locally * add kwargs to _flash_attention_forward --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2024-12-18 16:53:39 +01:00
Cyril Vallez	d363e71d0e	🧹 Remove deprecated RotaryEmbedding parts in the Attention layers (#34858 ) * update * style * fix missing args * remove last trace of old rope classes * remove deprecated copied from * fix copies * trigger CIs * post rebase clean-up * reverse mistral * cleanup after dropping commits * Add comment	2024-12-11 11:16:52 +01:00
Cyril Vallez	1da1e0d7f2	Support for easier multimodal use of modular (#35056 ) * update modular and add examples * style * improve example comments * style * fix small logic issue for imports * fix relative order issue when files do not make sense * Improve comments * trigger CIs	2024-12-04 15:13:11 +01:00
Yoni Gozlan	3a8eb74668	Fix support for image processors modifications in modular (#34866 ) * add fix and examples * fix camel case naming	2024-11-22 18:14:24 -05:00
Cyril Vallez	e3a5889ef0	Modular fix (#34802 ) * Modular fix * style * remove logger warning * Update modular_model_converter.py	2024-11-19 16:08:57 +01:00
Cyril Vallez	e2ac16b28a	Large modular logic refactoring (#34487 ) * rework converter * Update modular_model_converter.py * Update modular_model_converter.py * Update modular_model_converter.py * Update modular_model_converter.py * cleaning * cleaning * finalize imports * imports * Update modular_model_converter.py * Better renaming to avoid visiting same file multiple times * start converting files * style * address most comments * style * remove unused stuff in get_needed_imports * style * move class dependency functions outside class * Move main functions outside class * style * Update modular_model_converter.py * rename func * add augmented dependencies * Update modular_model_converter.py * Add types_to_file_type + tweak annotation handling * Allow assignment dependency mapping + fix regex * style + update modular examples * fix modular_roberta example (wrong redefinition of __init__) * slightly correct order in which dependencies will appear * style * review comments * Performance + better handling of dependencies when they are imported * style * Add advanced new classes capabilities * style * add forgotten check * Update modeling_llava_next_video.py * Add prority list ordering in check_conversion as well * Update check_modular_conversion.py * Update configuration_gemma.py	2024-11-01 10:13:51 +01:00

1 2

56 Commits