transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-10-20 17:13:56 +08:00

Author	SHA1	Message	Date
Yih-Dar	307c523854	further improve `utils/check_bad_commit.py` (#41658 ) (#41690 ) * fix * Update utils/check_bad_commit.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-10-17 23:07:00 +02:00
Yuanyuan Chen	12a50f294d	Enable FURB rules in ruff (#41395 ) * Apply ruff FURB rules Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Enable ruff FURB rules Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Revert changes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-17 15:00:40 +00:00
Cyril Vallez	75da795d8f	🚨 Remove torch.fx support (#41683 ) * remove all * fix comments * better checks * doc	2025-10-17 16:12:46 +02:00
Raushan Turganbay	10de06dace	🚨 [v5] Refactor RoPE for layer types (#39847 ) * update * batch update model code * typos * too many diffs, dump * dump again * another dump * fix copies * make `rope_scaling_dict` self attr * fix a few more tests * another update * fix a few more tests, hopefully last ones * fox copies * fix copies again * fix newly added models, I hate rebasing on main * update config files * modular files * fix rope utils test * docstring has to be indented more, why? * oops forgot to update some modualr files * copy from doesn't copy decorators? * fix overriden test as well * add a new test * fix failing tests again * update docstrings * fix phi3 * fix two models * fix copies * forgot to add * stupid bug from modular conversion * fix slow tests * update to call rotary emb once per model forward * 3K tests failing?! * update * update more models * fix copies * fix the rest of tests hopefully * fix after rebase * fix the rope tests * fix docs omni * change a bit * models with layer types * why it was deleted? * fix a few tests * fix last test! * delete extra empty lines * add a test case * more changes * fix models * typing hint for nested rope params * missed when resolving conflicts * delete layer types and fix typo * fix copies * fix copies * update docs text * docs * huuge update all models * fix copies * rename attr to align with new format * delete redundant rope tests * trigger ci * update the case * this is why i hate rebasing * maybe fixed? * oops * now fix? * fix last tests and copies * fix copies? * fix minimax and gemma3n * update typo * deprecation end version * final fix copies :fingers-crossed: * oh my, add the docs in toctree * oke, this is really the last fix * fix copies and hope that tests won't start failing again * use rope scaling if saved * fix slow tests * fix cwm and unrelated deepseek * fix last * update * hope it works now, it took so long * lets keep None for now, I will try to remove after checking tests * some more fixes, i find and replace does not always find all cases * last fix of tests * arthur's comment for extra foreward kwargs * delete unused code * fix slow qwen tests * delete layer types from models * faulty modular conversion * fix qwen omni * fix copies and style * address my comment --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-17 14:57:27 +02:00
Cyril Vallez	0b3aef1da9	🚨 Remove torchscript support (#41688 ) * remove a lot * remove the rest * doc	2025-10-17 13:38:27 +02:00
Yih-Dar	6344371a91	improve `utils/check_bad_commit.py` (#41658 ) * robust * robust * robust --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-16 15:51:19 +00:00
Yuanyuan Chen	9e99198e5e	Use \| for Optional and Union typing (#41646 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-16 14:29:54 +00:00
Lucain	af2a66ced9	Migrate transformers cli to Typer (#41487 ) * Add typer-slim as explicit dependency * Migrate CLI to Typer * code quality * bump release candidate * adapt test_cli.py * Remove ./commands + adapt tests * fix quality * consistency * doctested * do not serve model in chat * style * will it fix them? * fix test * capitalize classes * Rebase * Rebase * tests + fixup tests + fixup * csutom error message * fix ? * should be good * fix caplog globally * inner caplog * last attempt * Retry * Let's try with capsys disabled --------- Co-authored-by: Lysandre <hi@lysand.re>	2025-10-16 13:29:42 +02:00
Yih-Dar	dc6fdeb705	Update a dataset reop link (#41618 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-15 14:41:38 +02:00
regisss	d7c9fbdb64	Enable modular files from other libraries (#41372 ) Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-13 13:48:32 +02:00
Pablo Montalvo	b92fc0c6e1	[QoL] modular conversion shows LoC saved (#41500 ) smol qol conversion	2025-10-10 11:55:23 +02:00
Yuanyuan Chen	2b5e4c0d13	Import Callable from collections.abc (#41130 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-09 12:12:43 +00:00
Raushan Turganbay	d1c6310d6a	🚨 [v5] Rendundant code in nested configs (#41314 ) * batch update models * delete even more * fix modular super init location * fix * fix copies * fix again, these have force-set values in configs * fix copies	2025-10-09 13:47:44 +02:00
Joao Gante	f50fd7fb6b	[v5] rm `utils/tf_ops/` (#41402 ) rm utils/tf_ops/	2025-10-09 10:27:47 +01:00
Marc Sun	13791d8f48	[v5] Bump min version of bitsandbytes to 0.46.1 (#41283 ) * bump bitsandbytes to 0.46.1 * huge cleanup * style * fix * req * fix * importerror * fix	2025-10-08 12:04:26 +02:00
Paul Pak	0c9a72e457	[Model] Lfm2Moe (#41401 ) * [new-models] LFM2-MoE Signed-off-by: Paul Pak <paulpak58@gmail.com> * [docs] add in template lfm2_moe doc files Signed-off-by: Paul Pak <paulpak58@gmail.com> * [configuration] update configuration class Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modular][lfm] minor: fix rotary_emb typo Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modeling] modular/modeling files for Lfm2Moe Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modeling][lfm2_moe] fix Lfm2Moe modular/modeling Signed-off-by: Paul Pak <paulpak58@gmail.com> * [configuration][lfm2_moe] update configuration keys with latest config changes Signed-off-by: Paul Pak <paulpak58@gmail.com> * [misc] make fixup Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modular][lfm2_moe] address comments: dtype, mlp, buffers Signed-off-by: Paul Pak <paulpak58@gmail.com> * [configuration][lfm2_moe] add initializer_range Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modular][lfm2_moe] include init_weights to pass test_initialization Signed-off-by: Paul Pak <paulpak58@gmail.com> * [tests][causal_lm] include pos_emb as possible rope attribute Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modeling][lfm2_moe] remove load_balancing_loss_func due to lack of support for hooking expert biases Signed-off-by: Paul Pak <paulpak58@gmail.com> * [misc] make style Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modeling][lfm2_moe] MoE refactor PR update in LFM2Moe Signed-off-by: Paul Pak <paulpak58@gmail.com> * [tests] lfm2_moe: unit tests Signed-off-by: Paul Pak <paulpak58@gmail.com> * [misc] update LFM2-8B-A1B repo id Signed-off-by: Paul Pak <paulpak58@gmail.com> * [tests] lfm2: update ModelTests for lfm2 Signed-off-by: Paul Pak <paulpak58@gmail.com> * Update LFM2 documentation Updated the LFM2 documentation to reflect the addition of a new model size and clarified architectural details. * Add Lfm2Moe documentation Add Lfm2Moe model documentation with overview and example usage. * [misc] fix ci Signed-off-by: Paul Pak <paulpak58@gmail.com> * [docs] remove trust_remote_code Signed-off-by: Paul Pak <paulpak58@gmail.com> * [misc] ci: fix modular Signed-off-by: Paul Pak <paulpak58@gmail.com> * reapply modular * simplify * remove static address and inplace op * simplify * simplify a bit more the modular * imports --------- Signed-off-by: Paul Pak <paulpak58@gmail.com> Co-authored-by: Maxime Labonne <81252890+mlabonne@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-10-07 15:09:58 +02:00
Yuanyuan Chen	fa36c973fc	Remove unnecessary list comprehension (#41305 ) Remove unnecessary comprehension Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-06 14:49:02 +00:00
Arthur	0452f28544	[`ModularChecker`] QOL for the modular checker (#41361 ) * update * fancy table fancy prints * download to cache folder, never need it everagain * stule * update based on review	2025-10-06 12:52:10 +02:00
Cyril Vallez	163601c619	Standardize `PretrainedConfig` to `PreTrainedConfig` (#41300 ) * replace * add metaclass for full BC * doc * consistency * update deprecation message * revert	2025-10-06 11:34:02 +02:00
Raushan Turganbay	5339f72b9b	🚨 [unbloating] unify `TypedDict` usage in processing (#40931 ) * just squash commits into one * fix style	2025-10-03 14:17:59 +02:00
Pablo Montalvo	cd4422922e	Add modular detector (#41289 ) * doc * doc * no remote code * safe-ize the release + remove remote * fixes * add some documentation as well	2025-10-03 14:11:10 +02:00
Yih-Dar	e1f1d32af0	Remove some previous team members from allow list of triggering Github Actions (#41263 ) * delete * delete --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-10-02 16:32:28 +02:00
Yuanyuan Chen	1cc9069551	Fix unnecessary single-item container checks (#41279 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-02 12:35:11 +00:00
Yuanyuan Chen	1d91a8a454	Use max/min (#41280 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-02 12:15:27 +00:00
Arthur	7938e91faa	MoE + vllm = 😻 (#40132 ) * update modeling mixtral * oups[13;2u * fix * better naming? * compute softmax and top_k inside the experts * update minamax as well * models that will need an update * more models that need a fix * stash * fix mixtral * update olmoe * update * update * current changes * nits * molmoe is now fixed * olmoe is good to go! * refactor qwen2_moe * fixes * fixed moe * fix qwen2 modular * nit * qwen2_moie test script works * tricky rope ! * fix qwen3 * DeepSeek v3 MoE Standardization (#40538) * DeepSeek-v3 Shared Shared * Dependents of DS3 * Standardize GLM4V MoE (#40539) * up * Standardize VitPose's MoE (#40549) * VitPose * outside * outside * outside * fix * update dbrx * dbrx... the magix * Refactor Ernie 4.5's MoE (#40547) * Isolate Ernie fixes * fix moe --------- Co-authored-by: Vasqu <antonprogamer@gmail.com> * fix style * style * fix copies * style * latest changes * fixes * had to stage * current updaters * up * another modular * modular graniteMoe * some update * draft another modular moe * updaters * up * fix nit * q3 nit * fix phi moe * we're going up up up up its our mooooment * fix switch transformers this time around * up * gptsan japanese is deprecated forget about it * fix mixtral to not be a linear (gives us more freedom) * update * fix copies gone wrong try catch nothing * fix mixtral * new refactor again * update aria as well * up dbrx and deepseekv3 * nit * fix phimoe? * fix deepseek v3 * nits * don't bother with this one please * up olmoe * ?? * fix olmoe * yups * fiupx * ish * hot patch * new qwen3 * updates * up * nit * fix copies * fix * nits * we're going up up up * nits * switch_transformesr edge case * lol modular gptsan? * fix deepseek * finally all modeling match modular * update * up * up * dang * up * up aria * fix dbrx * nits here and there * finish fixing dbrx * fix deepseek * upd * up * fix flex olmo * updated * update jamba * JAMBA is stil a bit todo * forward forward * fix dots11 * update * fix hunyuan * fix some other * update phimoe * fuck you phimoe you are now submitted * submit granitemoe as well * try to fix some other models, reduces some of the failures * fix olmoe and qwem2moe * up * up * fix qwen2_moe * update modular make it again, simpler * nits * up * up * fix * someswitch reductions * up * fix qwen3vl * some fixes to jetmo * these should be shipped to the modular to fix jetmoe * fix most of the nllb failures * more nllb fixes * fix the modular * remove nllb modular as it sucks for now * ? * fix granitemoe * granitemoehybrid don't have rope * use rope when rope, no rope when no rope * updates * finish fixing dumbgrainite * fix most of minimax * fix * update modular * ? * up * up jetmoe still broken * up * fix, now align the moe * fix jetmoe * fix styling and qwen3 repo consitency * updatge * up up * update ruff? * nits * modeling is goot now for switch * fix * more fixses to switch! * fix some siwtch test * ? * ? * up * fix switch modular! * nit? * uip * subtest * can't believe I wasted so much time on this... * fix * updates * nits * nit jamba is fucking annoying * ? * fix? * oups * good good * styling * up * make sure qwen2 sliding works! * fix dbrx small * lol * nits * fix one test * fix load balancing loss issue * fix jamba * fix nllbmoe * fix jamba consistency and doc? * up * thse are correct * up * up * up * some of the final cleanup * update * up * fix some revert in granimoe * bring back attention multipliers for the granite family we'll see later on if they need removal * small jamba fix docstring and typing * fix phimoe * yup * fix unk returndict in granitemoes * up * fix qwen config * fix phiemoe check quality * nits * update based on caught non relative imports! * fix dbrx * Apply suggestions from code review Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * fix copies * fiuxp * fix dot1 regression! * fix phimoe issue * fix phi moe * fix float() for some models * fix jamba regression * ui * more dtype issues * fix deepseek2 and 3? * proper update * fix modular deepseek! * jamba jambaaaaaa --------- Co-authored-by: Lysandre Debut <hi@lysand.re> Co-authored-by: Vasqu <antonprogamer@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-10-02 12:12:44 +02:00
Yuanyuan Chen	30b79effb5	Remove SageMakerTrainer (#41267 ) * Remove SageMakerTrainer Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More removal Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-02 09:16:32 +00:00
Yuanyuan Chen	e4913bdf50	🚨 [v5] Remove SinkCache (#41107 ) Remove SinkCache Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-01 13:46:55 +00:00
Yuanyuan Chen	3016717f0d	Use removeprefix and removesuffix (#41240 ) * Use removeprefix and removesuffix Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * More fixes Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-10-01 13:13:04 +00:00
Yuanyuan Chen	ca975f1cb8	[V5] Remove deprecated transformers.onnx (#41214 ) * Remove deprecated transformers.onnx Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> * Remove onnx docs Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> --------- Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-10-01 12:17:04 +00:00
Joao Gante	1d1ac07893	[repo utils] Update `models_to_deprecate.py` (#41231 ) * update models_to_deprecate * exclude this file * handle typos and aliases * don't commit files * PR suggestions; make fixup	2025-10-01 12:01:52 +00:00
Yoni Gozlan	bd37c45354	Add EdgeTAM (#39800 ) * initial comment * test * initial conversion for outline * intermediate commit for configuration * chore:init files for sam2 * adding arbitary undefined config * check * add vision * make style * init sam2 base model * Fix imports * Linting * chore:sam to sam2 classes * Linting * Add sam2 to models.__init__ * chore:match prompt encoder with sam2 code * chore:prepare kwargs for mask decoder * Add image/video predictors * Add CUDA kernel * Add output classes * linting * Add logging info * tmp commit * docs for sam2 * enable image processing * check difference of original SAM2 - difference is the order of ToTensor() - please see https://pytorch.org/vision/main/_modules/torchvision/transforms/functional.html#resize * enable promptencoder of sam2 * fix promprencoder * Confirmed that PromptEncoder is exactly same (Be aware of bfloat16 and float32 difference) * Confirmed that ImageEncoder is exactly same (Be aware the linting of init) * Confirmed that MaskDecoder is exactly same (TO DO: lint variable name) * SamModel is now available (Need more chore for name) * make fix-copies * make style * make CI happy * Refactor VisionEncoder and PostioinEmbedding * TO DO : fix the image_embeddings and sparse_embeddings part * pure image inference done * reusable features fix and make style * styling * refactor memoryattention * tmp * tmp * refactor memoryencoder TO DO : convert and inference the video pipeline * TO DO : fix the image_encoder shape * conversion finish TO DO: need to check video inference * make style * remove video model * lint * change * python utils/check_docstringspy --check_all * python utils/check_config_attributes.py * remove copies for sam2promptencoder due to configuration * change __init__.py * remove tensorflow version * fix that to not use direct comparison * make style * add missing import * fix image_embedding_size * refactor Sam2 Attention * add fully working video inference (refactoring todo) * clarify _prepare_memory_conditioned_features * simplify modeling code, remove unused paths * use one model * use auto_docstring * refactor rope embeddings * nit * not using multimask when several points given * add all sam2.1 * add video tmp * add Sam2VideoSessionState + fast image proc + video proc * remove init_states from model * fix batch inference * add image integration tests * uniformize modeling code with other sam models and use modular * pass vision tests an most model tests * All tests passing * add offloading inference state and video to cpu * fix inference from image embedding and existing mask * fix multi_boxes mask inference * Fix batch images + batch boxes inference * improve processing for image inference * add support for mask generation pipeline * add support for get_connected_components post processing in mask generation * add fast image processor sam, image processor tests and use modular for sam2 image processor * fix mistake in sam after #39120 * fix init weights * refactor convert * add integration tests for video + other improvements * add needed missing docstrings * Improve docstrings and * improve inference speed by avoiding cuda sync * add test * skip test for vision_model * minor fix for vision_model * fix vision_model by adding sam2model and change the torch dependencies * remove patch_size * remove image_embedding_size * fix patch_size * fix test * make style * Separate hieradet and vision encoder in sam2 * fixup * review changes part 1 * remove MemoryEncoderConfig and MemoryAttentionConfig * pass q_stride instead of q_pool module * add inference on streamed videos * explicitely process streamed frames * nit * Improve docstrings in Sam2Model * update sam2 modeling with better gestion of inference state and cache, and separate Sam2Model and Sam2VideoModel * improve video inference api * change inference_state to inference_session * use modular for Sam2Model * fix convert sam2 hf * modular * Update src/transformers/models/sam2/video_processing_sam2.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix minor config * fix attention loading error * update modeling tests to use hub checkpoints * Use CI A10 runner for integration tests values + higher tolerance for video integration tests * PR review part 1 * fix doc * nit improvements * enforce one input format for points, labels and boxes * nit * last few nits from PR review * fix style * fix the input type * fix docs * add sam2 model as conversion script * improve sam2 doc * add rough necessarry changes * first working edgetam * fix issue with object pointers * Use modular as much as possible * nit fixes + optimization * refactor spatial perceiver * cleanup after merge * add working edgetam * improve perceiver resampler code * simplify/unify rope attention logic * Improve comments in apply_rotary_pos_emb_2d * add working tests * fix test timmwrapper * add docs * make fixup * nits * fix modular * fix modular * PR review part 1 * split apply_rotary_pos_emb_2d * add granularity to _prepare_memory_conditioned_features * add dates to doc * add separate mlp for memory attention * Fix memory on wrong device * store processed frames in dict * update checkpoints in tests * update dates --------- Co-authored-by: sangbumchoi <danielsejong55@gmail.com> Co-authored-by: RUFFY-369 <prakarshkaushik369@gmail.com> Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> Co-authored-by: Haitham Khedr <haithamkhedr@meta.com> Co-authored-by: sangbum choi <sangbumchoi@sangbumui-MacBookAir.local> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-09-29 11:54:54 -04:00
Marc Sun	ad74fba085	[v5] Remove `model_parallel` deprecated feature (#41166 ) * fix * remove model parallel * style * removed a bit too much * rm comments * fix	2025-09-29 16:14:03 +02:00
Joao Gante	4fade1148f	[tests] `CausalLMTester` automatically infers other test classes from `base_model_class` 🐛 🔫 (#41066 ) * halfway through the models * update test checks * refactor all * another one * use tuples * more deletions * solve bad inheritance patterns * type * PR ready? * automatic model class inference from the base class * vaultgemma * make fixup * make fixup * rebase with gpt2 * make fixup :'( * gpt2 is special	2025-09-29 15:05:08 +02:00
Rémi Ouazan	50d2448a1a	Enable fa in amd docker (#41069 ) * Add FA to docker * Use caching mechanism for qwen2_5 * Fix a typo in important models list * Partial fixes for gemma3 * Added a commit ID for FA repo * Detailled the expectation storage format * Rebase fix * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-09-26 13:57:58 +02:00
Yoni Gozlan	53838edde7	Improve `add_dates` script (#41167 ) * utils/add_dates.py * put lfm2-vl in correct category	2025-09-25 16:00:05 -04:00
Nithin Rao	a579de7f5e	Add Parakeet (#39062 ) * first commit Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * update to handle masking for bs>1 Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * Add tests and docs Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * update model ids Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * update docs and improve style Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * update librosa location Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * import guard torch too Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * ruff code checks fix Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * ruff format check Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * updated to parakeet names Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * update script Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * Add tokenizer decoding Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * Remove other model dependency Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * clean tests Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * fix tests Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * linting Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * fix ruff lint warnings Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * move to seperate folders Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * add parakeet ctc model code Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * simplify encoder structure Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * update documentation Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * add parakeet to toctree Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * fix tests Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * add parakeet doc Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * Address comments Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * Update featurizer to compute lens directly Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * fix ruff tests Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * fix encoding format Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * fix minor ctc decoding Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * revert modular_model_converter.py changes * revert check_config_attributes.py changes * refactor: fastconformer & parakeet_ctc -> parakeet * modeling update * test update * propagate feature extractor updates * propagate doc changes * propagate doc changes * propagate tokenization changes * propagate conversion changes * remove fastconformer tests * remove modular * update processor * update processor * tset update * diverse fixes * 100% macthing greedy batched * Update conversion script. * Refactor docs. * Reafactor auto loading. * Refactor and fix tokenization and processing. * Update integration test. * Modeling fixes: - ensure correct attention mask shape - ensure layer drop returns valid output - correct blank token ID when computing CTC loss * Format and repo consistency. * Update model doc. * Fix feature extraction tests. * Fix (most) tokenizer tests. * Add pipeline example. * Fixes * Use eager_attention_forward from Llama. * Small tweaks. * Replace Sequential with ModuleList * Add check if not all layers copied * Clean tokenizer. * Standardize FastSpeech2ConformerConvolutionModule for Parakeet. * Switch to modular for modeling and processing. * Add processor tests. * Fix modeling tests. * Formating and docstrings. * Add `return_attention_mask` like other feature extractors. * clean up after merging main. * nits on modeling * configuration update * nit * simplification: use PretrainedTokenizerFast, simplify processor * add dtype arg to mel_filter_bank * feature extraction: simplify! * modeling update * change to ParakeetTokenizerFast * correct attention mask handling * auto update * proc update * test update * feature extraction fixes * modeling update * conversion script update * udpate tests feature integration * update tokenization and tests * processor tests * revert audio_utils * config docstring update * blank_token -> pad_token * modeling udpate * doc update * fix tests * fix test * fix tests * address review comments * add comment * add comment * explicitly not support flash * atttention straightforward masking * fix * tokenizer update: skipping blank tokens by default * doc update * fix max_positions_embeddings handling * nits * change atol faeture extraction integration tests * doc update + fix loss * doc update * nit * update integration test for A10 * repo id name * nit --------- Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com> Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> Co-authored-by: Eric B <ebezzam@gmail.com>	2025-09-25 13:52:24 +00:00
Lucain	44682e7131	Adapt and test huggingface_hub v1.0.0 (#40889 ) * Adapt and test huggingface_hub v1.0.0.rc0 * forgot to bump hfh * bump * code quality * code quality * relax dependency table * fix has_file * install hfh 1.0.0.rc0 in circle ci jobs * repostiryo * push to hub now returns a commit url * catch HfHubHTTPError * check commit on branch * add it back * fix ? * remove deprecated test * uncomment another test * trigger * no proxies * many more small changes * fix load PIL Image from httpx * require 1.0.0.rc0 * fix mocked tests * fix others * unchange * unchange * args * Update .circleci/config.yml * Bump to 1.0.0.rc1 * bump kernels version * fix deps	2025-09-25 11:13:50 +00:00
Yih-Dar	03c92884b5	Update team member list for some CI workflows (#41094 ) * update list * update list --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-23 09:48:40 +00:00
Rémi Ouazan	2ec37649e2	Ci utils (#40978 ) * Add CI reports dir to gitignore * Add utils to run local CI * Review compliance * Style * License	2025-09-22 16:16:19 +02:00
BakerBunker	ebbcf00ad1	Adding support for Qwen3Omni (#41025 ) * Add Qwen3Omni * make fix-copies, import properly * nit * fix wrong setup. Why was audio_token_id renamed ? * upds * more processing fixes * yup * fix more generation tests * down to 1? * fix import issue * style, update check repo * up * fix quality at my best * final quality? * fix doc building * FINAL COMMIT: SKIP IMPORTANT BUT FAILING TESTS FOR MERGE * SKIP THE TEMPLATE ONE --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.com> Co-authored-by: Arthur <arthur.zucker@gmail.com>	2025-09-21 23:46:27 +02:00
Yuanyuan Chen	767f8a4c75	Fix typoes in src and tests (#40845 ) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-19 13:18:38 +00:00
Ita Zaporozhets	ddfa3d4402	blt wip (#38579 ) * blt wip * cpu version * cpu friendly with full entropy model (real time patching) * adding config file instead of args file * enable MPS * refactoring unused code * single config class in config file * inherit from PreTrainedModel * refactor LMTransformer --> BLTPatcher * add conversion script * load from new checkpoing with form_pretrained * fixed demo from_pretrained * clean up * clean a few comments * cleanup folder * clean up dir * cleaned up modeling further * rename classes * adding transformers Attention class and RotaryEmbedding class * exchanged blt modules for transformers modules: attention, rotary_emb, create_causal_mask, etc * seperate out patcher config, update modeling and conversion script * rename vars to be more transformers-like * rm unused functions * adding cross attention from transformers * pass arg * rename weights * updated conversion script * overwritten commit! fixing PR * apply feedback * adding BLTRMSNorm like Llama * add repeat_kv and eager_attention_forward copied from * BLTMLP identical to MllamTextMLP * clean up some args' * more like mllama, but busier inits * BLTTransformerLayer config * decoder, encoder, global configs * wip working on modular file * cleaning up patch and configs * clean up patcher helpers * clean up patcher helpers further * clean up * some config renaming * clean up unused configs * clean up configs * clean up configs * update modular * clean * update demo * config more like mllama, seperated subconfigs from subdicts * read from config instead of self args * update demo file * model weights to causal lm weights * missed file * added tied weights keys * BLTForCausalLM * adding files after add-new-model-like * update demo * working on tests * first running integration tests * added integration tests * adding tokenization tests, integration tests, and cleaned up tokenization file, + ruff * tokenizer clean up * modular file * fixing rebase * ruff * adding correct basemodel output and updating config with checkpoint vals (for testing) * BLTModelTests git status * enabling inputs_embeds, although won't be equal to input_ids since need ids for patching logic * fix sdpa == causal tests * fix small model test and some gradient checkpointing * skip training GC tests * fix test * updated modular * update modular * ruff * adding modular + modeling * modular * more modern is_casual check * cleaning up modular * more modular reduction * ruff * modular fix * fix styling * return 2 * return 2 * fix some tests * fix bltcrossattention after modular break * some fixes / feedback * try cache generate fix * try cache generate fix * fix generate tests * attn_impl workaround * refactoring to use recent TransformersKwargs changes * fix hidden_states shape test * refactor to new outputs * simplify outputs a bit * rm unneeded decoderlayer overwriting * rename blt * forgot tokenizer test renamed * Reorder * Reorder * working on modular * updates from modular * new modular * ruff and such * update pretrainedmodel modular * using cohere2 apply_rotary_pos_emb * small changes * apply feedback r2 * fix cross_attention * apply more feedback * update modeling fix * load submodules from pretrainedmodel * set initializer_range to subconfigs * rm cross_attnetion_states pass when not needed * add 7b projection layer support * check repo * make copies * lost cohere2 rotate_half * ruff * copies? * don't tie weights for submodules * tie weights setting * check docstrings * apply feedback * rebase * rebased modeling * update docs * applying feedback * few more fixes * fix can_record_outputs * fast tokenizer * no more modulelist * tok auto * rm tokenizersss * fix docs * ruff * fix after rebase * fix test, configs are not subscriptable --------- Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-30.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-103.ec2.internal> Co-authored-by: Lysandre <hi@lysand.re> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-36.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-45.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-173-121.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-103.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-178.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-79.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-169-239.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-111.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-100.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-153.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-15.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-165-131.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-138.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-215.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-172-142.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-172-147.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-0.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-163-58.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-165-202.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-244.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-186.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-192.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-14.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-171-249.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-75.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-78.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-163-134.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-180.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-175-241.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-225.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-9.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-34.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-166-68.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-175.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-170-160.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-95.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-172-73.ec2.internal>	2025-09-19 11:55:55 +02:00
Cyril Vallez	4df2529d79	🚨🚨🚨 Fully remove Tensorflow and Jax support library-wide (#40760 ) * setup * start the purge * continue the purge * more and more * more * continue the quest: remove loading tf/jax checkpoints * style * fix configs * oups forgot conflict * continue * still grinding * always more * in tje zone * never stop * should fix doc * fic * fix * fix * fix tests * still tests * fix non-deterministic * style * remove last rebase issues * onnx configs * still on the grind * always more references * nearly the end * could it really be the end? * small fix * add converters back * post rebase * latest qwen * add back all converters * explicitly add functions in converters * re-add	2025-09-18 18:27:39 +02:00
Yih-Dar	5ac3c5171a	Track the CI (model) jobs that don't produce test output files (process being killed etc.) (#40981 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-18 18:27:27 +02:00
Yih-Dar	738b223f57	Add captured actual outputs to CI artifacts (#40965 ) * fix * fix * Remove `# TODO: ???` as it make me `???` * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-18 15:40:53 +02:00
Yih-Dar	270da89708	Remove `runner_map` (#40880 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-09-16 15:18:07 +02:00
JJJYmmm	c0dbe095b0	Adding Support for Qwen3-VL Series (#40795 ) * add qwen3vl series * make fixup * fix import * re-protect import * fix it finally (need to merge main into the branch) * skip processor test (need the checkpoint) * oups typo * simplify modular * remove unecesary attr * fix layer * remove unused rope_deltas args * reuse image def * remove unnesesary imports --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-09-15 12:46:18 +02:00
Ryan Mullins	291772b6b5	add: differential privacy research model (#40851 ) * VaultGemma * Removing Sequence and Token classification models. Removing integration tests for now * Remove pass-only modular code. style fixes * Update vaultgemma.md * Update docs/source/en/model_doc/vaultgemma.md Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Update docs/source/en/model_doc/vaultgemma.md Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Add links to model doc * Correct model doc usage examples * Updating model doc to describe differences from Gemma 2 * Update model_doc links * Adding integration tests * style fixes * repo consistency * attribute exception --------- Co-authored-by: Amer <amersinha@gmail.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-09-12 17:36:03 +02:00
Yuanyuan Chen	6c9f412105	Fix typos in tests and util (#40780 ) Fix typos Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>	2025-09-10 11:45:40 +00:00
Joao Gante	7aaef98cbe	rm src/transformers/convert_pytorch_checkpoint_to_tf2.py (#40718 ) * rm src/transformers/convert_pytorch_checkpoint_to_tf2.py * doctest skip	2025-09-09 16:34:54 +01:00

1 2 3 4 5 ...

1202 Commits