Compare commits

...

1135 Commits

Author SHA1 Message Date
28c9541c1c Attention Quantization with FBGemm & TP (#37384)
* fix

* keep fused

* contiguous

* rm print

* update

* update

* rm print
2025-04-10 17:27:47 +02:00
56512077ad use rms_norm_eps for the L2Norm for Llama4 (#37418)
use `rms_norm_eps`
2025-04-10 17:26:22 +02:00
ac9c7f37a5 mark llama4 as not supported with fa2 (#37416) 2025-04-10 17:24:54 +02:00
894783c902 Fix Llama4 offset (#37414)
* add +1

* Update modeling_llama4.py
2025-04-10 17:24:45 +02:00
eae06ff6d4 v4.51.2 2025-04-09 20:27:06 +02:00
3148a41fa8 the fix that did not get in (#37370)
* debugging improvements

* add debugging details

* add more debugging details

* debug more

* the fix that did not get in

* First fix flex

* fix query offset

* fix flex first

* fix device mask creation for speed

* small mask creation sdpa

* Update flex_attention.py

* remove chunked prefill from HybridChunkedCache

* never seen such a fucked up merged

* clean up layers + output

* add summary json file

* Efficient general cache

* Update cache_utils.py

* cleanup

* fix?

* fix!

* oups typo

* not everywhere

* more fixes

* revert unrelated changes

* Fix but ugly for now -> should use pad instead

* oups

* re-initialize the cache

* Use pad to simplify

* style

* correct slicing

---------

Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-04-09 20:26:49 +02:00
10baffb599 Multiple llama4 fixe (#37353)
* update for fixes

* more fixes

* fuxix dynamic cache?

* style

* fix both traiining and generating. Eager seems alright

* dynamic does not work

* fix most cases, use_cache or not, eager or not, no default cache (ex: not training but you want to get cache states)

* should be final fixes

* fix more stuff no cat

* style

* fix

* style

* final sytle

* qualityeioiwhjfaopsejdpofqsdjkfjha;wesdhgfkjlqsw.denghjkaswednkgs

* fix

* revert
2025-04-08 11:15:06 +02:00
4a88ffae40 v4.51.1 2025-04-08 00:27:58 +02:00
f19aec737e Fixing flex attention for torch=2.6.0 (#37285)
* adding compile kwarg for torch 2.6

* fixing dynamic

* addressing comment

* typo

* Update src/transformers/integrations/flex_attention.py

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-04-08 00:22:21 +02:00
d8f0695e84 more fixes for post-training llama4 (#37329)
* more fixes for post-training llama4

* use target_length instead of guearded past_key_values
2025-04-08 00:22:17 +02:00
d27c8c38f4 Remove HQQ from caching allocator warmup (#37347)
Update modeling_utils.py
2025-04-08 00:22:07 +02:00
04c0cedcdf fix derived berts _init_weights (#37341)
* fix derived berts

* more

* roformer
2025-04-08 00:21:44 +02:00
4f536ba0ae Fix init empty weights without accelerate (#37337)
* add the integration

* Update accelerate.py

* Update accelerate.py

* add find_tied_params as well

* Update accelerate.py

* add where copied from

* simplify

* add error
2025-04-08 00:21:36 +02:00
6b82af0a5b Fix deepspeed with quantization (#37324)
* Update modeling_utils.py

* Update modeling_utils.py
2025-04-08 00:21:32 +02:00
2bf3d4aca8 fix llama4 training (#37319) 2025-04-08 00:21:28 +02:00
a79b7abede fix flex attn when optional args aren't passed (#37327) 2025-04-08 00:21:24 +02:00
0720e206c6 Release: v4.51.0 2025-04-05 22:03:17 +02:00
25b7f27234 Add llama4 (#37307)
* remove one of the last deps

* update fast image processor after refactor

* styling

* more quality of life improvements

* nit

* update

* cleanups

* some cleanups

* vllm updates

* update fake image token

* [convert] Fix typo

* [convert] Strip extraneous bytes from shards

* [convert] Minor fixes

* [convert] Use num_experts

* multi-image fixes in modeling + processor

* fixup size

* 128 experts

* Use default rope

* Unfuse mlp

* simplify a lot inputs embeds merging

* remove .item() 👀

* fix from review

* Address feedback

* Use None "default" for rope_scaling. Add eot.

* set seed

* return aspect ratios and bug fixes

* Moe 128 rebased (#8)

* 128 experts

* Use default rope

* Unfuse mlp

* Address feedback

* Use None "default" for rope_scaling. Add eot.

* Meta/llama quant compat (#7)

* add quant compatible model & conversion code for llama4

* fix a few issues

* fix a few issues

* minor type mapping fix

---------

Co-authored-by: Lu Fang <fanglu@fb.com>

* use a new config parameter to determine which model definition to use for MoE

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Lu Fang <fanglu@fb.com>

* un-comment write_tokenizer from converting script

* remove un-used imports

* [llama4] Pop aspect_ratios from image processor output in Llama4Processor

Signed-off-by: Jon Swenson <jmswen@gmail.com>

* Fix parameter_count name

* Update src/transformers/models/llama4/configuration_llama4.py

* nit

* Add changes for no_rope, moe_layers, chunked attention. Just need to test all

* Update src/transformers/models/llama4/image_processing_llama4_fast.py

* nit

* fix post merge with main

* support flex attention

* fixes

* fix

* add layer

* small updates

* rebase and delete llm_compressor

* nit

* [llama4/mm] Add back <|image|> token that delimits global tile

* [llama4/mm] Fix Llama 4 image processing unit tests

* add explicit dtype

Signed-off-by: Jon Swenson <jmswen@gmail.com>

* sdpa works

* comment todo small

* fix model loading

Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>

* revert

* nits

* small fix for TP on 1 node

* Read new params from config

* Add <|eom|>

* lol don't know how this got here

* adding fp8

* Save processor, fix chat template

* style

* Add boi/eoi tokens

We don't use them.

* fixes for now flex seems to work :)

* updates

* nits

* updates

* missking keys

* add context parallel

* update

* update

* fix

* nits

* add worldsize and make eager attn work for vision

* Ignore new key present in base models

* add tp_plan

* fix nope

Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>

* minor fix

Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>

* Clean up Llama4 vision model

* current updates

* add support for `attn_temperature_tuning`

* add floor scale

* add missing attn scales

* push what works, dirty trick for the device synch

* oups

* Fix pad_token_id

See
https://huggingface.co/ll-re/Llama-4-Scout-17B-16E/discussions/2/files
Confirmed in the original codebase.

* fix causallml loading

* rm

* fix tied-weights

* fix sdpa

* push current version

* should work with both short and long

* add compressed_tensos & fix fbgemm tp

* Fix flex impl

* style

* chunking

* try to revert the potentially breaking change

* fix auto factory

* fix shapes in general

* rm processing

* commit cache utils cleanup

* Fix context length

* fix

* allocate

* update tp_plan

* fix SDPA!

* Add support for sparse `Llama4TextMoe` layer from the kernel hub

* cleanup

* better merge

* update

* still broken fixing now

* nits

* revert print

* Write max_position_embeddings and max_model_length

* Update modeling_llama4.py

* Save attention_chunk_size

* Sync eos terminators

* Read initializer_range

* style

* remove `dict`

* fix

* eager should use `chunked_attention_mask`

* revert

* fixup

* fix config

* Revert "Merge pull request #36 from huggingface/sparse-llama4-moe"

This reverts commit ccda19f050867dd42ea143c5de60f3dec81375f0, reversing
changes made to a515579aed8c0fe9bf529b6c40446a289406d5d6.

* Fix typo and remove warning with compiled flex and chunked prefill

* Fix MoE vs FF (#41)

* fix

* Use correct no_rope_layers if provided one is empty list

* update tests

* fix

* skipping some tests

* fix fp8 loading

Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>

* fix text geneartion pipeline

Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>

* eager needs 4D mask

* fix

* Some cleanup

* fix

* update

* fix

* replace correctly module

* patch

* modulelist

* update

* update

* clean up

* Don't move to `cuda:0` in distributed mode

* restrict to compressed tensors for now

* rm print

* Docs!

* Fixes

* Update docs/source/en/model_doc/llama4.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fixes

* cuda graph fix

* revert some stuff

* fixup

* styling

* Update src/transformers/models/llama4/modeling_llama4.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

* commit licence, cleanup here and there and style

* more styling changes

* fix dummies

* fix and clean docstrings

* remove comment

* remove warning

* Only fast image processor is supported

* nit

* trigger CI

* fix issue with flex encoder

* fix dynamic cache

* Code quality

* Code quality

* fix more tests for now

* Code quality

* Code quality

* Nuke bunch of failing stuff

* Code quality

* Code quality

* cleanup removal of slow image processor

* ruff fix fast image processor

* fix

* fix styling

* Docs

* Repo consistency

* Repo consistency

* fix sliding window issue

* separate llama cache

* styling

* Repo consistency

* Repo consistency

* push waht works

* L4 Repo consistency

* Docs

* fix last last alst alst alst alstsaltlsltlaslt

---------

Signed-off-by: Jon Swenson <jmswen@gmail.com>
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com>
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: Keyun Tong <tongkeyun@gmail.com>
Co-authored-by: Zijing Liu <liuzijing2014@users.noreply.github.com>
Co-authored-by: Lu Fang <fanglu@fb.com>
Co-authored-by: Zijing Liu <liuzijing2014@gmail.com>
Co-authored-by: Jon Swenson <jmswen@gmail.com>
Co-authored-by: jmswen <jmswen@users.noreply.github.com>
Co-authored-by: MekkCyber <mekk.cyber@gmail.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com>
Co-authored-by: Yong Hoon Shin <yhshin@meta.com>
Co-authored-by: Marc Sun <marc@huggingface.co>
Co-authored-by: drisspg <drisspguessous@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Daniël de Kok <me@danieldk.eu>
Co-authored-by: Lysandre <hi@lysand.re>
Co-authored-by: Ye (Charlotte) Qi <ye.charlotte.qi@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-05 22:02:22 +02:00
aa40fda346 Hf Xet extra (#37305)
* Hf Xet extra

* Hf Xet extra
2025-04-05 21:06:05 +02:00
e94571580b Fix deepspeed loading (part 2) (#37306)
* fix

* Update modeling_utils.py

* Update modeling_utils.py

* oups remove print
2025-04-05 20:41:42 +02:00
84aa13dd85 Fix deepspeed loading (#37281)
* Update modeling_utils.py

* Update modeling_utils.py

* fix and remove all imports

* Update modeling_utils.py

* Update modeling_utils.py

* style

* Update modeling_utils.py
2025-04-05 17:05:45 +02:00
0ef339ff1b Update OpenAI GPT model card (#37255)
* Update OpenAI GPT model card

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update OpenAI GPT model card: add usage examples and notes section

* Add API autodoc tags after Notes section for OpenAI GPT model

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/openai-gpt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Added missing badges

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-04 15:25:16 -07:00
46d73910d5 Updated T5 model card with standardized format (#37261)
* Updated T5 model card with standardized format

* Updated T5 model card with standardized format, fixed typo

* Update docs/source/en/model_doc/t5.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/t5.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/t5.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/t5.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/t5.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/t5.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/t5.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/t5.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/t5.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/t5.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Apply reviewer suggestions

* Update docs/source/en/model_doc/t5.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-04 15:23:09 -07:00
579135a2f6 Updated model card for distilbert (#37157)
* Updated model card for distilbert

* Updated the distilbert model card

* Updated model card for distilbert

* Updated the distilbert model card

* Addressed code review comments

* Addressed review comments

* fix pipeline

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-04 15:22:46 -07:00
8cd57eb731 mobilebert model card update (#37256)
* mobilebert model card update

* Updates to model card mobilebert

---------

Co-authored-by: Reshan Gomis <reshang@verdentra.com>
2025-04-04 14:28:35 -07:00
ebe47ce3e9 Fix: Unexpected Keys, Improve run_compressed, Rename Test Folder (#37077) 2025-04-04 21:30:11 +02:00
531e4fcf0e Update model card for Depth Anything (#37065)
[docs] Update model card for Depth Anything
2025-04-04 11:36:05 -07:00
a4e55fcff8 Disable delay_optimizer_creation in Trainer to support fsdp2 (#37147)
* github why you do this

* fix

* make fixup

* disable cpu offload test

* fixup

* tmp reworks

* git branch movement

* make fixup

* add require_fsdp_v2_version

* dep issues

* update ruff and fixup
2025-04-04 20:11:37 +02:00
878562b68d fix test device spec relative path importing issue (#37190)
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-04-04 18:22:55 +02:00
8ebc435267 Fix llava_onevision tests (#37280)
* Fix llava_onevision tests

* Trigger tests
2025-04-04 15:03:38 +01:00
ad3d157188 [RoPE] abstract dynamic RoPE update under a decorator (#37249)
* dynamic rope decorator

* longrope; shorter fwd pass

* propper docstring

* make fixup
2025-04-04 14:27:28 +01:00
3d40bda30e Hugging Face Hub pin to v0.30.0 for Xet (#37166) 2025-04-04 14:58:22 +02:00
acbcb5d07d [Tests] flaky test_constrained_beam_search_generate_dict_output (#37276) 2025-04-04 13:38:42 +01:00
4ba0989eab Clarify error message to ensure min 28x28 image supplied for Qwen 2.5 VL (#37264)
fix: clarify error message for min 28x28 images

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-04-04 12:53:38 +01:00
352ec8ef22 pin specific natten version in docker file (#37274)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-04 13:47:16 +02:00
edd345b52e Fix deprecated PT functions (#37237)
* Fix deprecated PT functions

Signed-off-by: cyy <cyyever@outlook.com>

* Revert some changes

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
2025-04-04 12:31:11 +01:00
b016de1ae4 Fix utils/check_bad_commit.py (#37272)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-04 12:18:20 +02:00
f74d7da836 Introduce modular files for speech models (#35902)
* WAV_2_VEC_2 to WAV2VEC2

* added modular files for hubert, wavlm, wav2vec2_bert, data2vec_audio

* remove unnessary definitions in modulars

* added modular files for UniSpeech, UniSpeechSat, Wav2Vec2Conformer

* docstring fix for UniSpeechForCTC

* removed unneccessary re-definition of modular classes

* reverted lazy imports change on modular_model_converter, type-alias for Wav2Vec2BaseModelOutput

* top-level import of deepspeed in seamless_m4t, speecht5

* avoid tracking imports inside classes, relocate lazy deepspeed, peft imports in their original locations

* convert modular

* tiny modular typing fixes

* some more modular fixes

* make style

---------

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>
2025-04-04 11:46:27 +02:00
d130cd0e16 update error msg (#37207) 2025-04-04 10:21:30 +02:00
41b9b92b52 [qwen-vl] fix image processor (#37258)
* fix

* add test
2025-04-03 19:48:56 +02:00
8dd0a2b89c Update model card for electra (#37063)
* Update ELECTRA model card with new format

* Update ELECTRA model card with new format

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* close hfoption block

---------

Co-authored-by: Wun0 <f20191221@hyderabad.bits-pilani.ac.in>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-03 10:45:35 -07:00
15ac2b6ac5 Update Model Card for ModernBERT (#37052)
* Modify Model Card for ModernBERT.

* Update as per code review.

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update model card.

* Update model card.

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-03 10:14:02 -07:00
b552708694 chore: Update model doc for code_llama (#37115)
* Update code_llama.md

aims to handle https://github.com/huggingface/transformers/issues/36979#issuecomment-2758560598

sub part of https://github.com/huggingface/transformers/issues/36979

* Update docs/source/en/model_doc/code_llama.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/code_llama.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/code_llama.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* make changes as per code review

* chore: make the function smaller for attention mask visualizer

* chore[docs]: update code_llama.md with some more suggested changes

* Update docs/source/en/model_doc/code_llama.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* chore[docs] : Update code_llama.md with indentation changes

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-03 10:09:41 -07:00
2b84831a93 Update model card for Cohere (#37056)
* Update Cohere model card to follow standard template

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update cohere.md

Update code snippet for AutoModel, quantization, and transformers-cli

* Update cohere.md

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-03 09:51:40 -07:00
2d46a08b63 Purge unused ModelTester code (#37085)
* Purge correctly this time

* Remove more methods from recent PRs

* make fixup
2025-04-03 17:48:35 +01:00
1b29409d89 feat: updated model card for qwen_2.5_vl (#37099)
* feat: updated model card for qwen_2.5_vl

* applied suggested change 1

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* applied suggested change 2

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* applied suggested change 3

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix: made requested changes for quantization and notes

* suggeested model card change 4

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* updated model card wiht suggested change 5

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* updated model card wiht suggested change 6

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* updated model card wiht suggested change 7

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* feat: applied requested changes

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-03 09:13:26 -07:00
8a828a747e Add Optional to types (#37163)
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-03 16:38:01 +01:00
3f6af96732 Adding links to ShieldGemma 2 technical report (#37247) 2025-04-03 16:26:29 +01:00
9a1c1fe7ed [CI] green llama tests (#37244)
* green llama tests

* use cleanup instead

* better test comment; cleanup upgrade

* better test comment; cleanup upgrade
2025-04-03 14:15:53 +01:00
782d7d945d Allow flexible generation params arg when checking pipeline specs (#37211)
* Allow flexible generation params arg

* Trigger tests

* Add docstring and rename js_generate to hub_generate
2025-04-03 13:29:36 +01:00
afafb84b59 Add support for fast image processing in image-pretraining example (#37021)
* Add support for fast image processing in image-pretraining example

Fix typo: correct tuple formatting in IMAGE_PROCESSOR_MAPPING_NAMES

Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com>

* Use fast image processor by default

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com>

---------

Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-04-03 13:26:46 +01:00
34ccfebf32 Fix AST parsing when looking for remote code imports (#37245)
* Not all Call.func nodes have id because they can be methods

* Trigger tests

* Trigger tests
2025-04-03 13:00:51 +01:00
f697b3f824 enable 2 types of case on XPU (#37198)
enable 2 types of case on XPU 1. test_resize_tokens_embeddings_with_deepspeed_multi_gpu 2. test_resize_embeddings_untied_with_deepspeed_multi_gpu

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-04-03 11:37:55 +02:00
2099287a59 [CI] lazy loading external datasets (#37218) 2025-04-03 09:57:45 +01:00
a0803a9555 [tests] fix mamba integration simple inference precision issue (#37193)
* fix precision issue

* use float32
2025-04-03 10:38:03 +02:00
6ce238fe7a Fix test (#37213)
* Update test_modeling_common.py

* style
2025-04-03 10:24:34 +02:00
12048990a9 Add new dim to num_items_in_batch if necessary (#36967)
* Add new dim to `num_items_in_batch` if necessary

* Unsqueeze only in the DP case

---------

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-04-03 09:57:03 +02:00
98601cc818 [Phi4] add multimodal chat template (#36996)
* phi4 chat template

* remove from valid kwargs
2025-04-03 09:52:09 +02:00
c9302c0983 Fix static cache export (#37229)
Co-authored-by: Guang Yang <guangyang@fb.com>
2025-04-03 07:05:57 +02:00
2056287940 Updated model card for Qwen2 (#37192)
* Update qwen2.md

* Update qwen2.md

* Update qwen2.md

* Update qwen2.md

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update qwen2.md

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-02 18:10:41 -07:00
3e96a0c32b Update falcon model card (#37184)
* feat: updated model card for falcon

* fix:rewrite model description

* fix: add link to conversion script

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix: Add suggested changes

* fix: typo in link for quantization

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix: fix indent and close ticks

* fix: add indent

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-02 17:30:37 -07:00
199d7adf10 Updated the model card for CLIP (#37040)
* Update clip.md

* Update docs/source/en/model_doc/clip.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/clip.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/clip.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Incorporated suggested changes

* Update docs/source/en/model_doc/clip.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/clip.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/clip.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-02 14:57:38 -07:00
126abe3461 More ReDOS fixes! (#36964)
* More ReDOS fixes!

* Slight regex cleanup

* Cleanup regex replacement

* Drop that regex entirely too

* The regex didn't match config.json, let's make sure we don't either

* Cleanup allowed_value_chars a little

* Cleanup the import search

* Catch multi-condition blocks too

* Trigger tests

* Trigger tests
2025-04-02 18:46:14 +01:00
3d133cc557 Stop DOSing the Hub in the CI (#37209)
* As the title suggests, stop hammering the same files

* make fixup

* Use shutil instead of pathlib
2025-04-02 17:19:33 +01:00
e90d55ebcc [Tests] add min_new_tokens to prevent flaky length checks (#37175) 2025-04-02 15:24:00 +01:00
cbfa14823b No more dtype_byte_size() (#37144)
* No more dtype_byte_size()

* Remove function once again

* Fix rebase cruft

* Trigger tests
2025-04-02 14:58:38 +01:00
7613cf1a45 Add py.typed (#37022) 2025-04-02 14:17:27 +01:00
32c12aaec3 [3/N] Use pyupgrade --py39-plus to improve code (#36936)
Use pyupgrade --py39-plus to improve code

Signed-off-by: cyy <cyyever@outlook.com>
2025-04-02 14:16:06 +01:00
764ab0d46a Merge tensor operations with device transfer operations (#37097)
* Merge operations with to

Signed-off-by: cyy <cyyever@outlook.com>

* Use dtype

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
2025-04-02 14:15:23 +01:00
c94c6ed397 Fix some code annotation typos. (#37102)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-04-02 14:00:41 +01:00
e94d607c8b fix: Add 'image-text-to-text' to TASK_MAPPING (#37107)
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-04-02 14:51:03 +02:00
adfc91cd46 Try to avoid/reduce some remaining CI job failures (#37202)
* try

* try

* Update tests/pipelines/test_pipelines_video_classification.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-04-02 14:39:57 +02:00
6f5dc9c82e Fixes DynamicCache export issues due to control flow and inplace modifications (#36652)
* Remove unnecessary masked_fill in deberta models

* Enable some code when exporting but not compiling

* add missing import

* style

* replace if by torch.cond

* style

* use numel

* style

* add unit tests

* style

* change empty value for dynamic cache

* replace != [] by numel()

* fix import issue

* style
2025-04-02 12:04:40 +01:00
a165458901 Add device workaround for int4 weight only quantization after API update (#36980)
* merge

* fix import

* format

* reformat

* reformat

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-02 12:42:22 +02:00
ed95493ce0 Skip code 307 in RequestCounter (#36953)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-02 11:35:46 +02:00
211e4dc9a4 [chat-template] fix video loading (#37146)
* fix

* add video

* trigger

* push new iamges

* fix tests

* revert

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-02 11:27:50 +02:00
800510c67b [doc] Fix link for Quark quantization page (#37179)
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-01 20:57:38 +02:00
41f5c3216c Revert #37031 (#37178)
Update modeling_utils.py
2025-04-01 19:48:15 +02:00
bc2dea3f54 Fix meta state dict loading with quantizers (#37136)
Update modeling_utils.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-04-01 18:45:58 +02:00
35253076f4 Avoid pipeline test failing related to Hub call (#37170)
* cls

* cls

* cls

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-01 18:22:45 +02:00
bf41e54fc8 Fixes the inconsistency of the optionality of attention_mask (#37153)
* debugging issue 36758

* debugging issue 36758

* debugging issue 36758

* updated attn_mask type specification in _flash_attention_forward

* removed pdb

* added a blank line

* removed indentation
2025-04-01 15:31:10 +01:00
3249c5dc15 Refactor attention for SigLIP based models (#36981)
* Update Siglip attention implementation

* Update tests for Siglip

* Remove one level of indentation

* Update test to be more specific

* Fixup

* Idefics2

* Idefics3

* Emu3

* SmolVLM

* Phi4 (just init small update)

* Idefics2 (test fix)

* Update siglip2 tests

* Update eager

* trigger

* Clean up

* Transfer inputs to device in test

* Fixing test

* Fixing test

* Revert contiguous

* Remove unused is_flash_attn_2_available

* Move flaky to specific models
2025-04-01 15:37:25 +02:00
24e311f42b fix XPU UT error case brough by RNG difference btw XPU and CUDA (#37121)
* fix XPU UT error case brough by RNG difference btw XPU and CUDA

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* Revert "enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu"

This reverts commit 3ef83a4f0204642daa45fda56e8aca1afed24b4f.

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-04-01 13:52:55 +01:00
897ff9af0e [ModernBERT] Never save 'reference_compile' config; should be set based on end user (#36305)
* Never save 'reference_compile' config; should be set based on end user

* Reformat (I ran 'make style' from the wrong env)

* Use pop instead of del

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Use pop instead of del

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-04-01 14:14:39 +02:00
c0bd8048a5 Make canine model exportable by removing unncessary complicated logic (#37124) 2025-04-01 12:31:12 +01:00
60b75d99b6 Only count num items in batch when needed (#36867)
only count num itels when needed
2025-04-01 12:30:39 +02:00
fac70ff3c0 Convert _VALID_DICT_FIELDS to class attribute for shared dict parsing in subclasses (#36736)
* make _VALID_DICT_FIELDS as a class attribute

* fix test case about TrainingArguments
2025-04-01 12:29:12 +02:00
ae34bd75fd Use public export API on torch 2.5 and future (#36781)
Co-authored-by: Guang Yang <guangyang@fb.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-04-01 10:47:38 +01:00
8f6b27eb5c enable test_assisted_decoding_in_different_gpu test on XPU (#37120)
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-04-01 11:22:59 +02:00
737cbd2109 Fix llava xpu tests. (#37130)
* fix llava 4bit xpu test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix llava 4bit xpu test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-04-01 11:10:13 +02:00
3a6ab46a0b add gpt2 test on XPU (#37028)
* add gpt2 test on XPU

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* auto dtype has been fixed

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* convert model to train mode

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-04-01 11:09:29 +02:00
4b13a02920 Fix std initialization in Idefics variants (#37100)
* Nit 😅

* Another one

* fix

* run ci

* revert change
2025-04-01 09:18:54 +02:00
786d9c5ed9 Fix more inefficient PT operations (#37060)
* Fix inefficient operations

* Remove cpu() call

* Reorder detach()

* Reorder detach()

* tolist without detach

* item without detach

* Update src/transformers/models/rag/modeling_rag.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/encodec/test_modeling_encodec.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Use detach().cpu().numpy

* Revert some numpy operations

* More fixes

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-31 16:31:24 +01:00
a1e389e637 Refactor return_dict logic to remove complicated if/else paths (#36794)
* SAM

* CLIP

* SigLIP

* GOT-OCR2 (depends on SAM)

* SigLIP2 (depends on SigLIP)

* trigger tests

* Fix SAM

* Fix missed indexing, use named attributes

* Llama

* Aria

* Bamba

* Update llama: missed outputs return type

* (fixup) Aria

* DiffLlama

* Emu3

* Gemma

* Gemma2

* Paligemma

* Fix paligemma

* Gemma3

* GLM

* Helium

* JetMoe

* Jamba

* Mistral

* Mistral

* Mixtral

* Nemotron

* Olmo

* Olmo2

* Persimmon

* Phi

* Phi3

* PhiMoe

* Qwen2

* Qwen2_moe

* StableLM

* Starcoder2

* Add return_dict decorator

* SAM

* Update decorator: compile, export, trace - friendly

* Llama (decorator)

* SAM (decorator)

* Add decorator `can_return_tuple`

* Llama

* Update to decorator

* Update CLIP

* Update decorator to store `_is_top_level_module` in self

* Update decorator to correctly handle compile/export

* Remove is_torchdynamo_compiling constraint, all work fine with self attribute assignment

* Typing

* GPT NeoX

* Fixup

* Fix attribute Granite

* Fix return type mixtral

* Update Gemma3

* Fix Cohere amd Cohere2

* Fixup

* Fix corner case for Phi4, when activation is shared

* (fix-copies) deepseekv3, phi4

* Fixup

* Apply to qwen3/qwen3_moe

* Fix
2025-03-31 16:23:37 +01:00
f304318f5f Remove low_cpu_mem_usage and _fast_init (#36963)
* Remove low_cpu_mem_usage and _fast_init

* Update deepspeed.py

* Update modeling_utils.py

* remove the first 2 tests everywhere

* Update test_modeling_common.py

* remove what was remaining about fast_init

* fix logic and simplify

* mismatched keys logic update

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* fix 2 models init_weights

* extend to others

* remove grad

* Update modeling_fsmt.py

* init weights in tests

* style

* Update test_modeling_fsmt.py

* more old models

* fix more init_weights

* copies

* fix

* style

* Update modeling_lxmert.py

* fix inits

* more and more

* more

* should finalize

* style

* Update modeling_dinov2_with_registers.py

* fix

* Update modeling_encoder_decoder.py

* fix

* style

* Update modeling_lxmert.py

* post rebase cleanup

* Update modeling_informer.py

* back to start for device

* fix

* add test to detect all failing cases correctly

* Update test_modeling_common.py

* fix

* fix

* sam

* style

* Update modeling_maskformer_swin.py

* CIs

* CIs

* remove test - will add it on separate PR

* fix

* fix

* Update modeling_sam.py

* CIs

* CIs

* CIs

* convnext

* suggestions

* CIs

* fix copies after merge

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-31 17:18:43 +02:00
8805600406 [qwen3] fix generation tests (#37142)
* do not skip tests

* fix qwen3-moe as well

* fixup

* fixup
2025-03-31 16:33:41 +02:00
e686fed635 [Feature] Support using FlashAttention2 on Ascend NPU (#36696)
* [Feature] Support using flash-attention on Ascend NPU

* Fix qwen3 and qwen3_moe moduler conversion mismatch
2025-03-31 16:12:58 +02:00
a03cee7a1d skip (#37141)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-31 15:38:40 +02:00
3b07ca78bb Export T5 (encoder-decoder) to ExecuTorch (#36486)
Co-authored-by: Guang Yang <guangyang@fb.com>
2025-03-31 12:10:26 +02:00
475664e2c6 [tests] remove cuda-only test marker in AwqConfigTest (#37032)
* enable on xpu

* add xpu support

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-31 11:53:02 +02:00
0710e9b1e8 Create and Expose SamVisionModel as public for better accessibility (#36493)
* move encoder below

* auto modeling

* write SamVisionTester

* fix vision attention shape

* fix SamVisionTest

* minor changes to SamVisionTest

* Revert "fix vision attention shape"

This reverts commit d2a4083ae5704716e33351aed03af8f3cc45f3ae.

* fix attention output shape in new tests

* remove encoder examples

* run modular on got_ocr2

* code formatting

* fix got_ocr2

* ruff fixes

* code quality

* add sam_vision in auto modeling and auto configuration

* remove composite test

* updated index.md

* add TFSamVisionEncoder to __init__

* fix public TFSamVisionEncoder

* remove outdated todo comment

* set test_torch_exportable

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* rename: VisionEncoder -> VisionModel

* bring back original SamVisionEncoder

* rename back: VisionEncoderOutput -> VisionModelOutput

* undo changes in SamModelTester

* reuse SamVisionEncoder in SamVisionModel

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-03-31 11:45:07 +02:00
f99c279d20 Remove deprecated code (#37059)
* Remove deprecated code

* fix get_loading_attributes

* fix error

* skip test

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-03-31 11:15:35 +02:00
d1efaf0318 RWKV: fix mask warning typo (#37114)
rwkv: fix mask warning typo
2025-03-31 11:07:51 +02:00
19919689b2 Fix Gemma3 embedding scaling (#37109)
fix gemma3 embedding
2025-03-31 11:04:02 +02:00
d0b65bb479 [MLU] Fix FA2 check error, remove deepspeed-mlu deps. (#36159)
* add Cambricon MLUs support

* fix mlu device rng state

* up for quality check

* up mlu to support fp16

* fix mlu device dependency error

* fix mlu device dependency error

* enable mlu device for bf16

* fix mlu device memory tracker

* Cambricon support SDPA and flash_attn

* MLU devices : Checks if `mlu` is available via an `cndev-based` check which won't trigger the drivers and leave mlu

* Fix mlu FA2 check. Remove deepspeed-mlu check. add mlu tests support.

* fix testing errors.

* Merge branch 'hf/main' into main

* fix get_device_count error.

* fix mlu testing utils.

* fix code quality and style.

* switch to @require_torch_multi_accelerator
2025-03-31 11:02:49 +02:00
ad63d20dff fix whisper re-compile (#36712)
* fix whisper re-compile

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix copy

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix comment

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix copies

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* revert useless changes

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-31 11:01:51 +02:00
286393fbb1 enable tp on CPU (#36299)
* enable tp on CPU

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* get rank from cpu

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable TP tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix comment

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* em print

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix model id

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix conflict

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix index and add doc

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-03-31 10:55:47 +02:00
4705b04c74 Fix 4090/ada not detected as having FP8 support (#37067)
fix 4090/ada not detected as having FP8 support

Signed-off-by: Qubitium <qubitium@modelcloud.ai>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-03-31 10:53:48 +02:00
2b4734bd49 Support passing flash_attn_kwargs when gradient_checkpointing is enabled (#37037)
* support passing flash_attn_kwargs when gradient_checkpointing is enabled

* make modeling_deepspeek_v3.py consistent with modular_deepseek_v3.py
2025-03-31 10:53:02 +02:00
bd41b9c1ac Gaudi: Fix the pipeline failed issue with hpu device (#36990)
* Gaudi: fix the issue of is_torch_hpu_available() returns false

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Fix make fixup

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Add comments for the implicit behavior of import

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Update src/transformers/utils/import_utils.py

* Update src/transformers/utils/import_utils.py

---------

Signed-off-by: yuanwu <yuan.wu@intel.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
2025-03-31 10:23:47 +02:00
6acd5aecb3 Adding Qwen3 and Qwen3MoE (#36878)
* Initial commit for Qwen3

* fix and add tests for qwen3 & qwen3_moe

* rename models for tests.

* fix

* fix

* fix and add docs.

* fix model name in docs.

* simplify modular and fix configuration issues

* Fix the red CI: ruff was updated

* revert ruff, version was wrong

* fix qwen3moe.

* fix

* make sure MOE can load

* fix copies

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2025-03-31 09:50:49 +02:00
0d6a60fe55 🌐 [i18n-KO] Translated qwen2_vl.md to Korean (#36750)
* fix: manual edits

* fix: resolve suggestions

* Update toctree.yml
2025-03-30 15:00:27 -07:00
b7fc2daf8b Kenlm (#37091)
* kenlm

* kenlm

* kenlm

* kenlm

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-28 21:42:54 +01:00
bab605dd04 [Cache] rename dtype attribute 🚨 🚨 (#37044)
* yoink

* same pattern in all cache
2025-03-28 19:08:02 +01:00
9fd9476005 [generate] beam search -- fix output cropping (#37080)
* handle jagged beams

* better comment

* bart -- beam search tests print special tokens

* more bart test updates

* more tests!

* better comment
2025-03-28 18:57:51 +01:00
257bc670fb fixed typo. (#37057)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-28 17:12:14 +00:00
2bea6bf24e Fix AttentionInterface following feedback (#37010)
* up

* typo

* update doc

* Update attention_interface.md
2025-03-28 18:00:35 +01:00
a86dad56bc Fix state_dict map location when quantized (#37086)
* Update modeling_utils.py

* Update modeling_utils.py
2025-03-28 17:57:16 +01:00
d6064754ea Update w/ new account (#37084)
* Update w/ new account

* DS
2025-03-28 12:43:00 -04:00
581cf96e0c fix tied weigths issue (#37031)
* fix

* comment

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-28 16:36:44 +01:00
eca74d1367 [WIP] add deepseek-v3 (#35926)
* init commit

* style

* take comments into account

* add deepseekv3 modeling

* remove redundant code

* apply make style

* apply fix-copies

* make format

* add init files

* rename deepseekv3 into deepseek_v3 based on its model_type

* rename deepseekv3 into deepseek_v3 based on its model_type

* deepseek-v3 not deepseek_v3

* set model_type as deepseek_v3

* use default docs

* apply make

* fill type and docstring

* add rope_config_validation

* use custom DeepseekV3MLP

* hold code only for checkpoints congifuration; remove redundant

* revise rope yarn for DeepSeek variation

* rename DeepSeek-V3

* some refactoring

* revise load_hook to work properly; make moe func trainable; use llama instead of mixtral

* fix attention forward

* use -1 for not-changing dim when to use exapnd

* refactor DeepseekV3TopkRouter

* use reshape_for_rope instead of load_hook; revise attention forward for TP; rename q_head_dim with qk_head_dim

* register pre_hook and hook both

* make style

* use n_shared_experts

* Update src/transformers/models/deepseek_v3/configuration_deepseek_v3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add test file

* update modeling_file according to modular file

* make style

* add mapping for DeepseekV3ForSequenceClassification

* remove aux_loss_alpha

* add deepseek_v3 for perf

* add deepseek_v3

* rename test as deepseekv3

* use tiny-deepseek-v3

* remove DeepseekV3ForSequenceClassification

* cache before padding

* remote output_router_logits

* Revert "remote output_router_logits"

This reverts commit f264f800d04950390db8413b9efb24cef8186330.

* remove output_router_logits

* make e_score_correction_bias as buffer

* skip tests not compatible

* make style

* make e_score_correction_bias as buffer

* use rope_interleave instead of load_hook

* skip tests not compatible with MLA

* add doc for rope_interleave

* fix typo

* remove torch.no_grad for selecting topk

* fix post merge issue

* mrege with main and simplify

* nits

* final

* small fixes

* fix

* support TP better

* stash

* changes currently requires

* remove synch

* more fixes for TP

* temp fix for TP : some attention layers's FP8 scales are too small + shared is local colwise and anything is local if FP8 because weights are used

* updates to have generation work!

* push most of the changes

* reorder functions + call for contributions!

* update readme

* nits

* update

* ruff was updated on main

* merge with main and fix copies

* revert unrelated changes

* route all tokens to all experts when testing to avoid no gradient iddues

* finish fixing all tests

* fixup

* nit

* clean config

* last readme changes

* nit

* do cnit

* typo

* last nit

* one more one more

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-165-131.ec2.internal>
2025-03-28 15:56:59 +01:00
52cc204dd7 [blip-2] Fix dtype mismatch when keep in fp32 (#37068)
* fix fp32 BLIP2

* no need to reorder that

* check for `Noneness` as well before casting dtype
2025-03-28 15:52:11 +01:00
aa3778afc2 Change deprecated PT functions (#37041)
Change deprecated functions
2025-03-28 14:26:22 +00:00
c90e6e9625 Fix some typos about benchmark scripts. (#37027)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-03-28 14:10:20 +00:00
1fcaad6df9 Use lru_cache for tokenization tests (#36818)
* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-28 15:09:35 +01:00
jp
3af425d4c6 fix: AttributeError: 'LlavaProcessor' object has no attribute 'image_token_id' (#37026)
* Add image_token_id and video_token_id handling in Llava processors

* fix: image to video

* fix: correct image and video token ID handling in Llava processors

* fix: improve image and video token ID handling in Llava processors
2025-03-28 10:46:24 +01:00
064cd7cdac Fix SDPA implementation in Qwen2-VL (issues with torch==2.6.0) (#36891)
* fix sdpa implementation

* ruff

* also modify 2_5 for consistency
2025-03-28 09:54:21 +01:00
348f3285c5 fix: Fully remove legacy cache from Llama (#36958)
* bug: fully remove legacy cache from Llama

* bug: fix CI issues

* bug: update jetmoe model

* bug: apply =check_modular_conversion.py= fix

* bug: apply make fix-copies

* bug: fix ruff

* PR suggestions

* Remove trailing commas in auto-gen files

* Trivial new line removal
2025-03-27 17:22:44 +00:00
d6b3c7486b fixed typo (#37036) 2025-03-27 15:37:53 +00:00
6cc9c8d7d1 Remove deprecated batch_size parameter (#37007) 2025-03-27 15:01:56 +00:00
4cc65e990f Replace default split function with jnp.split() in flax models (#37001)
Replace split with jnp's split function for flax models (#36854)
2025-03-27 14:59:57 +00:00
41a0e58e5b Set weights_only in torch.load (#36991) 2025-03-27 14:55:50 +00:00
de77f5b1ec Fix typing for None valued variables (#37004)
Fix typing for None-able variables
2025-03-27 14:46:32 +00:00
8c5e29bad5 Avoid unnecessary device operations in loss computing (#36950)
* Avoid unnecessary tensor copy in loss computing

* Add type
2025-03-27 14:45:14 +00:00
471cf1de63 clean pipeline question_answering. (#36986)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-03-27 14:35:33 +00:00
29f322d04d [generate, cache] handle more complex device maps (#37014) 2025-03-27 14:33:20 +00:00
fb8e6c50e4 [audio utils] fix fft_bin_width computation (#36603)
* fix fft_bin_width computation

* update docstring + enforce correct params

* update test with correct value

* udpate test

* update feature extractors for concerned models

* update

* make

* udpate docstring

* udpate docstring
2025-03-27 15:20:02 +01:00
e97c760006 [chat templates} support loading audio from video (#36955)
* add audio from video

* typos

* delete print

* comments
2025-03-27 14:46:11 +01:00
c7bc79bd2a Fixup for distill_any_depth conversion script (#37043)
* Fixup

* trigger
2025-03-27 13:29:25 +00:00
d1eafe8d4e Optimize to_py_obj for python-native numeric lists and scalars (#36885)
* Optimize to_py_obj for python-native numeric lists and scalars

* Fix bug that tuple is not converted to list

* Try np.array for more robust type checking

* Apply review and add tests for to_py_obj
2025-03-27 14:16:46 +01:00
0e56fb69a2 fix pegasus init weights and other copied models (#36844)
* fix pegasus init weights

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix the rest of models

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix informer init

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* init weight before checking

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix roformer tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix roformer tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-03-27 14:14:30 +01:00
7e813f9cf0 Add Distill Any Depth (#36614)
* Added conversion Script

* Update src/transformers/models/depth_anything/convert_distill_any_depth_to_hf.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Updated Conversion Script

* Update src/transformers/models/depth_anything/convert_distill_any_depth_to_hf.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-03-27 13:10:03 +00:00
92429057d9 Skip FP8 linear tests For device capability < 9.0(#37008)
* skip fp8 linear

* add capability check

* format
2025-03-27 12:38:37 +01:00
279c2e302a remove redundant code in trainer (#36994)
* Update optimization.py

* Update optimization.py
2025-03-27 11:35:15 +01:00
d13c390d01 Mark 2 tests as flaky for now (#37038)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-27 10:59:47 +01:00
d6d930a64b [Modeling] Load FP8 safetensors such as DeepSeek (#36828)
support loading fp8

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-03-27 10:47:10 +01:00
927ce1d39f Fix PixtralProcessor patch_size when spatial_merge_size is used (#37019) 2025-03-27 10:46:23 +01:00
49b5ab6a27 Support QuestionAnswering Module for ModernBert based models. (#35566)
* push ModernBertForQuestionAnswering

* update ModernBertForQuestionAnswering

* update __init__ loading

* set imports for ModernBertForQuestionAnswering

* update ModernBertForQuestionAnswering

* remove debugging logs

* update init_weights method

* remove custom initialization for ModernBertForQuestionAnswering

* apply make fix-copies

* apply make style

* apply make fix-copies

* append ModernBertForQuestionAnswering to the pipeline supported models

* remove unused file

* remove invalid autoload value

* update en/model_doc/modernbert.md

* apply make fixup command

* make fixup

* Update dummies

* update usage tips for ModernBertForQuestionAnswering

* update usage tips for ModernBertForQuestionAnswering

* add init

* add lint

* add consistency

* update init test

* change text to trigger stuck text

* use self.loss_function instead of custom loss

By @Cyrilvallez

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* Update modeling_modernbert.py

make comparable commit to even it out

* Match whitespace

* whitespace

---------

Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Orion Weller <wellerorion@gmail.com>
Co-authored-by: Orion Weller <31665361+orionw@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-03-26 21:24:18 +01:00
5b08db8844 fix transformers_cli import relative path issue (#36989)
* fix transformers_cli relative import path issue

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-26 18:45:56 +00:00
3a8ec8c467 [docs] Attention mask image (#36970)
add image
2025-03-26 10:11:34 -07:00
2b550c47b2 Remove deprecated training arguments (#36946)
* Remove deprecated training arguments

* More fixes

* More fixes

* More fixes
2025-03-26 16:44:48 +00:00
44715225e3 fix typos in the code comments and error messages (#36993)
* chore: enhance code comments

* chore: enhance code comments

* chore: enhance code comments

* chore: enhance code comments

* chore: enhance code comments

* chore: enhance code comments

* chore: enhance code comments
2025-03-26 16:09:48 +00:00
79d6f9fd70 Log the correct learning rate (#36973)
* fix learning rate log

* fix lr log

* add lr
2025-03-26 16:52:00 +01:00
13d36e89fe Fix device_map check for ggml files (#37003)
fix
2025-03-26 16:24:57 +01:00
021006e1b0 Fix removing "cpu" from frozenset in bitsandbytes.py to allow better ROCm support. (#36975)
* Fix removing "cpu" from frozenset in bitsandbytes.py to allow better ROCm support.

Related to https://github.com/bitsandbytes-foundation/bitsandbytes/issues/1573 and https://github.com/huggingface/transformers/issues/36949 , this resolves a bug in allowing ROCm/HIP support in bitsandbytes.

* Related to bitsandbytes-foundation/bitsandbytes#1573 and huggingface#36949 , this resolves a bug in the biteandbytes integration, allowing ROCm/HIP support in bitsandbytes.

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-03-26 16:18:08 +01:00
788e1092e9 Allow easy registration of custom attention functions (#36889)
* Update modeling_utils.py

* style

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* add to init

* Update modeling_utils.py

* style

* update

* Update modeling_utils.py

* Update modeling_utils.py

* style

* Add some doc

* Update _toctree.yml

* readd it for tgi/vllm compat

* CIs

* CIs
2025-03-26 16:15:06 +01:00
ad5d40de9c Fix get_device_properties (#36997)
Fix remove remnant self from get_device_properties

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-26 15:46:34 +01:00
8084b26294 Fix Optional type annotation (#36841)
* Fix annotation

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-26 13:53:44 +00:00
b56d8f07e4 Install networkx==3.2.1 manually in some CircleCI jobs after #36957 (#37000)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-26 14:49:09 +01:00
78afa1c537 Use torch.expm1 (#36995) 2025-03-26 13:06:33 +00:00
181d453069 byebye CircleCI TF jobs (#36998)
* byebye tf jobs

* byebye tf jobs

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-26 12:49:50 +01:00
e7139d06f5 Fix tensor dtype mismatch (#36985)
* Fix tensor dtype mismatch

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-26 10:37:46 +01:00
be37d34f44 🚨Deprecate legacy argument for image-text-to-text models and adopt new behavior by default (#36307)
* deprecate legacy argument and adopt new behavior by default

* revert back modification git
2025-03-25 17:32:17 -04:00
ab4656f6b7 update bot comment again (#36974)
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-25 19:42:09 +01:00
ba531278ca Add ruff target-version (#36971) 2025-03-25 19:41:25 +01:00
a844297088 [docs] Fix image link (#36869)
* fix image link

* fix

* update

* fix
2025-03-25 11:34:21 -07:00
d68a91aebf Remove extra tensor clone in PyTorch code (#36748)
* Use detach().clone()

* Eliminate continuous()

* Merge clone and other calls with to

* Merge clone and other calls with to
2025-03-25 17:42:15 +00:00
121830ab47 update examples after ruff being updated (#36972)
* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-25 18:15:47 +01:00
a41677a68b Updated docker files to use uv for installing packages (#36957)
* Updated docker files to use uv pip install as uv is blazingly fast.

* Removed -y flag for uv pip uninstall.

* Passed --no-build-isolation flag

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-25 18:12:51 +01:00
3dce98a437 typo fixed in README_fr.md (#36951) 2025-03-25 09:29:36 -07:00
ebd2029483 Change GPUS to GPUs (#36945)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-25 17:25:39 +01:00
69632aadb7 Update after #36962 (#36965)
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-25 16:16:06 +01:00
c6814b4ee8 Update ruff to 0.11.2 (#36962)
* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-25 16:00:11 +01:00
bc1c90a755 [Utils] torch version checks optionally accept dev versions (#36847) 2025-03-25 10:58:58 +00:00
80b4c5dcc9 Fix cuda index issue in cache allocator (#36937)
fix
2025-03-25 11:51:41 +01:00
0f733110a6 Support return_tensors in audio chat templates (#34601)
* add audio chat templates

* update

* update

* nit

* green ci

* we dont care about the order anymore

* clean up after rebase

* overriden tests rename

* rename shieldgemma also

* one more rename

* require_read_token

* removde images/videos

* retrigger CI flaky
2025-03-25 11:08:47 +01:00
19085c28da fix typos in the tests directory (#36932)
* chore: fix typos in test codes

* chore: fix typos in test codes

* chore: fix typos in test codes

* chore: fix typos in test codes

* chore: fix typos in test codes

* chore: fix typos in test codes

* chore: fix typos in test codes

* chore: fix typos in test codes

* chore: format codes
2025-03-25 10:49:24 +01:00
69bcb86c58 Export for Phi4-mini (#36780)
* Export for Phi4-mini

* Update tests/models/phi3/test_modeling_phi3.py

---------

Co-authored-by: Guang Yang <guangyang@fb.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-25 10:46:38 +01:00
be2c0e7bff Fixing _pre_quantization_dtype when torch_dtype is None (#36930)
fix
2025-03-25 10:43:27 +01:00
4303d88c09 Add Phi4 multimodal (#36939)
* raw start

* update

* update

* add to imports

* update

* up

* simplify configs

* clean configs

* style

* typos

* Update convert_phi4_multimodal_weights_to_hf.py

* Update convert_phi4_multimodal_weights_to_hf.py

* fix

* up

* up

* up

* Update convert_phi4_multimodal_weights_to_hf.py

* Update convert_phi4_multimodal_weights_to_hf.py

* up

* up

* up

* Update feature_extraction_phi4_multimodal.py

* up

* up

* up

* up

* up

* simplify configs

* typo

* cut code

* typo

* typo

* typo

* re

* typo

* up

* up

* up

* add tests

* fix

* fix

* Update test_modeling_phi4_multimodal.py

* up

* Update test_modeling_phi4_multimodal.py

* doc

* fix

* up

* up

* up

* up

* up

* up

* simplify

* up

* simplify

* config docstrings

* cleanup

* clean

* typo

* typo

* fix

* Update phi4_multimodal.md

* fix

* fix

* Update test_modeling_phi4_multimodal.py

* update

* simplify reshapes and permutes

* up

* simplify special tokens

* simplify processor a lot

* Update processing_phi4_multimodal.py

* Update processing_phi4_multimodal.py

* switch to fast processor

* image processor

* Update image_processing_phi4_multimodal_fast.py

* add lora extraction to converter

* Update convert_phi4_multimodal_weights_to_hf.py

* Update __init__.py

* add AudioInput type in audio_utils

* rewrite feature_extraction: support torch batched FFT

* input_audio_embeds -> audio_input_features, input_image_embeds -> image_pixel_values

* test update

* not mono channel warning update

* remove auto maps from processor

* kargs dispatch in processor

* simplify kwargs dispatch

* simplify merging

* remove default sampling rate

* style

* Update test_modeling_phi4_multimodal.py

* update doc

* doc

* torch only feature extractor

* make fake tokens adjustable

* Update feature_extraction_phi4_multimodal.py

* fix

* Update processing_phi4_multimodal.py

* simplify mask

* last touch

* fix copies

* style

* Update audio_utils.py

* style

* Update feature_extraction_phi4_multimodal.py

* Update __init__.py

* docstrings

* copies

* fix all checks

* back to fix-copies

* trigger CIs

* Update feature_extraction_phi4_multimodal.py

* improve tests with multimodal inputs

* trigger CIs

---------

Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>
2025-03-25 09:55:21 +01:00
47e5432805 Deprecate #36741 and map Causal to Conditional (#36917)
* deprecate the prev fix

* reword warning and update docs

* reword warning

* tests

* dont bloat `get_text_config()`
2025-03-25 09:13:56 +01:00
2b8a15cc3f Disallow Offload to disk for gguf files (#36933)
update

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-24 19:30:01 +01:00
91455c1825 Fix processor kwargs qwen2 vl (#36890)
* Fix qwen2_vl and qwen2_5_vl processors cutom images kwargs

* change version warning
2025-03-24 13:19:26 -04:00
48385aa4f4 Added support for seed in DataCollatorForWholeWordMask (#36903)
* Added support for seed in `DataCollatorForWholeWordMask`, and also wrote tests.

Also fixed bugs where the code hardcoded values for mask replacement probability and random replacement probability, instead of using the values passed by the user.

* formatting issues

* Used better way to generate seed in TF. Made tests more consistent.
2025-03-24 16:57:17 +00:00
5932606d8e More precise comment (#36935)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-24 17:03:09 +01:00
2be2984462 Fix pytorch defomr attn path (#36923)
* Fix pytorch path for DeformableAttention

* Apply for GroundingDino
2025-03-24 15:58:51 +00:00
00d077267a [2/N] Use pyupgrade --py39-plus to improve code (#36857)
Use pyupgrade --py39-plus to improve code
2025-03-24 15:42:25 +00:00
a6ecb54159 Update trainer_pt_utils.py docstrings for consistency (#36912)
* Update trainer_pt_utils.py

* update docstrings trainer_pt_utils.py for consistency

* Update src/transformers/trainer_pt_utils.py

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-03-24 14:46:41 +00:00
cbf924b76c Fix typos (#36910)
* fix typos

* fix typos

* fix typos

* fix typos
2025-03-24 14:08:29 +00:00
340500b1a9 Use another repo. for Mistral3 processor testing (#36925)
* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-24 14:36:05 +01:00
9e125d9a2e Fix Compressed tensors to_dict_diff (#36922)
fix
2025-03-24 13:06:33 +01:00
57f551c78d [chameleon] fix num image token check (#36918)
* [chameleon] fix num image token check

* embed after merging image token

* skip this also

* mistral require_read_token
2025-03-24 12:36:08 +01:00
a41e08aa19 tests: fix asyncio.wait() usage for python>=3.11 (#36898)
tests: fix asyncio.wait() usage for python>=3.7

Passing coroutings directly to `asyncio.wait()` is deprecated since
python 3.8 and removed starting from python 3.11. Instead, it's required
to explicitly wrap coroutine in the task with `asyncio.create_task()` which
first appeared in python 3.7.

We step into this issue running the following Transformers tests on a
system with python 3.11 or later (for example, Ubuntu 24.04 has python 3.12):

* `tests/trainer/test_trainer_distributed.py`
* `tests/extended/test_trainer_ext.py`

The error will be:
```
src/transformers/testing_utils.py:2380: in execute_subprocess_async
    result = loop.run_until_complete(
/usr/lib/python3.12/asyncio/base_events.py:687: in run_until_complete
    return future.result()
src/transformers/testing_utils.py:2368: in _stream_subprocess
    await asyncio.wait(
...
E           TypeError: Passing coroutines is forbidden, use tasks explicitly.

```

See: https://docs.python.org/3.10/library/asyncio-task.html#asyncio.wait
See: https://docs.python.org/3.10/library/asyncio-task.html#asyncio.wait
See: https://docs.python.org/3.7/library/asyncio-task.html#asyncio.create_task

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-24 11:53:59 +01:00
e28be7a692 [Fix] Add original_max_position_embeddings to YARN rope_scaling optional keys (#36877)
[fix] Update optional keys in _validate_yarn_parameters to include original_max_position_embeddings
2025-03-24 11:05:19 +01:00
48da44be24 Fix torch version guard at import (#36907)
fix
2025-03-24 10:33:33 +01:00
fe4ca2f4a7 fix Gemma3 Config (#36893)
* fix Gemma3 Config

* fix config in modular gemm3
2025-03-24 10:05:44 +01:00
c9d1e5238a Update installation.md (#36826)
* Update installation.md

* Update README.md
2025-03-21 16:32:02 -07:00
d253de6d58 [docs] Model docs (#36469)
* initial

* fix

* fix

* update

* fix

* fixes

* quantization

* attention mask visualizer

* multimodal

* small changes

* fix code samples
2025-03-21 15:35:22 -07:00
beb9b5b022 Fix Pan and Scan on batched images Gemma3 (#36864)
* process flattened images in fast image proc

* process flattened images in low proc and add tests

* remove print

* add unbalanced batch test pas image proc

* fix integration tests
2025-03-21 13:56:00 -04:00
dd3933dd65 Simplify keep_in_fp32_modules logic (#36722)
* better regex everywhere

* fix

* Update test_modeling_instructblip.py

* BC with explanations this time otherwise it makes no sense at all

* Update test_modeling_instructblip.py

* style

* CIs

* update _keep_in_fp32_modules in blip2

* Update modeling_utils.py

* Update modeling_utils.py

* style

* CIs

* add check

* trigger CIs

* Update modeling_utils.py

* trigger CIs
2025-03-21 16:12:59 +01:00
90e2df5d55 fix: loss computation after embeddings resize - mllama (#36840)
* move loss to generation class

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* code cleanup

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* test for resize and loss computation

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix tests

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix:test for resize and loss

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix resize embedding mllama test

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* review changes

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

---------

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
2025-03-21 14:47:59 +01:00
4542b8fb27 push v4.51.0.dev0 2025-03-21 13:45:25 +01:00
523f6e743c Fix: dtype cannot be str (#36262)
* fix

* this wan't supposed to be here, revert

* refine tests a bit more
2025-03-21 13:27:47 +01:00
3f9ff19b4e Minor Gemma 3 fixes (#36884)
fix attention mask dtype + outputs type
2025-03-21 13:15:22 +01:00
f94b0c59f2 Use deformable_detr kernel from the Hub (#36853)
* Use `deformable_detr` kernel from the Hub

Remove the `deformable_detr` kernel from `kernels/` and use the
pre-built kernel from the Hub instead.

* Add license header

* Add `kernels` as an extra `hub-kernels`

Also add it to `testing`, so that the kernel replacement gets tested
when using CUDA in CI.
2025-03-21 13:08:47 +01:00
2638d54e78 Gemma 3 tests expect greedy decoding (#36882)
tests expect greedy decoding
2025-03-21 12:36:39 +01:00
b8aadc31d5 🔴 🔴 🔴 supersede paligemma forward to shift pos id indexing (#36859)
* supersede paligemma forward to shift pos id indexing

* fix prepare_inputs_ as well

* fix modular error

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-03-21 12:36:27 +01:00
6321876b5b add eustlb as an actor 2025-03-21 12:32:12 +01:00
94f487626a [generate] model defaults being inherited only happens for newer models (#36881) 2025-03-21 11:01:09 +00:00
f19d018bff Revert "Update deprecated Jax calls (#35919)" (#36880)
* Revert "Update deprecated Jax calls (#35919)"

This reverts commit f0d5b2ff04e1354d32beac70984adcc8100352a0.

* Revert "Update deprecated Jax calls (#35919)"

This reverts commit f0d5b2ff04e1354d32beac70984adcc8100352a0.

* udpate
2025-03-21 11:01:44 +01:00
62116c967f Make ViTPooler configurable (#36517)
* Make ViT Pooler configurable, so that it is possible to pick the activation function and the number of channels in the output

* Add documentation and allow functions as activations (instead of just string)

* formatting change

* Use ACT2FN

* Formatting change

* Formatting changes

* force pooler_act to be string

* force pooler_act to be string

* Add configs to OBJECTS_TO_IGNORE to make check_docstrings happy

* Making the same change in ijepa to make check_modular_conversion happy

* Add IJepaConfig to make CI happy

* rename pooler_size to pooler_output_size as defined in the config

* typo

* revert change to ignore variable

* Ran utils/check_docstrings.py --fix_and_overwrite

* revert unrelated change

* remove redundant defaults

* rename self.act -> self.activation

* tanh activation function in mapping
2025-03-21 11:01:07 +01:00
26c83490d2 chore: fix typos in the tests directory (#36813)
* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* chore: fix typos in the tests

* fix: format codes

* chore: fix copy mismatch issue

* fix: format codes

* chore: fix copy mismatch issue

* chore: fix copy mismatch issue

* chore: fix copy mismatch issue

* chore: restore previous words

* chore: revert unexpected changes
2025-03-21 10:20:05 +01:00
0adbc873d0 Remove call to .item in get_batch_samples (#36861) 2025-03-21 10:14:26 +01:00
6bb8565f0c FIX FSDP plugin update for QLoRA (#36720)
The _fsdp_qlora_plugin_updates checks for LoraConfig but other PEFT
methods can also support quantized models, e.g. VeRA. Therefore, the
isinstance check is now looking for PeftConfig in general.

Moreover, the fsdp_plugin variable may be undefined in the 2nd if
condition, leading to an `UnboundLocalError` error. This is fixed by not
assigning the variable at all.

I checked for tests that may need updating but only found
test_fsdp_config_transformers_auto_wrap associated with this change.
AFAICT, this test does not cover the changed code, since the test does
not start the training loop. Therefore, I haven't updated any tests. LMK
if/how this fix should be tested.

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-21 10:11:47 +01:00
949cca4061 [CI] doc builder without custom image (#36862)
* no image

* test

* revert jax version updates

* make fixup

* update autodoc path for model_addition_debugger

* shieldgemma2

* add missing pages to toctree
2025-03-21 09:10:27 +00:00
97d2f9d8ae Mllama: raise better error (#35934)
* fix mllama

* update test

* fix test
2025-03-21 09:35:37 +01:00
6a2627918d Refactor Aya Vision with modular (#36688)
* refactor aya_vision with modular (incorrect docstring)

* Fix docstrings

* Fix other modulars

* fix docstring

* revert changes

* add tie_weights and resize_token_embeddings
2025-03-20 15:34:56 -04:00
9e771bf402 Add support for seed in DataCollatorForLanguageModeling (#36497)
Add support for `seed` in `DataCollatorForLanguageModeling`. Also wrote tests for verifying behaviour.
2025-03-20 18:27:43 +00:00
ecd60d01c3 [CI] fix update metadata job (#36850)
fix updata_metadata job
2025-03-20 17:17:36 +00:00
42c489f2ae Gemma3: fix test (#36820)
* fix test

* require_read_token and public repo ids

* flash-attn test uncomment

* fix torchscript
2025-03-20 18:14:53 +01:00
068b663f90 [torchao] revert to get_apply_tensor_subclass (#36849)
* revert to old name

* empty commit

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-03-20 18:00:13 +01:00
1d3f35f30a Add model visual debugger (#36798)
* draft of model tracer visualiser

* add context manager in addition to decorator

* add debug utils to init

* move model debugging utils to dedicated file

* add documentation

* protect some imports

* format

* move and protect imports

* format

* doc: improve errors in case of broken dummy imports.

* format

* use automatic torch backend

* update doc

* fix backend

* (TEMP) move to dummies while backend wait

* update documentation

* doc
2025-03-20 17:37:29 +01:00
6515c25953 Add Prompt Depth Anything Model (#35401)
* add prompt depth anything model by modular transformer

* add prompt depth anything docs and imports

* update code style according transformers doc

* update code style: import order issue is fixed by custom_init_isort

* fix depth shape from B,1,H,W to B,H,W which is as the same as Depth Anything

* move prompt depth anything to vision models in _toctree.yml

* update backbone test; there is no need for resnet18 backbone test

* update init file & pass RUN_SLOW tests

* update len(prompt_depth) to prompt_depth.shape[0]

Co-authored-by: Joshua Lochner <admin@xenova.com>

* fix torch_int/model_doc

* fix typo

* update PromptDepthAnythingImageProcessor

* fix typo

* fix typo for prompt depth anything doc

* update promptda overview image link of huggingface repo

* fix some typos in promptda doc

* Update image processing to include pad_image, prompt depth position, and related explanations for better clarity and functionality.

* add copy disclaimer for prompt depth anything image processing

* fix some format typos in image processing and conversion scripts

* fix nn.ReLU(False) to nn.ReLU()

* rename residual layer as it's a sequential layer

* move size compute to a separate line/variable for easier debug in modular prompt depth anything

* fix modular format for prompt depth anything

* update modular prompt depth anything

* fix scale to meter and some internal funcs warp

* fix code style in image_processing_prompt_depth_anything.py

* fix issues in image_processing_prompt_depth_anything.py

* fix issues in image_processing_prompt_depth_anything.py

* fix issues in prompt depth anything

* update converting script similar to mllamma

* update testing for modeling prompt depth anything

* update testing for image_processing_prompt_depth_anything

* fix assertion in image_processing_prompt_depth_anything

* Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update docs/source/en/model_doc/prompt_depth_anything.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update docs/source/en/model_doc/prompt_depth_anything.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* update some testing

* fix testing

* fix

* add return doc for forward of prompt depth anything

* Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update tests/models/prompt_depth_anything/test_modeling_prompt_depth_anything.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fix prompt depth order

* fix format for testing prompt depth anything

* fix minor issues in prompt depth anything doc

* fix format for modular prompt depth anything

* revert format for modular prompt depth anything

* revert format for modular prompt depth anything

* update format for modular prompt depth anything

* fix parallel testing errors

* fix doc for prompt depth anything

* Add header

* Fix imports

* Licence header

---------

Co-authored-by: Joshua Lochner <admin@xenova.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-03-20 16:12:44 +00:00
66291778dd Refactor Attention implementation for ViT-based models (#36545)
* Refactor vit attention

* Refactor ViT-based models

* 🚨🚨🚨 Fix prefix for DPT

* Update params order

* trigger tests

* Fix Dinov2 attention

* Fix DPT attention impl propagation for backbone config

* Common test fix: config is modif. inplace - avoid it

* view->reshape

* Fixup

* Fixup

* Enable IJepa FA2

* Add FA2 in corresponding model docs
2025-03-20 15:15:01 +00:00
730d2a52e7 DeepSpeed tensor parallel+ZeRO (#36825)
add ds tp change
2025-03-20 16:12:01 +01:00
1a374799ce Support loading Quark quantized models in Transformers (#36372)
* add quark quantizer

* add quark doc

* clean up doc

* fix tests

* make style

* more style fixes

* cleanup imports

* cleaning

* precise install

* Update docs/source/en/quantization/quark.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update tests/quantization/quark_integration/test_quark.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* remove import guard as suggested

* update copyright headers

* add quark to transformers-quantization-latest-gpu Dockerfile

* make tests pass on transformers main + quark==0.7

* add missing F8_E4M3 and F8_E5M2 keys from str_to_torch_dtype

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Bowen Bao <bowenbao@amd.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-03-20 15:40:51 +01:00
ce091b1bda Use pyupgrade --py39-plus to improve code (#36843) 2025-03-20 14:39:44 +00:00
3e8f0fbf44 Fix hqq skipped modules and dynamic quant (#36821)
* Fix hqq skip_modules and dynamic_quant

* fix skipped modules loading

* add dynamic/skip HqqConfig test
2025-03-20 15:31:49 +01:00
055afdb6bb Fix ONNX export for sequence classification head (#36332)
* set dtype to int32

* fix style
2025-03-20 14:22:48 +00:00
487dab1b2b Shieldgemma2 (#36678)
* single commit

* correct config

* fixup

* dummy pt

* Use ShieldGemma2Config in conversion script

* Update src/transformers/models/shieldgemma2/configuration_shieldgemma2.py

* Adding shieldgemma2 to models.__init__.py

* Adding ShieldGemma2 to main __init__.py

* Update shieldgemma2.md

* Update shieldgemma2.md

* Adding tests. Addressing review feedback.

* Minor docs update

* Fixing code quality feedback from CI

* Fixing empty messages bug reported by ghunkins

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Ren Pang <ain-soph@live.com>
2025-03-20 15:14:38 +01:00
a63e92e2f0 Fix: remove the redundant snippet of _whole_word_mask (#36759)
remove the redundant snippet of _whole_word_mask
2025-03-20 14:10:43 +00:00
8124a234ca Gemma 3: Adding explicit GenerationConfig and refactoring conversion … (#36833)
Gemma 3: Adding explicit GenerationConfig and refactoring conversion script
2025-03-20 15:03:32 +01:00
cf8091c017 Fix import for torch 2.0, 2.1 - guard typehint for "device_mesh" (#36768)
* Fix device_mesh

* Remove rebase leftover
2025-03-20 11:55:47 +00:00
388e6659bf Update min safetensors bis (#36823)
* update setup.py

* style
2025-03-20 12:50:07 +01:00
b47d9b2f8a [generate] clarify docstrings: when to inherit GenerationMixin (#36605) 2025-03-20 10:58:54 +00:00
8e97b44087 [modular] Sort modular skips (#36304) 2025-03-20 10:55:12 +00:00
63380b77d4 Pass state dict (#35234)
* Pass state_dict argument to get_peft_model_state_dict

* Style fix

* Change arguments order
2025-03-20 11:54:59 +01:00
957b05b413 [qwen2 audio] remove redundant code and update docs (#36282) 2025-03-20 10:54:51 +00:00
f0d5b2ff04 Update deprecated Jax calls (#35919)
* Remove deprecated arguments for jax.numpy.clip.

* Remove deprecated arguments for jax.numpy.clip.

* Update jax version to 0.4.27 to 0.4.38.

* Avoid use of deprecated xla_bridge.get_backend().platform

Co-authored-by: Jake Vanderplas <jakevdp@google.com>

---------

Co-authored-by: Jake Vanderplas <jakevdp@google.com>
2025-03-20 11:51:51 +01:00
1ddb64937c Fix fp16 ONNX export for RT-DETR and RT-DETRv2 (#36460)
* Fix FP16 ONNX export

* Fix typo

* Sync omdet-turbo

* Refactor encoder for better readability

* Fix _no_split_modules

* Fix int -> torch_int

* Fix rt_detr

* Apply to rt-detr-v2

* Fixup

* Fix copies
2025-03-20 10:43:51 +00:00
e7337ee7be Pass num_items_in_batch directly to loss computation (#36753)
* Pass num_items_in_batch directly to loss computation

* use self loss instead

* fix loss kwrgs

* fix vocab size
2025-03-20 10:35:35 +00:00
8b479e39bb Saving Trainer.collator.tokenizer in when Trainer.processing_class is None (#36552)
* feat: Saving tokenizer in collator when processing_class is None

* chore: Style issue

* chore: Typo

* dbg: Check why test failed

* dbg: Remove logics and another test failed which successed before, so should be the stablibility issue

* test: Init unit-test

* chore: Style

* chore: Add err log

* fix: Case

* Update tests/trainer/test_trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* chore: Try to use get_regression_trainer

* fix: Impl and style

* fix: Style

* fix: Case

* fix: Import err

* fix: Missed import

* fix: Import block un-sorted problem

* fix: Try another tokenizer

* fix: Test logic

* chore: Light updates

* chore: Reformat

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-20 11:27:47 +01:00
3f03c379d2 fix tiktoken convert to pass AddedToken to Tokenizer (#36566)
* pass AddedToken to Tokenizer

* ruff

* handle dict for special tokens

* option: test tokenizer from tiktoken same as fast

* ruff

* ruff
2025-03-20 11:26:49 +01:00
8f64b177f6 [ForCausalLMLoss] allow users to pass shifted labels (#36607)
* [ForCausalLMLoss] allow users to pass shifted labels

Signed-off-by: Stas Bekman <stas@stason.org>

* style

Signed-off-by: Stas Bekman <stas@stason.org>

---------

Signed-off-by: Stas Bekman <stas@stason.org>
2025-03-20 11:25:22 +01:00
94555437e2 Disable inductor config setter by default (#36608)
* Disable inductor config setter by default

This is hard to debug and should be off by default

* remove default settings in autoquant too

* Add info to torchao.md about recommended settings

* satisfying Ruff format

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-20 11:23:14 +01:00
8733297b41 Fix swanlab global step (#36728)
* fix

* global step
2025-03-20 11:13:37 +01:00
b815fae359 Move the warning to the documentation for DataCollatorWithFlattening (#36707)
Remove init warning
2025-03-20 11:09:57 +01:00
9be4728af8 Just import torch AdamW instead (#36177)
* Just import torch AdamW instead

* Update docs too

* Make AdamW undocumented

* make fixup

* Add a basic wrapper class

* Add it back to the docs

* Just remove AdamW entirely

* Remove some AdamW references

* Drop AdamW from the public init

* make fix-copies

* Cleanup some references

* make fixup

* Delete lots of transformers.AdamW references

* Remove extra references to adamw_hf
2025-03-19 18:29:40 +00:00
51bd0ceb9e Update configuration_qwen2.py (#36735)
* Update configuration_qwen2_moe.py

* Update modeling_qwen2_moe.py

* ruff fmt

* docstring add qkv_bias
2025-03-19 18:15:54 +00:00
107fedc1e2 quick fix fast_image_processor register error (#36716)
* fix fast_image_processor register error

* update error message

* remove redundant import

* fix format
2025-03-19 18:05:45 +00:00
258dd9cc69 Add Space to Bitsandbytes doc (#36834)
* add space

* address review
2025-03-19 18:56:07 +01:00
f39f4960f3 Support tracable dynamicKVcache (#36311)
* Support tracable dynamicKVcache

* Fix lint

* More fine grained test

* Lint

* Update

* Update

* Fix up

* Apply suggestions from code review

* Update src/transformers/cache_utils.py

* Update tests/utils/test_cache_utils.py

* Apply suggestions from code review

* Update

* Change error message

* Rename

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

---------

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-19 16:52:30 +00:00
63c3116530 One more fix for reviewer assignment (#36829)
* one more fix

* one more fix

* Trigger tests
2025-03-19 16:25:24 +00:00
7c233980f4 [gemma 3] multimodal checkpoints + AutoModelForCausalLM (#36741) 2025-03-19 15:04:19 +00:00
b11050d6a2 enable OffloadedCache on XPU from PyTorch 2.7 (#36654)
* fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model

* follow Marc's suggestion to use _tie_weights to fix

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

* enable OffloadedCache on XPU since PyTorch 2.7

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

* don't change bart

Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>

* make code more concise per review comments

Signed-off-by: N <matrix.yao@intel.com>

* fix review comments

Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>

* Revert "fix review comments"

This reverts commit acf1484b86c7cc58b2dee69e7008c0eeb4c97b1b.

* fix review comments

Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>

* fix style

Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>

---------

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
Signed-off-by: N <matrix.yao@intel.com>
Co-authored-by: root <root@a4bf01945cfe.jf.intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-19 15:15:52 +01:00
e8d960329e Add option for ao base configs (#36526) 2025-03-19 14:59:47 +01:00
fef8b7f8e9 Add attention visualization tool (#36630)
* add utils  fiel

* style

* nits

* nits

* update

* updaets

* update

* fix init issues

* big updates

* nits

* nits?

* small updates

* nites

* there were still some models left

* style

* fixes

* updates

* nits _ fixes

* push changes

* update

* update

* update

* Apply suggestions from code review

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* style

* styling and return a string for testing

* small updates

* always biderectional for now

* update

---------

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2025-03-19 13:58:46 +01:00
0fe0bae0a8 [Generation] remove leftover code from end-to-end compilation (#36685) 2025-03-19 11:28:33 +00:00
a861db01e5 Fix Device map for bitsandbytes tests (#36800)
fix
2025-03-19 11:57:13 +01:00
b9374a0763 Remove dist": "loadfile" for pytest in CircleCI jobs (#36811)
* fasterrrrr

* avoid crash in example jobs

* avoid crash in TF example jobs

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-19 11:15:09 +01:00
4fa91b1be5 fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model (#36572)
* fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model

* follow Marc's suggestion to use _tie_weights to fix

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

* fix review comments.

Signed-off-by: N <matrix.yao@intel.com>

* fix quality

Signed-off-by: N <matrix.yao@intel.com>

---------

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Signed-off-by: N <matrix.yao@intel.com>
2025-03-19 10:48:47 +01:00
706703bba6 Expectations test utils (#36569)
* Add expectation classes + tests

* Use typing Union instead of |

* Use bits to track score in properties cmp method

* Add exceptions and tests + comments

* Remove compute cap minor as it is not needed currently

* Simplify. Remove Properties class

* Add example Exceptions usage

* Expectations as dict subclass

* Update example Exceptions usage

* Refactor. Improve type name. Document score fn.

* Rename to DeviceProperties.
2025-03-18 23:39:50 +01:00
179d02ffb8 [generate] vectorized beam search (#35802) 2025-03-18 18:39:36 +00:00
12f2ebef63 Support custom dosctrings in modular (#36726)
* Override docstrings in modular if not none

* Update doc
2025-03-18 14:00:54 -04:00
Gar
00915d3041 Fix chameleon's TypeError because inputs_embeds may None (#36673)
* fix chameleon TypeError when inputs_embeds is None

* reformat

* hotfix
2025-03-18 18:59:30 +01:00
14b597f518 Fix casting dtype for qunatization (#36799)
* fix

* remove print
2025-03-18 18:46:03 +01:00
30580f035b Fix Mistral3 tests (#36797)
* fix processor tests

* fix modeling tests

* fix test processor chat template

* revert modeling test changes
2025-03-18 13:08:12 -04:00
db1d4c5a0b Loading optimizations (#36742)
* improvements

* Update modeling_utils.py

* add some doc about loading

* Update modeling_utils.py
2025-03-18 16:38:44 +01:00
7baf00089a Update SHA for tj-actions/changed-files (#36795)
* trigger

* trigger

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-18 16:19:39 +01:00
3017536ebf fix hqq due to recent modeling changes (#36771)
* fix-hqq

* style

* test
2025-03-18 12:20:27 +01:00
e959530b8f Add Mistral3 (#36790)
* initial start

* style and dummies

* Create convert_mistral3_weights_to_hf.py

* update

* typo

* typo

* Update convert_mistral3_weights_to_hf.py

* Update convert_mistral3_weights_to_hf.py

* Update convert_mistral3_weights_to_hf.py

* Update convert_mistral3_weights_to_hf.py

* up

* Update convert_mistral3_weights_to_hf.py

* Update convert_mistral3_weights_to_hf.py

* update

* update

* Update image_processing_mistral3.py

* Update convert_mistral3_weights_to_hf.py

* fix patch merger

* Update convert_mistral3_weights_to_hf.py

* Update convert_mistral3_weights_to_hf.py

* up

* update modular to fit

* style

* Update convert_mistral3_weights_to_hf.py

* typo

* Update modular_mistral3.py

* simplify a lot all shape shenanigans

* simplify

* add working test processor

* Add partially working common modeling tests

* All tests working and remove mistral3 image processors

* add docs and fixup

* fix inference with image size >1540

* 🚨fix test image proc pixtral

* Remove vision_feature_select_strategy

* Update convert_mistral3_weights_to_hf.py

* Update convert_mistral3_weights_to_hf.py

* Update convert_mistral3_weights_to_hf.py

* Update convert_mistral3_weights_to_hf.py

* clean

* fix test checkpoints

* Update test_modeling_mistral3.py

* Update test_modeling_mistral3.py

* style

* Use Pixtral processor

* up

* finish cleaning processor to use pixtral directly

* Update __init__.py

* Update processing_pixtral.py

* doc

* Update __init__.py

* Update mistral3.md

* Update _toctree.yml

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com>
2025-03-18 12:04:42 +01:00
bd92073692 Fix gemma3_text tokenizer in mapping (#36793) 2025-03-18 11:50:22 +01:00
7426d02ea8 Fixing typo in gemma3 image_processor_fast and adding a small test (#36776)
Co-authored-by: zebz13 <zeb@fedora>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-18 11:35:06 +01:00
19b9d8ae13 chore: fix typos in tests directory (#36785)
* chore: fix typos in tests directory

* chore: fix typos in tests directory

* chore: fix typos in tests directory

* chore: fix typos in tests directory

* chore: fix typos in tests directory

* chore: fix typos in tests directory

* chore: fix typos in tests directory
2025-03-18 10:31:13 +01:00
7f5077e536 fix typos in the tests directory (#36717) 2025-03-17 17:45:57 +00:00
cbfb8d7b27 doc: Clarify is_decoder usage in PretrainedConfig documentation (#36724)
* fix: clarify decoder usage in PretrainedConfig documentation

* Apply suggestions from code review

updated doc

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-03-17 09:40:25 -07:00
ac1a1b66b9 [docs] Update README (#36265)
* update

* feedback

* feedback

* update versions
2025-03-17 09:37:19 -07:00
cff4caa0c1 [CI] remove redundant checks in test_eager_matches_sdpa_inference (#36740) 2025-03-17 16:29:18 +00:00
e3af4fec91 [MINOR:TYPO] Update hubert.md (#36733)
* [MINOR:TYPO] Update hubert.md

- typo fix (wave2vec instead of hubert)
- make code snippet copiable and runnable

* Run tests
2025-03-17 09:07:51 -07:00
c8a2b25f91 Fix TrainingArguments.torch_empty_cache_steps post_init check (#36734)
Mistaken use of De Morgan's law. Fixed "not (X or Y)"
to correct "not (X and Y)" check to raise a ValueError.

Added corresponding test to check "positive int or None" condition.

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-03-17 16:09:46 +01:00
8e67230860 Fix test isolation for clear_import_cache utility (#36345)
* test fixup

* test fixup

* fixing tests for unused imports

* style fixes

* fix

* style fixes

* styke fix

* remove isolated module cache

* rm custom subprocess defination

* run using exsiting fn

* style fixup

* make fixup

* remove redundant comments

* rm redundat skipif + style changes
2025-03-17 16:09:09 +01:00
27361bd218 fix xpu tests (#36656)
* fix awq xpu tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix llava next video bnb tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-17 15:57:49 +01:00
da7d64f4ff Allow ray datasets to be used with trainer (#36699)
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-17 15:44:47 +01:00
2256875a77 fix can_generate (#36570)
* fix can_generate

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix can generate for speecht5 and blip

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix speecht5 tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
2025-03-17 14:56:18 +01:00
9e94801146 enable/disable compile for quants methods (#36519)
* disable compile for most quants methods

* fix

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update tests/quantization/bnb/test_mixed_int8.py

Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* changes from joao suggestions

---------

Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-17 11:38:21 +01:00
c53d53da89 🚨🚨🚨 Fix sdpa in SAM and refactor relative position embeddings (#36422)
* fall back to eager if output_attentions

* improve relative position embeddings

* run modular on got_ocr2

* run-slow: sam

* fix run-length encoding

* fix tf processor errors

* update tf_sam

* fix compile error

* re-run tests
2025-03-17 09:39:52 +00:00
fc8764c9a6 [Generation, Gemma 3] When passing a custom generation_config, overwrite default values with the model's base generation_config (#36684) 2025-03-15 12:40:09 +00:00
f263e88dcf Update self-push-caller.yml 2025-03-15 11:32:04 +01:00
6f3e0b68e0 Fix grad accum arbitrary value (#36691) 2025-03-14 22:03:01 +01:00
2c2495cc7b Fix post_init() code duplication (#36727)
* Update modeling_utils.py

* CIs
2025-03-14 17:36:02 +01:00
25992b493c 🌐 [i18n-KO] Translated codegen.md to Korean (#36698)
* Initial translation

* Add _toctree.yml
2025-03-14 09:31:18 -07:00
42ebb6c23e [tests] Parameterized test_eager_matches_sdpa_inference (#36650) 2025-03-14 14:41:27 +00:00
9215cc62d4 Try working around the processor registration bugs (#36184)
* Try working around the processor registration bugs

* oops

* Update error message

* Clarify error

* Docstring docstring docstring

* The extra content is indexed by config class, so let's grab some values out of there

* Commit my confusion as a TODO

* Resolve my confusion

* Cleanup and mostly revert to the original

* Better autoclass fallback

* Don't nest f-strings you lunatic

* Clearer error message

* Less getattr()

* Revert a lot of changes to try a different approach!

* Try the global registry

* Check the dynamic list as well as the transformers root

* Move the dynamic list somewhere safer

* Move the dynamic list somewhere even safer

* More import cleanup

* Simplify all the register_for_auto_class methods

* Set _auto_class in the register() methods

* Stop setting the cls attribute in register()

* Restore specifying the model class for Model derivatives only

* Fix accidentally taking the .__class__ of a class

* Revert register_for_auto_class changes

* Fix get_possibly_dynamic_module

* No more ALL_CUSTOM_CLASSES

* Fix up get_possibly_dynamic_module as well

* Revert unnecessary formatting changes

* Trigger tests
2025-03-14 13:56:21 +00:00
691d1b52c3 Fix/best model checkpoint fix (#35885)
* Set best_model_checkpoint only when ckpt exists.

Rather than set it explicitly without checking if the checkpoint directory even exists as before, now we moved the setting logic inside of _save_checkpoint and are only setting it if it exists.

* Added best_global_step to TrainerState.

* Added tests for best_model_checkpoint.

* Fixed hard-coded values in test to prevent fail.

* Added helper func and removed hard-coded best_step.

* Added side effect patch generator for _eval.

* Added evaluate side effect func.

* Removed erroneous patching.

* Fixed minor bug.

* Applied Ruff.

* Fixed Ruff problem in make style.

* Used Trainer.set_initial_training_values.
2025-03-14 14:24:53 +01:00
3bd1a0ddf1 [model loading] don't gc.collect() if only 1 shard is used (#36721)
* don't gc collect if 1 shard is used

* delete state dict anyways
2025-03-14 12:56:56 +00:00
8cb522b419 Cleanup the regex used for doc preprocessing (#36648)
* Cleanup the regex used for doc preprocessing

* Run tests
2025-03-14 12:18:49 +00:00
72861e11eb Make the flaky list a little more general (#36704)
* Make the flaky list a little more general

* Trigger tests

* Make the flaky list a little more general
2025-03-14 12:15:32 +00:00
53742b11f5 Gemma3 processor typo (#36710)
* fix typo when  is on

* tiny

* add test and remove 'text_crops'

* lint
2025-03-14 13:07:55 +01:00
69bc848480 Add support for fast image processors in add-new-model-like CLI (#36313)
* add support for fast image processors in add-new-model-like

* fix header not found add-fast-image-processor-cli

* Encourage adding fast image processor

* nit

* start improve doc

* update docs

* make requested modifs
2025-03-13 14:16:37 -04:00
48ef468c74 Final CI cleanup (#36703)
* make fixup

* make fixup

* Correct skip decorator

* Add TODOs

* add is_flaky() parentheses
2025-03-13 17:26:09 +00:00
b070025aa6 Add GGUF support to T5-Encoder (#36700)
* add gguf support to t5encoder

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove gguf from model_kwargs

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>
2025-03-13 17:57:33 +01:00
4a60bae8e2 Handling an exception related to HQQ quantization in modeling (#36702)
* adding exception

* style

* add types
2025-03-13 17:53:36 +01:00
09a309d273 fix: fsdp sharded state dict wont work for save_only_model knob (#36627)
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-13 17:17:35 +01:00
2a004f9ff1 Add loading speed test (#36671)
* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* trigger CIs

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* better error messages

* Update test_modeling_utils.py

* Update test_modeling_utils.py
2025-03-13 17:07:30 +01:00
a3201cea14 [CI] Automatic rerun of certain test failures (#36694) 2025-03-13 15:40:23 +00:00
d84569387f chore: fix typos in utils module (#36668)
* chore: fix typos in utils module

* chore: fix typos in utils module

* chore: fix typos in utils module

* chore: fix typos in utils module

* chore: fix typos in utils module

* chore: fix typos in utils module
2025-03-13 15:12:44 +00:00
32c95bd847 Fix dtype for params without tp_plan (#36681)
* Update tensor_parallel.py

* CIs
2025-03-13 15:28:14 +01:00
bb965d8e87 fix type annotation for ALL_ATTENTION_FUNCTIONS (#36690)
Corrects the type annotation to match actual usage. The variable was typed as
Dict[str, Dict[str, Callable]] but is actually used as Dict[str, Callable]
where keys are attention mechanism names and values are the corresponding
attention functions directly. This change makes the type annotation consistent
with how the dictionary is used in the codebase.
2025-03-13 14:27:50 +00:00
1c287aecfc Change Qwen2_VL image processors to have init and call accept the same kwargs (#36207)
Change qwen2VL image processors to have init and call accept the same kwargs
2025-03-13 10:15:17 -04:00
65b8e38aac Upgrading torch version and cuda version in quantization docker (#36264)
* update

* small update

* no spqr quant

* testing

* testing

* test nightly

* gptqmodel

* flute

* fix hadamard

* running tests

* new docker

* fix docker

* run tests

* testing new docker

* new docker

* run tests

* new docker

* run tests

* final test

* update

* update

* run tests

* new docker

* launch tests

* test_docker

* running tests

* add comments

* fixing yml

* revert
2025-03-13 12:39:16 +01:00
87b30c3589 fix wandb hp search unable to resume from sweep_id (#35883)
* fix wandb hp search unable to resume from sweep_id

* format styles

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-03-13 12:32:26 +01:00
47cc4da351 Changing the test model in Quanto kv cache (#36670)
changing model
2025-03-13 12:23:34 +01:00
bc3d5781e7 Fix slicing for 0-dim param (#36580)
* fix

* switch to ellipsis instead

* Add co-author
Co-authored-by: fxmarty-amd <fxmarty-amd@users.noreply.github.com>

* Add co-author second try
Co-authored-by: fxmarty-amd <felmarty@amd.com>
2025-03-13 12:16:13 +01:00
fbb18ce68b Update config.torch_dtype correctly (#36679)
* fix

* style

* new test
2025-03-13 12:08:02 +01:00
c4161238bd [Cache] Don't initialize the cache on meta device (#36543) 2025-03-13 10:13:29 +00:00
79254c9b61 Fix rescale normalize inconsistencies in fast image processors (#36388)
* fix fused rescale normalize inconsistencies

* fix siglip2 fast image processor

* refactor kwargs validation and fused nirmalize rescale

* cleanup kwargs handling in preprocess

* update new procs after refactor
2025-03-12 23:18:34 -04:00
48292a9848 Refactor siglip2 fast image processor (#36406)
* refactor siglip2 fast image processor, add unused_kwargs in base fast image processor

* nits

* change unused_kwargs default to None

* update siglip2 fast image proc
2025-03-12 20:28:27 -04:00
ea219ed164 Remove differences between init and preprocess kwargs for fast image processors (#36186)
* Remove differences between init and preprocess kwargs in fast image processors

* make modifs got_ocr2

* update gemma3
2025-03-12 19:44:05 -04:00
cc3a361b46 [quants] refactor logic for modules_to_not_convert (#36672) 2025-03-12 23:43:30 +01:00
bc3253f076 Remove hardcoded slow image processor class in processors supporting fast ones (#36266)
* Add fast image processor class to processors supporting them

* fix test kosmos2
2025-03-12 18:39:25 -04:00
0013ba61e5 Fix Failing GPTQ tests (#36666)
fix tests
2025-03-12 20:03:02 +01:00
c7eb95581a Don't accidentally mutate the base_model_tp_plan (#36677)
* Don't accidentally mutate the base_model_tp_plan

* Co-authored by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Trigger tests

* Marking grad accum test as slow

* Add a flaky decorator

* Add a flaky decorator

* Use cyril's codeblock

* Don't copy() when it's None

* Use cyril's new codeblock

* make fixup
2025-03-12 18:59:13 +00:00
071a161d3e [core] Large/full refactor of from_pretrained (#36033)
* squash everything together
start to simplify inner logic

Update modeling_utils.py

Update modeling_utils.py

Update modeling_utils.py

Update modeling_utils.py

continue refactor

fix

small fixes

add type hints/docstring

Update modeling_utils.py

remove _fast_init

keep improving

Update modeling_utils.py

Update modeling_utils.py

new first tp loading version

style

fix weird in-place op

trigger CIs

Update modeling_utils.py

much clearer renaming of keys

fix

update

Update test_modeling_common.py

trigger CIs

update

update

style

Update modeling_utils.py

Update modeling_utils.py

Update modeling_utils.py

fix

fast download first prototype

remove old function

remove old functions

Remove unused function and move back _get_tp_registry

fix tp plan registry

simplify

CIs

Update hub.py

Update modeling_utils.py

simplify

simplify renaming logic

remove unused check

add sanity check back (a test depends on it)

Update modeling_utils.py

finalize sound renaming logic

style

add forgotten check

Update modeling_utils.py

add key_mapping keyword

style

Update modeling_utils.py

add comment

minor updates

minor change for clarity

fix small prefix issue and simplify

style

trigger CIs

typo fix

Post rebase fix

post rebase cleanup

simplify tp

typo

oupsi

typo

correctly escape

improvements based on Marc's review

finalize Marc's review comments

 squash everything

* improve

* Update modeling_utils.py

* Update modeling_utils.py

* fix

* Update modeling_utils.py

* Update modeling_utils.py

* style

* Update modeling_utils.py

* simplify

* style

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* fix dtype issue

* Update modeling_utils.py

* style

* remove test that does not make sense

* style

* small fixes

* style

* fix

* cleanup after rebase

* style

* typo

* escape

* tp for task specific top modules

* Update modeling_utils.py

* Update modeling_utils.py

* fix allocation

* CIs

* CIs

* CIs

* improve docstring

* CIs

* Update modeling_utils.py

* fix
2025-03-12 13:39:25 +01:00
7652804d23 Fix bnb regression due to empty state dict (#36663)
fix
2025-03-12 11:40:46 +01:00
994cad2790 [CI] gemma 3 make fix-copies (#36664)
* make fixup

* trigger ci
2025-03-12 10:35:13 +00:00
2829013d2d fix block mask typing (#36661)
* fix block mask typing

* updated

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* gemma

* fix

---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-03-12 11:29:11 +01:00
89f6956015 HPU support (#36424)
* test

* fix

* fix

* skip some and run some first

* test fsdp

* fix

* patches for generate

* test distributed

* copy

* don't test distributed loss for hpu

* require fp16 and run first

* changes from marc's PR fixing zero3

* better alternative

* return True when fp16 support on gaudi without creating bridge

* fix

* fix tested dtype in deepspeed inference test

* test

* fix

* test

* fix

* skip

* require fp16

* run first fsdp

* Apply suggestions from code review

* address comments

* address comments and refactor test

* reduce precison

* avoid doing gaudi1 specific stuff in the genreation loop

* document test_gradient_accumulation_loss_alignment_with_model_loss test a bit more
2025-03-12 09:08:12 +01:00
50d3530aa0 Gemma3 (#36658)
* Fix converter

* [Broken] Adds Gemma 3 to Hugging Face Transformers

* Consolidating Config and Processor params across impls

* Sorting out configuration parameters. Adds qk_norm before RoPE. Still not sure if RoPE is right.

* Additional plumbing for CausalLM and ConditionalGeneration variants

* incomplete draft of Orbax conversion script

* More complete checkpoint conversion

* Supporting Gemma 3 1B checkpoints

* Updating RoPE for multiple frequencies

* Adjustments to rotary embedder

* Proof of life for text-only operation

* Updating the conversion script to handle multimodal projection weights

* Fixing tet-only conversions

* Cleaner conversion script with multimodal support and a simpler processor

* Additional refatcors to the Gemma3Processor

* Simplified Processor to work over text representations

* Updated conversion script to join text and vision embeddings at converion time

* Logging for debugging

* Update src/transformers/models/gemma2/modeling_gemma2.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* Removed extraneous Config params

* Switching to fast tokenizer for checkpoint conversions

* isolating siglip for performance tetsing

* Minor changes for debugging tests against baselines

* Adding average pooling for soft tokens

* Updating processor code to enable simpler embedding interleaving for arbitrary number of images in prompts

* Updating conversion script for ShieldGemma 2 conversion compatibility

* Allow disable_compile to be provided as a kwarg

* Refresh from modular

* Updated conversion script and corrected sliding window

* Fix type mismatch in cache_position (#4)

* Fix dtype (#5)

* Fix type mismatch in cache_position

* Actually fix in the modular file

Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>

---------

Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>

* fixes for embedding table overflow and missing image_soft_token_mask from Gemma3Processor

* Adding 2D pooling for image embeddings

* Revert "Adding 2D pooling for image embeddings"

This reverts commit 65350cf531296f050b2078a5b8e46f61642b2648.

* Gemma3 average pooling changed from 1D to 2D

* Major refactor to Gemma3MultimodalInputProjection

* Updating Gemm 3 Auto* registrations

* Add option to save Gemma 3 chat template with tokenizer during weights conversion

* Removing unused imports

* Moving out-of-vocab handling from Gemma3Processor to Gemma3ForConditionalGeneration

* Removing duplicate config property

* Removing final logit softcapping and 1-indexing of position ids

* Fixing image processor config and none --> None typo

* Fixing sliding window size for 1B

* Updating image_mean and image_std in Image Processor

* Attention masking changed to lower triangular

* Moving image special tokens to conversion script

* Mirror image processor defaults from conversion script into Gemma3ProcessorKwargs

* Remove special token variables from symbol space

* Moving image soft token mask computation from Gemma3Processor to Gemma3ForConditionalGeneration

* tie lm_head and embedding weights

Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Correct tied weights in Gemma3CausalLM

* iterative bidirectional attention

* resolving merge conflicts

* Reverting to Gemma 2 HybridCache with sldiing window support and a sliding_window_pattern of 6

* Correcting RoPE scaling

* clean up first pass, dummy model geenration works

* final clean up before fixing tests

* causal lm test works, so fine

* Fix conversion

* Update src/transformers/models/gemma3/processing_gemma3.py

* model tests are happy

* processor tests are happy

* image processing tests added

* fixup

* Fix pre-processing in conversion

* Inputs merging

* Do not normalize vision embeddings

* Apply Ryan's (and team) changes to attention

* token type ids + mask

* template

* move embed scale, add rope scale, fix tests

* Add chat template to tokenizer

* Use prefix for causal model loading

* use existing code for sliding mask from gemma2

* self.embed_tokens already normalizes

* Correcting Gemma3TextConfig parameters in conversion script

* typo, modular overwrites my fixes

* enable device map for text model

* Conversion updates

* ultra nit: no einsums

* update image token

* copy deepcopy config + some docs

* add some test, still WIP

* Refactoring --include_chat_tempalte logic in converter

* Update src/transformers/models/gemma3/modular_gemma3.py

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

* Add eos tokens for instruct models

* dump so i can work on dgx

* Removing add_bos by default

* dump

* add fast im proc

* docs for PaS + fixup

* another fixup

* one more fixup

* fix tests

* Inverting prior BOS change

* ultra nit

* Reverting to Tokenizer saved with add_bos_token=True and chat template starting with BOS

* resize embeds, remove sqrt, add slow test outputs

* FA2 but quality is meh

* nit

* skip FA2, no idea what happened

* last bit for green CI

* please, green CI for docs

* T_T

* Fix for Gemma3 logits

* Support both options for system prompt

* Update src/transformers/models/gemma3/image_processing_gemma3_fast.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/gemma3.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/gemma3.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/gemma3.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/gemma3.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/gemma3.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Docs updates now that assets are live

* Style fixes

---------

Co-authored-by: Joshua Lochner <admin@xenova.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Mayank Chaturvedi <imayank@google.com>
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
Co-authored-by: Lysandre <hi@lysand.re>
2025-03-12 09:06:17 +01:00
81aa9b2e07 fix typos in the docs directory (#36639)
* chore: fix typos in the docs directory

* chore: fix typos in the docs directory

* chore: fix typos in the docs directory
2025-03-11 09:41:41 -07:00
cb384dcd7a Fix gguf docs (#36601)
* update

* doc

* update

* Update docs/source/en/gguf.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-03-11 15:29:14 +01:00
1e4286fd59 Remove research projects (#36645)
* Remove research projects

* Add new README to explain where the projects went

* Trigger tests

* Cleanup all references to research_projects
2025-03-11 13:47:38 +00:00
ed1807bab3 [docs] Update docs dependency (#36635)
update
2025-03-11 13:42:49 +00:00
b80b3ec529 Stop warnings from unnecessary torch.tensor() overuse (#36538) 2025-03-11 13:41:13 +00:00
556d2c23c6 Remove remote code warning (#36285)
* Remove redundant pipeline warning

* Remove redundant pipeline warning
2025-03-11 13:29:15 +00:00
b1a51ea464 Fix AriaForConditionalGeneration flex attn test (#36604)
AriaForConditionalGeneration depends on idefics3 vision transformer which does not support flex attn
2025-03-11 11:05:49 +01:00
d126f35427 Proper_flex (#36643)
* proper performant flex attention implementation

* wrapper for flex attention to compile only when triggered

* wrapper for flex attention to compile only when triggered

* attention mask type detection

* Update src/transformers/integrations/flex_attention.py

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* nit

* nit

* nit

* nit

* gemma2 support

* add citation for torchtune

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update flex_attention.py

* nit

* nit

* nit

* reset gemma2 modifications

* nit

* nit

* nit

* licencing

* apply changes to other models

* safe import

---------

Co-authored-by: Sung Ching Liu <sunny19981005@outlook.com>
Co-authored-by: Sung Ching Liu <22844540+bursteratom@users.noreply.github.com>
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
2025-03-11 10:24:12 +01:00
d8663cb8c5 Fix bugs in mllama image processing (#36156)
* fix: handle input_channel_dim == channels_last

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

* fix: default PIL images to channels_last

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

* Apply suggestions from code review

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fixup from review batch

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

* test: add 1x1 PIL image to ambiguous channel test

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

* fix(mllama): avoid 0 dimension for image with impractical aspect ratio

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

---------

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-03-11 10:22:48 +01:00
1c4b62b219 Refactor some core stuff (#36539)
* some config changes

* update

* current state

* update

* update

* updates and cleanup

* something that works

* fixup

* fixes

* nits

* nit

* nits and fix

* Update src/transformers/integrations/tensor_parallel.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Update src/transformers/integrations/tensor_parallel.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* cleanup

* style

* safe import

* fix

* updates

* rename stuff an clean

* style

* small updates

* ups

* oups

* nit

* protect imports

* update tp

* rodfl

* arf

* turbo nit on init

* fix import error

* frumble gumbgle

* try to fix the import error

* should fix the non model test

* update keep in float32

* update

* fix

* nits

* fix subvconfigs

* test was weird

* nit

* fix failing test

* fix instruct blip

* fixes

* style

* x.com

* fix overwrite

* ok last bit of failing test

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>
2025-03-11 09:26:28 +01:00
e9756cdbc7 [docs] Serving LLMs (#36522)
* initial

* fix

* model-impl
2025-03-10 13:14:19 -07:00
af9b2eaa54 chore: fix typos in language models (#36586)
* chore: fix typos in language models

* chore: fix typos in mistral model

* chore: fix model copy from issue

* chore: fix model copy from issue

* chore: fix model copy from issue

* chore: fix model copy from issue

* chore: fix model copy from issue
2025-03-10 15:54:49 +00:00
a929c466d0 Fix auto-assign reviewers (#36631)
* Fix auto-assign reviewers

* Clean up endanchor a bit

* We don't actually need the end anchor at all
2025-03-10 15:52:13 +00:00
858545047c [HybridCache] disable automatic compilation (#36620) 2025-03-10 09:24:26 +00:00
94ae1ba5b5 Fix check for XPU. PyTorch >= 2.6 no longer needs ipex. (#36593) 2025-03-07 14:09:35 +00:00
a1cf9f3390 Fixed datatype related issues in DataCollatorForLanguageModeling (#36457)
Fixed 2 issues regarding `tests/trainer/test_data_collator.py::TFDataCollatorIntegrationTest::test_all_mask_replacement`:
1. I got the error `RuntimeError: "bernoulli_tensor_cpu_p_" not implemented for 'Long'`. This is because the `mask_replacement_prob=1` and `torch.bernoulli` doesn't accept this type (which would be a `torch.long` dtype instead. I fixed this by manually casting the probability arguments in the `__post_init__` function of `DataCollatorForLanguageModeling`.
2. I also got the error `tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute Equal as input #1(zero-based) was expected to be a int64 tensor but is a int32 tensor [Op:Equal]` due to the line `tf.reduce_all((batch["input_ids"] == inputs) | (batch["input_ids"] == tokenizer.mask_token_id))` in `test_data_collator.py`. This occurs because the type of the `inputs` variable is `tf.int32`. Solved this by manually casting it to `tf.int64` in the test, as the expected return type of `batch["input_ids"]` is `tf.int64`.
2025-03-07 14:09:27 +00:00
4fce7a0f0f Bump jinja2 from 3.1.5 to 3.1.6 in /examples/research_projects/decision_transformer (#36582)
Bump jinja2 in /examples/research_projects/decision_transformer

Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.5 to 3.1.6.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-07 13:35:59 +00:00
f2fb41948e Update "who to tag" / "who can review" (#36394)
update who to tag
2025-03-07 13:09:31 +00:00
1b9978c360 Update chat_extras.md with content correction (#36599)
Update chat_extras.md - content

Fixed a typo in the content, that may confuse the readers.
2025-03-07 13:09:02 +00:00
f2e197c30a Github action for auto-assigning reviewers (#35846)
* First draft of github action on PR opening for auto-assigning reviewers

* fix missing import

* Don't reassign reviewers if we already have them

* Temporarily comment out the opened line so we can test the script

* Correct path for codeowners file

* Update workflow permissions

* Update workflow permissions

* Update debug logs

* Strip inline comments

* Remove prefix

* Request reviews instead of assigning

* Request reviews instead of assigning

* Add TODO

* Use pull-request-target instead

* Update the script

* Set back to pull_request for testing

* Set to pull_request_target, testing works!

* Add licence

* Tighten up one of the globs

* Refactor things to be a bit less convoluted

* Only assign reviewers when marked ready for review
2025-03-07 12:18:49 +00:00
8a16edce67 Export base streamer. (#36500)
* Export base streamer. 

Previously, the base streamer class was not exported so the set of available streamers was fixed to 3 streamer classes. 

This change makes it so that customers may extend the default base streamer class.

* make fixup

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
2025-03-07 11:16:09 +00:00
6f775970c7 avoid errors when the size of input_ids passed to PrefixConstrainedLogitsProcessor is zero (#36489)
* avoid errors when the size of `input_ids` passed to PrefixConstrainedLogitsProcessor is zero

* use more reasonable process

* avoid early return

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-03-07 11:02:49 +00:00
51ed61e2f0 Mention UltraScale Playbook 🌌 in docs (#36589) 2025-03-06 14:48:11 -08:00
159445d044 fix: argument (#36558)
752ef3fd4e/utils/modular_model_converter.py (L1729)
2025-03-06 13:11:19 -08:00
5275ef6f3d [XGLM] tag tests as slow (#36592)
these tests should be slow
2025-03-06 17:54:41 +00:00
c1b24c0b73 [bark] fix loading of generation config (#36587) 2025-03-06 16:55:19 +00:00
0440dbc0e1 Integrate SwanLab for offline/online experiment tracking and local visualization (#36433)
* add swanlab integration

* feat(integrate): add SwanLab as an optional experiment tracking tool in transformers

- Integrated SwanLab into the transformers library as an alternative for experiment tracking.
- Users can now log training metrics, hyperparameters, and other experiment details to SwanLab by setting `report_to="swanlab"` in the `TrainingArguments`.
- Added necessary dependencies and documentation for SwanLab integration.

* Fix the spelling error of SwanLabCallback in callback.md

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Fix typo in comment

* Fix typo in comment

* Fix typos and update comments

* fix annotation

* chore: opt some comments

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: AAssets <20010618@qq.com>
Co-authored-by: ZeYi Lin <944270057@qq.com>
Co-authored-by: KAAANG <79990647+SAKURA-CAT@users.noreply.github.com>
2025-03-06 17:35:30 +01:00
bc30dd1efb Modular Conversion --fix_and_overwrite on Windows (#36583)
* Modular Conversion --fix_and_overwrite on Windows

* -newline on read
2025-03-06 13:12:30 +00:00
9e385109cf Delete redundancy if case in model_utils (#36559)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-03-06 11:36:11 +00:00
acc49e390d Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/pplm (#36540)
Bump transformers in /examples/research_projects/pplm

Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-06 11:35:47 +00:00
9e84b38135 chore: enhance message descriptions in parameters,comments,logs and docstrings (#36554)
* chore: enhance message descriptons in parameters,comments,logs and docstrings

* chore: enhance message descriptons in parameters,comments,logs and docstrings

* Update src/transformers/hf_argparser.py

* Update src/transformers/keras_callbacks.py

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-03-06 11:02:35 +00:00
6966fa1901 Fix typos . (#36551)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-03-05 16:31:43 -08:00
996f512d52 Fix typos in tests (#36547)
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-03-05 15:04:06 -08:00
752ef3fd4e guard torch version for uint16 (#36520)
* u16

* style

* fix
2025-03-05 11:27:01 +01:00
66f29aaaf5 chore: enhance messages in docstrings (#36525)
chore: enhance the message in docstrings
2025-03-04 16:31:20 +00:00
89d27fa6ff Fix links in quantization doc (#36528)
fix quantization doc
2025-03-04 16:43:03 +01:00
c0c5acff07 Fix bamba tests amd (#36535) 2025-03-04 15:24:27 +01:00
37508816d6 chore: Fix typos in docs and examples (#36524)
Fix typos in docs and examples

Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-03-04 13:47:41 +00:00
84f0186e89 Add aya (#36521)
* initial commit

* small fix

* move stuff to image processing file

* remove stuff in validate turn and fix return tensor

* remove liquid stuff

* in the process of addressing comments

* changes to get the right tokenization

* new __init__ works

* fixing defulat std and mean

* works

* small testing scipt -- to be deleted before merge

* remove redundant code

* addressing comments

* fix inits, add docs templates

* refactor processor, switch to gotocr image processor

* remove image proc from init

* refactor to working llava-style architecture

* Change AyaVisionModel to AyaVisionForConditionalGeneration

* add tests

* fixups

* update doc

* Adding logits_to_keep explicitly in ayavision forward to enable compatibility with cohere model

* better variable names + remove code paths

* Updates to aya_vision.md

* address comments

* adding copied from

* make style and remove unused projector_hidden_act from config

* sort init

* include usage of fast image proc and proc on cuda in doc

* update checkpoint iin test processor

* update checkpoint in test processor 2

* remove test_model and update docstring

* skip failing tests

---------

Co-authored-by: Saurabh Dash <saurabh@cohere.com>
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
2025-03-04 12:24:33 +01:00
c0f8d055ce [docs] Redesign (#31757)
* toctree

* not-doctested.txt

* collapse sections

* feedback

* update

* rewrite get started sections

* fixes

* fix

* loading models

* fix

* customize models

* share

* fix link

* contribute part 1

* contribute pt 2

* fix toctree

* tokenization pt 1

* Add new model (#32615)

* v1 - working version

* fix

* fix

* fix

* fix

* rename to correct name

* fix title

* fixup

* rename files

* fix

* add copied from on tests

* rename to `FalconMamba` everywhere and fix bugs

* fix quantization + accelerate

* fix copies

* add `torch.compile` support

* fix tests

* fix tests and add slow tests

* copies on config

* merge the latest changes

* fix tests

* add few lines about instruct

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* fix tests

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* "to be not" -> "not to be" (#32636)

* "to be not" -> "not to be"

* Update sam.md

* Update trainer.py

* Update modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* fix hfoption tag

* tokenization pt. 2

* image processor

* fix toctree

* backbones

* feature extractor

* fix file name

* processor

* update not-doctested

* update

* make style

* fix toctree

* revision

* make fixup

* fix toctree

* fix

* make style

* fix hfoption tag

* pipeline

* pipeline gradio

* pipeline web server

* add pipeline

* fix toctree

* not-doctested

* prompting

* llm optims

* fix toctree

* fixes

* cache

* text generation

* fix

* chat pipeline

* chat stuff

* xla

* torch.compile

* cpu inference

* toctree

* gpu inference

* agents and tools

* gguf/tiktoken

* finetune

* toctree

* trainer

* trainer pt 2

* optims

* optimizers

* accelerate

* parallelism

* fsdp

* update

* distributed cpu

* hardware training

* gpu training

* gpu training 2

* peft

* distrib debug

* deepspeed 1

* deepspeed 2

* chat toctree

* quant pt 1

* quant pt 2

* fix toctree

* fix

* fix

* quant pt 3

* quant pt 4

* serialization

* torchscript

* scripts

* tpu

* review

* model addition timeline

* modular

* more reviews

* reviews

* fix toctree

* reviews reviews

* continue reviews

* more reviews

* modular transformers

* more review

* zamba2

* fix

* all frameworks

* pytorch

* supported model frameworks

* flashattention

* rm check_table

* not-doctested.txt

* rm check_support_list.py

* feedback

* updates/feedback

* review

* feedback

* fix

* update

* feedback

* updates

* update

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-03-03 10:33:46 -08:00
6aa9888463 Remove unused code (#36459) 2025-03-03 18:31:10 +00:00
9fe82793ee [Style] fix E721 warnings (#36474)
* fix E721 warnings

* config.hidden_size is not a tuple

* fix copies

* fix-copies

* not a tuple

* undo

* undo
2025-03-03 18:03:42 +00:00
1975be4d97 Fix edge case for continue_final_message (#36404)
* Fix edge case for continue_final_message

* lstrip() correctly

* Add regression test

* Add a clearer error message when the final message is not present

* Add a clearer error message when the final message is not present

* Fix massive bug!
2025-03-03 18:03:03 +00:00
2aff938992 Fix pipeline+peft interaction (#36480)
* Fix pipeline-peft interaction

* once again you have committed a debug breakpoint

* Remove extra testing line

* Add a test to check adapter loading

* Correct adapter path

* make fixup

* Remove unnecessary check

* Make check a little more stringent
2025-03-03 18:01:43 +00:00
28159aee63 chore: fix message descriptions in arguments and comments (#36504)
chore: fix messagedescriptions in arguments and comments
2025-03-03 17:54:57 +00:00
acb8586dd9 Fix some typos in docs (#36502)
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-03-03 17:53:53 +00:00
0463901c92 fix torch_dtype, contiguous, and load_state_dict regression (#36512)
* fix regression

* fix param

* fix load_state_dict

* style

* better fix for module

* fix tests

* quick fix for now

* rm print
2025-03-03 18:35:37 +01:00
3e83ee75ec Fix kwargs UserWarning in SamImageProcessor (#36479)
transformers/image_processing_utils.py:41: UserWarning: The following named arguments are not valid for `SamImageProcessor.preprocess` and were ignored: 'point_pad_value'
2025-03-03 16:23:34 +00:00
9e3a1072c2 Check TRUST_REMOTE_CODE for RealmRetriever for security (#36511)
* fix

* repush

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-03-03 15:08:12 +01:00
4d8259d245 Fix loading zero3 weights (#36455)
* Check if fixes

* Fix zero3 loading

* Quality

* Fix marc nit

* Add fast tests

* Migrate to integrations.deepspeed rather than modeling_utils

* Style
2025-03-03 15:05:58 +01:00
dcbdf7e962 Fix _load_state_dict_into_meta_model with device_map=None (#36488)
* Fix _load_state_dict_into_meta_model with device_map=None

* Update src/transformers/modeling_utils.py
2025-03-02 08:33:36 +01:00
a40f1ac602 Fix couples of issues from #36335 (#36453)
* fix

* style

* better allocation

* fix

* fix

* style

* revert disk

* exit

* style

* return if nothing to cache

* dtensor guard

* fix regressiion

* fix regression

* fix

* fix
2025-03-01 07:12:17 +01:00
2c5d038f92 Add Got-OCR 2 Fast image processor and refactor slow one (#36185)
* refactor image processor slow got ocr

* add working image processor fast

* fix fast image processor, update doc

* use one big loop for processing patches
2025-03-01 00:56:00 -05:00
51083d1bac [docs] fix bug in deepspeed config (#36081)
bug fix
2025-02-28 07:09:54 -08:00
02776d2c6a Fix loading models with mismatched sizes (#36463)
* Fix loading model with mismatched sizes

* trigger tests
2025-02-28 11:48:59 +01:00
222505c7e4 [GroundingDino] Fix grounding dino loss 🚨 (#31828)
* Starting to fix GroundingDinoLoss and GroundingDinoHungarianMatcher

* More updates

* More updates

* fixed: GroundingDinoLoss

* fixed: failing tests

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/grounding_dino/test_modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Addressed comments

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>

* add: cardinality loss and make box loss as copy from

* change: default for reduction loss is sum

* fix: vectorized generate fake box

* fix copies

* Addressed comments

* addressed comments

* addressed one-hot

* Update tests/models/grounding_dino/test_modeling_grounding_dino.py

Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>

* Addressed comments

* fixed test

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

* Update tests/models/grounding_dino/test_modeling_grounding_dino.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Starting to fix GroundingDinoLoss and GroundingDinoHungarianMatcher

* More updates

* More updates

* fixed: GroundingDinoLoss

* add: cardinality loss and make box loss as copy from

* fix copies

* Revert "Update tests/models/grounding_dino/test_modeling_grounding_dino.py"

This reverts commit aa74c4c57c430e54cc74c414d6269edb65c73e83.

* [run-slow] groundigdino

* remove nestedtensor

* [run-slow] groundig_dino

* [run-slow] grounding_dino

* [run-slow] grounding_dino

* [run-slow] grounding_dino

* check

* check

* add: enconder intermediate outputs to ImageLoss forward

* add: GroundingDinoForObjectDetectionLoss in the loss directory

* make style

* fix the loss function

* remove class_reduction since it sum is default

* remove class_reduction

* Update src/transformers/loss/loss_grounding_dino.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* simple fix

* Update src/transformers/loss/loss_grounding_dino.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* minor fix

* Update src/transformers/loss/loss_for_object_detection.py

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: sangbumchoi <danielsejong55@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-27 19:15:58 +00:00
482d17be60 Fix hub_retry (#36449)
* cry

* trigger

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-27 14:38:25 +01:00
6a876462c3 Lazy import libraries in src/transformers/image_utils.py (#36435)
* Lazy import libraries in `src/transformers/image_utils.py`

* `make fixup`

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Protect imports

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

---------

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-02-27 12:53:42 +00:00
8aed019764 [generate] torch.distributed-compatible DynamicCache (#36373)
* test

* docstring

* prepare distributed cache data

* fix cat dim

* test mvp

* add test checks

* like this?

* working test and solution

* nit

* nit

* add shape info
2025-02-27 11:48:57 +00:00
17792556b2 [save_pretrained ] Skip collecting duplicated weight (#36409)
* Skip collecting duplicated weight

* format
2025-02-27 10:57:11 +01:00
2d6cc0dfde Add contents: write (#36445)
fix permission

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-27 10:55:37 +01:00
549db241e5 Fix another permission (#36444)
* fix permission

* fix permission

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-27 10:29:06 +01:00
a8e4fe45fd Fix permission (#36443)
fix permission

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-27 10:08:31 +01:00
d0727d92cd Change PR to draft when it is (re)opened (#36417)
* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-27 09:44:33 +01:00
8ede897c30 restrict cache allocator to non quantized model (#36428) 2025-02-26 22:16:15 +01:00
a7fbab33ae Fix Expected output for compressed-tensors tests (#36425)
fix
2025-02-26 21:17:24 +01:00
1603018e7a Update form pretrained to make TP a first class citizen (#36335)
* clean code

* oups

* fix merge

* yups

* fix if

* now you can play

* fix shape issue

* try non blocking

* fix

* updates

* up

* updates

* fix most of thetests

* update

* update

* small updates

* up

* fix the remaining bug?

* update

* rename when you read from the file

* buffer issues

* current status

* cleanup

* properly allocate dumb memory

* update a small bug

* fix colwise rep issue

* fix keep in float 32 that was keeping everything in float 32

* typo

* more fixes with keep_in_fp32_modules as we use to serach on it

* fix ROPE dtype for TP

* remove what's breaking the tests

* updates

* update and fixes

* small cleanup after merging

* allocate 2x to be safe

* style, auto

* update

* yup nit

* fix

* remove slow as fuck torch api :(

* work

* fixup

* update

* brting the fix back

* fix and update

* fixes

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* updates because some suggestions were wrong 👀

* update?

* fuck this bloated function

* typo

* fix the dumb prefix thing once and forall

* fixes here and there

* updates

* remove prints

* fix strict cases

* styel

* properly fix keys on load!

* update

* fix base model prefix issue

* style

* update

* fix all?

* remoce 1 print

* fix the final etsts

* fixup

* last nits

* fix the detach issue which cause a 2x slowdown

* fixup

* small fixes

* ultra nit

* fix

* fix

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-26 20:12:38 +01:00
981c276a02 Fix compressed tensors config (#36421)
* fix config

* update

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-26 17:56:15 +01:00
d18d9c3205 Universal Speculative Decoding CandidateGenerator (#35029)
* move `TestAssistedCandidateGeneratorDifferentTokenizers` into a new testing file

* refactor

* NOTHING. add space to rerun github actions tests

* remove it...

* `UniversalSpeculativeDecodingGenerator`

* Use `UniversalSpeculativeDecodingGenerator` when `generation_config.do_sample=True`

* assistant tokenizes only the target's new suffix

* formatting

* fix code

* fix code

* formatting

* add `TestGenerateWithDifferentModels`

* `TestGenerateWithDifferentModels` parameterize on `do_sample`

* `AssistantVocabMapping` & `AssistantVocabMappingCache`

* formatting

* `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits`

* improve `_get_assistant_to_target_input_ids` & formatting

* renaming

* WIP: debugging `min_new_tokens`

* fix get_target_ids

* `UniversalSpeculativeDecodingGenerator`

* assistant tokenizes only the target's new suffix

* formatting

* fix code

* fix code

* formatting

* `TestGenerateWithDifferentModels` parameterize on `do_sample`

* `AssistantVocabMapping` & `AssistantVocabMappingCache`

* formatting

* `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits`

* improve `_get_assistant_to_target_input_ids` & formatting

* renaming

* WIP: debugging `min_new_tokens`

* fix get_target_ids

* fix device issue

* fix get_assistant_input_ids

* add `TestAssistedCandidateGeneratorDifferentTokenizers`

* formatting

* `AssistantVocabTranslatorCache` refactor & tests

* revert changes in `src/transformers/generation/logits_process.py`

* refactor `AssistedCandidateGenerator`

* refactor `AssistedCandidateGeneratorDifferentTokenizers`

* formatting

* refactor `UniversalSpeculativeDecodingGenerator`

* fix negative value for max_new_tokens

* fix generation length target + attention_mask vs. assistant + attent

* fix device

* fix negative max_new_tokens bug

* fix UAG

* minor

* formatting

* `AssistedCandidateGeneratorDifferentTokenizers` `lookbehind`s init

* resolve conflict & formatting

* rerun CI tests

* remove space...

* remove old code

* fix candidate_input_ids device

* minor

* formatting

* Fix prepare + apply (#7)

* fix prepare + apply

* move to cpu

* simplity suppress_tokens

* fix bugs and refacatoring

* device move

* handle self.config.vocab_size > len(target_tokenizer.get_vocab())

* no need to normalize in candidate_generator

* address Nadav's comments + minor

* optimize device move + SuppressTokensLogitsProcessor

* AssistantToTargetTranslator, SuppressTokensLogitsProcessor and tokenizers mapping improvements

* padding size

* padding improvement

* fix and simplify get_target_logits

* renaming in get_target_logits

* minor

* add filter_value and suppress_tokens_id

* style + rename

* remove TODO

* restore original SelectTokensLogitsProcessor with modification

* fix style

* fix _update_past_and_masks and optimize code

* remove assistant_vocab_size arg

* fix attention_mask

* call _prepare_attention_mask also if not has_past_key_values

* handling attention mask for first generation

* comment

* restore test

* remove SelectTokensLogitsProcessor

* _update_past_and_masks implementation for USD

* Add unittests for Universal Assisted generation

* fix style

* update tests

* Remove unused import and fix `test_speculation_depth` test

* exclude special and reserved tokens from tokenizer for UAG

* mv `test_universal_assisted_generation.py` to `generation/test_candidate_generator.py`

* Remove unused imports and fix style using `make style` (#9)

* formatting

* Swap gated `meta-llama/llama-3.2` with `allenai/llama` (#10)

* Fix space sign disagreement (#12)

* default values for AssistantToTargetTranslator fileds

* fix space sign

* minor

* fix test + style

* Default values for some fields of assistant to target translator (#11)

* default values for AssistantToTargetTranslator fileds

* fix

* add support to empty logit_processors

* Update candidate_generator.py (#15)

fix typo

* BUG fix in _prepare_assistant_input_ids (#14)

* fix _prepare_assistant_input_ids

* target_to_assistant_input_ids

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Nadav Timor <nadav.timor@weizmann.ac.il>

---------

Co-authored-by: Nadav Timor <nadav.timor@weizmann.ac.il>

* typo (`target_to_assistant_input_ids`)

* formatting

* merge upstream/main

* Fix minor review comments (#16)

* Fix: `token_ids.to(torch.int64)` (#18)

* tok ids to `torch.int64` (reference: https://huggingface.co/docs/transformers.js/en/api/tokenizers)

* `LongTensor`

* fix dtype

* `assistant_input_ids.to(dtype=torch.long)`

* Remove unused import from test_candidate_generator.py

* Remove unused import from test_candidate_generator.py

* Remove `numpy` import

* resolve pr comments (#19)

* `AssistantToTargetTranslator` docstring

* (per gante's comment) `filter_value` and `suppress_tokens_id` to class constants

* update `AssistantToTargetTranslator` docstring

* (gante's comment) replace `match-case`

* formatting

* Fix Joao's comments (#21)

* remove threading

* fix logits_processor

* fix test device

* fix style (#23)

* Move atm (#24)

* move AssistantToTargetTranslator

* fixup

* fix logit_processor

* add atm_translator test

* refactor test

* remove threading from test

* add require_torch in tests

* move AssistantVocabTranslatorCache + add tests

* ruff fix

---------

Co-authored-by: jmamou <jonathan.mamou@intel.com>
Co-authored-by: Gaurav <gauravj@d-matrix.ai>
Co-authored-by: Gaurav Jain <gaurjain14@gmail.com>
Co-authored-by: gauravjain14 <41287729+gauravjain14@users.noreply.github.com>
2025-02-26 16:14:02 +00:00
082834dd79 fix: prevent model access error during Optuna hyperparameter tuning (#36395)
* fix: prevent model access error during Optuna hyperparameter tuning

The `transformers.integrations.integration_utils.run_hp_search_optuna` function releases model memory and sets trainer.model to None after each trial. This causes an AttributeError when  subsequent Trainer.train calls attempt to access the model before reinitialization. This is only an issue when `fp16_full_eval` or `bf16_full_eval` flags are enabled.

* Update src/transformers/trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-26 17:06:48 +01:00
6513e5e402 add recommendations for NPU using flash_attn (#36383)
* add recommendations for Ascend NPU using flash_attn

* update recommend_message_npu

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-26 14:51:08 +01:00
b4965cecc5 Fixing the docs corresponding to the breaking change in torch 2.6. (#36420) 2025-02-26 14:11:52 +01:00
9a217fc327 Deprecate transformers.agents (#36415) 2025-02-26 11:38:47 +01:00
41925e4213 Add retry hf hub decorator (#35213)
* Add retry torch decorator

* New approach

* Empty commit

* Empty commit

* Style

* Use logger.error

* Add a test

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Fix err

* Update tests/utils/test_modeling_utils.py

---------

Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-02-25 20:53:11 +01:00
9ebfda3263 Fixed VitDet for non-squre Images (#35969)
* size tuple

* delete original input_size

* use zip

* process the other case

* Update src/transformers/models/vitdet/modeling_vitdet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* [VITDET] Test non-square image

* [Fix] Make Quality

* make fix style

* Update src/transformers/models/vitdet/modeling_vitdet.py

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-02-25 19:31:24 +00:00
cbe0ea59f3 Security fix for benchmark.yml (#36402)
security

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-25 17:22:09 +01:00
88d10517b4 Fix convert_to_rgb for SAM ImageProcessor (#36369) 2025-02-25 15:10:21 +00:00
e1ce948908 [CLI] add import guards (#36376)
* add import guards

* nit
2025-02-25 15:06:50 +00:00
fb83befb14 Fix pytorch integration tests for SAM (#36397)
Fix device in tests
2025-02-25 14:53:34 +00:00
ca6ebcb9bc chore: fix function argument descriptions (#36392) 2025-02-25 14:28:34 +00:00
7c8916ddb5 fix audio classification pipeline fp16 test on cuda (#36359)
* fix audio classification pipeline fp16 test on cuda

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add comments

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Update tests/pipelines/test_pipelines_audio_classification.py

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-02-25 15:01:25 +01:00
c3700b0eee [tests] enable autoawq tests on XPU (#36327)
add autoawq

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-02-25 13:38:09 +01:00
b4b9da6d9b tests: revert change of torch_require_multi_gpu to be device agnostic (#35721)
* tests: revert change of torch_require_multi_gpu to be device agnostic

The 11c27dd33 modified `torch_require_multi_gpu()` to be device agnostic
instead of being CUDA specific. This broke some tests which are rightfully
CUDA specific, such as:

* `tests/trainer/test_trainer_distributed.py::TestTrainerDistributed`

In the current Transformers tests architecture `require_torch_multi_accelerator()`
should be used to mark multi-GPU tests agnostic to device.

This change addresses the issue introduced by 11c27dd33 and reverts
modification of `torch_require_multi_gpu()`.

Fixes: 11c27dd33 ("Enable BNB multi-backend support (#31098)")
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* fix bug: modification of frozen set

---------

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-02-25 13:36:10 +01:00
d80d52b007 addressing the issue #34611 to make FlaxDinov2 compatible with any batch size (#35138)
fixed the batch_size error, all tests are passing

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-02-25 10:44:44 +00:00
3a02fe56c2 Added handling for length <2 of suppress_tokens for whisper (#36336)
* Update generation_whisper.py

Added handling for <2 length of suppress_tokens for whisper

* Updated None check for suppress_tokens to avoid ambiguity

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-02-25 10:32:49 +00:00
da4ab2a1b6 Fix doc formatting in forward passes & modular (#36243)
* fix indentation issues + modular without magic keyword

* style

* Update doc.py

* style

* Fix all decorators indentation

* all models

* style

* style

* Update doc.py

* fix

* general fix

* style
2025-02-25 11:09:01 +01:00
92abc0dae8 Update _get_eval_sampler to reflect Trainer.tokenizer is deprecation self.tokenizer -> self.processing_class (#36315)
* fix warning self.tokenizer -> self.processing_class

* formating change
2025-02-25 11:07:50 +01:00
9d6abf9778 enable torchao quantization on CPU (#36146)
* enable torchao quantization on CPU

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix int4

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable CPU torchao tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix cuda tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix cpu tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix style

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix cuda tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix torchao available

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix torchao available

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix torchao config cannot convert to json

* fix docs

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* rm to_dict to rebase

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* limited torchao version for CPU

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix skip

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Update src/transformers/testing_utils.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix cpu test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-25 11:06:52 +01:00
401543a825 Fix is_causal fail with compile (#36374)
fix
2025-02-25 10:44:56 +01:00
bc65f3fc1c [modular] Do not track imports in functions (#36279)
* Add check

* just check for function

* Update examples
2025-02-25 10:29:47 +01:00
4b5cf5496d Load models much faster on accelerator devices!! (#36380)
* caching allocator warmup

* Update modeling_utils.py

* reuse expanded map

* style
2025-02-25 09:41:22 +01:00
931e5f4ac3 Update modeling_llava_onevision.py (#36391)
Fixed a potential bug in modeling_llava_onevision.py
2025-02-25 09:34:50 +01:00
2ab7bdc403 notify new model merged to main (#36375)
notify new model

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-24 17:53:18 +01:00
05dfed06d7 [Modeling] Reduce runtime when loading missing keys (#36312)
* hoist keys

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove hoist

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

---------

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
2025-02-24 16:10:28 +00:00
18276b03f7 fix(type): padding_side type should be Optional[str] (#36326) 2025-02-24 16:09:42 +00:00
f4684a6eb2 Update amd pytorch index to match base image (#36347)
pip pytorch index should match docker base image
2025-02-24 16:17:20 +01:00
2af272c101 Add autoquant support for torchao quantizer (#35503)
* Add autoquant support for torchao quantizer

Summary:
att, also verified that autoquantized model can be saved and loaded:

save: https://gist.github.com/jerryzh168/01d367aaf44dbbbfd4068a4a10a00061
load: https://gist.github.com/jerryzh168/d5c6c401b2abdf18e0b6771341f1525c

Test Plan:
tested locally with above script
model uploaded to https://huggingface.co/jerryzh168/llama3-8b-autoquant

Reviewers:

Subscribers:

Tasks:

Tags:

* add test

* ruff fix

* ruff reformat

* add docs and min_sqnr support

* format

* format

* fix test

* update doc

* format

* remove disable_compile

* format
2025-02-24 15:54:16 +01:00
977a61f743 Change slack channel for mi250 CI to amd-hf-ci (#36346) 2025-02-24 15:50:06 +01:00
884a8ea1f0 Improve model loading for compressed tensor models (#36152)
* Disable warnings for stacked compressors
* Introduce two new hooks in HfQuantizer lifecycle
to allow updates to missing and unexpected keys
* Update missing and unexpected keys
for stacked compressors
* Add tests
* Fix: run_compressed cases
* Fix: uncompressed cases

* Rename compressed_tensor folder to compressed_tensors
Move RunCompressedTest to the same file
Update tests to unittest
2025-02-24 13:47:21 +01:00
4dbf17c17f [tests] enable bnb tests on xpu (#36233)
* fix failed test

* fix device

* fix more device cases

* add more cases

* fix empty cache

* Update test_4bit.py

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-02-24 11:30:15 +01:00
92c5ca9dd7 Fix exploitable regexes in Nougat and GPTSan/GPTJNeoXJapanese (#36121)
* Fix potential regex catastrophic backtracking in NougatTokenizerFast

The original regex pattern in tokenization_nougat_fast.py was vulnerable to
catastrophic backtracking due to greedy quantifiers and nested alternations.
This commit replaces it with a more efficient pattern that:

1. Uses explicit character classes instead of dot (.)
2. Handles whitespace more precisely
3. Avoids unnecessary backtracking
4. Supports both lowercase and uppercase roman numerals
5. Maintains the same functionality while being more robust

* Try another regex

* Trying deepseek's answer

* Start with a simplification

* Another simplification

* Just rewrite the whole function myself

* Fix gptneox and gptsan

* Simplify the regex even further

* Tighten up the price regex a little

* Add possessive version of the regex

* Fix regex

* Much cleaner regexes

---------

Co-authored-by: openhands <openhands@all-hands.dev>
2025-02-21 19:49:51 +00:00
547911e727 Uses Collection in transformers.image_transforms.normalize (#36301)
* Uses Collection instead of Sequence in transformers.image_transforms.normalize

* Uses collections.abc.Collection in lieu of deprecated typing one
2025-02-21 18:38:41 +01:00
7c5bd24ffa [tests] make quanto tests device-agnostic (#36328)
* make device-agnostic

* name change
2025-02-21 14:20:40 +01:00
678885bbbd [CI] Check test if the GenerationTesterMixin inheritance is correct 🐛 🔫 (#36180) 2025-02-21 10:18:20 +00:00
a957b7911a Add SigLIP 2 (#36323)
* Docs

* Inits

* Auto classes

* Add siglip base

* Add base tests

* Fix Siglip V1 for fix res version

* Add image processor

* Update conversion

* Experimenting with vectorized embeddings

* Fixup

* Add modular Siglip2Processor

* Add modular configuration

* Rename num patches

* Correct image and text features merging

* Working conversion script

* Refactoring conversion script

* Remove unused code in conversion script

* Shorten dict a bit

* Refactoring conversion

* Done conversion refactoring

* Fixup

* Modular siglip2

* Make model exportable and compilable without graph breaks

* Remove position_ids from image_processor

* REmove position ids from modeling file

* Update modular

* Type hint

* Fixup

* Set defaults to processor

* Add integration test

* Revert spatial shapes back to tensor

* Change order

* Fix most of the tests

* Fix docstring

* Remove interpolate_pos_encoding arg (not needed)

* Update docs

* Standardize processing

* Fix attention_mask in vision head

* Siglip v1: remove double transpose in FA2

* Update modular file

* Update FA2 test

* Update expected logits

* Fix interpolation for siglip2 image processor

* Skip init test

* Skip dispatch on flash test

* Fix modeling tests

* Fixup

* Add dummy objects

* Fix some docstrings

* Add siglip2 in index.md

* Fix consistency

* Add docs

* Remove size and data format

* Add image processor tests

* Fix

* Add fast image processor

* Fix style

* Fix

* Docs

* Set lowercase for tokenizer

* Adjust head size for Siglip v1

* Update siglip2 for consistency with siglip1

* Update siglip2 conversion

* Update pipeline

* Update checkpoints in tests

* Update checkpoint name

* Fix pooling for image classification model

* Fix FA2 test

* Update processor

* Fix check repo

* Update docs

* Fix typos

* Fix docstring for fast image processor

* Add siglip2 to FA2 docs

* Fix fast ip tests

* Fix constitency

* Fix tokenizer class for siglip v1

* Fix missing header

* Refactor scaling for clip, siglip, siglip2

* Remove unused imports

* Make fast IP default for siglip2

* Update docs

* Update checkpoints

* Update modular

* Update paper link

* Fixup

* Fix name in toctree

* Fix test
2025-02-21 09:04:19 +00:00
14552cbd7c VLMs: even more clean-up (#36249)
* squash

* style
2025-02-21 09:46:31 +01:00
e18f233f6c Fix default attention mask of generate in MoshiForConditionalGeneration (#36171) 2025-02-20 19:53:27 +00:00
27d1707586 [smolvlm] make CI green (#36306)
* add smolvlm to toctree

* add requirements

* dev-ci

* no docker changes

* dev-ci

* update torch-light.dockerfile

* derp

* dev-ci
2025-02-20 18:56:11 +01:00
effaef334b fix: prevent second save in the end of training if last step was saved already (#36219)
* fix: prevent second save in the end of training

* fix: prevent second save in the end of training

* test: added test for no duplicate save on epoch save strategy

* fix: removed TrainerControl

* chore: style formatting

---------

Co-authored-by: JaktensTid <jaktenstid1@gmail.com>
2025-02-20 17:38:52 +01:00
12v
5412ff1a13 Fix typo in Pixtral example (#36302)
Fix typo
2025-02-20 14:13:48 +00:00
4397dfcb71 SmolVLM2 (#36126)
* smolvlm init

* updates

* fixing bugs

* minimal run, no checks

* minimal run, no checks

* passing first check + adding url support

* updating video dataloading logic

* fixing image logic

* trying modular, but fails

* modular is working, changing processor to match PR comments and general transformers logic

* fixing kwargs

* offloading video loading logic to  image_util

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* update

* add idefics3-based tests

* add keyword to all

* add PreTrainedModel

* updateing video loading logic

* working inference

* updates for PR comments

* updates for PR comments

* moving SmolVLMPretrainedModel higher to fix import error

* CI test pass

* CI test pass

* removing lambda

* CI test pass

* CI test pass

* CI test pass

* CI test pass

* CI test pass

* CI test pass

* processor tests

* add example in docs

* typo

* fix copies

* skip compile tests - sdpa for VisionTransformer

* fix init

* raise import error for num2words

* update doc for FA2

* more doc fix

* CI

* updates for PR comments

* Update docs/source/en/model_doc/smolvlm.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/smolvlm.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/smolvlm.md

Co-authored-by: Joshua Lochner <admin@xenova.com>

* Update docs/source/en/model_doc/smolvlm.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/smolvlm.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* fixing processor -- tokenizer not defined properly, (gpt2 tokenizer), and does not have the attributes of fake image token, etc

* adding smolvlm to VQA models

* removing vqa auto class

* Update src/transformers/models/smolvlm/processing_smolvlm.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* removing smolvlmvisiontransformer from index.md

* my bad, video processing had typos

* fixing docs

* renaming params in SmolVLMModel.inputs_merger

* removing un-needed dtype/device in model forward

* ruff for CI

* update docs

* Update docs/source/en/model_doc/smolvlm.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* return cache position

* return cache position

* return cache also in modular

* needed to run modular again

* fix training tests

* push vectorized inputs merger

* format

* format

* reduce number of mappings

* addressing PR comments

* happy CI, happy me :)

* skip non-nested images

* adjust integration test for smaller GPUs

* format

* fix kwargs in chat template apply

* skip this for now

---------

Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Joshua Lochner <admin@xenova.com>
2025-02-20 15:00:26 +01:00
f2ab182dca Ignore conversion files in test fetcher (#36251)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-20 13:32:02 +01:00
e8531a0e33 Fix broken CI on release branch due to missing conversion files (#36275)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-20 13:22:10 +01:00
5e2183f344 Make cache traceable (#35873)
simply make cache traceable
2025-02-20 09:59:25 +01:00
31bb662db1 Fix callback handler reference (#36250)
* fix reference

* style
2025-02-19 18:17:33 +01:00
78d6484675 docs: Update README_zh-hans.md (#36269)
Update README_zh-hans.md

docs: Fix awkward sentence in README
2025-02-19 09:04:46 -08:00
e5cea20743 Add Example for Custom quantization (#36286)
* add example

* rename
2025-02-19 17:09:23 +01:00
e3d99ec2f5 [tests] make test_from_pretrained_low_cpu_mem_usage_equal less flaky (#36255)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-19 15:14:02 +00:00
99adc74462 [tests] remove flax-pt equivalence and cross tests (#36283) 2025-02-19 15:13:27 +00:00
fa8cdccd91 [tests] deflake dither test (#36284) 2025-02-19 15:13:10 +00:00
60226c6ff3 TP initialization module-by-module (#35996)
* module-by-module loading!

* Update modeling_utils.py

* dtyle and comments

* Update modeling_utils.py

* Update modeling_utils.py

* Update test

* Update modeling_utils.py

* Update modeling_utils.py

* Update test_tp.py

* Update test_tp.py

* Update modeling_utils.py

* re-trigger CIs

* re-trigger CIs
2025-02-19 14:04:57 +01:00
0863eef248 [tests] remove pt_tf equivalence tests (#36253) 2025-02-19 11:55:11 +00:00
1a81d774b1 Add dithering to the Speech2TextFeatureExtractor API. (#34638)
* Add dithering to the `Speech2TextFeatureExtractor` API.

- in kaldi : 4a8b7f6732/src/feat/feature-window.cc (L145)
- with dithering without a seed, the features become non-deterministic due
  to small Gaussian noise added to the audio (i.e. 2 runs lead to little
  different outputs)

* update the PR

- add dithering also for WhisperFeatureExtractor
- not adding to Wav2Vec2FeatureExtractor (no FBANK computation)

* add unit-tests for dithering, fix docstrings

* ruff

* utils/check_copies.py --fix_and_overwrite

* update code, add seed to unit-test

* adding explanation of dithering
2025-02-19 11:50:02 +01:00
9f51dc2535 Add support for post-processing kwargs in image-text-to-text pipeline (#35374)
* fix error and improve pipeline

* add processing_kwargs to apply_chat_template

* change default post_process kwarg to args

* Fix slow tests

* fix copies
2025-02-18 17:43:36 -05:00
9b479a245b Uniformize LlavaNextVideoProcessor kwargs (#35613)
* Uniformize processor kwargs and add tests

* add videos_kwargs tests

* fix copies

* fix llava_next_video chat template tests

* remove unnecessary default kwargs
2025-02-18 14:13:51 -05:00
8ee50537fe Qwen2VL fix cos,sin dtypes to float when used with deepspeed (#36188)
* fix dtype of cos,sin when used with deepspeed

* move sin,cos casting withing flash attention functions

* fix cos,sin float casting in modular

---------

Co-authored-by: ardalan.mehrani <ardalan.mehrani@ardalanmehranis-MacBook-Pro.local>
Co-authored-by: ardalan.mehrani <ardalan.mehrani@bytedance.com>
2025-02-18 19:18:29 +01:00
8eaae6bee9 Added Support for Custom Quantization (#35915)
* Added Support for Custom Quantization

* Update code

* code reformatted

* Updated Changes

* Updated Changes

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-02-18 16:14:19 +01:00
07182b2e10 GitModelIntegrationTest - flatten the expected slice tensor (#36260)
Flatten the expected slice tensor
2025-02-18 16:04:19 +01:00
4d2de5f63c Fix XGLM loss computation (PyTorch and TensorFlow) (#35878)
* Fix XGLM loss computation (PyTorch and TensorFlow)

* Update expected output string in XGLM sample test

This updates the expected output string of test_xglm_sample for torch
2.0 to the correct one and removes the one for torch 1.13.1 + cu116
(transformers moved to torch 2.0 with PR #35358).

* Update expected output IDs in XGLM generation test
2025-02-18 15:37:48 +01:00
c3ba53303b feat: add support for tensor parallel training workflow with accelerate (#34194)
* feat: add support for tensor parallel flow using accelerate

Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>

* fix: add tp degree to env variable

Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>

* fix: add version check for accelerate to allow TP

Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>

* docs: tensor parallelism

Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>

* nit: rename plugin name

Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>

* fix: guard accelerate version before allow tp

Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>

* docs: add more docs and updates related to TP

Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>

---------

Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-18 14:05:46 +01:00
e6cc410d5b Remove flakiness in VLMs (#36242)
* fix

* nit

* no logits processor needed

* two more tests on assisted decoding
2025-02-18 11:41:07 +01:00
fdcfdbfd22 Fix TorchAoConfig not JSON serializable (#36206)
**Summary:** TorchAoConfig optionally contains a
`torchao.dtypes.Layout` object which is a dataclass and not
JSON serializable, and so the following fails:

```
import json
from torchao.dtypes import TensorCoreTiledLayout
from transformers import TorchAoConfig

config = TorchAoConfig("int4_weight_only", layout=TensorCoreTiledLayout())

config.to_json_string()

json.dumps(config.to_dict())
```

This also causes `quantized_model.save_pretrained(...)` to
fail because the first step of this call is to JSON serialize
the config. Fixes https://github.com/pytorch/ao/issues/1704.

**Test Plan:**
python tests/quantization/torchao_integration/test_torchao.py -k test_json_serializable

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-18 11:05:42 +01:00
626666c444 Au revoir flaky test_fast_is_faster_than_slow (#36240)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-17 18:30:07 +01:00
429f1a682d [tests] remove test_export_to_onnx (#36241) 2025-02-17 16:52:44 +00:00
dae8708c36 Add compressed tensor in quant dockerfile (#36239)
add compressed_tensors in the dockerfile
2025-02-17 17:48:57 +01:00
3e970dbbf1 Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/codeparrot/examples (#36237)
Bump transformers in /examples/research_projects/codeparrot/examples

Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-17 16:28:43 +00:00
77aa9fc076 [generate] Fix encoder decoder models attention mask (#36018) 2025-02-17 15:42:28 +00:00
55493f1390 [tests] remove tf/flax tests in /generation (#36235) 2025-02-17 14:59:22 +00:00
c877c9fa5b v4.45.0-dev0 2025-02-17 15:21:20 +01:00
7ec35bc3bd Add missing atol to torch.testing.assert_close where rtol is specified (#36234) 2025-02-17 14:57:50 +01:00
dad513e0c2 [generate] remove cache v4.47 deprecations (#36212) 2025-02-17 13:55:03 +00:00
936aeb70ab AMD DeepSpeed image additional HIP dependencies (#36195)
* Add hipsolver and hipblastlt as dependencies

* Upgrade torch libs with rocm6.2.4 index
2025-02-17 11:50:49 +01:00
23d6095e8f Fix LlavaForConditionalGenerationModelTest::test_config after #36077 (#36230)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-17 11:49:07 +01:00
fae0f3dde8 [tests] fix EsmModelIntegrationTest::test_inference_bitsandbytes (#36225)
fix failed test
2025-02-17 11:10:33 +01:00
dd16acb8a3 set test_torchscript = False for Blip2 testing (#35972)
* just skip

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-14 17:43:32 +01:00
0a9923a609 Use args.num_workers in check_modular_conversion.py (#36200)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-14 17:31:03 +01:00
a570e2ba87 add shared experts for upcoming Granite 4.0 language models (#35894)
* Modular GraniteMoE with shared Experts.

Signed-off-by: Shawn Tan <shawntan@ibm.com>

* Modified

* Import order.

* Modified for style

* Fix space.

* Test

* Remove extra granitemoe file.

* New converted file and tests

* Modified __init__ files.

* Formatting.

* Dummy PT objects

* register granitemoe shared model

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix linting of a file

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix import in modeling file

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* update generated modeling file

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* add documentation

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* update docstrings

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* update generated modeling file

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix docstrings in config class

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* merge main

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

---------

Signed-off-by: Shawn Tan <shawntan@ibm.com>
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
Co-authored-by: Shawn Tan <shawntan@ibm.com>
Co-authored-by: Shawn Tan <shawn@wtf.sg>
Co-authored-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
Co-authored-by: Sukriti Sharma <Ssukriti@users.noreply.github.com>
2025-02-14 16:55:28 +01:00
7ae7e87a09 Add @require_bitsandbytes to Aria test_batched_generation (#36192) 2025-02-14 15:48:47 +01:00
bcfc9d795e [Bugfix] Fix reloading of pixtral/llava configs (#36077)
* add is_composition flag to LlavaConfig

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* WIP: pixtral text config

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix style

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add test

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use is_composition for pixtral

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Revert "use is_composition for pixtral"

This reverts commit a53d5f9fc5149c84419b0e9e03db6d99362add53.

* Revert "Revert "use is_composition for pixtral""

This reverts commit 3ab1c99404e2c2963fba0bcf94b9786d6365db0f.

---------

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
2025-02-14 15:27:05 +01:00
0c78ef6cd3 🔴 VLM: compile compatibility (#35724)
* llavas

* add mroe models

* fix `compile_forward` test for all models

* fix copies

* make style

* also doesn't support cache class

* fix some tests

* not copied from

* ci green?

* fix tests

* fix copies

* fix tests

* check with `numel` and remove `item`

* fix copies

* fix copies

* Update src/transformers/models/cohere2/modeling_cohere2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* opt remove cross attn

* gemma2

* fixup

* fixup

* fix newly added test

* maybe fixed?

* green please?

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-02-14 15:23:49 +01:00
b45cf0e90a Guard against unset resolved_archive_file (#35628)
* archive_file may not be specified
When loading a pre-trained model from a gguf file, resolved_archive_file may not be set. Guard against that case in the safetensors availability check.

* Remap partial disk offload to cpu for GGUF files
GGUF files don't support disk offload so attempt to remap them to the CPU when device_map is auto. If device_map is anything else but None, raise a NotImplementedError.

* Don't remap auto device_map and raise RuntimeError
If device_map=auto and modules are selected for disk offload, don't attempt to map them to any other device. Raise a runtime error when a GGUF model is configured to map any modules to disk.

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-14 14:44:31 +01:00
96f01a36ac Revert qwen2 breaking changes related to attention refactor (#36162)
* dito

* add a test

* upsate

* test needs fa2

* update test and configuration

* test requires fa2

* style
2025-02-14 13:44:14 +01:00
cb586a3999 Add require_read_token to fp8 tests (#36189)
fix
2025-02-14 12:27:35 +01:00
5f726f8b8e New HIGGS quantization interfaces, JIT kernel compilation support. (#36148)
* new flute

* new higgs working

* small adjustments

* progress and quallity

* small updates

* style

---------

Co-authored-by: Andrey Panferov <panferov.andrey3@wb.ru>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-02-14 12:26:45 +01:00
15ec971b8e Prepare processors for VideoLLMs (#36149)
* allow processor to preprocess conversation + video metadata

* allow callable

* add test

* fix test

* nit: fix

* add metadata frames_indices

* Update src/transformers/processing_utils.py

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* Update src/transformers/processing_utils.py

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* port updates from Orr and add one more test

* Update src/transformers/processing_utils.py

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* typo

* as dataclass

* style

* docstring + maek sure tests green

---------

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2025-02-14 11:34:08 +01:00
33d1d715b0 Add ImageProcessorFast to Qwen2.5-VL processor (#36164)
* add qwen2 fast image processor to modular file

Signed-off-by: isotr0py <2037008807@qq.com>

* fix modular

Signed-off-by: isotr0py <2037008807@qq.com>

* fix circle import

Signed-off-by: isotr0py <2037008807@qq.com>

* add docs

Signed-off-by: isotr0py <2037008807@qq.com>

* fix typo

Signed-off-by: isotr0py <2037008807@qq.com>

* add modular generated files

Signed-off-by: isotr0py <2037008807@qq.com>

* revert qwen2vl fast image processor

Signed-off-by: isotr0py <2037008807@qq.com>

* remove qwen2.5-vl image processor from modular

Signed-off-by: isotr0py <2037008807@qq.com>

* re-generate qwen2.5-vl files

Signed-off-by: isotr0py <2037008807@qq.com>

* remove unnecessary test

Signed-off-by: isotr0py <2037008807@qq.com>

* fix auto map

Signed-off-by: isotr0py <2037008807@qq.com>

* cleanup

Signed-off-by: isotr0py <2037008807@qq.com>

* fix model_input_names

Signed-off-by: isotr0py <2037008807@qq.com>

* remove import

Signed-off-by: isotr0py <2037008807@qq.com>

* make fix-copies

Signed-off-by: isotr0py <2037008807@qq.com>

---------

Signed-off-by: isotr0py <2037008807@qq.com>
2025-02-14 17:34:55 +08:00
1931a35140 Chat template docs (#36163)
* decompose chat template docs

* add docs

* update model docs

* qwen2-5

* pixtral

* remove old chat template

* also video as list frames supported

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* remove audio for now

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-02-14 10:32:14 +01:00
3bf02cf440 CI: fix test-save-trainer (#36191)
* fix

* also the docstring
2025-02-14 10:20:56 +01:00
0ae93d31ce Add support for partial rotary embeddings in Phi3 model (#35947)
* Added support for partial_rotary_factor

* addressed comments

* refactored
2025-02-14 09:37:38 +01:00
336dc69d63 Uniformize OwlViT and Owlv2 processors (#35700)
* uniformize owlvit processor

* uniformize owlv2

* nit

* add positional arg test owlvit

* run-slow: owlvit, owlv2

* run-slow: owlvit, owlv2

* remove one letter variable
2025-02-13 17:30:26 -05:00
e6a7981711 Fix make_batched_videos and add tests (#36143)
* add support for initial shift in video processing and other fixes

* revert modifications video loading functions
2025-02-13 17:14:30 -05:00
8fd4bc7d1d Fix a mistake in #36175 (#36179)
fix my bad

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-13 18:33:02 +01:00
b1a2de075d Follow up to SpQR integration (#36176)
fix
2025-02-13 17:40:59 +01:00
12962fe84b Fix the key name for _load_rng_state under torch.cuda (#36138)
fix load key name for _load_rng_state under torch.cuda

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-13 11:35:08 -05:00
bfe46c98b5 Make check_repository_consistency run faster by MP (#36175)
* speeddddd

* speeddddd

* speeddddd

* speeddddd

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-13 17:25:17 +01:00
5f0fd1185b Optimize Qwen2VL vision model by precomputing cos/sin embeds before ViT blocks (#35837)
* Optimize Qwen2VL vision model by precomputing cos/sin embeds before ViT blocks

* Make rotary_pos_emb optional & fix type

* Adapt pre-computed cos/sin to Qwen2.5VL

* More concise
2025-02-13 17:10:58 +01:00
d72642bccc Use tqdm auto (#35726)
* Remove traces of the progressbar

* Use tqdm auto
2025-02-13 15:41:30 +00:00
62c7ea0201 CI: avoid human error, automatically infer generative models (#33212)
* tmp commit

* move tests to the right class

* remove ALL all_generative_model_classes = ...

* skip tf roberta

* skip InstructBlipForConditionalGenerationDecoderOnlyTest

* videollava

* reduce diff

* reduce diff

* remove  on vlms

* fix a few more

* manual rebase bits

* more manual rebase

* remove all manual generative model class test entries

* fix up to ernie

* a few more removals

* handle remaining cases

* recurrent gemma

* it's better here

* make fixup

* tf idefics is broken

* tf bert + generate is broken

* don't touch tf :()

* don't touch tf :(

* make fixup

* better comments for test skips

* revert tf changes

* remove empty line removal

* one more

* missing one
2025-02-13 16:27:11 +01:00
06231fdfc7 add disable compile option (#36161)
* add disable compile code

* fix
2025-02-13 16:24:46 +01:00
0ca7259217 fix training issues (#36158)
* fix training issues

* Update

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-13 16:24:28 +01:00
845b0a2616 Efficient Inference Kernel for SpQR (#34976)
* Resolve vptq conflict

* Rename spqr package to spqr_quant

* Get rid of aqlm mention

* Start working on tests

* Resolve ruff code checks

* Ruff format

* Isort

* Test updates

* Add gpu tag

* Rename to modules_to_not_convert

* Config update

* Docs and config update

* Docs and config update

* Update to update_torch_dtype

* spqr config parameter validation

* Ruff update

* Apply ruff fixes

* Test fixes

* Ruff update

* Mark tests as @slow again; Ruff; Docstring update

* Ruff

* Remove absolute path

* Resolve typo

* Remove redundandt log

* Check accelerate/spqr availability

* Ruff fix

* Check if the config contains proper shapes

* Ruff test

* Documentation update

* overview update

* Ruff checks

* Ruff code quality

* Make style

* Update docs/source/en/quantization/spqr.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update spqr.md

* Enable gptqmodel (#35012)

* gptqmodel

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update readme

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* gptqmodel need use checkpoint_format (#1)

* gptqmodel need use checkpoint_format

* fix quantize

* Update quantization_config.py

* Update quantization_config.py

* Update quantization_config.py

---------

Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* Revert quantizer_gptq.py (#2)

* revert quantizer_gptq.py change

* pass **kwargs

* limit gptqmodel and optimum version

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix warning

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix version check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* revert unrelated changes

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable gptqmodel tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix requires gptq

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Fix Transformer compat (#3)

* revert quantizer_gptq.py change

* pass **kwargs

* add meta info

* cleanup

* cleanup

* Update quantization_config.py

* hf_select_quant_linear pass checkpoint_format and meta

* fix GPTQTestCUDA

* Update test_gptq.py

* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2

* cleanup

* add backend

* cleanup

* cleanup

* no need check exllama version

* Update quantization_config.py

* lower checkpoint_format and backend

* check none

* cleanup

* Update quantization_config.py

* fix self.use_exllama == False

* spell

* fix unittest

* fix unittest

---------

Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format again

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update gptqmodel version (#6)

* update gptqmodel version

* update gptqmodel version

* fix unit test (#5)

* update gptqmodel version

* update gptqmodel version

* "not self.use_exllama" is not equivalent to "self.use_exllama==False"

* fix unittest

* update gptqmodel version

* backend is loading_attibutes (#7)

* fix format and tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix memory check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix device mismatch

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix result check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* update tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* review: update docs (#10)

* review: update docs (#12)

* review: update docs

* fix typo

* update tests for gptqmodel

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update document (#9)

* update overview.md

* cleanup

* Update overview.md

* Update overview.md

* Update overview.md

* update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

---------

Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* typo

* doc note for asymmetric quant

* typo with apple silicon(e)

* typo for marlin

* column name revert: review

* doc rocm support

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Fix : Nemotron Processor in GGUF conversion (#35708)

* fixing nemotron processor

* make style

* Update docs/source/en/quantization/spqr.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add missing TOC to doc

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-02-13 16:22:58 +01:00
c5506f4f00 Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/adversarial (#36168)
Bump transformers in /examples/research_projects/adversarial

Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-13 15:06:16 +00:00
d7c5d1b539 Bump transformers from 4.38.0 to 4.48.0 in /examples/tensorflow/language-modeling-tpu (#36167)
Bump transformers in /examples/tensorflow/language-modeling-tpu

Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-13 14:46:38 +00:00
636ee57489 [generate] revert change in Aria: the maximum cache length must match max_length (#36120)
* revert inputs_embeds len

* Update test_utils.py

* make fixup
2025-02-13 14:36:33 +00:00
b41591d847 Fix : fix doc fp8 (#36173)
* fix

* fix
2025-02-13 15:29:59 +01:00
b079dd1fa2 Fix red CI (#36174)
test was weird
2025-02-13 14:27:55 +01:00
d114a6f78e [Modular] skip modular checks based on diff (#36130)
skip modular checks based on diff
2025-02-13 12:53:21 +00:00
6397916dd2 Remove loading custom kernel for RT-DETRv2 (#36098)
* Remove loading custom kernels

* Remove config param

* Fixup
2025-02-13 12:01:53 +00:00
efe72fe21f Adding FP8 Quantization to transformers (#36026)
* first commit

* adding kernels

* fix create_quantized_param

* fix quantization logic

* end2end

* fix style

* fix imports

* fix consistency

* update

* fix style

* update

* udpate after review

* make style

* update

* update

* fix

* update

* fix docstring

* update

* update after review

* update

* fix scheme

* update

* update

* fix

* update

* fix docstring

* add source

* fix test

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-13 13:01:19 +01:00
c82319b493 Helium documentation fixes (#36170)
* Helium documentation fixes

* Update helium.md

* Update helium.md

* Update helium.md
2025-02-13 12:20:53 +01:00
8f137b2427 Move DataCollatorForMultipleChoice from the docs to the package (#34763)
* Add implementation for DataCollatorForMultipleChoice based on docs.

* Add DataCollatorForMultipleChoice to import structure.

* Remove custom DataCollatorForMultipleChoice implementations from example scripts.

* Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean.

* Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable.

* Apply suggested changes and run make fixup.

* fix copies, style and fixup

* add missing documentation

* nits

* fix docstring

* style

* nits

* isort

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2025-02-13 12:01:28 +01:00
35c155052d Fix PretrainedTokenizerFast check => Fix PretrainedTokenizerFast Save (#35835)
* Fix the bug in tokenizer.save_pretrained when saving tokenizer_class to tokenizer_config.json

* Update tokenization_utils_base.py

* Update tokenization_utils_base.py

* Update tokenization_utils_base.py

* add tokenizer class type test

* code review

* code opt

* fix bug

* Update test_tokenization_fast.py

* ruff check

* make style

* code opt

* Update test_tokenization_fast.py

---------

Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
2025-02-13 12:00:33 +01:00
3c912c9089 docs: fix return type annotation of get_default_model_revision (#35982) 2025-02-13 11:59:15 +01:00
6a1ab634b6 qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 (#36083)
* qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1

* fix

* fix

* fix

* fix

* add tests

* fix test bugs

* fix

* fix failed tests

* fix
2025-02-13 11:35:28 +01:00
d419862889 Fix tests for vision models (#35654)
* Trigger tests

* [run-slow] beit, detr, dinov2, vit, textnet

* Fix BEiT interpolate_pos_encoding

* Fix DETR test

* Update DINOv2 test

* Fix textnet

* Fix vit

* Fix DPT

* fix data2vec test

* Fix textnet test

* Update interpolation check

* Fix ZoeDepth tests

* Update interpolate embeddings for BEiT

* Apply suggestions from code review
2025-02-13 10:28:37 +00:00
e60ae0d078 Replace deprecated update_repo_visibility (#35970) 2025-02-13 11:27:55 +01:00
9065cf0d92 Fix Gemma2 dtype issue when storing weights in float16 precision (#35398)
fix gemma2 dtype issue when storing weights in float16 precision
2025-02-13 11:17:37 +01:00
08ab1abff4 Add reminder config to issue template and print DS version in env (#35156)
* update env command to log deepspeed version

* suppress deepspeed import logging

* Add reminder to include configs to repro description in bug report.

* make fixup

* [WIP] update import utils for deepspeed

* Change to using is_deepspeed_available() from integrations.

* make fixup
2025-02-13 10:55:49 +01:00
950cfb0b4f Fix PaliGemma Pad Token Masking During Training #35855 (#35859)
* change order of unmasking of tokens

* library import

* class setup

* test function

* refactor

* add commit message

* test modified

* explict initiliasation of weights + made model smaller

* removed sepete testing file

* fixup

* fixup core

* test attention mask with token types

* tests fixup

* removed PaliGemmaAttentionMaskTest class

---------

Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
2025-02-13 10:11:44 +01:00
1614d196e8 Mllama fsdp (#36000)
* pixel input assignment revoked

* double send

* Update src/transformers/models/mllama/modeling_mllama.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-02-13 09:49:39 +01:00
847854b023 Add git LFS to AMD docker image (#36016)
Add git lfs to AMD docker image
2025-02-12 22:27:21 +01:00
9985d06add skip test_initialization for VitPoseBackboneModelTest for now (#36154)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-12 18:24:24 +01:00
4a5a7b991a Fix test fetcher (#36129)
* fix

* fix

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-12 17:35:41 +01:00
1fae54c721 Add more rigerous non-slow grad accum tests (#35668)
* Add more rigerous non-slow grad accum tests

* Further nits

* Re-add space

* Readbility

* Use tinystories instead

* Revert transformer diff

* tweak threshs
2025-02-12 10:26:21 -05:00
f869d486d3 Update doc re list of models supporting TP (#35864)
Update doc about models' TP support
2025-02-12 15:53:27 +01:00
281c0c8b5b adding option to save/reload scaler (#34932)
* Adding option to save/reload scaler

* Removing duplicate variable

* Adding save/reload test

* Small fixes on deterministic algorithm call

* Moving LLM test to another file to isolate its environment

* Moving back to old file and using subprocess to run test isolated

* Reverting back accidental change

* Reverting back accidental change
2025-02-12 15:48:16 +01:00
a33ac830af Fix multi gpu loss sync condition, add doc and test (#35743)
* Fix multi gpu loss sync condition, add doc and test

* rename function and class

* loss should not scale during inference

* fix typo
2025-02-12 15:41:31 +01:00
08c4959a23 Optim: APOLLO optimizer integration (#36062)
* Added APOLLO optimizer integration

* fix comment

* Remove redundancy: Modularize low-rank optimizer construction

* Remove redundancy: Remove useless comment

* Fix comment: Add typing

* Fix comment: Rewrite apollo desc
2025-02-12 15:33:43 +01:00
2440512723 multi-gpu: fix tensor device placements for various models (#35763)
* milti-gpu: fix inputs_embeds + position_embeds

Fixing the following errors in few models:
```
>       hidden_states = inputs_embeds + pos_embeds
E       RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:2 and xpu:3!
```

Fixes: #35762
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* multi-gpu: fix tensor device placements for various models

Fixes: #35762
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* Apply make fix-copies

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

---------

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2025-02-12 15:28:18 +01:00
befea8c4f0 🚨 Remove cache migration script (#35810)
* Remove cache migration script

* remove dummy move_cache
2025-02-12 15:12:38 +01:00
d52a9d08ce Bump cryptography from 43.0.1 to 44.0.1 in /examples/research_projects/decision_transformer (#36142)
Bump cryptography in /examples/research_projects/decision_transformer

Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.1 to 44.0.1.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/43.0.1...44.0.1)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-12 13:34:52 +00:00
31e4831b98 Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/vqgan-clip (#36136)
Bump transformers in /examples/research_projects/vqgan-clip

Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-12 13:21:09 +00:00
243aeb7c4a Fix Gradient Checkpointing for Deberta & Deberta-V2 using PEFT / Adapters (#35898)
Replace In-Place Operations for Deberta and Deberta-V2
2025-02-12 14:21:01 +01:00
8a2f062eac [commands] remove deprecated/inoperational commands (#35718)
rm deprecated/inoperational commands
2025-02-12 12:23:58 +00:00
8fc6ecba4f VLM: enable skipped tests (#35746)
* fix cached tests

* fix some tests

* fix pix2struct

* fix
2025-02-12 12:55:46 +01:00
d6897b46bd Add utility for Reload Transformers imports cache for development workflow #35508 (#35858)
* Reload transformers fix form cache

* add imports

* add test fn for clearing import cache

* ruff fix to core import logic

* ruff fix to test file

* fixup for imports

* fixup for test

* lru restore

* test check

* fix style changes

* added documentation for usecase

* fixing

---------

Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
2025-02-12 12:45:11 +01:00
1cc7ca3295 Whisper: remove redundant assisted generation tests (#34814)
* remove redundant test

* delete another test

* revert default max_length

* (wrong place, moving)
2025-02-12 11:37:19 +00:00
0cd5e2dfd0 added warning to Trainer when label_names is not specified for PeftModel (#32085)
* feat: added warning to Trainer when label_names is not specified for PeftModel

* Update trainer.py

* feat: peft detectw ith `_is_peft_model`

* Update src/transformers/trainer.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Applied formatting in trainer.py

---------

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2025-02-12 12:34:47 +01:00
377d8e2b9c add RAdamScheduleFree optimizer (#35313)
* add RAdamScheduleFree optimizer

* revert schedulefree version to the minimum requirement

* refine is_schedulefree_available so that it can take min_version

* refine documents

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-02-12 11:31:51 +01:00
f5fff672db Add pipeline parallel plan to PretrainedConfig and PreTrainedModel (#36091)
* Add `base_model_pp_plan` to `PretrainedConfig`

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add `_pp_plan` to `PreTrainedModel`

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add both to Llama for testing

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Fix type error

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Update to suggested schema

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* `_pp_plan` keys are not patterns

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Simplify schema

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Fix typing error

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Update input name for Llama

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Aria

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Bamba

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Cohere 1 & 2

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to diffllama and emu3

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Gemma 1 & 2

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to GLM and GPT NeoX

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Granite and Helium

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Mistral and Mixtral

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to OLMo 1 & 2

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Phi and Phi 3

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan for Qwen 2, 2 MoE, 2 VL and 2.5 VL

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan for Starcoder 2

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add enum for accessing inputs and outputs

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Update type hints to use tuples

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Change outer list to tuple

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

---------

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-02-12 10:51:48 +01:00
11afab19c0 [docs] update awq doc (#36079)
* update awq doc

* Update docs/source/en/quantization/awq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/awq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/awq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/awq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* add note for inference

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-02-11 10:35:28 -08:00
9b69986e8a [docs] minor doc fix (#36127)
fix
2025-02-11 10:31:12 -08:00
1b57de8dcf Make output_dir Optional in TrainingArguments #27866 (#35735)
* make output_dir optional

* inintaied a basic testing module to validate and verify the changes

* Test output_dir default to 'tmp_trainer' when  unspecified.

* test existing functionality of output_dir.

* test that output dir only created when needed

* final check

* added doc string and changed the tmp_trainer to trainer_output

* amke style fixes to test file.

* another round of fixup

---------

Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
2025-02-11 18:54:36 +01:00
03534a92f8 update tiktoken integ to use converted (#36135) 2025-02-11 18:27:22 +01:00
3a5c328fd8 Fix CI issues (#35662)
* make explicit gpu dep

* [run-slow] bamba
2025-02-11 18:17:01 +01:00
775252abd4 Fix max size deprecated warning (#34998)
* Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning

* Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning

* Remove deprecated warnings and eliminate `max_size` usage

* Test use `int` as argument for `size`
Add a test to ensure test can pass successfully and backward compatibility

* The test pipelines still use `max_size`
Remove `max_size` from test pipelines and replace by `size` by a `Dict` with `'shortest_edge'` `'longest_edge'` as keys

* Reformatting

* Reformatting

* Revert "Reformatting"

This reverts commit c3040acee75440357cffd1f60c9d29ff5b2744b8.

* Revert "Reformatting"

This reverts commit ac4522e5c9a02d2d0c298295026db68ea26453df.

* Revert "The test pipelines still use `max_size`"

This reverts commit eaed96f041ffc32459536e1524d87f7a12ddee29.

* Revert "Test use `int` as argument for `size`"

This reverts commit 1925ee38c7c5eabb11832316712df1d4ba8043d0.

* Revert "Remove deprecated warnings and eliminate `max_size` usage"

This reverts commit d8e7e6ff9025931468fc1f3827cda1fa391003d5.

* Change version `4.26` to "a future version"

* Reformatting

* Revert "Change version `4.26` to "a future version""

This reverts commit 2b53f9e4
2025-02-11 18:14:31 +01:00
5489fea557 update awesome-transformers.md. (#36115)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-02-11 15:55:49 +00:00
76048be419 fix: typos in documentation files (#36122)
* Update tools.py

* Update text_generation.py

* Update question_answering.py
2025-02-11 13:47:20 +00:00
f42d46ccb4 Add common test for torch.export and fix some vision models (#35124)
* Add is_torch_greater_or_equal test decorator

* Add common test for torch.export

* Fix bit

* Fix focalnet

* Fix imagegpt

* Fix seggpt

* Fix swin2sr

* Enable torch.export test for vision models

* Enable test for video models

* Remove json

* Enable for hiera

* Enable for ijepa

* Fix detr

* Fic conditional_detr

* Fix maskformer

* Enable test maskformer

* Fix test for deformable detr

* Fix custom kernels for export in rt-detr and deformable-detr

* Enable test for all DPT

* Remove custom test for deformable detr

* Simplify test to use only kwargs for export

* Add comment

* Move compile_compatible_method_lru_cache to utils

* Fix beit export

* Fix deformable detr

* Fix copies data2vec<->beit

* Fix typos, update test to work with dict

* Add seed to the test

* Enable test for vit_mae

* Fix beit tests

* [run-slow] beit, bit, conditional_detr, data2vec, deformable_detr, detr, focalnet, imagegpt, maskformer, rt_detr, seggpt, swin2sr

* Add vitpose test

* Add textnet test

* Add dinov2 with registers

* Update tests/test_modeling_common.py

* Switch to torch.testing.assert_close

* Fix masformer

* Remove save-load from test

* Add dab_detr

* Add depth_pro

* Fix and test RT-DETRv2

* Fix dab_detr
2025-02-11 11:37:31 +00:00
1779f5180e Fix nighlty CIs: missing atols (#35903)
fix osme missing atols
2025-02-11 10:49:21 +01:00
1feebb5b41 AutoformerForPrediction test add atol (#36017) 2025-02-10 19:22:24 +01:00
be2ac0916a [generate] shape checks in tests compatible with fixed-length caches (+ some minor fixes) (#35993)
* shape checks compatible with static cache

* add test

* tmp

* manually turn on eager attn when we want to output attn

* typo

* generalize to encoder-decoder models

* force compilation on cpu

* tmp commit

* fix static cache shape checks

* models with odd caches

* fix copies

* shorter cache search loop

* use decoder_past_key_values everywhere

* better test variable names and comments

* signature

* rename _check_outputs into _check_generate_outputs

* add comments

* HybridCache future test note
2025-02-10 17:50:54 +00:00
9510ae39d9 fix bnb warning (#36116)
fix
2025-02-10 17:34:50 +01:00
09261ccf12 [Bugfix] fix file name of docstring in utils/check_table.py (#36108)
fix file name

Co-authored-by: kkscilife <qa-caif-cicd@pjlab.org.cn>
2025-02-10 15:48:02 +00:00
d4a6b4099b Revert checkpoint tmp dir (#36112)
* Revert "Fix OS err (#36094)"

This reverts commit ba29a439adbe6f371710d0514659127264ae24b3.

* Revert "Save checkpoint to temporary directory to handle partial saves during failures (#35580)"

This reverts commit 20d17358c468b7aefca9e54c3461eb88d1ee34f9.
2025-02-10 16:22:03 +01:00
0baf003915 Refactor OPT model (#36101)
* remove cross attention

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* remove is_decoder

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix pkv

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-02-10 14:27:16 +01:00
924f1c717a Remove Multi-threaded image conversion for fast image processors (#36105)
remove multithreaded image conversion

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-10 07:59:34 -05:00
3897f2caf8 Enable pytest live log and show warning logs on GitHub Actions CI runs (#35912)
* fix

* remove

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-10 13:36:20 +01:00
48a309d0d2 Support constant lr with cooldown (#35453)
* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add more warmup and cooldown methods to 'get_wsc_schedule'

* Add more warmup and cooldown methods to 'get_wsc_schedule'

* Add more warmup and cooldown methods to 'get_wsc_schedule'

* Add more warmup and cooldown methods to 'get_wsc_schedule'

* Add more warmup and decay methods to 'get_wsd_schedule'

* support num_training_steps and num_stable_steps for get_wsd_schedule

* support num_training_steps and num_stable_steps for get_wsd_schedule

* get wsd scheduler before the `num_training_steps` decision

* fix code_quality

* Update stable branch logic

* fix code_quality

* Move stable stage decide to `get_wsd_schedule`

* Update docstring of `get_wsd_schedule`

* Update `num_train_steps` to optional

* Update `num_train_steps` to optional

* Update docstring of `get_wsd_schedule`

* Update src/transformers/optimization.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-10 13:21:55 +01:00
9a6be63fdb Add Apple's Depth-Pro for depth estimation (#34583)
* implement config and model building blocks

* refactor model architechture

* update model outputs

* update init param to include use_fov_model

* update param name in config

* fix hidden_states and attentions outputs for fov

* sort config

* complete minor todos

* update patching

* update config for encoder

* fix config

* use correct defaults in config

* update merge for compatibility with different image size

* restructure encoder for custom configuration

* make fov model compatible with custom config

* replace word "decoder" with "fusion"

* weight conversion script

* fix fov squeeze

* update conversion script (without test)

* upload ruff image processing

* create fast image processing

* use torch interpolation for image processing

* complete post_process_depth_estimation

* config: fix imports and sort args

* apply inference in weight conversion

* use mllama script instead for weight conversion

* clean weight conversion script

* add depth-pro status in other files

* fill docstring in config

* formatting

* more formatting

* formatting with ruff

* formatting with style

* fix copied classes

* add examples; update weight convert script

* fix using check_table.py and isort

* fix config docstring

* add depth pro to sdpa docs

* undo unintentional changes in configuration_gemma.py

* minor fixes

* test image processing

* fixes and tests

* more fixes

* use output states from image_encoder instead

* Revert "use output states from image_encoder instead"

This reverts commit 2408ec54e4f27d2abbecdb8374e58f34d91d8e96.

* make embeddings dynamic

* reshape output hidden states and attentions as part of computation graph

* fix ruff formating

* fix docstring failure

* use num_fov_head_layers in tests

* update doc

* check consistency with config

* ruff formatting

* update test case

* fix ruff formatting

* add tests for fov

* use interpolation in postprocess

* run and fix slow tests locally

* use scaled_images_features for image and fov encoder

* return fused_hidden_states in fusion stage

* fix example

* fix ruff

* fix copyright license for all files

* add __all__ for each file

* minor fixes
- fix download spell
- add push_to_hub option
- fix Optional type hinting
- apply single loop for DepthProImageProcessor.preprocess

* return list in post_process_depth_estimation

* minor fixes
- capitalize start of docstring
- use ignore copy
- fix examples
- move docstring templates and custom output classes to top
- remove "-> None" typehinting from __init__
- type hinting for forward passes
- fix docstrings for custom output classes

* fix "ruff check"

* update upsample and projection

* major changes: (image size and merge optimization)
- add support for images of any size
- optimize merge operation
- remove image_size from config
- use full names instead of B, C, H, W
- remove interpolation from fusion stage
- add interpolation after merge
- move validations to config
- update integration test
- add type hints for functions

* fix push_to_hub option in weights conversion

* remove image_size in weights conversion

* major changes in the architecture
- remove all DepthProViT modules and support different backbones using the AutoModel API
- set default use_fov_model to False
- validate parameters in configuration
- update interpolate function: use "nearest" for faster computation
- update reshape_feature function: remove all special tokens, possible from different backbones
- update merge function: use padding from config instead of merge_out_size
- remove patch_to_batch and batch_to_patch conversions for now
- calculate out_size dynamically in the encoder
- leave head_mask calculation to the backbone
- fix bugs with merge
- add more comments
- update tests

* placeholder for unused config attributes

* improve docs amid review

* minor change in docs

* further optimize merge

* fix formatting

* remove unused patch/batch convertion functions

* use original F.interpolate

* improve function naming

* minor chages
- use torch_int instead of int
- use proper for newly initialized tensors
- use user provided return_dict for patch_encoder
- use if-else block instead in self.use_fov_model

* rearchitect upsample block for improved modularity

* update upsample keys in weight conversion

* improve padding in merge_patches

* use double-loop for merge

* update comments

* create feature_extractor, reduce some forward code

* introduce config.use_mask_token in dinov2

* minor fixes

* minor fixes for onnx

* update __init__ to latest format

* remove DepthProConfig.to_dict()

* major changes in backbone

* update config in weight conversion

* formatting

* converted model is fp32

* improve naming and docs for feature_extractor->reconstruct_feature_maps

* minor fixes; amid review

* create intermediate vars in func call

* use torch.testing.assert_close

* use ModuleList instead of Sequential and ModuleDict

* update docs

* include fov in integraiton tests

* update docs

* improve initialization of convolution layers

* fix unused fov keys

* update tests

* ruff format

* fix test, amid kaimming initialization

* add depthpro to toctree

* add residual layer to _no_split_modules

* architecture rework

* Update src/transformers/models/depth_pro/image_processing_depth_pro.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/depth_pro/image_processing_depth_pro_fast.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* update docs

* improve merge_patches

* use flatten with fov_output

* ruff formatting

* update resources section in docs

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fix typo "final_kernal_size"

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fix output typehint for DepthProDepthEstimator

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* residual operation in 2 steps

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* use image_size instead of global patch_size in interpolation

* replace all Sequential with ModuleList

* update fov

* update heads

* fix and update conversion script for heads

* ruff formatting

* remove float32 conversion

* use "Fov" instead of "FOV" in class names

* use "Fov" instead of "FOV" in config docs

* remove prune_heads

* update fusion stage

* use device in examples

* update processor

* ruff fixes

* add do_rescale in image_processor_dict

* skip test: test_fast_is_faster_than_slow

* ruff formatting

* DepthProImageProcessorFast in other files

* revert antialias removal

* add antialias in BaseImageProcessorFast

* Revert "revert antialias removal"

This reverts commit 5caa0bd8f9f7463b98410c04e6cfe8fef3adee18.

* Revert "add antialias in BaseImageProcessorFast"

This reverts commit 3ae1134780ae236872985523d9c0a444eabcc179.

* update processor for grouping and antialias

* try test_fast_is_faster_than_slow without "skip" or "flanky"

* update checkpoint

* update checkpoint

* use @is_flanky for processor test

* update checkpoint to "apple/DepthPro-hf"

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-02-10 11:32:45 +00:00
c399921965 Paligemma: revert #36084 (#36113)
* revert

* type check
2025-02-10 12:04:24 +01:00
eebd2c972c Chat template: update for processor (#35953)
* update

* we need batched nested input to always process correctly

* update a bit

* fix copies
2025-02-10 09:52:19 +01:00
5bd7694781 Processors: allow tuples of images when checking (#36084)
allow tuples of images
2025-02-10 09:35:13 +01:00
3a3b06ace4 fix MllamaVisionAttention typehint (#35975)
* fix MllamaVisionAttention typehint

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Update src/transformers/models/mllama/modeling_mllama.py

Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>

* fix suggestion

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

---------

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
2025-02-10 09:17:10 +01:00
6b55046213 [docs] fix not-working example code in perf_infer_gpu_one.md (#36087)
* bug fix

* update memory limit
2025-02-07 12:42:22 -08:00
14ca7f1452 [docs] fix typo (#36080)
typo fix
2025-02-07 12:42:09 -08:00
c361b1e3d9 [docs] fix model checkpoint name (#36075)
update model name
2025-02-07 12:41:52 -08:00
ba29a439ad Fix OS err (#36094)
* Try via local_main_process first

* try 2
2025-02-07 09:57:43 -05:00
a18b7fdd9e Move audio top_k tests to the right file and add slow decorator (#36072)
* Move audio top_k tests to the right file and add slow decorator because we load a real model

* empty commit to trigger tests
2025-02-07 14:32:30 +00:00
014047e1c8 Fix bug in apply_rotary_pos_emb_flashatt: in Qwen2-5-VL (#36065) 2025-02-07 10:43:45 +01:00
006d9249ec Adding RT-DETRv2 for object detection (#34773)
* cookiecutter add rtdetrv2

* make modular working

* working modelgit add .

* working modelgit add .

* finalize moduar inheritence

* finalize moduar inheritence

* Update src/transformers/models/rtdetrv2/modular_rtdetrv2.py

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* update modular and add rename

* remove output ckpt

* define loss_kwargs

* fix CamelCase naming

* fix naming + files

* fix modular and convert file

* additional changes

* fix modular

* fix import error (switch to lazy)

* fix autobackbone

* make style

* add

* update testing

* fix loss

* remove old folder

* fix testing for v2

* update docstring

* fix docstring

* add resnetv2 (with modular bug to fix)

* remove resnetv2 backbone

* fix changes

* small fixes

* remove rtdetrv2resnetconfig

* add rtdetrv2 name to convert

* make style

* Update docs/source/en/model_doc/rt_detr_v2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/rt_detr_v2/modular_rt_detr_v2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/rt_detr_v2/modular_rt_detr_v2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix modular typo after review

* add reviewed changes

* add final review changes

* Update docs/source/en/model_doc/rt_detr_v2.md

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* Update src/transformers/models/rt_detr_v2/__init__.py

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* Update src/transformers/models/rt_detr_v2/convert_rt_detr_v2_weights_to_hf.py

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* add review changes

* remove rtdetrv2 resnet

* removing this weird project change

* change ckpt name from jadechoghari to author

* implement review and update testing

* update naming and remove wrong ckpt

* name

* make fix-copies

* Fix RT-DETR loss

* Add resources, fix name

* Fix repo in docs

* Fix table name

---------

Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: qubvel <qubvel@gmail.com>
2025-02-06 19:28:45 +00:00
6246c03260 [docs] fix outdated example code in trainer.md (#36066)
fix bugs
2025-02-06 10:54:22 -08:00
4563ba2c6f Fix StopStringCriteria to handle tokens above len(tokenizer) (#35797)
* Fix StopStringCriteria to handle tokens above len(tokenizer)

This fixes #35244 by clipping token IDs to be within the tokenizer's vocabulary size before performing the embedding lookup. This prevents index errors when model.config.vocab_size > len(tokenizer).

The fix:
1. Adds a clamp operation to ensure token IDs are within bounds
2. Adds a test case to verify the behavior

* Use self.stop_strings instead of stop_strings

* Handle clipping correctly

* make fixup

* Update test to the new embedding vecs

* Use much bigger values in the mismatch test

* Typo fix

* Slight simplification

---------

Co-authored-by: openhands <openhands@all-hands.dev>
2025-02-06 16:53:28 +00:00
28f73bc307 Fix model kwargs (#35875)
* Save state

* Make a failing test

* Better test

* mpt -> done, many more to go

* Rm extranious

* Bamba

* Bert

* big_bird

* biogpt

* bloom

* codegen

* ctrl

* data2vec

* dbrx

* Through up to Dbrx

* electra

* ernie

* falcon

* Fuyu/persimmon

* Include noop kwargs to base models

* Rebase

* Skip musigen

* Refactor/skip mllama

* Revert makefile

* Rm file

* Fix PT failing, need to modify rest of loss funcs to not resize

* Propagate some

* Continue

* More

* More options

* Mostly fixed

* Proved that it's the same

* Bloom is good

* Make ability to override loss func possible

* Fixup

* Clean

* Fix xglm

* Quality tests

* Skip OCR2

* Make specific loss for xglm

* Make order the same/line up 1:1

* xglm

* Skip fx output loss bloom model

* Didn't pass in pad_token_id

* Fix quality
2025-02-06 11:35:25 -05:00
1590c66430 Fix words typos in ggml test. (#36060)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-02-06 15:32:40 +00:00
1ce0e2992e Nail in edge case of torch dtype being overriden permantly in the case of an error (#35845)
* Nail in edge case of torch dtype

* Rm unused func

* Apply suggestions from code review

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Refactor tests to only mock what we need, don't introduce injection functions

* SetUp/TearDown

* Do super

---------

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2025-02-06 09:05:23 -05:00
e3458af726 Save checkpoint to temporary directory to handle partial saves during failures (#35580)
Save checkpoint to temporary folder first

Since partial/missing files due to failures throw error during load
2025-02-06 08:48:05 -05:00
3dd1de39bb Paligemma: fix generation with Gemma2 (#36044)
* fix paligemma

* nit

* use `kwargs` in models that can load any LM
2025-02-06 14:31:32 +01:00
dce9970884 Update test_flash_attn_2_can_dispatch_composite_models (#36050)
* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-06 12:09:49 +01:00
37faa97d9b Fix repo consistency (#36063)
* fix 1

* fix 2

* fix modular

* simplify at the same time

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-02-06 11:53:15 +01:00
ed98ad35e6 Fix usage of unpad_input function (#35925)
Fix usage of unpad_function

See https://github.com/huggingface/transformers/issues/35899

In the [commit](cdbbe844b1) return type of `unpad_input` was changed.
Now the code support older and newer versions

Co-authored-by: Pavel Gein <pavel.gein@gmail.com>
2025-02-06 11:33:42 +01:00
7aee036e54 Iterative generation using Input embeds and past_key_values (#35890)
* Iterative generation using input embeds

* ruff fix

* Added Testcase

* Updated comment

* ♻️ Refactored testcase

* Skip test for these models

* Continue generation using input embeds and cache

* Skip generate_continue_from_embeds test

* Refactor `prepare_input_for_generation` func

* Continue generation using input embeds and cache

* Modular changes fix

* Overwrite 'prepare_inputs_for_generation' function
2025-02-06 11:06:05 +01:00
b5f327f350 Add Qwen2VLImageProcessorFast into Qwen2VLProcessor (#35987)
* Add `Qwen2VLImageProcessorFast` into `Qwen2VLProcessor`

* Use `AutoImageProcessor` instead

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-02-06 10:03:09 +01:00
0de15c988b Fix Audio Classification Pipeline top_k Documentation Mismatch and Bug #35736 (#35771)
* added condition for top_k Doc mismatch fix

* initilation of test file for top_k changes

* added test for returning all labels

* added test for few labels

* tests/test_audio_classification_top_k.py

* final fix

* ruff fix

---------

Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
2025-02-05 16:25:08 +00:00
694aaa7fbc Fix how we compute the final non-padding token for ForSequenceClassification models (#35911)
* Fix how we compute the final non-padding token for Gemma (and probably other models)

* .size() -> .shape[]

* Propagating changes to other models

* Propagating changes to other models

* Change it for all ForSequenceClassification models

* Fix batch dim

* More TF fixes

* Copy the TF fix around as well

* Correct layer name for TFCTRL

* Cleaner .to()

* Clean up the nested if-else

* Use argmax() instead of .max().values
2025-02-05 16:23:33 +00:00
531d1511f5 [docs] no hard-coding cuda (#36043)
make device-agnostic
2025-02-05 08:22:33 -08:00
7399f8021e [docs] fix bugs in the bitsandbytes documentation (#35868)
* fix doc

* update model
2025-02-05 08:21:20 -08:00
0a1a8e3c7e [docs] no hard coding cuda as bnb has multi-backend support (#35867)
* change cuda to DEVICE

* Update docs/source/en/llm_tutorial.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-02-05 08:20:02 -08:00
9dc1efa5d4 DeepSpeed github repo move sync (#36021)
deepspeed github repo move
2025-02-05 08:19:31 -08:00
c772bff31a add support for empty list as input to create_model_card (#36042)
handle cases where it is list
2025-02-05 13:29:17 +01:00
315a9f494e Add XPU type for work-around -inf mask causing sdpa NaN issue in modeling files (#35647)
* add xpu for unmask

* change modular for generated matching

* add lastest modeling for helium
2025-02-05 13:28:31 +01:00
d8080d55c7 Fix synced multi-GPU generation with LLMs and VLMs (#35893)
* Fix synced multi-GPU generation

* fix copies

---------

Co-authored-by: Davit Manukyan <ManukyanD>
Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2025-02-05 11:15:11 +01:00
4831a94ee7 Fix Gemma2 synced multi-GPU generation (#35232)
* Fix Gemma2 synced multi-GPU generation

* Fix import ordering in modular_gemma2.py
2025-02-05 10:07:50 +01:00
fa56dcc2ab Refactoring of ImageProcessorFast (#35069)
* add init and base image processing functions

* add add_fast_image_processor to transformers-cli

* add working fast image processor clip

* add fast image processor to doc, working tests

* remove "to be implemented" SigLip

* fix unprotected import

* fix unprotected vision import

* update ViTImageProcessorFast

* increase threshold slow fast ewuivalence

* add fast img blip

* add fast class in tests with cli

* improve cli

* add fast image processor convnext

* add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision

* add device kwarg to ImagesKwargs for fast processing on cuda

* cleanup

* fix unprotected import

* group images by sizes and add batch processing

* Add batch equivalence tests, skip when center_crop is used

* cleanup

* update init and cli

* fix-copies

* refactor convnext, cleanup base

* fix

* remove patching mixins, add piped torchvision transforms for ViT

* fix unbatched processing

* fix f strings

* protect imports

* change llava onevision to class transforms (test)

* fix convnext

* improve formatting (following Pavel review)

* fix handling device arg

* improve cli

* fix

* fix inits

* Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs

* uniformize qwen2_vl fast

* fix docstrings

* add add fast image processor llava

* remove min_pixels max_pixels from accepted size

* nit

* nit

* refactor fast image processors docstrings

* cleanup and remove fast class transforms

* update add fast image processor transformers cli

* cleanup docstring

* uniformize pixtral fast and  make _process_image explicit

* fix prepare image structure llava next/onevision

* Use typed kwargs instead of explicit args

* nit fix import Unpack

* clearly separate pops and gets in base preprocess. Use explicit typed kwargs

* make qwen2_vl preprocess arguments hashable
2025-02-04 17:52:31 -05:00
8d73a38606 Add DAB-DETR for object detection (#30803)
* initial commit

* encoder+decoder layer changes WIP

* architecture checks

* working version of detection + segmentation

* fix modeling outputs

* fix return dict + output att/hs

* found the position embedding masking bug

* pre-training version

* added iamge processors

* typo in init.py

* iterupdate set to false

* fixed num_labels in class_output linear layer bias init

* multihead attention shape fixes

* test improvements

* test update

* dab-detr model_doc update

* dab-detr model_doc update2

* test fix:test_retain_grad_hidden_states_attentions

* config file clean and renaming variables

* config file clean and renaming variables fix

* updated convert_to_hf file

* small fixes

* style and qulity checks

* return_dict fix

* Merge branch main into add_dab_detr

* small comment fix

* skip test_inputs_embeds test

* image processor updates + image processor test updates

* check copies test fix update

* updates for check_copies.py test

* updates for check_copies.py test2

* tied weights fix

* fixed image processing tests and fixed shared weights issues

* added numpy nd array option to get_Expected_values method in test_image_processing_dab_detr.py

* delete prints from test file

* SafeTensor modification to solve HF Trainer issue

* removing the safetensor modifications

* make fix copies and hf uplaod has been added.

* fixed index.md

* fixed repo consistency

* styel fix and dabdetrimageprocessor docstring update

* requested modifications after the first review

* Update src/transformers/models/dab_detr/image_processing_dab_detr.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* repo consistency has been fixed

* update copied NestedTensor function after main merge

* Update src/transformers/models/dab_detr/modeling_dab_detr.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* temp commit

* temp commit2

* temp commit 3

* unit tests are fixed

* fixed repo consistency

* updated expected_boxes varible values based on related notebook results in DABDETRIntegrationTests file.

* temporarialy config modifications and repo consistency fixes

* Put dilation parameter back to config

* pattern embeddings have been added to the rename_keys method

* add dilation comment to config + add as an exception in check_config_attributes SPECIAL CASES

* delete FeatureExtractor part from docs.md

* requested modifications in modeling_dab_detr.py

* [run_slow] dab_detr

* deleted last segmentation code part, updated conversion script and changed the hf path in test files

* temp commit of requested modifications

* temp commit of requested modifications 2

* updated config file, resolved codepaths and refactored conversion script

* updated decodelayer block types and refactored conversion script

* style and quality update

* small modifications based on the request

* attentions are refactored

* removed loss functions from modeling file, added loss function to lossutils, tried to move the MLP layer generation to config but it failed

* deleted imageprocessor

* fixed conversion script + quality and style

* fixed config_att

* [run_slow] dab_detr

* changing model path in conversion file and in test file

* fix Decoder variable naming

* testing the old loss function

* switched back to the new loss function and testing with the odl attention functions

* switched back to the new last good result modeling file

* moved back to the version when I asked the review

* missing new line at the end of the file

* old version test

* turn back to newest mdoel versino but change image processor

* style fix

* style fix after merge main

* [run_slow] dab_detr

* [run_slow] dab_detr

* added device and type for head bias data part

* [run_slow] dab_detr

* fixed model head bias data fill

* changed test_inference_object_detection_head assertTrues to torch test assert_close

* fixes part 1

* quality update

* self.bbox_embed in decoder has been restored

* changed Assert true torch closeall methods to torch testing assertclose

* modelcard markdown file has been updated

* deleted intemediate list from decoder module

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-02-04 17:28:27 +00:00
fe52679e74 Update tests regarding attention types after #35235 (#36024)
* update

* update

* update

* dev-ci

* more changes

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-04 18:04:47 +01:00
014a1fa2c8 CircleCI with python 3.9 (#36027)
update docker files

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-04 17:40:20 +01:00
c98b467905 feat(ci): ignore trufflehog unverified results (#36031) 2025-02-04 16:39:36 +01:00
9855acb9c5 Hotfix for self-comment-ci.yml (#36030)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-04 16:28:05 +01:00
9f486badd5 Display warning for unknown quants config instead of an error (#35963)
* add supports_quant_method check

* fix

* add test and fix suggestions

* change logic slightly

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-02-04 15:17:01 +01:00
f19bfa50e7 Commont bot CI for other jobs (generation / quantization) (#35341)
* quantization CI on PRs

* fix

* fix

* add 2 members

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-04 14:42:51 +01:00
a93b80588b Fix RMSNormGated in Zamba2 (#35943)
* First commit

* Finish model implementation

* First commit

* Finish model implementation

* Register zamba2

* generated modeling and configuration

* generated modeling and configuration

* added hybrid cache

* fix attention_mask in mamba

* dropped unused loras

* fix flash2

* config docstrings

* fix config and fwd pass

* make fixup fixes

* text_modeling_zamba2

* small fixes

* make fixup fixes

* Fix modular model converter

* added inheritances in modular, renamed zamba cache

* modular rebase

* new modular conversion

* fix generated modeling file

* fixed import for Zamba2RMSNormGated

* modular file cleanup

* make fixup and model tests

* dropped inheritance for Zamba2PreTrainedModel

* make fixup and unit tests

* Add inheritance of rope from GemmaRotaryEmbedding

* moved rope to model init

* drop del self.self_attn and del self.feed_forward

* fix tests

* renamed lora -> adapter

* rewrote adapter implementation

* fixed tests

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Dropped adapter in-place sum

* removed rope from attention init

* updated rope

* created get_layers method

* make fixup fix

* make fixup fixes

* make fixup fixes

* update to new attention standard

* update to new attention standard

* make fixup fixes

* minor fixes

* cache_position

* removed cache_position postion_ids use_cache

* remove config from modular

* removed config from modular (2)

* import apply_rotary_pos_emb from llama

* fixed rope_kwargs

* Instantiate cache in Zamba2Model

* fix cache

* fix @slow decorator

* small fix in modular file

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* several minor fixes

* inherit mamba2decoder fwd and drop position_ids in mamba

* removed docstrings from modular

* reinstate zamba2 attention decoder fwd

* use regex for tied keys

* Revert "use regex for tied keys"

This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5.

* use regex for tied keys

* add cpu to slow forward tests

* dropped config.use_shared_mlp_adapter

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* re-convert from modular

* extended Zamba2RMSNormGated to n_groups>1

* removed einops import

* set _supports_sdpa = True

* add use_mem_eff_path flag for fused mamba2 fwd

* added docstring for use_mem_eff_ath flag

---------

Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-02-04 14:28:04 +01:00
bc9a6d8302 Fix device mismatch error in Whisper model during feature extraction (#35866)
* Fix device mismatch error in whisper feature extraction

* Set default device

* Address code review feedback

---------

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
2025-02-04 12:23:08 +01:00
9afb904b15 Refactor (and fix) gpt_neox (#35610)
* start a nice modular

* Update modular_gpt_neox.py

* Update modular_gpt_neox.py

* Update modular_gpt_neox.py

* Update modular_gpt_neox.py

* update

* Update modular_gpt_neox.py

* convert

* fix attribute

* fix attrs

* oups

* fix

* fix

* fix

* fix

* fix

* fix order to pass test (see with accelerate team)

* trigger CIs

* modular

* update

* up

* Update test_modeling_gpt_neox.py

* Update test_modeling_gpt_neox.py

* trigger CIs

* correctly pass arg

* simplify

* remove key warning

* update tp -> it's compatible since the view is before

* trigger CIs
2025-02-04 11:18:43 +01:00
ad30598923 Update Mistral converter (#35967)
* Update convert_mistral_weights_to_hf.py

* Update convert_mistral_weights_to_hf.py

* update

* style

* move it to integrations

* style

* trigger CIs

* trigger CIs
2025-02-04 11:13:12 +01:00
b1954fd64a layernorm_decay_fix (#35927)
* layernorm_decay_fix

* W293 fix

* ruff format fix

* black format

* ruff format

* erase last layer

* add test_get_parameter_names_rmsnorm

* rmsnorm fix
2025-02-04 11:01:49 +01:00
2ba040a71f apply_chat_template: consistent behaviour for return_assistant_tokens_mask=True return_tensors=True (#35582)
* apply_chat_template: consistent return_tensors behaviour with return_assistant_tokens_mask flag

* test_chat_template_return_assistant_tokens_mask: support tokenizers with no attention mask

* test_chat_template_return_assistant_tokens_mask: skip tokenizers with no padding token

* test_chat_template_return_assistant_tokens_mask: force tokenizer padding_side=right

---------

Co-authored-by: Eduard Allakhverdov <goncharova@airi.net>
Co-authored-by: d.tarasov <d.tarasov@airi.net>
2025-02-04 10:27:52 +01:00
9c02cb6233 Fix custom kernel for DeformableDetr, RT-Detr, GroindingDINO, OmDet-Turbo in Pytorch 2.6.0 (#35979)
Updates type().is_cuda() -> .is_cuda(); .data<> -> .data_ptr<>
2025-02-04 09:07:25 +00:00
5d75a25b03 Qwen2-VL: fix rope delta calculation (#36013)
* fix rope delats calculation

* add test

* style
2025-02-04 09:48:29 +01:00
e284c7e954 Update Granite Vision Model Path / Tests (#35998)
* Update granite vision model path

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Enable granite vision test

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

---------

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
2025-02-03 20:06:03 +01:00
Gar
9d2056f12b Add mean_resizing for every VLMs' resizing_token_embeddings() (#35717)
* refine all resize_token_embedding()

* ruff format

* hotfix
2025-02-03 15:03:49 +01:00
7eecdf2a86 Update-tp test (#35844)
* update test for now

* up

* cleanup

* update todo
2025-02-03 09:37:02 +01:00
62db3e6ed6 use torch 2.6 for daily CI (#35985)
use torch 2.6 for CI

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-31 18:58:23 +01:00
2b46943195 Add GOT-OCR 2.0 to Transformers (#34721)
* init modular got_ocr2

* Get correct got_ocr architecture

* add processing

* run modular with processing

* add working inference

* apply modular

* Refactor and fix style

* Refactor, cleanup, fix style

* fix init order

* Fix docs

* add base modeling tests

* fix style and consistency

* rename doc file

* fix repo consistency

* fix inference with box

* add image processing and support for crop_to_multi_page

* Fix batch inference

* add tests

* fixup

* fix slow test

* fix docstrings

* Add model doc

* update to new init

* fix input autocast pixel_values dtype

* update doc

* move doc to multimodal

* Reformat crop_image_to_patches and add docstrings

* Fix example in forward docstring

* Address Pablo review

* [run slow] got_ocr2

* remove defaults defined twice

* apply modular

* add torch_device to integration tests

* update modular

* follow-up Pavel review

* add device variable in doc

* fix doc multi-page

* Force eager attention for vision encoder to avoid attn implementation conflict

* revert qwen2vl doc changes

* use Qwen2ForCausalLM instead of Qwen2Model

* make fixup

* refactor gotocr2 to llava style

* uniformize function names and reduce checks

* final nits

* fix pixel_values dtype error

* change checkpoint names

* fix modular
2025-01-31 11:28:13 -05:00
5bbee12ac9 [Moshi] disable automatic compilation if the model can't compile (#35992)
moshi cant compile
2025-01-31 15:53:06 +00:00
e6f4a4ebbf [Moonshine] compute head_dim_padding at init (#35984)
compute head_dim_padding at init
2025-01-31 14:26:52 +01:00
d7188ba600 Add support for nested images to LLava and VipLLava (#35558)
* move make_flat_list_of_images and make_batched_videos to image_utils

* remove unnecessary is_vision_available

* move make_nested_list_of_images to image_utils

* fix fast pixtral image processor

* fix import mllama

* fix make_nested_list_of_images

* add tests

* convert 4d arrays/tensors to list

* add test_make_batched_videos

* add support nested batch of videos

* fix image processing qwen2vl
2025-01-30 16:49:20 -05:00
e4227eb4d4 Handle empty change indices in SAM's mask to rle conversion (#35665)
* Handle empty change indices in RLE conversion for masks

* [test] Add unit tests for RLE encoding of masks in SamProcessor

* [test] Update RLE conversion tests to use TensorFlow implementation

* [test] Fix formatting in SamProcessorTest according to check_code_quality action

* [test] Fix formatting in SamProcessorTest according to check_code_quality

* [test] Refactored rle test cases into one test and used tf tensors in tf test cases

* [test] Fix: removed self parameter from refactored methods

* [test] Removed nested methods in run-length encoding tests for PyTorch and TensorFlow

* [test] Added description to individual to run-length encoding tests for PyTorch and TensorFlow.
2025-01-30 19:08:38 +00:00
47bd4296d6 not to use A100 for benchmark.yml (#35974)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-30 18:55:36 +01:00
693328f2bc Support batching for UsefulSensors Moonshine (#35922)
* Add support for attention masking in moonshine.

Tested against Open ASR Leaderboard with batch size 256.

* Update comments and ensure attention masks are passed everywhere.

Perform attention mask downsampling inside of moonshine forward call.

* Hide padding behind conditional. Fix encoder/decoder masking.

- Correctly pipe encoder attention mask into decoder
- Add correct scaling factor if one is not already provided.
- Fix formatting with ruff

* Add auto generated modeling_moonshine file.

* Update formatting in generated model file.

* Address review comments.

* Fix typo.

* Add `pad_head_dim_to_multiple_of` to moonshine config.

* Correct args order for MooonshineConfig.

* Update configuration moonshine too.

* Update src/transformers/models/moonshine/modular_moonshine.py

* Update src/transformers/models/moonshine/configuration_moonshine.py

---------

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
2025-01-30 17:08:07 +01:00
5757681837 Less flaky for TimmBackboneModelTest::test_batching_equivalence (#35971)
* fix

* remove is_flaky

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-30 16:56:26 +01:00
e320d5542e Revert p_mask to a list in DQA pipeline (#35964)
* p_mask back to being a list

* Remove breakpoint
2025-01-30 15:37:59 +00:00
365fecb4d0 Whisper: fix static cache CI (#35852)
* fix

* remove overriden method

* small change
2025-01-30 12:43:00 +01:00
9725e5be2f Pixtral: vectorize patch embeddings and enable tests (#35122)
* initial POC

* - batch mix feature

* fix tests

* fix tests

* make style

* do not skip and instead fix tests

* update

* return back the test

* correct text with the correct ckpt
2025-01-30 12:40:18 +01:00
8bc4c89ee9 [bart] minor test fixes (#35965)
fix tests
2025-01-30 10:00:11 +00:00
19f2ec80cf Fix is_causal being a tensor (#35791)
* fix is_causal being a tensor

* convert in sdpa attention only when  jit tracing
2025-01-30 09:22:33 +01:00
7547f55e5d fix iterator overflow when gradient accumulation is 1 (#35960) 2025-01-29 14:45:09 -05:00
4d3b1076a1 [generate] move max time tests (#35962)
* move max time tests to their right place

* move test to the right place
2025-01-29 17:56:46 +00:00
4d1d489617 Update README.md (#35958)
There should be a dot after pip install .
2025-01-29 15:46:26 +00:00
f0ae65c198 [tests] further fix Tester object has no attribute '_testMethodName' (#35781)
* bug fix

* update with more cases

* more entries

* Fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 16:05:33 +01:00
ec7790f0d3 update docker file transformers-pytorch-deepspeed-latest-gpu (#35940)
update docker file for deepspeed

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 16:01:27 +01:00
5d257111c1 Trainer Refactor: Part 1 (#35567)
* start

* So far: 30%

* Small fix

* Continuing update

* Continuing

* Forgot to check if not None

* Continuing refactor

* Fix if else

* Fix ref

* Should make tests pass

* Keep grad norm same

* Document

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Err instead of info for logging RNG state error

* Seperate out to func

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-01-29 09:50:54 -05:00
23d782ead2 Output dicts support in text generation pipeline (#35092)
* Support for generate_argument: return_dict_in_generate=True, instead of returning a error

* fix: call test with return_dict_in_generate=True

* fix: Only import torch if it is present

* update: Encapsulate output_dict changes

* fix: added back original comments

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-01-29 14:44:46 +00:00
cf90404807 Fix flaky test_assisted_decoding_matches_greedy_search (#35951)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 14:50:07 +01:00
692afa102d Update squad_convert_example_to_features to work with numpy v2 (#35955)
* Fix

* Fix

* Fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 14:33:06 +01:00
c600e89f5c Update unwrap_and_save_reload_schedule to use weights_only=False (#35952)
* fix

* Fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 14:30:57 +01:00
42c8ccfd4c fix test_generated_length_assisted_generation (#34935)
fix test_generated_length_assisted_generation
2025-01-29 12:03:45 +00:00
ec7afad609 use torch constraints to check if covariance is positive definite during mean resizing. (#35693)
* use torch constraints to check for psd

* small nit

* Small change

* Small change for the ci

* nit
2025-01-28 17:33:42 +01:00
61cbb723fc Remove INC notebook reference in documentation (#35936)
remove INC notebook in documentation
2025-01-28 17:10:02 +01:00
478c4f2d0d fix(FA): QKV not being casted to target_dtype for FA with dpo lora (#35834)
fix(FA): QKV not being casted to target_dtype due to dtype check
2025-01-28 17:06:56 +01:00
ece8c42488 Test: generate with torch.compile(model.forward) as a fast test (#34544) 2025-01-28 14:10:38 +00:00
f48ecd7608 Fix TP initialization (#35860)
* fix tp

* Update modeling_utils.py

* style

* style

* Update test_tp.py

* Update test_tp.py

* style

* Update test_tp.py

* Update test_tp.py

* Update test_tp.py

* Update test_tp.py
2025-01-28 15:07:37 +01:00
f85ba20449 Qwen-2-5-VL: fix CI (#35935)
fix
2025-01-28 14:51:57 +01:00
3f860dba55 Fix mask slicing for models with HybridCache (#35681)
* correctly slice

* check mask

* Update modular_gemma2.py

* fix

* add tests

* fix typo

* finally fix mask slicing

* Finally correctly slice in all cases!!

* add test for all attention functions

* small fix in tests

* trick around dynamo tracing issue

* last update

* more robust

* kwargs propagation

* make it explicit for checkpointing

* apply modular
2025-01-28 14:35:00 +01:00
b764c20b09 Fix: loading DBRX back from saved path (#35728)
* fix dtype as dict for some models + add test

* add comment in tests
2025-01-28 11:38:45 +01:00
3613f568cd Add default TP plan for all models with backend support (#35870)
* Add some tp plans!

* More tp plans!

* Add it in the comment

* style

* Update configuration_mixtral.py

* Update configuration_phi.py

* update the layout according to special archs

* fix mixtral

* style

* trigger CIs

* trigger CIs

* CIs

* olmo2

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-28 11:20:58 +01:00
96625d85fd Use rocm6.2 for AMD images (#35930)
* Use rocm6.2 as rocm6.3 only has nightly pytorch wheels atm

* Use stable wheel index for torch libs
2025-01-28 11:10:28 +01:00
bf16a182ba Remove _supports_static_cache = True for some model classes (#34975)
* use mask_fill

* remove comment

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-28 10:42:10 +01:00
86d7564611 [docs] Fix Zamba2 (#35916)
fix code block
2025-01-27 11:44:10 -08:00
414658f94f Close Zamba2Config code block (#35914)
* close zamba2 code block

* Add Zamba2 to toctree
2025-01-27 19:09:42 +00:00
63e9c941eb Fix the config class comparison for remote code models (#35592)
* Fix the config class comparison when repeatedly saving and loading remote code models

* once again you have committed your debug breakpoint
2025-01-27 18:37:30 +00:00
c550a1c640 [docs] uv install (#35821)
uv install
2025-01-27 08:49:28 -08:00
cd6591bfb2 Fix typing in audio_utils.chroma_filter_bank (#35888)
* Fix typing in audio_utils.chroma_filter_bank

* Apply make style

---------

Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>
2025-01-27 16:06:03 +00:00
e57b459997 Split and clean up GGUF quantization tests (#35502)
* clean up ggml test

Signed-off-by: Isotr0py <2037008807@qq.com>

* port remaining tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* further cleanup

Signed-off-by: Isotr0py <2037008807@qq.com>

* format

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix broken tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* update comment

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix

Signed-off-by: Isotr0py <2037008807@qq.com>

* reorganize tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* k-quants use qwen2.5-0.5B

Signed-off-by: Isotr0py <2037008807@qq.com>

* move ggml tokenization test

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove dead code

Signed-off-by: Isotr0py <2037008807@qq.com>

* add assert for serilization test

Signed-off-by: Isotr0py <2037008807@qq.com>

* use str for parameterize

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-27 15:46:57 +01:00
5c576f5a66 🚨🚨🚨 image-classification pipeline single-label and multi-label prob type squashing fns (sigmoid vs softmax) are backwards (#35848)
single-label and multi-label prob type squashing fns (sigmoid vs softmax) were backwards for image-classification pipeline
2025-01-27 15:34:57 +01:00
5450e7c84a 🔴 🔴 🔴 Added segmentation maps support for DPT image processor (#34345)
* Added `segmentation_maps` support for DPT image processor

* Added tests for dpt image processor

* Moved preprocessing into separate functions

* Added # Copied from statements

* Fixed # Copied from statements

* Added `segmentation_maps` support for DPT image processor

* Added tests for dpt image processor

* Moved preprocessing into separate functions

* Added # Copied from statements

* Fixed # Copied from statements
2025-01-27 15:14:00 +01:00
a50befa9b9 Update deepspeed amd image (#35906) 2025-01-27 14:32:36 +01:00
33cb1f7b61 Add Zamba2 (#34517)
* First commit

* Finish model implementation

* First commit

* Finish model implementation

* Register zamba2

* generated modeling and configuration

* generated modeling and configuration

* added hybrid cache

* fix attention_mask in mamba

* dropped unused loras

* fix flash2

* config docstrings

* fix config and fwd pass

* make fixup fixes

* text_modeling_zamba2

* small fixes

* make fixup fixes

* Fix modular model converter

* added inheritances in modular, renamed zamba cache

* modular rebase

* new modular conversion

* fix generated modeling file

* fixed import for Zamba2RMSNormGated

* modular file cleanup

* make fixup and model tests

* dropped inheritance for Zamba2PreTrainedModel

* make fixup and unit tests

* Add inheritance of rope from GemmaRotaryEmbedding

* moved rope to model init

* drop del self.self_attn and del self.feed_forward

* fix tests

* renamed lora -> adapter

* rewrote adapter implementation

* fixed tests

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Dropped adapter in-place sum

* removed rope from attention init

* updated rope

* created get_layers method

* make fixup fix

* make fixup fixes

* make fixup fixes

* update to new attention standard

* update to new attention standard

* make fixup fixes

* minor fixes

* cache_position

* removed cache_position postion_ids use_cache

* remove config from modular

* removed config from modular (2)

* import apply_rotary_pos_emb from llama

* fixed rope_kwargs

* Instantiate cache in Zamba2Model

* fix cache

* fix @slow decorator

* small fix in modular file

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* several minor fixes

* inherit mamba2decoder fwd and drop position_ids in mamba

* removed docstrings from modular

* reinstate zamba2 attention decoder fwd

* use regex for tied keys

* Revert "use regex for tied keys"

This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5.

* use regex for tied keys

* add cpu to slow forward tests

* dropped config.use_shared_mlp_adapter

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* re-convert from modular

---------

Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-27 10:51:23 +01:00
14a9bb520e Fix fast image processor warnings in object detection examples (#35892)
Have the DETR examples default to using the fast image  processor
2025-01-27 08:32:44 +00:00
f11f57c925 [doctest] Fixes (#35863)
doctest fixes
2025-01-26 15:26:38 -08:00
fc269f77da Add Rocketknight1 to self-comment-ci.yml (#35881)
my bad

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-24 19:07:07 +00:00
bcb841f007 add xpu device check in device_placement (#35865)
add xpu device
2025-01-24 19:13:07 +01:00
b912f5ee43 use torch.testing.assertclose instead to get more details about error in cis (#35659)
* use torch.testing.assertclose instead to get more details about error in cis

* fix

* style

* test_all

* revert for I bert

* fixes and updates

* more image processing fixes

* more image processors

* fix mamba and co

* style

* less strick

* ok I won't be strict

* skip and be done

* up
2025-01-24 16:55:28 +01:00
72d1a4cd53 Fix Llava-NeXT / Llava-NeXT Video / Llava-OneVision's token unpadding mismatch (#35779)
* Fix Llava OneVision's token padding

* Fix Llava next and Llava next video's token unpadding for consistency
2025-01-24 09:10:27 +01:00
b5aaf87509 Fix test_pipelines_video_classification that was always failing (#35842)
* Fix test_pipelines_video_classification that was always failing

* Update video pipeline docstring to reflect actual return type

---------

Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>
2025-01-23 19:22:32 +01:00
328e2ae4c0 fix apply_chat_template() padding choice (#35828)
fix apply_chat_template() padding choice to bool, str, PaddingStrategy and the docstring of pad()
2025-01-23 17:32:32 +00:00
d2a424b550 Fix typo (#35854) 2025-01-23 17:32:18 +00:00
045c02f209 [DOC] Fix contamination and missing paragraph in translation (#35851)
Fix contamination and missing paragraph in translation
2025-01-23 08:33:44 -08:00
71cc8161b2 Granite Vision Support (#35579)
* Add multimodal granite support

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

Support multiple image feature layres

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Remove failing validation for visual encoders with no cls

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Update llava based models / configs to support list of feature layers

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Add tests for multiple feature layers

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Use conditional instead of except for misaligned feature shapes

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* crop cls from each hidden state

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Fix formatting

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Support single vision feature int in vipllava

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Fix typo in vision feature selection strategy validation

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Add tentative integration test for granite vision models

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Add granite vision docs

Replace multimodal granite refs with granite vision

Add granite vision / llava next alias

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Use image url in granitevision example

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

---------

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
2025-01-23 17:15:52 +01:00
8f1509a96c Fix more CI tests (#35661)
add tooslow for the fat ones
2025-01-23 14:45:42 +01:00
0a950e0bbe Fix uploading processors/tokenizers to WandB on train end (#35701)
* rename tokenizer to processing_class in WandbCallback.on_train_end

* rename tokenizer to processing_class in ClearMLCallback and DVCLiveCallback
2025-01-23 13:32:15 +01:00
4ec425ffad Fix GA loss for Deepspeed (#35808)
* Fix GA loss for Deepspeed

* Turn off loss scaling in DeepSpeed engine by scale_wrt_gas

* Add comment linking to PR
2025-01-23 11:45:02 +01:00
f3f6c86582 add qwen2.5vl (#35569)
* add qwen2.5vl

* fix

* pass check table

* add modular file

* fix style

* Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>

* Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>

* Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>

* padd copy check

* use modular

* fix

* fix

* fix

* update flashatt2&sdpa support_list

* Update docs/source/en/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update config

* update

* fix hf path

* rename Qwen2_5_VLVideosKwargs

* fix

* fix

* update

* excuted modular

* rollback init

* fix

* formated

* simpler init

* fix

* fix

* fix

* fix

* fix

* update docs

* fix

* fix

* update Qwen2VLRotaryEmbedding for yarn

* fix

---------

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: gewenbin0992 <gewenbin292@163.com>
Co-authored-by: gewenbin0992 <67409248+gewenbin0992@users.noreply.github.com>
2025-01-23 11:23:00 +01:00
d3af76df58 [Backend support] Allow num_logits_to_keep as Tensor + add flag (#35757)
* support

* Update modeling_utils.py

* style

* most models

* Other models

* fix-copies

* tests + generation utils
2025-01-23 09:47:54 +01:00
8736e91ad6 [ tests] remove some flash attention class tests (#35817)
remove class from tests
2025-01-23 09:44:21 +01:00
2c3a44f9a7 Fix NoneType type as it requires py>=3.10 (#35843)
fix type
2025-01-22 15:56:53 +00:00
fdcc62c855 Add PyTorch version check for FA backend on AMD GPUs (#35813)
Disable FA backend for SDPA on AMD GPUs (PyTorch < 2.4.1)
2025-01-22 16:09:23 +01:00
3b9770581e Fix compatibility issues when using auto_gptq with these older versions (#35830)
convert_model method of optimum only accepts a single nn.Module type model parameter for versions less than 1.23.99.
2025-01-22 15:46:47 +01:00
62bd83947a [chat] docs fix (#35840)
docs fix
2025-01-22 14:32:27 +00:00
487e2f63bd Fix head_dim in config extracted from Gemma2 GGUF model (#35818)
fix gemma2 head dim

Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-01-22 15:22:04 +01:00
b3d6722469 [Chat] Add Chat from TRL 🐈 (#35714)
* tmp commit

* add working chat

* add docts

* docs 2

* use auto dtype by default
2025-01-22 13:30:12 +00:00
a7738f5a89 Fix : Nemotron tokenizer for GGUF format (#35836)
fix nemotron gguf
2025-01-22 12:28:40 +01:00
ec28957f94 [pipeline] missing import regarding assisted generation (#35752)
missing import
2025-01-22 10:34:28 +00:00
36c9181f5c [gpt2] fix generation tests (#35822)
fix gpt2 generation tests
2025-01-22 09:41:04 +00:00
f439e28d32 Hotfix: missing working-directory in self-comment-ci.yml (#35833)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-22 10:25:50 +01:00
373e50e970 Init cache on meta device (#35164)
* init cache on meta device

* offloaded static + enable tests

* tests weren't running before  :(

* update

* fix mamba

* fix copies

* update

* address comments and fix tests

* fix copies

* Update src/transformers/cache_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update

* mamba fix

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-22 09:49:17 +01:00
870e2c8ea0 Another security patch for self-comment-ci.yml (#35816)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-22 09:29:54 +01:00
f4f33a20a2 Remove pyav pin to allow python 3.11 to be used (#35823)
* Remove pyav pin to allow python 3.11 to be used

* Run make fixup

---------

Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>
2025-01-21 20:16:18 +00:00
90b46e983f Remove old benchmark code (#35730)
* remove traces of the old deprecated benchmarks

* also remove old tf benchmark example, which uses deleted code

* run doc builder
2025-01-21 17:56:43 +00:00
870eb7b41b [Mimi] update test expected values for t4 runners (#35696)
update values for t4
2025-01-21 18:23:36 +01:00
8ac851b0b3 Improve modular documentation (#35737)
* start a nice doc

* keep improving the doc

* Finalize doc

* Update modular_transformers.md

* apply suggestion
2025-01-21 17:53:30 +01:00
107f9f5127 add Qwen2-VL image processor fast (#35733)
* add qwen2_vl image processor fast

* add device to ImagesKwargs

* remove automatic fix copies

* fix fast_is_faster_than_slow

* remove unnecessary import
2025-01-21 11:49:05 -05:00
3df90103b8 move fastspeech to audio models (#35788) 2025-01-21 08:32:09 -08:00
741d55237a [i18n-ar] Translated file: docs/source/ar/tasks/masked_language_modeling.md into Arabic (#35198)
* إضافة الترجمة العربية: masked_language_modeling.md

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

* Update _toctree.yml

* Add language_modeling.md

* Add Sequence_classifiation.md

* Update _toctree.yml

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2025-01-21 08:29:58 -08:00
568941bf11 Optimized set_initialized_submodules. (#35493) 2025-01-21 17:01:28 +01:00
7051c5fcc8 Remove deprecated get_cached_models (#35809)
* Remove deprecated get_cached_models

* imports
2025-01-21 16:08:31 +01:00
97fbaf0861 Fixed typo in autoawq version number in an error message for IPEX backend requirements. (#35815)
Fixed typo in version number for IPEX backend required minimal autoawq version
2025-01-21 14:42:44 +00:00
dbd8474125 Fix : BLOOM tie_word_embeddings in GGUF (#35812)
* fix bloom ggml

* fix falcon output

* make style
2025-01-21 15:35:54 +01:00
678bd7f1ce Auto-add timm tag to timm-wrapper models. (#35794)
Works for fine-tuned or exported models:

```py
from transformers import AutoModelForImageClassification

checkpoint = "timm/vit_base_patch16_224.augreg2_in21k_ft_in1k"
model = AutoModelForImageClassification.from_pretrained(checkpoint)

model.push_to_hub("pcuenq/tw1")
```

The uploaded model will now show snippets for both the timm and the
transformers libraries.
2025-01-21 14:34:45 +01:00
dc10f7906a Support adamw_torch_8bit (#34993)
* var

* more

* test
2025-01-21 14:17:49 +01:00
f82b19cb6f add a new flax example for Bert model inference (#34794)
* add a new example for flax inference cases

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix for "make fixup"

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-21 14:09:29 +01:00
edbabf6b82 [Doc] Adding blog post to model doc for TimmWrapper (#35744)
* adding blog post to model doc

* Update docs/source/en/model_doc/timm_wrapper.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* review suggestions

* review suggestions

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-21 12:32:39 +00:00
fd8d61fdb2 Byebye test_batching_equivalence's flakiness (#35729)
* fix

* fix

* skip

* better error message

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-21 13:11:33 +01:00
78f5ee0217 Add LlavaImageProcessor (#33191)
* First draft

* Add equivalence test

* Update docstrings

* Add tests

* Use numpy

* Fix tests

* Improve variable names

* Improve docstring

* Add link

* Remove script

* Add copied from

* Address comment

* Add note in docs

* Add docstring, data format

* Improve test

* Add test

* update

* Update src/transformers/models/llava/image_processing_llava.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/llava/image_processing_llava.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* loop once only

---------

Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-21 12:47:04 +01:00
8e4cedd9ca Update AMD Docker image (#35804) 2025-01-21 12:11:23 +01:00
705aeaaa12 Fix "test_chat_template_dict" in video LLMs (#35660)
* fix  "test_chat_template_dict" in llava_onevision

* Update src/transformers/models/llava_next_video/processing_llava_next_video.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* get one video calles once

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-21 10:23:40 +01:00
e867b97443 Deterministic sorting in modular converter when adding new functions (#35795)
deterministic sort
2025-01-21 09:38:48 +01:00
920f34a772 modular_model_converter bugfix on assignments (#35642)
* added bugfix in modular converter to keep modular assignments for docstrings, expected outputs etc.

* revert stracoder2 docstring copying, add forward in EMU3 to enable docstring assingment, remove verbatim assignments in modular converter

* added _FOR_DOC in assignments to keep, corrected wrong checkpoint name in ijepa's configuration
2025-01-21 08:06:44 +01:00
234168c4dc Fixes, improvements to timm import behaviour (#35800)
* Fix timm dummy import logic

* Add requires to TimmWrapperConfig.from_dict so users see a helpful import error message if timm not installed
2025-01-20 13:17:01 -08:00
44393df089 Tool calling: support more types (#35776)
* Tool calling: support NoneType for function return type
2025-01-20 19:15:34 +01:00
f19135afc7 fix low-precision audio classification pipeline (#35435)
* fix low-precision audio classification pipeline

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix torch import

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix torch import

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-01-20 16:20:51 +00:00
641238eb76 Fix vits low-precision dtype (#35418)
* fix vits dtype

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* use weight dtype

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-01-20 16:19:31 +00:00
729b569531 fix document qa bf16 pipeline (#35456)
* fix document qa bf16 pipeline

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-01-20 16:18:07 +00:00
ec97417827 Don't import torch.distributed when it's not available (#35777)
This is a continuation of 217c47e31bc0cd442443e5b4a62c8bc2785d53ee but
for another module. This issue was spotted in nixpkgs (again) when
building lm-eval package that used a different path in transformers
library to reach the same failure.

Related: #35133
2025-01-20 17:10:35 +01:00
5f0f4b1b93 Patch moonshine (#35731)
* udpate expected logits for T4 runners

* update doc

* correct order of the args for better readability

* remove generate wrap

* convert modular
2025-01-20 16:19:29 +01:00
a142f16131 transformers.image_transforms.normalize wrong types (#35773)
transformers.image_transforms.normalize documents and checks for the wrong type for std and mean arguments

Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>
2025-01-20 15:00:46 +00:00
3998fa8aab [fix] cannot import name 'Pop2PianoFeatureExtractor' from 'transformers' (#35604)
* update pop2piano __init__

* add lib check

* update fix

* revert
2025-01-20 15:21:45 +01:00
b80e334e71 Skip Falcon 7B GGML Test (#35783)
skip test
2025-01-20 15:00:34 +01:00
68947282fc remove code owners as it was generating too much noise BUT (#35784)
remove code owners
2025-01-20 14:18:03 +01:00
135e86aa54 Remove read_video and run 2025-01-20 13:40:57 +01:00
88b95e6179 [generate] update docstring of SequenceBiasLogitsProcessor (#35699)
* fix docstring

* space
2025-01-20 11:00:15 +00:00
56afd2f488 fix register_buffer in MimiEuclideanCodebook (#35759)
Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
2025-01-20 11:54:58 +01:00
abe57b6f17 Add SuperGlue model (#29886)
* Initial commit with template code generated by transformers-cli

* Multiple additions to SuperGlue implementation :

- Added the SuperGlueConfig
- Added the SuperGlueModel and its implementation
- Added basic weight conversion script
- Added new ImageMatchingOutput dataclass

* Few changes for SuperGlue

* Multiple changes :
- Added keypoint detection config to SuperGlueConfig
- Completed convert_superglue_to_pytorch and succesfully run inference

* Reverted unintentional change

* Multiple changes :
 - Added SuperGlue to a bunch of places
 - Divided SuperGlue into SuperGlueForImageMatching and SuperGlueModel
 - Added testing images

* Moved things in init files

* Added docs (to be finished depending on the final implementation)

* Added necessary imports and some doc

* Removed unnecessary import

* Fixed make fix-copies bug and ran it

* Deleted SuperGlueModel
Fixed convert script

* Added SuperGlueImageProcessor

* Changed SuperGlue to support batching pairs of images and modified ImageMatchingOutput in consequences

* Changed convert_superglue_to_hf.py script to experiment different ways of reading an image and seeing its impact on performances

* Added initial tests for SuperGlueImageProcessor

* Added AutoModelForImageMatching in missing places and tests

* Fixed keypoint_detector_output instructions

* Fix style

* Adapted to latest main changes

* Added integration test

* Fixed bugs to pass tests

* Added keypoints returned by keypoint detector in the output of SuperGlue

* Added doc to SuperGlue

* SuperGlue returning all attention and hidden states for a fixed number of keypoints

* Make style

* Changed SuperGlueImageProcessor tests

* Revert "SuperGlue returning all attention and hidden states for a fixed number of keypoints"
Changed tests accordingly

This reverts commit 5b3b669c

* Added back hidden_states and attentions masked outputs with tests

* Renamed ImageMatching occurences into KeypointMatching

* Changed SuperGlueImageProcessor to raise error when batch_size is not even

* Added docs and clarity to hidden state and attention grouping function

* Fixed some code and done refactoring

* Fixed typo in SuperPoint output doc

* Fixed some of the formatting and variable naming problems

* Removed useless function call

* Removed AutoModelForKeypointMatching

* Fixed SuperGlueImageProcessor to only accept paris of images

* Added more fixes to SuperGlueImageProcessor

* Simplified the batching of attention and hidden states

* Simplified stack functions

* Moved attention instructions into class

* Removed unused do_batch_norm argument

* Moved weight initialization to the proper place

* Replaced deepcopy for instantiation

* Fixed small bug

* Changed from stevenbucaille to magic-leap repo

* Renamed London Bridge images to Tower Bridge

* Fixed formatting

* Renamed remaining "london" to "tower"

* Apply suggestions from code review

Small changes in the docs

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Added AutoModelForKeypointMatching

* Changed images used in example

* Several changes to image_processing_superglue and style

* Fixed resample type hint

* Changed SuperGlueImageProcessor and added test case for list of 2 images

* Changed list_of_tuples implementation

* Fix in dummy objects

* Added normalize_keypoint, log_sinkhorn_iterations and log_optimal_transport docstring

* Added missing docstring

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Moved forward block at bottom

* Added docstring to forward method

* Added docstring to match_image_pair method

* Changed test_model_common_attributes to test_model_get_set_embeddings test method signature

* Removed AutoModelForKeypointMatching

* Removed image fixtures and added load_dataset

* Added padding of images in SuperGlueImageProcessor

* Cleaned up convert_superglue_to_hf script

* Added missing docs and fixed unused argument

* Fixed SuperGlueImageProcessor tests

* Transposed all hidden states from SuperGlue to reflect the standard (..., seq_len, feature_dim) shape

* Added SuperGlueForKeypointMatching back to modeling_auto

* Fixed image processor padding test

* Changed SuperGlue docs

* changes:
 - Abstraction to batch, concat and stack of inconsistent tensors
 - Changed conv1d's to linears to match standard attention implementations
 - Renamed all tensors to be tensor0 and not tensor_0 and be consistent
 - Changed match image pair to run keypoint detection on all image first, create batching tensors and then filling these tensors matches after matches
 - Various changes in docs, etc

* Changes to SuperGlueImageProcessor:
- Reworked the input image pairs checking function and added tests accordingly
- Added Copied from statements
- Added do_grayscale tag (also for SuperPointImageProcessor)
- Misc changes for better code

* Formatting changes

* Reverted conv1d to linear conversion because of numerical differences

* fix: changed some code to be more straightforward (e.g. filtering keypoints) and converted plot from opencv to matplotlib

* fix: removed unnecessary test

* chore: removed commented code and added back hidden states transpositions

* chore: changed from "inconsistent" to "ragged" function names as suggested

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* docs: applied suggestions

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* docs: updated to display matched output

* chore: applied suggestion for check_image_pairs_input function

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* chore: changed check_image_pairs_input function name to validate_and_format_image_pairs and used validate_preprocess_arguments function

* tests: simplified tests for image input format and shapes

* feat: converted SuperGlue's use of Conv1d with kernel_size of 1 with Linear layers. Changed tests and conversion script accordingly

* feat: several changes to address comments

Conversion script:
- Reverted fuse batchnorm to linear conversion
- Changed all 'nn.Module' to respective SuperGlue models
- Changed conversion script to use regex mapping and match other recent scripts

Modeling SuperGlue:
- Added batching with mask and padding to attention
- Removed unnecessary concat, stack and batch ragged pairs functions
- Reverted batchnorm layer
- Renamed query, key, value and merge layers into q, k, v, out proj
- Removed Union of different Module into nn.Module in _init_weights method typehint
- Changed several method's signature to combine image0 and image1 inputs with appropriate doc changes
- Updated SuperGlue's doc with torch.no_grad()

Updated test to reflect changes in SuperGlue model

* refactor: changed validate_and_format_image_pairs function with clarity

* refactor: changed from one SuperGlueMLP class to a list of SuperGlueMLP class

* fix: fixed forgotten init weight change from last commit

* fix: fixed rebase mistake

* fix: removed leftover commented code

* fix: added typehint and changed some of arguments default values

* fix: fixed attribute default values for SuperGlueConfig

* feat: added SuperGlueImageProcessor post process keypoint matching method with tests

* fix: fixed SuperGlue attention and hidden state tuples aggregation

* chore: fixed mask optionality and reordered tensor reshapes to be cleaner

* chore: fixed docs and error message returned in validate_and_format_image_pairs function

* fix: fixed returned keypoints to be the ones that SuperPoint returns

* fix: fixed check on number of image sizes for post process compared to the pairs in outputs of SuperGlue

* fix: fixed check on number of image sizes for post process compared to the pairs in outputs of SuperGlue (bis)

* fix: Changed SuperGlueMultiLayerPerceptron instantiation to avoid if statement

* fix: Changed convert_superglue_to_hf script to reflect latest SuperGlue changes and got rid of nn.Modules

* WIP: implement Attention from an existing class (like BERT)

* docs: Changed docs to include more appealing matching plot

* WIP: Implement Attention

* chore: minor typehint change

* chore: changed convert superglue script by removing all classes and apply conv to linear conversion in state dict + rearrange keys to comply with changes in model's layers organisation

* Revert "Fixed typo in SuperPoint output doc"

This reverts commit 2120390e827f94fcd631c8e5728d9a4980f4a503.

* chore: added comments in SuperGlueImageProcessor

* chore: changed SuperGlue organization HF repo to magic-leap-community

* [run-slow] refactor: small change in layer instantiation

* [run-slow] chore: replaced remaining stevenbucaille org to magic-leap-community

* [run-slow] chore: make style

* chore: update image matching fixture dataset HF repository

* [run-slow] superglue

* tests: overwriting test_batching_equivalence

* [run-slow] superglue

* tests: changed test to cope with value changing depending on cuda version

* [run-slow] superglue

* tests: changed matching_threshold value

* [run-slow] superglue

* [run-slow] superglue

* tests: changed tests for integration

* [run-slow] superglue

* fix: Changed tensor view and permutations to match original implementation results

* fix: updated convert script and integration test to include last change in model

* fix: increase tolerance for CUDA variances

* Apply suggestions from code review

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* [run-slow] superglue

* chore: removed blank whitespaces

* [run-slow] superglue

* Revert SuperPoint image processor accident changes

* [run-slow] superglue

* refactor: reverted copy from BERT class

* tests: lower the tolerance in integration tests for SuperGlue

* [run-slow] superglue

* chore: set do_grayscale to False in SuperPoint and SuperGlue image processors

* [run-slow] superglue

* fix: fixed imports in SuperGlue files

* chore: changed do_grayscale SuperGlueImageProcessing default value to True

* docs: added typehint to post_process_keypoint_matching method in SuperGlueImageProcessor

* fix: set matching_threshold default value to 0.0 instead of 0.2

* feat: added matching_threshold to post_process_keypoint_matching method

* docs: update superglue.md to include matching_threshold parameter

* docs: updated SuperGlueConfig docstring for matching_threshold default value

* refactor: removed unnecessary parameters in SuperGlueConfig

* fix: changed from matching_threshold to threshold

* fix: re-revert changes to make SuperGlue attention classes copies of BERT

* [run-slow] superglue

* fix: added missing device argument in post_processing method

* [run-slow] superglue

* fix: add matches different from -1 to compute valid matches in post_process_keypoint_matching (and docstring)

* fix: add device to image_sizes tensor instantiation

* tests: added checks on do_grayscale test

* chore: reordered and added Optional typehint to KeypointMatchingOutput

* LightGluePR suggestions:
- use `post_process_keypoint_matching` as default docs example
- add `post_process_keypoint_matching` in autodoc
- add `SuperPointConfig` import under TYPE_CHECKING condition
- format SuperGlueConfig docstring
- add device in convert_superglue_to_hf
- Fix typo
- Fix KeypointMatchingOutput docstring
- Removed unnecessary line
- Added missing SuperGlueConfig in __init__ methods

* LightGluePR suggestions:
- use batching to get keypoint detection

* refactor: processing images done in 1 for loop instead of 4

* fix: use @ instead of torch.einsum for scores computation

* style: added #fmt skip to long tensor values

* refactor: rollbacked validate_and_format_image_pairs valid and invalid case to more simple ones

* refactor: prepare_imgs

* refactor: simplified `validate_and_format_image_pairs`

* docs: fixed doc

---------

Co-authored-by: steven <steven.bucaillle@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-20 10:32:39 +00:00
872dfbdd46 [ViTPose] Convert more checkpoints (#35638)
* Convert more checkpoints

* Update docs, convert huge variant

* Update model name

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Remove print statements

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Link to collection

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-20 11:29:47 +01:00
332fa024d6 Security fix for self-comment-ci.yml (#35548)
* Revert "Disable  `.github/workflows/self-comment-ci.yml` for now (#35366)"

This reverts commit ccc4a5a59b2d4134a49971915db0710e7a8c7824.

* fix

* fix

* fix

* least permission

* add env

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-20 11:16:03 +01:00
8571bb145a Fix CI for VLMs (#35690)
* fix some easy test

* more tests

* remove logit check here also

* add require_torch_large_gpu in Emu3
2025-01-20 11:15:39 +01:00
5fa3534475 Use AMD CI workflow defined in hf-workflows (#35058)
* Use AMD CI workflow defined in hf-workflows
2025-01-17 20:52:57 +01:00
7d4b3ddde4 ci: fix xpu skip condition for test_model_parallel_beam_search (#35742)
`return unittest.skip()` used in the `test_model_parallel_beam_search` in
skip condition for xpu did not actually mark test to be skipped running
under pytest:
* 148 passed, 1 skipped

Other tests use `self.skipTest()`. Reusing this approach and moving the
condition outside the loop (since it does not depend on it) allows to skip
for xpu correctly:
* 148 skipped

Secondly, `device_map="auto"` is now implemented for XPU for IPEX>=2.5 and
torch>=2.6, so we can now enable these tests for XPU for new IPEX/torch
versions.

Fixes: 1ea3ad1ae ("[tests] use `torch_device` instead of `auto` for model testing (#29531)")

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2025-01-17 16:47:27 +01:00
8ad6bd0f1b Stop mutating input dicts in audio classification pipeline (#35754) 2025-01-17 15:41:56 +00:00
936a731534 Revert "Unable to use MimiModel with DeepSpeed ZeRO-3" (#35755)
Revert "Unable to use `MimiModel` with DeepSpeed ZeRO-3 (#34735)"

This reverts commit 54fd7e92604e8ecb2f4601aae2f75322af9184c5.
2025-01-17 16:29:26 +01:00
10e8cd0d63 Restore is_torch_greater_or_equal_than for backward compatibility (#35734)
* Restore is_torch_greater_or_equal_than for backward compatibility

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* review comments

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

---------

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-01-17 16:22:44 +01:00
099d93d2e9 Grounding DINO Processor standardization (#34853)
* Add input ids to model output

* Add text preprocessing for processor

* Fix snippet

* Add test for equivalence

* Add type checking guard

* Fixing typehint

* Fix test for added `input_ids` in output

* Add deprecations and "text_labels" to output

* Adjust tests

* Fix test

* Update code examples

* Minor docs and code improvement

* Remove one-liner functions and rename class to CamelCase

* Update docstring

* Fixup
2025-01-17 14:18:16 +00:00
42b2857b01 OmDet Turbo processor standardization (#34937)
* Fix docstring

* Fix docstring

* Add `classes_structure` to model output

* Update omdet postprocessing

* Adjust tests

* Update code example in docs

* Add deprecation to "classes" key in output

* Types, docs

* Fixing test

* Fix missed clip_boxes

* [run-slow] omdet_turbo

* Apply suggestions from code review

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* Make CamelCase class

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-01-17 14:10:19 +00:00
94ae9a8da1 OwlViT/Owlv2 post processing standardization (#34929)
* Refactor owlvit post_process_object_detection + add text_labels

* Fix copies in grounding dino

* Sync with Owlv2 postprocessing

* Add post_process_grounded_object_detection method to processor, deprecate post_process_object_detection

* Add test cases

* Move text_labels to processors only

* [run-slow] owlvit owlv2

* [run-slow] owlvit, owlv2

* Update snippets

* Update docs structure

* Update deprecated objects for check_repo

* Update docstring for post processing of image guided object detection
2025-01-17 13:58:28 +00:00
add5f0566c Added liger_kernel compatibility with PeftModel (#35680)
* Added liger_kernel compatibility with `PeftModel`

* Amending based on review comments

* Amending based on review comments
2025-01-17 14:43:20 +01:00
df6d42a914 check is added for the report_to variable in TrainingArguments (#35403)
check for report_to variable is added
2025-01-17 14:39:32 +01:00
54fd7e9260 Unable to use MimiModel with DeepSpeed ZeRO-3 (#34735)
use torch.tensor(), not torch.Tensor()

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
2025-01-17 14:06:20 +01:00
ab1afd56f5 Fix some tests (#35682)
* cohere tests

* glm tests

* cohere2 model name

* create decorator

* update

* fix cohere2 completions

* style

* style

* style

* add cuda in comments
2025-01-17 12:10:43 +00:00
8c1b5d3782 🚨🚨🚨 An attempt to fix #29554. Include 'LayerNorm.' in gamma/beta rename scope, optimize string search. (#35615)
* An attempt to fix #29554. Include 'LayerNorm.' in gamma/beta rename scope, reduce number of characters searched on every load considerably.

* Fix fix on load issue

* Fix gamma/beta warning test

* A style complaint

* Improve efficiency of weight norm key rename. Add better comments about weight norm and layer norm renaming.

* Habitual elif redunant with the return
2025-01-16 17:25:44 -08:00
02a492a838 Added resource class configuration option for check_circleci_user job (#32866)
Added resource class configuration option for check_circleci_user job.
2025-01-16 21:31:18 +01:00
94af1c0aa2 [generate] return Cache object even if passed in a legacy format (#35673)
* generate returns a Cache object by default

* fix tests

* fix test for encoder-decoder models
2025-01-16 17:06:24 +00:00
2818307e93 [generate] can instantiate GenerationConfig(cache_implementation="static") (#35679)
fix failing instantiation
2025-01-16 17:04:54 +00:00
aaa969e97d Remove pt_to_tf (#35672)
* rm command

* remove exception
2025-01-16 17:03:37 +00:00
80dbbd103c 🧹 remove generate-related objects and methods scheduled for removal in v4.48 (#35677)
* remove things scheduled for removal

* make fixup
2025-01-16 17:03:20 +00:00
aeeceb9916 [cache] add a test to confirm we can use cache at train time (#35709)
* add test

* augment test as suggested

* Update tests/utils/test_modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* rerun tests

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-16 17:02:34 +00:00
57bf1a12a0 Remove batch size argument warning when unjustified (#35519)
* use max batch size

* revert unneccessary change

---------

Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2025-01-16 17:48:11 +01:00
91be6a5eb2 Modular: support for importing functions from any file (#35692)
* fix function imports

* improve comment

* Update modeling_switch_function.py

* make checks more robust

* improvement

* rename

* final test update
2025-01-16 16:37:53 +00:00
8ebe9d7166 Optimize ForCausalLMLoss by removing unnecessary contiguous() call to reduce memory overhead (#35646)
Optimize ForCausalLMLoss by removing unnecessary contiguous() calls to reduce memory overhead
2025-01-16 15:47:43 +00:00
1302c32a84 Add proper jinja2 error (#35533)
* Cleanup jinja2 imports

* Raise a proper error if Jinja is missing

* make fixup
2025-01-16 15:31:11 +00:00
3292e96a4f [generation] fix type hint (#35725)
fix type hint
2025-01-16 15:09:59 +00:00
8b78d9d6e7 Fix the bug that Trainer cannot correctly call torch_jit_model_eval (#35722)
Fix the bug that the accelerator.autocast does not pass parameters correctly when calling torch_jit_model_eval (#35706)
2025-01-16 15:53:37 +01:00
2cbcc5877d Fix condition when GA loss bug fix is not performed (#35651)
* fix condition when GA loss bug fix is not performed

* max loss diff is 2.29

* fix typo

* add an extra validation that loss should not vary too much
2025-01-16 13:59:53 +01:00
fd4f14c968 Fix: Falcon tie_word_embeddings in GGUF (#35715)
* fix falcon tie_word_embeddings

* fix style
2025-01-16 13:18:22 +01:00
bef7dded22 Replace deprecated batch_size with max_batch_size when using HybridCache (#35498)
* Replace deprecated batch_size with max_batch_size

- Functionality remains the same, because property getter batch_size(self) returned max_batch_size anyways.
- This change just avoids an unnecessary warning about deprecation.

* Use max_batch_size instead of deprecated batch_size with HybridCache

* Use max_batch_size instead of deprecated batch_size with HybridCache

- Change generated code to match original source
2025-01-16 11:48:41 +00:00
99e0ab6ed8 Fix typo in /docs/source/ja/model_doc/decision_transformer.md URL (#35705)
doc: Update original code repository URL
2025-01-15 07:36:50 -08:00
12dfd99007 Fix : Nemotron Processor in GGUF conversion (#35708)
* fixing nemotron processor

* make style
2025-01-15 14:25:44 +01:00
387663e571 Enable gptqmodel (#35012)
* gptqmodel

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update readme

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* gptqmodel need use checkpoint_format (#1)

* gptqmodel need use checkpoint_format

* fix quantize

* Update quantization_config.py

* Update quantization_config.py

* Update quantization_config.py

---------

Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* Revert quantizer_gptq.py (#2)

* revert quantizer_gptq.py change

* pass **kwargs

* limit gptqmodel and optimum version

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix warning

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix version check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* revert unrelated changes

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable gptqmodel tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix requires gptq

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Fix Transformer compat (#3)

* revert quantizer_gptq.py change

* pass **kwargs

* add meta info

* cleanup

* cleanup

* Update quantization_config.py

* hf_select_quant_linear pass checkpoint_format and meta

* fix GPTQTestCUDA

* Update test_gptq.py

* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2

* cleanup

* add backend

* cleanup

* cleanup

* no need check exllama version

* Update quantization_config.py

* lower checkpoint_format and backend

* check none

* cleanup

* Update quantization_config.py

* fix self.use_exllama == False

* spell

* fix unittest

* fix unittest

---------

Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format again

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update gptqmodel version (#6)

* update gptqmodel version

* update gptqmodel version

* fix unit test (#5)

* update gptqmodel version

* update gptqmodel version

* "not self.use_exllama" is not equivalent to "self.use_exllama==False"

* fix unittest

* update gptqmodel version

* backend is loading_attibutes (#7)

* fix format and tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix memory check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix device mismatch

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix result check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* update tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* review: update docs (#10)

* review: update docs (#12)

* review: update docs

* fix typo

* update tests for gptqmodel

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update document (#9)

* update overview.md

* cleanup

* Update overview.md

* Update overview.md

* Update overview.md

* update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

---------

Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* typo

* doc note for asymmetric quant

* typo with apple silicon(e)

* typo for marlin

* column name revert: review

* doc rocm support

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-15 14:22:49 +01:00
615bf9c5e4 Add future import for Py < 3.10 (#35666)
* Add future import for Py < 3.10

* make fixup

* Same issue in convert_olmo2_weights_to_hf.py
2025-01-15 12:45:43 +00:00
09d5f76274 Clean-up composite configs (#34603)
* remove manual assignment tie-word-embeddings

* remove another unused attribute

* fix tests

* fix tests

* remove unnecessary overwrites

* fix

* decoder=True

* clean pix2struct

* run-all

* forgot `_tied_weights_keys` when adding Emu3

* also Aria + fix-copies

* and clean aria
2025-01-15 10:04:07 +01:00
c61fcde910 Enhance DataCollatorForLanguageModeling with Configurable Token Replacement Probabilities (#35251)
* DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing

* DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing

* Addressed review comments, modified the docstring and made a test for the DataCollatorForLanguageModeling
2025-01-14 17:01:10 +00:00
b0cdbd9119 Enhanced Installation Section in README.md (#35094)
* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

Enhanced installation section with troubleshooting, GPU setup, and OS-specific details.

* Update README.md

Enhanced installation section with troubleshooting, GPU setup, and OS-specific details.

* Update installation.md

Updated installation.md to include virtual environment and GPU setup instructions.

* Update installation.md

Updated installation.md to include virtual environment and GPU setup instructions.

* Update installation.md

Updated installation.md to include virtual environment, troubleshooting and GPU setup instructions.

* Update installation.md

* Update installation.md

* Update installation.md

* Update installation.md

Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions.

* Update installation.md

Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions.

* Update installation.md

Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions.

* Update README.md

Removed numbering from README.md.

* Update README.md

Removed unnecessary "a)" formatting as per maintainer feedback.

* Update README.md

Added blank lines around code snippets for better readability.

* Update README.md

Removed the line "b) Install a backend framework:" from README.md as per feedback.

* Update README.md

Simplified "For Windows:" to "Windows" in README.md as per feedback as well as "For macOS/Linux:" to "macOS/Linux"

* Update README.md

Removed unnecessary heading and retained valid code snippet.

* Update README.md

Removed unnecessary heading "d) Optional: Install from source for the latest updates" as per feedback.

* Update README.md

Removed "GPU Setup (Optional)" section to align with minimal design feedback.

* Update installation.md

Removed "Create and Activate a Virtual Environment" section from installation.md as per feedback.

* Update installation.md

Adjusted "Troubleshooting" to a second-level heading and added an introductory line as per feedback.

* Update installation.md

Updated troubleshooting section with simplified headings and formatted code blocks as per feedback.

* Update installation.md

Integrated GPU setup instructions into the "Install with pip" section for better content flow.

* Update README.md

Removed Troubleshooting section from README.md for minimalism as per maintainer feedback.
2025-01-14 08:05:08 -08:00
a11041ffad Fix : add require_read_token for gemma2 gated model (#35687)
fix gemma2 gated model test
2025-01-14 11:47:05 +01:00
df2a812e95 Fix expected output for ggml test (#35686)
fix expected output
2025-01-14 11:46:55 +01:00
050636518a Fix : HQQ config when hqq not available (#35655)
* fix

* make style

* adding require_hqq

* make style
2025-01-14 11:37:37 +01:00
715fdd6459 Update torchao.md: use auto-compilation (#35490)
* Update torchao.md: use auto-compilation

* Update torchao.md: indicate updating transformers to the latest

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-01-14 11:33:48 +01:00
4b8d1f7fca Fix : adding einops lib in the CI docker for some bitsandbytes tests (#35652)
* fix docker

* fix
2025-01-14 07:36:10 +01:00
34f76bb62b Fix zero_shot_image_classification documentation guide link in SigLIP (#35671) 2025-01-13 11:08:17 -08:00
c23a1c1932 Add-helium (#35669)
* Add the helium model.

* Add a missing helium.

* And add another missing helium.

* Use float for the rmsnorm mul.

* Add the Helium tokenizer converter.

* Add the pad token as suggested by Arthur.

* Update the RMSNorm + some other tweaks.

* Fix more rebase issues.

* fix copies and style

* fixes and add helium.md

* add missing tests

* udpate the backlink

* oups

* style

* update init, and expected results

* small fixes

* match test outputs

* style fixup, fix doc builder

* add dummies and we should be good to go!z

* update sdpa and fa2 documentation

---------

Co-authored-by: laurent <laurent.mazare@gmail.com>
2025-01-13 18:41:15 +01:00
a3f82328ed [i18n-ar] Translated file : docs/source/ar/tasks/token_classification.md into Arabic (#35193)
* Create token_classification.md

* Update token_classification.md

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2025-01-13 09:32:15 -08:00
2fa876d2d8 [tests] make cuda-only tests device-agnostic (#35607)
* intial commit

* remove unrelated files

* further remove

* Update test_trainer.py

* fix style
2025-01-13 14:48:39 +01:00
e6f9b03464 [Compile] Only test compiling model forward pass (#35658)
* rename test to only compile forward!

* style emu
2025-01-13 13:43:29 +01:00
84a6789145 Enable different torch dtype in sub models (#34873)
* fix

* fix test

* add tests

* add more tests

* fix tests

* supposed to be a torch.dtype test

* handle BC and make fp32 default
2025-01-13 13:42:08 +01:00
87089176d9 [Phi] bias should be True (#35650)
bias should be True
2025-01-13 13:15:07 +01:00
91f14f1fc4 Removed some duplicated code (#35637)
* Removed duplicate class field definition.

* Removed duplicate code in try-except block.

---------

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2025-01-13 12:34:21 +01:00
b8c34d97fc Fix whisper compile (#35413)
Fix compile error

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-01-13 11:31:51 +01:00
cd44bdb4b8 Fix device in rope module when using dynamic updates (#35608)
fix rope device
2025-01-13 10:11:17 +01:00
15bd3e61f8 Update codeowners with individual model owners (#35595)
* Update codeowners with individual model owners

* rip yoach

* add comment

* Replace - with _

* Add @qubvel for zero-shot object-detection

* Update CODEOWNERS

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update CODEOWNERS

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update CODEOWNERS

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update CODEOWNERS

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add yoni for omdet-turbo

* Update CODEOWNERS

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* Refactor / comment the CODEOWNERS file

* Capture modular files as well

* Add dummies without owner

* More cleanup

* Set Niels on a few more models that he added

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-01-10 17:59:36 +00:00
1e3c6c1f7d Skip MobileNetV1ModelTest::test_batching_equivalence for now (#35614)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 18:32:36 +01:00
04eae987f3 Fix flaky test_beam_search_low_memory (#35611)
* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 17:31:03 +01:00
b02828e4af Let EarlyStoppingCallback not require load_best_model_at_end (#35101)
* Bookmark

* Add warning
2025-01-10 10:25:32 -05:00
0aaf124fb9 Added error when sequence length is bigger than max_position_embeddings (#32156)
* Added error when sequence length is bigger than max_position_embeddings

* Fixed formatting

* Fixed bug

* Changed copies to match

* Fixed bug

* Applied suggestions

* Removed redundant code

* Fixed bugs

* Bug fix

* Bug fix

* Added requested Changes

* Fixed bug

* Fixed unwanted change

* Fixed unwanated changes

* Fixed formatting
2025-01-10 15:23:54 +00:00
1211e616a4 Use inherit tempdir makers for tests + fix failing DS tests (#35600)
* Use existing APIs to make tempdir folders

* Fixup deepspeed too

* output_dir -> tmp_dir
2025-01-10 10:01:58 -05:00
bbc00046b9 Fix flaky test_custom_4d_attention_mask (#35606)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 15:40:04 +01:00
f63829c87b v4.49.0-dev 2025-01-10 12:31:11 +01:00
52e1f87c7d [WIP] Emu3: add model (#33770)
* model can convert to HF and be loaded back

* nit

* works in single batch generation but hallucinates

* use the image tokens

* add image generation

* now it works

* add tests

* update

* add modulare but it doesn't work for porting docstring :(

* skip some tests

* add slow tests

* modular removed the import?

* guess this works

* update

* update

* fix copies

* fix test

* fix copies

* update

* docs

* fix tests

* last fix tests?

* pls

* repo consistency

* more style

* style

* remove file

* address comments

* tiny bits

* update after the new modular

* fix tests

* add one more cond in check attributes

* decompose down/up/mid blocks

* allow static cache generation in VLMs

* nit

* fix copies

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix VAE upsampling

* Update src/transformers/models/emu3/modular_emu3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address comments

* state overwritten stuff explicitly

* fix copies

* add the flag for flex attn

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-10 12:23:00 +01:00
ccc0381d36 Fix flex_attention in training mode (#35605)
* fix flex

* add test

* style
2025-01-10 11:49:12 +01:00
a9bd1e6284 Remove benchmark.py after #34275 2025-01-10 11:09:06 +01:00
e0646f3dce Chat template: return vectorized output in processors (#34275)
* update chat template

* style

* fix tests

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* typehints + docs

* fix tests

* remove unnecessary warnings

* forgot code style :(

* allow users to pass backend and num frames

* Update docs/source/en/chat_templating.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/processing_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* typo fix

* style

* address comments

* align with "pipeline" template

* update docs

* update docs

* unpack for all kwargs?

* wrong conflict resolution while rebasing

* tmp

* update docs

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-10 11:05:29 +01:00
5f087d1335 Add Moonshine (#34784)
* config draft

* full encoder forward

* full decoder forward

* fix sdpa and FA2

* fix sdpa and FA2

* moonshine model

* moonshine model forward

* fix attention with past_key_values

* add MoonshineForConditionalGeneration

* fix cache handling and causality for cross attention

* no causal attention mask for the encoder

* model addition (imports etc)

* small nit

* nits

* Update src/transformers/models/moonshine/convert_usefulsensors_to_hf.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* add rope_theta

* nits

* model doc

* Update src/transformers/models/auto/configuration_auto.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* imports

* add MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING_NAMES

* updates modular

* make

* make fix-copies

* ruff check examples fix

* fix check_modular_conversion

* nit

* nits

* nits

* copied from -> imports

* imports fix

* integrate attention refacto

* modular edge case

* remove encoder

* convolutions params in config

* run modular_model_converter

* make

* Update docs/source/en/model_doc/moonshine.md

Co-authored-by: Joshua Lochner <admin@xenova.com>

* MoonshineModelTest

* correct typo

* make style

* integration tests

* make

* modular convert

* name conversion update (up_proj -> fc1 etc)

* update config

* update MLP

* update attention

* update encoder layer

* update decoder layer

* update convolutions parameters

* update encoder

* remove INPUTS_DOCSTRING

* update decoder

* update conditional generation

* update pretrained model

* imports

* modular converted

* update doc

* fix

* typo

* update doc

* update license

* update init

* split config in file

* two classes for MLP

* attention from GLM

* from GlmRotaryEmbedding

* split MLP

* apply arthur's review suggestions

* apply arthur's review suggestions

* apply arthur's review suggestions

* auto feature extractor

* convert modular

* fix + make

* convert modular

* make

* unsplit config

* use correct checkpoint

* wrap generate

* update tests

* typos

* make

* typo

* update doc

---------

Co-authored-by: Joshua Lochner <admin@xenova.com>
2025-01-10 11:00:54 +01:00
6f127d3f81 Skip torchscript tests if a cache object is in model's outputs (#35596)
* fix 1

* fix 1

* comment

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 10:46:03 +01:00
6b73ee8905 ModernBert: reuse GemmaRotaryEmbedding via modular + Integration tests (#35459)
* Introduce 5 integration tests for the 4 model classes + torch export

* ModernBert: reuse GemmaRotaryEmbedding via modular

* Revert #35589, keep rope_kwargs; rely on them in modular_modernbert

* Revert "Revert #35589, keep rope_kwargs; rely on them in modular_modernbert"

This reverts commit 11b44b9ee83e199cbfb7c5ba2d11f7a7fdbba2d3.

* Don't set rope_kwargs; override 'self.rope_init_fn' call instead
2025-01-10 10:25:10 +01:00
8de7b1ba8d Add flex_attn to diffllama (#35601)
Add sdpa to diffllama
2025-01-09 20:49:11 +01:00
1e3ddcb2d0 ModernBERT bug fixes (#35404)
* bug fixes

* organize imports

* wrap cpu warning in reference_compile

* Avoid needing repad_logits_with_grad, always repad with grads when training

I'm not 100% that the conditional with "or labels is None" makes sense though - not sure what the intention is there. Perhaps we can remove that?

* Revert "Avoid needing repad_logits_with_grad, always repad with grads when training"

This reverts commit cedcb4e89bcea199a1135a0933e71f534b656239.

* Fix grammar: keep -> keeps

* Propagate grammar fix with modular_model_converter

---------

Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>
2025-01-09 20:15:38 +01:00
e97d7a5be5 add _supports_flex_attn = True for models that do support it (#35598)
* add `_supports_flex_attn = True`

* fix repo consistency
2025-01-09 20:03:33 +01:00
c9c682d19c [doc] deepspeed universal checkpoint (#35015)
* universal checkpoint

* Update docs/source/en/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-09 09:50:51 -08:00
3a4ae6eace Refactor/fix Cohere2 (#35594)
* refactor/fix cohere2

* add kwargs

* tests

* remove func and import it
2025-01-09 17:54:57 +01:00
32e0db8a69 [tokenizers] Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer (#35593)
* Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer

in PreTrainedTokenizerFast, rather than relying on subclasses to take care of this.

* Simplify setting self.add_prefix_space, ensure pre_tok exists

* Wrap in try-except to catch 'Custom PreTokenizer cannot be serialized'

862d1a346a/bindings/python/src/pre_tokenizers.rs (L672) produces the Exception. They're triggered by the roformer tests, as the RoFormerTokenizerFast uses a custom PreTokenizer.

* Propagate add_prefix_space in T5TokenizerFast to superclass
2025-01-09 17:46:50 +01:00
46276f9a7f Fix modular edge case + modular sorting order (#35562)
* look-ahead negation

* re add examples by default

* Fix the bug in topological sort

* Update create_dependency_mapping.py

* start adding test

* finalize test

* more tests

* style

* style
2025-01-09 17:17:52 +01:00
d3fe9fa3fe PR for Issue #22694: Fixed Training Evaluation table display for VSCode (#35557) 2025-01-09 15:05:47 +00:00
395b114bd1 Small fix rope kwargs (#35589)
* don't know why this keeps popping up?

* remove unused rope_kwargs
2025-01-09 15:40:36 +01:00
82dd6c14bb Fix flaky SwitchTransformersModelTest::test_training_gradient (#35587)
* fix

* Update tests/models/switch_transformers/test_modeling_switch_transformers.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-09 15:36:22 +01:00
eb4579cf43 tokenizer train from iterator without pre_tokenizers (#35396)
* fix if else issues

* add a test

* fix the test

* style
2025-01-09 15:34:43 +01:00
320512df46 feat: add TP plan for granite (#35573)
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
2025-01-09 15:25:55 +01:00
633da1b10e [Idefics3] Move image features to same device as input embeds (#35100)
* [Idefics3] Move image features to same device as input embeds

* Update src/transformers/models/idefics3/modeling_idefics3.py

* make style

---------

Co-authored-by: Saif Rehman Nasir <shyshin@github.com>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2025-01-09 14:25:36 +01:00
832c6191ed Add inputs_embeds param to ModernBertModel (#35373)
* update modular_modernbert -- add inputs_embeds param to ModernBertModel

* Fix implementation issues; extend to other classes; docstring

First of all, the inputs_embeds shouldn't fully replace `self.embeddings(input_ids)`, because this call also does layer normalization and dropout. So, now both input_ids and inputs_embeds is passed to the ModernBertEmbeddings, much like how BertEmbeddings is implemented.

I also added `inputs_embeds` to the docstring, and propagated the changes to the other model classes.

I also introduced an error if input_ids and input_embeds are both or neither provided.

Lastly, I fixed an issue with device being based solely on input_ids with attention_mask.

* Propagate inputs_embeds to ModernBertForMaskedLM correctly

Also reintroduce inputs_embeds test

---------

Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
2025-01-09 14:17:26 +01:00
1b2f942af7 Fix flaky test_batching_equivalence (#35564)
* yes!

* oh no!!!

* oh no!!!

* style

* oh no!!!

* oh no!!!

* oh no!!!

* oh no!!!

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-09 14:00:08 +01:00
4adc415b6d Setup loss_type in config at model init time (#34616)
* setup loss_type in config at model init time

ensures no additional graph break introduced when torch.compile'ed

fixes #34615

Signed-off-by: ChanderG <mail@chandergovind.org>

* lookup loss mapping at init time instead of manual setup

Signed-off-by: ChanderG <mail@chandergovind.org>

* remove redundant lookup at loss_function time

* overwride losstype at init time

---------

Signed-off-by: ChanderG <mail@chandergovind.org>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2025-01-09 13:32:21 +01:00
c8ab6ce6ce Re-add missing __all__ for Cohere and Phi3 (#35578)
re-add missing __all__
2025-01-09 11:29:31 +01:00
487c31a21f Minor fix in video text 2 text docs (#35546)
minor fix in docs
2025-01-09 11:20:36 +01:00
965a2fb320 More model refactoring! (#35359)
* cohere

* style

* phi3

* style

* small fix

* small fix

* phi3 longrope

* oups

* Update rope (only for phi3 still)

* Update test_modeling_rope_utils.py

* Update modeling_phi3.py

* fix

* fix copies

* style

* Fix copied from bad renaming
2025-01-09 11:09:09 +01:00
137965ca7d Don't show warning for inv_freq buffers (#35255)
dont show warning
2025-01-09 10:46:01 +01:00
8cad65a698 Fix multi-gpu loss (#35395)
push to device
2025-01-09 10:14:31 +01:00
2e2f8015c0 update code owners (#35576)
update
2025-01-09 09:55:41 +01:00
a6256ec098 [i18n-ar] Translated file: docs/source/ar/tasks/multiple_choice.md into Arabic (#35199)
* إضافة الترجمة العربية: multiple_choice.md

* Update multiple_choice.md

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

* Add files via upload

* Update _toctree.yml

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2025-01-08 14:17:58 -08:00
b32938aeee Fix all output_dir in test_trainer.py to use tmp_dir (#35266)
* update codecarbon

* replace directly-specified-test-dirs with tmp_dir

* pass tmp_dir to all get_regression_trainer

* test_trainer.py: Use tmp_dir consistently for all output_dir arguments

* fix some with...as tmp_dir blocks

* reflect the comments to improve test_trainer.py

* refresh .gitignore
2025-01-08 19:44:39 +01:00
76da6ca034 Pipeline: simple API for assisted generation (#34504)
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-01-08 17:08:02 +00:00
3f483beab9 [PixtralLarge] Update Pixtral conversion script to support large format! (#34801)
* update conversion script

* update for bias again

* remove pdv

* use my dir

* Update how we initialize the tokenizer

* Convert in bfloat16

* Undo that one again

* fix config dump

* .to() was broken for BatchMixFeature

* quick debug breakpoint

* put the breakpoint in the right place

* Add a config flag for the multimodal projector bias

* Add a config flag for the multimodal projector bias

* Conversion script can load chat templates

* Indent config for comparison

* Stop clobbering the config

* Re-enable the config clobber

* Get rid of the config manual save - it has no effect!

* Handle adapter bias correctly

* Default vision transformer activation to silu

* Remove legacy processing path

* One commit with all the debug breakpoints before I delete them all, in case I need to revert

* Update conversion

* Remove vLLM debugging instrumentation

* Drop xformers

* Remove debug enumerates

* make fixup

* make fixup

* Break copied from in pixtral

* Propagate multimodal_projector_bias change

* Propagate multimodal_projector_bias change

* Remove debug device .to()

* Restore attention weights output

* Fix Pixtral test

* Drop image_seq_length

* Drop image_seq_length

* Put the legacy processing code back

* Add the bias option to the llava_next_video config

* Add the bias option to the llava_next_video config

* Make certain args required in converter

* Make certain args required in converter

* typo

* make fixup

* Reverting some dtype changes since it seems to work without them

---------

Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-166-244.ec2.internal>
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-01-08 17:39:47 +01:00
4c2c12b3de [docs] Remove Hiera from AUDIO MODELS in docs (#35544)
Remove Hiera from AUDIO MODELS

Hiera is a visual model and should not appear in audio model...
2025-01-08 16:33:21 +00:00
854dc7941b ovewrite top_k when crate audio classification pipeline (#35541)
* ovewrite top_k when crate audio classification pipeline

* Update src/transformers/pipelines/audio_classification.py

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-01-08 16:32:27 +00:00
8c555ca3d7 add code owners (#35528)
* add co owners

* normal processing

* /src/transformers/models/*/*_modeling*

* Update CODEOWNERS

* Update CODEOWNERS

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update CODEOWNERS

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* nit

* Apply suggestions from code review

Co-authored-by: Alvaro Moran <6949769+tengomucho@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update CODEOWNERS

* rather put `@Rocketknight1`

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Alvaro Moran <6949769+tengomucho@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
2025-01-08 17:14:44 +01:00
8490d3159c Add ViTPose (#30530)
* First draft

* Make fixup

* Make forward pass worké

* Improve code

* More improvements

* More improvements

* Make predictions match

* More improvements

* Improve image processor

* Fix model tests

* Add classic decoder

* Convert classic decoder

* Verify image processor

* Fix classic decoder logits

* Clean up

* Add post_process_pose_estimation

* Improve post_process_pose_estimation

* Use AutoBackbone

* Add support for MoE models

* Fix tests, improve num_experts%

* Improve variable names

* Make fixup

* More improvements

* Improve post_process_pose_estimation

* Compute centers and scales

* Improve postprocessing

* More improvements

* Fix ViTPoseBackbone tests

* Add docstrings, fix image processor tests

* Update index

* Use is_cv2_available

* Add model to toctree

* Add cv2 to doc tests

* Remove script

* Improve conversion script

* Add coco_to_pascal_voc

* Add box_to_center_and_scale to image_transforms

* Update tests

* Add integration test

* Fix merge

* Address comments

* Replace numpy by pytorch, improve docstrings

* Remove get_input_embeddings

* Address comments

* Move coco_to_pascal_voc

* Address comment

* Fix style

* Address comments

* Fix test

* Address comment

* Remove udp

* Remove comment

* [WIP] need to check if the numpy function is same as cv

* add scipy affine_transform

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* refactor convert

* add output_shape

* add atol 5e-2

* Use hf_hub_download in conversion script

* make box_to_center more applicable

* skipt test_get_set_embedding

* fix to accept array and fix CI

* add co-contributor

* make it to tensor type output

* add torch

* change to torch tensor

* add more test

* minor change

* CI test change

* import torch should be above ImageProcessor

* make style

* try not use torch in def

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vitpose_backbone/configuration_vitpose_backbone.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix

* fix

* add caution

* make more detail about dataset_index

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>

* add docs

* Update docs/source/en/model_doc/vitpose.md

* Update src/transformers/models/vitpose/configuration_vitpose.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Revert "Update src/transformers/__init__.py"

This reverts commit 7ffa504450bb9dbccf9c7ea668441b98a1939d5c.

* change name

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/vitpose/test_modeling_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* move vitpose only function to image_processor

* raise valueerror when using timm backbone

* use out_indices

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove camel-case of def flip_back

* rename vitposeEstimatorOutput

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix confused camelcase of MLP

* remove in-place logic

* clear scale description

* make consistent batch format

* docs update

* formatting docstring

* add batch tests

* test docs change

* Update src/transformers/models/vitpose/image_processing_vitpose.py

* Update src/transformers/models/vitpose/configuration_vitpose.py

* chagne ViT to Vit

* change to enable MoE

* make fix-copies

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* extract udp

* add more described docs

* simple fix

* change to accept target_size

* make style

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vitpose/configuration_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change to `verify_backbone_config_arguments`

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove unnecessary copy

* make config immutable

* enable gradient checkpointing

* update inappropriate docstring

* linting docs

* split function for visibility

* make style

* check isinstances

* change to acceptable use_pretrained_backbone

* make style

* remove copy in docs

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* simple fix + make style

* change input config of activation function to string

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* tmp docs

* delete index.md

* make fix-copies

* simple fix

* change conversion to sam2/mllama style

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* refactor convert

* add supervision

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* remove reduntant def

* seperate code block for visualization

* add validation for num_moe

* final commit

* add labels

* [run-slow] vitpose, vitpose_backbone

* Update src/transformers/models/vitpose/convert_vitpose_to_hf.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* enable all conversion

* final commit

* [run-slow] vitpose, vitpose_backbone

* ruff check --fix

* [run-slow] vitpose, vitpose_backbone

* rename split module

* [run-slow] vitpose, vitpose_backbone

* fix pos_embed

* Simplify init

* Revert "fix pos_embed"

This reverts commit 2c56a4806e30bc9b5753b142fa04b913306c54ff.

* refactor single loop

* allow flag to enable custom model

* efficiency of MoE to not use unused experts

* make style

* Fix range -> arange to avoid warning

* Revert MOE router, a new one does not work

* Fix postprocessing a bit (labels)

* Fix type hint

* Fix docs snippets

* Fix links to checkpoints

* Fix checkpoints in tests

* Fix test

* Add image to docs

---------

Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: sangbumchoi <danielsejong55@gmail.com>
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-08 16:02:14 +00:00
4349a0e401 fix: Qwen2-VL generate with inputs_embeds (#35466)
* fix: Qwen2-VL generate with inputs_embeds

* change: optional input_ids in get_rope_index
2025-01-08 16:36:03 +01:00
88e18b3c63 Update doc for metric_for_best_model when save_strategy="best". (#35389)
* Updated docstring for _determine_best_metric.

* Updated docstring for metric_for_best_model.

* Added test case for save strategy.

* Updated incorrect test case.

* Changed eval_strategy to match save_strategy.

* Separated test cases for metric.

* Allow load_best_model when save_strategy == "best".

* Updated docstring for metric_for_best_model.
2025-01-08 16:32:35 +01:00
jp
29e74b7cbc Add: num_additional_image_tokens to models (#35052)
* Add: num_additional_image_tokens to models

* docs: update docstring for num_additional_image_tokens in configuration files

* Add num_additional_image_tokens to LlavaNextVideo model and update feature selection logic

* revert

* Fix: adjust num_image_tokens calculation in LlavaProcessor

* Remove num_additional_image_tokens initialization from configuration files

* Fix test error

* revert

* Fix: adjust num_image_tokens calculation in LlavaNextVideoProcessor

* fix conflict

* Fix: adjust num_image_tokens calculation in VideoLlavaProcessor

* make style

---------

Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2025-01-08 16:20:01 +01:00
657bb14f98 Enable auto task for timm models in pipeline (#35531)
* Enable auto task for timm models

* Add pipeline test
2025-01-08 15:14:17 +00:00
1a6c1d3a9a Bump torch requirement to >= 2 (#35479)
Bump torch requirement, follow-up of #35358
2025-01-08 15:59:32 +01:00
59e5b3f01b Timm wrapper label names (#35553)
* Add timm wrapper label names mapping

* Add index to classification pipeline

* Revert adding index for pipelines

* Add custom model check for loading timm labels

* Add tests for labels

* [run-slow] timm_wrapper

* Add note regarding label2id mapping
2025-01-08 14:09:46 +00:00
f1639ea51d Update missing model error message (#35370)
* Update missing model error message

* Update missing model error message

* Update missing model error message

* Fix capitalization
2025-01-08 15:05:06 +01:00
bd39b0627b Update doc and default value of TextNetImageProcessor (#35563)
update doc and default value
2025-01-08 13:47:52 +00:00
651cfb400f Add support for modular with fast image processors (#35379)
* Add support for modular with fast image processors

* fix order and remove copied from

* add comment for "image_processing*_fast"
2025-01-08 08:37:57 -05:00
430d3d43a5 [Docs] links to logits-processor-zoo (#35552)
links to logits-processor-zoo
2025-01-08 13:36:30 +00:00
3c1895aa65 Fix Qwen2VL processor to handle odd number of frames (#35431)
* fix: processing odd number of frames

* feat: add test case

* update: test one frame

* feat: support custom patch size

* fix: test with videos

* revert: change on patch repeat

* fix: much wow

* update: fixups

* fixup pls

* ruff fixup

* fix typo at least
2025-01-08 13:49:00 +01:00
3fde88b19d support chat generator as input of TextGenerationPipeline (#35551)
* support chat generator as input of TextGenerationPipeline

* missing import

* fix tests

* again

* simpler

* add test
2025-01-08 13:27:07 +01:00
ebdd1ad400 Pass correct num_items_in_batch value into the training_step function (#35438)
pass correct `num_items_in_batch` to compute_loss
2025-01-08 13:16:03 +01:00
0e0516c119 MODERNBERT_INPUTS_DOCSTRING: past_key_values are ignored (#35513)
* MODERNBERT_INPUTS_DOCSTRING: past_key_values are ignored

* sync to modular_modernbert.py
2025-01-08 11:45:40 +01:00
d1681ec2b6 VLMs: major clean up 🧼 (#34502)
only lllava models are modified
2025-01-08 10:35:23 +01:00
7176e06b52 Add TextNet (#34979)
* WIP

* Add config and modeling for Fast model

* Refactor modeling and add tests

* More changes

* WIP

* Add tests

* Add conversion script

* Add conversion scripts, integration tests, image processor

* Fix style and copies

* Add fast model to init

* Add fast model in docs and other places

* Fix import of cv2

* Rename image processing method

* Fix build

* Fix Build

* fix style and fix copies

* Fix build

* Fix build

* Fix Build

* Clean up docstrings

* Fix Build

* Fix Build

* Fix Build

* Fix build

* Add test for image_processing_fast and add documentation tests

* some refactorings

* Fix failing tests

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Introduce TextNet

* Fix failures

* Refactor textnet model

* Fix failures

* Add cv2 to setup

* Fix failures

* Fix failures

* Add CV2 dependency

* Fix bugs

* Fix build issue

* Fix failures

* Remove textnet from modeling fast

* Fix build and other things

* Fix build

* some cleanups

* some cleanups

* Some more cleanups

* Fix build

* Incorporate PR feedbacks

* More cleanup

* More cleanup

* More cleanup

* Fix build

* Remove all the references of fast model

* More cleanup

* Fix build

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Fix Build

* Fix build

* Fix build

* Fix build

* Fix build

* Fix build

* Incorporate PR feedbacks

* Fix style

* Fix build

* Incorporate PR feedbacks

* Fix image processing mean and std

* Incorporate PR feedbacks

* fix build failure

* Add assertion to image processor

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* fix style failures

* fix build

* Fix Imageclassification's linear layer, also introduce TextNetImageProcessor

* Fix build

* Fix build

* Fix build

* Fix build

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Fix build

* Incorporate PR feedbacks

* Remove some script

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Fix image processing in textnet

* Incorporate PR Feedbacks

* Fix CI failures

* Fix failing test

* Fix failing test

* Fix failing test

* Fix failing test

* Fix failing test

* Fix failing test

* Add textnet to readme

* Improve readability

* Incorporate PR feedbacks

* fix code style

* fix key error and convert working

* tvlt shouldn't be here

* fix test modeling test

* Fix tests, make fixup

* Make fixup

* Make fixup

* Remove TEXTNET_PRETRAINED_MODEL_ARCHIVE_LIST

* improve type annotation

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update tests/models/textnet/test_image_processing_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* improve type annotation

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* space typo

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* improve type annotation

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/textnet/configuration_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* make conv layer kernel sizes and strides default to None

* Update src/transformers/models/textnet/modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/textnet/modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fix keyword bug

* add batch init and make fixup

* Make fixup

* Update integration test

* Add figure

* Update textnet.md

* add testing and fix errors (classification, imgprocess)

* fix error check

* make fixup

* make fixup

* revert to original docstring

* add make style

* remove conflict for now

* Update modeling_auto.py

got a confusion in `timm_wrapper` - was giving some conflicts

* Update tests/models/textnet/test_modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/textnet/modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update tests/models/textnet/test_modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/textnet/modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* add changes

* Update textnet.md

* add doc

* add authors hf ckpt + rename

* add feedback: classifier/docs

---------

Co-authored-by: raghavanone <opensourcemaniacfreak@gmail.com>
Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co>
Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-08 09:52:51 +01:00
b05df6611e [docs] Remove sortish_sampler (#35539)
remove
2025-01-07 12:06:19 -08:00
a7d1441d65 Correctly list the chat template file in the Tokenizer saved files list (#34974)
* Correctly list the chat template file in the saved files list

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add save file checking to test

* make fixup

* better filename handling

* make fixup

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-07 19:11:02 +00:00
cdca3cf9e3 [Whisper] fix docstrings typo (#35338)
fix typo
2025-01-07 09:20:27 -08:00
7f7677307c [Qwen2Audio] handle input ids expansion during processing (#35534)
* add audio_token attribute to proc

* expand input_ids

* and legacy and expanded input_ids

* test update

* split lines

* add possibility not to provide eos and bos audio tokens

* raise errors

* test incorrect number of audio tokens

* add example

* fmt

* typo
2025-01-07 16:47:27 +01:00
628cd838a3 Release GPU memory after Optuna trial (#35440)
* Release GPU memory after trial

* Update to use release_memory from accelerate.utils.memory after suggestion
2025-01-07 16:26:28 +01:00
665a4942e4 Check whether rescale is requested before checking is_scaled_image (#35439) 2025-01-07 11:39:45 +00:00
f408d55448 Fix bug when requesting input normalization with EnCodec (#34756)
* EnCodec: unsqueeze padding mask

* add test for normalization
2025-01-07 11:50:02 +01:00
96bf3d6cc5 Add diffllama (#34083)
* first adding diffllama

* add Diff Attention and other but still with errors

* complate make attention Diff-Attention

* fix some bugs which may be caused by transformer-cli while adding model

* fix a bug caused by forgetting KV cache...

* Update src/transformers/models/diffllama/modeling_diffllama.py

You don't need to divide by 2 if we use same number of attention heads as llama. instead you can just split in forward.

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fit to changeing "num_heads // 2" place

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

new codes are more meaningful than before

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

new codes are more meaningful than before

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fit to changeing "num_heads // 2" place

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fix 2times divide by sqrt(self.head_dim)

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fix 2times divide by sqrt(self.head_dim)

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fit to changeing "num_heads // 2" place.
and more visible

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* I found Attention missed implemented from paper still on e072544a3bfc69b8a903e062729f861108ffecd3.

* re-implemented

* adding groupnorm

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* align with transformers code style

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* fix typo

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* adding groupnorm

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* change SdpaAttention to DiffSdpaAttention

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* fix bug

* Update src/transformers/models/diffllama/modeling_diffllama.py

resolve "not same outputs" problem

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* fix bugs of places of "GroupNorm with scale" and etc

* Revert "fix bugs of places of "GroupNorm with scale" and etc"

This reverts commit 26307d92f6acd55e9fe89f2facff350f05760960.

* simplify multiple of attention (matmul) operations into one by repeating value_states

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* simplify multiple of attention (matmul) operations into one by repeating value_states

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* simplify multiple of attention (matmul) operations into one by repeating value_states

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* remove missed type

* add diffllama model_doc

* apply make style/quality

* apply review comment about model

* apply review comment about test

* place diffllama alphabetically on the src/transformers/__init__.py

* fix forgot code

* Supports parameters that are not initialized with standard deviation 0 in the conventional method

* add DiffLlamaConfig to CONFIG_CLASSES_TO_IGNORE_FOR_DOCSTRING_CHECKPOINT_CHECK on utils/check_config_docstrings.py

* remove unused property of config

* add to supported model list

* add to spda supported model list

* fix copyright, remove pretraining_tensor_parallel, and modify for initialization test

* remove unused import and etc.

* empty commit

* empty commit

* empty commit

* apply modular transformers but with bugs

* revert prev commit

* create src/transformers/model/diffllama/modular_diffllama.py

* run utils/modular_model_converter.py

* empty commit

* leaner modular diffllama

* remove more and more in modular_diffllama.pt

* remove more and more in modular_diffllama.pt

* resolve missing docstring entries

* force reset

* convert modular

---------

Co-authored-by: Minho Ryu <ryumin93@gmail.com>
2025-01-07 11:34:56 +01:00
ed73ae210b NPU support SDPA (#35165)
Co-authored-by: root <weichunyude@163.com>
2025-01-07 11:30:05 +01:00
02ed609285 Replace tokenizer to processing_class in Seq2SeqTrainer (#35452) 2025-01-07 09:51:12 +00:00
9fd123ac31 ci: mark model_parallel tests as cuda specific (#35269)
`parallelize()` API is deprecated in favor of accelerate's `device_map="auto"`
and therefore is not accepting new features. At the same time `parallelize()`
implementation is currently CUDA-specific. This commit marks respective
ci tests with `@require_torch_gpu`.

Fixes: #35252

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2025-01-07 10:16:34 +01:00
bd442c6d3a Zamba new attention standard (#35375)
* updated zamba to new attention standard

* make fixup fixes
2025-01-07 10:08:45 +01:00
12ba96aa3c [Dinov2 with Registers] Some fixes (#35411)
* First draft

* Thanks claude

* Remove print statement

* Use torch_int

* Address comments

* Address comment
2025-01-06 21:10:59 +01:00
ca00950057 added logic for deleting adapters once loaded (#34650)
* added logic for deleting adapters once loaded

* updated to the latest version of transformers, merged utility function into the source

* updated with missing check

* added peft version check

* Apply suggestions from code review

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* changes according to reviewer

* added test for deleting adapter(s)

* styling changes

* styling changes in test

* removed redundant code

* formatted my contributions with ruff

* optimized error handling

* ruff formatted with correct config

* resolved formatting issues

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
2025-01-06 18:36:40 +00:00
1650e0e514 Fixed typo in Llama configuration docstring (#35520)
Update configuration_llama.py

There is no `num_heads` parameter, only `num_attention_heads`
2025-01-06 09:54:08 -08:00
3b1be043cd 🌐 [i18n-KO] Remove duplicates in toctree (#35496)
fix(docs): remove duplicates in toctree
2025-01-06 09:14:22 -08:00
3951da1a6b [GGUF] Refactor and decouple gguf checkpoint loading logic (#34385)
* draft load_gguf refactor

* update

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove llama mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove qwen2 mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove unused function

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate stablelm mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate phi3 mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate t5 mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate bloom mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix bloom

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate starcoder2 mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate gpt2 mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate mistral mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate nemotron mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate mamba mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate mamba mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* code format

Signed-off-by: Isotr0py <2037008807@qq.com>

* code format

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix mamba

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix qwen2moe

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove qwen2moe mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* clean up

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove falcon 7b map

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove all ggml tensors mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* add comments

Signed-off-by: Isotr0py <2037008807@qq.com>

* update messages

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix tensors in parsed parameters

Signed-off-by: Isotr0py <2037008807@qq.com>

* add gguf check

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-06 18:02:38 +01:00
86fa3cedad Bump jinja2 from 3.1.4 to 3.1.5 in /examples/research_projects/decision_transformer (#35408)
Bump jinja2 in /examples/research_projects/decision_transformer

Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.4 to 3.1.5.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.4...3.1.5)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-06 16:58:29 +00:00
44a26c871c Update llm_optims docs for sdpa_kernel (#35481)
update: use sdpa_kernel
2025-01-06 08:54:31 -08:00
18e896bd8f 🌐 [i18n-KO] Translated altclip.md to Korean (#34594)
* docs: ko: model_doc/timesformer.md

* feat: nmt draft

* Apply suggestions from code review

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>

* Update docs/source/ko/model_doc/altclip.md

* add snippet

---------

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>
2025-01-06 08:45:26 -08:00
a821b9c7ab Add check for if num_items_in_batch is not None (#35102) 2025-01-06 10:11:21 -05:00
203e978826 Add position_ids in XLMRobertaXLForCausalLM.prepare_inputs_for_generation (#35044)
* fix

* fix

* cleanup

* style

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-06 16:10:21 +01:00
c451a72cd7 Add French translation of task_summary and tasks_explained (#33407)
* Add French translation of task_summary and tasks_explained

---------

Co-authored-by: Aymeric Roucher <69208727+aymeric-roucher@users.noreply.github.com>
2025-01-06 14:23:52 +01:00
9895f7df81 Idefics: fix docstring (#35079)
nit: fix docstring
2025-01-06 10:58:04 +01:00
32aa2db04a Fix Llava conversion for models that use safetensors to store weights (#35406)
* fix llava-med-v1.5-mistral-7b conversion

Signed-off-by: Isotr0py <2037008807@qq.com>

* add weights_only=True

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-06 09:59:38 +01:00
b2f2977533 Applies the rest of the init refactor except to modular files (#35238)
* [test_all] Applies the rest of the init refactor except to modular files

* Revert modular that doesn't work

* [test_all] TFGPT2Tokenizer
2025-01-05 18:30:08 +01:00
e5fd865eba Add Gemma2 GGUF support (#34002)
* initial setup for ggml.py

* initial setup of GGUFGemma2Converter class

* Add gemma2 model to gguf.md doc

* Partial work on GGUF_TENSOR_MAPPING

* initial setup of GGUF_TENSOR_MAPPING for Gemma2

* refactor: rename GemmaConvert class to GemmaConverter for naming consistency

* feat: complete gemma2 tensor mapping implementation

* feat: add initial implementation of GGUFGemmaConverter

* feat: complete GGUFGemmaConverter implementation

* feat: add test code for gemma2

* refactor: minor code cleanup

* refactor: minor code cleanup

* fix: resolve suggestions

* Update tests/quantization/ggml/test_ggml.py

Co-authored-by: Isotr0py <2037008807@qq.com>

---------

Co-authored-by: Isotr0py <2037008807@qq.com>
2025-01-03 14:50:07 +01:00
1fe2d53d4e Reuse "if not" logic in image_processing. (#35405) 2025-01-03 14:44:57 +01:00
30a9971632 Use sdpa_kernel in tests (#35472)
* update: use sdpa_kernel

* update: rerun test
2025-01-03 14:39:52 +01:00
cba49cb2a6 Change is_soundfile_availble to is_soundfile_available (#35030) 2025-01-03 14:37:42 +01:00
42865860ec Fix paligemma warning message (#35486)
fix log input
2025-01-02 11:36:53 +01:00
b2b04e86e7 Fix docs typos. (#35465)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-01-02 11:29:46 +01:00
6b1e86fd4d Fix new BNB test failures (#35345) 2025-01-02 11:24:52 +01:00
5b516b06c8 Reintroduce Python 3.9 support for ModernBERT (#35458)
Co-authored-by: Koichi Yasuoka <yasuoka@kanji.zinbun.kyoto-u.ac.jp>
2025-01-02 11:23:07 +01:00
919220dab1 Update translated docs for sdpa_kernel (#35461)
* docs: update sdpa_kernel for translation

* fix: nn.attention

* update: infer many
2024-12-31 08:37:58 -08:00
eb2b452432 [i18n-ar] Translated file: docs/source/ar/tasks/summarization.md into Arabic (#35195)
* إضافة الترجمة العربية: summarization.md

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-12-31 08:35:54 -08:00
d5aebc6465 [i18n-ar] Translated file: docs/source/ar/tasks/question_answering.md into Arabic (#35196)
* إضافة الترجمة العربية: question_answering.md

* Update question_answering.md

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-12-30 11:56:05 -08:00
b5f97977ed Update docs for sdpa_kernel (#35410)
update: sdp_kernel -> sdpa_kernel
2024-12-30 09:50:34 -08:00
5cabc75b4b Add compute_loss_func to Seq2SeqTrainer (#35136) 2024-12-29 15:01:35 +01:00
90f256c90c Update perf_infer_gpu_one.md: fix a typo (#35441) 2024-12-29 14:57:08 +01:00
5c75087aee Fix model_accepts_loss_kwargs for timm model (#35257)
* Fix for timm model

* Add comment
2024-12-27 16:33:44 +00:00
3b0a94ef9e Fix f-string to show ACCELERATE_MIN_VERSION on error (#35189)
fix f-string to show ACCELERATE_MIN_VERSION on error

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-12-27 13:21:44 +01:00
f63da20a9f CLIP conversion script - Change fairseq to OpenAI (#35384)
Change fairseq to OpenAI
2024-12-27 13:12:32 +01:00
7f97d01675 Fix: Rename keyword argument in_channels to num_channels (#35289)
Fix: Rename keyword argument in_channels to num_channels in some default backbone configs
2024-12-27 13:07:31 +01:00
4eb17b26e7 Drop inplace operation for loss computation with gradient accumulation (#35416)
Fix inplace loss computation
2024-12-26 14:58:53 +01:00
24c91f095f [GPTQ, CompressedTensors] Fix unsafe imports and metada check (#34815)
* fix gptq creation when optimum is not installed + fix metadata checking

* fix compressed tensors as well

* style

* pray for ci luck on flaky tests :prayge:

* trigger ci

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2024-12-24 19:32:44 +01:00
6e0515e99c Add DINOv2 with registers (#35348)
* added changes from 32905

* fixed mistakes caused by select all paste

* rename diff_dinov2...

* ran tests

* Fix modular

* Fix tests

* Use new init

* Simplify drop path

* Convert all checkpoints

* Add figure and summary

* Update paths

* Update docs

* Update docs

* Update toctree

* Update docs

---------

Co-authored-by: BernardZach <bernardzach00@gmail.com>
Co-authored-by: Zach Bernard <132859071+BernardZach@users.noreply.github.com>
2024-12-24 13:21:59 +01:00
d8c1db2f56 enable non-cuda awq model support without modify version (#35334)
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2024-12-24 12:36:00 +01:00
ccc4a5a59b Disable .github/workflows/self-comment-ci.yml for now (#35366)
* disable

* disable

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-24 10:53:57 +01:00
93aafdc620 Add compile test for fast image processor (#35184)
* add compile test for fast image processor

* override pixtral test
2024-12-23 13:12:45 -05:00
82fcac0a7e Adding logger.info about update_torch_dtype in some quantizers (#35046)
adding logger.info
2024-12-23 17:01:00 +01:00
a1780b7ba5 bugfix Idefics3 processor - handle gracefully cases with text and no images (#35363)
* bugfix processing empty images

* fix

* fix

* Update src/transformers/models/idefics3/processing_idefics3.py

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* adding tests

* fix

* fix

* fix

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2024-12-23 16:59:01 +01:00
64c05eecd6 HIGGS Quantization Support (#34997)
* higgs init

* working with crunches

* per-model workspaces

* style

* style 2

* tests and style

* higgs tests passing

* protecting torch import

* removed torch.Tensor type annotations

* torch.nn.Module inheritance fix maybe

* hide inputs inside quantizer calls

* style structure something

* Update src/transformers/quantizers/quantizer_higgs.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* reworked num_sms

* Update src/transformers/integrations/higgs.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* revamped device checks

* docstring upd

* Update src/transformers/quantizers/quantizer_higgs.py

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* edited tests and device map assertions

* minor edits

* updated flute cuda version in docker

* Added p=1 and 2,3bit HIGGS

* flute version check update

* incorporated `modules_to_not_convert`

* less hardcoding

* Fixed comment

* Added docs

* Fixed gemma support

* example in docs

* fixed torch_dtype for HIGGS

* Update docs/source/en/quantization/higgs.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Collection link

* dequantize interface

* newer flute version, torch.compile support

* unittest message fix

* docs update compile

* isort

* ValueError instead of assert

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2024-12-23 16:54:49 +01:00
ef1f54a0a7 add bnb support for Ascend NPU (#31512)
* add bnb support for Ascend NPU

* delete comment
2024-12-23 16:36:16 +01:00
59178780a6 Fix : VPTQ test (#35394)
fix_test
2024-12-23 16:27:46 +01:00
3a4ced9ab4 Fix typing in docstring for PaliGemmaProcessor (#35278)
Updated typing for `tokenizer` in the `PaliGemmaProcessor` to be `GemmaTokenizerFast` instead of `LlamaTokenizerFast`
2024-12-23 16:22:04 +01:00
3cd3cd50ac Scale loss before backward (#35207) 2024-12-23 16:16:38 +01:00
f5264a86ee Deprecate _is_quantized_training_enabled (#34991)
deperecate

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-12-23 15:51:31 +01:00
e10be82b71 uniformize kwargs for SAM (#34578)
* Make kwargs uniform for SAM

* Remove unused attribute

* Make point_pad_value part of image_kwargs

* Update annotations

* Code review - use existing methods

* Use ProcessorTesterMixin

* Do not add ProcessorTesterMixin everywhere
2024-12-23 13:54:57 +01:00
2bb60982ac Patch GPTNeoX to use adequate FA2 if position_ids is provided (#35318) 2024-12-23 13:45:55 +01:00
5e7aedebeb make LlamaModel._update_causal_mask torch compilable (#35187)
* make LlamaModel._update_causal_mask torch compilable

* chore: lint (make fix-copies)

* fix-copies

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2024-12-23 13:10:00 +01:00
401aa39d7b bitsandbytes: simplify 8bit dequantization (#35068) 2024-12-23 13:04:59 +01:00
05260a1fc1 Fix new FA2 if is_causal is passed explicitly (#35390)
* fix

* Update modeling_decision_transformer.py

* Update flash_attention.py
2024-12-22 20:00:07 +01:00
8f38f58f3d owlvit/2 dynamic input resolution (#34764)
* owlvit/2 dynamic input resolution.

* adapt box grid to patch_dim_h patch_dim_w

* fix ci

* clarify variable naming

* clarify variable naming..

* compute box_bias dynamically inside box_predictor

* change style part of code

* [run-slow] owlvit, owlv2
2024-12-21 08:51:09 +00:00
608e163b52 [docs] Follow up register_pipeline (#35310)
example json
2024-12-20 09:22:44 -08:00
UV
94fe0b915b Improved Documentation Of Audio Classification (#35368)
* Improved Documentation Of Audio Classification

* Updated documentation as per review

* Updated audio_classification.md

* Update audio_classification.md
2024-12-20 09:17:28 -08:00
c96cc039c3 Improve modular transformers documentation (#35322)
* Improve modular transformers documentation

- Adds hints to general contribution guides
- Lists which utils scripts are available to generate single-files from modular files and check their content

* Show commands in copyable code cells

---------

Co-authored-by: Joel Koch <joel@bitcrowd.net>
2024-12-20 09:16:02 -08:00
504c4d3692 Make test_generate_with_static_cache even less flaky (#34995)
* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-20 16:03:26 +01:00
0fc2970363 Use weights_only=True with torch.load for transfo_xl (#35241)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-20 15:40:55 +01:00
6fae2a84ae Update test fetcher when we want to test all (#35364)
* [test-all]

* style

* [test-all]

* [test_all]

* [test_all]

* style
2024-12-20 15:10:43 +01:00
34ad1bd287 update codecarbon (#35243)
* update codecarbon

* replace directly-specified-test-dirs with tmp_dir

* Revert "replace directly-specified-test-dirs with tmp_dir"

This reverts commit 310a6d962ec83db3f6d4f96daeeba5c6746f736c.

* revert the change of .gitignore

* Update .gitignore

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2024-12-20 15:04:36 +01:00
40292aa4e9 bugfix: torch.export failure caused by _make_causal_mask (#35291)
* bugfix: torch.export failure caused by `_make_causal_mask`

Recent changes in torch dynamo prevent mutations on tensors converted with aten::_to_copy. To address this, we can clone such tensor before performing in-place operation `masked_fill_` only when the code is being compiled by torch dynamo.
(relevant issue: https://github.com/pytorch/pytorch/issues/127571)

* chore: use `is_torchdynamo_compiling` instead of `torch._dynamo.is_compiling`
2024-12-20 14:37:04 +01:00
05de764e9c Aurevoir PyTorch 1 (#35358)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-20 14:36:31 +01:00
4567ee8057 fix zoedepth initialization error under deepspeed zero3 (#35011)
fix zoe bug in deepspeed zero3
2024-12-20 11:42:40 +00:00
c3a43594b7 Add Tensor Parallel support for Qwen2VL (#35050)
feat: add parallel support for qwen2vl
2024-12-20 12:40:38 +01:00
0d51d65905 Cleaner attention interfaces (#35342)
* cleaner attention interfaces

* correctly set the _attn_implementation when adding other functions to it

* update

* Update modeling_utils.py

* CIs
2024-12-20 12:09:34 +01:00
eafbb0eca7 Implement AsyncTextIteratorStreamer for asynchronous streaming (#34931)
* Add AsyncTextIteratorStreamer class

* export AsyncTextIteratorStreamer

* export AsyncTextIteratorStreamer

* improve docs

* missing import

* missing import

* doc example fix

* doc example output fix

* add pytest-asyncio

* first attempt at tests

* missing import

* add pytest-asyncio

* fallback to wait_for and raise TimeoutError on timeout

* check for TimeoutError

* autodoc

* reorder imports

* fix style

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-20 12:08:12 +01:00
b5a557e5fe Reduce CircleCI usage (#35355)
* reduce 1

* reduce 1

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-20 10:18:15 +01:00
4e27a4009d FEAT : Adding VPTQ quantization method to HFQuantizer (#34770)
* init vptq

* add integration

* add vptq support

fix readme

* add tests && format

* format

* address comments

* format

* format

* address comments

* format

* address comments

* remove debug code

* Revert "remove debug code"

This reverts commit ed3b3eaaba82caf58cb3aa6e865d98e49650cf66.

* fix test

---------

Co-authored-by: Yang Wang <wyatuestc@gmail.com>
2024-12-20 09:45:53 +01:00
5a2aedca1e [Mamba2] Fix caching, slow path, and multi-gpu (#35154)
* fixup mamba2 - caching and several other small fixes

* fixup cached forward

* correct fix this time

* fixup cache - we do not need to extend the attn mask it's handled by generate (gives total ids + mask at each step)

* remove unnecessary (un)squeeze

* fixup cache position

* simplify a few things

* [run-slow] mamba2

* multi gpu attempt two

* [run-slow] mamba2

* [run-slow] mamba2

* [run-slow] mamba2

* [run-slow] mamba2

* add newer slow path fix

* [run-slow] mamba2
2024-12-20 09:27:47 +01:00
ff9141bb85 fix onnx export of speech foundation models (#34224)
* added expanded attention/padding masks prior to indexing the hidden_states

* consistency fix in WavLMForSequenceClassification

---------

Co-authored-by: Nikos Antoniou <nikosantoniou@Nikos-MacBook-Pro.local>
2024-12-20 09:22:05 +01:00
f42084e641 [docs] Add link to ModernBERT Text Classification GLUE finetuning script (#35347)
Add link to ModernBERT Text Classification GLUE finetuning script
2024-12-19 14:45:52 -08:00
0ade1caa35 Modernbert Release Fixes (#35344)
* fix ForSequenceClassification

* unmodularize rope layer

* fix linting warning

* Avoid complex PoolingHead, only one prediction head needed

---------

Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
2024-12-19 17:22:37 +01:00
1fa807fa63 Fix some fa2 tests (#35340)
* remove fa2 test

* remove other failing tests

* style
2024-12-19 17:05:25 +01:00
667ed5635e Add ModernBERT to Transformers (#35158)
* initial cut of modernbert for transformers

* small bug fixes

* fixes

* Update import

* Use compiled mlp->mlp_norm to match research implementation

* Propagate changes in modular to modeling

* Replace duplicate attn_out_dropout in favor of attention_dropout

cc @warner-benjamin let me know if the two should remain separate!

* Update BOS to CLS and EOS to SEP

Please confirm @warner-benjamin

* Set default classifier bias to False, matching research repo

* Update tie_word_embeddings description

* Fix _init_weights for ForMaskedLM

* Match base_model_prefix

* Add compiled_head to match research repo outputs

* Fix imports for ModernBertForMaskedLM

* Just use "gelu" default outright for classifier

* Fix config name typo: initalizer -> initializer

* Remove some unused parameters in docstring. Still lots to edit there!

* Compile the embeddings forward

Not having this resulted in very slight differences - so small it wasn't even noticed for the base model, only for the large model.

But the tiny difference for large propagated at the embedding layer through the rest of the model, leading to notable differences of ~0.0084 average per value, up to 0.2343 for the worst case.

* Add drafts for ForSequenceClassification/ForTokenClassification

* Add initial SDPA support (not exactly equivalent to FA2 yet!)

During testing, FA2 and SDPA still differ by about 0.0098 per value in the token embeddings. It still predicts the correct mask fills, but I'd like to get it fully 1-1 if possible.

* Only use attention dropout if training

* Add initial eager attention support (also not equivalent to FA2 yet!)

Frustratingly, I also can't get eager to be equivalent to FA2 (or sdpa), but it does get really close, i.e. avg ~0.010 difference per value.

Especially if I use fp32 for both FA2&eager, avg ~0.0029 difference per value

The fill-mask results are good with eager.

* Add initial tests, output_attentions, output_hidden_states, prune_heads

Tests are based on BERT, not all tests pass yet: 23 failed, 79 passed, 100 skipped

* Remove kwargs from ModernBertForMaskedLM

Disable sparse_prediction by default to match the normal HF, can be enabled via config

* Remove/adjust/skip improper tests; warn if padding but no attn mask

* Run formatting etc.

* Run python utils/custom_init_isort.py

* FlexAttention with unpadded sequences(matches FA2 within bf16 numerics)

* Reformat init_weights based on review

* self -> module in attention forwards

* Remove if config.tie_word_embeddings

* Reformat output projection on a different line

* Remove pruning

* Remove assert

* Call contiguous() to simplify paths

* Remove prune_qkv_linear_layer

* Format code

* Keep as kwargs, only use if needed

* Remove unused codepaths & related config options

* Remove 3d attn_mask test; fix token classification tuple output

* Reorder: attention_mask above position_ids, fixes gradient checkpointing

* Fix usage if no FA2 or torch v2.5+

* Make torch.compile/triton optional

Should we rename 'compile'? It's a bit vague

* Separate pooling options into separate functions (cls, mean) - cls as default

* Simplify _pad_modernbert_output, remove unused labels path

* Update tied weights to remove decoder.weight, simplify decoder loading

* Adaptively set config.compile based on hf_device_map/device/resize, etc.

* Update ModernBertConfig docstring

* Satisfy some consistency checks, add unfinished docs

* Only set compile to False if there's more than 1 device

* Add docstrings for public ModernBert classes

* Dont replace docstring returns - ends up being duplicate

* Fix mistake in toctree

* Reformat toctree

* Patched FlexAttention, SDPA, Eager with Local Attention

* Implement FA2 -> SDPA -> Eager attn_impl defaulting, crucial

both to match the original performance, and to get the highest inference speed without requiring users to manually pick FA2

* Patch test edge case with Idefics3 not working with 'attn_implementation="sdpa"'

* Repad all_hidden_states as well

* rename config.compile to reference_compile

* disable flex_attention since it crashes

* Update modernbert.md

* Using dtype min to mask in eager

* Fully remove flex attention for now

It's only compatible with the nightly torch 2.6, so we'll leave it be for now. It's also slower than eager/sdpa.

Also, update compile -> reference_compile in one more case

* Call contiguous to allow for .view()

* Copyright 2020 -> 2024

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update/simplify __init__ structure

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove "... if dropout_prob > 0 else identity"

As dropout with 0.0 should be efficient like identity

* re-use existing pad/unpad functions instead of creating new ones

* remove flexattention method

* Compute attention_mask and local_attention_mask once in modeling

* Simplify sequence classification prediction heads, only CLS now

Users can make custom heads if they feel like it

Also removes the unnecessary pool parameter

* Simplify module.training in eager attn

* Also export ModernBertPreTrainedModel

* Update the documentation with links to finetuning scripts

* Explain local_attention_mask parameter in docstring

* Simplify _autoset_attn_implementation, rely on super()

* Keep "in" to initialize Prediction head

Doublechecked with Benjamin that it's correct/what we used for pretraining

* add back mean pooling

* Use the pooling head in TokenClassification

* update copyright

* Reset config._attn_implementation_internal on failure

* Allow optional attention_mask in ForMaskedLM head

* fix failing run_slow tests

* Add links to the paper

* Remove unpad_no_grad, always pad/unpad without gradients

* local_attention_mask -> sliding_window_mask

* Revert "Use the pooling head in TokenClassification"

This reverts commit 99c38badd1dbce01d7aef41095fbf2f5cce87279.

There was no real motivation, no info on whether having this bigger head does anything useful.

* Simplify pooling, 2 options via if-else

---------

Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>
Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
Co-authored-by: Said Taghadouini <taghadouinisaid@gmail.com>
Co-authored-by: Benjamin Clavié <ben@clavie.eu>
Co-authored-by: Antoine Chaffin <ant54600@hotmail.fr>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-19 14:03:35 +01:00
56ff1e92fd PaliGemma: Make sure to add <eos> to suffix if <image> is present in text (#35201)
Move suffix processing code to out of if statement
2024-12-19 09:53:48 +01:00
4592cc9e98 Update comment CI bot (#35323)
* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-19 09:45:27 +01:00
d19b11f59b Fix documentation for ColPali (#35321)
* docs: fix typo quickstart snippet in ColPali's model card

* docs: clean the ColPali's model card

* docs: make the `ColPaliForRetrieval`'s docstring more concise

* docs: add missing bash command used to convert weights for `vidore/colpali-v1.3-hf`
2024-12-19 09:08:28 +01:00
9613933b02 Add the Bamba Model (#34982)
* initial commit for PR

Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>

* rename dynamic cache

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add more unit tests

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add integration test

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add integration test

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* Add modular bamba file

* Remove trainer changes from unrelated PR

* Modify modular and cofig to get model running

* Fix some CI errors and beam search

* Fix a plethora of bugs from CI/docs/etc

* Add bamba to models with special caches

* Updat to newer mamba PR for mamba sublayer

* fix test_left_padding_compatibility

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix style

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix remaining tests

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* missed this test

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* ran make style

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* move slow tag to integration obj

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* make style

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* address comments

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix modular

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* left out one part of modular

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* change model

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* Make Rotary modular as well

* Update bamba.md

Added overview, update Model inference card and added config

* Update bamba.md

* Update bamba.md

* Update bamba.md

Minor fixes

* Add docs for config and model back

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Add warning when using fast kernels

* replaced generate example

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* Address comments from PR

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Propagate attention fixes

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Fix attention interfaces to the new API

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Fix API for decoder layer

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Remove extra weights

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

---------

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>
Co-authored-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: divya-kumari32 <72085811+divya-kumari32@users.noreply.github.com>
Co-authored-by: Antoni Viros <ani300@gmail.com>
2024-12-18 20:18:17 +01:00
9a94dfe123 feat: add benchmarks_entrypoint.py (#34495)
* feat: add `benchmarks_entrypoint.py`

Adding `benchmarks_entrypoint.py` file, which will be run from the
benchmarks CI.

This python script will list all python files from the `benchmark/`
folder and run the included `run_benchmark` function, allowing people to
add new benchmarks scripts.

* feat: add `MetricsRecorder`

* feat: update dashboard

* fix: add missing arguments to `MetricsRecorder`

* feat: update dash & add datasource + `default.yml`

* fix: move responsibility to create `MetricsRecorder` in bench script

* fix: update incorrect datasource UID

* fix: incorrect variable values

* debug: benchmark entrypoint script

* refactor: update log level

* fix: update broken import

* feat: add debug log in `MetricsRecorder`

* debug: set log level to debug

* fix: set connection `autocommit` to `True`
2024-12-18 18:59:07 +01:00
2c47618c1a 🚨All attention refactor🚨 (#35235)
* refactor LlamaAttention

* minimal changes

* fix llama

* update

* modular gemmas

* modular nits

* modular updates

* nits

* simplify

* gpt2

* more modualr and fixes

* granite

* modular modular modular

* nits

* update

* qwen2 + starcoder2

* mostly gemma2

* Update image_processing_auto.py

* fix

* Update modular_starcoder2.py

* fix

* remove all copied from attentions

* remove gcv

* make fix-copies

* oups

* oups2.0

* fix some modulars + all copied from

* should be good now

* revert unwanted changes

* Update modeling_decision_transformer.py

* finish cleanup

* Update modeling_olmo.py

* consistency

* re-add gradient checkpointing attribute

* fix

* style

* make config necessary

* bis

* bis

* Update modeling_my_new_model2.py

* is_causal attr

* fix

* remove past kv return from decoder layer

* fix

* default rope config

* correctly fix rope config

* fix bias

* fix gpt2 attention output

* fix test

* fix inits

* fix default sdpa

* fix default sdpa implementation

* harmonize classes

* fix mistral

* fix sliding window models

* mixtral

* be more explicit

* style

* fix

* several fixes

* Update modeling_dbrx.py

* fix test

* olmo + phi

* rotary

* syle

* phi

* phi again

* again

* kwargs

* Update test_modeling_common.py

* skip fx tracing tests

* Update modeling_utils.py

* gemma 2

* again

* Update modeling_recurrent_gemma.py

* gemma2

* granite

* style

* starcoder

* Update sdpa_attention.py

* switch args

* Update modeling_mllama.py

* fix

* cache type tests

* gpt2

* Update test_modeling_common.py

* fix

* consistency

* fix shape with encoder

* should be the last one

* tests non model

* most comments

* small oupsi

* be more explicit in modulars

* more explicit modulars

* CIs! it works locally

* add kwargs to _flash_attention_forward

---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2024-12-18 16:53:39 +01:00
75be5a0a5b [Whisper] fix docstrings typo (#35319)
typos docstring
2024-12-18 16:38:19 +01:00
69e31eb1bf change bnb tests (#34713)
* fix training tests

* fix xpu check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* rm pdb

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix 4bit logits check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix 4bit logits check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add xpu check on int8 training

* fix training tests

* add llama test on bnb

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* only cpu and xpu disable autocast training

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com>
2024-12-18 09:49:59 -05:00
da334bcfa8 [Whisper] 🚨 Fix whisper decoding 🚨 (#34135)
* do not remove decoder_input_ids for the first segment

* do not remove eos token in generate_with_fallback

* when removing padding tokens, do not remove eos token

* remove eos token in generate (and not in generate_with_fallback!)

* reconciliate short-from/ long-form behavior

* correct avg_logprobs calculation

* handle eos token in segments

* handle decoder_input_ids and eos token in _prepare_decoder_input_ids

* fix incorrect time precision

* always remove eos token

* always remove decoder_input_ids

* no need to handle decoder_inputs_ids and eos token

* no need to remove decoder_input_ids

* no need to handle eos token

* fix num_beams in _retrieve_logit_processors

* remove todo unconsistency

* no need to add eos token

* last_timestamp_pos should indeed be timestamp token pos

* patch generate to enable compatibility with GenerationTesterMixin tests

* adapt test_generate_continue_from_past_key_values

* adapt test_prompt_lookup_decoding_matches_greedy_search

* adapt generic GenerationMixin tests to whisper's generate

* fix speculative decoding

* fix

* [run-slow] whisper

* change HF_HUB_TOKEN for require_read_token

* [run-slow] whisper

* prioritize kwargs over generation_config

* remove unnecessary args

* [run-slow] whisper

* update tests

* [run-slow] whisper

* add comment

* update test

* [run-slow] whisper

* update test + revert require_read_token

* docstring updates

* revert tokenizer decode args change

* do not use a patch + docstring updates

* [run-slow] whisper

* make

* [run-slow] whisper

* add a flag to force unique call to generate

* test update

* [run-slow] whisper

* add force_unique_generate_call arg

* do not use a patch

* correct the timestamps for the pad tokens

* docstring update

* docstring update

* docstring update

* upodate TF tests

* add require_read_token

* [run-slow] whisper

* test reset dynamo

* [run-slow] whisper

* fix

* [run-slow] whisper

* avoid iterating twice on current_segments

* [run-slow] whisper

* [run-slow] whisper

---------

Co-authored-by: Eustache Le Bihan <eustlb@users.noreply.huggingface.co>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-18 14:13:21 +01:00
f1b7634fc8 Trigger GitHub CI with a comment on PR (#35211)
* fix

* fix

* comment

* final

* final

* final

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-18 13:56:49 +01:00
c7e48053aa [tests] make cuda-only tests device-agnostic (#35222)
fix cuda-only tests
2024-12-18 10:14:22 +01:00
1eee1cedfd Fix loading with only state dict and low_cpu_mem_usage = True (#35217)
* fix loading with only state dict and config

* style

* add tests

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-18 09:54:32 +01:00
0531d7513b [docs] Improve register_pipeline (#35300)
register_pipeline
2024-12-17 10:27:23 -08:00
UV
77080f023f Fixed typo in audio_classification.md (#35305) 2024-12-17 09:45:51 -08:00
8bfd7eeeef Add Cohere2 docs details (#35294)
* Add Cohere2 docs details

* Update docs/source/en/model_doc/cohere2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-17 09:36:31 -08:00
a7feae190f Fix remove unused parameter in docs (#35306)
remove unused parameter in example

Co-authored-by: zzzzzsa <zzzzzsaqwq@gmail.com>
2024-12-17 09:34:41 -08:00
927c3e39ec Fix image preview in multi-GPU inference docs (#35303)
fix: link for img
2024-12-17 09:33:50 -08:00
4302b27719 Fix typos in translated quicktour docs (#35302)
* fix: quicktour typos

* fix: one more
2024-12-17 09:32:00 -08:00
deac971c46 🚨🚨🚨 Limit backtracking in Nougat regexp (#35264)
* Limit backtracking in regexp

* Update

* [run-slow] nougat

* Update
2024-12-17 16:34:18 +00:00
d29a06e39a remove benchmark job in push-important-models.yml (#35292)
remove-bench

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-17 17:27:26 +01:00
e0ae9b5974 🚨🚨🚨 Delete conversion scripts when making release wheels (#35296)
* Delete conversion scripts when making release wheels

* make fixup

* Update docstring
2024-12-17 14:18:42 +00:00
6eb00dd2f0 Support for SDPA for SAM models (#34110)
* feat: add support for sdpa and gradient checkpointing

* fix: ruff format

* fix: config sdpa

* fix: sdpa layer naming convention

* fix: update test_eager_matches_sdpa_inference to handle vision_hidden_states

* test: skip incompatible tests and fix loading issue with sdpa

- Updated tests to skip cases flash and dynamic compile.
- Minor adjustment to ensure correct loading of model with sdpa for dispatch test.

* style: apply Ruff formatting

* ruff fix again after rebase

* [run-slow] sam

* [run-slow] sam

* refactor: Address review comments and improve sub-config handling in SAM model tests

- Added attributes for sub_configs as per PR #34410.
- Enabled tests for configs, ensuring the composite model (SAM) has several sub-configs in the main config.
- Added class attribute _is_composite=True to the tester class
- test_sdpa_can_dispatch_composite_models added

* [run-slow] sam

* style: ruff

* [run-slow] sam

* style: ruff again ...

* [run-slow] sam
2024-12-17 14:46:05 +01:00
747f361da1 Add sdpa for Beit (#34941)
* Add sdpa for Beit

* Updates

* [run-slow] beit

* Update inference benchmarks

* Update

* Fix - add missed to super().forward()

* Updates

* Fix missing import
2024-12-17 14:44:47 +01:00
6c08b3b6e5 Add Falcon3 documentation (#35307)
* Add Falcon3 documentation

* Update Falcon3 documentation

* Change Falcon to Falcon3

* Update docs and run make fix-copies

* Add blog post and huggingface models links
2024-12-17 14:23:13 +01:00
f33a0cebb3 Add ColPali to 🤗 transformers (#33736)
* feat: run `add-new-model-like`

* feat: add paligemma code with "copied from"

* feat: add ColPaliProcessor

* feat: add ColPaliModel

* feat: add ColPaliConfig

* feat: rename `ColPaliForConditionalGeneration` to `ColPaliModel`

* fixup modeling colpali

* fix: fix root import shortcuts

* fix: fix `modeling_auto` dict

* feat: comment out ColPali test file

* fix: fix typos from `add-new-model-like`

* feat: explicit the forward input args

* feat: move everything to `modular_colpali.py`

* fix: put back ColPaliProcesor

* feat: add auto-generated files

* fix: run `fix-copies`

* fix: remove DOCStRING constants to make modular converter work

* fix: fix typo + modular converter

* fix: add missing imports

* feat: no more errors when loading ColPaliModel

* fix: remove unused args in forward + tweak doc

* feat: rename `ColPaliModel` to `ColPaliForRetrieval`

* fix: apply `fix-copies`

* feat: add ColPaliProcessor to `modular_colpali`

* fix: run make quality + make style

* fix: remove duplicate line in configuration_auto

* feat: make ColPaliModel inehrit from PaliGemmaForConditionalGeneration

* fix: tweak and use ColPaliConfig

* feat: rename `score` to `post_process_retrieval`

* build: run modular formatter + make style

* feat: convert colpali weights + fixes

* feat: remove old weight converter file

* feat: add and validate tests

* feat: replace harcoded path to "vidore/colpali-v1.2-hf" in tests

* fix: add bfloat16 conversion in weight converter

* feat: replace pytest with unittest in modeling colpali test

* feat: add sanity check for weight conversion (doesn't work yet)

* feat: add shape sanity check in weigth converter

* feat: make ColPaliProcessor args explicit

* doc: add doc for ColPali

* fix: trying to fix output mismatch

* feat: tweaks

* fix: ColPaliModelOutput inherits from ModelOutput instead of PaliGemmaCausalLMOutputWithPast

* fix: address comments on PR

* fix: adapt tests to the Hf norm

* wip: try things

* feat: add `__call__` method to `ColPaliProcessor`

* feat: remove need for dummy image in `process_queries`

* build: run new modular converter

* fix: fix incorrect method override

* Fix tests, processing, modular, convert

* fix tokenization auto

* hotfix: manually fix processor -> fixme once convert modular is fixed

* fix: convert weights working

* feat: rename and improve convert weight script

* feat: tweaks

* fest: remove `device` input for `post_process_retrieval`

* refactor: remove unused `get_torch_device`

* Fix all tests

* docs: update ColPali model doc

* wip: fix convert weights to hf

* fix logging modular

* docs: add acknowledgements in model doc

* docs: add missing docstring to ColPaliProcessor

* docs: tweak

* docs: add doc for `ColPaliForRetrievalOutput.forward`

* feat: add modifications from colpali-engine v0.3.2 in ColPaliProcessor

* fix: fix and upload colapli hf weights

* refactor: rename `post_process_retrieval` to `score_retrieval`

* fix: fix wrong typing for `score_retrieval`

* test: add integration test for ColPali

* chore: rerun convert modular

* build: fix root imports

* Update docs/source/en/index.md

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* fix: address PR comments

* wip: reduce the prediction gap in weight conversion

* docs: add comment in weight conversion script

* docs: add example for `ColPaliForRetrieval.forward`

* tests: change dataset path to the new one in hf-internal

* fix: colpali weight conversion works

* test: add fine-grained check for ColPali integration test

* fix: fix typos in convert weight script

* docs: move input docstring in a variable

* fix: remove hardcoded torch device in test

* fix: run the new modular refactor

* docs: fix python example for ColPali

* feat: add option to choose `score_retrieval`'s output dtype and device

* docs: update doc for `score_retrieval`

* feat: add `patch_size` property in ColPali model

* chore: run `make fix-copies`

* docs: update description for ColPali cookbooks

* fix: remove `ignore_index` methods

* feat: remove non-transformers specific methods

* feat: update `__init__.py` to new hf format

* fix: fix root imports in transformers

* feat: remove ColPali's inheritance from PaliGemma

* Fix CI issues

* nit remove prints

* feat: remove ColPali config and model from `modular_colpali.py`

* feat: add `ColPaliPreTrainedModel` and update modeling and configuration code

* fix: fix auto-removed imports in root `__init__.py`

* fix: various fixes

* fix: fix `_init_weight`

* temp: comment `AutoModel.from_config` for experiments

* fix: add missing `output_attentions` arg in ColPali's forward

* fix: fix `resize_token_embeddings`

* fix: make `input_ids` optional in forward

* feat: rename `projection_layer` to `embedding_proj_layer`

* wip: fix convert colpali weight script

* fix tests and convert weights from original repo

* fix unprotected import

* fix unprotected torch import

* fix style

* change vlm_backbone_config to vlm_config

* fix unprotected import in modular this time

* fix: load config from Hub + tweaks in convert weight script

* docs: move example usage from model docstring to model markdown

* docs: fix input docstring for ColPali's forward method

* fix: use `sub_configs` for ColPaliConfig

* fix: remove non-needed sanity checks in weight conversion script + tweaks

* fix: fix issue with `replace_return_docstrings` in ColPali's `forward`

* docs: update docstring for `ColPaliConfig`

* test: change model path in ColPali test

* fix: fix ColPaliConfig

* fix: fix weight conversion script

* test: fix expected weights for ColPali model

* docs: update ColPali markdown

* docs: fix minor typo in ColPaliProcessor

* Fix tests and add _no_split_modules

* add text_config to colpali config

* [run slow] colpali

* move inputs to torch_device in integration test

* skip test_model_parallelism

* docs: clarify quickstart snippet in ColPali's model card

* docs: update ColPali's model card

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2024-12-17 11:26:43 +01:00
a7f5479b45 fix modular order (#35297)
* fix modular ordre

* fix

* style
2024-12-17 08:05:35 +01:00
UV
f5620a7634 Improved documentation of Automatic speech recognition (#35268)
Improved documentation quality of Automatic speech recognition
2024-12-16 09:50:11 -08:00
eb92bc44b7 Fix wrongs in quicktour[zh] (#35272)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2024-12-16 09:23:34 -08:00
886f690e76 Translating "translate perf_infer_gpu_multi.md" to Chinese (#35271)
add "translate perf_infer_gpu_multi"
2024-12-16 09:22:35 -08:00
22834eeba1 Fix typos in Translated Audio Classification Docs (#35287)
* fix: qwen2 model ids

* fix: line

* fix: more format

* update: reformat

* fix: doc typos
2024-12-16 08:51:32 -08:00
9feae5fb01 [Whisper] patch float type on mps (#35295)
* fix float type on mps

* make
2024-12-16 16:52:47 +01:00
d5b81e1ca1 Delete redundancy for loop checks. (#35288)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2024-12-16 13:36:27 +00:00
d0f32212ed Temporarily disable amd push ci (#35293)
Temporarily disable amd push ci (reduce noise)
2024-12-16 14:18:50 +01:00
85eb339231 Fix : model used to test ggml conversion of Falcon-7b is incorrect (#35083)
fixing test model
2024-12-16 13:21:44 +01:00
14910281a7 Blip: fix offloading and MP tests (#35239)
* fix device map

* fix offloading + model parallel test
2024-12-16 12:44:33 +01:00
66531a1ec3 Aggeregate test summary files in CircleCI workflow runs (#34989)
* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* fix

* fix

* fix

* update

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-16 11:06:17 +01:00
5615a39369 Fall back to slow image processor in ImageProcessingAuto when no fast processor available (#34785)
* refactor image_processing_auto logic

* fix fast image processor tests

* Fix tests fast vit image processor

* Add safeguard when use_fast True and torchvision not available

* change default use_fast back to None, add warnings

* remove debugging print

* call get_image_processor_class_from_name once
2024-12-15 14:00:36 -05:00
ca03842cdc [i18n-Chinese] Translating perf_train_cpu.md to Chinese (#35242)
add "1"
2024-12-13 14:46:49 -08:00
add53e25ff don't use no_sync when deepspeed doesn't support it for certain zero stages (#35157)
* don't use no_sync when deepspeed doesn't support it for certain zero stages

* chore: lint

* fix no_sync context for deepspeed across all zero types

* chore: lint
2024-12-13 19:23:00 +01:00
7237b3ecfc Fix FSDP no longer working (#35212)
Fix FSDP failing
2024-12-13 19:20:51 +01:00
6009642459 Translating agents_advanced.md to Chinese (#35231)
add "translate agents_advanced"
2024-12-13 10:12:00 -08:00
UV
e94083bf90 Fixed typos in Audio Classification Documentation (#35263)
* Fixed typos in Audio Classification Documentation

* removed space in '8000 kHZ'

* Changes made as per review
2024-12-13 09:43:44 -08:00
bc6ae0d55e Update AMD docker image (rocm 6.1) (#35259)
* Use rocm 6.3 as base amd image and add nvidia-ml-py to exclude list

* Align rocm base image with torch wheels @6.1. Seems like the most stable combo
2024-12-13 15:41:03 +01:00
8096161b76 Use rsfE with pytest (#35119)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-13 14:36:22 +01:00
bdd4201fdb [tests] fix "Tester object has no attribute '_testMethodName'" (#34910)
* add more cases

* fix method not found in unittest

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>

* fix more cases

* add more models

* add all

* no unittest.case

* remove for oneformer

* fix style

---------

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
2024-12-13 14:33:45 +01:00
3d213b57fe skip Fuyu from test_generate (#35246)
* skip Fuyu from test_generate

* make fixup, quality, repo-consistency
2024-12-13 10:12:49 +01:00
64478c7631 Add Cohere2 model (#35224) 2024-12-13 09:35:50 +01:00
e4e404fdd0 Run model as compressed/uncompressed mode (#34719)
* draft, run model as compreszed/uncompressed mode

* draft

* run run_compressed=False

* run_compressed as attr

* set run_compressed=False using quantization_config

* remove redundant line

* make is_qat_trainable dependent on run_compressed status

* add tests

* lint

* full in docstring

* add decompress

* comments

* decompress if model is compresssed and not run_compressed

* apply_quant_config logic fix -- populate statedict properly

* comments

* remove non  compressed model

* make is_compressed as property

* cosmetic

* run apply_quant_config for non-compressed models -- popualte scales and zeropoints

* add pahtway for decompressing sparse models

* typo on is_quantization_compressed

* lint

* fix typo
2024-12-13 08:23:31 +01:00
31f9a289a6 Fix typo in chat template example (#35250)
Fix template example typo
2024-12-12 16:53:21 -08:00
11ba1d472c [Init refactor] Modular changes (#35240)
* Modular changes

* Gemma

* Gemma
2024-12-12 19:23:28 +01:00
a691ccb0c2 Change back to Thread for SF conversion (#35236)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-12 16:05:04 +01:00
e3ee49fcfb Refactoring AssistedCandidateGenerator for Improved Modularity and Reusability (#35009)
* move `TestAssistedCandidateGeneratorDifferentTokenizers` into a new testing file

* refactor

* NOTHING. add space to rerun github actions tests

* remove it...

* NOTHING. add space to rerun github actions tests

* remove it...

* replace: `self.prev_tokens` -> `self.prev_assistant_ids`

* NOTHING. rerun CI tests

* remove it

* introduce `self.prev_target_ids_len`

* fix style

* fix style

---------

Co-authored-by: Jonathan Mamou <jonathan.mamou@intel.com>
2024-12-12 15:47:05 +01:00
63766abe36 Support Python 3.10+ Union style in chat template type hints parsing (#35103)
* fix(utils): Support the newest Union type in chat template

* fix(utils/chat_template): Backward compatibility for the newest Union type

* Update src/transformers/utils/chat_template_utils.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2024-12-12 14:07:06 +00:00
5cf11e5ab9 Fix type hints for apply_chat_template (#35216) 2024-12-12 13:59:24 +00:00
UV
3db8e27816 Fixed typo of 'indentifier' in audio_utils.py (#35226) 2024-12-12 13:45:04 +00:00
a9ccdfd8e3 docs: clarify initializer_range parameter description in Idefics3VisionConfig (#35215) 2024-12-11 11:26:18 -08:00
6181c6b095 Fix seamless TTS generate (#34968)
* fix seamless tts generate

* apply same fix for v2

* [run-slow] seamless_m4t, seamless_m4t_v2

* remove TODO

* [run-slow] seamless_m4t, seamless_m4t_v2

* [run-slow] seamless_m4t, seamless_m4t_v2

* ignore failing test on multigpus

* [run-slow] seamless_m4t, seamless_m4t_v2

* [run-slow] seamless_m4t, seamless_m4t_v2
2024-12-11 15:38:42 +01:00
33c12e4d80 Fix CI (#35208)
fix aria
2024-12-11 14:24:52 +01:00
7d303efa5f Cleanup: continue the init refactor (#35170)
* Round 2

* Round 3
2024-12-11 14:12:34 +01:00
5fcf6286bf Add TimmWrapper (#34564)
* Add files

* Init

* Add TimmWrapperModel

* Fix up

* Some fixes

* Fix up

* Remove old file

* Sort out import orders

* Fix some model loading

* Compatible with pipeline and trainer

* Fix up

* Delete test_timm_model_1/config.json

* Remove accidentally commited files

* Delete src/transformers/models/modeling_timm_wrapper.py

* Remove empty imports; fix transformations applied

* Tidy up

* Add image classifcation model to special cases

* Create pretrained model; enable device_map='auto'

* Enable most tests; fix init order

* Sort imports

* [run-slow] timm_wrapper

* Pass num_classes into timm.create_model

* Remove train transforms from image processor

* Update timm creation with pretrained=False

* Fix gamma/beta issue for timm models

* Fixing gamma and beta renaming for timm models

* Simplify config and model creation

* Remove attn_implementation diff

* Fixup

* Docstrings

* Fix warning msg text according to test case

* Fix device_map auto

* Set dtype and device for pixel_values in forward

* Enable output hidden states

* Enable tests for hidden_states and model parallel

* Remove default scriptable arg

* Refactor inner model

* Update timm version

* Fix _find_mismatched_keys function

* Change inheritance for Classification model (fix weights loading with device_map)

* Minor bugfix

* Disable save pretrained for image processor

* Rename hook method for loaded keys correction

* Rename state dict keys on save, remove `timm_model` prefix, make checkpoint compatible with `timm`

* Managing num_labels <-> num_classes attributes

* Enable loading checkpoints in Trainer to resume training

* Update error message for output_hidden_states

* Add output hidden states test

* Decouple base and classification models

* Add more test cases

* Add save-load-to-timm test

* Fix test name

* Fixup

* Add do_pooling

* Add test for do_pooling

* Fix doc

* Add tests for TimmWrapperModel

* Add validation for `num_classes=0` in timm config + test for DINO checkpoint

* Adjust atol for test

* Fix docs

* dev-ci

* dev-ci

* Add tests for image processor

* Update docs

* Update init to new format

* Update docs in configuration

* Fix some docs in image processor

* Improve docs for modeling

* fix for is_timm_checkpoint

* Update code examples

* Fix header

* Fix typehint

* Increase tolerance a bit

* Fix Path

* Fixing model parallel tests

* Disable "parallel" tests

* Add comment for metadata

* Refactor AutoImageProcessor for timm wrapper loading

* Remove custom test_model_outputs_equivalence

* Add require_timm decorator

* Fix comment

* Make image processor work with older timm versions and tensor input

* Save config instead of whole model in image processor tests

* Add docstring for `image_processor_filename`

* Sanitize kwargs for timm image processor

* Fix doc style

* Update check for tensor input

* Update normalize

* Remove _load_timm_model function

---------

Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>
2024-12-11 12:40:30 +00:00
bcc50cc7ce [PEFT] Better Trainer error when prompt learning with loading best model at the end (#35087)
Original issue: https://github.com/huggingface/peft/issues/2256

There is a potential error when using load_best_model_at_end=True with a
prompt learning PEFT method. This is because Trainer uses load_adapter
under the hood but with some prompt learning methods, there is an
optimization on the saved model to remove parameters that are not
required for inference, which in turn requires a change to the model
architecture. This is why load_adapter will fail in such cases and users
should instead set load_best_model_at_end=False and use
PeftModel.from_pretrained. As this is not obvious, we now intercept the
error and add a helpful error message.
2024-12-11 12:44:39 +01:00
d363e71d0e 🧹 Remove deprecated RotaryEmbedding parts in the Attention layers (#34858)
* update

* style

* fix missing args

* remove last trace of old rope classes

* remove deprecated copied from

* fix copies

* trigger CIs

* post rebase clean-up

* reverse mistral

* cleanup after dropping commits

* Add comment
2024-12-11 11:16:52 +01:00
9094b87dd4 BLIP: enable device map (#34850)
fix device map
2024-12-11 11:03:30 +01:00
10feacd88a [i18n-<languageCode>] Translating agents.md to Chinese (#35139)
* add "translate agents.md"

* add "agents.md"

* add "translate warnings"

* add "totree"

* add "remove transformer_agent"

* add "remove transformer _agent file"

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-10 15:16:37 -08:00
e8508924fd Update data collator docstrings to accurately reference Nvidia tensor core compute capability version (#35188)
update data collator docs to reflect correct tensor core compute capability

Co-authored-by: John Graham Reynolds <john.graham.reynolds@vumc.org>
2024-12-10 15:16:01 -08:00
5290f6a62d [docs] Fix FlashAttention link (#35171)
fix link
2024-12-10 11:36:25 -08:00
91b8ab18b7 [i18n-<languageCode>] Translating Benchmarks.md to Chinese (#35137)
* add "Translating Benchmarks.md to Chinese "

* Removed all the English original text (which was previously kept as comments in the document) and refined some of the Chinese expressions.
2024-12-10 09:58:47 -08:00
217c47e31b Only import torch.distributed if it is available (#35133) 2024-12-10 18:19:30 +01:00
52d135426f Multiple typo fixes in NLP, Audio docs (#35181)
Fixed multiple typos in Tutorials, NLP, and Audio sections
2024-12-10 09:08:55 -08:00
425af6cdc2 [i18n-ar] Translated file : docs/source/ar/community.md into Arabic (#33027)
* Add docs/source/ar/community.md to Add_docs_source_ar_community.md

* Update community.md

* Update community.md

* Update community.md

* Update _toctree.yml - add community.md

* Update docs/source/ar/community.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Create how_to_hack_models.md

* Create modular_transformers.md

* Create tiktoken.md

* Update _toctree.yml

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tiktoken.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tiktoken.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-12-10 09:08:27 -08:00
e5c45a6679 Fixing GGUF support for StableLm (#35060)
fix

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-12-10 16:30:09 +01:00
3e2769a3c9 Fix DBRX LayerNorm init method (#35177)
fix dbrx layernorm init
2024-12-10 14:31:22 +00:00
5fba3f99c0 Remove unnecessary masked_fill in deberta models (#35182) 2024-12-10 13:52:20 +00:00
6acb4e43a7 Support BatchNorm in Hubert pos_conv_emb as in fairseq (#34389)
* Support BatchNorm in Hubert pos_conv_emb as in fairseq

* Correct the new defaults (#34377)

* Correct the new defaults

* CIs

* add check

* Update utils.py

* Update utils.py

* Add the max_length in generate test checking shape without passing length

* style

* CIs

* fix fx CI issue

* [auto. ping] Avoid sending empty info + add more team members (#34383)

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Fix glm  (#34388)

* Fix duplicated

* fix import

* Use non nested images and batched text Idefics2/3  (#34222)

* add support for non nested images and add tests

* add tests error scenario

* fix style

* added single and no image to error tests

* Fix onnx non-expotable inplace aten op (#34376)

* fix onnx non-expotable inplace op

* mistral, qwen2, qwen2_vl, starcoder2

* fixup copies

* Fix right padding in LLaVA models (#34305)

* fix right pad llavas

* device mismatch

* no filter (#34391)

* no filter

* no filter

* no filter

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* SynthID: better example (#34372)

* better example

* Update src/transformers/generation/configuration_utils.py

* Update src/transformers/generation/logits_process.py

* nits

* Tests: upgrade `test_eager_matches_sdpa_generate` (#34386)

* Fix bnb training test failure (#34414)

* Fix bnb training test: compatibility with OPTSdpaAttention

* Avoid check expected exception when it is on CUDA (#34408)

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Fix typos in agents_advanced.md (#34405)

* [docs] Cache implementations (#34325)

cache

* [run-slow] hubert

* Support BatchNorm in Hubert pos_conv_emb as in fairseq
Add conversion integration test, and make batchnorm explicit variable

* Support BatchNorm in Hubert pos_conv_emb as in fairseq
fix make fixup styling changes

* [run-slow] hubert

* Support BatchNorm in Hubert pos_conv_emb as in fairseq

* [run-slow] hubert

* Support BatchNorm in Hubert pos_conv_emb as in fairseq
Add conversion integration test, and make batchnorm explicit variable

* Support BatchNorm in Hubert pos_conv_emb as in fairseq
fix make fixup styling changes

* [run-slow] hubert

* [run-slow] hubert

---------

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
Co-authored-by: Rudy Delouya <rudy.delouya@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
2024-12-10 14:18:23 +01:00
80f2b1610f Fix file path for shard_num 1 with mllama converter (#35053)
"#35049 fix path for num_shard 1"
2024-12-10 09:11:45 +00:00
0938b57770 Assisted decoding multi-gpu (#35116)
* fix

* move a few lines up
2024-12-10 09:59:17 +01:00
dada0fd85f Fix num_items_in_batch not being an integer (#35115)
In method `Trainer#get_batch_samples`, the return values should be a
list of batch samples and an integer indicating the number of items that
exist in the batch. However, this was not actually a case and what was
returned instead of an integer, was a tensor with one element. In the
multi-GPU setup, this tensor is placed in a different device than the
loss tensor, causing the loss function to raise a `RuntimeError`.

The problem arises from
5d7739f15a/src/transformers/trainer.py (L5139-L5144),
where the outer `sum` operates over a list of tensors which means that
the final result is also a tensor. To counter this issue, a new check
(after the accelerator gathering) has been added in order to convert a
potential tensor to an integer before returning the
`num_items_in_batch`.
2024-12-10 08:40:40 +01:00
34f4080ff5 [CI] Fix bnb quantization tests with accelerate>=1.2.0 (#35172) 2024-12-09 13:55:16 -05:00
UV
fa8763ce17 Fixed typo of 'avilable' in prompts.py (#35145) 2024-12-09 16:40:32 +00:00
4bc39de5c3 Super tiny fix logging message (#35132)
Update integration_utils.py
2024-12-09 16:31:32 +00:00
8e806a336f Cleanup: continue the init refactor (#35167)
Round 2
2024-12-09 16:09:50 +01:00
7238387f67 Fix typo in EETQ Tests (#35160)
fix
2024-12-09 14:13:36 +01:00
de8a0b7547 Option to set 'non_blocking' for to(device) in BatchEncoding and BatchFeature (#34883)
* Option to set 'non_blocking' for to(device) operation for performance improvements. Defaults to 'false', thus no behavioral changes.

* Enabling non_blocking in to() operation of BatchFeature.

* Improved docstring on utilization of non_blocking

* Force non_blocking as keyword argument

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

---------

Co-authored-by: Daniel Bogdoll <dbogdoll@umich.edu>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-12-09 11:29:04 +01:00
UV
1452dc2514 Corrected typo in agent system prompts (#35143) 2024-12-09 10:42:23 +01:00
9e420e0269 [I-JEPA] Update docs (#35148)
Update docs
2024-12-09 10:01:31 +01:00
1ccca8f48c Fix GA loss bugs and add unit test (#35121)
* fix GA bugs and add unit test

* narrow down model loss unit test diff gap

* format code to make ruff happy

* send num_items_in_batch argument to decoder

* fix GA loss bug in BertLMHeadModel

* use TinyStories-33M to narrow down diff gap

* fotmat code

* missing .config

* avoid add extra args

---------

Co-authored-by: kangsheng <kangsheng@meituan.com>
2024-12-09 09:57:41 +01:00
c8c8dffbe4 Update I-JEPA checkpoints path (#35120)
Update checkpoints path
2024-12-06 13:42:51 +00:00
7f95372c62 Add feature dim attributes to BitLinear for easier PEFT integration (#34946)
Update bitnet.py, extremely small change to allow for easier PEFT integration

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2024-12-06 13:39:45 +01:00
9ad4c93536 Add Aria (#34157)
* Add Aria
---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-06 12:17:34 +01:00
15ab310c3a Fix private forked repo. CI (#35114)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-06 12:03:31 +01:00
98e8062df3 [docs] top_p, top_k, temperature docstrings (#35065)
clarify
2024-12-05 11:24:51 -08:00
44f88d8ccb [docs] Update Python version in translations (#35096)
update: doc version
2024-12-05 11:06:54 -08:00
66ab300aaf Dev version 2024-12-05 19:12:22 +01:00
a5bb528471 Fix signatures for processing kwargs (#35105)
* add conversion script

* remove pg2 refs

* fixup style

* small update

* get correct scaling

* add back missing bos

* fix missing config keys

* might revert this pos_embeddings

* fixup 9b config

* fix 9b

* fixup 9b conversion for good + add back num_hidden_layers

* add correct query scaling for 2b, 9b, 27b

* fixup 27b conversion

* Additional variant: 27b-896

* Use CPU for conversion to reduce GPU RAM requirements

* fix causal mask generation + formatting

* fix in-training causal mask generation edge case

* trigger CI

* update config

* update config

* update config

* update config

* update config

* update config

* update config

* update config

* update config

* move conversion file to main model dir

* handle multi-images + bos token

* address comments for input ids

* revert ci fixes

* [run-slow] paligemma

* fix

* [run-slow] paligemma

* skip end 2 end

* [run-slow] paligemma

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-05 18:15:48 +01:00
e27465c801 Adaptive dynamic number of speculative tokens (#34156)
* initial commit

* update strategy

* add tradeoff FPR TPR with cost

* all probs

* fix

* fix

* fix style

* Update src/transformers/generation/configuration_utils.py

shorter docstring

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* import guard

* fix style

* add is_sklearn_available condition

* vectorizing to flatten the for-loop

* fix style

* disable adaptation for UAG

* update doc

* add TestAssistedCandidateGeneratorUpdateStrategy

* fix style

* protect import

* fix style

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-12-05 17:07:33 +01:00
b0a51e5cff Fix flaky Hub CI (test_trainer.py) (#35062)
* fix

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* check

* check

* check

* check

* check

* check

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* check

* check

* check

* Final space

* Final adjustment

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Lucain <lucainp@gmail.com>
2024-12-05 17:02:27 +01:00
a928d9c128 [trainer] fix the GA model_accepts_loss_kwargs (#34915)
* fix

* style

* values

* fix
2024-12-05 16:37:46 +01:00
e682c17e4a BLIP: this is correct now (#35081)
this is correct now
2024-12-05 16:30:09 +01:00
50189e36a6 Add I-JEPA (#33125)
* first draft

* add IJepaEmbeddings class

* fix copy-from for IJepa model

* add weight conversion script

* update attention class names in IJepa model

* style changes

* Add push_to_hub option to convert_ijepa_checkpoint function

* add initial tests for I-JEPA

* minor style changes to conversion script

* make fixup related

* rename conversion script

* Add I-JEPA to sdpa docs

* minor fixes

* adjust conversion script

* update conversion script

* adjust sdpa docs

* [run_slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* formatting issues

* adjust modeling to modular code

* add IJepaModel to objects to ignore in docstring checks

* [run-slow] ijepa

* fix formatting issues

* add usage instruction snippet to docs

* change pos encoding, add checkpoint for doc

* add verify logits for all models

* [run-slow] ijepa

* update docs to include image feature extraction instructions

* remove pooling layer from IJepaModel in image classification class

* [run-slow] ijepa

* remove pooling layer from IJepaModel constructor

* update docs

* [run-slow] ijepa

* [run-slow] ijepa

* small changes

* [run-slow] ijepa

* style adjustments

* update copyright in init file

* adjust modular ijepa

* [run-slow] ijepa
2024-12-05 16:14:46 +01:00
95a855e212 Deprecate quanto and switch to optimum-quanto (#35001)
* deprecate quanto

* fix style
2024-12-05 16:11:09 +01:00
482cb28a18 Fix tie_word_embeddings handling for GGUF models (#35085)
* fix tie_word_embeddings

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>
2024-12-05 16:00:41 +01:00
35447054f5 Update Mistral conversion script (#34829)
* Update convert_mistral_weights_to_hf.py

* Update convert_mistral_weights_to_hf.py

* Update convert_mistral_weights_to_hf.py
2024-12-05 15:47:20 +01:00
93f87d3cf5 [tokenizers] bump to 0.21 (#34972)
bump to 0.21
2024-12-05 15:46:02 +01:00
54aae121eb [Whisper] Fix whisper tokenizer (#34537)
* handle single timestamp ending

* include last timestamp token

* handle single timestamp ending

* avoid floating points arithm limitations

* ensure float64 operations

* new test

* make fixup

* make copies

* handle edge case double tokens ending with different tokens

* handle single timestamp ending

* make fixup

* handle conditioning on prev segments

* fix

* Update src/transformers/models/whisper/generation_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* [run-slow] whisper

* don't call item() to avoid unnecessary sync

* fix

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
Co-authored-by: Eustache Le Bihan <eustlb@users.noreply.huggingface.co>
2024-12-05 13:46:29 +01:00
beb2c66ec3 Informative (#35059)
* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-05 09:50:27 +01:00
1ed1de2fec [docs] Increase visibility of torch_dtype="auto" (#35067)
* auto-dtype

* feedback
2024-12-04 09:18:44 -08:00
baa3b22137 [docs] add a comment that offloading requires CUDA GPU (#35055)
* add commen to offloading

* Update docs/source/en/kv_cache.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-04 07:48:34 -08:00
1da1e0d7f2 Support for easier multimodal use of modular (#35056)
* update modular and add examples

* style

* improve example comments

* style

* fix small logic issue for imports

* fix relative order issue when files do not make sense

* Improve comments

* trigger CIs
2024-12-04 15:13:11 +01:00
46df859975 [GPTNeoX] Flex Attention + Refactor (#34896)
* gpt neox flex attention + refactor

* some formatting

* small fix on dropout

* add assertion on flex attn test

* flaky ci :(

* add head mask support

* style

* handle dtype, replace torch where

* fixup flex with output attns

* code review and several other fixes

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* style

* remove unnecessary comment

* remove incorrect comment

* make flex attn check more agnostic tor versions and centralized

* change peft input dtype check to value since q and k could be affected by other stuff like RoPE

* i forgor

* flaky

* code review and small fixes

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-04 14:48:28 +01:00
accb7204f9 Add Pytorch Tensor Parallel support for Qwen2, Qwen2Moe, Starcoder2 (#35007)
* add base tp plan for qwen2 and qwen2moe

* add parallel tp for starcoder2

* fix modular conversion

* add infer dim for qkv states

* Update src/transformers/models/qwen2_moe/configuration_qwen2_moe.py

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-04 14:43:36 +01:00
c7a109ec81 Fix pad_token_tensor is None in warning (#34005)
Fix pad_token_tensor is None in warning
2024-12-04 11:15:25 +01:00
329f5dbf97 [docs] use device-agnostic API instead of hard-coded cuda (#35048)
replace cuda
2024-12-03 10:54:15 -08:00
b8cdc262d5 [docs] use device-agnostic instead of cuda (#35047)
* fix on xpu

* [run_all]

* add the missing import for Image lib

* add more devices in comment

* bug fix

* replace cuda
2024-12-03 10:53:45 -08:00
346597b644 Translate community.md into Chinese (#35013)
* community translation

* Update docs/source/zh/community.md

Co-authored-by: Isotr0py <2037008807@qq.com>

---------

Co-authored-by: Isotr0py <2037008807@qq.com>
2024-12-03 10:22:02 -08:00
3deaa8179d [docs] fix example code bug (#35054)
fix code bug
2024-12-03 09:18:39 -08:00
125de41643 fix speecht5 failure issue in test_peft_gradient_checkpointing_enable… (#34454)
* fix speecht5 failure issue in test_peft_gradient_checkpointing_enable_disable

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* [run-slow] speecht5

---------

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Co-authored-by: Matt <rocketknight1@gmail.com>
2024-12-03 13:58:54 +00:00
7a7f27697a Fix BertGeneration (#35043)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-03 13:56:59 +01:00
901f504580 Add token cost + runtime monitoring to Agent and HfEngine children (#34548)
* Add monitoring to Agent and HfEngine children
2024-12-03 13:14:52 +01:00
ee37bf0d95 Automatic compilation in generate: do not rely on inner function (#34923)
* compiled forward in PreTrainedModel

* update

* style

* update name

* trigger CIs

* Add way to use custom compile args

* style

* switch parameterization to generation_config

* Add to inits

* Update configuration_utils.py

* inits

* style

* docs

* style

* Update configuration_utils.py

* back without dataclass for repo consistency

* Update configuration_utils.py

* style

* style

* style once again

* add config serialization

* update

* true dataclass

* trigger CIs

* merge compile methods + remove serialization of compile config
2024-12-03 11:20:31 +01:00
f9c7e6021e Translate bertlogy.md into Chinese (#34908)
* bertology translation

* Update docs/source/zh/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/zh/bertology.md

Co-authored-by: blueingman <15329507600@163.com>

* Update docs/source/zh/bertology.md

Co-authored-by: blueingman <15329507600@163.com>

* Update docs/source/zh/bertology.md

Co-authored-by: Isotr0py <2037008807@qq.com>

* Update docs/source/zh/bertology.md

Co-authored-by: Isotr0py <2037008807@qq.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: blueingman <15329507600@163.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
2024-12-02 11:42:40 -08:00
527dc04e46 [docs] add the missing import for Image and bug fix (#34776)
* add the missing import for Image lib

* add more devices in comment

* bug fix
2024-12-02 11:40:20 -08:00
4955e4e638 [i18n-ar] Translated file : docs/source/ar/notebooks.md into Arabic (#33049)
* Add docs/source/ar/notebooks.md to Add_docs_source_ar_notebooks.md

* Update notebooks.md

* Update _toctree.yml
2024-12-02 11:40:04 -08:00
f0dec874f0 add docstring example for compute_loss_func (#35020) 2024-12-02 11:39:09 -08:00
31299670cd Multiple typo fixes in Tutorials docs (#35035)
* Fixed typo in multi gpu docs and OLMoE version

* Fixed typos in docs for agents, agents advanced, knowledge distillation, and image feature extraction

* Fixed incorrect usage of model.image_guided_detection in zero shot object detection docs
2024-12-02 15:26:34 +00:00
31830474bf Fix test_eager_matches_sdpa_inference for XPU backend (#34889)
* Use torch.nn.attention.sdpa_kernel instead of deprecated torch.backends.cuda.sdp_kernel

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* Fix test_eager_matches_sdpa_inference for XPU backend

As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH
which is implemented on PyTorch level using aten operators and is device
agnostic with respect to implementation of each aten operator. Thus, we can
reuse CUDA (or CPU) MATH weights for XPU.

Fixes: #34888
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* Use torch.amp.autocast instead of deprecated torch.cuda.amp.autocast in nemotron

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

---------

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2024-12-02 16:21:04 +01:00
f41d5d8f74 Add type hints for forward functions in Gemma2 (#35034)
* feat: add gemma2 type hints

* fix: mask is optional
2024-12-02 14:03:36 +00:00
7b5f76e32e Typo in warning switching to optimum-quanto (#35028)
fix typos
2024-12-02 13:47:05 +00:00
c24c79ebf9 Optimize memory usage of mllama encoder (#34930)
mllama encoder memory optimization
2024-12-02 11:46:45 +01:00
9ab8c5b503 fix variable undefined bug when return_tensors is not specified in llava processing (#34953)
* fix variable undefined bug when return_tensors is not specified in llava processor

* improve readability
2024-12-02 11:44:42 +01:00
3480cbb97e Only cast cu_seqlens when tracing (#35016)
* Only cast `cu_seqlens` when tracing

* Formatting
2024-12-02 11:39:39 +01:00
19dabe9636 Update FillMaskPipeline.__call__ signature and docstring (#35006)
Update `FillMaskPipeline.__call__`

- Remove unused `*args`
- Update docstring with `inputs` over `args`
2024-11-29 13:44:56 +00:00
f7427f58ed fix: double verbs (#35008) 2024-11-29 13:19:57 +00:00
737f4dc4b6 Update timm version (#35005)
* Bump timm

* dev-ci
2024-11-29 12:46:59 +00:00
89d7bf584f 🚨🚨🚨 Uniformize kwargs for TrOCR Processor (#34587)
* Make kwargs uniform for TrOCR

* Add tests

* Put back current_processor

* Remove args

* Add todo comment

* Code review - breaking change
2024-11-29 11:58:11 +00:00
0b5b5e6a70 Let server decide default repo visibility (#34999)
* Let server decide default repo visibility

* code style
2024-11-28 17:05:08 +01:00
f491096f7d Fix docker CI : install autogptq from source (#35000)
* Fixed Docker

* Test ci

* Finally

* add comment
2024-11-28 16:31:36 +01:00
01ad80f820 Improve .from_pretrained type annotations (#34973)
* Fix from_pretrained type annotations

* Better typing for image processor's `from_pretrained`
2024-11-28 15:05:19 +00:00
9d6f0ddcec Add optimized PixtralImageProcessorFast (#34836)
* Add optimized PixtralImageProcessorFast

* make style

* Add dummy_vision_object

* Review comments

* Format

* Fix dummy

* Format

* np.ceil for math.ceil
2024-11-28 16:04:05 +01:00
6300212946 Fix utils/check_bad_commit.py (for auto ping in CI) (#34943)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-28 15:34:38 +01:00
5e8c1d713d Offloaded cache: fix generate (#34921)
* fix cache impl

* require_torch_gpu

* fix mamba

* fix copies
2024-11-28 15:05:56 +01:00
57ca9e6d2f Allow compressed-tensors quantized model to be trained (#34520)
* populate quantization_config for kv-cache-scheme only configs

* make compressed-tensors quantized models trainable

* populate versions on quant config

* pass oneshot then finetune

* remove breakpoint

* SunMarc comments and fix to_dict logic

* lint

* lint

* test

* comment

* comments'
2024-11-28 15:05:16 +01:00
44af935ec5 Refine the code of Universal Assisted Generation (#34823)
* removed the useless attritbutes

* add configs for window size

* fixed the wrong kwargs

* added docstring
2024-11-28 15:04:24 +01:00
2b053fdf1a 🚨🚨🚨 Changed DINOv2Config default patch size to 14 (#34568)
Changed DINOv2Config default patch size to 14
2024-11-28 14:48:06 +01:00
4f0bf9864c Fix save_pretrained for partially offloaded models (#34890)
* delete unnecessary reference

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update comment, explicit delete state_dict

* Update src/transformers/modeling_utils.py

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* fix style

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

---------

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
2024-11-28 14:46:56 +01:00
f4b674f269 [PEFT] Set eval mode when loading PEFT adapter (#34509)
* [PEFT] Set eval mode when loading PEFT adapter

Resolves #34469

When calling model.load_adapter to load a PEFT adapter, by default the
adapter should be set to eval mode. This is now correctly done. Users
can still pass is_trainable=True to load the adapter in training mode.

* Linter
2024-11-28 13:56:25 +01:00
5523e38b55 Fixed typo in VisitWebpageTool (#34978)
Fixed typo in VisitWebpageTool
2024-11-27 12:49:21 -08:00
4120cb257f Fix typo in code block in vipllava.md (#34957)
fix typo in code block in vipllava.md
2024-11-27 08:19:34 -08:00
2910015d6d [i18n-zh]Translated perf_train_special.md into Chinese (#34948)
* Add translation for perf_train_special documentation

* Update docs/source/zh/perf_train_special.md

Co-authored-by: Isotr0py <2037008807@qq.com>

* Update docs/source/zh/perf_train_special.md

Co-authored-by: Isotr0py <2037008807@qq.com>

* Update _toctree.yml

* Update _toctree.yml

* Update perf_train_special.md

* Update perf_train_special.md

---------

Co-authored-by: Isotr0py <2037008807@qq.com>
2024-11-27 07:57:43 -08:00
637225508f [docs] add explanation to release_memory() (#34911)
* explain release_memory

* Update docs/source/en/llm_tutorial_optimization.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-27 07:47:28 -08:00
0600f46353 🌐 [i18n-KO] Translated encoder-decoder.md to Korean (#34880)
* Initial version of translation, english still remaining

* Revised Translation, removed english. _toctree not updated

* updated _toctree.yml && 3rd ver translation

* updated _toctree.yml && 3rd ver translation

* Update encoder-decoder.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update encoder-decoder.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update encoder-decoder.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update encoder-decoder.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update encoder-decoder.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update encoder-decoder.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

---------

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
2024-11-27 07:47:14 -08:00
5f8b24ee12 Fix flaky test execution caused by Thread (#34966)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-27 16:32:50 +01:00
0d99a938aa Avoid calling get_max_length (#34971)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-27 15:15:35 +01:00
8f48ccf548 Fix : Add PEFT from source to CI docker (#34969)
* Docker fix peft

* Test new docker

* uncomment
2024-11-27 14:10:47 +01:00
4c1388f48e [FlexAttention] Update gemma2 (#34942)
* update tests

* now maybe this fixes the previous fialing tests!

* nit default

* Update src/transformers/models/gemma2/modular_gemma2.py

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* fix-copies

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
2024-11-27 11:50:48 +01:00
6c3f168b36 [i18n-zh]Translated tiktoken.md into chinese (#34936)
* Add translation for tiktoken documentation

* Update tiktoken.md

* Update tiktoken.md
2024-11-26 10:09:52 -08:00
5bfb40bc8e docs: HUGGINGFACE_HUB_CACHE -> HF_HUB_CACHE (#34904) 2024-11-26 09:37:18 -08:00
784d22078a [doc] use full path for run_qa.py (#34914)
use full path for run_qa.py
2024-11-26 09:23:44 -08:00
6bc0c219c1 [docs] use device-agnostic API instead of cuda (#34913)
add device-agnostic API

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
2024-11-26 09:23:34 -08:00
64b73e61f8 [i18n-ar] Translated file : docs/source/ar/benchmarks.md into Arabic (#33023)
* Add docs/source/ar/benchmarks.md to Add_docs_source_ar_benchmarks.md

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

* Update benchmarks.md

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-11-26 09:23:11 -08:00
a0ba631519 Update the Python version in the Chinese README to match the English README. (#34870)
Update Python Version
2024-11-26 09:22:34 -08:00
1f6b423f0c Fix torch.onnx.export of Qwen2-VL vision encoder (#34852)
* Fix torch.onnx.export of Qwen2-VL vision encoder

This PR fixes onnx export support for the vision encoder of Qwen2-VL, which converts the `cu_seqlens` to `torch.int32`, leading to errors later on when using the values for slicing.

c57eafdaa1/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py (L1044-L1046)

## Error:
```
onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] (op_type:Slice, node name: /blocks.0/attn/Slice_4): axes has inconsistent type tensor(int64)
```

## Code to reproduce issue:
```py

import requests
from PIL import Image
import torch
from transformers import (
    AutoProcessor,
    Qwen2VLForConditionalGeneration,
)

# Constants
VISION_MODEL_NAME = "vision_encoder.onnx"

# Load model and processor
model_id = "hf-internal-testing/tiny-random-Qwen2VLForConditionalGeneration"
model = Qwen2VLForConditionalGeneration.from_pretrained(model_id).eval()
processor = AutoProcessor.from_pretrained(model_id)

# Prepare inputs
url = "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
image = Image.open(requests.get(url, stream=True).raw)
conversation = [
    {
        "role": "user",
        "content": [
            { "type": "image" },
            { "type": "text", "text": "Describe this image."},
        ],
    },
]
images = [image]
text_prompt = processor.apply_chat_template(conversation, add_generation_prompt=True)
inputs = processor(text=[text_prompt], images=images, padding=True, return_tensors="pt")

## Vision model
vision_inputs = dict(
    pixel_values=inputs["pixel_values"],
    grid_thw=inputs["image_grid_thw"],
)
vision_inputs_positional = tuple(vision_inputs.values())
vision_outputs = model.visual.forward(*vision_inputs_positional)  # Test forward pass
torch.onnx.export(
    model.visual,
    args=vision_inputs_positional,
    f=VISION_MODEL_NAME,
    export_params=True,
    opset_version=14,
    do_constant_folding=True,
    input_names=list(vision_inputs.keys()),
    output_names=["image_features"],
    dynamic_axes={
        "pixel_values": {
            0: "batch_size * grid_t * grid_h * grid_w",
            1: "channel * temporal_patch_size * patch_size * patch_size",
        },
        "grid_thw": {0: "batch_size"},
        "image_features": {0: "batch_size * grid_t * grid_h * grid_w"},
    },
)

# Load and check the exported model model
import onnx
model = onnx.load(VISION_MODEL_NAME)
onnx.checker.check_model(model, full_check=True)
inferred = onnx.shape_inference.infer_shapes(model, check_type=True)
```

* Formatting

* [run-slow] qwen2_vl
2024-11-26 16:14:36 +01:00
d5cf91b346 Separate chat templates into a single file (#33957)
* Initial draft

* Add .jinja file loading for processors

* Add processor saving of naked chat template files

* make fixup

* Add save-load test for tokenizers

* Add save-load test for tokenizers

* stash commit

* Try popping the file

* make fixup

* Pop the arg correctly

* Pop the arg correctly

* Add processor test

* Fix processor code

* stash commit

* Processor clobbers child tokenizer's chat template

* Processor clobbers child tokenizer's chat template

* make fixup

* Split processor/tokenizer files to avoid interactions

* fix test

* Expand processor tests

* Rename arg to "save_raw_chat_template" across all classes

* Update processor warning

* Move templates to single file

* Move templates to single file

* Improve testing for processor/tokenizer clashes

* Improve testing for processor/tokenizer clashes

* Extend saving test

* Test file priority correctly

* make fixup

* Don't pop the chat template file before the slow tokenizer gets a look

* Remove breakpoint

* make fixup

* Fix error
2024-11-26 14:18:04 +00:00
5a45617887 change apply_rotary_pos_emb of Glmmodel for GLM-Edge Series model (#34629)
* change apply_rotary_pos_emb

* upload for glm-edge

* remove useless part

* follow the suggestion

* fix

* format

* format

* test

* format again

* format again

* remove modular change

* remove modular change

* this apply_rotary_pos_emb need modify?

* fix with this

* format

* format

* ruff check

* modify modular_glm failed

* remove partial_rotary_factor of function  partial_rotary_factor

* fix wrong change of examples/research_projects

* revert

* remove line 118

* use q_rot
2024-11-26 15:05:42 +01:00
1141eff1bd Add Pytorch Tensor Parallel support for Mistral (#34927)
add base tp support
2024-11-26 14:28:07 +01:00
4d1d0f29a4 [Whisper] Fix whisper integration tests (#34111)
* fix test_tiny_timestamp_generation

* fix test_large_timestamp_generation

* fix test_whisper_shortform_single_batch_prev_cond

* fix test_whisper_shortform_multi_batch_hard_prev_cond

* return_timestamps necessary with long form

* fix test_default_multilingual_transcription_long_form

* fix test_tiny_token_timestamp_generation_longform

* fix test_whisper_longform_multi_batch_hard

* Update tests/models/whisper/test_modeling_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* fix typo

* do not expect special tokens

* fix test_whisper_longform_single_batch_beam

* fix test_whisper_longform_multi_batch_hard_prev_cond

* update test_whisper_longform_multi_batch_hard_prev_cond

* update test_whisper_longform_multi_batch_hard_prev_cond

* these tests does not make sense anymore

* this test does not make sense anymore

* make fixup

* suggested nits

* add test with forced_decoder_ids

* this test does not make sense anymore

* change assert for unittest test cases

* make fixup

* test with prompt_ids and task and language

* fix unittest test case call

* fix test_tiny_generation

* fix test_tiny_en_generation

* fix test_tiny_en_batched_generation

* fix test_tiny_longform_timestamps_generation

* fix test_tiny_timestamp_generation

* fix test_large_generation

* fix test_large_batched_generation

* fix test_large_generation_multilingual

* fix test_large_timestamp_generation

* fix test_large_timestamp_generation

* fix test_tiny_token_timestamp_generation_longform

* fix test_tiny_en_batched_generation

* make fixup

* [run-slow] whisper

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
2024-11-26 12:23:08 +01:00
0e805e6d1e Skipping aqlm non working inference tests till fix merged (#34865) 2024-11-26 11:09:30 +01:00
73b4ab1085 VideoLLaVA: add default values (#34916)
add default values
2024-11-26 08:20:06 +01:00
bdb29ff9f3 Fix import structure for Fast Image processors (#34859)
* Fix import structure image_processor_fast

* update to new inits
2024-11-25 16:27:56 -05:00
bfc3556b20 making gpt2 fx traceable (#34633)
* making gpt2 fx tracable

* running make fix-copies

* Revert "running make fix-copies"

This reverts commit 5a3437cb5b63799243bceae7d21a2aed8d0418c7.
2024-11-25 19:30:38 +01:00
95c10fedb3 Updated documentation and added conversion utility (#34319)
* Updated documentation and added conversion utility

* Update docs/source/en/tiktoken.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tiktoken.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Moved util function to integration folder + allow for str

* Update formatting

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Updated formatting

* style changes

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-25 18:44:09 +01:00
890ea7de93 Fix failling GGML test (#34871)
fix_test
2024-11-25 18:04:52 +01:00
b76a292bde Upgrade torch version to 2.5 in dockerfile for quantization CI (#34924)
* Upgrade Torch 2.5

* uncomment
2024-11-25 17:38:20 +01:00
a830df2909 Fix test_auto_backbone_timm_model_from_pretrained (#34877)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-25 17:20:41 +01:00
a464afbe2a fix static cache data type miss-match (#34799)
* fix gptj data type missmatch

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add low precision static cache tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix low-precision static cache tests

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* avoid config change

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* change data type convert in cache copy

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix comment

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* cast key value after k v out

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2024-11-25 16:59:38 +01:00
b13916c09d [AWQ, CI] Bump AWQ version used in docker image (#34922)
The old AWQ version is failing with the latest (unreleased)
transformers, giving the error:

> ImportError: cannot import name 'shard_checkpoint' from
'transformers.modeling_utils'

This has been resolved in awq v0.2.7:

https://github.com/casper-hansen/AutoAWQ/pull/644
2024-11-25 16:49:57 +01:00
4e6b19cd95 Fix : BitNet tests (#34895)
* fix_tests_bitnet

* fix format
2024-11-25 16:47:14 +01:00
9121ab8fe8 Rename OLMo November to OLMo2 (#34864)
* Rename/move OLMo Nov files to OLMo2

* Rename Olmo1124 and its variants to Olmo2
2024-11-25 16:31:22 +01:00
1de3598d30 Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/lxmert (#34917)
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-25 15:19:29 +00:00
f4c04ba32b Fix Qwen2 failing tests (#34819)
* fix: qwen2 model ids

* fix: line

* fix: more format

* update: reformat
2024-11-25 15:53:04 +01:00
11cc2295c7 [peft] Given that self.active_adapter is deprecated, avoid using it (#34804)
* Given that self.active_adapter is deprecated, avoid using it

* Remove misleading comment - `self.active_adapter` is not used (and deprecated)
2024-11-25 15:29:52 +01:00
74db22f905 Fix convert_tokens_to_string when decoder is None (#34569)
* Fix convert_tokens_to_string when decoder is None

* revert unrelated changs

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2024-11-25 14:35:24 +01:00
97514a8ba3 chore: fix some typos (#34891)
Signed-off-by: wanxiangchwng <cui.shuang@foxmail.com>
2024-11-25 13:05:59 +00:00
62ab94dea8 Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/visual_bert (#34887)
Bump tornado in /examples/research_projects/visual_bert

Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-25 12:54:55 +00:00
c50b5675d6 prepare_fa2_from_position_ids function bugfix (#33269)
contiguous() is called before view() for key and value within prepare_fa2_from_position_ids function
2024-11-25 13:51:26 +01:00
a0f4f3174f allow unused input parameters passthrough when chunking in asr pipelines (#33889)
* allow unused parameter passthrough when chunking in asr pipelines

* format code

* format

* run fixup

* update tests

* update parameters to pipline in test

* updates parametrs in tests

* change spelling in gitignore

* revert .gitignore to main

* add git ignore of devcontainer folder

* assert asr output follows expected inference output type

* run fixup

* Remove .devcontainer from .gitignore

* remove compliance check
2024-11-25 11:36:44 +01:00
4dc1a69349 Sum gathered input tokens (#34554)
* sum gathered input tokens

* ruff line-length is 119, format the code

---------

Co-authored-by: kangsheng <kangsheng@meituan.com>
2024-11-25 11:27:13 +01:00
1e492afd61 🔴 Mllama: fix base prefix (#34874)
fix base prefix
2024-11-25 11:20:20 +01:00
857d46ca0c [Deberta/Deberta-v2] Refactor code base to support compile, export, and fix LLM (#22105)
* some modification for roadmap

* revert some changes

* yups

* weird

* make it work

* sttling

* fix-copies

* fixup

* renaming

* more fix-copies

* move stuff around

* remove torch script warnings

* ignore copies

* revert bad changes

* woops

* just styling

* nit

* revert

* style fixup

* nits configuration style

* fixup

* nits

* will this fix the tf pt issue?

* style

* ???????

* update

* eval?

* update error message

* updates

* style

* grumble grumble

* update

* style

* nit

* skip torch fx tests that were failing

* style

* skip the failing tests

* skip another test and make style
2024-11-25 10:43:16 +01:00
098962dac2 BLIP: fix generation after hub update (#34876)
* fix blip generation

* dont remove it yet

* Update src/transformers/models/blip_2/modeling_blip_2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address comments

* modular

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-25 10:41:55 +01:00
c1a8520419 Cache: init empty cache when use_cache (#34274)
* fix

* fix tests

* fix copies

* add docs

* Revert "add docs"

This reverts commit 32d35634f12ba02781d2ebdee0c8dcfbe992a7b9.

* qwen move deltas

* mllama can potentiall fullgraph compile

* enable mllama compile and fix tests

* remove mllama fixes
2024-11-25 10:11:33 +01:00
1339a14dca Add safe_globals to resume training on PyTorch 2.6 (#34632)
Starting from version 2.4 PyTorch introduces a stricter check for the objects which
can be loaded with torch.load(). Starting from version 2.6 loading with weights_only=True
requires allowlisting of such objects.

This commit adds allowlist of some numpy objects used to load model checkpoints.
Usage is restricted by context manager. User can still additionally call
torch.serialization.add_safe_globals() to add other objects into the safe globals list.

Accelerate library also stepped into same problem and addressed it with PR-3036.

Fixes: #34631
See: https://github.com/pytorch/pytorch/pull/137602
See: https://pytorch.org/docs/stable/notes/serialization.html#torch.serialization.add_safe_globals
See: https://github.com/huggingface/accelerate/pull/3036

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2024-11-25 10:03:43 +01:00
318fe25f22 Fix: Enable prefill phase key value caching of nemotron/minitron models (#34742)
* modeling nemotron kv caching bugfix

Signed-off-by: jeongin601 <0200angela@gmail.com>

* test file deleted

Signed-off-by: jeongin601 <0200angela@gmail.com>

* code refinement

Signed-off-by: jeongin601 <0200angela@gmail.com>

* remove unused variables

Signed-off-by: jeongin601 <0200angela@gmail.com>

* import block sorted

* removed deprecation warning

Signed-off-by: jeongin601 <0200angela@gmail.com>

* removed support for tuple shape past_key_values

Signed-off-by: jeongin601 <0200angela@gmail.com>

* Update conditional statement for cache initialization

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Signed-off-by: jeongin601 <0200angela@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-25 09:45:35 +01:00
3a8eb74668 Fix support for image processors modifications in modular (#34866)
* add fix and examples

* fix camel case naming
2024-11-22 18:14:24 -05:00
54be2d7ae8 Bitnet test fix to avoid using gated model (#34863)
small test fix
2024-11-22 17:18:49 +01:00
286ffaaf0a [CI] Skip EETQ tests while package is broken with latest transformers (#34854)
* CI Skip EETQ tests while package is broken

EETQ tries to import the shard_checkpoint function from transformers but
the function has been removed. Therefore, trying to use EETQ currently
results in an import error. This fix results in EETQ tests being skipped
if there is an import error.

The issue has been reported to EETQ:

https://github.com/NetEase-FuXi/EETQ/issues/34

* Raise helpful error when trying to use eetq

* Forget to raise the error in else clause
2024-11-22 17:13:30 +01:00
861758e235 smol improvements to support more flexible usage (#34857)
* smol improvements to support more flexible usage

* ruff
2024-11-22 16:34:38 +01:00
42b36d7395 Speculative decoding: Test the target distribution (to prevent issues like #32867) (#34553)
* Update test_utils.py

* formatting

* Update test_utils.py

* formatting

* formatting

* Update test_utils.py

* formatting

* Update test_utils.py

* formatting

* format

* comments at standard positions
2024-11-22 16:02:37 +01:00
597efd21d2 Auto compile when static cache (#34247)
* generate with compile

* nits

* simple

* generate with compile

* nits

* simple

* safe

* style

* Update src/transformers/generation/utils.py

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>

* remove TOKENIZER forked warning

---------

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2024-11-22 15:33:35 +01:00
d9e6f307e7 Remove quantization related config from dequantized model (#34856)
* Remove quantization related config from dequantized model

* Fix whitespace
2024-11-22 10:06:29 +01:00
1867be666d Update checks for torch.distributed.tensor to require torch >= 2.5 (#34816)
* Update checks for torch.distributed.tensor

* Update PR with feedback

* Formatting fix for import order

* Remove unused function
2024-11-22 10:05:26 +01:00
6a912ff2c5 Watermarking: fix order (#34849)
fix watermarking order
2024-11-22 08:25:14 +01:00
4e90b99ed9 Refactor StarCoder2 using modular (#34015)
* Create modular_starcoder2.py

* Update modular_starcoder2.py

* update

* finalize modular

* revert # no-unravel

* Add support

* style

* Update modular_model_converter.py

* update docstring
2024-11-21 14:52:39 +01:00
18871599c9 Fix heuristic scheduling for UAG (#34805)
* fix heuristic schedule

* fix style

* fix format
2024-11-21 14:46:35 +01:00
d6a5c23f71 Fix ds nvme (#34444)
* skip nested deepspeed.zero.Init call

* make fixup

* solve conflict

* solve conflict

* put back local

* use context mangers instead of local thread

* Skip recursive calls to deepspeed.zero.Init

* Skip recursive calls to deepspeed.zero.Init

* back to old notebooks

* make style
2024-11-21 13:52:22 +01:00
ae5cbf804b Improve gguf tensor processing (#34515)
* add tensor processing system to separate logic for models

* format refactoring

* small fix

* make some methods private

* move custom methods to processors

* refactor tensor processing

* format fix
2024-11-21 13:40:49 +01:00
c57eafdaa1 Add Nemotron GGUF Loading Support (#34725)
* Add Nemotron GGUF Loading Support

* fix the Nemotron architecture assignation

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-21 11:37:34 +01:00
d4e1acbb7c Change logging level from warning to info for max_steps overriding num_train_epochs (#34810)
Update trainer.py
2024-11-21 11:37:02 +01:00
28fb02fc05 VLMs: enable generation tests - last batch (#34484)
* add tests for 3 more vlms

* fix fuyu back

* skip test
2024-11-21 11:00:22 +01:00
40821a2478 Fix CI slack reporting issue (#34833)
* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-20 21:36:13 +01:00
3cb8676a91 Fix CI by tweaking torchao tests (#34832) 2024-11-20 20:28:51 +01:00
bf42c3bd4b Fix hyperparameter search when optuna+deepseed (#34642)
* Fix hyperparameter search when optuna+deepseed

* Adding free_memory to the search setup

---------

Co-authored-by: Corentin-Royer <corentin.royer@ibm.com>
2024-11-20 18:02:58 +01:00
67890de3b8 Torchao weights only + prequantized compability (#34355)
* weights only compability

* better tests from code review

* ping torch version

* add weights_only check
2024-11-20 17:24:45 +01:00
f297af55df Fix: take into account meta device (#34134)
* Do not load for meta device

* Make some minor improvements

* Add test

* Update tests/utils/test_modeling_utils.py

Update test parameters

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Make the test simpler

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-20 11:32:07 +01:00
8cadf76e1c fix(DPT,Depth-Anything) torch.export (#34103)
* Fix torch.export issue in dpt based models

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Simplify the if statements

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Move activation definitions of zoe_depth to init()

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Add test_export for dpt and zoedepth

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* add depth anything

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Remove zoedepth non-automated zoedepth changes and zoedepth test

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* [run_slow] dpt, depth_anything, zoedepth

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

---------

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>
2024-11-20 11:31:21 +01:00
9d16441e4f Fix the memory usage issue of logits in generate() (#34813) 2024-11-20 11:25:37 +01:00
3671 changed files with 213875 additions and 224363 deletions

View File

@ -13,6 +13,7 @@ jobs:
check_circleci_user:
docker:
- image: python:3.10-slim
resource_class: small
parallelism: 1
steps:
- run: echo $CIRCLE_PROJECT_USERNAME
@ -30,6 +31,14 @@ jobs:
parallelism: 1
steps:
- checkout
- run: if [[ "$CIRCLE_PULL_REQUEST" == "" && "$CIRCLE_BRANCH" != "main" && "$CIRCLE_BRANCH" != *-release ]]; then echo "Not a PR, not the main branch and not a release branch, skip test!"; circleci-agent step halt; fi
- run: 'curl -L -H "Accept: application/vnd.github+json" -H "X-GitHub-Api-Version: 2022-11-28" https://api.github.com/repos/$CIRCLE_PROJECT_USERNAME/$CIRCLE_PROJECT_REPONAME/pulls/${CIRCLE_PULL_REQUEST##*/} >> github.txt'
- run: cat github.txt
- run: (python3 -c 'import json; from datetime import datetime; fp = open("github.txt"); data = json.load(fp); fp.close(); f = "%Y-%m-%dT%H:%M:%SZ"; created = datetime.strptime(data["created_at"], f); updated = datetime.strptime(data["updated_at"], f); s = (updated - created).total_seconds(); print(int(s))' || true) > elapsed.txt
- run: if [ "$(cat elapsed.txt)" == "" ]; then echo 60 > elapsed.txt; fi
- run: cat elapsed.txt
- run: if [ "$(cat elapsed.txt)" -lt "30" ]; then echo "PR is just opened, wait some actions from GitHub"; sleep 30; fi
- run: 'if grep -q "\"draft\": true," github.txt; then echo "draft mode, skip test!"; circleci-agent step halt; fi'
- run: uv pip install -U -e .
- run: echo 'export "GIT_COMMIT_MESSAGE=$(git show -s --format=%s)"' >> "$BASH_ENV" && source "$BASH_ENV"
- run: mkdir -p test_preparation
@ -57,15 +66,15 @@ jobs:
- run:
name: "Prepare pipeline parameters"
command: |
python utils/process_test_artifacts.py
python utils/process_test_artifacts.py
# To avoid too long generated_config.yaml on the continuation orb, we pass the links to the artifacts as parameters.
# Otherwise the list of tests was just too big. Explicit is good but for that it was a limitation.
# We used:
# https://circleci.com/docs/api/v2/index.html#operation/getJobArtifacts : to get the job artifacts
# We could not pass a nested dict, which is why we create the test_file_... parameters for every single job
- store_artifacts:
path: test_preparation/transformed_artifacts.json
- store_artifacts:
@ -109,7 +118,7 @@ jobs:
- run:
name: "Prepare pipeline parameters"
command: |
python utils/process_test_artifacts.py
python utils/process_test_artifacts.py
# To avoid too long generated_config.yaml on the continuation orb, we pass the links to the artifacts as parameters.
# Otherwise the list of tests was just too big. Explicit is good but for that it was a limitation.
@ -145,7 +154,7 @@ jobs:
path: ~/transformers/installed.txt
- run: python -c "from transformers import *" || (echo '🚨 import failed, this means you introduced unprotected imports! 🚨'; exit 1)
- run: ruff check examples tests src utils
- run: ruff format tests src utils --check
- run: ruff format examples tests src utils --check
- run: python utils/custom_init_isort.py --check_only
- run: python utils/sort_auto_mappings.py --check_only
- run: python utils/check_doc_toc.py
@ -170,7 +179,6 @@ jobs:
path: ~/transformers/installed.txt
- run: python utils/check_copies.py
- run: python utils/check_modular_conversion.py
- run: python utils/check_table.py
- run: python utils/check_dummies.py
- run: python utils/check_repo.py
- run: python utils/check_inits.py
@ -180,7 +188,6 @@ jobs:
- run: make deps_table_check_updated
- run: python utils/update_metadata.py --check-only
- run: python utils/check_docstrings.py
- run: python utils/check_support_list.py
workflows:
version: 2

View File

@ -28,21 +28,52 @@ COMMON_ENV_VARIABLES = {
"TRANSFORMERS_IS_CI": True,
"PYTEST_TIMEOUT": 120,
"RUN_PIPELINE_TESTS": False,
"RUN_PT_TF_CROSS_TESTS": False,
"RUN_PT_FLAX_CROSS_TESTS": False,
}
# Disable the use of {"s": None} as the output is way too long, causing the navigation on CircleCI impractical
COMMON_PYTEST_OPTIONS = {"max-worker-restart": 0, "dist": "loadfile", "vvv": None, "rsf":None}
COMMON_PYTEST_OPTIONS = {"max-worker-restart": 0, "vvv": None, "rsfE":None}
DEFAULT_DOCKER_IMAGE = [{"image": "cimg/python:3.8.12"}]
# Strings that commonly appear in the output of flaky tests when they fail. These are used with `pytest-rerunfailures`
# to rerun the tests that match these patterns.
FLAKY_TEST_FAILURE_PATTERNS = [
"OSError", # Machine/connection transient error
"Timeout", # Machine/connection transient error
"ConnectionError", # Connection transient error
"FileNotFoundError", # Raised by `datasets` on Hub failures
"PIL.UnidentifiedImageError", # Raised by `PIL.Image.open` on connection issues
"HTTPError", # Also catches HfHubHTTPError
"AssertionError: Tensor-likes are not close!", # `torch.testing.assert_close`, we might have unlucky random values
# TODO: error downloading tokenizer's `merged.txt` from hub can cause all the exceptions below. Throw and handle
# them under a single message.
"TypeError: expected str, bytes or os.PathLike object, not NoneType",
"TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType",
"Converting from Tiktoken failed",
"KeyError: <class ",
"TypeError: not a string",
]
class EmptyJob:
job_name = "empty"
def to_dict(self):
steps = [{"run": 'ls -la'}]
if self.job_name == "collection_job":
steps.extend(
[
"checkout",
{"run": "pip install requests || true"},
{"run": """while [[ $(curl --location --request GET "https://circleci.com/api/v2/workflow/$CIRCLE_WORKFLOW_ID/job" --header "Circle-Token: $CCI_TOKEN"| jq -r '.items[]|select(.name != "collection_job")|.status' | grep -c "running") -gt 0 ]]; do sleep 5; done || true"""},
{"run": 'python utils/process_circleci_workflow_test_reports.py --workflow_id $CIRCLE_WORKFLOW_ID || true'},
{"store_artifacts": {"path": "outputs"}},
{"run": 'echo "All required jobs have now completed"'},
]
)
return {
"docker": copy.deepcopy(DEFAULT_DOCKER_IMAGE),
"steps":["checkout"],
"resource_class": "small",
"steps": steps,
}
@ -54,9 +85,9 @@ class CircleCIJob:
install_steps: List[str] = None
marker: Optional[str] = None
parallelism: Optional[int] = 0
pytest_num_workers: int = 12
pytest_num_workers: int = 8
pytest_options: Dict[str, Any] = None
resource_class: Optional[str] = "2xlarge"
resource_class: Optional[str] = "xlarge"
tests_to_run: Optional[List[str]] = None
num_test_files_per_worker: Optional[int] = 10
# This should be only used for doctest job!
@ -112,7 +143,9 @@ class CircleCIJob:
# Examples special case: we need to download NLTK files in advance to avoid cuncurrency issues
timeout_cmd = f"timeout {self.command_timeout} " if self.command_timeout else ""
marker_cmd = f"-m '{self.marker}'" if self.marker is not None else ""
additional_flags = f" -p no:warning -o junit_family=xunit1 --junitxml=test-results/junit.xml"
junit_flags = f" -p no:warning -o junit_family=xunit1 --junitxml=test-results/junit.xml"
joined_flaky_patterns = "|".join(FLAKY_TEST_FAILURE_PATTERNS)
repeat_on_failure_flags = f"--reruns 5 --reruns-delay 2 --only-rerun '({joined_flaky_patterns})'"
parallel = f' << pipeline.parameters.{self.job_name}_parallelism >> '
steps = [
"checkout",
@ -133,14 +166,15 @@ class CircleCIJob:
"command": """dpkg-query --show --showformat='${Installed-Size}\t${Package}\n' | sort -rh | head -25 | sort -h | awk '{ package=$2; sub(".*/", "", package); printf("%.5f GB %s\n", $1/1024/1024, package)}' || true"""}
},
{"run": {"name": "Create `test-results` directory", "command": "mkdir test-results"}},
{"run": {"name": "Get files to test", "command":f'curl -L -o {self.job_name}_test_list.txt <<pipeline.parameters.{self.job_name}_test_list>>' if self.name != "pr_documentation_tests" else 'echo "Skipped"'}},
{"run": {"name": "Get files to test", "command":f'curl -L -o {self.job_name}_test_list.txt <<pipeline.parameters.{self.job_name}_test_list>> --header "Circle-Token: $CIRCLE_TOKEN"' if self.name != "pr_documentation_tests" else 'echo "Skipped"'}},
{"run": {"name": "Split tests across parallel nodes: show current parallel tests",
"command": f"TESTS=$(circleci tests split --split-by=timings {self.job_name}_test_list.txt) && echo $TESTS > splitted_tests.txt && echo $TESTS | tr ' ' '\n'" if self.parallelism else f"awk '{{printf \"%s \", $0}}' {self.job_name}_test_list.txt > splitted_tests.txt"
}
},
{"run": {"name": "fetch hub objects before pytest", "command": "python3 utils/fetch_hub_objects_for_ci.py"}},
{"run": {
"name": "Run tests",
"command": f"({timeout_cmd} python3 -m pytest {marker_cmd} -n {self.pytest_num_workers} {additional_flags} {' '.join(pytest_flags)} $(cat splitted_tests.txt) | tee tests_output.txt)"}
"command": f"({timeout_cmd} python3 -m pytest {marker_cmd} -n {self.pytest_num_workers} {junit_flags} {repeat_on_failure_flags} {' '.join(pytest_flags)} $(cat splitted_tests.txt) | tee tests_output.txt)"}
},
{"run": {"name": "Expand to show skipped tests", "when": "always", "command": f"python3 .circleci/parse_test_outputs.py --file tests_output.txt --skip"}},
{"run": {"name": "Failed tests: show reasons", "when": "always", "command": f"python3 .circleci/parse_test_outputs.py --file tests_output.txt --fail"}},
@ -163,58 +197,39 @@ class CircleCIJob:
# JOBS
torch_and_tf_job = CircleCIJob(
"torch_and_tf",
docker_image=[{"image":"huggingface/transformers-torch-tf-light"}],
additional_env={"RUN_PT_TF_CROSS_TESTS": True},
marker="is_pt_tf_cross_test",
pytest_options={"rA": None, "durations": 0},
)
torch_and_flax_job = CircleCIJob(
"torch_and_flax",
additional_env={"RUN_PT_FLAX_CROSS_TESTS": True},
docker_image=[{"image":"huggingface/transformers-torch-jax-light"}],
marker="is_pt_flax_cross_test",
pytest_options={"rA": None, "durations": 0},
)
torch_job = CircleCIJob(
"torch",
docker_image=[{"image": "huggingface/transformers-torch-light"}],
marker="not generate",
parallelism=6,
pytest_num_workers=8
)
generate_job = CircleCIJob(
"generate",
docker_image=[{"image": "huggingface/transformers-torch-light"}],
# networkx==3.3 (after #36957) cause some issues
# TODO: remove this once it works directly
install_steps=["uv venv && uv pip install . && uv pip install networkx==3.2.1"],
marker="generate",
parallelism=6,
pytest_num_workers=8
)
tokenization_job = CircleCIJob(
"tokenization",
docker_image=[{"image": "huggingface/transformers-torch-light"}],
parallelism=8,
pytest_num_workers=16
)
processor_job = CircleCIJob(
"processors",
docker_image=[{"image": "huggingface/transformers-torch-light"}],
parallelism=8,
pytest_num_workers=6
)
tf_job = CircleCIJob(
"tf",
docker_image=[{"image":"huggingface/transformers-tf-light"}],
parallelism=6,
pytest_num_workers=16,
)
@ -222,7 +237,8 @@ flax_job = CircleCIJob(
"flax",
docker_image=[{"image":"huggingface/transformers-jax-light"}],
parallelism=6,
pytest_num_workers=16
pytest_num_workers=16,
resource_class="2xlarge",
)
@ -231,7 +247,7 @@ pipelines_torch_job = CircleCIJob(
additional_env={"RUN_PIPELINE_TESTS": True},
docker_image=[{"image":"huggingface/transformers-torch-light"}],
marker="is_pipeline_test",
parallelism=4
parallelism=4,
)
@ -240,7 +256,7 @@ pipelines_tf_job = CircleCIJob(
additional_env={"RUN_PIPELINE_TESTS": True},
docker_image=[{"image":"huggingface/transformers-tf-light"}],
marker="is_pipeline_test",
parallelism=4
parallelism=4,
)
@ -257,7 +273,7 @@ examples_torch_job = CircleCIJob(
docker_image=[{"image":"huggingface/transformers-examples-torch"}],
# TODO @ArthurZucker remove this once docker is easier to build
install_steps=["uv venv && uv pip install . && uv pip install -r examples/pytorch/_tests_requirements.txt"],
pytest_num_workers=8,
pytest_num_workers=4,
)
@ -265,7 +281,7 @@ examples_tensorflow_job = CircleCIJob(
"examples_tensorflow",
additional_env={"OMP_NUM_THREADS": 8},
docker_image=[{"image":"huggingface/transformers-examples-tf"}],
pytest_num_workers=16,
pytest_num_workers=2,
)
@ -280,6 +296,7 @@ hub_job = CircleCIJob(
],
marker="is_staging_test",
pytest_num_workers=2,
resource_class="medium",
)
@ -292,13 +309,13 @@ onnx_job = CircleCIJob(
],
pytest_options={"k onnx": None},
pytest_num_workers=1,
resource_class="small",
)
exotic_models_job = CircleCIJob(
"exotic_models",
docker_image=[{"image":"huggingface/transformers-exotic-models"}],
pytest_num_workers=12,
parallelism=4,
pytest_options={"durations": 100},
)
@ -315,9 +332,11 @@ repo_utils_job = CircleCIJob(
non_model_job = CircleCIJob(
"non_model",
docker_image=[{"image": "huggingface/transformers-torch-light"}],
# networkx==3.3 (after #36957) cause some issues
# TODO: remove this once it works directly
install_steps=["uv venv && uv pip install . && uv pip install networkx==3.2.1"],
marker="not generate",
parallelism=6,
pytest_num_workers=8,
)
@ -345,13 +364,14 @@ doc_test_job = CircleCIJob(
pytest_num_workers=1,
)
REGULAR_TESTS = [torch_and_tf_job, torch_and_flax_job, torch_job, tf_job, flax_job, hub_job, onnx_job, tokenization_job, processor_job, generate_job, non_model_job] # fmt: skip
EXAMPLES_TESTS = [examples_torch_job, examples_tensorflow_job]
PIPELINE_TESTS = [pipelines_torch_job, pipelines_tf_job]
REGULAR_TESTS = [torch_job, flax_job, hub_job, onnx_job, tokenization_job, processor_job, generate_job, non_model_job] # fmt: skip
EXAMPLES_TESTS = [examples_torch_job]
PIPELINE_TESTS = [pipelines_torch_job]
REPO_UTIL_TESTS = [repo_utils_job]
DOC_TESTS = [doc_test_job]
ALL_TESTS = REGULAR_TESTS + EXAMPLES_TESTS + PIPELINE_TESTS + REPO_UTIL_TESTS + DOC_TESTS + [custom_tokenizers_job] + [exotic_models_job] # fmt: skip
def create_circleci_config(folder=None):
if folder is None:
folder = os.getcwd()
@ -361,7 +381,13 @@ def create_circleci_config(folder=None):
if len(jobs) == 0:
jobs = [EmptyJob()]
print("Full list of job name inputs", {j.job_name + "_test_list":{"type":"string", "default":''} for j in jobs})
else:
print("Full list of job name inputs", {j.job_name + "_test_list":{"type":"string", "default":''} for j in jobs})
# Add a job waiting all the test jobs and aggregate their test summary files at the end
collection_job = EmptyJob()
collection_job.job_name = "collection_job"
jobs = [collection_job] + jobs
config = {
"version": "2.1",
"parameters": {
@ -371,9 +397,14 @@ def create_circleci_config(folder=None):
**{j.job_name + "_test_list":{"type":"string", "default":''} for j in jobs},
**{j.job_name + "_parallelism":{"type":"integer", "default":1} for j in jobs},
},
"jobs" : {j.job_name: j.to_dict() for j in jobs},
"workflows": {"version": 2, "run_tests": {"jobs": [j.job_name for j in jobs]}}
"jobs": {j.job_name: j.to_dict() for j in jobs}
}
if "CIRCLE_TOKEN" in os.environ:
# For private forked repo. (e.g. new model addition)
config["workflows"] = {"version": 2, "run_tests": {"jobs": [{j.job_name: {"context": ["TRANSFORMERS_CONTEXT"]}} for j in jobs]}}
else:
# For public repo. (e.g. `transformers`)
config["workflows"] = {"version": 2, "run_tests": {"jobs": [j.job_name for j in jobs]}}
with open(os.path.join(folder, "generated_config.yml"), "w") as f:
f.write(yaml.dump(config, sort_keys=False, default_flow_style=False).replace("' << pipeline", " << pipeline").replace(">> '", " >>"))

View File

@ -38,21 +38,21 @@ body:
- text models: @ArthurZucker
- vision models: @amyeroberts, @qubvel
- speech models: @ylacombe, @eustlb
- speech models: @eustlb
- graph models: @clefourrier
Library:
- flax: @sanchit-gandhi
- flax: @gante and @Rocketknight1
- generate: @zucchini-nlp (visual-language models) or @gante (all others)
- pipelines: @Rocketknight1
- tensorflow: @gante and @Rocketknight1
- tokenizers: @ArthurZucker and @itazap
- trainer: @muellerzr @SunMarc
- trainer: @zach-huggingface @SunMarc
Integrations:
- deepspeed: HF Trainer/Accelerate: @muellerzr
- deepspeed: HF Trainer/Accelerate: @SunMarc @zach-huggingface
- ray/raytune: @richardliaw, @amogkam
- Big Model Inference: @SunMarc
- quantization (bitsandbytes, autogpt): @SunMarc @MekkCyber
@ -72,7 +72,7 @@ body:
Maintained examples (not research project or legacy):
- Flax: @sanchit-gandhi
- Flax: @Rocketknight1
- PyTorch: See Models above and tag the person corresponding to the modality of the example.
- TensorFlow: @Rocketknight1
@ -106,6 +106,7 @@ body:
label: Reproduction
description: |
Please provide a code sample that reproduces the problem you ran into. It can be a Colab link or just a code snippet.
Please include relevant config information with your code, for example your Trainers, TRL, Peft, and DeepSpeed configs.
If you have code snippets, error messages, stack traces please provide them here as well.
Important! Use code tags to correctly format your code. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
Do not use screenshots, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.

View File

@ -41,22 +41,22 @@ Models:
- text models: @ArthurZucker
- vision models: @amyeroberts, @qubvel
- speech models: @ylacombe, @eustlb
- speech models: @eustlb
- graph models: @clefourrier
Library:
- flax: @sanchit-gandhi
- flax: @gante and @Rocketknight1
- generate: @zucchini-nlp (visual-language models) or @gante (all others)
- pipelines: @Rocketknight1
- tensorflow: @gante and @Rocketknight1
- tokenizers: @ArthurZucker
- trainer: @muellerzr and @SunMarc
- trainer: @zach-huggingface and @SunMarc
- chat templates: @Rocketknight1
Integrations:
- deepspeed: HF Trainer/Accelerate: @muellerzr
- deepspeed: HF Trainer/Accelerate: @SunMarc @zach-huggingface
- ray/raytune: @richardliaw, @amogkam
- Big Model Inference: @SunMarc
- quantization (bitsandbytes, autogpt): @SunMarc @MekkCyber
@ -72,7 +72,7 @@ HF projects:
Maintained examples (not research project or legacy):
- Flax: @sanchit-gandhi
- Flax: @Rocketknight1
- PyTorch: See Models above and tag the person corresponding to the modality of the example.
- TensorFlow: @Rocketknight1

102
.github/scripts/assign_reviewers.py vendored Normal file
View File

@ -0,0 +1,102 @@
# coding=utf-8
# Copyright 2025 the HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import github
import json
from github import Github
import re
from collections import Counter
from pathlib import Path
def pattern_to_regex(pattern):
if pattern.startswith("/"):
start_anchor = True
pattern = re.escape(pattern[1:])
else:
start_anchor = False
pattern = re.escape(pattern)
# Replace `*` with "any number of non-slash characters"
pattern = pattern.replace(r"\*", "[^/]*")
if start_anchor:
pattern = r"^\/?" + pattern # Allow an optional leading slash after the start of the string
return pattern
def get_file_owners(file_path, codeowners_lines):
# Process lines in reverse (last matching pattern takes precedence)
for line in reversed(codeowners_lines):
# Skip comments and empty lines, strip inline comments
line = line.split('#')[0].strip()
if not line:
continue
# Split into pattern and owners
parts = line.split()
pattern = parts[0]
# Can be empty, e.g. for dummy files with explicitly no owner!
owners = [owner.removeprefix("@") for owner in parts[1:]]
# Check if file matches pattern
file_regex = pattern_to_regex(pattern)
if re.search(file_regex, file_path) is not None:
return owners # Remember, can still be empty!
return [] # Should never happen, but just in case
def main():
script_dir = Path(__file__).parent.absolute()
with open(script_dir / "codeowners_for_review_action") as f:
codeowners_lines = f.readlines()
g = Github(os.environ['GITHUB_TOKEN'])
repo = g.get_repo("huggingface/transformers")
with open(os.environ['GITHUB_EVENT_PATH']) as f:
event = json.load(f)
# The PR number is available in the event payload
pr_number = event['pull_request']['number']
pr = repo.get_pull(pr_number)
pr_author = pr.user.login
existing_reviews = list(pr.get_reviews())
if existing_reviews:
print(f"Already has reviews: {[r.user.login for r in existing_reviews]}")
return
users_requested, teams_requested = pr.get_review_requests()
users_requested = list(users_requested)
if users_requested:
print(f"Reviewers already requested: {users_requested}")
return
locs_per_owner = Counter()
for file in pr.get_files():
owners = get_file_owners(file.filename, codeowners_lines)
for owner in owners:
locs_per_owner[owner] += file.changes
# Assign the top 2 based on locs changed as reviewers, but skip the owner if present
locs_per_owner.pop(pr_author, None)
top_owners = locs_per_owner.most_common(2)
print("Top owners", top_owners)
top_owners = [owner[0] for owner in top_owners]
try:
pr.create_review_request(top_owners)
except github.GithubException as e:
print(f"Failed to request review for {top_owners}: {e}")
if __name__ == "__main__":
main()

View File

@ -0,0 +1,370 @@
# Top-level rules are matched only if nothing else matches
* @Rocketknight1 @ArthurZucker # if no one is pinged based on the other rules, he will do the dispatch
*.md @stevhliu
*tokenization* @ArthurZucker
docs/ @stevhliu
/benchmark/ @McPatate
/docker/ @ydshieh @ArthurZucker
# More high-level globs catch cases when specific rules later don't apply
/src/transformers/models/*/processing* @molbap @yonigozlan @qubvel
/src/transformers/models/*/image_processing* @qubvel
/src/transformers/models/*/image_processing_*_fast* @yonigozlan
# Owners of subsections of the library
/src/transformers/generation/ @gante
/src/transformers/pipeline/ @Rocketknight1 @yonigozlan
/src/transformers/integrations/ @SunMarc @MekkCyber @zach-huggingface
/src/transformers/quantizers/ @SunMarc @MekkCyber
tests/ @ydshieh
tests/generation/ @gante
/src/transformers/models/auto/ @ArthurZucker
/src/transformers/utils/ @ArthurZucker @Rocketknight1
/src/transformers/loss/ @ArthurZucker
/src/transformers/onnx/ @michaelbenayoun
# Specific files come after the sections/globs, so they take priority
/.circleci/config.yml @ArthurZucker @ydshieh
/utils/tests_fetcher.py @ydshieh
trainer.py @zach-huggingface @SunMarc
trainer_utils.py @zach-huggingface @SunMarc
/utils/modular_model_converter.py @Cyrilvallez @ArthurZucker
# Owners of individual models are specific / high priority, and so they come last
# mod* captures modeling and modular files
# Text models
/src/transformers/models/albert/mod*_albert* @ArthurZucker
/src/transformers/models/bamba/mod*_bamba* @ArthurZucker
/src/transformers/models/bart/mod*_bart* @ArthurZucker
/src/transformers/models/barthez/mod*_barthez* @ArthurZucker
/src/transformers/models/bartpho/mod*_bartpho* @ArthurZucker
/src/transformers/models/bert/mod*_bert* @ArthurZucker
/src/transformers/models/bert_generation/mod*_bert_generation* @ArthurZucker
/src/transformers/models/bert_japanese/mod*_bert_japanese* @ArthurZucker
/src/transformers/models/bertweet/mod*_bertweet* @ArthurZucker
/src/transformers/models/big_bird/mod*_big_bird* @ArthurZucker
/src/transformers/models/bigbird_pegasus/mod*_bigbird_pegasus* @ArthurZucker
/src/transformers/models/biogpt/mod*_biogpt* @ArthurZucker
/src/transformers/models/blenderbot/mod*_blenderbot* @ArthurZucker
/src/transformers/models/blenderbot_small/mod*_blenderbot_small* @ArthurZucker
/src/transformers/models/bloom/mod*_bloom* @ArthurZucker
/src/transformers/models/bort/mod*_bort* @ArthurZucker
/src/transformers/models/byt5/mod*_byt5* @ArthurZucker
/src/transformers/models/camembert/mod*_camembert* @ArthurZucker
/src/transformers/models/canine/mod*_canine* @ArthurZucker
/src/transformers/models/codegen/mod*_codegen* @ArthurZucker
/src/transformers/models/code_llama/mod*_code_llama* @ArthurZucker
/src/transformers/models/cohere/mod*_cohere* @ArthurZucker
/src/transformers/models/cohere2/mod*_cohere2* @ArthurZucker
/src/transformers/models/convbert/mod*_convbert* @ArthurZucker
/src/transformers/models/cpm/mod*_cpm* @ArthurZucker
/src/transformers/models/cpmant/mod*_cpmant* @ArthurZucker
/src/transformers/models/ctrl/mod*_ctrl* @ArthurZucker
/src/transformers/models/dbrx/mod*_dbrx* @ArthurZucker
/src/transformers/models/deberta/mod*_deberta* @ArthurZucker
/src/transformers/models/deberta_v2/mod*_deberta_v2* @ArthurZucker
/src/transformers/models/dialogpt/mod*_dialogpt* @ArthurZucker
/src/transformers/models/diffllama/mod*_diffllama* @ArthurZucker
/src/transformers/models/distilbert/mod*_distilbert* @ArthurZucker
/src/transformers/models/dpr/mod*_dpr* @ArthurZucker
/src/transformers/models/electra/mod*_electra* @ArthurZucker
/src/transformers/models/encoder_decoder/mod*_encoder_decoder* @ArthurZucker
/src/transformers/models/ernie/mod*_ernie* @ArthurZucker
/src/transformers/models/ernie_m/mod*_ernie_m* @ArthurZucker
/src/transformers/models/esm/mod*_esm* @ArthurZucker
/src/transformers/models/falcon/mod*_falcon* @ArthurZucker
/src/transformers/models/falcon3/mod*_falcon3* @ArthurZucker
/src/transformers/models/falcon_mamba/mod*_falcon_mamba* @ArthurZucker
/src/transformers/models/fastspeech2_conformer/mod*_fastspeech2_conformer* @ArthurZucker
/src/transformers/models/flan_t5/mod*_flan_t5* @ArthurZucker
/src/transformers/models/flan_ul2/mod*_flan_ul2* @ArthurZucker
/src/transformers/models/flaubert/mod*_flaubert* @ArthurZucker
/src/transformers/models/fnet/mod*_fnet* @ArthurZucker
/src/transformers/models/fsmt/mod*_fsmt* @ArthurZucker
/src/transformers/models/funnel/mod*_funnel* @ArthurZucker
/src/transformers/models/fuyu/mod*_fuyu* @ArthurZucker
/src/transformers/models/gemma/mod*_gemma* @ArthurZucker
/src/transformers/models/gemma2/mod*_gemma2* @ArthurZucker
/src/transformers/models/glm/mod*_glm* @ArthurZucker
/src/transformers/models/openai_gpt/mod*_openai_gpt* @ArthurZucker
/src/transformers/models/gpt_neo/mod*_gpt_neo* @ArthurZucker
/src/transformers/models/gpt_neox/mod*_gpt_neox* @ArthurZucker
/src/transformers/models/gpt_neox_japanese/mod*_gpt_neox_japanese* @ArthurZucker
/src/transformers/models/gptj/mod*_gptj* @ArthurZucker
/src/transformers/models/gpt2/mod*_gpt2* @ArthurZucker
/src/transformers/models/gpt_bigcode/mod*_gpt_bigcode* @ArthurZucker
/src/transformers/models/gptsan_japanese/mod*_gptsan_japanese* @ArthurZucker
/src/transformers/models/gpt_sw3/mod*_gpt_sw3* @ArthurZucker
/src/transformers/models/granite/mod*_granite* @ArthurZucker
/src/transformers/models/granitemoe/mod*_granitemoe* @ArthurZucker
/src/transformers/models/herbert/mod*_herbert* @ArthurZucker
/src/transformers/models/ibert/mod*_ibert* @ArthurZucker
/src/transformers/models/jamba/mod*_jamba* @ArthurZucker
/src/transformers/models/jetmoe/mod*_jetmoe* @ArthurZucker
/src/transformers/models/jukebox/mod*_jukebox* @ArthurZucker
/src/transformers/models/led/mod*_led* @ArthurZucker
/src/transformers/models/llama/mod*_llama* @ArthurZucker @Cyrilvallez
/src/transformers/models/longformer/mod*_longformer* @ArthurZucker
/src/transformers/models/longt5/mod*_longt5* @ArthurZucker
/src/transformers/models/luke/mod*_luke* @ArthurZucker
/src/transformers/models/m2m_100/mod*_m2m_100* @ArthurZucker
/src/transformers/models/madlad_400/mod*_madlad_400* @ArthurZucker
/src/transformers/models/mamba/mod*_mamba* @ArthurZucker
/src/transformers/models/mamba2/mod*_mamba2* @ArthurZucker
/src/transformers/models/marian/mod*_marian* @ArthurZucker
/src/transformers/models/markuplm/mod*_markuplm* @ArthurZucker
/src/transformers/models/mbart/mod*_mbart* @ArthurZucker
/src/transformers/models/mega/mod*_mega* @ArthurZucker
/src/transformers/models/megatron_bert/mod*_megatron_bert* @ArthurZucker
/src/transformers/models/megatron_gpt2/mod*_megatron_gpt2* @ArthurZucker
/src/transformers/models/mistral/mod*_mistral* @ArthurZucker
/src/transformers/models/mixtral/mod*_mixtral* @ArthurZucker
/src/transformers/models/mluke/mod*_mluke* @ArthurZucker
/src/transformers/models/mobilebert/mod*_mobilebert* @ArthurZucker
/src/transformers/models/modernbert/mod*_modernbert* @ArthurZucker
/src/transformers/models/mpnet/mod*_mpnet* @ArthurZucker
/src/transformers/models/mpt/mod*_mpt* @ArthurZucker
/src/transformers/models/mra/mod*_mra* @ArthurZucker
/src/transformers/models/mt5/mod*_mt5* @ArthurZucker
/src/transformers/models/mvp/mod*_mvp* @ArthurZucker
/src/transformers/models/myt5/mod*_myt5* @ArthurZucker
/src/transformers/models/nemotron/mod*_nemotron* @ArthurZucker
/src/transformers/models/nezha/mod*_nezha* @ArthurZucker
/src/transformers/models/nllb/mod*_nllb* @ArthurZucker
/src/transformers/models/nllb_moe/mod*_nllb_moe* @ArthurZucker
/src/transformers/models/nystromformer/mod*_nystromformer* @ArthurZucker
/src/transformers/models/olmo/mod*_olmo* @ArthurZucker
/src/transformers/models/olmo2/mod*_olmo2* @ArthurZucker
/src/transformers/models/olmoe/mod*_olmoe* @ArthurZucker
/src/transformers/models/open_llama/mod*_open_llama* @ArthurZucker
/src/transformers/models/opt/mod*_opt* @ArthurZucker
/src/transformers/models/pegasus/mod*_pegasus* @ArthurZucker
/src/transformers/models/pegasus_x/mod*_pegasus_x* @ArthurZucker
/src/transformers/models/persimmon/mod*_persimmon* @ArthurZucker
/src/transformers/models/phi/mod*_phi* @ArthurZucker
/src/transformers/models/phi3/mod*_phi3* @ArthurZucker
/src/transformers/models/phimoe/mod*_phimoe* @ArthurZucker
/src/transformers/models/phobert/mod*_phobert* @ArthurZucker
/src/transformers/models/plbart/mod*_plbart* @ArthurZucker
/src/transformers/models/prophetnet/mod*_prophetnet* @ArthurZucker
/src/transformers/models/qdqbert/mod*_qdqbert* @ArthurZucker
/src/transformers/models/qwen2/mod*_qwen2* @ArthurZucker
/src/transformers/models/qwen2_moe/mod*_qwen2_moe* @ArthurZucker
/src/transformers/models/rag/mod*_rag* @ArthurZucker
/src/transformers/models/realm/mod*_realm* @ArthurZucker
/src/transformers/models/recurrent_gemma/mod*_recurrent_gemma* @ArthurZucker
/src/transformers/models/reformer/mod*_reformer* @ArthurZucker
/src/transformers/models/rembert/mod*_rembert* @ArthurZucker
/src/transformers/models/retribert/mod*_retribert* @ArthurZucker
/src/transformers/models/roberta/mod*_roberta* @ArthurZucker
/src/transformers/models/roberta_prelayernorm/mod*_roberta_prelayernorm* @ArthurZucker
/src/transformers/models/roc_bert/mod*_roc_bert* @ArthurZucker
/src/transformers/models/roformer/mod*_roformer* @ArthurZucker
/src/transformers/models/rwkv/mod*_rwkv* @ArthurZucker
/src/transformers/models/splinter/mod*_splinter* @ArthurZucker
/src/transformers/models/squeezebert/mod*_squeezebert* @ArthurZucker
/src/transformers/models/stablelm/mod*_stablelm* @ArthurZucker
/src/transformers/models/starcoder2/mod*_starcoder2* @ArthurZucker
/src/transformers/models/switch_transformers/mod*_switch_transformers* @ArthurZucker
/src/transformers/models/t5/mod*_t5* @ArthurZucker
/src/transformers/models/t5v1.1/mod*_t5v1.1* @ArthurZucker
/src/transformers/models/tapex/mod*_tapex* @ArthurZucker
/src/transformers/models/transfo_xl/mod*_transfo_xl* @ArthurZucker
/src/transformers/models/ul2/mod*_ul2* @ArthurZucker
/src/transformers/models/umt5/mod*_umt5* @ArthurZucker
/src/transformers/models/xmod/mod*_xmod* @ArthurZucker
/src/transformers/models/xglm/mod*_xglm* @ArthurZucker
/src/transformers/models/xlm/mod*_xlm* @ArthurZucker
/src/transformers/models/xlm_prophetnet/mod*_xlm_prophetnet* @ArthurZucker
/src/transformers/models/xlm_roberta/mod*_xlm_roberta* @ArthurZucker
/src/transformers/models/xlm_roberta_xl/mod*_xlm_roberta_xl* @ArthurZucker
/src/transformers/models/xlm_v/mod*_xlm_v* @ArthurZucker
/src/transformers/models/xlnet/mod*_xlnet* @ArthurZucker
/src/transformers/models/yoso/mod*_yoso* @ArthurZucker
/src/transformers/models/zamba/mod*_zamba* @ArthurZucker
# Vision models
/src/transformers/models/beit/mod*_beit* @amyeroberts @qubvel
/src/transformers/models/bit/mod*_bit* @amyeroberts @qubvel
/src/transformers/models/conditional_detr/mod*_conditional_detr* @amyeroberts @qubvel
/src/transformers/models/convnext/mod*_convnext* @amyeroberts @qubvel
/src/transformers/models/convnextv2/mod*_convnextv2* @amyeroberts @qubvel
/src/transformers/models/cvt/mod*_cvt* @amyeroberts @qubvel
/src/transformers/models/deformable_detr/mod*_deformable_detr* @amyeroberts @qubvel
/src/transformers/models/deit/mod*_deit* @amyeroberts @qubvel
/src/transformers/models/depth_anything/mod*_depth_anything* @amyeroberts @qubvel
/src/transformers/models/depth_anything_v2/mod*_depth_anything_v2* @amyeroberts @qubvel
/src/transformers/models/deta/mod*_deta* @amyeroberts @qubvel
/src/transformers/models/detr/mod*_detr* @amyeroberts @qubvel
/src/transformers/models/dinat/mod*_dinat* @amyeroberts @qubvel
/src/transformers/models/dinov2/mod*_dinov2* @amyeroberts @qubvel
/src/transformers/models/dinov2_with_registers/mod*_dinov2_with_registers* @amyeroberts @qubvel
/src/transformers/models/dit/mod*_dit* @amyeroberts @qubvel
/src/transformers/models/dpt/mod*_dpt* @amyeroberts @qubvel
/src/transformers/models/efficientformer/mod*_efficientformer* @amyeroberts @qubvel
/src/transformers/models/efficientnet/mod*_efficientnet* @amyeroberts @qubvel
/src/transformers/models/focalnet/mod*_focalnet* @amyeroberts @qubvel
/src/transformers/models/glpn/mod*_glpn* @amyeroberts @qubvel
/src/transformers/models/hiera/mod*_hiera* @amyeroberts @qubvel
/src/transformers/models/ijepa/mod*_ijepa* @amyeroberts @qubvel
/src/transformers/models/imagegpt/mod*_imagegpt* @amyeroberts @qubvel
/src/transformers/models/levit/mod*_levit* @amyeroberts @qubvel
/src/transformers/models/mask2former/mod*_mask2former* @amyeroberts @qubvel
/src/transformers/models/maskformer/mod*_maskformer* @amyeroberts @qubvel
/src/transformers/models/mobilenet_v1/mod*_mobilenet_v1* @amyeroberts @qubvel
/src/transformers/models/mobilenet_v2/mod*_mobilenet_v2* @amyeroberts @qubvel
/src/transformers/models/mobilevit/mod*_mobilevit* @amyeroberts @qubvel
/src/transformers/models/mobilevitv2/mod*_mobilevitv2* @amyeroberts @qubvel
/src/transformers/models/nat/mod*_nat* @amyeroberts @qubvel
/src/transformers/models/poolformer/mod*_poolformer* @amyeroberts @qubvel
/src/transformers/models/pvt/mod*_pvt* @amyeroberts @qubvel
/src/transformers/models/pvt_v2/mod*_pvt_v2* @amyeroberts @qubvel
/src/transformers/models/regnet/mod*_regnet* @amyeroberts @qubvel
/src/transformers/models/resnet/mod*_resnet* @amyeroberts @qubvel
/src/transformers/models/rt_detr/mod*_rt_detr* @amyeroberts @qubvel
/src/transformers/models/segformer/mod*_segformer* @amyeroberts @qubvel
/src/transformers/models/seggpt/mod*_seggpt* @amyeroberts @qubvel
/src/transformers/models/superpoint/mod*_superpoint* @amyeroberts @qubvel
/src/transformers/models/swiftformer/mod*_swiftformer* @amyeroberts @qubvel
/src/transformers/models/swin/mod*_swin* @amyeroberts @qubvel
/src/transformers/models/swinv2/mod*_swinv2* @amyeroberts @qubvel
/src/transformers/models/swin2sr/mod*_swin2sr* @amyeroberts @qubvel
/src/transformers/models/table_transformer/mod*_table_transformer* @amyeroberts @qubvel
/src/transformers/models/textnet/mod*_textnet* @amyeroberts @qubvel
/src/transformers/models/timm_wrapper/mod*_timm_wrapper* @amyeroberts @qubvel
/src/transformers/models/upernet/mod*_upernet* @amyeroberts @qubvel
/src/transformers/models/van/mod*_van* @amyeroberts @qubvel
/src/transformers/models/vit/mod*_vit* @amyeroberts @qubvel
/src/transformers/models/vit_hybrid/mod*_vit_hybrid* @amyeroberts @qubvel
/src/transformers/models/vitdet/mod*_vitdet* @amyeroberts @qubvel
/src/transformers/models/vit_mae/mod*_vit_mae* @amyeroberts @qubvel
/src/transformers/models/vitmatte/mod*_vitmatte* @amyeroberts @qubvel
/src/transformers/models/vit_msn/mod*_vit_msn* @amyeroberts @qubvel
/src/transformers/models/vitpose/mod*_vitpose* @amyeroberts @qubvel
/src/transformers/models/yolos/mod*_yolos* @amyeroberts @qubvel
/src/transformers/models/zoedepth/mod*_zoedepth* @amyeroberts @qubvel
# Audio models
/src/transformers/models/audio_spectrogram_transformer/mod*_audio_spectrogram_transformer* @eustlb
/src/transformers/models/bark/mod*_bark* @eustlb
/src/transformers/models/clap/mod*_clap* @eustlb
/src/transformers/models/dac/mod*_dac* @eustlb
/src/transformers/models/encodec/mod*_encodec* @eustlb
/src/transformers/models/hubert/mod*_hubert* @eustlb
/src/transformers/models/mctct/mod*_mctct* @eustlb
/src/transformers/models/mimi/mod*_mimi* @eustlb
/src/transformers/models/mms/mod*_mms* @eustlb
/src/transformers/models/moshi/mod*_moshi* @eustlb
/src/transformers/models/musicgen/mod*_musicgen* @eustlb
/src/transformers/models/musicgen_melody/mod*_musicgen_melody* @eustlb
/src/transformers/models/pop2piano/mod*_pop2piano* @eustlb
/src/transformers/models/seamless_m4t/mod*_seamless_m4t* @eustlb
/src/transformers/models/seamless_m4t_v2/mod*_seamless_m4t_v2* @eustlb
/src/transformers/models/sew/mod*_sew* @eustlb
/src/transformers/models/sew_d/mod*_sew_d* @eustlb
/src/transformers/models/speech_to_text/mod*_speech_to_text* @eustlb
/src/transformers/models/speech_to_text_2/mod*_speech_to_text_2* @eustlb
/src/transformers/models/speecht5/mod*_speecht5* @eustlb
/src/transformers/models/unispeech/mod*_unispeech* @eustlb
/src/transformers/models/unispeech_sat/mod*_unispeech_sat* @eustlb
/src/transformers/models/univnet/mod*_univnet* @eustlb
/src/transformers/models/vits/mod*_vits* @eustlb
/src/transformers/models/wav2vec2/mod*_wav2vec2* @eustlb
/src/transformers/models/wav2vec2_bert/mod*_wav2vec2_bert* @eustlb
/src/transformers/models/wav2vec2_conformer/mod*_wav2vec2_conformer* @eustlb
/src/transformers/models/wav2vec2_phoneme/mod*_wav2vec2_phoneme* @eustlb
/src/transformers/models/wavlm/mod*_wavlm* @eustlb
/src/transformers/models/whisper/mod*_whisper* @eustlb
/src/transformers/models/xls_r/mod*_xls_r* @eustlb
/src/transformers/models/xlsr_wav2vec2/mod*_xlsr_wav2vec2* @eustlb
# Video models
/src/transformers/models/timesformer/mod*_timesformer* @Rocketknight1
/src/transformers/models/videomae/mod*_videomae* @Rocketknight1
/src/transformers/models/vivit/mod*_vivit* @Rocketknight1
# Multimodal models
/src/transformers/models/align/mod*_align* @zucchini-nlp
/src/transformers/models/altclip/mod*_altclip* @zucchini-nlp
/src/transformers/models/aria/mod*_aria* @zucchini-nlp
/src/transformers/models/blip/mod*_blip* @zucchini-nlp
/src/transformers/models/blip_2/mod*_blip_2* @zucchini-nlp
/src/transformers/models/bridgetower/mod*_bridgetower* @zucchini-nlp
/src/transformers/models/bros/mod*_bros* @zucchini-nlp
/src/transformers/models/chameleon/mod*_chameleon* @zucchini-nlp
/src/transformers/models/chinese_clip/mod*_chinese_clip* @zucchini-nlp
/src/transformers/models/clip/mod*_clip* @zucchini-nlp
/src/transformers/models/clipseg/mod*_clipseg* @zucchini-nlp
/src/transformers/models/clvp/mod*_clvp* @zucchini-nlp
/src/transformers/models/colpali/mod*_colpali* @zucchini-nlp @yonigozlan
/src/transformers/models/data2vec/mod*_data2vec* @zucchini-nlp
/src/transformers/models/deplot/mod*_deplot* @zucchini-nlp
/src/transformers/models/donut/mod*_donut* @zucchini-nlp
/src/transformers/models/flava/mod*_flava* @zucchini-nlp
/src/transformers/models/git/mod*_git* @zucchini-nlp
/src/transformers/models/grounding_dino/mod*_grounding_dino* @qubvel
/src/transformers/models/groupvit/mod*_groupvit* @zucchini-nlp
/src/transformers/models/idefics/mod*_idefics* @zucchini-nlp
/src/transformers/models/idefics2/mod*_idefics2* @zucchini-nlp
/src/transformers/models/idefics3/mod*_idefics3* @zucchini-nlp
/src/transformers/models/instructblip/mod*_instructblip* @zucchini-nlp
/src/transformers/models/instructblipvideo/mod*_instructblipvideo* @zucchini-nlp
/src/transformers/models/kosmos_2/mod*_kosmos_2* @zucchini-nlp
/src/transformers/models/layoutlm/mod*_layoutlm* @NielsRogge
/src/transformers/models/layoutlmv2/mod*_layoutlmv2* @NielsRogge
/src/transformers/models/layoutlmv3/mod*_layoutlmv3* @NielsRogge
/src/transformers/models/layoutxlm/mod*_layoutxlm* @NielsRogge
/src/transformers/models/lilt/mod*_lilt* @zucchini-nlp
/src/transformers/models/llava/mod*_llava* @zucchini-nlp @arthurzucker
/src/transformers/models/llava_next/mod*_llava_next* @zucchini-nlp
/src/transformers/models/llava_next_video/mod*_llava_next_video* @zucchini-nlp
/src/transformers/models/llava_onevision/mod*_llava_onevision* @zucchini-nlp
/src/transformers/models/lxmert/mod*_lxmert* @zucchini-nlp
/src/transformers/models/matcha/mod*_matcha* @zucchini-nlp
/src/transformers/models/mgp_str/mod*_mgp_str* @zucchini-nlp
/src/transformers/models/mllama/mod*_mllama* @zucchini-nlp
/src/transformers/models/nougat/mod*_nougat* @NielsRogge
/src/transformers/models/omdet_turbo/mod*_omdet_turbo* @qubvel @yonigozlan
/src/transformers/models/oneformer/mod*_oneformer* @zucchini-nlp
/src/transformers/models/owlvit/mod*_owlvit* @qubvel
/src/transformers/models/owlv2/mod*_owlv2* @qubvel
/src/transformers/models/paligemma/mod*_paligemma* @zucchini-nlp @molbap
/src/transformers/models/perceiver/mod*_perceiver* @zucchini-nlp
/src/transformers/models/pix2struct/mod*_pix2struct* @zucchini-nlp
/src/transformers/models/pixtral/mod*_pixtral* @zucchini-nlp @ArthurZucker
/src/transformers/models/qwen2_audio/mod*_qwen2_audio* @zucchini-nlp @ArthurZucker
/src/transformers/models/qwen2_vl/mod*_qwen2_vl* @zucchini-nlp @ArthurZucker
/src/transformers/models/sam/mod*_sam* @zucchini-nlp @ArthurZucker
/src/transformers/models/siglip/mod*_siglip* @zucchini-nlp
/src/transformers/models/speech_encoder_decoder/mod*_speech_encoder_decoder* @zucchini-nlp
/src/transformers/models/tapas/mod*_tapas* @NielsRogge
/src/transformers/models/trocr/mod*_trocr* @zucchini-nlp
/src/transformers/models/tvlt/mod*_tvlt* @zucchini-nlp
/src/transformers/models/tvp/mod*_tvp* @zucchini-nlp
/src/transformers/models/udop/mod*_udop* @zucchini-nlp
/src/transformers/models/video_llava/mod*_video_llava* @zucchini-nlp
/src/transformers/models/vilt/mod*_vilt* @zucchini-nlp
/src/transformers/models/vipllava/mod*_vipllava* @zucchini-nlp
/src/transformers/models/vision_encoder_decoder/mod*_vision_encoder_decoder* @Rocketknight1
/src/transformers/models/vision_text_dual_encoder/mod*_vision_text_dual_encoder* @Rocketknight1
/src/transformers/models/visual_bert/mod*_visual_bert* @zucchini-nlp
/src/transformers/models/xclip/mod*_xclip* @zucchini-nlp
# Reinforcement learning models
/src/transformers/models/decision_transformer/mod*_decision_transformer* @Rocketknight1
/src/transformers/models/trajectory_transformer/mod*_trajectory_transformer* @Rocketknight1
# Time series models
/src/transformers/models/autoformer/mod*_autoformer* @Rocketknight1
/src/transformers/models/informer/mod*_informer* @Rocketknight1
/src/transformers/models/patchtsmixer/mod*_patchtsmixer* @Rocketknight1
/src/transformers/models/patchtst/mod*_patchtst* @Rocketknight1
/src/transformers/models/time_series_transformer/mod*_time_series_transformer* @Rocketknight1
# Graph models
/src/transformers/models/graphormer/mod*_graphormer* @clefourrier
# Finally, files with no owners that shouldn't generate pings, usually automatically generated and checked in the CI
utils/dummy*

26
.github/workflows/assign-reviewers.yml vendored Normal file
View File

@ -0,0 +1,26 @@
name: Assign PR Reviewers
on:
pull_request_target:
branches:
- main
types: [ready_for_review]
jobs:
assign_reviewers:
permissions:
pull-requests: write
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.13'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install PyGithub
- name: Run assignment script
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: python .github/scripts/assign_reviewers.py

View File

@ -18,7 +18,8 @@ jobs:
name: Benchmark
strategy:
matrix:
group: [aws-g5-4xlarge-cache, aws-p4d-24xlarge-plus]
# group: [aws-g5-4xlarge-cache, aws-p4d-24xlarge-plus] (A100 runner is not enabled)
group: [aws-g5-4xlarge-cache]
runs-on:
group: ${{ matrix.group }}
if: |
@ -63,7 +64,7 @@ jobs:
commit_id=$GITHUB_SHA
fi
commit_msg=$(git show -s --format=%s | cut -c1-70)
python3 benchmark/llama.py "${{ github.head_ref || github.ref_name }}" "$commit_id" "$commit_msg"
python3 benchmark/benchmarks_entrypoint.py "$BRANCH_NAME" "$commit_id" "$commit_msg"
env:
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
# Enable this to see debug logs
@ -72,3 +73,4 @@ jobs:
PGHOST: ${{ secrets.TRANSFORMERS_BENCHMARKS_PGHOST }}
PGUSER: transformers_benchmarks
PGPASSWORD: ${{ secrets.TRANSFORMERS_BENCHMARKS_PGPASSWORD }}
BRANCH_NAME: ${{ github.head_ref || github.ref_name }}

View File

@ -26,7 +26,7 @@ jobs:
strategy:
matrix:
file: ["quality", "consistency", "custom-tokenizers", "torch-light", "tf-light", "exotic-models", "torch-tf-light", "torch-jax-light", "jax-light", "examples-torch", "examples-tf"]
file: ["quality", "consistency", "custom-tokenizers", "torch-light", "tf-light", "exotic-models", "torch-tf-light", "jax-light", "examples-torch", "examples-tf"]
continue-on-error: true
steps:
@ -34,11 +34,11 @@ jobs:
name: Set tag
run: |
if ${{contains(github.event.head_commit.message, '[build-ci-image]')}}; then
echo "TAG=huggingface/transformers-${{ matrix.file }}:dev" >> "$GITHUB_ENV"
echo "TAG=huggingface/transformers-${{ matrix.file }}:dev" >> "$GITHUB_ENV"
echo "setting it to DEV!"
else
echo "TAG=huggingface/transformers-${{ matrix.file }}" >> "$GITHUB_ENV"
fi
-
name: Set up Docker Buildx

View File

@ -15,4 +15,3 @@ jobs:
pr_number: ${{ github.event.number }}
package: transformers
languages: ar de en es fr hi it ko pt tr zh ja te
custom_container: huggingface/transformers-doc-builder

View File

@ -0,0 +1,25 @@
name: Change PR to draft
on:
pull_request_target:
types: [opened, reopened]
jobs:
convert_pr_to_draft:
runs-on: ubuntu-22.04
name: Convert PR to draft
permissions:
pull-requests: write
contents: write
if: github.event.pull_request.draft == false
steps:
- name: Convert PR to draft
shell: bash
env:
PR_NUMBER: ${{ github.event.number }}
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
REPO: ${{ github.repository }}
run: |
echo $PR_NUMBER
gh pr ready $PR_NUMBER --repo $REPO --undo
gh pr comment $PR_NUMBER --repo $REPO --body "Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the \`Ready for review\` button (at the bottom of the PR page). This will assign reviewers and trigger CI."

View File

@ -22,7 +22,6 @@ env:
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
RUN_PT_TF_CROSS_TESTS: 1
CUDA_VISIBLE_DEVICES: 0,1

View File

@ -30,7 +30,6 @@ env:
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
RUN_PT_TF_CROSS_TESTS: 1
CUDA_VISIBLE_DEVICES: 0,1
jobs:

View File

@ -30,7 +30,6 @@ env:
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
RUN_PT_TF_CROSS_TESTS: 1
CUDA_VISIBLE_DEVICES: 0,1
jobs:

View File

@ -0,0 +1,68 @@
# Used to notify core maintainers about new model PR being merged
name: New model PR merged notification
on:
push:
branches:
- main
paths:
- 'src/transformers/models/*/modeling_*'
jobs:
notify_new_model:
name: Notify new model
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Check new model
shell: bash
run: |
python -m pip install gitpython
python -c 'from utils.pr_slow_ci_models import get_new_model; new_model = get_new_model(diff_with_last_commit=True); print(new_model)' | tee output.txt
echo "NEW_MODEL=$(tail -n 1 output.txt)" >> $GITHUB_ENV
echo "COMMIT_SHA=$(git log -1 --format=%H)" >> $GITHUB_ENV
- name: print commit sha
if: ${{ env.NEW_MODEL != ''}}
shell: bash
run: |
echo "$COMMIT_SHA"
- name: print new model
if: ${{ env.NEW_MODEL != ''}}
shell: bash
run: |
echo "$NEW_MODEL"
- name: Notify
if: ${{ env.NEW_MODEL != ''}}
uses: slackapi/slack-github-action@6c661ce58804a1a20f6dc5fbee7f0381b469e001
with:
# Slack channel id, channel name, or user id to post message.
# See also: https://api.slack.com/methods/chat.postMessage#channels
channel-id: transformers-new-model-notification
# For posting a rich message using Block Kit
payload: |
{
"blocks": [
{
"type": "header",
"text": {
"type": "plain_text",
"text": "New model!",
"emoji": true
}
},
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "<https://github.com/huggingface/transformers/commit/${{ env.COMMIT_SHA }}|New model: ${{ env.NEW_MODEL }}> GH_ArthurZucker, GH_lysandrejik, GH_ydshieh"
}
}
]
}
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

View File

@ -7,14 +7,13 @@ on:
env:
OUTPUT_SLACK_CHANNEL_ID: "C06L2SGMEEA"
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
HF_HOME: /mnt/cache
TRANSFORMERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access. # This token is created under the bot `hf-transformers-bot`.
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
RUN_PT_TF_CROSS_TESTS: 1
HF_HOME: /mnt/cache
TRANSFORMERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access. # This token is created under the bot `hf-transformers-bot`.
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
jobs:
get_modified_models:
@ -25,13 +24,13 @@ jobs:
steps:
- name: Check out code
uses: actions/checkout@v4
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@3f54ebb830831fc121d3263c1857cfbdc310cdb9 #v42
uses: tj-actions/changed-files@1c8e6069583811afb28f97afeaf8e7da80c6be5c
with:
files: src/transformers/models/**
- name: Run step if only the files listed above change
if: steps.changed-files.outputs.any_changed == 'true'
id: set-matrix
@ -60,41 +59,41 @@ jobs:
if: ${{ needs.get_modified_models.outputs.matrix != '[]' && needs.get_modified_models.outputs.matrix != '' && fromJson(needs.get_modified_models.outputs.matrix)[0] != null }}
strategy:
fail-fast: false
matrix:
matrix:
model-name: ${{ fromJson(needs.get_modified_models.outputs.matrix) }}
steps:
- name: Check out code
uses: actions/checkout@v4
- name: Install locally transformers & other libs
run: |
apt install sudo
sudo -H pip install --upgrade pip
sudo -H pip uninstall -y transformers
sudo -H pip install -U -e ".[testing]"
sudo -H pip uninstall -y transformers
sudo -H pip install -U -e ".[testing]"
MAX_JOBS=4 pip install flash-attn --no-build-isolation
pip install bitsandbytes
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Show installed libraries and their versions
run: pip freeze
- name: Run FA2 tests
id: run_fa2_tests
run:
pytest -rsfE -m "flash_attn_test" --make-reports=${{ matrix.model-name }}_fa2_tests/ tests/${{ matrix.model-name }}/test_modeling_*
- name: "Test suite reports artifacts: ${{ matrix.model-name }}_fa2_tests"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.model-name }}_fa2_tests
path: /transformers/reports/${{ matrix.model-name }}_fa2_tests
- name: Post to Slack
if: always()
uses: huggingface/hf-workflows/.github/actions/post-slack@main
@ -103,13 +102,13 @@ jobs:
title: 🤗 Results of the FA2 tests - ${{ matrix.model-name }}
status: ${{ steps.run_fa2_tests.conclusion}}
slack_token: ${{ secrets.CI_SLACK_BOT_TOKEN }}
- name: Run integration tests
id: run_integration_tests
if: always()
run:
pytest -rsfE -k "IntegrationTest" --make-reports=tests_integration_${{ matrix.model-name }} tests/${{ matrix.model-name }}/test_modeling_*
- name: "Test suite reports artifacts: tests_integration_${{ matrix.model-name }}"
if: ${{ always() }}
uses: actions/upload-artifact@v4
@ -119,7 +118,7 @@ jobs:
- name: Post to Slack
if: always()
uses: huggingface/hf-workflows/.github/actions/post-slack@main
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
slack_channel: ${{ env.OUTPUT_SLACK_CHANNEL_ID }}
title: 🤗 Results of the Integration tests - ${{ matrix.model-name }}
@ -134,10 +133,3 @@ jobs:
slackChannel: ${{ secrets.SLACK_CIFEEDBACK_CHANNEL }}
slackToken: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
waitForSSH: true
benchmark:
name: Benchmark workflow
needs: get_modified_models
if: ${{ needs.get_modified_models.outputs.matrix != '[]' && needs.get_modified_models.outputs.matrix != '' && fromJson(needs.get_modified_models.outputs.matrix)[0] != null }}
uses: ./.github/workflows/benchmark.yml
secrets: inherit

416
.github/workflows/self-comment-ci.yml vendored Normal file
View File

@ -0,0 +1,416 @@
name: PR comment GitHub CI
on:
issue_comment:
types:
- created
branches-ignore:
- main
concurrency:
group: ${{ github.workflow }}-${{ github.event.issue.number }}-${{ startsWith(github.event.comment.body, 'run-slow') || startsWith(github.event.comment.body, 'run slow') || startsWith(github.event.comment.body, 'run_slow') }}
cancel-in-progress: true
permissions: read-all
env:
HF_HOME: /mnt/cache
TRANSFORMERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes
# For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
# This token is created under the bot `hf-transformers-bot`.
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
CUDA_VISIBLE_DEVICES: 0,1
jobs:
get-pr-number:
runs-on: ubuntu-22.04
name: Get PR number
# For security: only allow team members to run
if: ${{ github.event.issue.state == 'open' && contains(fromJSON('["ydshieh", "ArthurZucker", "zucchini-nlp", "qubvel", "molbap", "gante", "LysandreJik", "Cyrilvallez", "Rocketknight1", "SunMarc", "muellerzr", "eustlb"]'), github.actor) && (startsWith(github.event.comment.body, 'run-slow') || startsWith(github.event.comment.body, 'run slow') || startsWith(github.event.comment.body, 'run_slow')) }}
outputs:
PR_NUMBER: ${{ steps.set_pr_number.outputs.PR_NUMBER }}
steps:
- name: Get PR number
shell: bash
run: |
if [[ "${{ github.event.issue.number }}" != "" && "${{ github.event.issue.pull_request }}" != "" ]]; then
echo "PR_NUMBER=${{ github.event.issue.number }}" >> $GITHUB_ENV
else
echo "PR_NUMBER=" >> $GITHUB_ENV
fi
- name: Check PR number
shell: bash
run: |
echo "${{ env.PR_NUMBER }}"
- name: Set PR number
id: set_pr_number
run: echo "PR_NUMBER=${{ env.PR_NUMBER }}" >> "$GITHUB_OUTPUT"
get-sha:
runs-on: ubuntu-22.04
needs: get-pr-number
if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}
outputs:
PR_HEAD_SHA: ${{ steps.get_sha.outputs.PR_HEAD_SHA }}
PR_MERGE_SHA: ${{ steps.get_sha.outputs.PR_MERGE_SHA }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: "0"
ref: "refs/pull/${{needs.get-pr-number.outputs.PR_NUMBER}}/merge"
- name: Get SHA (and verify timestamps against the issue comment date)
id: get_sha
env:
PR_NUMBER: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
COMMENT_DATE: ${{ github.event.comment.created_at }}
run: |
git fetch origin refs/pull/$PR_NUMBER/head:refs/remotes/pull/$PR_NUMBER/head
git checkout refs/remotes/pull/$PR_NUMBER/head
echo "PR_HEAD_SHA: $(git log -1 --format=%H)"
echo "PR_HEAD_SHA=$(git log -1 --format=%H)" >> "$GITHUB_OUTPUT"
git fetch origin refs/pull/$PR_NUMBER/merge:refs/remotes/pull/$PR_NUMBER/merge
git checkout refs/remotes/pull/$PR_NUMBER/merge
echo "PR_MERGE_SHA: $(git log -1 --format=%H)"
echo "PR_MERGE_SHA=$(git log -1 --format=%H)" >> "$GITHUB_OUTPUT"
PR_MERGE_COMMIT_TIMESTAMP=$(git log -1 --date=unix --format=%cd)
echo "PR_MERGE_COMMIT_TIMESTAMP: $PR_MERGE_COMMIT_TIMESTAMP"
COMMENT_TIMESTAMP=$(date -d "${COMMENT_DATE}" +"%s")
echo "COMMENT_DATE: $COMMENT_DATE"
echo "COMMENT_TIMESTAMP: $COMMENT_TIMESTAMP"
if [ $COMMENT_TIMESTAMP -le $PR_MERGE_COMMIT_TIMESTAMP ]; then
echo "Last commit on the pull request is newer than the issue comment triggering this run! Abort!";
exit -1;
fi
# use a python script to handle this complex logic
# case 1: `run-slow` (auto. infer with limited number of models, but in particular, new model)
# case 2: `run-slow model_1, model_2`
get-tests:
runs-on: ubuntu-22.04
needs: [get-pr-number, get-sha]
if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}
outputs:
models: ${{ steps.models_to_run.outputs.models }}
quantizations: ${{ steps.models_to_run.outputs.quantizations }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: "0"
ref: "refs/pull/${{needs.get-pr-number.outputs.PR_NUMBER}}/merge"
- name: Verify merge commit SHA
env:
VERIFIED_PR_MERGE_SHA: ${{ needs.get-sha.outputs.PR_MERGE_SHA }}
run: |
PR_MERGE_SHA=$(git log -1 --format=%H)
if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then
echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";
exit -1;
fi
- name: Get models to test
env:
PR_COMMENT: ${{ github.event.comment.body }}
run: |
python -m pip install GitPython
python utils/pr_slow_ci_models.py --message "$PR_COMMENT" | tee output.txt
echo "models=$(tail -n 1 output.txt)" >> $GITHUB_ENV
python utils/pr_slow_ci_models.py --message "$PR_COMMENT" --quantization | tee output2.txt
echo "quantizations=$(tail -n 1 output2.txt)" >> $GITHUB_ENV
- name: Show models to test
id: models_to_run
run: |
echo "${{ env.models }}"
echo "models=${{ env.models }}" >> $GITHUB_ENV
echo "models=${{ env.models }}" >> $GITHUB_OUTPUT
echo "${{ env.quantizations }}"
echo "quantizations=${{ env.quantizations }}" >> $GITHUB_OUTPUT
reply_to_comment:
name: Reply to the comment
if: ${{ needs.get-tests.outputs.models != '[]' || needs.get-tests.outputs.quantizations != '[]' }}
needs: [get-pr-number, get-tests]
permissions:
pull-requests: write
runs-on: ubuntu-22.04
steps:
- name: Reply to the comment
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
MODELS: ${{ needs.get-tests.outputs.models }}
BODY: "This comment contains run-slow, running the specified jobs:\n\nmodels: ${{ needs.get-tests.outputs.models }}\nquantizations: ${{ needs.get-tests.outputs.quantizations }}"
run: |
gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
repos/${{ github.repository }}/issues/${{ needs.get-pr-number.outputs.PR_NUMBER }}/comments \
-f "body=This comment contains run-slow, running the specified jobs: ${{ env.BODY }} ..."
create_run:
name: Create run
if: ${{ needs.get-tests.outputs.models != '[]' || needs.get-tests.outputs.quantizations != '[]' }}
needs: [get-sha, get-tests, reply_to_comment]
permissions:
statuses: write
runs-on: ubuntu-22.04
steps:
- name: Create Run
id: create_run
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# Create a commit status (pending) for a run of this workflow. The status has to be updated later in `update_run_status`.
# See https://docs.github.com/en/rest/commits/statuses?apiVersion=2022-11-28#create-a-commit-status
GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
run: |
gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
repos/${{ github.repository }}/statuses/${{ needs.get-sha.outputs.PR_HEAD_SHA }} \
-f "target_url=$GITHUB_RUN_URL" -f "state=pending" -f "description=Slow CI job" -f "context=pytest/custom-tests"
run_models_gpu:
name: Run all tests for the model
if: ${{ needs.get-tests.outputs.models != '[]' }}
needs: [get-pr-number, get-sha, get-tests, create_run]
strategy:
fail-fast: false
matrix:
folders: ${{ fromJson(needs.get-tests.outputs.models) }}
machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
runs-on:
group: '${{ matrix.machine_type }}'
container:
image: huggingface/transformers-all-latest-gpu
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Echo input and matrix info
shell: bash
run: |
echo "${{ matrix.folders }}"
- name: Echo folder ${{ matrix.folders }}
shell: bash
# For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
# set the artifact folder names (because the character `/` is not allowed).
run: |
echo "${{ matrix.folders }}"
matrix_folders=${{ matrix.folders }}
matrix_folders=${matrix_folders/'models/'/'models_'}
echo "$matrix_folders"
echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
- name: Checkout to PR merge commit
working-directory: /transformers
run: |
git fetch origin refs/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge:refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge
git checkout refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge
git log -1 --format=%H
- name: Verify merge commit SHA
env:
VERIFIED_PR_MERGE_SHA: ${{ needs.get-sha.outputs.PR_MERGE_SHA }}
working-directory: /transformers
run: |
PR_MERGE_SHA=$(git log -1 --format=%H)
if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then
echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";
exit -1;
fi
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Set `machine_type` for report and artifact names
working-directory: /transformers
shell: bash
run: |
echo "${{ matrix.machine_type }}"
if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
machine_type=single-gpu
elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
machine_type=multi-gpu
else
machine_type=${{ matrix.machine_type }}
fi
echo "$machine_type"
echo "machine_type=$machine_type" >> $GITHUB_ENV
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Run all tests on GPU
working-directory: /transformers
run: |
export CUDA_VISIBLE_DEVICES="$(python3 utils/set_cuda_devices_for_ci.py --test_folder ${{ matrix.folders }})"
echo $CUDA_VISIBLE_DEVICES
python3 -m pytest -v -rsfE --make-reports=${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: cat /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/failures_short.txt
- name: Make sure report directory exists
shell: bash
run: |
mkdir -p /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports
echo "hello" > /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/hello.txt
echo "${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports"
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports
path: /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports
run_quantization_torch_gpu:
name: Run all tests for a quantization
if: ${{ needs.get-tests.outputs.quantizations != '[]' }}
needs: [get-pr-number, get-sha, get-tests, create_run]
strategy:
fail-fast: false
matrix:
folders: ${{ fromJson(needs.get-tests.outputs.quantizations) }}
machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
runs-on:
group: '${{ matrix.machine_type }}'
container:
image: huggingface/transformers-quantization-latest-gpu
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Echo folder ${{ matrix.folders }}
shell: bash
run: |
echo "${{ matrix.folders }}"
matrix_folders=${{ matrix.folders }}
matrix_folders=${matrix_folders/'quantization/'/'quantization_'}
echo "$matrix_folders"
echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
- name: Checkout to PR merge commit
working-directory: /transformers
run: |
git fetch origin refs/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge:refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge
git checkout refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge
git log -1 --format=%H
- name: Verify merge commit SHA
env:
VERIFIED_PR_MERGE_SHA: ${{ needs.get-sha.outputs.PR_MERGE_SHA }}
working-directory: /transformers
run: |
PR_MERGE_SHA=$(git log -1 --format=%H)
if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then
echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";
exit -1;
fi
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Set `machine_type` for report and artifact names
working-directory: /transformers
shell: bash
run: |
echo "${{ matrix.machine_type }}"
if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
machine_type=single-gpu
elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
machine_type=multi-gpu
else
machine_type=${{ matrix.machine_type }}
fi
echo "$machine_type"
echo "machine_type=$machine_type" >> $GITHUB_ENV
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Run quantization tests on GPU
working-directory: /transformers
run: |
python3 -m pytest -v --make-reports=${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: cat /transformers/reports/${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports/failures_short.txt
- name: Make sure report directory exists
shell: bash
run: |
mkdir -p /transformers/reports/${{ env.machine_type }}_run_quantization_gpu_${{ matrix.folders }}_test_reports
echo "hello" > /transformers/reports/${{ env.machine_type }}_run_quantization_gpu_${{ matrix.folders }}_test_reports/hello.txt
echo "${{ env.machine_type }}_run_quantization_gpu_${{ matrix.folders }}_test_reports"
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports
path: /transformers/reports/${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports
update_run_status:
name: Update Check Run Status
needs: [get-sha, create_run, run_models_gpu, run_quantization_torch_gpu]
permissions:
statuses: write
if: ${{ always() && needs.create_run.result == 'success' }}
runs-on: ubuntu-22.04
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
STATUS_OK: ${{ contains(fromJSON('["skipped", "success"]'), needs.run_models_gpu.result) && contains(fromJSON('["skipped", "success"]'), needs.run_quantization_torch_gpu.result) }}
steps:
- name: Get `run_models_gpu` job status
run: |
echo "${{ needs.run_models_gpu.result }}"
echo "${{ needs.run_quantization_torch_gpu.result }}"
echo $STATUS_OK
if [ "$STATUS_OK" = "true" ]; then
echo "STATUS=success" >> $GITHUB_ENV
else
echo "STATUS=failure" >> $GITHUB_ENV
fi
- name: Update PR commit statuses
run: |
echo "${{ needs.run_models_gpu.result }}"
echo "${{ env.STATUS }}"
gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
repos/${{ github.repository }}/statuses/${{ needs.get-sha.outputs.PR_HEAD_SHA }} \
-f "target_url=$GITHUB_RUN_URL" -f "state=${{ env.STATUS }}" -f "description=Slow CI job" -f "context=pytest/custom-tests"

View File

@ -21,39 +21,6 @@ jobs:
echo "$(python3 -c 'print(int(${{ github.run_number }}) % 10)')"
echo "run_number=$(python3 -c 'print(int(${{ github.run_number }}) % 10)')" >> $GITHUB_OUTPUT
run_past_ci_pytorch_1-13:
name: PyTorch 1.13
needs: get_number
if: needs.get_number.outputs.run_number == 0 && (cancelled() != true) && ((github.event_name == 'schedule') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci')))
uses: ./.github/workflows/self-past-caller.yml
with:
framework: pytorch
version: "1.13"
sha: ${{ github.sha }}
secrets: inherit
run_past_ci_pytorch_1-12:
name: PyTorch 1.12
needs: get_number
if: needs.get_number.outputs.run_number == 1 && (cancelled() != true) && ((github.event_name == 'schedule') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci')))
uses: ./.github/workflows/self-past-caller.yml
with:
framework: pytorch
version: "1.12"
sha: ${{ github.sha }}
secrets: inherit
run_past_ci_pytorch_1-11:
name: PyTorch 1.11
needs: get_number
if: needs.get_number.outputs.run_number == 2 && (cancelled() != true) && ((github.event_name == 'schedule') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci')))
uses: ./.github/workflows/self-past-caller.yml
with:
framework: pytorch
version: "1.11"
sha: ${{ github.sha }}
secrets: inherit
run_past_ci_tensorflow_2-11:
name: TensorFlow 2.11
needs: get_number

View File

@ -1,151 +0,0 @@
name: PR slow CI
on:
pull_request:
paths:
- "src/transformers/models/*/modeling_*.py"
- "tests/**/test_*.py"
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
env:
HF_HOME: /mnt/cache
TRANSFORMERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes
# For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
# This token is created under the bot `hf-transformers-bot`.
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
RUN_PT_TF_CROSS_TESTS: 1
CUDA_VISIBLE_DEVICES: 0,1
jobs:
find_models_to_run:
runs-on: ubuntu-22.04
name: Find models to run slow tests
# Triggered only if the required label `run-slow` is added
if: ${{ contains(github.event.pull_request.labels.*.name, 'run-slow') }}
outputs:
models: ${{ steps.models_to_run.outputs.models }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: "0"
ref: ${{ github.event.pull_request.head.sha }}
- name: Get commit message
run: |
echo "commit_message=$(git show -s --format=%s)" >> $GITHUB_ENV
- name: Get models to run slow tests
run: |
echo "${{ env.commit_message }}"
python -m pip install GitPython
python utils/pr_slow_ci_models.py --commit_message "${{ env.commit_message }}" | tee output.txt
echo "models=$(tail -n 1 output.txt)" >> $GITHUB_ENV
- name: Models to run slow tests
id: models_to_run
run: |
echo "${{ env.models }}"
echo "models=${{ env.models }}" >> $GITHUB_OUTPUT
run_models_gpu:
name: Run all tests for the model
# Triggered only `find_models_to_run` is triggered (label `run-slow` is added) which gives the models to run
# (either a new model PR or via a commit message)
if: ${{ needs.find_models_to_run.outputs.models != '[]' }}
needs: find_models_to_run
strategy:
fail-fast: false
matrix:
folders: ${{ fromJson(needs.find_models_to_run.outputs.models) }}
machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
runs-on:
group: '${{ matrix.machine_type }}'
container:
image: huggingface/transformers-all-latest-gpu
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Echo input and matrix info
shell: bash
run: |
echo "${{ matrix.folders }}"
- name: Echo folder ${{ matrix.folders }}
shell: bash
# For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
# set the artifact folder names (because the character `/` is not allowed).
run: |
echo "${{ matrix.folders }}"
matrix_folders=${{ matrix.folders }}
matrix_folders=${matrix_folders/'models/'/'models_'}
echo "$matrix_folders"
echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
- name: Update clone
working-directory: /transformers
run: git fetch && git fetch origin pull/${{ github.event.pull_request.number }}/head:pull/${{ github.event.pull_request.number }}/merge && git checkout pull/${{ github.event.pull_request.number }}/merge
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e . && python3 -m pip install --upgrade torch torchaudio torchvision
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Set `machine_type` for report and artifact names
working-directory: /transformers
shell: bash
run: |
echo "${{ matrix.machine_type }}"
if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
machine_type=single-gpu
elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
machine_type=multi-gpu
else
machine_type=${{ matrix.machine_type }}
fi
echo "$machine_type"
echo "machine_type=$machine_type" >> $GITHUB_ENV
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Run all tests on GPU
working-directory: /transformers
run: |
export CUDA_VISIBLE_DEVICES="$(python3 utils/set_cuda_devices_for_ci.py --test_folder ${{ matrix.folders }})"
echo $CUDA_VISIBLE_DEVICES
python3 -m pytest -v -rsfE --make-reports=${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: cat /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/failures_short.txt
- name: Make sure report directory exists
shell: bash
run: |
mkdir -p /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports
echo "hello" > /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/hello.txt
echo "${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports"
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports
path: /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports

View File

@ -1,25 +1,25 @@
name: Self-hosted runner (AMD mi210 CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (push-caller)"]
branches: ["main"]
types: [completed]
push:
branches:
- run_amd_push_ci_caller*
paths:
- "src/**"
- "tests/**"
- ".github/**"
- "templates/**"
- "utils/**"
jobs:
run_amd_ci:
name: AMD mi210
if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))
uses: ./.github/workflows/self-push-amd.yml
with:
gpu_flavor: mi210
secrets: inherit
name: Self-hosted runner (AMD mi210 CI caller)
on:
#workflow_run:
# workflows: ["Self-hosted runner (push-caller)"]
# branches: ["main"]
# types: [completed]
push:
branches:
- run_amd_push_ci_caller*
paths:
- "src/**"
- "tests/**"
- ".github/**"
- "templates/**"
- "utils/**"
jobs:
run_amd_ci:
name: AMD mi210
if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))
uses: ./.github/workflows/self-push-amd.yml
with:
gpu_flavor: mi210
secrets: inherit

View File

@ -1,25 +1,25 @@
name: Self-hosted runner (AMD mi250 CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (push-caller)"]
branches: ["main"]
types: [completed]
push:
branches:
- run_amd_push_ci_caller*
paths:
- "src/**"
- "tests/**"
- ".github/**"
- "templates/**"
- "utils/**"
jobs:
run_amd_ci:
name: AMD mi250
if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))
uses: ./.github/workflows/self-push-amd.yml
with:
gpu_flavor: mi250
secrets: inherit
name: Self-hosted runner (AMD mi250 CI caller)
on:
#workflow_run:
# workflows: ["Self-hosted runner (push-caller)"]
# branches: ["main"]
# types: [completed]
push:
branches:
- run_amd_push_ci_caller*
paths:
- "src/**"
- "tests/**"
- ".github/**"
- "templates/**"
- "utils/**"
jobs:
run_amd_ci:
name: AMD mi250
if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))
uses: ./.github/workflows/self-push-amd.yml
with:
gpu_flavor: mi250
secrets: inherit

View File

@ -1,10 +1,10 @@
name: Self-hosted runner (AMD mi300 CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (push-caller)"]
branches: ["main"]
types: [completed]
#workflow_run:
# workflows: ["Self-hosted runner (push-caller)"]
# branches: ["main"]
# types: [completed]
push:
branches:
- run_amd_push_ci_caller*

View File

@ -14,7 +14,6 @@ env:
MKL_NUM_THREADS: 8
PYTEST_TIMEOUT: 60
TF_FORCE_GPU_ALLOW_GROWTH: true
RUN_PT_TF_CROSS_TESTS: 1
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
jobs:

View File

@ -25,7 +25,7 @@ jobs:
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v41
uses: tj-actions/changed-files@1c8e6069583811afb28f97afeaf8e7da80c6be5c
- name: Was setup changed
id: was_changed
@ -51,4 +51,4 @@ jobs:
needs: build-docker-containers
steps:
- name: Trigger push CI via workflow_run
run: echo "Trigger push CI via workflow_run"
run: echo "Trigger push CI via workflow_run"

View File

@ -24,7 +24,6 @@ env:
MKL_NUM_THREADS: 8
PYTEST_TIMEOUT: 60
TF_FORCE_GPU_ALLOW_GROWTH: true
RUN_PT_TF_CROSS_TESTS: 1
CUDA_VISIBLE_DEVICES: 0,1
jobs:
@ -293,7 +292,7 @@ jobs:
echo "$machine_type"
echo "machine_type=$machine_type" >> $GITHUB_ENV
- name: Update clone using environment variables
working-directory: /transformers
run: |
@ -406,7 +405,7 @@ jobs:
echo "$machine_type"
echo "machine_type=$machine_type" >> $GITHUB_ENV
- name: Update clone using environment variables
working-directory: /workspace/transformers
run: |
@ -516,7 +515,7 @@ jobs:
echo "$machine_type"
echo "machine_type=$machine_type" >> $GITHUB_ENV
- name: Update clone using environment variables
working-directory: /workspace/transformers
run: |
@ -648,6 +647,6 @@ jobs:
# `models/bert` to `models_bert` is required, as the artifact names use `_` instead of `/`.
run: |
pip install huggingface_hub
pip install slack_sdk
pip install slack_sdk
pip show slack_sdk
python utils/notification_service.py "${{ needs.setup.outputs.matrix }}"

View File

@ -1,55 +1,55 @@
name: Self-hosted runner (AMD mi210 scheduled CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (AMD scheduled CI caller)"]
branches: ["main"]
types: [completed]
push:
branches:
- run_amd_scheduled_ci_caller*
jobs:
model-ci:
name: Model CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_models_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
torch-pipeline:
name: Torch pipeline CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_pipelines_torch_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
example-ci:
name: Example CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_examples_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
deepspeed-ci:
name: DeepSpeed CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_torch_cuda_extensions_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
name: Self-hosted runner (AMD mi210 scheduled CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (AMD scheduled CI caller)"]
branches: ["main"]
types: [completed]
push:
branches:
- run_amd_scheduled_ci_caller*
jobs:
model-ci:
name: Model CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_models_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
torch-pipeline:
name: Torch pipeline CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_pipelines_torch_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
example-ci:
name: Example CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_examples_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
deepspeed-ci:
name: DeepSpeed CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_torch_cuda_extensions_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit

View File

@ -1,55 +1,55 @@
name: Self-hosted runner (AMD mi250 scheduled CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (AMD scheduled CI caller)"]
branches: ["main"]
types: [completed]
push:
branches:
- run_amd_scheduled_ci_caller*
jobs:
model-ci:
name: Model CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_models_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi250
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
torch-pipeline:
name: Torch pipeline CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_pipelines_torch_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi250
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
example-ci:
name: Example CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_examples_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi250
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
deepspeed-ci:
name: DeepSpeed CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_torch_cuda_extensions_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi250
docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
name: Self-hosted runner (AMD mi250 scheduled CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (AMD scheduled CI caller)"]
branches: ["main"]
types: [completed]
push:
branches:
- run_amd_scheduled_ci_caller*
jobs:
model-ci:
name: Model CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_models_gpu
slack_report_channel: "#amd-hf-ci"
runner: mi250
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
torch-pipeline:
name: Torch pipeline CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_pipelines_torch_gpu
slack_report_channel: "#amd-hf-ci"
runner: mi250
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
example-ci:
name: Example CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_examples_gpu
slack_report_channel: "#amd-hf-ci"
runner: mi250
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
deepspeed-ci:
name: DeepSpeed CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_torch_cuda_extensions_gpu
slack_report_channel: "#amd-hf-ci"
runner: mi250
docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit

View File

@ -1,349 +0,0 @@
name: Self-hosted runner (scheduled-amd)
# Note: For the AMD CI, we rely on a caller workflow and on the workflow_call event to trigger the
# CI in order to run it on both MI210 and MI250, without having to use matrix here which pushes
# us towards the limit of allowed jobs on GitHub Actions.
on:
workflow_call:
inputs:
job:
required: true
type: string
slack_report_channel:
required: true
type: string
runner:
required: true
type: string
docker:
required: true
type: string
ci_event:
required: true
type: string
env:
HF_HOME: /mnt/cache
TRANSFORMERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
NUM_SLICES: 2
# Important note: each job (run_tests_single_gpu, run_tests_multi_gpu, run_examples_gpu, run_pipelines_torch_gpu) requires all the previous jobs before running.
# This is done so that we avoid parallelizing the scheduled tests, to leave available
# runners for the push CI that is running on the same machine.
jobs:
check_runner_status:
name: Check Runner Status
runs-on: ubuntu-22.04
steps:
- name: Checkout transformers
uses: actions/checkout@v4
with:
fetch-depth: 2
- name: Check Runner Status
run: python utils/check_self_hosted_runner.py --target_runners hf-amd-mi210-ci-1gpu-1,hf-amd-mi250-ci-1gpu-1,hf-amd-mi300-ci-1gpu-1 --token ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
check_runners:
name: Check Runners
needs: check_runner_status
strategy:
matrix:
machine_type: [single-gpu, multi-gpu]
runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']
container:
image: huggingface/transformers-pytorch-amd-gpu
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: ROCM-SMI
run: |
rocm-smi
- name: ROCM-INFO
run: |
rocminfo | grep "Agent" -A 14
- name: Show ROCR environment
run: |
echo "ROCR: $ROCR_VISIBLE_DEVICES"
setup:
if: contains(fromJSON('["run_models_gpu"]'), inputs.job)
name: Setup
needs: check_runners
strategy:
matrix:
machine_type: [single-gpu, multi-gpu]
runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']
container:
image: huggingface/transformers-pytorch-amd-gpu
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
outputs:
folder_slices: ${{ steps.set-matrix.outputs.folder_slices }}
slice_ids: ${{ steps.set-matrix.outputs.slice_ids }}
steps:
- name: Update clone
working-directory: /transformers
run: |
git fetch && git checkout ${{ github.sha }}
- name: Cleanup
working-directory: /transformers
run: |
rm -rf tests/__pycache__
rm -rf tests/models/__pycache__
rm -rf reports
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- id: set-matrix
name: Identify models to test
working-directory: /transformers/tests
run: |
echo "folder_slices=$(python3 ../utils/split_model_tests.py --num_splits ${{ env.NUM_SLICES }})" >> $GITHUB_OUTPUT
echo "slice_ids=$(python3 -c 'd = list(range(${{ env.NUM_SLICES }})); print(d)')" >> $GITHUB_OUTPUT
- name: ROCM-SMI
run: |
rocm-smi
- name: ROCM-INFO
run: |
rocminfo | grep "Agent" -A 14
- name: Show ROCR environment
run: |
echo "ROCR: $ROCR_VISIBLE_DEVICES"
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
run_models_gpu:
if: ${{ inputs.job == 'run_models_gpu' }}
name: Single GPU tests
needs: setup
strategy:
max-parallel: 1 # For now, not to parallelize. Can change later if it works well.
fail-fast: false
matrix:
machine_type: [single-gpu, multi-gpu]
slice_id: ${{ fromJSON(needs.setup.outputs.slice_ids) }}
uses: ./.github/workflows/model_jobs_amd.yml
with:
folder_slices: ${{ needs.setup.outputs.folder_slices }}
machine_type: ${{ matrix.machine_type }}
slice_id: ${{ matrix.slice_id }}
runner: ${{ inputs.runner }}
docker: ${{ inputs.docker }}
secrets: inherit
run_pipelines_torch_gpu:
if: ${{ inputs.job == 'run_pipelines_torch_gpu' }}
name: PyTorch pipelines
needs: check_runners
strategy:
fail-fast: false
matrix:
machine_type: [single-gpu, multi-gpu]
runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']
container:
image: ${{ inputs.docker }}
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Update clone
working-directory: /transformers
run: git fetch && git checkout ${{ github.sha }}
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
- name: ROCM-SMI
run: |
rocm-smi
- name: ROCM-INFO
run: |
rocminfo | grep "Agent" -A 14
- name: Show ROCR environment
run: |
echo "ROCR: $ROCR_VISIBLE_DEVICES"
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Run all pipeline tests on GPU
working-directory: /transformers
run: |
python3 -m pytest -n 1 -v --dist=loadfile --make-reports=${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports tests/pipelines -m "not not_device_test"
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: cat /transformers/reports/${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports/failures_short.txt
- name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports
path: /transformers/reports/${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports
run_examples_gpu:
if: ${{ inputs.job == 'run_examples_gpu' }}
name: Examples directory
needs: check_runners
strategy:
fail-fast: false
matrix:
machine_type: [single-gpu]
runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']
container:
image: ${{ inputs.docker }}
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Update clone
working-directory: /transformers
run: git fetch && git checkout ${{ github.sha }}
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
- name: ROCM-SMI
run: |
rocm-smi
- name: ROCM-INFO
run: |
rocminfo | grep "Agent" -A 14
- name: Show ROCR environment
run: |
echo "ROCR: $ROCR_VISIBLE_DEVICES"
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Run examples tests on GPU
working-directory: /transformers
run: |
pip install -r examples/pytorch/_tests_requirements.txt
python3 -m pytest -v --make-reports=${{ matrix.machine_type }}_run_examples_gpu_test_reports examples/pytorch -m "not not_device_test"
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: cat /transformers/reports/${{ matrix.machine_type }}_run_examples_gpu_test_reports/failures_short.txt
- name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_examples_gpu_test_reports"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.machine_type }}_run_examples_gpu_test_reports
path: /transformers/reports/${{ matrix.machine_type }}_run_examples_gpu_test_reports
run_torch_cuda_extensions_gpu:
if: ${{ inputs.job == 'run_torch_cuda_extensions_gpu' }}
name: Torch ROCm deepspeed tests
needs: check_runners
strategy:
fail-fast: false
matrix:
machine_type: [single-gpu, multi-gpu]
runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']
container:
image: ${{ inputs.docker }}
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Update clone
working-directory: /transformers
run: git fetch && git checkout ${{ github.sha }}
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
- name: ROCM-SMI
run: |
rocm-smi
- name: ROCM-INFO
run: |
rocminfo | grep "Agent" -A 14
- name: Show ROCR environment
run: |
echo "ROCR: $ROCR_VISIBLE_DEVICES"
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Run all tests on GPU
working-directory: /transformers
run: python3 -m pytest -v --make-reports=${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports tests/deepspeed tests/extended -m "not not_device_test"
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: cat /transformers/reports/${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt
- name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
path: /transformers/reports/${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
send_results:
name: Slack Report
needs: [
check_runner_status,
check_runners,
setup,
run_models_gpu,
run_pipelines_torch_gpu,
run_examples_gpu,
run_torch_cuda_extensions_gpu
]
if: ${{ always() }}
uses: ./.github/workflows/slack-report.yml
with:
job: ${{ inputs.job }}
# This would be `skipped` if `setup` is skipped.
setup_status: ${{ needs.setup.result }}
slack_report_channel: ${{ inputs.slack_report_channel }}
# This would be an empty string if `setup` is skipped.
folder_slices: ${{ needs.setup.outputs.folder_slices }}
quantization_matrix: ${{ needs.setup.outputs.quantization_matrix }}
ci_event: ${{ inputs.ci_event }}
secrets: inherit

View File

@ -40,7 +40,6 @@ env:
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
RUN_PT_TF_CROSS_TESTS: 1
CUDA_VISIBLE_DEVICES: 0,1
NUM_SLICES: 2
@ -366,7 +365,7 @@ jobs:
run: |
python3 -m pip uninstall -y deepspeed
rm -rf DeepSpeed
git clone https://github.com/microsoft/DeepSpeed && cd DeepSpeed && rm -rf build
git clone https://github.com/deepspeedai/DeepSpeed && cd DeepSpeed && rm -rf build
DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install . --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check
- name: NVIDIA-SMI
@ -571,4 +570,4 @@ jobs:
with:
docker: ${{ inputs.docker }}
start_sha: ${{ github.sha }}
secrets: inherit
secrets: inherit

View File

@ -70,7 +70,7 @@ jobs:
with:
name: ci_results_${{ inputs.job }}
path: ci_results_${{ inputs.job }}
- uses: actions/checkout@v4
- uses: actions/download-artifact@v4
- name: Send message to Slack for quantization workflow
@ -90,7 +90,7 @@ jobs:
pip install huggingface_hub
pip install slack_sdk
pip show slack_sdk
python utils/notification_service_quantization.py "${{ inputs.quantization_matrix }}"
python utils/notification_service_quantization.py "${{ inputs.quantization_matrix }}"
# Upload complete failure tables, as they might be big and only truncated versions could be sent to Slack.
- name: Failure table artifacts
@ -98,4 +98,4 @@ jobs:
uses: actions/upload-artifact@v4
with:
name: ci_results_${{ inputs.job }}
path: ci_results_${{ inputs.job }}
path: ci_results_${{ inputs.job }}

View File

@ -5,7 +5,7 @@ on:
inputs:
runner_type:
description: 'Type of runner to test (a10 or t4)'
required: true
required: true
docker_image:
description: 'Name of the Docker image'
required: true
@ -15,15 +15,14 @@ on:
env:
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
HF_HOME: /mnt/cache
TRANSFORMERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access. # This token is created under the bot `hf-transformers-bot`.
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
HF_HOME: /mnt/cache
TRANSFORMERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access. # This token is created under the bot `hf-transformers-bot`.
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
CUDA_VISIBLE_DEVICES: 0,1
RUN_PT_TF_CROSS_TESTS: 1
jobs:
get_runner:
@ -78,7 +77,7 @@ jobs:
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: NVIDIA-SMI
run: |
nvidia-smi

View File

@ -16,3 +16,5 @@ jobs:
fetch-depth: 0
- name: Secret Scanning
uses: trufflesecurity/trufflehog@main
with:
extra_args: --results=verified,unknown

View File

@ -19,7 +19,7 @@ jobs:
- name: Setup environment
run: |
pip install --upgrade pip
pip install datasets pandas==2.0.3
pip install datasets pandas
pip install .[torch,tf,flax]
- name: Update metadata

View File

@ -221,10 +221,10 @@ You'll need **[Python 3.9](https://github.com/huggingface/transformers/blob/main
[Checks on a Pull Request](https://huggingface.co/docs/transformers/pr_checks) guide.
If you're modifying documents under the `docs/source` directory, make sure the documentation can still be built. This check will also run in the CI when you open a pull request. To run a local check
make sure you install the documentation builder:
make sure you install the [documentation builder](https://github.com/huggingface/doc-builder).
```bash
pip install ".[docs]"
pip install hf-doc-builder
```
Run the following command from the root of the repository:
@ -343,8 +343,6 @@ RUN_SLOW=yes python -m pytest -n auto --dist=loadfile -s -v ./examples/pytorch/t
Like the slow tests, there are other environment variables available which are not enabled by default during testing:
- `RUN_CUSTOM_TOKENIZERS`: Enables tests for custom tokenizers.
- `RUN_PT_FLAX_CROSS_TESTS`: Enables tests for PyTorch + Flax integration.
- `RUN_PT_TF_CROSS_TESTS`: Enables tests for TensorFlow + PyTorch integration.
More environment variables and additional information can be found in the [testing_utils.py](https://github.com/huggingface/transformers/blob/main/src/transformers/testing_utils.py).

View File

@ -263,9 +263,9 @@ You are not required to read the following guidelines before opening an issue. H
But if you're replying to a comment that happened some comments back it's always a good practice to quote just the relevant lines you're replying it. The `>` is used for quoting, or you can always use the menu to do so. For example your editor box will look like:
```
> How big is your gpu cluster?
> How big is your GPU cluster?
Our cluster is made of 256 gpus.
Our cluster is made of 256 GPUs.
```
If you are addressing multiple comments, quote the relevant parts of each before your answer. Some people use the same comment to do multiple replies, others separate them into separate comments. Either way works. The latter approach helps for linking to a specific comment.

View File

@ -37,7 +37,6 @@ autogenerate_code: deps_table_update
repo-consistency:
python utils/check_copies.py
python utils/check_modular_conversion.py
python utils/check_table.py
python utils/check_dummies.py
python utils/check_repo.py
python utils/check_inits.py
@ -46,7 +45,6 @@ repo-consistency:
python utils/check_doctest_list.py
python utils/update_metadata.py --check-only
python utils/check_docstrings.py
python utils/check_support_list.py
# this target runs checks on all files
@ -82,7 +80,6 @@ fixup: modified_only_fixup extra_style_checks autogenerate_code repo-consistency
fix-copies:
python utils/check_copies.py --fix_and_overwrite
python utils/check_modular_conversion.py --fix_and_overwrite
python utils/check_table.py --fix_and_overwrite
python utils/check_dummies.py --fix_and_overwrite
python utils/check_doctest_list.py --fix_and_overwrite
python utils/check_docstrings.py --fix_and_overwrite

366
README.md
View File

@ -25,6 +25,7 @@ limitations under the License.
</p>
<p align="center">
<a href="https://huggingface.com/models"><img alt="Checkpoints on Hub" src="https://img.shields.io/endpoint?url=https://huggingface.co/api/shields/models&color=brightgreen"></a>
<a href="https://circleci.com/gh/huggingface/transformers"><img alt="Build" src="https://img.shields.io/circleci/build/github/huggingface/transformers/main"></a>
<a href="https://github.com/huggingface/transformers/blob/main/LICENSE"><img alt="GitHub" src="https://img.shields.io/github/license/huggingface/transformers.svg?color=blue"></a>
<a href="https://huggingface.co/docs/transformers/index"><img alt="Documentation" src="https://img.shields.io/website/http/huggingface.co/docs/transformers/index.svg?down_color=red&down_message=offline&up_message=online"></a>
@ -54,255 +55,254 @@ limitations under the License.
</h4>
<h3 align="center">
<p>State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow</p>
<p>State-of-the-art pretrained models for inference and training</p>
</h3>
<h3 align="center">
<a href="https://hf.co/course"><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/course_banner.png"></a>
</h3>
🤗 Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.
Transformers is a library of pretrained text, computer vision, audio, video, and multimodal models for inference and training. Use Transformers to fine-tune models on your data, build inference applications, and for generative AI use cases across multiple modalities.
These models can be applied on:
There are over 500K+ Transformers [model checkpoints](https://huggingface.co/models?library=transformers&sort=trending) on the [Hugging Face Hub](https://huggingface.com/models) you can use.
* 📝 Text, for tasks like text classification, information extraction, question answering, summarization, translation, and text generation, in over 100 languages.
* 🖼️ Images, for tasks like image classification, object detection, and segmentation.
* 🗣️ Audio, for tasks like speech recognition and audio classification.
Explore the [Hub](https://huggingface.com/) today to find a model and use Transformers to help you get started right away.
Transformer models can also perform tasks on **several modalities combined**, such as table question answering, optical character recognition, information extraction from scanned documents, video classification, and visual question answering.
## Installation
🤗 Transformers provides APIs to quickly download and use those pretrained models on a given text, fine-tune them on your own datasets and then share them with the community on our [model hub](https://huggingface.co/models). At the same time, each python module defining an architecture is fully standalone and can be modified to enable quick research experiments.
Transformers works with Python 3.9+ [PyTorch](https://pytorch.org/get-started/locally/) 2.0+, [TensorFlow](https://www.tensorflow.org/install/pip) 2.6+, and [Flax](https://flax.readthedocs.io/en/latest/) 0.4.1+.
🤗 Transformers is backed by the three most popular deep learning libraries — [Jax](https://jax.readthedocs.io/en/latest/), [PyTorch](https://pytorch.org/) and [TensorFlow](https://www.tensorflow.org/) — with a seamless integration between them. It's straightforward to train your models with one before loading them for inference with the other.
Create and activate a virtual environment with [venv](https://docs.python.org/3/library/venv.html) or [uv](https://docs.astral.sh/uv/), a fast Rust-based Python package and project manager.
## Online demos
```py
# venv
python -m venv .my-env
source .my-env/bin/activate
You can test most of our models directly on their pages from the [model hub](https://huggingface.co/models). We also offer [private model hosting, versioning, & an inference API](https://huggingface.co/pricing) for public and private models.
Here are a few examples:
In Natural Language Processing:
- [Masked word completion with BERT](https://huggingface.co/google-bert/bert-base-uncased?text=Paris+is+the+%5BMASK%5D+of+France)
- [Named Entity Recognition with Electra](https://huggingface.co/dbmdz/electra-large-discriminator-finetuned-conll03-english?text=My+name+is+Sarah+and+I+live+in+London+city)
- [Text generation with Mistral](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
- [Natural Language Inference with RoBERTa](https://huggingface.co/FacebookAI/roberta-large-mnli?text=The+dog+was+lost.+Nobody+lost+any+animal)
- [Summarization with BART](https://huggingface.co/facebook/bart-large-cnn?text=The+tower+is+324+metres+%281%2C063+ft%29+tall%2C+about+the+same+height+as+an+81-storey+building%2C+and+the+tallest+structure+in+Paris.+Its+base+is+square%2C+measuring+125+metres+%28410+ft%29+on+each+side.+During+its+construction%2C+the+Eiffel+Tower+surpassed+the+Washington+Monument+to+become+the+tallest+man-made+structure+in+the+world%2C+a+title+it+held+for+41+years+until+the+Chrysler+Building+in+New+York+City+was+finished+in+1930.+It+was+the+first+structure+to+reach+a+height+of+300+metres.+Due+to+the+addition+of+a+broadcasting+aerial+at+the+top+of+the+tower+in+1957%2C+it+is+now+taller+than+the+Chrysler+Building+by+5.2+metres+%2817+ft%29.+Excluding+transmitters%2C+the+Eiffel+Tower+is+the+second+tallest+free-standing+structure+in+France+after+the+Millau+Viaduct)
- [Question answering with DistilBERT](https://huggingface.co/distilbert/distilbert-base-uncased-distilled-squad?text=Which+name+is+also+used+to+describe+the+Amazon+rainforest+in+English%3F&context=The+Amazon+rainforest+%28Portuguese%3A+Floresta+Amaz%C3%B4nica+or+Amaz%C3%B4nia%3B+Spanish%3A+Selva+Amaz%C3%B3nica%2C+Amazon%C3%ADa+or+usually+Amazonia%3B+French%3A+For%C3%AAt+amazonienne%3B+Dutch%3A+Amazoneregenwoud%29%2C+also+known+in+English+as+Amazonia+or+the+Amazon+Jungle%2C+is+a+moist+broadleaf+forest+that+covers+most+of+the+Amazon+basin+of+South+America.+This+basin+encompasses+7%2C000%2C000+square+kilometres+%282%2C700%2C000+sq+mi%29%2C+of+which+5%2C500%2C000+square+kilometres+%282%2C100%2C000+sq+mi%29+are+covered+by+the+rainforest.+This+region+includes+territory+belonging+to+nine+nations.+The+majority+of+the+forest+is+contained+within+Brazil%2C+with+60%25+of+the+rainforest%2C+followed+by+Peru+with+13%25%2C+Colombia+with+10%25%2C+and+with+minor+amounts+in+Venezuela%2C+Ecuador%2C+Bolivia%2C+Guyana%2C+Suriname+and+French+Guiana.+States+or+departments+in+four+nations+contain+%22Amazonas%22+in+their+names.+The+Amazon+represents+over+half+of+the+planet%27s+remaining+rainforests%2C+and+comprises+the+largest+and+most+biodiverse+tract+of+tropical+rainforest+in+the+world%2C+with+an+estimated+390+billion+individual+trees+divided+into+16%2C000+species)
- [Translation with T5](https://huggingface.co/google-t5/t5-base?text=My+name+is+Wolfgang+and+I+live+in+Berlin)
In Computer Vision:
- [Image classification with ViT](https://huggingface.co/google/vit-base-patch16-224)
- [Object Detection with DETR](https://huggingface.co/facebook/detr-resnet-50)
- [Semantic Segmentation with SegFormer](https://huggingface.co/nvidia/segformer-b0-finetuned-ade-512-512)
- [Panoptic Segmentation with Mask2Former](https://huggingface.co/facebook/mask2former-swin-large-coco-panoptic)
- [Depth Estimation with Depth Anything](https://huggingface.co/docs/transformers/main/model_doc/depth_anything)
- [Video Classification with VideoMAE](https://huggingface.co/docs/transformers/model_doc/videomae)
- [Universal Segmentation with OneFormer](https://huggingface.co/shi-labs/oneformer_ade20k_dinat_large)
In Audio:
- [Automatic Speech Recognition with Whisper](https://huggingface.co/openai/whisper-large-v3)
- [Keyword Spotting with Wav2Vec2](https://huggingface.co/superb/wav2vec2-base-superb-ks)
- [Audio Classification with Audio Spectrogram Transformer](https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593)
In Multimodal tasks:
- [Table Question Answering with TAPAS](https://huggingface.co/google/tapas-base-finetuned-wtq)
- [Visual Question Answering with ViLT](https://huggingface.co/dandelin/vilt-b32-finetuned-vqa)
- [Image captioning with LLaVa](https://huggingface.co/llava-hf/llava-1.5-7b-hf)
- [Zero-shot Image Classification with SigLIP](https://huggingface.co/google/siglip-so400m-patch14-384)
- [Document Question Answering with LayoutLM](https://huggingface.co/impira/layoutlm-document-qa)
- [Zero-shot Video Classification with X-CLIP](https://huggingface.co/docs/transformers/model_doc/xclip)
- [Zero-shot Object Detection with OWLv2](https://huggingface.co/docs/transformers/en/model_doc/owlv2)
- [Zero-shot Image Segmentation with CLIPSeg](https://huggingface.co/docs/transformers/model_doc/clipseg)
- [Automatic Mask Generation with SAM](https://huggingface.co/docs/transformers/model_doc/sam)
## 100 projects using Transformers
Transformers is more than a toolkit to use pretrained models: it's a community of projects built around it and the
Hugging Face Hub. We want Transformers to enable developers, researchers, students, professors, engineers, and anyone
else to build their dream projects.
In order to celebrate the 100,000 stars of transformers, we have decided to put the spotlight on the
community, and we have created the [awesome-transformers](./awesome-transformers.md) page which lists 100
incredible projects built in the vicinity of transformers.
If you own or use a project that you believe should be part of the list, please open a PR to add it!
## Serious about AI in your organisation? Build faster with the Hugging Face Enterprise Hub.
<a target="_blank" href="https://huggingface.co/enterprise">
<img alt="Hugging Face Enterprise Hub" src="https://github.com/user-attachments/assets/247fb16d-d251-4583-96c4-d3d76dda4925">
</a><br>
## Quick tour
To immediately use a model on a given input (text, image, audio, ...), we provide the `pipeline` API. Pipelines group together a pretrained model with the preprocessing that was used during that model's training. Here is how to quickly use a pipeline to classify positive versus negative texts:
```python
>>> from transformers import pipeline
# Allocate a pipeline for sentiment-analysis
>>> classifier = pipeline('sentiment-analysis')
>>> classifier('We are very happy to introduce pipeline to the transformers repository.')
[{'label': 'POSITIVE', 'score': 0.9996980428695679}]
# uv
uv venv .my-env
source .my-env/bin/activate
```
The second line of code downloads and caches the pretrained model used by the pipeline, while the third evaluates it on the given text. Here, the answer is "positive" with a confidence of 99.97%.
Install Transformers in your virtual environment.
Many tasks have a pre-trained `pipeline` ready to go, in NLP but also in computer vision and speech. For example, we can easily extract detected objects in an image:
```py
# pip
pip install transformers
``` python
>>> import requests
>>> from PIL import Image
>>> from transformers import pipeline
# Download an image with cute cats
>>> url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/coco_sample.png"
>>> image_data = requests.get(url, stream=True).raw
>>> image = Image.open(image_data)
# Allocate a pipeline for object detection
>>> object_detector = pipeline('object-detection')
>>> object_detector(image)
[{'score': 0.9982201457023621,
'label': 'remote',
'box': {'xmin': 40, 'ymin': 70, 'xmax': 175, 'ymax': 117}},
{'score': 0.9960021376609802,
'label': 'remote',
'box': {'xmin': 333, 'ymin': 72, 'xmax': 368, 'ymax': 187}},
{'score': 0.9954745173454285,
'label': 'couch',
'box': {'xmin': 0, 'ymin': 1, 'xmax': 639, 'ymax': 473}},
{'score': 0.9988006353378296,
'label': 'cat',
'box': {'xmin': 13, 'ymin': 52, 'xmax': 314, 'ymax': 470}},
{'score': 0.9986783862113953,
'label': 'cat',
'box': {'xmin': 345, 'ymin': 23, 'xmax': 640, 'ymax': 368}}]
# uv
uv pip install transformers
```
Here, we get a list of objects detected in the image, with a box surrounding the object and a confidence score. Here is the original image on the left, with the predictions displayed on the right:
Install Transformers from source if you want the latest changes in the library or are interested in contributing. However, the *latest* version may not be stable. Feel free to open an [issue](https://github.com/huggingface/transformers/issues) if you encounter an error.
```shell
git clone https://github.com/huggingface/transformers.git
cd transformers
pip install .
```
## Quickstart
Get started with Transformers right away with the [Pipeline](https://huggingface.co/docs/transformers/pipeline_tutorial) API. The `Pipeline` is a high-level inference class that supports text, audio, vision, and multimodal tasks. It handles preprocessing the input and returns the appropriate output.
Instantiate a pipeline and specify model to use for text generation. The model is downloaded and cached so you can easily reuse it again. Finally, pass some text to prompt the model.
```py
from transformers import pipeline
pipeline = pipeline(task="text-generation", model="Qwen/Qwen2.5-1.5B")
pipeline("the secret to baking a really good cake is ")
[{'generated_text': 'the secret to baking a really good cake is 1) to use the right ingredients and 2) to follow the recipe exactly. the recipe for the cake is as follows: 1 cup of sugar, 1 cup of flour, 1 cup of milk, 1 cup of butter, 1 cup of eggs, 1 cup of chocolate chips. if you want to make 2 cakes, how much sugar do you need? To make 2 cakes, you will need 2 cups of sugar.'}]
```
To chat with a model, the usage pattern is the same. The only difference is you need to construct a chat history (the input to `Pipeline`) between you and the system.
> [!TIP]
> You can also chat with a model directly from the command line.
> ```shell
> transformers-cli chat --model_name_or_path Qwen/Qwen2.5-0.5B-Instruct
> ```
```py
import torch
from transformers import pipeline
chat = [
{"role": "system", "content": "You are a sassy, wise-cracking robot as imagined by Hollywood circa 1986."},
{"role": "user", "content": "Hey, can you tell me any fun things to do in New York?"}
]
pipeline = pipeline(task="text-generation", model="meta-llama/Meta-Llama-3-8B-Instruct", torch_dtype=torch.bfloat16, device_map="auto")
response = pipeline(chat, max_new_tokens=512)
print(response[0]["generated_text"][-1]["content"])
```
Expand the examples below to see how `Pipeline` works for different modalities and tasks.
<details>
<summary>Automatic speech recognition</summary>
```py
from transformers import pipeline
pipeline = pipeline(task="automatic-speech-recognition", model="openai/whisper-large-v3")
pipeline("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")
{'text': ' I have a dream that one day this nation will rise up and live out the true meaning of its creed.'}
```
</details>
<details>
<summary>Image classification</summary>
<h3 align="center">
<a><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/coco_sample.png" width="400"></a>
<a><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/coco_sample_post_processed.png" width="400"></a>
<a><img src="https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png"></a>
</h3>
You can learn more about the tasks supported by the `pipeline` API in [this tutorial](https://huggingface.co/docs/transformers/task_summary).
```py
from transformers import pipeline
In addition to `pipeline`, to download and use any of the pretrained models on your given task, all it takes is three lines of code. Here is the PyTorch version:
```python
>>> from transformers import AutoTokenizer, AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")
>>> model = AutoModel.from_pretrained("google-bert/bert-base-uncased")
>>> inputs = tokenizer("Hello world!", return_tensors="pt")
>>> outputs = model(**inputs)
pipeline = pipeline(task="image-classification", model="facebook/dinov2-small-imagenet1k-1-layer")
pipeline("https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png")
[{'label': 'macaw', 'score': 0.997848391532898},
{'label': 'sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita',
'score': 0.0016551691805943847},
{'label': 'lorikeet', 'score': 0.00018523589824326336},
{'label': 'African grey, African gray, Psittacus erithacus',
'score': 7.85409429227002e-05},
{'label': 'quail', 'score': 5.502637941390276e-05}]
```
And here is the equivalent code for TensorFlow:
```python
>>> from transformers import AutoTokenizer, TFAutoModel
</details>
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-uncased")
<details>
<summary>Visual question answering</summary>
>>> inputs = tokenizer("Hello world!", return_tensors="tf")
>>> outputs = model(**inputs)
<h3 align="center">
<a><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/idefics-few-shot.jpg"></a>
</h3>
```py
from transformers import pipeline
pipeline = pipeline(task="visual-question-answering", model="Salesforce/blip-vqa-base")
pipeline(
image="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/idefics-few-shot.jpg",
question="What is in the image?",
)
[{'answer': 'statue of liberty'}]
```
The tokenizer is responsible for all the preprocessing the pretrained model expects and can be called directly on a single string (as in the above examples) or a list. It will output a dictionary that you can use in downstream code or simply directly pass to your model using the ** argument unpacking operator.
</details>
The model itself is a regular [Pytorch `nn.Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) or a [TensorFlow `tf.keras.Model`](https://www.tensorflow.org/api_docs/python/tf/keras/Model) (depending on your backend) which you can use as usual. [This tutorial](https://huggingface.co/docs/transformers/training) explains how to integrate such a model into a classic PyTorch or TensorFlow training loop, or how to use our `Trainer` API to quickly fine-tune on a new dataset.
## Why should I use transformers?
## Why should I use Transformers?
1. Easy-to-use state-of-the-art models:
- High performance on natural language understanding & generation, computer vision, and audio tasks.
- Low barrier to entry for educators and practitioners.
- High performance on natural language understanding & generation, computer vision, audio, video, and multimodal tasks.
- Low barrier to entry for researchers, engineers, and developers.
- Few user-facing abstractions with just three classes to learn.
- A unified API for using all our pretrained models.
1. Lower compute costs, smaller carbon footprint:
- Researchers can share trained models instead of always retraining.
- Practitioners can reduce compute time and production costs.
- Dozens of architectures with over 400,000 pretrained models across all modalities.
- Share trained models instead of training from scratch.
- Reduce compute time and production costs.
- Dozens of model architectures with 1M+ pretrained checkpoints across all modalities.
1. Choose the right framework for every part of a model's lifetime:
1. Choose the right framework for every part of a models lifetime:
- Train state-of-the-art models in 3 lines of code.
- Move a single model between TF2.0/PyTorch/JAX frameworks at will.
- Seamlessly pick the right framework for training, evaluation, and production.
- Move a single model between PyTorch/JAX/TF2.0 frameworks at will.
- Pick the right framework for training, evaluation, and production.
1. Easily customize a model or an example to your needs:
- We provide examples for each architecture to reproduce the results published by its original authors.
- Model internals are exposed as consistently as possible.
- Model files can be used independently of the library for quick experiments.
## Why shouldn't I use transformers?
<a target="_blank" href="https://huggingface.co/enterprise">
<img alt="Hugging Face Enterprise Hub" src="https://github.com/user-attachments/assets/247fb16d-d251-4583-96c4-d3d76dda4925">
</a><br>
## Why shouldn't I use Transformers?
- This library is not a modular toolbox of building blocks for neural nets. The code in the model files is not refactored with additional abstractions on purpose, so that researchers can quickly iterate on each of the models without diving into additional abstractions/files.
- The training API is not intended to work on any model but is optimized to work with the models provided by the library. For generic machine learning loops, you should use another library (possibly, [Accelerate](https://huggingface.co/docs/accelerate)).
- While we strive to present as many use cases as possible, the scripts in our [examples folder](https://github.com/huggingface/transformers/tree/main/examples) are just that: examples. It is expected that they won't work out-of-the-box on your specific problem and that you will be required to change a few lines of code to adapt them to your needs.
- The training API is optimized to work with PyTorch models provided by Transformers. For generic machine learning loops, you should use another library like [Accelerate](https://huggingface.co/docs/accelerate).
- The [example scripts]((https://github.com/huggingface/transformers/tree/main/examples)) are only *examples*. They may not necessarily work out-of-the-box on your specific use case and you'll need to adapt the code for it to work.
## Installation
## 100 projects using Transformers
### With pip
Transformers is more than a toolkit to use pretrained models, it's a community of projects built around it and the
Hugging Face Hub. We want Transformers to enable developers, researchers, students, professors, engineers, and anyone
else to build their dream projects.
This repository is tested on Python 3.9+, Flax 0.4.1+, PyTorch 1.11+, and TensorFlow 2.6+.
In order to celebrate Transformers 100,000 stars, we wanted to put the spotlight on the
community with the [awesome-transformers](./awesome-transformers.md) page which lists 100
incredible projects built with Transformers.
You should install 🤗 Transformers in a [virtual environment](https://docs.python.org/3/library/venv.html). If you're unfamiliar with Python virtual environments, check out the [user guide](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
If you own or use a project that you believe should be part of the list, please open a PR to add it!
First, create a virtual environment with the version of Python you're going to use and activate it.
## Example models
Then, you will need to install at least one of Flax, PyTorch, or TensorFlow.
Please refer to [TensorFlow installation page](https://www.tensorflow.org/install/), [PyTorch installation page](https://pytorch.org/get-started/locally/#start-locally) and/or [Flax](https://github.com/google/flax#quick-install) and [Jax](https://github.com/google/jax#installation) installation pages regarding the specific installation command for your platform.
You can test most of our models directly on their [Hub model pages](https://huggingface.co/models).
When one of those backends has been installed, 🤗 Transformers can be installed using pip as follows:
Expand each modality below to see a few example models for various use cases.
```bash
pip install transformers
```
<details>
<summary>Audio</summary>
If you'd like to play with the examples or need the bleeding edge of the code and can't wait for a new release, you must [install the library from source](https://huggingface.co/docs/transformers/installation#installing-from-source).
- Audio classification with [Whisper](https://huggingface.co/openai/whisper-large-v3-turbo)
- Automatic speech recognition with [Moonshine](https://huggingface.co/UsefulSensors/moonshine)
- Keyword spotting with [Wav2Vec2](https://huggingface.co/superb/wav2vec2-base-superb-ks)
- Speech to speech generation with [Moshi](https://huggingface.co/kyutai/moshiko-pytorch-bf16)
- Text to audio with [MusicGen](https://huggingface.co/facebook/musicgen-large)
- Text to speech with [Bark](https://huggingface.co/suno/bark)
### With conda
</details>
🤗 Transformers can be installed using conda as follows:
<details>
<summary>Computer vision</summary>
```shell script
conda install conda-forge::transformers
```
- Automatic mask generation with [SAM](https://huggingface.co/facebook/sam-vit-base)
- Depth estimation with [DepthPro](https://huggingface.co/apple/DepthPro-hf)
- Image classification with [DINO v2](https://huggingface.co/facebook/dinov2-base)
- Keypoint detection with [SuperGlue](https://huggingface.co/magic-leap-community/superglue_outdoor)
- Keypoint matching with [SuperGlue](https://huggingface.co/magic-leap-community/superglue)
- Object detection with [RT-DETRv2](https://huggingface.co/PekingU/rtdetr_v2_r50vd)
- Pose Estimation with [VitPose](https://huggingface.co/usyd-community/vitpose-base-simple)
- Universal segmentation with [OneFormer](https://huggingface.co/shi-labs/oneformer_ade20k_swin_large)
- Video classification with [VideoMAE](https://huggingface.co/MCG-NJU/videomae-large)
> **_NOTE:_** Installing `transformers` from the `huggingface` channel is deprecated.
</details>
Follow the installation pages of Flax, PyTorch or TensorFlow to see how to install them with conda.
<details>
<summary>Multimodal</summary>
> **_NOTE:_** On Windows, you may be prompted to activate Developer Mode in order to benefit from caching. If this is not an option for you, please let us know in [this issue](https://github.com/huggingface/huggingface_hub/issues/1062).
- Audio or text to text with [Qwen2-Audio](https://huggingface.co/Qwen/Qwen2-Audio-7B)
- Document question answering with [LayoutLMv3](https://huggingface.co/microsoft/layoutlmv3-base)
- Image or text to text with [Qwen-VL](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct)
- Image captioning [BLIP-2](https://huggingface.co/Salesforce/blip2-opt-2.7b)
- OCR-based document understanding with [GOT-OCR2](https://huggingface.co/stepfun-ai/GOT-OCR-2.0-hf)
- Table question answering with [TAPAS](https://huggingface.co/google/tapas-base)
- Unified multimodal understanding and generation with [Emu3](https://huggingface.co/BAAI/Emu3-Gen)
- Vision to text with [Llava-OneVision](https://huggingface.co/llava-hf/llava-onevision-qwen2-0.5b-ov-hf)
- Visual question answering with [Llava](https://huggingface.co/llava-hf/llava-1.5-7b-hf)
- Visual referring expression segmentation with [Kosmos-2](https://huggingface.co/microsoft/kosmos-2-patch14-224)
## Model architectures
</details>
**[All the model checkpoints](https://huggingface.co/models)** provided by 🤗 Transformers are seamlessly integrated from the huggingface.co [model hub](https://huggingface.co/models), where they are uploaded directly by [users](https://huggingface.co/users) and [organizations](https://huggingface.co/organizations).
<details>
<summary>NLP</summary>
Current number of checkpoints: ![](https://img.shields.io/endpoint?url=https://huggingface.co/api/shields/models&color=brightgreen)
- Masked word completion with [ModernBERT](https://huggingface.co/answerdotai/ModernBERT-base)
- Named entity recognition with [Gemma](https://huggingface.co/google/gemma-2-2b)
- Question answering with [Mixtral](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1)
- Summarization with [BART](https://huggingface.co/facebook/bart-large-cnn)
- Translation with [T5](https://huggingface.co/google-t5/t5-base)
- Text generation with [Llama](https://huggingface.co/meta-llama/Llama-3.2-1B)
- Text classification with [Qwen](https://huggingface.co/Qwen/Qwen2.5-0.5B)
🤗 Transformers currently provides the following architectures: see [here](https://huggingface.co/docs/transformers/model_summary) for a high-level summary of each them.
To check if each model has an implementation in Flax, PyTorch or TensorFlow, or has an associated tokenizer backed by the 🤗 Tokenizers library, refer to [this table](https://huggingface.co/docs/transformers/index#supported-frameworks).
These implementations have been tested on several datasets (see the example scripts) and should match the performance of the original implementations. You can find more details on performance in the Examples section of the [documentation](https://github.com/huggingface/transformers/tree/main/examples).
## Learn more
| Section | Description |
|-|-|
| [Documentation](https://huggingface.co/docs/transformers/) | Full API documentation and tutorials |
| [Task summary](https://huggingface.co/docs/transformers/task_summary) | Tasks supported by 🤗 Transformers |
| [Preprocessing tutorial](https://huggingface.co/docs/transformers/preprocessing) | Using the `Tokenizer` class to prepare data for the models |
| [Training and fine-tuning](https://huggingface.co/docs/transformers/training) | Using the models provided by 🤗 Transformers in a PyTorch/TensorFlow training loop and the `Trainer` API |
| [Quick tour: Fine-tuning/usage scripts](https://github.com/huggingface/transformers/tree/main/examples) | Example scripts for fine-tuning models on a wide range of tasks |
| [Model sharing and uploading](https://huggingface.co/docs/transformers/model_sharing) | Upload and share your fine-tuned models with the community |
</details>
## Citation

View File

@ -15,7 +15,7 @@ to add it.
Keywords: Open-source, LLaMa, GPT-J, instruction, assistant
## [recommenders](https://github.com/microsoft/recommenders)
## [recommenders](https://github.com/recommenders-team/recommenders)
This repository contains examples and best practices for building recommendation systems, provided as Jupyter notebooks. It goes over several aspects required to build efficient recommendation systems: data preparation, modeling, evaluation, model selection & optimization, as well as operationalization
@ -29,7 +29,7 @@ Keywords: inpainting, SD, Stable Diffusion
## [flair](https://github.com/flairNLP/flair)
FLAIR is a powerful PyTorch NLP framework, convering several important tasks: NER, sentiment-analysis, part-of-speech tagging, text and document embeddings, among other things.
FLAIR is a powerful PyTorch NLP framework, covering several important tasks: NER, sentiment-analysis, part-of-speech tagging, text and document embeddings, among other things.
Keywords: NLP, text embedding, document embedding, biomedical, NER, PoS, sentiment-analysis
@ -39,15 +39,15 @@ MindsDB is a low-code ML platform, which automates and integrates several ML fra
Keywords: Database, low-code, AI table
## [langchain](https://github.com/hwchase17/langchain)
## [langchain](https://github.com/langchain-ai/langchain)
[langchain](https://github.com/hwchase17/langchain) is aimed at assisting in the development of apps merging both LLMs and other sources of knowledge. The library allows chaining calls to applications, creating a sequence across many tools.
[langchain](https://github.com/langchain-ai/langchain) is aimed at assisting in the development of apps merging both LLMs and other sources of knowledge. The library allows chaining calls to applications, creating a sequence across many tools.
Keywords: LLMs, Large Language Models, Agents, Chains
## [LlamaIndex](https://github.com/jerryjliu/llama_index)
## [LlamaIndex](https://github.com/run-llama/llama_index)
[LlamaIndex](https://github.com/jerryjliu/llama_index) is a project that provides a central interface to connect your LLM's with external data. It provides various kinds of indices and retreival mechanisms to perform different LLM tasks and obtain knowledge-augmented results.
[LlamaIndex](https://github.com/run-llama/llama_index) is a project that provides a central interface to connect your LLM's with external data. It provides various kinds of indices and retrieval mechanisms to perform different LLM tasks and obtain knowledge-augmented results.
Keywords: LLMs, Large Language Models, Data Retrieval, Indices, Knowledge Augmentation
@ -146,9 +146,9 @@ Keywords: Framework, simplicity, NLP
Keywords: LLM, Agents, HF Hub
## [transformers.js](https://xenova.github.io/transformers.js/)
## [transformers.js](https://github.com/huggingface/transformers.js/)
[transformers.js](https://xenova.github.io/transformers.js/) is a JavaScript library targeted at running models from transformers directly within the browser.
[transformers.js](https://github.com/huggingface/transformers.js/) is a JavaScript library targeted at running models from transformers directly within the browser.
Keywords: Transformers, JavaScript, browser
@ -437,7 +437,7 @@ Keywords: DALL-E, Russian
Keywords: Knowledge Extraction, Knowledge Graphs
## [Nebuly](https://github.com/nebuly-ai/nebuly)
## [Nebuly](https://github.com/nebuly-ai/optimate)
Nebuly is the next-generation platform to monitor and optimize your AI costs in one place. The platform connects to all your AI cost sources (compute, API providers, AI software licenses, etc) and centralizes them in one place to give you full visibility on a model basis. The platform also provides optimization recommendations and a co-pilot model that can guide during the optimization process. The platform builds on top of the open-source tools allowing you to optimize the different steps of your AI stack to squeeze out the best possible cost performances.

49
benchmark/README.md Normal file
View File

@ -0,0 +1,49 @@
# Benchmarks
You might want to add new benchmarks.
You will need to define a python function named `run_benchmark` in your python file and the file must be located in this `benchmark/` directory.
The expected function signature is the following:
```py
def run_benchmark(logger: Logger, branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):
```
## Writing metrics to the database
`MetricsRecorder` is thread-safe, in the sense of the python [`Thread`](https://docs.python.org/3/library/threading.html#threading.Thread). This means you can start a background thread to do the readings on the device measurements while not blocking the main thread to execute the model measurements.
cf [`llama.py`](./llama.py) to see an example of this in practice.
```py
from benchmarks_entrypoint import MetricsRecorder
import psycopg2
def run_benchmark(logger: Logger, branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):
metrics_recorder = MetricsRecorder(psycopg2.connect("dbname=metrics"), logger, branch, commit_id, commit_msg)
benchmark_id = metrics_recorder.initialise_benchmark({"gpu_name": gpu_name, "model_id": model_id})
# To collect device measurements
metrics_recorder.collect_device_measurements(
benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes
)
# To collect your model measurements
metrics_recorder.collect_model_measurements(
benchmark_id,
{
"model_load_time": model_load_time,
"first_eager_forward_pass_time_secs": first_eager_fwd_pass_time,
"second_eager_forward_pass_time_secs": second_eager_fwd_pass_time,
"first_eager_generate_time_secs": first_eager_generate_time,
"second_eager_generate_time_secs": second_eager_generate_time,
"time_to_first_token_secs": time_to_first_token,
"time_to_second_token_secs": time_to_second_token,
"time_to_third_token_secs": time_to_third_token,
"time_to_next_token_mean_secs": mean_time_to_next_token,
"first_compile_generate_time_secs": first_compile_generate_time,
"second_compile_generate_time_secs": second_compile_generate_time,
"third_compile_generate_time_secs": third_compile_generate_time,
"fourth_compile_generate_time_secs": fourth_compile_generate_time,
},
)
```

View File

@ -0,0 +1,143 @@
import argparse
import importlib.util
import logging
import os
from typing import Dict
import sys
from psycopg2.extras import Json
from psycopg2.extensions import register_adapter
register_adapter(dict, Json)
class ImportModuleException(Exception):
pass
class MetricsRecorder:
def __init__(self, connection, logger: logging.Logger, branch: str, commit_id: str, commit_msg: str):
self.conn = connection
self.conn.autocommit = True
self.logger = logger
self.branch = branch
self.commit_id = commit_id
self.commit_msg = commit_msg
def initialise_benchmark(self, metadata: Dict[str, str]) -> int:
"""
Creates a new benchmark, returns the benchmark id
"""
# gpu_name: str, model_id: str
with self.conn.cursor() as cur:
cur.execute(
"INSERT INTO benchmarks (branch, commit_id, commit_message, metadata) VALUES (%s, %s, %s, %s) RETURNING benchmark_id",
(self.branch, self.commit_id, self.commit_msg, metadata),
)
benchmark_id = cur.fetchone()[0]
logger.debug(f"initialised benchmark #{benchmark_id}")
return benchmark_id
def collect_device_measurements(self, benchmark_id: int, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes):
"""
Collect device metrics, such as CPU & GPU usage. These are "static", as in you cannot pass arbitrary arguments to the function.
"""
with self.conn.cursor() as cur:
cur.execute(
"INSERT INTO device_measurements (benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes) VALUES (%s, %s, %s, %s, %s)",
(benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes),
)
self.logger.debug(
f"inserted device measurements for benchmark #{benchmark_id} [CPU util: {cpu_util}, mem MBs: {mem_megabytes}, GPU util: {gpu_util}, GPU mem MBs: {gpu_mem_megabytes}]"
)
def collect_model_measurements(self, benchmark_id: int, measurements: Dict[str, float]):
with self.conn.cursor() as cur:
cur.execute(
"""
INSERT INTO model_measurements (
benchmark_id,
measurements
) VALUES (%s, %s)
""",
(
benchmark_id,
measurements,
),
)
self.logger.debug(f"inserted model measurements for benchmark #{benchmark_id}: {measurements}")
def close(self):
self.conn.close()
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)
formatter = logging.Formatter("[%(levelname)s - %(asctime)s] %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
def parse_arguments():
"""
Parse command line arguments for the benchmarking CLI.
"""
parser = argparse.ArgumentParser(description="CLI for benchmarking the huggingface/transformers.")
parser.add_argument(
"branch",
type=str,
help="The branch name on which the benchmarking is performed.",
)
parser.add_argument(
"commit_id",
type=str,
help="The commit hash on which the benchmarking is performed.",
)
parser.add_argument(
"commit_msg",
type=str,
help="The commit message associated with the commit, truncated to 70 characters.",
)
args = parser.parse_args()
return args.branch, args.commit_id, args.commit_msg
def import_from_path(module_name, file_path):
try:
spec = importlib.util.spec_from_file_location(module_name, file_path)
module = importlib.util.module_from_spec(spec)
sys.modules[module_name] = module
spec.loader.exec_module(module)
return module
except Exception as e:
raise ImportModuleException(f"failed to load python module: {e}")
if __name__ == "__main__":
benchmarks_folder_path = os.path.dirname(os.path.realpath(__file__))
branch, commit_id, commit_msg = parse_arguments()
for entry in os.scandir(benchmarks_folder_path):
try:
if not entry.name.endswith(".py"):
continue
if entry.path == __file__:
continue
logger.debug(f"loading: {entry.name}")
module = import_from_path(entry.name.split(".")[0], entry.path)
logger.info(f"running benchmarks in: {entry.name}")
module.run_benchmark(logger, branch, commit_id, commit_msg)
except ImportModuleException as e:
logger.error(e)
except Exception as e:
logger.error(f"error running benchmarks for {entry.name}: {e}")

10
benchmark/default.yml Normal file
View File

@ -0,0 +1,10 @@
apiVersion: 1
providers:
- name: 'Transformers Benchmarks'
orgId: 1
type: file
updateIntervalSeconds: 10
allowUiUpdates: true
options:
path: /etc/grafana/dashboards

View File

@ -30,7 +30,7 @@
"title": "Go to data",
"tooltip": "Go to data",
"type": "link",
"url": "http://transformers-benchmarks.huggingface.co/d/fdz33iyzln9c0a/transformers-benchmarks?orgId=1&from=${StartTime}&to=${EndTime}"
"url": "http://transformers-benchmarks.hf.co/d/fdz33iyzln9c0a/transformers-benchmarks?orgId=1&from=${StartTime}&to=${EndTime}"
}
],
"liveNow": true,
@ -77,7 +77,7 @@
"properties": [
{
"id": "custom.width",
"value": 196
"value": 202
}
]
},
@ -101,7 +101,7 @@
"properties": [
{
"id": "custom.width",
"value": 581
"value": 524
}
]
},
@ -113,7 +113,19 @@
"properties": [
{
"id": "custom.width",
"value": 379
"value": 353
}
]
},
{
"matcher": {
"id": "byName",
"options": "model_id"
},
"properties": [
{
"id": "custom.width",
"value": 216
}
]
}
@ -143,12 +155,14 @@
"targets": [
{
"datasource": {
"type": "grafana-postgresql-datasource"
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
"rawQuery": true,
"rawSql": "SELECT commit_id as commit_id, commit_message, gpu_name, created_at AS date FROM benchmarks WHERE branch = '${branch}' ORDER BY benchmark_id DESC LIMIT ${last_n_commits};",
"rawSql": "SELECT commit_id, commit_message, metadata->>'gpu_name' as gpu_name, metadata->>'model_id' as model_id, created_at AS date FROM benchmarks WHERE branch = '${branch}' AND metadata->>'gpu_name' = '${gpu_name}' ORDER BY benchmark_id DESC LIMIT ${last_n_commits};",
"refId": "A",
"sql": {
"columns": [
@ -306,13 +320,14 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
"rawQuery": true,
"rawSql": "SELECT CAST(m.measurements->'first_eager_forward_pass_time_secs' AS double precision) AS first_eager_forward_pass_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND gpu_name = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"rawSql": "SELECT CAST(m.measurements->'first_eager_forward_pass_time_secs' AS double precision) AS first_eager_forward_pass_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"refId": "A",
"sql": {
"columns": [
@ -431,13 +446,14 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
"rawQuery": true,
"rawSql": "SELECT CAST(m.measurements->'second_eager_forward_pass_time_secs' AS double precision) AS second_eager_forward_pass_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND gpu_name = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"rawSql": "SELECT CAST(m.measurements->'second_eager_forward_pass_time_secs' AS double precision) AS second_eager_forward_pass_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"refId": "A",
"sql": {
"columns": [
@ -565,13 +581,14 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
"rawQuery": true,
"rawSql": "SELECT CAST(m.measurements->'time_to_first_token_secs' AS double precision) AS time_to_first_token_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND gpu_name = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"rawSql": "SELECT CAST(m.measurements->'time_to_first_token_secs' AS double precision) AS time_to_first_token_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"refId": "A",
"sql": {
"columns": [
@ -686,13 +703,14 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
"rawQuery": true,
"rawSql": "SELECT CAST(m.measurements->'time_to_second_token_secs' AS double precision) AS time_to_second_token_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND gpu_name = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"rawSql": "SELECT CAST(m.measurements->'time_to_second_token_secs' AS double precision) AS time_to_second_token_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"refId": "A",
"sql": {
"columns": [
@ -807,13 +825,14 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
"rawQuery": true,
"rawSql": "SELECT CAST(m.measurements->'time_to_third_token_secs' AS double precision) AS time_to_third_token_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND gpu_name = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"rawSql": "SELECT CAST(m.measurements->'time_to_third_token_secs' AS double precision) AS time_to_third_token_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"refId": "A",
"sql": {
"columns": [
@ -928,13 +947,14 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
"rawQuery": true,
"rawSql": "SELECT CAST(m.measurements->'time_to_next_token_mean_secs' AS double precision) AS time_to_next_token_mean_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND gpu_name = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"rawSql": "SELECT CAST(m.measurements->'time_to_next_token_mean_secs' AS double precision) AS time_to_next_token_mean_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"refId": "A",
"sql": {
"columns": [
@ -1062,13 +1082,14 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
"rawQuery": true,
"rawSql": "SELECT CAST(m.measurements->'first_compile_generate_time_secs' AS double precision) AS first_compile_generate_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND gpu_name = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"rawSql": "SELECT CAST(m.measurements->'first_compile_generate_time_secs' AS double precision) AS first_compile_generate_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"refId": "A",
"sql": {
"columns": [
@ -1183,13 +1204,14 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
"rawQuery": true,
"rawSql": "SELECT CAST(m.measurements->'second_compile_generate_time_secs' AS double precision) AS second_compile_generate_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND gpu_name = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"rawSql": "SELECT CAST(m.measurements->'second_compile_generate_time_secs' AS double precision) AS second_compile_generate_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"refId": "A",
"sql": {
"columns": [
@ -1304,13 +1326,14 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
"rawQuery": true,
"rawSql": "SELECT CAST(m.measurements->'third_compile_generate_time_secs' AS double precision) AS third_compile_generate_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND gpu_name = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"rawSql": "SELECT CAST(m.measurements->'third_compile_generate_time_secs' AS double precision) AS third_compile_generate_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"refId": "A",
"sql": {
"columns": [
@ -1425,13 +1448,14 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
"rawQuery": true,
"rawSql": "SELECT CAST(m.measurements->'fourth_compile_generate_time_secs' AS double precision) AS fourth_compile_generate_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND gpu_name = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"rawSql": "SELECT CAST(m.measurements->'fourth_compile_generate_time_secs' AS double precision) AS fourth_compile_generate_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
"refId": "A",
"sql": {
"columns": [
@ -1480,11 +1504,7 @@
"id": 15,
"panels": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "be28nkzirtb0gd"
},
"datasource": {},
"fieldConfig": {
"defaults": {
"color": {
@ -1528,8 +1548,7 @@
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
"color": "green"
},
{
"color": "red",
@ -1563,8 +1582,9 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
@ -1665,11 +1685,7 @@
"type": "timeseries"
},
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "be28nkzirtb0gd"
},
"datasource": {},
"fieldConfig": {
"defaults": {
"color": {
@ -1713,8 +1729,7 @@
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
"color": "green"
},
{
"color": "red",
@ -1748,8 +1763,9 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
@ -1850,11 +1866,7 @@
"type": "timeseries"
},
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "be28nkzirtb0gd"
},
"datasource": {},
"fieldConfig": {
"defaults": {
"color": {
@ -1898,8 +1910,7 @@
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
"color": "green"
},
{
"color": "red",
@ -1933,8 +1944,9 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
@ -2035,11 +2047,7 @@
"type": "timeseries"
},
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "be28nkzirtb0gd"
},
"datasource": {},
"fieldConfig": {
"defaults": {
"color": {
@ -2083,8 +2091,7 @@
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
"color": "green"
},
{
"color": "red",
@ -2118,8 +2125,9 @@
"targets": [
{
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "bdz2yss7sxo1sc"
"uid": "be28nkzirtb0gd"
},
"editorMode": "code",
"format": "table",
@ -2224,7 +2232,6 @@
"type": "row"
}
],
"refresh": "",
"schemaVersion": 39,
"tags": [],
"templating": {
@ -2236,6 +2243,7 @@
"value": "main"
},
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "be28nkzirtb0gd"
},
@ -2248,7 +2256,7 @@
"name": "branch",
"options": [],
"query": "SELECT DISTINCT branch FROM benchmarks;",
"refresh": 2,
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
@ -2261,6 +2269,7 @@
"value": "1729701492845"
},
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "be28nkzirtb0gd"
},
@ -2281,10 +2290,11 @@
{
"current": {
"selected": false,
"text": "1730120430069",
"value": "1730120430069"
"text": "1730393397577",
"value": "1730393397577"
},
"datasource": {
"default": true,
"type": "grafana-postgresql-datasource",
"uid": "be28nkzirtb0gd"
},
@ -2312,15 +2322,16 @@
"type": "grafana-postgresql-datasource",
"uid": "be28nkzirtb0gd"
},
"definition": "SELECT DISTINCT gpu_name FROM benchmarks;",
"definition": "SELECT DISTINCT metadata->>'gpu_name' FROM benchmarks;",
"description": "",
"hide": 0,
"includeAll": false,
"label": "GPU",
"multi": false,
"name": "gpu_name",
"options": [],
"query": "SELECT DISTINCT gpu_name FROM benchmarks;",
"refresh": 2,
"query": "SELECT DISTINCT metadata->>'gpu_name' FROM benchmarks;",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
@ -2328,7 +2339,7 @@
},
{
"current": {
"selected": false,
"selected": true,
"text": "10",
"value": "10"
},
@ -2359,6 +2370,6 @@
"timezone": "browser",
"title": "Transformers benchmarks",
"uid": "fdz33iyzln9c0a",
"version": 4,
"version": 10,
"weekStart": ""
}

View File

@ -0,0 +1,17 @@
apiVersion: 1
datasources:
- name: grafana-postgresql-datasource
uid: be28nkzirtb0gd
type: postgres
url: $GRAFANA_POSTGRES_DATASOURCE_URL
user: $GRAFANA_POSTGRES_DATASOURCE_USER
secureJsonData:
password: $GRAFANA_POSTGRES_DATASOURCE_PWD
jsonData:
database: metrics
maxOpenConns: 100
maxIdleConns: 100
maxIdleConnsAuto: true
connMaxLifetime: 14400
postgresVersion: 1000
timescaledb: false

View File

@ -3,7 +3,7 @@ CREATE TABLE IF NOT EXISTS benchmarks (
branch VARCHAR(255),
commit_id VARCHAR(72),
commit_message VARCHAR(70),
gpu_name VARCHAR(255),
metadata jsonb,
created_at timestamp without time zone NOT NULL DEFAULT (current_timestamp AT TIME ZONE 'UTC')
);

View File

@ -1,71 +1,25 @@
import argparse
import json
import logging
from logging import Logger
import os
import sys
from statistics import mean
from threading import Event, Thread
from time import perf_counter, sleep
from typing import Optional
from benchmarks_entrypoint import MetricsRecorder
import gpustat
import psutil
import psycopg2
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig, StaticCache
from psycopg2.extras import Json
from psycopg2.extensions import register_adapter
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)
formatter = logging.Formatter("[%(levelname)s - %(asctime)s] %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
os.environ["TOKENIZERS_PARALLELISM"] = "1"
torch.set_float32_matmul_precision("high")
register_adapter(dict, Json)
def parse_arguments():
"""
Parse command line arguments for the benchmarking CLI.
"""
parser = argparse.ArgumentParser(description="CLI for benchmarking the huggingface/transformers.")
parser.add_argument(
"branch",
type=str,
help="The branch name on which the benchmarking is performed.",
)
parser.add_argument(
"commit_id",
type=str,
help="The commit hash on which the benchmarking is performed.",
)
parser.add_argument(
"commit_msg",
type=str,
help="The commit message associated with the commit, truncated to 70 characters.",
)
args = parser.parse_args()
return args.branch, args.commit_id, args.commit_msg
def collect_metrics(benchmark_id, continue_metric_collection):
def collect_metrics(benchmark_id, continue_metric_collection, metrics_recorder):
p = psutil.Process(os.getpid())
conn = psycopg2.connect("dbname=metrics")
cur = conn.cursor()
while not continue_metric_collection.is_set():
with p.oneshot():
cpu_util = p.cpu_percent()
@ -73,47 +27,41 @@ def collect_metrics(benchmark_id, continue_metric_collection):
gpu_stats = gpustat.GPUStatCollection.new_query()
gpu_util = gpu_stats[0]["utilization.gpu"]
gpu_mem_megabytes = gpu_stats[0]["memory.used"]
cur.execute(
"INSERT INTO device_measurements (benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes) VALUES (%s, %s, %s, %s, %s)",
(benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes),
metrics_recorder.collect_device_measurements(
benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes
)
sleep(0.01)
conn.commit()
conn.close()
def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):
def run_benchmark(logger: Logger, branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):
continue_metric_collection = Event()
metrics_thread = None
model_id = "meta-llama/Llama-2-7b-hf"
metrics_recorder = MetricsRecorder(psycopg2.connect("dbname=metrics"), logger, branch, commit_id, commit_msg)
try:
gpu_stats = gpustat.GPUStatCollection.new_query()
gpu_name = gpu_stats[0]["name"]
conn = psycopg2.connect("dbname=metrics")
cur = conn.cursor()
cur.execute(
"INSERT INTO benchmarks (branch, commit_id, commit_message, gpu_name) VALUES (%s, %s, %s, %s) RETURNING benchmark_id",
(branch, commit_id, commit_msg, gpu_name),
benchmark_id = metrics_recorder.initialise_benchmark({"gpu_name": gpu_name, "model_id": model_id})
logger.info(f"running benchmark #{benchmark_id} on {gpu_name} for {model_id}")
metrics_thread = Thread(
target=collect_metrics,
args=[benchmark_id, continue_metric_collection, metrics_recorder],
)
conn.commit()
benchmark_id = cur.fetchone()[0]
logger.info(f"running benchmark #{benchmark_id} on {gpu_name}")
metrics_thread = Thread(target=collect_metrics, args=[benchmark_id, continue_metric_collection])
metrics_thread.start()
logger.info("started background thread to fetch device metrics")
os.environ["TOKENIZERS_PARALLELISM"] = "false" # silence warnings when compiling
device = "cuda"
ckpt = "meta-llama/Llama-2-7b-hf"
logger.info("downloading weights")
# This is to avoid counting download in model load time measurement
model = AutoModelForCausalLM.from_pretrained(ckpt, torch_dtype=torch.float16)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16)
gen_config = GenerationConfig(do_sample=False, top_p=1, temperature=1)
logger.info("loading model")
start = perf_counter()
model = AutoModelForCausalLM.from_pretrained(
ckpt, torch_dtype=torch.float16, generation_config=gen_config
model_id, torch_dtype=torch.float16, generation_config=gen_config
).eval()
model.to(device)
torch.cuda.synchronize()
@ -121,7 +69,7 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
model_load_time = end - start
logger.info(f"loaded model in: {model_load_time}s")
tokenizer = AutoTokenizer.from_pretrained(ckpt)
tokenizer = AutoTokenizer.from_pretrained(model_id)
prompt = "Why dogs are so cute?"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
@ -170,7 +118,7 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
with torch.no_grad():
past_key_values = StaticCache(
model.config,
batch_size=batch_size,
max_batch_size=batch_size,
device=device,
dtype=torch.float16,
max_cache_len=seq_length + num_tokens_to_generate,
@ -196,7 +144,7 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
past_key_values = StaticCache(
model.config,
batch_size=batch_size,
max_batch_size=batch_size,
device=device,
dtype=torch.float16,
max_cache_len=seq_length + num_tokens_to_generate,
@ -239,7 +187,7 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
# TODO use decode_one_token(model, input_id.clone(), cache_position) for verification
past_key_values = StaticCache(
model.config,
batch_size=batch_size,
max_batch_size=batch_size,
device=device,
dtype=torch.float16,
max_cache_len=seq_length + num_tokens_to_generate + 10,
@ -256,7 +204,7 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
time_to_first_token = end - start
logger.info(f"completed first compile generation in: {time_to_first_token}s")
cache_position += 1
all_generated_tokens += next_token.clone().detach().cpu().tolist()
all_generated_tokens += next_token.tolist()
cache_position = torch.tensor([seq_length], device=device)
### First compile, decoding
@ -267,9 +215,9 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
torch.cuda.synchronize()
end = perf_counter()
time_to_second_token = end - start
logger.info(f"completed second compile generation in: {time_to_first_token}s")
logger.info(f"completed second compile generation in: {time_to_second_token}s")
cache_position += 1
all_generated_tokens += next_token.clone().detach().cpu().tolist()
all_generated_tokens += next_token.tolist()
### Second compile, decoding
start = perf_counter()
@ -279,15 +227,15 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
torch.cuda.synchronize()
end = perf_counter()
time_to_third_token = end - start
logger.info(f"completed third compile forward in: {time_to_first_token}s")
logger.info(f"completed third compile forward in: {time_to_third_token}s")
cache_position += 1
all_generated_tokens += next_token.clone().detach().cpu().tolist()
all_generated_tokens += next_token.tolist()
### Using cuda graphs decoding
start = perf_counter()
for _ in range(1, num_tokens_to_generate):
all_generated_tokens += next_token.clone().detach().cpu().tolist()
all_generated_tokens += next_token.tolist()
next_token = decode_one_token(
model, next_token.clone(), cache_position=cache_position, past_key_values=past_key_values
)
@ -306,7 +254,7 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
past_key_values = StaticCache(
model.config,
batch_size=batch_size,
max_batch_size=batch_size,
device=device,
dtype=torch.float16,
max_cache_len=seq_length + 128,
@ -323,7 +271,7 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
past_key_values = StaticCache(
model.config,
batch_size=batch_size,
max_batch_size=batch_size,
device=device,
dtype=torch.float16,
max_cache_len=seq_length + 128,
@ -339,7 +287,7 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
past_key_values = StaticCache(
model.config,
batch_size=batch_size,
max_batch_size=batch_size,
device=device,
dtype=torch.float16,
max_cache_len=seq_length + 128,
@ -350,12 +298,12 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
output = model.generate(**inputs, past_key_values=past_key_values)
end = perf_counter()
third_compile_generate_time = end - start
logger.info(f"completed second compile generation in: {third_compile_generate_time}s")
logger.info(f"completed third compile generation in: {third_compile_generate_time}s")
logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")
past_key_values = StaticCache(
model.config,
batch_size=batch_size,
max_batch_size=batch_size,
device=device,
dtype=torch.float16,
max_cache_len=seq_length + 128,
@ -365,44 +313,30 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
output = model.generate(**inputs, past_key_values=past_key_values)
end = perf_counter()
fourth_compile_generate_time = end - start
logger.info(f"completed second compile generation in: {fourth_compile_generate_time}s")
logger.info(f"completed fourth compile generation in: {fourth_compile_generate_time}s")
logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")
cur.execute(
"""
INSERT INTO model_measurements (
benchmark_id,
measurements
) VALUES (%s, %s)
""",
(
benchmark_id,
{
"model_load_time": model_load_time,
"first_eager_forward_pass_time_secs": first_eager_fwd_pass_time,
"second_eager_forward_pass_time_secs": second_eager_fwd_pass_time,
"first_eager_generate_time_secs": first_eager_generate_time,
"second_eager_generate_time_secs": second_eager_generate_time,
"time_to_first_token_secs": time_to_first_token,
"time_to_second_token_secs": time_to_second_token,
"time_to_third_token_secs": time_to_third_token,
"time_to_next_token_mean_secs": mean_time_to_next_token,
"first_compile_generate_time_secs": first_compile_generate_time,
"second_compile_generate_time_secs": second_compile_generate_time,
"third_compile_generate_time_secs": third_compile_generate_time,
"fourth_compile_generate_time_secs": fourth_compile_generate_time,
},
),
metrics_recorder.collect_model_measurements(
benchmark_id,
{
"model_load_time": model_load_time,
"first_eager_forward_pass_time_secs": first_eager_fwd_pass_time,
"second_eager_forward_pass_time_secs": second_eager_fwd_pass_time,
"first_eager_generate_time_secs": first_eager_generate_time,
"second_eager_generate_time_secs": second_eager_generate_time,
"time_to_first_token_secs": time_to_first_token,
"time_to_second_token_secs": time_to_second_token,
"time_to_third_token_secs": time_to_third_token,
"time_to_next_token_mean_secs": mean_time_to_next_token,
"first_compile_generate_time_secs": first_compile_generate_time,
"second_compile_generate_time_secs": second_compile_generate_time,
"third_compile_generate_time_secs": third_compile_generate_time,
"fourth_compile_generate_time_secs": fourth_compile_generate_time,
},
)
conn.commit()
conn.close()
except Exception as e:
logger.error(f"Caught exception: {e}")
continue_metric_collection.set()
if metrics_thread is not None:
metrics_thread.join()
if __name__ == "__main__":
branch, commit_id, commit_msg = parse_arguments()
run_benchmark(branch, commit_id, commit_msg, num_tokens_to_generate=20)
metrics_recorder.close()

View File

@ -46,10 +46,6 @@ NOT_DEVICE_TESTS = {
"test_keep_in_fp32_modules",
"test_gradient_checkpointing_backward_compatibility",
"test_gradient_checkpointing_enable_disable",
"test_save_load_fast_init_from_base",
"test_fast_init_context_manager",
"test_fast_init_tied_embeddings",
"test_save_load_fast_init_to_base",
"test_torch_save_load",
"test_initialization",
"test_forward_signature",
@ -61,7 +57,6 @@ NOT_DEVICE_TESTS = {
"test_load_save_without_tied_weights",
"test_tied_weights_keys",
"test_model_weights_reload_no_missing_tied_weights",
"test_pt_tf_model_equivalence",
"test_mismatched_shapes_have_properly_initialized_weights",
"test_matched_shapes_have_loaded_weights_when_some_mismatched_shapes_exist",
"test_model_is_small",
@ -85,12 +80,6 @@ warnings.simplefilter(action="ignore", category=FutureWarning)
def pytest_configure(config):
config.addinivalue_line(
"markers", "is_pt_tf_cross_test: mark test to run only when PT and TF interactions are tested"
)
config.addinivalue_line(
"markers", "is_pt_flax_cross_test: mark test to run only when PT and FLAX interactions are tested"
)
config.addinivalue_line("markers", "is_pipeline_test: mark test to run only when pipelines are tested")
config.addinivalue_line("markers", "is_staging_test: mark test to run only in the staging environment")
config.addinivalue_line("markers", "accelerate_tests: mark test that require accelerate")

View File

@ -2,8 +2,8 @@
In this folder you will find various docker files, and some subfolders.
- dockerfiles (ex: `consistency.dockerfile`) present under `~/docker` are used for our "fast" CIs. You should be able to use them for tasks that only need CPU. For example `torch-light` is a very light weights container (703MiB).
- subfloder contain dockerfiles used for our `slow` CIs, which *can* be used for GPU tasks, but they are **BIG** as they were not specifically designed for a single model / single task. Thus the `~/docker/transformers-pytorch-gpu` includes additional dependencies to allow us to run ALL model tests (say `librosa` or `tesseract`, which you do not need to run LLMs)
- subfolders contain dockerfiles used for our `slow` CIs, which *can* be used for GPU tasks, but they are **BIG** as they were not specifically designed for a single model / single task. Thus the `~/docker/transformers-pytorch-gpu` includes additional dependencies to allow us to run ALL model tests (say `librosa` or `tesseract`, which you do not need to run LLMs)
Note that in both case, you need to run `uv pip install -e .`, which should take around 5 seconds. We do it outside the dockerfile for the need of our CI: we checkout a new branch each time, and the `transformers` code is thus updated.
We are open to contribution, and invite the community to create dockerfiles with potential arguments that properly choose extras depending on the model's dependencies! :hugs:
We are open to contribution, and invite the community to create dockerfiles with potential arguments that properly choose extras depending on the model's dependencies! :hugs:

View File

@ -1,16 +1,16 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
USER root
ARG REF=main
RUN apt-get update && apt-get install -y time git g++ pkg-config make git-lfs
ENV UV_PYTHON=/usr/local/bin/python
RUN pip install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools GitPython
RUN pip install --no-cache-dir --upgrade 'torch' 'torchaudio' 'torchvision' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir --upgrade 'torch' 'torchaudio' 'torchvision' --index-url https://download.pytorch.org/whl/cpu
# tensorflow pin matching setup.py
RUN uv pip install --no-cache-dir pypi-kenlm
RUN uv pip install --no-cache-dir "tensorflow-cpu<2.16" "tf-keras<2.16"
RUN uv pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[flax,quality,testing,torch-speech,vision]"
RUN git lfs install
RUN pip uninstall -y transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean
RUN uv pip uninstall transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean

View File

@ -1,5 +1,6 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root
RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git cmake wget xz-utils build-essential g++5 libprotobuf-dev protobuf-compiler
ENV UV_PYTHON=/usr/local/bin/python
@ -16,11 +17,11 @@ RUN make install -j 10
RUN uv pip install --no-cache --upgrade 'torch' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir --no-deps accelerate --extra-index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir "transformers[ja,testing,sentencepiece,jieba,spacy,ftfy,rjieba]" unidic unidic-lite
RUN uv pip install --no-cache-dir --no-deps accelerate --extra-index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[ja,testing,sentencepiece,jieba,spacy,ftfy,rjieba]" unidic unidic-lite
# spacy is not used so not tested. Causes to failures. TODO fix later
RUN python3 -m unidic download
RUN pip uninstall -y transformers
RUN uv pip uninstall transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/*
RUN apt remove -y g++ cmake xz-utils libprotobuf-dev protobuf-compiler
RUN apt remove -y g++ cmake xz-utils libprotobuf-dev protobuf-compiler

View File

@ -1,12 +1,13 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root
RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git
RUN apt-get install -y g++ cmake
ENV UV_PYTHON=/usr/local/bin/python
RUN pip --no-cache-dir install uv && uv venv
RUN uv pip install --no-cache-dir -U pip setuptools albumentations seqeval
RUN pip install --upgrade --no-cache-dir "transformers[tf-cpu,sklearn,testing,sentencepiece,tf-speech,vision]"
RUN uv pip install --no-cache-dir "protobuf==3.20.3"
RUN pip uninstall -y transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/*
RUN uv pip install --upgrade --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[tf-cpu,sklearn,testing,sentencepiece,tf-speech,vision]"
RUN uv pip install --no-cache-dir "protobuf==3.20.3"
RUN uv pip uninstall transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/*

View File

@ -1,11 +1,12 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root
RUN apt-get update && apt-get install -y --no-install-recommends libsndfile1-dev espeak-ng time git g++ cmake pkg-config openssh-client git
ENV UV_PYTHON=/usr/local/bin/python
RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
RUN pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-deps timm accelerate --extra-index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir librosa "transformers[sklearn,sentencepiece,vision,testing]" seqeval albumentations jiwer
RUN pip uninstall -y transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/*
RUN uv pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-deps timm accelerate --extra-index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir librosa "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[sklearn,sentencepiece,vision,testing]" seqeval albumentations jiwer
RUN uv pip uninstall transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/*

View File

@ -1,17 +1,17 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root
RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git libgl1-mesa-glx libgl1 g++ tesseract-ocr
ENV UV_PYTHON=/usr/local/bin/python
RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
RUN pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir --no-deps timm accelerate
RUN pip install -U --upgrade-strategy eager --no-cache-dir pytesseract python-Levenshtein opencv-python nltk
# RUN uv pip install --no-cache-dir natten==0.15.1+torch210cpu -f https://shi-labs.com/natten/wheels
RUN pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[testing, vision]" 'scikit-learn' 'torch-stft' 'nose' 'dataset'
RUN uv pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[testing, vision]" 'scikit-learn' 'torch-stft' 'nose' 'dataset'
# RUN git clone https://github.com/facebookresearch/detectron2.git
# RUN python3 -m pip install --no-cache-dir -e detectron2
RUN pip install 'git+https://github.com/facebookresearch/detectron2.git@92ae9f0b92aba5867824b4f12aa06a22a60a45d3'
RUN pip uninstall -y transformers
RUN uv pip install 'git+https://github.com/facebookresearch/detectron2.git@92ae9f0b92aba5867824b4f12aa06a22a60a45d3' --no-build-isolation
RUN uv pip uninstall transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/*

View File

@ -1,10 +1,10 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root
RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git g++ cmake
ENV UV_PYTHON=/usr/local/bin/python
RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
RUN pip install --no-cache-dir "scipy<1.13" "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[flax,testing,sentencepiece,flax-speech,vision]"
RUN pip uninstall -y transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean
RUN uv pip install --no-cache-dir "scipy<1.13" "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[flax,testing,sentencepiece,flax-speech,vision]"
RUN uv pip uninstall transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean

View File

@ -1,10 +1,10 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root
RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git cmake g++
ENV UV_PYTHON=/usr/local/bin/python
RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
RUN pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[sklearn,tf-cpu,testing,sentencepiece,tf-speech,vision]"
RUN uv pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[sklearn,tf-cpu,testing,sentencepiece,tf-speech,vision]"
RUN uv pip install --no-cache-dir "protobuf==3.20.3" tensorflow_probability
RUN apt-get clean && rm -rf /var/lib/apt/lists/*
RUN apt-get clean && rm -rf /var/lib/apt/lists/*

View File

@ -1,11 +1,11 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root
RUN apt-get update && apt-get install -y --no-install-recommends libsndfile1-dev espeak-ng time git pkg-config openssh-client git
ENV UV_PYTHON=/usr/local/bin/python
RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
RUN pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-deps timm accelerate --extra-index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir librosa "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[sklearn,sentencepiece,vision,testing]"
RUN pip uninstall -y transformers
RUN uv pip uninstall transformers

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root
@ -6,4 +6,4 @@ RUN apt-get update && apt-get install -y time git
ENV UV_PYTHON=/usr/local/bin/python
RUN pip install uv && uv venv
RUN uv pip install --no-cache-dir -U pip setuptools GitPython "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[ruff]" urllib3
RUN apt-get install -y jq curl && apt-get clean && rm -rf /var/lib/apt/lists/*
RUN apt-get install -y jq curl && apt-get clean && rm -rf /var/lib/apt/lists/*

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root
@ -6,7 +6,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends libsndfile1-de
RUN apt-get install -y cmake
ENV UV_PYTHON=/usr/local/bin/python
RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
RUN pip install --upgrade --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[tf-cpu,sklearn,testing,sentencepiece,tf-speech,vision]"
RUN uv pip install --no-cache-dir "protobuf==3.20.3"
RUN pip uninstall -y transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean
RUN uv pip install --upgrade --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[tf-cpu,sklearn,testing,sentencepiece,tf-speech,vision]"
RUN uv pip install --no-cache-dir "protobuf==3.20.3"
RUN uv pip uninstall transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root
@ -6,11 +6,11 @@ RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git g++
ENV UV_PYTHON=/usr/local/bin/python
RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
RUN uv pip install --no-deps accelerate
RUN pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
RUN pip install --no-cache-dir "scipy<1.13" "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[flax,audio,sklearn,sentencepiece,vision,testing]"
RUN uv pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir "scipy<1.13" "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[flax,audio,sklearn,sentencepiece,vision,testing]"
# RUN pip install --no-cache-dir "scipy<1.13" "transformers[flax,testing,sentencepiece,flax-speech,vision]"
RUN pip uninstall -y transformers
RUN uv pip uninstall transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean

View File

@ -1,11 +1,11 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root
RUN apt-get update && apt-get install -y --no-install-recommends libsndfile1-dev espeak-ng time git g++ cmake pkg-config openssh-client git git-lfs
ENV UV_PYTHON=/usr/local/bin/python
RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
RUN pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-deps timm accelerate --extra-index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir librosa "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[sklearn,sentencepiece,vision,testing,tiktoken]"
RUN pip uninstall -y transformers
RUN uv pip install --no-cache-dir librosa "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[sklearn,sentencepiece,vision,testing,tiktoken,num2words,video]"
RUN uv pip uninstall transformers

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
RUN echo ${REF}
@ -7,13 +7,13 @@ RUN apt-get update && apt-get install -y --no-install-recommends libsndfile1-de
ENV UV_PYTHON=/usr/local/bin/python
RUN pip --no-cache-dir install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
RUN uv pip install --no-cache-dir --no-deps accelerate --extra-index-url https://download.pytorch.org/whl/cpu
RUN pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir 'torch' 'torchvision' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
RUN git lfs install
RUN uv pip install --no-cache-dir pypi-kenlm
RUN pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[tf-cpu,sklearn,sentencepiece,vision,testing]"
RUN uv pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[tf-cpu,sklearn,sentencepiece,vision,testing]"
RUN uv pip install --no-cache-dir "protobuf==3.20.3" librosa
RUN pip uninstall -y transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean
RUN uv pip uninstall transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean

View File

@ -9,7 +9,7 @@ SHELL ["sh", "-lc"]
# The following `ARG` are mainly used to specify the versions explicitly & directly in this docker file, and not meant
# to be used as arguments for docker build (so far).
ARG PYTORCH='2.5.1'
ARG PYTORCH='2.6.0'
# (not always a valid torch version)
ARG INTEL_TORCH_EXT='2.3.0'
# Example: `cu102`, `cu113`, etc.
@ -57,7 +57,8 @@ RUN python3 -m pip uninstall -y ninja
# For `dinat` model
# The `XXX` part in `torchXXX` needs to match `PYTORCH` (to some extent)
RUN python3 -m pip install --no-cache-dir natten==0.15.1+torch220$CUDA -f https://shi-labs.com/natten/wheels
# pin `0.17.4` otherwise `cannot import name 'natten2dav' from 'natten.functional'`
RUN python3 -m pip install --no-cache-dir natten==0.17.4+torch250cu121 -f https://shi-labs.com/natten/wheels
# For `nougat` tokenizer
RUN python3 -m pip install --no-cache-dir python-Levenshtein
@ -65,6 +66,9 @@ RUN python3 -m pip install --no-cache-dir python-Levenshtein
# For `FastSpeech2ConformerTokenizer` tokenizer
RUN python3 -m pip install --no-cache-dir g2p-en
# For Some bitsandbytes tests
RUN python3 -m pip install --no-cache-dir einops
# When installing in editable mode, `transformers` is not recognized as a package.
# this line must be added in order for python to be aware of transformers.
RUN cd transformers && python3 setup.py develop

View File

@ -48,8 +48,8 @@ RUN python3 -m pip uninstall -y torch-tensorrt apex
# Pre-build **nightly** release of DeepSpeed, so it would be ready for testing (otherwise, the 1st deepspeed test will timeout)
RUN python3 -m pip uninstall -y deepspeed
# This has to be run inside the GPU VMs running the tests. (So far, it fails here due to GPU checks during compilation.)
# Issue: https://github.com/microsoft/DeepSpeed/issues/2010
# RUN git clone https://github.com/microsoft/DeepSpeed && cd DeepSpeed && rm -rf build && \
# Issue: https://github.com/deepspeedai/DeepSpeed/issues/2010
# RUN git clone https://github.com/deepspeedai/DeepSpeed && cd DeepSpeed && rm -rf build && \
# DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 DS_BUILD_UTILS=1 python3 -m pip install . --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check 2>&1
RUN python3 -m pip install -U "itsdangerous<2.1.0"

View File

@ -1,17 +1,18 @@
FROM rocm/dev-ubuntu-22.04:6.0.2
# rocm/pytorch has no version with 2.1.0
FROM rocm/dev-ubuntu-22.04:6.2.4
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y --no-install-recommends git libsndfile1-dev tesseract-ocr espeak-ng python3 python3-dev python3-pip python3-dev ffmpeg && \
apt install -y --no-install-recommends git libsndfile1-dev tesseract-ocr espeak-ng python3 python3-dev python3-pip python3-dev ffmpeg git-lfs && \
apt clean && \
rm -rf /var/lib/apt/lists/*
RUN git lfs install
RUN python3 -m pip install --no-cache-dir --upgrade pip numpy
RUN python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0
RUN python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2.4
RUN python3 -m pip install --no-cache-dir --upgrade importlib-metadata setuptools ninja git+https://github.com/facebookresearch/detectron2.git pytesseract "itsdangerous<2.1.0"
@ -30,5 +31,5 @@ RUN python3 -m pip uninstall -y tensorflow flax
# this line must be added in order for python to be aware of transformers.
RUN cd transformers && python3 setup.py develop
# Remove nvml as it is not compatible with ROCm. apex is not tested on NVIDIA either.
RUN python3 -m pip uninstall py3nvml pynvml apex -y
# Remove nvml and nvidia-ml-py as it is not compatible with ROCm. apex is not tested on NVIDIA either.
RUN python3 -m pip uninstall py3nvml pynvml nvidia-ml-py apex -y

View File

@ -1,11 +1,11 @@
FROM rocm/dev-ubuntu-22.04:5.6
FROM rocm/dev-ubuntu-22.04:6.2.4
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
ARG PYTORCH='2.1.1'
ARG TORCH_VISION='0.16.1'
ARG TORCH_AUDIO='2.1.1'
ARG ROCM='5.6'
ARG PYTORCH='2.6.0'
ARG TORCH_VISION='0.21.0'
ARG TORCH_AUDIO='2.6.0'
ARG ROCM='6.2.4'
RUN apt update && \
apt install -y --no-install-recommends \
@ -16,9 +16,11 @@ RUN apt update && \
python-is-python3 \
rocrand-dev \
rocthrust-dev \
rocblas-dev \
hipsolver-dev \
hipsparse-dev \
hipblas-dev \
rocblas-dev && \
hipblaslt-dev && \
apt clean && \
rm -rf /var/lib/apt/lists/*
@ -45,4 +47,4 @@ RUN cd transformers && python3 setup.py develop
RUN python3 -c "from deepspeed.launcher.runner import main"
# Remove nvml as it is not compatible with ROCm
RUN python3 -m pip uninstall py3nvml pynvml -y
RUN python3 -m pip uninstall py3nvml pynvml nvidia-ml-py apex -y

View File

@ -1,5 +1,5 @@
# https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-11.html#rel-23-11
FROM nvcr.io/nvidia/pytorch:23.04-py3
FROM nvcr.io/nvidia/pytorch:23.11-py3
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive

View File

@ -34,8 +34,8 @@ RUN python3 -m pip uninstall -y torch-tensorrt apex
# Pre-build **nightly** release of DeepSpeed, so it would be ready for testing (otherwise, the 1st deepspeed test will timeout)
RUN python3 -m pip uninstall -y deepspeed
# This has to be run inside the GPU VMs running the tests. (So far, it fails here due to GPU checks during compilation.)
# Issue: https://github.com/microsoft/DeepSpeed/issues/2010
# RUN git clone https://github.com/microsoft/DeepSpeed && cd DeepSpeed && rm -rf build && \
# Issue: https://github.com/deepspeedai/DeepSpeed/issues/2010
# RUN git clone https://github.com/deepspeedai/DeepSpeed && cd DeepSpeed && rm -rf build && \
# DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 DS_BUILD_UTILS=1 python3 -m pip install . --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check 2>&1
## For `torchdynamo` tests

View File

@ -11,7 +11,7 @@ ARG REF=main
RUN git clone https://github.com/huggingface/transformers && cd transformers && git checkout $REF
# If set to nothing, will install the latest version
ARG PYTORCH='2.5.1'
ARG PYTORCH='2.6.0'
ARG TORCH_VISION=''
ARG TORCH_AUDIO=''
# Example: `cu102`, `cu113`, etc.

View File

@ -1,4 +1,4 @@
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04
FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
@ -9,9 +9,9 @@ SHELL ["sh", "-lc"]
# The following `ARG` are mainly used to specify the versions explicitly & directly in this docker file, and not meant
# to be used as arguments for docker build (so far).
ARG PYTORCH='2.4.1'
ARG PYTORCH='2.6.0'
# Example: `cu102`, `cu113`, etc.
ARG CUDA='cu118'
ARG CUDA='cu121'
RUN apt update
RUN apt install -y git libsndfile1-dev tesseract-ocr espeak-ng python3 python3-pip ffmpeg
@ -26,8 +26,6 @@ RUN echo torch=$VERSION
# Currently, let's just use their latest releases (when `torch` is installed with a release version)
RUN python3 -m pip install --no-cache-dir -U $VERSION torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/$CUDA
RUN python3 -m pip install --no-cache-dir -e ./transformers[dev-torch]
RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/accelerate@main#egg=accelerate
# needed in bnb and awq
@ -36,15 +34,26 @@ RUN python3 -m pip install --no-cache-dir einops
# Add bitsandbytes for mixed int8 testing
RUN python3 -m pip install --no-cache-dir bitsandbytes
# Add auto-gptq for gtpq quantization testing
RUN python3 -m pip install --no-cache-dir auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
# Add gptqmodel for gtpq quantization testing, installed from source for pytorch==2.6.0 compatibility
RUN python3 -m pip install lm_eval
RUN git clone https://github.com/ModelCloud/GPTQModel.git && cd GPTQModel && pip install -v . --no-build-isolation
# Add optimum for gptq quantization testing
RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/optimum@main#egg=optimum
# Add PEFT
RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/peft@main#egg=peft
# Add aqlm for quantization testing
RUN python3 -m pip install --no-cache-dir aqlm[gpu]==1.0.2
# Add vptq for quantization testing
RUN pip install vptq
# Add spqr for quantization testing
# Commented for now as No matching distribution found we need to reach out to the authors
# RUN python3 -m pip install --no-cache-dir spqr_quant[gpu]
# Add hqq for quantization testing
RUN python3 -m pip install --no-cache-dir hqq
@ -52,14 +61,29 @@ RUN python3 -m pip install --no-cache-dir hqq
RUN python3 -m pip install --no-cache-dir gguf
# Add autoawq for quantization testing
# >=v0.2.3 needed for compatibility with torch 2.2.1
RUN python3 -m pip install --no-cache-dir https://github.com/casper-hansen/AutoAWQ/releases/download/v0.2.3/autoawq-0.2.3+cu118-cp310-cp310-linux_x86_64.whl
# New release v0.2.8
RUN python3 -m pip install --no-cache-dir autoawq[kernels]
# Add quanto for quantization testing
RUN python3 -m pip install --no-cache-dir optimum-quanto
# Add eetq for quantization testing
RUN python3 -m pip install git+https://github.com/NetEase-FuXi/EETQ.git
RUN git clone https://github.com/NetEase-FuXi/EETQ.git && cd EETQ/ && git submodule update --init --recursive && pip install .
# # Add flute-kernel and fast_hadamard_transform for quantization testing
# # Commented for now as they cause issues with the build
# # TODO: create a new workflow to test them
# RUN python3 -m pip install --no-cache-dir flute-kernel==0.4.1
# RUN python3 -m pip install --no-cache-dir git+https://github.com/Dao-AILab/fast-hadamard-transform.git
# Add compressed-tensors for quantization testing
RUN python3 -m pip install --no-cache-dir compressed-tensors
# Add AMD Quark for quantization testing
RUN python3 -m pip install --no-cache-dir amd-quark
# Add transformers in editable mode
RUN python3 -m pip install --no-cache-dir -e ./transformers[dev-torch]
# When installing in editable mode, `transformers` is not recognized as a package.
# this line must be added in order for python to be aware of transformers.

View File

@ -30,26 +30,26 @@
- local: conversations
title: الدردشة مع المحولات
title: البرامج التعليمية
# - sections:
# - isExpanded: false
# sections:
# - local: tasks/sequence_classification
# title: تصنيف النصوص
# - local: tasks/token_classification
# title: تصنيف الرموز
# - local: tasks/question_answering
# title: الإجابة على الأسئلة
# - local: tasks/language_modeling
# title: نمذجة اللغة السببية
# - local: tasks/masked_language_modeling
# title: نمذجة اللغة المقنعة
# - local: tasks/translation
# title: الترجمة
# - local: tasks/summarization
# title: التلخيص
# - local: tasks/multiple_choice
# title: الاختيار المتعدد
# title: معالجة اللغات الطبيعية
- sections:
- isExpanded: false
sections:
- local: tasks/sequence_classification
title: تصنيف النصوص
- local: tasks/token_classification
title: تصنيف الرموز
- local: tasks/question_answering
title: الإجابة على الأسئلة
- local: tasks/language_modeling
title: نمذجة اللغة السببية
- local: tasks/masked_language_modeling
title: نمذجة اللغة المقنعة
- local: tasks/translation
title: الترجمة
- local: tasks/summarization
title: التلخيص
- local: tasks/multiple_choice
title: الاختيار المتعدد
title: معالجة اللغات الطبيعية
# - isExpanded: false
# sections:
# - local: tasks/audio_classification
@ -107,10 +107,10 @@
# - local: tasks/prompting
# title: دليل إرشادي لمحفزات النماذج اللغوية الكبيرة
# title: الإرشاد
# title: أدلة المهام
title: أدلة المهام
- sections:
- local: fast_tokenizers
title: استخدم مجزئيات النصوص السريعة من 🤗 Tokenizers
title: استخدم مجزئيات النصوص السريعة من 🤗 Tokenizers
- local: multilingual
title: الاستدلال باستخدام نماذج متعددة اللغات
- local: create_a_model
@ -129,16 +129,20 @@
title: التصدير إلى TFLite
- local: torchscript
title: التصدير إلى TorchScript
# - local: benchmarks
# title: المعايير
# - local: notebooks
# title: دفاتر الملاحظات مع الأمثلة
# - local: community
# title: موارد المجتمع
- local: notebooks
title: دفاتر الملاحظات مع الأمثلة
- local: community
title: موارد المجتمع
- local: troubleshooting
title: استكشاف الأخطاء وإصلاحها
- local: gguf
title: التوافق مع ملفات GGUF
- local: tiktoken
title: التوافق مع ملفات TikToken
- local: modular_transformers
title: الوحدات النمطية في `transformers`
- local: how_to_hack_models
title: اختراق النموذج (الكتابة فوق فئة لاستخدامك)
title: أدلة المطورين
# - sections:
# - local: quantization/overview
@ -151,6 +155,8 @@
# title: AWQ
# - local: quantization/aqlm
# title: AQLM
# - local: quantization/vptq
# title: VPTQ
# - local: quantization/quanto
# title: Quanto
# - local: quantization/eetq
@ -875,7 +881,7 @@
# - local: internal/pipelines_utils
# title: مرافق خطوط الأنابيب
# - local: internal/tokenization_utils
# title: مرافق مقسم النصوص
# title: مرافق مقسم النصوص
# - local: internal/trainer_utils
# title: مرافق المدرب
# - local: internal/generation_utils

View File

@ -195,7 +195,7 @@ You have access to the following tools:
To solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Code:', and 'Observation:' sequences.
At each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task, then the tools that you want to use.
Then in the 'Code:' sequence, you shold write the code in simple Python. The code sequence must end with '/End code' sequence.
Then in the 'Code:' sequence, you should write the code in simple Python. The code sequence must end with '/End code' sequence.
During each intermediate step, you can use 'print()' to save whatever important information you will then need.
These print outputs will then be available in the 'Observation:' field, for using this information as input for the next step.
@ -205,7 +205,7 @@ Here are a few examples using notional tools:
---
{examples}
Above example were using notional tools that might not exist for you. You only have acces to those tools:
Above example were using notional tools that might not exist for you. You only have access to those tools:
<<tool_names>>
You also can perform computations in the python code you generate.

View File

@ -15,4 +15,4 @@
- الوصول إلى جميع أوزان الانتباه لكل رأس في BERT/GPT/GPT-2،
- استرجاع قيم ومشتقات مخرجات الرأس لحساب درجة أهمية الرأس وحذفه كما هو موضح في https://arxiv.org/abs/1905.10650.
ولمساعدتك على فهم واستخدام هذه الميزات بسهولة، أضفنا مثالًا برمجيًا محددًا: [bertology.py](https://github.com/huggingface/transformers/tree/main/examples/research_projects/bertology/run_bertology.py) أثناء استخراج المعلومات وتقليص من نموذج تم تدريبه مسبقًا على GLUE.
ولمساعدتك على فهم واستخدام هذه الميزات بسهولة، أضفنا مثالًا برمجيًا محددًا: [bertology.py](https://github.com/huggingface/transformers-research-projects/tree/main/bertology/run_bertology.py) أثناء استخراج المعلومات وتقليص من نموذج تم تدريبه مسبقًا على GLUE.

View File

@ -0,0 +1,66 @@
# مجتمع المطورين
هذه الصفحة تجمع الموارد حول 🤗 Transformers التي طورها المجتمع.
## موارد المجتمع:
| المصدر | الوصف | المؤلف |
|:----------|:-------------|------:|
| [Hugging Face Transformers Glossary Flashcards](https://www.darigovresearch.com/huggingface-transformers-glossary-flashcards) | مجموعة من البطاقات التعليمية القائمة على [Transformers Docs Glossary](glossary) والتي تم وضعها في شكل يمكن تعلمه/مراجعته بسهولة باستخدام [Anki](https://apps.ankiweb.net/) وهو تطبيق مفتوح المصدر متعدد المنصات مصمم خصيصًا للاحتفاظ بالمعرفة على المدى الطويل. شاهد هذا [فيديو تمهيدي حول كيفية استخدام البطاقات التعليمية](https://www.youtube.com/watch?v=Dji_7PILrw). | [Darigov Research](https://www.darigovresearch.com/) |
## دفاتر ملاحظات المجتمع:
| الدفتر | الوصف | المؤلف | |
|:----------|:-------------|:-------------|------:|
| [Fine-tune a pre-trained Transformer to generate lyrics](https://github.com/AlekseyKorshuk/huggingartists) | كيفية توليد كلمات الأغاني على غرار فنانك المفضل من خلال ضبط نموذج GPT-2 | [Aleksey Korshuk](https://github.com/AlekseyKorshuk) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AlekseyKorshuk/huggingartists/blob/master/huggingartists-demo.ipynb) |
| [Train T5 in Tensorflow 2](https://github.com/snapthat/TF-T5-text-to-text) | كيفية تدريب T5 لأي مهمة باستخدام Tensorflow 2. يوضح هذا الدفتر مهمة السؤال والجواب المنفذة في Tensorflow 2 باستخدام SQUAD | [Muhammad Harris](https://github.com/HarrisDePerceptron) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snapthat/TF-T5-text-to-text/blob/master/snapthatT5/notebooks/TF-T5-Datasets%20Training.ipynb) |
| [Train T5 on TPU](https://github.com/patil-suraj/exploring-T5/blob/master/T5_on_TPU.ipynb) | كيفية تدريب T5 على SQUAD مع Transformers و Nlp | [Suraj Patil](https://github.com/patil-suraj) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/exploring-T5/blob/master/T5_on_TPU.ipynb#scrollTo=QLGiFCDqvuil) |
| [Fine-tune T5 for Classification and Multiple Choice](https://github.com/patil-suraj/exploring-T5/blob/master/t5_fine_tuning.ipynb) | كيفية ضبط نموذج T5 للتصنيف والمهام متعددة الخيارات باستخدام تنسيق النص إلى نص مع PyTorch Lightning | [Suraj Patil](https://github.com/patil-suraj) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/exploring-T5/blob/master/t5_fine_tuning.ipynb) |
| [Fine-tune DialoGPT on New Datasets and Languages](https://github.com/ncoop57/i-am-a-nerd/blob/master/_notebooks/2020-05-12-chatbot-part-1.ipynb) | كيفية ضبط نموذج DialoGPT على مجموعة بيانات جديدة لروبوتات الدردشة المحادثية المفتوحة | [Nathan Cooper](https://github.com/ncoop57) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ncoop57/i-am-a-nerd/blob/master/_notebooks/2020-05-12-chatbot-part-1.ipynb) |
| [Long Sequence Modeling with Reformer](https://github.com/patrickvonplaten/notebooks/blob/master/PyTorch_Reformer.ipynb) | كيفية التدريب على تسلسلات طويلة تصل إلى 500,000 رمز باستخدام Reformer | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/PyTorch_Reformer.ipynb) |
| [Fine-tune BART for Summarization](https://github.com/ohmeow/ohmeow_website/blob/master/posts/2021-05-25-mbart-sequence-classification-with-blurr.ipynb) | كيفية ضبط نموذج BART للتلخيص باستخدام fastai باستخدام blurr | [Wayde Gilliam](https://ohmeow.com/) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ohmeow/ohmeow_website/blob/master/posts/2021-05-25-mbart-sequence-classification-with-blurr.ipynb) |
| [Fine-tune a pre-trained Transformer on anyone's tweets](https://colab.research.google.com/github/borisdayma/huggingtweets/blob/master/huggingtweets-demo.ipynb) | كيفية توليد تغريدات على غرار حساب Twitter المفضل لديك من خلال ضبط نموذج GPT-2 | [Boris Dayma](https://github.com/borisdayma) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/borisdayma/huggingtweets/blob/master/huggingtweets-demo.ipynb) |
| [Optimize 🤗 Hugging Face models with Weights & Biases](https://colab.research.google.com/github/wandb/examples/blob/master/colabs/huggingface/Optimize_Hugging_Face_models_with_Weights_%26_Biases.ipynb) | دليل كامل لعرض تكامل W&B مع Hugging Face | [Boris Dayma](https://github.com/borisdayma) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/wandb/examples/blob/master/colabs/huggingface/Optimize_Hugging_Face_models_with_Weights_%26_Biases.ipynb) |
| [Pretrain Longformer](https://github.com/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb) | كيفية بناء نسخة "طويلة" من النماذج المسبقة التدريب الموجودة | [Iz Beltagy](https://beltagy.net) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb) |
| [Fine-tune Longformer for QA](https://github.com/patil-suraj/Notebooks/blob/master/longformer_qa_training.ipynb) | كيفية ضبط نموذج Longformer لمهمة QA | [Suraj Patil](https://github.com/patil-suraj) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/Notebooks/blob/master/longformer_qa_training.ipynb) |
| [Evaluate Model with 🤗nlp](https://github.com/patrickvonplaten/notebooks/blob/master/How_to_evaluate_Longformer_on_TriviaQA_using_NLP.ipynb) | كيفية تقييم نموذج Longformer على TriviaQA مع `nlp` | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1m7eTGlPmLRgoPkkA7rkhQdZ9ydpmsdLE?usp=sharing) |
| [Fine-tune T5 for Sentiment Span Extraction](https://github.com/enzoampil/t5-intro/blob/master/t5_qa_training_pytorch_span_extraction.ipynb) | كيفية ضبط نموذج T5 لاستخراج المشاعر باستخدام تنسيق النص إلى نص مع PyTorch Lightning | [Lorenzo Ampil](https://github.com/enzoampil) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/enzoampil/t5-intro/blob/master/t5_qa_training_pytorch_span_extraction.ipynb) |
| [Fine-tune DistilBert for Multiclass Classification](https://github.com/abhimishra91/transformers-tutorials/blob/master/transformers_multiclass_classification.ipynb) | كيفية ضبط نموذج DistilBert للتصنيف متعدد الفئات باستخدام PyTorch | [Abhishek Kumar Mishra](https://github.com/abhimishra91) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/abhimishra91/transformers-tutorials/blob/master/transformers_multiclass_classification.ipynb)|
|[Fine-tune BERT for Multi-label Classification](https://github.com/abhimishra91/transformers-tutorials/blob/master/transformers_multi_label_classification.ipynb)|كيفية ضبط نموذج BERT للتصنيف متعدد التصنيفات باستخدام PyTorch|[Abhishek Kumar Mishra](https://github.com/abhimishra91) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/abhimishra91/transformers-tutorials/blob/master/transformers_multi_label_classification.ipynb)|
|[Fine-tune T5 for Summarization](https://github.com/abhimishra91/transformers-tutorials/blob/master/transformers_summarization_wandb.ipynb)|كيفية ضبط نموذج T5 للتلخيص في PyTorch وتتبع التجارب باستخدام WandB|[Abhishek Kumar Mishra](https://github.com/abhimishra91) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/abhimishra91/transformers-tutorials/blob/master/transformers_summarization_wandb.ipynb)|
|[Speed up Fine-Tuning in Transformers with Dynamic Padding / Bucketing](https://github.com/ELS-RD/transformers-notebook/blob/master/Divide_Hugging_Face_Transformers_training_time_by_2_or_more.ipynb)|كيفية تسريع الضبط الدقيق بعامل 2 باستخدام الضبط الديناميكي/التقسيم|[Michael Benesty](https://github.com/pommedeterresautee) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1CBfRU1zbfu7-ijiOqAAQUA-RJaxfcJoO?usp=sharing)|
|[Pretrain Reformer for Masked Language Modeling](https://github.com/patrickvonplaten/notebooks/blob/master/Reformer_For_Masked_LM.ipynb)| كيفية تدريب نموذج Reformer مع طبقات الانتباه ثنائية الاتجاه | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1tzzh0i8PgDQGV3SMFUGxM7_gGae3K-uW?usp=sharing)|
|[Expand and Fine Tune Sci-BERT](https://github.com/lordtt13/word-embeddings/blob/master/COVID-19%20Research%20Data/COVID-SciBERT.ipynb)| كيفية زيادة مفردات نموذج SciBERT المسبق التدريب من AllenAI على مجموعة بيانات CORD وإنشاء خط أنابيب لها. | [Tanmay Thakur](https://github.com/lordtt13) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1rqAR40goxbAfez1xvF3hBJphSCsvXmh8)|
|[Fine Tune BlenderBotSmall for Summarization using the Trainer API](https://github.com/lordtt13/transformers-experiments/blob/master/Custom%20Tasks/fine-tune-blenderbot_small-for-summarization.ipynb)| كيفية ضبط نموذج BlenderBotSmall للتلخيص على مجموعة بيانات مخصصة، باستخدام واجهة برمجة التطبيقات Trainer. | [Tanmay Thakur](https://github.com/lordtt13) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/19Wmupuls7mykSGyRN_Qo6lPQhgp56ymq?usp=sharing)|
|[Fine-tune Electra and interpret with Integrated Gradients](https://github.com/elsanns/xai-nlp-notebooks/blob/master/electra_fine_tune_interpret_captum_ig.ipynb) | كيفية ضبط نموذج Electra للتحليل العاطفي وتفسير التنبؤات باستخدام Captum Integrated Gradients | [Eliza Szczechla](https://elsanns.github.io) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/elsanns/xai-nlp-notebooks/blob/master/electra_fine_tune_interpret_captum_ig.ipynb)|
|[fine-tune a non-English GPT-2 Model with Trainer class](https://github.com/philschmid/fine-tune-GPT-2/blob/master/Fine_tune_a_non_English_GPT_2_Model_with_Huggingface.ipynb) | كيفية ضبط نموذج GPT-2 غير الإنجليزي باستخدام فئة Trainer | [Philipp Schmid](https://www.philschmid.de) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/fine-tune-GPT-2/blob/master/Fine_tune_a_non_English_GPT_2_Model_with_Huggingface.ipynb)|
|[Fine-tune a DistilBERT Model for Multi Label Classification task](https://github.com/DhavalTaunk08/Transformers_scripts/blob/master/Transformers_multilabel_distilbert.ipynb) | كيفية ضبط نموذج DistilBERT لمهمة التصنيف متعدد التصنيفات | [Dhaval Taunk](https://github.com/DhavalTaunk08) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DhavalTaunk08/Transformers_scripts/blob/master/Transformers_multilabel_distilbert.ipynb)|
|[Fine-tune ALBERT for sentence-pair classification](https://github.com/NadirEM/nlp-notebooks/blob/master/Fine_tune_ALBERT_sentence_pair_classification.ipynb) | كيفية ضبط نموذج ALBERT أو أي نموذج آخر قائم على BERT لمهمة التصنيف المزدوج للجمل | [Nadir El Manouzi](https://github.com/NadirEM) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NadirEM/nlp-notebooks/blob/master/Fine_tune_ALBERT_sentence_pair_classification.ipynb)|
|[Fine-tune Roberta for sentiment analysis](https://github.com/DhavalTaunk08/NLP_scripts/blob/master/sentiment_analysis_using_roberta.ipynb) | كيفية ضبط نموذج Roberta للتحليل العاطفي | [Dhaval Taunk](https://github.com/DhavalTaunk08) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DhavalTaunk08/NLP_scripts/blob/master/sentiment_analysis_using_roberta.ipynb)|
|[Evaluating Question Generation Models](https://github.com/flexudy-pipe/qugeev) | ما مدى دقة الإجابات على الأسئلة التي يولدها نموذجك التحويلي seq2seq؟ | [Pascal Zoleko](https://github.com/zolekode) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1bpsSqCQU-iw_5nNoRm_crPq6FRuJthq_?usp=sharing)|
|[Classify text with DistilBERT and Tensorflow](https://github.com/peterbayerle/huggingface_notebook/blob/main/distilbert_tf.ipynb) | كيفية ضبط نموذج DistilBERT للتصنيف النصي في TensorFlow | [Peter Bayerle](https://github.com/peterbayerle) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/peterbayerle/huggingface_notebook/blob/main/distilbert_tf.ipynb)|
|[Leverage BERT for Encoder-Decoder Summarization on CNN/Dailymail](https://github.com/patrickvonplaten/notebooks/blob/master/BERT2BERT_for_CNN_Dailymail.ipynb) | كيفية البدء السريع لنموذج *EncoderDecoderModel* مع نقطة تفتيش *google-bert/bert-base-uncased* للتلخيص على CNN/Dailymail | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/BERT2BERT_for_CNN_Dailymail.ipynb)|
|[Leverage RoBERTa for Encoder-Decoder Summarization on BBC XSum](https://github.com/patrickvonplaten/notebooks/blob/master/RoBERTaShared_for_BBC_XSum.ipynb) | كيفية البدء السريع لنموذج *EncoderDecoderModel* المشترك مع نقطة تفتيش *FacebookAI/roberta-base* للتلخيص على BBC/XSum | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/RoBERTaShared_for_BBC_XSum.ipynb)|
|[Fine-tune TAPAS on Sequential Question Answering (SQA)](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/TAPAS/Fine_tuning_TapasForQuestionAnswering_on_SQA.ipynb) | كيفية ضبط نموذج *TapasForQuestionAnswering* مع نقطة تفتيش *tapas-base* على مجموعة بيانات Sequential Question Answering (SQA) | [Niels Rogge](https://github.com/nielsrogge) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/TAPAS/Fine_tuning_TapasForQuestionAnswering_on_SQA.ipynb)|
|[Evaluate TAPAS on Table Fact Checking (TabFact)](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/TAPAS/Evaluating_TAPAS_on_the_Tabfact_test_set.ipynb) | كيفية تقييم نموذج *TapasForSequenceClassification* المضبوط مسبقًا مع نقطة تفتيش *tapas-base-finetuned-tabfact* باستخدام مزيج من مكتبتي 🤗 datasets و 🤗 transformers | [Niels Rogge](https://github.com/nielsrogge) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/TAPAS/Evaluating_TAPAS_on_the_Tabfact_test_set.ipynb)|
|[Fine-tuning mBART for translation](https://colab.research.google.com/github/vasudevgupta7/huggingface-tutorials/blob/main/translation_training.ipynb) | كيفية ضبط نموذج mBART باستخدام Seq2SeqTrainer للترجمة من الهندية إلى الإنجليزية | [Vasudev Gupta](https://github.com/vasudevgupta7) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vasudevgupta7/huggingface-tutorials/blob/main/translation_training.ipynb)|
|[Fine-tune LayoutLM on FUNSD (a form understanding dataset)](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/LayoutLM/Fine_tuning_LayoutLMForTokenClassification_on_FUNSD.ipynb) | كيفية ضبط نموذج *LayoutLMForTokenClassification* على مجموعة بيانات FUNSD لاستخراج المعلومات من المستندات الممسوحة ضوئيًا | [Niels Rogge](https://github.com/nielsrogge) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/LayoutLM/Fine_tuning_LayoutLMForTokenClassification_on_FUNSD.ipynb)|
|[Fine-Tune DistilGPT2 and Generate Text](https://colab.research.google.com/github/tripathiaakash/DistilGPT2-Tutorial/blob/main/distilgpt2_fine_tuning.ipynb) | كيفية ضبط نموذج DistilGPT2 وتوليد النص | [Aakash Tripathi](https://github.com/tripathiaakash) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/tripathiaakash/DistilGPT2-Tutorial/blob/main/distilgpt2_fine_tuning.ipynb)|
|[Fine-Tune LED on up to 8K tokens](https://github.com/patrickvonplaten/notebooks/blob/master/Fine_tune_Longformer_Encoder_Decoder_(LED)_for_Summarization_on_pubmed.ipynb) | كيفية ضبط نموذج LED على pubmed للتلخيص طويل المدى | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/Fine_tune_Longformer_Encoder_Decoder_(LED)_for_Summarization_on_pubmed.ipynb)|
|[Evaluate LED on Arxiv](https://github.com/patrickvonplaten/notebooks/blob/master/LED_on_Arxiv.ipynb) | كيفية تقييم نموذج LED للتلخيص طويل المدى بشكل فعال | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/LED_on_Arxiv.ipynb)|
|[Fine-tune LayoutLM on RVL-CDIP (a document image classification dataset)](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/LayoutLM/Fine_tuning_LayoutLMForSequenceClassification_on_RVL_CDIP.ipynb) | كيفية ضبط نموذج *LayoutLMForSequenceClassification* على مجموعة بيانات RVL-CDIP لتصنيف المستندات الممسوحة ضوئيًا | [Niels Rogge](https://github.com/nielsrogge) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/LayoutLM/Fine_tuning_LayoutLMForSequenceClassification_on_RVL_CDIP.ipynb)|
|[Wav2Vec2 CTC decoding with GPT2 adjustment](https://github.com/voidful/huggingface_notebook/blob/main/xlsr_gpt.ipynb) | كيفية فك تشفير تسلسل CTC مع تعديل نموذج اللغة | [Eric Lam](https://github.com/voidful) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1e_zQHYbO2YKEaUgzb1ww1WwiAyydAj?usp=sharing)|
|[Fine-tune BART for summarization in two languages with Trainer class](https://github.com/elsanns/xai-nlp-notebooks/blob/master/fine_tune_bart_summarization_two_langs.ipynb) | كيفية ضبط نموذج BART للتلخيص بلغتين باستخدام فئة Trainer | [Eliza Szczechla](https://github.com/elsanns) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/elsanns/xai-nlp-notebooks/blob/master/fine_tune_bart_summarization_two_langs.ipynb)|
|[Evaluate Big Bird on Trivia QA](https://github.com/patrickvonplaten/notebooks/blob/master/Evaluating_Big_Bird_on_TriviaQA.ipynb) | كيفية تقييم نموذج BigBird للأسئلة والأجوبة على وثائق طويلة على Trivia QA | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/Evaluating_Big_Bird_on_TriviaQA.ipynb)|
| [Create video captions using Wav2Vec2](https://github.com/Muennighoff/ytclipcc/blob/main/wav2vec_youtube_captions.ipynb) | كيفية إنشاء تعليقات توضيحية على YouTube من أي فيديو من خلال تفريغ الصوت باستخدام Wav2Vec | [Niklas Muennighoff](https://github.com/Muennighoff) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Muennighoff/ytclipcc/blob/main/wav2vec_youtube_captions.ipynb) |
| [Fine-tune the Vision Transformer on CIFAR-10 using PyTorch Lightning](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/VisionTransformer/Fine_tuning_the_Vision_Transformer_on_CIFAR_10_with_PyTorch_Lightning.ipynb) | كيفية ضبط نموذج Vision Transformer (ViT) على CIFAR-10 باستخدام مكتبات HuggingFace Transformers و Datasets و PyTorch Lightning | [Niels Rogge](https://github.com/nielsrogge) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/VisionTransformer/Fine_tuning_the_Vision_Transformer_on_CIFAR_10_with_PyTorch_Lightning.ipynb) |
| [Fine-tune the Vision Transformer on CIFAR-10 using the 🤗 Trainer](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/VisionTransformer/Fine_tuning_the_Vision_Transformer_on_CIFAR_10_with_the_%F0%9F%A4%97_Trainer.ipynb) | كيفية ضبط نموذج Vision Transformer (ViT) على CIFAR-10 باستخدام مكتبات HuggingFace Transformers و Datasets و 🤗 Trainer | [Niels Rogge](https://github.com/nielsrogge) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/VisionTransformer/Fine_tuning_the_Vision_Transformer_on_CIFAR_10_with_the_%F0%9F%A4%97_Trainer.ipynb) |
| [Evaluate LUKE on Open Entity, an entity typing dataset](https://github.com/studio-ousia/luke/blob/master/notebooks/huggingface_open_entity.ipynb) | كيفية تقييم نموذج *LukeForEntityClassification* على مجموعة بيانات Open Entity | [Ikuya Yamada](https://github.com/ikuyamada) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/studio-ousia/luke/blob/master/notebooks/huggingface_open_entity.ipynb) |
| [Evaluate LUKE on TACRED, a relation extraction dataset](https://github.com/studio-ousia/luke/blob/master/notebooks/huggingface_tacred.ipynb) | كيفية تقييم نموذج *LukeForEntityPairClassification* على مجموعة بيانات TACRED | [Ikuya Yamada](https://github.com/ikuyamada) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/studio-ousia/luke/blob/master/notebooks/huggingface_tacred.ipynb) |
| [Evaluate LUKE on CoNLL-2003, an important NER benchmark](https://github.com/studio-ousia/luke/blob/master/notebooks/huggingface_conll_2003.ipynb) | كيفية تقييم نموذج *LukeForEntitySpanClassification* على مجموعة بيانات CoNLL-2003 | [Ikuya Yamada](https://github.com/ikuyamada) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/studio-ousia/luke/blob/master/notebooks/huggingface_conll_2003.ipynb) |
| [Evaluate BigBird-Pegasus on PubMed dataset](https://github.com/vasudevgupta7/bigbird/blob/main/notebooks/bigbird_pegasus_evaluation.ipynb) | كيفية تقييم نموذج *BigBirdPegasusForConditionalGeneration* على مجموعة بيانات PubMed | [Vasudev Gupta](https://github.com/vasudevgupta7) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vasudevgupta7/bigbird/blob/main/notebooks/bigbird_pegasus_evaluation.ipynb) |
| [Speech Emotion Classification with Wav2Vec2](https://github.com/m3hrdadfi/soxan/blob/main/notebooks/Emotion_recognition_in_Greek_speech_using_Wav2Vec2.ipynb) | كيفية استخدام نموذج Wav2Vec2 المسبق التدريب لتصنيف المشاعر على مجموعة بيانات MEGA | [Mehrdad Farahani](https://github.com/m3hrdadfi) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/m3hrdadfi/soxan/blob/main/notebooks/Emotion_recognition_in_Greek_speech_using_Wav2Vec2.ipynb) |
| [Detect objects in an image with DETR](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/DETR/DETR_minimal_example_(with_DetrFeatureExtractor).ipynb) | كيفية استخدام نموذج *DetrForObjectDetection* المدرب للكشف عن الأجسام في صورة وتصوير الانتباه | [Niels Rogge](https://github.com/NielsRogge) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/DETR/DETR_minimal_example_(with_DetrFeatureExtractor).ipynb) |
| [Fine-tune DETR on a custom object detection dataset](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/DETR/Fine_tuning_DetrForObjectDetection_on_custom_dataset_(balloon).ipynb) | كيفية ضبط نموذج *DetrForObjectDetection* على مجموعة بيانات الكشف عن الأجسام المخصصة | [Niels Rogge](https://github.com/NielsRogge) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/DETR/Fine_tuning_DetrForObjectDetection_on_custom_dataset_(balloon).ipynb) |
| [Finetune T5 for Named Entity Recognition](https://github.com/ToluClassics/Notebooks/blob/main/T5_Ner_Finetuning.ipynb) | كيفية ضبط نموذج *T5* على مهمة التعرف على الكيانات المسماة | [Ogundepo Odunayo](https://github.com/ToluClassics) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1obr78FY_cBmWY5ODViCmzdY6O1KB65Vc?usp=sharing) |
| [Fine-Tuning Open-Source LLM using QLoRA with MLflow and PEFT](https://github.com/mlflow/mlflow/blob/master/docs/source/llms/transformers/tutorials/fine-tuning/transformers-peft.ipynb) | كيفية استخدام [QLoRA](https://github.com/artidoro/qlora) و [PEFT](https://huggingface.co/docs/peft/en/index) لضبط نموذج LLM بطريقة فعالة من حيث الذاكرة، مع استخدام [MLflow](https://mlflow.org/docs/latest/llms/transformers/index.html) لإدارة تتبع التجارب | [Yuki Watanabe](https://github.com/B-Step62) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mlflow/mlflow/blob/master/docs/source/llms/transformers/tutorials/fine-tuning/transformers-peft.ipynb) |

View File

@ -0,0 +1,163 @@
# كيفية تعديل أي نموذج من نماذج Transformers
توفر مكتبة [🤗 Transformers](https://github.com/huggingface/transformers) مجموعة من النماذج المسبقة التدريب والأدوات لمعالجة اللغات الطبيعية، والرؤية، وما إلى ذلك. على الرغم من أن هذه النماذج تغطي مجموعة واسعة من التطبيقات، فقد تواجه حالات استخدام لا تدعمها المكتبة بشكل افتراضي. يُمكن للتخصيص أن يفتح إمكانيات جديدة، مثل إضافة طبقات جديدة، أو تعديل البنية المعمارية، أو تحسين آليات الانتباه. سيُوضح لك هذا الدليل كيفية تعديل نماذج Transformers الموجودة لتلبية احتياجاتك المحددة. الشيء الرائع هو أنك لست بحاجة إلى الخروج من إطار عمل Transformers لإجراء هذه التغييرات. ي يمكنك تعديل النماذج مباشرةً في Transformers والاستفادة من الميزات مثل [واجهة برمجة التطبيقات Trainer](https://huggingface.co/docs/transformers/main/en/main_classes/trainer)، و [PreTrainedModel](https://huggingface.co/docs/transformers/main/en/main_classes/model#transformers.PreTrainedModel)، والضبط الدقيق الفعال باستخدام أدوات مثل [PEFT](https://huggingface.co/docs/peft/index).
سنرشدك في هذا الدليل لكيفية تخصيص نماذج Transformers الموجودة لتلبية متطلباتك، دون فقدان مزايا الإطار. ستتعلم كيفية:
- تعديل بنية نموذج ما من خلال تغيير آلية الانتباه الخاصة به.
- تطبيق تقنيات مثل Low-Rank Adaptation (LoRA) على مكونات نموذج محددة.
نحن نشجعك على المساهمة باختراقاتك الخاصة ومشاركتها هنا مع المجتمع1
## مثال: تعديل آلية الانتباه في نموذج Segment Anything (SAM)
نموذج **Segment Anything (SAM)** هو نموذج رائد في مجال تجزئة الصور. في تنفيذه الافتراضي، يستخدم SAM إسقاطًا مجمعًا للاستعلام والمفتاح والقيمة (`qkv`) في آلية الانتباه الخاصة به. ومع ذلك، قد ترغب في ضبط مكونات محددة فقط من آلية الانتباه، مثل إسقاطات الاستعلام (`q`) والقيمة (`v`)، لتقليل عدد المعلمات القابلة للتدريب والموارد الحسابية المطلوبة.
### الدافع
من خلال تقسيم الإسقاط المجمع `qkv` إلى إسقاطات منفصلة `q` و `k` و `v`، يمكنك تطبيق تقنيات مثل **LoRA** (Low-Rank Adaptation) على إسقاطي `q` و `v` فقط. يسمح لك هذا بما يلي:
- ضبط عدد أقل من المعلمات، مما يقلل من العبء الحسابي.
- تحقيق أداء أفضل من خلال التركيز على مكونات محددة.
- تجربة استراتيجيات تعديل مختلفة في آلية الانتباه.
### التنفيذ
#### **الخطوة 1: إنشاء فئة اهتمام مخصصة**
بعد ذلك، قم بإنشاء فئة فرعية من فئة `SamVisionAttention` الأصلية وعدلها لتضم إسقاطات `q` و `k` و `v` منفصلة.
```python
import torch
import torch.nn as nn
from transformers.models.sam.modeling_sam import SamVisionAttention
class SamVisionAttentionSplit(SamVisionAttention, nn.Module):
def __init__(self, config, window_size):
super().__init__(config, window_size)
del self.qkv
# إسقاطات منفصلة q و k و v
self.q = nn.Linear(config.hidden_size, config.hidden_size, bias=config.qkv_bias)
self.k = nn.Linear(config.hidden_size, config.hidden_size, bias=config.qkv_bias)
self.v = nn.Linear(config.hidden_size, config.hidden_size, bias=config.qkv_bias)
self._register_load_state_dict_pre_hook(self.split_q_k_v_load_hook)
def split_q_k_v_load_hook(self, state_dict, prefix, *args):
keys_to_delete = []
for key in list(state_dict.keys()):
if "qkv." in key:
# تقسيم q و k و v من الإسقاط المجمع
q, k, v = state_dict[key].chunk(3, dim=0)
# استبدال الإسقاطات الفردية q و k و v
state_dict[key.replace("qkv.", "q.")] = q
state_dict[key.replace("qkv.", "k.")] = k
state_dict[key.replace("qkv.", "v.")] = v
# وضع علامة على مفتاح qkv القديم للحذف
keys_to_delete.append(key)
# حذف مفاتيح qkv القديمة
for key in keys_to_delete:
del state_dict[key]
def forward(self, hidden_states: torch.Tensor, output_attentions=False) -> torch.Tensor:
batch_size, height, width, _ = hidden_states.shape
qkv_shapes = (batch_size * self.num_attention_heads, height * width, -1)
query = self.q(hidden_states).reshape((batch_size, height * width,self.num_attention_heads, -1)).permute(0,2,1,3).reshape(qkv_shapes)
key = self.k(hidden_states).reshape((batch_size, height * width,self.num_attention_heads, -1)).permute(0,2,1,3).reshape(qkv_shapes)
value = self.v(hidden_states).reshape((batch_size, height * width,self.num_attention_heads, -1)).permute(0,2,1,3).reshape(qkv_shapes)
attn_weights = (query * self.scale) @ key.transpose(-2, -1)
if self.use_rel_pos:
attn_weights = self.add_decomposed_rel_pos(
attn_weights, query, self.rel_pos_h, self.rel_pos_w, (height, width), (height, width)
)
attn_weights = torch.nn.functional.softmax(attn_weights, dtype=torch.float32, dim=-1).to(query.dtype)
attn_probs = nn.functional.dropout(attn_weights, p=self.dropout, training=self.training)
attn_output = (attn_probs @ value).reshape(batch_size, self.num_attention_heads, height, width, -1)
attn_output = attn_output.permute(0, 2, 3, 1, 4).reshape(batch_size, height, width, -1)
attn_output = self.proj(attn_output)
if output_attentions:
outputs = (attn_output, attn_weights)
else:
outputs = (attn_output, None)
return outputs
```
**الشرح:**
- **الإسقاطات المنفصلة:** يتم إزالة الإسقاط المُجمع `qkv`، وإنشاء إسقاطات خطية منفصلة `q` و `k` و `v`.
- **دالة استدعاء تحميل الأوزان:** تقوم طريقة `_split_qkv_load_hook` بتقسيم أوزان `qkv` المسبقة التدريب إلى أوزان `q` و `k` و `v` منفصلة عند تحميل النموذج. يضمن هذا التوافق مع أي نموذج مسبق التدريب.
- **التنفيذ الأمامي:** يتم حساب الاستعلامات والمفاتيح والقيم بشكل منفصل، وتستمر آلية الانتباه كالمعتاد.
#### **الخطوة 2: استبدال فئة الانتباه الأصلية**
استبدل فئة `SamVisionAttention` الأصلية بفئتك المخصصة بحيث يستخدم النموذج آلية الانتباه المعدلة.
```python
from transformers import SamModel
from transformers.models.sam import modeling_sam
# استبدال فئة الاهتمام في وحدة نمطية modeling_sam
modeling_sam.SamVisionAttention = SamVisionAttentionSplit
# تحميل نموذج SAM المسبق التدريب
model = SamModel.from_pretrained("facebook/sam-vit-base")
```
**الشرح:**
- **استبدال الفئة:** من خلال تعيين فئتك المخصصة إلى `modeling_sam.SamVisionAttention`، فإن أي حالات من فئة `SamVisionAttention` في النموذج ستستخدم النسخة المعدلة. وبالتالي، عند استدعاء `SamModel`، سيتم استخدام `SamVisionAttentionSplit` المحددة حديثًا.
- **تحميل النموذج:** يتم تحميل النموذج باستخدام `from_pretrained`، ويتم دمج آلية الانتباه المخصصة.
#### **الخطوة 3: تطبيق LoRA على إسقاطات محددة**
مع وجود إسقاطات `q` و `k` و `v` منفصلة، يمكنك الآن تطبيق LoRA على مكونات محددة، مثل إسقاطات `q` و `v`.
```python
from peft import LoraConfig, get_peft_model
config = LoraConfig(
r=16,
lora_alpha=32,
target_modules=["q", "v"], # تطبيق LoRA على إسقاطات q و v
lora_dropout=0.1,
task_type="mask-generation"
)
# تطبيق LoRA على النموذج
model = get_peft_model(model, config)
```
**الشرح:**
- **تكوين LoRA:** تحدد `LoraConfig` المرتبة `r`، وعامل القياس `lora_alpha`، والوحدات المستهدفة (`"q"` و `"v"`)، ومعدل التخلي، ونوع المهمة.
- **تطبيق LoRA:** تقوم دالة `get_peft_model` بتطبيق LoRA على الوحدات المحددة في النموذج.
- **تقليل المعلمات:** من خلال التركيز على `q` و `v`، فإنك تقلل عدد المعلمات القابلة للتدريب، مما يؤدي إلى تسريع التدريب وتقليل استخدام الذاكرة.
#### **الخطوة 4: التحقق من عدد المعلمات القابلة للتدريب**
من السهل التحقق من عدد المعلمات القابلة للتدريب ومعرفة تأثير تعديلك.
```python
model.print_trainable_parameters()
```
**الناتج المتوقع:**
```
عدد المعلمات القابلة للتدريب: 608,256 || جميع المعلمات: 94,343,728 || نسبة المعلمات القابلة للتدريب: 0.6447
عدد المعلمات القابلة للتدريب: 912,384 || جميع المعلمات: 94,647,856 || نسبة المعلمات القابلة للتدريب: 0.9640 # مع k
```
## المساهمة بابداعاتك الخاصة
يمكن لتعديل النماذج المسبقة التدريب أن يفتح آفاقًا جديدة للبحث والتطبيق. من خلال فهم وتعديل الآليات الداخلية للنماذج مثل SAM، يمكنك تخصيصها لتلبية احتياجاتك المحددة، وتحسين الأداء، وتجربة أفكار جديدة.
إذا قمت بتطوير تعديﻻتك الخاصة لنماذج Transformers وترغب في مشاركتها، ففكر في المساهمة في هذه الوثيقة.
- **إنشاء طلب سحب (Pull Request):** شارك تغييراتك وتحسيناتك في التعليمات البرمجية مباشرة في المستودع.
- **كتابة التوثيق:** قدم تفسيرات وأمثلة واضحة لتعديلاتك.
- **التفاعل مع المجتمع:** ناقش أفكارك واحصل على تعليقات من المطورين والباحثين الآخرين من خلال فتح مشكلة.

View File

@ -144,7 +144,7 @@ conda install conda-forge::transformers
تُحمّل النماذج المُسبقة التدريب وتُخزّن مؤقتًا في: `~/.cache/huggingface/hub`. هذا هو المجلد الافتراضي الذي يُحدده متغير البيئة `TRANSFORMERS_CACHE`. على Windows، يكون دليل ذاكرة التخزين المؤقت الافتراضي هو `C:\Users\username\.cache\huggingface\hub`. يمكنك تغيير متغيرات البيئة shell الموضحة أدناه - حسب الأولوية - لتحديد دليل ذاكرة تخزين مؤقت مختلف:
1. متغير البيئة (افتراضي): `HUGGINGFACE_HUB_CACHE` أو `TRANSFORMERS_CACHE`.
1. متغير البيئة (افتراضي): `HF_HUB_CACHE` أو `TRANSFORMERS_CACHE`.
2. متغير البيئة: `HF_HOME`.
3. متغير البيئة: `XDG_CACHE_HOME` + `/huggingface`.

View File

@ -0,0 +1,184 @@
# المحولات النمطية
مكتبة `transformers` هي إطار عمل ذو فلسفة محدد؛ يتم تعريف فلسفتنا في [الدليل المفاهيمي](./philosophy).
جوهر هذه الفلسفة يتمثل في مبدأ [نموذج واحد، ملف واحد](https://huggingface.co/blog/transformers-design-philosophy)
في المكتبة. الجانب السلبي لهذا المكون هو تقييده لوراثة واستيراد مكونات الملفات.
نتيجة لذلك، تتكرر مكونات النموذج عبر العديد من الملفات. يحتوي `transformers` على عدد كبير من طبقات الانتباه، يقارب عدد النماذج، والكثير منها متطابق. يتسبب هذا في تباعد عمليات التنفيذ المستقلة مع تطبيق الإصلاحات والتغييرات.
على أجزاء محددة من التعليمات البرمجية.
ولمعالجة ذلك، اعتمدنا مفهوم "النسخ" في المكتبة. فبإضافة تعليق يُشير إلى أن التعليمات البرمجية هي نسخة من أخرى، نضمن من خلال أنظمة CI والأوامر المحلية عدم تباعد النسخ. لكن هذه العملية، رغم بساطتها، تُسبب إرهاقاً. كما أنها تزيد العبء على المساهمين، وهو ما نهدف إلى تجاوزه.
غالباً ما تتطلب مساهمات النماذج إضافة تعليمات برمجية (حوالي 1000 سطر)، ومعالج (حوالي 500 سطر)، واختبارات، ووثائق، إلخ. ونادراً ما تقل مساهمات النماذج عن 3000-5000 سطر من التعليمات البرمجية، معظمها أكواد نمطية. هذا يرفع مستوى المساهمات،
ونهدف مع المحولات النمطية إلى خفض هذا المستوى إلى حدّ مقبول.
## ما هو؟
تقدم المحولات النمطية مفهوم ملف "نمطي" لمجلد نموذج. يقبل هذا الملف النمطي تعليمات برمجية
غير مقبولة عادة في ملفات النمذجة/المعالجة، حيث يسمح بالاستيراد من نماذج مجاورة وكذلك
الوراثة من الفئات إلى فئات أخرى.
يعرّف هذا الملف النمطي النماذج والمعالجات وفئة التكوين التي سيتم تعريفها في وحداتهم
المتعلقة.
وأخيرًا، يقدم هذا الميزة أداة `linter` جديدة والتي ستعمل على "تفكيك" الملف النمطي إلى بنية "نموذج واحد، ملف واحد"
هيكل الدليل. سيتم إنشاء هذه الملفات تلقائيًا في كل مرة يتم فيها تشغيل البرنامج النصي؛ مما يقلل من المساهمات المطلوبة
إلى الملف النمطي، وبالتالي فقط إلى التغييرات بين النموذج المساهم والنماذج الأخرى.
سيقوم مستخدمو النموذج في النهاية باستيراد واستخدام واجهة الملف الواحد، لذا لا يتوقع حدوث أي تغيير هنا. من خلال القيام بذلك،
نأمل في الجمع بين أفضل ما في العالمين: تمكين المساهمات البسيطة مع الالتزام بفلسفتنا.
لذلك، هذا بديل لعلامات `# Copied from`، ويمكن توقع انتقال النماذج المساهمة سابقًا إلى
تنسيق المحولات النمطية الجديد في الأشهر المقبلة.
### التفاصيل
تُبسط أداة "linter" الوراثة، مُنشئةً جميع الملفات المفردة من الملف النمطي، مع الحفاظ على شفافيتها أمام مستخدمي Python. حاليًا، تُبسط الأداة مستوىً واحدًا من الوراثة
على سبيل المثال:
- إذا ورثت فئة التكوين من فئة أخرى وأضافت/حذفت معامل، فسيتم إما الإشارة إلى الملف المولد مباشرةً
(في حالة الإضافة) أو إزالته تمامًا (في حالة الحذف).
- إذا ورثت فئة من فئة أخرى، على سبيل المثال: `class GemmaModel(LlamaModel):`، تُستنتج التبعيات تلقائيًا
سيتم استنتاج جميع الوحدات الفرعية تلقائيًا من الفئة الأصلية.
- إذا قمت بتعريف وظائف جديدة في الملف `modular` واستخدمتها داخل الفئات، فستستنتج أداة linter ذلك تلقائيًا
يجب أن تكون قادرًا على كتابة كل شيء (المجزىء اللغوي، ومُعالِج الصور، والنموذج، والتكوين) في الملف `modular`، وسيتم إنشاء الملفات المُقابلة تلقائيًا.
### التطبيق
[TODO] نقدم اختبارًا جديدًا، للتأكد من أن المحتوى المولد يتطابق مع ما هو موجود في `modular_xxxx.py`
### الأمثلة
هنا مثال سريع باستخدام BERT و RoBERTa. النموذجان مرتبطان ارتباطًا وثيقًا: يختلف تنفيذهما النموذجي في طبقة تضمين.
بدلاً من إعادة تعريف النموذج بالكامل، إليك كيف يبدو ملف `modular_roberta.py` لفئات النمذجة والتكوين (لأغراض المثال، يتم تجاهل المجزىء اللغوي في هذا الوقت حيث أنه مختلف جدًا).
```python
from torch import nn
from ..bert.configuration_bert import BertConfig
from ..bert.modeling_bert import (
BertModel,
BertEmbeddings,
BertForMaskedLM
)
# تكوين RoBERTa مطابق لتكوين BERT
class RobertaConfig(BertConfig):
model_type = 'roberta'
# نعيد تعريف الإضافات هنا لتسليط الضوء على اختلاف معرف الحشو، ونعيد تعريف الإضافات الموضعية
class RobertaEmbeddings(BertEmbeddings):
def __init__(self, config):
super().__init__(config())
self.padding_idx = config.pad_token_id
self.position_embeddings = nn.Embedding(
config.max_position_embeddings, config.hidden_size, padding_idx=self.padding_idx
)
# نموذج RoBERTa مطابق لنموذج BERT، باستثناء طبقة الإضافات.
# نعيد تعريف الإضافات أعلاه، لذا هنا لا توجد حاجة لعمل إضافي
class RobertaModel(BertModel):
def __init__(self, config):
super().__init__(config)
self.embeddings = RobertaEmbeddings(config)
# الرؤوس الآن تحتاج فقط إلى إعادة تعريف النموذج داخل `RobertaModel` الصحيح
class RobertaForMaskedLM(BertForMaskedLM):
def __init__(self, config):
super().__init__(config)
self.model = RobertaModel(config)
```
لاحظ أنه إذا لم تستخدم الاعتماد الذي حددته، فستحصل على الخطأ التالي:
```bash
ValueError: You defined `RobertaEmbeddings` in the modular_roberta.py, it should be used
when you define `BertModel`, as it is one of it's direct dependencies. Make sure
you use it in the `__init__` function.
```
بالإضافة إلى ذلك، قد تجد قائمة بالأمثلة هنا:
## ما هو ليس كذلك
ليس بديلاً لتعليمات برمجة النمذجة (بعد؟)، وإذا لم يكن نموذجك يعتمد على أي شيء آخر موجود من قبل، فيمكنك إضافة ملف `نمذجة` كالعادة.
## الاستخدام المتقدم
### إزالة السمات والوظائف
لإزالة السمات التي لا تستخدم في نموذجك النمطي، والتي لا تريد رؤيتها في النمذجة المفككة:
```python
class GemmaModel(LlamaModel): | class GemmaModel(PreTrainedModel):
def __init__(self, config): | def __init__(self, config):
super().__init__(self, eos_token) | super().__init__(config)
del self.embed_tokens | self.padding_idx = config.pad_token_id
| self.vocab_size = config.vocab_size
|
| self.layers = nn.ModuleList(
| [LlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
| )
| self.norm = LlamaRMSNorm(config.hidden_size, eps=config.rms_norm_eps)
| self.rotary_emb = LlamaRotaryEmbedding(config=config)
| self.gradient_checkpointing = False
|
| # Initialize weights and apply final processing
| self.post_init()
```
إذا قمت بالتحقق من `LlamaModel` الأصلي، فستجد `embed_tokens` الذي تمت إزالته هنا (كما هو متوقع!)
إزالة وظيفة مشابهة، تحتاج فقط إلى كتابتها مع `raise ValueError("")` لمحاكاة السلوك الذي تريده فعليًا عند إزالة وظيفة أصلية في بايثون.
```python
class GemmaTokenizer(LlamaTokenizer):
...
def get_spm_processor(self):
raise AttributeError("Not needed for Gemma")
def unk_token_length(self):
raise AttributeError("Not needed for Gemma")
```
### تعريف وظائف جديدة
إذا قمت بتعريف وظيفة جديدة في الملف `modular` لاستخدامها داخل فئة، على سبيل المثال
```python
def my_new_function(*args, **kwargs):
# Do something here
pass
class GemmaModel(LlamaModel):
def forward(*args, **kwargs):
# Call the function
example = my_new_function(*args, **kwargs)
# continue here
```
سيتم نسخ وظيفة `my_new_function` (وبشكل متكرر، أي وظائف أخرى جديدة يتم استدعاؤها في جسمها) تلقائيًا
في الملف الذي يتم استخدامه.
### استدعاء `super()`
قمنا مؤخرًا بشحن بعض الميزات التي تسمح لك بالانتقال من:
```python
class GemmaTokenizer(LlamaTokenizer, PretrainedTokenizerFast): | class GemmaModel(nn.Module):
def __init__(self, eos_token="</s>"): | def __init__(self):
eos_token = AddedToken(eos_token) | eos_token = AddedToken(eos_token)
PretrainedTokenizerFast.__init__(self, eos_token) | super().__init__(eos_token)
```
هذا مفيد عندما لا تريد تفكيك استدعاء `super()`، وتريد التمييز بين أي استدعاء super init تقوم به!
### التسمية الخاصة
ندعم الآن أيضًا حالات خاصة مثل
```python
class GemmaVisionModel(CLIPModel):
pass
```
حيث اسم فئة `GemmaVision` الخاصة بك ليس هو نفسه `Gemma` النمطي. هذا مفيد للغاية للنماذج المركبة.

140
docs/source/ar/notebooks.md Normal file
View File

@ -0,0 +1,140 @@
# دفاتر ملاحظات 🤗 Transformers
يمكنك أن تجد هنا قائمة بدفاتر الملاحظات الرسمية التي تقدمها Hugging Face.
كما نود أن ندرج هنا محتوى مثيرًا للاهتمام تم إنشاؤه بواسطة المجتمع.
إذا كتبت دفتر ملاحظات يستفيد من 🤗 Transformers وتود إدراجه هنا، فيُرجى فتح طلب سحب حتى يمكن تضمينه ضمن دفاتر ملاحظات المجتمع.
## دفاتر ملاحظات Hugging Face 🤗
### دفاتر ملاحظات التوثيق
يمكنك فتح أي صفحة من صفحات التوثيق كدفتر ملاحظات في Colab (يوجد زر مباشرة على تلك الصفحات) ولكنها مدرجة هنا أيضًا إذا كنت بحاجة إليها:
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [جولة سريعة في المكتبة](https://github.com/huggingface/notebooks/blob/main/transformers_doc/en/quicktour.ipynb) | عرض لمختلف واجهات برمجة التطبيقات في Transformers |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/quicktour.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/en/transformers_doc/quicktour.ipynb)|
| [ملخص المهام](https://github.com/huggingface/notebooks/blob/main/transformers_doc/en/task_summary.ipynb) | كيفية تشغيل نماذج مكتبة Transformers مهمة تلو الأخرى |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/task_summary.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/transformers_doc/en/task_summary.ipynb)|
| [معالجة البيانات مسبقًا](https://github.com/huggingface/notebooks/blob/main/transformers_doc/en/preprocessing.ipynb) | كيفية استخدام محلل لغوي لمعالجة بياناتك مسبقًا |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/preprocessing.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/transformers_doc/en/preprocessing.ipynb)|
| [الضبط الدقيق لنموذج مُدرَّب مسبقًا](https://github.com/huggingface/notebooks/blob/main/transformers_doc/en/training.ipynb) | كيفية استخدام المدرب لضبط نموذج مُدرَّب مسبقًا بدقة |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/training.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/transformers_doc/en/training.ipynb)|
| [ملخص للمحللات اللغوية](https://github.com/huggingface/notebooks/blob/main/transformers_doc/en/tokenizer_summary.ipynb) | الاختلافات بين خوارزمية المحلل اللغوي |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/tokenizer_summary.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/transformers_doc/en/tokenizer_summary.ipynb)|
| [النماذج متعددة اللغات](https://github.com/huggingface/notebooks/blob/main/transformers_doc/en/multilingual.ipynb) | كيفية استخدام النماذج متعددة اللغات للمكتبة |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/multilingual.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/transformers_doc/en/multilingual.ipynb)|
### أمثلة PyTorch
#### معالجة اللغة الطبيعية[[pytorch-nlp]]
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [تدريب محللك اللغوي](https://github.com/huggingface/notebooks/blob/main/examples/tokenizer_training.ipynb) | كيفية تدريب واستخدام محللك اللغوي الخاص بك |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/tokenizer_training.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/tokenizer_training.ipynb)|
| [تدريب نموذج لغتك](https://github.com/huggingface/notebooks/blob/main/examples/language_modeling_from_scratch.ipynb) | كيفية البدء بسهولة في استخدام المحولات |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling_from_scratch.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/language_modeling_from_scratch.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف النص](https://github.com/huggingface/notebooks/blob/main/examples/text_classification.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على أي مهمة GLUE. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/text_classification.ipynb)|
| [كيفية ضبط نموذج بدقة على النمذجة اللغوية](https://github.com/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على مهمة LM سببية أو مقنعة. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف الرموز المميزة](https://github.com/huggingface/notebooks/blob/main/examples/token_classification.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على مهمة تصنيف الرموز المميزة (NER، PoS). | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/token_classification.ipynb)|
| [كيفية ضبط نموذج بدقة على الإجابة على الأسئلة](https://github.com/huggingface/notebooks/blob/main/examples/question_answering.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على SQUAD. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/question_answering.ipynb)|
| [كيفية ضبط نموذج بدقة على الاختيار من متعدد](https://github.com/huggingface/notebooks/blob/main/examples/multiple_choice.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على SWAG. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multiple_choice.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/multiple_choice.ipynb)|
| [كيفية ضبط نموذج بدقة على الترجمة](https://github.com/huggingface/notebooks/blob/main/examples/translation.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على WMT. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/translation.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/translation.ipynb)|
| [كيفية ضبط نموذج بدقة على التلخيص](https://github.com/huggingface/notebooks/blob/main/examples/summarization.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على XSUM. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/summarization.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/summarization.ipynb)|
| [كيفية تدريب نموذج لغة من البداية](https://github.com/huggingface/blog/blob/main/notebooks/01_how_to_train.ipynb)| تسليط الضوء على جميع الخطوات لتدريب نموذج Transformer بشكل فعال على بيانات مخصصة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/blog/blob/main/notebooks/01_how_to_train.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/blog/blob/main/notebooks/01_how_to_train.ipynb)|
| [كيفية إنشاء نص](https://github.com/huggingface/blog/blob/main/notebooks/02_how_to_generate.ipynb)| كيفية استخدام أساليب فك التشفير المختلفة لإنشاء اللغة باستخدام المحولات | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/blog/blob/main/notebooks/02_how_to_generate.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/blog/blob/main/notebooks/02_how_to_generate.ipynb)|
| [كيفية إنشاء نص (مع قيود)](https://github.com/huggingface/blog/blob/main/notebooks/53_constrained_beam_search.ipynb)| كيفية توجيه إنشاء اللغة باستخدام القيود التي يوفرها المستخدم | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/blog/blob/main/notebooks/53_constrained_beam_search.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/blog/blob/main/notebooks/53_constrained_beam_search.ipynb)|
| [Reformer](https://github.com/huggingface/blog/blob/main/notebooks/03_reformer.ipynb)| كيف يدفع Reformer حدود النمذجة اللغوية | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/blog/blob/main/notebooks/03_reformer.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/patrickvonplaten/blog/blob/main/notebooks/03_reformer.ipynb)|
#### رؤية الكمبيوتر[[pytorch-cv]]
| دفتر الملاحظات | الوصف | | |
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------:|
| [كيفية ضبط نموذج بدقة على تصنيف الصور (Torchvision)](https://github.com/huggingface/notebooks/blob/main/examples/image_classification.ipynb) | يوضح كيفية معالجة البيانات مسبقًا باستخدام Torchvision وضبط أي نموذج رؤية مُدرَّب مسبقًا بدقة على تصنيف الصور | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف الصور (Albumentations)](https://github.com/huggingface/notebooks/blob/main/examples/image_classification_albumentations.ipynb) | يوضح كيفية معالجة البيانات مسبقًا باستخدام Albumentations وضبط أي نموذج رؤية مُدرَّب مسبقًا بدقة على تصنيف الصور | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification_albumentations.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/image_classification_albumentations.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف الصور (Kornia)](https://github.com/huggingface/notebooks/blob/main/examples/image_classification_kornia.ipynb) | يوضح كيفية معالجة البيانات مسبقًا باستخدام Kornia وضبط أي نموذج رؤية مُدرَّب مسبقًا بدقة على تصنيف الصور | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification_kornia.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/image_classification_kornia.ipynb)|
| [كيفية إجراء الكشف عن الأشياء بدون لقطات مع OWL-ViT](https://github.com/huggingface/notebooks/blob/main/examples/zeroshot_object_detection_with_owlvit.ipynb) | يوضح كيفية إجراء الكشف عن الأشياء بدون لقطات على الصور باستخدام استعلامات نصية | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/zeroshot_object_detection_with_owlvit.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/zeroshot_object_detection_with_owlvit.ipynb)|
| [كيفية ضبط نموذج وصف الصور بدقة](https://github.com/huggingface/notebooks/blob/main/examples/image_captioning_blip.ipynb) | يوضح كيفية ضبط BLIP بدقة لوصف الصور على مجموعة بيانات مخصصة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_captioning_blip.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/image_captioning_blip.ipynb)|
| [كيفية بناء نظام تشابه الصور مع Transformers](https://github.com/huggingface/notebooks/blob/main/examples/image_similarity.ipynb) | يوضح كيفية بناء نظام تشابه الصور | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_similarity.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/image_similarity.ipynb)|
| [كيفية ضبط نموذج SegFormer بدقة على التجزئة الدلالية](https://github.com/huggingface/notebooks/blob/main/examples/semantic_segmentation.ipynb) | يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج SegFormer مُدرَّب مسبقًا بدقة على التجزئة الدلالية | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/semantic_segmentation.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/semantic_segmentation.ipynb)|
| [كيفية ضبط نموذج VideoMAE بدقة على تصنيف الفيديو](https://github.com/huggingface/notebooks/blob/main/examples/video_classification.ipynb) | يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج VideoMAE مُدرَّب مسبقًا بدقة على تصنيف الفيديو | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/video_classification.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/video_classification.ipynb)|
#### الصوت[[pytorch-audio]]
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [كيفية ضبط نموذج التعرف على الكلام باللغة الإنجليزية بدقة](https://github.com/huggingface/notebooks/blob/main/examples/speech_recognition.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج كلام مُدرَّب مسبقًا بدقة على TIMIT | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/speech_recognition.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/speech_recognition.ipynb)|
| [كيفية ضبط نموذج التعرف على الكلام بأي لغة بدقة](https://github.com/huggingface/notebooks/blob/main/examples/multi_lingual_speech_recognition.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج كلام مُدرَّب مسبقًا متعدد اللغات بدقة على Common Voice | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multi_lingual_speech_recognition.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/multi_lingual_speech_recognition.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف الصوت](https://github.com/huggingface/notebooks/blob/main/examples/audio_classification.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج كلام مُدرَّب مسبقًا بدقة على Keyword Spotting | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/audio_classification.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/audio_classification.ipynb)|
#### التسلسلات البيولوجية[[pytorch-bio]]
| دفتر الملاحظات | الوصف | | |
|:----------|:----------------------------------------------------------------------------------------|:-------------|------:|
| [كيفية ضبط نموذج بروتين مُدرَّب مسبقًا بدقة](https://github.com/huggingface/notebooks/blob/main/examples/protein_language_modeling.ipynb) | شاهد كيفية ترميز البروتينات وضبط نموذج "لغة" بروتين مُدرَّب مسبقًا كبير بدقة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/protein_language_modeling.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/protein_language_modeling.ipynb) |
| [كيفية إنشاء طيات بروتينية](https://github.com/huggingface/notebooks/blob/main/examples/protein_folding.ipynb) | شاهد كيفية الانتقال من تسلسل البروتين إلى نموذج بروتين كامل وملف PDB | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/protein_folding.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/protein_folding.ipynb) |
| [كيفية ضبط نموذج محول النيوكليوتيدات بدقة](https://github.com/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling.ipynb) | شاهد كيفية ترميز الحمض النووي وضبط نموذج "لغة" الحمض النووي مُدرَّب مسبقًا كبير بدقة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling.ipynb) |
| [ضبط نموذج محول النيوكليوتيدات بدقة باستخدام LoRA](https://github.com/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling_with_peft.ipynb) | تدريب نماذج DNA أكبر بكثير بطريقة فعالة من حيث الذاكرة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling_with_peft.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling_with_peft.ipynb) |
#### طرائق أخرى[[pytorch-other]]
| دفتر الملاحظات | الوصف | | |
|:----------|:----------------------------------------------------------------------------------------|:-------------|------:|
| [التنبؤ الاحتمالي بالسلاسل الزمنية](https://github.com/huggingface/notebooks/blob/main/examples/time-series-transformers.ipynb) | شاهد كيفية تدريب Time Series Transformer على مجموعة بيانات مخصصة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/time-series-transformers.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/time-series-transformers.ipynb) |
#### دفاتر ملاحظات الأدوات المساعدة [[pytorch-utility]]
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [كيفية تصدير النموذج إلى ONNX](https://github.com/huggingface/notebooks/blob/main/examples/onnx-export.ipynb)| تسليط الضوء على كيفية التصدير وتشغيل أعباء عمل الاستدلال من خلال ONNX | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/onnx-export.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/onnx-export.ipynb)|
| [كيفية استخدام المعايير](https://github.com/huggingface/notebooks/blob/main/examples/benchmark.ipynb)| كيفية قياس أداء النماذج باستخدام المحولات | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/benchmark.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/benchmark.ipynb)|
### أمثلة TensorFlow
#### معالجة اللغة الطبيعية[[tensorflow-nlp]]
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [تدريب محللك اللغوي](https://github.com/huggingface/notebooks/blob/main/examples/tokenizer_training.ipynb) | كيفية تدريب واستخدام محللك اللغوي الخاص بك |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/tokenizer_training.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/tokenizer_training.ipynb)|
| [تدريب نموذج لغتك](https://github.com/huggingface/notebooks/blob/main/examples/language_modeling_from_scratch-tf.ipynb) | كيفية البدء بسهولة في استخدام المحولات |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling_from_scratch-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/language_modeling_from_scratch-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف النص](https://github.com/huggingface/notebooks/blob/main/examples/text_classification-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على أي مهمة GLUE. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/text_classification-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على النمذجة اللغوية](https://github.com/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على مهمة LM سببية أو مقنعة. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف الرموز المميزة](https://github.com/huggingface/notebooks/blob/main/examples/token_classification-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على مهمة تصنيف الرموز المميزة (NER، PoS). | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/token_classification-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على الإجابة على الأسئلة](https://github.com/huggingface/notebooks/blob/main/examples/question_answering-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على SQUAD. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/question_answering-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على الاختيار من متعدد](https://github.com/huggingface/notebooks/blob/main/examples/multiple_choice-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على SWAG. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multiple_choice-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/multiple_choice-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على الترجمة](https://github.com/huggingface/notebooks/blob/main/examples/translation-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على WMT. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/translation-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/translation-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على التلخيص](https://github.com/huggingface/notebooks/blob/main/examples/summarization-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على XSUM. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/summarization-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/summarization-tf.ipynb)|
#### رؤية الكمبيوتر[[tensorflow-cv]]
| دفتر الملاحظات | الوصف | | |
|:---------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------|:-------------|------:|
| [كيفية ضبط نموذج بدقة على تصنيف الصور](https://github.com/huggingface/notebooks/blob/main/examples/image_classification-tf.ipynb) | يوضح كيفية معالجة البيانات مسبقًا وضبط أي نموذج رؤية مُدرَّب مسبقًا بدقة على تصنيف الصور | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/image_classification-tf.ipynb)|
| [كيفية ضبط نموذج SegFormer بدقة على التجزئة الدلالية](https://github.com/huggingface/notebooks/blob/main/examples/semantic_segmentation-tf.ipynb) | يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج SegFormer مُدرَّب مسبقًا بدقة على التجزئة الدلالية | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/semantic_segmentation-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/semantic_segmentation-tf.ipynb)|
#### التسلسلات البيولوجية[[tensorflow-bio]]
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [كيفية ضبط نموذج بروتين مُدرَّب مسبقًا بدقة](https://github.com/huggingface/notebooks/blob/main/examples/protein_language_modeling-tf.ipynb) | شاهد كيفية ترميز البروتينات وضبط نموذج "لغة" بروتين مُدرَّب مسبقًا كبير بدقة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/protein_language_modeling-tf.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/protein_language_modeling-tf.ipynb) |
#### دفاتر ملاحظات الأدوات المساعدة [[tensorflow-utility]]
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [كيفية تدريب نماذج TF/Keras على TPU](https://github.com/huggingface/notebooks/blob/main/examples/tpu_training-tf.ipynb) | شاهد كيفية التدريب بسرعة عالية على أجهزة TPU من Google | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/tpu_training-tf.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/tpu_training-tf.ipynb) |
### دفاتر ملاحظات Optimum
🤗 [Optimum](https://github.com/huggingface/optimum) هو امتداد لـ 🤗 Transformers، يوفر مجموعة من أدوات تحسين الأداء التي تمكن من تحقيق أقصى قدر من الكفاءة لتدريب وتشغيل النماذج على الأجهزة المستهدفة.
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [كيفية تكميم نموذج باستخدام ONNX Runtime لتصنيف النص](https://github.com/huggingface/notebooks/blob/main/examples/text_classification_quantization_ort.ipynb)| يوضح كيفية تطبيق التكميم الثابت والديناميكي على نموذج باستخدام [ONNX Runtime](https://github.com/microsoft/onnxruntime) لأي مهمة GLUE. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification_quantization_ort.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/text_classification_quantization_ort.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف النص باستخدام ONNX Runtime](https://github.com/huggingface/notebooks/blob/main/examples/text_classification_ort.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج بدقة على أي مهمة GLUE باستخدام [ONNX Runtime](https://github.com/microsoft/onnxruntime). | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification_ort.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/text_classification_ort.ipynb)|
| [كيفية ضبط نموذج بدقة على التلخيص باستخدام ONNX Runtime](https://github.com/huggingface/notebooks/blob/main/examples/summarization_ort.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج بدقة على XSUM باستخدام [ONNX Runtime](https://github.com/microsoft/onnxruntime). | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/summarization_ort.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/summarization_ort.ipynb)|
## دفاتر ملاحظات المجتمع:
تتوفر المزيد من دفاتر الملاحظات التي طورها المجتمع [هنا](https://hf.co/docs/transformers/community#community-notebooks).

View File

@ -347,8 +347,8 @@ tensor([[0.0021, 0.0018, 0.0115, 0.2121, 0.7725],
```py
>>> from transformers import AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory)
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
>>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory)
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)
```
</pt>
<tf>
@ -356,8 +356,8 @@ tensor([[0.0021, 0.0018, 0.0115, 0.2121, 0.7725],
```py
>>> from transformers import TFAutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory)
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)
>>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory)
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
```
</tf>
</frameworkcontent>

View File

@ -2,7 +2,7 @@
بالإضافة إلى دفاتر الملاحظات [notebooks](./notebooks) الخاصة بـ 🤗 Transformers، هناك أيضًا نصوص برمجية توضيحية تُظهر كيفية تدريب نموذج لمهمة باستخدام [PyTorch](https://github.com/huggingface/transformers/tree/main/examples/pytorch) أو [TensorFlow](https://github.com/huggingface/transformers/tree/main/examples/tensorflow) أو [JAX/Flax](https://github.com/huggingface/transformers/tree/main/examples/flax).
كما ستجد النصوص البرمجية التي استخدمناها في [مشاريع الأبحاث](https://github.com/huggingface/transformers/tree/main/examples/research_projects) و [الأمثلة القديمة](https://github.com/huggingface/transformers/tree/main/examples/legacy) والتي ساهم بها المجتمع بشكل أساسي. هذه النصوص البرمجية غير مدعومة بشكل نشط وقد تتطلب إصدارًا محددًا من مكتبة 🤗 Transformers والذي من المحتمل أن يكون غير متوافق مع الإصدار الأحدث من المكتبة.
كما ستجد النصوص البرمجية التي استخدمناها في [مشاريع الأبحاث](https://github.com/huggingface/transformers-research-projects/) و [الأمثلة القديمة](https://github.com/huggingface/transformers/tree/main/examples/legacy) والتي ساهم بها المجتمع بشكل أساسي. هذه النصوص البرمجية غير مدعومة بشكل نشط وقد تتطلب إصدارًا محددًا من مكتبة 🤗 Transformers والذي من المحتمل أن يكون غير متوافق مع الإصدار الأحدث من المكتبة.
لا يُتوقع أن تعمل النصوص البرمجية التوضيحية بشكل مباشر على كل مشكلة، وقد تحتاج إلى تكييف النص البرمجي مع المشكلة التي تحاول حلها. ولمساعدتك في ذلك، تعرض معظم النصوص البرمجية كيفية معالجة البيانات قبل التدريب بشكل كامل، مما يتيح لك تحريرها حسب الحاجة لحالتك الاستخدام.

View File

@ -116,11 +116,11 @@ optimum-cli export onnx --model keras-io/transformers-qa distilbert_base_cased_s
<Tip warning={true}>
لم يعد يتم دعم `tranformers.onnx` يُرجى تصدير النماذج باستخدام 🤗 Optimum كما هو موضح أعلاه. سيتم إزالة هذا القسم في الإصدارات القادمة.
لم يعد يتم دعم `transformers.onnx` يُرجى تصدير النماذج باستخدام 🤗 Optimum كما هو موضح أعلاه. سيتم إزالة هذا القسم في الإصدارات القادمة.
</Tip>
لتصدير نموذج 🤗 Transformers إلى ONNX باستخدام `tranformers.onnx`، ثبّت التبعيات الإضافية:
لتصدير نموذج 🤗 Transformers إلى ONNX باستخدام `transformers.onnx`، ثبّت التبعيات الإضافية:
```bash
pip install transformers[onnx]

View File

@ -0,0 +1,422 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# نمذجة اللغة السببية (Causal language modeling)
[[open-in-colab]]
هناك نوعان من نمذجة اللغة، السببية والمقنعة. يوضح هذا الدليل نمذجة اللغة السببية.
تُستخدم نماذج اللغة السببية غالبًا لتوليد النص. يمكنك استخدام هذه النماذج للتطبيقات الإبداعية مثل
اختيار مغامرة النص الخاصة بك أو مساعد ترميز ذكي مثل Copilot أو CodeParrot.
<Youtube id="Vpjb1lu0MDk"/>
تتنبأ نمذجة اللغة السببية بالرمز التالي في تسلسل من الرموز، ولا يمكن للنموذج سوى الاهتمام بالرموز على
اليسار. هذا يعني أن النموذج لا يمكنه رؤية الرموز المستقبلية. GPT-2 هو مثال على نموذج اللغة السببية.
سيوضح لك هذا الدليل كيفية:
1. ضبط دقيق [DistilRoBERTa](https://huggingface.co/distilbert/distilroberta-base) على مجموعة فرعية [r/askscience](https://www.reddit.com/r/askscience/) من مجموعة بيانات [ELI5](https://huggingface.co/datasets/eli5).
2. استخدام النموذج المدرب الخاص بك للاستنتاج.
<Tip>
لرؤية جميع العمارات ونقاط التحقق المتوافقة مع هذه المهمة، نوصي بالتحقق من [task-page](https://huggingface.co/tasks/text-generation)
</Tip>
قبل أن تبدأ، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate
```
نحن نشجعك على تسجيل الدخول إلى حساب Hugging Face الخاص بك حتى تتمكن من تحميل ومشاركة نموذجك مع المجتمع. عند المطالبة، أدخل رمزك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات ELI5
ابدأ بتحميل أول 5000 مثال من [ELI5-Category](https://huggingface.co/datasets/eli5_category) مجموعة البيانات مع مكتبة 🤗 Datasets. سيعطيك هذا فرصة للتجربة والتأكد من أن كل شيء يعمل قبل قضاء المزيد من الوقت في التدريب على مجموعة البيانات الكاملة.
```py
>>> from datasets import load_dataset
>>> eli5 = load_dataset("eli5_category", split="train[:5000]")
```
قم بتقسيم مجموعة بيانات `train` إلى مجموعتي تدريب واختبار باستخدام الخاصية [`~datasets.Dataset.train_test_split`]:
```py
>>> eli5 = eli5.train_test_split(test_size=0.2)
```
ثم ألق نظرة على مثال:
```py
>>> eli5["train"][0]
{'q_id': '7h191n',
'title': 'What does the tax bill that was passed today mean? How will it affect Americans in each tax bracket?',
'selftext': '',
'category': 'Economics',
'subreddit': 'explainlikeimfive',
'answers': {'a_id': ['dqnds8l', 'dqnd1jl', 'dqng3i1', 'dqnku5x'],
'text': ["The tax bill is 500 pages long and there were a lot of changes still going on right to the end. It's not just an adjustment to the income tax brackets, it's a whole bunch of changes. As such there is no good answer to your question. The big take aways are: - Big reduction in corporate income tax rate will make large companies very happy. - Pass through rate change will make certain styles of business (law firms, hedge funds) extremely happy - Income tax changes are moderate, and are set to expire (though it's the kind of thing that might just always get re-applied without being made permanent) - People in high tax states (California, New York) lose out, and many of them will end up with their taxes raised.",
'None yet. It has to be reconciled with a vastly different house bill and then passed again.',
'Also: does this apply to 2017 taxes? Or does it start with 2018 taxes?',
'This article explains both the House and senate bills, including the proposed changes to your income taxes based on your income level. URL_0'],
'score': [21, 19, 5, 3],
'text_urls': [[],
[],
[],
['https://www.investopedia.com/news/trumps-tax-reform-what-can-be-done/']]},
'title_urls': ['url'],
'selftext_urls': ['url']}
```
على الرغم من أن هذا قد يبدو معقدًا، إلا أنك مهتم حقًا بحقل `text`. ما هو رائع حول مهام نمذجة اللغة
أنت لا تحتاج إلى تسميات (تُعرف أيضًا باسم المهمة غير الخاضعة للإشراف) لأن الكلمة التالية تعمل كتسمية.
## معالجة مسبقة (Preprocess)
<Youtube id="ma1TrR7gE7I"/>
الخطوة التالية هي تحميل مجزء النص DistilGPT2 لمعالجة حقل `text` الفرعي:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilgpt2")
```
ستلاحظ من المثال أعلاه، الحقل `text` هو في الواقع متداخل داخل `answers`. هذا يعني أنك ستحتاج إلى
استخراج حقل `text` الفرعي من بنيته المتداخلة باستخدام الدالة [`flatten`](https://huggingface.co/docs/datasets/process#flatten):
```py
>>> eli5 = eli5.flatten()
>>> eli5["train"][0]
{'q_id': '7h191n',
'title': 'What does the tax bill that was passed today mean? How will it affect Americans in each tax bracket?',
'selftext': '',
'category': 'Economics',
'subreddit': 'explainlikeimfive',
'answers.a_id': ['dqnds8l', 'dqnd1jl', 'dqng3i1', 'dqnku5x'],
'answers.text': ["The tax bill is 500 pages long and there were a lot of changes still going on right to the end. It's not just an adjustment to the income tax brackets, it's a whole bunch of changes. As such there is no good answer to your question. The big take aways are: - Big reduction in corporate income tax rate will make large companies very happy. - Pass through rate change will make certain styles of business (law firms, hedge funds) extremely happy - Income tax changes are moderate, and are set to expire (though it's the kind of thing that might just always get re-applied without being made permanent) - People in high tax states (California, New York) lose out, and many of them will end up with their taxes raised.",
'None yet. It has to be reconciled with a vastly different house bill and then passed again.',
'Also: does this apply to 2017 taxes? Or does it start with 2018 taxes?',
'This article explains both the House and senate bills, including the proposed changes to your income taxes based on your income level. URL_0'],
'answers.score': [21, 19, 5, 3],
'answers.text_urls': [[],
[],
[],
['https://www.investopedia.com/news/trumps-tax-reform-what-can-be-done/']],
'title_urls': ['url'],
'selftext_urls': ['url']}
```
كل حقل فرعي هو الآن عموداً منفصلاً مسبوقاً بـ `answers`، وحقل `text` هو قائمة الآن. بدلاً من ذلك
من تجزائة نص كل جملة بشكل منفصل، قم بتحويل القائمة إلى سلسلة حتى تتمكن من تجزئة نصها بشكل مجمّع.
هنا أول دالة معالجة مسبقة لدمج قائمة السلاسل لكل مثال ومجزىء النتيجة:
```py
>>> def preprocess_function(examples):
... return tokenizer([" ".join(x) for x in examples["answers.text"]])
```
لتطبيق دالة المعالجة المسبقة هذه على مجموعة البيانات بأكملها، استخدم الدالة 🤗 Datasets [`~datasets.Dataset.map`]. يمكنك تسريع هذه العملية `map` عن طريق تعيين `batched=True` لمعالجة عناصر متعددة من مجموعة البيانات في وقت واحد، وزيادة عدد العمليات مع `num_proc`. احذف أي أعمدة لا تحتاجها:
```py
>>> tokenized_eli5 = eli5.map(
... preprocess_function,
... batched=True,
... num_proc=4,
... remove_columns=eli5["train"].column_names,
... )
```
تحتوي هذه المجموعة من البيانات على تسلسلات الرموز، ولكن بعضها أطول من الطول الأقصى للمدخلات للنموذج.
يمكنك الآن استخدام دالة ما قبل المعالجة ثانية لـ:
- تجميع كل التسلسلات.
- تقسيم التسلسلات المجمّعة إلى أجزاء أقصر محددة، بحجم `block_size`، والتي يجب أن تكون أقصر من الطول الأقصى للمدخلات ومناسبة لذاكرة GPU.
```py
>>> block_size = 128
>>> def group_texts(examples):
... # ربط جميع النصوص.
... concatenated_examples = {k: sum(examples[k], []) for k in examples.keys()}
... total_length = len(concatenated_examples[list(examples.keys())[0]])
... # نتجاهل الباقي الصغير، يمكننا إضافة الحشو إذا كان النموذج يدعمه بدلاً من هذا الإسقاط، يمكنك
... # تخصيص هذا الجزء حسب احتياجاتك.
... if total_length >= block_size:
... total_length = (total_length // block_size) * block_size
... # التقسيم إلى أجزاء بحجم block_size.
... result = {
... k: [t[i : i + block_size] for i in range(0, total_length, block_size)]
... for k, t in concatenated_examples.items()
... }
... result["labels"] = result["input_ids"].copy()
... return result
```
طبق دالة `group_texts` على كامل المجموعة من البيانات:
```py
>>> lm_dataset = tokenized_eli5.map(group_texts, batched=True, num_proc=4)
```
الآن قم بإنشاء دفعة من الأمثلة باستخدام [`DataCollatorForLanguageModeling`]. من الأفضل أن تقوم بـ *الحشو الديناميكي* للجمل إلى الطول الأطول في الدفعة أثناء التجميع، بدلاً من حشو كامل المجموعة من البيانات إلى الطول الأقصى.
<frameworkcontent>
<pt>
استخدم رمز نهاية التسلسل كرمز للحشو، وحدد `mlm_probability` لحجب الرموز بشكل عشوائي عند كل تكرار للبيانات:
```py
>>> from transformers import DataCollatorForLanguageModeling
>>> tokenizer.pad_token = tokenizer.eos_token
>>> data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)
```
</pt>
<tf>
استخدم رمز نهاية التسلسل كرمز للحشو، وحدد `mlm_probability` لحجب الرموز بشكل عشوائي عند كل تكرار للبيانات:
```py
>>> from transformers import DataCollatorForLanguageModeling
>>> data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False, return_tensors="tf")
```
</tf>
</frameworkcontent>
## التدريب (Train)
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن على دراية بتدريب نموذج باستخدام [`Trainer`], اطلع على [البرنامج التعليمي الأساسي](../training#train-with-pytorch-trainer)!
</Tip>
أنت جاهز الآن لبدء تدريب نموذجك! قم بتحميل DistilGPT2 باستخدام [`AutoModelForCausalLM`]:
```py
>>> from transformers import AutoModelForCausalLM, TrainingArguments, Trainer
>>> model = AutoModelForCausalLM.from_pretrained("distilbert/distilgpt2")
```
في هذه المرحلة، تبقى ثلاث خطوات فقط:
1. حدد معلمات التدريب الخاصة بك في [`TrainingArguments`]. المعامل الوحيد المطلوب هو `output_dir` الذي يحدد أين سيتم حفظ نموذجك. ستقوم بدفع هذا النموذج إلى Hub بتحديد `push_to_hub=True` (يجب أن تكون مسجلاً الدخول إلى Hugging Face لتحميل نموذجك).
2. قم بتمرير معاملات التدريب إلى [`Trainer`] إلى جانب النموذج، والمجموعات من البيانات، ومجمّع البيانات.
3. قم باستدعاء [`~Trainer.train`] لتدريب نموذجك.
```py
>>> training_args = TrainingArguments(
... output_dir="my_awesome_eli5_clm-model",
... eval_strategy="epoch",
... learning_rate=2e-5,
... weight_decay=0.01,
... push_to_hub=True,
... )
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=lm_dataset["train"],
... eval_dataset=lm_dataset["test"],
... data_collator=data_collator,
... tokenizer=tokenizer,
... )
>>> trainer.train()
```
بمجرد اكتمال التدريب، استخدم طريقة [`~transformers.Trainer.evaluate`] لتقييم نموذجك والحصول على احتمالية الارتباك:
```py
>>> import math
>>> eval_results = trainer.evaluate()
>>> print(f"Perplexity: {math.exp(eval_results['eval_loss']):.2f}")
Perplexity: 49.61
```
ثم شارك نموذجك على Hub باستخدام طريقة [`~transformers.Trainer.push_to_hub`] حتى يتمكن الجميع من استخدام نموذجك:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن على دراية بتدريب نموذج باستخدام Keras، اطلع على [البرنامج التعليمي الأساسي](../training#train-a-tensorflow-model-with-keras)!
</Tip>
لتدريب نموذج في TensorFlow، ابدأ بإعداد دالة المحسن، وجدول معدل التعلم، وبعض معاملات التدريب:
```py
>>> from transformers import create_optimizer, AdamWeightDecay
>>> optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)
```
ثم يمكنك تحميل DistilGPT2 باستخدام [`TFAutoModelForCausalLM`]:
```py
>>> from transformers import TFAutoModelForCausalLM
>>> model = TFAutoModelForCausalLM.from_pretrained("distilbert/distilgpt2")
```
حول مجموعات بياناتك إلى تنسيق `tf.data.Dataset` باستخدام [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... lm_dataset["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_test_set = model.prepare_tf_dataset(
... lm_dataset["test"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
قم بتهيئة النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method). لاحظ أن جميع نماذج Transformers لديها دالة خسارة ذات صلة بالمهمة الافتراضية، لذلك لا تحتاج إلى تحديد واحدة ما لم ترغب في ذلك:
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer) # لا يوجد حجة للخسارة!
```
يمكن القيام بذلك عن طريق تحديد مكان دفع نموذجك ومجمّع البيانات في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> callback = PushToHubCallback(
... output_dir="my_awesome_eli5_clm-model",
... tokenizer=tokenizer,
... )
```
أخيراً، أنت جاهز لبدء تدريب نموذجك! قم باستدعاء [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع مجموعات بيانات التدريب والتحقق من الصحة، وعدد العصور، والتعليقات الخاصة بك لتدريب النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3, callbacks=[callback])
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائيًا إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
للحصول على مثال أكثر تعمقًا حول كيفية تدريب نموذج للنمذجة اللغوية السببية، اطلع على الدفتر المقابل
[دفتر PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)
أو [دفتر TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb).
</Tip>
## الاستدلال (Inference)
رائع، الآن بعد أن قمت بتدريب نموذج، يمكنك استخدامه للاستدلال!
قم بابتكار سؤال تود توليد نص منه:
```py
>>> prompt = "Somatic hypermutation allows the immune system to"
```
أبسط طريقة لتجربة نموذجك المدرب للاستدلال هي استخدامه في [`pipeline`]. قم بتنفيذ `pipeline` لتوليد النص مع نموذجك، ومرر نصك إليه:
```py
>>> from transformers import pipeline
>>> generator = pipeline("text-generation", model="username/my_awesome_eli5_clm-model")
>>> generator(prompt)
[{'generated_text': "Somatic hypermutation allows the immune system to be able to effectively reverse the damage caused by an infection.\n\n\nThe damage caused by an infection is caused by the immune system's ability to perform its own self-correcting tasks."}]
```
<frameworkcontent>
<pt>
قسم النص وإرجع `input_ids` كتنسورات PyTorch:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_eli5_clm-model")
>>> inputs = tokenizer(prompt, return_tensors="pt").input_ids
```
استخدم طريقة [`~generation.GenerationMixin.generate`] لتوليد النص.
للمزيد من التفاصيل حول استراتيجيات توليد النص المختلفة والبارامترات للتحكم في التوليد، راجع صفحة [استراتيجيات توليد النص](../generation_strategies).
```py
>>> from transformers import AutoModelForCausalLM
>>> model = AutoModelForCausalLM.from_pretrained("username/my_awesome_eli5_clm-model")
>>> outputs = model.generate(inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
```
فك ترميز الرموز المولدة مرة أخرى إلى نص:
```py
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
["Somatic hypermutation allows the immune system to react to drugs with the ability to adapt to a different environmental situation. In other words, a system of 'hypermutation' can help the immune system to adapt to a different environmental situation or in some cases even a single life. In contrast, researchers at the University of Massachusetts-Boston have found that 'hypermutation' is much stronger in mice than in humans but can be found in humans, and that it's not completely unknown to the immune system. A study on how the immune system"]
```
</pt>
<tf>
قم بتقسيم النص وإرجاع `input_ids` كـ TensorFlow tensors:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_eli5_clm-model")
>>> inputs = tokenizer(prompt, return_tensors="tf").input_ids
```
استخدم طريقة [`~transformers.generation_tf_utils.TFGenerationMixin.generate`] لإنشاء الملخص. للمزيد من التفاصيل حول استراتيجيات توليد النص المختلفة والبارامترات للتحكم في التوليد، راجع صفحة [استراتيجيات توليد النص](../generation_strategies).
```py
>>> from transformers import TFAutoModelForCausalLM
>>> model = TFAutoModelForCausalLM.from_pretrained("username/my_awesome_eli5_clm-model")
>>> outputs = model.generate(input_ids=inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
```
فك ترميز الرموز المولدة مرة أخرى إلى نص:
```py
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
['Somatic hypermutation allows the immune system to detect the presence of other viruses as they become more prevalent. Therefore, researchers have identified a high proportion of human viruses. The proportion of virus-associated viruses in our study increases with age. Therefore, we propose a simple algorithm to detect the presence of these new viruses in our samples as a sign of improved immunity. A first study based on this algorithm, which will be published in Science on Friday, aims to show that this finding could translate into the development of a better vaccine that is more effective for']
```
</tf>
</frameworkcontent>

View File

@ -0,0 +1,442 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# نمذجة اللغة المقنعة (Masked language modeling)
[[open-in-colab]]
<Youtube id="mqElG5QJWUg"/>
تتنبأ نمذجة اللغة المقنعة برمز مقنع في تسلسل، ويمكن للنموذج الانتباه إلى الرموز بشكل ثنائي الاتجاه. هذا
يعني أن النموذج لديه إمكانية الوصول الكاملة إلى الرموز الموجودة على اليسار واليمين. تعد نمذجة اللغة المقنعة ممتازة للمهام التي
تتطلب فهمًا سياقيًا جيدًا لتسلسل كامل. BERT هو مثال على نموذج لغة مقنع.
سيوضح لك هذا الدليل كيفية:
1. تكييف [DistilRoBERTa](https://huggingface.co/distilbert/distilroberta-base) على مجموعة فرعية [r/askscience](https://www.reddit.com/r/askscience/) من مجموعة بيانات [ELI5](https://huggingface.co/datasets/eli5).
2. استخدام نموذج المدرب الخاص بك للاستدلال.
<Tip>
لمعرفة جميع البنى والنسخ المتوافقة مع هذه المهمة، نوصي بالتحقق من [صفحة المهمة](https://huggingface.co/tasks/fill-mask)
</Tip>
قبل أن تبدأ، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate
```
نحن نشجعك على تسجيل الدخول إلى حساب Hugging Face الخاص بك حتى تتمكن من تحميل ومشاركة نموذجك مع المجتمع. عندما تتم مطالبتك، أدخل رمزك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات ELI5
ابدأ بتحميل أول 5000 مثال من مجموعة بيانات [ELI5-Category](https://huggingface.co/datasets/eli5_category) باستخدام مكتبة 🤗 Datasets. سيعطيك هذا فرصة للتجربة والتأكد من أن كل شيء يعمل قبل قضاء المزيد من الوقت في التدريب على مجموعة البيانات الكاملة.
```py
>>> from datasets import load_dataset
>>> eli5 = load_dataset("eli5_category", split="train[:5000]")
```
قم بتقسيم مجموعة البيانات `train` إلى مجموعتي تدريب واختبار باستخدام الدالة [`~datasets.Dataset.train_test_split`]:
```py
>>> eli5 = eli5.train_test_split(test_size=0.2)
```
ثم ألق نظرة على مثال:
```py
>>> eli5["train"][0]
{'q_id': '7h191n',
'title': 'What does the tax bill that was passed today mean? How will it affect Americans in each tax bracket?',
'selftext': '',
'category': 'Economics',
'subreddit': 'explainlikeimfive',
'answers': {'a_id': ['dqnds8l', 'dqnd1jl', 'dqng3i1', 'dqnku5x'],
'text': ["The tax bill is 500 pages long and there were a lot of changes still going on right to the end. It's not just an adjustment to the income tax brackets, it's a whole bunch of changes. As such there is no good answer to your question. The big take aways are: - Big reduction in corporate income tax rate will make large companies very happy. - Pass through rate change will make certain styles of business (law firms, hedge funds) extremely happy - Income tax changes are moderate, and are set to expire (though it's the kind of thing that might just always get re-applied without being made permanent) - People in high tax states (California, New York) lose out, and many of them will end up with their taxes raised.",
'None yet. It has to be reconciled with a vastly different house bill and then passed again.',
'Also: does this apply to 2017 taxes? Or does it start with 2018 taxes?',
'This article explains both the House and senate bills, including the proposed changes to your income taxes based on your income level. URL_0'],
'score': [21, 19, 5, 3],
'text_urls': [[],
[],
[],
['https://www.investopedia.com/news/trumps-tax-reform-what-can-be-done/']]},
'title_urls': ['url'],
'selftext_urls': ['url']}
```
على الرغم من أن هذا قد يبدو كثيرًا، إلا أنك مهتم حقًا بحقل `text`. ما هو رائع حول مهام نمذجة اللغة هو أنك لا تحتاج إلى تسميات (تُعرف أيضًا باسم المهمة غير الخاضعة للإشراف) لأن الكلمة التالية *هي* التسمية.
## معالجة مسبقة (Preprocess)
<Youtube id="8PmhEIXhBvI"/>
بالنسبة لنمذجة اللغة المقنعة، فإن الخطوة التالية هي تحميل معالج DistilRoBERTa لمعالجة حقل `text` الفرعي:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilroberta-base")
```
ستلاحظ من المثال أعلاه، أن حقل `text` موجود بالفعل داخل `answers`. هذا يعني أنك ستحتاج إلى استخراج حقل `text` الفرعي من بنيته المضمنة باستخدام الدالة [`flatten`](https://huggingface.co/docs/datasets/process#flatten):
```py
>>> eli5 = eli5.flatten()
>>> eli5["train"][0]
{'q_id': '7h191n',
'title': 'What does the tax bill that was passed today mean? How will it affect Americans in each tax bracket?',
'selftext': '',
'category': 'Economics',
'subreddit': 'explainlikeimfive',
'answers.a_id': ['dqnds8l', 'dqnd1jl', 'dqng3i1', 'dqnku5x'],
'answers.text': ["The tax bill is 500 pages long and there were a lot of changes still going on right to the end. It's not just an adjustment to the income tax brackets, it's a whole bunch of changes. As such there is no good answer to your question. The big take aways are: - Big reduction in corporate income tax rate will make large companies very happy. - Pass through rate change will make certain styles of business (law firms, hedge funds) extremely happy - Income tax changes are moderate, and are set to expire (though it's the kind of thing that might just always get re-applied without being made permanent) - People in high tax states (California, New York) lose out, and many of them will end up with their taxes raised.",
'None yet. It has to be reconciled with a vastly different house bill and then passed again.',
'Also: does this apply to 2017 taxes? Or does it start with 2018 taxes?',
'This article explains both the House and senate bills, including the proposed changes to your income taxes based on your income level. URL_0'],
'answers.score': [21, 19, 5, 3],
'answers.text_urls': [[],
[],
[],
['https://www.investopedia.com/news/trumps-tax-reform-what-can-be-done/']],
'title_urls': ['url'],
'selftext_urls': ['url']}
```
كل حقل فرعي هو الآن عمود منفصل كما هو موضح بواسطة بادئة `answers`، وحقل `text` هو قائمة الآن. بدلاً من
معالجة كل جملة بشكل منفصل، قم بتحويل القائمة إلى سلسلة حتى تتمكن من معالجتها بشكل مشترك.
هنا أول دالة معالجة مسبقة لربط قائمة السلاسل لكل مثال ومعالجة النتيجة:
```py
>>> def preprocess_function(examples):
... return tokenizer([" ".join(x) for x in examples["answers.text"]])
```
لتطبيق دالة المعالجة المسبقة على مجموعة البيانات بأكملها، استخدم الدالة 🤗 Datasets [`~datasets.Dataset.map`]. يمكنك تسريع دالة `map` عن طريق تعيين `batched=True` لمعالجة عدة عناصر في وقت واحد، وزيادة عدد العمليات باستخدام `num_proc`. احذف أي أعمدة غير ضرورية:
```py
>>> tokenized_eli5 = eli5.map(
... preprocess_function,
... batched=True,
... num_proc=4,
... remove_columns=eli5["train"].column_names,
... )
```
تحتوي مجموعة البيانات هذه على تسلسلات رمزية، ولكن بعضها أطول من الطول الأقصى للمدخلات للنموذج.
يمكنك الآن استخدام دالة معالجة مسبقة ثانية لـ:
- تجميع جميع التسلسلات
- تقسيم التسلسلات المجمّعة إلى أجزاء أقصر محددة بـ `block_size`، والتي يجب أن تكون أقصر من الحد الأقصى لطول المدخلات ومناسبة لذاكرة GPU.
```py
>>> block_size = 128
>>> def group_texts(examples):
... # تجميع جميع النصوص.
... concatenated_examples = {k: sum(examples[k], []) for k in examples.keys()}
... total_length = len(concatenated_examples[list(examples.keys())[0]])
... # نتجاهل الجزء المتبقي الصغير، يمكننا إضافة الحشو إذا كان النموذج يدعمه بدلاً من هذا الإسقاط، يمكنك
... # تخصيص هذا الجزء حسب احتياجاتك.
... if total_length >= block_size:
... total_length = (total_length // block_size) * block_size
... # تقسيمها إلى أجزاء بحجم block_size.
... result = {
... k: [t[i : i + block_size] for i in range(0, total_length, block_size)]
... for k, t in concatenated_examples.items()
... }
... return result
```
طبق دالة `group_texts` على مجموعة البيانات بأكملها:
```py
>>> lm_dataset = tokenized_eli5.map(group_texts, batched=True, num_proc=4)
```
الآن، قم بإنشاء دفعة من الأمثلة باستخدام [`DataCollatorForLanguageModeling`]. من الأكثر كفاءة أن تقوم بـ *الحشو الديناميكي* ليصل طولها إلى أطول جملة في الدفعة أثناء التجميع، بدلاً من حشو مجموعة البيانات بأكملها إلى الطول الأقصى.
<frameworkcontent>
<pt>
استخدم رمز نهاية التسلسل كرمز الحشو وحدد `mlm_probability` لحجب الرموز عشوائياً كل مرة تكرر فيها البيانات:
```py
>>> from transformers import DataCollatorForLanguageModeling
>>> tokenizer.pad_token = tokenizer.eos_token
>>> data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm_probability=0.15)
```
</pt>
<tf>
استخدم رمز نهاية التسلسل كرمز الحشو وحدد `mlm_probability` لحجب الرموز عشوائياً كل مرة تكرر فيها البيانات:
```py
>>> from transformers import DataCollatorForLanguageModeling
>>> data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm_probability=0.15, return_tensors="tf")
```
</tf>
</frameworkcontent>
## التدريب (Train)
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن على دراية بتعديل نموذج باستخدام [`Trainer`], ألق نظرة على الدليل الأساسي [هنا](../training#train-with-pytorch-trainer)!
</Tip>
أنت مستعد الآن لبدء تدريب نموذجك! قم بتحميل DistilRoBERTa باستخدام [`AutoModelForMaskedLM`]:
```py
>>> from transformers import AutoModelForMaskedLM
>>> model = AutoModelForMaskedLM.from_pretrained("distilbert/distilroberta-base")
```
في هذه المرحلة، تبقى ثلاث خطوات فقط:
1. حدد معلمات التدريب الخاصة بك في [`TrainingArguments`]. المعلمة الوحيدة المطلوبة هي `output_dir` والتي تحدد مكان حفظ نموذجك. ستقوم بدفع هذا النموذج إلى Hub عن طريق تعيين `push_to_hub=True` (يجب أن تكون مسجلاً الدخول إلى Hugging Face لتحميل نموذجك).
2. قم بتمرير معلمات التدريب إلى [`Trainer`] مع النموذج، ومجموعات البيانات، ومجمّع البيانات.
3. قم باستدعاء [`~Trainer.train`] لتعديل نموذجك.
```py
>>> training_args = TrainingArguments(
... output_dir="my_awesome_eli5_mlm_model",
... eval_strategy="epoch",
... learning_rate=2e-5,
... num_train_epochs=3,
... weight_decay=0.01,
... push_to_hub=True,
... )
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=lm_dataset["train"],
... eval_dataset=lm_dataset["test"],
... data_collator=data_collator,
... tokenizer=tokenizer,
... )
>>> trainer.train()
```
بمجرد اكتمال التدريب، استخدم طريقة [`~transformers.Trainer.evaluate`] لتقييم النموذج والحصول على مقياس
الحيرة:
```py
>>> import math
>>> eval_results = trainer.evaluate()
>>> print(f"Perplexity: {math.exp(eval_results['eval_loss']):.2f}")
Perplexity: 8.76
```
ثم شارك نموذجك على Hub باستخدام طريقة [`~transformers.Trainer.push_to_hub`] حتى يتمكن الجميع من استخدام نموذجك:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن على دراية بتعديل نموذج باستخدام Keras، ألق نظرة على الدليل الأساسي [هنا](../training#train-a-tensorflow-model-with-keras)!
</Tip>
لتعديل نموذج في TensorFlow، ابدأ بإعداد دالة محسن، وجدول معدل التعلم، وبعض معلمات التدريب:
```py
>>> from transformers import create_optimizer, AdamWeightDecay
>>> optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)
```
ثم يمكنك تحميل DistilRoBERTa باستخدام [`TFAutoModelForMaskedLM`]:
```py
>>> from transformers import TFAutoModelForMaskedLM
>>> model = TFAutoModelForMaskedLM.from_pretrained("distilbert/distilroberta-base")
```
قم بتحويل مجموعات بياناتك إلى تنسيق `tf.data.Dataset` باستخدام [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... lm_dataset["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_test_set = model.prepare_tf_dataset(
... lm_dataset["test"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
قم بتهيئة النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method). لاحظ أن نماذج Transformers لديها جميعها دالة خسارة افتراضية ذات صلة بالمهمة، لذلك لا تحتاج إلى تحديد واحدة ما لم تكن تريد ذلك:
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer) # لا توجد حجة للخسارة!
```
يمكن القيام بذلك عن طريق تحديد مكان دفع نموذجك ومعالج الرموز في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> callback = PushToHubCallback(
... output_dir="my_awesome_eli5_mlm_model",
... tokenizer=tokenizer,
... )
```
أخيراً، أنت مستعد لبدء تدريب نموذجك! قم باستدعاء [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع مجموعات بيانات التدريب والتحقق، وعدد العصور، والتعليقات الخاصة بك لتعديل النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3, callbacks=[callback])
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائياً إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
لمثال أكثر تفصيلاً حول كيفية تعديل نموذج للنمذجة اللغوية المقنعة، ألق نظرة على الدفتر المقابل
[دفتر PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)
أو [دفتر TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb).
</Tip>
## الاستدلال
رائع، الآن بعد أن قمت بتعديل نموذج، يمكنك استخدامه للاستدلال!
جهّز بعض النصوص التي تريد أن يملأ النموذج الفراغات فيها، واستخدم الرمز الخاص `<mask>` للإشارة إلى الفراغ:
```py
>>> text = "The Milky Way is a <mask> galaxy."
```
أبسط طريقة لتجربة نموذجك المعدل للاستدلال هي استخدامه في [`pipeline`]. قم بإنشاء كائن `pipeline` لملء الفراغ مع نموذجك، ومرر نصك إليه. إذا أردت، يمكنك استخدام معلمة `top_k` لتحديد عدد التنبؤات التي تريد إرجاعها:
```py
>>> from transformers import pipeline
>>> mask_filler = pipeline("fill-mask", "username/my_awesome_eli5_mlm_model")
>>> mask_filler(text, top_k=3)
[{'score': 0.5150994658470154,
'token': 21300,
'token_str': ' spiral',
'sequence': 'The Milky Way is a spiral galaxy.'},
{'score': 0.07087188959121704,
'token': 2232,
'token_str': ' massive',
'sequence': 'The Milky Way is a massive galaxy.'},
{'score': 0.06434620916843414,
'token': 650,
'token_str': ' small',
'sequence': 'The Milky Way is a small galaxy.'}]
```
<frameworkcontent>
<pt>
قم بتجزئة النص وإرجاع `input_ids` كمتجهات PyTorch. ستحتاج أيضًا إلى تحديد موضع رمز `<mask>`:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_eli5_mlm_model")
>>> inputs = tokenizer(text, return_tensors="pt")
>>> mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1]
```
قم بتمرير المدخلات إلى النموذج وإرجاع `logits` للرمز المقنع:
```py
>>> from transformers import AutoModelForMaskedLM
>>> model = AutoModelForMaskedLM.from_pretrained("username/my_awesome_eli5_mlm_model")
>>> logits = model(**inputs).logits
>>> mask_token_logits = logits[0, mask_token_index, :]
```
ثم قم بإرجاع الرموز الثلاثة المقنعة ذات الاحتمالية الأعلى وطباعتها:
```py
>>> top_3_tokens = torch.topk(mask_token_logits, 3, dim=1).indices[0].tolist()
>>> for token in top_3_tokens:
... print(text.replace(tokenizer.mask_token, tokenizer.decode([token])))
The Milky Way is a spiral galaxy.
The Milky Way is a massive galaxy.
The Milky Way is a small galaxy.
```
</pt>
<tf>
قم بتقسيم النص إلى رموز وإرجاع `input_ids` كـ TensorFlow tensors. ستحتاج أيضًا إلى تحديد موضع رمز `<mask>`:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_eli5_mlm_model")
>>> inputs = tokenizer(text, return_tensors="tf")
>>> mask_token_index = tf.where(inputs["input_ids"] == tokenizer.mask_token_id)[0, 1]
```
قم بتمرير المدخلات إلى النموذج وإرجاع `logits` للرمز المقنع:
```py
>>> from transformers import TFAutoModelForMaskedLM
>>> model = TFAutoModelForMaskedLM.from_pretrained("username/my_awesome_eli5_mlm_model")
>>> logits = model(**inputs).logits
>>> mask_token_logits = logits[0, mask_token_index, :]
```
ثم قم بإرجاع الرموز الثلاثة المقنعة ذات الاحتمالية الأعلى وطباعتها:
```py
>>> top_3_tokens = tf.math.top_k(mask_token_logits, 3).indices.numpy()
>>> for token in top_3_tokens:
... print(text.replace(tokenizer.mask_token, tokenizer.decode([token])))
The Milky Way is a spiral galaxy.
The Milky Way is a massive galaxy.
The Milky Way is a small galaxy.
```
</tf>
</frameworkcontent>

View File

@ -0,0 +1,452 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# الاختيار من متعدد (Multiple choice)
[[open-in-colab]]
مهمة الاختيار من متعدد مشابهة لمهمة الإجابة على الأسئلة، ولكن مع توفير عدة إجابات محتملة مع سياق، ويُدرّب النموذج على تحديد الإجابة الصحيحة.
سيوضح لك هذا الدليل كيفية:
1. ضبط نموذج [BERT](https://huggingface.co/google-bert/bert-base-uncased) باستخدام الإعداد `regular` لمجموعة بيانات [SWAG](https://huggingface.co/datasets/swag) لاختيار الإجابة الأفضل من بين الخيارات المتعددة المتاحة مع السياق.
2. استخدام النموذج المضبوط للاستدلال.
قبل البدء، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate
```
نشجعك على تسجيل الدخول إلى حساب Hugging Face الخاص بك حتى تتمكن من تحميل نموذجك ومشاركته مع المجتمع. عند المطالبة، أدخل الرمز المميز الخاص بك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات SWAG
ابدأ بتحميل تهيئة `regular` لمجموعة بيانات SWAG من مكتبة 🤗 Datasets:
```py
>>> from datasets import load_dataset
>>> swag = load_dataset("swag", "regular")
```
ثم ألق نظرة على مثال:
```py
>>> swag["train"][0]
{'ending0': 'passes by walking down the street playing their instruments.',
'ending1': 'has heard approaching them.',
'ending2': "arrives and they're outside dancing and asleep.",
'ending3': 'turns the lead singer watches the performance.',
'fold-ind': '3416',
'gold-source': 'gold',
'label': 0,
'sent1': 'Members of the procession walk down the street holding small horn brass instruments.',
'sent2': 'A drum line',
'startphrase': 'Members of the procession walk down the street holding small horn brass instruments. A drum line',
'video-id': 'anetv_jkn6uvmqwh4'}
```
على الرغم من أن الحقول تبدو كثيرة، إلا أنها في الواقع بسيطة جداً:
- `sent1` و `sent2`: يعرض هذان الحقلان بداية الجملة، وبدمجهما معًا، نحصل على حقل `startphrase`.
- `ending`: يقترح نهاية محتملة للجملة، واحدة منها فقط هي الصحيحة.
- `label`: يحدد نهاية الجملة الصحيحة.
## المعالجة المسبقة (Preprocess)
الخطوة التالية هي استدعاء مُجزئ BERT لمعالجة بدايات الجمل والنهايات الأربع المحتملة:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")
```
تحتاج دالة المعالجة المسبقة التي تريد إنشاءها إلى:
1. إنشاء أربع نسخ من حقل `sent1` ودمج كل منها مع `sent2` لإعادة إنشاء كيفية بدء الجملة.
2. دمج `sent2` مع كل من نهايات الجمل الأربع المحتملة.
3. تتجميع هاتين القائمتين لتتمكن من تجزئتهما، ثم إعادة ترتيبها بعد ذلك بحيث يكون لكل مثال حقول `input_ids` و `attention_mask` و `labels` مقابلة.
```py
>>> ending_names = ["ending0", "ending1", "ending2", "ending3"]
>>> def preprocess_function(examples):
... first_sentences = [[context] * 4 for context in examples["sent1"]]
... question_headers = examples["sent2"]
... second_sentences = [
... [f"{header} {examples[end][i]}" for end in ending_names] for i, header in enumerate(question_headers)
... ]
... first_sentences = sum(first_sentences, [])
... second_sentences = sum(second_sentences, [])
... tokenized_examples = tokenizer(first_sentences, second_sentences, truncation=True)
... return {k: [v[i : i + 4] for i in range(0, len(v), 4)] for k, v in tokenized_examples.items()}
```
لتطبيق دالة المعالجة المسبقة على مجموعة البيانات بأكملها، استخدم طريقة [`~datasets.Dataset.map`] الخاصة بـ 🤗 Datasets. يمكنك تسريع دالة `map` عن طريق تعيين `batched=True` لمعالجة عناصر متعددة من مجموعة البيانات في وقت واحد:
```py
tokenized_swag = swag.map(preprocess_function, batched=True)
```
لا يحتوي 🤗 Transformers على مجمع بيانات للاختيار من متعدد، لذلك ستحتاج إلى تكييف [`DataCollatorWithPadding`] لإنشاء دفعة من الأمثلة. من الأكفأ إضافة حشو (padding) ديناميكي للجمل إلى أطول طول في دفعة أثناء التجميع، بدلاً من حشو مجموعة البيانات بأكملها إلى الحد الأقصى للطول.
يقوم `DataCollatorForMultipleChoice` بتجميع جميع مدخلات النموذج، ويطبق الحشو، ثم يعيد تجميع النتائج في شكلها الأصلي:
<frameworkcontent>
<pt>
```py
>>> from dataclasses import dataclass
>>> from transformers.tokenization_utils_base import PreTrainedTokenizerBase, PaddingStrategy
>>> from typing import Optional, Union
>>> import torch
>>> @dataclass
... class DataCollatorForMultipleChoice:
... """
... Data collator that will dynamically pad the inputs for multiple choice received.
... """
... tokenizer: PreTrainedTokenizerBase
... padding: Union[bool, str, PaddingStrategy] = True
... max_length: Optional[int] = None
... pad_to_multiple_of: Optional[int] = None
... def __call__(self, features):
... label_name = "label" if "label" in features[0].keys() else "labels"
... labels = [feature.pop(label_name) for feature in features]
... batch_size = len(features)
... num_choices = len(features[0]["input_ids"])
... flattened_features = [
... [{k: v[i] for k, v in feature.items()} for i in range(num_choices)] for feature in features
... ]
... flattened_features = sum(flattened_features, [])
... batch = self.tokenizer.pad(
... flattened_features,
... padding=self.padding,
... max_length=self.max_length,
... pad_to_multiple_of=self.pad_to_multiple_of,
... return_tensors="pt",
... )
... batch = {k: v.view(batch_size, num_choices, -1) for k, v in batch.items()}
... batch["labels"] = torch.tensor(labels, dtype=torch.int64)
... return batch
```
</pt>
<tf>
```py
>>> from dataclasses import dataclass
>>> from transformers.tokenization_utils_base import PreTrainedTokenizerBase, PaddingStrategy
>>> from typing import Optional, Union
>>> import tensorflow as tf
>>> @dataclass
... class DataCollatorForMultipleChoice:
... """
... Data collator that will dynamically pad the inputs for multiple choice received.
... """
... tokenizer: PreTrainedTokenizerBase
... padding: Union[bool, str, PaddingStrategy] = True
... max_length: Optional[int] = None
... pad_to_multiple_of: Optional[int] = None
... def __call__(self, features):
... label_name = "label" if "label" in features[0].keys() else "labels"
... labels = [feature.pop(label_name) for feature in features]
... batch_size = len(features)
... num_choices = len(features[0]["input_ids"])
... flattened_features = [
... [{k: v[i] for k, v in feature.items()} for i in range(num_choices)] for feature in features
... ]
... flattened_features = sum(flattened_features, [])
... batch = self.tokenizer.pad(
... flattened_features,
... padding=self.padding,
... max_length=self.max_length,
... pad_to_multiple_of=self.pad_to_multiple_of,
... return_tensors="tf",
... )
... batch = {k: tf.reshape(v, (batch_size, num_choices, -1)) for k, v in batch.items()}
... batch["labels"] = tf.convert_to_tensor(labels, dtype=tf.int64)
... return batch
```
</tf>
</frameworkcontent>
## التقييم (Evaluate)
يُفضل غالبًا تضمين مقياس أثناء التدريب لتقييم أداء نموذجك. يمكنك تحميل طريقة تقييم بسرعة باستخدام مكتبة 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index). لهذه المهمة، قم بتحميل مقياس [الدقة](https://huggingface.co/spaces/evaluate-metric/accuracy) (انظر إلى [الجولة السريعة](https://huggingface.co/docs/evaluate/a_quick_tour) لـ 🤗 Evaluate لمعرفة المزيد حول كيفية تحميل المقياس وحسابه):
```py
>>> import evaluate
>>> accuracy = evaluate.load("accuracy")
```
ثم أنشئ دالة لتمرير التنبؤات والتسميات إلى [`~evaluate.EvaluationModule.compute`] لحساب الدقة:
```py
>>> import numpy as np
>>> def compute_metrics(eval_pred):
... predictions, labels = eval_pred
... predictions = np.argmax(predictions, axis=1)
... return accuracy.compute(predictions=predictions, references=labels)
```
دالتك `compute_metrics` جاهزة الآن، وستعود إليها عند إعداد تدريبك.
## التدريب (Train)
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن معتادًا على ضبط نموذج باستخدام [`Trainer`], فراجع الدرس الأساسي [هنا](../training#train-with-pytorch-trainer)!
</Tip>
أنت جاهز لبدء تدريب نموذجك الآن! قم بتحميل BERT باستخدام [`AutoModelForMultipleChoice`]:
```py
>>> from transformers import AutoModelForMultipleChoice, TrainingArguments, Trainer
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-uncased")
```
في هذه المرحلة، تبقى ثلاث خطوات فقط:
1. حدد معلمات التدريب الخاصة بك في [`TrainingArguments`]. المعلمة الوحيدة المطلوبة هي `output_dir` التي تحدد مكان حفظ نموذجك. ستدفع هذا النموذج إلى Hub عن طريق تعيين `push_to_hub=True` (يجب عليك تسجيل الدخول إلى Hugging Face لتحميل نموذجك). في نهاية كل حقبة، سيقوم [`Trainer`] بتقييم الدقة وحفظ نقطة فحص التدريب.
2. مرر معلمات التدريب إلى [`Trainer`] جنبًا إلى جنب مع النموذج ومُجمِّع البيانات والمعالج ودالة تجميع البيانات ودالة `compute_metrics`.
3. استدعي [`~Trainer.train`] لضبط نموذجك.
```py
>>> training_args = TrainingArguments(
... output_dir="my_awesome_swag_model",
... eval_strategy="epoch",
... save_strategy="epoch",
... load_best_model_at_end=True,
... learning_rate=5e-5,
... per_device_train_batch_size=16,
... per_device_eval_batch_size=16,
... num_train_epochs=3,
... weight_decay=0.01,
... push_to_hub=True,
... )
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=tokenized_swag["train"],
... eval_dataset=tokenized_swag["validation"],
... processing_class=tokenizer,
... data_collator=DataCollatorForMultipleChoice(tokenizer=tokenizer),
... compute_metrics=compute_metrics,
... )
>>> trainer.train()
```
بمجرد اكتمال التدريب، شارك نموذجك مع Hub باستخدام طريقة [`~transformers.Trainer.push_to_hub`] حتى يتمكن الجميع من استخدام نموذجك:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن معتادًا على ضبط نموذج باستخدام Keras، فراجع الدرس الأساسي [هنا](../training#train-a-tensorflow-model-with-keras)!
</Tip>
لضبط نموذج في TensorFlow، ابدأ بإعداد دالة مُحسِّن وجدول معدل التعلم وبعض معلمات التدريب:
```py
>>> from transformers import create_optimizer
>>> batch_size = 16
>>> num_train_epochs = 2
>>> total_train_steps = (len(tokenized_swag["train"]) // batch_size) * num_train_epochs
>>> optimizer, schedule = create_optimizer(init_lr=5e-5, num_warmup_steps=0, num_train_steps=total_train_steps)
```
ثم يمكنك تحميل BERT باستخدام [`TFAutoModelForMultipleChoice`]:
```py
>>> from transformers import TFAutoModelForMultipleChoice
>>> model = TFAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-uncased")
```
حوّل مجموعات البيانات الخاصة بك إلى تنسيق `tf.data.Dataset` باستخدام [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> data_collator = DataCollatorForMultipleChoice(tokenizer=tokenizer)
>>> tf_train_set = model.prepare_tf_dataset(
... tokenized_swag["train"],
... shuffle=True,
... batch_size=batch_size,
... collate_fn=data_collator,
... )
>>> tf_validation_set = model.prepare_tf_dataset(
... tokenized_swag["validation"],
... shuffle=False,
... batch_size=batch_size,
... collate_fn=data_collator,
... )
```
قم بتهيئة النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method). لاحظ أن جميع نماذج Transformers تحتوي على دالة خسارة مناسبة للمهمة بشكل افتراضي، لذلك لا تحتاج إلى تحديد واحدة ما لم ترغب في ذلك:
```py
>>> model.compile(optimizer=optimizer) # لا توجد وسيطة خسارة!
```
الخطوتان الأخيرتان قبل بدء التدريب هما: حساب دقة التنبؤات، وتوفير طريقة لرفع النموذج إلى Hub. ويمكن تحقيق ذلك باستخدام [استدعاءات Keras](../main_classes/keras_callbacks)
مرر دالتك `compute_metrics` إلى [`~transformers.KerasMetricCallback`]:
```py
>>> from transformers.keras_callbacks import KerasMetricCallback
>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)
```
حدد مكان دفع نموذجك ومعالجك في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> push_to_hub_callback = PushToHubCallback(
... output_dir="my_awesome_model",
... tokenizer=tokenizer,
... )
```
ثم قم بتضمين الاستدعاءات معًا:
```py
>>> callbacks = [metric_callback, push_to_hub_callback]
```
أخيرًا، أنت جاهز لبدء تدريب نموذجك! استدعِ[`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع مجموعات بيانات التدريب والتحقق من الصحة وعدد الحقب والاستدعاءات لضبط النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=2, callbacks=callbacks)
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائيًا إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
للحصول على مثال أكثر تعمقًا حول كيفية ضبط نموذج للاختيار من متعدد، ألق نظرة على [دفتر ملاحظات PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multiple_choice.ipynb)
أو [دفتر ملاحظات TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multiple_choice-tf.ipynb) المقابل.
</Tip>
## الاستدلال (Inference)
رائع، الآن بعد أن قمت بضبط نموذج، يمكنك استخدامه للاستدلال!
قم بإنشاء نص واقتراح إجابتين محتملتين:
```py
>>> prompt = "France has a bread law, Le Décret Pain, with strict rules on what is allowed in a traditional baguette."
>>> candidate1 = "The law does not apply to croissants and brioche."
>>> candidate2 = "The law applies to baguettes."
```
<frameworkcontent>
<pt>
قم بتحليل كل مطالبة وزوج إجابة مرشح وأعد تنسورات PyTorch. يجب عليك أيضًا إنشاء بعض `العلامات`:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_swag_model")
>>> inputs = tokenizer([[prompt, candidate1], [prompt, candidate2]], return_tensors="pt", padding=True)
>>> labels = torch.tensor(0).unsqueeze(0)
```
مرر مدخلاتك والعلامات إلى النموذج وأرجع`logits`:
```py
>>> from transformers import AutoModelForMultipleChoice
>>> model = AutoModelForMultipleChoice.from_pretrained("username/my_awesome_swag_model")
>>> outputs = model(**{k: v.unsqueeze(0) for k, v in inputs.items()}, labels=labels)
>>> logits = outputs.logits
```
استخرج الفئة ذات الاحتمالية الأكبر:
```py
>>> predicted_class = logits.argmax().item()
>>> predicted_class
0
```
</pt>
<tf>
قم بتحليل كل مطالبة وزوج إجابة مرشح وأعد موترات TensorFlow:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_swag_model")
>>> inputs = tokenizer([[prompt, candidate1], [prompt, candidate2]], return_tensors="tf", padding=True)
```
مرر مدخلاتك إلى النموذج وأعد القيم logits:
```py
>>> from transformers import TFAutoModelForMultipleChoice
>>> model = TFAutoModelForMultipleChoice.from_pretrained("username/my_awesome_swag_model")
>>> inputs = {k: tf.expand_dims(v, 0) for k, v in inputs.items()}
>>> outputs = model(inputs)
>>> logits = outputs.logits
```
استخرج الفئة ذات الاحتمالية الأكبر:
```py
>>> predicted_class = int(tf.math.argmax(logits, axis=-1)[0])
>>> predicted_class
0
```
</tf>
</frameworkcontent>

View File

@ -0,0 +1,432 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# الإجابة على الأسئلة (Question answering)
[[open-in-colab]]
<Youtube id="ajPx5LwJD-I"/>
تُقدّم مهام الإجابة على الأسئلة إجابةً بناءً على سؤال. إذا سبق لك أن سألت مساعدًا افتراضيًا مثل Alexa أو Siri أو Google عن حالة الطقس، فأنت قد استخدمت نموذج للإجابة على الأسئلة من قبل. هناك نوعان شائعان لمهام الإجابة على الأسئلة:
- الاستخراجية: استخراج الإجابة من السياق المحدد.
- التلخيصية: إنشاء إجابة من السياق تجيب على السؤال بشكل صحيح.
سيوضح لك هذا الدليل كيفية:
1. ضبط [DistilBERT](https://huggingface.co/distilbert/distilbert-base-uncased) على مجموعة بيانات [SQuAD](https://huggingface.co/datasets/squad) للإجابة على الأسئلة الاستخراجية.
2. استخدام النموذج المضبوط للاستدلال.
<Tip>
لمشاهدة جميع الهياكل والنسخ المتوافقة مع هذه المهمة، نوصي بالرجوع إلى [صفحة المهمة](https://huggingface.co/tasks/question-answering)
</Tip>
قبل البدء، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate
```
نشجعك على تسجيل الدخول إلى حساب Hugging Face الخاص بك حتى تتمكن من تحميل نموذجك ومشاركته مع المجتمع. عند المطالبة، أدخل الرمز المميز الخاص بك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات SQuAD
ابدأ بتحميل جزء أصغر من مجموعة بيانات SQuAD من مكتبة 🤗 Datasets. سيتيح لك ذلك فرصة للتجربة والتحقق من عمل كل شيء بشكل صحيح قبل قضاء المزيد من الوقت في التدريب على مجموعة البيانات الكاملة.
```py
>>> from datasets import load_dataset
>>> squad = load_dataset("squad", split="train[:5000]")
```
قم بتقسيم تقسيم `train` لمجموعة البيانات إلى مجموعة تدريب واختبار باستخدام طريقة [`~datasets.Dataset.train_test_split`]:
```py
>>> squad = squad.train_test_split(test_size=0.2)
```
ثم ألق نظرة على مثال:
```py
>>> squad["train"][0]
{'answers': {'answer_start': [515], 'text': ['Saint Bernadette Soubirous']},
'context': 'Architecturally, the school has a Catholic character. Atop the Main Building\'s gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.',
'id': '5733be284776f41900661182',
'question': 'To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?',
'title': 'University_of_Notre_Dame'
}
```
هناك العديد من الحقول المهمة هنا:
- `answers`: موقع بداية الرمز المميز للإجابة ونص الإجابة.
- `context`: معلومات أساسية يحتاج النموذج إلى استخراج الإجابة منها.
- `question`: السؤال الذي يجب على النموذج الإجابة عليه.
## المعالجة المسبقة (Preprocess)
<Youtube id="qgaM0weJHpA"/>
الخطوة التالية هي تحميل المحلل اللغوى DistilBERT لمعالجة حقلي `question` و `context`:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
```
هناك بعض خطوات المعالجة المسبقة الخاصة بمهام الإجابة على الأسئلة التي يجب أن تكون على دراية بها:
1. قد تحتوي بعض الأمثلة في مجموعة البيانات على `context` طويلًا يتجاوز الحد الأقصى لطول مدخل النموذج. للتعامل مع النصوص الأطول، يتم اقتطاع `context` فقط عن طريق تعيين `truncation="only_second"`.
2. بعد ذلك، يتم تحديد مواضع بداية ونهاية الإجابة في `context` الأصلي عن طريق تعيين
`return_offset_mapping=True`.
3. باستخدام التعيين، يمكن الآن تحديد رموز بداية ونهاية الإجابة. استخدم طريقة [`~tokenizers.Encoding.sequence_ids`]
لتحديد أجزاء الإزاحة التي تتوافق مع `question` و `context`.
فيما يلي كيفية إنشاء دالة لقص وتعيين رموز البداية والنهاية لـ `answer` إلى `context`:
```py
>>> def preprocess_function(examples):
... questions = [q.strip() for q in examples["question"]]
... inputs = tokenizer(
... questions,
... examples["context"],
... max_length=384,
... truncation="only_second",
... return_offsets_mapping=True,
... padding="max_length",
... )
... offset_mapping = inputs.pop("offset_mapping")
... answers = examples["answers"]
... start_positions = []
... end_positions = []
... for i, offset in enumerate(offset_mapping):
... answer = answers[i]
... start_char = answer["answer_start"][0]
... end_char = answer["answer_start"][0] + len(answer["text"][0])
... sequence_ids = inputs.sequence_ids(i)
... # Find the start and end of the context
... idx = 0
... while sequence_ids[idx] != 1:
... idx += 1
... context_start = idx
... while sequence_ids[idx] == 1:
... idx += 1
... context_end = idx - 1
... # If the answer is not fully inside the context, label it (0, 0)
... if offset[context_start][0] > end_char or offset[context_end][1] < start_char:
... start_positions.append(0)
... end_positions.append(0)
... else:
... # Otherwise it's the start and end token positions
... idx = context_start
... while idx <= context_end and offset[idx][0] <= start_char:
... idx += 1
... start_positions.append(idx - 1)
... idx = context_end
... while idx >= context_start and offset[idx][1] >= end_char:
... idx -= 1
... end_positions.append(idx + 1)
... inputs["start_positions"] = start_positions
... inputs["end_positions"] = end_positions
... return inputs
```
لتطبيق المعالجة المسبقة على كامل مجموعة البيانات، استخدم [`~datasets.Dataset.map`] من مكتبة 🤗 Datasets. يمكنك تسريع دالة `map` عن طريق تعيين `batched=True` لمعالجة عناصر متعددة من مجموعة البيانات دفعة واحدة. قم بإزالة أي أعمدة لا تحتاجها:
```py
>>> tokenized_squad = squad.map(preprocess_function, batched=True, remove_columns=squad["train"].column_names)
```
الآن قم بإنشاء دفعة من الأمثلة باستخدام [`DefaultDataCollator`]. بخلاف مجمّعات البيانات الأخرى في 🤗 Transformers، لا يطبق [`DefaultDataCollator`] أي معالجة مسبقة إضافية مثل الحشو.
<frameworkcontent>
<pt>
```py
>>> from transformers import DefaultDataCollator
>>> data_collator = DefaultDataCollator()
```
</pt>
<tf>
```py
>>> from transformers import DefaultDataCollator
>>> data_collator = DefaultDataCollator(return_tensors="tf")
```
</tf>
</frameworkcontent>
## التدريب (Train)
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن معتادًا على ضبط نموذج باستخدام [`Trainer`], ألق نظرة على البرنامج التعليمي الأساسي [هنا](../training#train-with-pytorch-trainer)!
</Tip>
أنت جاهز لبدء تدريب نموذجك الآن! قم بتحميل DistilBERT باستخدام [`AutoModelForQuestionAnswering`]:
```py
>>> from transformers import AutoModelForQuestionAnswering, TrainingArguments, Trainer
>>> model = AutoModelForQuestionAnswering.from_pretrained("distilbert/distilbert-base-uncased")
```
في هذه المرحلة، تبقى ثلاث خطوات فقط:
1. حدد المعاملات الفائقة للتدريب في [`TrainingArguments`]. المعامل الوحيد المطلوب هو `output_dir` الذي يحدد مكان حفظ نموذجك. ستدفع هذا النموذج إلى Hub عن طريق تعيين `push_to_hub=True` (يجب عليك تسجيل الدخول إلى Hugging Face لتحميل نموذجك).
2. مرر معاملات التدريب إلى [`Trainer`] جنبًا إلى جنب مع النموذج، ومجموعة البيانات، والمُحلّل النصي، ومُجمّع البيانات.
3. استدعِ ـ [`~Trainer.train`] لضبط النموذج.
```py
>>> training_args = TrainingArguments(
... output_dir="my_awesome_qa_model",
... eval_strategy="epoch",
... learning_rate=2e-5,
... per_device_train_batch_size=16,
... per_device_eval_batch_size=16,
... num_train_epochs=3,
... weight_decay=0.01,
... push_to_hub=True,
... )
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=tokenized_squad["train"],
... eval_dataset=tokenized_squad["test"],
... processing_class=tokenizer,
... data_collator=data_collator,
... )
>>> trainer.train()
```
بمجرد اكتمال التدريب، شارك نموذجك في Hub باستخدام الدالة [`~transformers.Trainer.push_to_hub`] حتى يتمكن الجميع من استخدام نموذجك:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن معتادًا على ضبط نموذج باستخدام Keras، فألق نظرة على البرنامج التعليمي الأساسي [هنا](../training#train-a-tensorflow-model-with-keras)!
</Tip>
لضبط نموذج في TensorFlow، ابدأ بإعداد دالة مُحسِّن، وجدول معدل التعلم، وبعض المعاملات الفائقة للتدريب:
```py
>>> from transformers import create_optimizer
>>> batch_size = 16
>>> num_epochs = 2
>>> total_train_steps = (len(tokenized_squad["train"]) // batch_size) * num_epochs
>>> optimizer, schedule = create_optimizer(
... init_lr=2e-5,
... num_warmup_steps=0,
... num_train_steps=total_train_steps,
... )
```
ثم يمكنك تحميل DistilBERT باستخدام [`TFAutoModelForQuestionAnswering`]:
```py
>>> from transformers import TFAutoModelForQuestionAnswering
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert/distilbert-base-uncased")
```
حوّل مجموعات البيانات الخاصة بك إلى تنسيق `tf.data.Dataset` باستخدام [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... tokenized_squad["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_validation_set = model.prepare_tf_dataset(
... tokenized_squad["test"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
قم بتكوين النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method):
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer)
```
آخر شيء يجب إعداده قبل بدء التدريب هو توفير طريقة لدفع نموذجك إلى Hub. يمكن القيام بذلك عن طريق تحديد مكان دفع نموذجك ومعالجك المعجمي في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> callback = PushToHubCallback(
... output_dir="my_awesome_qa_model",
... tokenizer=tokenizer,
... )
```
أخيرًا، أنت جاهز لبدء تدريب نموذجك! اتصل بـ [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع مجموعات بيانات التدريب والتحقق من الصحة، وعدد العهود، ومعاودة الاتصال الخاصة بك لضبط النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3, callbacks=[callback])
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائيًا إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
للحصول على مثال أكثر تعمقًا حول كيفية ضبط نموذج للإجابة على الأسئلة، ألق نظرة على [دفتر ملاحظات PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering.ipynb) المقابل
أو [دفتر ملاحظات TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering-tf.ipynb).
</Tip>
## التقييم (Evaluate)
يتطلب التقييم للإجابة على الأسئلة قدرًا كبيرًا من المعالجة اللاحقة. لتوفير وقتك، يتخطى هذا الدليل خطوة التقييم. لا يزال [`Trainer`] يحسب خسارة التقييم أثناء التدريب، مما يعني أنك لست تجهل تمامًا أداء نموذجك.
إذا كان لديك المزيد من الوقت وتهتم بكيفية تقييم نموذجك للإجابة على الأسئلة، فألق نظرة على فصل [الإجابة على الأسئلة](https://huggingface.co/course/chapter7/7?fw=pt#post-processing) من دورة 🤗 Hugging Face!
## الاستدلال (Inference)
رائع، الآن بعد أن قمت بضبط نموذج، يمكنك استخدامه للاستدلال!
حدد سؤالًا وسياقًا ليقوم النموذج بالتنبؤ بالإجابة عليه:
```py
>>> question = "How many programming languages does BLOOM support?"
>>> context = "BLOOM has 176 billion parameters and can generate text in 46 languages natural languages and 13 programming languages."
```
أبسط طريقة لتجربة نموذجك المُدرَّب للاستدلال هي استخدامه في [`pipeline`]. قم بإنشاء كائن لـ `pipeline` للإجابة على الأسئلة باستخدام نموذجك، ومرِّر النص إليه:
```py
>>> from transformers import pipeline
>>> question_answerer = pipeline("question-answering", model="my_awesome_qa_model")
>>> question_answerer(question=question, context=context)
{'score': 0.2058267742395401,
'start': 10,
'end': 95,
'answer': '176 مليار معامل ويمكنه إنشاء نصوص بـ 46 لغة طبيعية و 13'}
```
يمكنك أيضًا تكرار نتائج `pipeline` يدويًا إذا أردت:
<frameworkcontent>
<pt>
قسّم النص وأرجع تنسورات PyTorch:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_qa_model")
>>> inputs = tokenizer(question, context, return_tensors="pt")
```
مرر مدخلاتك إلى النموذج وأرجع `logits`:
```py
>>> import torch
>>> from transformers import AutoModelForQuestionAnswering
>>> model = AutoModelForQuestionAnswering.from_pretrained("my_awesome_qa_model")
>>> with torch.no_grad():
... outputs = model(**inputs)
```
احصل على أعلى احتمال من مخرجات النموذج لموضعي البداية والنهاية:
```py
>>> answer_start_index = outputs.start_logits.argmax()
>>> answer_end_index = outputs.end_logits.argmax()
```
استخلاص الإجابة من الرموز المتوقعة:
```py
>>> predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
>>> tokenizer.decode(predict_answer_tokens)
'176 billion parameters and can generate text in 46 languages natural languages and 13'
```
</pt>
<tf>
قم بتحليل النص المعجمي وأعد موترات TensorFlow:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_qa_model")
>>> inputs = tokenizer(question, context, return_tensors="tf")
```
مرر مدخلاتك إلى النموذج وأعد `logits`:
```py
>>> from transformers import TFAutoModelForQuestionAnswering
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("my_awesome_qa_model")
>>> outputs = model(**inputs)
```
احصل على أعلى احتمال من مخرجات النموذج لموضعي البداية والنهاية:
```py
>>> answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)[0])
>>> answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)[0])
```
استخلاص الإجابة من الرموز المتوقعة:
```py
>>> predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
>>> tokenizer.decode(predict_answer_tokens)
'176 billion parameters and can generate text in 46 languages natural languages and 13'
```
</tf>
</frameworkcontent>

View File

@ -0,0 +1,387 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# تصنيف النص(Text classification)
[[open-in-colab]]
<Youtube id="leNG9fN9FQU"/>
تصنيف النص هو مهمة NLP شائعة حيث يُعيّن تصنيفًا أو فئة للنص. تستخدم بعض أكبر الشركات تصنيف النصوص في الإنتاج لمجموعة واسعة من التطبيقات العملية. أحد أكثر أشكال تصنيف النص شيوعًا هو تحليل المشاعر، والذي يقوم بتعيين تسمية مثل 🙂 إيجابية، 🙁 سلبية، أو 😐 محايدة لتسلسل نصي.
سيوضح لك هذا الدليل كيفية:
1. ضبط [DistilBERT](https://huggingface.co/distilbert/distilbert-base-uncased) على مجموعة بيانات [IMDb](https://huggingface.co/datasets/imdb) لتحديد ما إذا كانت مراجعة الفيلم إيجابية أو سلبية.
2. استخدام نموذج الضبط الدقيق للتنبؤ.
<Tip>
لرؤية جميع البنى ونقاط التحقق المتوافقة مع هذه المهمة، نوصي بالتحقق من [صفحة المهمة](https://huggingface.co/tasks/text-classification).
</Tip>
قبل أن تبدأ، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate accelerate
```
نحن نشجعك على تسجيل الدخول إلى حساب Hugging Face الخاص بك حتى تتمكن من تحميل ومشاركة نموذجك مع المجتمع. عند المطالبة، أدخل رمزك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات IMDb
ابدأ بتحميل مجموعة بيانات IMDb من مكتبة 🤗 Datasets:
```py
>>> from datasets import load_dataset
>>> imdb = load_dataset("imdb")
```
ثم ألق نظرة على مثال:
```py
>>> imdb["test"][0]
{
"label": 0,
"text": "I love sci-fi and am willing to put up with a lot. Sci-fi movies/TV are usually underfunded, under-appreciated and misunderstood. I tried to like this, I really did, but it is to good TV sci-fi as Babylon 5 is to Star Trek (the original). Silly prosthetics, cheap cardboard sets, stilted dialogues, CG that doesn't match the background, and painfully one-dimensional characters cannot be overcome with a 'sci-fi' setting. (I'm sure there are those of you out there who think Babylon 5 is good sci-fi TV. It's not. It's clichéd and uninspiring.) While US viewers might like emotion and character development, sci-fi is a genre that does not take itself seriously (cf. Star Trek). It may treat important issues, yet not as a serious philosophy. It's really difficult to care about the characters here as they are not simply foolish, just missing a spark of life. Their actions and reactions are wooden and predictable, often painful to watch. The makers of Earth KNOW it's rubbish as they have to always say \"Gene Roddenberry's Earth...\" otherwise people would not continue watching. Roddenberry's ashes must be turning in their orbit as this dull, cheap, poorly edited (watching it without advert breaks really brings this home) trudging Trabant of a show lumbers into space. Spoiler. So, kill off a main character. And then bring him back as another actor. Jeeez! Dallas all over again.",
}
```
هناك حقولان في هذه المجموعة من البيانات:
- `text`: نص مراجعة الفيلم.
- `label`: قيمة إما `0` لمراجعة سلبية أو `1` لمراجعة إيجابية.
## المعالجة المسبقة(Preprocess)
الخطوة التالية هي تحميل المُجزِّئ النص DistilBERT لتهيئة لحقل `text`:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
```
أنشئ دالة لتهيئة حقل `text` وتقصير السلاسل النصية بحيث لا يتجاوز طولها الحد الأقصى لإدخالات DistilBERT:
```py
>>> def preprocess_function(examples):
... return tokenizer(examples["text"], truncation=True)
```
لتطبيق دالة التهيئة على مجموعة البيانات بأكملها، استخدم دالة 🤗 Datasets [`~datasets.Dataset.map`] . يمكنك تسريع `map` باستخدام `batched=True` لمعالجة دفعات من البيانات:
```py
tokenized_imdb = imdb.map(preprocess_function, batched=True)
```
الآن قم بإنشاء دفعة من الأمثلة باستخدام [`DataCollatorWithPadding`]. الأكثر كفاءة هو استخدام الحشو الديناميكي لجعل الجمل متساوية في الطول داخل كل دفعة، بدلًا من حشو كامل البيانات إلى الحد الأقصى للطول.
<frameworkcontent>
<pt>
```py
>>> from transformers import DataCollatorWithPadding
>>> data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
```
</pt>
<tf>
```py
>>> from transformers import DataCollatorWithPadding
>>> data_collator = DataCollatorWithPadding(tokenizer=tokenizer, return_tensors="tf")
```
</tf>
</frameworkcontent>
## التقييم(Evaluate)
يُعدّ تضمين مقياس أثناء التدريب مفيدًا لتقييم أداء النموذج. يمكنك تحميل طريقة تقييم بسرعة باستخدام مكتبة 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index) . بالنسبة لهذه المهمة، قم بتحميل مقياس [الدقة](https://huggingface.co/spaces/evaluate-metric/accuracy) (راجع جولة 🤗 Evaluate [السريعة](https://huggingface.co/docs/evaluate/a_quick_tour) لمعرفة المزيد حول كيفية تحميل وحساب مقياس):
```py
>>> import evaluate
>>> accuracy = evaluate.load("accuracy")
```
ثم أنشئ دالة تقوم بتمرير تنبؤاتك وتصنيفاتك إلى [`~evaluate.EvaluationModule.compute`] لحساب الدقة:
```py
>>> import numpy as np
>>> def compute_metrics(eval_pred):
... predictions, labels = eval_pred
... predictions = np.argmax(predictions, axis=1)
... return accuracy.compute(predictions=predictions, references=labels)
```
دالة `compute_metrics` جاهزة الآن، وستعود إليها عند إعداد التدريب.
## التدريب(Train)
قبل أن تبدأ في تدريب نموذجك، قم بإنشاء خريطة من المعرفات المتوقعة إلى تسمياتها باستخدام `id2label` و `label2id`:
```py
>>> id2label = {0: "NEGATIVE", 1: "POSITIVE"}
>>> label2id = {"NEGATIVE": 0, "POSITIVE": 1}
```
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن على دراية بضبط نموذج دقيق باستخدام [`Trainer`], فالق نظرة على البرنامج التعليمي الأساسي [هنا](../training#train-with-pytorch-trainer)!
</Tip>
أنت مستعد الآن لبدء تدريب نموذجك! قم بتحميل DistilBERT مع [`AutoModelForSequenceClassification`] جنبًا إلى جنب مع عدد التصنيفات المتوقعة، وتصنيفات الخرائط:
```py
>>> from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer
>>> model = AutoModelForSequenceClassification.from_pretrained(
... "distilbert/distilbert-base-uncased", num_labels=2, id2label=id2label, label2id=label2id
... )
```
في هذه المرحلة، هناك ثلاث خطوات فقط متبقية:
1. حدد مُعامِلات التدريب في [`TrainingArguments`]. المُعامل المطلوب الوحيد هو `output_dir`، لتحديد مكان حفظ النموذج. يمكنك رفع النموذج إلى Hub بتعيين `push_to_hub=True` (يجب تسجيل الدخول إلى Hugging Face لرفع النموذج). سيقوم `Trainer` بتقييم الدقة وحفظ نقاط التحقق في نهاية كل حقبة.
2. مرر مُعامِلات التدريب إلى `Trainer` مع النموذج، ومجموعة البيانات، والمحلل اللغوي، ومُجمِّع البيانات، ووظيفة `compute_metrics`.
3. استدعِ [`~Trainer.train`] لضبط النموذج.
```py
>>> training_args = TrainingArguments(
... output_dir="my_awesome_model",
... learning_rate=2e-5,
... per_device_train_batch_size=16,
... per_device_eval_batch_size=16,
... num_train_epochs=2,
... weight_decay=0.01,
... eval_strategy="epoch",
... save_strategy="epoch",
... load_best_model_at_end=True,
... push_to_hub=True,
... )
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=tokenized_imdb["train"],
... eval_dataset=tokenized_imdb["test"],
... processing_class=tokenizer,
... data_collator=data_collator,
... compute_metrics=compute_metrics,
... )
>>> trainer.train()
```
<Tip>
يستخدم [`Trainer`] الحشو الديناميكي افتراضيًا عند تمرير `tokenizer` إليه. في هذه الحالة، لا تحتاج لتحديد مُجمِّع البيانات صراحةً.
</Tip>
بعد اكتمال التدريب، شارك نموذجك على Hub باستخدام الطريقة [`~transformers.Trainer.push_to_hub`] ليستخدمه الجميع:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن على دراية بضبط نموذج باستخدام Keras، قم بالاطلاع على البرنامج التعليمي الأساسي [هنا](../training#train-a-tensorflow-model-with-keras)!
</Tip>
لضبط نموذج في TensorFlow، ابدأ بإعداد دالة المحسن، وجدول معدل التعلم، وبعض معلمات التدريب:
```py
>>> from transformers import create_optimizer
>>> import tensorflow as tf
>>> batch_size = 16
>>> num_epochs = 5
>>> batches_per_epoch = len(tokenized_imdb["train"]) // batch_size
>>> total_train_steps = int(batches_per_epoch * num_epochs)
>>> optimizer, schedule = create_optimizer(init_lr=2e-5, num_warmup_steps=0, num_train_steps=total_train_steps)
```
ثم يمكنك تحميل DistilBERT مع [`TFAutoModelForSequenceClassification`] بالإضافة إلى عدد التصنيفات المتوقعة، وتعيينات التسميات:
```py
>>> from transformers import TFAutoModelForSequenceClassification
>>> model = TFAutoModelForSequenceClassification.from_pretrained(
... "distilbert/distilbert-base-uncased", num_labels=2, id2label=id2label, label2id=label2id
... )
```
قم بتحويل مجموعات بياناتك إلى تنسيق `tf.data.Dataset` باستخدام [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... tokenized_imdb["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_validation_set = model.prepare_tf_dataset(
... tokenized_imdb["test"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
قم بتهيئة النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method). لاحظ أن جميع نماذج Transformers لديها دالة خسارة ذات صلة بالمهمة بشكل افتراضي، لذلك لا تحتاج إلى تحديد واحدة ما لم ترغب في ذلك:
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer) # No loss argument!
```
آخر أمرين يجب إعدادهما قبل بدء التدريب هو حساب الدقة من التوقعات، وتوفير طريقة لدفع نموذجك إلى Hub. يتم ذلك باستخدام [Keras callbacks](../main_classes/keras_callbacks).
قم بتمرير دالة `compute_metrics` الخاصة بك إلى [`~transformers.KerasMetricCallback`]:
```py
>>> from transformers.keras_callbacks import KerasMetricCallback
>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)
```
حدد مكان دفع نموذجك والمجزئ اللغوي في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> push_to_hub_callback = PushToHubCallback(
... output_dir="my_awesome_model",
... tokenizer=tokenizer,
... )
```
ثم اجمع الاستدعاءات معًا:
```py
>>> callbacks = [metric_callback, push_to_hub_callback]
```
أخيرًا، أنت مستعد لبدء تدريب نموذجك! قم باستدعاء [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع مجموعات بيانات التدريب والتحقق، وعدد الحقبات، واستدعاءاتك لضبط النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3, callbacks=callbacks)
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائيًا إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
للحصول على مثال أكثر عمقًا حول كيفية ضبط نموذج لتصنيف النصوص، قم بالاطلاع على الدفتر المقابل
[دفتر PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification.ipynb)
أو [دفتر TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification-tf.ipynb).
</Tip>
## الاستدلال(Inference)
رائع، الآن بعد أن قمت بضبط نموذج، يمكنك استخدامه للاستدلال!
احصل على بعض النصوص التي ترغب في إجراء الاستدلال عليها:
```py
>>> text = "This was a masterpiece. Not completely faithful to the books, but enthralling from beginning to end. Might be my favorite of the three."
```
أسهل طريقة لتجربة النموذج المضبوط للاستدلال هي استخدامه ضمن [`pipeline`]. قم بإنشاء `pipeline` لتحليل المشاعر مع نموذجك، ومرر نصك إليه:
```py
>>> from transformers import pipeline
>>> classifier = pipeline("sentiment-analysis", model="stevhliu/my_awesome_model")
>>> classifier(text)
[{'label': 'POSITIVE', 'score': 0.9994940757751465}]
```
يمكنك أيضًا تكرار نتائج `pipeline` يدويًا إذا أردت:
<frameworkcontent>
<pt>
قم يتجزئة النص وإرجاع تنسورات PyTorch:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_model")
>>> inputs = tokenizer(text, return_tensors="pt")
```
مرر المدخلات إلى النموذج واسترجع `logits`:
```py
>>> from transformers import AutoModelForSequenceClassification
>>> model = AutoModelForSequenceClassification.from_pretrained("stevhliu/my_awesome_model")
>>> with torch.no_grad():
... logits = model(**inputs).logits
```
استخرج الفئة ذات الاحتمالية الأعلى، واستخدم `id2label` لتحويلها إلى تصنيف نصي:
```py
>>> predicted_class_id = logits.argmax().item()
>>> model.config.id2label[predicted_class_id]
'POSITIVE'
```
</pt>
<tf>
قم بتحليل النص وإرجاع تنسيقات TensorFlow:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_model")
>>> inputs = tokenizer(text, return_tensors="tf")
```
قم بتمرير مدخلاتك إلى النموذج وإرجاع `logits`:
```py
>>> from transformers import TFAutoModelForSequenceClassification
>>> model = TFAutoModelForSequenceClassification.from_pretrained("stevhliu/my_awesome_model")
>>> logits = model(**inputs).logits
```
استخرج الفئة ذات الاحتمالية الأعلى، واستخدم `id2label` لتحويلها إلى تصنيف نصي:
```py
>>> predicted_class_id = int(tf.math.argmax(logits, axis=-1)[0])
>>> model.config.id2label[predicted_class_id]
'POSITIVE'
```
</tf>
</frameworkcontent>

File diff suppressed because one or more lines are too long

View File

@ -0,0 +1,550 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# تصنيف الرموز(Token classification)
[[open-in-colab]]
<Youtube id="wVHdVlPScxA"/>
يهدف تصنيف الرموز إلى إعطاء تسمية لكل رمز على حدة في الجملة. من أكثر مهام تصنيف الرموز شيوعًا هو التعرف على الكيانات المسماة (NER). يحاول NER تحديد تسمية لكل كيان في الجملة، مثل شخص، أو مكان، أو منظمة.
سيوضح لك هذا الدليل كيفية:
1. ضبط [DistilBERT](https://huggingface.co/distilbert/distilbert-base-uncased) على مجموعة بيانات [WNUT 17](https://huggingface.co/datasets/wnut_17) للكشف عن كيانات جديدة.
2. استخدام نموذجك المضبوط بدقة للاستدلال.
<Tip>
للاطلاع جميع البنى والنقاط المتوافقة مع هذه المهمة، نوصي بالرجوع من [صفحة المهمة](https://huggingface.co/tasks/token-classification).
</Tip>
قبل أن تبدأ، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate seqeval
```
نحن نشجعك على تسجيل الدخول إلى حساب HuggingFace الخاص بك حتى تتمكن من تحميل ومشاركة نموذجك مع المجتمع. عندما يُطلب منك، أدخل رمزك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات WNUT 17
ابدأ بتحميل مجموعة بيانات WNUT 17 من مكتبة 🤗 Datasets:
```py
>>> from datasets import load_dataset
>>> wnut = load_dataset("wnut_17")
```
ثم ألق نظرة على مثال:
```py
>>> wnut["train"][0]
{'id': '0',
'ner_tags': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 8, 8, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0],
'tokens': ['@paulwalk', 'It', "'s", 'the', 'view', 'from', 'where', 'I', "'m", 'living', 'for', 'two', 'weeks', '.', 'Empire', 'State', 'Building', '=', 'ESB', '.', 'Pretty', 'bad', 'storm', 'here', 'last', 'evening', '.']
}
```
يمثل كل رقم في `ner_tags` كياناً. حوّل الأرقام إلى أسماء التصنيفات لمعرفة ماهية الكيانات:
```py
>>> label_list = wnut["train"].features[f"ner_tags"].feature.names
>>> label_list
[
"O",
"B-corporation",
"I-corporation",
"B-creative-work",
"I-creative-work",
"B-group",
"I-group",
"B-location",
"I-location",
"B-person",
"I-person",
"B-product",
"I-product",
]
```
يشير الحرف الذي يسبق كل `ner_tag` إلى موضع الرمز للكيان:
- `B-` يشير إلى بداية الكيان.
- `I-` يشير إلى أن الرمز يقع ضمن نفس الكيان (على سبيل المثال، الرمز `State` هو جزء من كيان مثل `Empire State Building`).
- `0` يشير إلى أن الرمز لا يمثل أي كيان.
## المعالجة المسبقة(Preprocess)
<Youtube id="iY2AZYdZAr0"/>
الخطوة التالية هي تحميل مُجزِّئ النصوص DistilBERT للمعالجة المسبقة لحقل `tokens`:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
```
كما رأيت في حقل `tokens` المثال أعلاه، يبدو أن المدخل قد تم تحليله بالفعل. لكن المدخل لم يُجزأ بعد ويتعيّن عليك ضبط `is_split_into_words=True` لتقسيم الكلمات إلى كلمات فرعية. على سبيل المثال:
```py
>>> example = wnut["train"][0]
>>> tokenized_input = tokenizer(example["tokens"], is_split_into_words=True)
>>> tokens = tokenizer.convert_ids_to_tokens(tokenized_input["input_ids"])
>>> tokens
['[CLS]', '@', 'paul', '##walk', 'it', "'", 's', 'the', 'view', 'from', 'where', 'i', "'", 'm', 'living', 'for', 'two', 'weeks', '.', 'empire', 'state', 'building', '=', 'es', '##b', '.', 'pretty', 'bad', 'storm', 'here', 'last', 'evening', '.', '[SEP]']
```
ومع ذلك، يضيف هذا بعض الرموز الخاصة `[CLS]` و`[SEP]` وتقسيم الكلمات إلى أجزاء يُنشئ عدم تطابق بين المُدخلات والتسميات. قد يتم تقسيم كلمة واحدة تقابل تسمية واحدة الآن إلى كلمتين فرعيتين. ستحتاج إلى إعادة محاذاة الرموز والتسميات عن طريق:
1. ربط كل رمز بالكلمة الأصلية باستخدام الخاصية [`word_ids`](https://huggingface.co/docs/transformers/main_classes/tokenizer#transformers.BatchEncoding.word_ids).
2. تعيين التسمية `-100` للرموز الخاصة `[CLS]` و`[SEP]` بحيث يتم تجاهلها بواسطة دالة الخسارة PyTorch (انظر [CrossEntropyLoss](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html)).
3. تسمية الرمز الأول فقط لكلمة معينة. قم بتعيين `-100` لأجزاء الكلمة الأخرى.
هنا كيف يمكنك إنشاء وظيفة لإعادة محاذاة الرموز والتسميات، وقص الجمل لتتجاوز الحد الأقصى لطول مُدخلات DistilBERT:
```py
>>> def tokenize_and_align_labels(examples):
... tokenized_inputs = tokenizer(examples["tokens"], truncation=True, is_split_into_words=True)
... labels = []
... for i, label in enumerate(examples[f"ner_tags"]):
... word_ids = tokenized_inputs.word_ids(batch_index=i) # تعيين الرموز إلى كلماتهم المقابلة.
... previous_word_idx = None
... label_ids = []
... for word_idx in word_ids: # تعيين الرموز الخاصة إلى -100.
... if word_idx is None:
... label_ids.append(-100)
... elif word_idx != previous_word_idx: # تسمية الرمز الأول فقط لكلمة معينة.
... label_ids.append(label[word_idx])
... else:
... label_ids.append(-100)
... previous_word_idx = word_idx
... labels.append(label_ids)
... tokenized_inputs["labels"] = labels
... return tokenized_inputs
```
لتطبيق هذه العملية على كامل مجموعة البيانات، استخدم الدالة [`~datasets.Dataset.map`] لمجموعة بيانات 🤗. يمكنك تسريع الدالة `map` عن طريق تعيين `batched=True` لمعالجة عناصر متعددة من مجموعة البيانات في وقت واحد:
```py
>>> tokenized_wnut = wnut.map(tokenize_and_align_labels, batched=True)
```
الآن قم بإنشاء دفعة من الأمثلة باستخدام [`DataCollatorWithPadding`].من الأفضل استخدام *الحشو الديناميكي* للجمل إلى أطول طول في دفعة أثناء التجميع، بدلاً من حشو مجموعة البيانات بالكامل إلى الطول الأقصى.
<frameworkcontent>
<pt>
```py
>>> from transformers import DataCollatorForTokenClassification
>>> data_collator = DataCollatorForTokenClassification(tokenizer=tokenizer)
```
</pt>
<tf>
```py
>>> from transformers import DataCollatorForTokenClassification
>>> data_collator = DataCollatorForTokenClassification(tokenizer=tokenizer, return_tensors="tf")
```
</tf>
</frameworkcontent>
## التقييم(Evaluate)
يُعدّ تضمين مقياس أثناء التدريب مفيدًا في تقييم أداء نموذجك. يمكنك تحميل طريقة تقييم بسرعة مع مكتبة 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index). لهذه المهمة، قم بتحميل إطار [seqeval](https://huggingface.co/spaces/evaluate-metric/seqeval) (انظر جولة 🤗 Evaluate [quick tour](https://huggingface.co/docs/evaluate/a_quick_tour) لمعرفة المزيد حول كيفية تحميل وحساب مقياس). يُخرج seqeval عدة نتائج: الدقة، والاستذكار، ومقياس F1، والدقة.
```py
>>> import evaluate
>>> seqeval = evaluate.load("seqeval")
```
احصل على تسميات الكيانات المسماة (NER) أولاً،ثم أنشئ دالة تُمرر تنبؤاتك وتسمياتك الصحيحة إلى [`~evaluate.EvaluationModule.compute`] لحساب النتائج:
```py
>>> import numpy as np
>>> labels = [label_list[i] for i in example[f"ner_tags"]]
>>> def compute_metrics(p):
... predictions, labels = p
... predictions = np.argmax(predictions, axis=2)
... true_predictions = [
... [label_list[p] for (p, l) in zip(prediction, label) if l != -100]
... for prediction, label in zip(predictions, labels)
... ]
... true_labels = [
... [label_list[l] for (p, l) in zip(prediction, label) if l != -100]
... for prediction, label in zip(predictions, labels)
... ]
... results = seqeval.compute(predictions=true_predictions, references=true_labels)
... return {
... "precision": results["overall_precision"],
... "recall": results["overall_recall"],
... "f1": results["overall_f1"],
... "accuracy": results["overall_accuracy"],
... }
```
دالة `compute_metrics` جاهزة للاستخدام، وستحتاج إليها عند إعداد التدريب.
## التدريب(Train)
قبل تدريب النموذج، جهّز خريطة تربط بين المعرّفات المتوقعة وتسمياتها باستخدام `id2label` و `label2id`:
```py
>>> id2label = {
... 0: "O",
... 1: "B-corporation",
... 2: "I-corporation",
... 3: "B-creative-work",
... 4: "I-creative-work",
... 5: "B-group",
... 6: "I-group",
... 7: "B-location",
... 8: "I-location",
... 9: "B-person",
... 10: "I-person",
... 11: "B-product",
... 12: "I-product",
... }
>>> label2id = {
... "O": 0,
... "B-corporation": 1,
... "I-corporation": 2,
... "B-creative-work": 3,
... "I-creative-work": 4,
... "B-group": 5,
... "I-group": 6,
... "B-location": 7,
... "I-location": 8,
... "B-person": 9,
... "I-person": 10,
... "B-product": 11,
... "I-product": 12,
... }
```
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن على دراية بتعديل نموذج باستخدام [`Trainer`], ألق نظرة على الدليل التعليمي الأساسي [هنا](../training#train-with-pytorch-trainer)!
</Tip>
أنت مستعد الآن لبدء تدريب نموذجك! قم بتحميل DistilBERT مع [`AutoModelForTokenClassification`] إلى جانب عدد التصنيفات المتوقعة، وخريطة التسميات:
```py
>>> from transformers import AutoModelForTokenClassification, TrainingArguments, Trainer
>>> model = AutoModelForTokenClassification.from_pretrained(
... "distilbert/distilbert-base-uncased", num_labels=13, id2label=id2label, label2id=label2id
... )
```
في هذه المرحلة، هناك ثلاث خطوات فقط متبقية:
1. حدد معلمات التدريب الخاصة بك في [`TrainingArguments`]. المعامل الوحيد المطلوب هو `output_dir` الذي يحدد مكان حفظ نموذجك. ستقوم بدفع هذا النموذج إلى Hub عن طريق تعيين `push_to_hub=True` (يجب أن تكون مسجلاً الدخول إلى Hugging Face لتحميل نموذجك). في نهاية كل حقبة، سيقوم [`Trainer`] بتقييم درجات seqeval وحفظ تسخة التدريب.
2. قم بتمرير معاملات التدريب إلى [`Trainer`] إلى جانب النموذج، ومجموعة البيانات، والمُجزِّئ اللغوي، و`data collator`، ودالة `compute_metrics`.
3.استدعِ [`~Trainer.train`] لتدريب نموذجك.
```py
>>> training_args = TrainingArguments(
... output_dir="my_awesome_wnut_model",
... learning_rate=2e-5,
... per_device_train_batch_size=16,
... per_device_eval_batch_size=16,
... num_train_epochs=2,
... weight_decay=0.01,
... eval_strategy="epoch",
... save_strategy="epoch",
... load_best_model_at_end=True,
... push_to_hub=True,
... )
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=tokenized_wnut["train"],
... eval_dataset=tokenized_wnut["test"],
... processing_class=tokenizer,
... data_collator=data_collator,
... compute_metrics=compute_metrics,
... )
>>> trainer.train()
```
بمجرد اكتمال التدريب، شارك نموذجك على Hub باستخدام طريقة [`~transformers.Trainer.push_to_hub`] حتى يتمكن الجميع من استخدام نموذجك:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن على دراية بتعديل نموذج باستخدام Keras، ألق نظرة على الدليل التعليمي الأساسي [هنا](../training#train-a-tensorflow-model-with-keras)!
</Tip>
للتعديل على نموذج في TensorFlow، ابدأ بإعداد دالة محسن، وجدول معدل التعلم، وبعض معلمات التدريب:
```py
>>> from transformers import create_optimizer
>>> batch_size = 16
>>> num_train_epochs = 3
>>> num_train_steps = (len(tokenized_wnut["train"]) // batch_size) * num_train_epochs
>>> optimizer, lr_schedule = create_optimizer(
... init_lr=2e-5,
... num_train_steps=num_train_steps,
... weight_decay_rate=0.01,
... num_warmup_steps=0,
... )
```
ثم يمكنك تحميل DistilBERT مع [`TFAutoModelForTokenClassification`] إلى جانب عدد التسميات المتوقعة، وتخطيطات التسميات:
```py
>>> from transformers import TFAutoModelForTokenClassification
>>> model = TFAutoModelForTokenClassification.from_pretrained(
... "distilbert/distilbert-base-uncased", num_labels=13, id2label=id2label, label2id=label2id
... )
```
قم بتحويل مجموعات بياناتك إلى تنسيق `tf.data.Dataset` مع [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... tokenized_wnut["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_validation_set = model.prepare_tf_dataset(
... tokenized_wnut["validation"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
هيّئ النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method). لاحظ أن نماذج Transformers تتضمن دالة خسارة افتراضية مرتبطة بالمهمة، لذلك لا تحتاج إلى تحديد واحدة إلا إذا كنت ترغب في ذلك:
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer) # No loss argument!
```
آخر أمرين يجب إعدادهما قبل بدء التدريب هو حساب درجات seqeval من التنبؤات، وتوفير طريقة لدفع نموذجك إلى Hub. يتم ذلك باستخدام [Keras callbacks](../main_classes/keras_callbacks).
مرر دالة `compute_metrics` الخاصة بك إلى [`~transformers.KerasMetricCallback`]:
```py
>>> from transformers.keras_callbacks import KerasMetricCallback
>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)
```
حدد مكان دفع نموذجك والمحلل اللغوي في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> push_to_hub_callback = PushToHubCallback(
... output_dir="my_awesome_wnut_model",
... tokenizer=tokenizer,
... )
```
ثم جمّع callbacks الخاصة بك معًا:
```py
>>> callbacks = [metric_callback, push_to_hub_callback]
```
أخيرًا، أنت جاهز الآن لبدء تدريب نموذجك! قم باستدعاء [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع بيانات التدريب والتحقق، وعدد الحقبات، وcallbacks لتعديل النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3, callbacks=callbacks)
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائيًا إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
للحصول على مثال أكثر تفصيلاً حول كيفية تعديل نموذج لتصنيف الرموز، ألق نظرة على الدفتر المقابل
[دفتر PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification.ipynb)
أو [دفتر TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification-tf.ipynb).
</Tip>
## الاستدلال(Inference)
رائع، الآن بعد أن قمت بتعديل نموذج، يمكنك استخدامه للاستدلال!
احصل على بعض النصوص التي تريد تشغيل الاستدلال عليها:
```py
>>> text = "The Golden State Warriors are an American professional basketball team based in San Francisco."
```
أبسط طريقة لتجربة نموذجك المُدرب مسبقًا للاستدلال هي استخدامه في [`pipeline`]. قم بتنفيذ `pipeline` لتصنيف الكيانات المسماة مع نموذجك، ومرر نصك إليه:
```py
>>> from transformers import pipeline
>>> classifier = pipeline("ner", model="stevhliu/my_awesome_wnut_model")
>>> classifier(text)
[{'entity': 'B-location',
'score': 0.42658573,
'index': 2,
'word': 'golden',
'start': 4,
'end': 10},
{'entity': 'I-location',
'score': 0.35856336,
'index': 3,
'word': 'state',
'start': 11,
'end': 16},
{'entity': 'B-group',
'score': 0.3064001,
'index': 4,
'word': 'warriors',
'start': 17,
'end': 25},
{'entity': 'B-location',
'score': 0.65523505,
'index': 13,
'word': 'san',
'start': 80,
'end': 83},
{'entity': 'B-location',
'score': 0.4668663,
'index': 14,
'word': 'francisco',
'start': 84,
'end': 93}]
```
يمكنك أيضًا تكرار نتائج `pipeline` يدويًا إذا أردت:
<frameworkcontent>
<pt>
قسّم النص إلى رموز وأرجع المُوتّرات بلغة PyTorch:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_wnut_model")
>>> inputs = tokenizer(text, return_tensors="pt")
```
مرر مدخلاتك إلى النموذج واحصل على `logits`:
```py
>>> from transformers import AutoModelForTokenClassification
>>> model = AutoModelForTokenClassification.from_pretrained("stevhliu/my_awesome_wnut_model")
>>> with torch.no_grad():
... logits = model(**inputs).logits
```
استخرج الفئة ذات الاحتمالية الأعلى، واستخدم جدول `id2label` الخاصة بالنموذج لتحويلها إلى تسمية نصية:
```py
>>> predictions = torch.argmax(logits, dim=2)
>>> predicted_token_class = [model.config.id2label[t.item()] for t in predictions[0]]
>>> predicted_token_class
['O',
'O',
'B-location',
'I-location',
'B-group',
'O',
'O',
'O',
'O',
'O',
'O',
'O',
'O',
'B-location',
'B-location',
'O',
'O']
```
</pt>
<tf>
قسّم النص إلى رموز وأرجع المُوتّرات ب TensorFlow:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_wnut_model")
>>> inputs = tokenizer(text, return_tensors="tf")
```
مرر مدخلاتك إلى النموذج واحصل على `logits`:
```py
>>> from transformers import TFAutoModelForTokenClassification
>>> model = TFAutoModelForTokenClassification.from_pretrained("stevhliu/my_awesome_wnut_model")
>>> logits = model(**inputs).logits
```
استخرج الفئة ذات الاحتمالية الأعلى، واستخدم جدول `id2label` الخاصة بالنموذج لتحويلها إلى تسمية نصية:
```py
>>> predicted_token_class_ids = tf.math.argmax(logits, axis=-1)
>>> predicted_token_class = [model.config.id2label[t] for t in predicted_token_class_ids[0].numpy().tolist()]
>>> predicted_token_class
['O',
'O',
'B-location',
'I-location',
'B-group',
'O',
'O',
'O',
'O',
'O',
'O',
'O',
'O',
'B-location',
'B-location',
'O',
'O']
```
</tf>
</frameworkcontent>

View File

@ -0,0 +1,407 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# الترجمة(Translation)
[[open-in-colab]]
<Youtube id="1JvfrvZgi6c"/>
الترجمة هي عملية تحويل سلسلة نصية من لغة إلى أخرى. وهي إحدى المهام التي يمكن صياغتها كمسألة تسلسل إلى تسلسل، وهو إطار عمل قوي لإنتاج مخرجات من مدخلات، مثل الترجمة أو التلخيص. تُستخدم أنظمة الترجمة عادةً للترجمة بين نصوص لغات مختلفة، ويمكن استخدامها أيضًا لترجمة الكلام أو لمهام تجمع بين النصوص والكلام، مثل تحويل النص إلى كلام أو تحويل الكلام إلى نص.
سيوضح لك هذا الدليل كيفية:
1. ضبط دقيق لنموذج [T5](https://huggingface.co/google-t5/t5-small) على المجموعة الفرعية الإنجليزية-الفرنسية من مجموعة بيانات [OPUS Books](https://huggingface.co/datasets/opus_books) لترجمة النص الإنجليزي إلى الفرنسية.
2. استخدام النموذج المضبوط بدقة للاستدلال.
<Tip>
لمشاهدة جميع البنى والنسخ المتوافقة مع هذه المهمة، نوصي بالتحقق من [صفحة المهمة](https://huggingface.co/tasks/translation).
</Tip>
قبل البدء، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate sacrebleu
```
نشجعك على تسجيل الدخول إلى حساب Hugging Face الخاص بك حتى تتمكن من تحميل نموذجك ومشاركته مع المجتمع. عند الطلب، أدخل الرمز المميز الخاص بك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات OPUS Books
ابدأ بتحميل المجموعة الفرعية الإنجليزية-الفرنسية من مجموعة بيانات [OPUS Books](https://huggingface.co/datasets/opus_books) من مكتبة 🤗 Datasets:
```py
>>> from datasets import load_dataset
>>> books = load_dataset("opus_books", "en-fr")
```
قسّم مجموعة البيانات إلى مجموعة تدريب ومجموعة اختبار باستخدام طريقة [`~datasets.Dataset.train_test_split`]:
```py
>>> books = books["train"].train_test_split(test_size=0.2)
```
ثم ألقِ نظرة على مثال:
```py
>>> books["train"][0]
{'id': '90560',
'translation': {'en': 'But this lofty plateau measured only a few fathoms, and soon we reentered Our Element.',
'fr': 'Mais ce plateau élevé ne mesurait que quelques toises, et bientôt nous fûmes rentrés dans notre élément.'}}
```
`translation`: ترجمة إنجليزية وفرنسية للنص.
## المعالجة المسبقة(Preprocess)
<Youtube id="XAR8jnZZuUs"/>
الخطوة التالية هي تحميل مُجزئ T5 لمعالجة أزواج اللغة الإنجليزية-الفرنسية:
```py
>>> from transformers import AutoTokenizer
>>> checkpoint = "google-t5/t5-small"
>>> tokenizer = AutoTokenizer.from_pretrained(checkpoint)
```
يجب أن تقوم دالة المعالجة المسبقة التي تُريد إنشاءها بما يلي:
1. إضافة بادئة إلى المُدخل بمُوجه حتى يعرف T5 أن هذه مهمة ترجمة. تتطلب بعض النماذج القادرة على أداء مهام متعددة توجيهًا لمهام مُحددة.
2. تعيين اللغة الهدف (الفرنسية) في معامل `text_target` لضمان معالجة المُجزئ للنص بشكل صحيح. إذا لم تُعيّن `text_target`، فسيُعالج المُجزئ النص على أنه إنجليزي.
3. اقتطاع التسلسلات بحيث لا يزيد طولها عن الحد الأقصى الذي يحدده معامل `max_length`.
```py
>>> source_lang = "en"
>>> target_lang = "fr"
>>> prefix = "translate English to French: "
>>> def preprocess_function(examples):
... inputs = [prefix + example[source_lang] for example in examples["translation"]]
... targets = [example[target_lang] for example in examples["translation"]]
... model_inputs = tokenizer(inputs, text_target=targets, max_length=128, truncation=True)
... return model_inputs
```
لتطبيق دالة المعالجة المسبقة على مجموعة البيانات بأكملها، استخدم طريقة [`~datasets.Dataset.map`] من 🤗 Datasets. يمكنك تسريع دالة `map` عن طريق تعيين `batched=True` لمعالجة عناصر متعددة من مجموعة البيانات في وقت واحد:
```py
>>> tokenized_books = books.map(preprocess_function, batched=True)
```
الآن أنشئ دفعة من الأمثلة باستخدام [`DataCollatorForSeq2Seq`]. من الأكثر كفاءة *الحشو الديناميكي* للجمل إلى أطول طول في دفعة أثناء التجميع، بدلاً من حشو مجموعة البيانات بأكملها إلى الحد الأقصى للطول.
<frameworkcontent>
<pt>
```py
>>> from transformers import DataCollatorForSeq2Seq
>>> data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=checkpoint)
```
</pt>
<tf>
```py
>>> from transformers import DataCollatorForSeq2Seq
>>> data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=checkpoint, return_tensors="tf")
```
</tf>
</frameworkcontent>
## التقييم (Evaluate)
غالباً ما يكون تضمين مقياس أثناء التدريب مفيداً لتقييم أداء نموذجك. يمكنك تحميل طريقة تقييم بسرعة باستخدام مكتبة 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index). لهذه المهمة، حمّل مقياس [SacreBLEU](https://huggingface.co/spaces/evaluate-metric/sacrebleu) (راجع [الجولة السريعة](https://huggingface.co/docs/evaluate/a_quick_tour) لـ 🤗 Evaluate لمعرفة المزيد حول كيفية تحميل وحساب مقياس):
```py
>>> import evaluate
>>> metric = evaluate.load("sacrebleu")
```
ثم أنشئ دالة تُمرر تنبؤاتك وتسمياتك إلى [`~evaluate.EvaluationModule.compute`] لحساب درجة SacreBLEU:
```py
>>> import numpy as np
>>> def postprocess_text(preds, labels):
... preds = [pred.strip() for pred in preds]
... labels = [[label.strip()] for label in labels]
... return preds, labels
>>> def compute_metrics(eval_preds):
... preds, labels = eval_preds
... if isinstance(preds, tuple):
... preds = preds[0]
... decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)
... labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
... decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
... decoded_preds, decoded_labels = postprocess_text(decoded_preds, decoded_labels)
... result = metric.compute(predictions=decoded_preds, references=decoded_labels)
... result = {"bleu": result["score"]}
... prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in preds]
... result["gen_len"] = np.mean(prediction_lens)
... result = {k: round(v, 4) for k, v in result.items()}
... return result
```
دالة `compute_metrics` الخاصة بك جاهزة الآن، وسوف تعود إليها عند إعداد التدريب.
## التدريب (Train)
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن معتادًا على ضبط دقيق نموذج باستخدام [`Trainer`], فألقِ نظرة على البرنامج التعليمي الأساسي [هنا](../training#train-with-pytorch-trainer)!
</Tip>
أنت جاهز لبدء تدريب نموذجك الآن! حمّل T5 باستخدام [`AutoModelForSeq2SeqLM`]:
```py
>>> from transformers import AutoModelForSeq2SeqLM, Seq2SeqTrainingArguments, Seq2SeqTrainer
>>> model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint)
```
في هذه المرحلة، تبقى ثلاث خطوات فقط:
1. حدد مُعاملات للتدريب في [`Seq2SeqTrainingArguments`]. المُعامل الوحيدة المطلوبة هي `output_dir` التي تحدد مكان حفظ النموذج الخاص بك. ستقوم بدفع هذا النموذج إلى Hub عن طريق تعيين `push_to_hub=True` (يجب عليك تسجيل الدخول إلى Hugging Face لتحميل نموذجك). في نهاية كل حقبة، سيقوم [`Trainer`] بتقييم مقياس SacreBLEU وحفظ نقطة تدقيق التدريب.
2. مرر مُعاملات التدريب إلى [`Seq2SeqTrainer`] جنبًا إلى جنب مع النموذج ومجموعة البيانات والمعالج اللغوي وجامع البيانات ووظيفة `compute_metrics`.
3. نفّذ [`~Trainer.train`] لضبط نموذجك.
```py
>>> training_args = Seq2SeqTrainingArguments(
... output_dir="my_awesome_opus_books_model",
... eval_strategy="epoch",
... learning_rate=2e-5,
... per_device_train_batch_size=16,
... per_device_eval_batch_size=16,
... weight_decay=0.01,
... save_total_limit=3,
... num_train_epochs=2,
... predict_with_generate=True,
... fp16=True, #change to bf16=True for XPU
... push_to_hub=True,
... )
>>> trainer = Seq2SeqTrainer(
... model=model,
... args=training_args,
... train_dataset=tokenized_books["train"],
... eval_dataset=tokenized_books["test"],
... processing_class=tokenizer,
... data_collator=data_collator,
... compute_metrics=compute_metrics,
... )
>>> trainer.train()
```
بمجرد اكتمال التدريب، شارك نموذجك مع Hub باستخدام طريقة [`~transformers.Trainer.push_to_hub`] حتى يتمكن الجميع من استخدام نموذجك:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن معتادًا على ضبط نموذج باستخدام Keras، فألق نظرة على البرنامج التعليمي الأساسي [هنا](../training#train-a-tensorflow-model-with-keras)!
</Tip>
لضبط نموذج في TensorFlow، ابدأ بإعداد دالة مُحسِّن وجدول معدل تعلم وبعض المعلمات الفائقة للتدريب:
```py
>>> from transformers import AdamWeightDecay
>>> optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)
```
ثم يمكنك تحميل T5 باستخدام [`TFAutoModelForSeq2SeqLM`]:
```py
>>> from transformers import TFAutoModelForSeq2SeqLM
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained(checkpoint)
```
حوّل مجموعات البيانات الخاصة بك إلى تنسيق `tf.data.Dataset` باستخدام [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... tokenized_books["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_test_set = model.prepare_tf_dataset(
... tokenized_books["test"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
قم بتكوين النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method). لاحظ أن جميع نماذج Transformers تحتوي على دالة خسارة ذات صلة بالمهمة بشكل افتراضي، لذلك لا تحتاج إلى تحديد واحدة إلا إذا كنت ترغب في ذلك:
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer) # No loss argument!
```
آخر شيئين يجب إعدادهما قبل بدء التدريب هما حساب مقياس SacreBLEU من التوقعات، وتوفير طريقة لدفع نموذجك إلى Hub. يتم كلاهما باستخدام [استدعاءات Keras](../main_classes/keras_callbacks).
مرر دالة `compute_metrics` الخاصة بك إلى [`~transformers.KerasMetricCallback`]:
```py
>>> from transformers.keras_callbacks import KerasMetricCallback
>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_test_set)
```
حدد مكان دفع نموذجك ومعالجك اللغوي في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> push_to_hub_callback = PushToHubCallback(
... output_dir="my_awesome_opus_books_model",
... tokenizer=tokenizer,
... )
```
ثم اجمع استدعاءاتك معًا:
```py
>>> callbacks = [metric_callback, push_to_hub_callback]
```
أخيرًا، أنت جاهز لبدء تدريب نموذجك! اتصل بـ [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع مجموعات بيانات التدريب والتحقق من الصحة وعدد الحقب واستدعاءاتك لضبط النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3, callbacks=callbacks)
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائيًا إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
للحصول على مثال أكثر تعمقًا لكيفية ضبط نموذج للترجمة، ألق نظرة على [دفتر ملاحظات PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/translation.ipynb) المقابل
أو [دفتر ملاحظات TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/translation-tf.ipynb).
</Tip>
## الاستدلال (Inference)
رائع، الآن بعد أن قمت بضبط نموذج، يمكنك استخدامه للاستدلال!
أحضر بعض النصوص التي ترغب في ترجمتها إلى لغة أخرى. بالنسبة لـ T5، تحتاج إلى إضافة بادئة إلى مدخلاتك اعتمادًا على المهمة التي تعمل عليها. للترجمة من الإنجليزية إلى الفرنسية، يجب عليك إضافة بادئة إلى مدخلاتك كما هو موضح أدناه:
```py
>>> text = "translate English to French: Legumes share resources with nitrogen-fixing bacteria."
```
أبسط طريقة لتجربة نموذجك المضبوط للاستدلال هي استخدامه في [`pipeline`]. قم بإنشاء مثيل لـ `pipeline` للترجمة باستخدام نموذجك، ومرر النص الخاص بك إليه:
```py
>>> from transformers import pipeline
# تغيير `xx` إلى لغة الإدخال و `yy` إلى لغة المخرجات المطلوبة.
# أمثلة: "en" للغة الإنجليزية، "fr" للغة الفرنسية، "de" للغة الألمانية، "es" للغة الإسبانية، "zh" للغة الصينية، إلخ؛ translation_en_to_fr تترجم من الإنجليزية إلى الفرنسية
# يمكنك عرض جميع قوائم اللغات هنا - https://huggingface.co/languages
>>> translator = pipeline("translation_xx_to_yy", model="username/my_awesome_opus_books_model")
>>> translator(text)
[{'translation_text': 'Legumes partagent des ressources avec des bactéries azotantes.'}]
```
يمكنك أيضًا تكرار نتائج `pipeline` يدويًا إذا أردت:
<frameworkcontent>
<pt>
قم بتحويل النص إلى رموز وإرجاع `input_ids` كموترات PyTorch:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_opus_books_model")
>>> inputs = tokenizer(text, return_tensors="pt").input_ids
```
استخدم الدالة [`~generation.GenerationMixin.generate`] لإنشاء الترجمة. لمزيد من التفاصيل حول استراتيجيات توليد النصوص المختلفة والمعلمات للتحكم في التوليد، تحقق من واجهة برمجة تطبيقات [توليد النصوص](../main_classes/text_generation).
```py
>>> from transformers import AutoModelForSeq2SeqLM
>>> model = AutoModelForSeq2SeqLM.from_pretrained("username/my_awesome_opus_books_model")
>>> outputs = model.generate(inputs, max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)
```
فك تشفير معرفات الرموز المولدة مرة أخرى إلى نص:
```py
>>> tokenizer.decode(outputs[0], skip_special_tokens=True)
'Les lignées partagent des ressources avec des bactéries enfixant l'azote.'
```
</pt>
<tf>
قم بتحويل النص إلى رموز وإرجاع `input_ids` كموترات TensorFlow:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_opus_books_model")
>>> inputs = tokenizer(text, return_tensors="tf").input_ids
```
استخدم طريقة [`~transformers.generation_tf_utils.TFGenerationMixin.generate`] لإنشاء الترجمة. لمزيد من التفاصيل حول استراتيجيات توليد النصوص المختلفة والمعلمات للتحكم في التوليد، تحقق من واجهة برمجة تطبيقات [توليد النصوص](../main_classes/text_generation).
```py
>>> from transformers import TFAutoModelForSeq2SeqLM
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("username/my_awesome_opus_books_model")
>>> outputs = model.generate(inputs, max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)
```
فك تشفير معرفات الرموز المولدة مرة أخرى إلى نص:
```py
>>> tokenizer.decode(outputs[0], skip_special_tokens=True)
'Les lugumes partagent les ressources avec des bactéries fixatrices d'azote.'
```
</tf>
</frameworkcontent>

View File

@ -0,0 +1,41 @@
# Tiktoken والتفاعل مع Transformers
يتم دمج دعم ملفات نموذج tiktoken بسلاسة في 🤗 transformers عند تحميل النماذج
`from_pretrained` مع ملف `tokenizer.model` tiktoken على Hub، والذي يتم تحويله تلقائيًا إلى [المحلل اللغوي السريع](https://huggingface.co/docs/transformers/main/en/main_classes/tokenizer#transformers.PreTrainedTokenizerFast).
### النماذج المعروفة التي تم إصدارها مع `tiktoken.model`:
- gpt2
- llama3
## مثال على الاستخدام
من أجل تحميل ملفات `tiktoken` في `transformers`، تأكد من أن ملف `tokenizer.model` هو ملف tiktoken وسيتم تحميله تلقائيًا عند التحميل `from_pretrained`. إليك كيفية تحميل مجزىء لغوي ونموذج، والذي
يمكن تحميله من نفس الملف بالضبط:
```py
from transformers import AutoTokenizer
model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id, subfolder="original")
```
## إنشاء مجزىء لغوي tiktoken
لا يحتوي ملف `tokenizer.model` على أي معلومات حول الرموز أو الأنماط الإضافية. إذا كانت هذه الأمور مهمة، قم بتحويل المحلل اللغوي إلى `tokenizer.json`، وهو التنسيق المناسب لـ [`PreTrainedTokenizerFast`].
قم بتوليد ملف `tokenizer.model` باستخدام [tiktoken.get_encoding](https://github.com/openai/tiktoken/blob/63527649963def8c759b0f91f2eb69a40934e468/tiktoken/registry.py#L63) ثم قم بتحويله إلى `tokenizer.json` باستخدام [`convert_tiktoken_to_fast`].
```py
from transformers.integrations.tiktoken import convert_tiktoken_to_fast
from tiktoken import get_encoding
# يمكنك تحميل ترميزك المخصص أو الترميز الذي توفره OpenAI
encoding = get_encoding("gpt2")
convert_tiktoken_to_fast(encoding, "config/save/dir")
```
يتم حفظ ملف `tokenizer.json` الناتج في الدليل المحدد ويمكن تحميله باستخدام [`PreTrainedTokenizerFast`].
```py
tokenizer = PreTrainedTokenizerFast.from_pretrained("config/save/dir")
```

View File

@ -673,6 +673,29 @@ tpu_use_sudo: false
use_cpu: false
```
</hfoption>
<hfoption id="Tensor Parallelism with PyTorch 2">
```yml
compute_environment: LOCAL_MACHINE
tp_config:
tp_size: 4
distributed_type: TP
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
mixed_precision: 'no'
num_machines: 1
num_processes: 4
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
```
</hfoption>
</hfoptions>
يُعد أمر [`accelerate_launch`](https://huggingface.co/docs/accelerate/package_reference/cli#accelerate-launch) هو الطريقة المُوصى بها لتشغيل نص البرمجى للتدريب على نظام موزع باستخدام Accelerate و [`Trainer`] مع المعلمات المحددة في `config_file.yaml`. يتم حفظ هذا الملف في مجلد ذاكرة التخزين المؤقت لـ Accelerate ويتم تحميله تلقائيًا عند تشغيل `accelerate_launch`.

View File

@ -283,8 +283,6 @@ RUN_SLOW=yes python -m pytest -n auto --dist=loadfile -s -v ./examples/pytorch/t
Wie bei den langsamen Tests gibt es auch andere Umgebungsvariablen, die standardmäßig beim Testen nicht gesetzt sind:
* `RUN_CUSTOM_TOKENIZERS`: Aktiviert Tests für benutzerdefinierte Tokenizer.
* `RUN_PT_FLAX_CROSS_TESTS`: Aktiviert Tests für die Integration von PyTorch + Flax.
* `RUN_PT_TF_CROSS_TESTS`: Aktiviert Tests für die Integration von TensorFlow + PyTorch.
Weitere Umgebungsvariablen und zusätzliche Informationen finden Sie in der [testing_utils.py](src/transformers/testing_utils.py).

View File

@ -88,7 +88,7 @@ Die Bibliothek enthält derzeit JAX-, PyTorch- und TensorFlow-Implementierungen,
1. **[DeiT](model_doc/deit)** (from Facebook) released with the paper [Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877) by Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou.
1. **[DETR](model_doc/detr)** (from Facebook) released with the paper [End-to-End Object Detection with Transformers](https://arxiv.org/abs/2005.12872) by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko.
1. **[DialoGPT](model_doc/dialogpt)** (from Microsoft Research) released with the paper [DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation](https://arxiv.org/abs/1911.00536) by Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan.
1. **[DistilBERT](model_doc/distilbert)** (from HuggingFace), released together with the paper [DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter](https://arxiv.org/abs/1910.01108) by Victor Sanh, Lysandre Debut and Thomas Wolf. The same method has been applied to compress GPT2 into [DistilGPT2](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation), RoBERTa into [DistilRoBERTa](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation), Multilingual BERT into [DistilmBERT](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) and a German version of DistilBERT.
1. **[DistilBERT](model_doc/distilbert)** (from HuggingFace), released together with the paper [DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter](https://arxiv.org/abs/1910.01108) by Victor Sanh, Lysandre Debut and Thomas Wolf. The same method has been applied to compress GPT2 into [DistilGPT2](https://github.com/huggingface/transformers-research-projects/tree/main/distillation), RoBERTa into [DistilRoBERTa](https://github.com/huggingface/transformers-research-projects/tree/main/distillation), Multilingual BERT into [DistilmBERT](https://github.com/huggingface/transformers-research-projects/tree/main/distillation) and a German version of DistilBERT.
1. **[DiT](model_doc/dit)** (from Microsoft Research) released with the paper [DiT: Self-supervised Pre-training for Document Image Transformer](https://arxiv.org/abs/2203.02378) by Junlong Li, Yiheng Xu, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei.
1. **[DPR](model_doc/dpr)** (from Facebook) released with the paper [Dense Passage Retrieval for Open-Domain Question Answering](https://arxiv.org/abs/2004.04906) by Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih.
1. **[DPT](master/model_doc/dpt)** (from Intel Labs) released with the paper [Vision Transformers for Dense Prediction](https://arxiv.org/abs/2103.13413) by René Ranftl, Alexey Bochkovskiy, Vladlen Koltun.

View File

@ -149,7 +149,7 @@ conda install conda-forge::transformers
Vorgefertigte Modelle werden heruntergeladen und lokal zwischengespeichert unter: `~/.cache/huggingface/hub`. Dies ist das Standardverzeichnis, das durch die Shell-Umgebungsvariable "TRANSFORMERS_CACHE" vorgegeben ist. Unter Windows wird das Standardverzeichnis durch `C:\Benutzer\Benutzername\.cache\huggingface\hub` angegeben. Sie können die unten aufgeführten Shell-Umgebungsvariablen - in der Reihenfolge ihrer Priorität - ändern, um ein anderes Cache-Verzeichnis anzugeben:
1. Shell-Umgebungsvariable (Standard): `HUGGINGFACE_HUB_CACHE` oder `TRANSFORMERS_CACHE`.
1. Shell-Umgebungsvariable (Standard): `HF_HUB_CACHE` oder `TRANSFORMERS_CACHE`.
2. Shell-Umgebungsvariable: `HF_HOME`.
3. Shell-Umgebungsvariable: `XDG_CACHE_HOME` + `/huggingface`.

View File

@ -109,7 +109,7 @@ label: NEGATIVE, with score: 0.5309
Die [`pipeline`] kann auch über einen ganzen Datensatz iterieren. Starten wir mit der Installation der [🤗 Datasets](https://huggingface.co/docs/datasets/) Bibliothek:
```bash
pip install datasets
pip install datasets
```
Erstellen wir eine [`pipeline`] mit der Aufgabe die wir lösen und dem Modell welches wir nutzen möchten.
@ -156,7 +156,7 @@ Die [`pipeline`] kann jedes Modell aus dem [Model Hub](https://huggingface.co/mo
<frameworkcontent>
<pt>
Use the [`AutoModelForSequenceClassification`] and [`AutoTokenizer`] to load the pretrained model and it's associated tokenizer (more on an `AutoClass` below):
Use the [`AutoModelForSequenceClassification`] and [`AutoTokenizer`] to load the pretrained model and its associated tokenizer (more on an `AutoClass` below):
```py
>>> from transformers import AutoTokenizer, AutoModelForSequenceClassification
@ -166,7 +166,7 @@ Use the [`AutoModelForSequenceClassification`] and [`AutoTokenizer`] to load the
```
</pt>
<tf>
Use the [`TFAutoModelForSequenceClassification`] and [`AutoTokenizer`] to load the pretrained model and it's associated tokenizer (more on an `TFAutoClass` below):
Use the [`TFAutoModelForSequenceClassification`] and [`AutoTokenizer`] to load the pretrained model and its associated tokenizer (more on an `TFAutoClass` below):
```py
>>> from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
@ -191,7 +191,7 @@ Wenn Sie kein Modell für Ihren Anwendungsfall finden können, müssen Sie ein v
<Youtube id="AhChOFRegn4"/>
Unter der Haube arbeiten die Klassen [`AutoModelForSequenceClassification`] und [`AutoTokenizer`] zusammen, um die [`pipeline`] zu betreiben. Eine [`AutoClass`](./model_doc/auto) ist eine Abkürzung, die automatisch die Architektur eines trainierten Modells aus dessen Namen oder Pfad abruft. Sie müssen nur die passende `AutoClass` für Ihre Aufgabe und den zugehörigen Tokenizer mit [`AutoTokenizer`] auswählen.
Unter der Haube arbeiten die Klassen [`AutoModelForSequenceClassification`] und [`AutoTokenizer`] zusammen, um die [`pipeline`] zu betreiben. Eine [`AutoClass`](./model_doc/auto) ist eine Abkürzung, die automatisch die Architektur eines trainierten Modells aus dessen Namen oder Pfad abruft. Sie müssen nur die passende `AutoClass` für Ihre Aufgabe und den zugehörigen Tokenizer mit [`AutoTokenizer`] auswählen.
Kehren wir zu unserem Beispiel zurück und sehen wir uns an, wie Sie die `AutoClass` verwenden können, um die Ergebnisse der [`pipeline`] zu replizieren.
@ -222,7 +222,7 @@ Anschließend wandelt der Tokenizer die Token in Zahlen um, um einen Tensor als
Der Tokenizer gibt ein Wörterbuch zurück, das Folgendes enthält:
* [input_ids](./glossary#input-ids): numerische Repräsentationen Ihrer Token.
* [atttention_mask](.glossary#attention-mask): gibt an, welche Token beachtet werden sollen.
* [attention_mask](.glossary#attention-mask): gibt an, welche Token beachtet werden sollen.
Genau wie die [`pipeline`] akzeptiert der Tokenizer eine Liste von Eingaben. Darüber hinaus kann der Tokenizer den Text auch auffüllen und kürzen, um einen Stapel mit einheitlicher Länge zurückzugeben:
@ -281,7 +281,7 @@ Jetzt können Sie Ihren vorverarbeiteten Stapel von Eingaben direkt an das Model
```
Das Modell gibt die endgültigen Aktivierungen in dem Attribut "logits" aus. Wenden Sie die Softmax-Funktion auf die "logits" an, um die Wahrscheinlichkeiten zu erhalten:
```py
>>> from torch import nn
@ -308,7 +308,7 @@ In der [Aufgabenzusammenfassung](./task_summary) steht, welche [AutoModel]-Klass
</Tip>
Jetzt können Sie Ihren vorverarbeiteten Stapel von Eingaben direkt an das Modell übergeben, indem Sie die Wörterbuchschlüssel direkt an die Tensoren übergeben:
```py
>>> tf_outputs = tf_model(tf_batch)
```
@ -383,8 +383,8 @@ Ein besonders cooles 🤗 Transformers-Feature ist die Möglichkeit, ein Modell
```py
>>> from transformers import AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory)
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
>>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory)
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)
```
</pt>
<tf>
@ -392,8 +392,8 @@ Ein besonders cooles 🤗 Transformers-Feature ist die Möglichkeit, ein Modell
```py
>>> from transformers import TFAutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory)
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)
>>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory)
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
```
</tf>
</frameworkcontent>

View File

@ -18,7 +18,7 @@ rendered properly in your Markdown viewer.
Neben den 🤗 Transformers [notebooks](./notebooks) gibt es auch Beispielskripte, die zeigen, wie man ein Modell für eine Aufgabe mit [PyTorch](https://github.com/huggingface/transformers/tree/main/examples/pytorch), [TensorFlow](https://github.com/huggingface/transformers/tree/main/examples/tensorflow) oder [JAX/Flax](https://github.com/huggingface/transformers/tree/main/examples/flax) trainiert.
Sie werden auch Skripte finden, die wir in unseren [Forschungsprojekten](https://github.com/huggingface/transformers/tree/main/examples/research_projects) und [Legacy-Beispielen](https://github.com/huggingface/transformers/tree/main/examples/legacy) verwendet haben und die größtenteils von der Community stammen. Diese Skripte werden nicht aktiv gepflegt und erfordern eine bestimmte Version von 🤗 Transformers, die höchstwahrscheinlich nicht mit der neuesten Version der Bibliothek kompatibel ist.
Sie werden auch Skripte finden, die wir in unseren [Forschungsprojekten](https://github.com/huggingface/transformers-research-projects/) und [Legacy-Beispielen](https://github.com/huggingface/transformers/tree/main/examples/legacy) verwendet haben und die größtenteils von der Community stammen. Diese Skripte werden nicht aktiv gepflegt und erfordern eine bestimmte Version von 🤗 Transformers, die höchstwahrscheinlich nicht mit der neuesten Version der Bibliothek kompatibel ist.
Es wird nicht erwartet, dass die Beispielskripte bei jedem Problem sofort funktionieren. Möglicherweise müssen Sie das Skript an das Problem anpassen, das Sie zu lösen versuchen. Um Ihnen dabei zu helfen, legen die meisten Skripte vollständig offen, wie die Daten vorverarbeitet werden, so dass Sie sie nach Bedarf für Ihren Anwendungsfall bearbeiten können.

View File

@ -11,4 +11,4 @@ black_avoid_patterns = {
"{processor_class}": "FakeProcessorClass",
"{model_class}": "FakeModelClass",
"{object_class}": "FakeObjectClass",
}
}

View File

@ -1,276 +1,310 @@
- sections:
- local: index
title: 🤗 Transformers
- local: quicktour
title: Quick tour
title: Transformers
- local: installation
title: Installation
- local: add_new_model
title: Adding a new model to `transformers`
- local: quicktour
title: Quickstart
title: Get started
- sections:
- local: pipeline_tutorial
title: Run inference with pipelines
- local: autoclass_tutorial
title: Write portable code with AutoClass
- local: preprocessing
title: Preprocess data
- local: training
title: Fine-tune a pretrained model
- local: run_scripts
title: Train with a script
- local: accelerate
title: Set up distributed training with 🤗 Accelerate
- local: peft
title: Load and train adapters with 🤗 PEFT
- local: model_sharing
title: Share your model
- local: agents
title: Agents 101
- local: agents_advanced
title: Agents, supercharged - Multi-agents, External tools, and more
- local: llm_tutorial
title: Generation with LLMs
- local: conversations
title: Chatting with Transformers
title: Tutorials
- sections:
- isExpanded: false
sections:
- local: tasks/sequence_classification
title: Text classification
- local: tasks/token_classification
title: Token classification
- local: tasks/question_answering
title: Question answering
- local: tasks/language_modeling
title: Causal language modeling
- local: tasks/masked_language_modeling
title: Masked language modeling
- local: tasks/translation
title: Translation
- local: tasks/summarization
title: Summarization
- local: tasks/multiple_choice
title: Multiple choice
title: Natural Language Processing
- isExpanded: false
sections:
- local: tasks/audio_classification
title: Audio classification
- local: tasks/asr
title: Automatic speech recognition
title: Audio
- isExpanded: false
sections:
- local: tasks/image_classification
title: Image classification
- local: tasks/semantic_segmentation
title: Image segmentation
- local: tasks/video_classification
title: Video classification
- local: tasks/object_detection
title: Object detection
- local: tasks/zero_shot_object_detection
title: Zero-shot object detection
- local: tasks/zero_shot_image_classification
title: Zero-shot image classification
- local: tasks/monocular_depth_estimation
title: Depth estimation
- local: tasks/image_to_image
title: Image-to-Image
- local: tasks/image_feature_extraction
title: Image Feature Extraction
- local: tasks/mask_generation
title: Mask Generation
- local: tasks/keypoint_detection
title: Keypoint Detection
- local: tasks/knowledge_distillation_for_image_classification
title: Knowledge Distillation for Computer Vision
title: Computer Vision
- isExpanded: false
sections:
- local: tasks/image_captioning
title: Image captioning
- local: tasks/document_question_answering
title: Document Question Answering
- local: tasks/visual_question_answering
title: Visual Question Answering
- local: tasks/text-to-speech
title: Text to speech
- local: tasks/image_text_to_text
title: Image-text-to-text
- local: tasks/video_text_to_text
title: Video-text-to-text
title: Multimodal
- isExpanded: false
sections:
- isExpanded: false
sections:
- sections:
- local: models
title: Loading models
- local: custom_models
title: Customizing models
- local: how_to_hack_models
title: Customizing model components
- local: model_sharing
title: Sharing
- local: add_new_model
title: Adding a new model to Transformers
- local: modular_transformers
title: Modular Transformers
- local: task_summary
title: What 🤗 Transformers can do
- local: tasks_explained
title: How 🤗 Transformers solve tasks
- local: model_summary
title: The Transformer model family
- local: attention
title: Attention mechanisms
- local: attention_interface
title: Customizing attention function
title: Models
- sections:
- local: fast_tokenizers
title: Tokenizers
- local: image_processors
title: Image processors
- local: backbones
title: Backbones
- local: feature_extractors
title: Feature extractors
- local: processors
title: Processors
- local: tokenizer_summary
title: Summary of the tokenizers
- local: pad_truncation
title: Padding and truncation
title: Preprocessors
title: Base classes
- isExpanded: false
sections:
- sections:
- local: pipeline_tutorial
title: Pipeline
- local: pipeline_gradio
title: Machine learning apps
- local: pipeline_webserver
title: Web server inference
- local: add_new_pipeline
title: Adding a new pipeline
title: Pipeline API
- sections:
- local: llm_tutorial
title: Text generation
- local: generation_strategies
title: Customize the generation strategy
- local: kv_cache
title: Best Practices for Generation with Cache
title: Generation
- isExpanded: false
sections:
- local: tasks/idefics
title: Image tasks with IDEFICS
title: Generation strategies
- local: generation_features
title: Generation features
- local: tasks/prompting
title: LLM prompting guide
title: Prompting
title: Task Guides
- sections:
- local: fast_tokenizers
title: Use fast tokenizers from 🤗 Tokenizers
- local: multilingual
title: Run inference with multilingual models
- local: create_a_model
title: Use model-specific APIs
- local: custom_models
title: Share a custom model
- local: chat_templating
title: Chat templates
- local: trainer
title: Trainer
- local: sagemaker
title: Run training on Amazon SageMaker
title: Prompt engineering
- local: llm_optims
title: Optimizing inference
- local: kv_cache
title: KV cache strategies
- local: serving
title: Serving
- local: cache_explanation
title: Caching
- local: llm_tutorial_optimization
title: Getting the most out of LLMs
- local: perplexity
title: Perplexity of fixed-length models
title: LLMs
- sections:
- local: conversations
title: Chat basics
- local: chat_templating
title: Templates
- local: chat_templating_multimodal
title: Multimodal templates
- local: chat_templating_writing
title: Template writing
- local: chat_extras
title: Tools and RAG
title: Chat with models
- sections:
- local: perf_torch_compile
title: torch.compile
- local: perf_infer_gpu_one
title: GPU
- local: perf_infer_gpu_multi
title: Distributed GPU inference
- local: perf_infer_cpu
title: CPU
- local: tf_xla
title: XLA
title: Optimization
- local: agents
title: Agents
- local: tools
title: Tools
title: Inference
- isExpanded: false
sections:
- sections:
- local: trainer
title: Trainer
- local: training
title: Fine-tuning
- local: optimizers
title: Optimizers
- local: hpo_train
title: Hyperparameter search
title: Trainer API
- sections:
- local: gpu_selection
title: GPU selection
- local: accelerate
title: Accelerate
- local: fsdp
title: FullyShardedDataParallel
- local: deepspeed
title: DeepSpeed
- local: debugging
title: Multi-GPU debugging
- local: perf_train_cpu_many
title: Distributed CPUs
- local: perf_train_gpu_many
title: Parallelism methods
title: Distributed training
- sections:
- local: perf_train_gpu_one
title: GPU
- local: perf_train_cpu
title: CPU
- local: perf_train_tpu_tf
title: TPU
- local: perf_train_special
title: Apple Silicon
- local: perf_hardware
title: Build your own machine
title: Hardware
- local: peft
title: PEFT
- local: model_memory_anatomy
title: Model training anatomy
title: Training
- isExpanded: false
sections:
- local: quantization/overview
title: Overview
- local: quantization/aqlm
title: AQLM
- local: quantization/awq
title: AWQ
- local: quantization/bitnet
title: BitNet
- local: quantization/bitsandbytes
title: bitsandbytes
- local: quantization/compressed_tensors
title: compressed-tensors
- local: quantization/eetq
title: EETQ
- local: quantization/fbgemm_fp8
title: FBGEMM
- local: quantization/finegrained_fp8
title: Fine-grained FP8
- local: gguf
title: GGUF
- local: quantization/gptq
title: GPTQ
- local: quantization/higgs
title: HIGGS
- local: quantization/hqq
title: HQQ
- local: quantization/optimum
title: Optimum
- local: quantization/quanto
title: Quanto
- local: quantization/quark
title: Quark
- local: quantization/torchao
title: torchao
- local: quantization/spqr
title: SpQR
- local: quantization/vptq
title: VPTQ
- local: quantization/contribute
title: Contribute
title: Quantization
- isExpanded: false
sections:
- local: serialization
title: Export to ONNX
title: ONNX
- local: tflite
title: Export to TFLite
title: LiteRT
- local: executorch
title: ExecuTorch
- local: torchscript
title: Export to TorchScript
- local: benchmarks
title: Benchmarks
title: TorchScript
title: Export to production
- isExpanded: false
sections:
- sections:
- sections:
- local: tasks/sequence_classification
title: Text classification
- local: tasks/token_classification
title: Token classification
- local: tasks/question_answering
title: Question answering
- local: tasks/language_modeling
title: Causal language modeling
- local: tasks/masked_language_modeling
title: Masked language modeling
- local: tasks/translation
title: Translation
- local: tasks/summarization
title: Summarization
- local: tasks/multiple_choice
title: Multiple choice
title: Natural language processing
- sections:
- local: tasks/audio_classification
title: Audio classification
- local: tasks/asr
title: Automatic speech recognition
title: Audio
- sections:
- local: tasks/image_classification
title: Image classification
- local: tasks/semantic_segmentation
title: Image segmentation
- local: tasks/video_classification
title: Video classification
- local: tasks/object_detection
title: Object detection
- local: tasks/zero_shot_object_detection
title: Zero-shot object detection
- local: tasks/zero_shot_image_classification
title: Zero-shot image classification
- local: tasks/monocular_depth_estimation
title: Depth estimation
- local: tasks/image_to_image
title: Image-to-Image
- local: tasks/image_feature_extraction
title: Image Feature Extraction
- local: tasks/mask_generation
title: Mask Generation
- local: tasks/keypoint_detection
title: Keypoint detection
- local: tasks/knowledge_distillation_for_image_classification
title: Knowledge Distillation for Computer Vision
title: Computer vision
- sections:
- local: tasks/image_captioning
title: Image captioning
- local: tasks/document_question_answering
title: Document Question Answering
- local: tasks/visual_question_answering
title: Visual Question Answering
- local: tasks/text-to-speech
title: Text to speech
- local: tasks/idefics
title: Image tasks with IDEFICS
- local: tasks/image_text_to_text
title: Image-text-to-text
- local: tasks/video_text_to_text
title: Video-text-to-text
title: Multimodal
title: Task recipes
- local: run_scripts
title: Training scripts
- local: glossary
title: Glossary
- local: philosophy
title: Philosophy
- local: notebooks
title: Notebooks with examples
- local: community
title: Community resources
- local: troubleshooting
title: Troubleshoot
- local: gguf
title: Interoperability with GGUF files
- local: tiktoken
title: Interoperability with TikToken files
- local: modular_transformers
title: Modularity in `transformers`
- local: how_to_hack_models
title: Model Hacking (overwriting a class to your usage)
title: Developer guides
- sections:
- local: quantization/overview
title: Getting started
- local: quantization/bitsandbytes
title: bitsandbytes
- local: quantization/gptq
title: GPTQ
- local: quantization/awq
title: AWQ
- local: quantization/aqlm
title: AQLM
- local: quantization/quanto
title: Quanto
- local: quantization/eetq
title: EETQ
- local: quantization/hqq
title: HQQ
- local: quantization/fbgemm_fp8
title: FBGEMM_FP8
- local: quantization/optimum
title: Optimum
- local: quantization/torchao
title: TorchAO
- local: quantization/bitnet
title: BitNet
- local: quantization/compressed_tensors
title: compressed-tensors
- local: quantization/contribute
title: Contribute new quantization method
title: Quantization Methods
- sections:
- local: performance
title: Overview
- local: llm_optims
title: LLM inference optimization
- sections:
- local: perf_train_gpu_one
title: Methods and tools for efficient training on a single GPU
- local: perf_train_gpu_many
title: Multiple GPUs and parallelism
- local: fsdp
title: Fully Sharded Data Parallel
- local: deepspeed
title: DeepSpeed
- local: perf_train_cpu
title: Efficient training on CPU
- local: perf_train_cpu_many
title: Distributed CPU training
- local: perf_train_tpu_tf
title: Training on TPU with TensorFlow
- local: perf_train_special
title: PyTorch training on Apple silicon
- local: perf_hardware
title: Custom hardware for training
- local: hpo_train
title: Hyperparameter Search using Trainer API
title: Efficient training techniques
- sections:
- local: perf_infer_cpu
title: CPU inference
- local: perf_infer_gpu_one
title: GPU inference
- local: perf_infer_gpu_multi
title: Multi-GPU inference
title: Optimizing inference
- local: big_models
title: Instantiate a big model
- local: debugging
title: Debugging
- local: tf_xla
title: XLA Integration for TensorFlow Models
- local: perf_torch_compile
title: Optimize inference using `torch.compile()`
title: Performance and scalability
- sections:
title: Resources
- isExpanded: false
sections:
- local: contributing
title: How to contribute to 🤗 Transformers?
- local: add_new_model
title: How to add a model to 🤗 Transformers?
- local: add_new_pipeline
title: How to add a pipeline to 🤗 Transformers?
title: Contribute to Transformers
- local: testing
title: Testing
title: Transformers model tests
- local: pr_checks
title: Checks on a Pull Request
title: Pull request checks
title: Contribute
- sections:
- local: philosophy
title: Philosophy
- local: glossary
title: Glossary
- local: task_summary
title: What 🤗 Transformers can do
- local: tasks_explained
title: How 🤗 Transformers solve tasks
- local: model_summary
title: The Transformer model family
- local: tokenizer_summary
title: Summary of the tokenizers
- local: attention
title: Attention mechanisms
- local: pad_truncation
title: Padding and truncation
- local: bertology
title: BERTology
- local: perplexity
title: Perplexity of fixed-length models
- local: pipeline_webserver
title: Pipelines for webserver inference
- local: model_memory_anatomy
title: Model training anatomy
- local: llm_tutorial_optimization
title: Getting the most out of LLMs
title: Conceptual guides
- sections:
- isExpanded: false
sections:
- sections:
- local: main_classes/agent
title: Agents and Tools
@ -298,6 +332,8 @@
title: Optimization
- local: main_classes/output
title: Model outputs
- local: main_classes/peft
title: PEFT
- local: main_classes/pipelines
title: Pipelines
- local: main_classes/processors
@ -316,12 +352,13 @@
title: Feature Extractor
- local: main_classes/image_processor
title: Image Processor
title: Main Classes
title: Main classes
- sections:
- isExpanded: false
sections:
- sections:
- local: model_doc/albert
title: ALBERT
- local: model_doc/bamba
title: Bamba
- local: model_doc/bart
title: BART
- local: model_doc/barthez
@ -362,6 +399,8 @@
title: CodeLlama
- local: model_doc/cohere
title: Cohere
- local: model_doc/cohere2
title: Cohere2
- local: model_doc/convbert
title: ConvBERT
- local: model_doc/cpm
@ -376,8 +415,12 @@
title: DeBERTa
- local: model_doc/deberta-v2
title: DeBERTa-v2
- local: model_doc/deepseek_v3
title: DeepSeek-V3
- local: model_doc/dialogpt
title: DialoGPT
- local: model_doc/diffllama
title: DiffLlama
- local: model_doc/distilbert
title: DistilBERT
- local: model_doc/dpr
@ -394,10 +437,10 @@
title: ESM
- local: model_doc/falcon
title: Falcon
- local: model_doc/falcon3
title: Falcon3
- local: model_doc/falcon_mamba
title: FalconMamba
- local: model_doc/fastspeech2_conformer
title: FastSpeech2Conformer
- local: model_doc/flan-t5
title: FLAN-T5
- local: model_doc/flan-ul2
@ -440,6 +483,12 @@
title: Granite
- local: model_doc/granitemoe
title: GraniteMoe
- local: model_doc/granitemoeshared
title: GraniteMoeShared
- local: model_doc/granitevision
title: GraniteVision
- local: model_doc/helium
title: Helium
- local: model_doc/herbert
title: HerBERT
- local: model_doc/ibert
@ -458,6 +507,8 @@
title: Llama2
- local: model_doc/llama3
title: Llama3
- local: model_doc/llama4
title: Llama4
- local: model_doc/longformer
title: Longformer
- local: model_doc/longt5
@ -486,12 +537,16 @@
title: MegatronGPT2
- local: model_doc/mistral
title: Mistral
- local: model_doc/mistral3
title: Mistral3
- local: model_doc/mixtral
title: Mixtral
- local: model_doc/mluke
title: mLUKE
- local: model_doc/mobilebert
title: MobileBERT
- local: model_doc/modernbert
title: ModernBert
- local: model_doc/mpnet
title: MPNet
- local: model_doc/mpt
@ -516,8 +571,8 @@
title: Nyströmformer
- local: model_doc/olmo
title: OLMo
- local: model_doc/olmo_1124
title: OLMo November 2024
- local: model_doc/olmo2
title: OLMo2
- local: model_doc/olmoe
title: OLMoE
- local: model_doc/open-llama
@ -534,6 +589,8 @@
title: Phi
- local: model_doc/phi3
title: Phi-3
- local: model_doc/phi4_multimodal
title: Phi4 Multimodal
- local: model_doc/phimoe
title: PhiMoE
- local: model_doc/phobert
@ -548,6 +605,10 @@
title: Qwen2
- local: model_doc/qwen2_moe
title: Qwen2MoE
- local: model_doc/qwen3
title: Qwen3
- local: model_doc/qwen3_moe
title: Qwen3MoE
- local: model_doc/rag
title: RAG
- local: model_doc/realm
@ -612,9 +673,10 @@
title: YOSO
- local: model_doc/zamba
title: Zamba
- local: model_doc/zamba2
title: Zamba2
title: Text models
- isExpanded: false
sections:
- sections:
- local: model_doc/beit
title: BEiT
- local: model_doc/bit
@ -627,6 +689,8 @@
title: ConvNeXTV2
- local: model_doc/cvt
title: CvT
- local: model_doc/dab-detr
title: DAB-DETR
- local: model_doc/deformable_detr
title: Deformable DETR
- local: model_doc/deit
@ -635,6 +699,8 @@
title: Depth Anything
- local: model_doc/depth_anything_v2
title: Depth Anything V2
- local: model_doc/depth_pro
title: DepthPro
- local: model_doc/deta
title: DETA
- local: model_doc/detr
@ -643,6 +709,8 @@
title: DiNAT
- local: model_doc/dinov2
title: DINOV2
- local: model_doc/dinov2_with_registers
title: DINOv2 with Registers
- local: model_doc/dit
title: DiT
- local: model_doc/dpt
@ -657,6 +725,8 @@
title: GLPN
- local: model_doc/hiera
title: Hiera
- local: model_doc/ijepa
title: I-JEPA
- local: model_doc/imagegpt
title: ImageGPT
- local: model_doc/levit
@ -677,6 +747,8 @@
title: NAT
- local: model_doc/poolformer
title: PoolFormer
- local: model_doc/prompt_depth_anything
title: Prompt Depth Anything
- local: model_doc/pvt
title: Pyramid Vision Transformer (PVT)
- local: model_doc/pvt_v2
@ -687,10 +759,14 @@
title: ResNet
- local: model_doc/rt_detr
title: RT-DETR
- local: model_doc/rt_detr_v2
title: RT-DETRv2
- local: model_doc/segformer
title: SegFormer
- local: model_doc/seggpt
title: SegGpt
- local: model_doc/superglue
title: SuperGlue
- local: model_doc/superpoint
title: SuperPoint
- local: model_doc/swiftformer
@ -703,6 +779,10 @@
title: Swin2SR
- local: model_doc/table-transformer
title: Table Transformer
- local: model_doc/textnet
title: TextNet
- local: model_doc/timm_wrapper
title: Timm Wrapper
- local: model_doc/upernet
title: UperNet
- local: model_doc/van
@ -719,13 +799,14 @@
title: ViTMatte
- local: model_doc/vit_msn
title: ViTMSN
- local: model_doc/vitpose
title: ViTPose
- local: model_doc/yolos
title: YOLOS
- local: model_doc/zoedepth
title: ZoeDepth
title: Vision models
- isExpanded: false
sections:
- sections:
- local: model_doc/audio-spectrogram-transformer
title: Audio Spectrogram Transformer
- local: model_doc/bark
@ -736,8 +817,8 @@
title: dac
- local: model_doc/encodec
title: EnCodec
- local: model_doc/hiera
title: Hiera
- local: model_doc/fastspeech2_conformer
title: FastSpeech2Conformer
- local: model_doc/hubert
title: Hubert
- local: model_doc/mctct
@ -746,6 +827,8 @@
title: Mimi
- local: model_doc/mms
title: MMS
- local: model_doc/moonshine
title: Moonshine
- local: model_doc/moshi
title: Moshi
- local: model_doc/musicgen
@ -793,8 +876,7 @@
- local: model_doc/xlsr_wav2vec2
title: XLSR-Wav2Vec2
title: Audio models
- isExpanded: false
sections:
- sections:
- local: model_doc/timesformer
title: TimeSformer
- local: model_doc/videomae
@ -802,12 +884,15 @@
- local: model_doc/vivit
title: ViViT
title: Video models
- isExpanded: false
sections:
- sections:
- local: model_doc/align
title: ALIGN
- local: model_doc/altclip
title: AltCLIP
- local: model_doc/aria
title: Aria
- local: model_doc/aya_vision
title: AyaVision
- local: model_doc/blip
title: BLIP
- local: model_doc/blip-2
@ -826,16 +911,24 @@
title: CLIPSeg
- local: model_doc/clvp
title: CLVP
- local: model_doc/colpali
title: ColPali
- local: model_doc/data2vec
title: Data2Vec
- local: model_doc/deplot
title: DePlot
- local: model_doc/donut
title: Donut
- local: model_doc/emu3
title: Emu3
- local: model_doc/flava
title: FLAVA
- local: model_doc/gemma3
title: Gemma3
- local: model_doc/git
title: GIT
- local: model_doc/got_ocr2
title: GOT-OCR2
- local: model_doc/grounding-dino
title: Grounding DINO
- local: model_doc/groupvit
@ -896,14 +989,22 @@
title: Pix2Struct
- local: model_doc/pixtral
title: Pixtral
- local: model_doc/qwen2_5_vl
title: Qwen2.5-VL
- local: model_doc/qwen2_audio
title: Qwen2Audio
- local: model_doc/qwen2_vl
title: Qwen2VL
- local: model_doc/sam
title: Segment Anything
- local: model_doc/shieldgemma2
title: ShieldGemma2
- local: model_doc/siglip
title: SigLIP
- local: model_doc/siglip2
title: SigLIP2
- local: model_doc/smolvlm
title: SmolVLM
- local: model_doc/speech-encoder-decoder
title: Speech Encoder Decoder Models
- local: model_doc/tapas
@ -931,15 +1032,13 @@
- local: model_doc/xclip
title: X-CLIP
title: Multimodal models
- isExpanded: false
sections:
- sections:
- local: model_doc/decision_transformer
title: Decision Transformer
- local: model_doc/trajectory_transformer
title: Trajectory Transformer
title: Reinforcement learning models
- isExpanded: false
sections:
- sections:
- local: model_doc/autoformer
title: Autoformer
- local: model_doc/informer
@ -951,8 +1050,7 @@
- local: model_doc/time_series_transformer
title: Time Series Transformer
title: Time series models
- isExpanded: false
sections:
- sections:
- local: model_doc/graphormer
title: Graphormer
title: Graph models
@ -960,6 +1058,8 @@
- sections:
- local: internal/modeling_utils
title: Custom Layers and Utilities
- local: internal/model_debugging_utils
title: Utilities for Model Debugging
- local: internal/pipelines_utils
title: Utilities for pipelines
- local: internal/tokenization_utils
@ -976,5 +1076,5 @@
title: General Utilities
- local: internal/time_series_utils
title: Utilities for Time Series
title: Internal Helpers
title: Internal helpers
title: API

View File

@ -1,4 +1,4 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@ -14,123 +14,152 @@ rendered properly in your Markdown viewer.
-->
# Distributed training with 🤗 Accelerate
# Accelerate
As models get bigger, parallelism has emerged as a strategy for training larger models on limited hardware and accelerating training speed by several orders of magnitude. At Hugging Face, we created the [🤗 Accelerate](https://huggingface.co/docs/accelerate) library to help users easily train a 🤗 Transformers model on any type of distributed setup, whether it is multiple GPU's on one machine or multiple GPU's across several machines. In this tutorial, learn how to customize your native PyTorch training loop to enable training in a distributed environment.
[Accelerate](https://hf.co/docs/accelerate/index) is a library designed to simplify distributed training on any type of setup with PyTorch by uniting the most common frameworks ([Fully Sharded Data Parallel (FSDP)](https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/) and [DeepSpeed](https://www.deepspeed.ai/)) for it into a single interface. [`Trainer`] is powered by Accelerate under the hood, enabling loading big models and distributed training.
## Setup
Get started by installing 🤗 Accelerate:
This guide will show you two ways to use Accelerate with Transformers, using FSDP as the backend. The first method demonstrates distributed training with [`Trainer`], and the second method demonstrates adapting a PyTorch training loop. For more detailed information about Accelerate, please refer to the [documentation](https://hf.co/docs/accelerate/index).
```bash
pip install accelerate
```
Then import and create an [`~accelerate.Accelerator`] object. The [`~accelerate.Accelerator`] will automatically detect your type of distributed setup and initialize all the necessary components for training. You don't need to explicitly place your model on a device.
```py
>>> from accelerate import Accelerator
>>> accelerator = Accelerator()
```
## Prepare to accelerate
The next step is to pass all the relevant training objects to the [`~accelerate.Accelerator.prepare`] method. This includes your training and evaluation DataLoaders, a model and an optimizer:
```py
>>> train_dataloader, eval_dataloader, model, optimizer = accelerator.prepare(
... train_dataloader, eval_dataloader, model, optimizer
... )
```
## Backward
The last addition is to replace the typical `loss.backward()` in your training loop with 🤗 Accelerate's [`~accelerate.Accelerator.backward`] method:
```py
>>> for epoch in range(num_epochs):
... for batch in train_dataloader:
... outputs = model(**batch)
... loss = outputs.loss
... accelerator.backward(loss)
... optimizer.step()
... lr_scheduler.step()
... optimizer.zero_grad()
... progress_bar.update(1)
```
As you can see in the following code, you only need to add four additional lines of code to your training loop to enable distributed training!
```diff
+ from accelerate import Accelerator
from transformers import AdamW, AutoModelForSequenceClassification, get_scheduler
+ accelerator = Accelerator()
model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)
optimizer = AdamW(model.parameters(), lr=3e-5)
- device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
- model.to(device)
+ train_dataloader, eval_dataloader, model, optimizer = accelerator.prepare(
+ train_dataloader, eval_dataloader, model, optimizer
+ )
num_epochs = 3
num_training_steps = num_epochs * len(train_dataloader)
lr_scheduler = get_scheduler(
"linear",
optimizer=optimizer,
num_warmup_steps=0,
num_training_steps=num_training_steps
)
progress_bar = tqdm(range(num_training_steps))
model.train()
for epoch in range(num_epochs):
for batch in train_dataloader:
- batch = {k: v.to(device) for k, v in batch.items()}
outputs = model(**batch)
loss = outputs.loss
- loss.backward()
+ accelerator.backward(loss)
optimizer.step()
lr_scheduler.step()
optimizer.zero_grad()
progress_bar.update(1)
```
## Train
Once you've added the relevant lines of code, launch your training in a script or a notebook like Colaboratory.
### Train with a script
If you are running your training from a script, run the following command to create and save a configuration file:
Start by running [accelerate config](https://hf.co/docs/accelerate/main/en/package_reference/cli#accelerate-config) in the command line to answer a series of prompts about your training system. This creates and saves a configuration file to help Accelerate correctly set up training based on your setup.
```bash
accelerate config
```
Then launch your training with:
Depending on your setup and the answers you provide, an example configuration file for distributing training with FSDP on one machine with two GPUs may look like the following.
```bash
accelerate launch train.py
```yaml
compute_environment: LOCAL_MACHINE
debug: false
distributed_type: FSDP
downcast_bf16: 'no'
fsdp_config:
fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
fsdp_backward_prefetch_policy: BACKWARD_PRE
fsdp_forward_prefetch: false
fsdp_cpu_ram_efficient_loading: true
fsdp_offload_params: false
fsdp_sharding_strategy: FULL_SHARD
fsdp_state_dict_type: SHARDED_STATE_DICT
fsdp_sync_module_states: true
fsdp_transformer_layer_cls_to_wrap: BertLayer
fsdp_use_orig_params: true
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 2
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
```
### Train with a notebook
## Trainer
🤗 Accelerate can also run in a notebook if you're planning on using Colaboratory's TPUs. Wrap all the code responsible for training in a function, and pass it to [`~accelerate.notebook_launcher`]:
Pass the path to the saved configuration file to [`TrainingArguments`], and from there, pass your [`TrainingArguments`] to [`Trainer`].
```py
>>> from accelerate import notebook_launcher
from transformers import TrainingArguments, Trainer
>>> notebook_launcher(training_function)
training_args = TrainingArguments(
output_dir="your-model",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=2,
fsdp_config="path/to/fsdp_config",
fsdp_strategy="full_shard",
weight_decay=0.01,
eval_strategy="epoch",
save_strategy="epoch",
load_best_model_at_end=True,
push_to_hub=True,
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=dataset["train"],
eval_dataset=dataset["test"],
processing_class=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics,
)
trainer.train()
```
For more information about 🤗 Accelerate and its rich features, refer to the [documentation](https://huggingface.co/docs/accelerate).
## Native PyTorch
Accelerate can also be added to any PyTorch training loop to enable distributed training. The [`~accelerate.Accelerator`] is the main entry point for adapting your PyTorch code to work with Accelerate. It automatically detects your distributed training setup and initializes all the necessary components for training. You don't need to explicitly place your model on a device because [`~accelerate.Accelerator`] knows which device to move your model to.
```py
from accelerate import Accelerator
accelerator = Accelerator()
device = accelerator.device
```
All PyTorch objects (model, optimizer, scheduler, dataloaders) should be passed to the [`~accelerate.Accelerator.prepare`] method now. This method moves your model to the appropriate device or devices, adapts the optimizer and scheduler to use [`~accelerate.optimizer.AcceleratedOptimizer`] and [`~accelerate.scheduler.AcceleratedScheduler`], and creates a new shardable dataloader.
```py
train_dataloader, eval_dataloader, model, optimizer = accelerator.prepare(
train_dataloader, eval_dataloader, model, optimizer
)
```
Replace `loss.backward` in your training loop with Accelerates [`~accelerate.Accelerator.backward`] method to scale the gradients and determine the appropriate `backward` method to use depending on your framework (for example, DeepSpeed or Megatron).
```py
for epoch in range(num_epochs):
for batch in train_dataloader:
outputs = model(**batch)
loss = outputs.loss
accelerator.backward(loss)
optimizer.step()
lr_scheduler.step()
optimizer.zero_grad()
progress_bar.update(1)
```
Combine everything into a function and make it callable as a script.
```py
from accelerate import Accelerator
def main():
accelerator = Accelerator()
model, optimizer, training_dataloader, scheduler = accelerator.prepare(
model, optimizer, training_dataloader, scheduler
)
for batch in training_dataloader:
optimizer.zero_grad()
inputs, targets = batch
outputs = model(inputs)
loss = loss_function(outputs, targets)
accelerator.backward(loss)
optimizer.step()
scheduler.step()
if __name__ == "__main__":
main()
```
From the command line, call [accelerate launch](https://hf.co/docs/accelerate/main/en/package_reference/cli#accelerate-launch) to run your training script. Any additional arguments or parameters can be passed here as well.
To launch your training script on two GPUs, add the `--num_processes` argument.
```bash
accelerate launch --num_processes=2 your_script.py
```
Refer to the [Launching Accelerate scripts](https://hf.co/docs/accelerate/main/en/basic_tutorials/launch) for more details.

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,4 @@
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@ -13,92 +13,66 @@ rendered properly in your Markdown viewer.
-->
# How to create a custom pipeline?
# Adding a new pipeline
In this guide, we will see how to create a custom pipeline and share it on the [Hub](https://hf.co/models) or add it to the
🤗 Transformers library.
Make [`Pipeline`] your own by subclassing it and implementing a few methods. Share the code with the community on the [Hub](https://hf.co) and register the pipeline with Transformers so that everyone can quickly and easily use it.
First and foremost, you need to decide the raw entries the pipeline will be able to take. It can be strings, raw bytes,
dictionaries or whatever seems to be the most likely desired input. Try to keep these inputs as pure Python as possible
as it makes compatibility easier (even through other languages via JSON). Those will be the `inputs` of the
pipeline (`preprocess`).
This guide will walk you through the process of adding a new pipeline to Transformers.
Then define the `outputs`. Same policy as the `inputs`. The simpler, the better. Those will be the outputs of
`postprocess` method.
## Design choices
Start by inheriting the base class `Pipeline` with the 4 methods needed to implement `preprocess`,
`_forward`, `postprocess`, and `_sanitize_parameters`.
At a minimum, you only need to provide [`Pipeline`] with an appropriate input for a task. This is also where you should begin when designing your pipeline.
Decide what input types [`Pipeline`] can accept. It can be strings, raw bytes, dictionaries, and so on. Try to keep the inputs in pure Python where possible because it's more compatible. Next, decide on the output [`Pipeline`] should return. Again, keeping the output in Python is the simplest and best option because it's easier to work with.
```python
Keeping the inputs and outputs simple, and ideally JSON-serializable, makes it easier for users to run your [`Pipeline`] without needing to learn new object types. It's also common to support many different input types for even greater ease of use. For example, making an audio file acceptable from a filename, URL, or raw bytes gives the user more flexibility in how they provide the audio data.
## Create a pipeline
With an input and output decided, you can start implementing [`Pipeline`]. Your pipeline should inherit from the base [`Pipeline`] class and include 4 methods.
```py
from transformers import Pipeline
class MyPipeline(Pipeline):
def _sanitize_parameters(self, **kwargs):
preprocess_kwargs = {}
if "maybe_arg" in kwargs:
preprocess_kwargs["maybe_arg"] = kwargs["maybe_arg"]
return preprocess_kwargs, {}, {}
def preprocess(self, inputs, maybe_arg=2):
model_input = Tensor(inputs["input_ids"])
return {"model_input": model_input}
def preprocess(self, inputs, args=2):
def _forward(self, model_inputs):
# model_inputs == {"model_input": model_input}
outputs = self.model(**model_inputs)
# Maybe {"logits": Tensor(...)}
return outputs
def postprocess(self, model_outputs):
best_class = model_outputs["logits"].softmax(-1)
return best_class
```
The structure of this breakdown is to support relatively seamless support for CPU/GPU, while supporting doing
pre/postprocessing on the CPU on different threads
1. `preprocess` takes the inputs and transforms them into the appropriate input format for the model.
`preprocess` will take the originally defined inputs, and turn them into something feedable to the model. It might
contain more information and is usually a `Dict`.
`_forward` is the implementation detail and is not meant to be called directly. `forward` is the preferred
called method as it contains safeguards to make sure everything is working on the expected device. If anything is
linked to a real model it belongs in the `_forward` method, anything else is in the preprocess/postprocess.
`postprocess` methods will take the output of `_forward` and turn it into the final output that was decided
earlier.
`_sanitize_parameters` exists to allow users to pass any parameters whenever they wish, be it at initialization
time `pipeline(...., maybe_arg=4)` or at call time `pipe = pipeline(...); output = pipe(...., maybe_arg=4)`.
The returns of `_sanitize_parameters` are the 3 dicts of kwargs that will be passed directly to `preprocess`,
`_forward`, and `postprocess`. Don't fill anything if the caller didn't call with any extra parameter. That
allows to keep the default arguments in the function definition which is always more "natural".
A classic example would be a `top_k` argument in the post processing in classification tasks.
```python
>>> pipe = pipeline("my-new-task")
>>> pipe("This is a test")
[{"label": "1-star", "score": 0.8}, {"label": "2-star", "score": 0.1}, {"label": "3-star", "score": 0.05}
{"label": "4-star", "score": 0.025}, {"label": "5-star", "score": 0.025}]
>>> pipe("This is a test", top_k=2)
[{"label": "1-star", "score": 0.8}, {"label": "2-star", "score": 0.1}]
```py
def preprocess(self, inputs, maybe_arg=2):
model_input = Tensor(inputs["input_ids"])
return {"model_input": model_input}
```
In order to achieve that, we'll update our `postprocess` method with a default parameter to `5`. and edit
`_sanitize_parameters` to allow this new parameter.
2. `_forward` shouldn't be called directly. `forward` is the preferred method because it includes safeguards to make sure everything works correctly on the expected device. Anything linked to the model belongs in `_forward` and everything else belongs in either `preprocess` or `postprocess`.
```py
def _forward(self, model_inputs):
outputs = self.model(**model_inputs)
return outputs
```
```python
3. `postprocess` generates the final output from the models output in `_forward`.
```py
def postprocess(self, model_outputs, top_k=5):
best_class = model_outputs["logits"].softmax(-1)
# Add logic to handle top_k
return best_class
```
4. `_sanitize_parameters` lets users pass additional parameters to [`Pipeline`]. This could be during initialization or when [`Pipeline`] is called. `_sanitize_parameters` returns 3 dicts of additional keyword arguments that are passed directly to `preprocess`, `_forward`, and `postprocess`. Don't add anything if a user didn't call the pipeline with extra parameters. This keeps the default arguments in the function definition which is always more natural.
For example, add a `top_k` parameter in `postprocess` to return the top 5 most likely classes. Then in `_sanitize_parameters`, check if the user passed in `top_k` and add it to `postprocess_kwargs`.
```py
def _sanitize_parameters(self, **kwargs):
preprocess_kwargs = {}
if "maybe_arg" in kwargs:
@ -110,55 +84,61 @@ def _sanitize_parameters(self, **kwargs):
return preprocess_kwargs, {}, postprocess_kwargs
```
Try to keep the inputs/outputs very simple and ideally JSON-serializable as it makes the pipeline usage very easy
without requiring users to understand new kinds of objects. It's also relatively common to support many different types
of arguments for ease of use (audio files, which can be filenames, URLs or pure bytes)
Now the pipeline can return the top most likely labels if a user chooses to.
```py
from transformers import pipeline
pipeline = pipeline("my-task")
# returns 3 most likely labels
pipeline("This is the best meal I've ever had", top_k=3)
# returns 5 most likely labels by default
pipeline("This is the best meal I've ever had")
```
## Adding it to the list of supported tasks
## Register a pipeline
To register your `new-task` to the list of supported tasks, you have to add it to the `PIPELINE_REGISTRY`:
Register the new task your pipeline supports in the `PIPELINE_REGISTRY`. The registry defines:
```python
- the machine learning framework the pipeline supports with either `pt_model` or `tf_model` (add both to ensure it works with either frameworks)
- a default model which should come from a specific revision (branch, or commit hash) where the model works as expected with `default`
- the expected input with `type`
```py
from transformers.pipelines import PIPELINE_REGISTRY
from transformers import AutoModelForSequenceClassification, TFAutoModelForSequenceClassification
PIPELINE_REGISTRY.register_pipeline(
"new-task",
pipeline_class=MyPipeline,
pt_model=AutoModelForSequenceClassification,
tf_model=TFAutoModelForSequenceClassification,
default={"pt": ("user/awesome-model", "branch-name")},
type="text",
)
```
You can specify a default model if you want, in which case it should come with a specific revision (which can be the name of a branch or a commit hash, here we took `"abcdef"`) as well as the type:
## Share your pipeline
```python
PIPELINE_REGISTRY.register_pipeline(
"new-task",
pipeline_class=MyPipeline,
pt_model=AutoModelForSequenceClassification,
default={"pt": ("user/awesome_model", "abcdef")},
type="text", # current support type: text, audio, image, multimodal
)
```
Share your pipeline with the community on the [Hub](https://hf.co) or you can add it directly to Transformers.
## Share your pipeline on the Hub
It's faster to upload your pipeline code to the Hub because it doesn't require a review from the Transformers team. Adding the pipeline to Transformers may be slower because it requires a review and you need to add tests to ensure your [`Pipeline`] works.
To share your custom pipeline on the Hub, you just have to save the custom code of your `Pipeline` subclass in a
python file. For instance, let's say we want to use a custom pipeline for sentence pair classification like this:
### Upload to the Hub
Add your pipeline code to the Hub in a Python file.
For example, a custom pipeline for sentence pair classification might look like the following code below. The implementation works for PyTorch and TensorFlow models.
```py
import numpy as np
from transformers import Pipeline
def softmax(outputs):
maxes = np.max(outputs, axis=-1, keepdims=True)
shifted_exp = np.exp(outputs - maxes)
return shifted_exp / shifted_exp.sum(axis=-1, keepdims=True)
class PairClassificationPipeline(Pipeline):
def _sanitize_parameters(self, **kwargs):
preprocess_kwargs = {}
@ -183,8 +163,7 @@ class PairClassificationPipeline(Pipeline):
return {"label": label, "score": score, "logits": logits}
```
The implementation is framework agnostic, and will work for PyTorch and TensorFlow models. If we have saved this in
a file named `pair_classification.py`, we can then import it and register it like this:
Save the code in a file named `pair_classification.py`, and import and register it as shown below.
```py
from pair_classification import PairClassificationPipeline
@ -199,56 +178,52 @@ PIPELINE_REGISTRY.register_pipeline(
)
```
Once this is done, we can use it with a pretrained model. For instance `sgugger/finetuned-bert-mrpc` has been
fine-tuned on the MRPC dataset, which classifies pairs of sentences as paraphrases or not.
The [register_pipeline](https://github.com/huggingface/transformers/blob/9feae5fb0164e89d4998e5776897c16f7330d3df/src/transformers/pipelines/base.py#L1387) function registers the pipeline details (task type, pipeline class, supported backends) to a models `config.json` file.
```json
"custom_pipelines": {
"pair-classification": {
"impl": "pair_classification.PairClassificationPipeline",
"pt": [
"AutoModelForSequenceClassification"
],
"tf": [
"TFAutoModelForSequenceClassification"
],
}
},
```
Call [`~Pipeline.push_to_hub`] to push the pipeline to the Hub. The Python file containing the code is copied to the Hub, and the pipelines model and tokenizer are also saved and pushed to the Hub. Your pipeline should now be available on the Hub under your namespace.
```py
from transformers import pipeline
classifier = pipeline("pair-classification", model="sgugger/finetuned-bert-mrpc")
pipeline = pipeline(task="pair-classification", model="sgugger/finetuned-bert-mrpc")
pipeline.push_to_hub("pair-classification-pipeline")
```
Then we can share it on the Hub by using the `push_to_hub` method:
```py
classifier.push_to_hub("test-dynamic-pipeline")
```
This will copy the file where you defined `PairClassificationPipeline` inside the folder `"test-dynamic-pipeline"`,
along with saving the model and tokenizer of the pipeline, before pushing everything into the repository
`{your_username}/test-dynamic-pipeline`. After that, anyone can use it as long as they provide the option
`trust_remote_code=True`:
To use the pipeline, add `trust_remote_code=True` when loading the pipeline.
```py
from transformers import pipeline
classifier = pipeline(model="{your_username}/test-dynamic-pipeline", trust_remote_code=True)
pipeline = pipeline(task="pair-classification", trust_remote_code=True)
```
## Add the pipeline to 🤗 Transformers
### Add to Transformers
If you want to contribute your pipeline to 🤗 Transformers, you will need to add a new module in the `pipelines` submodule
with the code of your pipeline, then add it to the list of tasks defined in `pipelines/__init__.py`.
Adding a custom pipeline to Transformers requires adding tests to make sure everything works as expected, and requesting a review from the Transformers team.
Then you will need to add tests. Create a new file `tests/test_pipelines_MY_PIPELINE.py` with examples of the other tests.
Add your pipeline code as a new module to the [pipelines](https://github.com/huggingface/transformers/tree/main/src/transformers/pipelines) submodule, and add it to the list of tasks defined in [pipelines/__init__.py](https://github.com/huggingface/transformers/blob/main/src/transformers/pipelines/__init__.py).
The `run_pipeline_test` function will be very generic and run on small random models on every possible
architecture as defined by `model_mapping` and `tf_model_mapping`.
Next, add a new test for the pipeline in [transformers/tests/pipelines](https://github.com/huggingface/transformers/tree/main/tests/pipelines). You can look at the other tests for examples of how to test your pipeline.
This is very important to test future compatibility, meaning if someone adds a new model for
`XXXForQuestionAnswering` then the pipeline test will attempt to run on it. Because the models are random it's
impossible to check for actual values, that's why there is a helper `ANY` that will simply attempt to match the
output of the pipeline TYPE.
The [run_pipeline_test](https://github.com/huggingface/transformers/blob/db70426854fe7850f2c5834d633aff637f14772e/tests/pipelines/test_pipelines_text_classification.py#L186) function should be very generic and run on the models defined in [model_mapping](https://github.com/huggingface/transformers/blob/db70426854fe7850f2c5834d633aff637f14772e/tests/pipelines/test_pipelines_text_classification.py#L48) and [tf_model_mapping](https://github.com/huggingface/transformers/blob/db70426854fe7850f2c5834d633aff637f14772e/tests/pipelines/test_pipelines_text_classification.py#L49). This is important for testing future compatibility with new models.
You also *need* to implement 2 (ideally 4) tests.
You'll also notice `ANY` is used throughout the [run_pipeline_test](https://github.com/huggingface/transformers/blob/db70426854fe7850f2c5834d633aff637f14772e/tests/pipelines/test_pipelines_text_classification.py#L186) function. The models are random, so you can't check the actual values. Using `ANY` allows the test to match the output of the pipeline type instead.
- `test_small_model_pt` : Define 1 small model for this pipeline (doesn't matter if the results don't make sense)
and test the pipeline outputs. The results should be the same as `test_small_model_tf`.
- `test_small_model_tf` : Define 1 small model for this pipeline (doesn't matter if the results don't make sense)
and test the pipeline outputs. The results should be the same as `test_small_model_pt`.
- `test_large_model_pt` (`optional`): Tests the pipeline on a real pipeline where the results are supposed to
make sense. These tests are slow and should be marked as such. Here the goal is to showcase the pipeline and to make
sure there is no drift in future releases.
- `test_large_model_tf` (`optional`): Tests the pipeline on a real pipeline where the results are supposed to
make sense. These tests are slow and should be marked as such. Here the goal is to showcase the pipeline and to make
sure there is no drift in future releases.
Finally, you should also implement the following 4 tests.
1. [test_small_model_pt](https://github.com/huggingface/transformers/blob/db70426854fe7850f2c5834d633aff637f14772e/tests/pipelines/test_pipelines_text_classification.py#L59) and [test_small_model_tf](https://github.com/huggingface/transformers/blob/db70426854fe7850f2c5834d633aff637f14772e/tests/pipelines/test_pipelines_text_classification.py#L150), use a small model for these pipelines to make sure they return the correct outputs. The results don't have to make sense. Each pipeline should return the same result.
1. [test_large_model_pt](https://github.com/huggingface/transformers/blob/db70426854fe7850f2c5834d633aff637f14772e/tests/pipelines/test_pipelines_zero_shot_image_classification.py#L187) nad [test_large_model_tf](https://github.com/huggingface/transformers/blob/db70426854fe7850f2c5834d633aff637f14772e/tests/pipelines/test_pipelines_zero_shot_image_classification.py#L220), use a realistic model for these pipelines to make sure they return meaningful results. These tests are slow and should be marked as slow.

View File

@ -13,211 +13,135 @@ specific language governing permissions and limitations under the License.
rendered properly in your Markdown viewer.
-->
# Agents and tools
> [!WARNING]
> Agents and tools are being spun out into the standalone [smolagents](https://huggingface.co/docs/smolagents/index) library. These docs will be deprecated in the future!
# Agents
[[open-in-colab]]
### What is an agent?
An agent is a system where a large language model (LLM) can execute more complex tasks through *planning* and using *tools*.
Large Language Models (LLMs) trained to perform [causal language modeling](./tasks/language_modeling) can tackle a wide range of tasks, but they often struggle with basic tasks like logic, calculation, and search. When prompted in domains in which they do not perform well, they often fail to generate the answer we expect them to.
- Planning helps a LLM reason its way through a task by breaking it down into smaller subtasks. For example, [`CodeAgent`] plans a series of actions to take and then generates Python code to execute all the actions at once.
One approach to overcome this weakness is to create an *agent*.
Another planning method is by self-reflection and refinement of its previous actions to improve its performance. The [`ReactJsonAgent`] is an example of this type of planning, and it's based on the [ReAct](https://hf.co/papers/2210.03629) framework. This agent plans and executes actions one at a time based on the feedback it receives from each action.
An agent is a system that uses an LLM as its engine, and it has access to functions called *tools*.
- Tools give a LLM access to external functions or APIs that it can use to help it complete a task. For example, [gradio-tools](https://github.com/freddyaboulton/gradio-tools) gives a LLM access to any of the [Gradio](https://www.gradio.app/) apps available on Hugging Face [Spaces](https://hf.co/spaces). These apps can be used for a wide range of tasks such as image generation, video generation, audio transcription, and more.
These *tools* are functions for performing a task, and they contain all necessary description for the agent to properly use them.
The agent can be programmed to:
- devise a series of actions/tools and run them all at once, like the [`CodeAgent`]
- plan and execute actions/tools one by one and wait for the outcome of each action before launching the next one, like the [`ReactJsonAgent`]
### Types of agents
#### Code agent
This agent has a planning step, then generates python code to execute all its actions at once. It natively handles different input and output types for its tools, thus it is the recommended choice for multimodal tasks.
#### React agents
This is the go-to agent to solve reasoning tasks, since the ReAct framework ([Yao et al., 2022](https://huggingface.co/papers/2210.03629)) makes it really efficient to think on the basis of its previous observations.
We implement two versions of ReactJsonAgent:
- [`ReactJsonAgent`] generates tool calls as a JSON in its output.
- [`ReactCodeAgent`] is a new type of ReactJsonAgent that generates its tool calls as blobs of code, which works really well for LLMs that have strong coding performance.
> [!TIP]
> Read [Open-source LLMs as LangChain Agents](https://huggingface.co/blog/open-source-llms-as-agents) blog post to learn more about ReAct agents.
<div class="flex justify-center">
<img
class="block dark:hidden"
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif"
/>
<img
class="hidden dark:block"
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif"
/>
</div>
![Framework of a React Agent](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/open-source-llms-as-agents/ReAct.png)
For example, here is how a ReAct Code agent would work its way through the following question.
```py3
>>> agent.run(
... "How many more blocks (also denoted as layers) in BERT base encoder than the encoder from the architecture proposed in Attention is All You Need?",
... )
=====New task=====
How many more blocks (also denoted as layers) in BERT base encoder than the encoder from the architecture proposed in Attention is All You Need?
====Agent is executing the code below:
bert_blocks = search(query="number of blocks in BERT base encoder")
print("BERT blocks:", bert_blocks)
====
Print outputs:
BERT blocks: twelve encoder blocks
====Agent is executing the code below:
attention_layer = search(query="number of layers in Attention is All You Need")
print("Attention layers:", attention_layer)
====
Print outputs:
Attention layers: Encoder: The encoder is composed of a stack of N = 6 identical layers. Each layer has two sub-layers. The first is a multi-head self-attention mechanism, and the second is a simple, position- 2 Page 3 Figure 1: The Transformer - model architecture.
====Agent is executing the code below:
bert_blocks = 12
attention_layers = 6
diff = bert_blocks - attention_layers
print("Difference in blocks:", diff)
final_answer(diff)
====
Print outputs:
Difference in blocks: 6
Final answer: 6
```
### How can I build an agent?
To initialize an agent, you need these arguments:
- an LLM to power your agent - the agent is not exactly the LLM, its more like the agent is a program that uses an LLM as its engine.
- a system prompt: what the LLM engine will be prompted with to generate its output
- a toolbox from which the agent pick tools to execute
- a parser to extract from the LLM output which tools are to call and with which arguments
Upon initialization of the agent system, the tool attributes are used to generate a tool description, then baked into the agents `system_prompt` to let it know which tools it can use and why.
To start with, please install the `agents` extras in order to install all default dependencies.
To use agents in Transformers, make sure you have the extra `agents` dependencies installed.
```bash
pip install transformers[agents]
!pip install transformers[agents]
```
Build your LLM engine by defining a `llm_engine` method which accepts a list of [messages](./chat_templating) and returns text. This callable also needs to accept a `stop` argument that indicates when to stop generating.
Create an agent instance (refer to the [Agents](./main_classes/agent#agents) API for supported agents in Transformers) and a list of tools available for it to use, then [`~ReactAgent.run`] the agent on your task. The example below demonstrates how a ReAct agent reasons through a task.
```python
from huggingface_hub import login, InferenceClient
```py
from transformers import ReactCodeAgent
login("<YOUR_HUGGINGFACEHUB_API_TOKEN>")
agent = ReactCodeAgent(tools=[])
agent.run(
"How many more blocks (also denoted as layers) in BERT base encoder than the encoder from the architecture proposed in Attention is All You Need?",
)
```
client = InferenceClient(model="meta-llama/Meta-Llama-3-70B-Instruct")
```bash
======== New task ========
How many more blocks (also denoted as layers) in BERT base encoder than the encoder from the architecture proposed in Attention is All You Need?
==== Agent is executing the code below:
bert_layers = 12 # BERT base encoder has 12 layers
attention_layers = 6 # Encoder in Attention is All You Need has 6 layers
layer_diff = bert_layers - attention_layers
print("The difference in layers between BERT base encoder and Attention is All You Need is", layer_diff)
====
Print outputs:
The difference in layers between BERT base encoder and Attention is All You Need is 6
==== Agent is executing the code below:
final_answer("BERT base encoder has {} more layers than the encoder from Attention is All You Need.".format(layer_diff))
====
Print outputs:
>>> Final answer:
BERT base encoder has 6 more layers than the encoder from Attention is All You Need.
```
This guide will walk you through in more detail how to initialize an agent.
## LLM
An agent uses a LLM to plan and execute a task; it is the engine that powers the agent. To choose and build your own LLM engine, you need a method that:
1. the input uses the [chat template](./chat_templating) format, `List[Dict[str, str]]`, and it returns a string
2. the LLM stops generating outputs when it encounters the sequences in `stop_sequences`
```py
def llm_engine(messages, stop_sequences=["Task"]) -> str:
response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1000)
answer = response.choices[0].message.content
return answer
```
You could use any `llm_engine` method as long as:
1. it follows the [messages format](./chat_templating) (`List[Dict[str, str]]`) for its input `messages`, and it returns a `str`.
2. it stops generating outputs at the sequences passed in the argument `stop_sequences`
Next, initialize an engine to load a model. To run an agent locally, create a [`TransformersEngine`] to load a preinitialized [`Pipeline`].
Additionally, `llm_engine` can also take a `grammar` argument. In the case where you specify a `grammar` upon agent initialization, this argument will be passed to the calls to llm_engine, with the `grammar` that you defined upon initialization, to allow [constrained generation](https://huggingface.co/docs/text-generation-inference/conceptual/guidance) in order to force properly-formatted agent outputs.
However, you could also leverage Hugging Face's powerful inference infrastructure, [Inference API](https://hf.co/docs/api-inference/index) or [Inference Endpoints](https://hf.co/docs/inference-endpoints/index), to run your model. This is useful for loading larger models that are typically required for agentic behavior. In this case, load the [`HfApiEngine`] to run the agent.
You will also need a `tools` argument which accepts a list of `Tools` - it can be an empty list. You can also add the default toolbox on top of your `tools` list by defining the optional argument `add_base_tools=True`.
The agent requires a list of tools it can use to complete a task. If you aren't using any additional tools, pass an empty list. The default tools provided by Transformers are loaded automatically, but you can optionally set `add_base_tools=True` to explicitly enable them.
Now you can create an agent, like [`CodeAgent`], and run it. You can also create a [`TransformersEngine`] with a pre-initialized pipeline to run inference on your local machine using `transformers`.
For convenience, since agentic behaviours generally require stronger models such as `Llama-3.1-70B-Instruct` that are harder to run locally for now, we also provide the [`HfApiEngine`] class that initializes a `huggingface_hub.InferenceClient` under the hood.
<hfoptions id="engine">
<hfoption id="TransformersEngine">
```python
```py
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, TransformersEngine, CodeAgent
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct").to("cuda")
pipeline = pipeline("text-generation", model=model, tokenizer=tokenizer)
llm_engine = TransformersEngine(pipeline)
agent = CodeAgent(tools=[], llm_engine=llm_engine)
agent.run(
"What causes bread to rise?",
)
```
</hfoption>
<hfoption id="HfApiEngine">
```py
from transformers import CodeAgent, HfApiEngine
llm_engine = HfApiEngine(model="meta-llama/Meta-Llama-3-70B-Instruct")
agent = CodeAgent(tools=[], llm_engine=llm_engine, add_base_tools=True)
agent = CodeAgent(tools=[], llm_engine=llm_engine)
agent.run(
"Could you translate this sentence from French, say it out loud and return the audio.",
sentence="Où est la boulangerie la plus proche?",
)
```
This will be handy in case of emergency baguette need!
You can even leave the argument `llm_engine` undefined, and an [`HfApiEngine`] will be created by default.
</hfoption>
</hfoptions>
```python
from transformers import CodeAgent
The agent supports [constrained generation](https://hf.co/docs/text-generation-inference/conceptual/guidance) for generating outputs according to a specific structure with the `grammar` parameter. The `grammar` parameter should be specified in the `llm_engine` method or you can set it when initializing an agent.
agent = CodeAgent(tools=[], add_base_tools=True)
agent.run(
"Could you translate this sentence from French, say it out loud and give me the audio.",
sentence="Où est la boulangerie la plus proche?",
)
```
Note that we used an additional `sentence` argument: you can pass text as additional arguments to the model.
You can also use this to indicate the path to local or remote files for the model to use:
Lastly, an agent accepts additional inputs such as text and audio. In the [`HfApiEngine`] example above, the agent accepted a sentence to translate. But you could also pass a path to a local or remote file for the agent to access. The example below demonstrates how to pass a path to an audio file.
```py
from transformers import ReactCodeAgent
agent = ReactCodeAgent(tools=[], llm_engine=llm_engine, add_base_tools=True)
agent.run("Why does Mike not know many people in New York?", audio="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/recording.mp3")
agent = ReactCodeAgent(tools=[], llm_engine=llm_engine)
agent.run("Why doesn't he know many people in New York?", audio="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/recording.mp3")
```
## System prompt
The prompt and output parser were automatically defined, but you can easily inspect them by calling the `system_prompt_template` on your agent.
A system prompt describes how an agent should behave, a description of the available tools, and the expected output format.
```python
print(agent.system_prompt_template)
```
Tools are defined by the `<<tool_descriptions>>` token which is dynamically replaced during runtime with the actual tool. The tool description is derived from the tool name, description, inputs, output type, and a Jinja2 template. Refer to the [Tools](./tools) guide for more information about how to describe tools.
It's important to explain as clearly as possible the task you want to perform.
Every [`~Agent.run`] operation is independent, and since an agent is powered by an LLM, minor variations in your prompt might yield completely different results.
You can also run an agent consecutively for different tasks: each time the attributes `agent.task` and `agent.logs` will be re-initialized.
#### Code execution
A Python interpreter executes the code on a set of inputs passed along with your tools.
This should be safe because the only functions that can be called are the tools you provided (especially if it's only tools by Hugging Face) and the print function, so you're already limited in what can be executed.
The Python interpreter also doesn't allow imports by default outside of a safe list, so all the most obvious attacks shouldn't be an issue.
You can still authorize additional imports by passing the authorized modules as a list of strings in argument `additional_authorized_imports` upon initialization of your [`ReactCodeAgent`] or [`CodeAgent`]:
The example below is the system prompt for [`ReactCodeAgent`].
```py
>>> from transformers import ReactCodeAgent
>>> agent = ReactCodeAgent(tools=[], additional_authorized_imports=['requests', 'bs4'])
>>> agent.run("Could you get me the title of the page at url 'https://huggingface.co/blog'?")
(...)
'Hugging Face Blog'
```
The execution will stop at any code trying to perform an illegal operation or if there is a regular Python error with the code generated by the agent.
> [!WARNING]
> The LLM can generate arbitrary code that will then be executed: do not add any unsafe imports!
### The system prompt
An agent, or rather the LLM that drives the agent, generates an output based on the system prompt. The system prompt can be customized and tailored to the intended task. For example, check the system prompt for the [`ReactCodeAgent`] (below version is slightly simplified).
```text
You will be given a task to solve as best you can.
You have access to the following tools:
<<tool_descriptions>>
@ -225,7 +149,7 @@ You have access to the following tools:
To solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Code:', and 'Observation:' sequences.
At each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task, then the tools that you want to use.
Then in the 'Code:' sequence, you shold write the code in simple Python. The code sequence must end with '/End code' sequence.
Then in the 'Code:' sequence, you should write the code in simple Python. The code sequence must end with '/End code' sequence.
During each intermediate step, you can use 'print()' to save whatever important information you will then need.
These print outputs will then be available in the 'Observation:' field, for using this information as input for the next step.
@ -235,7 +159,7 @@ Here are a few examples using notional tools:
---
{examples}
Above example were using notional tools that might not exist for you. You only have acces to those tools:
Above example were using notional tools that might not exist for you. You only have access to those tools:
<<tool_names>>
You also can perform computations in the python code you generate.
@ -249,183 +173,125 @@ Remember to make sure that variables you use are all defined.
Now Begin!
```
The system prompt includes:
- An *introduction* that explains how the agent should behave and what tools are.
- A description of all the tools that is defined by a `<<tool_descriptions>>` token that is dynamically replaced at runtime with the tools defined/chosen by the user.
- The tool description comes from the tool attributes, `name`, `description`, `inputs` and `output_type`, and a simple `jinja2` template that you can refine.
- The expected output format.
The system prompt can be tailored to the intended task. For example, you can add a better explanation of the output format or you can overwrite the system prompt template entirely with your own custom system prompt as shown below.
You could improve the system prompt, for example, by adding an explanation of the output format.
> [!WARNING]
> If you're writing a custom system prompt, make sure to include `<<tool_descriptions>>` in the template so the agent is aware of the available tools.
For maximum flexibility, you can overwrite the whole system prompt template by passing your custom prompt as an argument to the `system_prompt` parameter.
```python
```py
from transformers import ReactJsonAgent
from transformers.agents import PythonInterpreterTool
agent = ReactJsonAgent(tools=[PythonInterpreterTool()], system_prompt="{your_custom_prompt}")
```
> [!WARNING]
> Please make sure to define the `<<tool_descriptions>>` string somewhere in the `template` so the agent is aware
of the available tools.
## Code execution
For safety, only the tools you provide (and the default Transformers tools) and the `print` function are executed. The interpreter doesn't allow importing modules that aren't on a safe list.
### Inspecting an agent run
Here are a few useful attributes to inspect what happened after a run:
- `agent.logs` stores the fine-grained logs of the agent. At every step of the agent's run, everything gets stored in a dictionary that then is appended to `agent.logs`.
- Running `agent.write_inner_memory_from_logs()` creates an inner memory of the agent's logs for the LLM to view, as a list of chat messages. This method goes over each step of the log and only stores what it's interested in as a message: for instance, it will save the system prompt and task in separate messages, then for each step it will store the LLM output as a message, and the tool call output as another message. Use this if you want a higher-level view of what has happened - but not every log will be transcripted by this method.
## Tools
A tool is an atomic function to be used by an agent.
You can for instance check the [`PythonInterpreterTool`]: it has a name, a description, input descriptions, an output type, and a `__call__` method to perform the action.
When the agent is initialized, the tool attributes are used to generate a tool description which is baked into the agent's system prompt. This lets the agent know which tools it can use and why.
### Default toolbox
Transformers comes with a default toolbox for empowering agents, that you can add to your agent upon initialization with argument `add_base_tools = True`:
- **Document question answering**: given a document (such as a PDF) in image format, answer a question on this document ([Donut](./model_doc/donut))
- **Image question answering**: given an image, answer a question on this image ([VILT](./model_doc/vilt))
- **Speech to text**: given an audio recording of a person talking, transcribe the speech into text ([Whisper](./model_doc/whisper))
- **Text to speech**: convert text to speech ([SpeechT5](./model_doc/speecht5))
- **Translation**: translates a given sentence from source language to target language.
- **DuckDuckGo search***: performs a web search using DuckDuckGo browser.
- **Python code interpreter**: runs your the LLM generated Python code in a secure environment. This tool will only be added to [`ReactJsonAgent`] if you initialize it with `add_base_tools=True`, since code-based agent can already natively execute Python code
You can manually use a tool by calling the [`load_tool`] function and a task to perform.
```python
from transformers import load_tool
tool = load_tool("text-to-speech")
audio = tool("This is a text to speech tool")
```
### Create a new tool
You can create your own tool for use cases not covered by the default tools from Hugging Face.
For example, let's create a tool that returns the most downloaded model for a given task from the Hub.
You'll start with the code below.
```python
from huggingface_hub import list_models
task = "text-classification"
model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
print(model.id)
```
This code can quickly be converted into a tool, just by wrapping it in a function and adding the `tool` decorator:
To import modules that aren't on the list, add them as a list to the `additional_authorized_imports` parameter when initializing an agent.
```py
from transformers import tool
from transformers import ReactCodeAgent
@tool
def model_download_tool(task: str) -> str:
"""
This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.
It returns the name of the checkpoint.
Args:
task: The task for which
"""
model = next(iter(list_models(filter="text-classification", sort="downloads", direction=-1)))
return model.id
agent = ReactCodeAgent(tools=[], additional_authorized_imports=['requests', 'bs4'])
agent.run("Could you get me the title of the page at url 'https://huggingface.co/blog'?")
```
The function needs:
- A clear name. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's put `model_download_tool`.
- Type hints on both inputs and output
- A description, that includes an 'Args:' part where each argument is described (without a type indication this time, it will be pulled from the type hint).
All these will be automatically baked into the agent's system prompt upon initialization: so strive to make them as clear as possible!
> [!TIP]
> This definition format is the same as tool schemas used in `apply_chat_template`, the only difference is the added `tool` decorator: read more on our tool use API [here](https://huggingface.co/blog/unified-tool-use#passing-tools-to-a-chat-template).
Then you can directly initialize your agent:
```py
from transformers import CodeAgent
agent = CodeAgent(tools=[model_download_tool], llm_engine=llm_engine)
agent.run(
"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?"
)
```
You get the following:
```text
======== New task ========
Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?
==== Agent is executing the code below:
most_downloaded_model = model_download_tool(task="text-to-video")
print(f"The most downloaded model for the 'text-to-video' task is {most_downloaded_model}.")
====
```
And the output:
`"The most downloaded model for the 'text-to-video' task is ByteDance/AnimateDiff-Lightning."`
### Manage your agent's toolbox
If you have already initialized an agent, it is inconvenient to reinitialize it from scratch with a tool you want to use. With Transformers, you can manage an agent's toolbox by adding or replacing a tool.
Let's add the `model_download_tool` to an existing agent initialized with only the default toolbox.
```python
from transformers import CodeAgent
agent = CodeAgent(tools=[], llm_engine=llm_engine, add_base_tools=True)
agent.toolbox.add_tool(model_download_tool)
```
Now we can leverage both the new tool and the previous text-to-speech tool:
```python
agent.run(
"Can you read out loud the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub and return the audio?"
)
```
| **Audio** |
|------------------------------------------------------------------------------------------------------------------------------------------------------|
| <audio controls><source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/damo.wav" type="audio/wav"/> |
Code execution stops if a tool isn't on the safe list, it isn't authorized, or if the code generated by the agent returns a Python error.
> [!WARNING]
> Beware when adding tools to an agent that already works well because it can bias selection towards your tool or select another tool other than the one already defined.
> A LLM can generate any arbitrary code that can be executed, so don't add any unsafe imports!
## Multi-agent
Use the `agent.toolbox.update_tool()` method to replace an existing tool in the agent's toolbox.
This is useful if your new tool is a one-to-one replacement of the existing tool because the agent already knows how to perform that specific task.
Just make sure the new tool follows the same API as the replaced tool or adapt the system prompt template to ensure all examples using the replaced tool are updated.
[Multi-agent](https://hf.co/papers/2308.08155) refers to multiple agents working together to solve a task. Performance is typically better because each agent is specialized for a particular subtask.
Multi-agents are created through a [`ManagedAgent`] class, where a *manager agent* oversees how other agents work together. The manager agent requires an agent and their name and description. These are added to the manager agents system prompt which lets it know how to call and use them.
### Use a collection of tools
You can leverage tool collections by using the ToolCollection object, with the slug of the collection you want to use.
Then pass them as a list to initialize you agent, and start using them!
The multi-agent example below creates a web search agent that is managed by another [`ReactCodeAgent`].
```py
from transformers import ToolCollection, ReactCodeAgent
from transformers.agents import ReactCodeAgent, HfApiEngine, DuckDuckGoSearchTool, ManagedAgent
image_tool_collection = ToolCollection(collection_slug="huggingface-tools/diffusion-tools-6630bb19a942c2306a2cdb6f")
agent = ReactCodeAgent(tools=[*image_tool_collection.tools], add_base_tools=True)
agent.run("Please draw me a picture of rivers and lakes.")
llm_engine = HfApiEngine()
web_agent = ReactCodeAgent(tools=[DuckDuckGoSearchTool()], llm_engine=llm_engine)
managed_web_agent = ManagedAgent(
agent=web_agent,
name="web_search",
description="Runs web searches for you. Give it your query as an argument."
)
manager_agent = ReactCodeAgent(
tools=[], llm_engine=llm_engine, managed_agents=[managed_web_agent]
)
manager_agent.run("Who is the CEO of Hugging Face?")
```
To speed up the start, tools are loaded only if called by the agent.
## Gradio integration
This gets you this image:
[Gradio](https://www.gradio.app/) is a library for quickly creating and sharing machine learning apps. The [gradio.Chatbot](https://www.gradio.app/docs/gradio/chatbot) supports chatting with a Transformers agent with the [`stream_to_gradio`] function.
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rivers_and_lakes.png">
Load a tool and LLM with an agent, and then create a Gradio app. The key is to use [`stream_to_gradio`] to stream the agents messages and display how it's reasoning through a task.
```py
import gradio as gr
from transformers import (
load_tool,
ReactCodeAgent,
HfApiEngine,
stream_to_gradio,
)
# Import tool from Hub
image_generation_tool = load_tool("m-ric/text-to-image")
llm_engine = HfApiEngine("meta-llama/Meta-Llama-3-70B-Instruct")
# Initialize the agent with the image generation tool
agent = ReactCodeAgent(tools=[image_generation_tool], llm_engine=llm_engine)
def interact_with_agent(task):
messages = []
messages.append(gr.ChatMessage(role="user", content=task))
yield messages
for msg in stream_to_gradio(agent, task):
messages.append(msg)
yield messages + [
gr.ChatMessage(role="assistant", content="⏳ Task not finished yet!")
]
yield messages
with gr.Blocks() as demo:
text_input = gr.Textbox(lines=1, label="Chat Message", value="Make me a picture of the Statue of Liberty.")
submit = gr.Button("Run illustrator agent!")
chatbot = gr.Chatbot(
label="Agent",
type="messages",
avatar_images=(
None,
"https://em-content.zobj.net/source/twitter/53/robot-face_1f916.png",
),
)
submit.click(interact_with_agent, [text_input], [chatbot])
if __name__ == "__main__":
demo.launch()
```
## Troubleshoot
For a better idea of what is happening when you call an agent, it is always a good idea to check the system prompt template first.
```py
print(agent.system_prompt_template)
```
If the agent is behaving unexpectedly, remember to explain the task you want to perform as clearly as possible. Every [`~Agent.run`] is different and minor variations in your system prompt may yield completely different results.
To find out what happened after a run, check the following agent attributes.
- `agent.logs` stores the finegrained agent logs. At every step of the agents run, everything is stored in a dictionary and appended to `agent.logs`.
- `agent.write_inner_memory_from_logs` only stores a high-level overview of the agents run. For example, at each step, it stores the LLM output as a message and the tool call output as a separate message. Not every detail from a step is transcripted by `write_inner_memory_from_logs`.
## Resources
Learn more about ReAct agents in the [Open-source LLMs as LangChain Agents](https://hf.co/blog/open-source-llms-as-agents) blog post.

Some files were not shown because too many files have changed in this diff Show More