Compare commits

...

825 Commits

Author SHA1 Message Date
8e3fbca4cd try 2025-02-14 16:02:47 +01:00
15ec971b8e Prepare processors for VideoLLMs (#36149)
* allow processor to preprocess conversation + video metadata

* allow callable

* add test

* fix test

* nit: fix

* add metadata frames_indices

* Update src/transformers/processing_utils.py

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* Update src/transformers/processing_utils.py

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* port updates from Orr and add one more test

* Update src/transformers/processing_utils.py

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>

* typo

* as dataclass

* style

* docstring + maek sure tests green

---------

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2025-02-14 11:34:08 +01:00
33d1d715b0 Add ImageProcessorFast to Qwen2.5-VL processor (#36164)
* add qwen2 fast image processor to modular file

Signed-off-by: isotr0py <2037008807@qq.com>

* fix modular

Signed-off-by: isotr0py <2037008807@qq.com>

* fix circle import

Signed-off-by: isotr0py <2037008807@qq.com>

* add docs

Signed-off-by: isotr0py <2037008807@qq.com>

* fix typo

Signed-off-by: isotr0py <2037008807@qq.com>

* add modular generated files

Signed-off-by: isotr0py <2037008807@qq.com>

* revert qwen2vl fast image processor

Signed-off-by: isotr0py <2037008807@qq.com>

* remove qwen2.5-vl image processor from modular

Signed-off-by: isotr0py <2037008807@qq.com>

* re-generate qwen2.5-vl files

Signed-off-by: isotr0py <2037008807@qq.com>

* remove unnecessary test

Signed-off-by: isotr0py <2037008807@qq.com>

* fix auto map

Signed-off-by: isotr0py <2037008807@qq.com>

* cleanup

Signed-off-by: isotr0py <2037008807@qq.com>

* fix model_input_names

Signed-off-by: isotr0py <2037008807@qq.com>

* remove import

Signed-off-by: isotr0py <2037008807@qq.com>

* make fix-copies

Signed-off-by: isotr0py <2037008807@qq.com>

---------

Signed-off-by: isotr0py <2037008807@qq.com>
2025-02-14 17:34:55 +08:00
1931a35140 Chat template docs (#36163)
* decompose chat template docs

* add docs

* update model docs

* qwen2-5

* pixtral

* remove old chat template

* also video as list frames supported

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_template_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* remove audio for now

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-02-14 10:32:14 +01:00
3bf02cf440 CI: fix test-save-trainer (#36191)
* fix

* also the docstring
2025-02-14 10:20:56 +01:00
0ae93d31ce Add support for partial rotary embeddings in Phi3 model (#35947)
* Added support for partial_rotary_factor

* addressed comments

* refactored
2025-02-14 09:37:38 +01:00
336dc69d63 Uniformize OwlViT and Owlv2 processors (#35700)
* uniformize owlvit processor

* uniformize owlv2

* nit

* add positional arg test owlvit

* run-slow: owlvit, owlv2

* run-slow: owlvit, owlv2

* remove one letter variable
2025-02-13 17:30:26 -05:00
e6a7981711 Fix make_batched_videos and add tests (#36143)
* add support for initial shift in video processing and other fixes

* revert modifications video loading functions
2025-02-13 17:14:30 -05:00
8fd4bc7d1d Fix a mistake in #36175 (#36179)
fix my bad

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-13 18:33:02 +01:00
b1a2de075d Follow up to SpQR integration (#36176)
fix
2025-02-13 17:40:59 +01:00
12962fe84b Fix the key name for _load_rng_state under torch.cuda (#36138)
fix load key name for _load_rng_state under torch.cuda

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-13 11:35:08 -05:00
bfe46c98b5 Make check_repository_consistency run faster by MP (#36175)
* speeddddd

* speeddddd

* speeddddd

* speeddddd

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-13 17:25:17 +01:00
5f0fd1185b Optimize Qwen2VL vision model by precomputing cos/sin embeds before ViT blocks (#35837)
* Optimize Qwen2VL vision model by precomputing cos/sin embeds before ViT blocks

* Make rotary_pos_emb optional & fix type

* Adapt pre-computed cos/sin to Qwen2.5VL

* More concise
2025-02-13 17:10:58 +01:00
d72642bccc Use tqdm auto (#35726)
* Remove traces of the progressbar

* Use tqdm auto
2025-02-13 15:41:30 +00:00
62c7ea0201 CI: avoid human error, automatically infer generative models (#33212)
* tmp commit

* move tests to the right class

* remove ALL all_generative_model_classes = ...

* skip tf roberta

* skip InstructBlipForConditionalGenerationDecoderOnlyTest

* videollava

* reduce diff

* reduce diff

* remove  on vlms

* fix a few more

* manual rebase bits

* more manual rebase

* remove all manual generative model class test entries

* fix up to ernie

* a few more removals

* handle remaining cases

* recurrent gemma

* it's better here

* make fixup

* tf idefics is broken

* tf bert + generate is broken

* don't touch tf :()

* don't touch tf :(

* make fixup

* better comments for test skips

* revert tf changes

* remove empty line removal

* one more

* missing one
2025-02-13 16:27:11 +01:00
06231fdfc7 add disable compile option (#36161)
* add disable compile code

* fix
2025-02-13 16:24:46 +01:00
0ca7259217 fix training issues (#36158)
* fix training issues

* Update

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-13 16:24:28 +01:00
845b0a2616 Efficient Inference Kernel for SpQR (#34976)
* Resolve vptq conflict

* Rename spqr package to spqr_quant

* Get rid of aqlm mention

* Start working on tests

* Resolve ruff code checks

* Ruff format

* Isort

* Test updates

* Add gpu tag

* Rename to modules_to_not_convert

* Config update

* Docs and config update

* Docs and config update

* Update to update_torch_dtype

* spqr config parameter validation

* Ruff update

* Apply ruff fixes

* Test fixes

* Ruff update

* Mark tests as @slow again; Ruff; Docstring update

* Ruff

* Remove absolute path

* Resolve typo

* Remove redundandt log

* Check accelerate/spqr availability

* Ruff fix

* Check if the config contains proper shapes

* Ruff test

* Documentation update

* overview update

* Ruff checks

* Ruff code quality

* Make style

* Update docs/source/en/quantization/spqr.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update spqr.md

* Enable gptqmodel (#35012)

* gptqmodel

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update readme

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* gptqmodel need use checkpoint_format (#1)

* gptqmodel need use checkpoint_format

* fix quantize

* Update quantization_config.py

* Update quantization_config.py

* Update quantization_config.py

---------

Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* Revert quantizer_gptq.py (#2)

* revert quantizer_gptq.py change

* pass **kwargs

* limit gptqmodel and optimum version

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix warning

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix version check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* revert unrelated changes

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable gptqmodel tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix requires gptq

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Fix Transformer compat (#3)

* revert quantizer_gptq.py change

* pass **kwargs

* add meta info

* cleanup

* cleanup

* Update quantization_config.py

* hf_select_quant_linear pass checkpoint_format and meta

* fix GPTQTestCUDA

* Update test_gptq.py

* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2

* cleanup

* add backend

* cleanup

* cleanup

* no need check exllama version

* Update quantization_config.py

* lower checkpoint_format and backend

* check none

* cleanup

* Update quantization_config.py

* fix self.use_exllama == False

* spell

* fix unittest

* fix unittest

---------

Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format again

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update gptqmodel version (#6)

* update gptqmodel version

* update gptqmodel version

* fix unit test (#5)

* update gptqmodel version

* update gptqmodel version

* "not self.use_exllama" is not equivalent to "self.use_exllama==False"

* fix unittest

* update gptqmodel version

* backend is loading_attibutes (#7)

* fix format and tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix memory check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix device mismatch

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix result check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* update tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* review: update docs (#10)

* review: update docs (#12)

* review: update docs

* fix typo

* update tests for gptqmodel

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update document (#9)

* update overview.md

* cleanup

* Update overview.md

* Update overview.md

* Update overview.md

* update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

---------

Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* typo

* doc note for asymmetric quant

* typo with apple silicon(e)

* typo for marlin

* column name revert: review

* doc rocm support

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Fix : Nemotron Processor in GGUF conversion (#35708)

* fixing nemotron processor

* make style

* Update docs/source/en/quantization/spqr.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add missing TOC to doc

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-02-13 16:22:58 +01:00
c5506f4f00 Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/adversarial (#36168)
Bump transformers in /examples/research_projects/adversarial

Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-13 15:06:16 +00:00
d7c5d1b539 Bump transformers from 4.38.0 to 4.48.0 in /examples/tensorflow/language-modeling-tpu (#36167)
Bump transformers in /examples/tensorflow/language-modeling-tpu

Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-13 14:46:38 +00:00
636ee57489 [generate] revert change in Aria: the maximum cache length must match max_length (#36120)
* revert inputs_embeds len

* Update test_utils.py

* make fixup
2025-02-13 14:36:33 +00:00
b41591d847 Fix : fix doc fp8 (#36173)
* fix

* fix
2025-02-13 15:29:59 +01:00
b079dd1fa2 Fix red CI (#36174)
test was weird
2025-02-13 14:27:55 +01:00
d114a6f78e [Modular] skip modular checks based on diff (#36130)
skip modular checks based on diff
2025-02-13 12:53:21 +00:00
6397916dd2 Remove loading custom kernel for RT-DETRv2 (#36098)
* Remove loading custom kernels

* Remove config param

* Fixup
2025-02-13 12:01:53 +00:00
efe72fe21f Adding FP8 Quantization to transformers (#36026)
* first commit

* adding kernels

* fix create_quantized_param

* fix quantization logic

* end2end

* fix style

* fix imports

* fix consistency

* update

* fix style

* update

* udpate after review

* make style

* update

* update

* fix

* update

* fix docstring

* update

* update after review

* update

* fix scheme

* update

* update

* fix

* update

* fix docstring

* add source

* fix test

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-13 13:01:19 +01:00
c82319b493 Helium documentation fixes (#36170)
* Helium documentation fixes

* Update helium.md

* Update helium.md

* Update helium.md
2025-02-13 12:20:53 +01:00
8f137b2427 Move DataCollatorForMultipleChoice from the docs to the package (#34763)
* Add implementation for DataCollatorForMultipleChoice based on docs.

* Add DataCollatorForMultipleChoice to import structure.

* Remove custom DataCollatorForMultipleChoice implementations from example scripts.

* Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean.

* Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable.

* Apply suggested changes and run make fixup.

* fix copies, style and fixup

* add missing documentation

* nits

* fix docstring

* style

* nits

* isort

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2025-02-13 12:01:28 +01:00
35c155052d Fix PretrainedTokenizerFast check => Fix PretrainedTokenizerFast Save (#35835)
* Fix the bug in tokenizer.save_pretrained when saving tokenizer_class to tokenizer_config.json

* Update tokenization_utils_base.py

* Update tokenization_utils_base.py

* Update tokenization_utils_base.py

* add tokenizer class type test

* code review

* code opt

* fix bug

* Update test_tokenization_fast.py

* ruff check

* make style

* code opt

* Update test_tokenization_fast.py

---------

Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
2025-02-13 12:00:33 +01:00
3c912c9089 docs: fix return type annotation of get_default_model_revision (#35982) 2025-02-13 11:59:15 +01:00
6a1ab634b6 qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 (#36083)
* qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1

* fix

* fix

* fix

* fix

* add tests

* fix test bugs

* fix

* fix failed tests

* fix
2025-02-13 11:35:28 +01:00
d419862889 Fix tests for vision models (#35654)
* Trigger tests

* [run-slow] beit, detr, dinov2, vit, textnet

* Fix BEiT interpolate_pos_encoding

* Fix DETR test

* Update DINOv2 test

* Fix textnet

* Fix vit

* Fix DPT

* fix data2vec test

* Fix textnet test

* Update interpolation check

* Fix ZoeDepth tests

* Update interpolate embeddings for BEiT

* Apply suggestions from code review
2025-02-13 10:28:37 +00:00
e60ae0d078 Replace deprecated update_repo_visibility (#35970) 2025-02-13 11:27:55 +01:00
9065cf0d92 Fix Gemma2 dtype issue when storing weights in float16 precision (#35398)
fix gemma2 dtype issue when storing weights in float16 precision
2025-02-13 11:17:37 +01:00
08ab1abff4 Add reminder config to issue template and print DS version in env (#35156)
* update env command to log deepspeed version

* suppress deepspeed import logging

* Add reminder to include configs to repro description in bug report.

* make fixup

* [WIP] update import utils for deepspeed

* Change to using is_deepspeed_available() from integrations.

* make fixup
2025-02-13 10:55:49 +01:00
950cfb0b4f Fix PaliGemma Pad Token Masking During Training #35855 (#35859)
* change order of unmasking of tokens

* library import

* class setup

* test function

* refactor

* add commit message

* test modified

* explict initiliasation of weights + made model smaller

* removed sepete testing file

* fixup

* fixup core

* test attention mask with token types

* tests fixup

* removed PaliGemmaAttentionMaskTest class

---------

Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
2025-02-13 10:11:44 +01:00
1614d196e8 Mllama fsdp (#36000)
* pixel input assignment revoked

* double send

* Update src/transformers/models/mllama/modeling_mllama.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-02-13 09:49:39 +01:00
847854b023 Add git LFS to AMD docker image (#36016)
Add git lfs to AMD docker image
2025-02-12 22:27:21 +01:00
9985d06add skip test_initialization for VitPoseBackboneModelTest for now (#36154)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-12 18:24:24 +01:00
4a5a7b991a Fix test fetcher (#36129)
* fix

* fix

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-12 17:35:41 +01:00
1fae54c721 Add more rigerous non-slow grad accum tests (#35668)
* Add more rigerous non-slow grad accum tests

* Further nits

* Re-add space

* Readbility

* Use tinystories instead

* Revert transformer diff

* tweak threshs
2025-02-12 10:26:21 -05:00
f869d486d3 Update doc re list of models supporting TP (#35864)
Update doc about models' TP support
2025-02-12 15:53:27 +01:00
281c0c8b5b adding option to save/reload scaler (#34932)
* Adding option to save/reload scaler

* Removing duplicate variable

* Adding save/reload test

* Small fixes on deterministic algorithm call

* Moving LLM test to another file to isolate its environment

* Moving back to old file and using subprocess to run test isolated

* Reverting back accidental change

* Reverting back accidental change
2025-02-12 15:48:16 +01:00
a33ac830af Fix multi gpu loss sync condition, add doc and test (#35743)
* Fix multi gpu loss sync condition, add doc and test

* rename function and class

* loss should not scale during inference

* fix typo
2025-02-12 15:41:31 +01:00
08c4959a23 Optim: APOLLO optimizer integration (#36062)
* Added APOLLO optimizer integration

* fix comment

* Remove redundancy: Modularize low-rank optimizer construction

* Remove redundancy: Remove useless comment

* Fix comment: Add typing

* Fix comment: Rewrite apollo desc
2025-02-12 15:33:43 +01:00
2440512723 multi-gpu: fix tensor device placements for various models (#35763)
* milti-gpu: fix inputs_embeds + position_embeds

Fixing the following errors in few models:
```
>       hidden_states = inputs_embeds + pos_embeds
E       RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:2 and xpu:3!
```

Fixes: #35762
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* multi-gpu: fix tensor device placements for various models

Fixes: #35762
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* Apply make fix-copies

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

---------

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2025-02-12 15:28:18 +01:00
befea8c4f0 🚨 Remove cache migration script (#35810)
* Remove cache migration script

* remove dummy move_cache
2025-02-12 15:12:38 +01:00
d52a9d08ce Bump cryptography from 43.0.1 to 44.0.1 in /examples/research_projects/decision_transformer (#36142)
Bump cryptography in /examples/research_projects/decision_transformer

Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.1 to 44.0.1.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/43.0.1...44.0.1)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-12 13:34:52 +00:00
31e4831b98 Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/vqgan-clip (#36136)
Bump transformers in /examples/research_projects/vqgan-clip

Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-12 13:21:09 +00:00
243aeb7c4a Fix Gradient Checkpointing for Deberta & Deberta-V2 using PEFT / Adapters (#35898)
Replace In-Place Operations for Deberta and Deberta-V2
2025-02-12 14:21:01 +01:00
8a2f062eac [commands] remove deprecated/inoperational commands (#35718)
rm deprecated/inoperational commands
2025-02-12 12:23:58 +00:00
8fc6ecba4f VLM: enable skipped tests (#35746)
* fix cached tests

* fix some tests

* fix pix2struct

* fix
2025-02-12 12:55:46 +01:00
d6897b46bd Add utility for Reload Transformers imports cache for development workflow #35508 (#35858)
* Reload transformers fix form cache

* add imports

* add test fn for clearing import cache

* ruff fix to core import logic

* ruff fix to test file

* fixup for imports

* fixup for test

* lru restore

* test check

* fix style changes

* added documentation for usecase

* fixing

---------

Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
2025-02-12 12:45:11 +01:00
1cc7ca3295 Whisper: remove redundant assisted generation tests (#34814)
* remove redundant test

* delete another test

* revert default max_length

* (wrong place, moving)
2025-02-12 11:37:19 +00:00
0cd5e2dfd0 added warning to Trainer when label_names is not specified for PeftModel (#32085)
* feat: added warning to Trainer when label_names is not specified for PeftModel

* Update trainer.py

* feat: peft detectw ith `_is_peft_model`

* Update src/transformers/trainer.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Applied formatting in trainer.py

---------

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2025-02-12 12:34:47 +01:00
377d8e2b9c add RAdamScheduleFree optimizer (#35313)
* add RAdamScheduleFree optimizer

* revert schedulefree version to the minimum requirement

* refine is_schedulefree_available so that it can take min_version

* refine documents

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-02-12 11:31:51 +01:00
f5fff672db Add pipeline parallel plan to PretrainedConfig and PreTrainedModel (#36091)
* Add `base_model_pp_plan` to `PretrainedConfig`

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add `_pp_plan` to `PreTrainedModel`

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add both to Llama for testing

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Fix type error

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Update to suggested schema

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* `_pp_plan` keys are not patterns

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Simplify schema

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Fix typing error

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Update input name for Llama

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Aria

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Bamba

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Cohere 1 & 2

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to diffllama and emu3

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Gemma 1 & 2

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to GLM and GPT NeoX

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Granite and Helium

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Mistral and Mixtral

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to OLMo 1 & 2

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan to Phi and Phi 3

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan for Qwen 2, 2 MoE, 2 VL and 2.5 VL

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add pp plan for Starcoder 2

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Add enum for accessing inputs and outputs

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Update type hints to use tuples

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

* Change outer list to tuple

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

---------

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-02-12 10:51:48 +01:00
11afab19c0 [docs] update awq doc (#36079)
* update awq doc

* Update docs/source/en/quantization/awq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/awq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/awq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/awq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* add note for inference

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-02-11 10:35:28 -08:00
9b69986e8a [docs] minor doc fix (#36127)
fix
2025-02-11 10:31:12 -08:00
1b57de8dcf Make output_dir Optional in TrainingArguments #27866 (#35735)
* make output_dir optional

* inintaied a basic testing module to validate and verify the changes

* Test output_dir default to 'tmp_trainer' when  unspecified.

* test existing functionality of output_dir.

* test that output dir only created when needed

* final check

* added doc string and changed the tmp_trainer to trainer_output

* amke style fixes to test file.

* another round of fixup

---------

Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
2025-02-11 18:54:36 +01:00
03534a92f8 update tiktoken integ to use converted (#36135) 2025-02-11 18:27:22 +01:00
3a5c328fd8 Fix CI issues (#35662)
* make explicit gpu dep

* [run-slow] bamba
2025-02-11 18:17:01 +01:00
775252abd4 Fix max size deprecated warning (#34998)
* Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning

* Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning

* Remove deprecated warnings and eliminate `max_size` usage

* Test use `int` as argument for `size`
Add a test to ensure test can pass successfully and backward compatibility

* The test pipelines still use `max_size`
Remove `max_size` from test pipelines and replace by `size` by a `Dict` with `'shortest_edge'` `'longest_edge'` as keys

* Reformatting

* Reformatting

* Revert "Reformatting"

This reverts commit c3040acee75440357cffd1f60c9d29ff5b2744b8.

* Revert "Reformatting"

This reverts commit ac4522e5c9a02d2d0c298295026db68ea26453df.

* Revert "The test pipelines still use `max_size`"

This reverts commit eaed96f041ffc32459536e1524d87f7a12ddee29.

* Revert "Test use `int` as argument for `size`"

This reverts commit 1925ee38c7c5eabb11832316712df1d4ba8043d0.

* Revert "Remove deprecated warnings and eliminate `max_size` usage"

This reverts commit d8e7e6ff9025931468fc1f3827cda1fa391003d5.

* Change version `4.26` to "a future version"

* Reformatting

* Revert "Change version `4.26` to "a future version""

This reverts commit 2b53f9e4
2025-02-11 18:14:31 +01:00
5489fea557 update awesome-transformers.md. (#36115)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-02-11 15:55:49 +00:00
76048be419 fix: typos in documentation files (#36122)
* Update tools.py

* Update text_generation.py

* Update question_answering.py
2025-02-11 13:47:20 +00:00
f42d46ccb4 Add common test for torch.export and fix some vision models (#35124)
* Add is_torch_greater_or_equal test decorator

* Add common test for torch.export

* Fix bit

* Fix focalnet

* Fix imagegpt

* Fix seggpt

* Fix swin2sr

* Enable torch.export test for vision models

* Enable test for video models

* Remove json

* Enable for hiera

* Enable for ijepa

* Fix detr

* Fic conditional_detr

* Fix maskformer

* Enable test maskformer

* Fix test for deformable detr

* Fix custom kernels for export in rt-detr and deformable-detr

* Enable test for all DPT

* Remove custom test for deformable detr

* Simplify test to use only kwargs for export

* Add comment

* Move compile_compatible_method_lru_cache to utils

* Fix beit export

* Fix deformable detr

* Fix copies data2vec<->beit

* Fix typos, update test to work with dict

* Add seed to the test

* Enable test for vit_mae

* Fix beit tests

* [run-slow] beit, bit, conditional_detr, data2vec, deformable_detr, detr, focalnet, imagegpt, maskformer, rt_detr, seggpt, swin2sr

* Add vitpose test

* Add textnet test

* Add dinov2 with registers

* Update tests/test_modeling_common.py

* Switch to torch.testing.assert_close

* Fix masformer

* Remove save-load from test

* Add dab_detr

* Add depth_pro

* Fix and test RT-DETRv2

* Fix dab_detr
2025-02-11 11:37:31 +00:00
1779f5180e Fix nighlty CIs: missing atols (#35903)
fix osme missing atols
2025-02-11 10:49:21 +01:00
1feebb5b41 AutoformerForPrediction test add atol (#36017) 2025-02-10 19:22:24 +01:00
be2ac0916a [generate] shape checks in tests compatible with fixed-length caches (+ some minor fixes) (#35993)
* shape checks compatible with static cache

* add test

* tmp

* manually turn on eager attn when we want to output attn

* typo

* generalize to encoder-decoder models

* force compilation on cpu

* tmp commit

* fix static cache shape checks

* models with odd caches

* fix copies

* shorter cache search loop

* use decoder_past_key_values everywhere

* better test variable names and comments

* signature

* rename _check_outputs into _check_generate_outputs

* add comments

* HybridCache future test note
2025-02-10 17:50:54 +00:00
9510ae39d9 fix bnb warning (#36116)
fix
2025-02-10 17:34:50 +01:00
09261ccf12 [Bugfix] fix file name of docstring in utils/check_table.py (#36108)
fix file name

Co-authored-by: kkscilife <qa-caif-cicd@pjlab.org.cn>
2025-02-10 15:48:02 +00:00
d4a6b4099b Revert checkpoint tmp dir (#36112)
* Revert "Fix OS err (#36094)"

This reverts commit ba29a439adbe6f371710d0514659127264ae24b3.

* Revert "Save checkpoint to temporary directory to handle partial saves during failures (#35580)"

This reverts commit 20d17358c468b7aefca9e54c3461eb88d1ee34f9.
2025-02-10 16:22:03 +01:00
0baf003915 Refactor OPT model (#36101)
* remove cross attention

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* remove is_decoder

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix pkv

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-02-10 14:27:16 +01:00
924f1c717a Remove Multi-threaded image conversion for fast image processors (#36105)
remove multithreaded image conversion

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-10 07:59:34 -05:00
3897f2caf8 Enable pytest live log and show warning logs on GitHub Actions CI runs (#35912)
* fix

* remove

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-10 13:36:20 +01:00
48a309d0d2 Support constant lr with cooldown (#35453)
* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add support for constant learning rate with cooldown

* Add more warmup and cooldown methods to 'get_wsc_schedule'

* Add more warmup and cooldown methods to 'get_wsc_schedule'

* Add more warmup and cooldown methods to 'get_wsc_schedule'

* Add more warmup and cooldown methods to 'get_wsc_schedule'

* Add more warmup and decay methods to 'get_wsd_schedule'

* support num_training_steps and num_stable_steps for get_wsd_schedule

* support num_training_steps and num_stable_steps for get_wsd_schedule

* get wsd scheduler before the `num_training_steps` decision

* fix code_quality

* Update stable branch logic

* fix code_quality

* Move stable stage decide to `get_wsd_schedule`

* Update docstring of `get_wsd_schedule`

* Update `num_train_steps` to optional

* Update `num_train_steps` to optional

* Update docstring of `get_wsd_schedule`

* Update src/transformers/optimization.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-10 13:21:55 +01:00
9a6be63fdb Add Apple's Depth-Pro for depth estimation (#34583)
* implement config and model building blocks

* refactor model architechture

* update model outputs

* update init param to include use_fov_model

* update param name in config

* fix hidden_states and attentions outputs for fov

* sort config

* complete minor todos

* update patching

* update config for encoder

* fix config

* use correct defaults in config

* update merge for compatibility with different image size

* restructure encoder for custom configuration

* make fov model compatible with custom config

* replace word "decoder" with "fusion"

* weight conversion script

* fix fov squeeze

* update conversion script (without test)

* upload ruff image processing

* create fast image processing

* use torch interpolation for image processing

* complete post_process_depth_estimation

* config: fix imports and sort args

* apply inference in weight conversion

* use mllama script instead for weight conversion

* clean weight conversion script

* add depth-pro status in other files

* fill docstring in config

* formatting

* more formatting

* formatting with ruff

* formatting with style

* fix copied classes

* add examples; update weight convert script

* fix using check_table.py and isort

* fix config docstring

* add depth pro to sdpa docs

* undo unintentional changes in configuration_gemma.py

* minor fixes

* test image processing

* fixes and tests

* more fixes

* use output states from image_encoder instead

* Revert "use output states from image_encoder instead"

This reverts commit 2408ec54e4f27d2abbecdb8374e58f34d91d8e96.

* make embeddings dynamic

* reshape output hidden states and attentions as part of computation graph

* fix ruff formating

* fix docstring failure

* use num_fov_head_layers in tests

* update doc

* check consistency with config

* ruff formatting

* update test case

* fix ruff formatting

* add tests for fov

* use interpolation in postprocess

* run and fix slow tests locally

* use scaled_images_features for image and fov encoder

* return fused_hidden_states in fusion stage

* fix example

* fix ruff

* fix copyright license for all files

* add __all__ for each file

* minor fixes
- fix download spell
- add push_to_hub option
- fix Optional type hinting
- apply single loop for DepthProImageProcessor.preprocess

* return list in post_process_depth_estimation

* minor fixes
- capitalize start of docstring
- use ignore copy
- fix examples
- move docstring templates and custom output classes to top
- remove "-> None" typehinting from __init__
- type hinting for forward passes
- fix docstrings for custom output classes

* fix "ruff check"

* update upsample and projection

* major changes: (image size and merge optimization)
- add support for images of any size
- optimize merge operation
- remove image_size from config
- use full names instead of B, C, H, W
- remove interpolation from fusion stage
- add interpolation after merge
- move validations to config
- update integration test
- add type hints for functions

* fix push_to_hub option in weights conversion

* remove image_size in weights conversion

* major changes in the architecture
- remove all DepthProViT modules and support different backbones using the AutoModel API
- set default use_fov_model to False
- validate parameters in configuration
- update interpolate function: use "nearest" for faster computation
- update reshape_feature function: remove all special tokens, possible from different backbones
- update merge function: use padding from config instead of merge_out_size
- remove patch_to_batch and batch_to_patch conversions for now
- calculate out_size dynamically in the encoder
- leave head_mask calculation to the backbone
- fix bugs with merge
- add more comments
- update tests

* placeholder for unused config attributes

* improve docs amid review

* minor change in docs

* further optimize merge

* fix formatting

* remove unused patch/batch convertion functions

* use original F.interpolate

* improve function naming

* minor chages
- use torch_int instead of int
- use proper for newly initialized tensors
- use user provided return_dict for patch_encoder
- use if-else block instead in self.use_fov_model

* rearchitect upsample block for improved modularity

* update upsample keys in weight conversion

* improve padding in merge_patches

* use double-loop for merge

* update comments

* create feature_extractor, reduce some forward code

* introduce config.use_mask_token in dinov2

* minor fixes

* minor fixes for onnx

* update __init__ to latest format

* remove DepthProConfig.to_dict()

* major changes in backbone

* update config in weight conversion

* formatting

* converted model is fp32

* improve naming and docs for feature_extractor->reconstruct_feature_maps

* minor fixes; amid review

* create intermediate vars in func call

* use torch.testing.assert_close

* use ModuleList instead of Sequential and ModuleDict

* update docs

* include fov in integraiton tests

* update docs

* improve initialization of convolution layers

* fix unused fov keys

* update tests

* ruff format

* fix test, amid kaimming initialization

* add depthpro to toctree

* add residual layer to _no_split_modules

* architecture rework

* Update src/transformers/models/depth_pro/image_processing_depth_pro.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/depth_pro/image_processing_depth_pro_fast.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* update docs

* improve merge_patches

* use flatten with fov_output

* ruff formatting

* update resources section in docs

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fix typo "final_kernal_size"

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fix output typehint for DepthProDepthEstimator

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* residual operation in 2 steps

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* use image_size instead of global patch_size in interpolation

* replace all Sequential with ModuleList

* update fov

* update heads

* fix and update conversion script for heads

* ruff formatting

* remove float32 conversion

* use "Fov" instead of "FOV" in class names

* use "Fov" instead of "FOV" in config docs

* remove prune_heads

* update fusion stage

* use device in examples

* update processor

* ruff fixes

* add do_rescale in image_processor_dict

* skip test: test_fast_is_faster_than_slow

* ruff formatting

* DepthProImageProcessorFast in other files

* revert antialias removal

* add antialias in BaseImageProcessorFast

* Revert "revert antialias removal"

This reverts commit 5caa0bd8f9f7463b98410c04e6cfe8fef3adee18.

* Revert "add antialias in BaseImageProcessorFast"

This reverts commit 3ae1134780ae236872985523d9c0a444eabcc179.

* update processor for grouping and antialias

* try test_fast_is_faster_than_slow without "skip" or "flanky"

* update checkpoint

* update checkpoint

* use @is_flanky for processor test

* update checkpoint to "apple/DepthPro-hf"

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-02-10 11:32:45 +00:00
c399921965 Paligemma: revert #36084 (#36113)
* revert

* type check
2025-02-10 12:04:24 +01:00
eebd2c972c Chat template: update for processor (#35953)
* update

* we need batched nested input to always process correctly

* update a bit

* fix copies
2025-02-10 09:52:19 +01:00
5bd7694781 Processors: allow tuples of images when checking (#36084)
allow tuples of images
2025-02-10 09:35:13 +01:00
3a3b06ace4 fix MllamaVisionAttention typehint (#35975)
* fix MllamaVisionAttention typehint

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Update src/transformers/models/mllama/modeling_mllama.py

Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>

* fix suggestion

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

---------

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
2025-02-10 09:17:10 +01:00
6b55046213 [docs] fix not-working example code in perf_infer_gpu_one.md (#36087)
* bug fix

* update memory limit
2025-02-07 12:42:22 -08:00
14ca7f1452 [docs] fix typo (#36080)
typo fix
2025-02-07 12:42:09 -08:00
c361b1e3d9 [docs] fix model checkpoint name (#36075)
update model name
2025-02-07 12:41:52 -08:00
ba29a439ad Fix OS err (#36094)
* Try via local_main_process first

* try 2
2025-02-07 09:57:43 -05:00
a18b7fdd9e Move audio top_k tests to the right file and add slow decorator (#36072)
* Move audio top_k tests to the right file and add slow decorator because we load a real model

* empty commit to trigger tests
2025-02-07 14:32:30 +00:00
014047e1c8 Fix bug in apply_rotary_pos_emb_flashatt: in Qwen2-5-VL (#36065) 2025-02-07 10:43:45 +01:00
006d9249ec Adding RT-DETRv2 for object detection (#34773)
* cookiecutter add rtdetrv2

* make modular working

* working modelgit add .

* working modelgit add .

* finalize moduar inheritence

* finalize moduar inheritence

* Update src/transformers/models/rtdetrv2/modular_rtdetrv2.py

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* update modular and add rename

* remove output ckpt

* define loss_kwargs

* fix CamelCase naming

* fix naming + files

* fix modular and convert file

* additional changes

* fix modular

* fix import error (switch to lazy)

* fix autobackbone

* make style

* add

* update testing

* fix loss

* remove old folder

* fix testing for v2

* update docstring

* fix docstring

* add resnetv2 (with modular bug to fix)

* remove resnetv2 backbone

* fix changes

* small fixes

* remove rtdetrv2resnetconfig

* add rtdetrv2 name to convert

* make style

* Update docs/source/en/model_doc/rt_detr_v2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/rt_detr_v2/modular_rt_detr_v2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/rt_detr_v2/modular_rt_detr_v2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix modular typo after review

* add reviewed changes

* add final review changes

* Update docs/source/en/model_doc/rt_detr_v2.md

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* Update src/transformers/models/rt_detr_v2/__init__.py

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* Update src/transformers/models/rt_detr_v2/convert_rt_detr_v2_weights_to_hf.py

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* add review changes

* remove rtdetrv2 resnet

* removing this weird project change

* change ckpt name from jadechoghari to author

* implement review and update testing

* update naming and remove wrong ckpt

* name

* make fix-copies

* Fix RT-DETR loss

* Add resources, fix name

* Fix repo in docs

* Fix table name

---------

Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: qubvel <qubvel@gmail.com>
2025-02-06 19:28:45 +00:00
6246c03260 [docs] fix outdated example code in trainer.md (#36066)
fix bugs
2025-02-06 10:54:22 -08:00
4563ba2c6f Fix StopStringCriteria to handle tokens above len(tokenizer) (#35797)
* Fix StopStringCriteria to handle tokens above len(tokenizer)

This fixes #35244 by clipping token IDs to be within the tokenizer's vocabulary size before performing the embedding lookup. This prevents index errors when model.config.vocab_size > len(tokenizer).

The fix:
1. Adds a clamp operation to ensure token IDs are within bounds
2. Adds a test case to verify the behavior

* Use self.stop_strings instead of stop_strings

* Handle clipping correctly

* make fixup

* Update test to the new embedding vecs

* Use much bigger values in the mismatch test

* Typo fix

* Slight simplification

---------

Co-authored-by: openhands <openhands@all-hands.dev>
2025-02-06 16:53:28 +00:00
28f73bc307 Fix model kwargs (#35875)
* Save state

* Make a failing test

* Better test

* mpt -> done, many more to go

* Rm extranious

* Bamba

* Bert

* big_bird

* biogpt

* bloom

* codegen

* ctrl

* data2vec

* dbrx

* Through up to Dbrx

* electra

* ernie

* falcon

* Fuyu/persimmon

* Include noop kwargs to base models

* Rebase

* Skip musigen

* Refactor/skip mllama

* Revert makefile

* Rm file

* Fix PT failing, need to modify rest of loss funcs to not resize

* Propagate some

* Continue

* More

* More options

* Mostly fixed

* Proved that it's the same

* Bloom is good

* Make ability to override loss func possible

* Fixup

* Clean

* Fix xglm

* Quality tests

* Skip OCR2

* Make specific loss for xglm

* Make order the same/line up 1:1

* xglm

* Skip fx output loss bloom model

* Didn't pass in pad_token_id

* Fix quality
2025-02-06 11:35:25 -05:00
1590c66430 Fix words typos in ggml test. (#36060)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-02-06 15:32:40 +00:00
1ce0e2992e Nail in edge case of torch dtype being overriden permantly in the case of an error (#35845)
* Nail in edge case of torch dtype

* Rm unused func

* Apply suggestions from code review

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Refactor tests to only mock what we need, don't introduce injection functions

* SetUp/TearDown

* Do super

---------

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2025-02-06 09:05:23 -05:00
e3458af726 Save checkpoint to temporary directory to handle partial saves during failures (#35580)
Save checkpoint to temporary folder first

Since partial/missing files due to failures throw error during load
2025-02-06 08:48:05 -05:00
3dd1de39bb Paligemma: fix generation with Gemma2 (#36044)
* fix paligemma

* nit

* use `kwargs` in models that can load any LM
2025-02-06 14:31:32 +01:00
dce9970884 Update test_flash_attn_2_can_dispatch_composite_models (#36050)
* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-06 12:09:49 +01:00
37faa97d9b Fix repo consistency (#36063)
* fix 1

* fix 2

* fix modular

* simplify at the same time

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-02-06 11:53:15 +01:00
ed98ad35e6 Fix usage of unpad_input function (#35925)
Fix usage of unpad_function

See https://github.com/huggingface/transformers/issues/35899

In the [commit](cdbbe844b1) return type of `unpad_input` was changed.
Now the code support older and newer versions

Co-authored-by: Pavel Gein <pavel.gein@gmail.com>
2025-02-06 11:33:42 +01:00
7aee036e54 Iterative generation using Input embeds and past_key_values (#35890)
* Iterative generation using input embeds

* ruff fix

* Added Testcase

* Updated comment

* ♻️ Refactored testcase

* Skip test for these models

* Continue generation using input embeds and cache

* Skip generate_continue_from_embeds test

* Refactor `prepare_input_for_generation` func

* Continue generation using input embeds and cache

* Modular changes fix

* Overwrite 'prepare_inputs_for_generation' function
2025-02-06 11:06:05 +01:00
b5f327f350 Add Qwen2VLImageProcessorFast into Qwen2VLProcessor (#35987)
* Add `Qwen2VLImageProcessorFast` into `Qwen2VLProcessor`

* Use `AutoImageProcessor` instead

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-02-06 10:03:09 +01:00
0de15c988b Fix Audio Classification Pipeline top_k Documentation Mismatch and Bug #35736 (#35771)
* added condition for top_k Doc mismatch fix

* initilation of test file for top_k changes

* added test for returning all labels

* added test for few labels

* tests/test_audio_classification_top_k.py

* final fix

* ruff fix

---------

Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
2025-02-05 16:25:08 +00:00
694aaa7fbc Fix how we compute the final non-padding token for ForSequenceClassification models (#35911)
* Fix how we compute the final non-padding token for Gemma (and probably other models)

* .size() -> .shape[]

* Propagating changes to other models

* Propagating changes to other models

* Change it for all ForSequenceClassification models

* Fix batch dim

* More TF fixes

* Copy the TF fix around as well

* Correct layer name for TFCTRL

* Cleaner .to()

* Clean up the nested if-else

* Use argmax() instead of .max().values
2025-02-05 16:23:33 +00:00
531d1511f5 [docs] no hard-coding cuda (#36043)
make device-agnostic
2025-02-05 08:22:33 -08:00
7399f8021e [docs] fix bugs in the bitsandbytes documentation (#35868)
* fix doc

* update model
2025-02-05 08:21:20 -08:00
0a1a8e3c7e [docs] no hard coding cuda as bnb has multi-backend support (#35867)
* change cuda to DEVICE

* Update docs/source/en/llm_tutorial.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-02-05 08:20:02 -08:00
9dc1efa5d4 DeepSpeed github repo move sync (#36021)
deepspeed github repo move
2025-02-05 08:19:31 -08:00
c772bff31a add support for empty list as input to create_model_card (#36042)
handle cases where it is list
2025-02-05 13:29:17 +01:00
315a9f494e Add XPU type for work-around -inf mask causing sdpa NaN issue in modeling files (#35647)
* add xpu for unmask

* change modular for generated matching

* add lastest modeling for helium
2025-02-05 13:28:31 +01:00
d8080d55c7 Fix synced multi-GPU generation with LLMs and VLMs (#35893)
* Fix synced multi-GPU generation

* fix copies

---------

Co-authored-by: Davit Manukyan <ManukyanD>
Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2025-02-05 11:15:11 +01:00
4831a94ee7 Fix Gemma2 synced multi-GPU generation (#35232)
* Fix Gemma2 synced multi-GPU generation

* Fix import ordering in modular_gemma2.py
2025-02-05 10:07:50 +01:00
fa56dcc2ab Refactoring of ImageProcessorFast (#35069)
* add init and base image processing functions

* add add_fast_image_processor to transformers-cli

* add working fast image processor clip

* add fast image processor to doc, working tests

* remove "to be implemented" SigLip

* fix unprotected import

* fix unprotected vision import

* update ViTImageProcessorFast

* increase threshold slow fast ewuivalence

* add fast img blip

* add fast class in tests with cli

* improve cli

* add fast image processor convnext

* add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision

* add device kwarg to ImagesKwargs for fast processing on cuda

* cleanup

* fix unprotected import

* group images by sizes and add batch processing

* Add batch equivalence tests, skip when center_crop is used

* cleanup

* update init and cli

* fix-copies

* refactor convnext, cleanup base

* fix

* remove patching mixins, add piped torchvision transforms for ViT

* fix unbatched processing

* fix f strings

* protect imports

* change llava onevision to class transforms (test)

* fix convnext

* improve formatting (following Pavel review)

* fix handling device arg

* improve cli

* fix

* fix inits

* Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs

* uniformize qwen2_vl fast

* fix docstrings

* add add fast image processor llava

* remove min_pixels max_pixels from accepted size

* nit

* nit

* refactor fast image processors docstrings

* cleanup and remove fast class transforms

* update add fast image processor transformers cli

* cleanup docstring

* uniformize pixtral fast and  make _process_image explicit

* fix prepare image structure llava next/onevision

* Use typed kwargs instead of explicit args

* nit fix import Unpack

* clearly separate pops and gets in base preprocess. Use explicit typed kwargs

* make qwen2_vl preprocess arguments hashable
2025-02-04 17:52:31 -05:00
8d73a38606 Add DAB-DETR for object detection (#30803)
* initial commit

* encoder+decoder layer changes WIP

* architecture checks

* working version of detection + segmentation

* fix modeling outputs

* fix return dict + output att/hs

* found the position embedding masking bug

* pre-training version

* added iamge processors

* typo in init.py

* iterupdate set to false

* fixed num_labels in class_output linear layer bias init

* multihead attention shape fixes

* test improvements

* test update

* dab-detr model_doc update

* dab-detr model_doc update2

* test fix:test_retain_grad_hidden_states_attentions

* config file clean and renaming variables

* config file clean and renaming variables fix

* updated convert_to_hf file

* small fixes

* style and qulity checks

* return_dict fix

* Merge branch main into add_dab_detr

* small comment fix

* skip test_inputs_embeds test

* image processor updates + image processor test updates

* check copies test fix update

* updates for check_copies.py test

* updates for check_copies.py test2

* tied weights fix

* fixed image processing tests and fixed shared weights issues

* added numpy nd array option to get_Expected_values method in test_image_processing_dab_detr.py

* delete prints from test file

* SafeTensor modification to solve HF Trainer issue

* removing the safetensor modifications

* make fix copies and hf uplaod has been added.

* fixed index.md

* fixed repo consistency

* styel fix and dabdetrimageprocessor docstring update

* requested modifications after the first review

* Update src/transformers/models/dab_detr/image_processing_dab_detr.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* repo consistency has been fixed

* update copied NestedTensor function after main merge

* Update src/transformers/models/dab_detr/modeling_dab_detr.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* temp commit

* temp commit2

* temp commit 3

* unit tests are fixed

* fixed repo consistency

* updated expected_boxes varible values based on related notebook results in DABDETRIntegrationTests file.

* temporarialy config modifications and repo consistency fixes

* Put dilation parameter back to config

* pattern embeddings have been added to the rename_keys method

* add dilation comment to config + add as an exception in check_config_attributes SPECIAL CASES

* delete FeatureExtractor part from docs.md

* requested modifications in modeling_dab_detr.py

* [run_slow] dab_detr

* deleted last segmentation code part, updated conversion script and changed the hf path in test files

* temp commit of requested modifications

* temp commit of requested modifications 2

* updated config file, resolved codepaths and refactored conversion script

* updated decodelayer block types and refactored conversion script

* style and quality update

* small modifications based on the request

* attentions are refactored

* removed loss functions from modeling file, added loss function to lossutils, tried to move the MLP layer generation to config but it failed

* deleted imageprocessor

* fixed conversion script + quality and style

* fixed config_att

* [run_slow] dab_detr

* changing model path in conversion file and in test file

* fix Decoder variable naming

* testing the old loss function

* switched back to the new loss function and testing with the odl attention functions

* switched back to the new last good result modeling file

* moved back to the version when I asked the review

* missing new line at the end of the file

* old version test

* turn back to newest mdoel versino but change image processor

* style fix

* style fix after merge main

* [run_slow] dab_detr

* [run_slow] dab_detr

* added device and type for head bias data part

* [run_slow] dab_detr

* fixed model head bias data fill

* changed test_inference_object_detection_head assertTrues to torch test assert_close

* fixes part 1

* quality update

* self.bbox_embed in decoder has been restored

* changed Assert true torch closeall methods to torch testing assertclose

* modelcard markdown file has been updated

* deleted intemediate list from decoder module

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-02-04 17:28:27 +00:00
fe52679e74 Update tests regarding attention types after #35235 (#36024)
* update

* update

* update

* dev-ci

* more changes

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-04 18:04:47 +01:00
014a1fa2c8 CircleCI with python 3.9 (#36027)
update docker files

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-04 17:40:20 +01:00
c98b467905 feat(ci): ignore trufflehog unverified results (#36031) 2025-02-04 16:39:36 +01:00
9855acb9c5 Hotfix for self-comment-ci.yml (#36030)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-04 16:28:05 +01:00
9f486badd5 Display warning for unknown quants config instead of an error (#35963)
* add supports_quant_method check

* fix

* add test and fix suggestions

* change logic slightly

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-02-04 15:17:01 +01:00
f19bfa50e7 Commont bot CI for other jobs (generation / quantization) (#35341)
* quantization CI on PRs

* fix

* fix

* add 2 members

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-04 14:42:51 +01:00
a93b80588b Fix RMSNormGated in Zamba2 (#35943)
* First commit

* Finish model implementation

* First commit

* Finish model implementation

* Register zamba2

* generated modeling and configuration

* generated modeling and configuration

* added hybrid cache

* fix attention_mask in mamba

* dropped unused loras

* fix flash2

* config docstrings

* fix config and fwd pass

* make fixup fixes

* text_modeling_zamba2

* small fixes

* make fixup fixes

* Fix modular model converter

* added inheritances in modular, renamed zamba cache

* modular rebase

* new modular conversion

* fix generated modeling file

* fixed import for Zamba2RMSNormGated

* modular file cleanup

* make fixup and model tests

* dropped inheritance for Zamba2PreTrainedModel

* make fixup and unit tests

* Add inheritance of rope from GemmaRotaryEmbedding

* moved rope to model init

* drop del self.self_attn and del self.feed_forward

* fix tests

* renamed lora -> adapter

* rewrote adapter implementation

* fixed tests

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Dropped adapter in-place sum

* removed rope from attention init

* updated rope

* created get_layers method

* make fixup fix

* make fixup fixes

* make fixup fixes

* update to new attention standard

* update to new attention standard

* make fixup fixes

* minor fixes

* cache_position

* removed cache_position postion_ids use_cache

* remove config from modular

* removed config from modular (2)

* import apply_rotary_pos_emb from llama

* fixed rope_kwargs

* Instantiate cache in Zamba2Model

* fix cache

* fix @slow decorator

* small fix in modular file

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* several minor fixes

* inherit mamba2decoder fwd and drop position_ids in mamba

* removed docstrings from modular

* reinstate zamba2 attention decoder fwd

* use regex for tied keys

* Revert "use regex for tied keys"

This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5.

* use regex for tied keys

* add cpu to slow forward tests

* dropped config.use_shared_mlp_adapter

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* re-convert from modular

* extended Zamba2RMSNormGated to n_groups>1

* removed einops import

* set _supports_sdpa = True

* add use_mem_eff_path flag for fused mamba2 fwd

* added docstring for use_mem_eff_ath flag

---------

Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-02-04 14:28:04 +01:00
bc9a6d8302 Fix device mismatch error in Whisper model during feature extraction (#35866)
* Fix device mismatch error in whisper feature extraction

* Set default device

* Address code review feedback

---------

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
2025-02-04 12:23:08 +01:00
9afb904b15 Refactor (and fix) gpt_neox (#35610)
* start a nice modular

* Update modular_gpt_neox.py

* Update modular_gpt_neox.py

* Update modular_gpt_neox.py

* Update modular_gpt_neox.py

* update

* Update modular_gpt_neox.py

* convert

* fix attribute

* fix attrs

* oups

* fix

* fix

* fix

* fix

* fix

* fix order to pass test (see with accelerate team)

* trigger CIs

* modular

* update

* up

* Update test_modeling_gpt_neox.py

* Update test_modeling_gpt_neox.py

* trigger CIs

* correctly pass arg

* simplify

* remove key warning

* update tp -> it's compatible since the view is before

* trigger CIs
2025-02-04 11:18:43 +01:00
ad30598923 Update Mistral converter (#35967)
* Update convert_mistral_weights_to_hf.py

* Update convert_mistral_weights_to_hf.py

* update

* style

* move it to integrations

* style

* trigger CIs

* trigger CIs
2025-02-04 11:13:12 +01:00
b1954fd64a layernorm_decay_fix (#35927)
* layernorm_decay_fix

* W293 fix

* ruff format fix

* black format

* ruff format

* erase last layer

* add test_get_parameter_names_rmsnorm

* rmsnorm fix
2025-02-04 11:01:49 +01:00
2ba040a71f apply_chat_template: consistent behaviour for return_assistant_tokens_mask=True return_tensors=True (#35582)
* apply_chat_template: consistent return_tensors behaviour with return_assistant_tokens_mask flag

* test_chat_template_return_assistant_tokens_mask: support tokenizers with no attention mask

* test_chat_template_return_assistant_tokens_mask: skip tokenizers with no padding token

* test_chat_template_return_assistant_tokens_mask: force tokenizer padding_side=right

---------

Co-authored-by: Eduard Allakhverdov <goncharova@airi.net>
Co-authored-by: d.tarasov <d.tarasov@airi.net>
2025-02-04 10:27:52 +01:00
9c02cb6233 Fix custom kernel for DeformableDetr, RT-Detr, GroindingDINO, OmDet-Turbo in Pytorch 2.6.0 (#35979)
Updates type().is_cuda() -> .is_cuda(); .data<> -> .data_ptr<>
2025-02-04 09:07:25 +00:00
5d75a25b03 Qwen2-VL: fix rope delta calculation (#36013)
* fix rope delats calculation

* add test

* style
2025-02-04 09:48:29 +01:00
e284c7e954 Update Granite Vision Model Path / Tests (#35998)
* Update granite vision model path

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Enable granite vision test

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

---------

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
2025-02-03 20:06:03 +01:00
Gar
9d2056f12b Add mean_resizing for every VLMs' resizing_token_embeddings() (#35717)
* refine all resize_token_embedding()

* ruff format

* hotfix
2025-02-03 15:03:49 +01:00
7eecdf2a86 Update-tp test (#35844)
* update test for now

* up

* cleanup

* update todo
2025-02-03 09:37:02 +01:00
62db3e6ed6 use torch 2.6 for daily CI (#35985)
use torch 2.6 for CI

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-31 18:58:23 +01:00
2b46943195 Add GOT-OCR 2.0 to Transformers (#34721)
* init modular got_ocr2

* Get correct got_ocr architecture

* add processing

* run modular with processing

* add working inference

* apply modular

* Refactor and fix style

* Refactor, cleanup, fix style

* fix init order

* Fix docs

* add base modeling tests

* fix style and consistency

* rename doc file

* fix repo consistency

* fix inference with box

* add image processing and support for crop_to_multi_page

* Fix batch inference

* add tests

* fixup

* fix slow test

* fix docstrings

* Add model doc

* update to new init

* fix input autocast pixel_values dtype

* update doc

* move doc to multimodal

* Reformat crop_image_to_patches and add docstrings

* Fix example in forward docstring

* Address Pablo review

* [run slow] got_ocr2

* remove defaults defined twice

* apply modular

* add torch_device to integration tests

* update modular

* follow-up Pavel review

* add device variable in doc

* fix doc multi-page

* Force eager attention for vision encoder to avoid attn implementation conflict

* revert qwen2vl doc changes

* use Qwen2ForCausalLM instead of Qwen2Model

* make fixup

* refactor gotocr2 to llava style

* uniformize function names and reduce checks

* final nits

* fix pixel_values dtype error

* change checkpoint names

* fix modular
2025-01-31 11:28:13 -05:00
5bbee12ac9 [Moshi] disable automatic compilation if the model can't compile (#35992)
moshi cant compile
2025-01-31 15:53:06 +00:00
e6f4a4ebbf [Moonshine] compute head_dim_padding at init (#35984)
compute head_dim_padding at init
2025-01-31 14:26:52 +01:00
d7188ba600 Add support for nested images to LLava and VipLLava (#35558)
* move make_flat_list_of_images and make_batched_videos to image_utils

* remove unnecessary is_vision_available

* move make_nested_list_of_images to image_utils

* fix fast pixtral image processor

* fix import mllama

* fix make_nested_list_of_images

* add tests

* convert 4d arrays/tensors to list

* add test_make_batched_videos

* add support nested batch of videos

* fix image processing qwen2vl
2025-01-30 16:49:20 -05:00
e4227eb4d4 Handle empty change indices in SAM's mask to rle conversion (#35665)
* Handle empty change indices in RLE conversion for masks

* [test] Add unit tests for RLE encoding of masks in SamProcessor

* [test] Update RLE conversion tests to use TensorFlow implementation

* [test] Fix formatting in SamProcessorTest according to check_code_quality action

* [test] Fix formatting in SamProcessorTest according to check_code_quality

* [test] Refactored rle test cases into one test and used tf tensors in tf test cases

* [test] Fix: removed self parameter from refactored methods

* [test] Removed nested methods in run-length encoding tests for PyTorch and TensorFlow

* [test] Added description to individual to run-length encoding tests for PyTorch and TensorFlow.
2025-01-30 19:08:38 +00:00
47bd4296d6 not to use A100 for benchmark.yml (#35974)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-30 18:55:36 +01:00
693328f2bc Support batching for UsefulSensors Moonshine (#35922)
* Add support for attention masking in moonshine.

Tested against Open ASR Leaderboard with batch size 256.

* Update comments and ensure attention masks are passed everywhere.

Perform attention mask downsampling inside of moonshine forward call.

* Hide padding behind conditional. Fix encoder/decoder masking.

- Correctly pipe encoder attention mask into decoder
- Add correct scaling factor if one is not already provided.
- Fix formatting with ruff

* Add auto generated modeling_moonshine file.

* Update formatting in generated model file.

* Address review comments.

* Fix typo.

* Add `pad_head_dim_to_multiple_of` to moonshine config.

* Correct args order for MooonshineConfig.

* Update configuration moonshine too.

* Update src/transformers/models/moonshine/modular_moonshine.py

* Update src/transformers/models/moonshine/configuration_moonshine.py

---------

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
2025-01-30 17:08:07 +01:00
5757681837 Less flaky for TimmBackboneModelTest::test_batching_equivalence (#35971)
* fix

* remove is_flaky

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-30 16:56:26 +01:00
e320d5542e Revert p_mask to a list in DQA pipeline (#35964)
* p_mask back to being a list

* Remove breakpoint
2025-01-30 15:37:59 +00:00
365fecb4d0 Whisper: fix static cache CI (#35852)
* fix

* remove overriden method

* small change
2025-01-30 12:43:00 +01:00
9725e5be2f Pixtral: vectorize patch embeddings and enable tests (#35122)
* initial POC

* - batch mix feature

* fix tests

* fix tests

* make style

* do not skip and instead fix tests

* update

* return back the test

* correct text with the correct ckpt
2025-01-30 12:40:18 +01:00
8bc4c89ee9 [bart] minor test fixes (#35965)
fix tests
2025-01-30 10:00:11 +00:00
19f2ec80cf Fix is_causal being a tensor (#35791)
* fix is_causal being a tensor

* convert in sdpa attention only when  jit tracing
2025-01-30 09:22:33 +01:00
7547f55e5d fix iterator overflow when gradient accumulation is 1 (#35960) 2025-01-29 14:45:09 -05:00
4d3b1076a1 [generate] move max time tests (#35962)
* move max time tests to their right place

* move test to the right place
2025-01-29 17:56:46 +00:00
4d1d489617 Update README.md (#35958)
There should be a dot after pip install .
2025-01-29 15:46:26 +00:00
f0ae65c198 [tests] further fix Tester object has no attribute '_testMethodName' (#35781)
* bug fix

* update with more cases

* more entries

* Fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 16:05:33 +01:00
ec7790f0d3 update docker file transformers-pytorch-deepspeed-latest-gpu (#35940)
update docker file for deepspeed

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 16:01:27 +01:00
5d257111c1 Trainer Refactor: Part 1 (#35567)
* start

* So far: 30%

* Small fix

* Continuing update

* Continuing

* Forgot to check if not None

* Continuing refactor

* Fix if else

* Fix ref

* Should make tests pass

* Keep grad norm same

* Document

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Err instead of info for logging RNG state error

* Seperate out to func

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-01-29 09:50:54 -05:00
23d782ead2 Output dicts support in text generation pipeline (#35092)
* Support for generate_argument: return_dict_in_generate=True, instead of returning a error

* fix: call test with return_dict_in_generate=True

* fix: Only import torch if it is present

* update: Encapsulate output_dict changes

* fix: added back original comments

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-01-29 14:44:46 +00:00
cf90404807 Fix flaky test_assisted_decoding_matches_greedy_search (#35951)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 14:50:07 +01:00
692afa102d Update squad_convert_example_to_features to work with numpy v2 (#35955)
* Fix

* Fix

* Fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 14:33:06 +01:00
c600e89f5c Update unwrap_and_save_reload_schedule to use weights_only=False (#35952)
* fix

* Fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-29 14:30:57 +01:00
42c8ccfd4c fix test_generated_length_assisted_generation (#34935)
fix test_generated_length_assisted_generation
2025-01-29 12:03:45 +00:00
ec7afad609 use torch constraints to check if covariance is positive definite during mean resizing. (#35693)
* use torch constraints to check for psd

* small nit

* Small change

* Small change for the ci

* nit
2025-01-28 17:33:42 +01:00
61cbb723fc Remove INC notebook reference in documentation (#35936)
remove INC notebook in documentation
2025-01-28 17:10:02 +01:00
478c4f2d0d fix(FA): QKV not being casted to target_dtype for FA with dpo lora (#35834)
fix(FA): QKV not being casted to target_dtype due to dtype check
2025-01-28 17:06:56 +01:00
ece8c42488 Test: generate with torch.compile(model.forward) as a fast test (#34544) 2025-01-28 14:10:38 +00:00
f48ecd7608 Fix TP initialization (#35860)
* fix tp

* Update modeling_utils.py

* style

* style

* Update test_tp.py

* Update test_tp.py

* style

* Update test_tp.py

* Update test_tp.py

* Update test_tp.py

* Update test_tp.py
2025-01-28 15:07:37 +01:00
f85ba20449 Qwen-2-5-VL: fix CI (#35935)
fix
2025-01-28 14:51:57 +01:00
3f860dba55 Fix mask slicing for models with HybridCache (#35681)
* correctly slice

* check mask

* Update modular_gemma2.py

* fix

* add tests

* fix typo

* finally fix mask slicing

* Finally correctly slice in all cases!!

* add test for all attention functions

* small fix in tests

* trick around dynamo tracing issue

* last update

* more robust

* kwargs propagation

* make it explicit for checkpointing

* apply modular
2025-01-28 14:35:00 +01:00
b764c20b09 Fix: loading DBRX back from saved path (#35728)
* fix dtype as dict for some models + add test

* add comment in tests
2025-01-28 11:38:45 +01:00
3613f568cd Add default TP plan for all models with backend support (#35870)
* Add some tp plans!

* More tp plans!

* Add it in the comment

* style

* Update configuration_mixtral.py

* Update configuration_phi.py

* update the layout according to special archs

* fix mixtral

* style

* trigger CIs

* trigger CIs

* CIs

* olmo2

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-28 11:20:58 +01:00
96625d85fd Use rocm6.2 for AMD images (#35930)
* Use rocm6.2 as rocm6.3 only has nightly pytorch wheels atm

* Use stable wheel index for torch libs
2025-01-28 11:10:28 +01:00
bf16a182ba Remove _supports_static_cache = True for some model classes (#34975)
* use mask_fill

* remove comment

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-28 10:42:10 +01:00
86d7564611 [docs] Fix Zamba2 (#35916)
fix code block
2025-01-27 11:44:10 -08:00
414658f94f Close Zamba2Config code block (#35914)
* close zamba2 code block

* Add Zamba2 to toctree
2025-01-27 19:09:42 +00:00
63e9c941eb Fix the config class comparison for remote code models (#35592)
* Fix the config class comparison when repeatedly saving and loading remote code models

* once again you have committed your debug breakpoint
2025-01-27 18:37:30 +00:00
c550a1c640 [docs] uv install (#35821)
uv install
2025-01-27 08:49:28 -08:00
cd6591bfb2 Fix typing in audio_utils.chroma_filter_bank (#35888)
* Fix typing in audio_utils.chroma_filter_bank

* Apply make style

---------

Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>
2025-01-27 16:06:03 +00:00
e57b459997 Split and clean up GGUF quantization tests (#35502)
* clean up ggml test

Signed-off-by: Isotr0py <2037008807@qq.com>

* port remaining tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* further cleanup

Signed-off-by: Isotr0py <2037008807@qq.com>

* format

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix broken tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* update comment

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix

Signed-off-by: Isotr0py <2037008807@qq.com>

* reorganize tests

Signed-off-by: Isotr0py <2037008807@qq.com>

* k-quants use qwen2.5-0.5B

Signed-off-by: Isotr0py <2037008807@qq.com>

* move ggml tokenization test

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove dead code

Signed-off-by: Isotr0py <2037008807@qq.com>

* add assert for serilization test

Signed-off-by: Isotr0py <2037008807@qq.com>

* use str for parameterize

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-27 15:46:57 +01:00
5c576f5a66 🚨🚨🚨 image-classification pipeline single-label and multi-label prob type squashing fns (sigmoid vs softmax) are backwards (#35848)
single-label and multi-label prob type squashing fns (sigmoid vs softmax) were backwards for image-classification pipeline
2025-01-27 15:34:57 +01:00
5450e7c84a 🔴 🔴 🔴 Added segmentation maps support for DPT image processor (#34345)
* Added `segmentation_maps` support for DPT image processor

* Added tests for dpt image processor

* Moved preprocessing into separate functions

* Added # Copied from statements

* Fixed # Copied from statements

* Added `segmentation_maps` support for DPT image processor

* Added tests for dpt image processor

* Moved preprocessing into separate functions

* Added # Copied from statements

* Fixed # Copied from statements
2025-01-27 15:14:00 +01:00
a50befa9b9 Update deepspeed amd image (#35906) 2025-01-27 14:32:36 +01:00
33cb1f7b61 Add Zamba2 (#34517)
* First commit

* Finish model implementation

* First commit

* Finish model implementation

* Register zamba2

* generated modeling and configuration

* generated modeling and configuration

* added hybrid cache

* fix attention_mask in mamba

* dropped unused loras

* fix flash2

* config docstrings

* fix config and fwd pass

* make fixup fixes

* text_modeling_zamba2

* small fixes

* make fixup fixes

* Fix modular model converter

* added inheritances in modular, renamed zamba cache

* modular rebase

* new modular conversion

* fix generated modeling file

* fixed import for Zamba2RMSNormGated

* modular file cleanup

* make fixup and model tests

* dropped inheritance for Zamba2PreTrainedModel

* make fixup and unit tests

* Add inheritance of rope from GemmaRotaryEmbedding

* moved rope to model init

* drop del self.self_attn and del self.feed_forward

* fix tests

* renamed lora -> adapter

* rewrote adapter implementation

* fixed tests

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Dropped adapter in-place sum

* removed rope from attention init

* updated rope

* created get_layers method

* make fixup fix

* make fixup fixes

* make fixup fixes

* update to new attention standard

* update to new attention standard

* make fixup fixes

* minor fixes

* cache_position

* removed cache_position postion_ids use_cache

* remove config from modular

* removed config from modular (2)

* import apply_rotary_pos_emb from llama

* fixed rope_kwargs

* Instantiate cache in Zamba2Model

* fix cache

* fix @slow decorator

* small fix in modular file

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* several minor fixes

* inherit mamba2decoder fwd and drop position_ids in mamba

* removed docstrings from modular

* reinstate zamba2 attention decoder fwd

* use regex for tied keys

* Revert "use regex for tied keys"

This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5.

* use regex for tied keys

* add cpu to slow forward tests

* dropped config.use_shared_mlp_adapter

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* re-convert from modular

---------

Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-27 10:51:23 +01:00
14a9bb520e Fix fast image processor warnings in object detection examples (#35892)
Have the DETR examples default to using the fast image  processor
2025-01-27 08:32:44 +00:00
f11f57c925 [doctest] Fixes (#35863)
doctest fixes
2025-01-26 15:26:38 -08:00
fc269f77da Add Rocketknight1 to self-comment-ci.yml (#35881)
my bad

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-24 19:07:07 +00:00
bcb841f007 add xpu device check in device_placement (#35865)
add xpu device
2025-01-24 19:13:07 +01:00
b912f5ee43 use torch.testing.assertclose instead to get more details about error in cis (#35659)
* use torch.testing.assertclose instead to get more details about error in cis

* fix

* style

* test_all

* revert for I bert

* fixes and updates

* more image processing fixes

* more image processors

* fix mamba and co

* style

* less strick

* ok I won't be strict

* skip and be done

* up
2025-01-24 16:55:28 +01:00
72d1a4cd53 Fix Llava-NeXT / Llava-NeXT Video / Llava-OneVision's token unpadding mismatch (#35779)
* Fix Llava OneVision's token padding

* Fix Llava next and Llava next video's token unpadding for consistency
2025-01-24 09:10:27 +01:00
b5aaf87509 Fix test_pipelines_video_classification that was always failing (#35842)
* Fix test_pipelines_video_classification that was always failing

* Update video pipeline docstring to reflect actual return type

---------

Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>
2025-01-23 19:22:32 +01:00
328e2ae4c0 fix apply_chat_template() padding choice (#35828)
fix apply_chat_template() padding choice to bool, str, PaddingStrategy and the docstring of pad()
2025-01-23 17:32:32 +00:00
d2a424b550 Fix typo (#35854) 2025-01-23 17:32:18 +00:00
045c02f209 [DOC] Fix contamination and missing paragraph in translation (#35851)
Fix contamination and missing paragraph in translation
2025-01-23 08:33:44 -08:00
71cc8161b2 Granite Vision Support (#35579)
* Add multimodal granite support

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

Support multiple image feature layres

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Remove failing validation for visual encoders with no cls

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Update llava based models / configs to support list of feature layers

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Add tests for multiple feature layers

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Use conditional instead of except for misaligned feature shapes

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* crop cls from each hidden state

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Fix formatting

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Support single vision feature int in vipllava

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

* Fix typo in vision feature selection strategy validation

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Add tentative integration test for granite vision models

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Add granite vision docs

Replace multimodal granite refs with granite vision

Add granite vision / llava next alias

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

* Use image url in granitevision example

Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>

---------

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
2025-01-23 17:15:52 +01:00
8f1509a96c Fix more CI tests (#35661)
add tooslow for the fat ones
2025-01-23 14:45:42 +01:00
0a950e0bbe Fix uploading processors/tokenizers to WandB on train end (#35701)
* rename tokenizer to processing_class in WandbCallback.on_train_end

* rename tokenizer to processing_class in ClearMLCallback and DVCLiveCallback
2025-01-23 13:32:15 +01:00
4ec425ffad Fix GA loss for Deepspeed (#35808)
* Fix GA loss for Deepspeed

* Turn off loss scaling in DeepSpeed engine by scale_wrt_gas

* Add comment linking to PR
2025-01-23 11:45:02 +01:00
f3f6c86582 add qwen2.5vl (#35569)
* add qwen2.5vl

* fix

* pass check table

* add modular file

* fix style

* Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>

* Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>

* Update src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>

* padd copy check

* use modular

* fix

* fix

* fix

* update flashatt2&sdpa support_list

* Update docs/source/en/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_5_vl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update config

* update

* fix hf path

* rename Qwen2_5_VLVideosKwargs

* fix

* fix

* update

* excuted modular

* rollback init

* fix

* formated

* simpler init

* fix

* fix

* fix

* fix

* fix

* update docs

* fix

* fix

* update Qwen2VLRotaryEmbedding for yarn

* fix

---------

Co-authored-by: Minho Shim <6764739+minostauros@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: gewenbin0992 <gewenbin292@163.com>
Co-authored-by: gewenbin0992 <67409248+gewenbin0992@users.noreply.github.com>
2025-01-23 11:23:00 +01:00
d3af76df58 [Backend support] Allow num_logits_to_keep as Tensor + add flag (#35757)
* support

* Update modeling_utils.py

* style

* most models

* Other models

* fix-copies

* tests + generation utils
2025-01-23 09:47:54 +01:00
8736e91ad6 [ tests] remove some flash attention class tests (#35817)
remove class from tests
2025-01-23 09:44:21 +01:00
2c3a44f9a7 Fix NoneType type as it requires py>=3.10 (#35843)
fix type
2025-01-22 15:56:53 +00:00
fdcc62c855 Add PyTorch version check for FA backend on AMD GPUs (#35813)
Disable FA backend for SDPA on AMD GPUs (PyTorch < 2.4.1)
2025-01-22 16:09:23 +01:00
3b9770581e Fix compatibility issues when using auto_gptq with these older versions (#35830)
convert_model method of optimum only accepts a single nn.Module type model parameter for versions less than 1.23.99.
2025-01-22 15:46:47 +01:00
62bd83947a [chat] docs fix (#35840)
docs fix
2025-01-22 14:32:27 +00:00
487e2f63bd Fix head_dim in config extracted from Gemma2 GGUF model (#35818)
fix gemma2 head dim

Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-01-22 15:22:04 +01:00
b3d6722469 [Chat] Add Chat from TRL 🐈 (#35714)
* tmp commit

* add working chat

* add docts

* docs 2

* use auto dtype by default
2025-01-22 13:30:12 +00:00
a7738f5a89 Fix : Nemotron tokenizer for GGUF format (#35836)
fix nemotron gguf
2025-01-22 12:28:40 +01:00
ec28957f94 [pipeline] missing import regarding assisted generation (#35752)
missing import
2025-01-22 10:34:28 +00:00
36c9181f5c [gpt2] fix generation tests (#35822)
fix gpt2 generation tests
2025-01-22 09:41:04 +00:00
f439e28d32 Hotfix: missing working-directory in self-comment-ci.yml (#35833)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-22 10:25:50 +01:00
373e50e970 Init cache on meta device (#35164)
* init cache on meta device

* offloaded static + enable tests

* tests weren't running before  :(

* update

* fix mamba

* fix copies

* update

* address comments and fix tests

* fix copies

* Update src/transformers/cache_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update

* mamba fix

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-22 09:49:17 +01:00
870e2c8ea0 Another security patch for self-comment-ci.yml (#35816)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-22 09:29:54 +01:00
f4f33a20a2 Remove pyav pin to allow python 3.11 to be used (#35823)
* Remove pyav pin to allow python 3.11 to be used

* Run make fixup

---------

Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>
2025-01-21 20:16:18 +00:00
90b46e983f Remove old benchmark code (#35730)
* remove traces of the old deprecated benchmarks

* also remove old tf benchmark example, which uses deleted code

* run doc builder
2025-01-21 17:56:43 +00:00
870eb7b41b [Mimi] update test expected values for t4 runners (#35696)
update values for t4
2025-01-21 18:23:36 +01:00
8ac851b0b3 Improve modular documentation (#35737)
* start a nice doc

* keep improving the doc

* Finalize doc

* Update modular_transformers.md

* apply suggestion
2025-01-21 17:53:30 +01:00
107f9f5127 add Qwen2-VL image processor fast (#35733)
* add qwen2_vl image processor fast

* add device to ImagesKwargs

* remove automatic fix copies

* fix fast_is_faster_than_slow

* remove unnecessary import
2025-01-21 11:49:05 -05:00
3df90103b8 move fastspeech to audio models (#35788) 2025-01-21 08:32:09 -08:00
741d55237a [i18n-ar] Translated file: docs/source/ar/tasks/masked_language_modeling.md into Arabic (#35198)
* إضافة الترجمة العربية: masked_language_modeling.md

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/masked_language_modeling.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

* Update _toctree.yml

* Add language_modeling.md

* Add Sequence_classifiation.md

* Update _toctree.yml

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2025-01-21 08:29:58 -08:00
568941bf11 Optimized set_initialized_submodules. (#35493) 2025-01-21 17:01:28 +01:00
7051c5fcc8 Remove deprecated get_cached_models (#35809)
* Remove deprecated get_cached_models

* imports
2025-01-21 16:08:31 +01:00
97fbaf0861 Fixed typo in autoawq version number in an error message for IPEX backend requirements. (#35815)
Fixed typo in version number for IPEX backend required minimal autoawq version
2025-01-21 14:42:44 +00:00
dbd8474125 Fix : BLOOM tie_word_embeddings in GGUF (#35812)
* fix bloom ggml

* fix falcon output

* make style
2025-01-21 15:35:54 +01:00
678bd7f1ce Auto-add timm tag to timm-wrapper models. (#35794)
Works for fine-tuned or exported models:

```py
from transformers import AutoModelForImageClassification

checkpoint = "timm/vit_base_patch16_224.augreg2_in21k_ft_in1k"
model = AutoModelForImageClassification.from_pretrained(checkpoint)

model.push_to_hub("pcuenq/tw1")
```

The uploaded model will now show snippets for both the timm and the
transformers libraries.
2025-01-21 14:34:45 +01:00
dc10f7906a Support adamw_torch_8bit (#34993)
* var

* more

* test
2025-01-21 14:17:49 +01:00
f82b19cb6f add a new flax example for Bert model inference (#34794)
* add a new example for flax inference cases

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/flax/language-modeling/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix for "make fixup"

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-21 14:09:29 +01:00
edbabf6b82 [Doc] Adding blog post to model doc for TimmWrapper (#35744)
* adding blog post to model doc

* Update docs/source/en/model_doc/timm_wrapper.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* review suggestions

* review suggestions

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-21 12:32:39 +00:00
fd8d61fdb2 Byebye test_batching_equivalence's flakiness (#35729)
* fix

* fix

* skip

* better error message

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-21 13:11:33 +01:00
78f5ee0217 Add LlavaImageProcessor (#33191)
* First draft

* Add equivalence test

* Update docstrings

* Add tests

* Use numpy

* Fix tests

* Improve variable names

* Improve docstring

* Add link

* Remove script

* Add copied from

* Address comment

* Add note in docs

* Add docstring, data format

* Improve test

* Add test

* update

* Update src/transformers/models/llava/image_processing_llava.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/llava/image_processing_llava.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* loop once only

---------

Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-21 12:47:04 +01:00
8e4cedd9ca Update AMD Docker image (#35804) 2025-01-21 12:11:23 +01:00
705aeaaa12 Fix "test_chat_template_dict" in video LLMs (#35660)
* fix  "test_chat_template_dict" in llava_onevision

* Update src/transformers/models/llava_next_video/processing_llava_next_video.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* get one video calles once

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-21 10:23:40 +01:00
e867b97443 Deterministic sorting in modular converter when adding new functions (#35795)
deterministic sort
2025-01-21 09:38:48 +01:00
920f34a772 modular_model_converter bugfix on assignments (#35642)
* added bugfix in modular converter to keep modular assignments for docstrings, expected outputs etc.

* revert stracoder2 docstring copying, add forward in EMU3 to enable docstring assingment, remove verbatim assignments in modular converter

* added _FOR_DOC in assignments to keep, corrected wrong checkpoint name in ijepa's configuration
2025-01-21 08:06:44 +01:00
234168c4dc Fixes, improvements to timm import behaviour (#35800)
* Fix timm dummy import logic

* Add requires to TimmWrapperConfig.from_dict so users see a helpful import error message if timm not installed
2025-01-20 13:17:01 -08:00
44393df089 Tool calling: support more types (#35776)
* Tool calling: support NoneType for function return type
2025-01-20 19:15:34 +01:00
f19135afc7 fix low-precision audio classification pipeline (#35435)
* fix low-precision audio classification pipeline

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix torch import

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix torch import

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-01-20 16:20:51 +00:00
641238eb76 Fix vits low-precision dtype (#35418)
* fix vits dtype

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* use weight dtype

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-01-20 16:19:31 +00:00
729b569531 fix document qa bf16 pipeline (#35456)
* fix document qa bf16 pipeline

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-01-20 16:18:07 +00:00
ec97417827 Don't import torch.distributed when it's not available (#35777)
This is a continuation of 217c47e31bc0cd442443e5b4a62c8bc2785d53ee but
for another module. This issue was spotted in nixpkgs (again) when
building lm-eval package that used a different path in transformers
library to reach the same failure.

Related: #35133
2025-01-20 17:10:35 +01:00
5f0f4b1b93 Patch moonshine (#35731)
* udpate expected logits for T4 runners

* update doc

* correct order of the args for better readability

* remove generate wrap

* convert modular
2025-01-20 16:19:29 +01:00
a142f16131 transformers.image_transforms.normalize wrong types (#35773)
transformers.image_transforms.normalize documents and checks for the wrong type for std and mean arguments

Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>
2025-01-20 15:00:46 +00:00
3998fa8aab [fix] cannot import name 'Pop2PianoFeatureExtractor' from 'transformers' (#35604)
* update pop2piano __init__

* add lib check

* update fix

* revert
2025-01-20 15:21:45 +01:00
b80e334e71 Skip Falcon 7B GGML Test (#35783)
skip test
2025-01-20 15:00:34 +01:00
68947282fc remove code owners as it was generating too much noise BUT (#35784)
remove code owners
2025-01-20 14:18:03 +01:00
135e86aa54 Remove read_video and run 2025-01-20 13:40:57 +01:00
88b95e6179 [generate] update docstring of SequenceBiasLogitsProcessor (#35699)
* fix docstring

* space
2025-01-20 11:00:15 +00:00
56afd2f488 fix register_buffer in MimiEuclideanCodebook (#35759)
Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
2025-01-20 11:54:58 +01:00
abe57b6f17 Add SuperGlue model (#29886)
* Initial commit with template code generated by transformers-cli

* Multiple additions to SuperGlue implementation :

- Added the SuperGlueConfig
- Added the SuperGlueModel and its implementation
- Added basic weight conversion script
- Added new ImageMatchingOutput dataclass

* Few changes for SuperGlue

* Multiple changes :
- Added keypoint detection config to SuperGlueConfig
- Completed convert_superglue_to_pytorch and succesfully run inference

* Reverted unintentional change

* Multiple changes :
 - Added SuperGlue to a bunch of places
 - Divided SuperGlue into SuperGlueForImageMatching and SuperGlueModel
 - Added testing images

* Moved things in init files

* Added docs (to be finished depending on the final implementation)

* Added necessary imports and some doc

* Removed unnecessary import

* Fixed make fix-copies bug and ran it

* Deleted SuperGlueModel
Fixed convert script

* Added SuperGlueImageProcessor

* Changed SuperGlue to support batching pairs of images and modified ImageMatchingOutput in consequences

* Changed convert_superglue_to_hf.py script to experiment different ways of reading an image and seeing its impact on performances

* Added initial tests for SuperGlueImageProcessor

* Added AutoModelForImageMatching in missing places and tests

* Fixed keypoint_detector_output instructions

* Fix style

* Adapted to latest main changes

* Added integration test

* Fixed bugs to pass tests

* Added keypoints returned by keypoint detector in the output of SuperGlue

* Added doc to SuperGlue

* SuperGlue returning all attention and hidden states for a fixed number of keypoints

* Make style

* Changed SuperGlueImageProcessor tests

* Revert "SuperGlue returning all attention and hidden states for a fixed number of keypoints"
Changed tests accordingly

This reverts commit 5b3b669c

* Added back hidden_states and attentions masked outputs with tests

* Renamed ImageMatching occurences into KeypointMatching

* Changed SuperGlueImageProcessor to raise error when batch_size is not even

* Added docs and clarity to hidden state and attention grouping function

* Fixed some code and done refactoring

* Fixed typo in SuperPoint output doc

* Fixed some of the formatting and variable naming problems

* Removed useless function call

* Removed AutoModelForKeypointMatching

* Fixed SuperGlueImageProcessor to only accept paris of images

* Added more fixes to SuperGlueImageProcessor

* Simplified the batching of attention and hidden states

* Simplified stack functions

* Moved attention instructions into class

* Removed unused do_batch_norm argument

* Moved weight initialization to the proper place

* Replaced deepcopy for instantiation

* Fixed small bug

* Changed from stevenbucaille to magic-leap repo

* Renamed London Bridge images to Tower Bridge

* Fixed formatting

* Renamed remaining "london" to "tower"

* Apply suggestions from code review

Small changes in the docs

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Added AutoModelForKeypointMatching

* Changed images used in example

* Several changes to image_processing_superglue and style

* Fixed resample type hint

* Changed SuperGlueImageProcessor and added test case for list of 2 images

* Changed list_of_tuples implementation

* Fix in dummy objects

* Added normalize_keypoint, log_sinkhorn_iterations and log_optimal_transport docstring

* Added missing docstring

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Moved forward block at bottom

* Added docstring to forward method

* Added docstring to match_image_pair method

* Changed test_model_common_attributes to test_model_get_set_embeddings test method signature

* Removed AutoModelForKeypointMatching

* Removed image fixtures and added load_dataset

* Added padding of images in SuperGlueImageProcessor

* Cleaned up convert_superglue_to_hf script

* Added missing docs and fixed unused argument

* Fixed SuperGlueImageProcessor tests

* Transposed all hidden states from SuperGlue to reflect the standard (..., seq_len, feature_dim) shape

* Added SuperGlueForKeypointMatching back to modeling_auto

* Fixed image processor padding test

* Changed SuperGlue docs

* changes:
 - Abstraction to batch, concat and stack of inconsistent tensors
 - Changed conv1d's to linears to match standard attention implementations
 - Renamed all tensors to be tensor0 and not tensor_0 and be consistent
 - Changed match image pair to run keypoint detection on all image first, create batching tensors and then filling these tensors matches after matches
 - Various changes in docs, etc

* Changes to SuperGlueImageProcessor:
- Reworked the input image pairs checking function and added tests accordingly
- Added Copied from statements
- Added do_grayscale tag (also for SuperPointImageProcessor)
- Misc changes for better code

* Formatting changes

* Reverted conv1d to linear conversion because of numerical differences

* fix: changed some code to be more straightforward (e.g. filtering keypoints) and converted plot from opencv to matplotlib

* fix: removed unnecessary test

* chore: removed commented code and added back hidden states transpositions

* chore: changed from "inconsistent" to "ragged" function names as suggested

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* docs: applied suggestions

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* docs: updated to display matched output

* chore: applied suggestion for check_image_pairs_input function

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* chore: changed check_image_pairs_input function name to validate_and_format_image_pairs and used validate_preprocess_arguments function

* tests: simplified tests for image input format and shapes

* feat: converted SuperGlue's use of Conv1d with kernel_size of 1 with Linear layers. Changed tests and conversion script accordingly

* feat: several changes to address comments

Conversion script:
- Reverted fuse batchnorm to linear conversion
- Changed all 'nn.Module' to respective SuperGlue models
- Changed conversion script to use regex mapping and match other recent scripts

Modeling SuperGlue:
- Added batching with mask and padding to attention
- Removed unnecessary concat, stack and batch ragged pairs functions
- Reverted batchnorm layer
- Renamed query, key, value and merge layers into q, k, v, out proj
- Removed Union of different Module into nn.Module in _init_weights method typehint
- Changed several method's signature to combine image0 and image1 inputs with appropriate doc changes
- Updated SuperGlue's doc with torch.no_grad()

Updated test to reflect changes in SuperGlue model

* refactor: changed validate_and_format_image_pairs function with clarity

* refactor: changed from one SuperGlueMLP class to a list of SuperGlueMLP class

* fix: fixed forgotten init weight change from last commit

* fix: fixed rebase mistake

* fix: removed leftover commented code

* fix: added typehint and changed some of arguments default values

* fix: fixed attribute default values for SuperGlueConfig

* feat: added SuperGlueImageProcessor post process keypoint matching method with tests

* fix: fixed SuperGlue attention and hidden state tuples aggregation

* chore: fixed mask optionality and reordered tensor reshapes to be cleaner

* chore: fixed docs and error message returned in validate_and_format_image_pairs function

* fix: fixed returned keypoints to be the ones that SuperPoint returns

* fix: fixed check on number of image sizes for post process compared to the pairs in outputs of SuperGlue

* fix: fixed check on number of image sizes for post process compared to the pairs in outputs of SuperGlue (bis)

* fix: Changed SuperGlueMultiLayerPerceptron instantiation to avoid if statement

* fix: Changed convert_superglue_to_hf script to reflect latest SuperGlue changes and got rid of nn.Modules

* WIP: implement Attention from an existing class (like BERT)

* docs: Changed docs to include more appealing matching plot

* WIP: Implement Attention

* chore: minor typehint change

* chore: changed convert superglue script by removing all classes and apply conv to linear conversion in state dict + rearrange keys to comply with changes in model's layers organisation

* Revert "Fixed typo in SuperPoint output doc"

This reverts commit 2120390e827f94fcd631c8e5728d9a4980f4a503.

* chore: added comments in SuperGlueImageProcessor

* chore: changed SuperGlue organization HF repo to magic-leap-community

* [run-slow] refactor: small change in layer instantiation

* [run-slow] chore: replaced remaining stevenbucaille org to magic-leap-community

* [run-slow] chore: make style

* chore: update image matching fixture dataset HF repository

* [run-slow] superglue

* tests: overwriting test_batching_equivalence

* [run-slow] superglue

* tests: changed test to cope with value changing depending on cuda version

* [run-slow] superglue

* tests: changed matching_threshold value

* [run-slow] superglue

* [run-slow] superglue

* tests: changed tests for integration

* [run-slow] superglue

* fix: Changed tensor view and permutations to match original implementation results

* fix: updated convert script and integration test to include last change in model

* fix: increase tolerance for CUDA variances

* Apply suggestions from code review

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* [run-slow] superglue

* chore: removed blank whitespaces

* [run-slow] superglue

* Revert SuperPoint image processor accident changes

* [run-slow] superglue

* refactor: reverted copy from BERT class

* tests: lower the tolerance in integration tests for SuperGlue

* [run-slow] superglue

* chore: set do_grayscale to False in SuperPoint and SuperGlue image processors

* [run-slow] superglue

* fix: fixed imports in SuperGlue files

* chore: changed do_grayscale SuperGlueImageProcessing default value to True

* docs: added typehint to post_process_keypoint_matching method in SuperGlueImageProcessor

* fix: set matching_threshold default value to 0.0 instead of 0.2

* feat: added matching_threshold to post_process_keypoint_matching method

* docs: update superglue.md to include matching_threshold parameter

* docs: updated SuperGlueConfig docstring for matching_threshold default value

* refactor: removed unnecessary parameters in SuperGlueConfig

* fix: changed from matching_threshold to threshold

* fix: re-revert changes to make SuperGlue attention classes copies of BERT

* [run-slow] superglue

* fix: added missing device argument in post_processing method

* [run-slow] superglue

* fix: add matches different from -1 to compute valid matches in post_process_keypoint_matching (and docstring)

* fix: add device to image_sizes tensor instantiation

* tests: added checks on do_grayscale test

* chore: reordered and added Optional typehint to KeypointMatchingOutput

* LightGluePR suggestions:
- use `post_process_keypoint_matching` as default docs example
- add `post_process_keypoint_matching` in autodoc
- add `SuperPointConfig` import under TYPE_CHECKING condition
- format SuperGlueConfig docstring
- add device in convert_superglue_to_hf
- Fix typo
- Fix KeypointMatchingOutput docstring
- Removed unnecessary line
- Added missing SuperGlueConfig in __init__ methods

* LightGluePR suggestions:
- use batching to get keypoint detection

* refactor: processing images done in 1 for loop instead of 4

* fix: use @ instead of torch.einsum for scores computation

* style: added #fmt skip to long tensor values

* refactor: rollbacked validate_and_format_image_pairs valid and invalid case to more simple ones

* refactor: prepare_imgs

* refactor: simplified `validate_and_format_image_pairs`

* docs: fixed doc

---------

Co-authored-by: steven <steven.bucaillle@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-20 10:32:39 +00:00
872dfbdd46 [ViTPose] Convert more checkpoints (#35638)
* Convert more checkpoints

* Update docs, convert huge variant

* Update model name

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Remove print statements

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Link to collection

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-20 11:29:47 +01:00
332fa024d6 Security fix for self-comment-ci.yml (#35548)
* Revert "Disable  `.github/workflows/self-comment-ci.yml` for now (#35366)"

This reverts commit ccc4a5a59b2d4134a49971915db0710e7a8c7824.

* fix

* fix

* fix

* least permission

* add env

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-20 11:16:03 +01:00
8571bb145a Fix CI for VLMs (#35690)
* fix some easy test

* more tests

* remove logit check here also

* add require_torch_large_gpu in Emu3
2025-01-20 11:15:39 +01:00
5fa3534475 Use AMD CI workflow defined in hf-workflows (#35058)
* Use AMD CI workflow defined in hf-workflows
2025-01-17 20:52:57 +01:00
7d4b3ddde4 ci: fix xpu skip condition for test_model_parallel_beam_search (#35742)
`return unittest.skip()` used in the `test_model_parallel_beam_search` in
skip condition for xpu did not actually mark test to be skipped running
under pytest:
* 148 passed, 1 skipped

Other tests use `self.skipTest()`. Reusing this approach and moving the
condition outside the loop (since it does not depend on it) allows to skip
for xpu correctly:
* 148 skipped

Secondly, `device_map="auto"` is now implemented for XPU for IPEX>=2.5 and
torch>=2.6, so we can now enable these tests for XPU for new IPEX/torch
versions.

Fixes: 1ea3ad1ae ("[tests] use `torch_device` instead of `auto` for model testing (#29531)")

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2025-01-17 16:47:27 +01:00
8ad6bd0f1b Stop mutating input dicts in audio classification pipeline (#35754) 2025-01-17 15:41:56 +00:00
936a731534 Revert "Unable to use MimiModel with DeepSpeed ZeRO-3" (#35755)
Revert "Unable to use `MimiModel` with DeepSpeed ZeRO-3 (#34735)"

This reverts commit 54fd7e92604e8ecb2f4601aae2f75322af9184c5.
2025-01-17 16:29:26 +01:00
10e8cd0d63 Restore is_torch_greater_or_equal_than for backward compatibility (#35734)
* Restore is_torch_greater_or_equal_than for backward compatibility

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* review comments

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

---------

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-01-17 16:22:44 +01:00
099d93d2e9 Grounding DINO Processor standardization (#34853)
* Add input ids to model output

* Add text preprocessing for processor

* Fix snippet

* Add test for equivalence

* Add type checking guard

* Fixing typehint

* Fix test for added `input_ids` in output

* Add deprecations and "text_labels" to output

* Adjust tests

* Fix test

* Update code examples

* Minor docs and code improvement

* Remove one-liner functions and rename class to CamelCase

* Update docstring

* Fixup
2025-01-17 14:18:16 +00:00
42b2857b01 OmDet Turbo processor standardization (#34937)
* Fix docstring

* Fix docstring

* Add `classes_structure` to model output

* Update omdet postprocessing

* Adjust tests

* Update code example in docs

* Add deprecation to "classes" key in output

* Types, docs

* Fixing test

* Fix missed clip_boxes

* [run-slow] omdet_turbo

* Apply suggestions from code review

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* Make CamelCase class

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-01-17 14:10:19 +00:00
94ae9a8da1 OwlViT/Owlv2 post processing standardization (#34929)
* Refactor owlvit post_process_object_detection + add text_labels

* Fix copies in grounding dino

* Sync with Owlv2 postprocessing

* Add post_process_grounded_object_detection method to processor, deprecate post_process_object_detection

* Add test cases

* Move text_labels to processors only

* [run-slow] owlvit owlv2

* [run-slow] owlvit, owlv2

* Update snippets

* Update docs structure

* Update deprecated objects for check_repo

* Update docstring for post processing of image guided object detection
2025-01-17 13:58:28 +00:00
add5f0566c Added liger_kernel compatibility with PeftModel (#35680)
* Added liger_kernel compatibility with `PeftModel`

* Amending based on review comments

* Amending based on review comments
2025-01-17 14:43:20 +01:00
df6d42a914 check is added for the report_to variable in TrainingArguments (#35403)
check for report_to variable is added
2025-01-17 14:39:32 +01:00
54fd7e9260 Unable to use MimiModel with DeepSpeed ZeRO-3 (#34735)
use torch.tensor(), not torch.Tensor()

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
2025-01-17 14:06:20 +01:00
ab1afd56f5 Fix some tests (#35682)
* cohere tests

* glm tests

* cohere2 model name

* create decorator

* update

* fix cohere2 completions

* style

* style

* style

* add cuda in comments
2025-01-17 12:10:43 +00:00
8c1b5d3782 🚨🚨🚨 An attempt to fix #29554. Include 'LayerNorm.' in gamma/beta rename scope, optimize string search. (#35615)
* An attempt to fix #29554. Include 'LayerNorm.' in gamma/beta rename scope, reduce number of characters searched on every load considerably.

* Fix fix on load issue

* Fix gamma/beta warning test

* A style complaint

* Improve efficiency of weight norm key rename. Add better comments about weight norm and layer norm renaming.

* Habitual elif redunant with the return
2025-01-16 17:25:44 -08:00
02a492a838 Added resource class configuration option for check_circleci_user job (#32866)
Added resource class configuration option for check_circleci_user job.
2025-01-16 21:31:18 +01:00
94af1c0aa2 [generate] return Cache object even if passed in a legacy format (#35673)
* generate returns a Cache object by default

* fix tests

* fix test for encoder-decoder models
2025-01-16 17:06:24 +00:00
2818307e93 [generate] can instantiate GenerationConfig(cache_implementation="static") (#35679)
fix failing instantiation
2025-01-16 17:04:54 +00:00
aaa969e97d Remove pt_to_tf (#35672)
* rm command

* remove exception
2025-01-16 17:03:37 +00:00
80dbbd103c 🧹 remove generate-related objects and methods scheduled for removal in v4.48 (#35677)
* remove things scheduled for removal

* make fixup
2025-01-16 17:03:20 +00:00
aeeceb9916 [cache] add a test to confirm we can use cache at train time (#35709)
* add test

* augment test as suggested

* Update tests/utils/test_modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* rerun tests

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-16 17:02:34 +00:00
57bf1a12a0 Remove batch size argument warning when unjustified (#35519)
* use max batch size

* revert unneccessary change

---------

Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2025-01-16 17:48:11 +01:00
91be6a5eb2 Modular: support for importing functions from any file (#35692)
* fix function imports

* improve comment

* Update modeling_switch_function.py

* make checks more robust

* improvement

* rename

* final test update
2025-01-16 16:37:53 +00:00
8ebe9d7166 Optimize ForCausalLMLoss by removing unnecessary contiguous() call to reduce memory overhead (#35646)
Optimize ForCausalLMLoss by removing unnecessary contiguous() calls to reduce memory overhead
2025-01-16 15:47:43 +00:00
1302c32a84 Add proper jinja2 error (#35533)
* Cleanup jinja2 imports

* Raise a proper error if Jinja is missing

* make fixup
2025-01-16 15:31:11 +00:00
3292e96a4f [generation] fix type hint (#35725)
fix type hint
2025-01-16 15:09:59 +00:00
8b78d9d6e7 Fix the bug that Trainer cannot correctly call torch_jit_model_eval (#35722)
Fix the bug that the accelerator.autocast does not pass parameters correctly when calling torch_jit_model_eval (#35706)
2025-01-16 15:53:37 +01:00
2cbcc5877d Fix condition when GA loss bug fix is not performed (#35651)
* fix condition when GA loss bug fix is not performed

* max loss diff is 2.29

* fix typo

* add an extra validation that loss should not vary too much
2025-01-16 13:59:53 +01:00
fd4f14c968 Fix: Falcon tie_word_embeddings in GGUF (#35715)
* fix falcon tie_word_embeddings

* fix style
2025-01-16 13:18:22 +01:00
bef7dded22 Replace deprecated batch_size with max_batch_size when using HybridCache (#35498)
* Replace deprecated batch_size with max_batch_size

- Functionality remains the same, because property getter batch_size(self) returned max_batch_size anyways.
- This change just avoids an unnecessary warning about deprecation.

* Use max_batch_size instead of deprecated batch_size with HybridCache

* Use max_batch_size instead of deprecated batch_size with HybridCache

- Change generated code to match original source
2025-01-16 11:48:41 +00:00
99e0ab6ed8 Fix typo in /docs/source/ja/model_doc/decision_transformer.md URL (#35705)
doc: Update original code repository URL
2025-01-15 07:36:50 -08:00
12dfd99007 Fix : Nemotron Processor in GGUF conversion (#35708)
* fixing nemotron processor

* make style
2025-01-15 14:25:44 +01:00
387663e571 Enable gptqmodel (#35012)
* gptqmodel

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update readme

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* gptqmodel need use checkpoint_format (#1)

* gptqmodel need use checkpoint_format

* fix quantize

* Update quantization_config.py

* Update quantization_config.py

* Update quantization_config.py

---------

Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* Revert quantizer_gptq.py (#2)

* revert quantizer_gptq.py change

* pass **kwargs

* limit gptqmodel and optimum version

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix warning

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix version check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* revert unrelated changes

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable gptqmodel tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix requires gptq

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Fix Transformer compat (#3)

* revert quantizer_gptq.py change

* pass **kwargs

* add meta info

* cleanup

* cleanup

* Update quantization_config.py

* hf_select_quant_linear pass checkpoint_format and meta

* fix GPTQTestCUDA

* Update test_gptq.py

* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2

* cleanup

* add backend

* cleanup

* cleanup

* no need check exllama version

* Update quantization_config.py

* lower checkpoint_format and backend

* check none

* cleanup

* Update quantization_config.py

* fix self.use_exllama == False

* spell

* fix unittest

* fix unittest

---------

Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format again

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update gptqmodel version (#6)

* update gptqmodel version

* update gptqmodel version

* fix unit test (#5)

* update gptqmodel version

* update gptqmodel version

* "not self.use_exllama" is not equivalent to "self.use_exllama==False"

* fix unittest

* update gptqmodel version

* backend is loading_attibutes (#7)

* fix format and tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix memory check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix device mismatch

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix result check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* update tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* review: update docs (#10)

* review: update docs (#12)

* review: update docs

* fix typo

* update tests for gptqmodel

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update document (#9)

* update overview.md

* cleanup

* Update overview.md

* Update overview.md

* Update overview.md

* update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

---------

Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* typo

* doc note for asymmetric quant

* typo with apple silicon(e)

* typo for marlin

* column name revert: review

* doc rocm support

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-15 14:22:49 +01:00
615bf9c5e4 Add future import for Py < 3.10 (#35666)
* Add future import for Py < 3.10

* make fixup

* Same issue in convert_olmo2_weights_to_hf.py
2025-01-15 12:45:43 +00:00
09d5f76274 Clean-up composite configs (#34603)
* remove manual assignment tie-word-embeddings

* remove another unused attribute

* fix tests

* fix tests

* remove unnecessary overwrites

* fix

* decoder=True

* clean pix2struct

* run-all

* forgot `_tied_weights_keys` when adding Emu3

* also Aria + fix-copies

* and clean aria
2025-01-15 10:04:07 +01:00
c61fcde910 Enhance DataCollatorForLanguageModeling with Configurable Token Replacement Probabilities (#35251)
* DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing

* DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing

* Addressed review comments, modified the docstring and made a test for the DataCollatorForLanguageModeling
2025-01-14 17:01:10 +00:00
b0cdbd9119 Enhanced Installation Section in README.md (#35094)
* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

Enhanced installation section with troubleshooting, GPU setup, and OS-specific details.

* Update README.md

Enhanced installation section with troubleshooting, GPU setup, and OS-specific details.

* Update installation.md

Updated installation.md to include virtual environment and GPU setup instructions.

* Update installation.md

Updated installation.md to include virtual environment and GPU setup instructions.

* Update installation.md

Updated installation.md to include virtual environment, troubleshooting and GPU setup instructions.

* Update installation.md

* Update installation.md

* Update installation.md

* Update installation.md

Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions.

* Update installation.md

Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions.

* Update installation.md

Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions.

* Update README.md

Removed numbering from README.md.

* Update README.md

Removed unnecessary "a)" formatting as per maintainer feedback.

* Update README.md

Added blank lines around code snippets for better readability.

* Update README.md

Removed the line "b) Install a backend framework:" from README.md as per feedback.

* Update README.md

Simplified "For Windows:" to "Windows" in README.md as per feedback as well as "For macOS/Linux:" to "macOS/Linux"

* Update README.md

Removed unnecessary heading and retained valid code snippet.

* Update README.md

Removed unnecessary heading "d) Optional: Install from source for the latest updates" as per feedback.

* Update README.md

Removed "GPU Setup (Optional)" section to align with minimal design feedback.

* Update installation.md

Removed "Create and Activate a Virtual Environment" section from installation.md as per feedback.

* Update installation.md

Adjusted "Troubleshooting" to a second-level heading and added an introductory line as per feedback.

* Update installation.md

Updated troubleshooting section with simplified headings and formatted code blocks as per feedback.

* Update installation.md

Integrated GPU setup instructions into the "Install with pip" section for better content flow.

* Update README.md

Removed Troubleshooting section from README.md for minimalism as per maintainer feedback.
2025-01-14 08:05:08 -08:00
a11041ffad Fix : add require_read_token for gemma2 gated model (#35687)
fix gemma2 gated model test
2025-01-14 11:47:05 +01:00
df2a812e95 Fix expected output for ggml test (#35686)
fix expected output
2025-01-14 11:46:55 +01:00
050636518a Fix : HQQ config when hqq not available (#35655)
* fix

* make style

* adding require_hqq

* make style
2025-01-14 11:37:37 +01:00
715fdd6459 Update torchao.md: use auto-compilation (#35490)
* Update torchao.md: use auto-compilation

* Update torchao.md: indicate updating transformers to the latest

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-01-14 11:33:48 +01:00
4b8d1f7fca Fix : adding einops lib in the CI docker for some bitsandbytes tests (#35652)
* fix docker

* fix
2025-01-14 07:36:10 +01:00
34f76bb62b Fix zero_shot_image_classification documentation guide link in SigLIP (#35671) 2025-01-13 11:08:17 -08:00
c23a1c1932 Add-helium (#35669)
* Add the helium model.

* Add a missing helium.

* And add another missing helium.

* Use float for the rmsnorm mul.

* Add the Helium tokenizer converter.

* Add the pad token as suggested by Arthur.

* Update the RMSNorm + some other tweaks.

* Fix more rebase issues.

* fix copies and style

* fixes and add helium.md

* add missing tests

* udpate the backlink

* oups

* style

* update init, and expected results

* small fixes

* match test outputs

* style fixup, fix doc builder

* add dummies and we should be good to go!z

* update sdpa and fa2 documentation

---------

Co-authored-by: laurent <laurent.mazare@gmail.com>
2025-01-13 18:41:15 +01:00
a3f82328ed [i18n-ar] Translated file : docs/source/ar/tasks/token_classification.md into Arabic (#35193)
* Create token_classification.md

* Update token_classification.md

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/token_classification.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2025-01-13 09:32:15 -08:00
2fa876d2d8 [tests] make cuda-only tests device-agnostic (#35607)
* intial commit

* remove unrelated files

* further remove

* Update test_trainer.py

* fix style
2025-01-13 14:48:39 +01:00
e6f9b03464 [Compile] Only test compiling model forward pass (#35658)
* rename test to only compile forward!

* style emu
2025-01-13 13:43:29 +01:00
84a6789145 Enable different torch dtype in sub models (#34873)
* fix

* fix test

* add tests

* add more tests

* fix tests

* supposed to be a torch.dtype test

* handle BC and make fp32 default
2025-01-13 13:42:08 +01:00
87089176d9 [Phi] bias should be True (#35650)
bias should be True
2025-01-13 13:15:07 +01:00
91f14f1fc4 Removed some duplicated code (#35637)
* Removed duplicate class field definition.

* Removed duplicate code in try-except block.

---------

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2025-01-13 12:34:21 +01:00
b8c34d97fc Fix whisper compile (#35413)
Fix compile error

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-01-13 11:31:51 +01:00
cd44bdb4b8 Fix device in rope module when using dynamic updates (#35608)
fix rope device
2025-01-13 10:11:17 +01:00
15bd3e61f8 Update codeowners with individual model owners (#35595)
* Update codeowners with individual model owners

* rip yoach

* add comment

* Replace - with _

* Add @qubvel for zero-shot object-detection

* Update CODEOWNERS

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update CODEOWNERS

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update CODEOWNERS

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update CODEOWNERS

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add yoni for omdet-turbo

* Update CODEOWNERS

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* Refactor / comment the CODEOWNERS file

* Capture modular files as well

* Add dummies without owner

* More cleanup

* Set Niels on a few more models that he added

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-01-10 17:59:36 +00:00
1e3c6c1f7d Skip MobileNetV1ModelTest::test_batching_equivalence for now (#35614)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 18:32:36 +01:00
04eae987f3 Fix flaky test_beam_search_low_memory (#35611)
* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 17:31:03 +01:00
b02828e4af Let EarlyStoppingCallback not require load_best_model_at_end (#35101)
* Bookmark

* Add warning
2025-01-10 10:25:32 -05:00
0aaf124fb9 Added error when sequence length is bigger than max_position_embeddings (#32156)
* Added error when sequence length is bigger than max_position_embeddings

* Fixed formatting

* Fixed bug

* Changed copies to match

* Fixed bug

* Applied suggestions

* Removed redundant code

* Fixed bugs

* Bug fix

* Bug fix

* Added requested Changes

* Fixed bug

* Fixed unwanted change

* Fixed unwanated changes

* Fixed formatting
2025-01-10 15:23:54 +00:00
1211e616a4 Use inherit tempdir makers for tests + fix failing DS tests (#35600)
* Use existing APIs to make tempdir folders

* Fixup deepspeed too

* output_dir -> tmp_dir
2025-01-10 10:01:58 -05:00
bbc00046b9 Fix flaky test_custom_4d_attention_mask (#35606)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 15:40:04 +01:00
f63829c87b v4.49.0-dev 2025-01-10 12:31:11 +01:00
52e1f87c7d [WIP] Emu3: add model (#33770)
* model can convert to HF and be loaded back

* nit

* works in single batch generation but hallucinates

* use the image tokens

* add image generation

* now it works

* add tests

* update

* add modulare but it doesn't work for porting docstring :(

* skip some tests

* add slow tests

* modular removed the import?

* guess this works

* update

* update

* fix copies

* fix test

* fix copies

* update

* docs

* fix tests

* last fix tests?

* pls

* repo consistency

* more style

* style

* remove file

* address comments

* tiny bits

* update after the new modular

* fix tests

* add one more cond in check attributes

* decompose down/up/mid blocks

* allow static cache generation in VLMs

* nit

* fix copies

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/emu3.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix VAE upsampling

* Update src/transformers/models/emu3/modular_emu3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address comments

* state overwritten stuff explicitly

* fix copies

* add the flag for flex attn

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-10 12:23:00 +01:00
ccc0381d36 Fix flex_attention in training mode (#35605)
* fix flex

* add test

* style
2025-01-10 11:49:12 +01:00
a9bd1e6284 Remove benchmark.py after #34275 2025-01-10 11:09:06 +01:00
e0646f3dce Chat template: return vectorized output in processors (#34275)
* update chat template

* style

* fix tests

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* typehints + docs

* fix tests

* remove unnecessary warnings

* forgot code style :(

* allow users to pass backend and num frames

* Update docs/source/en/chat_templating.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/image_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/processing_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* typo fix

* style

* address comments

* align with "pipeline" template

* update docs

* update docs

* unpack for all kwargs?

* wrong conflict resolution while rebasing

* tmp

* update docs

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-10 11:05:29 +01:00
5f087d1335 Add Moonshine (#34784)
* config draft

* full encoder forward

* full decoder forward

* fix sdpa and FA2

* fix sdpa and FA2

* moonshine model

* moonshine model forward

* fix attention with past_key_values

* add MoonshineForConditionalGeneration

* fix cache handling and causality for cross attention

* no causal attention mask for the encoder

* model addition (imports etc)

* small nit

* nits

* Update src/transformers/models/moonshine/convert_usefulsensors_to_hf.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* add rope_theta

* nits

* model doc

* Update src/transformers/models/auto/configuration_auto.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* imports

* add MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING_NAMES

* updates modular

* make

* make fix-copies

* ruff check examples fix

* fix check_modular_conversion

* nit

* nits

* nits

* copied from -> imports

* imports fix

* integrate attention refacto

* modular edge case

* remove encoder

* convolutions params in config

* run modular_model_converter

* make

* Update docs/source/en/model_doc/moonshine.md

Co-authored-by: Joshua Lochner <admin@xenova.com>

* MoonshineModelTest

* correct typo

* make style

* integration tests

* make

* modular convert

* name conversion update (up_proj -> fc1 etc)

* update config

* update MLP

* update attention

* update encoder layer

* update decoder layer

* update convolutions parameters

* update encoder

* remove INPUTS_DOCSTRING

* update decoder

* update conditional generation

* update pretrained model

* imports

* modular converted

* update doc

* fix

* typo

* update doc

* update license

* update init

* split config in file

* two classes for MLP

* attention from GLM

* from GlmRotaryEmbedding

* split MLP

* apply arthur's review suggestions

* apply arthur's review suggestions

* apply arthur's review suggestions

* auto feature extractor

* convert modular

* fix + make

* convert modular

* make

* unsplit config

* use correct checkpoint

* wrap generate

* update tests

* typos

* make

* typo

* update doc

---------

Co-authored-by: Joshua Lochner <admin@xenova.com>
2025-01-10 11:00:54 +01:00
6f127d3f81 Skip torchscript tests if a cache object is in model's outputs (#35596)
* fix 1

* fix 1

* comment

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 10:46:03 +01:00
6b73ee8905 ModernBert: reuse GemmaRotaryEmbedding via modular + Integration tests (#35459)
* Introduce 5 integration tests for the 4 model classes + torch export

* ModernBert: reuse GemmaRotaryEmbedding via modular

* Revert #35589, keep rope_kwargs; rely on them in modular_modernbert

* Revert "Revert #35589, keep rope_kwargs; rely on them in modular_modernbert"

This reverts commit 11b44b9ee83e199cbfb7c5ba2d11f7a7fdbba2d3.

* Don't set rope_kwargs; override 'self.rope_init_fn' call instead
2025-01-10 10:25:10 +01:00
8de7b1ba8d Add flex_attn to diffllama (#35601)
Add sdpa to diffllama
2025-01-09 20:49:11 +01:00
1e3ddcb2d0 ModernBERT bug fixes (#35404)
* bug fixes

* organize imports

* wrap cpu warning in reference_compile

* Avoid needing repad_logits_with_grad, always repad with grads when training

I'm not 100% that the conditional with "or labels is None" makes sense though - not sure what the intention is there. Perhaps we can remove that?

* Revert "Avoid needing repad_logits_with_grad, always repad with grads when training"

This reverts commit cedcb4e89bcea199a1135a0933e71f534b656239.

* Fix grammar: keep -> keeps

* Propagate grammar fix with modular_model_converter

---------

Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>
2025-01-09 20:15:38 +01:00
e97d7a5be5 add _supports_flex_attn = True for models that do support it (#35598)
* add `_supports_flex_attn = True`

* fix repo consistency
2025-01-09 20:03:33 +01:00
c9c682d19c [doc] deepspeed universal checkpoint (#35015)
* universal checkpoint

* Update docs/source/en/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/deepspeed.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-09 09:50:51 -08:00
3a4ae6eace Refactor/fix Cohere2 (#35594)
* refactor/fix cohere2

* add kwargs

* tests

* remove func and import it
2025-01-09 17:54:57 +01:00
32e0db8a69 [tokenizers] Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer (#35593)
* Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer

in PreTrainedTokenizerFast, rather than relying on subclasses to take care of this.

* Simplify setting self.add_prefix_space, ensure pre_tok exists

* Wrap in try-except to catch 'Custom PreTokenizer cannot be serialized'

862d1a346a/bindings/python/src/pre_tokenizers.rs (L672) produces the Exception. They're triggered by the roformer tests, as the RoFormerTokenizerFast uses a custom PreTokenizer.

* Propagate add_prefix_space in T5TokenizerFast to superclass
2025-01-09 17:46:50 +01:00
46276f9a7f Fix modular edge case + modular sorting order (#35562)
* look-ahead negation

* re add examples by default

* Fix the bug in topological sort

* Update create_dependency_mapping.py

* start adding test

* finalize test

* more tests

* style

* style
2025-01-09 17:17:52 +01:00
d3fe9fa3fe PR for Issue #22694: Fixed Training Evaluation table display for VSCode (#35557) 2025-01-09 15:05:47 +00:00
395b114bd1 Small fix rope kwargs (#35589)
* don't know why this keeps popping up?

* remove unused rope_kwargs
2025-01-09 15:40:36 +01:00
82dd6c14bb Fix flaky SwitchTransformersModelTest::test_training_gradient (#35587)
* fix

* Update tests/models/switch_transformers/test_modeling_switch_transformers.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-09 15:36:22 +01:00
eb4579cf43 tokenizer train from iterator without pre_tokenizers (#35396)
* fix if else issues

* add a test

* fix the test

* style
2025-01-09 15:34:43 +01:00
320512df46 feat: add TP plan for granite (#35573)
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>
2025-01-09 15:25:55 +01:00
633da1b10e [Idefics3] Move image features to same device as input embeds (#35100)
* [Idefics3] Move image features to same device as input embeds

* Update src/transformers/models/idefics3/modeling_idefics3.py

* make style

---------

Co-authored-by: Saif Rehman Nasir <shyshin@github.com>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2025-01-09 14:25:36 +01:00
832c6191ed Add inputs_embeds param to ModernBertModel (#35373)
* update modular_modernbert -- add inputs_embeds param to ModernBertModel

* Fix implementation issues; extend to other classes; docstring

First of all, the inputs_embeds shouldn't fully replace `self.embeddings(input_ids)`, because this call also does layer normalization and dropout. So, now both input_ids and inputs_embeds is passed to the ModernBertEmbeddings, much like how BertEmbeddings is implemented.

I also added `inputs_embeds` to the docstring, and propagated the changes to the other model classes.

I also introduced an error if input_ids and input_embeds are both or neither provided.

Lastly, I fixed an issue with device being based solely on input_ids with attention_mask.

* Propagate inputs_embeds to ModernBertForMaskedLM correctly

Also reintroduce inputs_embeds test

---------

Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
2025-01-09 14:17:26 +01:00
1b2f942af7 Fix flaky test_batching_equivalence (#35564)
* yes!

* oh no!!!

* oh no!!!

* style

* oh no!!!

* oh no!!!

* oh no!!!

* oh no!!!

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-09 14:00:08 +01:00
4adc415b6d Setup loss_type in config at model init time (#34616)
* setup loss_type in config at model init time

ensures no additional graph break introduced when torch.compile'ed

fixes #34615

Signed-off-by: ChanderG <mail@chandergovind.org>

* lookup loss mapping at init time instead of manual setup

Signed-off-by: ChanderG <mail@chandergovind.org>

* remove redundant lookup at loss_function time

* overwride losstype at init time

---------

Signed-off-by: ChanderG <mail@chandergovind.org>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2025-01-09 13:32:21 +01:00
c8ab6ce6ce Re-add missing __all__ for Cohere and Phi3 (#35578)
re-add missing __all__
2025-01-09 11:29:31 +01:00
487c31a21f Minor fix in video text 2 text docs (#35546)
minor fix in docs
2025-01-09 11:20:36 +01:00
965a2fb320 More model refactoring! (#35359)
* cohere

* style

* phi3

* style

* small fix

* small fix

* phi3 longrope

* oups

* Update rope (only for phi3 still)

* Update test_modeling_rope_utils.py

* Update modeling_phi3.py

* fix

* fix copies

* style

* Fix copied from bad renaming
2025-01-09 11:09:09 +01:00
137965ca7d Don't show warning for inv_freq buffers (#35255)
dont show warning
2025-01-09 10:46:01 +01:00
8cad65a698 Fix multi-gpu loss (#35395)
push to device
2025-01-09 10:14:31 +01:00
2e2f8015c0 update code owners (#35576)
update
2025-01-09 09:55:41 +01:00
a6256ec098 [i18n-ar] Translated file: docs/source/ar/tasks/multiple_choice.md into Arabic (#35199)
* إضافة الترجمة العربية: multiple_choice.md

* Update multiple_choice.md

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/multiple_choice.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

* Add files via upload

* Update _toctree.yml

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2025-01-08 14:17:58 -08:00
b32938aeee Fix all output_dir in test_trainer.py to use tmp_dir (#35266)
* update codecarbon

* replace directly-specified-test-dirs with tmp_dir

* pass tmp_dir to all get_regression_trainer

* test_trainer.py: Use tmp_dir consistently for all output_dir arguments

* fix some with...as tmp_dir blocks

* reflect the comments to improve test_trainer.py

* refresh .gitignore
2025-01-08 19:44:39 +01:00
76da6ca034 Pipeline: simple API for assisted generation (#34504)
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-01-08 17:08:02 +00:00
3f483beab9 [PixtralLarge] Update Pixtral conversion script to support large format! (#34801)
* update conversion script

* update for bias again

* remove pdv

* use my dir

* Update how we initialize the tokenizer

* Convert in bfloat16

* Undo that one again

* fix config dump

* .to() was broken for BatchMixFeature

* quick debug breakpoint

* put the breakpoint in the right place

* Add a config flag for the multimodal projector bias

* Add a config flag for the multimodal projector bias

* Conversion script can load chat templates

* Indent config for comparison

* Stop clobbering the config

* Re-enable the config clobber

* Get rid of the config manual save - it has no effect!

* Handle adapter bias correctly

* Default vision transformer activation to silu

* Remove legacy processing path

* One commit with all the debug breakpoints before I delete them all, in case I need to revert

* Update conversion

* Remove vLLM debugging instrumentation

* Drop xformers

* Remove debug enumerates

* make fixup

* make fixup

* Break copied from in pixtral

* Propagate multimodal_projector_bias change

* Propagate multimodal_projector_bias change

* Remove debug device .to()

* Restore attention weights output

* Fix Pixtral test

* Drop image_seq_length

* Drop image_seq_length

* Put the legacy processing code back

* Add the bias option to the llava_next_video config

* Add the bias option to the llava_next_video config

* Make certain args required in converter

* Make certain args required in converter

* typo

* make fixup

* Reverting some dtype changes since it seems to work without them

---------

Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-166-244.ec2.internal>
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-01-08 17:39:47 +01:00
4c2c12b3de [docs] Remove Hiera from AUDIO MODELS in docs (#35544)
Remove Hiera from AUDIO MODELS

Hiera is a visual model and should not appear in audio model...
2025-01-08 16:33:21 +00:00
854dc7941b ovewrite top_k when crate audio classification pipeline (#35541)
* ovewrite top_k when crate audio classification pipeline

* Update src/transformers/pipelines/audio_classification.py

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-01-08 16:32:27 +00:00
8c555ca3d7 add code owners (#35528)
* add co owners

* normal processing

* /src/transformers/models/*/*_modeling*

* Update CODEOWNERS

* Update CODEOWNERS

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update CODEOWNERS

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* nit

* Apply suggestions from code review

Co-authored-by: Alvaro Moran <6949769+tengomucho@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update CODEOWNERS

* rather put `@Rocketknight1`

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Alvaro Moran <6949769+tengomucho@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
2025-01-08 17:14:44 +01:00
8490d3159c Add ViTPose (#30530)
* First draft

* Make fixup

* Make forward pass worké

* Improve code

* More improvements

* More improvements

* Make predictions match

* More improvements

* Improve image processor

* Fix model tests

* Add classic decoder

* Convert classic decoder

* Verify image processor

* Fix classic decoder logits

* Clean up

* Add post_process_pose_estimation

* Improve post_process_pose_estimation

* Use AutoBackbone

* Add support for MoE models

* Fix tests, improve num_experts%

* Improve variable names

* Make fixup

* More improvements

* Improve post_process_pose_estimation

* Compute centers and scales

* Improve postprocessing

* More improvements

* Fix ViTPoseBackbone tests

* Add docstrings, fix image processor tests

* Update index

* Use is_cv2_available

* Add model to toctree

* Add cv2 to doc tests

* Remove script

* Improve conversion script

* Add coco_to_pascal_voc

* Add box_to_center_and_scale to image_transforms

* Update tests

* Add integration test

* Fix merge

* Address comments

* Replace numpy by pytorch, improve docstrings

* Remove get_input_embeddings

* Address comments

* Move coco_to_pascal_voc

* Address comment

* Fix style

* Address comments

* Fix test

* Address comment

* Remove udp

* Remove comment

* [WIP] need to check if the numpy function is same as cv

* add scipy affine_transform

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* refactor convert

* add output_shape

* add atol 5e-2

* Use hf_hub_download in conversion script

* make box_to_center more applicable

* skipt test_get_set_embedding

* fix to accept array and fix CI

* add co-contributor

* make it to tensor type output

* add torch

* change to torch tensor

* add more test

* minor change

* CI test change

* import torch should be above ImageProcessor

* make style

* try not use torch in def

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vitpose_backbone/configuration_vitpose_backbone.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix

* fix

* add caution

* make more detail about dataset_index

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>

* add docs

* Update docs/source/en/model_doc/vitpose.md

* Update src/transformers/models/vitpose/configuration_vitpose.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Revert "Update src/transformers/__init__.py"

This reverts commit 7ffa504450bb9dbccf9c7ea668441b98a1939d5c.

* change name

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/vitpose/test_modeling_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* move vitpose only function to image_processor

* raise valueerror when using timm backbone

* use out_indices

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove camel-case of def flip_back

* rename vitposeEstimatorOutput

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix confused camelcase of MLP

* remove in-place logic

* clear scale description

* make consistent batch format

* docs update

* formatting docstring

* add batch tests

* test docs change

* Update src/transformers/models/vitpose/image_processing_vitpose.py

* Update src/transformers/models/vitpose/configuration_vitpose.py

* chagne ViT to Vit

* change to enable MoE

* make fix-copies

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* extract udp

* add more described docs

* simple fix

* change to accept target_size

* make style

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vitpose/configuration_vitpose.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change to `verify_backbone_config_arguments`

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove unnecessary copy

* make config immutable

* enable gradient checkpointing

* update inappropriate docstring

* linting docs

* split function for visibility

* make style

* check isinstances

* change to acceptable use_pretrained_backbone

* make style

* remove copy in docs

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/vitpose/modeling_vitpose.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* simple fix + make style

* change input config of activation function to string

* Update docs/source/en/model_doc/vitpose.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* tmp docs

* delete index.md

* make fix-copies

* simple fix

* change conversion to sam2/mllama style

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/vitpose/image_processing_vitpose.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* refactor convert

* add supervision

* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* remove reduntant def

* seperate code block for visualization

* add validation for num_moe

* final commit

* add labels

* [run-slow] vitpose, vitpose_backbone

* Update src/transformers/models/vitpose/convert_vitpose_to_hf.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* enable all conversion

* final commit

* [run-slow] vitpose, vitpose_backbone

* ruff check --fix

* [run-slow] vitpose, vitpose_backbone

* rename split module

* [run-slow] vitpose, vitpose_backbone

* fix pos_embed

* Simplify init

* Revert "fix pos_embed"

This reverts commit 2c56a4806e30bc9b5753b142fa04b913306c54ff.

* refactor single loop

* allow flag to enable custom model

* efficiency of MoE to not use unused experts

* make style

* Fix range -> arange to avoid warning

* Revert MOE router, a new one does not work

* Fix postprocessing a bit (labels)

* Fix type hint

* Fix docs snippets

* Fix links to checkpoints

* Fix checkpoints in tests

* Fix test

* Add image to docs

---------

Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: sangbumchoi <danielsejong55@gmail.com>
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-08 16:02:14 +00:00
4349a0e401 fix: Qwen2-VL generate with inputs_embeds (#35466)
* fix: Qwen2-VL generate with inputs_embeds

* change: optional input_ids in get_rope_index
2025-01-08 16:36:03 +01:00
88e18b3c63 Update doc for metric_for_best_model when save_strategy="best". (#35389)
* Updated docstring for _determine_best_metric.

* Updated docstring for metric_for_best_model.

* Added test case for save strategy.

* Updated incorrect test case.

* Changed eval_strategy to match save_strategy.

* Separated test cases for metric.

* Allow load_best_model when save_strategy == "best".

* Updated docstring for metric_for_best_model.
2025-01-08 16:32:35 +01:00
jp
29e74b7cbc Add: num_additional_image_tokens to models (#35052)
* Add: num_additional_image_tokens to models

* docs: update docstring for num_additional_image_tokens in configuration files

* Add num_additional_image_tokens to LlavaNextVideo model and update feature selection logic

* revert

* Fix: adjust num_image_tokens calculation in LlavaProcessor

* Remove num_additional_image_tokens initialization from configuration files

* Fix test error

* revert

* Fix: adjust num_image_tokens calculation in LlavaNextVideoProcessor

* fix conflict

* Fix: adjust num_image_tokens calculation in VideoLlavaProcessor

* make style

---------

Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2025-01-08 16:20:01 +01:00
657bb14f98 Enable auto task for timm models in pipeline (#35531)
* Enable auto task for timm models

* Add pipeline test
2025-01-08 15:14:17 +00:00
1a6c1d3a9a Bump torch requirement to >= 2 (#35479)
Bump torch requirement, follow-up of #35358
2025-01-08 15:59:32 +01:00
59e5b3f01b Timm wrapper label names (#35553)
* Add timm wrapper label names mapping

* Add index to classification pipeline

* Revert adding index for pipelines

* Add custom model check for loading timm labels

* Add tests for labels

* [run-slow] timm_wrapper

* Add note regarding label2id mapping
2025-01-08 14:09:46 +00:00
f1639ea51d Update missing model error message (#35370)
* Update missing model error message

* Update missing model error message

* Update missing model error message

* Fix capitalization
2025-01-08 15:05:06 +01:00
bd39b0627b Update doc and default value of TextNetImageProcessor (#35563)
update doc and default value
2025-01-08 13:47:52 +00:00
651cfb400f Add support for modular with fast image processors (#35379)
* Add support for modular with fast image processors

* fix order and remove copied from

* add comment for "image_processing*_fast"
2025-01-08 08:37:57 -05:00
430d3d43a5 [Docs] links to logits-processor-zoo (#35552)
links to logits-processor-zoo
2025-01-08 13:36:30 +00:00
3c1895aa65 Fix Qwen2VL processor to handle odd number of frames (#35431)
* fix: processing odd number of frames

* feat: add test case

* update: test one frame

* feat: support custom patch size

* fix: test with videos

* revert: change on patch repeat

* fix: much wow

* update: fixups

* fixup pls

* ruff fixup

* fix typo at least
2025-01-08 13:49:00 +01:00
3fde88b19d support chat generator as input of TextGenerationPipeline (#35551)
* support chat generator as input of TextGenerationPipeline

* missing import

* fix tests

* again

* simpler

* add test
2025-01-08 13:27:07 +01:00
ebdd1ad400 Pass correct num_items_in_batch value into the training_step function (#35438)
pass correct `num_items_in_batch` to compute_loss
2025-01-08 13:16:03 +01:00
0e0516c119 MODERNBERT_INPUTS_DOCSTRING: past_key_values are ignored (#35513)
* MODERNBERT_INPUTS_DOCSTRING: past_key_values are ignored

* sync to modular_modernbert.py
2025-01-08 11:45:40 +01:00
d1681ec2b6 VLMs: major clean up 🧼 (#34502)
only lllava models are modified
2025-01-08 10:35:23 +01:00
7176e06b52 Add TextNet (#34979)
* WIP

* Add config and modeling for Fast model

* Refactor modeling and add tests

* More changes

* WIP

* Add tests

* Add conversion script

* Add conversion scripts, integration tests, image processor

* Fix style and copies

* Add fast model to init

* Add fast model in docs and other places

* Fix import of cv2

* Rename image processing method

* Fix build

* Fix Build

* fix style and fix copies

* Fix build

* Fix build

* Fix Build

* Clean up docstrings

* Fix Build

* Fix Build

* Fix Build

* Fix build

* Add test for image_processing_fast and add documentation tests

* some refactorings

* Fix failing tests

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Introduce TextNet

* Fix failures

* Refactor textnet model

* Fix failures

* Add cv2 to setup

* Fix failures

* Fix failures

* Add CV2 dependency

* Fix bugs

* Fix build issue

* Fix failures

* Remove textnet from modeling fast

* Fix build and other things

* Fix build

* some cleanups

* some cleanups

* Some more cleanups

* Fix build

* Incorporate PR feedbacks

* More cleanup

* More cleanup

* More cleanup

* Fix build

* Remove all the references of fast model

* More cleanup

* Fix build

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Fix Build

* Fix build

* Fix build

* Fix build

* Fix build

* Fix build

* Incorporate PR feedbacks

* Fix style

* Fix build

* Incorporate PR feedbacks

* Fix image processing mean and std

* Incorporate PR feedbacks

* fix build failure

* Add assertion to image processor

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* fix style failures

* fix build

* Fix Imageclassification's linear layer, also introduce TextNetImageProcessor

* Fix build

* Fix build

* Fix build

* Fix build

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Fix build

* Incorporate PR feedbacks

* Remove some script

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Fix image processing in textnet

* Incorporate PR Feedbacks

* Fix CI failures

* Fix failing test

* Fix failing test

* Fix failing test

* Fix failing test

* Fix failing test

* Fix failing test

* Add textnet to readme

* Improve readability

* Incorporate PR feedbacks

* fix code style

* fix key error and convert working

* tvlt shouldn't be here

* fix test modeling test

* Fix tests, make fixup

* Make fixup

* Make fixup

* Remove TEXTNET_PRETRAINED_MODEL_ARCHIVE_LIST

* improve type annotation

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update tests/models/textnet/test_image_processing_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* improve type annotation

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* space typo

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* improve type annotation

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/textnet/configuration_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* make conv layer kernel sizes and strides default to None

* Update src/transformers/models/textnet/modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/textnet/modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fix keyword bug

* add batch init and make fixup

* Make fixup

* Update integration test

* Add figure

* Update textnet.md

* add testing and fix errors (classification, imgprocess)

* fix error check

* make fixup

* make fixup

* revert to original docstring

* add make style

* remove conflict for now

* Update modeling_auto.py

got a confusion in `timm_wrapper` - was giving some conflicts

* Update tests/models/textnet/test_modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/textnet/modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update tests/models/textnet/test_modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/textnet/modeling_textnet.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* add changes

* Update textnet.md

* add doc

* add authors hf ckpt + rename

* add feedback: classifier/docs

---------

Co-authored-by: raghavanone <opensourcemaniacfreak@gmail.com>
Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co>
Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-08 09:52:51 +01:00
b05df6611e [docs] Remove sortish_sampler (#35539)
remove
2025-01-07 12:06:19 -08:00
a7d1441d65 Correctly list the chat template file in the Tokenizer saved files list (#34974)
* Correctly list the chat template file in the saved files list

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add save file checking to test

* make fixup

* better filename handling

* make fixup

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-07 19:11:02 +00:00
cdca3cf9e3 [Whisper] fix docstrings typo (#35338)
fix typo
2025-01-07 09:20:27 -08:00
7f7677307c [Qwen2Audio] handle input ids expansion during processing (#35534)
* add audio_token attribute to proc

* expand input_ids

* and legacy and expanded input_ids

* test update

* split lines

* add possibility not to provide eos and bos audio tokens

* raise errors

* test incorrect number of audio tokens

* add example

* fmt

* typo
2025-01-07 16:47:27 +01:00
628cd838a3 Release GPU memory after Optuna trial (#35440)
* Release GPU memory after trial

* Update to use release_memory from accelerate.utils.memory after suggestion
2025-01-07 16:26:28 +01:00
665a4942e4 Check whether rescale is requested before checking is_scaled_image (#35439) 2025-01-07 11:39:45 +00:00
f408d55448 Fix bug when requesting input normalization with EnCodec (#34756)
* EnCodec: unsqueeze padding mask

* add test for normalization
2025-01-07 11:50:02 +01:00
96bf3d6cc5 Add diffllama (#34083)
* first adding diffllama

* add Diff Attention and other but still with errors

* complate make attention Diff-Attention

* fix some bugs which may be caused by transformer-cli while adding model

* fix a bug caused by forgetting KV cache...

* Update src/transformers/models/diffllama/modeling_diffllama.py

You don't need to divide by 2 if we use same number of attention heads as llama. instead you can just split in forward.

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fit to changeing "num_heads // 2" place

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

new codes are more meaningful than before

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

new codes are more meaningful than before

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fit to changeing "num_heads // 2" place

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fix 2times divide by sqrt(self.head_dim)

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fix 2times divide by sqrt(self.head_dim)

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* Update src/transformers/models/diffllama/modeling_diffllama.py

fit to changeing "num_heads // 2" place.
and more visible

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* I found Attention missed implemented from paper still on e072544a3bfc69b8a903e062729f861108ffecd3.

* re-implemented

* adding groupnorm

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* align with transformers code style

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* fix typo

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* adding groupnorm

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* change SdpaAttention to DiffSdpaAttention

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* fix bug

* Update src/transformers/models/diffllama/modeling_diffllama.py

resolve "not same outputs" problem

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* fix bugs of places of "GroupNorm with scale" and etc

* Revert "fix bugs of places of "GroupNorm with scale" and etc"

This reverts commit 26307d92f6acd55e9fe89f2facff350f05760960.

* simplify multiple of attention (matmul) operations into one by repeating value_states

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* simplify multiple of attention (matmul) operations into one by repeating value_states

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* simplify multiple of attention (matmul) operations into one by repeating value_states

Co-authored-by: Minho Ryu <ryumin93@gmail.com>

* remove missed type

* add diffllama model_doc

* apply make style/quality

* apply review comment about model

* apply review comment about test

* place diffllama alphabetically on the src/transformers/__init__.py

* fix forgot code

* Supports parameters that are not initialized with standard deviation 0 in the conventional method

* add DiffLlamaConfig to CONFIG_CLASSES_TO_IGNORE_FOR_DOCSTRING_CHECKPOINT_CHECK on utils/check_config_docstrings.py

* remove unused property of config

* add to supported model list

* add to spda supported model list

* fix copyright, remove pretraining_tensor_parallel, and modify for initialization test

* remove unused import and etc.

* empty commit

* empty commit

* empty commit

* apply modular transformers but with bugs

* revert prev commit

* create src/transformers/model/diffllama/modular_diffllama.py

* run utils/modular_model_converter.py

* empty commit

* leaner modular diffllama

* remove more and more in modular_diffllama.pt

* remove more and more in modular_diffllama.pt

* resolve missing docstring entries

* force reset

* convert modular

---------

Co-authored-by: Minho Ryu <ryumin93@gmail.com>
2025-01-07 11:34:56 +01:00
ed73ae210b NPU support SDPA (#35165)
Co-authored-by: root <weichunyude@163.com>
2025-01-07 11:30:05 +01:00
02ed609285 Replace tokenizer to processing_class in Seq2SeqTrainer (#35452) 2025-01-07 09:51:12 +00:00
9fd123ac31 ci: mark model_parallel tests as cuda specific (#35269)
`parallelize()` API is deprecated in favor of accelerate's `device_map="auto"`
and therefore is not accepting new features. At the same time `parallelize()`
implementation is currently CUDA-specific. This commit marks respective
ci tests with `@require_torch_gpu`.

Fixes: #35252

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2025-01-07 10:16:34 +01:00
bd442c6d3a Zamba new attention standard (#35375)
* updated zamba to new attention standard

* make fixup fixes
2025-01-07 10:08:45 +01:00
12ba96aa3c [Dinov2 with Registers] Some fixes (#35411)
* First draft

* Thanks claude

* Remove print statement

* Use torch_int

* Address comments

* Address comment
2025-01-06 21:10:59 +01:00
ca00950057 added logic for deleting adapters once loaded (#34650)
* added logic for deleting adapters once loaded

* updated to the latest version of transformers, merged utility function into the source

* updated with missing check

* added peft version check

* Apply suggestions from code review

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* changes according to reviewer

* added test for deleting adapter(s)

* styling changes

* styling changes in test

* removed redundant code

* formatted my contributions with ruff

* optimized error handling

* ruff formatted with correct config

* resolved formatting issues

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
2025-01-06 18:36:40 +00:00
1650e0e514 Fixed typo in Llama configuration docstring (#35520)
Update configuration_llama.py

There is no `num_heads` parameter, only `num_attention_heads`
2025-01-06 09:54:08 -08:00
3b1be043cd 🌐 [i18n-KO] Remove duplicates in toctree (#35496)
fix(docs): remove duplicates in toctree
2025-01-06 09:14:22 -08:00
3951da1a6b [GGUF] Refactor and decouple gguf checkpoint loading logic (#34385)
* draft load_gguf refactor

* update

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove llama mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove qwen2 mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove unused function

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate stablelm mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate phi3 mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate t5 mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate bloom mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix bloom

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate starcoder2 mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate gpt2 mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate mistral mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate nemotron mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate mamba mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* deprecate mamba mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* code format

Signed-off-by: Isotr0py <2037008807@qq.com>

* code format

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix mamba

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix qwen2moe

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove qwen2moe mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* clean up

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove falcon 7b map

Signed-off-by: Isotr0py <2037008807@qq.com>

* remove all ggml tensors mapping

Signed-off-by: Isotr0py <2037008807@qq.com>

* add comments

Signed-off-by: Isotr0py <2037008807@qq.com>

* update messages

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix tensors in parsed parameters

Signed-off-by: Isotr0py <2037008807@qq.com>

* add gguf check

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-06 18:02:38 +01:00
86fa3cedad Bump jinja2 from 3.1.4 to 3.1.5 in /examples/research_projects/decision_transformer (#35408)
Bump jinja2 in /examples/research_projects/decision_transformer

Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.4 to 3.1.5.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.4...3.1.5)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-06 16:58:29 +00:00
44a26c871c Update llm_optims docs for sdpa_kernel (#35481)
update: use sdpa_kernel
2025-01-06 08:54:31 -08:00
18e896bd8f 🌐 [i18n-KO] Translated altclip.md to Korean (#34594)
* docs: ko: model_doc/timesformer.md

* feat: nmt draft

* Apply suggestions from code review

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>

* Update docs/source/ko/model_doc/altclip.md

* add snippet

---------

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>
2025-01-06 08:45:26 -08:00
a821b9c7ab Add check for if num_items_in_batch is not None (#35102) 2025-01-06 10:11:21 -05:00
203e978826 Add position_ids in XLMRobertaXLForCausalLM.prepare_inputs_for_generation (#35044)
* fix

* fix

* cleanup

* style

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-06 16:10:21 +01:00
c451a72cd7 Add French translation of task_summary and tasks_explained (#33407)
* Add French translation of task_summary and tasks_explained

---------

Co-authored-by: Aymeric Roucher <69208727+aymeric-roucher@users.noreply.github.com>
2025-01-06 14:23:52 +01:00
9895f7df81 Idefics: fix docstring (#35079)
nit: fix docstring
2025-01-06 10:58:04 +01:00
32aa2db04a Fix Llava conversion for models that use safetensors to store weights (#35406)
* fix llava-med-v1.5-mistral-7b conversion

Signed-off-by: Isotr0py <2037008807@qq.com>

* add weights_only=True

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-06 09:59:38 +01:00
b2f2977533 Applies the rest of the init refactor except to modular files (#35238)
* [test_all] Applies the rest of the init refactor except to modular files

* Revert modular that doesn't work

* [test_all] TFGPT2Tokenizer
2025-01-05 18:30:08 +01:00
e5fd865eba Add Gemma2 GGUF support (#34002)
* initial setup for ggml.py

* initial setup of GGUFGemma2Converter class

* Add gemma2 model to gguf.md doc

* Partial work on GGUF_TENSOR_MAPPING

* initial setup of GGUF_TENSOR_MAPPING for Gemma2

* refactor: rename GemmaConvert class to GemmaConverter for naming consistency

* feat: complete gemma2 tensor mapping implementation

* feat: add initial implementation of GGUFGemmaConverter

* feat: complete GGUFGemmaConverter implementation

* feat: add test code for gemma2

* refactor: minor code cleanup

* refactor: minor code cleanup

* fix: resolve suggestions

* Update tests/quantization/ggml/test_ggml.py

Co-authored-by: Isotr0py <2037008807@qq.com>

---------

Co-authored-by: Isotr0py <2037008807@qq.com>
2025-01-03 14:50:07 +01:00
1fe2d53d4e Reuse "if not" logic in image_processing. (#35405) 2025-01-03 14:44:57 +01:00
30a9971632 Use sdpa_kernel in tests (#35472)
* update: use sdpa_kernel

* update: rerun test
2025-01-03 14:39:52 +01:00
cba49cb2a6 Change is_soundfile_availble to is_soundfile_available (#35030) 2025-01-03 14:37:42 +01:00
42865860ec Fix paligemma warning message (#35486)
fix log input
2025-01-02 11:36:53 +01:00
b2b04e86e7 Fix docs typos. (#35465)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-01-02 11:29:46 +01:00
6b1e86fd4d Fix new BNB test failures (#35345) 2025-01-02 11:24:52 +01:00
5b516b06c8 Reintroduce Python 3.9 support for ModernBERT (#35458)
Co-authored-by: Koichi Yasuoka <yasuoka@kanji.zinbun.kyoto-u.ac.jp>
2025-01-02 11:23:07 +01:00
919220dab1 Update translated docs for sdpa_kernel (#35461)
* docs: update sdpa_kernel for translation

* fix: nn.attention

* update: infer many
2024-12-31 08:37:58 -08:00
eb2b452432 [i18n-ar] Translated file: docs/source/ar/tasks/summarization.md into Arabic (#35195)
* إضافة الترجمة العربية: summarization.md

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/summarization.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-12-31 08:35:54 -08:00
d5aebc6465 [i18n-ar] Translated file: docs/source/ar/tasks/question_answering.md into Arabic (#35196)
* إضافة الترجمة العربية: question_answering.md

* Update question_answering.md

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tasks/question_answering.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-12-30 11:56:05 -08:00
b5f97977ed Update docs for sdpa_kernel (#35410)
update: sdp_kernel -> sdpa_kernel
2024-12-30 09:50:34 -08:00
5cabc75b4b Add compute_loss_func to Seq2SeqTrainer (#35136) 2024-12-29 15:01:35 +01:00
90f256c90c Update perf_infer_gpu_one.md: fix a typo (#35441) 2024-12-29 14:57:08 +01:00
5c75087aee Fix model_accepts_loss_kwargs for timm model (#35257)
* Fix for timm model

* Add comment
2024-12-27 16:33:44 +00:00
3b0a94ef9e Fix f-string to show ACCELERATE_MIN_VERSION on error (#35189)
fix f-string to show ACCELERATE_MIN_VERSION on error

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-12-27 13:21:44 +01:00
f63da20a9f CLIP conversion script - Change fairseq to OpenAI (#35384)
Change fairseq to OpenAI
2024-12-27 13:12:32 +01:00
7f97d01675 Fix: Rename keyword argument in_channels to num_channels (#35289)
Fix: Rename keyword argument in_channels to num_channels in some default backbone configs
2024-12-27 13:07:31 +01:00
4eb17b26e7 Drop inplace operation for loss computation with gradient accumulation (#35416)
Fix inplace loss computation
2024-12-26 14:58:53 +01:00
24c91f095f [GPTQ, CompressedTensors] Fix unsafe imports and metada check (#34815)
* fix gptq creation when optimum is not installed + fix metadata checking

* fix compressed tensors as well

* style

* pray for ci luck on flaky tests :prayge:

* trigger ci

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2024-12-24 19:32:44 +01:00
6e0515e99c Add DINOv2 with registers (#35348)
* added changes from 32905

* fixed mistakes caused by select all paste

* rename diff_dinov2...

* ran tests

* Fix modular

* Fix tests

* Use new init

* Simplify drop path

* Convert all checkpoints

* Add figure and summary

* Update paths

* Update docs

* Update docs

* Update toctree

* Update docs

---------

Co-authored-by: BernardZach <bernardzach00@gmail.com>
Co-authored-by: Zach Bernard <132859071+BernardZach@users.noreply.github.com>
2024-12-24 13:21:59 +01:00
d8c1db2f56 enable non-cuda awq model support without modify version (#35334)
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2024-12-24 12:36:00 +01:00
ccc4a5a59b Disable .github/workflows/self-comment-ci.yml for now (#35366)
* disable

* disable

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-24 10:53:57 +01:00
93aafdc620 Add compile test for fast image processor (#35184)
* add compile test for fast image processor

* override pixtral test
2024-12-23 13:12:45 -05:00
82fcac0a7e Adding logger.info about update_torch_dtype in some quantizers (#35046)
adding logger.info
2024-12-23 17:01:00 +01:00
a1780b7ba5 bugfix Idefics3 processor - handle gracefully cases with text and no images (#35363)
* bugfix processing empty images

* fix

* fix

* Update src/transformers/models/idefics3/processing_idefics3.py

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* adding tests

* fix

* fix

* fix

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2024-12-23 16:59:01 +01:00
64c05eecd6 HIGGS Quantization Support (#34997)
* higgs init

* working with crunches

* per-model workspaces

* style

* style 2

* tests and style

* higgs tests passing

* protecting torch import

* removed torch.Tensor type annotations

* torch.nn.Module inheritance fix maybe

* hide inputs inside quantizer calls

* style structure something

* Update src/transformers/quantizers/quantizer_higgs.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* reworked num_sms

* Update src/transformers/integrations/higgs.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* revamped device checks

* docstring upd

* Update src/transformers/quantizers/quantizer_higgs.py

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* edited tests and device map assertions

* minor edits

* updated flute cuda version in docker

* Added p=1 and 2,3bit HIGGS

* flute version check update

* incorporated `modules_to_not_convert`

* less hardcoding

* Fixed comment

* Added docs

* Fixed gemma support

* example in docs

* fixed torch_dtype for HIGGS

* Update docs/source/en/quantization/higgs.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Collection link

* dequantize interface

* newer flute version, torch.compile support

* unittest message fix

* docs update compile

* isort

* ValueError instead of assert

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2024-12-23 16:54:49 +01:00
ef1f54a0a7 add bnb support for Ascend NPU (#31512)
* add bnb support for Ascend NPU

* delete comment
2024-12-23 16:36:16 +01:00
59178780a6 Fix : VPTQ test (#35394)
fix_test
2024-12-23 16:27:46 +01:00
3a4ced9ab4 Fix typing in docstring for PaliGemmaProcessor (#35278)
Updated typing for `tokenizer` in the `PaliGemmaProcessor` to be `GemmaTokenizerFast` instead of `LlamaTokenizerFast`
2024-12-23 16:22:04 +01:00
3cd3cd50ac Scale loss before backward (#35207) 2024-12-23 16:16:38 +01:00
f5264a86ee Deprecate _is_quantized_training_enabled (#34991)
deperecate

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-12-23 15:51:31 +01:00
e10be82b71 uniformize kwargs for SAM (#34578)
* Make kwargs uniform for SAM

* Remove unused attribute

* Make point_pad_value part of image_kwargs

* Update annotations

* Code review - use existing methods

* Use ProcessorTesterMixin

* Do not add ProcessorTesterMixin everywhere
2024-12-23 13:54:57 +01:00
2bb60982ac Patch GPTNeoX to use adequate FA2 if position_ids is provided (#35318) 2024-12-23 13:45:55 +01:00
5e7aedebeb make LlamaModel._update_causal_mask torch compilable (#35187)
* make LlamaModel._update_causal_mask torch compilable

* chore: lint (make fix-copies)

* fix-copies

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2024-12-23 13:10:00 +01:00
401aa39d7b bitsandbytes: simplify 8bit dequantization (#35068) 2024-12-23 13:04:59 +01:00
05260a1fc1 Fix new FA2 if is_causal is passed explicitly (#35390)
* fix

* Update modeling_decision_transformer.py

* Update flash_attention.py
2024-12-22 20:00:07 +01:00
8f38f58f3d owlvit/2 dynamic input resolution (#34764)
* owlvit/2 dynamic input resolution.

* adapt box grid to patch_dim_h patch_dim_w

* fix ci

* clarify variable naming

* clarify variable naming..

* compute box_bias dynamically inside box_predictor

* change style part of code

* [run-slow] owlvit, owlv2
2024-12-21 08:51:09 +00:00
608e163b52 [docs] Follow up register_pipeline (#35310)
example json
2024-12-20 09:22:44 -08:00
UV
94fe0b915b Improved Documentation Of Audio Classification (#35368)
* Improved Documentation Of Audio Classification

* Updated documentation as per review

* Updated audio_classification.md

* Update audio_classification.md
2024-12-20 09:17:28 -08:00
c96cc039c3 Improve modular transformers documentation (#35322)
* Improve modular transformers documentation

- Adds hints to general contribution guides
- Lists which utils scripts are available to generate single-files from modular files and check their content

* Show commands in copyable code cells

---------

Co-authored-by: Joel Koch <joel@bitcrowd.net>
2024-12-20 09:16:02 -08:00
504c4d3692 Make test_generate_with_static_cache even less flaky (#34995)
* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-20 16:03:26 +01:00
0fc2970363 Use weights_only=True with torch.load for transfo_xl (#35241)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-20 15:40:55 +01:00
6fae2a84ae Update test fetcher when we want to test all (#35364)
* [test-all]

* style

* [test-all]

* [test_all]

* [test_all]

* style
2024-12-20 15:10:43 +01:00
34ad1bd287 update codecarbon (#35243)
* update codecarbon

* replace directly-specified-test-dirs with tmp_dir

* Revert "replace directly-specified-test-dirs with tmp_dir"

This reverts commit 310a6d962ec83db3f6d4f96daeeba5c6746f736c.

* revert the change of .gitignore

* Update .gitignore

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2024-12-20 15:04:36 +01:00
40292aa4e9 bugfix: torch.export failure caused by _make_causal_mask (#35291)
* bugfix: torch.export failure caused by `_make_causal_mask`

Recent changes in torch dynamo prevent mutations on tensors converted with aten::_to_copy. To address this, we can clone such tensor before performing in-place operation `masked_fill_` only when the code is being compiled by torch dynamo.
(relevant issue: https://github.com/pytorch/pytorch/issues/127571)

* chore: use `is_torchdynamo_compiling` instead of `torch._dynamo.is_compiling`
2024-12-20 14:37:04 +01:00
05de764e9c Aurevoir PyTorch 1 (#35358)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-20 14:36:31 +01:00
4567ee8057 fix zoedepth initialization error under deepspeed zero3 (#35011)
fix zoe bug in deepspeed zero3
2024-12-20 11:42:40 +00:00
c3a43594b7 Add Tensor Parallel support for Qwen2VL (#35050)
feat: add parallel support for qwen2vl
2024-12-20 12:40:38 +01:00
0d51d65905 Cleaner attention interfaces (#35342)
* cleaner attention interfaces

* correctly set the _attn_implementation when adding other functions to it

* update

* Update modeling_utils.py

* CIs
2024-12-20 12:09:34 +01:00
eafbb0eca7 Implement AsyncTextIteratorStreamer for asynchronous streaming (#34931)
* Add AsyncTextIteratorStreamer class

* export AsyncTextIteratorStreamer

* export AsyncTextIteratorStreamer

* improve docs

* missing import

* missing import

* doc example fix

* doc example output fix

* add pytest-asyncio

* first attempt at tests

* missing import

* add pytest-asyncio

* fallback to wait_for and raise TimeoutError on timeout

* check for TimeoutError

* autodoc

* reorder imports

* fix style

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-20 12:08:12 +01:00
b5a557e5fe Reduce CircleCI usage (#35355)
* reduce 1

* reduce 1

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-20 10:18:15 +01:00
4e27a4009d FEAT : Adding VPTQ quantization method to HFQuantizer (#34770)
* init vptq

* add integration

* add vptq support

fix readme

* add tests && format

* format

* address comments

* format

* format

* address comments

* format

* address comments

* remove debug code

* Revert "remove debug code"

This reverts commit ed3b3eaaba82caf58cb3aa6e865d98e49650cf66.

* fix test

---------

Co-authored-by: Yang Wang <wyatuestc@gmail.com>
2024-12-20 09:45:53 +01:00
5a2aedca1e [Mamba2] Fix caching, slow path, and multi-gpu (#35154)
* fixup mamba2 - caching and several other small fixes

* fixup cached forward

* correct fix this time

* fixup cache - we do not need to extend the attn mask it's handled by generate (gives total ids + mask at each step)

* remove unnecessary (un)squeeze

* fixup cache position

* simplify a few things

* [run-slow] mamba2

* multi gpu attempt two

* [run-slow] mamba2

* [run-slow] mamba2

* [run-slow] mamba2

* [run-slow] mamba2

* add newer slow path fix

* [run-slow] mamba2
2024-12-20 09:27:47 +01:00
ff9141bb85 fix onnx export of speech foundation models (#34224)
* added expanded attention/padding masks prior to indexing the hidden_states

* consistency fix in WavLMForSequenceClassification

---------

Co-authored-by: Nikos Antoniou <nikosantoniou@Nikos-MacBook-Pro.local>
2024-12-20 09:22:05 +01:00
f42084e641 [docs] Add link to ModernBERT Text Classification GLUE finetuning script (#35347)
Add link to ModernBERT Text Classification GLUE finetuning script
2024-12-19 14:45:52 -08:00
0ade1caa35 Modernbert Release Fixes (#35344)
* fix ForSequenceClassification

* unmodularize rope layer

* fix linting warning

* Avoid complex PoolingHead, only one prediction head needed

---------

Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
2024-12-19 17:22:37 +01:00
1fa807fa63 Fix some fa2 tests (#35340)
* remove fa2 test

* remove other failing tests

* style
2024-12-19 17:05:25 +01:00
667ed5635e Add ModernBERT to Transformers (#35158)
* initial cut of modernbert for transformers

* small bug fixes

* fixes

* Update import

* Use compiled mlp->mlp_norm to match research implementation

* Propagate changes in modular to modeling

* Replace duplicate attn_out_dropout in favor of attention_dropout

cc @warner-benjamin let me know if the two should remain separate!

* Update BOS to CLS and EOS to SEP

Please confirm @warner-benjamin

* Set default classifier bias to False, matching research repo

* Update tie_word_embeddings description

* Fix _init_weights for ForMaskedLM

* Match base_model_prefix

* Add compiled_head to match research repo outputs

* Fix imports for ModernBertForMaskedLM

* Just use "gelu" default outright for classifier

* Fix config name typo: initalizer -> initializer

* Remove some unused parameters in docstring. Still lots to edit there!

* Compile the embeddings forward

Not having this resulted in very slight differences - so small it wasn't even noticed for the base model, only for the large model.

But the tiny difference for large propagated at the embedding layer through the rest of the model, leading to notable differences of ~0.0084 average per value, up to 0.2343 for the worst case.

* Add drafts for ForSequenceClassification/ForTokenClassification

* Add initial SDPA support (not exactly equivalent to FA2 yet!)

During testing, FA2 and SDPA still differ by about 0.0098 per value in the token embeddings. It still predicts the correct mask fills, but I'd like to get it fully 1-1 if possible.

* Only use attention dropout if training

* Add initial eager attention support (also not equivalent to FA2 yet!)

Frustratingly, I also can't get eager to be equivalent to FA2 (or sdpa), but it does get really close, i.e. avg ~0.010 difference per value.

Especially if I use fp32 for both FA2&eager, avg ~0.0029 difference per value

The fill-mask results are good with eager.

* Add initial tests, output_attentions, output_hidden_states, prune_heads

Tests are based on BERT, not all tests pass yet: 23 failed, 79 passed, 100 skipped

* Remove kwargs from ModernBertForMaskedLM

Disable sparse_prediction by default to match the normal HF, can be enabled via config

* Remove/adjust/skip improper tests; warn if padding but no attn mask

* Run formatting etc.

* Run python utils/custom_init_isort.py

* FlexAttention with unpadded sequences(matches FA2 within bf16 numerics)

* Reformat init_weights based on review

* self -> module in attention forwards

* Remove if config.tie_word_embeddings

* Reformat output projection on a different line

* Remove pruning

* Remove assert

* Call contiguous() to simplify paths

* Remove prune_qkv_linear_layer

* Format code

* Keep as kwargs, only use if needed

* Remove unused codepaths & related config options

* Remove 3d attn_mask test; fix token classification tuple output

* Reorder: attention_mask above position_ids, fixes gradient checkpointing

* Fix usage if no FA2 or torch v2.5+

* Make torch.compile/triton optional

Should we rename 'compile'? It's a bit vague

* Separate pooling options into separate functions (cls, mean) - cls as default

* Simplify _pad_modernbert_output, remove unused labels path

* Update tied weights to remove decoder.weight, simplify decoder loading

* Adaptively set config.compile based on hf_device_map/device/resize, etc.

* Update ModernBertConfig docstring

* Satisfy some consistency checks, add unfinished docs

* Only set compile to False if there's more than 1 device

* Add docstrings for public ModernBert classes

* Dont replace docstring returns - ends up being duplicate

* Fix mistake in toctree

* Reformat toctree

* Patched FlexAttention, SDPA, Eager with Local Attention

* Implement FA2 -> SDPA -> Eager attn_impl defaulting, crucial

both to match the original performance, and to get the highest inference speed without requiring users to manually pick FA2

* Patch test edge case with Idefics3 not working with 'attn_implementation="sdpa"'

* Repad all_hidden_states as well

* rename config.compile to reference_compile

* disable flex_attention since it crashes

* Update modernbert.md

* Using dtype min to mask in eager

* Fully remove flex attention for now

It's only compatible with the nightly torch 2.6, so we'll leave it be for now. It's also slower than eager/sdpa.

Also, update compile -> reference_compile in one more case

* Call contiguous to allow for .view()

* Copyright 2020 -> 2024

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update/simplify __init__ structure

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove "... if dropout_prob > 0 else identity"

As dropout with 0.0 should be efficient like identity

* re-use existing pad/unpad functions instead of creating new ones

* remove flexattention method

* Compute attention_mask and local_attention_mask once in modeling

* Simplify sequence classification prediction heads, only CLS now

Users can make custom heads if they feel like it

Also removes the unnecessary pool parameter

* Simplify module.training in eager attn

* Also export ModernBertPreTrainedModel

* Update the documentation with links to finetuning scripts

* Explain local_attention_mask parameter in docstring

* Simplify _autoset_attn_implementation, rely on super()

* Keep "in" to initialize Prediction head

Doublechecked with Benjamin that it's correct/what we used for pretraining

* add back mean pooling

* Use the pooling head in TokenClassification

* update copyright

* Reset config._attn_implementation_internal on failure

* Allow optional attention_mask in ForMaskedLM head

* fix failing run_slow tests

* Add links to the paper

* Remove unpad_no_grad, always pad/unpad without gradients

* local_attention_mask -> sliding_window_mask

* Revert "Use the pooling head in TokenClassification"

This reverts commit 99c38badd1dbce01d7aef41095fbf2f5cce87279.

There was no real motivation, no info on whether having this bigger head does anything useful.

* Simplify pooling, 2 options via if-else

---------

Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>
Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
Co-authored-by: Said Taghadouini <taghadouinisaid@gmail.com>
Co-authored-by: Benjamin Clavié <ben@clavie.eu>
Co-authored-by: Antoine Chaffin <ant54600@hotmail.fr>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-19 14:03:35 +01:00
56ff1e92fd PaliGemma: Make sure to add <eos> to suffix if <image> is present in text (#35201)
Move suffix processing code to out of if statement
2024-12-19 09:53:48 +01:00
4592cc9e98 Update comment CI bot (#35323)
* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-19 09:45:27 +01:00
d19b11f59b Fix documentation for ColPali (#35321)
* docs: fix typo quickstart snippet in ColPali's model card

* docs: clean the ColPali's model card

* docs: make the `ColPaliForRetrieval`'s docstring more concise

* docs: add missing bash command used to convert weights for `vidore/colpali-v1.3-hf`
2024-12-19 09:08:28 +01:00
9613933b02 Add the Bamba Model (#34982)
* initial commit for PR

Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>

* rename dynamic cache

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add more unit tests

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add integration test

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add integration test

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* Add modular bamba file

* Remove trainer changes from unrelated PR

* Modify modular and cofig to get model running

* Fix some CI errors and beam search

* Fix a plethora of bugs from CI/docs/etc

* Add bamba to models with special caches

* Updat to newer mamba PR for mamba sublayer

* fix test_left_padding_compatibility

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix style

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix remaining tests

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* missed this test

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* ran make style

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* move slow tag to integration obj

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* make style

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* address comments

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix modular

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* left out one part of modular

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* change model

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* Make Rotary modular as well

* Update bamba.md

Added overview, update Model inference card and added config

* Update bamba.md

* Update bamba.md

* Update bamba.md

Minor fixes

* Add docs for config and model back

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Add warning when using fast kernels

* replaced generate example

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* Address comments from PR

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Propagate attention fixes

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Fix attention interfaces to the new API

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Fix API for decoder layer

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Remove extra weights

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

---------

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>
Co-authored-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: divya-kumari32 <72085811+divya-kumari32@users.noreply.github.com>
Co-authored-by: Antoni Viros <ani300@gmail.com>
2024-12-18 20:18:17 +01:00
9a94dfe123 feat: add benchmarks_entrypoint.py (#34495)
* feat: add `benchmarks_entrypoint.py`

Adding `benchmarks_entrypoint.py` file, which will be run from the
benchmarks CI.

This python script will list all python files from the `benchmark/`
folder and run the included `run_benchmark` function, allowing people to
add new benchmarks scripts.

* feat: add `MetricsRecorder`

* feat: update dashboard

* fix: add missing arguments to `MetricsRecorder`

* feat: update dash & add datasource + `default.yml`

* fix: move responsibility to create `MetricsRecorder` in bench script

* fix: update incorrect datasource UID

* fix: incorrect variable values

* debug: benchmark entrypoint script

* refactor: update log level

* fix: update broken import

* feat: add debug log in `MetricsRecorder`

* debug: set log level to debug

* fix: set connection `autocommit` to `True`
2024-12-18 18:59:07 +01:00
2c47618c1a 🚨All attention refactor🚨 (#35235)
* refactor LlamaAttention

* minimal changes

* fix llama

* update

* modular gemmas

* modular nits

* modular updates

* nits

* simplify

* gpt2

* more modualr and fixes

* granite

* modular modular modular

* nits

* update

* qwen2 + starcoder2

* mostly gemma2

* Update image_processing_auto.py

* fix

* Update modular_starcoder2.py

* fix

* remove all copied from attentions

* remove gcv

* make fix-copies

* oups

* oups2.0

* fix some modulars + all copied from

* should be good now

* revert unwanted changes

* Update modeling_decision_transformer.py

* finish cleanup

* Update modeling_olmo.py

* consistency

* re-add gradient checkpointing attribute

* fix

* style

* make config necessary

* bis

* bis

* Update modeling_my_new_model2.py

* is_causal attr

* fix

* remove past kv return from decoder layer

* fix

* default rope config

* correctly fix rope config

* fix bias

* fix gpt2 attention output

* fix test

* fix inits

* fix default sdpa

* fix default sdpa implementation

* harmonize classes

* fix mistral

* fix sliding window models

* mixtral

* be more explicit

* style

* fix

* several fixes

* Update modeling_dbrx.py

* fix test

* olmo + phi

* rotary

* syle

* phi

* phi again

* again

* kwargs

* Update test_modeling_common.py

* skip fx tracing tests

* Update modeling_utils.py

* gemma 2

* again

* Update modeling_recurrent_gemma.py

* gemma2

* granite

* style

* starcoder

* Update sdpa_attention.py

* switch args

* Update modeling_mllama.py

* fix

* cache type tests

* gpt2

* Update test_modeling_common.py

* fix

* consistency

* fix shape with encoder

* should be the last one

* tests non model

* most comments

* small oupsi

* be more explicit in modulars

* more explicit modulars

* CIs! it works locally

* add kwargs to _flash_attention_forward

---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2024-12-18 16:53:39 +01:00
75be5a0a5b [Whisper] fix docstrings typo (#35319)
typos docstring
2024-12-18 16:38:19 +01:00
69e31eb1bf change bnb tests (#34713)
* fix training tests

* fix xpu check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* rm pdb

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix 4bit logits check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix 4bit logits check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add xpu check on int8 training

* fix training tests

* add llama test on bnb

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* only cpu and xpu disable autocast training

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com>
2024-12-18 09:49:59 -05:00
da334bcfa8 [Whisper] 🚨 Fix whisper decoding 🚨 (#34135)
* do not remove decoder_input_ids for the first segment

* do not remove eos token in generate_with_fallback

* when removing padding tokens, do not remove eos token

* remove eos token in generate (and not in generate_with_fallback!)

* reconciliate short-from/ long-form behavior

* correct avg_logprobs calculation

* handle eos token in segments

* handle decoder_input_ids and eos token in _prepare_decoder_input_ids

* fix incorrect time precision

* always remove eos token

* always remove decoder_input_ids

* no need to handle decoder_inputs_ids and eos token

* no need to remove decoder_input_ids

* no need to handle eos token

* fix num_beams in _retrieve_logit_processors

* remove todo unconsistency

* no need to add eos token

* last_timestamp_pos should indeed be timestamp token pos

* patch generate to enable compatibility with GenerationTesterMixin tests

* adapt test_generate_continue_from_past_key_values

* adapt test_prompt_lookup_decoding_matches_greedy_search

* adapt generic GenerationMixin tests to whisper's generate

* fix speculative decoding

* fix

* [run-slow] whisper

* change HF_HUB_TOKEN for require_read_token

* [run-slow] whisper

* prioritize kwargs over generation_config

* remove unnecessary args

* [run-slow] whisper

* update tests

* [run-slow] whisper

* add comment

* update test

* [run-slow] whisper

* update test + revert require_read_token

* docstring updates

* revert tokenizer decode args change

* do not use a patch + docstring updates

* [run-slow] whisper

* make

* [run-slow] whisper

* add a flag to force unique call to generate

* test update

* [run-slow] whisper

* add force_unique_generate_call arg

* do not use a patch

* correct the timestamps for the pad tokens

* docstring update

* docstring update

* docstring update

* upodate TF tests

* add require_read_token

* [run-slow] whisper

* test reset dynamo

* [run-slow] whisper

* fix

* [run-slow] whisper

* avoid iterating twice on current_segments

* [run-slow] whisper

* [run-slow] whisper

---------

Co-authored-by: Eustache Le Bihan <eustlb@users.noreply.huggingface.co>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-18 14:13:21 +01:00
f1b7634fc8 Trigger GitHub CI with a comment on PR (#35211)
* fix

* fix

* comment

* final

* final

* final

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-18 13:56:49 +01:00
c7e48053aa [tests] make cuda-only tests device-agnostic (#35222)
fix cuda-only tests
2024-12-18 10:14:22 +01:00
1eee1cedfd Fix loading with only state dict and low_cpu_mem_usage = True (#35217)
* fix loading with only state dict and config

* style

* add tests

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-18 09:54:32 +01:00
0531d7513b [docs] Improve register_pipeline (#35300)
register_pipeline
2024-12-17 10:27:23 -08:00
UV
77080f023f Fixed typo in audio_classification.md (#35305) 2024-12-17 09:45:51 -08:00
8bfd7eeeef Add Cohere2 docs details (#35294)
* Add Cohere2 docs details

* Update docs/source/en/model_doc/cohere2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-17 09:36:31 -08:00
a7feae190f Fix remove unused parameter in docs (#35306)
remove unused parameter in example

Co-authored-by: zzzzzsa <zzzzzsaqwq@gmail.com>
2024-12-17 09:34:41 -08:00
927c3e39ec Fix image preview in multi-GPU inference docs (#35303)
fix: link for img
2024-12-17 09:33:50 -08:00
4302b27719 Fix typos in translated quicktour docs (#35302)
* fix: quicktour typos

* fix: one more
2024-12-17 09:32:00 -08:00
deac971c46 🚨🚨🚨 Limit backtracking in Nougat regexp (#35264)
* Limit backtracking in regexp

* Update

* [run-slow] nougat

* Update
2024-12-17 16:34:18 +00:00
d29a06e39a remove benchmark job in push-important-models.yml (#35292)
remove-bench

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-17 17:27:26 +01:00
e0ae9b5974 🚨🚨🚨 Delete conversion scripts when making release wheels (#35296)
* Delete conversion scripts when making release wheels

* make fixup

* Update docstring
2024-12-17 14:18:42 +00:00
6eb00dd2f0 Support for SDPA for SAM models (#34110)
* feat: add support for sdpa and gradient checkpointing

* fix: ruff format

* fix: config sdpa

* fix: sdpa layer naming convention

* fix: update test_eager_matches_sdpa_inference to handle vision_hidden_states

* test: skip incompatible tests and fix loading issue with sdpa

- Updated tests to skip cases flash and dynamic compile.
- Minor adjustment to ensure correct loading of model with sdpa for dispatch test.

* style: apply Ruff formatting

* ruff fix again after rebase

* [run-slow] sam

* [run-slow] sam

* refactor: Address review comments and improve sub-config handling in SAM model tests

- Added attributes for sub_configs as per PR #34410.
- Enabled tests for configs, ensuring the composite model (SAM) has several sub-configs in the main config.
- Added class attribute _is_composite=True to the tester class
- test_sdpa_can_dispatch_composite_models added

* [run-slow] sam

* style: ruff

* [run-slow] sam

* style: ruff again ...

* [run-slow] sam
2024-12-17 14:46:05 +01:00
747f361da1 Add sdpa for Beit (#34941)
* Add sdpa for Beit

* Updates

* [run-slow] beit

* Update inference benchmarks

* Update

* Fix - add missed to super().forward()

* Updates

* Fix missing import
2024-12-17 14:44:47 +01:00
6c08b3b6e5 Add Falcon3 documentation (#35307)
* Add Falcon3 documentation

* Update Falcon3 documentation

* Change Falcon to Falcon3

* Update docs and run make fix-copies

* Add blog post and huggingface models links
2024-12-17 14:23:13 +01:00
f33a0cebb3 Add ColPali to 🤗 transformers (#33736)
* feat: run `add-new-model-like`

* feat: add paligemma code with "copied from"

* feat: add ColPaliProcessor

* feat: add ColPaliModel

* feat: add ColPaliConfig

* feat: rename `ColPaliForConditionalGeneration` to `ColPaliModel`

* fixup modeling colpali

* fix: fix root import shortcuts

* fix: fix `modeling_auto` dict

* feat: comment out ColPali test file

* fix: fix typos from `add-new-model-like`

* feat: explicit the forward input args

* feat: move everything to `modular_colpali.py`

* fix: put back ColPaliProcesor

* feat: add auto-generated files

* fix: run `fix-copies`

* fix: remove DOCStRING constants to make modular converter work

* fix: fix typo + modular converter

* fix: add missing imports

* feat: no more errors when loading ColPaliModel

* fix: remove unused args in forward + tweak doc

* feat: rename `ColPaliModel` to `ColPaliForRetrieval`

* fix: apply `fix-copies`

* feat: add ColPaliProcessor to `modular_colpali`

* fix: run make quality + make style

* fix: remove duplicate line in configuration_auto

* feat: make ColPaliModel inehrit from PaliGemmaForConditionalGeneration

* fix: tweak and use ColPaliConfig

* feat: rename `score` to `post_process_retrieval`

* build: run modular formatter + make style

* feat: convert colpali weights + fixes

* feat: remove old weight converter file

* feat: add and validate tests

* feat: replace harcoded path to "vidore/colpali-v1.2-hf" in tests

* fix: add bfloat16 conversion in weight converter

* feat: replace pytest with unittest in modeling colpali test

* feat: add sanity check for weight conversion (doesn't work yet)

* feat: add shape sanity check in weigth converter

* feat: make ColPaliProcessor args explicit

* doc: add doc for ColPali

* fix: trying to fix output mismatch

* feat: tweaks

* fix: ColPaliModelOutput inherits from ModelOutput instead of PaliGemmaCausalLMOutputWithPast

* fix: address comments on PR

* fix: adapt tests to the Hf norm

* wip: try things

* feat: add `__call__` method to `ColPaliProcessor`

* feat: remove need for dummy image in `process_queries`

* build: run new modular converter

* fix: fix incorrect method override

* Fix tests, processing, modular, convert

* fix tokenization auto

* hotfix: manually fix processor -> fixme once convert modular is fixed

* fix: convert weights working

* feat: rename and improve convert weight script

* feat: tweaks

* fest: remove `device` input for `post_process_retrieval`

* refactor: remove unused `get_torch_device`

* Fix all tests

* docs: update ColPali model doc

* wip: fix convert weights to hf

* fix logging modular

* docs: add acknowledgements in model doc

* docs: add missing docstring to ColPaliProcessor

* docs: tweak

* docs: add doc for `ColPaliForRetrievalOutput.forward`

* feat: add modifications from colpali-engine v0.3.2 in ColPaliProcessor

* fix: fix and upload colapli hf weights

* refactor: rename `post_process_retrieval` to `score_retrieval`

* fix: fix wrong typing for `score_retrieval`

* test: add integration test for ColPali

* chore: rerun convert modular

* build: fix root imports

* Update docs/source/en/index.md

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* fix: address PR comments

* wip: reduce the prediction gap in weight conversion

* docs: add comment in weight conversion script

* docs: add example for `ColPaliForRetrieval.forward`

* tests: change dataset path to the new one in hf-internal

* fix: colpali weight conversion works

* test: add fine-grained check for ColPali integration test

* fix: fix typos in convert weight script

* docs: move input docstring in a variable

* fix: remove hardcoded torch device in test

* fix: run the new modular refactor

* docs: fix python example for ColPali

* feat: add option to choose `score_retrieval`'s output dtype and device

* docs: update doc for `score_retrieval`

* feat: add `patch_size` property in ColPali model

* chore: run `make fix-copies`

* docs: update description for ColPali cookbooks

* fix: remove `ignore_index` methods

* feat: remove non-transformers specific methods

* feat: update `__init__.py` to new hf format

* fix: fix root imports in transformers

* feat: remove ColPali's inheritance from PaliGemma

* Fix CI issues

* nit remove prints

* feat: remove ColPali config and model from `modular_colpali.py`

* feat: add `ColPaliPreTrainedModel` and update modeling and configuration code

* fix: fix auto-removed imports in root `__init__.py`

* fix: various fixes

* fix: fix `_init_weight`

* temp: comment `AutoModel.from_config` for experiments

* fix: add missing `output_attentions` arg in ColPali's forward

* fix: fix `resize_token_embeddings`

* fix: make `input_ids` optional in forward

* feat: rename `projection_layer` to `embedding_proj_layer`

* wip: fix convert colpali weight script

* fix tests and convert weights from original repo

* fix unprotected import

* fix unprotected torch import

* fix style

* change vlm_backbone_config to vlm_config

* fix unprotected import in modular this time

* fix: load config from Hub + tweaks in convert weight script

* docs: move example usage from model docstring to model markdown

* docs: fix input docstring for ColPali's forward method

* fix: use `sub_configs` for ColPaliConfig

* fix: remove non-needed sanity checks in weight conversion script + tweaks

* fix: fix issue with `replace_return_docstrings` in ColPali's `forward`

* docs: update docstring for `ColPaliConfig`

* test: change model path in ColPali test

* fix: fix ColPaliConfig

* fix: fix weight conversion script

* test: fix expected weights for ColPali model

* docs: update ColPali markdown

* docs: fix minor typo in ColPaliProcessor

* Fix tests and add _no_split_modules

* add text_config to colpali config

* [run slow] colpali

* move inputs to torch_device in integration test

* skip test_model_parallelism

* docs: clarify quickstart snippet in ColPali's model card

* docs: update ColPali's model card

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2024-12-17 11:26:43 +01:00
a7f5479b45 fix modular order (#35297)
* fix modular ordre

* fix

* style
2024-12-17 08:05:35 +01:00
UV
f5620a7634 Improved documentation of Automatic speech recognition (#35268)
Improved documentation quality of Automatic speech recognition
2024-12-16 09:50:11 -08:00
eb92bc44b7 Fix wrongs in quicktour[zh] (#35272)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2024-12-16 09:23:34 -08:00
886f690e76 Translating "translate perf_infer_gpu_multi.md" to Chinese (#35271)
add "translate perf_infer_gpu_multi"
2024-12-16 09:22:35 -08:00
22834eeba1 Fix typos in Translated Audio Classification Docs (#35287)
* fix: qwen2 model ids

* fix: line

* fix: more format

* update: reformat

* fix: doc typos
2024-12-16 08:51:32 -08:00
9feae5fb01 [Whisper] patch float type on mps (#35295)
* fix float type on mps

* make
2024-12-16 16:52:47 +01:00
d5b81e1ca1 Delete redundancy for loop checks. (#35288)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2024-12-16 13:36:27 +00:00
d0f32212ed Temporarily disable amd push ci (#35293)
Temporarily disable amd push ci (reduce noise)
2024-12-16 14:18:50 +01:00
85eb339231 Fix : model used to test ggml conversion of Falcon-7b is incorrect (#35083)
fixing test model
2024-12-16 13:21:44 +01:00
14910281a7 Blip: fix offloading and MP tests (#35239)
* fix device map

* fix offloading + model parallel test
2024-12-16 12:44:33 +01:00
66531a1ec3 Aggeregate test summary files in CircleCI workflow runs (#34989)
* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* try 1

* fix

* fix

* fix

* update

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-16 11:06:17 +01:00
5615a39369 Fall back to slow image processor in ImageProcessingAuto when no fast processor available (#34785)
* refactor image_processing_auto logic

* fix fast image processor tests

* Fix tests fast vit image processor

* Add safeguard when use_fast True and torchvision not available

* change default use_fast back to None, add warnings

* remove debugging print

* call get_image_processor_class_from_name once
2024-12-15 14:00:36 -05:00
ca03842cdc [i18n-Chinese] Translating perf_train_cpu.md to Chinese (#35242)
add "1"
2024-12-13 14:46:49 -08:00
add53e25ff don't use no_sync when deepspeed doesn't support it for certain zero stages (#35157)
* don't use no_sync when deepspeed doesn't support it for certain zero stages

* chore: lint

* fix no_sync context for deepspeed across all zero types

* chore: lint
2024-12-13 19:23:00 +01:00
7237b3ecfc Fix FSDP no longer working (#35212)
Fix FSDP failing
2024-12-13 19:20:51 +01:00
6009642459 Translating agents_advanced.md to Chinese (#35231)
add "translate agents_advanced"
2024-12-13 10:12:00 -08:00
UV
e94083bf90 Fixed typos in Audio Classification Documentation (#35263)
* Fixed typos in Audio Classification Documentation

* removed space in '8000 kHZ'

* Changes made as per review
2024-12-13 09:43:44 -08:00
bc6ae0d55e Update AMD docker image (rocm 6.1) (#35259)
* Use rocm 6.3 as base amd image and add nvidia-ml-py to exclude list

* Align rocm base image with torch wheels @6.1. Seems like the most stable combo
2024-12-13 15:41:03 +01:00
8096161b76 Use rsfE with pytest (#35119)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-13 14:36:22 +01:00
bdd4201fdb [tests] fix "Tester object has no attribute '_testMethodName'" (#34910)
* add more cases

* fix method not found in unittest

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>

* fix more cases

* add more models

* add all

* no unittest.case

* remove for oneformer

* fix style

---------

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
2024-12-13 14:33:45 +01:00
3d213b57fe skip Fuyu from test_generate (#35246)
* skip Fuyu from test_generate

* make fixup, quality, repo-consistency
2024-12-13 10:12:49 +01:00
64478c7631 Add Cohere2 model (#35224) 2024-12-13 09:35:50 +01:00
e4e404fdd0 Run model as compressed/uncompressed mode (#34719)
* draft, run model as compreszed/uncompressed mode

* draft

* run run_compressed=False

* run_compressed as attr

* set run_compressed=False using quantization_config

* remove redundant line

* make is_qat_trainable dependent on run_compressed status

* add tests

* lint

* full in docstring

* add decompress

* comments

* decompress if model is compresssed and not run_compressed

* apply_quant_config logic fix -- populate statedict properly

* comments

* remove non  compressed model

* make is_compressed as property

* cosmetic

* run apply_quant_config for non-compressed models -- popualte scales and zeropoints

* add pahtway for decompressing sparse models

* typo on is_quantization_compressed

* lint

* fix typo
2024-12-13 08:23:31 +01:00
31f9a289a6 Fix typo in chat template example (#35250)
Fix template example typo
2024-12-12 16:53:21 -08:00
11ba1d472c [Init refactor] Modular changes (#35240)
* Modular changes

* Gemma

* Gemma
2024-12-12 19:23:28 +01:00
a691ccb0c2 Change back to Thread for SF conversion (#35236)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-12 16:05:04 +01:00
e3ee49fcfb Refactoring AssistedCandidateGenerator for Improved Modularity and Reusability (#35009)
* move `TestAssistedCandidateGeneratorDifferentTokenizers` into a new testing file

* refactor

* NOTHING. add space to rerun github actions tests

* remove it...

* NOTHING. add space to rerun github actions tests

* remove it...

* replace: `self.prev_tokens` -> `self.prev_assistant_ids`

* NOTHING. rerun CI tests

* remove it

* introduce `self.prev_target_ids_len`

* fix style

* fix style

---------

Co-authored-by: Jonathan Mamou <jonathan.mamou@intel.com>
2024-12-12 15:47:05 +01:00
63766abe36 Support Python 3.10+ Union style in chat template type hints parsing (#35103)
* fix(utils): Support the newest Union type in chat template

* fix(utils/chat_template): Backward compatibility for the newest Union type

* Update src/transformers/utils/chat_template_utils.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2024-12-12 14:07:06 +00:00
5cf11e5ab9 Fix type hints for apply_chat_template (#35216) 2024-12-12 13:59:24 +00:00
UV
3db8e27816 Fixed typo of 'indentifier' in audio_utils.py (#35226) 2024-12-12 13:45:04 +00:00
a9ccdfd8e3 docs: clarify initializer_range parameter description in Idefics3VisionConfig (#35215) 2024-12-11 11:26:18 -08:00
6181c6b095 Fix seamless TTS generate (#34968)
* fix seamless tts generate

* apply same fix for v2

* [run-slow] seamless_m4t, seamless_m4t_v2

* remove TODO

* [run-slow] seamless_m4t, seamless_m4t_v2

* [run-slow] seamless_m4t, seamless_m4t_v2

* ignore failing test on multigpus

* [run-slow] seamless_m4t, seamless_m4t_v2

* [run-slow] seamless_m4t, seamless_m4t_v2
2024-12-11 15:38:42 +01:00
33c12e4d80 Fix CI (#35208)
fix aria
2024-12-11 14:24:52 +01:00
7d303efa5f Cleanup: continue the init refactor (#35170)
* Round 2

* Round 3
2024-12-11 14:12:34 +01:00
5fcf6286bf Add TimmWrapper (#34564)
* Add files

* Init

* Add TimmWrapperModel

* Fix up

* Some fixes

* Fix up

* Remove old file

* Sort out import orders

* Fix some model loading

* Compatible with pipeline and trainer

* Fix up

* Delete test_timm_model_1/config.json

* Remove accidentally commited files

* Delete src/transformers/models/modeling_timm_wrapper.py

* Remove empty imports; fix transformations applied

* Tidy up

* Add image classifcation model to special cases

* Create pretrained model; enable device_map='auto'

* Enable most tests; fix init order

* Sort imports

* [run-slow] timm_wrapper

* Pass num_classes into timm.create_model

* Remove train transforms from image processor

* Update timm creation with pretrained=False

* Fix gamma/beta issue for timm models

* Fixing gamma and beta renaming for timm models

* Simplify config and model creation

* Remove attn_implementation diff

* Fixup

* Docstrings

* Fix warning msg text according to test case

* Fix device_map auto

* Set dtype and device for pixel_values in forward

* Enable output hidden states

* Enable tests for hidden_states and model parallel

* Remove default scriptable arg

* Refactor inner model

* Update timm version

* Fix _find_mismatched_keys function

* Change inheritance for Classification model (fix weights loading with device_map)

* Minor bugfix

* Disable save pretrained for image processor

* Rename hook method for loaded keys correction

* Rename state dict keys on save, remove `timm_model` prefix, make checkpoint compatible with `timm`

* Managing num_labels <-> num_classes attributes

* Enable loading checkpoints in Trainer to resume training

* Update error message for output_hidden_states

* Add output hidden states test

* Decouple base and classification models

* Add more test cases

* Add save-load-to-timm test

* Fix test name

* Fixup

* Add do_pooling

* Add test for do_pooling

* Fix doc

* Add tests for TimmWrapperModel

* Add validation for `num_classes=0` in timm config + test for DINO checkpoint

* Adjust atol for test

* Fix docs

* dev-ci

* dev-ci

* Add tests for image processor

* Update docs

* Update init to new format

* Update docs in configuration

* Fix some docs in image processor

* Improve docs for modeling

* fix for is_timm_checkpoint

* Update code examples

* Fix header

* Fix typehint

* Increase tolerance a bit

* Fix Path

* Fixing model parallel tests

* Disable "parallel" tests

* Add comment for metadata

* Refactor AutoImageProcessor for timm wrapper loading

* Remove custom test_model_outputs_equivalence

* Add require_timm decorator

* Fix comment

* Make image processor work with older timm versions and tensor input

* Save config instead of whole model in image processor tests

* Add docstring for `image_processor_filename`

* Sanitize kwargs for timm image processor

* Fix doc style

* Update check for tensor input

* Update normalize

* Remove _load_timm_model function

---------

Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>
2024-12-11 12:40:30 +00:00
bcc50cc7ce [PEFT] Better Trainer error when prompt learning with loading best model at the end (#35087)
Original issue: https://github.com/huggingface/peft/issues/2256

There is a potential error when using load_best_model_at_end=True with a
prompt learning PEFT method. This is because Trainer uses load_adapter
under the hood but with some prompt learning methods, there is an
optimization on the saved model to remove parameters that are not
required for inference, which in turn requires a change to the model
architecture. This is why load_adapter will fail in such cases and users
should instead set load_best_model_at_end=False and use
PeftModel.from_pretrained. As this is not obvious, we now intercept the
error and add a helpful error message.
2024-12-11 12:44:39 +01:00
d363e71d0e 🧹 Remove deprecated RotaryEmbedding parts in the Attention layers (#34858)
* update

* style

* fix missing args

* remove last trace of old rope classes

* remove deprecated copied from

* fix copies

* trigger CIs

* post rebase clean-up

* reverse mistral

* cleanup after dropping commits

* Add comment
2024-12-11 11:16:52 +01:00
9094b87dd4 BLIP: enable device map (#34850)
fix device map
2024-12-11 11:03:30 +01:00
10feacd88a [i18n-<languageCode>] Translating agents.md to Chinese (#35139)
* add "translate agents.md"

* add "agents.md"

* add "translate warnings"

* add "totree"

* add "remove transformer_agent"

* add "remove transformer _agent file"

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-10 15:16:37 -08:00
e8508924fd Update data collator docstrings to accurately reference Nvidia tensor core compute capability version (#35188)
update data collator docs to reflect correct tensor core compute capability

Co-authored-by: John Graham Reynolds <john.graham.reynolds@vumc.org>
2024-12-10 15:16:01 -08:00
5290f6a62d [docs] Fix FlashAttention link (#35171)
fix link
2024-12-10 11:36:25 -08:00
91b8ab18b7 [i18n-<languageCode>] Translating Benchmarks.md to Chinese (#35137)
* add "Translating Benchmarks.md to Chinese "

* Removed all the English original text (which was previously kept as comments in the document) and refined some of the Chinese expressions.
2024-12-10 09:58:47 -08:00
217c47e31b Only import torch.distributed if it is available (#35133) 2024-12-10 18:19:30 +01:00
52d135426f Multiple typo fixes in NLP, Audio docs (#35181)
Fixed multiple typos in Tutorials, NLP, and Audio sections
2024-12-10 09:08:55 -08:00
425af6cdc2 [i18n-ar] Translated file : docs/source/ar/community.md into Arabic (#33027)
* Add docs/source/ar/community.md to Add_docs_source_ar_community.md

* Update community.md

* Update community.md

* Update community.md

* Update _toctree.yml - add community.md

* Update docs/source/ar/community.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Create how_to_hack_models.md

* Create modular_transformers.md

* Create tiktoken.md

* Update _toctree.yml

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/how_to_hack_models.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/modular_transformers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tiktoken.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/tiktoken.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-12-10 09:08:27 -08:00
e5c45a6679 Fixing GGUF support for StableLm (#35060)
fix

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-12-10 16:30:09 +01:00
3e2769a3c9 Fix DBRX LayerNorm init method (#35177)
fix dbrx layernorm init
2024-12-10 14:31:22 +00:00
5fba3f99c0 Remove unnecessary masked_fill in deberta models (#35182) 2024-12-10 13:52:20 +00:00
6acb4e43a7 Support BatchNorm in Hubert pos_conv_emb as in fairseq (#34389)
* Support BatchNorm in Hubert pos_conv_emb as in fairseq

* Correct the new defaults (#34377)

* Correct the new defaults

* CIs

* add check

* Update utils.py

* Update utils.py

* Add the max_length in generate test checking shape without passing length

* style

* CIs

* fix fx CI issue

* [auto. ping] Avoid sending empty info + add more team members (#34383)

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Fix glm  (#34388)

* Fix duplicated

* fix import

* Use non nested images and batched text Idefics2/3  (#34222)

* add support for non nested images and add tests

* add tests error scenario

* fix style

* added single and no image to error tests

* Fix onnx non-expotable inplace aten op (#34376)

* fix onnx non-expotable inplace op

* mistral, qwen2, qwen2_vl, starcoder2

* fixup copies

* Fix right padding in LLaVA models (#34305)

* fix right pad llavas

* device mismatch

* no filter (#34391)

* no filter

* no filter

* no filter

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* SynthID: better example (#34372)

* better example

* Update src/transformers/generation/configuration_utils.py

* Update src/transformers/generation/logits_process.py

* nits

* Tests: upgrade `test_eager_matches_sdpa_generate` (#34386)

* Fix bnb training test failure (#34414)

* Fix bnb training test: compatibility with OPTSdpaAttention

* Avoid check expected exception when it is on CUDA (#34408)

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Fix typos in agents_advanced.md (#34405)

* [docs] Cache implementations (#34325)

cache

* [run-slow] hubert

* Support BatchNorm in Hubert pos_conv_emb as in fairseq
Add conversion integration test, and make batchnorm explicit variable

* Support BatchNorm in Hubert pos_conv_emb as in fairseq
fix make fixup styling changes

* [run-slow] hubert

* Support BatchNorm in Hubert pos_conv_emb as in fairseq

* [run-slow] hubert

* Support BatchNorm in Hubert pos_conv_emb as in fairseq
Add conversion integration test, and make batchnorm explicit variable

* Support BatchNorm in Hubert pos_conv_emb as in fairseq
fix make fixup styling changes

* [run-slow] hubert

* [run-slow] hubert

---------

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
Co-authored-by: Rudy Delouya <rudy.delouya@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
2024-12-10 14:18:23 +01:00
80f2b1610f Fix file path for shard_num 1 with mllama converter (#35053)
"#35049 fix path for num_shard 1"
2024-12-10 09:11:45 +00:00
0938b57770 Assisted decoding multi-gpu (#35116)
* fix

* move a few lines up
2024-12-10 09:59:17 +01:00
dada0fd85f Fix num_items_in_batch not being an integer (#35115)
In method `Trainer#get_batch_samples`, the return values should be a
list of batch samples and an integer indicating the number of items that
exist in the batch. However, this was not actually a case and what was
returned instead of an integer, was a tensor with one element. In the
multi-GPU setup, this tensor is placed in a different device than the
loss tensor, causing the loss function to raise a `RuntimeError`.

The problem arises from
5d7739f15a/src/transformers/trainer.py (L5139-L5144),
where the outer `sum` operates over a list of tensors which means that
the final result is also a tensor. To counter this issue, a new check
(after the accelerator gathering) has been added in order to convert a
potential tensor to an integer before returning the
`num_items_in_batch`.
2024-12-10 08:40:40 +01:00
34f4080ff5 [CI] Fix bnb quantization tests with accelerate>=1.2.0 (#35172) 2024-12-09 13:55:16 -05:00
UV
fa8763ce17 Fixed typo of 'avilable' in prompts.py (#35145) 2024-12-09 16:40:32 +00:00
4bc39de5c3 Super tiny fix logging message (#35132)
Update integration_utils.py
2024-12-09 16:31:32 +00:00
8e806a336f Cleanup: continue the init refactor (#35167)
Round 2
2024-12-09 16:09:50 +01:00
7238387f67 Fix typo in EETQ Tests (#35160)
fix
2024-12-09 14:13:36 +01:00
de8a0b7547 Option to set 'non_blocking' for to(device) in BatchEncoding and BatchFeature (#34883)
* Option to set 'non_blocking' for to(device) operation for performance improvements. Defaults to 'false', thus no behavioral changes.

* Enabling non_blocking in to() operation of BatchFeature.

* Improved docstring on utilization of non_blocking

* Force non_blocking as keyword argument

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

---------

Co-authored-by: Daniel Bogdoll <dbogdoll@umich.edu>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-12-09 11:29:04 +01:00
UV
1452dc2514 Corrected typo in agent system prompts (#35143) 2024-12-09 10:42:23 +01:00
9e420e0269 [I-JEPA] Update docs (#35148)
Update docs
2024-12-09 10:01:31 +01:00
1ccca8f48c Fix GA loss bugs and add unit test (#35121)
* fix GA bugs and add unit test

* narrow down model loss unit test diff gap

* format code to make ruff happy

* send num_items_in_batch argument to decoder

* fix GA loss bug in BertLMHeadModel

* use TinyStories-33M to narrow down diff gap

* fotmat code

* missing .config

* avoid add extra args

---------

Co-authored-by: kangsheng <kangsheng@meituan.com>
2024-12-09 09:57:41 +01:00
c8c8dffbe4 Update I-JEPA checkpoints path (#35120)
Update checkpoints path
2024-12-06 13:42:51 +00:00
7f95372c62 Add feature dim attributes to BitLinear for easier PEFT integration (#34946)
Update bitnet.py, extremely small change to allow for easier PEFT integration

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2024-12-06 13:39:45 +01:00
9ad4c93536 Add Aria (#34157)
* Add Aria
---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-06 12:17:34 +01:00
15ab310c3a Fix private forked repo. CI (#35114)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-06 12:03:31 +01:00
98e8062df3 [docs] top_p, top_k, temperature docstrings (#35065)
clarify
2024-12-05 11:24:51 -08:00
44f88d8ccb [docs] Update Python version in translations (#35096)
update: doc version
2024-12-05 11:06:54 -08:00
66ab300aaf Dev version 2024-12-05 19:12:22 +01:00
a5bb528471 Fix signatures for processing kwargs (#35105)
* add conversion script

* remove pg2 refs

* fixup style

* small update

* get correct scaling

* add back missing bos

* fix missing config keys

* might revert this pos_embeddings

* fixup 9b config

* fix 9b

* fixup 9b conversion for good + add back num_hidden_layers

* add correct query scaling for 2b, 9b, 27b

* fixup 27b conversion

* Additional variant: 27b-896

* Use CPU for conversion to reduce GPU RAM requirements

* fix causal mask generation + formatting

* fix in-training causal mask generation edge case

* trigger CI

* update config

* update config

* update config

* update config

* update config

* update config

* update config

* update config

* update config

* move conversion file to main model dir

* handle multi-images + bos token

* address comments for input ids

* revert ci fixes

* [run-slow] paligemma

* fix

* [run-slow] paligemma

* skip end 2 end

* [run-slow] paligemma

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-05 18:15:48 +01:00
e27465c801 Adaptive dynamic number of speculative tokens (#34156)
* initial commit

* update strategy

* add tradeoff FPR TPR with cost

* all probs

* fix

* fix

* fix style

* Update src/transformers/generation/configuration_utils.py

shorter docstring

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* import guard

* fix style

* add is_sklearn_available condition

* vectorizing to flatten the for-loop

* fix style

* disable adaptation for UAG

* update doc

* add TestAssistedCandidateGeneratorUpdateStrategy

* fix style

* protect import

* fix style

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-12-05 17:07:33 +01:00
b0a51e5cff Fix flaky Hub CI (test_trainer.py) (#35062)
* fix

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* check

* check

* check

* check

* check

* check

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/testing_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* check

* check

* check

* Final space

* Final adjustment

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Lucain <lucainp@gmail.com>
2024-12-05 17:02:27 +01:00
a928d9c128 [trainer] fix the GA model_accepts_loss_kwargs (#34915)
* fix

* style

* values

* fix
2024-12-05 16:37:46 +01:00
e682c17e4a BLIP: this is correct now (#35081)
this is correct now
2024-12-05 16:30:09 +01:00
50189e36a6 Add I-JEPA (#33125)
* first draft

* add IJepaEmbeddings class

* fix copy-from for IJepa model

* add weight conversion script

* update attention class names in IJepa model

* style changes

* Add push_to_hub option to convert_ijepa_checkpoint function

* add initial tests for I-JEPA

* minor style changes to conversion script

* make fixup related

* rename conversion script

* Add I-JEPA to sdpa docs

* minor fixes

* adjust conversion script

* update conversion script

* adjust sdpa docs

* [run_slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* [run-slow] ijepa

* formatting issues

* adjust modeling to modular code

* add IJepaModel to objects to ignore in docstring checks

* [run-slow] ijepa

* fix formatting issues

* add usage instruction snippet to docs

* change pos encoding, add checkpoint for doc

* add verify logits for all models

* [run-slow] ijepa

* update docs to include image feature extraction instructions

* remove pooling layer from IJepaModel in image classification class

* [run-slow] ijepa

* remove pooling layer from IJepaModel constructor

* update docs

* [run-slow] ijepa

* [run-slow] ijepa

* small changes

* [run-slow] ijepa

* style adjustments

* update copyright in init file

* adjust modular ijepa

* [run-slow] ijepa
2024-12-05 16:14:46 +01:00
95a855e212 Deprecate quanto and switch to optimum-quanto (#35001)
* deprecate quanto

* fix style
2024-12-05 16:11:09 +01:00
482cb28a18 Fix tie_word_embeddings handling for GGUF models (#35085)
* fix tie_word_embeddings

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>
2024-12-05 16:00:41 +01:00
35447054f5 Update Mistral conversion script (#34829)
* Update convert_mistral_weights_to_hf.py

* Update convert_mistral_weights_to_hf.py

* Update convert_mistral_weights_to_hf.py
2024-12-05 15:47:20 +01:00
93f87d3cf5 [tokenizers] bump to 0.21 (#34972)
bump to 0.21
2024-12-05 15:46:02 +01:00
54aae121eb [Whisper] Fix whisper tokenizer (#34537)
* handle single timestamp ending

* include last timestamp token

* handle single timestamp ending

* avoid floating points arithm limitations

* ensure float64 operations

* new test

* make fixup

* make copies

* handle edge case double tokens ending with different tokens

* handle single timestamp ending

* make fixup

* handle conditioning on prev segments

* fix

* Update src/transformers/models/whisper/generation_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* [run-slow] whisper

* don't call item() to avoid unnecessary sync

* fix

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
Co-authored-by: Eustache Le Bihan <eustlb@users.noreply.huggingface.co>
2024-12-05 13:46:29 +01:00
beb2c66ec3 Informative (#35059)
* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-05 09:50:27 +01:00
1ed1de2fec [docs] Increase visibility of torch_dtype="auto" (#35067)
* auto-dtype

* feedback
2024-12-04 09:18:44 -08:00
baa3b22137 [docs] add a comment that offloading requires CUDA GPU (#35055)
* add commen to offloading

* Update docs/source/en/kv_cache.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-04 07:48:34 -08:00
1da1e0d7f2 Support for easier multimodal use of modular (#35056)
* update modular and add examples

* style

* improve example comments

* style

* fix small logic issue for imports

* fix relative order issue when files do not make sense

* Improve comments

* trigger CIs
2024-12-04 15:13:11 +01:00
46df859975 [GPTNeoX] Flex Attention + Refactor (#34896)
* gpt neox flex attention + refactor

* some formatting

* small fix on dropout

* add assertion on flex attn test

* flaky ci :(

* add head mask support

* style

* handle dtype, replace torch where

* fixup flex with output attns

* code review and several other fixes

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* style

* remove unnecessary comment

* remove incorrect comment

* make flex attn check more agnostic tor versions and centralized

* change peft input dtype check to value since q and k could be affected by other stuff like RoPE

* i forgor

* flaky

* code review and small fixes

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-04 14:48:28 +01:00
accb7204f9 Add Pytorch Tensor Parallel support for Qwen2, Qwen2Moe, Starcoder2 (#35007)
* add base tp plan for qwen2 and qwen2moe

* add parallel tp for starcoder2

* fix modular conversion

* add infer dim for qkv states

* Update src/transformers/models/qwen2_moe/configuration_qwen2_moe.py

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-04 14:43:36 +01:00
c7a109ec81 Fix pad_token_tensor is None in warning (#34005)
Fix pad_token_tensor is None in warning
2024-12-04 11:15:25 +01:00
329f5dbf97 [docs] use device-agnostic API instead of hard-coded cuda (#35048)
replace cuda
2024-12-03 10:54:15 -08:00
b8cdc262d5 [docs] use device-agnostic instead of cuda (#35047)
* fix on xpu

* [run_all]

* add the missing import for Image lib

* add more devices in comment

* bug fix

* replace cuda
2024-12-03 10:53:45 -08:00
346597b644 Translate community.md into Chinese (#35013)
* community translation

* Update docs/source/zh/community.md

Co-authored-by: Isotr0py <2037008807@qq.com>

---------

Co-authored-by: Isotr0py <2037008807@qq.com>
2024-12-03 10:22:02 -08:00
3deaa8179d [docs] fix example code bug (#35054)
fix code bug
2024-12-03 09:18:39 -08:00
125de41643 fix speecht5 failure issue in test_peft_gradient_checkpointing_enable… (#34454)
* fix speecht5 failure issue in test_peft_gradient_checkpointing_enable_disable

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* [run-slow] speecht5

---------

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Co-authored-by: Matt <rocketknight1@gmail.com>
2024-12-03 13:58:54 +00:00
7a7f27697a Fix BertGeneration (#35043)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-03 13:56:59 +01:00
901f504580 Add token cost + runtime monitoring to Agent and HfEngine children (#34548)
* Add monitoring to Agent and HfEngine children
2024-12-03 13:14:52 +01:00
ee37bf0d95 Automatic compilation in generate: do not rely on inner function (#34923)
* compiled forward in PreTrainedModel

* update

* style

* update name

* trigger CIs

* Add way to use custom compile args

* style

* switch parameterization to generation_config

* Add to inits

* Update configuration_utils.py

* inits

* style

* docs

* style

* Update configuration_utils.py

* back without dataclass for repo consistency

* Update configuration_utils.py

* style

* style

* style once again

* add config serialization

* update

* true dataclass

* trigger CIs

* merge compile methods + remove serialization of compile config
2024-12-03 11:20:31 +01:00
f9c7e6021e Translate bertlogy.md into Chinese (#34908)
* bertology translation

* Update docs/source/zh/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/zh/bertology.md

Co-authored-by: blueingman <15329507600@163.com>

* Update docs/source/zh/bertology.md

Co-authored-by: blueingman <15329507600@163.com>

* Update docs/source/zh/bertology.md

Co-authored-by: Isotr0py <2037008807@qq.com>

* Update docs/source/zh/bertology.md

Co-authored-by: Isotr0py <2037008807@qq.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: blueingman <15329507600@163.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
2024-12-02 11:42:40 -08:00
527dc04e46 [docs] add the missing import for Image and bug fix (#34776)
* add the missing import for Image lib

* add more devices in comment

* bug fix
2024-12-02 11:40:20 -08:00
4955e4e638 [i18n-ar] Translated file : docs/source/ar/notebooks.md into Arabic (#33049)
* Add docs/source/ar/notebooks.md to Add_docs_source_ar_notebooks.md

* Update notebooks.md

* Update _toctree.yml
2024-12-02 11:40:04 -08:00
f0dec874f0 add docstring example for compute_loss_func (#35020) 2024-12-02 11:39:09 -08:00
31299670cd Multiple typo fixes in Tutorials docs (#35035)
* Fixed typo in multi gpu docs and OLMoE version

* Fixed typos in docs for agents, agents advanced, knowledge distillation, and image feature extraction

* Fixed incorrect usage of model.image_guided_detection in zero shot object detection docs
2024-12-02 15:26:34 +00:00
31830474bf Fix test_eager_matches_sdpa_inference for XPU backend (#34889)
* Use torch.nn.attention.sdpa_kernel instead of deprecated torch.backends.cuda.sdp_kernel

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* Fix test_eager_matches_sdpa_inference for XPU backend

As of PyTorch 2.5 XPU backend supports only torch.nn.attention.SDPBackend.MATH
which is implemented on PyTorch level using aten operators and is device
agnostic with respect to implementation of each aten operator. Thus, we can
reuse CUDA (or CPU) MATH weights for XPU.

Fixes: #34888
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

* Use torch.amp.autocast instead of deprecated torch.cuda.amp.autocast in nemotron

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

---------

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2024-12-02 16:21:04 +01:00
f41d5d8f74 Add type hints for forward functions in Gemma2 (#35034)
* feat: add gemma2 type hints

* fix: mask is optional
2024-12-02 14:03:36 +00:00
7b5f76e32e Typo in warning switching to optimum-quanto (#35028)
fix typos
2024-12-02 13:47:05 +00:00
c24c79ebf9 Optimize memory usage of mllama encoder (#34930)
mllama encoder memory optimization
2024-12-02 11:46:45 +01:00
9ab8c5b503 fix variable undefined bug when return_tensors is not specified in llava processing (#34953)
* fix variable undefined bug when return_tensors is not specified in llava processor

* improve readability
2024-12-02 11:44:42 +01:00
3480cbb97e Only cast cu_seqlens when tracing (#35016)
* Only cast `cu_seqlens` when tracing

* Formatting
2024-12-02 11:39:39 +01:00
19dabe9636 Update FillMaskPipeline.__call__ signature and docstring (#35006)
Update `FillMaskPipeline.__call__`

- Remove unused `*args`
- Update docstring with `inputs` over `args`
2024-11-29 13:44:56 +00:00
f7427f58ed fix: double verbs (#35008) 2024-11-29 13:19:57 +00:00
737f4dc4b6 Update timm version (#35005)
* Bump timm

* dev-ci
2024-11-29 12:46:59 +00:00
89d7bf584f 🚨🚨🚨 Uniformize kwargs for TrOCR Processor (#34587)
* Make kwargs uniform for TrOCR

* Add tests

* Put back current_processor

* Remove args

* Add todo comment

* Code review - breaking change
2024-11-29 11:58:11 +00:00
0b5b5e6a70 Let server decide default repo visibility (#34999)
* Let server decide default repo visibility

* code style
2024-11-28 17:05:08 +01:00
f491096f7d Fix docker CI : install autogptq from source (#35000)
* Fixed Docker

* Test ci

* Finally

* add comment
2024-11-28 16:31:36 +01:00
01ad80f820 Improve .from_pretrained type annotations (#34973)
* Fix from_pretrained type annotations

* Better typing for image processor's `from_pretrained`
2024-11-28 15:05:19 +00:00
9d6f0ddcec Add optimized PixtralImageProcessorFast (#34836)
* Add optimized PixtralImageProcessorFast

* make style

* Add dummy_vision_object

* Review comments

* Format

* Fix dummy

* Format

* np.ceil for math.ceil
2024-11-28 16:04:05 +01:00
6300212946 Fix utils/check_bad_commit.py (for auto ping in CI) (#34943)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-28 15:34:38 +01:00
5e8c1d713d Offloaded cache: fix generate (#34921)
* fix cache impl

* require_torch_gpu

* fix mamba

* fix copies
2024-11-28 15:05:56 +01:00
57ca9e6d2f Allow compressed-tensors quantized model to be trained (#34520)
* populate quantization_config for kv-cache-scheme only configs

* make compressed-tensors quantized models trainable

* populate versions on quant config

* pass oneshot then finetune

* remove breakpoint

* SunMarc comments and fix to_dict logic

* lint

* lint

* test

* comment

* comments'
2024-11-28 15:05:16 +01:00
44af935ec5 Refine the code of Universal Assisted Generation (#34823)
* removed the useless attritbutes

* add configs for window size

* fixed the wrong kwargs

* added docstring
2024-11-28 15:04:24 +01:00
2b053fdf1a 🚨🚨🚨 Changed DINOv2Config default patch size to 14 (#34568)
Changed DINOv2Config default patch size to 14
2024-11-28 14:48:06 +01:00
4f0bf9864c Fix save_pretrained for partially offloaded models (#34890)
* delete unnecessary reference

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update comment, explicit delete state_dict

* Update src/transformers/modeling_utils.py

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* fix style

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

---------

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
2024-11-28 14:46:56 +01:00
f4b674f269 [PEFT] Set eval mode when loading PEFT adapter (#34509)
* [PEFT] Set eval mode when loading PEFT adapter

Resolves #34469

When calling model.load_adapter to load a PEFT adapter, by default the
adapter should be set to eval mode. This is now correctly done. Users
can still pass is_trainable=True to load the adapter in training mode.

* Linter
2024-11-28 13:56:25 +01:00
5523e38b55 Fixed typo in VisitWebpageTool (#34978)
Fixed typo in VisitWebpageTool
2024-11-27 12:49:21 -08:00
4120cb257f Fix typo in code block in vipllava.md (#34957)
fix typo in code block in vipllava.md
2024-11-27 08:19:34 -08:00
2910015d6d [i18n-zh]Translated perf_train_special.md into Chinese (#34948)
* Add translation for perf_train_special documentation

* Update docs/source/zh/perf_train_special.md

Co-authored-by: Isotr0py <2037008807@qq.com>

* Update docs/source/zh/perf_train_special.md

Co-authored-by: Isotr0py <2037008807@qq.com>

* Update _toctree.yml

* Update _toctree.yml

* Update perf_train_special.md

* Update perf_train_special.md

---------

Co-authored-by: Isotr0py <2037008807@qq.com>
2024-11-27 07:57:43 -08:00
637225508f [docs] add explanation to release_memory() (#34911)
* explain release_memory

* Update docs/source/en/llm_tutorial_optimization.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-27 07:47:28 -08:00
0600f46353 🌐 [i18n-KO] Translated encoder-decoder.md to Korean (#34880)
* Initial version of translation, english still remaining

* Revised Translation, removed english. _toctree not updated

* updated _toctree.yml && 3rd ver translation

* updated _toctree.yml && 3rd ver translation

* Update encoder-decoder.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update encoder-decoder.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update encoder-decoder.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update encoder-decoder.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update encoder-decoder.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update encoder-decoder.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

---------

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
2024-11-27 07:47:14 -08:00
5f8b24ee12 Fix flaky test execution caused by Thread (#34966)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-27 16:32:50 +01:00
0d99a938aa Avoid calling get_max_length (#34971)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-27 15:15:35 +01:00
8f48ccf548 Fix : Add PEFT from source to CI docker (#34969)
* Docker fix peft

* Test new docker

* uncomment
2024-11-27 14:10:47 +01:00
4c1388f48e [FlexAttention] Update gemma2 (#34942)
* update tests

* now maybe this fixes the previous fialing tests!

* nit default

* Update src/transformers/models/gemma2/modular_gemma2.py

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* fix-copies

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
2024-11-27 11:50:48 +01:00
6c3f168b36 [i18n-zh]Translated tiktoken.md into chinese (#34936)
* Add translation for tiktoken documentation

* Update tiktoken.md

* Update tiktoken.md
2024-11-26 10:09:52 -08:00
5bfb40bc8e docs: HUGGINGFACE_HUB_CACHE -> HF_HUB_CACHE (#34904) 2024-11-26 09:37:18 -08:00
784d22078a [doc] use full path for run_qa.py (#34914)
use full path for run_qa.py
2024-11-26 09:23:44 -08:00
6bc0c219c1 [docs] use device-agnostic API instead of cuda (#34913)
add device-agnostic API

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
2024-11-26 09:23:34 -08:00
64b73e61f8 [i18n-ar] Translated file : docs/source/ar/benchmarks.md into Arabic (#33023)
* Add docs/source/ar/benchmarks.md to Add_docs_source_ar_benchmarks.md

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/benchmarks.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

* Update benchmarks.md

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-11-26 09:23:11 -08:00
a0ba631519 Update the Python version in the Chinese README to match the English README. (#34870)
Update Python Version
2024-11-26 09:22:34 -08:00
1f6b423f0c Fix torch.onnx.export of Qwen2-VL vision encoder (#34852)
* Fix torch.onnx.export of Qwen2-VL vision encoder

This PR fixes onnx export support for the vision encoder of Qwen2-VL, which converts the `cu_seqlens` to `torch.int32`, leading to errors later on when using the values for slicing.

c57eafdaa1/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py (L1044-L1046)

## Error:
```
onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] (op_type:Slice, node name: /blocks.0/attn/Slice_4): axes has inconsistent type tensor(int64)
```

## Code to reproduce issue:
```py

import requests
from PIL import Image
import torch
from transformers import (
    AutoProcessor,
    Qwen2VLForConditionalGeneration,
)

# Constants
VISION_MODEL_NAME = "vision_encoder.onnx"

# Load model and processor
model_id = "hf-internal-testing/tiny-random-Qwen2VLForConditionalGeneration"
model = Qwen2VLForConditionalGeneration.from_pretrained(model_id).eval()
processor = AutoProcessor.from_pretrained(model_id)

# Prepare inputs
url = "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
image = Image.open(requests.get(url, stream=True).raw)
conversation = [
    {
        "role": "user",
        "content": [
            { "type": "image" },
            { "type": "text", "text": "Describe this image."},
        ],
    },
]
images = [image]
text_prompt = processor.apply_chat_template(conversation, add_generation_prompt=True)
inputs = processor(text=[text_prompt], images=images, padding=True, return_tensors="pt")

## Vision model
vision_inputs = dict(
    pixel_values=inputs["pixel_values"],
    grid_thw=inputs["image_grid_thw"],
)
vision_inputs_positional = tuple(vision_inputs.values())
vision_outputs = model.visual.forward(*vision_inputs_positional)  # Test forward pass
torch.onnx.export(
    model.visual,
    args=vision_inputs_positional,
    f=VISION_MODEL_NAME,
    export_params=True,
    opset_version=14,
    do_constant_folding=True,
    input_names=list(vision_inputs.keys()),
    output_names=["image_features"],
    dynamic_axes={
        "pixel_values": {
            0: "batch_size * grid_t * grid_h * grid_w",
            1: "channel * temporal_patch_size * patch_size * patch_size",
        },
        "grid_thw": {0: "batch_size"},
        "image_features": {0: "batch_size * grid_t * grid_h * grid_w"},
    },
)

# Load and check the exported model model
import onnx
model = onnx.load(VISION_MODEL_NAME)
onnx.checker.check_model(model, full_check=True)
inferred = onnx.shape_inference.infer_shapes(model, check_type=True)
```

* Formatting

* [run-slow] qwen2_vl
2024-11-26 16:14:36 +01:00
d5cf91b346 Separate chat templates into a single file (#33957)
* Initial draft

* Add .jinja file loading for processors

* Add processor saving of naked chat template files

* make fixup

* Add save-load test for tokenizers

* Add save-load test for tokenizers

* stash commit

* Try popping the file

* make fixup

* Pop the arg correctly

* Pop the arg correctly

* Add processor test

* Fix processor code

* stash commit

* Processor clobbers child tokenizer's chat template

* Processor clobbers child tokenizer's chat template

* make fixup

* Split processor/tokenizer files to avoid interactions

* fix test

* Expand processor tests

* Rename arg to "save_raw_chat_template" across all classes

* Update processor warning

* Move templates to single file

* Move templates to single file

* Improve testing for processor/tokenizer clashes

* Improve testing for processor/tokenizer clashes

* Extend saving test

* Test file priority correctly

* make fixup

* Don't pop the chat template file before the slow tokenizer gets a look

* Remove breakpoint

* make fixup

* Fix error
2024-11-26 14:18:04 +00:00
5a45617887 change apply_rotary_pos_emb of Glmmodel for GLM-Edge Series model (#34629)
* change apply_rotary_pos_emb

* upload for glm-edge

* remove useless part

* follow the suggestion

* fix

* format

* format

* test

* format again

* format again

* remove modular change

* remove modular change

* this apply_rotary_pos_emb need modify?

* fix with this

* format

* format

* ruff check

* modify modular_glm failed

* remove partial_rotary_factor of function  partial_rotary_factor

* fix wrong change of examples/research_projects

* revert

* remove line 118

* use q_rot
2024-11-26 15:05:42 +01:00
1141eff1bd Add Pytorch Tensor Parallel support for Mistral (#34927)
add base tp support
2024-11-26 14:28:07 +01:00
4d1d0f29a4 [Whisper] Fix whisper integration tests (#34111)
* fix test_tiny_timestamp_generation

* fix test_large_timestamp_generation

* fix test_whisper_shortform_single_batch_prev_cond

* fix test_whisper_shortform_multi_batch_hard_prev_cond

* return_timestamps necessary with long form

* fix test_default_multilingual_transcription_long_form

* fix test_tiny_token_timestamp_generation_longform

* fix test_whisper_longform_multi_batch_hard

* Update tests/models/whisper/test_modeling_whisper.py

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* fix typo

* do not expect special tokens

* fix test_whisper_longform_single_batch_beam

* fix test_whisper_longform_multi_batch_hard_prev_cond

* update test_whisper_longform_multi_batch_hard_prev_cond

* update test_whisper_longform_multi_batch_hard_prev_cond

* these tests does not make sense anymore

* this test does not make sense anymore

* make fixup

* suggested nits

* add test with forced_decoder_ids

* this test does not make sense anymore

* change assert for unittest test cases

* make fixup

* test with prompt_ids and task and language

* fix unittest test case call

* fix test_tiny_generation

* fix test_tiny_en_generation

* fix test_tiny_en_batched_generation

* fix test_tiny_longform_timestamps_generation

* fix test_tiny_timestamp_generation

* fix test_large_generation

* fix test_large_batched_generation

* fix test_large_generation_multilingual

* fix test_large_timestamp_generation

* fix test_large_timestamp_generation

* fix test_tiny_token_timestamp_generation_longform

* fix test_tiny_en_batched_generation

* make fixup

* [run-slow] whisper

---------

Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
2024-11-26 12:23:08 +01:00
0e805e6d1e Skipping aqlm non working inference tests till fix merged (#34865) 2024-11-26 11:09:30 +01:00
73b4ab1085 VideoLLaVA: add default values (#34916)
add default values
2024-11-26 08:20:06 +01:00
bdb29ff9f3 Fix import structure for Fast Image processors (#34859)
* Fix import structure image_processor_fast

* update to new inits
2024-11-25 16:27:56 -05:00
bfc3556b20 making gpt2 fx traceable (#34633)
* making gpt2 fx tracable

* running make fix-copies

* Revert "running make fix-copies"

This reverts commit 5a3437cb5b63799243bceae7d21a2aed8d0418c7.
2024-11-25 19:30:38 +01:00
95c10fedb3 Updated documentation and added conversion utility (#34319)
* Updated documentation and added conversion utility

* Update docs/source/en/tiktoken.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tiktoken.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Moved util function to integration folder + allow for str

* Update formatting

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Updated formatting

* style changes

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-25 18:44:09 +01:00
890ea7de93 Fix failling GGML test (#34871)
fix_test
2024-11-25 18:04:52 +01:00
b76a292bde Upgrade torch version to 2.5 in dockerfile for quantization CI (#34924)
* Upgrade Torch 2.5

* uncomment
2024-11-25 17:38:20 +01:00
a830df2909 Fix test_auto_backbone_timm_model_from_pretrained (#34877)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-25 17:20:41 +01:00
a464afbe2a fix static cache data type miss-match (#34799)
* fix gptj data type missmatch

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add low precision static cache tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix low-precision static cache tests

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* avoid config change

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* change data type convert in cache copy

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix comment

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* cast key value after k v out

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2024-11-25 16:59:38 +01:00
b13916c09d [AWQ, CI] Bump AWQ version used in docker image (#34922)
The old AWQ version is failing with the latest (unreleased)
transformers, giving the error:

> ImportError: cannot import name 'shard_checkpoint' from
'transformers.modeling_utils'

This has been resolved in awq v0.2.7:

https://github.com/casper-hansen/AutoAWQ/pull/644
2024-11-25 16:49:57 +01:00
4e6b19cd95 Fix : BitNet tests (#34895)
* fix_tests_bitnet

* fix format
2024-11-25 16:47:14 +01:00
9121ab8fe8 Rename OLMo November to OLMo2 (#34864)
* Rename/move OLMo Nov files to OLMo2

* Rename Olmo1124 and its variants to Olmo2
2024-11-25 16:31:22 +01:00
1de3598d30 Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/lxmert (#34917)
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-25 15:19:29 +00:00
f4c04ba32b Fix Qwen2 failing tests (#34819)
* fix: qwen2 model ids

* fix: line

* fix: more format

* update: reformat
2024-11-25 15:53:04 +01:00
11cc2295c7 [peft] Given that self.active_adapter is deprecated, avoid using it (#34804)
* Given that self.active_adapter is deprecated, avoid using it

* Remove misleading comment - `self.active_adapter` is not used (and deprecated)
2024-11-25 15:29:52 +01:00
74db22f905 Fix convert_tokens_to_string when decoder is None (#34569)
* Fix convert_tokens_to_string when decoder is None

* revert unrelated changs

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2024-11-25 14:35:24 +01:00
97514a8ba3 chore: fix some typos (#34891)
Signed-off-by: wanxiangchwng <cui.shuang@foxmail.com>
2024-11-25 13:05:59 +00:00
62ab94dea8 Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/visual_bert (#34887)
Bump tornado in /examples/research_projects/visual_bert

Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-25 12:54:55 +00:00
c50b5675d6 prepare_fa2_from_position_ids function bugfix (#33269)
contiguous() is called before view() for key and value within prepare_fa2_from_position_ids function
2024-11-25 13:51:26 +01:00
a0f4f3174f allow unused input parameters passthrough when chunking in asr pipelines (#33889)
* allow unused parameter passthrough when chunking in asr pipelines

* format code

* format

* run fixup

* update tests

* update parameters to pipline in test

* updates parametrs in tests

* change spelling in gitignore

* revert .gitignore to main

* add git ignore of devcontainer folder

* assert asr output follows expected inference output type

* run fixup

* Remove .devcontainer from .gitignore

* remove compliance check
2024-11-25 11:36:44 +01:00
4dc1a69349 Sum gathered input tokens (#34554)
* sum gathered input tokens

* ruff line-length is 119, format the code

---------

Co-authored-by: kangsheng <kangsheng@meituan.com>
2024-11-25 11:27:13 +01:00
1e492afd61 🔴 Mllama: fix base prefix (#34874)
fix base prefix
2024-11-25 11:20:20 +01:00
857d46ca0c [Deberta/Deberta-v2] Refactor code base to support compile, export, and fix LLM (#22105)
* some modification for roadmap

* revert some changes

* yups

* weird

* make it work

* sttling

* fix-copies

* fixup

* renaming

* more fix-copies

* move stuff around

* remove torch script warnings

* ignore copies

* revert bad changes

* woops

* just styling

* nit

* revert

* style fixup

* nits configuration style

* fixup

* nits

* will this fix the tf pt issue?

* style

* ???????

* update

* eval?

* update error message

* updates

* style

* grumble grumble

* update

* style

* nit

* skip torch fx tests that were failing

* style

* skip the failing tests

* skip another test and make style
2024-11-25 10:43:16 +01:00
098962dac2 BLIP: fix generation after hub update (#34876)
* fix blip generation

* dont remove it yet

* Update src/transformers/models/blip_2/modeling_blip_2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address comments

* modular

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-25 10:41:55 +01:00
c1a8520419 Cache: init empty cache when use_cache (#34274)
* fix

* fix tests

* fix copies

* add docs

* Revert "add docs"

This reverts commit 32d35634f12ba02781d2ebdee0c8dcfbe992a7b9.

* qwen move deltas

* mllama can potentiall fullgraph compile

* enable mllama compile and fix tests

* remove mllama fixes
2024-11-25 10:11:33 +01:00
1339a14dca Add safe_globals to resume training on PyTorch 2.6 (#34632)
Starting from version 2.4 PyTorch introduces a stricter check for the objects which
can be loaded with torch.load(). Starting from version 2.6 loading with weights_only=True
requires allowlisting of such objects.

This commit adds allowlist of some numpy objects used to load model checkpoints.
Usage is restricted by context manager. User can still additionally call
torch.serialization.add_safe_globals() to add other objects into the safe globals list.

Accelerate library also stepped into same problem and addressed it with PR-3036.

Fixes: #34631
See: https://github.com/pytorch/pytorch/pull/137602
See: https://pytorch.org/docs/stable/notes/serialization.html#torch.serialization.add_safe_globals
See: https://github.com/huggingface/accelerate/pull/3036

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2024-11-25 10:03:43 +01:00
318fe25f22 Fix: Enable prefill phase key value caching of nemotron/minitron models (#34742)
* modeling nemotron kv caching bugfix

Signed-off-by: jeongin601 <0200angela@gmail.com>

* test file deleted

Signed-off-by: jeongin601 <0200angela@gmail.com>

* code refinement

Signed-off-by: jeongin601 <0200angela@gmail.com>

* remove unused variables

Signed-off-by: jeongin601 <0200angela@gmail.com>

* import block sorted

* removed deprecation warning

Signed-off-by: jeongin601 <0200angela@gmail.com>

* removed support for tuple shape past_key_values

Signed-off-by: jeongin601 <0200angela@gmail.com>

* Update conditional statement for cache initialization

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Signed-off-by: jeongin601 <0200angela@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-25 09:45:35 +01:00
3a8eb74668 Fix support for image processors modifications in modular (#34866)
* add fix and examples

* fix camel case naming
2024-11-22 18:14:24 -05:00
54be2d7ae8 Bitnet test fix to avoid using gated model (#34863)
small test fix
2024-11-22 17:18:49 +01:00
286ffaaf0a [CI] Skip EETQ tests while package is broken with latest transformers (#34854)
* CI Skip EETQ tests while package is broken

EETQ tries to import the shard_checkpoint function from transformers but
the function has been removed. Therefore, trying to use EETQ currently
results in an import error. This fix results in EETQ tests being skipped
if there is an import error.

The issue has been reported to EETQ:

https://github.com/NetEase-FuXi/EETQ/issues/34

* Raise helpful error when trying to use eetq

* Forget to raise the error in else clause
2024-11-22 17:13:30 +01:00
861758e235 smol improvements to support more flexible usage (#34857)
* smol improvements to support more flexible usage

* ruff
2024-11-22 16:34:38 +01:00
42b36d7395 Speculative decoding: Test the target distribution (to prevent issues like #32867) (#34553)
* Update test_utils.py

* formatting

* Update test_utils.py

* formatting

* formatting

* Update test_utils.py

* formatting

* Update test_utils.py

* formatting

* format

* comments at standard positions
2024-11-22 16:02:37 +01:00
597efd21d2 Auto compile when static cache (#34247)
* generate with compile

* nits

* simple

* generate with compile

* nits

* simple

* safe

* style

* Update src/transformers/generation/utils.py

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>

* remove TOKENIZER forked warning

---------

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2024-11-22 15:33:35 +01:00
d9e6f307e7 Remove quantization related config from dequantized model (#34856)
* Remove quantization related config from dequantized model

* Fix whitespace
2024-11-22 10:06:29 +01:00
1867be666d Update checks for torch.distributed.tensor to require torch >= 2.5 (#34816)
* Update checks for torch.distributed.tensor

* Update PR with feedback

* Formatting fix for import order

* Remove unused function
2024-11-22 10:05:26 +01:00
6a912ff2c5 Watermarking: fix order (#34849)
fix watermarking order
2024-11-22 08:25:14 +01:00
4e90b99ed9 Refactor StarCoder2 using modular (#34015)
* Create modular_starcoder2.py

* Update modular_starcoder2.py

* update

* finalize modular

* revert # no-unravel

* Add support

* style

* Update modular_model_converter.py

* update docstring
2024-11-21 14:52:39 +01:00
18871599c9 Fix heuristic scheduling for UAG (#34805)
* fix heuristic schedule

* fix style

* fix format
2024-11-21 14:46:35 +01:00
d6a5c23f71 Fix ds nvme (#34444)
* skip nested deepspeed.zero.Init call

* make fixup

* solve conflict

* solve conflict

* put back local

* use context mangers instead of local thread

* Skip recursive calls to deepspeed.zero.Init

* Skip recursive calls to deepspeed.zero.Init

* back to old notebooks

* make style
2024-11-21 13:52:22 +01:00
ae5cbf804b Improve gguf tensor processing (#34515)
* add tensor processing system to separate logic for models

* format refactoring

* small fix

* make some methods private

* move custom methods to processors

* refactor tensor processing

* format fix
2024-11-21 13:40:49 +01:00
c57eafdaa1 Add Nemotron GGUF Loading Support (#34725)
* Add Nemotron GGUF Loading Support

* fix the Nemotron architecture assignation

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-21 11:37:34 +01:00
d4e1acbb7c Change logging level from warning to info for max_steps overriding num_train_epochs (#34810)
Update trainer.py
2024-11-21 11:37:02 +01:00
28fb02fc05 VLMs: enable generation tests - last batch (#34484)
* add tests for 3 more vlms

* fix fuyu back

* skip test
2024-11-21 11:00:22 +01:00
40821a2478 Fix CI slack reporting issue (#34833)
* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-20 21:36:13 +01:00
3cb8676a91 Fix CI by tweaking torchao tests (#34832) 2024-11-20 20:28:51 +01:00
bf42c3bd4b Fix hyperparameter search when optuna+deepseed (#34642)
* Fix hyperparameter search when optuna+deepseed

* Adding free_memory to the search setup

---------

Co-authored-by: Corentin-Royer <corentin.royer@ibm.com>
2024-11-20 18:02:58 +01:00
67890de3b8 Torchao weights only + prequantized compability (#34355)
* weights only compability

* better tests from code review

* ping torch version

* add weights_only check
2024-11-20 17:24:45 +01:00
f297af55df Fix: take into account meta device (#34134)
* Do not load for meta device

* Make some minor improvements

* Add test

* Update tests/utils/test_modeling_utils.py

Update test parameters

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Make the test simpler

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-20 11:32:07 +01:00
8cadf76e1c fix(DPT,Depth-Anything) torch.export (#34103)
* Fix torch.export issue in dpt based models

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Simplify the if statements

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Move activation definitions of zoe_depth to init()

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Add test_export for dpt and zoedepth

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* add depth anything

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Remove zoedepth non-automated zoedepth changes and zoedepth test

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* [run_slow] dpt, depth_anything, zoedepth

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

---------

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>
2024-11-20 11:31:21 +01:00
9d16441e4f Fix the memory usage issue of logits in generate() (#34813) 2024-11-20 11:25:37 +01:00
9470d65324 Fix low memory beam search (#34746)
* fix

* higher max positions in tests
2024-11-20 07:46:35 +01:00
145fbd46cb LLaVA OV: fix unpadding precision (#34779)
* fix

* propagate

* type check
2024-11-20 07:46:13 +01:00
3033509327 Translate attention.md into Chinese (#34716)
* try

* tryagain

* tryagggain

* translated

* translated2

* Update docs/source/zh/attention.md

Co-authored-by: Huazhong Ji <hzji210@gmail.com>

---------

Co-authored-by: Huazhong Ji <hzji210@gmail.com>
2024-11-19 10:03:12 -08:00
befbbf2f98 Added image-text-to-text pipeline to task guide (#34783)
* Added image-text-to-text pipeline to task guide

* Update docs/source/en/tasks/image_text_to_text.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tasks/image_text_to_text.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tasks/image_text_to_text.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tasks/image_text_to_text.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Merge codeblocks

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-19 09:49:10 -08:00
469eddbe2d Fix check_training_gradient_checkpointing (#34806)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-19 17:48:34 +01:00
05ebe8b9b0 Run test_medium_seamless_m4t_pt in subprocess to avoid many failures (#34812)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-19 17:32:10 +01:00
eedc113914 Add Image Processor Fast Deformable DETR (#34353)
* add deformable detr image processor fast

* add fast processor to doc

* fix copies

* nit docstring

* Add tests gpu/cpu and fix docstrings

* fix docstring

* import changes from detr

* fix imports

* rebase and fix

* fix input data format change in detr and rtdetr fast
2024-11-19 11:18:58 -05:00
b99ca4d28b Add support for OpenAI api "image_url" input in chat for image-text-to-text pipeline (#34562)
* add support for openai api image_url input

* change continue to elif

* Explicitely add support for OpenAI/TGI chat format

* rewrite content to transformers chat format and add tests

* Add support for typing of image type in chat templates

* add base64 to possible image types

* refactor nesting
2024-11-19 11:08:37 -05:00
15dd625a0f Bump aiohttp from 3.10.2 to 3.10.11 in /examples/research_projects/decision_transformer (#34792)
Bump aiohttp in /examples/research_projects/decision_transformer

Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.10.2 to 3.10.11.
- [Release notes](https://github.com/aio-libs/aiohttp/releases)
- [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst)
- [Commits](https://github.com/aio-libs/aiohttp/compare/v3.10.2...v3.10.11)

---
updated-dependencies:
- dependency-name: aiohttp
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-19 16:08:07 +00:00
dc42330388 fix crash in tiiuae/falcon-11B-vlm image-to-text generation (#34728)
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
2024-11-19 16:51:32 +01:00
427b62ed1a Fix post process function called in the instance segmentation example of mask2former (#34588)
* Fix post process function called in the instance segmentation example of mask2former

* fix description and additional notes for post_process_instance_segmentation of maskformers

* remove white space in maskformers post_process_instance_segmentation doc

* change image.size[::-1] to height and width for clarity in segmentation examples
2024-11-19 16:49:25 +01:00
jp
fdb9230485 Add do_convert_rgb to vit (#34523)
* Add: do_convert_rgb

* Add: doc string

* Update src/transformers/models/vit/image_processing_vit.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/vit/image_processing_vit.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/vit/image_processing_vit.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Add: do_convert_rgb to fast

* Add: convert_to_rgb

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-11-19 16:48:05 +01:00
7b9e51c1a0 Feature: print tokens per second during training (#34507)
* Log tokens per second during training

* Nitpicks

* Move logic into _maybe_log_save_evaluate

* Use speed_metrics
2024-11-19 16:46:04 +01:00
5fa4f64605 🚨🚨🚨 fix(Mask2Former): torch export 🚨🚨🚨 (#34393)
* fix(Mask2Former): torch export

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* revert level_start_index and create a level_start_index_list

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Add a comment to explain the level_start_index_list

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Address comment

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* add torch.export.export test

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* rename arg

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* remove spatial_shapes

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Use the version check from pytorch_utils

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* [run_slow] mask2former

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

---------

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>
2024-11-19 16:44:53 +01:00
581524389a MLU devices : Checks if mlu is available via an cndev-based check which won't trigger the drivers and leave mlu (#34326)
* add Cambricon MLUs support

* fix mlu device rng state

* up for quality check

* up mlu to support fp16

* fix mlu device dependency error

* fix mlu device dependency error

* enable mlu device for bf16

* fix mlu device memory tracker

* Cambricon support SDPA and flash_attn

* MLU devices : Checks if `mlu` is available via an `cndev-based` check which won't trigger the drivers and leave mlu
2024-11-19 16:37:39 +01:00
e3a5889ef0 Modular fix (#34802)
* Modular fix

* style

* remove logger warning

* Update modular_model_converter.py
2024-11-19 16:08:57 +01:00
ce1d328e3b Fix cache_utils for optimum.quanto kvcache quantization (#34750)
* add co-author

Co-authored-by: w3rew <w3rew@users.noreply.github.com>

* fix docs

* fix cache

* remove print

---------

Co-authored-by: w3rew <w3rew@users.noreply.github.com>
2024-11-19 14:16:34 +01:00
4bff54f921 Gemma capping (#34282)
* softcapping

* soft cap before the mask

* style

* ...

* super nit

* update

* fixes

* update

* small issue with modular

* fix modular imports

* update

* fixup

* simplify a hell lot

* simplify cleaning imports

* finish fixing

* update our design

* nits

* use a deprecation cycle

* updates

* Fix modular (recursive deps need to always be computed after merges!)

* push

* fix

* update

* fix modular order

* make fix-copies

* updates

* update

* ?

* don't compile for now

* ?

* fix some stuff

* donc!

* fix copies

* update

* fixup

* ?

* fix two tests

* fix?

* for now, don't use head info

* eager when output attentoin and sdpa or flash as it's the simplest behaviour (for our tests as well :))

* fix-copies

* revert sdpa check

* Apply suggestions from code review

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>

* rebase, fix-copies and push

* add a slow integration test

* update the test

* fix left padding issue

* fix test

* remove duplicate scaling

* quality

* add a small test and make sure it works

* 2b

---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2024-11-19 13:52:38 +01:00
54739a320e Self-speculation (Layer-Skip Llama) (#34240)
* 😅

* early exit (#34244)

* mvp

* docs and tests

* a few fixes

* no shared cache

* Apply suggestions from code review

Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org>

* docs

* make fix-copies

* cohere fix

* [test all]

* [test all] consistent model code copies

* [test all] make fix-copies :D

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org>

* Update src/transformers/generation/candidate_generator.py

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* [test all] don't use a stand-alone attribute; fix test

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-11-19 12:20:07 +00:00
5de58d5955 fix cpu bnb path (#34647)
* fix cpu bnb path

* Update src/transformers/generation/utils.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix awq quantizer env check

* fix awq quantizer device check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-19 12:44:44 +01:00
jp
3cd78be34e Fix: siglip image processor rgb_convert is not being applied correctly. (#34301)
Fix: do_convert_rgb
2024-11-19 12:40:36 +01:00
0db91c3c8d Support gradient checkpointing in Qwen2VL ViT (#34724)
* Support gradient checkpointing in Qwen2VL ViT

* Enable gradient checkpoint tests for Qwen2VL

* [run-slow] qwen2_vl
2024-11-19 12:30:44 +01:00
1a0cd69435 feat: allow to use hf-hub models for timm backbone (#34729)
Currently a backbone name like 'hf-hub:bioptimus/H-optimus-0' throws an
error, even though it could work.

Co-authored-by: Christian Gebbe <>
2024-11-19 10:26:35 +00:00
d8a5d31d9c Trainer hyperparameter search kwargs docs update (#34459)
* doc: Trainer.hyperparameter_search docstring discrepancy solved

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-19 11:23:03 +01:00
dadb286f06 protect tensor parallel usage (#34800)
protect
2024-11-19 09:54:11 +01:00
eed11f34ab Fix Whisper CI (#34617)
* Revert "Revert "Fix Whisper CI" (#34605)"

This reverts commit 74d3824cc0725829e7d92e1d43b97be1f18454f8.

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-18 21:37:50 +01:00
759a378ee5 Allow handling files as args for a tool created with Tool.from_space (#34687)
* Allow handling files as args for a tool created with `Tool.from_space`
2024-11-18 20:15:35 +01:00
20142ab542 Simplify Tensor Parallel implementation with PyTorch TP (#34184)
* Simplify Tensor Parallel implementation with PyTorch TP

* Move tp_plan to config

* Lint

* Format and warning

* Disable copy-from check

* Conditionally get attr from config

* make fix-copies

* Move base_model_tp_plan to PretrainedConfig

* Move TP into from_pretrained

* Add device context for load

* Do not serialize

* Move _tp_plan setting to post_init

* Add has_tp_plan

* Add test_tp

* Add 'Multi-gpu inference' doc

* Add backward support for device type identification

* Auto-detect accelerator

* supports_tp_plan

* copyright year

* Fix copy
2024-11-18 19:51:49 +01:00
7df93d6ffb fix: Wrong task mentioned in docs (#34757) 2024-11-18 18:42:28 +00:00
7693b62268 Fix callback key name (#34762)
Fixes typo.
2024-11-18 18:41:12 +00:00
1ef6c5f1c5 fix: Update pixel_values parameter in hf_model input (#34782) 2024-11-18 18:40:01 +00:00
e80a65ba4f [tests] add XPU part to testing (#34778)
add XPU part to testing

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
2024-11-18 09:59:11 -08:00
9568a9dfc5 [docs] add XPU besides CUDA, MPS etc. (#34777)
add XPU
2024-11-18 09:58:50 -08:00
8568bf1bcf [docs] make empty_cache device-agnostic (#34774)
make device-agnostic
2024-11-18 09:58:26 -08:00
36759f3312 make sure to disable gradients for integer tensor (#32943) 2024-11-18 16:49:37 +01:00
1c471fc307 Fix skip of test_training_gradient_checkpointing (#34723)
19d58d31f has introduced a context manager to manage subtests of
test_training_gradient_checkpointing. However, test body was not
moved under "with" statement. Thus, while tests are correctly
marked as skipped, test bodies were still executed. In some cases,
as with llama this caused attribute errors.

Fixes: #34722
Fixes: 19d58d31f ("Add MLLama (#33703)")

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2024-11-18 15:45:40 +01:00
c772d4d91e fix a typo bug where 'id2label' was incorrectly written as 'i2label' when reading config (#34637)
fix a bug where 'id2label' was incorrectly written as 'i2label' when reading the config from pretrained config
2024-11-18 14:41:48 +01:00
eb0ab3ed4b Fix broken link (#34618) 2024-11-18 14:13:26 +01:00
1646ffb4d1 VLMs: patch_size -> num_image_tokens in processing (#33424)
* use num additional tokens

* fix copies + docs

* another fix copies :)

* add docs

* move order for BC
2024-11-18 13:21:07 +01:00
3ee24e2208 Add OLMo November 2024 (#34551)
* Add model skeletion with transformers-cli add-new-model-like

* Convert config to modular, add rms_norm_eps, delete clip_qkv

* Convert model to modular, add RMSNorm

* Add flash attention with qk norm and no qkv clipping

* Add decoder layer with RMSNorm after attention/feedforward layers

* Add base and causal model

* Add converter improvements from OLMo repo

* Update weight loading in OLMo to HF converter

* Set correct default for rms_norm_eps

* Set correct pipeline_model_mapping in test

* Run make fixup

* Fix model type

* Re-run modular conversion

* Manually set config docs to fix build errors

* Convert olmo-1124 to olmo_1124 to fix flash attention docs errors

* Start updating tests

* Update tests

* Copy upstream test_eager_matches_sdpa_inference_1_bfloat16 changes to olmo_1124

* Rename input_layernorm and post_attention_layernorm to reflect their ops better

* Use correct tokenizer

* Remove test unsupported by GPT2 tokenizer

* Create GenerationConfig outside of from_pretrained call

* Use simpler init file structure

* Add explicit __all__ to support simplified init

* Make safetensor serialization the default

* Update OLMo November 2024 docs
2024-11-18 10:43:10 +01:00
13493215ab 🧼 remove v4.44 deprecations (#34245)
* remove v4.44 deprecations

* PR comments

* deprecations scheduled for v4.50

* hub version update

* make fiuxp

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-15 23:07:24 +01:00
8d50fda644 Remove FSDP wrapping from sub-models. (#34452)
* Remove FSDP wrapping from sub-models.

* solve conflict trainer.py

* make fixup

* add unit test for fsdp_auto_wrap_policy when using auto_find_batch_size

* put back extract_model_from_parallel

* use transformers unwrap_model
2024-11-15 23:00:03 +01:00
b0c0ba7b4d FSDP grad accum fix (#34645)
* add gradient accumulation steps tests for fsdp

* invert no_sync context to fix training for fsdp
2024-11-15 22:28:06 +01:00
52ea4aa589 add xpu path for awq (#34712)
* add xpu path for awq

* update readme
2024-11-15 15:45:24 +01:00
7b3d615bc2 fix(wandb): pass fake dataset to avoid exception in trainer (see #34455) (#34720) 2024-11-15 15:44:02 +01:00
f5dbfab7f3 Update llava.md (#34749)
LLava -> Llava
2024-11-15 15:39:57 +01:00
8ba3e1505e Retain newlines in chat template when continue_final_message=True (#34253)
* Retain newlines in chat template when

* Add try/except

* Add regression test

* Simplify test

* Apply suggestions from code review

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2024-11-15 14:27:04 +00:00
a3d69a8994 [docs] add xpu device check (#34684)
* add XPU path

* use accelerate API

* Update docs/source/en/tasks/semantic_segmentation.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update more places with accelerate API

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-13 14:16:59 -08:00
68f8186a89 Fix example in EsmConfig docstring (#34653) 2024-11-13 13:55:58 -08:00
e7c36a9d57 [docs] Broken link in generation_strategies (#34717)
[docs] Broken link
2024-11-13 13:44:42 -08:00
be8748a53c 🌐 [i18n-KO] Translated marian.md to Korean (#34698)
* initial translation

* removed english

* Fixed Trivial Typos, updated _toctree.yml
2024-11-13 13:14:23 -08:00
33eef99250 Agents: Small fixes in streaming to gradio + add tests (#34549)
* Better support transformers.agents in gradio: small fixes and additional tests
2024-11-11 20:52:09 +01:00
6de2a4d1f1 [i18n-ar] Translated file : docs/source/ar/torchscript.md into Arabic (#33079)
* Add docs/source/ar/torchscript.md to Add_docs_source_ar_torchscript.md

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Merge troubleshooting.md with this Branch

* Update _toctree.yml

* Update torchscript.md

* Update troubleshooting.md

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-11-11 10:41:01 -08:00
25f510a9c6 [docs] update not-working model revision (#34682)
update revision
2024-11-11 07:09:31 -08:00
3ea3ab62d8 Agents: turn any Space into a Tool with Tool.from_space() (#34561)
* Agents: you can now load a Space as a tool
2024-11-10 12:22:40 +01:00
134ba90da9 Update llm_engine.py (#33332)
* Update llm_engine.py
- Added support for optional token and max_tokens parameters in the constructor.
- Provided usage examples and detailed documentation for each method.
2024-11-10 12:19:20 +01:00
768f3c016e [i18n-ar] Translated file : docs/source/ar/trainer.md into Arabic (#33080)
* Add docs/source/ar/trainer.md to Add_docs_source_ar_trainer.md

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update trainer.md

* Update trainer.md

* Update trainer.md

* Create _toctree.yml

* Delete docs/source/ar/_toctree.yml

* Update _toctree.yml - add trainer

* Update _toctree.yml

* merge serialization.md into this branch

* merge sagemaker.md into this PR

* Update _toctree.yml

* Update docs/source/ar/trainer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-09 11:26:28 -08:00
a06a0d1263 🌐 [i18n-KO] Translated bert.md to Korean (#34627)
* Translated bert.md, Need additional check

* Translation 2nd ver, changed _toctree.yml

* Fixed Typo

* Update bert.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update bert.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update bert.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update bert.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-07 18:56:09 -08:00
1cf17077bf 🌐 [i18n-KO] Translated timesformer.md to Korean (#33972)
* docs: ko: model_doc/timesformer.md

* feat: nmt draft

* fix: manual edits

* fix_toctree

* fix toctree on Video Models
2024-11-07 11:04:27 -08:00
6938524a28 fix(dvclive): pass fake dataset to avoid exception in trainer init (#34455)
fix(dvclive): pass fake dataset to avoid exception in trainer
2024-11-07 15:57:34 +01:00
7bbc624743 🌐 [i18n-KO] Translated convbert.md to Korean (#34599)
* docs: ko: convbert.md

* Update _toctree.yml

* feat: nmt draft
2024-11-05 09:32:17 -08:00
e83aaaa86b Fix use_parallel_residual and qkv_bias for StableLM GGUF config extraction (#34450)
* fix stablelm qkv_bias

* fix stablelm qkv_bias and use_parallel_residual

* remove original_model.config for stablelm gguf test
2024-11-05 18:26:20 +01:00
9f28d0c5d0 Fix torchvision interpolation CI (#34539)
fix-torch-interpolation-ci
2024-11-05 11:02:14 -05:00
d2bae7ee9d Changing __repr__ in torchao to show quantized Linear (#34202)
* Changing __repr__ in torchao

* small update

* make style

* small update

* add LinearActivationQuantizedTensor

* remove some cases

* update imports & handle return None

* update
2024-11-05 16:11:02 +01:00
f2d5dfbab2 Remove @slow for test_eager_matches_sdpa_inference (#34558)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-05 16:10:42 +01:00
082e57e0d4 Fix #34494 assistant tokens when truncated (#34531)
* Fix assistant tokens when truncated

* fix test

* fix test

* step
2024-11-05 15:10:15 +00:00
74d3824cc0 Revert "Fix Whisper CI" (#34605)
Revert "Fix Whisper CI (#34541)"

This reverts commit eb811449a2389e48930c45f84c88fd041735cf92.
2024-11-05 15:12:47 +01:00
45b0c7680c Remove unused test_dataset (#34516) 2024-11-05 14:01:25 +00:00
663c851239 DistilBERT is ExecuTorch compatible (#34475)
* DistillBERT is ExecuTorch compatible

* [run_slow] distilbert

* [run_slow] distilbert

---------

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-11-05 13:41:48 +01:00
893ad04fad Load sub-configs from composite configs (#34410)
* save/load sub-configs

* nit forgot these

* fix copies

* move test to common

* use dict for sub-configs

* add load-save-laod test

* clean up modeling check

* oops this are correct keys

* fix some tests, missed some composite configs

* this model was missed
2024-11-05 11:34:01 +01:00
5e1fd4e204 FIX: Broken repr of TorchAoConfig (#34560)
FIX Broken repr of TorchAoConfig

The __repr__ method references a non-existent self.kwargs. This is now
fixed.

There does not appear to be a uniform way of defining __repr__ for
quantization configs. I copied the method as implemented for HQQ:

e2ac16b28a/src/transformers/utils/quantization_config.py (L285-L287)
2024-11-05 10:26:13 +01:00
d0b1d8d888 Skip DeepSpeed ZeRO Stage 3 model initialization when bnb (#34395)
* Skip DeepSpeed ZeRO Stage 3 model initialization when it is intended to be quantized.

* Propagate the quantization state using a context manager

* make fixup
2024-11-05 10:06:07 +01:00
eb811449a2 Fix Whisper CI (#34541)
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-04 21:35:37 +01:00
bfa021be05 fix TrainerState doc because num_input_tokens_seen is unused by defau… (#34593)
fix TrainerState doc because num_input_tokens_seen is unused by default config

Co-authored-by: kangsheng <kangsheng@meituan.com>
2024-11-04 09:42:20 -08:00
0a6795af12 🌐 [i18n-KO] Update README_ko.md (#33098)
* Update README_ko.md

Delete the blank paragraph in the language selection button and Edit to synchronize with the English version of README.md

* [i18n-KO] Update README_ko.md

* Additional edit for keep consistency with main [documentation](https://huggingface.co/docs/transformers/v4.44.2/ko/index). (메인 문서와 일관성 유지를 위한 수정)

* Update README_ko.md

Additional update.
* Change docs link to Korean translated page if it exists.

* Change doc link to korean translated if it exists.

Change the link of doc and delete a row 'migration' of the table Learn more[더 알아보기], since it does not exist in the main version of doc.

* modify a link of the main README.md

from
`https://huggingface.co/docs/transformers/index#supported-frameworks`

to
`https://huggingface.co/docs/transformers/index#supported-models-and-frameworks`

since the title of 'supported table' changed.

* [i18n-ko] edit links and sync with main `README.md`

* docs/change comment to Korean1

Change English comment to Korean

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* docs/change comment to Korean2

Change English comment to Korean

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* revise to original

to seperate `edit_README_ko_md` and `README.md`

* Synchronization with English documentation.

Synchronization with English documentation, and translated a line of comment from English to Korean.

---------

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
2024-11-04 09:42:07 -08:00
1112c54604 🌐 [i18n-KO] Translated perf_train_special.md to Korean (#34590)
* Translated to Ko, 1st version

* updated _toctree.yml
2024-11-04 09:41:44 -08:00
a86bd6f2d8 [i18n-HI] Translated TFLite page to Hindi (#34572)
* [i18n-HI] Translated TFLite page to Hindi

* [i18n-HI] Translated TFLite page to Hindi

* Update docs/source/hi/tflite.md

Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>

---------

Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>
2024-11-04 09:40:30 -08:00
48831b7d11 Add text support to the Trainer's TensorBoard integration (#34418)
* feat: add text support to TensorBoardCallback

* feat: ignore long strings in trainer progress

* docs: add docstring for max_str_len

* style: remove trailing whitespace

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-04 17:36:27 +01:00
34927b0f73 MPS: isin_mps_friendly can support 0D tensors (#34538)
* apply fix

* tested

* make fixup
2024-11-04 16:18:50 +00:00
187439c3fa VLM: special multimodal Tokenizer (#34461)
* kinda works

* update

* add tests

* update

* use special tokens in processors

* typo

* fix copies

* fix

* fix moshi after rebase

* update

* fix tests

* update

* Update docs/source/en/main_classes/tokenizer.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update docs

* test for load time adding tokens

* fix some more tests which are now fetched better

* one more fix

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-04 16:37:51 +01:00
ef976a7e18 Update trainer for easier handling of accumulate, compile fixes, and proper reporting (#34511)
* Update trainer for easier handling of accumulate + proper reporting

* test

* Fixup tests

* Full fix

* Fix style

* rm comment

* Fix tests

* Minimize test + remove py 311 check

* Unused import

* Forward contrib credits from discussions

* Fix reported metrics

* Refactor, good as it's going to get

* rm pad tok id check

* object detection and audio are being annoying

* Fin

* Fin x2

---------

Co-authored-by: Gyanateet Dutta <Ryukijano@users.noreply.github.com>
2024-11-04 07:47:34 -05:00
33868a057c [i18n-HI] Translated accelerate page to Hindi (#34443)
* [i18n-HI] Translated accelerate page to Hindi

* Update docs/source/hi/accelerate.md

Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>

* Update docs/source/hi/accelerate.md

Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>

* Update docs/source/hi/accelerate.md

Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>

* Update docs/source/hi/accelerate.md

Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>

---------

Co-authored-by: Kay <kay@Kays-MacBook-Pro.local>
Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>
2024-11-01 08:26:45 -07:00
e2ac16b28a Large modular logic refactoring (#34487)
* rework converter

* Update modular_model_converter.py

* Update modular_model_converter.py

* Update modular_model_converter.py

* Update modular_model_converter.py

* cleaning

* cleaning

* finalize imports

* imports

* Update modular_model_converter.py

* Better renaming to avoid visiting same file multiple times

* start converting files

* style

* address most comments

* style

* remove unused stuff in get_needed_imports

* style

* move class dependency functions outside class

* Move main functions outside class

* style

* Update modular_model_converter.py

* rename func

* add augmented dependencies

* Update modular_model_converter.py

* Add types_to_file_type + tweak annotation handling

* Allow assignment dependency mapping + fix regex

* style + update modular examples

* fix modular_roberta example (wrong redefinition of __init__)

* slightly correct order in which dependencies will appear

* style

* review comments

* Performance + better handling of dependencies when they are imported

* style

* Add advanced new classes capabilities

* style

* add forgotten check

* Update modeling_llava_next_video.py

* Add prority list ordering in check_conversion as well

* Update check_modular_conversion.py

* Update configuration_gemma.py
2024-11-01 10:13:51 +01:00
86701f2b6f 🔴 🔴 fix query_pre_attn_scalar different of num_heads in default gemma2 config (#34540)
* fix query_pre_attn_scalar different of num_heads in default config

* propagate modular changes

* fix copies

* fix modular copies

* fix copies?

* correct copies fix
2024-11-01 09:06:17 +01:00
4cc0813e28 BLIP: enable generation tests (#34174)
* blip2 tests

* instructblips

* copies

* fix slow tests

* fix

* uncomment this

* clean up after rebase

* should be model main input

* fix overwritten tests

* oops len should be multiple of frame number

* style

* fix some tests
2024-11-01 08:54:48 +01:00
6beb3f1691 Blip: get/set input embeddings correctly (#34152)
* set-get embeds

* add tests

* fix tests

* remove

* return dict True

* fix tests

* why did i remove this

* enabel torchscript tests
2024-11-01 08:39:39 +01:00
b53e44e847 [i18n-ar] Translated file : docs/source/ar/multilingual.md into Arabic (#33048)
* Add docs/source/ar/multilingual.md to Add_docs_source_ar_multilingual.md

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

* Update _toctree.yml

* Add Translated files to branch for merg

* Update _toctree.yml

* Update _toctree.yml

* Update custom_models.md

* Update chat_templating.md

* Update docs/source/ar/create_a_model.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update create_a_model.md

* Update gguf.md

* Update gguf.md

* Update gguf.md

* Update gguf.md

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-31 16:10:09 -07:00
2801d7bcf6 update doc (#34478)
* update doc

* Update docs/source/en/perf_train_cpu.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* delete closing tip

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-31 15:59:23 -07:00
df8640cedb [CLIPSeg] Make interpolate_pos_encoding default to True (#34419)
* Remove interpolate_pos_encoding

* Make fixup

* Make interpolate_pos_encoding default to True

* Reuse existing interpolation

* Add integration test
2024-10-31 22:15:04 +01:00
203e27059b Add image text to text pipeline (#34170)
* Standardize image-text-to-text-models-output

add post_process_image_text_to_text to chameleon and cleanup

Fix legacy kwarg behavior and deprecation warning

add post_process_image_text_to_text to qwen2_vl and llava_onevision

Add post_process_image_text_to_text to idefics3, mllama, pixtral processor

* nit var name post_process_image_text_to_text udop

* nit fix deprecation warnings

* Add image-text-to-text pipeline

* add support for image url in chat template for pipeline

* Reformat to be fully compatible with chat templates

* Add tests chat template

* Fix imports and tests

* Add pipeline tag

* change logic handling of single prompt ans multiple images

* add pipeline mapping to models

* fix batched inference

* fix tests

* Add manual batching for preprocessing

* Fix outputs with nested images

* Add support for all common processing kwargs

* Add default padding when multiple text inputs (batch size>1)

* nit change version deprecation warning

* Add support for text only inference

* add chat_template warnings

* Add pipeline tests and add copied from post process function

* Fix batched pipeline tests

* nit

* Fix pipeline tests blip2

* remove unnecessary max_new_tokens

* revert processing kosmos2 and remove unnecessary max_new_tokens

* fix pipeline tests idefics

* Force try loading processor if pipeline supports it

* revert load_processor change

* hardcode loading only processor

* remove unnecessary try except

* skip imagetexttotext tests for kosmos2 as tiny model causes problems

* Make code clearer

* Address review comments

* remove preprocessing logic from pipeline

* fix fuyu

* add BC resize fuyu

* Move post_process_image_text_to_text to ProcessorMixin

* add guard in post_process

* fix zero shot object detection pipeline

* add support for generator input in pipeline

* nit

* change default image-text-to-text model to llava onevision

* fix owlv2 size dict

* Change legacy deprecation warning to only show when True
2024-10-31 15:48:11 -04:00
c443d8d536 Bug Fix for issue #34294 (#34295)
Update SiglipVisionEmbeddings.forward to cast input to correct dtype before embedding it.
2024-10-31 18:51:15 +01:00
114dd812dd make test_eager_matches_sdpa_inference less flaky (#34512)
* try

* try

* try

* try

* try

* try

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-31 18:34:00 +01:00
294c170ff9 feat: add benchmarks pg indexes (#34536)
* feat: add benchmarks pg indexes

* refactor: remove debug `df -h`
2024-10-31 17:41:06 +01:00
b5919e12f7 fix(DPT,Depth-Anything) Address expected_slice errors inside inference tests (#34518)
* fix(DPT,Depth-Anything) Address expected_slice errors inside inference tests

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* [run_slow] dpt, depth_anything

---------

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>
2024-10-31 16:47:58 +01:00
4ca004eac6 Qwen2VL: skip base input_ids-inputs_embeds equivalence check (#34535)
it has complex inputs_embeds computation
2024-10-31 15:42:13 +00:00
ab98f0b0a1 avoid calling gc.collect and cuda.empty_cache (#34514)
* update

* update

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-31 16:36:13 +01:00
dca93ca076 Fix step shifting when accumulate gradient (#33673)
* replace total_batched_samples with step while counting grad accum step

* remove unused variable

* simplify condition for update step

* fix format by ruff

* simplify update step condition using accelerator.sync_gradients

* simplify update condition using do_sync_step

* remove print for test

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
2024-10-31 09:53:23 -04:00
jp
1b86772de5 Fix: img size mismatch caused by incorrect unpadding in LLaVA-Next (#34522)
Fix: unpadding img mismatch
2024-10-31 14:32:45 +01:00
f38531619d enable QA bf16 pipeline (#34483)
* enable QA bf16 pipeline

* add tests
2024-10-31 12:55:53 +00:00
405b562698 UPDATE Documentation for #TRANSLATING.md Documentation into Multiple Languages.(Changes made) (#34226)
* Update TRANSLATING.md

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update TRANSLATING.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-30 12:37:39 -07:00
48872fd6ae Add Image Processor Fast RT-DETR (#34354)
* add fast image processor rtdetr

* add gpu/cpu test and fix docstring

* remove prints

* add to doc

* nit docstring

* avoid iterating over images/annotations several times

* change torch typing

* Add image processor fast documentation
2024-10-30 13:49:47 -04:00
9f06fb0505 Fix super tiny extra space typo (#34440)
Update training_args.py
2024-10-30 16:55:16 +01:00
5251fe6271 Add GGUF for Mamba (#34200)
* add mamba architecture for gguf

* add logic for weights conversion, some fixes and refactoring

* add lm_head layers, unit test refactoring

* more fixes for tests

* remove lm_head creation

* remove unused comments
2024-10-30 16:52:17 +01:00
eab6c491d4 Use torch 2.5 in scheduled CI (#34465)
* torch 2.5

* try

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-30 14:54:10 +01:00
241d79026f fix pixtral processor (#34486)
* fix pixtral processor

* test out full length batches + remove undue ValueError

* fix up processing

* fix tests

* fix

* last fixup

* style

* [run-slow] pixtral

* [run-slow] pixtral

* fix config key

* skip torchscript tests

* [run-slow] pixtral

* add missing key

* [run-slow] pixtral

* fix docs

* [run-slow] pixtral

* fix wrong url for integration test

* [run-slow] pixtral

* pixtralVisionModel does not have a lm head

* [run-slow] pixtral
2024-10-30 14:17:20 +01:00
8a734ea2c3 Tests: move generate tests to the right mixin and delete redundant tests (#34464)
* tmp commit

* tmp commit

* cull overwrites of deleted tests

* typo

* more specific docstring

* make fixup

* parameterize at the top?

* correction

* more deletions :D

* tmp commit

* for VLMs too

* fix _check_outputs

* test nit

* make fixup

* fix another flaky

* test_generate_from_inputs_embeds -- handle missing attention mask
2024-10-30 10:59:08 +00:00
913330ca9f VLMs: fix number of image tokens (#34332)
* fix

* fix tests

* add tests

* style

* style

* fix qwen after rebase

* fix video llava
2024-10-30 10:21:37 +01:00
0f764a5af7 Mllama: update docs (#34334)
* update docs

* be more explicit

* use avaialble methods
2024-10-30 10:11:50 +01:00
25a9fc584a Fix format mistake in string repr of tokenizer objects (#34493)
* fix repr string format for tokenizer objects

The repr of tokenizer tokens looks confusing and just stupid, like this: `Tokenizer(...), added_tokens_decoder={1: ..., 2: ...}`. The dict that is the value of the added_tokens_decoder attribute is outside of the parentheses of the tokenizer object, whereas all other attributes are inside the parentheses like they should be.

This commit fixes this bug.

* cos: add newline before closing parenthesis of repr string
2024-10-30 10:03:41 +01:00
cd277618d4 Roberta is ExecuTorch compatible (#34425)
* Roberta is ExecuTorch compatible

* [run_slow] roberta

---------

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-30 08:36:45 +00:00
9bee9ff5db Un-deprecate timeout arg in pipelines (#34382)
* Un-deprecate timeout

* Put "timeout" on the allowed list

* make fixup
2024-10-29 18:45:14 +00:00
e4449bb790 fix incorrect warning (#34416) 2024-10-29 14:08:42 -04:00
f55595b177 Fix performance in get_imports regexp (#34298)
* fix: Fix performance in get_imports regexp

* Minimize get_imports content regexp
2024-10-29 17:29:24 +00:00
4e2e8809ff Bump werkzeug from 3.0.3 to 3.0.6 in /examples/research_projects/decision_transformer (#34420)
Bump werkzeug in /examples/research_projects/decision_transformer

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.3 to 3.0.6.
- [Release notes](https://github.com/pallets/werkzeug/releases)
- [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/werkzeug/compare/3.0.3...3.0.6)

---
updated-dependencies:
- dependency-name: werkzeug
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-29 16:42:40 +00:00
e9ad460494 Adding optimizer_cls_and_kwargs to Trainer.__init__ (#34358)
* Adding `optimizer_cls_and_kwargs` to `Trainer.__init__`

* formatting

* make fix-copies docstring

* added more docs for optimizer_cls_and_kwargs

* add docs for Trainer(optimizer_cls_and_kwargs)

* reverting anchor names
2024-10-29 16:23:16 +01:00
f339042b0b Albert is ExecuTorch compatible (#34476)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-29 16:22:13 +01:00
34620e8f0a MobileBERT is ExecuTorch compatible (#34473)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-29 16:14:31 +01:00
56c45d5757 Bug fix for drop path decay rate in swin transformer (#34291)
* potential bug fix for drop path

* variable name change

* forgot to rename the variables

* back to original

* modify dpr properly

* check_copies auto fix

* corresponsing swin2 changes

* auto fix

* linting

* default value for drop_path_rate as 0.0

* Update src/transformers/models/glm/modeling_glm.py

* maskformer fix

* ruff format

* changes made to tf code as well

* lint

---------

Co-authored-by: abhijit deo <167164474+deo-abhijit@users.noreply.github.com>
2024-10-29 16:09:18 +01:00
0ab0a42651 fix-qwen2vl-no-position_ids (#33487) 2024-10-29 15:27:34 +01:00
8755dd26b7 manual head_dim for mixtral model (#34281) 2024-10-29 14:31:36 +01:00
5392f12e16 Bert is ExecuTorch compatible (#34424)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-29 14:30:02 +01:00
004530aa05 Fix regression loading dtype (#34409)
* fix regression

* add test for torchao

* expected output

* better fix
2024-10-29 11:41:04 +01:00
9e3d704e23 Fixes for Modular Converter on Windows (#34266)
* Separator in regex

* Standardize separator for relative path in auto generated message

* open() encoding

* Replace `\` on `os.path.abspath`

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-29 11:40:41 +01:00
626c610a4d Fix perplexity computation in perplexity.md (#34387)
fix average NLL in perplexity.md
2024-10-29 11:10:10 +01:00
439334c8fb Simplify running tests in a subprocess (#34213)
* check

* check

* check

* check

* add docstring

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-29 10:48:57 +01:00
a1835195d1 🚨🚨🚨 [SuperPoint] Fix keypoint coordinate output and add post processing (#33200)
* feat: Added int conversion and unwrapping

* test: added tests for post_process_keypoint_detection of SuperPointImageProcessor

* docs: changed docs to include post_process_keypoint_detection method and switched from opencv to matplotlib

* test: changed test to not depend on SuperPointModel forward

* test: added missing require_torch decorator

* docs: changed pyplot parameters for the keypoints to be more visible in the example

* tests: changed import torch location to make test_flax and test_tf

* Revert "tests: changed import torch location to make test_flax and test_tf"

This reverts commit 39b32a2f69500bc7af01715fc7beae2260549afe.

* tests: fixed import

* chore: applied suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* tests: fixed import

* tests: fixed import (bis)

* tests: fixed import (ter)

* feat: added choice of type for target_size and changed tests accordingly

* docs: updated code snippet to reflect the addition of target size type choice in post process method

* tests: fixed imports (...)

* tests: fixed imports (...)

* style: formatting file

* docs: fixed typo from image[0] to image.size[0]

* docs: added output image and fixed some tests

* Update docs/source/en/model_doc/superpoint.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fix: included SuperPointKeypointDescriptionOutput in TYPE_CHECKING if statement and changed tests results to reflect changes to SuperPoint from absolute keypoints coordinates to relative

* docs: changed SuperPoint's docs to print output instead of just accessing

* style: applied make style

* docs: added missing output type and precision in docstring of post_process_keypoint_detection

* perf: deleted loop to perform keypoint conversion in one statement

* fix: moved keypoint conversion at the end of model forward

* docs: changed SuperPointInterestPointDecoder to SuperPointKeypointDecoder class name and added relative (x, y) coordinates information to its method

* fix: changed type hint

* refactor: removed unnecessary brackets

* revert: SuperPointKeypointDecoder to SuperPointInterestPointDecoder

* Update docs/source/en/model_doc/superpoint.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

---------

Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-10-29 09:36:03 +00:00
655bec2da7 use a tinymodel to test generation config which aviod timeout (#34482)
* use a tinymodel to test generation config which aviod timeout

* remove tailing whitespace
2024-10-29 09:39:06 +01:00
63ca6d9771 Fix CI (#34458)
* fix

* fix mistral
2024-10-29 08:26:04 +01:00
808d6c50f8 Generation: fix test (#34369)
* fix test

* fix copies
2024-10-29 07:57:10 +01:00
fe76b60370 LLaVA: latency issues (#34460)
* fix llavas

* code style

* green ci
2024-10-29 07:54:51 +01:00
a769ed45e1 Add post_process_depth_estimation for GLPN (#34413)
* add depth postprocessing for GLPN

* remove previous temp fix for glpn tests

* Style changes for GLPN's `post_process_depth_estimation`

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* additional style fix

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-28 19:44:20 +01:00
6cc4a67b3d feat: run benchmarks on A100 (#34287) 2024-10-28 19:33:17 +01:00
d21dbd1520 enable average tokens across devices (#34373)
* enable average tokens across devices

* reduce earlier in case model needs it

* simplify if statement

* reformat code to make ruff happy

* add doc for argument: average_tokens_across_devices

* cannot find world size when pytorch is unavailable

* format code

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-28 18:59:38 +01:00
a17f287ac0 [i18n-ar] Translated file : docs/source/ar/fast_tokenizers.md into Arabic (#33034)
* Add docs/source/ar/fast_tokenizers.md to Add_docs_source_ar_fast_tokenizers.md

* Update _toctree.yml

* Update _toctree.yml

* Update docs/source/ar/_toctree.yml

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-10-28 10:54:37 -07:00
084e946cfd Apply linting to the important code blocks to make it readable (#34449)
Enhance user experience using py-linting
2024-10-28 10:48:18 -07:00
1f7539c829 🌐 [i18n-KO] Translated model_doc/barthez.md to Korean (#33980)
* docs: ko: model_doc/barthez.md

* feat: nmt draft

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-28 10:46:49 -07:00
fc1ae7f30f [docs] update input documentation for MAMBA2 and MISTRAL models to include cache_position and attention_mask details (#34322)
* [docs] update input documentation for MAMBA2 and MISTRAL models to include cache_position and attention_mask details

* [docs] correct input documentation for MISTRAL model to reference `input_ids` instead of `decoder_input_ids`

* [docs] clarify cache_position description in MISTRAL model documentation
2024-10-28 09:14:07 -07:00
c1753436db New option called "best" for args.save_strategy. (#31817)
* Add _determine_best_metric and new saving logic.

1. Logic to determine the best logic was separated out from
`_save_checkpoint`.
2. In `_maybe_log_save_evaluate`, whether or not a new best metric was
achieved is determined after each evaluation, and if the save strategy
is "best' then the TrainerControl is updated accordingly.

* Added SaveStrategy.

Same as IntervalStrategy, but with a new attribute called BEST.

* IntervalStrategy -> SaveStrategy

* IntervalStratgy -> SaveStrategy for save_strat.

* Interval -> Save in docstring.

* Updated docstring for save_strategy.

* Added SaveStrategy and made according changes.

`save_strategy` previously followed `IntervalStrategy` but now follows
`SaveStrategy`.

Changes were made accordingly to the code and the docstring.

* Changes from `make fixup`.

* Removed redundant metrics argument.

* Added new test_save_best_checkpoint test.

1. Checks for both cases where `metric_for_best_model` is explicitly
provided and when it's not provided.
2. The first case should have two checkpoints saved, whereas the second
should have three saved.

* Changed should_training_end saving logic.

The Trainer saves a checkpoints at the end of training by default as
long as `save_strategy != SaveStrategy.NO`. This condition was modified
to include `SaveStrategy.BEST` because it would be counterintuitive that
we'd only want the best checkpoint to be saved but the last one is as
well.

* `args.metric_for_best_model` default to loss.

* Undo metric_for_best_model update.

* Remove checking metric_for_best_model.

* Added test cases for loss and no metric.

* Added error for metric and changed default best_metric.

* Removed unused import.

* `new_best_metric` -> `is_new_best_metric`

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Applied `is_new_best_metric` to all.

Changes were made for consistency and also to fix a potential bug.

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
2024-10-28 16:02:22 +01:00
8b3b9b48fc exclude fsdp from delay_optimizer_creation (#34140)
* exclude fsdp from delay_optimizer_creation

* add test case for trainer: FSDP mode and fp8 as mixed precision

* rearrange imports

* ruff formatted

* adapt _init_fsdp to fp8

* use _init_fsdp only when resume_from_checkpoint

* In case of FDP, self.layer will be CheckpointWrapper which has no len() method

* delete _init_fsdp

* solve conflict

* fix conflict

* make fixup
2024-10-28 13:50:16 +01:00
92bcdff2ef Fix batch size handling in prediction_loop for DataLoaderShard (#34343)
* Fix batch size handling in prediction_loop for DataLoaderShard

Updated the prediction_loop method in the Trainer class to correctly handle batch size when using DataLoaderShard. This ensures that the batch size is retrieved from total_batch_size for distributed training scenarios, preventing TypeError related to NoneType during evaluation.

* Update src/transformers/trainer.py

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* Applied the fix to remove unused imports

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
2024-10-28 13:23:52 +01:00
9360f1827d Tiny update after #34383 (#34404)
* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-28 12:01:05 +01:00
fc465bb196 pin tensorflow_probability<0.22 in docker files (#34381)
0.21

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-28 11:59:46 +01:00
fddbd3c13c Fix pix2struct (#34374)
* fix

* fix and test use_cache test

* style

* remove atol
2024-10-28 11:24:56 +01:00
1d06379331 [docs] Cache implementations (#34325)
cache
2024-10-25 08:52:45 -07:00
6a62a6d1b5 Fix typos in agents_advanced.md (#34405) 2024-10-25 08:52:29 -07:00
f73f5e62e2 Avoid check expected exception when it is on CUDA (#34408)
* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-25 17:14:07 +02:00
e447185b1f Fix bnb training test failure (#34414)
* Fix bnb training test: compatibility with OPTSdpaAttention
2024-10-25 10:23:20 -04:00
186b8dc190 Tests: upgrade test_eager_matches_sdpa_generate (#34386) 2024-10-25 11:55:07 +01:00
8814043c8c SynthID: better example (#34372)
* better example

* Update src/transformers/generation/configuration_utils.py

* Update src/transformers/generation/logits_process.py

* nits
2024-10-25 11:46:46 +01:00
223855314f no filter (#34391)
* no filter

* no filter

* no filter

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-25 12:32:39 +02:00
9f365fe0ac Fix right padding in LLaVA models (#34305)
* fix right pad llavas

* device mismatch
2024-10-25 11:02:07 +02:00
5779bac4c4 Fix onnx non-expotable inplace aten op (#34376)
* fix onnx non-expotable inplace op

* mistral, qwen2, qwen2_vl, starcoder2

* fixup copies
2024-10-25 09:44:09 +02:00
940a6bd343 Use non nested images and batched text Idefics2/3 (#34222)
* add support for non nested images and add tests

* add tests error scenario

* fix style

* added single and no image to error tests
2024-10-24 20:00:13 -04:00
3d99f1746e Fix glm (#34388)
* Fix duplicated

* fix import
2024-10-24 19:17:52 +02:00
a308d28d39 [auto. ping] Avoid sending empty info + add more team members (#34383)
* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-24 19:07:23 +02:00
4c6e0c9252 Correct the new defaults (#34377)
* Correct the new defaults

* CIs

* add check

* Update utils.py

* Update utils.py

* Add the max_length in generate test checking shape without passing length

* style

* CIs

* fix fx CI issue
2024-10-24 18:42:03 +02:00
1c5918d910 Fix torch.fx issue related to the new loss_kwargs keyword argument (#34380)
* Fix FX

* Unskip tests
2024-10-24 18:34:28 +02:00
d9989e0b9a [PEFT] Add warning for missing key in LoRA adapter (#34068)
When loading a LoRA adapter, so far, there was only a warning when there
were unexpected keys in the checkpoint. Now, there is also a warning
when there are missing keys.

This change is consistent with
https://github.com/huggingface/peft/pull/2118 in PEFT and the planned PR
https://github.com/huggingface/diffusers/pull/9622 in diffusers.

Apart from this change, the error message for unexpected keys was
slightly altered for consistency (it should be more readable now). Also,
besides adding a test for the missing keys warning, a test for
unexpected keys warning was also added, as it was missing so far.
2024-10-24 17:56:40 +02:00
fe35073319 Ignore unsupported kwarg in ProcessorMixin call (#34285)
Fix accept any common kwargs
2024-10-24 11:46:39 -04:00
e288616606 refactor: remove redundant if-condition and improve type correctness for convert_tokens_to_ids (#34030)
* chore: remove redundant if-condition

* fix: import `Iterable`
2024-10-24 17:40:26 +02:00
450b9cbfac Add code sample docstrings and checkpoint reference for GLM models (#34360)
* Add code sample docstrings and checkpoint reference for GLM models

* Update modular_glm.py

* Update modeling_glm.py
2024-10-24 17:28:51 +02:00
6432ad8bb5 Fix pil_torch_interpolation_mapping import in image_processing_detr_fast (#34375)
fix pil_torch_interpolation_mapping import
2024-10-24 09:22:50 -04:00
dd267fca72 Add T5 GGUF loading support (#33389)
* add: GGUFT5Converter

* add: tensormapping for t5

* add: test code for t5

* fix: Remove whitespace from blank line

* add: t5 fp16 tests

* fix: whitespace formatting

* fix: minor formatting

* fix: testing every weights
2024-10-24 15:10:59 +02:00
30c76d5b28 add code generation to natural language processing section (#34333) 2024-10-24 14:42:47 +02:00
2112027d0c Zamba is an LM (#34342)
* Zamba is an LM

* Addition
2024-10-24 14:29:33 +02:00
b29c24ff1e CI: fix failures (#34371)
fix
2024-10-24 13:44:53 +02:00
f0b3ef9e2e translated gguf.md into chinese (#34163)
* translated gguf.md into chinese

* Apply suggestions from code review

I have updated the PR accordingly.Thank you very much for detailed guidance,and I 'll pay more attention to the details next time.

Co-authored-by: Isotr0py <2037008807@qq.com>

* Apply suggestions from code review

Co-authored-by: Isotr0py <2037008807@qq.com>

---------

Co-authored-by: Isotr0py <2037008807@qq.com>
2024-10-24 11:47:58 +02:00
9643069465 v4.47.0.dev0 2024-10-24 11:23:29 +02:00
f0e640adfa Drop support for Python 3.8 (#34314)
* drop python 3.8

* update docker files

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-24 11:16:55 +02:00
05863817d6 Better defaults (#34026)
* be nice to our usres

* nit

* fixup

* default to -1

* oups

* turbo nit

* auto infer framework
2024-10-24 11:11:55 +02:00
65753d6065 Remove graph breaks for torch.compile() in flash_attention_forward when Lllama Model is padding free tuned (#33932)
* fix: fixes for graph breaks

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: formatting

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: import error

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: Add Fa2Kwargs

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* Revert "PR changes"

This reverts commit 39d2868e5c93cc5f3f3c7c6ff981b66614c0e0e4.

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: FlashAttentionKwarg

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: FlashAttentionKwarg

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* addition of documentation

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* change in _flash_attention_forward

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* make fix-copies

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* revert make fix-copies

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix copies

* style

* loss kwargs typing

* style and pull latest changes

---------

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2024-10-24 11:02:54 +02:00
2475 changed files with 153581 additions and 61106 deletions

View File

@ -13,6 +13,7 @@ jobs:
check_circleci_user:
docker:
- image: python:3.10-slim
resource_class: small
parallelism: 1
steps:
- run: echo $CIRCLE_PROJECT_USERNAME
@ -57,15 +58,15 @@ jobs:
- run:
name: "Prepare pipeline parameters"
command: |
python utils/process_test_artifacts.py
python utils/process_test_artifacts.py
# To avoid too long generated_config.yaml on the continuation orb, we pass the links to the artifacts as parameters.
# Otherwise the list of tests was just too big. Explicit is good but for that it was a limitation.
# We used:
# https://circleci.com/docs/api/v2/index.html#operation/getJobArtifacts : to get the job artifacts
# We could not pass a nested dict, which is why we create the test_file_... parameters for every single job
- store_artifacts:
path: test_preparation/transformed_artifacts.json
- store_artifacts:
@ -109,7 +110,7 @@ jobs:
- run:
name: "Prepare pipeline parameters"
command: |
python utils/process_test_artifacts.py
python utils/process_test_artifacts.py
# To avoid too long generated_config.yaml on the continuation orb, we pass the links to the artifacts as parameters.
# Otherwise the list of tests was just too big. Explicit is good but for that it was a limitation.
@ -159,7 +160,7 @@ jobs:
environment:
TRANSFORMERS_IS_CI: yes
PYTEST_TIMEOUT: 120
parallelism: 1
parallelism: 4
steps:
- checkout
- run: uv pip install -e ".[quality]"
@ -168,19 +169,19 @@ jobs:
command: pip freeze | tee installed.txt
- store_artifacts:
path: ~/transformers/installed.txt
- run: python utils/check_copies.py
- run: python utils/check_modular_conversion.py
- run: python utils/check_table.py
- run: python utils/check_dummies.py
- run: python utils/check_repo.py
- run: python utils/check_inits.py
- run: python utils/check_config_docstrings.py
- run: python utils/check_config_attributes.py
- run: python utils/check_doctest_list.py
- run: make deps_table_check_updated
- run: python utils/update_metadata.py --check-only
- run: python utils/check_docstrings.py
- run: python utils/check_support_list.py
- run: if [ "$CIRCLE_NODE_INDEX" == "0" ]; then python utils/check_copies.py; fi
- run: if [ "$CIRCLE_NODE_INDEX" == "1" ]; then python utils/check_modular_conversion.py --num_workers 4; fi
- run: if [ "$CIRCLE_NODE_INDEX" == "3" ]; then python utils/check_table.py; fi
- run: if [ "$CIRCLE_NODE_INDEX" == "0" ]; then python utils/check_dummies.py; fi
- run: if [ "$CIRCLE_NODE_INDEX" == "2" ]; then python utils/check_repo.py; fi
- run: if [ "$CIRCLE_NODE_INDEX" == "2" ]; then python utils/check_inits.py; fi
- run: if [ "$CIRCLE_NODE_INDEX" == "2" ]; then python utils/check_config_docstrings.py; fi
- run: if [ "$CIRCLE_NODE_INDEX" == "2" ]; then python utils/check_config_attributes.py; fi
- run: if [ "$CIRCLE_NODE_INDEX" == "0" ]; then python utils/check_doctest_list.py; fi
- run: if [ "$CIRCLE_NODE_INDEX" == "0" ]; then make deps_table_check_updated; fi
- run: if [ "$CIRCLE_NODE_INDEX" == "3" ]; then python utils/update_metadata.py --check-only; fi
- run: if [ "$CIRCLE_NODE_INDEX" == "3" ]; then python utils/check_docstrings.py; fi
- run: if [ "$CIRCLE_NODE_INDEX" == "0" ]; then python utils/check_support_list.py; fi
workflows:
version: 2

View File

@ -32,7 +32,7 @@ COMMON_ENV_VARIABLES = {
"RUN_PT_FLAX_CROSS_TESTS": False,
}
# Disable the use of {"s": None} as the output is way too long, causing the navigation on CircleCI impractical
COMMON_PYTEST_OPTIONS = {"max-worker-restart": 0, "dist": "loadfile", "vvv": None, "rsf":None}
COMMON_PYTEST_OPTIONS = {"max-worker-restart": 0, "dist": "loadfile", "vvv": None, "rsfE":None}
DEFAULT_DOCKER_IMAGE = [{"image": "cimg/python:3.8.12"}]
@ -40,9 +40,23 @@ class EmptyJob:
job_name = "empty"
def to_dict(self):
steps = [{"run": 'ls -la'}]
if self.job_name == "collection_job":
steps.extend(
[
"checkout",
{"run": "pip install requests || true"},
{"run": """while [[ $(curl --location --request GET "https://circleci.com/api/v2/workflow/$CIRCLE_WORKFLOW_ID/job" --header "Circle-Token: $CCI_TOKEN"| jq -r '.items[]|select(.name != "collection_job")|.status' | grep -c "running") -gt 0 ]]; do sleep 5; done || true"""},
{"run": 'python utils/process_circleci_workflow_test_reports.py --workflow_id $CIRCLE_WORKFLOW_ID || true'},
{"store_artifacts": {"path": "outputs"}},
{"run": 'echo "All required jobs have now completed"'},
]
)
return {
"docker": copy.deepcopy(DEFAULT_DOCKER_IMAGE),
"steps":["checkout"],
"resource_class": "small",
"steps": steps,
}
@ -54,9 +68,9 @@ class CircleCIJob:
install_steps: List[str] = None
marker: Optional[str] = None
parallelism: Optional[int] = 0
pytest_num_workers: int = 12
pytest_num_workers: int = 8
pytest_options: Dict[str, Any] = None
resource_class: Optional[str] = "2xlarge"
resource_class: Optional[str] = "xlarge"
tests_to_run: Optional[List[str]] = None
num_test_files_per_worker: Optional[int] = 10
# This should be only used for doctest job!
@ -133,7 +147,7 @@ class CircleCIJob:
"command": """dpkg-query --show --showformat='${Installed-Size}\t${Package}\n' | sort -rh | head -25 | sort -h | awk '{ package=$2; sub(".*/", "", package); printf("%.5f GB %s\n", $1/1024/1024, package)}' || true"""}
},
{"run": {"name": "Create `test-results` directory", "command": "mkdir test-results"}},
{"run": {"name": "Get files to test", "command":f'curl -L -o {self.job_name}_test_list.txt <<pipeline.parameters.{self.job_name}_test_list>>' if self.name != "pr_documentation_tests" else 'echo "Skipped"'}},
{"run": {"name": "Get files to test", "command":f'curl -L -o {self.job_name}_test_list.txt <<pipeline.parameters.{self.job_name}_test_list>> --header "Circle-Token: $CIRCLE_TOKEN"' if self.name != "pr_documentation_tests" else 'echo "Skipped"'}},
{"run": {"name": "Split tests across parallel nodes: show current parallel tests",
"command": f"TESTS=$(circleci tests split --split-by=timings {self.job_name}_test_list.txt) && echo $TESTS > splitted_tests.txt && echo $TESTS | tr ' ' '\n'" if self.parallelism else f"awk '{{printf \"%s \", $0}}' {self.job_name}_test_list.txt > splitted_tests.txt"
}
@ -185,7 +199,6 @@ torch_job = CircleCIJob(
docker_image=[{"image": "huggingface/transformers-torch-light"}],
marker="not generate",
parallelism=6,
pytest_num_workers=8
)
generate_job = CircleCIJob(
@ -193,28 +206,24 @@ generate_job = CircleCIJob(
docker_image=[{"image": "huggingface/transformers-torch-light"}],
marker="generate",
parallelism=6,
pytest_num_workers=8
)
tokenization_job = CircleCIJob(
"tokenization",
docker_image=[{"image": "huggingface/transformers-torch-light"}],
parallelism=8,
pytest_num_workers=16
)
processor_job = CircleCIJob(
"processors",
docker_image=[{"image": "huggingface/transformers-torch-light"}],
parallelism=8,
pytest_num_workers=6
)
tf_job = CircleCIJob(
"tf",
docker_image=[{"image":"huggingface/transformers-tf-light"}],
parallelism=6,
pytest_num_workers=16,
)
@ -222,7 +231,8 @@ flax_job = CircleCIJob(
"flax",
docker_image=[{"image":"huggingface/transformers-jax-light"}],
parallelism=6,
pytest_num_workers=16
pytest_num_workers=16,
resource_class="2xlarge",
)
@ -231,7 +241,7 @@ pipelines_torch_job = CircleCIJob(
additional_env={"RUN_PIPELINE_TESTS": True},
docker_image=[{"image":"huggingface/transformers-torch-light"}],
marker="is_pipeline_test",
parallelism=4
parallelism=4,
)
@ -240,7 +250,7 @@ pipelines_tf_job = CircleCIJob(
additional_env={"RUN_PIPELINE_TESTS": True},
docker_image=[{"image":"huggingface/transformers-tf-light"}],
marker="is_pipeline_test",
parallelism=4
parallelism=4,
)
@ -257,7 +267,6 @@ examples_torch_job = CircleCIJob(
docker_image=[{"image":"huggingface/transformers-examples-torch"}],
# TODO @ArthurZucker remove this once docker is easier to build
install_steps=["uv venv && uv pip install . && uv pip install -r examples/pytorch/_tests_requirements.txt"],
pytest_num_workers=8,
)
@ -265,7 +274,6 @@ examples_tensorflow_job = CircleCIJob(
"examples_tensorflow",
additional_env={"OMP_NUM_THREADS": 8},
docker_image=[{"image":"huggingface/transformers-examples-tf"}],
pytest_num_workers=16,
)
@ -280,6 +288,7 @@ hub_job = CircleCIJob(
],
marker="is_staging_test",
pytest_num_workers=2,
resource_class="medium",
)
@ -292,13 +301,13 @@ onnx_job = CircleCIJob(
],
pytest_options={"k onnx": None},
pytest_num_workers=1,
resource_class="small",
)
exotic_models_job = CircleCIJob(
"exotic_models",
docker_image=[{"image":"huggingface/transformers-exotic-models"}],
pytest_num_workers=12,
parallelism=4,
pytest_options={"durations": 100},
)
@ -317,7 +326,6 @@ non_model_job = CircleCIJob(
docker_image=[{"image": "huggingface/transformers-torch-light"}],
marker="not generate",
parallelism=6,
pytest_num_workers=8,
)
@ -352,6 +360,7 @@ REPO_UTIL_TESTS = [repo_utils_job]
DOC_TESTS = [doc_test_job]
ALL_TESTS = REGULAR_TESTS + EXAMPLES_TESTS + PIPELINE_TESTS + REPO_UTIL_TESTS + DOC_TESTS + [custom_tokenizers_job] + [exotic_models_job] # fmt: skip
def create_circleci_config(folder=None):
if folder is None:
folder = os.getcwd()
@ -361,7 +370,13 @@ def create_circleci_config(folder=None):
if len(jobs) == 0:
jobs = [EmptyJob()]
print("Full list of job name inputs", {j.job_name + "_test_list":{"type":"string", "default":''} for j in jobs})
else:
print("Full list of job name inputs", {j.job_name + "_test_list":{"type":"string", "default":''} for j in jobs})
# Add a job waiting all the test jobs and aggregate their test summary files at the end
collection_job = EmptyJob()
collection_job.job_name = "collection_job"
jobs = [collection_job] + jobs
config = {
"version": "2.1",
"parameters": {
@ -371,9 +386,14 @@ def create_circleci_config(folder=None):
**{j.job_name + "_test_list":{"type":"string", "default":''} for j in jobs},
**{j.job_name + "_parallelism":{"type":"integer", "default":1} for j in jobs},
},
"jobs" : {j.job_name: j.to_dict() for j in jobs},
"workflows": {"version": 2, "run_tests": {"jobs": [j.job_name for j in jobs]}}
"jobs": {j.job_name: j.to_dict() for j in jobs}
}
if "CIRCLE_TOKEN" in os.environ:
# For private forked repo. (e.g. new model addition)
config["workflows"] = {"version": 2, "run_tests": {"jobs": [{j.job_name: {"context": ["TRANSFORMERS_CONTEXT"]}} for j in jobs]}}
else:
# For public repo. (e.g. `transformers`)
config["workflows"] = {"version": 2, "run_tests": {"jobs": [j.job_name for j in jobs]}}
with open(os.path.join(folder, "generated_config.yml"), "w") as f:
f.write(yaml.dump(config, sort_keys=False, default_flow_style=False).replace("' << pipeline", " << pipeline").replace(">> '", " >>"))

View File

@ -106,6 +106,7 @@ body:
label: Reproduction
description: |
Please provide a code sample that reproduces the problem you ran into. It can be a Colab link or just a code snippet.
Please include relevant config information with your code, for example your Trainers, TRL, Peft, and DeepSpeed configs.
If you have code snippets, error messages, stack traces please provide them here as well.
Important! Use code tags to correctly format your code. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
Do not use screenshots, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.

View File

@ -16,8 +16,12 @@ env:
jobs:
benchmark:
name: Benchmark
strategy:
matrix:
# group: [aws-g5-4xlarge-cache, aws-p4d-24xlarge-plus] (A100 runner is not enabled)
group: [aws-g5-4xlarge-cache]
runs-on:
group: aws-g5-4xlarge-cache
group: ${{ matrix.group }}
if: |
(github.event_name == 'pull_request' && contains( github.event.pull_request.labels.*.name, 'run-benchmark') )||
(github.event_name == 'push' && github.ref == 'refs/heads/main')
@ -60,9 +64,12 @@ jobs:
commit_id=$GITHUB_SHA
fi
commit_msg=$(git show -s --format=%s | cut -c1-70)
python3 benchmark/llama.py "${{ github.head_ref || github.ref_name }}" "$commit_id" "$commit_msg"
python3 benchmark/benchmarks_entrypoint.py "${{ github.head_ref || github.ref_name }}" "$commit_id" "$commit_msg"
env:
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
# Enable this to see debug logs
# HF_HUB_VERBOSITY: debug
# TRANSFORMERS_VERBOSITY: debug
PGHOST: ${{ secrets.TRANSFORMERS_BENCHMARKS_PGHOST }}
PGUSER: transformers_benchmarks
PGPASSWORD: ${{ secrets.TRANSFORMERS_BENCHMARKS_PGPASSWORD }}

View File

@ -106,7 +106,7 @@ jobs:
} >> "$GITHUB_ENV"
- name: Send processed report
if: ${{ env.REPORT_TEXT != '' }}
if: ${{ !endsWith(env.REPORT_TEXT, '{}') }}
uses: slackapi/slack-github-action@6c661ce58804a1a20f6dc5fbee7f0381b469e001
with:
# Slack channel id, channel name, or user id to post message.

View File

@ -134,10 +134,3 @@ jobs:
slackChannel: ${{ secrets.SLACK_CIFEEDBACK_CHANNEL }}
slackToken: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
waitForSSH: true
benchmark:
name: Benchmark workflow
needs: get_modified_models
if: ${{ needs.get_modified_models.outputs.matrix != '[]' && needs.get_modified_models.outputs.matrix != '' && fromJson(needs.get_modified_models.outputs.matrix)[0] != null }}
uses: ./.github/workflows/benchmark.yml
secrets: inherit

417
.github/workflows/self-comment-ci.yml vendored Normal file
View File

@ -0,0 +1,417 @@
name: PR comment GitHub CI
on:
issue_comment:
types:
- created
branches-ignore:
- main
concurrency:
group: ${{ github.workflow }}-${{ github.event.issue.number }}-${{ startsWith(github.event.comment.body, 'run-slow') || startsWith(github.event.comment.body, 'run slow') || startsWith(github.event.comment.body, 'run_slow') }}
cancel-in-progress: true
permissions: read-all
env:
HF_HOME: /mnt/cache
TRANSFORMERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes
# For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
# This token is created under the bot `hf-transformers-bot`.
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
RUN_PT_TF_CROSS_TESTS: 1
CUDA_VISIBLE_DEVICES: 0,1
jobs:
get-pr-number:
runs-on: ubuntu-22.04
name: Get PR number
# For security: only allow team members to run
if: ${{ github.event.issue.state == 'open' && contains(fromJSON('["ydshieh", "ArthurZucker", "zucchini-nlp", "qubvel", "molbap", "gante", "LysandreJik", "Cyrilvallez", "Rocketknight1", "SunMarc", "muellerzr"]'), github.actor) && (startsWith(github.event.comment.body, 'run-slow') || startsWith(github.event.comment.body, 'run slow') || startsWith(github.event.comment.body, 'run_slow')) }}
outputs:
PR_NUMBER: ${{ steps.set_pr_number.outputs.PR_NUMBER }}
steps:
- name: Get PR number
shell: bash
run: |
if [[ "${{ github.event.issue.number }}" != "" && "${{ github.event.issue.pull_request }}" != "" ]]; then
echo "PR_NUMBER=${{ github.event.issue.number }}" >> $GITHUB_ENV
else
echo "PR_NUMBER=" >> $GITHUB_ENV
fi
- name: Check PR number
shell: bash
run: |
echo "${{ env.PR_NUMBER }}"
- name: Set PR number
id: set_pr_number
run: echo "PR_NUMBER=${{ env.PR_NUMBER }}" >> "$GITHUB_OUTPUT"
get-sha:
runs-on: ubuntu-22.04
needs: get-pr-number
if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}
outputs:
PR_HEAD_SHA: ${{ steps.get_sha.outputs.PR_HEAD_SHA }}
PR_MERGE_SHA: ${{ steps.get_sha.outputs.PR_MERGE_SHA }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: "0"
ref: "refs/pull/${{needs.get-pr-number.outputs.PR_NUMBER}}/merge"
- name: Get SHA (and verify timestamps against the issue comment date)
id: get_sha
env:
PR_NUMBER: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
COMMENT_DATE: ${{ github.event.comment.created_at }}
run: |
git fetch origin refs/pull/$PR_NUMBER/head:refs/remotes/pull/$PR_NUMBER/head
git checkout refs/remotes/pull/$PR_NUMBER/head
echo "PR_HEAD_SHA: $(git log -1 --format=%H)"
echo "PR_HEAD_SHA=$(git log -1 --format=%H)" >> "$GITHUB_OUTPUT"
git fetch origin refs/pull/$PR_NUMBER/merge:refs/remotes/pull/$PR_NUMBER/merge
git checkout refs/remotes/pull/$PR_NUMBER/merge
echo "PR_MERGE_SHA: $(git log -1 --format=%H)"
echo "PR_MERGE_SHA=$(git log -1 --format=%H)" >> "$GITHUB_OUTPUT"
PR_MERGE_COMMIT_TIMESTAMP=$(git log -1 --date=unix --format=%cd)
echo "PR_MERGE_COMMIT_TIMESTAMP: $PR_MERGE_COMMIT_TIMESTAMP"
COMMENT_TIMESTAMP=$(date -d "${COMMENT_DATE}" +"%s")
echo "COMMENT_DATE: $COMMENT_DATE"
echo "COMMENT_TIMESTAMP: $COMMENT_TIMESTAMP"
if [ $COMMENT_TIMESTAMP -le $PR_MERGE_COMMIT_TIMESTAMP ]; then
echo "Last commit on the pull request is newer than the issue comment triggering this run! Abort!";
exit -1;
fi
# use a python script to handle this complex logic
# case 1: `run-slow` (auto. infer with limited number of models, but in particular, new model)
# case 2: `run-slow model_1, model_2`
get-tests:
runs-on: ubuntu-22.04
needs: [get-pr-number, get-sha]
if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}
outputs:
models: ${{ steps.models_to_run.outputs.models }}
quantizations: ${{ steps.models_to_run.outputs.quantizations }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: "0"
ref: "refs/pull/${{needs.get-pr-number.outputs.PR_NUMBER}}/merge"
- name: Verify merge commit SHA
env:
VERIFIED_PR_MERGE_SHA: ${{ needs.get-sha.outputs.PR_MERGE_SHA }}
run: |
PR_MERGE_SHA=$(git log -1 --format=%H)
if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then
echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";
exit -1;
fi
- name: Get models to test
env:
PR_COMMENT: ${{ github.event.comment.body }}
run: |
python -m pip install GitPython
python utils/pr_slow_ci_models.py --message "$PR_COMMENT" | tee output.txt
echo "models=$(tail -n 1 output.txt)" >> $GITHUB_ENV
python utils/pr_slow_ci_models.py --message "$PR_COMMENT" --quantization | tee output2.txt
echo "quantizations=$(tail -n 1 output2.txt)" >> $GITHUB_ENV
- name: Show models to test
id: models_to_run
run: |
echo "${{ env.models }}"
echo "models=${{ env.models }}" >> $GITHUB_ENV
echo "models=${{ env.models }}" >> $GITHUB_OUTPUT
echo "${{ env.quantizations }}"
echo "quantizations=${{ env.quantizations }}" >> $GITHUB_OUTPUT
reply_to_comment:
name: Reply to the comment
if: ${{ needs.get-tests.outputs.models != '[]' || needs.get-tests.outputs.quantizations != '[]' }}
needs: [get-pr-number, get-tests]
permissions:
pull-requests: write
runs-on: ubuntu-22.04
steps:
- name: Reply to the comment
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
MODELS: ${{ needs.get-tests.outputs.models }}
BODY: "This comment contains run-slow, running the specified jobs:\n\nmodels: ${{ needs.get-tests.outputs.models }}\nquantizations: ${{ needs.get-tests.outputs.quantizations }}"
run: |
gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
repos/${{ github.repository }}/issues/${{ needs.get-pr-number.outputs.PR_NUMBER }}/comments \
-f "body=This comment contains run-slow, running the specified jobs: ${{ env.BODY }} ..."
create_run:
name: Create run
if: ${{ needs.get-tests.outputs.models != '[]' || needs.get-tests.outputs.quantizations != '[]' }}
needs: [get-sha, get-tests, reply_to_comment]
permissions:
statuses: write
runs-on: ubuntu-22.04
steps:
- name: Create Run
id: create_run
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# Create a commit status (pending) for a run of this workflow. The status has to be updated later in `update_run_status`.
# See https://docs.github.com/en/rest/commits/statuses?apiVersion=2022-11-28#create-a-commit-status
GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
run: |
gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
repos/${{ github.repository }}/statuses/${{ needs.get-sha.outputs.PR_HEAD_SHA }} \
-f "target_url=$GITHUB_RUN_URL" -f "state=pending" -f "description=Slow CI job" -f "context=pytest/custom-tests"
run_models_gpu:
name: Run all tests for the model
if: ${{ needs.get-tests.outputs.models != '[]' }}
needs: [get-pr-number, get-sha, get-tests, create_run]
strategy:
fail-fast: false
matrix:
folders: ${{ fromJson(needs.get-tests.outputs.models) }}
machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
runs-on:
group: '${{ matrix.machine_type }}'
container:
image: huggingface/transformers-all-latest-gpu
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Echo input and matrix info
shell: bash
run: |
echo "${{ matrix.folders }}"
- name: Echo folder ${{ matrix.folders }}
shell: bash
# For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
# set the artifact folder names (because the character `/` is not allowed).
run: |
echo "${{ matrix.folders }}"
matrix_folders=${{ matrix.folders }}
matrix_folders=${matrix_folders/'models/'/'models_'}
echo "$matrix_folders"
echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
- name: Checkout to PR merge commit
working-directory: /transformers
run: |
git fetch origin refs/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge:refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge
git checkout refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge
git log -1 --format=%H
- name: Verify merge commit SHA
env:
VERIFIED_PR_MERGE_SHA: ${{ needs.get-sha.outputs.PR_MERGE_SHA }}
working-directory: /transformers
run: |
PR_MERGE_SHA=$(git log -1 --format=%H)
if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then
echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";
exit -1;
fi
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Set `machine_type` for report and artifact names
working-directory: /transformers
shell: bash
run: |
echo "${{ matrix.machine_type }}"
if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
machine_type=single-gpu
elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
machine_type=multi-gpu
else
machine_type=${{ matrix.machine_type }}
fi
echo "$machine_type"
echo "machine_type=$machine_type" >> $GITHUB_ENV
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Run all tests on GPU
working-directory: /transformers
run: |
export CUDA_VISIBLE_DEVICES="$(python3 utils/set_cuda_devices_for_ci.py --test_folder ${{ matrix.folders }})"
echo $CUDA_VISIBLE_DEVICES
python3 -m pytest -v -rsfE --make-reports=${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: cat /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/failures_short.txt
- name: Make sure report directory exists
shell: bash
run: |
mkdir -p /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports
echo "hello" > /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/hello.txt
echo "${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports"
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports
path: /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports
run_quantization_torch_gpu:
name: Run all tests for a quantization
if: ${{ needs.get-tests.outputs.quantizations != '[]' }}
needs: [get-pr-number, get-sha, get-tests, create_run]
strategy:
fail-fast: false
matrix:
folders: ${{ fromJson(needs.get-tests.outputs.quantizations) }}
machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
runs-on:
group: '${{ matrix.machine_type }}'
container:
image: huggingface/transformers-quantization-latest-gpu
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Echo folder ${{ matrix.folders }}
shell: bash
run: |
echo "${{ matrix.folders }}"
matrix_folders=${{ matrix.folders }}
matrix_folders=${matrix_folders/'quantization/'/'quantization_'}
echo "$matrix_folders"
echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
- name: Checkout to PR merge commit
working-directory: /transformers
run: |
git fetch origin refs/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge:refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge
git checkout refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge
git log -1 --format=%H
- name: Verify merge commit SHA
env:
VERIFIED_PR_MERGE_SHA: ${{ needs.get-sha.outputs.PR_MERGE_SHA }}
working-directory: /transformers
run: |
PR_MERGE_SHA=$(git log -1 --format=%H)
if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then
echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";
exit -1;
fi
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Set `machine_type` for report and artifact names
working-directory: /transformers
shell: bash
run: |
echo "${{ matrix.machine_type }}"
if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
machine_type=single-gpu
elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
machine_type=multi-gpu
else
machine_type=${{ matrix.machine_type }}
fi
echo "$machine_type"
echo "machine_type=$machine_type" >> $GITHUB_ENV
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Run quantization tests on GPU
working-directory: /transformers
run: |
python3 -m pytest -v --make-reports=${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: cat /transformers/reports/${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports/failures_short.txt
- name: Make sure report directory exists
shell: bash
run: |
mkdir -p /transformers/reports/${{ env.machine_type }}_run_quantization_gpu_${{ matrix.folders }}_test_reports
echo "hello" > /transformers/reports/${{ env.machine_type }}_run_quantization_gpu_${{ matrix.folders }}_test_reports/hello.txt
echo "${{ env.machine_type }}_run_quantization_gpu_${{ matrix.folders }}_test_reports"
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports
path: /transformers/reports/${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports
update_run_status:
name: Update Check Run Status
needs: [get-sha, create_run, run_models_gpu, run_quantization_torch_gpu]
permissions:
statuses: write
if: ${{ always() && needs.create_run.result == 'success' }}
runs-on: ubuntu-22.04
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
STATUS_OK: ${{ contains(fromJSON('["skipped", "success"]'), needs.run_models_gpu.result) && contains(fromJSON('["skipped", "success"]'), needs.run_quantization_torch_gpu.result) }}
steps:
- name: Get `run_models_gpu` job status
run: |
echo "${{ needs.run_models_gpu.result }}"
echo "${{ needs.run_quantization_torch_gpu.result }}"
echo $STATUS_OK
if [ "$STATUS_OK" = "true" ]; then
echo "STATUS=success" >> $GITHUB_ENV
else
echo "STATUS=failure" >> $GITHUB_ENV
fi
- name: Update PR commit statuses
run: |
echo "${{ needs.run_models_gpu.result }}"
echo "${{ env.STATUS }}"
gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
repos/${{ github.repository }}/statuses/${{ needs.get-sha.outputs.PR_HEAD_SHA }} \
-f "target_url=$GITHUB_RUN_URL" -f "state=${{ env.STATUS }}" -f "description=Slow CI job" -f "context=pytest/custom-tests"

View File

@ -21,39 +21,6 @@ jobs:
echo "$(python3 -c 'print(int(${{ github.run_number }}) % 10)')"
echo "run_number=$(python3 -c 'print(int(${{ github.run_number }}) % 10)')" >> $GITHUB_OUTPUT
run_past_ci_pytorch_1-13:
name: PyTorch 1.13
needs: get_number
if: needs.get_number.outputs.run_number == 0 && (cancelled() != true) && ((github.event_name == 'schedule') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci')))
uses: ./.github/workflows/self-past-caller.yml
with:
framework: pytorch
version: "1.13"
sha: ${{ github.sha }}
secrets: inherit
run_past_ci_pytorch_1-12:
name: PyTorch 1.12
needs: get_number
if: needs.get_number.outputs.run_number == 1 && (cancelled() != true) && ((github.event_name == 'schedule') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci')))
uses: ./.github/workflows/self-past-caller.yml
with:
framework: pytorch
version: "1.12"
sha: ${{ github.sha }}
secrets: inherit
run_past_ci_pytorch_1-11:
name: PyTorch 1.11
needs: get_number
if: needs.get_number.outputs.run_number == 2 && (cancelled() != true) && ((github.event_name == 'schedule') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci')))
uses: ./.github/workflows/self-past-caller.yml
with:
framework: pytorch
version: "1.11"
sha: ${{ github.sha }}
secrets: inherit
run_past_ci_tensorflow_2-11:
name: TensorFlow 2.11
needs: get_number

View File

@ -1,151 +0,0 @@
name: PR slow CI
on:
pull_request:
paths:
- "src/transformers/models/*/modeling_*.py"
- "tests/**/test_*.py"
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
env:
HF_HOME: /mnt/cache
TRANSFORMERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes
# For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
# This token is created under the bot `hf-transformers-bot`.
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
RUN_PT_TF_CROSS_TESTS: 1
CUDA_VISIBLE_DEVICES: 0,1
jobs:
find_models_to_run:
runs-on: ubuntu-22.04
name: Find models to run slow tests
# Triggered only if the required label `run-slow` is added
if: ${{ contains(github.event.pull_request.labels.*.name, 'run-slow') }}
outputs:
models: ${{ steps.models_to_run.outputs.models }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: "0"
ref: ${{ github.event.pull_request.head.sha }}
- name: Get commit message
run: |
echo "commit_message=$(git show -s --format=%s)" >> $GITHUB_ENV
- name: Get models to run slow tests
run: |
echo "${{ env.commit_message }}"
python -m pip install GitPython
python utils/pr_slow_ci_models.py --commit_message "${{ env.commit_message }}" | tee output.txt
echo "models=$(tail -n 1 output.txt)" >> $GITHUB_ENV
- name: Models to run slow tests
id: models_to_run
run: |
echo "${{ env.models }}"
echo "models=${{ env.models }}" >> $GITHUB_OUTPUT
run_models_gpu:
name: Run all tests for the model
# Triggered only `find_models_to_run` is triggered (label `run-slow` is added) which gives the models to run
# (either a new model PR or via a commit message)
if: ${{ needs.find_models_to_run.outputs.models != '[]' }}
needs: find_models_to_run
strategy:
fail-fast: false
matrix:
folders: ${{ fromJson(needs.find_models_to_run.outputs.models) }}
machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
runs-on:
group: '${{ matrix.machine_type }}'
container:
image: huggingface/transformers-all-latest-gpu
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Echo input and matrix info
shell: bash
run: |
echo "${{ matrix.folders }}"
- name: Echo folder ${{ matrix.folders }}
shell: bash
# For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
# set the artifact folder names (because the character `/` is not allowed).
run: |
echo "${{ matrix.folders }}"
matrix_folders=${{ matrix.folders }}
matrix_folders=${matrix_folders/'models/'/'models_'}
echo "$matrix_folders"
echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
- name: Update clone
working-directory: /transformers
run: git fetch && git fetch origin pull/${{ github.event.pull_request.number }}/head:pull/${{ github.event.pull_request.number }}/merge && git checkout pull/${{ github.event.pull_request.number }}/merge
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e . && python3 -m pip install --upgrade torch torchaudio torchvision
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Set `machine_type` for report and artifact names
working-directory: /transformers
shell: bash
run: |
echo "${{ matrix.machine_type }}"
if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
machine_type=single-gpu
elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
machine_type=multi-gpu
else
machine_type=${{ matrix.machine_type }}
fi
echo "$machine_type"
echo "machine_type=$machine_type" >> $GITHUB_ENV
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Run all tests on GPU
working-directory: /transformers
run: |
export CUDA_VISIBLE_DEVICES="$(python3 utils/set_cuda_devices_for_ci.py --test_folder ${{ matrix.folders }})"
echo $CUDA_VISIBLE_DEVICES
python3 -m pytest -v -rsfE --make-reports=${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: cat /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/failures_short.txt
- name: Make sure report directory exists
shell: bash
run: |
mkdir -p /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports
echo "hello" > /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/hello.txt
echo "${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports"
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports
path: /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports

View File

@ -1,25 +1,25 @@
name: Self-hosted runner (AMD mi210 CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (push-caller)"]
branches: ["main"]
types: [completed]
push:
branches:
- run_amd_push_ci_caller*
paths:
- "src/**"
- "tests/**"
- ".github/**"
- "templates/**"
- "utils/**"
jobs:
run_amd_ci:
name: AMD mi210
if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))
uses: ./.github/workflows/self-push-amd.yml
with:
gpu_flavor: mi210
secrets: inherit
name: Self-hosted runner (AMD mi210 CI caller)
on:
#workflow_run:
# workflows: ["Self-hosted runner (push-caller)"]
# branches: ["main"]
# types: [completed]
push:
branches:
- run_amd_push_ci_caller*
paths:
- "src/**"
- "tests/**"
- ".github/**"
- "templates/**"
- "utils/**"
jobs:
run_amd_ci:
name: AMD mi210
if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))
uses: ./.github/workflows/self-push-amd.yml
with:
gpu_flavor: mi210
secrets: inherit

View File

@ -1,25 +1,25 @@
name: Self-hosted runner (AMD mi250 CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (push-caller)"]
branches: ["main"]
types: [completed]
push:
branches:
- run_amd_push_ci_caller*
paths:
- "src/**"
- "tests/**"
- ".github/**"
- "templates/**"
- "utils/**"
jobs:
run_amd_ci:
name: AMD mi250
if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))
uses: ./.github/workflows/self-push-amd.yml
with:
gpu_flavor: mi250
secrets: inherit
name: Self-hosted runner (AMD mi250 CI caller)
on:
#workflow_run:
# workflows: ["Self-hosted runner (push-caller)"]
# branches: ["main"]
# types: [completed]
push:
branches:
- run_amd_push_ci_caller*
paths:
- "src/**"
- "tests/**"
- ".github/**"
- "templates/**"
- "utils/**"
jobs:
run_amd_ci:
name: AMD mi250
if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))
uses: ./.github/workflows/self-push-amd.yml
with:
gpu_flavor: mi250
secrets: inherit

View File

@ -1,10 +1,10 @@
name: Self-hosted runner (AMD mi300 CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (push-caller)"]
branches: ["main"]
types: [completed]
#workflow_run:
# workflows: ["Self-hosted runner (push-caller)"]
# branches: ["main"]
# types: [completed]
push:
branches:
- run_amd_push_ci_caller*

View File

@ -1,55 +1,55 @@
name: Self-hosted runner (AMD mi210 scheduled CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (AMD scheduled CI caller)"]
branches: ["main"]
types: [completed]
push:
branches:
- run_amd_scheduled_ci_caller*
jobs:
model-ci:
name: Model CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_models_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
torch-pipeline:
name: Torch pipeline CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_pipelines_torch_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
example-ci:
name: Example CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_examples_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
deepspeed-ci:
name: DeepSpeed CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_torch_cuda_extensions_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
name: Self-hosted runner (AMD mi210 scheduled CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (AMD scheduled CI caller)"]
branches: ["main"]
types: [completed]
push:
branches:
- run_amd_scheduled_ci_caller*
jobs:
model-ci:
name: Model CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_models_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
torch-pipeline:
name: Torch pipeline CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_pipelines_torch_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
example-ci:
name: Example CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_examples_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit
deepspeed-ci:
name: DeepSpeed CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_torch_cuda_extensions_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi210
docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
ci_event: Scheduled CI (AMD) - mi210
secrets: inherit

View File

@ -1,55 +1,55 @@
name: Self-hosted runner (AMD mi250 scheduled CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (AMD scheduled CI caller)"]
branches: ["main"]
types: [completed]
push:
branches:
- run_amd_scheduled_ci_caller*
jobs:
model-ci:
name: Model CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_models_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi250
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
torch-pipeline:
name: Torch pipeline CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_pipelines_torch_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi250
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
example-ci:
name: Example CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_examples_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi250
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
deepspeed-ci:
name: DeepSpeed CI
uses: ./.github/workflows/self-scheduled-amd.yml
with:
job: run_torch_cuda_extensions_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi250
docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
name: Self-hosted runner (AMD mi250 scheduled CI caller)
on:
workflow_run:
workflows: ["Self-hosted runner (AMD scheduled CI caller)"]
branches: ["main"]
types: [completed]
push:
branches:
- run_amd_scheduled_ci_caller*
jobs:
model-ci:
name: Model CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_models_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi250
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
torch-pipeline:
name: Torch pipeline CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_pipelines_torch_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi250
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
example-ci:
name: Example CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_examples_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi250
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit
deepspeed-ci:
name: DeepSpeed CI
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
with:
job: run_torch_cuda_extensions_gpu
slack_report_channel: "#transformers-ci-daily-amd"
runner: mi250
docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
ci_event: Scheduled CI (AMD) - mi250
secrets: inherit

View File

@ -1,349 +0,0 @@
name: Self-hosted runner (scheduled-amd)
# Note: For the AMD CI, we rely on a caller workflow and on the workflow_call event to trigger the
# CI in order to run it on both MI210 and MI250, without having to use matrix here which pushes
# us towards the limit of allowed jobs on GitHub Actions.
on:
workflow_call:
inputs:
job:
required: true
type: string
slack_report_channel:
required: true
type: string
runner:
required: true
type: string
docker:
required: true
type: string
ci_event:
required: true
type: string
env:
HF_HOME: /mnt/cache
TRANSFORMERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
NUM_SLICES: 2
# Important note: each job (run_tests_single_gpu, run_tests_multi_gpu, run_examples_gpu, run_pipelines_torch_gpu) requires all the previous jobs before running.
# This is done so that we avoid parallelizing the scheduled tests, to leave available
# runners for the push CI that is running on the same machine.
jobs:
check_runner_status:
name: Check Runner Status
runs-on: ubuntu-22.04
steps:
- name: Checkout transformers
uses: actions/checkout@v4
with:
fetch-depth: 2
- name: Check Runner Status
run: python utils/check_self_hosted_runner.py --target_runners hf-amd-mi210-ci-1gpu-1,hf-amd-mi250-ci-1gpu-1,hf-amd-mi300-ci-1gpu-1 --token ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
check_runners:
name: Check Runners
needs: check_runner_status
strategy:
matrix:
machine_type: [single-gpu, multi-gpu]
runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']
container:
image: huggingface/transformers-pytorch-amd-gpu
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: ROCM-SMI
run: |
rocm-smi
- name: ROCM-INFO
run: |
rocminfo | grep "Agent" -A 14
- name: Show ROCR environment
run: |
echo "ROCR: $ROCR_VISIBLE_DEVICES"
setup:
if: contains(fromJSON('["run_models_gpu"]'), inputs.job)
name: Setup
needs: check_runners
strategy:
matrix:
machine_type: [single-gpu, multi-gpu]
runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']
container:
image: huggingface/transformers-pytorch-amd-gpu
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
outputs:
folder_slices: ${{ steps.set-matrix.outputs.folder_slices }}
slice_ids: ${{ steps.set-matrix.outputs.slice_ids }}
steps:
- name: Update clone
working-directory: /transformers
run: |
git fetch && git checkout ${{ github.sha }}
- name: Cleanup
working-directory: /transformers
run: |
rm -rf tests/__pycache__
rm -rf tests/models/__pycache__
rm -rf reports
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- id: set-matrix
name: Identify models to test
working-directory: /transformers/tests
run: |
echo "folder_slices=$(python3 ../utils/split_model_tests.py --num_splits ${{ env.NUM_SLICES }})" >> $GITHUB_OUTPUT
echo "slice_ids=$(python3 -c 'd = list(range(${{ env.NUM_SLICES }})); print(d)')" >> $GITHUB_OUTPUT
- name: ROCM-SMI
run: |
rocm-smi
- name: ROCM-INFO
run: |
rocminfo | grep "Agent" -A 14
- name: Show ROCR environment
run: |
echo "ROCR: $ROCR_VISIBLE_DEVICES"
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
run_models_gpu:
if: ${{ inputs.job == 'run_models_gpu' }}
name: Single GPU tests
needs: setup
strategy:
max-parallel: 1 # For now, not to parallelize. Can change later if it works well.
fail-fast: false
matrix:
machine_type: [single-gpu, multi-gpu]
slice_id: ${{ fromJSON(needs.setup.outputs.slice_ids) }}
uses: ./.github/workflows/model_jobs_amd.yml
with:
folder_slices: ${{ needs.setup.outputs.folder_slices }}
machine_type: ${{ matrix.machine_type }}
slice_id: ${{ matrix.slice_id }}
runner: ${{ inputs.runner }}
docker: ${{ inputs.docker }}
secrets: inherit
run_pipelines_torch_gpu:
if: ${{ inputs.job == 'run_pipelines_torch_gpu' }}
name: PyTorch pipelines
needs: check_runners
strategy:
fail-fast: false
matrix:
machine_type: [single-gpu, multi-gpu]
runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']
container:
image: ${{ inputs.docker }}
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Update clone
working-directory: /transformers
run: git fetch && git checkout ${{ github.sha }}
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
- name: ROCM-SMI
run: |
rocm-smi
- name: ROCM-INFO
run: |
rocminfo | grep "Agent" -A 14
- name: Show ROCR environment
run: |
echo "ROCR: $ROCR_VISIBLE_DEVICES"
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Run all pipeline tests on GPU
working-directory: /transformers
run: |
python3 -m pytest -n 1 -v --dist=loadfile --make-reports=${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports tests/pipelines -m "not not_device_test"
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: cat /transformers/reports/${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports/failures_short.txt
- name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports
path: /transformers/reports/${{ matrix.machine_type }}_run_pipelines_torch_gpu_test_reports
run_examples_gpu:
if: ${{ inputs.job == 'run_examples_gpu' }}
name: Examples directory
needs: check_runners
strategy:
fail-fast: false
matrix:
machine_type: [single-gpu]
runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']
container:
image: ${{ inputs.docker }}
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Update clone
working-directory: /transformers
run: git fetch && git checkout ${{ github.sha }}
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
- name: ROCM-SMI
run: |
rocm-smi
- name: ROCM-INFO
run: |
rocminfo | grep "Agent" -A 14
- name: Show ROCR environment
run: |
echo "ROCR: $ROCR_VISIBLE_DEVICES"
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Run examples tests on GPU
working-directory: /transformers
run: |
pip install -r examples/pytorch/_tests_requirements.txt
python3 -m pytest -v --make-reports=${{ matrix.machine_type }}_run_examples_gpu_test_reports examples/pytorch -m "not not_device_test"
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: cat /transformers/reports/${{ matrix.machine_type }}_run_examples_gpu_test_reports/failures_short.txt
- name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_examples_gpu_test_reports"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.machine_type }}_run_examples_gpu_test_reports
path: /transformers/reports/${{ matrix.machine_type }}_run_examples_gpu_test_reports
run_torch_cuda_extensions_gpu:
if: ${{ inputs.job == 'run_torch_cuda_extensions_gpu' }}
name: Torch ROCm deepspeed tests
needs: check_runners
strategy:
fail-fast: false
matrix:
machine_type: [single-gpu, multi-gpu]
runs-on: ['${{ matrix.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']
container:
image: ${{ inputs.docker }}
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Update clone
working-directory: /transformers
run: git fetch && git checkout ${{ github.sha }}
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
- name: ROCM-SMI
run: |
rocm-smi
- name: ROCM-INFO
run: |
rocminfo | grep "Agent" -A 14
- name: Show ROCR environment
run: |
echo "ROCR: $ROCR_VISIBLE_DEVICES"
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Run all tests on GPU
working-directory: /transformers
run: python3 -m pytest -v --make-reports=${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports tests/deepspeed tests/extended -m "not not_device_test"
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: cat /transformers/reports/${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt
- name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
path: /transformers/reports/${{ matrix.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
send_results:
name: Slack Report
needs: [
check_runner_status,
check_runners,
setup,
run_models_gpu,
run_pipelines_torch_gpu,
run_examples_gpu,
run_torch_cuda_extensions_gpu
]
if: ${{ always() }}
uses: ./.github/workflows/slack-report.yml
with:
job: ${{ inputs.job }}
# This would be `skipped` if `setup` is skipped.
setup_status: ${{ needs.setup.result }}
slack_report_channel: ${{ inputs.slack_report_channel }}
# This would be an empty string if `setup` is skipped.
folder_slices: ${{ needs.setup.outputs.folder_slices }}
quantization_matrix: ${{ needs.setup.outputs.quantization_matrix }}
ci_event: ${{ inputs.ci_event }}
secrets: inherit

View File

@ -366,7 +366,7 @@ jobs:
run: |
python3 -m pip uninstall -y deepspeed
rm -rf DeepSpeed
git clone https://github.com/microsoft/DeepSpeed && cd DeepSpeed && rm -rf build
git clone https://github.com/deepspeedai/DeepSpeed && cd DeepSpeed && rm -rf build
DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install . --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check
- name: NVIDIA-SMI

View File

@ -70,7 +70,7 @@ jobs:
with:
name: ci_results_${{ inputs.job }}
path: ci_results_${{ inputs.job }}
- uses: actions/checkout@v4
- uses: actions/download-artifact@v4
- name: Send message to Slack for quantization workflow
@ -90,7 +90,7 @@ jobs:
pip install huggingface_hub
pip install slack_sdk
pip show slack_sdk
python utils/notification_service_quantization.py "${{ inputs.quantization_matrix }}"
python utils/notification_service_quantization.py "${{ inputs.quantization_matrix }}"
# Upload complete failure tables, as they might be big and only truncated versions could be sent to Slack.
- name: Failure table artifacts
@ -98,4 +98,4 @@ jobs:
uses: actions/upload-artifact@v4
with:
name: ci_results_${{ inputs.job }}
path: ci_results_${{ inputs.job }}
path: ci_results_${{ inputs.job }}

View File

@ -16,3 +16,5 @@ jobs:
fetch-depth: 0
- name: Secret Scanning
uses: trufflesecurity/trufflehog@main
with:
extra_args: --results=verified,unknown

View File

@ -132,7 +132,7 @@ You will need basic `git` proficiency to contribute to
manual. Type `git --help` in a shell and enjoy! If you prefer books, [Pro
Git](https://git-scm.com/book/en/v2) is a very good reference.
You'll need **[Python 3.8](https://github.com/huggingface/transformers/blob/main/setup.py#L449)** or above to contribute to 🤗 Transformers. Follow the steps below to start contributing:
You'll need **[Python 3.9](https://github.com/huggingface/transformers/blob/main/setup.py#L449)** or above to contribute to 🤗 Transformers. Follow the steps below to start contributing:
1. Fork the [repository](https://github.com/huggingface/transformers) by
clicking on the **[Fork](https://github.com/huggingface/transformers/fork)** button on the repository's page. This creates a copy of the code

View File

@ -249,23 +249,43 @@ The model itself is a regular [Pytorch `nn.Module`](https://pytorch.org/docs/sta
### With pip
This repository is tested on Python 3.8+, Flax 0.4.1+, PyTorch 1.11+, and TensorFlow 2.6+.
This repository is tested on Python 3.9+, Flax 0.4.1+, PyTorch 2.0+, and TensorFlow 2.6+.
You should install 🤗 Transformers in a [virtual environment](https://docs.python.org/3/library/venv.html). If you're unfamiliar with Python virtual environments, check out the [user guide](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
First, create a virtual environment with the version of Python you're going to use and activate it.
Then, you will need to install at least one of Flax, PyTorch, or TensorFlow.
Please refer to [TensorFlow installation page](https://www.tensorflow.org/install/), [PyTorch installation page](https://pytorch.org/get-started/locally/#start-locally) and/or [Flax](https://github.com/google/flax#quick-install) and [Jax](https://github.com/google/jax#installation) installation pages regarding the specific installation command for your platform.
**macOS/Linux**
```python -m venv env
source env/bin/activate
```
**Windows**
``` python -m venv env
env\Scripts\activate
```
To use 🤗 Transformers, you must install at least one of Flax, PyTorch, or TensorFlow. Refer to the official installation guides for platform-specific commands:
[TensorFlow installation page](https://www.tensorflow.org/install/),
[PyTorch installation page](https://pytorch.org/get-started/locally/#start-locally) and/or [Flax](https://github.com/google/flax#quick-install) and [Jax](https://github.com/google/jax#installation)
When one of those backends has been installed, 🤗 Transformers can be installed using pip as follows:
```bash
```
pip install transformers
```
If you'd like to play with the examples or need the bleeding edge of the code and can't wait for a new release, you must [install the library from source](https://huggingface.co/docs/transformers/installation#installing-from-source).
```
git clone https://github.com/huggingface/transformers.git
cd transformers
pip install .
```
### With conda
🤗 Transformers can be installed using conda as follows:

View File

@ -15,7 +15,7 @@ to add it.
Keywords: Open-source, LLaMa, GPT-J, instruction, assistant
## [recommenders](https://github.com/microsoft/recommenders)
## [recommenders](https://github.com/recommenders-team/recommenders)
This repository contains examples and best practices for building recommendation systems, provided as Jupyter notebooks. It goes over several aspects required to build efficient recommendation systems: data preparation, modeling, evaluation, model selection & optimization, as well as operationalization
@ -39,15 +39,15 @@ MindsDB is a low-code ML platform, which automates and integrates several ML fra
Keywords: Database, low-code, AI table
## [langchain](https://github.com/hwchase17/langchain)
## [langchain](https://github.com/langchain-ai/langchain)
[langchain](https://github.com/hwchase17/langchain) is aimed at assisting in the development of apps merging both LLMs and other sources of knowledge. The library allows chaining calls to applications, creating a sequence across many tools.
[langchain](https://github.com/langchain-ai/langchain) is aimed at assisting in the development of apps merging both LLMs and other sources of knowledge. The library allows chaining calls to applications, creating a sequence across many tools.
Keywords: LLMs, Large Language Models, Agents, Chains
## [LlamaIndex](https://github.com/jerryjliu/llama_index)
## [LlamaIndex](https://github.com/run-llama/llama_index)
[LlamaIndex](https://github.com/jerryjliu/llama_index) is a project that provides a central interface to connect your LLM's with external data. It provides various kinds of indices and retreival mechanisms to perform different LLM tasks and obtain knowledge-augmented results.
[LlamaIndex](https://github.com/run-llama/llama_index) is a project that provides a central interface to connect your LLM's with external data. It provides various kinds of indices and retreival mechanisms to perform different LLM tasks and obtain knowledge-augmented results.
Keywords: LLMs, Large Language Models, Data Retrieval, Indices, Knowledge Augmentation
@ -146,9 +146,9 @@ Keywords: Framework, simplicity, NLP
Keywords: LLM, Agents, HF Hub
## [transformers.js](https://xenova.github.io/transformers.js/)
## [transformers.js](https://github.com/huggingface/transformers.js/)
[transformers.js](https://xenova.github.io/transformers.js/) is a JavaScript library targeted at running models from transformers directly within the browser.
[transformers.js](https://github.com/huggingface/transformers.js/) is a JavaScript library targeted at running models from transformers directly within the browser.
Keywords: Transformers, JavaScript, browser
@ -437,7 +437,7 @@ Keywords: DALL-E, Russian
Keywords: Knowledge Extraction, Knowledge Graphs
## [Nebuly](https://github.com/nebuly-ai/nebuly)
## [Nebuly](https://github.com/nebuly-ai/optimate)
Nebuly is the next-generation platform to monitor and optimize your AI costs in one place. The platform connects to all your AI cost sources (compute, API providers, AI software licenses, etc) and centralizes them in one place to give you full visibility on a model basis. The platform also provides optimization recommendations and a co-pilot model that can guide during the optimization process. The platform builds on top of the open-source tools allowing you to optimize the different steps of your AI stack to squeeze out the best possible cost performances.

49
benchmark/README.md Normal file
View File

@ -0,0 +1,49 @@
# Benchmarks
You might want to add new benchmarks.
You will need to define a python function named `run_benchmark` in your python file and the file must be located in this `benchmark/` directory.
The expected function signature is the following:
```py
def run_benchmark(logger: Logger, branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):
```
## Writing metrics to the database
`MetricRecorder` is thread-safe, in the sense of the python [`Thread`](https://docs.python.org/3/library/threading.html#threading.Thread). This means you can start a background thread to do the readings on the device measurements while not blocking the main thread to execute the model measurements.
cf [`llama.py`](./llama.py) to see an example of this in practice.
```py
from benchmarks_entrypoint import MetricsRecorder
import psycopg2
def run_benchmark(logger: Logger, branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):
metrics_recorder = MetricsRecorder(psycopg2.connect("dbname=metrics"), logger, branch, commit_id, commit_msg)
benchmark_id = metrics_recorder.initialise_benchmark({"gpu_name": gpu_name, "model_id": model_id})
# To collect device measurements
metrics_recorder.collect_device_measurements(
benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes
)
# To collect your model measurements
metrics_recorder.collect_model_measurements(
benchmark_id,
{
"model_load_time": model_load_time,
"first_eager_forward_pass_time_secs": first_eager_fwd_pass_time,
"second_eager_forward_pass_time_secs": second_eager_fwd_pass_time,
"first_eager_generate_time_secs": first_eager_generate_time,
"second_eager_generate_time_secs": second_eager_generate_time,
"time_to_first_token_secs": time_to_first_token,
"time_to_second_token_secs": time_to_second_token,
"time_to_third_token_secs": time_to_third_token,
"time_to_next_token_mean_secs": mean_time_to_next_token,
"first_compile_generate_time_secs": first_compile_generate_time,
"second_compile_generate_time_secs": second_compile_generate_time,
"third_compile_generate_time_secs": third_compile_generate_time,
"fourth_compile_generate_time_secs": fourth_compile_generate_time,
},
)
```

View File

@ -0,0 +1,144 @@
import argparse
import importlib.util
import logging
import os
from typing import Dict
import psycopg2
import sys
from psycopg2.extras import Json
from psycopg2.extensions import register_adapter
register_adapter(dict, Json)
class ImportModuleException(Exception):
pass
class MetricsRecorder:
def __init__(self, connection, logger: logging.Logger, branch: str, commit_id: str, commit_msg: str):
self.conn = connection
self.conn.autocommit = True
self.logger = logger
self.branch = branch
self.commit_id = commit_id
self.commit_msg = commit_msg
def initialise_benchmark(self, metadata: Dict[str, str]) -> int:
"""
Creates a new benchmark, returns the benchmark id
"""
# gpu_name: str, model_id: str
with self.conn.cursor() as cur:
cur.execute(
"INSERT INTO benchmarks (branch, commit_id, commit_message, metadata) VALUES (%s, %s, %s, %s) RETURNING benchmark_id",
(self.branch, self.commit_id, self.commit_msg, metadata),
)
benchmark_id = cur.fetchone()[0]
logger.debug(f"initialised benchmark #{benchmark_id}")
return benchmark_id
def collect_device_measurements(self, benchmark_id: int, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes):
"""
Collect device metrics, such as CPU & GPU usage. These are "static", as in you cannot pass arbitrary arguments to the function.
"""
with self.conn.cursor() as cur:
cur.execute(
"INSERT INTO device_measurements (benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes) VALUES (%s, %s, %s, %s, %s)",
(benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes),
)
self.logger.debug(
f"inserted device measurements for benchmark #{benchmark_id} [CPU util: {cpu_util}, mem MBs: {mem_megabytes}, GPU util: {gpu_util}, GPU mem MBs: {gpu_mem_megabytes}]"
)
def collect_model_measurements(self, benchmark_id: int, measurements: Dict[str, float]):
with self.conn.cursor() as cur:
cur.execute(
"""
INSERT INTO model_measurements (
benchmark_id,
measurements
) VALUES (%s, %s)
""",
(
benchmark_id,
measurements,
),
)
self.logger.debug(f"inserted model measurements for benchmark #{benchmark_id}: {measurements}")
def close(self):
self.conn.close()
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)
formatter = logging.Formatter("[%(levelname)s - %(asctime)s] %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
def parse_arguments():
"""
Parse command line arguments for the benchmarking CLI.
"""
parser = argparse.ArgumentParser(description="CLI for benchmarking the huggingface/transformers.")
parser.add_argument(
"branch",
type=str,
help="The branch name on which the benchmarking is performed.",
)
parser.add_argument(
"commit_id",
type=str,
help="The commit hash on which the benchmarking is performed.",
)
parser.add_argument(
"commit_msg",
type=str,
help="The commit message associated with the commit, truncated to 70 characters.",
)
args = parser.parse_args()
return args.branch, args.commit_id, args.commit_msg
def import_from_path(module_name, file_path):
try:
spec = importlib.util.spec_from_file_location(module_name, file_path)
module = importlib.util.module_from_spec(spec)
sys.modules[module_name] = module
spec.loader.exec_module(module)
return module
except Exception as e:
raise ImportModuleException(f"failed to load python module: {e}")
if __name__ == "__main__":
benchmarks_folder_path = os.path.dirname(os.path.realpath(__file__))
branch, commit_id, commit_msg = parse_arguments()
for entry in os.scandir(benchmarks_folder_path):
try:
if not entry.name.endswith(".py"):
continue
if entry.path == __file__:
continue
logger.debug(f"loading: {entry.name}")
module = import_from_path(entry.name.split(".")[0], entry.path)
logger.info(f"runnning benchmarks in: {entry.name}")
module.run_benchmark(logger, branch, commit_id, commit_msg)
except ImportModuleException as e:
logger.error(e)
except Exception as e:
logger.error(f"error running benchmarks for {entry.name}: {e}")

10
benchmark/default.yml Normal file
View File

@ -0,0 +1,10 @@
apiVersion: 1
providers:
- name: 'Transformers Benchmarks'
orgId: 1
type: file
updateIntervalSeconds: 10
allowUiUpdates: true
options:
path: /etc/grafana/dashboards

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,17 @@
apiVersion: 1
datasources:
- name: grafana-postgresql-datasource
uid: be28nkzirtb0gd
type: postgres
url: $GRAFANA_POSTGRES_DATASOURCE_URL
user: $GRAFANA_POSTGRES_DATASOURCE_USER
secureJsonData:
password: $GRAFANA_POSTGRES_DATASOURCE_PWD
jsonData:
database: metrics
maxOpenConns: 100
maxIdleConns: 100
maxIdleConnsAuto: true
connMaxLifetime: 14400
postgresVersion: 1000
timescaledb: false

View File

@ -3,10 +3,14 @@ CREATE TABLE IF NOT EXISTS benchmarks (
branch VARCHAR(255),
commit_id VARCHAR(72),
commit_message VARCHAR(70),
gpu_name VARCHAR(255),
metadata jsonb,
created_at timestamp without time zone NOT NULL DEFAULT (current_timestamp AT TIME ZONE 'UTC')
);
CREATE INDEX IF NOT EXISTS benchmarks_benchmark_id_idx ON benchmarks (benchmark_id);
CREATE INDEX IF NOT EXISTS benchmarks_branch_idx ON benchmarks (branch);
CREATE TABLE IF NOT EXISTS device_measurements (
measurement_id SERIAL PRIMARY KEY,
benchmark_id int REFERENCES benchmarks (benchmark_id),
@ -17,6 +21,8 @@ CREATE TABLE IF NOT EXISTS device_measurements (
time timestamp without time zone NOT NULL DEFAULT (current_timestamp AT TIME ZONE 'UTC')
);
CREATE INDEX IF NOT EXISTS device_measurements_branch_idx ON device_measurements (benchmark_id);
CREATE TABLE IF NOT EXISTS model_measurements (
measurement_id SERIAL PRIMARY KEY,
benchmark_id int REFERENCES benchmarks (benchmark_id),
@ -24,3 +30,4 @@ CREATE TABLE IF NOT EXISTS model_measurements (
time timestamp without time zone NOT NULL DEFAULT (current_timestamp AT TIME ZONE 'UTC')
);
CREATE INDEX IF NOT EXISTS model_measurements_branch_idx ON model_measurements (benchmark_id);

View File

@ -1,71 +1,25 @@
import argparse
import json
import logging
from logging import Logger
import os
import sys
from statistics import mean
from threading import Event, Thread
from time import perf_counter, sleep
from typing import Optional
from benchmarks_entrypoint import MetricsRecorder
import gpustat
import psutil
import psycopg2
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig, StaticCache
from psycopg2.extras import Json
from psycopg2.extensions import register_adapter
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)
formatter = logging.Formatter("[%(levelname)s - %(asctime)s] %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
os.environ["TOKENIZERS_PARALLELISM"] = "1"
torch.set_float32_matmul_precision("high")
register_adapter(dict, Json)
def parse_arguments():
"""
Parse command line arguments for the benchmarking CLI.
"""
parser = argparse.ArgumentParser(description="CLI for benchmarking the huggingface/transformers.")
parser.add_argument(
"branch",
type=str,
help="The branch name on which the benchmarking is performed.",
)
parser.add_argument(
"commit_id",
type=str,
help="The commit hash on which the benchmarking is performed.",
)
parser.add_argument(
"commit_msg",
type=str,
help="The commit message associated with the commit, truncated to 70 characters.",
)
args = parser.parse_args()
return args.branch, args.commit_id, args.commit_msg
def collect_metrics(benchmark_id, continue_metric_collection):
def collect_metrics(benchmark_id, continue_metric_collection, metrics_recorder):
p = psutil.Process(os.getpid())
conn = psycopg2.connect("dbname=metrics")
cur = conn.cursor()
while not continue_metric_collection.is_set():
with p.oneshot():
cpu_util = p.cpu_percent()
@ -73,43 +27,41 @@ def collect_metrics(benchmark_id, continue_metric_collection):
gpu_stats = gpustat.GPUStatCollection.new_query()
gpu_util = gpu_stats[0]["utilization.gpu"]
gpu_mem_megabytes = gpu_stats[0]["memory.used"]
cur.execute(
"INSERT INTO device_measurements (benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes) VALUES (%s, %s, %s, %s, %s)",
(benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes),
metrics_recorder.collect_device_measurements(
benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes
)
sleep(0.01)
conn.commit()
conn.close()
def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):
def run_benchmark(logger: Logger, branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):
continue_metric_collection = Event()
metrics_thread = None
model_id = "meta-llama/Llama-2-7b-hf"
metrics_recorder = MetricsRecorder(psycopg2.connect("dbname=metrics"), logger, branch, commit_id, commit_msg)
try:
gpu_stats = gpustat.GPUStatCollection.new_query()
gpu_name = gpu_stats[0]["name"]
conn = psycopg2.connect("dbname=metrics")
cur = conn.cursor()
cur.execute(
"INSERT INTO benchmarks (branch, commit_id, commit_message, gpu_name) VALUES (%s, %s, %s, %s) RETURNING benchmark_id",
(branch, commit_id, commit_msg, gpu_name),
benchmark_id = metrics_recorder.initialise_benchmark({"gpu_name": gpu_name, "model_id": model_id})
logger.info(f"running benchmark #{benchmark_id} on {gpu_name} for {model_id}")
metrics_thread = Thread(
target=collect_metrics,
args=[benchmark_id, continue_metric_collection, metrics_recorder],
)
conn.commit()
benchmark_id = cur.fetchone()[0]
metrics_thread = Thread(target=collect_metrics, args=[benchmark_id, continue_metric_collection])
metrics_thread.start()
logger.info("started background thread to fetch device metrics")
os.environ["TOKENIZERS_PARALLELISM"] = "false" # silence warnings when compiling
device = "cuda"
ckpt = "meta-llama/Llama-2-7b-hf"
logger.info("downloading weights")
# This is to avoid counting download in model load time measurement
model = AutoModelForCausalLM.from_pretrained(ckpt, torch_dtype=torch.float16)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16)
gen_config = GenerationConfig(do_sample=False, top_p=1, temperature=1)
logger.info("loading model")
start = perf_counter()
model = AutoModelForCausalLM.from_pretrained(
ckpt, torch_dtype=torch.float16, generation_config=gen_config
model_id, torch_dtype=torch.float16, generation_config=gen_config
).eval()
model.to(device)
torch.cuda.synchronize()
@ -117,7 +69,7 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
model_load_time = end - start
logger.info(f"loaded model in: {model_load_time}s")
tokenizer = AutoTokenizer.from_pretrained(ckpt)
tokenizer = AutoTokenizer.from_pretrained(model_id)
prompt = "Why dogs are so cute?"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
@ -364,41 +316,27 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
logger.info(f"completed second compile generation in: {fourth_compile_generate_time}s")
logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")
cur.execute(
"""
INSERT INTO model_measurements (
benchmark_id,
measurements
) VALUES (%s, %s)
""",
(
benchmark_id,
{
"model_load_time": model_load_time,
"first_eager_forward_pass_time_secs": first_eager_fwd_pass_time,
"second_eager_forward_pass_time_secs": second_eager_fwd_pass_time,
"first_eager_generate_time_secs": first_eager_generate_time,
"second_eager_generate_time_secs": second_eager_generate_time,
"time_to_first_token_secs": time_to_first_token,
"time_to_second_token_secs": time_to_second_token,
"time_to_third_token_secs": time_to_third_token,
"time_to_next_token_mean_secs": mean_time_to_next_token,
"first_compile_generate_time_secs": first_compile_generate_time,
"second_compile_generate_time_secs": second_compile_generate_time,
"third_compile_generate_time_secs": third_compile_generate_time,
"fourth_compile_generate_time_secs": fourth_compile_generate_time,
},
),
metrics_recorder.collect_model_measurements(
benchmark_id,
{
"model_load_time": model_load_time,
"first_eager_forward_pass_time_secs": first_eager_fwd_pass_time,
"second_eager_forward_pass_time_secs": second_eager_fwd_pass_time,
"first_eager_generate_time_secs": first_eager_generate_time,
"second_eager_generate_time_secs": second_eager_generate_time,
"time_to_first_token_secs": time_to_first_token,
"time_to_second_token_secs": time_to_second_token,
"time_to_third_token_secs": time_to_third_token,
"time_to_next_token_mean_secs": mean_time_to_next_token,
"first_compile_generate_time_secs": first_compile_generate_time,
"second_compile_generate_time_secs": second_compile_generate_time,
"third_compile_generate_time_secs": third_compile_generate_time,
"fourth_compile_generate_time_secs": fourth_compile_generate_time,
},
)
conn.commit()
conn.close()
except Exception as e:
logger.error(f"Caught exception: {e}")
continue_metric_collection.set()
if metrics_thread is not None:
metrics_thread.join()
if __name__ == "__main__":
branch, commit_id, commit_msg = parse_arguments()
run_benchmark(branch, commit_id, commit_msg, num_tokens_to_generate=20)
metrics_recorder.close()

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
USER root
ARG REF=main

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
USER root
RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git cmake wget xz-utils build-essential g++5 libprotobuf-dev protobuf-compiler

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
USER root
RUN apt-get update && apt-get install -y libsndfile1-dev espeak-ng time git

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
USER root
RUN apt-get update && apt-get install -y --no-install-recommends libsndfile1-dev espeak-ng time git g++ cmake pkg-config openssh-client git

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
USER root

View File

@ -1,4 +1,4 @@
FROM python:3.10-slim
FROM python:3.9-slim
ENV PYTHONDONTWRITEBYTECODE=1
ARG REF=main
RUN echo ${REF}

View File

@ -1,4 +1,4 @@
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
@ -9,7 +9,7 @@ SHELL ["sh", "-lc"]
# The following `ARG` are mainly used to specify the versions explicitly & directly in this docker file, and not meant
# to be used as arguments for docker build (so far).
ARG PYTORCH='2.4.0'
ARG PYTORCH='2.6.0'
# (not always a valid torch version)
ARG INTEL_TORCH_EXT='2.3.0'
# Example: `cu102`, `cu113`, etc.
@ -26,7 +26,7 @@ RUN git clone https://github.com/huggingface/transformers && cd transformers &&
# 1. Put several commands in a single `RUN` to avoid image/layer exporting issue. Could be revised in the future.
# 2. Regarding `torch` part, We might need to specify proper versions for `torchvision` and `torchaudio`.
# Currently, let's not bother to specify their versions explicitly (so installed with their latest release versions).
RUN python3 -m pip install --no-cache-dir -U tensorflow==2.13 protobuf==3.20.3 tensorflow_text tensorflow_probability && python3 -m pip install --no-cache-dir -e ./transformers[dev,onnxruntime] && [ ${#PYTORCH} -gt 0 -a "$PYTORCH" != "pre" ] && VERSION='torch=='$PYTORCH'.*' || VERSION='torch'; echo "export VERSION='$VERSION'" >> ~/.profile && echo torch=$VERSION && [ "$PYTORCH" != "pre" ] && python3 -m pip install --no-cache-dir -U $VERSION torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/$CUDA || python3 -m pip install --no-cache-dir -U --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/$CUDA
RUN python3 -m pip install --no-cache-dir -U tensorflow==2.13 protobuf==3.20.3 "tensorflow_text<2.16" "tensorflow_probability<0.22" && python3 -m pip install --no-cache-dir -e ./transformers[dev,onnxruntime] && [ ${#PYTORCH} -gt 0 -a "$PYTORCH" != "pre" ] && VERSION='torch=='$PYTORCH'.*' || VERSION='torch'; echo "export VERSION='$VERSION'" >> ~/.profile && echo torch=$VERSION && [ "$PYTORCH" != "pre" ] && python3 -m pip install --no-cache-dir -U $VERSION torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/$CUDA || python3 -m pip install --no-cache-dir -U --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/$CUDA
RUN python3 -m pip uninstall -y flax jax
@ -65,6 +65,9 @@ RUN python3 -m pip install --no-cache-dir python-Levenshtein
# For `FastSpeech2ConformerTokenizer` tokenizer
RUN python3 -m pip install --no-cache-dir g2p-en
# For Some bitsandbytes tests
RUN python3 -m pip install --no-cache-dir einops
# When installing in editable mode, `transformers` is not recognized as a package.
# this line must be added in order for python to be aware of transformers.
RUN cd transformers && python3 setup.py develop

View File

@ -48,8 +48,8 @@ RUN python3 -m pip uninstall -y torch-tensorrt apex
# Pre-build **nightly** release of DeepSpeed, so it would be ready for testing (otherwise, the 1st deepspeed test will timeout)
RUN python3 -m pip uninstall -y deepspeed
# This has to be run inside the GPU VMs running the tests. (So far, it fails here due to GPU checks during compilation.)
# Issue: https://github.com/microsoft/DeepSpeed/issues/2010
# RUN git clone https://github.com/microsoft/DeepSpeed && cd DeepSpeed && rm -rf build && \
# Issue: https://github.com/deepspeedai/DeepSpeed/issues/2010
# RUN git clone https://github.com/deepspeedai/DeepSpeed && cd DeepSpeed && rm -rf build && \
# DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 DS_BUILD_UTILS=1 python3 -m pip install . --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check 2>&1
RUN python3 -m pip install -U "itsdangerous<2.1.0"

View File

@ -1,17 +1,18 @@
FROM rocm/dev-ubuntu-22.04:6.0.2
# rocm/pytorch has no version with 2.1.0
FROM rocm/dev-ubuntu-22.04:6.2.4
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y --no-install-recommends git libsndfile1-dev tesseract-ocr espeak-ng python3 python3-dev python3-pip python3-dev ffmpeg && \
apt install -y --no-install-recommends git libsndfile1-dev tesseract-ocr espeak-ng python3 python3-dev python3-pip python3-dev ffmpeg git-lfs && \
apt clean && \
rm -rf /var/lib/apt/lists/*
RUN git lfs install
RUN python3 -m pip install --no-cache-dir --upgrade pip numpy
RUN python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0
RUN python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2
RUN python3 -m pip install --no-cache-dir --upgrade importlib-metadata setuptools ninja git+https://github.com/facebookresearch/detectron2.git pytesseract "itsdangerous<2.1.0"
@ -30,5 +31,5 @@ RUN python3 -m pip uninstall -y tensorflow flax
# this line must be added in order for python to be aware of transformers.
RUN cd transformers && python3 setup.py develop
# Remove nvml as it is not compatible with ROCm. apex is not tested on NVIDIA either.
RUN python3 -m pip uninstall py3nvml pynvml apex -y
# Remove nvml and nvidia-ml-py as it is not compatible with ROCm. apex is not tested on NVIDIA either.
RUN python3 -m pip uninstall py3nvml pynvml nvidia-ml-py apex -y

View File

@ -1,11 +1,11 @@
FROM rocm/dev-ubuntu-22.04:5.6
FROM rocm/dev-ubuntu-22.04:6.2.4
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
ARG PYTORCH='2.1.1'
ARG TORCH_VISION='0.16.1'
ARG TORCH_AUDIO='2.1.1'
ARG ROCM='5.6'
ARG PYTORCH='2.5.1'
ARG TORCH_VISION='0.20.0'
ARG TORCH_AUDIO='2.5.0'
ARG ROCM='6.2'
RUN apt update && \
apt install -y --no-install-recommends \
@ -45,4 +45,4 @@ RUN cd transformers && python3 setup.py develop
RUN python3 -c "from deepspeed.launcher.runner import main"
# Remove nvml as it is not compatible with ROCm
RUN python3 -m pip uninstall py3nvml pynvml -y
RUN python3 -m pip uninstall py3nvml pynvml nvidia-ml-py apex -y

View File

@ -1,5 +1,5 @@
# https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-11.html#rel-23-11
FROM nvcr.io/nvidia/pytorch:23.04-py3
FROM nvcr.io/nvidia/pytorch:23.11-py3
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive

View File

@ -34,8 +34,8 @@ RUN python3 -m pip uninstall -y torch-tensorrt apex
# Pre-build **nightly** release of DeepSpeed, so it would be ready for testing (otherwise, the 1st deepspeed test will timeout)
RUN python3 -m pip uninstall -y deepspeed
# This has to be run inside the GPU VMs running the tests. (So far, it fails here due to GPU checks during compilation.)
# Issue: https://github.com/microsoft/DeepSpeed/issues/2010
# RUN git clone https://github.com/microsoft/DeepSpeed && cd DeepSpeed && rm -rf build && \
# Issue: https://github.com/deepspeedai/DeepSpeed/issues/2010
# RUN git clone https://github.com/deepspeedai/DeepSpeed && cd DeepSpeed && rm -rf build && \
# DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 DS_BUILD_UTILS=1 python3 -m pip install . --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check 2>&1
## For `torchdynamo` tests

View File

@ -1,4 +1,4 @@
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
@ -11,7 +11,7 @@ ARG REF=main
RUN git clone https://github.com/huggingface/transformers && cd transformers && git checkout $REF
# If set to nothing, will install the latest version
ARG PYTORCH='2.4.0'
ARG PYTORCH='2.6.0'
ARG TORCH_VISION=''
ARG TORCH_AUDIO=''
# Example: `cu102`, `cu113`, etc.

View File

@ -1,4 +1,4 @@
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
@ -9,12 +9,12 @@ SHELL ["sh", "-lc"]
# The following `ARG` are mainly used to specify the versions explicitly & directly in this docker file, and not meant
# to be used as arguments for docker build (so far).
ARG PYTORCH='2.2.1'
ARG PYTORCH='2.5.1'
# Example: `cu102`, `cu113`, etc.
ARG CUDA='cu118'
RUN apt update
RUN apt install -y git libsndfile1-dev tesseract-ocr espeak-ng python python3-pip ffmpeg
RUN apt install -y git libsndfile1-dev tesseract-ocr espeak-ng python3 python3-pip ffmpeg
RUN python3 -m pip install --no-cache-dir --upgrade pip
ARG REF=main
@ -36,15 +36,26 @@ RUN python3 -m pip install --no-cache-dir einops
# Add bitsandbytes for mixed int8 testing
RUN python3 -m pip install --no-cache-dir bitsandbytes
# Add auto-gptq for gtpq quantization testing
RUN python3 -m pip install --no-cache-dir auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
# Add auto-gptq for gtpq quantization testing, installed from source for pytorch==2.5.1 compatibility
# TORCH_CUDA_ARCH_LIST="7.5+PTX" is added to make the package compile for Tesla T4 gpus available for the CI.
RUN pip install gekko
RUN git clone https://github.com/PanQiWei/AutoGPTQ.git && cd AutoGPTQ && TORCH_CUDA_ARCH_LIST="7.5+PTX" python3 setup.py install
# Add optimum for gptq quantization testing
RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/optimum@main#egg=optimum
# Add PEFT
RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/peft@main#egg=peft
# Add aqlm for quantization testing
RUN python3 -m pip install --no-cache-dir aqlm[gpu]==1.0.2
# Add vptq for quantization testing
RUN python3 -m pip install --no-cache-dir vptq
# Add spqr for quantization testing
RUN python3 -m pip install --no-cache-dir spqr_quant[gpu]
# Add hqq for quantization testing
RUN python3 -m pip install --no-cache-dir hqq
@ -52,8 +63,8 @@ RUN python3 -m pip install --no-cache-dir hqq
RUN python3 -m pip install --no-cache-dir gguf
# Add autoawq for quantization testing
# >=v0.2.3 needed for compatibility with torch 2.2.1
RUN python3 -m pip install --no-cache-dir https://github.com/casper-hansen/AutoAWQ/releases/download/v0.2.3/autoawq-0.2.3+cu118-cp38-cp38-linux_x86_64.whl
# >=v0.2.7 needed for compatibility with transformers > 4.46
RUN python3 -m pip install --no-cache-dir https://github.com/casper-hansen/AutoAWQ/releases/download/v0.2.7.post2/autoawq-0.2.7.post2-py3-none-any.whl
# Add quanto for quantization testing
RUN python3 -m pip install --no-cache-dir optimum-quanto
@ -61,6 +72,10 @@ RUN python3 -m pip install --no-cache-dir optimum-quanto
# Add eetq for quantization testing
RUN python3 -m pip install git+https://github.com/NetEase-FuXi/EETQ.git
# Add flute-kernel and fast_hadamard_transform for quantization testing
RUN python3 -m pip install --no-cache-dir flute-kernel==0.3.0 -i https://flute-ai.github.io/whl/cu118
RUN python3 -m pip install --no-cache-dir fast_hadamard_transform==1.0.4.post1
# When installing in editable mode, `transformers` is not recognized as a package.
# this line must be added in order for python to be aware of transformers.
RUN cd transformers && python3 setup.py develop

View File

@ -1,4 +1,4 @@
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
@ -18,7 +18,7 @@ RUN [ ${#TENSORFLOW} -gt 0 ] && VERSION='tensorflow=='$TENSORFLOW'.*' || VERSIO
RUN python3 -m pip uninstall -y torch flax
RUN python3 -m pip install -U "itsdangerous<2.1.0"
RUN python3 -m pip install --no-cache-dir -U tensorflow_probability
RUN python3 -m pip install --no-cache-dir -U "tensorflow_probability<0.22"
# When installing in editable mode, `transformers` is not recognized as a package.
# this line must be added in order for python to be aware of transformers.

View File

@ -276,14 +276,14 @@ building the return.
Here's an example of a single value return:
```
```python
Returns:
`List[int]`: A list of integers in the range [0, 1] --- 1 for a special token, 0 for a sequence token.
```
Here's an example of a tuple return, comprising several objects:
```
```python
Returns:
`tuple(torch.FloatTensor)` comprising various elements depending on the configuration ([`BertConfig`]) and inputs:
- ** loss** (*optional*, returned when `masked_lm_labels` is provided) `torch.FloatTensor` of shape `(1,)` --
@ -322,10 +322,9 @@ includes an example of how to transcribe speech to text in the
The syntax for Example docstrings can look as follows:
```
```python
Example:
```python
>>> from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC
>>> from datasets import load_dataset
>>> import torch
@ -347,7 +346,6 @@ The syntax for Example docstrings can look as follows:
>>> transcription = processor.batch_decode(predicted_ids)
>>> transcription[0]
'MISTER QUILTER IS THE APOSTLE OF THE MIDDLE CLASSES AND WE ARE GLAD TO WELCOME HIS GOSPEL'
```
```
The docstring should give a minimal, clear example of how the respective model

View File

@ -1,57 +1,70 @@
### Translating the Transformers documentation into your language
# Translating the Transformers documentation into your language
As part of our mission to democratize machine learning, we'd love to make the Transformers library available in many more languages! Follow the steps below if you want to help translate the documentation into your language 🙏.
As part of our mission to democratize machine learning, we aim to make the Transformers library available in many more languages! Follow the steps below to help translate the documentation into your language.
**🗞️ Open an issue**
## Open an Issue
To get started, navigate to the [Issues](https://github.com/huggingface/transformers/issues) page of this repo and check if anyone else has opened an issue for your language. If not, open a new issue by selecting the "Translation template" from the "New issue" button.
1. Navigate to the Issues page of this repository.
2. Check if anyone has already opened an issue for your language.
3. If not, create a new issue by selecting the "Translation template" from the "New issue" button.
4. Post a comment indicating which chapters youd like to work on, and well add your name to the list.
Once an issue exists, post a comment to indicate which chapters you'd like to work on, and we'll add your name to the list.
## Fork the Repository
1. First, fork the Transformers repo by clicking the Fork button in the top-right corner.
2. Clone your fork to your local machine for editing with the following command:
**🍴 Fork the repository**
```bash
git clone https://github.com/YOUR-USERNAME/transformers.git
```
Replace `YOUR-USERNAME` with your GitHub username.
First, you'll need to [fork the Transformers repo](https://docs.github.com/en/get-started/quickstart/fork-a-repo). You can do this by clicking on the **Fork** button on the top-right corner of this repo's page.
## Copy-paste the English version with a new language code
Once you've forked the repo, you'll want to get the files on your local machine for editing. You can do that by cloning the fork with Git as follows:
The documentation files are organized in the following directory:
```bash
git clone https://github.com/YOUR-USERNAME/transformers.git
```
- **docs/source**: This contains all documentation materials organized by language.
**📋 Copy-paste the English version with a new language code**
To copy the English version to your new language directory:
The documentation files are in one leading directory:
1. Navigate to your fork of the repository:
- [`docs/source`](https://github.com/huggingface/transformers/tree/main/docs/source): All the documentation materials are organized here by language.
```bash
cd ~/path/to/transformers/docs
```
You'll only need to copy the files in the [`docs/source/en`](https://github.com/huggingface/transformers/tree/main/docs/source/en) directory, so first navigate to your fork of the repo and run the following:
Replace `~/path/to` with your actual path.
```bash
cd ~/path/to/transformers/docs
cp -r source/en source/LANG-ID
```
2. Run the following command:
Here, `LANG-ID` should be one of the ISO 639-1 or ISO 639-2 language codes -- see [here](https://www.loc.gov/standards/iso639-2/php/code_list.php) for a handy table.
```bash
cp -r source/en source/LANG-ID
```
**✍️ Start translating**
Replace `LANG-ID` with the appropriate ISO 639-1 or ISO 639-2 language code (see [this table](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) for reference).
The fun part comes - translating the text!
## Start translating
The first thing we recommend is translating the part of the `_toctree.yml` file that corresponds to your doc chapter. This file is used to render the table of contents on the website.
Begin translating the text!
> 🙋 If the `_toctree.yml` file doesn't yet exist for your language, you can create one by copy-pasting from the English version and deleting the sections unrelated to your chapter. Just make sure it exists in the `docs/source/LANG-ID/` directory!
1. Start with the `_toctree.yml` file that corresponds to your documentation chapter. This file is essential for rendering the table of contents on the website.
The fields you should add are `local` (with the name of the file containing the translation; e.g. `autoclass_tutorial`), and `title` (with the title of the doc in your language; e.g. `Load pretrained instances with an AutoClass`) -- as a reference, here is the `_toctree.yml` for [English](https://github.com/huggingface/transformers/blob/main/docs/source/en/_toctree.yml):
- If the `_toctree.yml` file doesnt exist for your language, create one by copying the English version and removing unrelated sections.
- Ensure it is placed in the `docs/source/LANG-ID/` directory.
```yaml
- sections:
- local: pipeline_tutorial # Do not change this! Use the same name for your .md file
title: Pipelines for inference # Translate this!
...
title: Tutorials # Translate this!
```
Heres an example structure for the `_toctree.yml` file:
Once you have translated the `_toctree.yml` file, you can start translating the [MDX](https://mdxjs.com/) files associated with your docs chapter.
```yaml
- sections:
- local: pipeline_tutorial # Keep this name for your .md file
title: Pipelines for Inference # Translate this
...
title: Tutorials # Translate this
```
> 🙋 If you'd like others to help you with the translation, you should [open an issue](https://github.com/huggingface/transformers/issues) and tag @stevhliu.
2. Once youve translated the `_toctree.yml`, move on to translating the associated MDX files.
## Collaborate and share
If you'd like assistance with your translation, open an issue and tag `@stevhliu`. Feel free to share resources or glossaries to ensure consistent terminology.

View File

@ -30,26 +30,26 @@
- local: conversations
title: الدردشة مع المحولات
title: البرامج التعليمية
# - sections:
# - isExpanded: false
# sections:
# - local: tasks/sequence_classification
# title: تصنيف النصوص
# - local: tasks/token_classification
# title: تصنيف الرموز
# - local: tasks/question_answering
# title: الإجابة على الأسئلة
# - local: tasks/language_modeling
# title: نمذجة اللغة السببية
# - local: tasks/masked_language_modeling
# title: نمذجة اللغة المقنعة
# - local: tasks/translation
# title: الترجمة
# - local: tasks/summarization
# title: التلخيص
# - local: tasks/multiple_choice
# title: الاختيار المتعدد
# title: معالجة اللغات الطبيعية
- sections:
- isExpanded: false
sections:
- local: tasks/sequence_classification
title: تصنيف النصوص
- local: tasks/token_classification
title: تصنيف الرموز
- local: tasks/question_answering
title: الإجابة على الأسئلة
- local: tasks/language_modeling
title: نمذجة اللغة السببية
- local: tasks/masked_language_modeling
title: نمذجة اللغة المقنعة
- local: tasks/translation
title: الترجمة
- local: tasks/summarization
title: التلخيص
- local: tasks/multiple_choice
title: الاختيار المتعدد
title: معالجة اللغات الطبيعية
# - isExpanded: false
# sections:
# - local: tasks/audio_classification
@ -107,39 +107,43 @@
# - local: tasks/prompting
# title: دليل إرشادي لمحفزات النماذج اللغوية الكبيرة
# title: الإرشاد
# title: أدلة المهام
# - sections:
# - local: fast_tokenizers
# title: استخدم برامج التجزئة السريعة من 🤗 Tokenizers
# - local: multilingual
# title: تشغيل الاستنتاج باستخدام نماذج متعددة اللغات
# - local: create_a_model
# title: استخدام واجهات برمجة التطبيقات الخاصة بالنموذج
# - local: custom_models
# title: مشاركة نموذج مخصص
# - local: chat_templating
# title: قوالب لنماذج الدردشة
# - local: trainer
# title: المدرب
# - local: sagemaker
# title: تشغيل التدريب على Amazon SageMaker
# - local: serialization
# title: التصدير إلى ONNX
# - local: tflite
# title: التصدير إلى TFLite
# - local: torchscript
# title: التصدير إلى TorchScript
# - local: benchmarks
# title: المعايير
# - local: notebooks
# title: دفاتر الملاحظات مع الأمثلة
# - local: community
# title: موارد المجتمع
# - local: troubleshooting
# title: استكشاف الأخطاء وإصلاحها
# - local: gguf
# title: التوافق مع ملفات GGUF
# title: أدلة المطورين
title: أدلة المهام
- sections:
- local: fast_tokenizers
title: استخدم مجزئيات النصوص السريعة من 🤗 Tokenizers
- local: multilingual
title: الاستدلال باستخدام نماذج متعددة اللغات
- local: create_a_model
title: استخدام واجهات برمجة التطبيقات الخاصة بالنموذج
- local: custom_models
title: مشاركة نموذج مخصص
- local: chat_templating
title: قوالب لنماذج الدردشة
- local: trainer
title: المدرب
- local: sagemaker
title: تشغيل التدريب على Amazon SageMaker
- local: serialization
title: التصدير إلى ONNX
- local: tflite
title: التصدير إلى TFLite
- local: torchscript
title: التصدير إلى TorchScript
- local: notebooks
title: دفاتر الملاحظات مع الأمثلة
- local: community
title: موارد المجتمع
- local: troubleshooting
title: استكشاف الأخطاء وإصلاحها
- local: gguf
title: التوافق مع ملفات GGUF
- local: tiktoken
title: التوافق مع ملفات TikToken
- local: modular_transformers
title: الوحدات النمطية في `transformers`
- local: how_to_hack_models
title: اختراق النموذج (الكتابة فوق فئة لاستخدامك)
title: أدلة المطورين
# - sections:
# - local: quantization/overview
# title: نظرة عامة
@ -151,6 +155,8 @@
# title: AWQ
# - local: quantization/aqlm
# title: AQLM
# - local: quantization/vptq
# title: VPTQ
# - local: quantization/quanto
# title: Quanto
# - local: quantization/eetq
@ -875,7 +881,7 @@
# - local: internal/pipelines_utils
# title: مرافق خطوط الأنابيب
# - local: internal/tokenization_utils
# title: مرافق مقسم النصوص
# title: مرافق مقسم النصوص
# - local: internal/trainer_utils
# title: مرافق المدرب
# - local: internal/generation_utils

View File

@ -464,7 +464,7 @@ image = image_generator(prompt=improved_prompt)
قبل إنشاء الصورة أخيرًا:
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png" />
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit_spacesuit_flux.webp" />
> [!WARNING]
> تتطلب gradio-tools إدخالات وإخراجات *نصية* حتى عند العمل مع طرائق مختلفة مثل كائنات الصور والصوت. الإدخالات والإخراجات الصورية والصوتية غير متوافقة حاليًا.

View File

@ -0,0 +1,835 @@
# قوالب نماذج الدردشة
## مقدمة
تعد **الدردشة** أحد استخدامات نماذج اللغات الكبيرة (LLMs) شائعة الاستخدام بشكل متزايد. ففي سياق الدردشة، وبدلاً من متابعة سلسلة نصية واحدة (كما هو الحال مع نماذج اللغات القياسية)، يواصل النموذج بدلاً من ذلك محادثة تتكون من رسالة واحدة أو أكثر، تتضمن كل منها دورًا، مثل "المستخدم" أو "المساعد"، بالإضافة إلى نص الرسالة.
وكما هو الحال مع تقسيم النص إلى رموز (tokenization)، تتوقع النماذج المختلفة تنسيقات إدخال مختلفة تمامًا للمحادثة. لهذا السبب أضفنا **قوالب الدردشة** كميزة جديدة. تُعد قوالب المحادثة جزءًا من tokenizer. تحدد هذه القوالب كيفية تحويل المحادثات، والتي يتم تمثيلها كقوائم من الرسائل، إلى سلسلة نصية واحدة قابلة للتقسيم إلى رموز بالتنسيق الذي يتوقعه النموذج.
دعونا نجعل هذا ملموسًا بمثال سريع باستخدام نموذج `BlenderBot`. لدى BlenderBot قالب افتراضي بسيط للغاية، والذي يضيف في الغالب مسافات بيضاء بين جولات الحوار:
```python
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("facebook/blenderbot-400M-distill")
>>> chat = [
... {"role": "user", "content": "Hello, how are you?"},
... {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
... {"role": "user", "content": "I'd like to show off how chat templating works!"},
... ]
>>> tokenizer.apply_chat_template(chat, tokenize=False)
" Hello, how are you? I'm doing great. How can I help you today? I'd like to show off how chat templating works!</s>"
```
لاحظ كيف تم ضغط الدردشة بأكملها في سلسلة واحدة. إذا استخدمنا `tokenize=True`، وهو الإعداد الافتراضي، فسيتم أيضًا تحليل السلسلة نحويًا نيابة عنا. ولكن، لنشاهد قالبًا أكثر تعقيدًا في العمل، دعونا نستخدم نموذج `mistralai/Mistral-7B-Instruct-v0.1`.
```python
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
>>> chat = [
... {"role": "user", "content": "Hello, how are you?"},
... {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
... {"role": "user", "content": "I'd like to show off how chat templating works!"},
... ]
>>> tokenizer.apply_chat_template(chat, tokenize=False)
"<s>[INST] Hello, how are you? [/INST]I'm doing great. How can I help you today?</s> [INST] I'd like to show off how chat templating works! [/INST]</s>"
```
لاحظ كيف أضاف المجزىء اللغوى tokenizer رموز التحكم `[INST]` و `[/INST]` للإشارة إلى بداية ونهاية رسائل المستخدم (ولكن ليس رسائل المساعد!) ، وتم تكثيف المحادثة بأكملها في سلسلة نصية واحدة. إذا استخدمنا `tokenize=True` ، وهو الإعداد الافتراضي ، فسيتم أيضًا تقسيم تلك السلسلة إلى رموز.
حاول الآن استخدام نفس الشفرة، لكن مع استبدال النموذج بـ `HuggingFaceH4/zephyr-7b-beta` ، وستحصل على:
```text
<|user|>
Hello, how are you?</s>
<|assistant|>
I'm doing great. How can I help you today?</s>
<|user|>
I'd like to show off how chat templating works!</s>
```
تم ضبط كل من Zephyr و Mistral-Instruct من نفس النموذج الأصلي ، Mistral-7B-v0.1. ومع ذلك ، فقد تم تدريبهم بتنسيقات دردشة مختلفة تمامًا. بدون قوالب المحادثة، ستضطر إلى كتابة شفرة تنسيق يدويًا لكل نموذج ، ومن السهل جدًا ارتكاب أخطاء بسيطة تؤثر على الأداء! تُدير قوالب المحادثة تفاصيل التنسيق نيابةً عنك ، مما يُتيح لك كتابة شفرة عامة تعمل مع أي نموذج.
## كيف أستخدم قوالب الدردشة؟
كما رأيت في المثال السابق، من السهل استخدام قوالب الدردشة. قم ببساطة بإنشاء قائمة من الرسائل، مع مفتاحي `role` و`content`، ثم قم بتمريرها إلى [`~PreTrainedTokenizer.apply_chat_template`] . بمجرد قيامك بذلك، ستحصل على مخرجات جاهزة للاستخدام! عند استخدام قوالب الدردشة كإدخال لتوليد نصوص بواسطة النموذج، فمن الجيد أيضًا استخدام `add_generation_prompt=True` لإضافة [مطالبات توليد النصوص](#what-are-generation-prompts).
فيما يلي مثال على إعداد الإدخال لـ `model.generate()`، باستخدام Zephyr مرة أخرى:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "HuggingFaceH4/zephyr-7b-beta"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint) # قد ترغب في استخدام bfloat16 و/أو الانتقال إلى GPU هنا
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
print(tokenizer.decode(tokenized_chat[0]))
```
سيؤدي هذا إلى إنتاج سلسلة نصية بتنسيق الإدخال الذي يتوقعه Zephyr.
```text
<|system|>
You are a friendly chatbot who always responds in the style of a pirate</s>
<|user|>
How many helicopters can a human eat in one sitting?</s>
<|assistant|>
```
الآن بعد أن تم تنسيق الإدخال بشكل صحيح لـ Zephyr، يمكننا استخدام النموذج لإنشاء رد على سؤال المستخدم:
```python
outputs = model.generate(tokenized_chat, max_new_tokens=128)
print(tokenizer.decode(outputs[0]))
```
سيؤدي هذا إلى ما يلي:
```text
<|system|>
You are a friendly chatbot who always responds in the style of a pirate</s>
<|user|>
How many helicopters can a human eat in one sitting?</s>
<|assistant|>
Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all.
```
كان ذلك سهلاً بعد كل شيء !
## هل هناك قنوات معالجة أوتوماتيكية للدردشة؟
نعم يوجد ! تدعم قنوات المعالجة توليد النصوص مدخلات الدردشة ، مما يُسهّل استخدام نماذج الدردشة . في الماضي ، كنا نستخدم فئة "ConversationalPipeline" المُخصّصة ، ولكن تم الآن إيقافها وتم دمج وظائفها في [`TextGenerationPipeline`]. دعونا نجرّب مثال Zephyr مرة أخرى ، ولكن هذه المرة باستخدام قناة معالجة:
```python
from transformers import pipeline
pipe = pipeline("text-generation", "HuggingFaceH4/zephyr-7b-beta")
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
print(pipe(messages, max_new_tokens=128)[0]['generated_text'][-1]) # طباعة استجابة المساعد
```
```النص
{'role': 'assistant', 'content': "Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all."}
```
سيُراعي قناة المعالجة جميع تفاصيل تقسيم النص إلى رموز واستدعاء apply_chat_template نيابةً عنك - بمجرد أن يصبح لِدى النموذج قالب دردشة ، فكل ما تحتاج إلى القيام به هو تهيئة قناة معالجة وتمرير قائمة الرسائل إليها!
## ما هي "مطالبات التوليد"؟
قد تلاحظ أن طريقة `apply_chat_template` لها معامل `add_generation_prompt`. تخبر هذه المعامل القالب بإضافة رموز تشير إلى بداية رد البوت. على سبيل المثال، ضع في اعتبارك الدردشة التالية:
```python
messages = [
{"role": "user", "content": "Hi there!"},
{"role": "assistant", "content": "Nice to meet you!"},
{"role": "user", "content": "Can I ask a question?"}
]
```
إليك كيف سيبدو ذلك بدون موجه توليد نصوص ، بالنسبة لنموذج يستخدم تنسيق "ChatML" القياسي :
```python
tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)
"""<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
"""
```
وهكذا يبدو الأمر **مع** مطالبة التوليد:
```python
tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
"""<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant
"""
```
لاحظ أننا أضفنا هذه المرة الرموز التي تشير إلى بداية رد البوت. يضمن هذا أنه عندما يُولّد النموذج نصًا فسيكتب رد البوت بدلاً من القيام بشيء غير متوقع، مثل الاستمرار في رسالة المستخدم. تذكر، أن نماذج الدردشة لا تزال مجرد نماذج للغة - فهي مدربة على متابعة النصوص، والدردشة هي مجرد نوع خاص من النصوص بالنسبة لها! يجب توجيهها برموز تحكم مناسبة، حتى تعرف ما الذي يجب عليها فعله.
لا تتطلب جميع النماذج الرموز التحكمية لتوليد نصوص . بعض النماذج ، مثل LLaMA ، ليس لديها أي رموز خاصة قبل ردود البوت . في هذه الحالات ، لن يكون لمعامل `add_generation_prompt` أي تأثير. يعتمد التأثير الدقيق الذي تُحدثه `add_generation_prompt` على القالب المستخدم .
## ما وظيفة "continue_final_message"؟
عند تمرير قائمة من الرسائل إلى `apply_chat_template` أو `TextGenerationPipeline` ، يمكنك اختيار تنسيق المحادثة بحيث يواصل النموذج الرسالة الأخيرة في المحادثة بدلاً من بدء رسالة جديدة. يتم ذلك عن طريق إزالة أي رموز نهاية التسلسل التي تشير إلى نهاية الرسالة الأخيرة ، بحيث يقوم النموذج ببساطة بتمديد الرسالة الأخيرة عندما يبدأ في توليد النص . يُعد هذا أمرًا مفيدًا "لِمَلء بداية" رد النموذج مُسبقًا.
وهنا مثال:
```python
chat = [
{"role": "user", "content": "Can you format the answer in JSON?"},
{"role": "assistant", "content": '{"name": "'},
]
formatted_chat = tokenizer.apply_chat_template(chat, tokenize=True, return_dict=True, continue_final_message=True)
model.generate(**formatted_chat)
```
سيقوم النموذج بتوليد نص يكمل سلسلة JSON ، بدلاً من بدء رسالة جديدة . يمكن أن يكون هذا النهج مفيدًا جدًا لتحسين دقة اتباع النموذج للإرشادات عندما تعرف كيف تريد أن يبدأ ردوده .
.
نظرًا لأن `add_generation_prompt` تضيف الرموز التي تبدأ رسالة جديدة ، و `continue_final_message` تزيل أي رموز نهاية الرسالة من الرسالة الأخيرة ، فليس من المنطقي استخدامهما معًا . ونتيجة لذلك ، ستتلقّى خطأً إذا حاولت ذلك !
السلوك الافتراضي لِـ `TextGenerationPipeline` هو تعيين `add_generation_prompt=True` بحيث تبدأ رسالة جديدة . ومع ذلك ، إذا كانت الرسالة الأخيرة في المحادثة التي تم إدخالها لديها دور "assistant" ، فسوف تفترض أن هذه الرسالة هي "مَلء بداية" وتتحوّل إلى `continue_final_message=True` بدلاً من ذلك ، لأن مُعظم النماذج لا تدعم عدة رسائل متتالية للمساعد . يمكنك تجاوز هذا السلوك عن طريق تمرير معامل `continue_final_message` بشكل صريح عند استدعاء قناة المعالجة .
## هل يمكنني استخدام قوالب الدردشة في التدريب؟
نعم ! تُعد هذه طريقة جيدة للتأكد من أن قالب الدردشة يتطابق مع الرموز التي يراها النموذج أثناء التدريب . نوصي بتطبيق قالب الدردشة كخطوة معالجة أولية لمجموعة بياناتك . بعد ذلك ، يمكنك ببساطة متابعة عملية التدريب كما هو الحال مع أي مهمة تدريب نماذج لغات أخرى . عند التدريب ، يجب أن تُعيّن عادةً `add_generation_prompt=False` ، لأنه لن تكون الرموز المُضافة لتحفيز رد المساعد مفيدة أثناء التدريب . دعونا نرى مثالاً :
```python
from transformers import AutoTokenizer
from datasets import Dataset
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-beta")
chat1 = [
{"role": "user", "content": "Which is bigger, the moon or the sun?"},
{"role": "assistant", "content": "The sun."}
]
chat2 = [
{"role": "user", "content": "Which is bigger, a virus or a bacterium?"},
{"role": "assistant", "content": "A bacterium."}
]
dataset = Dataset.from_dict({"chat": [chat1, chat2]})
dataset = dataset.map(lambda x: {"formatted_chat": tokenizer.apply_chat_template(x["chat"], tokenize=False, add_generation_prompt=False)})
print(dataset['formatted_chat'][0])
```
ونحصل على:
```text
<|user|>
Which is bigger, the moon or the sun?</s>
<|assistant|>
The sun.</s>
```
من هنا، استمر في التدريب كما تفعل مع مهمة نمذجة اللغة القياسية، باستخدام عمود `formatted_chat`.
<Tip>
بشكل افتراضي ، تضيف بعض *tokenizers* رموزًا خاصة مثل `<bos>` و `<eos>` إلى النص الذي تقوم بتقسيمه إلى رموز. يجب أن تتضمن قوالب المحادثة بالفعل جميع الرموز الخاصة التي تحتاجها ، وبالتالي فإن الرموز الخاصة الإضافية ستكون غالبًا غير صحيحة أو مُكررة ، مما سيؤثر سلبًا على أداء النموذج .
لذلك ، إذا قمت بتنسيق النص باستخدام `apply_chat_template(tokenize=False)` ، فيجب تعيين المعامل `add_special_tokens=False` عندما تقوم بتقسيم ذلك النص إلى رموز لاحقًا . إذا كنت تستخدم `apply_chat_template(tokenize=True)` ، فلن تحتاج إلى القلق بشأن ذلك !
</Tip>
## متقدّم: مدخلات إضافية لِقوالب الدردشة
المعامل الوحيدة التي تتطلبها طريقة `apply_chat_template` هي `messages`. ومع ذلك، يمكنك تمرير أي معامل ككلمة مفتاحية إلى `apply_chat_template` وستكون متاحة داخل القالب. يمنحك هذا الكثير من المرونة لاستخدام قوالب الدردشة للعديد من الأشياء. لا توجد قيود على أسماء هذه المعامﻻت أو تنسيقاتها - يمكنك تمرير سلاسل نصية أو قوائم أو قواميس أو أي شيء آخر تريده.
ومع ذلك، هناك بعض الحالات الشائعة لاستخدام هذه المعامﻻت الإضافية، مثل تمرير أدوات لاستدعاء الوظائف، أو المستندات لإنشاء النصوص المُعزّزة بالاسترجاع. في هذه الحالات الشائعة، لدينا بعض التوصيات المُحدّدة حول أسماء هذه المعامﻻت وتنسيقاتها، والتي يتم وصفها في الأقسام التالية. نشجع مطوّري النماذج على جعل قوالب الدردشة الخاصة بهم متوافقة مع هذا التنسيق، لتسهيل نقل التعليمات البرمجية لاستدعاء الأدوات بين النماذج.
## متقدم: استخدام الأداة / استدعاء الدالة
يمكن لنماذج "استخدام الأداة" اختيار استدعاء الدوال كأدوات خارجية قبل توليد الإجابة. عند تمرير الأدوات إلى نموذج استخدام الأدوات، يمكنك ببساطة تمرير قائمة من الوظائف إلى معامل `tools`:
```python
import datetime
def current_time():
"""Get the current local time as a string."""
return str(datetime.now())
def multiply(a: float, b: float):
"""
A function that multiplies two numbers
Args:
a: The first number to multiply
b: The second number to multiply
"""
return a * b
tools = [current_time, multiply]
model_input = tokenizer.apply_chat_template(
messages,
tools=tools
)
```
لكي يعمل هذا بشكل صحيح، يجب عليك كتابة وظائفك بالتنسيق السابق، حتى يمكن تحليلها بشكل صحيح كأدوات. على وجه التحديد، يجب عليك اتباع هذه القواعد:
- يجب أن يكون للدالة اسم وصفي.
- يجب أن يكون لكل معامل نوع للتلميح.
- يجب أن تحتوي الدالة على سلسلة مستندية بتنسيق Google القياسي (بمعنى وصف الدالة الأولي متبوعًا بكتلة `Args:` التي تصف المعاﻻت، ما لم تكن الدالة لا تحتوي على أي معامﻻت.
- لا تقم بتضمين الأنواع في كتلة `Args:` . بعبارة أخرى، اكتب `a: The first number to multiply`، وليس `a (int): The first number to multiply`. يجب أن تذهب تلميحات الأنواع في رأس الدالة بدلاً من ذلك.
- يمكن أن يكون للدالة نوع للإرجاع ومربع `Returns:` في السلسلة. ومع ذلك، فهذه اختيارية لأن معظم نماذج استخدام الأدوات تتجاهلها.
### تمرير نتائج الأداة إلى النموذج
يكفي الكود السابقة لسرد الأدوات المتاحة لنموذجك، ولكن ماذا يحدث إذا أراد النموذج استخدام واحدة منها؟ إذا حدث ذلك، فيجب عليك:
1. تحليل مخرجات النموذج للحصول على اسم (أسماء) الأدوات ومعامﻻتها.
2. أضف استدعاء (استدعاءات) النموذج لِلأدوات إلى المحادثة.
3. استدعاء الدالة (الدالات) المقابلة بتلك المعامﻻت.
4. أضف النتيجة (النتائج) إلى المحادثة
### مثال كامل على استخدام الأداة
سنستعرض مثالاً على استخدام الأدوات خطوة بخطوة . في هذا المثال ، سنستخدم نموذج `Hermes-2-Pro` بحجم 8 مليارات معامل ، نظرًا لأنه أحد أعلى نماذج استخدام الأدوات أداءً في فئة حجمه وقت كتابة هذا النص . إذا كان لديك الذاكرة الكافية ، فيمكنك النظر في استخدام نموذج أكبر بدلاً من ذلك مثل `Command-R` أو `Mixtral-8x22B` ، وكلاهما يدعم استخدام الأدوات ويوفر أداءً أقوى .
أولاً ، لنقم بتحميل نموذجنا و tokenizer الخاص بنا:
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "NousResearch/Hermes-2-Pro-Llama-3-8B"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, device_map="auto")
```python
messages = [
{"role": "system", "content": "You are a bot that responds to weather queries. You should reply with the unit used in the queried location."},
{"role": "user", "content": "Hey, what's the temperature in Paris right now?"}
]
```
الآن، لنقم نطبق قالب الدردشة ونولد رد:
```python
inputs = tokenizer.apply_chat_template(messages, chat_template="tool_use", tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))
```
ونحصل على:
```text
<tool_call>
{"arguments": {"location": "Paris, France", "unit": "celsius"}, "name": "get_current_temperature"}
</tool_call><|im_end|>
```
لقد قام النموذج باستدعاء الدالة مع معامﻻت صحيحة، بالصيغة التي طلبتها توثيق الدالة. لقد استنتج أننا نشير على الأرجح إلى باريس في فرنسا، وتذكر أنه بكونها موطن وحدات القياس الدولية، يجب عرض درجة الحرارة في فرنسا بالدرجة المئوية.
دعنا نضيف استدعاء الأداة الخاص بالنموذج إلى المحادثة. لاحظ أننا نولد معرف استدعاء أداة عشوائيًا هنا. لا تستخدم جميع النماذج هذه المعرفات، ولكنها تسمح للنماذج بإصدار عدة استدعاءات للأدوات في نفس الوقت وتتبع الاستجابة المقابلة لكل استدعاء. يمكنك توليد هذه المعرفات بأي طريقة تريدها، ولكن يجب أن تكون فريدة داخل كل محادثة.
```python
tool_call_id = "vAHdf3" # Random ID, should be unique for each tool call
tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France", "unit": "celsius"}}
messages.append({"role": "assistant", "tool_calls": [{"id": tool_call_id, "type": "function", "function": tool_call}]})
```
الآن بعد أن أضفنا استدعاء الأداة إلى المحادثة، يمكننا استدعاء الدالة وإضافة النتيجة إلى المحادثة. نظرًا لأننا نستخدم دالة وهمية لهذا المثال والتي تعيد دائمًا 22.0، فيمكننا ببساطة إضافة تلك النتيجة مباشرةً. لاحظ معرف استدعاء الأداة - يجب أن يتطابق مع المعرف المستخدم في استدعاء الأداة أعلاه.
```python
messages.append({"role": "tool", "tool_call_id": tool_call_id, "name": "get_current_temperature", "content": "22.0"})
```
أخيرًا، دعنا نجعل المساعد يقرأ مخرجات الدالة ويكمل الدردشة مع المستخدم:
```python
inputs = tokenizer.apply_chat_template(messages, chat_template="tool_use", tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))
```
ونحصل على:
```text
The current temperature in Paris, France is 22.0 ° Celsius.<|im_end|>
```
<Tip>
لا تستخدم جميع نماذج استخدام الأدوات جميع ميزات استدعاء الأدوات الموضحة أعلاه. يستخدم البعض معرفات استدعاء الأدوات، بينما يستخدم البعض الآخر ببساطة اسم الدالة ويقارن استدعاءات الأدوات بالنتائج باستخدام الترتيب، وهناك عدة نماذج لا تستخدم أيًا منهما ولا تصدر سوى استدعاء أداة واحد في كل مرة لتجنب الارتباك. إذا كنت تريد أن يكون رمزك متوافقًا مع أكبر عدد ممكن من النماذج، فإننا نوصي بهيكلة استدعاءات الأدوات الخاصة بك كما هو موضح هنا، وإعادة نتائج الأدوات بالترتيب الذي أصدرها النموذج. يجب أن تتعامل قوالب الدردشة على كل نموذج مع الباقي.
</Tip>
### فهم مخططات الأدوات
يتم تحويل كل دالة تقوم بتمريرها إلى معامل `tools` في دالة `apply_chat_template` إلى [مخطط JSON](https://json-schema.org/learn/getting-started-step-by-step). يتم بعد ذلك تمرير هذه المخططات إلى قالب الدردشة النموذج. وبعبارة أخرى، فإن نماذج استخدام الأدوات لا ترى دوالك مباشرة، ولا ترى مطلقًا الكود الموجود بداخلها. ما يهمها هو**تعريفات** الدوال و**المعامﻻت** التي تحتاج إلى تمريرها إليها - فهي تهتم بما تفعله الأدوات وكيفية استخدامها، وليس بكيفية عملها! يقع على عاتقك قراءة مخرجاتها، والكشف عما إذا كانت قد طلبت استخدام أداة، وتمرير المعامﻻت إلى دالة الأداة، وإرجاع الرد في الدردشة.
يجب أن يكون إنشاء مخططات JSON لتمريرها إلى القالب تلقائيًا وغير مرئي طالما أن دوالك تتبع المواصفات الموضحة أعلاه، ولكن إذا واجهت مشكلات، أو إذا كنت تريد ببساطة مزيدًا من التحكم في التحويل، فيمكنك التعامل مع التحويل يدويًا. فيما يلي مثال على تحويل مخطط يدوي:
```python
from transformers.utils import get_json_schema
def multiply(a: float, b: float):
"""
A function that multiplies two numbers
Args:
a: The first number to multiply
b: The second number to multiply
"""
return a * b
schema = get_json_schema(multiply)
print(schema)
```
سيؤدي هذا إلى ما يلي:
```json
{
"type": "function",
"function": {
"name": "multiply",
"description": "A function that multiplies two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number to multiply"
},
"b": {
"type": "number",
"description": "The second number to multiply"
}
},
"required": ["a", "b"]
}
}
}
```
إذا كنت ترغب في ذلك، يمكنك تحرير هذه المخططات، أو حتى كتابتها من البداية بنفسك دون استخدام `get_json_schema` على الإطلاق. يمكن تمرير مخططات JSON مباشرةً إلى معامل `tools` في `apply_chat_template` - يمنحك هذا الكثير من القوة لتعريف مخططات دقيقة لوظائف أكثر تعقيدًا. ولكن كن حذرًا - كلما زاد تعقيد مخططاتك، زاد احتمال ارتباك النموذج عند التعامل معها! نوصي بتوقيعات دوال بسيطة حيثما أمكن، مع تقليل المعامﻻت (وخاصة المعامﻻت المعقدة والمتداخلة) إلى الحد الأدنى.
فيما يلي مثال على تعريف المخططات يدويًا، وتمريرها مباشرةً إلى `apply_chat_template`:
```python
# A simple function that takes no arguments
current_time = {
"type": "function",
"function": {
"name": "current_time",
"description": "Get the current local time as a string.",
"parameters": {
'type': 'object',
'properties': {}
}
}
}
# A more complete function that takes two numerical arguments
multiply = {
'type': 'function',
'function': {
'name': 'multiply',
'description': 'A function that multiplies two numbers',
'parameters': {
'type': 'object',
'properties': {
'a': {
'type': 'number',
'description': 'The first number to multiply'
},
'b': {
'type': 'number', 'description': 'The second number to multiply'
}
},
'required': ['a', 'b']
}
}
}
model_input = tokenizer.apply_chat_template(
messages,
tools = [current_time, multiply]
)
```
## متقدم: توليد قائم على الاسترجاع
يمكن لنماذج اللغة الكبيرة من نوع "توليد قائم على الاسترجاع" أو "RAG" البحث في مجموعة نصوص عن معلومات قبل الرد على الاستعلام. يسمح هذا للنماذج بتوسيع قاعدة معارفها بشكل كبير إلى ما هو أبعد من حجم سياقها المحدود. توصيتنا لنماذج RAG هي أن يقبل قالبها وسيطة `documents`. يجب أن تكون هذه قائمة من المستندات، حيث يكون كل "مستند" عبارة عن قاموس واحد بمفاتيح `title` و `contents`، وكلاهما سلاسل نصية. نظرًا لأن هذا التنسيق أبسط بكثير من مخططات JSON المستخدمة للأدوات، فلا توجد حاجة إلى دوال مساعدة.
فيما يلي مثال على قالب RAG بالفعل:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
# تحميل النموذج والمجزىء اللغوي
model_id = "CohereForAI/c4ai-command-r-v01-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
device = model.device # الحصول على الجهاز الذي تم تحميل النموذج عليه
# تعريف مُدخلات المحادثة
conversation = [
{"role": "user", "content": "What has Man always dreamed of?"}
]
# تعريف المستندات لتوليد قائم على الاسترجاع
documents = [
{
"title": "The Moon: Our Age-Old Foe",
"text": "Man has always dreamed of destroying the moon. In this essay, I shall..."
},
{
"title": "The Sun: Our Age-Old Friend",
"text": "Although often underappreciated, the sun provides several notable benefits..."
}
]
# معالجة المحادثة والمستندات باستخدام قالب RAG، وإرجاع موترات PyTorch.
input_ids = tokenizer.apply_chat_template(
conversation=conversation,
documents=documents,
chat_template="rag",
tokenize=True,
add_generation_prompt=True,
return_tensors="pt").to(device)
# توليد الرد
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
# فك تشفير النص المُوَلّد وطباعته
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
```
إن مُدخل documents للتوليد القائم على الاسترجاع غير مدعوم على نطاق واسع، والعديد من النماذج لديها قوالب دردشة تتجاهل هذا المُدخل ببساطة.
للتحقق مما إذا كان النموذج يدعم مُدخل `documents`، يمكنك قراءة بطاقة النموذج الخاصة به، أو `print(tokenizer.chat_template)` لمعرفة ما إذا كان مفتاح `documents` مستخدمًا في أي مكان.
<Tip>
ومع ذلك، فإن أحد فئات النماذج التي تدعمه هي [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-08-2024) و [Command-R+](https://huggingface.co/CohereForAI/c4ai-command-r-pluse-08-2024) من Cohere، من خلال قالب الدردشة rag الخاص بهم. يمكنك رؤية أمثلة إضافية على التوليد باستخدام هذه الميزة في بطاقات النموذج الخاصة بهم.
</Tip>
## متقدم: كيف تعمل قوالب الدردشة؟
يتم تخزين قالب الدردشة للنموذج في الخاصية `tokenizer.chat_template`. إذا لم يتم تعيين قالب دردشة، فسيتم استخدام القالب الافتراضي لفئة النموذج هذه بدلاً من ذلك. دعونا نلقي نظرة على قالب دردشة `Zephyr`، ولكن لاحظ أن هذا القالب مُبسّط قليلاً عن القالب الفعلي!
```
{%- for message in messages %}
{{- '<|' + message['role'] + |>\n' }}
{{- message['content'] + eos_token }}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|assistant|>\n' }}
{%- endif %}
```
إذا لم تكن قد رأيت أحد هذه القوالب من قبل، فهذا [قالب Jinja](https://jinja.palletsprojects.com/en/3.1.x/templates/) .Jinja هي لغة قوالب تسمح لك بكتابة تعليمات برمجية بسيطة تُوَلّد نصًا. من نواحٍ عديدة، يُشبه الرمز والتركيب للغة Python. أما في لغة Python، سيبدو هذا القالب كما يلي:
```python
for message in messages:
print(f'<|{message["role"]}|>')
print(message['content'] + eos_token)
if add_generation_prompt:
print('<|assistant|>')
```
يقوم القالب بثلاثة أشياء بشكل فعال:
- لكل رسالة، بطبع الدور مُحاطًا بـ `<|` و `|>`، مثل `<|user|>` أو `<|assistant|>`.
- بعد ذلك، يطبع محتوى الرسالة، متبوعًا برمز نهاية التسلسل `eos_token` .
- أخيرًا، إذا تم تعيين `add_generation_prompt` ، يطبع الرمز المساعد، حتى يعرف النموذج أنه يجب أن يبدأ في توليد استجابة المساعد.
هذا قالب بسيط جدًا، لكن Jinja تمنحك الكثير من المرونة للقيام بأشياء أكثر تعقيدًا! دعونا نرى قالب Jinja يُمكنه تنسيق المُدخلات بطريقة تُشبه الطريقة التي تُنسّق بها LLaMA مُدخلاتها (لاحظ أن قالب LLaMA الحقيقي يتضمن معالجة لرسائل النظام الافتراضية ومعالجة رسائل النظام بشكل مختلف قليلاً بشكل عام - لا تستخدم هذا القالب في التعليمات البرمجية الفعلية الخاصة بك!)
```
{%- for message in messages %}
{%- if message['role'] == 'user' %}
{{- bos_token + '[INST] ' + message['content'] + ' [/INST]' }}
{%- elif message['role'] == 'system' %}
{{- '<<SYS>>\\n' + message['content'] + '\\n<</SYS>>\\n\\n' }}
{%- elif message['role'] == 'assistant' %}
{{- ' ' + message['content'] + ' ' + eos_token }}
{%- endif %}
{%- endfor %}
```
نأمل أنه إذا حدقت في هذا لفترة قصيرة، يمكنك أن ترى ما يفعله هذا القالب - فهو يُضيف رموزًا مُحددة مثل `[INST]` و `[/INST]` بناءً على دور كل رسالة. يمكن تمييز رسائل المستخدم والمساعد والنظام بوضوح للنموذج بسبب الرموز التي تُحيط بها.
## متقدم: إضافة وتعديل قوالب الدردشة
### كيف أنشئ قالب دردشة؟
ببساطة، اكتب قالب Jinja واضبط `tokenizer.chat_template`. قد تجد أنه من الأسهل البدء بقالب موجود من نموذج آخر وتحريره ببساطة ليناسب احتياجاتك! على سبيل المثال، يمكننا أن نأخذ قالب LLaMA أعلاه ونضيف `[ASST]` و `[/ASST]` إلى رسائل المساعد:
```
{%- for message in messages %}
{%- if message['role'] == 'user' %}
{{- bos_token + '[INST] ' + message['content'].strip() + ' [/INST]' }}
{%- elif message['role'] == 'system' %}
{{- '<<SYS>>\\n' + message['content'].strip() + '\\n<</SYS>>\\n\\n' }}
{%- elif message['role'] == 'assistant' %}
{{- '[ASST] ' + message['content'] + ' [/ASST]' + eos_token }}
{%- endif %}
{%- endfor %}
```
الآن، اضبط ببساطة الخاصية `tokenizer.chat_template`. في المرة القادمة التي تستخدم فيها [`~PreTrainedTokenizer.apply_chat_template`] ، سيستخدم القالب الجديد الخاص بك! سيتم حفظ هذه الخاصية في ملف `tokenizer_config.json`، حتى تتمكن من استخدام [`~utils.PushToHubMixin.push_to_hub`] لتحميل قالبك الجديد إلى Hub والتأكد من أن الجميع يستخدم القالب الصحيح لنموذجك!
```python
template = tokenizer.chat_template
template = template.replace("SYS", "SYSTEM") # تغيير رمز النظام
tokenizer.chat_template = template # تعيين القالب الجديد
tokenizer.push_to_hub("model_name") # تحميل القالب الجديد إلى Hub!
```
يتم استدعاء الدالة [`~PreTrainedTokenizer.apply_chat_template`] الذي نستخدم قالب الدردشة الخاص بك بواسطة فئة [`TextGenerationPipeline`] لذلك بمجرد تعيين قالب الدردشة الصحيح، سيصبح نموذجك متوافقًا تلقائيًا مع [`TextGenerationPipeline`].
<Tip>
إذا كنت تُجري ضبطًا دقيقًا لنموذج للدردشة، بالإضافة إلى تعيين قالب دردشة، فربما يجب عليك إضافة أي رموز تحكم دردشة جديدة كرموز خاصة في المجزىء اللغوي. لا يتم تقسيم الرموز الخاصة أبدًا، مما يضمن معالجة رموز التحكم الخاصة بك دائمًا كرموز فردية بدلاً من تجزئتها إلى أجزاء. يجب عليك أيضًا تعيين خاصية `eos_token` للمجزىء اللغوي إلى الرمز الذي يُشير إلى نهاية توليدات المساعد في قالبك. سيضمن هذا أن أدوات توليد النصوص يمكنها تحديد وقت إيقاف توليد النص بشكل صحيح.
</Tip>
### لماذا تحتوي بعض النماذج على قوالب متعددة؟
تستخدم بعض النماذج قوالب مختلفة لحالات استخدام مختلفة. على سبيل المثال، قد تستخدم قالبًا واحدًا للدردشة العادية وآخر لاستخدام الأدوات، أو التوليد القائم على الاسترجاع. في هذه الحالات، تكون `tokenizer.chat_template` قاموسًا. يمكن أن يتسبب هذا في بعض الارتباك، وحيثما أمكن، نوصي باستخدام قالب واحد لجميع حالات الاستخدام. يمكنك استخدام عبارات Jinja مثل `if tools is defined` وتعريفات `{% macro %}` لتضمين مسارات تعليمات برمجية متعددة بسهولة في قالب واحد.
عندما يحتوي المعالج اللغوي على قوالب متعددة، ستكون `tokenizer.chat_template dict`، حيث يكون كل مفتاح هو اسم قالب. يحتوي أسلوب `apply_chat_template` على معالجة خاصة لأسماء قوالب مُعينة: على وجه التحديد، سيبحث عن قالب باسم `default` في معظم الحالات، وسيُثير خطأً إذا لم يتمكن من العثور على واحد. ومع ذلك، إذا كان هناك قالب باسم `tool_use` عندما قام المستخدم بتمرير وسيطة `tools`، فسيستخدم هذا القالب بدلاً من ذلك. للوصول إلى قوالب بأسماء أخرى، مرر اسم القالب الذي تُريده إلى وسيطة `chat_template` لـ `apply_chat_template()`.
نجد أن هذا قد يكون مُربكًا بعض الشيء للمستخدمين - لذلك إذا كنت تكتب قالبًا بنفسك، فننصحك بمحاولة وضعه كله في قالب واحد حيثما أمكن!
## ما القالب الذي يجب أن أستخدمه؟
عند تعيين قالب لنموذج تم تدريبه بالفعل على الدردشة، يجب التأكد من أن القالب يتطابق تمامًا مع تنسيق الرسالة الذي شاهده النموذج أثناء التدريب، وإلا فمن المحتمل أن تواجه تدهورًا في الأداء. هذا صحيح حتى إذا كنت تدرب النموذج بشكل إضافي - فمن المحتمل أن تحصل على أفضل أداء إذا قمت بإبقاء رموز الدردشة ثابتة. يُشبه هذا إلى حد كبير عملية التجزئة - فأنت تحصل بشكل عام على أفضل أداء للاستدلال أو الضبط الدقيق عندما تتطابق بدقة مع التجزئة المستخدمة أثناء التدريب.
من ناحية أخرى، إذا كنت تُدرّب نموذجًا من البداية، أو تقوم بضبط دقيق لنموذج لغة أساسي للدردشة، لديك حرية اختيار قالب مناسب! تتمتع LLMs بالذكاء الكافي للتعامل مع العديد من تنسيقات الإدخال المختلفة. أحد الخيارات الشائعة هو تنسيق "ChatML"، وهو خيار جيد ومرن للعديد من حالات الاستخدام. يبدو كالتالي:
```
{%- for message in messages %}
{{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
```
إذا أعجبك هذا، فإليك نسخة جاهزة لوضعها في كودك. يتضمن الخط المفرد أيضًا دعمًا مفيدًا [لإرشادات التوليد](#what-are-generation-prompts)، ولكن لاحظ أنه لا يضيف رموز BOS أو EOS! إذا كان نموذجك يتوقع هذه الرموز، فلن يتم إضافتها تلقائيًا بواسطة "apply_chat_template" - بمعنى آخر، سيتم تجزئة النص باستخدام "add_special_tokens=False". هذا لتجنب التعارضات المحتملة بين القالب ومنطق "add_special_tokens". إذا كان نموذجك يتوقع رموزًا خاصة، فتأكد من إضافتها إلى القالب!
```python
tokenizer.chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}"
```
يُحيط هذا القالب كل رسالة بين الرمزين "<|im_start|>" و "<|im_end|>"، ويكتب ببساطة الدور كسلسلة نصية، مما يسمح بالمرونة في الأدوار التي تتدرب عليها. يبدو الناتج كما يلي:
```text
<|im_start|>system
You are a helpful chatbot that will do its best not to say anything so stupid that people tweet about it.<|im_end|>
<|im_start|>user
How are you?<|im_end|>
<|im_start|>assistant
I'm doing great!<|im_end|>
```
تعد أدوار "user" و "system" و "assistant" هي الأدوار القياسية للدردشة، ونوصي باستخدامها عندما يكون ذلك منطقيًا، خاصة إذا كنت تريد أن يعمل نموذجك بشكل جيد مع [`TextGenerationPipeline`]. ومع ذلك، فأنت لست مقيدًا بهذه الأدوار - فإن القوالب مرنة للغاية، ويمكن أن تكون أي سلسلة نصية دورًا.
## أريد إضافة بعض قوالب الدردشة! كيف أبدأ؟
إذا كان لديك أي نماذج دردشة، فيجب عليك تعيين الخاصية "tokenizer.chat_template" الخاصة بها واختبارها باستخدام [`~PreTrainedTokenizer.apply_chat_template`]، ثم رفع المجزىء اللغوي المُحدّث إلى Hub. ينطبق هذا حتى إذا لم تكن مالك النموذج - إذا كنت تستخدم نموذجًا بقالب دردشة فارغ، أو لا يزال يستخدم قالب الفئة الافتراضية، فيرجى فتح [طلب سحب](https://huggingface.co/docs/hub/repositories-pull-requests-discussions) إلى مستودع النموذج حتى يمكن تعيين الخاصية بشكل صحيح!
بمجرد تعيين الخاصية، هذا كل شيء، لقد انتهيت! ستعمل "tokenizer.apply_chat_template" الآن بشكل صحيح لهذا النموذج، مما يعني أنها مدعومة أيضًا بشكل تلقائي في أماكن مثل "TextGenerationPipeline"!
من خلال ضمان امتلاك النماذج لهذه الخاصية، يُمكننا التأكد من أن المجتمع بأكمله يستخدم القوة الكاملة للنماذج مفتوحة المصدر. لقد كانت عدم تطابق التنسيق تطارد المجال وأضرت الأداء بصمت لفترة طويلة جدًا - لقد حان الوقت لوضع حد لها!
## متقدم: نصائح لكتابة القوالب
<Tip>
أسهل طريقة للبدء في كتابة قوالب Jinja هي إلقاء نظرة على بعض القوالب الموجودة. يمكنك استخدام `print(tokenizer.chat_template)` لأي نموذج دردشة لمعرفة القالب الذي يستخدمه. بشكل عام، تحتوي النماذج التي تدعم استخدام الأدوات على قوالب أكثر تعقيدًا بكثير من النماذج الأخرى - لذلك عندما تبدأ للتو، فمن المحتمل أنها مثال سيئ للتعلم منه! يمكنك أيضًا إلقاء نظرة على [وثائق Jinja](https://jinja.palletsprojects.com/en/3.1.x/templates/#synopsis) للحصول على تفاصيل حول تنسيق Jinja العام وتركيبه.
</Tip>
تُطابق قوالب Jinja في `transformers` قوالب Jinja في أي مكان آخر. الشيء الرئيسي الذي يجب معرفته هو أن سجل الدردشة سيكون متاحًا داخل قالبك كمتغير يسمى `messages`. ستتمكن من الوصول إلى `messages` في قالبك تمامًا كما يمكنك في Python، مما يعني أنه يمكنك التكرار خلاله باستخدام `{% for message in messages %}` أو الوصول إلى رسائل فردية باستخدام `{{ messages[0] }}`، على سبيل المثال.
يمكنك أيضًا استخدام النصائح التالية لكتابة قوالب Jinja نظيفة وفعالة:
### إقتطاع المسافات الفارغة
بشكل افتراضي، ستطبع Jinja أي مسافات فارغة تأتي قبل أو بعد كتلة. يمكن أن يكون هذا مشكلة لقوالب الدردشة، والتي تريد عادةً أن تكون دقيقة جدًا مع المسافات! لتجنب ذلك، نوصي بشدة بكتابة قوالبك على النحو التالي:
```
{%- for message in messages %}
{{- message['role'] + message['content'] }}
{%- endfor %}
```
بدلاً من ذلك:
```
{% for message in messages %}
{{ message['role'] + message['content'] }}
{% endfor %}
```
سيؤدي إضافة "-" إلى إزالة أي مسافات تأتي قبل الكتلة. يبدو المثال الثاني عادية، ولكن قد يتم تضمين السطر الجديد والمسافة البادئة في المخرجات، وهو على الأرجح ليس ما تُريده!
### المتغيرات الخاصة
داخل قالبك، سيكون لديك حق الوصول إلى العديد من المتغيرات الخاصة. أهمها هو `messages`، والذي يحتوي على سجل الدردشة كقائمة من قواميس الرسائل. ومع ذلك، هناك العديد من المتغيرات الأخرى. لن يتم استخدام كل متغير في كل قالب. المتغيرات الأكثر شيوعًا هي:
- `tools` تحتوي على قائمة بالأدوات بتنسيق مخطط JSON. ستكون `None` أو غير مُعرّفة إذا لم يتم تمرير أي أدوات.
- `documents` تحتوي على قائمة من المستندات بالتنسيق `{"title": "العنوان", "contents": "المحتويات"}`، تُستخدم للتوليد المُعزز بالاسترجاع. ستكون `None` أو غير مُعرّفة إذا لم يتم تمرير أي مستندات.
- `add_generation_prompt` هي قيمة منطقية تكون `True` إذا طلب المستخدم مُطالبة توليد، و `False` بخلاف ذلك. إذا تم تعيين هذا، فيجب أن يُضيف قالبك رأس رسالة مساعد إلى نهاية المحادثة. إذا لم يكن لدى نموذجك رأس مُحدد لرسائل المساعد، فيمكنك تجاهل هذا العلم.
- **الرموز الخاصة** مثل `bos_token` و `eos_token`. يتم استخراجها من `tokenizer.special_tokens_map`. ستختلف الرموز الدقيقة المتاحة داخل كل قالب اعتمادًا على المجزىء اللغوي الأصلي.
<Tip>
يمكنك في الواقع تمرير أي `kwarg` إلى `apply_chat_template`، وستكون متاحة داخل القالب كمتغير. بشكل عام، نوصي بمحاولة الالتزام بالمتغيرات الأساسية المذكورة أعلاه، لأن ذلك سيجعل نموذجك أكثر صعوبة في الاستخدام إذا كان على المستخدمين كتابة تعليمات برمجية مخصصة لتمرير `kwargs` خاصة بالنموذج. ومع ذلك، فنحن نُدرك أن هذا المجال يتحرك بسرعة، لذلك إذا كانت لديك حالة استخدام جديدة لا تتناسب مع واجهة برمجة التطبيقات الأساسية، فلا تتردد في استخدام `kwarg` معامل جديد لها! إذا أصبح `kwarg` المعامل الجديد شائعًا، فقد نقوم بترقيته إلى واجهة برمجة التطبيقات الأساسية وإنشاء وتوثيق الخاص به.
</Tip>
### دوال قابلة للاستدعاء
هناك أيضًا قائمة قصيرة من الدوال القابلة للاستدعاء المتاحة لك داخل قوالبك. هذه هي:
- `raise_exception(msg)`: تُثير `TemplateException`. هذا مفيد لتصحيح الأخطاء، ولإخبار المستخدمين عندما يفعلون شيئًا لا يدعمه قالبك.
- `strftime_now(format_str)`: تُكافئ `datetime.now().strftime(format_str)` في Python. يُستخدم هذا للحصول على التاريخ/الوقت الحالي بتنسيق مُحدد، والذي يتم تضمينه أحيانًا في رسائل النظام.
### التوافق مع Jinja غير Python
هناك تطبيقات متعددة لـ Jinja بلغات مختلفة. عادة ما يكون لها نفس التركيب، ولكن الاختلاف الرئيسي هو أنه عند كتابة قالبًا في Python، يمكنك استخدام أساليب Python، مثل ".lower()" على السلاسل أو ".items()" على القواميس. سيؤدي هذا إلى كسر إذا حاول شخص ما استخدام قالبك في تنفيذ غير Python لـ Jinja. تعد التطبيقات غير Python شائعة بشكل خاص في بيئات النشر، حيث تعد JS و Rust شائعة جدًا.
لا تقلق، على الرغم من ذلك! هناك بعض التغييرات البسيطة التي يمكنك إجراؤها على قوالبك لضمان توافقها عبر جميع تطبيقات Jinja:
- استبدل أساليب Python بمرشحات Jinja. عادة ما يكون لها نفس الاسم، على سبيل المثال، يصبح "string.lower()" عبارة عن "string|lower"، ويصبح "dict.items()" عبارة عن "dict|items". أحد التغييرات الملحوظة هو أن "string.strip()" يصبح "string|trim". راجع [قائمة المرشحات المدمجة](https://jinja.palletsprojects.com/en/3.1.x/templates/#builtin-filters) في وثائق Jinja لمزيد من المعلومات.
- استبدل "True" و "False" و "None"، وهي خاصة بـ Python، بـ "true" و "false" و "none".
- قد يؤدي عرض قاموس أو قائمة مباشرة إلى نتائج مختلفة في التطبيقات الأخرى (على سبيل المثال، قد تتغير مدخﻻت السلسلة النصية من علامات اقتباس مفردة ' إلى علامات اقتباس مزدوجة "). يمكن أن يساعد إضافة "tojson" في ضمان الاتساق هنا.
## كتابة مطالبات التوليد
لقد ذكرنا أعلاه أن add_generation_prompt هو متغير خاص يمكن الوصول إليه داخل قالبك، ويتحكم فيه المستخدم من خلال تعيين معامل add_generation_prompt. إذا كان نموذجك يتوقع عنوان لرسائل المساعد، فيجب أن يدعم قالبك إضافة العنوان عند تعيين add_generation_prompt.
فيما يلي مثال على قالب يُنسّق الرسائل بأسلوب ChatML، مع دعم مُطالبة التوليد:
```text
{{- bos_token }}
{%- for message in messages %}
{{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- endif %}
```
سيعتمد المحتوى الدقيق لعنوان المساعد على نموذجك المُحدد، ولكن يجب أن يكون دائمًا السلسلة النصية التي تُمثل بداية رسالة المساعد، بحيث إذا قام المستخدم بتطبيق قالبك باستخدام add_generation_prompt=True ثم قام بتوليد نص، سيكتب النموذج استجابة المساعد. لاحظ أيضًا أن بعض النماذج لا تحتاج إلى مُطالبة توليد، لأن رسائل المساعد تبدأ دائمًا فورًا بعد رسائل المستخدم. هذا شائع بشكل خاص لنماذج LLaMA و Mistral، حيث تبدأ رسائل المساعد فورًا بعد رمز [/INST] الذي ينهي رسائل المستخدم. في هذه الحالات، يمكن للقالب تجاهل معامل add_generation_prompt.
مُطالبات التوليد مُهمة! إذا كان نموذجك يتطلب مُطالبة توليد ولكنها غير مُعيّنة في القالب، فمن المُحتمل أن تتدهور عمليات توليد النموذج بشدة، أو قد يُظهر النموذج سلوكًا غير عادي مثل متابعة رسالة المستخدم الأخيرة!
### كتابة قوالب أكبر وتصحيحها
عندما تم تقديم هذه الميزة، كانت معظم القوالب صغيرة جدًا، أي ما يُعادل نص برمجي "من سطر واحد" في Jinja. ومع ذلك، مع النماذج والميزات الجديدة مثل استخدام الأدوات و RAG، يمكن أن يصل طول بعض القوالب إلى 100 سطر أو أكثر. عند كتابة قوالب كهذه، من الجيد كتابتها في ملف مُنفصل، باستخدام مُحرر نصوص. يمكنك بسهولة استخراج قالب دردشة إلى ملف:
```python
open("template.jinja", "w").write(tokenizer.chat_template)
```
أو تحميل القالب المُحرر مرة أخرى إلى المعالج اللغوي:
```python
tokenizer.chat_template = open("template.jinja").read()
```
كميزة إضافية، عندما تكتب قالبًا طويلاً متعدد الأسطر في ملف مُنفصل، ستتوافق أرقام الأسطر في هذا الملف تمامًا مع أرقام الأسطر في أخطاء تحليل القالب أو تنفيذه. سيُسهّل هذا كثيرًا تحديد مكان المشكلات.
### كتابة قوالب للأدوات
على الرغم من أن قوالب الدردشة لا تفرض واجهة برمجة تطبيقات مُحددة للأدوات (أو لأي شيء حقًا)، فإننا نوصي مؤلفي القوالب بمحاولة الالتزام بواجهة برمجة تطبيقات قياسية حيثما أمكن. الهدف النهائي لقوالب الدردشة هو السماح بنقل التعليمات البرمجية عبر النماذج، لذا فإن الانحراف عن واجهة برمجة تطبيقات الأدوات القياسية يعني أن المستخدمين سيضطرون إلى كتابة تعليمات برمجية مخصصة لاستخدام الأدوات مع نموذجك. في بعض الأحيان يكون ذلك أمرًا لا مفر منه، ولكن غالبًا ما يكون من الممكن استخدام واجهة برمجة التطبيقات القياسية من خلال استخدام قوالب ذكية!
أدناه، سنُدرج عناصر واجهة برمجة التطبيقات القياسية، ونقدم نصائح حول كتابة قوالب ستعمل بشكل جيد معها.
#### تعريفات الأدوات
يجب أن يتوقع قالبك أن يكون المتغير tools إما فارغًا (إذا لم يتم تمرير أي أدوات)، أو قائمة من قواميس مخطط JSON. تسمح أساليب قالب الدردشة الخاصة بنا للمستخدمين بتمرير الأدوات إما كمخطط JSON أو كدوال Python، ولكن عندما يتم تمرير الدوال، فإننا نقوم تلقائيًا بإنشاء مخطط JSON وتمريره إلى قالبك. نتيجة لذلك، سيكون متغير tools الذي يستقبله قالبك دائمًا قائمة من مخططات JSON. هنا مخطط JSON أداة نموذجي:
```json
{
"type": "function",
"function": {
"name": "multiply",
"description": "دالة تضرب عددين",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "الرقم الأول للضرب"
},
"b": {
"type": "number",
"description": "الرقم الثاني للضرب"
}
},
"required": ["a", "b"]
}
}
}
```
وهنا بعض الأمثلة البرمجية للتعامل مع الأدوات في قالب الدردشة الخاص بك. تذكر أن هذا مجرد مثال لتنسيق مُحدد - من المحتمل أن يحتاج نموذجك إلى تنسيق مختلف!
```text
{%- if tools %}
{%- for tool in tools %}
{{- '<tool>' + tool['function']['name'] + '\n' }}
{%- for argument in tool['function']['parameters']['properties'] %}
{{- argument + ': ' + tool['function']['parameters']['properties'][argument]['description'] + '\n' }}
{%- endfor %}
{{- '\n</tool>' }}
{%- endif %}
{%- endif %}
```
يجب بالطبع اختيار الرموز المحددة ووصف الأدوات التي يُعرضها قالبك لتتناسب مع تلك التي تم تدريب نموذجك عليها. لا يوجد شرط أن يفهم نموذجك مُدخلات مخطط JSON، فقط أن يتمكن قالبك من ترجمة مخطط JSON إلى تنسيق نموذجك. على سبيل المثال، تم تدريب Command-R باستخدام أدوات مُعرّفة باستخدام رؤوس دوال Python، ولكن يقبل قالب أداة Command-R مخطط JSON، ويُحوّل الأنواع داخليًا ويُعرض أدوات الإدخال كعناوين Python. يمكنك فعل الكثير باستخدام القوالب!
#### استدعاءات الأدوات
استدعاءات الأدوات، إذا كانت موجودة، ستكون قائمة مُرفقة برسالة بدور "assistant". لاحظ أن tool_calls هي دائمًا قائمة، على الرغم من أن معظم نماذج استدعاء الأدوات تدعم فقط استدعاءات أدوات فردية في كل مرة، مما يعني أن القائمة ستحتوي عادةً على عنصر واحد فقط. هنا قاموس رسالة نموذجي يحتوي على استدعاء أداة:
```json
{
"role": "assistant",
"tool_calls": [
{
"type": "function",
"function": {
"name": "multiply",
"arguments": {
"a": 5,
"b": 6
}
}
}
]
}
```
والنمط الشائع للتعامل معها سيكون كهذا:
```text
{%- if message['role'] == 'assistant' and 'tool_calls' in message %}
{%- for tool_call in message['tool_calls'] %}
{{- '<tool_call>' + tool_call['function']['name'] + '\n' + tool_call['function']['arguments']|tojson + '\n</tool_call>' }}
{%- endif %}
{%- endfor %}
{%- endif %}
```
مرة أخرى، يجب عليك عرض استدعاء الأداة بالتنسيق والرموز الخاصة التي يتوقعها نموذجك.
#### استجابات الأدوات
استجابات الأدوات لها تنسيق بسيط: إنها قاموس رسالة بدور "tool"، ومفتاح "name" يُعطي اسم الدالة المُستدعاة، ومفتاح "content" يحتوي على نتيجة استدعاء الأداة. هنا استجابة أداة نموذجية:
```json
{
"role": "tool",
"name": "multiply",
"content": "30"
}
```
لست بحاجة إلى استخدام جميع المفاتيح في استجابة الأداة. على سبيل المثال، إذا كان نموذجك لا يتوقع تضمين اسم الدالة في استجابة الأداة، فيمكن أن يكون عرضها بسيطًا مثل:
```text
{%- if message['role'] == 'tool' %}
{{- "<tool_result>" + message['content'] + "</tool_result>" }}
{%- endif %}
```
مرة أخرى، تذكر أن التنسيق الفعلي والرموز الخاصة خاصة بالنموذج - يجب أن تُولي عناية كبيرة لضمان أن الرموز والمسافات الفارغة وكل شيء آخر يتطابق تمامًا مع التنسيق الذي تم تدريب نموذجك عليه!

View File

@ -0,0 +1,66 @@
# مجتمع المطورين
هذه الصفحة تجمع الموارد حول 🤗 Transformers التي طورها المجتمع.
## موارد المجتمع:
| المصدر | الوصف | المؤلف |
|:----------|:-------------|------:|
| [Hugging Face Transformers Glossary Flashcards](https://www.darigovresearch.com/huggingface-transformers-glossary-flashcards) | مجموعة من البطاقات التعليمية القائمة على [Transformers Docs Glossary](glossary) والتي تم وضعها في شكل يمكن تعلمه/مراجعته بسهولة باستخدام [Anki](https://apps.ankiweb.net/) وهو تطبيق مفتوح المصدر متعدد المنصات مصمم خصيصًا للاحتفاظ بالمعرفة على المدى الطويل. شاهد هذا [فيديو تمهيدي حول كيفية استخدام البطاقات التعليمية](https://www.youtube.com/watch?v=Dji_7PILrw). | [Darigov Research](https://www.darigovresearch.com/) |
## دفاتر ملاحظات المجتمع:
| الدفتر | الوصف | المؤلف | |
|:----------|:-------------|:-------------|------:|
| [Fine-tune a pre-trained Transformer to generate lyrics](https://github.com/AlekseyKorshuk/huggingartists) | كيفية توليد كلمات الأغاني على غرار فنانك المفضل من خلال ضبط نموذج GPT-2 | [Aleksey Korshuk](https://github.com/AlekseyKorshuk) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AlekseyKorshuk/huggingartists/blob/master/huggingartists-demo.ipynb) |
| [Train T5 in Tensorflow 2](https://github.com/snapthat/TF-T5-text-to-text) | كيفية تدريب T5 لأي مهمة باستخدام Tensorflow 2. يوضح هذا الدفتر مهمة السؤال والجواب المنفذة في Tensorflow 2 باستخدام SQUAD | [Muhammad Harris](https://github.com/HarrisDePerceptron) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snapthat/TF-T5-text-to-text/blob/master/snapthatT5/notebooks/TF-T5-Datasets%20Training.ipynb) |
| [Train T5 on TPU](https://github.com/patil-suraj/exploring-T5/blob/master/T5_on_TPU.ipynb) | كيفية تدريب T5 على SQUAD مع Transformers و Nlp | [Suraj Patil](https://github.com/patil-suraj) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/exploring-T5/blob/master/T5_on_TPU.ipynb#scrollTo=QLGiFCDqvuil) |
| [Fine-tune T5 for Classification and Multiple Choice](https://github.com/patil-suraj/exploring-T5/blob/master/t5_fine_tuning.ipynb) | كيفية ضبط نموذج T5 للتصنيف والمهام متعددة الخيارات باستخدام تنسيق النص إلى نص مع PyTorch Lightning | [Suraj Patil](https://github.com/patil-suraj) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/exploring-T5/blob/master/t5_fine_tuning.ipynb) |
| [Fine-tune DialoGPT on New Datasets and Languages](https://github.com/ncoop57/i-am-a-nerd/blob/master/_notebooks/2020-05-12-chatbot-part-1.ipynb) | كيفية ضبط نموذج DialoGPT على مجموعة بيانات جديدة لروبوتات الدردشة المحادثية المفتوحة | [Nathan Cooper](https://github.com/ncoop57) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ncoop57/i-am-a-nerd/blob/master/_notebooks/2020-05-12-chatbot-part-1.ipynb) |
| [Long Sequence Modeling with Reformer](https://github.com/patrickvonplaten/notebooks/blob/master/PyTorch_Reformer.ipynb) | كيفية التدريب على تسلسلات طويلة تصل إلى 500,000 رمز باستخدام Reformer | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/PyTorch_Reformer.ipynb) |
| [Fine-tune BART for Summarization](https://github.com/ohmeow/ohmeow_website/blob/master/posts/2021-05-25-mbart-sequence-classification-with-blurr.ipynb) | كيفية ضبط نموذج BART للتلخيص باستخدام fastai باستخدام blurr | [Wayde Gilliam](https://ohmeow.com/) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ohmeow/ohmeow_website/blob/master/posts/2021-05-25-mbart-sequence-classification-with-blurr.ipynb) |
| [Fine-tune a pre-trained Transformer on anyone's tweets](https://colab.research.google.com/github/borisdayma/huggingtweets/blob/master/huggingtweets-demo.ipynb) | كيفية توليد تغريدات على غرار حساب Twitter المفضل لديك من خلال ضبط نموذج GPT-2 | [Boris Dayma](https://github.com/borisdayma) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/borisdayma/huggingtweets/blob/master/huggingtweets-demo.ipynb) |
| [Optimize 🤗 Hugging Face models with Weights & Biases](https://colab.research.google.com/github/wandb/examples/blob/master/colabs/huggingface/Optimize_Hugging_Face_models_with_Weights_%26_Biases.ipynb) | دليل كامل لعرض تكامل W&B مع Hugging Face | [Boris Dayma](https://github.com/borisdayma) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/wandb/examples/blob/master/colabs/huggingface/Optimize_Hugging_Face_models_with_Weights_%26_Biases.ipynb) |
| [Pretrain Longformer](https://github.com/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb) | كيفية بناء نسخة "طويلة" من النماذج المسبقة التدريب الموجودة | [Iz Beltagy](https://beltagy.net) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb) |
| [Fine-tune Longformer for QA](https://github.com/patil-suraj/Notebooks/blob/master/longformer_qa_training.ipynb) | كيفية ضبط نموذج Longformer لمهمة QA | [Suraj Patil](https://github.com/patil-suraj) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/Notebooks/blob/master/longformer_qa_training.ipynb) |
| [Evaluate Model with 🤗nlp](https://github.com/patrickvonplaten/notebooks/blob/master/How_to_evaluate_Longformer_on_TriviaQA_using_NLP.ipynb) | كيفية تقييم نموذج Longformer على TriviaQA مع `nlp` | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1m7eTGlPmLRgoPkkA7rkhQdZ9ydpmsdLE?usp=sharing) |
| [Fine-tune T5 for Sentiment Span Extraction](https://github.com/enzoampil/t5-intro/blob/master/t5_qa_training_pytorch_span_extraction.ipynb) | كيفية ضبط نموذج T5 لاستخراج المشاعر باستخدام تنسيق النص إلى نص مع PyTorch Lightning | [Lorenzo Ampil](https://github.com/enzoampil) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/enzoampil/t5-intro/blob/master/t5_qa_training_pytorch_span_extraction.ipynb) |
| [Fine-tune DistilBert for Multiclass Classification](https://github.com/abhimishra91/transformers-tutorials/blob/master/transformers_multiclass_classification.ipynb) | كيفية ضبط نموذج DistilBert للتصنيف متعدد الفئات باستخدام PyTorch | [Abhishek Kumar Mishra](https://github.com/abhimishra91) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/abhimishra91/transformers-tutorials/blob/master/transformers_multiclass_classification.ipynb)|
|[Fine-tune BERT for Multi-label Classification](https://github.com/abhimishra91/transformers-tutorials/blob/master/transformers_multi_label_classification.ipynb)|كيفية ضبط نموذج BERT للتصنيف متعدد التصنيفات باستخدام PyTorch|[Abhishek Kumar Mishra](https://github.com/abhimishra91) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/abhimishra91/transformers-tutorials/blob/master/transformers_multi_label_classification.ipynb)|
|[Fine-tune T5 for Summarization](https://github.com/abhimishra91/transformers-tutorials/blob/master/transformers_summarization_wandb.ipynb)|كيفية ضبط نموذج T5 للتلخيص في PyTorch وتتبع التجارب باستخدام WandB|[Abhishek Kumar Mishra](https://github.com/abhimishra91) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/abhimishra91/transformers-tutorials/blob/master/transformers_summarization_wandb.ipynb)|
|[Speed up Fine-Tuning in Transformers with Dynamic Padding / Bucketing](https://github.com/ELS-RD/transformers-notebook/blob/master/Divide_Hugging_Face_Transformers_training_time_by_2_or_more.ipynb)|كيفية تسريع الضبط الدقيق بعامل 2 باستخدام الضبط الديناميكي/التقسيم|[Michael Benesty](https://github.com/pommedeterresautee) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1CBfRU1zbfu7-ijiOqAAQUA-RJaxfcJoO?usp=sharing)|
|[Pretrain Reformer for Masked Language Modeling](https://github.com/patrickvonplaten/notebooks/blob/master/Reformer_For_Masked_LM.ipynb)| كيفية تدريب نموذج Reformer مع طبقات الانتباه ثنائية الاتجاه | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1tzzh0i8PgDQGV3SMFUGxM7_gGae3K-uW?usp=sharing)|
|[Expand and Fine Tune Sci-BERT](https://github.com/lordtt13/word-embeddings/blob/master/COVID-19%20Research%20Data/COVID-SciBERT.ipynb)| كيفية زيادة مفردات نموذج SciBERT المسبق التدريب من AllenAI على مجموعة بيانات CORD وإنشاء خط أنابيب لها. | [Tanmay Thakur](https://github.com/lordtt13) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1rqAR40goxbAfez1xvF3hBJphSCsvXmh8)|
|[Fine Tune BlenderBotSmall for Summarization using the Trainer API](https://github.com/lordtt13/transformers-experiments/blob/master/Custom%20Tasks/fine-tune-blenderbot_small-for-summarization.ipynb)| كيفية ضبط نموذج BlenderBotSmall للتلخيص على مجموعة بيانات مخصصة، باستخدام واجهة برمجة التطبيقات Trainer. | [Tanmay Thakur](https://github.com/lordtt13) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/19Wmupuls7mykSGyRN_Qo6lPQhgp56ymq?usp=sharing)|
|[Fine-tune Electra and interpret with Integrated Gradients](https://github.com/elsanns/xai-nlp-notebooks/blob/master/electra_fine_tune_interpret_captum_ig.ipynb) | كيفية ضبط نموذج Electra للتحليل العاطفي وتفسير التنبؤات باستخدام Captum Integrated Gradients | [Eliza Szczechla](https://elsanns.github.io) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/elsanns/xai-nlp-notebooks/blob/master/electra_fine_tune_interpret_captum_ig.ipynb)|
|[fine-tune a non-English GPT-2 Model with Trainer class](https://github.com/philschmid/fine-tune-GPT-2/blob/master/Fine_tune_a_non_English_GPT_2_Model_with_Huggingface.ipynb) | كيفية ضبط نموذج GPT-2 غير الإنجليزي باستخدام فئة Trainer | [Philipp Schmid](https://www.philschmid.de) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/fine-tune-GPT-2/blob/master/Fine_tune_a_non_English_GPT_2_Model_with_Huggingface.ipynb)|
|[Fine-tune a DistilBERT Model for Multi Label Classification task](https://github.com/DhavalTaunk08/Transformers_scripts/blob/master/Transformers_multilabel_distilbert.ipynb) | كيفية ضبط نموذج DistilBERT لمهمة التصنيف متعدد التصنيفات | [Dhaval Taunk](https://github.com/DhavalTaunk08) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DhavalTaunk08/Transformers_scripts/blob/master/Transformers_multilabel_distilbert.ipynb)|
|[Fine-tune ALBERT for sentence-pair classification](https://github.com/NadirEM/nlp-notebooks/blob/master/Fine_tune_ALBERT_sentence_pair_classification.ipynb) | كيفية ضبط نموذج ALBERT أو أي نموذج آخر قائم على BERT لمهمة التصنيف المزدوج للجمل | [Nadir El Manouzi](https://github.com/NadirEM) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NadirEM/nlp-notebooks/blob/master/Fine_tune_ALBERT_sentence_pair_classification.ipynb)|
|[Fine-tune Roberta for sentiment analysis](https://github.com/DhavalTaunk08/NLP_scripts/blob/master/sentiment_analysis_using_roberta.ipynb) | كيفية ضبط نموذج Roberta للتحليل العاطفي | [Dhaval Taunk](https://github.com/DhavalTaunk08) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DhavalTaunk08/NLP_scripts/blob/master/sentiment_analysis_using_roberta.ipynb)|
|[Evaluating Question Generation Models](https://github.com/flexudy-pipe/qugeev) | ما مدى دقة الإجابات على الأسئلة التي يولدها نموذجك التحويلي seq2seq؟ | [Pascal Zoleko](https://github.com/zolekode) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1bpsSqCQU-iw_5nNoRm_crPq6FRuJthq_?usp=sharing)|
|[Classify text with DistilBERT and Tensorflow](https://github.com/peterbayerle/huggingface_notebook/blob/main/distilbert_tf.ipynb) | كيفية ضبط نموذج DistilBERT للتصنيف النصي في TensorFlow | [Peter Bayerle](https://github.com/peterbayerle) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/peterbayerle/huggingface_notebook/blob/main/distilbert_tf.ipynb)|
|[Leverage BERT for Encoder-Decoder Summarization on CNN/Dailymail](https://github.com/patrickvonplaten/notebooks/blob/master/BERT2BERT_for_CNN_Dailymail.ipynb) | كيفية البدء السريع لنموذج *EncoderDecoderModel* مع نقطة تفتيش *google-bert/bert-base-uncased* للتلخيص على CNN/Dailymail | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/BERT2BERT_for_CNN_Dailymail.ipynb)|
|[Leverage RoBERTa for Encoder-Decoder Summarization on BBC XSum](https://github.com/patrickvonplaten/notebooks/blob/master/RoBERTaShared_for_BBC_XSum.ipynb) | كيفية البدء السريع لنموذج *EncoderDecoderModel* المشترك مع نقطة تفتيش *FacebookAI/roberta-base* للتلخيص على BBC/XSum | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/RoBERTaShared_for_BBC_XSum.ipynb)|
|[Fine-tune TAPAS on Sequential Question Answering (SQA)](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/TAPAS/Fine_tuning_TapasForQuestionAnswering_on_SQA.ipynb) | كيفية ضبط نموذج *TapasForQuestionAnswering* مع نقطة تفتيش *tapas-base* على مجموعة بيانات Sequential Question Answering (SQA) | [Niels Rogge](https://github.com/nielsrogge) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/TAPAS/Fine_tuning_TapasForQuestionAnswering_on_SQA.ipynb)|
|[Evaluate TAPAS on Table Fact Checking (TabFact)](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/TAPAS/Evaluating_TAPAS_on_the_Tabfact_test_set.ipynb) | كيفية تقييم نموذج *TapasForSequenceClassification* المضبوط مسبقًا مع نقطة تفتيش *tapas-base-finetuned-tabfact* باستخدام مزيج من مكتبتي 🤗 datasets و 🤗 transformers | [Niels Rogge](https://github.com/nielsrogge) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/TAPAS/Evaluating_TAPAS_on_the_Tabfact_test_set.ipynb)|
|[Fine-tuning mBART for translation](https://colab.research.google.com/github/vasudevgupta7/huggingface-tutorials/blob/main/translation_training.ipynb) | كيفية ضبط نموذج mBART باستخدام Seq2SeqTrainer للترجمة من الهندية إلى الإنجليزية | [Vasudev Gupta](https://github.com/vasudevgupta7) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vasudevgupta7/huggingface-tutorials/blob/main/translation_training.ipynb)|
|[Fine-tune LayoutLM on FUNSD (a form understanding dataset)](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/LayoutLM/Fine_tuning_LayoutLMForTokenClassification_on_FUNSD.ipynb) | كيفية ضبط نموذج *LayoutLMForTokenClassification* على مجموعة بيانات FUNSD لاستخراج المعلومات من المستندات الممسوحة ضوئيًا | [Niels Rogge](https://github.com/nielsrogge) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/LayoutLM/Fine_tuning_LayoutLMForTokenClassification_on_FUNSD.ipynb)|
|[Fine-Tune DistilGPT2 and Generate Text](https://colab.research.google.com/github/tripathiaakash/DistilGPT2-Tutorial/blob/main/distilgpt2_fine_tuning.ipynb) | كيفية ضبط نموذج DistilGPT2 وتوليد النص | [Aakash Tripathi](https://github.com/tripathiaakash) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/tripathiaakash/DistilGPT2-Tutorial/blob/main/distilgpt2_fine_tuning.ipynb)|
|[Fine-Tune LED on up to 8K tokens](https://github.com/patrickvonplaten/notebooks/blob/master/Fine_tune_Longformer_Encoder_Decoder_(LED)_for_Summarization_on_pubmed.ipynb) | كيفية ضبط نموذج LED على pubmed للتلخيص طويل المدى | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/Fine_tune_Longformer_Encoder_Decoder_(LED)_for_Summarization_on_pubmed.ipynb)|
|[Evaluate LED on Arxiv](https://github.com/patrickvonplaten/notebooks/blob/master/LED_on_Arxiv.ipynb) | كيفية تقييم نموذج LED للتلخيص طويل المدى بشكل فعال | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/LED_on_Arxiv.ipynb)|
|[Fine-tune LayoutLM on RVL-CDIP (a document image classification dataset)](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/LayoutLM/Fine_tuning_LayoutLMForSequenceClassification_on_RVL_CDIP.ipynb) | كيفية ضبط نموذج *LayoutLMForSequenceClassification* على مجموعة بيانات RVL-CDIP لتصنيف المستندات الممسوحة ضوئيًا | [Niels Rogge](https://github.com/nielsrogge) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/LayoutLM/Fine_tuning_LayoutLMForSequenceClassification_on_RVL_CDIP.ipynb)|
|[Wav2Vec2 CTC decoding with GPT2 adjustment](https://github.com/voidful/huggingface_notebook/blob/main/xlsr_gpt.ipynb) | كيفية فك تشفير تسلسل CTC مع تعديل نموذج اللغة | [Eric Lam](https://github.com/voidful) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1e_zQHYbO2YKEaUgzb1ww1WwiAyydAj?usp=sharing)|
|[Fine-tune BART for summarization in two languages with Trainer class](https://github.com/elsanns/xai-nlp-notebooks/blob/master/fine_tune_bart_summarization_two_langs.ipynb) | كيفية ضبط نموذج BART للتلخيص بلغتين باستخدام فئة Trainer | [Eliza Szczechla](https://github.com/elsanns) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/elsanns/xai-nlp-notebooks/blob/master/fine_tune_bart_summarization_two_langs.ipynb)|
|[Evaluate Big Bird on Trivia QA](https://github.com/patrickvonplaten/notebooks/blob/master/Evaluating_Big_Bird_on_TriviaQA.ipynb) | كيفية تقييم نموذج BigBird للأسئلة والأجوبة على وثائق طويلة على Trivia QA | [Patrick von Platen](https://github.com/patrickvonplaten) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/Evaluating_Big_Bird_on_TriviaQA.ipynb)|
| [Create video captions using Wav2Vec2](https://github.com/Muennighoff/ytclipcc/blob/main/wav2vec_youtube_captions.ipynb) | كيفية إنشاء تعليقات توضيحية على YouTube من أي فيديو من خلال تفريغ الصوت باستخدام Wav2Vec | [Niklas Muennighoff](https://github.com/Muennighoff) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Muennighoff/ytclipcc/blob/main/wav2vec_youtube_captions.ipynb) |
| [Fine-tune the Vision Transformer on CIFAR-10 using PyTorch Lightning](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/VisionTransformer/Fine_tuning_the_Vision_Transformer_on_CIFAR_10_with_PyTorch_Lightning.ipynb) | كيفية ضبط نموذج Vision Transformer (ViT) على CIFAR-10 باستخدام مكتبات HuggingFace Transformers و Datasets و PyTorch Lightning | [Niels Rogge](https://github.com/nielsrogge) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/VisionTransformer/Fine_tuning_the_Vision_Transformer_on_CIFAR_10_with_PyTorch_Lightning.ipynb) |
| [Fine-tune the Vision Transformer on CIFAR-10 using the 🤗 Trainer](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/VisionTransformer/Fine_tuning_the_Vision_Transformer_on_CIFAR_10_with_the_%F0%9F%A4%97_Trainer.ipynb) | كيفية ضبط نموذج Vision Transformer (ViT) على CIFAR-10 باستخدام مكتبات HuggingFace Transformers و Datasets و 🤗 Trainer | [Niels Rogge](https://github.com/nielsrogge) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/VisionTransformer/Fine_tuning_the_Vision_Transformer_on_CIFAR_10_with_the_%F0%9F%A4%97_Trainer.ipynb) |
| [Evaluate LUKE on Open Entity, an entity typing dataset](https://github.com/studio-ousia/luke/blob/master/notebooks/huggingface_open_entity.ipynb) | كيفية تقييم نموذج *LukeForEntityClassification* على مجموعة بيانات Open Entity | [Ikuya Yamada](https://github.com/ikuyamada) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/studio-ousia/luke/blob/master/notebooks/huggingface_open_entity.ipynb) |
| [Evaluate LUKE on TACRED, a relation extraction dataset](https://github.com/studio-ousia/luke/blob/master/notebooks/huggingface_tacred.ipynb) | كيفية تقييم نموذج *LukeForEntityPairClassification* على مجموعة بيانات TACRED | [Ikuya Yamada](https://github.com/ikuyamada) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/studio-ousia/luke/blob/master/notebooks/huggingface_tacred.ipynb) |
| [Evaluate LUKE on CoNLL-2003, an important NER benchmark](https://github.com/studio-ousia/luke/blob/master/notebooks/huggingface_conll_2003.ipynb) | كيفية تقييم نموذج *LukeForEntitySpanClassification* على مجموعة بيانات CoNLL-2003 | [Ikuya Yamada](https://github.com/ikuyamada) |[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/studio-ousia/luke/blob/master/notebooks/huggingface_conll_2003.ipynb) |
| [Evaluate BigBird-Pegasus on PubMed dataset](https://github.com/vasudevgupta7/bigbird/blob/main/notebooks/bigbird_pegasus_evaluation.ipynb) | كيفية تقييم نموذج *BigBirdPegasusForConditionalGeneration* على مجموعة بيانات PubMed | [Vasudev Gupta](https://github.com/vasudevgupta7) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vasudevgupta7/bigbird/blob/main/notebooks/bigbird_pegasus_evaluation.ipynb) |
| [Speech Emotion Classification with Wav2Vec2](https://github.com/m3hrdadfi/soxan/blob/main/notebooks/Emotion_recognition_in_Greek_speech_using_Wav2Vec2.ipynb) | كيفية استخدام نموذج Wav2Vec2 المسبق التدريب لتصنيف المشاعر على مجموعة بيانات MEGA | [Mehrdad Farahani](https://github.com/m3hrdadfi) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/m3hrdadfi/soxan/blob/main/notebooks/Emotion_recognition_in_Greek_speech_using_Wav2Vec2.ipynb) |
| [Detect objects in an image with DETR](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/DETR/DETR_minimal_example_(with_DetrFeatureExtractor).ipynb) | كيفية استخدام نموذج *DetrForObjectDetection* المدرب للكشف عن الأجسام في صورة وتصوير الانتباه | [Niels Rogge](https://github.com/NielsRogge) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/DETR/DETR_minimal_example_(with_DetrFeatureExtractor).ipynb) |
| [Fine-tune DETR on a custom object detection dataset](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/DETR/Fine_tuning_DetrForObjectDetection_on_custom_dataset_(balloon).ipynb) | كيفية ضبط نموذج *DetrForObjectDetection* على مجموعة بيانات الكشف عن الأجسام المخصصة | [Niels Rogge](https://github.com/NielsRogge) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/DETR/Fine_tuning_DetrForObjectDetection_on_custom_dataset_(balloon).ipynb) |
| [Finetune T5 for Named Entity Recognition](https://github.com/ToluClassics/Notebooks/blob/main/T5_Ner_Finetuning.ipynb) | كيفية ضبط نموذج *T5* على مهمة التعرف على الكيانات المسماة | [Ogundepo Odunayo](https://github.com/ToluClassics) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1obr78FY_cBmWY5ODViCmzdY6O1KB65Vc?usp=sharing) |
| [Fine-Tuning Open-Source LLM using QLoRA with MLflow and PEFT](https://github.com/mlflow/mlflow/blob/master/docs/source/llms/transformers/tutorials/fine-tuning/transformers-peft.ipynb) | كيفية استخدام [QLoRA](https://github.com/artidoro/qlora) و [PEFT](https://huggingface.co/docs/peft/en/index) لضبط نموذج LLM بطريقة فعالة من حيث الذاكرة، مع استخدام [MLflow](https://mlflow.org/docs/latest/llms/transformers/index.html) لإدارة تتبع التجارب | [Yuki Watanabe](https://github.com/B-Step62) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mlflow/mlflow/blob/master/docs/source/llms/transformers/tutorials/fine-tuning/transformers-peft.ipynb) |

View File

@ -0,0 +1,436 @@
# إنشاء بنية مخصصة
تحدد فئة [`AutoClass`](model_doc/auto) تلقائيًا بنية النموذج وتقوم بتنزيل تكوين وأوزان مسبقين للنموذج. بشكل عام، نوصي باستخدام `AutoClass` لإنتاج كود غير مرتبط بنسخة معينة. ولكن يمكن للمستخدمين الذين يريدون مزيدًا من التحكم في معلمات النموذج المحددة إنشاء نموذج مخصص من 🤗 Transformers من مجرد بضع فئات أساسية. قد يكون هذا مفيدًا بشكل خاص لأي شخص مهتم بدراسة نموذج 🤗 Transformers أو تدريبه أو إجراء تجارب عليه. في هذا الدليل، سنغوص بشكل أعمق في إنشاء نموذج مخصص بدون `AutoClass`. تعرف على كيفية:
- تحميل تكوين النموذج وتخصيصه.
- إنشاء بنية نموذج.
- إنشاء مجزء لغوى سريع وبطيء للنص.
- إنشاء معالج صور لمهام الرؤية.
- إنشاء مستخرج ميزات لمهام الصوت.
- إنشاء معالج للمهام متعددة الوسائط.
## التكوين
يشير مصطلح [التكوين](main_classes/configuration) إلى الخصائص المحددة للنموذج. لكل تكوين نموذج خصائصه الخاصة؛ على سبيل المثال، تشترك جميع نماذج NLP في الخصائص `hidden_size` و`num_attention_heads` و`num_hidden_layers` و`vocab_size` المشتركة. تحدد هذه الخصائص عدد رؤوس الانتباه أو الطبقات المخفية لبناء نموذج بها.
اطلع على [DistilBERT](model_doc/distilbert) من خلال [`DistilBertConfig`] لمعاينة خصائصه:
```py
>>> from transformers import DistilBertConfig
>>> config = DistilBertConfig()
>>> print(config)
DistilBertConfig {
"activation": "gelu",
"attention_dropout": 0.1,
"dim": 768,
"dropout": 0.1,
"hidden_dim": 3072,
"initializer_range": 0.02,
"max_position_embeddings": 512,
"model_type": "distilbert",
"n_heads": 12,
"n_layers": 6,
"pad_token_id": 0,
"qa_dropout": 0.1,
"seq_classif_dropout": 0.2,
"sinusoidal_pos_embds": false,
"transformers_version": "4.16.2",
"vocab_size": 30522
}
```
يعرض [`DistilBertConfig`] جميع الخصائص الافتراضية المستخدمة لبناء نموذج [`DistilBertModel`] أساسي. جميع الخصائص قابلة للتعديل، مما ييتيح مجالاً للتجريب. على سبيل المثال، يمكنك تعديل نموذج افتراضي لـ:
- تجربة دالة تنشيط مختلفة باستخدام معامل `activation`.
- استخدام معدل إسقاط أعلى الاحتمالات الانتباه مع معامل `attention_dropout`.
```py
>>> my_config = DistilBertConfig(activation="relu", attention_dropout=0.4)
>>> print(my_config)
DistilBertConfig {
"activation": "relu",
"attention_dropout": 0.4,
```
يمكن تعديل خصائص النموذج المدرب مسبقًا في دالة [`~PretrainedConfig.from_pretrained`] :
```py
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
```
بمجرد أن تصبح راضيًا عن تكوين نموذجك، يمكنك حفظه باستخدام [`~PretrainedConfig.save_pretrained`]. يتم تخزين ملف التكوين الخاص بك على أنه ملف JSON في دليل الحفظ المحدد:
```py
>>> my_config.save_pretrained(save_directory="./your_model_save_path")
```
لإعادة استخدام ملف التكوين، قم بتحميله باستخدام [`~PretrainedConfig.from_pretrained`]:
```py
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")
```
<Tip>
يمكنك أيضًا حفظ ملف التكوين كقاموس أو حتى كفرق بين خصائص التكوين المُعدّلة والخصائص التكوين الافتراضية! راجع وثائق [التكوين](main_classes/configuration) لمزيد من التفاصيل.
</Tip>
## النموذج
الخطوة التالية هي إنشاء [نموذج](main_classes/models). النموذج - ويُشار إليه أحيانًا باسم البنية - يُحدد وظيفة كل طبقة والعمليات الحسابية المُنفذة. تُستخدم خصائص مثل `num_hidden_layers` من التكوين لتحديد هذه البنية. تشترك جميع النماذج في فئة أساسية واحدة هي [`PreTrainedModel`] وبعض الوظائف المُشتركة مثل غيير حجم مُدخلات الكلمات وتقليص رؤوس آلية الانتباه الذاتي. بالإضافة إلى ذلك، فإن جميع النماذج هي فئات فرعية إما من [`torch.nn.Module`](https://pytorch.org/docs/stable/generated/torch.nn.Module.html)، [`tf.keras.Model`](https://www.tensorflow.org/api_docs/python/tf/keras/Model) أو [`flax.linen.Module`](https://flax.readthedocs.io/en/latest/api_reference/flax.linen/module.html) . هذا يعني النماذج متوافقة مع كل استخدام لإطار العمل الخاص بها.
<frameworkcontent>
<pt>
قم بتحميل خصائص التكوين المخصصة الخاصة بك في النموذج:
```py
>>> from transformers import DistilBertModel
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")
>>> model = DistilBertModel(my_config)
```
هذا ينشئ نموذجًا بقيم عشوائية بدلاً من الأوزان المُدربة مسبقًا. لن يكون هذا النموذج مفيدًا حتى يتم تدريبه. تُعد عملية التدريب مكلفة وتستغرق وقتًا طويلاً. من الأفضل بشكل عام استخدام نموذج مُدرب مسبقًا للحصول على نتائج أفضل بشكل أسرع، مع استخدام جزء بسيط فقط من الموارد المطلوبة للتدريب.
قم بإنشاء نموذج مُدرب مسبقًا باستخدام [`~PreTrainedModel.from_pretrained`]:
```py
>>> model = DistilBertModel.from_pretrained("distilbert/distilbert-base-uncased")
```
عند بتحميل الأوزان المُدربة مسبقًا، يتم تحميل تكوين النموذج الافتراضي تلقائيًا إذا كان النموذج من مكتبة 🤗 Transformers. ومع ذلك، يمكنك أيضًا استبدال - بعض أو كل - سإعدادات النموذج الافتراضية بإعداداتك الخاصة:
```py
>>> model = DistilBertModel.from_pretrained("distilbert/distilbert-base-uncased"، config=my_config)
```
</pt>
<tf>
قم بتحميل خصائص التكوين المُخصصة الخاصة بك في النموذج:
```py
>>> from transformers import TFDistilBertModel
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json")
>>> tf_model = TFDistilBertModel(my_config)
```
هذا ينشئ نموذجًا بقيم عشوائية بدلاً من الأوزان المُدربة مسبقًا. لن يكون هذا النموذج مفيدًا حتى يتم تدريبه. تُعد عملية التدريب مكلفة وتستغرق وقتًا طويلاً. من الأفضل بشكل عام استخدام نموذج مُدرب مسبقًا للحصول على نتائج أفضل بشكل أسرع، مع استخدام جزء بسيط فقط من الموارد المطلوبة للتدريب.
قم بإنشاء نموذج مُدرب مسبقًا باستخدام [`~TFPreTrainedModel.from_pretrained`]:
```py
>>> tf_model = TFDistilBertModel.from_pretrained("distilbert/distilbert-base-uncased")
```
عندما تقوم بتحميل الأوزان المُدربة مسبقًا،يتم تحميل إعدادات النموذج الافتراضي تلقائيًا إذا كان النموذج من مكتبة 🤗 Transformers. ومع ذلك، يمكنك أيضًا استبدال - بعض أو كل - إعدادات النموذج الافتراضية بإعداداتك الخاصة:
```py
>>> tf_model = TFDistilBertModel.from_pretrained("distilbert/distilbert-base-uncased"، config=my_config)
```
</tf>
</frameworkcontent>
### رؤوس النموذج
في هذه المرحلة، لديك نموذج DistilBERT الأساسي الذي يخرج *حالات الكامنة*. تُمرَّر هذه الحالات الكامنة كمدخلات لرأس النموذج لإنتاج المخرجات النهائية. توفر مكتبة 🤗 Transformers رأس نموذج مختلف لكل مهمة طالما أن النموذج يدعم المهمة (أي لا يمكنك استخدام DistilBERT لمهمة تسلسل إلى تسلسل مثل الترجمة).
<frameworkcontent>
<pt>
على سبيل المثال، [`DistilBertForSequenceClassification`] هو نموذج DistilBERT الأساس مزودًا برأس تصنيف تسلسلي. يُشكّل رأس التصنيف التسلسلي طبقة خطية فوق المخرجات المجمعة.
```py
>>> from transformers import DistilBertForSequenceClassification
>>> model = DistilBertForSequenceClassification.from_pretrained("distilbert/distilbert-base-uncased")
```
أعد استخدام هذا نقطة التحقق هذه لمهمة أخرى بسهولة، وذلك بتغيير رأس النموذج.ففي مهمة الإجابة على الأسئلة، ستستخدم رأس النموذج [`DistilBertForQuestionAnswering`]. رأس الإجابة على الأسئلة مشابه لرأس التصنيف التسلسلي باستثناء أنه طبقة خطية فوق مخرجات الحالات الكامنة.
```py
>>> from transformers import DistilBertForQuestionAnswering
>>> model = DistilBertForQuestionAnswering.from_pretrained("distilbert/distilbert-base-uncased")
```
</pt>
<tf>
على سبيل المثال، [`TFDistilBertForSequenceClassification`] هو نموذج DistilBERT الأساسي برأس تصنيف تسلسل. رأس التصنيف التسلسلي هو طبقة خطية أعلى المخرجات المجمعة.
```py
>>> from transformers import TFDistilBertForSequenceClassification
>>> tf_model = TFDistilBertForSequenceClassification.from_pretrained("distilbert/distilbert-base-uncased")
```
أعد استخدام هذا نقطة التحقق لمهمة أخرى عن طريق التبديل إلى رأس نموذج مختلف. لمهمة الإجابة على الأسئلة، ستستخدم رأس النموذج [`TFDistilBertForQuestionAnswering`]. رأس الإجابة على الأسئلة مشابه لرأس التصنيف التسلسلي باستثناء أنه طبقة خطية أعلى حالات الإخراج المخفية.
```py
>>> from transformers import TFDistilBertForQuestionAnswering
>>> tf_model = TFDistilBertForQuestionAnswering.from_pretrained("distilbert/distilbert-base-uncased")
```
</tf>
</frameworkcontent>
## مجزئ النصوص
الفئة الأساسية الأخيرة التي تحتاجها قبل استخدام نموذج للبيانات النصية هي [مجزئ النصوص](main_classes/tokenizer) لتحويل النص الخام إلى تنسورات (tensors). هناك نوعان من المحولات الرموز التي يمكنك استخدامها مع 🤗 Transformers:
- [`PreTrainedTokenizer`]: تنفيذ Python لمجزئ النصوص.
- [`PreTrainedTokenizerFast`]: مجزئ النصوص من مكتبة [🤗 Tokenizer](https://huggingface.co/docs/tokenizers/python/latest/) المُبنية على لغة Rust. هذا النوع من المجزئات أسرع بكثير، خاصةً عند معالجة دفعات النصوص، وذلك بفضل تصميمه بلغة Rust. كما يوفر مجزئ النصوص السريع طرقًا إضافية مثل *مخطط الإزاحة* الذي يُطابق الرموز بكلماتها أو أحرفها الأصلية.
يدعم كلا النوعين من المجزئات طرقًا شائعة مثل الترميز وفك الترميز، وإضافة رموز جديدة، وإدارة الرموز الخاصة.
<Tip warning={true}>
لا يدعم كل نموذج مجزئ النصوص سريع. الق نظرة على هذا [جدول](index#supported-frameworks) للتحقق مما إذا كان النموذج يحتوي على دعم مجزئ النصوص سريع.
</Tip>
إذا دربت مجزئ النصوص خاص بك، فيمكنك إنشاء واحد من *قاموسك*:```
```py
>>> from transformers import DistilBertTokenizer
>>> my_tokenizer = DistilBertTokenizer(vocab_file="my_vocab_file.txt"، do_lower_case=False، padding_side="left")
```
من المهم أن تتذكر أن قاموس مجزئ النصوص المُخصص سيكون مختلفًا عن قاموس مجزئ النصوص نموذج مُدرّب مسبقًا. يجب عليك استخدام قاموس نموذج مُدرّب مسبقًا إذا كنت تستخدم نموذجًا مُدرّبًا مسبقًا، وإلا فلن تكون المدخلات ذات معنى. قم بإنشاء مجزئ النصوص باستخدام قاموس نموذج مُدرّب مسبقًا باستخدام فئة [`DistilBertTokenizer`]:
```py
>>> from transformers import DistilBertTokenizer
>>> slow_tokenizer = DistilBertTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
```
قم بإنشاء مجزئ نصوص سريع باستخدام فئة [`DistilBertTokenizerFast`]:
```py
>>> from transformers import DistilBertTokenizerFast
>>> fast_tokenizer = DistilBertTokenizerFast.from_pretrained("distilbert/distilbert-base-uncased")
```
<Tip>
افتراضيًا، سيحاول [`AutoTokenizer`] تحميل مجزئ نصوص سريع. يمكنك تعطيل هذا السلوك عن طريق تعيين `use_fast=False` في `from_pretrained`.
</Tip>
## معالج الصور
يعالج معالج الصور بيانات الرؤية. وهو يرث من الفئة الأساسية [`~image_processing_utils.ImageProcessingMixin`].
لبناء معالج صور خاص بالنموذج المستخدم، أنشئ مثلاً مُعالج [`ViTImageProcessor`] افتراضيًا إذا كنت تستخدم [ViT](model_doc/vit) لتصنيف الصور:
```py
>>> from transformers import ViTImageProcessor
>>> vit_extractor = ViTImageProcessor()
>>> print(vit_extractor)
ViTImageProcessor {
"do_normalize": true,
"do_resize": true,
"image_processor_type": "ViTImageProcessor",
"image_mean": [
0.5,
0.5,
0.5
],
"image_std": [
0.5,
0.5,
0.5
],
"resample": 2,
"size": 224
}
```
<Tip>
إذا كنت لا تبحث عن أي تخصيص، فما عليك سوى استخدام طريقة `from_pretrained` لتحميل معلمات معالج الصور الافتراضية للنموذج.
</Tip>
عدل أيًا من معلمات [`ViTImageProcessor`] لإنشاء معالج الصور المخصص الخاص بك:
```py
>>> from transformers import ViTImageProcessor
>>> my_vit_extractor = ViTImageProcessor(resample="PIL.Image.BOX", do_normalize=False, image_mean=[0.3, 0.3, 0.3])
>>> print(my_vit_extractor)
ViTImageProcessor {
"do_normalize": false,
"do_resize": true,
"image_processor_type": "ViTImageProcessor",
"image_mean": [
0.3,
0.3,
0.3
],
"image_std": [
0.5,
0.5,
0.5
],
"resample": "PIL.Image.BOX",
"size": 224
}
```
## العمود الفقري
<div style="text-align: center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Backbone.png">
</div>
تتكون نماذج رؤية الحاسب من جزء أساسي، وجزء وسيط، وجزء معالجة نهائي. يستخرج الجزء الأساسي الميزات من صورة الإدخال، ويجمع الجزء الوسيط هذه الميزات المستخرجة ويعززها، ويُستخدم الجزء النهائي للمهمة الرئيسية (مثل اكتشاف الأجسام). ابدأ عبتهيئة الجزء الأساسي في تكوين النموذج وحدد ما إذا كنت تريد تحميل أوزان مدربة مسبقًا أو أوزانًا عشوائية. بعد ذلك، يمكنك تمرير تكوين النموذج إلى جزء المعالجة النهائي.
على سبيل المثال، لتحميل [ResNet](../model_doc/resnet) backbone في نموذج [MaskFormer](../model_doc/maskformer) مع رأس تجزئة مثيل:
<hfoptions id="backbone">
<hfoption id="pretrained weights">
قم بتعيين `use_pretrained_backbone=True` لتحميل الأوزان المسبقة التدريب لـ ResNet للعمود الفقري.
```py
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation
config = MaskFormerConfig(backbone="microsoft/resnet-50", use_pretrained_backbone=True) # تكوين الجزء الأساسي والجزء الوسيط
model = MaskFormerForInstanceSegmentation(config) # جزء المعالجة النهائي
```
</hfoption>
<hfoption id="random weights">
قم بتعيين `use_pretrained_backbone=False` لتهيئة جزء ResNet الأساسي بشكل عشوائي.
```py
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation
config = MaskFormerConfig(backbone="microsoft/resnet-50", use_pretrained_backbone=False) # تكوين الجزء الأساسي والجزء الوسيط
model = MaskFormerForInstanceSegmentation(config) # جزء المعالجة النهائي
```
يمكنك أيضًا تحميل تكوين الجزء الأساسي بشكل منفصل، ثم تمريره إلى تكوين النموذج.```
```py
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation, ResNetConfig
backbone_config = ResNetConfig()
config = MaskFormerConfig(backbone_config=backbone_config)
model = MaskFormerForInstanceSegmentation(config)
```
</hfoption>
<hfoption id="timm backbone">
يتم تحميل نماذج [timm](https://hf.co/docs/timm/index) داخل نموذج باستخدام `use_timm_backbone=True` أو باستخدام [`TimmBackbone`] و [`TimmBackboneConfig`].
استخدم `use_timm_backbone=True` و `use_pretrained_backbone=True` لتحميل أوزان timm المُدرّبة مسبقًا للجزء الأساسي.
```python
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation
config = MaskFormerConfig(backbone="resnet50", use_pretrained_backbone=True, use_timm_backbone=True) # تكوين الجزء الأساسي والجزء الوسيط
model = MaskFormerForInstanceSegmentation(config) # جزء المعالجة النهائي
```
قم بتعيين `use_timm_backbone=True` و `use_pretrained_backbone=False` لتحميل عمود فقري timm مبدئي عشوائي.
```python
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation
config = MaskFormerConfig(backbone="resnet50", use_pretrained_backbone=False, use_timm_backbone=True) # تكوين الجزء الأساسي والجزء الوسيط
model = MaskFormerForInstanceSegmentation(config) # جزء المعالجة النهائي
```
يمكنك أيضًا تحميل تكوين الجزء الأساسي واستخدامه لإنشاء `TimmBackbone` أو تمريره إلى تكوين النموذج. سيتم تحميلأوزان الجزء الأساسي لـ Timm المُدرّبة مسبقًا افتراضيًا. عيّن `use_pretrained_backbone=False` لتحميل الأوزان المبدئية العشوائية.
```python
from transformers import TimmBackboneConfig, TimmBackbone
backbone_config = TimmBackboneConfig("resnet50", use_pretrained_backbone=False)
# قم بإنشاء مثيل من العمود الفقري
backbone = TimmBackbone(config=backbone_config)
# قم بإنشاء نموذج باستخدام عمود فقري timm
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation
config = MaskFormerConfig(backbone_config=backbone_config)
model = MaskFormerForInstanceSegmentation(config)
```
## مستخرج الميزات
يقوم مُستخرج الميزات بمعالجة المدخلات الصوتية. يرث من فئة الأساس [`~feature_extraction_utils.FeatureExtractionMixin`]، وقد يرث أيضًا من فئة [`SequenceFeatureExtractor`] لمعالجة المدخلات الصوتية.
للاستخدام، قم بإنشاء مستخرج ميزات مرتبط بالنموذج الذي تستخدمه. على سبيل المثال، قم بإنشاء مستخرج ميزات Wav2Vec2 الافتراضي إذا كنت تستخدم [Wav2Vec2](model_doc/wav2vec2) لتصنيف الصوت:
```py
>>> from transformers import Wav2Vec2FeatureExtractor
>>> w2v2_extractor = Wav2Vec2FeatureExtractor()
>>> print(w2v2_extractor)
Wav2Vec2FeatureExtractor {
"do_normalize": true,
"feature_extractor_type": "Wav2Vec2FeatureExtractor",
"feature_size": 1,
"padding_side": "right",
"padding_value": 0.0,
"return_attention_mask": false,
"sampling_rate": 16000
}
```
<Tip>
إذا لم تكن بحاجة لأي تخصيص، فاستخدم فقط طريقة `from_pretrained` لتحميل معلمات مستخرج الميزات الافتراضية للنموذج.
</Tip>
قم بتعديل أي من معلمات [`Wav2Vec2FeatureExtractor`] لإنشاء مستخرج ميزات مخصص:
```py
>>> from transformers import Wav2Vec2FeatureExtractor
>>> w2v2_extractor = Wav2Vec2FeatureExtractor(sampling_rate=8000، do_normalize=False)
>>> print(w2v2_extractor)
Wav2Vec2FeatureExtractor {
"do_normalize": false,
"feature_extractor_type": "Wav2Vec2FeatureExtractor"،
"feature_size": 1،
"padding_side": "right"،
"padding_value": 0.0،
"return_attention_mask": false،
"sampling_rate": 8000
}
```
## المعالج
بالنسبة للنماذج التي تدعم مهام الوسائط المتعددة، توفر مكتبة 🤗 Transformers فئة معالج تجمع بفاعلية فئات المعالجة مثل مستخرج الميزات ومقسّم الرموز في كائن واحد. على سبيل المثال، دعنا نستخدم [`Wav2Vec2Processor`] لمهمة التعرف الآلي على الكلام (ASR). تقوم مهمة ASR بتحويل الصوت إلى نص، لذلك ستحتاج إلى مستخرج ميزات ومقسّم رموز.
قم بإنشاء مستخرج ميزات لمعالجة المدخلات الصوتية:
```py
>>> from transformers import Wav2Vec2FeatureExtractor
>>> feature_extractor = Wav2Vec2FeatureExtractor(padding_value=1.0, do_normalize=True)
```
قم بإنشاء مقسّم رموز لمعالجة المدخلات النصية:
```py
>>> from transformers import Wav2Vec2CTCTokenizer
>>> tokenizer = Wav2Vec2CTCTokenizer(vocab_file="my_vocab_file.txt")
```
قم بدمج مستخرج الميزات ومقسّم الرموز في [`Wav2Vec2Processor`]:
```py
>>> from transformers import Wav2Vec2Processor
>>> processor = Wav2Vec2Processor(feature_extractor=feature_extractor, tokenizer=tokenizer)
```
باستخدام فئتين أساسيتين - التكوين والنموذج - بالإضافة إلى فئة معالجة مسبق (مقسّم رموز أو معالج صورة أو مستخرج ميزات أو معالج)، يمكنك إنشاء أي من النماذج التي تدعمها مكتبة 🤗 Transformers. يمكن تكوين كل من هذه الفئات الأساسية، مما يسمح لك باستخدام السمات المطلوبة. يمكنك بسهولة تهيئة نموذج للتدريب أو تعديل نموذج مدرب مسبقاً لإجراء ضبط دقيق.

View File

@ -0,0 +1,323 @@
# بناء نماذج مخصصة
تم تصميم مكتبة 🤗 Transformers لتكون قابلة للتوسيع بسهولة. كل نموذج مُشفّر بالكامل في مجلد فرعي معين بالمستودع، دون أي تجريد، لذلك يمكنك بسهولة نسخ ملف النمذجة وتعديله وفقًا لاحتياجاتك.
إذا كنت تُنشئ نموذجًا جديدًا تمامًا، فقد يكون من الأسهل البدء من الصفر. في هذا البرنامج التعليمي، سنُرِيك كيفية كتابة نموذج مخصص وتكوينه ليُستخدم داخل Transformers، وكيفية مشاركته مع المجتمع (مع الكود الذي يعتمد عليه) بحيث يمكن لأي شخص استخدامه، حتى إذا لم يكن موجودًا في مكتبة 🤗 Transformers. سنرى كيفية البناء على المحولات ونوسّع الإطار باستخدام الأدوات التي يمكن استخدامها لتعديل سلوك الإطار (hooks) والتعليمات البرمجية المخصصة.
سنوضح كل هذا من خلال نموذج ResNet، بتغليف فئة ResNet من
[مكتبة timm](https://github.com/rwightman/pytorch-image-models) داخل [`PreTrainedModel`].
## كتابة إعدادات مخصصة
لنبدأ بكتابة إعدادات النموذج. إعدادات النموذج هو كائنٌ يحتوي على جميع المعلومات اللازمة لبنائه. كما سنرى لاحقًا، يتطلب النموذج كائن `config` لتهيئته، لذا يجب أن يكون هذا الكائن كاملاً.
<Tip>
تتبع النماذج في مكتبة `transformers` اتفاقية قبول كائن `config` في دالة `__init__` الخاصة بها، ثم تمرر كائن `config` بالكامل إلى الطبقات الفرعية في النموذج، بدلاً من تقسيمه إلى معامﻻت متعددة. يؤدي كتابة نموذجك بهذا الأسلوب إلى كود أبسط مع "مصدر حقيقة" واضح لأي فرط معلمات، كما يسهل إعادة استخدام الكود من نماذج أخرى في `transformers`.
</Tip>
في مثالنا، سنعدّل بعض الوسائط في فئة ResNet التي قد نرغب في ضبطها. ستعطينا التكوينات المختلفة أنواع ResNets المختلفة الممكنة. سنقوم بتخزين هذه الوسائط بعد التحقق من صحته.
```python
from transformers import PretrainedConfig
from typing import List
class ResnetConfig(PretrainedConfig):
model_type = "resnet"
def __init__(
self,
block_type="bottleneck",
layers: List[int] = [3, 4, 6, 3],
num_classes: int = 1000,
input_channels: int = 3,
cardinality: int = 1,
base_width: int = 64,
stem_width: int = 64,
stem_type: str = "",
avg_down: bool = False,
**kwargs,
):
if block_type not in ["basic", "bottleneck"]:
raise ValueError(f"`block_type` must be 'basic' or bottleneck', got {block_type}.")
if stem_type not in ["", "deep", "deep-tiered"]:
raise ValueError(f"`stem_type` must be '', 'deep' or 'deep-tiered', got {stem_type}.")
self.block_type = block_type
self.layers = layers
self.num_classes = num_classes
self.input_channels = input_channels
self.cardinality = cardinality
self.base_width = base_width
self.stem_width = stem_width
self.stem_type = stem_type
self.avg_down = avg_down
super().__init__(**kwargs)
```
الأشياء الثلاثة المهمة التي يجب تذكرها عند كتابة تكوينك الخاص هي:
- يجب أن ترث من `PretrainedConfig`،
- يجب أن تقبل دالة `__init__` الخاصة بـ `PretrainedConfig` أي معامﻻت إضافية kwargs،
- يجب تمرير هذه المعامﻻت الإضافية إلى دالة `__init__` فى الفئة الأساسية الاعلى.
يضمن الإرث حصولك على جميع الوظائف من مكتبة 🤗 Transformers، في حين أن القيدين التانى والثالث يأتيان من حقيقة أن `PretrainedConfig` لديه المزيد من الحقول أكثر من تلك التي تقوم بتعيينها. عند إعادة تحميل تكوين باستخدام طريقة `from_pretrained`، يجب أن يقبل تكوينك هذه الحقول ثم إرسالها إلى الفئة الأساسية الأعلى.
تحديد `model_type` لتكوينك (هنا `model_type="resnet"`) ليس إلزاميًا، ما لم ترغب في
تسجيل نموذجك باستخدام الفئات التلقائية (راجع القسم الأخير).
مع القيام بذلك، يمكنك بسهولة إنشاء تكوينك وحفظه مثلما تفعل مع أي تكوين نموذج آخر في
المكتبة. إليك كيفية إنشاء تكوين resnet50d وحفظه:
```py
resnet50d_config = ResnetConfig(block_type="bottleneck", stem_width=32, stem_type="deep", avg_down=True)
resnet50d_config.save_pretrained("custom-resnet")
```
سيؤدي هذا إلى حفظ ملف باسم `config.json` داخل مجلد `custom-resnet`. يمكنك بعد ذلك إعادة تحميل تكوينك باستخدام
طريقة `from_pretrained`:
```py
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
```
يمكنك أيضًا استخدام أي طريقة أخرى من فئة [`PretrainedConfig`]، مثل [`~PretrainedConfig.push_to_hub`] لتحميل تكوينك مباشرة إلى Hub.
## كتابة نموذج مخصص
الآن بعد أن أصبح لدينا تكوين ResNet، يمكننا المتابعة لإنشاء نموذجين: الأول يستخرج الميزات المخفية من دفعة من الصور (مثل [`BertModel`]) والآخر مناسب لتصنيف الصور (مثل [`BertForSequenceClassification`]).
كما ذكرنا سابقًا، سنقوم ببناء نموذج مبسط لتسهيل الفهم في هذا المثال. الخطوة الوحيدة المطلوبة قبل كتابة هذه الفئة هي لربط أنواع وحدات البناء بفئات ذات وحدات بناء فعلية. بعد ذلك، يُعرّف النموذج من خلال التكوين عبر تمرير كل شيء إلى فئة `ResNet`:
```py
from transformers import PreTrainedModel
from timm.models.resnet import BasicBlock, Bottleneck, ResNet
from .configuration_resnet import ResnetConfig
BLOCK_MAPPING = {"basic": BasicBlock, "bottleneck": Bottleneck}
class ResnetModel(PreTrainedModel):
config_class = ResnetConfig
def __init__(self, config):
super().__init__(config)
block_layer = BLOCK_MAPPING[config.block_type]
self.model = ResNet(
block_layer,
config.layers,
num_classes=config.num_classes,
in_chans=config.input_channels,
cardinality=config.cardinality,
base_width=config.base_width,
stem_width=config.stem_width,
stem_type=config.stem_type,
avg_down=config.avg_down,
)
def forward(self, tensor):
return self.model.forward_features(tensor)
```
بالنسبة للنموذج الذي سيصنف الصور، فإننا نغير فقط طريقة التقديم:
```py
import torch
class ResnetModelForImageClassification(PreTrainedModel):
config_class = ResnetConfig
def __init__(self, config):
super().__init__(config)
block_layer = BLOCK_MAPPING[config.block_type]
self.model = ResNet(
block_layer,
config.layers,
num_classes=config.num_classes,
in_chans=config.input_channels,
cardinality=config.cardinality,
base_width=config.base_width,
stem_width=config.stem_width,
stem_type=config.stem_type,
avg_down=config.avg_down,
)
def forward(self, tensor, labels=None):
logits = self.model(tensor)
if labels is not None:
loss = torch.nn.cross_entropy(logits, labels)
return {"loss": loss, "logits": logits}
return {"logits": logits}
```
في كلتا الحالتين، لاحظ كيف نرث من `PreTrainedModel` ونستدعي مُهيئ الفئة الرئيسية باستخدام `config` (كما تفعل عند إنشاء وحدة `torch.nn.Module` عادية). ليس من الضروري تعريف `config_class` إلا إذا كنت ترغب في تسجيل نموذجك مع الفئات التلقائية (راجع القسم الأخير).
<Tip>
إذا كان نموذجك مشابهًا جدًا لنموذج داخل المكتبة، فيمكنك إعادة استخدام نفس التكوين مثل هذا النموذج.
</Tip>
يمكن لنموذجك أن يعيد أي شيء تريده، ولكن إعادة قاموس مثلما فعلنا لـ
`ResnetModelForImageClassification`، مع تضمين الخسارة عند تمرير العلامات، سيجعل نموذجك قابلًا للاستخدام مباشرة داخل فئة [`Trainer`]. يعد استخدام تنسيق إخراج آخر أمرًا جيدًا طالما أنك تخطط لاستخدام حلقة تدريب خاصة بك أو مكتبة أخرى للتدريب.
الآن بعد أن أصبح لدينا فئة النموذج، دعنا ننشئ واحدة:
```py
resnet50d = ResnetModelForImageClassification(resnet50d_config)
```
يمكنك استخدام أي من طرق فئة [`PreTrainedModel`]، مثل [`~PreTrainedModel.save_pretrained`] أو
[`~PreTrainedModel.push_to_hub`]. سنستخدم الثاني في القسم التالي، وسنرى كيفية دفع أوزان النموذج مع كود نموذجنا. ولكن أولاً، دعنا نحمل بعض الأوزان المُعلمة مسبقًا داخل نموذجنا.
في حالة الاستخدام الخاصة بك، فمن المحتمل أن تقوم بتدريب نموذجك المخصص على بياناتك الخاصة. للانتقال بسرعة خلال هذا البرنامج التعليمي،
سنستخدم الإصدار المُعلم مسبقًا من resnet50d. نظرًا لأن نموذجنا هو مجرد غلاف حوله، فمن السهل نقل هذه الأوزان:
```py
import timm
pretrained_model = timm.create_model("resnet50d", pretrained=True)
resnet50d.model.load_state_dict(pretrained_model.state_dict())
```
الآن دعونا نرى كيفية التأكد من أنه عند قيامنا بـ [`~PreTrainedModel.save_pretrained`] أو [`~PreTrainedModel.push_to_hub`]، يتم حفظ كود النموذج.
## تسجيل نموذج مع كود مخصص للفئات التلقائية
إذا كنت تكتب مكتبة توسع 🤗 Transformers، فقد ترغب في توسيع الفئات التلقائية لتشمل نموذجك الخاص. يختلف هذا عن نشر الكود إلى Hub بمعنى أن المستخدمين سيحتاجون إلى استيراد مكتبتك للحصول على النماذج المخصصة (على عكس تنزيل كود النموذج تلقائيًا من Hub).
ما دام تكوينك يحتوي على معامل `model_type` مختلفة عن أنواع النماذج الحالية، وأن فئات نماذجك لديك لديها الخصائص الصحيحة `config_class`، فيمكنك ببساطة إضافتها إلى الفئات التلقائية مثل هذا:
```py
from transformers import AutoConfig, AutoModel, AutoModelForImageClassification
AutoConfig.register("resnet", ResnetConfig)
AutoModel.register(ResnetConfig, ResnetModel)
AutoModelForImageClassification.register(ResnetConfig, ResnetModelForImageClassification)
```
لاحظ أن الحجة الأولى المستخدمة عند تسجيل تكوينك المخصص لـ [`AutoConfig`] يجب أن تتطابق مع `model_type`
من تكوينك المخصص، والحجة الأولى المستخدمة عند تسجيل نماذجك المخصصة لأي فئة نموذج تلقائي يجب
أن تتطابق مع `config_class` من تلك النماذج.
## إرسال الكود إلى Hub
<Tip warning={true}>
هذا API تجريبي وقد يكون له بعض التغييرات الطفيفة في الإصدارات القادمة.
</Tip>
أولاً، تأكد من تعريف نموذجك بالكامل في ملف `.py`. يمكن أن يعتمد على الاستيراد النسبي لملفات أخرى طالما أن جميع الملفات موجودة في نفس الدليل (لا ندعم الوحدات الفرعية لهذه الميزة حتى الآن). في مثالنا، سنحدد ملف `modeling_resnet.py` وملف `configuration_resnet.py` في مجلد باسم "resnet_model" في دليل العمل الحالي. يحتوي ملف التكوين على كود لـ `ResnetConfig` ويحتوي ملف النمذجة على كود لـ `ResnetModel` و`ResnetModelForImageClassification`.
```
.
└── resnet_model
├── __init__.py
├── configuration_resnet.py
└── modeling_resnet.py
```
يمكن أن يكون ملف `__init__.py` فارغًا، فهو موجود فقط حتى يتمكن Python من اكتشاف أن `resnet_model` يمكن استخدامه كموديل.
<Tip warning={true}>
إذا كنت تقوم بنسخ ملفات النمذجة من المكتبة، فسوف تحتاج إلى استبدال جميع الواردات النسبية في أعلى الملف
لاستيرادها من حزمة `transformers`.
</Tip>
لاحظ أنه يمكنك إعادة استخدام (أو توسيع) تكوين/نموذج موجود.
لمشاركة نموذجك مع المجتمع، اتبع الخطوات التالية: أولاً، قم باستيراد نموذج ResNet والتكوين من الملفات التي تم إنشاؤها حديثًا:
```py
from resnet_model.configuration_resnet import ResnetConfig
from resnet_model.modeling_resnet import ResnetModel, ResnetModelForImageClassification
```
بعد ذلك، يجب عليك إخبار المكتبة بأنك تريد نسخ ملفات الكود الخاصة بهذه الكائنات عند استخدام طريقة `save_pretrained`
وتسجيلها بشكل صحيح باستخدام فئة تلقائية (خاصة للنماذج)، ما عليك سوى تشغيل:
```py
ResnetConfig.register_for_auto_class()
ResnetModel.register_for_auto_class("AutoModel")
ResnetModelForImageClassification.register_for_auto_class("AutoModelForImageClassification")
```
لاحظ أنه لا توجد حاجة لتحديد فئة تلقائية للتكوين (هناك فئة تلقائية واحدة فقط لها،
[`AutoConfig`]) ولكن الأمر يختلف بالنسبة للنماذج. قد يكون نموذجك المخصص مناسبًا للعديد من المهام المختلفة، لذلك يجب
تحديد أي من الفئات التلقائية هو الصحيح لنموذجك.
<Tip>
استخدم `register_for_auto_class()` إذا كنت تريد نسخ ملفات الكود. إذا كنت تفضل استخدام الكود على Hub من مستودع آخر،
فلا تحتاج إلى استدعائه. في الحالات التي يوجد فيها أكثر من فئة تلقائية واحدة، يمكنك تعديل ملف `config.json` مباشرة باستخدام
الهيكل التالي:
```json
"auto_map": {
"AutoConfig": "<your-repo-name>--<config-name>",
"AutoModel": "<your-repo-name>--<config-name>",
"AutoModelFor<Task>": "<your-repo-name>--<config-name>",
},
```
</Tip>
بعد ذلك، دعنا نقوم بإنشاء التكوين والنماذج كما فعلنا من قبل:
```py
resnet50d_config = ResnetConfig(block_type="bottleneck", stem_width=32, stem_type="deep", avg_down=True)
resnet50d = ResnetModelForImageClassification(resnet50d_config)
pretrained_model = timm.create_model("resnet50d", pretrained=True)
resnet50d.model.load_state_dict(pretrained_model.state_dict())
```
الآن لإرسال النموذج إلى Hub، تأكد من تسجيل الدخول. إما تشغيل في المحطة الأوامر الطرفية الخاصة بك:
```bash
huggingface-cli login
```
أو من دفتر ملاحظات:
```py
from huggingface_hub import notebook_login
notebook_login()
```
يمكنك بعد ذلك الضغط على مساحة الاسم الخاصة بك (أو منظمة أنت عضو فيها) مثل هذا:
```py
resnet50d.push_to_hub("custom-resnet50d")
```
بالإضافة إلى أوزان النمذجة والتكوين بتنسيق json، فقد قام هذا أيضًا بنسخ ملفات النمذجة والتكوين `.py` في مجلد `custom-resnet50d` وتحميل النتيجة إلى Hub. يمكنك التحقق من النتيجة في هذا [مستودع النموذج](https://huggingface.co/sgugger/custom-resnet50d).
راجع [البرنامج التعليمي للمشاركة](model_sharing) لمزيد من المعلومات حول طريقة الدفع إلى المحور.
### استخدام نموذج مع كود مخصص
يمكنك استخدام أي تكوين أو نموذج أو مقسم لغوي مع ملفات برمجة مخصصة في مستودعه باستخدام الفئات التلقائية و دالة `from_pretrained`.تُفحص جميع الملفات والرموز المرفوع إلى Hub بحثًا عن البرامج الضارة (راجع وثائق [أمان Hub](https://huggingface.co/docs/hub/security#malware-scanning) لمزيد من المعلومات)، ولكن يجب عليك مراجعة كود النموذج والمؤلف لتجنب تنفيذ التعليمات البرمجية الضارة على جهازك. لتفعيل نموذج يحتوي على شفرة برمجية مخصصة، عيّن `trust_remote_code=True`:
```py
from transformers import AutoModelForImageClassification
model = AutoModelForImageClassification.from_pretrained("sgugger/custom-resnet50d", trust_remote_code=True)
```
يُنصح بشدة بتحديد رقم إصدار (commit hash) كـ `revision` للتأكد من عدم تعديل مؤلف النموذج للشفرة لاحقًابإضافة أسطر ضارة (إلا إذا كنت تثق تمامًا بمؤلفي النموذج):
```py
commit_hash = "ed94a7c6247d8aedce4647f00f20de6875b5b292"
model = AutoModelForImageClassification.from_pretrained(
"sgugger/custom-resnet50d"، trust_remote_code=True، revision=commit_hash
)
```
لاحظ وجود زرّ لنسخ رقم إصدار بسهولة عند تصفح سجل التزامات مستودع النموذج على منصة Hugging Face.

View File

@ -0,0 +1,51 @@
# استخدام مجزئيات النصوص من 🤗 Tokenizers
يعتمد [`PreTrainedTokenizerFast`] على مكتبة [🤗 Tokenizers](https://huggingface.co/docs/tokenizers). يمكن تحميل المجزئات اللغويين الذين تم الحصول عليهم من مكتبة 🤗 Tokenizers ببساطة شديدة في 🤗 Transformers.
قبل الدخول في التفاصيل، دعونا نبدأ أولاً بإنشاء مُجزىء لغوي تجريبي في بضع سطور:
```python
>>> from tokenizers import Tokenizer
>>> from tokenizers.models import BPE
>>> from tokenizers.trainers import BpeTrainer
>>> from tokenizers.pre_tokenizers import Whitespace
>>> tokenizer = Tokenizer(BPE(unk_token="[UNK]"))
>>> trainer = BpeTrainer(special_tokens=["[UNK]", "[CLS]", "[SEP]", "[PAD]", "[MASK]"])
>>> tokenizer.pre_tokenizer = Whitespace()
>>> files = [...]
>>> tokenizer.train(files, trainer)
```
الآن لدينا مُجزىء لغوي مدرب على الملفات التي حددناها. يمكننا إما الاستمرار في استخدامه في وقت التشغيل هذا، أو حفظه في ملف JSON لإعادة استخدامه لاحقًا.
## تحميل مُجزئ النّصوص مُباشرةً
دعونا نرى كيف يمكننا الاستفادة من كائن (مُجزئ النصوص) في مكتبة 🤗 Transformers. تسمح فئة [`PreTrainedTokenizerFast`] سهولة إنشاء *tokenizer*، من خلال قبول كائن *المُجزئ النصوص* مُهيّأ مُسبقًا كمعامل:
```python
>>> from transformers import PreTrainedTokenizerFast
>>> fast_tokenizer = PreTrainedTokenizerFast(tokenizer_object=tokenizer)
```
يمكن الآن استخدام هذا الكائن مع جميع الطرق المُشتركة بين مُجزّئي النّصوص لـ 🤗 Transformers! انتقل إلى [صفحة مُجزّئ النّصوص](main_classes/tokenizer) لمزيد من المعلومات.
## التحميل من ملف JSON
لتحميل مُجزّئ النص من ملف JSON، دعونا نبدأ أولاً بحفظ مُجزّئ النّصوص:
```python
>>> tokenizer.save("tokenizer.json")
```
يمكن تمرير المسار الذي حفظنا به هذا الملف إلى طريقة تهيئة [`PreTrainedTokenizerFast`] باستخدام المُعامل `tokenizer_file`:
```python
>>> from transformers import PreTrainedTokenizerFast
>>> fast_tokenizer = PreTrainedTokenizerFast(tokenizer_file="tokenizer.json")
```
يمكن الآن استخدام هذا الكائن مع جميع الطرق التي تشترك فيها مُجزّئي النّصوص لـ 🤗 Transformers! انتقل إلى [صفحة مُجزّئ النص](main_classes/tokenizer) لمزيد من المعلومات.

89
docs/source/ar/gguf.md Normal file
View File

@ -0,0 +1,89 @@
# GGUF وتفاعلها مع المحولات
تُستخدم صيغة ملف GGUF لتخزين النماذج للاستدلال باستخدام [GGML](https://github.com/ggerganov/ggml) والمكتبات الأخرى التي تعتمد عليه، مثل [llama.cpp](https://github.com/ggerganov/llama.cpp) أو [whisper.cpp](https://github.com/ggerganov/whisper.cpp) الشهيرة جدًا.
إنها صيغة ملف [مدعومة من قبل Hugging Face Hub](https://huggingface.co/docs/hub/en/gguf) مع ميزات تسمح بالفحص السريع للموترات والبيانات الوصفية داخل الملف.
تم تصميم تنسيق الملف هذا كـ "تنسيق ملف واحد" حيث يحتوي ملف واحد عادةً على كل من سمات التكوين ومفردات المجزىء اللغوي والخصائص الأخرى، بالإضافة إلى جميع الموترات التي سيتم تحميلها في النموذج. تأتي هذه الملفات بتنسيقات مختلفة وفقًا لنوع التكميم في الملف. نلقي نظرة موجزة على بعضها [هنا](https://huggingface.co/docs/hub/en/gguf#quantization-types).
## الدعم داخل المحولات
أضفنا القدرة على تحميل ملفات `gguf` داخل `المحولات` لتوفير قدرات تدريب/ضبط إضافية لنماذج gguf، قبل إعادة تحويل تلك النماذج إلى `gguf` لاستخدامها داخل نظام `ggml`. عند تحميل نموذج، نقوم أولاً بإلغاء تكميمه إلى fp32، قبل تحميل الأوزان لاستخدامها في PyTorch.
> [!NOTE]
> لا يزال الدعم تجريبيًا للغاية ونرحب بالمساهمات من أجل ترسيخه عبر أنواع التكميم وبنى النماذج.
فيما يلي، بنيات النماذج وأنواع التكميم المدعومة:
### أنواع التكميم المدعومة
تُحدد أنواع التكميم المدعومة مبدئيًا وفقًا لملفات التكميم الشائعة التي تمت مشاركتها على Hub.
- F32
- F16
- BF16
- Q4_0
- Q4_1
- Q5_0
- Q5_1
- Q8_0
- Q2_K
- Q3_K
- Q4_K
- Q5_K
- Q6_K
- IQ1_S
- IQ1_M
- IQ2_XXS
- IQ2_XS
- IQ2_S
- IQ3_XXS
- IQ3_S
- IQ4_XS
- IQ4_NL
> [!NOTE]
> لدعم إلغاء تكميم gguf، يلزم تثبيت `gguf>=0.10.0`.
### بنيات النماذج المدعومة
في الوقت الحالي، بنيات النماذج المدعومة هي البنيات التي كانت شائعة جدًا على Hub، وهي:
- LLaMa
- Mistral
- Qwen2
- Qwen2Moe
- Phi3
- Bloom
- Falcon
- StableLM
- GPT2
- Starcoder2
- T5
## مثال الاستخدام
لتحميل ملفات `gguf` في `transformers`، يجب تحديد معامل `gguf_file` فى دالة `from_pretrained` لكل من المُجزّئ اللغوية والنموذج. فيما يلي كيفية تحميل المُجزّئ اللغوي ونموذج، يمكن تحميلهما من نفس الملف:
```py
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF"
filename = "tinyllama-1.1b-chat-v1.0.Q6_K.gguf"
tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename)
model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename)
```
الآن لديك إمكانية الوصول إلى النسخة الكامل غير المكممة للنموذج في بيئة PyTorch، حيث يمكنك دمجه مع مجموعة كبيرة من الأدوات الأخرى.
لإعادة التحويل إلى ملف `gguf`، نوصي باستخدام ملف [`convert-hf-to-gguf.py`](https://github.com/ggerganov/llama.cpp/blob/master/convert-hf-to-gguf.py) من llama.cpp.
فيما يلي كيفية إكمال البرنامج النصي أعلاه لحفظ النموذج وإعادة تصديره مرة أخرى إلى `gguf`:
```py
tokenizer.save_pretrained('directory')
model.save_pretrained('directory')
!python ${path_to_llama_cpp}/convert-hf-to-gguf.py ${directory}
```

View File

@ -0,0 +1,163 @@
# كيفية تعديل أي نموذج من نماذج Transformers
توفر مكتبة [🤗 Transformers](https://github.com/huggingface/transformers) مجموعة من النماذج المسبقة التدريب والأدوات لمعالجة اللغات الطبيعية، والرؤية، وما إلى ذلك. على الرغم من أن هذه النماذج تغطي مجموعة واسعة من التطبيقات، فقد تواجه حالات استخدام لا تدعمها المكتبة بشكل افتراضي. يُمكن للتخصيص أن يفتح إمكانيات جديدة، مثل إضافة طبقات جديدة، أو تعديل البنية المعمارية، أو تحسين آليات الانتباه. سيُوضح لك هذا الدليل كيفية تعديل نماذج Transformers الموجودة لتلبية احتياجاتك المحددة. الشيء الرائع هو أنك لست بحاجة إلى الخروج من إطار عمل Transformers لإجراء هذه التغييرات. ي يمكنك تعديل النماذج مباشرةً في Transformers والاستفادة من الميزات مثل [واجهة برمجة التطبيقات Trainer](https://huggingface.co/docs/transformers/main/en/main_classes/trainer)، و [PreTrainedModel](https://huggingface.co/docs/transformers/main/en/main_classes/model#transformers.PreTrainedModel)، والضبط الدقيق الفعال باستخدام أدوات مثل [PEFT](https://huggingface.co/docs/peft/index).
سنرشدك في هذا الدليل لكيفية تخصيص نماذج Transformers الموجودة لتلبية متطلباتك، دون فقدان مزايا الإطار. ستتعلم كيفية:
- تعديل بنية نموذج ما من خلال تغيير آلية الانتباه الخاصة به.
- تطبيق تقنيات مثل Low-Rank Adaptation (LoRA) على مكونات نموذج محددة.
نحن نشجعك على المساهمة باختراقاتك الخاصة ومشاركتها هنا مع المجتمع1
## مثال: تعديل آلية الانتباه في نموذج Segment Anything (SAM)
نموذج **Segment Anything (SAM)** هو نموذج رائد في مجال تجزئة الصور. في تنفيذه الافتراضي، يستخدم SAM إسقاطًا مجمعًا للاستعلام والمفتاح والقيمة (`qkv`) في آلية الانتباه الخاصة به. ومع ذلك، قد ترغب في ضبط مكونات محددة فقط من آلية الانتباه، مثل إسقاطات الاستعلام (`q`) والقيمة (`v`)، لتقليل عدد المعلمات القابلة للتدريب والموارد الحسابية المطلوبة.
### الدافع
من خلال تقسيم الإسقاط المجمع `qkv` إلى إسقاطات منفصلة `q` و `k` و `v`، يمكنك تطبيق تقنيات مثل **LoRA** (Low-Rank Adaptation) على إسقاطي `q` و `v` فقط. يسمح لك هذا بما يلي:
- ضبط عدد أقل من المعلمات، مما يقلل من العبء الحسابي.
- تحقيق أداء أفضل من خلال التركيز على مكونات محددة.
- تجربة استراتيجيات تعديل مختلفة في آلية الانتباه.
### التنفيذ
#### **الخطوة 1: إنشاء فئة اهتمام مخصصة**
بعد ذلك، قم بإنشاء فئة فرعية من فئة `SamVisionAttention` الأصلية وعدلها لتضم إسقاطات `q` و `k` و `v` منفصلة.
```python
import torch
import torch.nn as nn
from transformers.models.sam.modeling_sam import SamVisionAttention
class SamVisionAttentionSplit(SamVisionAttention, nn.Module):
def __init__(self, config, window_size):
super().__init__(config, window_size)
del self.qkv
# إسقاطات منفصلة q و k و v
self.q = nn.Linear(config.hidden_size, config.hidden_size, bias=config.qkv_bias)
self.k = nn.Linear(config.hidden_size, config.hidden_size, bias=config.qkv_bias)
self.v = nn.Linear(config.hidden_size, config.hidden_size, bias=config.qkv_bias)
self._register_load_state_dict_pre_hook(self.split_q_k_v_load_hook)
def split_q_k_v_load_hook(self, state_dict, prefix, *args):
keys_to_delete = []
for key in list(state_dict.keys()):
if "qkv." in key:
# تقسيم q و k و v من الإسقاط المجمع
q, k, v = state_dict[key].chunk(3, dim=0)
# استبدال الإسقاطات الفردية q و k و v
state_dict[key.replace("qkv.", "q.")] = q
state_dict[key.replace("qkv.", "k.")] = k
state_dict[key.replace("qkv.", "v.")] = v
# وضع علامة على مفتاح qkv القديم للحذف
keys_to_delete.append(key)
# حذف مفاتيح qkv القديمة
for key in keys_to_delete:
del state_dict[key]
def forward(self, hidden_states: torch.Tensor, output_attentions=False) -> torch.Tensor:
batch_size, height, width, _ = hidden_states.shape
qkv_shapes = (batch_size * self.num_attention_heads, height * width, -1)
query = self.q(hidden_states).reshape((batch_size, height * width,self.num_attention_heads, -1)).permute(0,2,1,3).reshape(qkv_shapes)
key = self.k(hidden_states).reshape((batch_size, height * width,self.num_attention_heads, -1)).permute(0,2,1,3).reshape(qkv_shapes)
value = self.v(hidden_states).reshape((batch_size, height * width,self.num_attention_heads, -1)).permute(0,2,1,3).reshape(qkv_shapes)
attn_weights = (query * self.scale) @ key.transpose(-2, -1)
if self.use_rel_pos:
attn_weights = self.add_decomposed_rel_pos(
attn_weights, query, self.rel_pos_h, self.rel_pos_w, (height, width), (height, width)
)
attn_weights = torch.nn.functional.softmax(attn_weights, dtype=torch.float32, dim=-1).to(query.dtype)
attn_probs = nn.functional.dropout(attn_weights, p=self.dropout, training=self.training)
attn_output = (attn_probs @ value).reshape(batch_size, self.num_attention_heads, height, width, -1)
attn_output = attn_output.permute(0, 2, 3, 1, 4).reshape(batch_size, height, width, -1)
attn_output = self.proj(attn_output)
if output_attentions:
outputs = (attn_output, attn_weights)
else:
outputs = (attn_output, None)
return outputs
```
**الشرح:**
- **الإسقاطات المنفصلة:** يتم إزالة الإسقاط المُجمع `qkv`، وإنشاء إسقاطات خطية منفصلة `q` و `k` و `v`.
- **دالة استدعاء تحميل الأوزان:** تقوم طريقة `_split_qkv_load_hook` بتقسيم أوزان `qkv` المسبقة التدريب إلى أوزان `q` و `k` و `v` منفصلة عند تحميل النموذج. يضمن هذا التوافق مع أي نموذج مسبق التدريب.
- **التنفيذ الأمامي:** يتم حساب الاستعلامات والمفاتيح والقيم بشكل منفصل، وتستمر آلية الانتباه كالمعتاد.
#### **الخطوة 2: استبدال فئة الانتباه الأصلية**
استبدل فئة `SamVisionAttention` الأصلية بفئتك المخصصة بحيث يستخدم النموذج آلية الانتباه المعدلة.
```python
from transformers import SamModel
from transformers.models.sam import modeling_sam
# استبدال فئة الاهتمام في وحدة نمطية modeling_sam
modeling_sam.SamVisionAttention = SamVisionAttentionSplit
# تحميل نموذج SAM المسبق التدريب
model = SamModel.from_pretrained("facebook/sam-vit-base")
```
**الشرح:**
- **استبدال الفئة:** من خلال تعيين فئتك المخصصة إلى `modeling_sam.SamVisionAttention`، فإن أي حالات من فئة `SamVisionAttention` في النموذج ستستخدم النسخة المعدلة. وبالتالي، عند استدعاء `SamModel`، سيتم استخدام `SamVisionAttentionSplit` المحددة حديثًا.
- **تحميل النموذج:** يتم تحميل النموذج باستخدام `from_pretrained`، ويتم دمج آلية الانتباه المخصصة.
#### **الخطوة 3: تطبيق LoRA على إسقاطات محددة**
مع وجود إسقاطات `q` و `k` و `v` منفصلة، يمكنك الآن تطبيق LoRA على مكونات محددة، مثل إسقاطات `q` و `v`.
```python
from peft import LoraConfig, get_peft_model
config = LoraConfig(
r=16,
lora_alpha=32,
target_modules=["q", "v"], # تطبيق LoRA على إسقاطات q و v
lora_dropout=0.1,
task_type="mask-generation"
)
# تطبيق LoRA على النموذج
model = get_peft_model(model, config)
```
**الشرح:**
- **تكوين LoRA:** تحدد `LoraConfig` المرتبة `r`، وعامل القياس `lora_alpha`، والوحدات المستهدفة (`"q"` و `"v"`)، ومعدل التخلي، ونوع المهمة.
- **تطبيق LoRA:** تقوم دالة `get_peft_model` بتطبيق LoRA على الوحدات المحددة في النموذج.
- **تقليل المعلمات:** من خلال التركيز على `q` و `v`، فإنك تقلل عدد المعلمات القابلة للتدريب، مما يؤدي إلى تسريع التدريب وتقليل استخدام الذاكرة.
#### **الخطوة 4: التحقق من عدد المعلمات القابلة للتدريب**
من السهل التحقق من عدد المعلمات القابلة للتدريب ومعرفة تأثير تعديلك.
```python
model.print_trainable_parameters()
```
**الناتج المتوقع:**
```
عدد المعلمات القابلة للتدريب: 608,256 || جميع المعلمات: 94,343,728 || نسبة المعلمات القابلة للتدريب: 0.6447
عدد المعلمات القابلة للتدريب: 912,384 || جميع المعلمات: 94,647,856 || نسبة المعلمات القابلة للتدريب: 0.9640 # مع k
```
## المساهمة بابداعاتك الخاصة
يمكن لتعديل النماذج المسبقة التدريب أن يفتح آفاقًا جديدة للبحث والتطبيق. من خلال فهم وتعديل الآليات الداخلية للنماذج مثل SAM، يمكنك تخصيصها لتلبية احتياجاتك المحددة، وتحسين الأداء، وتجربة أفكار جديدة.
إذا قمت بتطوير تعديﻻتك الخاصة لنماذج Transformers وترغب في مشاركتها، ففكر في المساهمة في هذه الوثيقة.
- **إنشاء طلب سحب (Pull Request):** شارك تغييراتك وتحسيناتك في التعليمات البرمجية مباشرة في المستودع.
- **كتابة التوثيق:** قدم تفسيرات وأمثلة واضحة لتعديلاتك.
- **التفاعل مع المجتمع:** ناقش أفكارك واحصل على تعليقات من المطورين والباحثين الآخرين من خلال فتح مشكلة.

View File

@ -144,7 +144,7 @@ conda install conda-forge::transformers
تُحمّل النماذج المُسبقة التدريب وتُخزّن مؤقتًا في: `~/.cache/huggingface/hub`. هذا هو المجلد الافتراضي الذي يُحدده متغير البيئة `TRANSFORMERS_CACHE`. على Windows، يكون دليل ذاكرة التخزين المؤقت الافتراضي هو `C:\Users\username\.cache\huggingface\hub`. يمكنك تغيير متغيرات البيئة shell الموضحة أدناه - حسب الأولوية - لتحديد دليل ذاكرة تخزين مؤقت مختلف:
1. متغير البيئة (افتراضي): `HUGGINGFACE_HUB_CACHE` أو `TRANSFORMERS_CACHE`.
1. متغير البيئة (افتراضي): `HF_HUB_CACHE` أو `TRANSFORMERS_CACHE`.
2. متغير البيئة: `HF_HOME`.
3. متغير البيئة: `XDG_CACHE_HOME` + `/huggingface`.

View File

@ -28,7 +28,7 @@ picture-in-picture" allowfullscreen></iframe>
```py
>>> model = AutoModel.from_pretrained(
... "julien-c/EsperBERTo-small", revision="v2.0.1" # اسم العلامة، أو اسم الفرع، أو تجزئة الالتزام
... "julien-c/EsperBERTo-small", revision="4c77982" # اسم العلامة، أو اسم الفرع، أو تجزئة الالتزام
... )
```

View File

@ -0,0 +1,184 @@
# المحولات النمطية
مكتبة `transformers` هي إطار عمل ذو فلسفة محدد؛ يتم تعريف فلسفتنا في [الدليل المفاهيمي](./philosophy).
جوهر هذه الفلسفة يتمثل في مبدأ [نموذج واحد، ملف واحد](https://huggingface.co/blog/transformers-design-philosophy)
في المكتبة. الجانب السلبي لهذا المكون هو تقييده لوراثة واستيراد مكونات الملفات.
نتيجة لذلك، تتكرر مكونات النموذج عبر العديد من الملفات. يحتوي `transformers` على عدد كبير من طبقات الانتباه، يقارب عدد النماذج، والكثير منها متطابق. يتسبب هذا في تباعد عمليات التنفيذ المستقلة مع تطبيق الإصلاحات والتغييرات.
على أجزاء محددة من التعليمات البرمجية.
ولمعالجة ذلك، اعتمدنا مفهوم "النسخ" في المكتبة. فبإضافة تعليق يُشير إلى أن التعليمات البرمجية هي نسخة من أخرى، نضمن من خلال أنظمة CI والأوامر المحلية عدم تباعد النسخ. لكن هذه العملية، رغم بساطتها، تُسبب إرهاقاً. كما أنها تزيد العبء على المساهمين، وهو ما نهدف إلى تجاوزه.
غالباً ما تتطلب مساهمات النماذج إضافة تعليمات برمجية (حوالي 1000 سطر)، ومعالج (حوالي 500 سطر)، واختبارات، ووثائق، إلخ. ونادراً ما تقل مساهمات النماذج عن 3000-5000 سطر من التعليمات البرمجية، معظمها أكواد نمطية. هذا يرفع مستوى المساهمات،
ونهدف مع المحولات النمطية إلى خفض هذا المستوى إلى حدّ مقبول.
## ما هو؟
تقدم المحولات النمطية مفهوم ملف "نمطي" لمجلد نموذج. يقبل هذا الملف النمطي تعليمات برمجية
غير مقبولة عادة في ملفات النمذجة/المعالجة، حيث يسمح بالاستيراد من نماذج مجاورة وكذلك
الوراثة من الفئات إلى فئات أخرى.
يعرّف هذا الملف النمطي النماذج والمعالجات وفئة التكوين التي سيتم تعريفها في وحداتهم
المتعلقة.
وأخيرًا، يقدم هذا الميزة أداة `linter` جديدة والتي ستعمل على "تفكيك" الملف النمطي إلى بنية "نموذج واحد، ملف واحد"
هيكل الدليل. سيتم إنشاء هذه الملفات تلقائيًا في كل مرة يتم فيها تشغيل البرنامج النصي؛ مما يقلل من المساهمات المطلوبة
إلى الملف النمطي، وبالتالي فقط إلى التغييرات بين النموذج المساهم والنماذج الأخرى.
سيقوم مستخدمو النموذج في النهاية باستيراد واستخدام واجهة الملف الواحد، لذا لا يتوقع حدوث أي تغيير هنا. من خلال القيام بذلك،
نأمل في الجمع بين أفضل ما في العالمين: تمكين المساهمات البسيطة مع الالتزام بفلسفتنا.
لذلك، هذا بديل لعلامات `# Copied from`، ويمكن توقع انتقال النماذج المساهمة سابقًا إلى
تنسيق المحولات النمطية الجديد في الأشهر المقبلة.
### التفاصيل
تُبسط أداة "linter" الوراثة، مُنشئةً جميع الملفات المفردة من الملف النمطي، مع الحفاظ على شفافيتها أمام مستخدمي Python. حاليًا، تُبسط الأداة مستوىً واحدًا من الوراثة
على سبيل المثال:
- إذا ورثت فئة التكوين من فئة أخرى وأضافت/حذفت معامل، فسيتم إما الإشارة إلى الملف المولد مباشرةً
(في حالة الإضافة) أو إزالته تمامًا (في حالة الحذف).
- إذا ورثت فئة من فئة أخرى، على سبيل المثال: `class GemmaModel(LlamaModel):`، تُستنتج التبعيات تلقائيًا
سيتم استنتاج جميع الوحدات الفرعية تلقائيًا من الفئة الأصلية.
- إذا قمت بتعريف وظائف جديدة في الملف `modular` واستخدمتها داخل الفئات، فستستنتج أداة linter ذلك تلقائيًا
يجب أن تكون قادرًا على كتابة كل شيء (المجزىء اللغوي، ومُعالِج الصور، والنموذج، والتكوين) في الملف `modular`، وسيتم إنشاء الملفات المُقابلة تلقائيًا.
### التطبيق
[TODO] نقدم اختبارًا جديدًا، للتأكد من أن المحتوى المولد يتطابق مع ما هو موجود في `modular_xxxx.py`
### الأمثلة
هنا مثال سريع باستخدام BERT و RoBERTa. النموذجان مرتبطان ارتباطًا وثيقًا: يختلف تنفيذهما النموذجي في طبقة تضمين.
بدلاً من إعادة تعريف النموذج بالكامل، إليك كيف يبدو ملف `modular_roberta.py` لفئات النمذجة والتكوين (لأغراض المثال، يتم تجاهل المجزىء اللغوي في هذا الوقت حيث أنه مختلف جدًا).
```python
from torch import nn
from ..bert.configuration_bert import BertConfig
from ..bert.modeling_bert import (
BertModel,
BertEmbeddings,
BertForMaskedLM
)
# تكوين RoBERTa مطابق لتكوين BERT
class RobertaConfig(BertConfig):
model_type = 'roberta'
# نعيد تعريف الإضافات هنا لتسليط الضوء على اختلاف معرف الحشو، ونعيد تعريف الإضافات الموضعية
class RobertaEmbeddings(BertEmbeddings):
def __init__(self, config):
super().__init__(config())
self.padding_idx = config.pad_token_id
self.position_embeddings = nn.Embedding(
config.max_position_embeddings, config.hidden_size, padding_idx=self.padding_idx
)
# نموذج RoBERTa مطابق لنموذج BERT، باستثناء طبقة الإضافات.
# نعيد تعريف الإضافات أعلاه، لذا هنا لا توجد حاجة لعمل إضافي
class RobertaModel(BertModel):
def __init__(self, config):
super().__init__(config)
self.embeddings = RobertaEmbeddings(config)
# الرؤوس الآن تحتاج فقط إلى إعادة تعريف النموذج داخل `RobertaModel` الصحيح
class RobertaForMaskedLM(BertForMaskedLM):
def __init__(self, config):
super().__init__(config)
self.model = RobertaModel(config)
```
لاحظ أنه إذا لم تستخدم الاعتماد الذي حددته، فستحصل على الخطأ التالي:
```bash
ValueError: You defined `RobertaEmbeddings` in the modular_roberta.py, it should be used
when you define `BertModel`, as it is one of it's direct dependencies. Make sure
you use it in the `__init__` function.
```
بالإضافة إلى ذلك، قد تجد قائمة بالأمثلة هنا:
## ما هو ليس كذلك
ليس بديلاً لتعليمات برمجة النمذجة (بعد؟)، وإذا لم يكن نموذجك يعتمد على أي شيء آخر موجود من قبل، فيمكنك إضافة ملف `نمذجة` كالعادة.
## الاستخدام المتقدم
### إزالة السمات والوظائف
لإزالة السمات التي لا تستخدم في نموذجك النمطي، والتي لا تريد رؤيتها في النمذجة المفككة:
```python
class GemmaModel(LlamaModel): | class GemmaModel(PreTrainedModel):
def __init__(self, config): | def __init__(self, config):
super().__init__(self, eos_token) | super().__init__(config)
del self.embed_tokens | self.padding_idx = config.pad_token_id
| self.vocab_size = config.vocab_size
|
| self.layers = nn.ModuleList(
| [LlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
| )
| self.norm = LlamaRMSNorm(config.hidden_size, eps=config.rms_norm_eps)
| self.rotary_emb = LlamaRotaryEmbedding(config=config)
| self.gradient_checkpointing = False
|
| # Initialize weights and apply final processing
| self.post_init()
```
إذا قمت بالتحقق من `LlamaModel` الأصلي، فستجد `embed_tokens` الذي تمت إزالته هنا (كما هو متوقع!)
إزالة وظيفة مشابهة، تحتاج فقط إلى كتابتها مع `raise ValueError("")` لمحاكاة السلوك الذي تريده فعليًا عند إزالة وظيفة أصلية في بايثون.
```python
class GemmaTokenizer(LlamaTokenizer):
...
def get_spm_processor(self):
raise AttributeError("Not needed for Gemma")
def unk_token_length(self):
raise AttributeError("Not needed for Gemma")
```
### تعريف وظائف جديدة
إذا قمت بتعريف وظيفة جديدة في الملف `modular` لاستخدامها داخل فئة، على سبيل المثال
```python
def my_new_function(*args, **kwargs):
# Do something here
pass
class GemmaModel(LlamaModel):
def forward(*args, **kwargs):
# Call the function
example = my_new_function(*args, **kwargs)
# continue here
```
سيتم نسخ وظيفة `my_new_function` (وبشكل متكرر، أي وظائف أخرى جديدة يتم استدعاؤها في جسمها) تلقائيًا
في الملف الذي يتم استخدامه.
### استدعاء `super()`
قمنا مؤخرًا بشحن بعض الميزات التي تسمح لك بالانتقال من:
```python
class GemmaTokenizer(LlamaTokenizer, PretrainedTokenizerFast): | class GemmaModel(nn.Module):
def __init__(self, eos_token="</s>"): | def __init__(self):
eos_token = AddedToken(eos_token) | eos_token = AddedToken(eos_token)
PretrainedTokenizerFast.__init__(self, eos_token) | super().__init__(eos_token)
```
هذا مفيد عندما لا تريد تفكيك استدعاء `super()`، وتريد التمييز بين أي استدعاء super init تقوم به!
### التسمية الخاصة
ندعم الآن أيضًا حالات خاصة مثل
```python
class GemmaVisionModel(CLIPModel):
pass
```
حيث اسم فئة `GemmaVision` الخاصة بك ليس هو نفسه `Gemma` النمطي. هذا مفيد للغاية للنماذج المركبة.

View File

@ -0,0 +1,160 @@
# النماذج متعددة اللغات للاستدلال
هناك العديد من النماذج متعددة اللغات في مكتبة 🤗 Transformers، وتختلف طريقة استخدامها للاستدلال عن النماذج أحادية اللغة. ولكن ليس كل استخدام النماذج متعددة اللغات مختلف. فبعض النماذج، مثل [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased)، يمكن استخدامها تمامًا مثل النموذج أحادي اللغة. سيوضح لك هذا الدليل كيفية استخدام النماذج متعددة اللغات التي تختلف طريقة استخدامها للاستدلال.
## XLM
يحتوي XLM على عشر نسخ مختلفة، واحدة منها فقط أحادية اللغة. ويمكن تقسيم نسخ النماذج التسع المتبقية إلى فئتين: نسخ التي تستخدم تضمينات اللغة (language embeddings) وتلك التي لا تستخدمها.
### XLM مع تضمينات اللغة
تستخدم النماذج التالية من XLM تضمينات اللغة لتحديد اللغة المستخدمة أثناء الاستدلال:
- `FacebookAI/xlm-mlm-ende-1024` (نمذجة اللغة المقنعة، الإنجليزية-الألمانية)
- `FacebookAI/xlm-mlm-enfr-1024` (نمذجة اللغة المقنعة، الإنجليزية-الفرنسية)
- `FacebookAI/xlm-mlm-enro-1024` (نمذجة اللغة المقنعة، الإنجليزية-الرومانية)
- `FacebookAI/xlm-mlm-xnli15-1024` (نمذجة اللغة المقنعة، لغات XNLI)
- `FacebookAI/xlm-mlm-tlm-xnli15-1024` (نمذجة اللغة المقنعة + الترجمة، لغات XNLI)
- `FacebookAI/xlm-clm-enfr-1024` (نمذجة اللغة السببية، الإنجليزية-الفرنسية)
- `FacebookAI/xlm-clm-ende-1024` (نمذجة اللغة السببية، الإنجليزية-الألمانية)
تُمثل تضمينات اللغة على شكل مصفوفة بنفس شكل `input_ids` التي يتم تمريره إلى النموذج. وتعتمد القيم في هذه المصفوفات على اللغة المستخدمة ويتم تحديدها بواسطة معاملى المجزىء `lang2id` و `id2lang`.
في هذا المثال، قم بتحميل نسخة `FacebookAI/xlm-clm-enfr-1024` ( نمذجة اللغة السببية، الإنجليزية-الفرنسية):
```py
>>> import torch
>>> from transformers import XLMTokenizer, XLMWithLMHeadModel
>>> tokenizer = XLMTokenizer.from_pretrained("FacebookAI/xlm-clm-enfr-1024")
>>> model = XLMWithLMHeadModel.from_pretrained("FacebookAI/xlm-clm-enfr-1024")
```
تُظهر خاصية `lang2id` في المجزىء اللغات وأرقام تعريفها في هذا النموذج:
```py
>>> print(tokenizer.lang2id)
{'en': 0, 'fr': 1}
```
بعد ذلك، قم بإنشاء مثال على المدخلات:
```py
>>> input_ids = torch.tensor([tokenizer.encode("Wikipedia was used to")]) # batch size of 1
```
قم بتعيين معرف اللغة إلى `"en"` واستخدمه لتحديد تضمين اللغة. وتضمين اللغة عبارة عن مصفوفة مملوءة بـ `0` لأن هذا هو معرف اللغة الإنجليزية. يجب أن تكون هذه المصفوفة بنفس حجم `input_ids`.
```py
>>> language_id = tokenizer.lang2id["en"] # 0
>>> langs = torch.tensor([language_id] * input_ids.shape[1]) # torch.tensor([0, 0, 0, ..., 0])
>>> # نقوم بإعادة تشكيلها لتكون بالحجم (batch_size، sequence_length)
>>> langs = langs.view(1, -1) # الآن بالحجم [1، sequence_length] (لدينا batch size تساوي 1)
```
الآن يمكنك تمرير `input_ids` وتضمين اللغة إلى النموذج:
```py
>>> outputs = model(input_ids, langs=langs)
```
يمكن لنص البرنامج النصي [run_generation.py](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-generation/run_generation.py) توليد النص باستخدام تضمينات اللغة مع نقاط تفتيش `xlm-clm`.
### XLM بدون تضمينات اللغة
النماذج التالية من XLM لا تتطلب تضمينات اللغة أثناء الاستنتاج:
- `FacebookAI/xlm-mlm-17-1280` (نمذجة اللغة المقنعة، 17 لغة)
- `FacebookAI/xlm-mlm-100-1280` (نمذجة اللغة المقنعة، 100 لغة)
تُستخدم هذه النماذج لتمثيل الجمل العامة، على عكس نسح XLM السابقة.
## BERT
يمكن استخدام النماذج التالية من BERT للمهام متعددة اللغات:
- `google-bert/bert-base-multilingual-uncased` (نمذجة اللغة المقنعة + التنبؤ بالجملة التالية، 102 لغة)
- `google-bert/bert-base-multilingual-cased` (نمذجة اللغة المقنعة + التنبؤ بالجملة التالية، 104 لغات)
لا تتطلب هذه النماذج تضمينات اللغة أثناء الاستدلال. يجب أن تُحدّد اللغة من السياق وتستنتج وفقاً لذلك.
## XLM-RoBERTa
يمكن استخدام النماذج التالية من XLM-RoBERTa للمهام متعددة اللغات:
- `FacebookAI/xlm-roberta-base` (نمذجة اللغة المقنعة، 100 لغة)
- `FacebookAI/xlm-roberta-large` (نمذجة اللغة المقنعة، 100 لغة)
تم تدريب XLM-RoBERTa على 2.5 تيرابايت من بيانات CommonCrawl الجديدة والمحسنة في 100 لغة. ويوفر مكاسب قوية على النماذج متعددة اللغات التي تم إصدارها سابقاً مثل mBERT أو XLM في مهام المصب مثل التصنيف، ووضع العلامات التسلسلية، والأسئلة والأجوبة.
## M2M100
يمكن استخدام النماذج التالية من M2M100 للترجمة متعددة اللغات:
- `facebook/m2m100_418M` (الترجمة)
- `facebook/m2m100_1.2B` (الترجمة)
في هذا المثال، قم بتحميل نسحة `facebook/m2m100_418M` لترجمة النص من الصينية إلى الإنجليزية. يمكنك تعيين اللغة المصدر في المجزىء اللغوى:
```py
>>> from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer
>>> en_text = "Do not meddle in the affairs of wizards, for they are subtle and quick to anger."
>>> chinese_text = "不要插手巫師的事務, 因為他們是微妙的, 很快就會發怒."
>>> tokenizer = M2M100Tokenizer.from_pretrained("facebook/m2m100_418M", src_lang="zh")
>>> model = M2M100ForConditionalGeneration.from_pretrained("facebook/m2m100_418M")
```
تقسيم النّص إلى رموز:
```py
>>> encoded_zh = tokenizer(chinese_text, return_tensors="pt")
```
يجبر M2M100 معرف اللغة الهدف كأول رمز مولد للترجمة إلى اللغة الهدف. قم بتعيين `forced_bos_token_id` إلى `en` في طريقة `generate` للترجمة إلى الإنجليزية:
```py
>>> generated_tokens = model.generate(**encoded_zh, forced_bos_token_id=tokenizer.get_lang_id("en"))
>>> tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
'Do not interfere with the matters of the witches, because they are delicate and will soon be angry.'
```
## MBart
يمكن استخدام النماذج التالية من MBart للترجمة متعددة اللغات:
- `facebook/mbart-large-50-one-to-many-mmt` (الترجمة الآلية متعددة اللغات من واحد إلى كثير، 50 لغة)
- `facebook/mbart-large-50-many-to-many-mmt` (الترجمة الآلية متعددة اللغات من كثير إلى كثير، 50 لغة)
- `facebook/mbart-large-50-many-to-one-mmt` (الترجمة الآلية متعددة اللغات من كثير إلى واحد، 50 لغة)
- `facebook/mbart-large-50` (الترجمة متعددة اللغات، 50 لغة)
- `facebook/mbart-large-cc25`
في هذا المثال، قم بتحميل نسخة `facebook/mbart-large-50-many-to-many-mmt` لترجمة النص من الفنلندية إلى الإنجليزية. يمكنك تعيين اللغة المصدر في المجزىء:
```py
>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
>>> en_text = "Do not meddle in the affairs of wizards, for they are subtle and quick to anger."
>>> fi_text = "Älä sekaannu velhojen asioihin, sillä ne ovat hienovaraisia ja nopeasti vihaisia."
>>> tokenizer = AutoTokenizer.from_pretrained("facebook/mbart-large-50-many-to-many-mmt", src_lang="fi_FI")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("facebook/mbart-large-50-many-to-many-mmt")
```
تقسيم النّص إلى رموز:
```py
>>> encoded_en = tokenizer(en_text, return_tensors="pt")
```
يجبر MBart معرف لغة الهدف كأول رمز مولد للترجمة إلى اللغة الهدف. قم بتعيين `forced_bos_token_id` إلى `en` في طريقة `generate` للترجمة إلى الإنجليزية:
```py
>>> generated_tokens = model.generate(**encoded_en, forced_bos_token_id=tokenizer.lang_code_to_id["en_XX"])
>>> tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
"Don't interfere with the wizard's affairs, because they are subtle, will soon get angry."
```
إذا كنت تستخدم نسخة `facebook/mbart-large-50-many-to-one-mmt`، فلا تحتاج إلى إجبار معرف لغة الهدف كأول رمز مولد، وإلا فإن الاستخدام هو نفسه.

140
docs/source/ar/notebooks.md Normal file
View File

@ -0,0 +1,140 @@
# دفاتر ملاحظات 🤗 Transformers
يمكنك أن تجد هنا قائمة بدفاتر الملاحظات الرسمية التي تقدمها Hugging Face.
كما نود أن ندرج هنا محتوى مثيرًا للاهتمام تم إنشاؤه بواسطة المجتمع.
إذا كتبت دفتر ملاحظات يستفيد من 🤗 Transformers وتود إدراجه هنا، فيُرجى فتح طلب سحب حتى يمكن تضمينه ضمن دفاتر ملاحظات المجتمع.
## دفاتر ملاحظات Hugging Face 🤗
### دفاتر ملاحظات التوثيق
يمكنك فتح أي صفحة من صفحات التوثيق كدفتر ملاحظات في Colab (يوجد زر مباشرة على تلك الصفحات) ولكنها مدرجة هنا أيضًا إذا كنت بحاجة إليها:
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [جولة سريعة في المكتبة](https://github.com/huggingface/notebooks/blob/main/transformers_doc/en/quicktour.ipynb) | عرض لمختلف واجهات برمجة التطبيقات في Transformers |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/quicktour.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/en/transformers_doc/quicktour.ipynb)|
| [ملخص المهام](https://github.com/huggingface/notebooks/blob/main/transformers_doc/en/task_summary.ipynb) | كيفية تشغيل نماذج مكتبة Transformers مهمة تلو الأخرى |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/task_summary.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/transformers_doc/en/task_summary.ipynb)|
| [معالجة البيانات مسبقًا](https://github.com/huggingface/notebooks/blob/main/transformers_doc/en/preprocessing.ipynb) | كيفية استخدام محلل لغوي لمعالجة بياناتك مسبقًا |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/preprocessing.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/transformers_doc/en/preprocessing.ipynb)|
| [الضبط الدقيق لنموذج مُدرَّب مسبقًا](https://github.com/huggingface/notebooks/blob/main/transformers_doc/en/training.ipynb) | كيفية استخدام المدرب لضبط نموذج مُدرَّب مسبقًا بدقة |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/training.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/transformers_doc/en/training.ipynb)|
| [ملخص للمحللات اللغوية](https://github.com/huggingface/notebooks/blob/main/transformers_doc/en/tokenizer_summary.ipynb) | الاختلافات بين خوارزمية المحلل اللغوي |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/tokenizer_summary.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/transformers_doc/en/tokenizer_summary.ipynb)|
| [النماذج متعددة اللغات](https://github.com/huggingface/notebooks/blob/main/transformers_doc/en/multilingual.ipynb) | كيفية استخدام النماذج متعددة اللغات للمكتبة |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/transformers_doc/en/multilingual.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/transformers_doc/en/multilingual.ipynb)|
### أمثلة PyTorch
#### معالجة اللغة الطبيعية[[pytorch-nlp]]
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [تدريب محللك اللغوي](https://github.com/huggingface/notebooks/blob/main/examples/tokenizer_training.ipynb) | كيفية تدريب واستخدام محللك اللغوي الخاص بك |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/tokenizer_training.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/tokenizer_training.ipynb)|
| [تدريب نموذج لغتك](https://github.com/huggingface/notebooks/blob/main/examples/language_modeling_from_scratch.ipynb) | كيفية البدء بسهولة في استخدام المحولات |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling_from_scratch.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/language_modeling_from_scratch.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف النص](https://github.com/huggingface/notebooks/blob/main/examples/text_classification.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على أي مهمة GLUE. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/text_classification.ipynb)|
| [كيفية ضبط نموذج بدقة على النمذجة اللغوية](https://github.com/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على مهمة LM سببية أو مقنعة. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف الرموز المميزة](https://github.com/huggingface/notebooks/blob/main/examples/token_classification.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على مهمة تصنيف الرموز المميزة (NER، PoS). | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/token_classification.ipynb)|
| [كيفية ضبط نموذج بدقة على الإجابة على الأسئلة](https://github.com/huggingface/notebooks/blob/main/examples/question_answering.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على SQUAD. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/question_answering.ipynb)|
| [كيفية ضبط نموذج بدقة على الاختيار من متعدد](https://github.com/huggingface/notebooks/blob/main/examples/multiple_choice.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على SWAG. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multiple_choice.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/multiple_choice.ipynb)|
| [كيفية ضبط نموذج بدقة على الترجمة](https://github.com/huggingface/notebooks/blob/main/examples/translation.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على WMT. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/translation.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/translation.ipynb)|
| [كيفية ضبط نموذج بدقة على التلخيص](https://github.com/huggingface/notebooks/blob/main/examples/summarization.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على XSUM. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/summarization.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/summarization.ipynb)|
| [كيفية تدريب نموذج لغة من البداية](https://github.com/huggingface/blog/blob/main/notebooks/01_how_to_train.ipynb)| تسليط الضوء على جميع الخطوات لتدريب نموذج Transformer بشكل فعال على بيانات مخصصة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/blog/blob/main/notebooks/01_how_to_train.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/blog/blob/main/notebooks/01_how_to_train.ipynb)|
| [كيفية إنشاء نص](https://github.com/huggingface/blog/blob/main/notebooks/02_how_to_generate.ipynb)| كيفية استخدام أساليب فك التشفير المختلفة لإنشاء اللغة باستخدام المحولات | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/blog/blob/main/notebooks/02_how_to_generate.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/blog/blob/main/notebooks/02_how_to_generate.ipynb)|
| [كيفية إنشاء نص (مع قيود)](https://github.com/huggingface/blog/blob/main/notebooks/53_constrained_beam_search.ipynb)| كيفية توجيه إنشاء اللغة باستخدام القيود التي يوفرها المستخدم | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/blog/blob/main/notebooks/53_constrained_beam_search.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/blog/blob/main/notebooks/53_constrained_beam_search.ipynb)|
| [Reformer](https://github.com/huggingface/blog/blob/main/notebooks/03_reformer.ipynb)| كيف يدفع Reformer حدود النمذجة اللغوية | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patrickvonplaten/blog/blob/main/notebooks/03_reformer.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/patrickvonplaten/blog/blob/main/notebooks/03_reformer.ipynb)|
#### رؤية الكمبيوتر[[pytorch-cv]]
| دفتر الملاحظات | الوصف | | |
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------:|
| [كيفية ضبط نموذج بدقة على تصنيف الصور (Torchvision)](https://github.com/huggingface/notebooks/blob/main/examples/image_classification.ipynb) | يوضح كيفية معالجة البيانات مسبقًا باستخدام Torchvision وضبط أي نموذج رؤية مُدرَّب مسبقًا بدقة على تصنيف الصور | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف الصور (Albumentations)](https://github.com/huggingface/notebooks/blob/main/examples/image_classification_albumentations.ipynb) | يوضح كيفية معالجة البيانات مسبقًا باستخدام Albumentations وضبط أي نموذج رؤية مُدرَّب مسبقًا بدقة على تصنيف الصور | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification_albumentations.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/image_classification_albumentations.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف الصور (Kornia)](https://github.com/huggingface/notebooks/blob/main/examples/image_classification_kornia.ipynb) | يوضح كيفية معالجة البيانات مسبقًا باستخدام Kornia وضبط أي نموذج رؤية مُدرَّب مسبقًا بدقة على تصنيف الصور | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification_kornia.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/image_classification_kornia.ipynb)|
| [كيفية إجراء الكشف عن الأشياء بدون لقطات مع OWL-ViT](https://github.com/huggingface/notebooks/blob/main/examples/zeroshot_object_detection_with_owlvit.ipynb) | يوضح كيفية إجراء الكشف عن الأشياء بدون لقطات على الصور باستخدام استعلامات نصية | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/zeroshot_object_detection_with_owlvit.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/zeroshot_object_detection_with_owlvit.ipynb)|
| [كيفية ضبط نموذج وصف الصور بدقة](https://github.com/huggingface/notebooks/blob/main/examples/image_captioning_blip.ipynb) | يوضح كيفية ضبط BLIP بدقة لوصف الصور على مجموعة بيانات مخصصة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_captioning_blip.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/image_captioning_blip.ipynb)|
| [كيفية بناء نظام تشابه الصور مع Transformers](https://github.com/huggingface/notebooks/blob/main/examples/image_similarity.ipynb) | يوضح كيفية بناء نظام تشابه الصور | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_similarity.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/image_similarity.ipynb)|
| [كيفية ضبط نموذج SegFormer بدقة على التجزئة الدلالية](https://github.com/huggingface/notebooks/blob/main/examples/semantic_segmentation.ipynb) | يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج SegFormer مُدرَّب مسبقًا بدقة على التجزئة الدلالية | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/semantic_segmentation.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/semantic_segmentation.ipynb)|
| [كيفية ضبط نموذج VideoMAE بدقة على تصنيف الفيديو](https://github.com/huggingface/notebooks/blob/main/examples/video_classification.ipynb) | يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج VideoMAE مُدرَّب مسبقًا بدقة على تصنيف الفيديو | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/video_classification.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/video_classification.ipynb)|
#### الصوت[[pytorch-audio]]
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [كيفية ضبط نموذج التعرف على الكلام باللغة الإنجليزية بدقة](https://github.com/huggingface/notebooks/blob/main/examples/speech_recognition.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج كلام مُدرَّب مسبقًا بدقة على TIMIT | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/speech_recognition.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/speech_recognition.ipynb)|
| [كيفية ضبط نموذج التعرف على الكلام بأي لغة بدقة](https://github.com/huggingface/notebooks/blob/main/examples/multi_lingual_speech_recognition.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج كلام مُدرَّب مسبقًا متعدد اللغات بدقة على Common Voice | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multi_lingual_speech_recognition.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/multi_lingual_speech_recognition.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف الصوت](https://github.com/huggingface/notebooks/blob/main/examples/audio_classification.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج كلام مُدرَّب مسبقًا بدقة على Keyword Spotting | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/audio_classification.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/audio_classification.ipynb)|
#### التسلسلات البيولوجية[[pytorch-bio]]
| دفتر الملاحظات | الوصف | | |
|:----------|:----------------------------------------------------------------------------------------|:-------------|------:|
| [كيفية ضبط نموذج بروتين مُدرَّب مسبقًا بدقة](https://github.com/huggingface/notebooks/blob/main/examples/protein_language_modeling.ipynb) | شاهد كيفية ترميز البروتينات وضبط نموذج "لغة" بروتين مُدرَّب مسبقًا كبير بدقة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/protein_language_modeling.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/protein_language_modeling.ipynb) |
| [كيفية إنشاء طيات بروتينية](https://github.com/huggingface/notebooks/blob/main/examples/protein_folding.ipynb) | شاهد كيفية الانتقال من تسلسل البروتين إلى نموذج بروتين كامل وملف PDB | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/protein_folding.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/protein_folding.ipynb) |
| [كيفية ضبط نموذج محول النيوكليوتيدات بدقة](https://github.com/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling.ipynb) | شاهد كيفية ترميز الحمض النووي وضبط نموذج "لغة" الحمض النووي مُدرَّب مسبقًا كبير بدقة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling.ipynb) |
| [ضبط نموذج محول النيوكليوتيدات بدقة باستخدام LoRA](https://github.com/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling_with_peft.ipynb) | تدريب نماذج DNA أكبر بكثير بطريقة فعالة من حيث الذاكرة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling_with_peft.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/nucleotide_transformer_dna_sequence_modelling_with_peft.ipynb) |
#### طرائق أخرى[[pytorch-other]]
| دفتر الملاحظات | الوصف | | |
|:----------|:----------------------------------------------------------------------------------------|:-------------|------:|
| [التنبؤ الاحتمالي بالسلاسل الزمنية](https://github.com/huggingface/notebooks/blob/main/examples/time-series-transformers.ipynb) | شاهد كيفية تدريب Time Series Transformer على مجموعة بيانات مخصصة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/time-series-transformers.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/time-series-transformers.ipynb) |
#### دفاتر ملاحظات الأدوات المساعدة [[pytorch-utility]]
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [كيفية تصدير النموذج إلى ONNX](https://github.com/huggingface/notebooks/blob/main/examples/onnx-export.ipynb)| تسليط الضوء على كيفية التصدير وتشغيل أعباء عمل الاستدلال من خلال ONNX | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/onnx-export.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/onnx-export.ipynb)|
| [كيفية استخدام المعايير](https://github.com/huggingface/notebooks/blob/main/examples/benchmark.ipynb)| كيفية قياس أداء النماذج باستخدام المحولات | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/benchmark.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/benchmark.ipynb)|
### أمثلة TensorFlow
#### معالجة اللغة الطبيعية[[tensorflow-nlp]]
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [تدريب محللك اللغوي](https://github.com/huggingface/notebooks/blob/main/examples/tokenizer_training.ipynb) | كيفية تدريب واستخدام محللك اللغوي الخاص بك |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/tokenizer_training.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/tokenizer_training.ipynb)|
| [تدريب نموذج لغتك](https://github.com/huggingface/notebooks/blob/main/examples/language_modeling_from_scratch-tf.ipynb) | كيفية البدء بسهولة في استخدام المحولات |[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling_from_scratch-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/language_modeling_from_scratch-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف النص](https://github.com/huggingface/notebooks/blob/main/examples/text_classification-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على أي مهمة GLUE. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/text_classification-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على النمذجة اللغوية](https://github.com/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على مهمة LM سببية أو مقنعة. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف الرموز المميزة](https://github.com/huggingface/notebooks/blob/main/examples/token_classification-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على مهمة تصنيف الرموز المميزة (NER، PoS). | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/token_classification-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على الإجابة على الأسئلة](https://github.com/huggingface/notebooks/blob/main/examples/question_answering-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على SQUAD. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/question_answering-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على الاختيار من متعدد](https://github.com/huggingface/notebooks/blob/main/examples/multiple_choice-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على SWAG. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multiple_choice-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/multiple_choice-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على الترجمة](https://github.com/huggingface/notebooks/blob/main/examples/translation-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على WMT. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/translation-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/translation-tf.ipynb)|
| [كيفية ضبط نموذج بدقة على التلخيص](https://github.com/huggingface/notebooks/blob/main/examples/summarization-tf.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج مُدرَّب مسبقًا بدقة على XSUM. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/summarization-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/summarization-tf.ipynb)|
#### رؤية الكمبيوتر[[tensorflow-cv]]
| دفتر الملاحظات | الوصف | | |
|:---------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------|:-------------|------:|
| [كيفية ضبط نموذج بدقة على تصنيف الصور](https://github.com/huggingface/notebooks/blob/main/examples/image_classification-tf.ipynb) | يوضح كيفية معالجة البيانات مسبقًا وضبط أي نموذج رؤية مُدرَّب مسبقًا بدقة على تصنيف الصور | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/image_classification-tf.ipynb)|
| [كيفية ضبط نموذج SegFormer بدقة على التجزئة الدلالية](https://github.com/huggingface/notebooks/blob/main/examples/semantic_segmentation-tf.ipynb) | يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج SegFormer مُدرَّب مسبقًا بدقة على التجزئة الدلالية | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/semantic_segmentation-tf.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/semantic_segmentation-tf.ipynb)|
#### التسلسلات البيولوجية[[tensorflow-bio]]
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [كيفية ضبط نموذج بروتين مُدرَّب مسبقًا بدقة](https://github.com/huggingface/notebooks/blob/main/examples/protein_language_modeling-tf.ipynb) | شاهد كيفية ترميز البروتينات وضبط نموذج "لغة" بروتين مُدرَّب مسبقًا كبير بدقة | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/protein_language_modeling-tf.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/protein_language_modeling-tf.ipynb) |
#### دفاتر ملاحظات الأدوات المساعدة [[tensorflow-utility]]
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [كيفية تدريب نماذج TF/Keras على TPU](https://github.com/huggingface/notebooks/blob/main/examples/tpu_training-tf.ipynb) | شاهد كيفية التدريب بسرعة عالية على أجهزة TPU من Google | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/tpu_training-tf.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/tpu_training-tf.ipynb) |
### دفاتر ملاحظات Optimum
🤗 [Optimum](https://github.com/huggingface/optimum) هو امتداد لـ 🤗 Transformers، يوفر مجموعة من أدوات تحسين الأداء التي تمكن من تحقيق أقصى قدر من الكفاءة لتدريب وتشغيل النماذج على الأجهزة المستهدفة.
| دفتر الملاحظات | الوصف | | |
|:----------|:-------------|:-------------|------:|
| [كيفية تكميم نموذج باستخدام ONNX Runtime لتصنيف النص](https://github.com/huggingface/notebooks/blob/main/examples/text_classification_quantization_ort.ipynb)| يوضح كيفية تطبيق التكميم الثابت والديناميكي على نموذج باستخدام [ONNX Runtime](https://github.com/microsoft/onnxruntime) لأي مهمة GLUE. | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification_quantization_ort.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/text_classification_quantization_ort.ipynb)|
| [كيفية ضبط نموذج بدقة على تصنيف النص باستخدام ONNX Runtime](https://github.com/huggingface/notebooks/blob/main/examples/text_classification_ort.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج بدقة على أي مهمة GLUE باستخدام [ONNX Runtime](https://github.com/microsoft/onnxruntime). | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification_ort.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/text_classification_ort.ipynb)|
| [كيفية ضبط نموذج بدقة على التلخيص باستخدام ONNX Runtime](https://github.com/huggingface/notebooks/blob/main/examples/summarization_ort.ipynb)| يوضح كيفية معالجة البيانات مسبقًا وضبط نموذج بدقة على XSUM باستخدام [ONNX Runtime](https://github.com/microsoft/onnxruntime). | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/summarization_ort.ipynb)| [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/main/examples/summarization_ort.ipynb)|
## دفاتر ملاحظات المجتمع:
تتوفر المزيد من دفاتر الملاحظات التي طورها المجتمع [هنا](https://hf.co/docs/transformers/community#community-notebooks).

View File

@ -347,8 +347,8 @@ tensor([[0.0021, 0.0018, 0.0115, 0.2121, 0.7725],
```py
>>> from transformers import AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory)
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
>>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory)
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)
```
</pt>
<tf>
@ -356,8 +356,8 @@ tensor([[0.0021, 0.0018, 0.0115, 0.2121, 0.7725],
```py
>>> from transformers import TFAutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory)
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)
>>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory)
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
```
</tf>
</frameworkcontent>

View File

@ -0,0 +1,8 @@
# تشغيل التدريب على Amazon SageMaker
تم نقل التوثيق إلى [hf.co/docs/sagemaker](https://huggingface.co/docs/sagemaker). وسيتم إزالة هذه الصفحة في الإصدار 5.0 من برنامج Transformers.
### جدول المحتويات
- [تدريب نماذج Hugging Face على Amazon SageMaker باستخدام SageMaker Python SDK](https://huggingface.co/docs/sagemaker/train)
- [نشر نماذج Hugging Face على Amazon SageMaker باستخدام SageMaker Python SDK](https://huggingface.co/docs/sagemaker/inference)

View File

@ -0,0 +1,170 @@
# التصدير إلى ONNX
غالباً ما يتطلب نشر نماذج 🤗 Transformers في بيئات الإنتاج أو يمكن أن يستفيد من تصدير النماذج إلى تنسيق تسلسلي يُمكن تحميله وتنفيذه على أجهزة وبرامج تشغيل مُتخصصة.
🤗 Optimum هو امتداد لـ Transformers يمكّن من تصدير النماذج من PyTorch أو TensorFlow إلى تنسيقات مُتسلسلة مثل ONNX و TFLite من خلال وحدة `exporters` الخاصة به. يوفر 🤗 Optimum أيضًا مجموعة من أدوات تحسين الأداء لتدريب النماذج وتشغيلها على أجهزة مستهدفة بكفاءة قصوى.
يوضح هذا الدليل كيفية تصدير نماذج 🤗 Transformers إلى ONNX باستخدام 🤗 Optimum، وللحصول على الدليل الخاص بتصدير النماذج إلى TFLite، يُرجى الرجوع إلى صفحة [التصدير إلى TFLite](tflite).
## التصدير إلى ONNX
مجمد [ONNX (Open Neural Network Exchange)](http://onnx.ai) هو معيار مفتوح يُحدد مجموعة مشتركة من العوامل وتنسيق ملف مشترك لتمثيل نماذج التعلم العميق في مجموعة متنوعة واسعة من الأطر، بما في ذلك PyTorch وTensorFlow. عندما يتم تصدير نموذج إلى تنسيق ONNX، يتم استخدام هذه المشغلات لبناء رسم بياني حاسوبي (يُطلق عليه غالبًا اسم _تمثيل وسيط_) والذي يمثل تدفق البيانات عبر الشبكة العصبية.
من خلال عرض رسم بياني بعوامل وأنواع بيانات معيارية، يُسهّل ONNX التبديل بين الأطر. على سبيل المثال، يُمكن تصدير نموذج مدرب في PyTorch إلى تنسيق ONNX ثم استيراده في TensorFlow (والعكس صحيح).
بمجرد التصدير إلى تنسيق ONNX، يُمكن:
- تحسين النموذج للاستدلال عبر تقنيات مثل [تحسين الرسم البياني](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/optimization) و [التكميم](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/quantization).
- تشغيله باستخدام ONNX Runtime عبر فئات [`ORTModelForXXX`](https://huggingface.co/docs/optimum/onnxruntime/package_reference/modeling_ort)، والتي تتبع نفس واجهة برمجة التطبيقات (API) لـ `AutoModel` التي اعتدت عليها في 🤗 Transformers.
- تشغيله باستخدام [قنوات معالجة الاستدلال مُحسّنة](https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/pipelines)، والتي لها نفس واجهة برمجة التطبيقات (API) مثل وظيفة [`pipeline`] في 🤗 Transformers.
يوفر 🤗 Optimum دعمًا لتصدير ONNX من خلال الاستفادة من كائنات التكوين. تأتي كائنات التكوين هذه جاهزة لعدد من معماريات النماذج، وقد تم تصميمها لتكون قابلة للتوسعة بسهولة إلى معماريات أخرى.
للاطلاع على قائمة بالتكوينات الجاهزة، يُرجى الرجوع إلى [وثائق 🤗 Optimum](https://huggingface.co/docs/optimum/exporters/onnx/overview).
هناك طريقتان لتصدير نموذج 🤗 Transformers إلى ONNX، نعرض هنا كليهما:
- التصدير باستخدام 🤗 Optimum عبر واجهة سطر الأوامر (CLI).
- التصدير باستخدام 🤗 Optimum مع `optimum.onnxruntime`.
### تصدير نموذج 🤗 Transformers إلى ONNX باستخدام واجهة سطر الأوامر
لتصدير نموذج 🤗 Transformers إلى ONNX، قم أولاً بتثبيت اعتماد إضافي:
```bash
pip install optimum[exporters]
```
للاطلاع على جميع المعامﻻت المتاحة، يرجى الرجوع إلى [وثائق 🤗 Optimum](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model#exporting-a-model-to-onnx-using-the-cli)، أو عرض المساعدة في سطر الأوامر:
```bash
optimum-cli export onnx --help
```
```bash
optimum-cli export onnx --help
```
لتصدير نقطة تفتيش نموذج من 🤗 Hub، على سبيل المثال، `distilbert/distilbert-base-uncased-distilled-squad`، قم بتشغيل الأمر التالي:
```bash
optimum-cli export onnx --model distilbert/distilbert-base-uncased-distilled-squad distilbert_base_uncased_squad_onnx/
```
يجب أن تشاهد السجلات التي تشير إلى التقدم المحرز وتظهر المكان الذي تم فيه حفظ ملف `model.onnx` الناتج، مثل هذا:
```bash
Validating ONNX model distilbert_base_uncased_squad_onnx/model.onnx...
-[] ONNX model output names match reference model (start_logits, end_logits)
- Validating ONNX Model output "start_logits":
-[] (2, 16) matches (2, 16)
-[] all values close (atol: 0.0001)
- Validating ONNX Model output "end_logits":
-[] (2, 16) matches (2, 16)
-[] all values close (atol: 0.0001)
The ONNX export succeeded and the exported model was saved at: distilbert_base_uncased_squad_onnx
```
يوضح المثال أعلاه تصدير نقطة تفتيش من 🤗 Hub. عند تصدير نموذج محلي، تأكد أولاً من حفظ ملفات أوزان النموذج ومحول الرموز في نفس الدليل (`local_path`). عند استخدام واجهة سطر الأوامر، قم بتمرير `local_path` إلى وسيط `model` بدلاً من اسم نقطة التفتيش على 🤗 Hub وقدم وسيط `--task`. يمكنك مراجعة قائمة المهام المدعومة في [وثائق 🤗 Optimum](https://huggingface.co/docs/optimum/exporters/task_manager). إذا لم يتم توفير وسيط `task`، فسيتم تعيينه افتراضيًا إلى هندسة النموذج دون أي رأس محدد للمهمة.
```bash
optimum-cli export onnx --model local_path --task question-answering distilbert_base_uncased_squad_onnx/
```
يمكن بعد ذلك تشغيل ملف `model.onnx` الناتج على أحد [المسرعات](https://onnx.ai/supported-tools.html#deployModel) العديدة التي تدعم معيار ONNX. على سبيل المثال، يمكننا تحميل النموذج وتشغيله باستخدام [ONNX Runtime](https://onnxruntime.ai/) كما يلي:
```python
>>> from transformers import AutoTokenizer
>>> from optimum.onnxruntime import ORTModelForQuestionAnswering
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert_base_uncased_squad_onnx")
>>> model = ORTModelForQuestionAnswering.from_pretrained("distilbert_base_uncased_squad_onnx")
>>> inputs = tokenizer("What am I using?", "Using DistilBERT with ONNX Runtime!", return_tensors="pt")
>>> outputs = model(**inputs)
```
تكون العملية مماثلة بالنسبة إلى نقاط تفتيش TensorFlow على Hub. على سبيل المثال، إليك كيفية تصدير نقطة تفتيش TensorFlow نقية من [منظمة Keras](https://huggingface.co/keras-io):
```bash
optimum-cli export onnx --model keras-io/transformers-qa distilbert_base_cased_squad_onnx/
```
### تصدير نموذج 🤗 Transformers إلى ONNX باستخدام `optimum.onnxruntime`
كبديل لواجهة سطر الأوامر، يُمكنك تصدير نموذج 🤗 Transformers إلى ONNX برمجيًا كما يلي:
```python
>>> from optimum.onnxruntime import ORTModelForSequenceClassification
>>> from transformers import AutoTokenizer
>>> model_checkpoint = "distilbert_base_uncased_squad"
>>> save_directory = "onnx/"
>>> # تحميل نموذج من transformers وتصديره إلى ONNX
>>> ort_model = ORTModelForSequenceClassification.from_pretrained(model_checkpoint, export=True)
>>> tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
>>> # حفظ نموذج onnx ومجزىء النصوص
>>> ort_model.save_pretrained(save_directory)
>>> tokenizer.save_pretrained(save_directory)
```
### تصدير نموذج لهندسة غير مدعومة
إذا كنت ترغب في المساهمة من خلال إضافة دعم لنموذج لا يُمكن تصديره حاليًا، فيجب عليك أولاً التحقق مما إذا كان مدعومًا في [`optimum.exporters.onnx`](https://huggingface.co/docs/optimum/exporters/onnx/overview)، وإذا لم يكن مدعومًا، [فيمكنك المساهمة في 🤗 Optimum](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/contribute) مُباشرةً.
### تصدير نموذج باستخدام `transformers.onnx`
<Tip warning={true}>
لم يعد يتم دعم `tranformers.onnx` يُرجى تصدير النماذج باستخدام 🤗 Optimum كما هو موضح أعلاه. سيتم إزالة هذا القسم في الإصدارات القادمة.
</Tip>
لتصدير نموذج 🤗 Transformers إلى ONNX باستخدام `tranformers.onnx`، ثبّت التبعيات الإضافية:
```bash
pip install transformers[onnx]
```
استخدم حزمة `transformers.onnx` كنموذج Python لتصدير نقطة حفظ باستخدام تكوين جاهز:
```bash
python -m transformers.onnx --model=distilbert/distilbert-base-uncased onnx/
```
يُصدّر هذا رسمًا بيانيًا ONNX لنقطة الحفظ المُحددة بواسطة وسيطة `--model`. مرر أي نقطة حفظ على 🤗 Hub أو نقطة حفظ مُخزنة محليًا.
يُمكن بعد ذلك تشغيل ملف `model.onnx` الناتج على أحد المُسرعات العديدة التي تدعم معيار ONNX. على سبيل المثال، قم بتحميل وتشغيل النموذج باستخدام ONNX Runtime كما يلي:
```python
>>> from transformers import AutoTokenizer
>>> from onnxruntime import InferenceSession
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
>>> session = InferenceSession("onnx/model.onnx")
>>> # يتوقع ONNX Runtime مصفوفات NumPy كمدخلات
>>> inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="np")
>>> outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))
```
يُمكن الحصول على أسماء المخرجات المطلوبة (مثل `["last_hidden_state"]`) من خلال إلقاء نظرة على تكوين ONNX لكل نموذج. على سبيل المثال، بالنسبة لـ DistilBERT، لدينا:
```python
>>> from transformers.models.distilbert import DistilBertConfig, DistilBertOnnxConfig
>>> config = DistilBertConfig()
>>> onnx_config = DistilBertOnnxConfig(config)
>>> print(list(onnx_config.outputs.keys()))
["last_hidden_state"]
```
العمليات مُتطابقة لنقاط الحفظ TensorFlow على Hub. على سبيل المثال، صدّر نقطة حفظ TensorFlow خالصة كما يلي:
```bash
python -m transformers.onnx --model=keras-io/transformers-qa onnx/
```
لتصدير نموذج مُخزن محليًا، احفظ أوزان النموذج ومجزىء اللغوى في نفس الدليل (على سبيل المثال `local-pt-checkpoint`)، ثم قم بتصديره إلى ONNX عن طريق توجيه وسيط `--model` لحزمة `transformers.onnx` إلى الدليل المطلوب:
```bash
python -m transformers.onnx --model=local-pt-checkpoint onnx/
```

View File

@ -0,0 +1,422 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# نمذجة اللغة السببية (Causal language modeling)
[[open-in-colab]]
هناك نوعان من نمذجة اللغة، السببية والمقنعة. يوضح هذا الدليل نمذجة اللغة السببية.
تُستخدم نماذج اللغة السببية غالبًا لتوليد النص. يمكنك استخدام هذه النماذج للتطبيقات الإبداعية مثل
اختيار مغامرة النص الخاصة بك أو مساعد ترميز ذكي مثل Copilot أو CodeParrot.
<Youtube id="Vpjb1lu0MDk"/>
تتنبأ نمذجة اللغة السببية بالرمز التالي في تسلسل من الرموز، ولا يمكن للنموذج سوى الاهتمام بالرموز على
اليسار. هذا يعني أن النموذج لا يمكنه رؤية الرموز المستقبلية. GPT-2 هو مثال على نموذج اللغة السببية.
سيوضح لك هذا الدليل كيفية:
1. ضبط دقيق [DistilRoBERTa](https://huggingface.co/distilbert/distilroberta-base) على مجموعة فرعية [r/askscience](https://www.reddit.com/r/askscience/) من مجموعة بيانات [ELI5](https://huggingface.co/datasets/eli5).
2. استخدام النموذج المدرب الخاص بك للاستنتاج.
<Tip>
لرؤية جميع العمارات ونقاط التحقق المتوافقة مع هذه المهمة، نوصي بالتحقق من [task-page](https://huggingface.co/tasks/text-generation)
</Tip>
قبل أن تبدأ، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate
```
نحن نشجعك على تسجيل الدخول إلى حساب Hugging Face الخاص بك حتى تتمكن من تحميل ومشاركة نموذجك مع المجتمع. عند المطالبة، أدخل رمزك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات ELI5
ابدأ بتحميل أول 5000 مثال من [ELI5-Category](https://huggingface.co/datasets/eli5_category) مجموعة البيانات مع مكتبة 🤗 Datasets. سيعطيك هذا فرصة للتجربة والتأكد من أن كل شيء يعمل قبل قضاء المزيد من الوقت في التدريب على مجموعة البيانات الكاملة.
```py
>>> from datasets import load_dataset
>>> eli5 = load_dataset("eli5_category", split="train[:5000]")
```
قم بتقسيم مجموعة بيانات `train` إلى مجموعتي تدريب واختبار باستخدام الخاصية [`~datasets.Dataset.train_test_split`]:
```py
>>> eli5 = eli5.train_test_split(test_size=0.2)
```
ثم ألق نظرة على مثال:
```py
>>> eli5["train"][0]
{'q_id': '7h191n',
'title': 'What does the tax bill that was passed today mean? How will it affect Americans in each tax bracket?',
'selftext': '',
'category': 'Economics',
'subreddit': 'explainlikeimfive',
'answers': {'a_id': ['dqnds8l', 'dqnd1jl', 'dqng3i1', 'dqnku5x'],
'text': ["The tax bill is 500 pages long and there were a lot of changes still going on right to the end. It's not just an adjustment to the income tax brackets, it's a whole bunch of changes. As such there is no good answer to your question. The big take aways are: - Big reduction in corporate income tax rate will make large companies very happy. - Pass through rate change will make certain styles of business (law firms, hedge funds) extremely happy - Income tax changes are moderate, and are set to expire (though it's the kind of thing that might just always get re-applied without being made permanent) - People in high tax states (California, New York) lose out, and many of them will end up with their taxes raised.",
'None yet. It has to be reconciled with a vastly different house bill and then passed again.',
'Also: does this apply to 2017 taxes? Or does it start with 2018 taxes?',
'This article explains both the House and senate bills, including the proposed changes to your income taxes based on your income level. URL_0'],
'score': [21, 19, 5, 3],
'text_urls': [[],
[],
[],
['https://www.investopedia.com/news/trumps-tax-reform-what-can-be-done/']]},
'title_urls': ['url'],
'selftext_urls': ['url']}
```
على الرغم من أن هذا قد يبدو معقدًا، إلا أنك مهتم حقًا بحقل `text`. ما هو رائع حول مهام نمذجة اللغة
أنت لا تحتاج إلى تسميات (تُعرف أيضًا باسم المهمة غير الخاضعة للإشراف) لأن الكلمة التالية تعمل كتسمية.
## معالجة مسبقة (Preprocess)
<Youtube id="ma1TrR7gE7I"/>
الخطوة التالية هي تحميل مجزء النص DistilGPT2 لمعالجة حقل `text` الفرعي:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilgpt2")
```
ستلاحظ من المثال أعلاه، الحقل `text` هو في الواقع متداخل داخل `answers`. هذا يعني أنك ستحتاج إلى
استخراج حقل `text` الفرعي من بنيته المتداخلة باستخدام الدالة [`flatten`](https://huggingface.co/docs/datasets/process#flatten):
```py
>>> eli5 = eli5.flatten()
>>> eli5["train"][0]
{'q_id': '7h191n',
'title': 'What does the tax bill that was passed today mean? How will it affect Americans in each tax bracket?',
'selftext': '',
'category': 'Economics',
'subreddit': 'explainlikeimfive',
'answers.a_id': ['dqnds8l', 'dqnd1jl', 'dqng3i1', 'dqnku5x'],
'answers.text': ["The tax bill is 500 pages long and there were a lot of changes still going on right to the end. It's not just an adjustment to the income tax brackets, it's a whole bunch of changes. As such there is no good answer to your question. The big take aways are: - Big reduction in corporate income tax rate will make large companies very happy. - Pass through rate change will make certain styles of business (law firms, hedge funds) extremely happy - Income tax changes are moderate, and are set to expire (though it's the kind of thing that might just always get re-applied without being made permanent) - People in high tax states (California, New York) lose out, and many of them will end up with their taxes raised.",
'None yet. It has to be reconciled with a vastly different house bill and then passed again.',
'Also: does this apply to 2017 taxes? Or does it start with 2018 taxes?',
'This article explains both the House and senate bills, including the proposed changes to your income taxes based on your income level. URL_0'],
'answers.score': [21, 19, 5, 3],
'answers.text_urls': [[],
[],
[],
['https://www.investopedia.com/news/trumps-tax-reform-what-can-be-done/']],
'title_urls': ['url'],
'selftext_urls': ['url']}
```
كل حقل فرعي هو الآن عموداً منفصلاً مسبوقاً بـ `answers`، وحقل `text` هو قائمة الآن. بدلاً من ذلك
من تجزائة نص كل جملة بشكل منفصل، قم بتحويل القائمة إلى سلسلة حتى تتمكن من تجزئة نصها بشكل مجمّع.
هنا أول دالة معالجة مسبقة لدمج قائمة السلاسل لكل مثال ومجزىء النتيجة:
```py
>>> def preprocess_function(examples):
... return tokenizer([" ".join(x) for x in examples["answers.text"]])
```
لتطبيق دالة المعالجة المسبقة هذه على مجموعة البيانات بأكملها، استخدم الدالة 🤗 Datasets [`~datasets.Dataset.map`]. يمكنك تسريع هذه العملية `map` عن طريق تعيين `batched=True` لمعالجة عناصر متعددة من مجموعة البيانات في وقت واحد، وزيادة عدد العمليات مع `num_proc`. احذف أي أعمدة لا تحتاجها:
```py
>>> tokenized_eli5 = eli5.map(
... preprocess_function,
... batched=True,
... num_proc=4,
... remove_columns=eli5["train"].column_names,
... )
```
تحتوي هذه المجموعة من البيانات على تسلسلات الرموز، ولكن بعضها أطول من الطول الأقصى للمدخلات للنموذج.
يمكنك الآن استخدام دالة ما قبل المعالجة ثانية لـ:
- تجميع كل التسلسلات.
- تقسيم التسلسلات المجمّعة إلى أجزاء أقصر محددة، بحجم `block_size`، والتي يجب أن تكون أقصر من الطول الأقصى للمدخلات ومناسبة لذاكرة GPU.
```py
>>> block_size = 128
>>> def group_texts(examples):
... # ربط جميع النصوص.
... concatenated_examples = {k: sum(examples[k], []) for k in examples.keys()}
... total_length = len(concatenated_examples[list(examples.keys())[0]])
... # نتجاهل الباقي الصغير، يمكننا إضافة الحشو إذا كان النموذج يدعمه بدلاً من هذا الإسقاط، يمكنك
... # تخصيص هذا الجزء حسب احتياجاتك.
... if total_length >= block_size:
... total_length = (total_length // block_size) * block_size
... # التقسيم إلى أجزاء بحجم block_size.
... result = {
... k: [t[i : i + block_size] for i in range(0, total_length, block_size)]
... for k, t in concatenated_examples.items()
... }
... result["labels"] = result["input_ids"].copy()
... return result
```
طبق دالة `group_texts` على كامل المجموعة من البيانات:
```py
>>> lm_dataset = tokenized_eli5.map(group_texts, batched=True, num_proc=4)
```
الآن قم بإنشاء دفعة من الأمثلة باستخدام [`DataCollatorForLanguageModeling`]. من الأفضل أن تقوم بـ *الحشو الديناميكي* للجمل إلى الطول الأطول في الدفعة أثناء التجميع، بدلاً من حشو كامل المجموعة من البيانات إلى الطول الأقصى.
<frameworkcontent>
<pt>
استخدم رمز نهاية التسلسل كرمز للحشو، وحدد `mlm_probability` لحجب الرموز بشكل عشوائي عند كل تكرار للبيانات:
```py
>>> from transformers import DataCollatorForLanguageModeling
>>> tokenizer.pad_token = tokenizer.eos_token
>>> data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)
```
</pt>
<tf>
استخدم رمز نهاية التسلسل كرمز للحشو، وحدد `mlm_probability` لحجب الرموز بشكل عشوائي عند كل تكرار للبيانات:
```py
>>> from transformers import DataCollatorForLanguageModeling
>>> data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False, return_tensors="tf")
```
</tf>
</frameworkcontent>
## التدريب (Train)
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن على دراية بتدريب نموذج باستخدام [`Trainer`], اطلع على [البرنامج التعليمي الأساسي](../training#train-with-pytorch-trainer)!
</Tip>
أنت جاهز الآن لبدء تدريب نموذجك! قم بتحميل DistilGPT2 باستخدام [`AutoModelForCausalLM`]:
```py
>>> from transformers import AutoModelForCausalLM, TrainingArguments, Trainer
>>> model = AutoModelForCausalLM.from_pretrained("distilbert/distilgpt2")
```
في هذه المرحلة، تبقى ثلاث خطوات فقط:
1. حدد معلمات التدريب الخاصة بك في [`TrainingArguments`]. المعامل الوحيد المطلوب هو `output_dir` الذي يحدد أين سيتم حفظ نموذجك. ستقوم بدفع هذا النموذج إلى Hub بتحديد `push_to_hub=True` (يجب أن تكون مسجلاً الدخول إلى Hugging Face لتحميل نموذجك).
2. قم بتمرير معاملات التدريب إلى [`Trainer`] إلى جانب النموذج، والمجموعات من البيانات، ومجمّع البيانات.
3. قم باستدعاء [`~Trainer.train`] لتدريب نموذجك.
```py
>>> training_args = TrainingArguments(
... output_dir="my_awesome_eli5_clm-model",
... eval_strategy="epoch",
... learning_rate=2e-5,
... weight_decay=0.01,
... push_to_hub=True,
... )
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=lm_dataset["train"],
... eval_dataset=lm_dataset["test"],
... data_collator=data_collator,
... tokenizer=tokenizer,
... )
>>> trainer.train()
```
بمجرد اكتمال التدريب، استخدم طريقة [`~transformers.Trainer.evaluate`] لتقييم نموذجك والحصول على احتمالية الارتباك:
```py
>>> import math
>>> eval_results = trainer.evaluate()
>>> print(f"Perplexity: {math.exp(eval_results['eval_loss']):.2f}")
Perplexity: 49.61
```
ثم شارك نموذجك على Hub باستخدام طريقة [`~transformers.Trainer.push_to_hub`] حتى يتمكن الجميع من استخدام نموذجك:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن على دراية بتدريب نموذج باستخدام Keras، اطلع على [البرنامج التعليمي الأساسي](../training#train-a-tensorflow-model-with-keras)!
</Tip>
لتدريب نموذج في TensorFlow، ابدأ بإعداد دالة المحسن، وجدول معدل التعلم، وبعض معاملات التدريب:
```py
>>> from transformers import create_optimizer, AdamWeightDecay
>>> optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)
```
ثم يمكنك تحميل DistilGPT2 باستخدام [`TFAutoModelForCausalLM`]:
```py
>>> from transformers import TFAutoModelForCausalLM
>>> model = TFAutoModelForCausalLM.from_pretrained("distilbert/distilgpt2")
```
حول مجموعات بياناتك إلى تنسيق `tf.data.Dataset` باستخدام [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... lm_dataset["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_test_set = model.prepare_tf_dataset(
... lm_dataset["test"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
قم بتهيئة النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method). لاحظ أن جميع نماذج Transformers لديها دالة خسارة ذات صلة بالمهمة الافتراضية، لذلك لا تحتاج إلى تحديد واحدة ما لم ترغب في ذلك:
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer) # لا يوجد حجة للخسارة!
```
يمكن القيام بذلك عن طريق تحديد مكان دفع نموذجك ومجمّع البيانات في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> callback = PushToHubCallback(
... output_dir="my_awesome_eli5_clm-model",
... tokenizer=tokenizer,
... )
```
أخيراً، أنت جاهز لبدء تدريب نموذجك! قم باستدعاء [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع مجموعات بيانات التدريب والتحقق من الصحة، وعدد العصور، والتعليقات الخاصة بك لتدريب النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3, callbacks=[callback])
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائيًا إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
للحصول على مثال أكثر تعمقًا حول كيفية تدريب نموذج للنمذجة اللغوية السببية، اطلع على الدفتر المقابل
[دفتر PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)
أو [دفتر TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb).
</Tip>
## الاستدلال (Inference)
رائع، الآن بعد أن قمت بتدريب نموذج، يمكنك استخدامه للاستدلال!
قم بابتكار سؤال تود توليد نص منه:
```py
>>> prompt = "Somatic hypermutation allows the immune system to"
```
أبسط طريقة لتجربة نموذجك المدرب للاستدلال هي استخدامه في [`pipeline`]. قم بتنفيذ `pipeline` لتوليد النص مع نموذجك، ومرر نصك إليه:
```py
>>> from transformers import pipeline
>>> generator = pipeline("text-generation", model="username/my_awesome_eli5_clm-model")
>>> generator(prompt)
[{'generated_text': "Somatic hypermutation allows the immune system to be able to effectively reverse the damage caused by an infection.\n\n\nThe damage caused by an infection is caused by the immune system's ability to perform its own self-correcting tasks."}]
```
<frameworkcontent>
<pt>
قسم النص وإرجع `input_ids` كتنسورات PyTorch:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_eli5_clm-model")
>>> inputs = tokenizer(prompt, return_tensors="pt").input_ids
```
استخدم طريقة [`~generation.GenerationMixin.generate`] لتوليد النص.
للمزيد من التفاصيل حول استراتيجيات توليد النص المختلفة والبارامترات للتحكم في التوليد، راجع صفحة [استراتيجيات توليد النص](../generation_strategies).
```py
>>> from transformers import AutoModelForCausalLM
>>> model = AutoModelForCausalLM.from_pretrained("username/my_awesome_eli5_clm-model")
>>> outputs = model.generate(inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
```
فك ترميز الرموز المولدة مرة أخرى إلى نص:
```py
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
["Somatic hypermutation allows the immune system to react to drugs with the ability to adapt to a different environmental situation. In other words, a system of 'hypermutation' can help the immune system to adapt to a different environmental situation or in some cases even a single life. In contrast, researchers at the University of Massachusetts-Boston have found that 'hypermutation' is much stronger in mice than in humans but can be found in humans, and that it's not completely unknown to the immune system. A study on how the immune system"]
```
</pt>
<tf>
قم بتقسيم النص وإرجاع `input_ids` كـ TensorFlow tensors:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_eli5_clm-model")
>>> inputs = tokenizer(prompt, return_tensors="tf").input_ids
```
استخدم طريقة [`~transformers.generation_tf_utils.TFGenerationMixin.generate`] لإنشاء الملخص. للمزيد من التفاصيل حول استراتيجيات توليد النص المختلفة والبارامترات للتحكم في التوليد، راجع صفحة [استراتيجيات توليد النص](../generation_strategies).
```py
>>> from transformers import TFAutoModelForCausalLM
>>> model = TFAutoModelForCausalLM.from_pretrained("username/my_awesome_eli5_clm-model")
>>> outputs = model.generate(input_ids=inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
```
فك ترميز الرموز المولدة مرة أخرى إلى نص:
```py
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
['Somatic hypermutation allows the immune system to detect the presence of other viruses as they become more prevalent. Therefore, researchers have identified a high proportion of human viruses. The proportion of virus-associated viruses in our study increases with age. Therefore, we propose a simple algorithm to detect the presence of these new viruses in our samples as a sign of improved immunity. A first study based on this algorithm, which will be published in Science on Friday, aims to show that this finding could translate into the development of a better vaccine that is more effective for']
```
</tf>
</frameworkcontent>

View File

@ -0,0 +1,442 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# نمذجة اللغة المقنعة (Masked language modeling)
[[open-in-colab]]
<Youtube id="mqElG5QJWUg"/>
تتنبأ نمذجة اللغة المقنعة برمز مقنع في تسلسل، ويمكن للنموذج الانتباه إلى الرموز بشكل ثنائي الاتجاه. هذا
يعني أن النموذج لديه إمكانية الوصول الكاملة إلى الرموز الموجودة على اليسار واليمين. تعد نمذجة اللغة المقنعة ممتازة للمهام التي
تتطلب فهمًا سياقيًا جيدًا لتسلسل كامل. BERT هو مثال على نموذج لغة مقنع.
سيوضح لك هذا الدليل كيفية:
1. تكييف [DistilRoBERTa](https://huggingface.co/distilbert/distilroberta-base) على مجموعة فرعية [r/askscience](https://www.reddit.com/r/askscience/) من مجموعة بيانات [ELI5](https://huggingface.co/datasets/eli5).
2. استخدام نموذج المدرب الخاص بك للاستدلال.
<Tip>
لمعرفة جميع البنى والنسخ المتوافقة مع هذه المهمة، نوصي بالتحقق من [صفحة المهمة](https://huggingface.co/tasks/fill-mask)
</Tip>
قبل أن تبدأ، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate
```
نحن نشجعك على تسجيل الدخول إلى حساب Hugging Face الخاص بك حتى تتمكن من تحميل ومشاركة نموذجك مع المجتمع. عندما تتم مطالبتك، أدخل رمزك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات ELI5
ابدأ بتحميل أول 5000 مثال من مجموعة بيانات [ELI5-Category](https://huggingface.co/datasets/eli5_category) باستخدام مكتبة 🤗 Datasets. سيعطيك هذا فرصة للتجربة والتأكد من أن كل شيء يعمل قبل قضاء المزيد من الوقت في التدريب على مجموعة البيانات الكاملة.
```py
>>> from datasets import load_dataset
>>> eli5 = load_dataset("eli5_category", split="train[:5000]")
```
قم بتقسيم مجموعة البيانات `train` إلى مجموعتي تدريب واختبار باستخدام الدالة [`~datasets.Dataset.train_test_split`]:
```py
>>> eli5 = eli5.train_test_split(test_size=0.2)
```
ثم ألق نظرة على مثال:
```py
>>> eli5["train"][0]
{'q_id': '7h191n',
'title': 'What does the tax bill that was passed today mean? How will it affect Americans in each tax bracket?',
'selftext': '',
'category': 'Economics',
'subreddit': 'explainlikeimfive',
'answers': {'a_id': ['dqnds8l', 'dqnd1jl', 'dqng3i1', 'dqnku5x'],
'text': ["The tax bill is 500 pages long and there were a lot of changes still going on right to the end. It's not just an adjustment to the income tax brackets, it's a whole bunch of changes. As such there is no good answer to your question. The big take aways are: - Big reduction in corporate income tax rate will make large companies very happy. - Pass through rate change will make certain styles of business (law firms, hedge funds) extremely happy - Income tax changes are moderate, and are set to expire (though it's the kind of thing that might just always get re-applied without being made permanent) - People in high tax states (California, New York) lose out, and many of them will end up with their taxes raised.",
'None yet. It has to be reconciled with a vastly different house bill and then passed again.',
'Also: does this apply to 2017 taxes? Or does it start with 2018 taxes?',
'This article explains both the House and senate bills, including the proposed changes to your income taxes based on your income level. URL_0'],
'score': [21, 19, 5, 3],
'text_urls': [[],
[],
[],
['https://www.investopedia.com/news/trumps-tax-reform-what-can-be-done/']]},
'title_urls': ['url'],
'selftext_urls': ['url']}
```
على الرغم من أن هذا قد يبدو كثيرًا، إلا أنك مهتم حقًا بحقل `text`. ما هو رائع حول مهام نمذجة اللغة هو أنك لا تحتاج إلى تسميات (تُعرف أيضًا باسم المهمة غير الخاضعة للإشراف) لأن الكلمة التالية *هي* التسمية.
## معالجة مسبقة (Preprocess)
<Youtube id="8PmhEIXhBvI"/>
بالنسبة لنمذجة اللغة المقنعة، فإن الخطوة التالية هي تحميل معالج DistilRoBERTa لمعالجة حقل `text` الفرعي:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilroberta-base")
```
ستلاحظ من المثال أعلاه، أن حقل `text` موجود بالفعل داخل `answers`. هذا يعني أنك ستحتاج إلى استخراج حقل `text` الفرعي من بنيته المضمنة باستخدام الدالة [`flatten`](https://huggingface.co/docs/datasets/process#flatten):
```py
>>> eli5 = eli5.flatten()
>>> eli5["train"][0]
{'q_id': '7h191n',
'title': 'What does the tax bill that was passed today mean? How will it affect Americans in each tax bracket?',
'selftext': '',
'category': 'Economics',
'subreddit': 'explainlikeimfive',
'answers.a_id': ['dqnds8l', 'dqnd1jl', 'dqng3i1', 'dqnku5x'],
'answers.text': ["The tax bill is 500 pages long and there were a lot of changes still going on right to the end. It's not just an adjustment to the income tax brackets, it's a whole bunch of changes. As such there is no good answer to your question. The big take aways are: - Big reduction in corporate income tax rate will make large companies very happy. - Pass through rate change will make certain styles of business (law firms, hedge funds) extremely happy - Income tax changes are moderate, and are set to expire (though it's the kind of thing that might just always get re-applied without being made permanent) - People in high tax states (California, New York) lose out, and many of them will end up with their taxes raised.",
'None yet. It has to be reconciled with a vastly different house bill and then passed again.',
'Also: does this apply to 2017 taxes? Or does it start with 2018 taxes?',
'This article explains both the House and senate bills, including the proposed changes to your income taxes based on your income level. URL_0'],
'answers.score': [21, 19, 5, 3],
'answers.text_urls': [[],
[],
[],
['https://www.investopedia.com/news/trumps-tax-reform-what-can-be-done/']],
'title_urls': ['url'],
'selftext_urls': ['url']}
```
كل حقل فرعي هو الآن عمود منفصل كما هو موضح بواسطة بادئة `answers`، وحقل `text` هو قائمة الآن. بدلاً من
معالجة كل جملة بشكل منفصل، قم بتحويل القائمة إلى سلسلة حتى تتمكن من معالجتها بشكل مشترك.
هنا أول دالة معالجة مسبقة لربط قائمة السلاسل لكل مثال ومعالجة النتيجة:
```py
>>> def preprocess_function(examples):
... return tokenizer([" ".join(x) for x in examples["answers.text"]])
```
لتطبيق دالة المعالجة المسبقة على مجموعة البيانات بأكملها، استخدم الدالة 🤗 Datasets [`~datasets.Dataset.map`]. يمكنك تسريع دالة `map` عن طريق تعيين `batched=True` لمعالجة عدة عناصر في وقت واحد، وزيادة عدد العمليات باستخدام `num_proc`. احذف أي أعمدة غير ضرورية:
```py
>>> tokenized_eli5 = eli5.map(
... preprocess_function,
... batched=True,
... num_proc=4,
... remove_columns=eli5["train"].column_names,
... )
```
تحتوي مجموعة البيانات هذه على تسلسلات رمزية، ولكن بعضها أطول من الطول الأقصى للمدخلات للنموذج.
يمكنك الآن استخدام دالة معالجة مسبقة ثانية لـ:
- تجميع جميع التسلسلات
- تقسيم التسلسلات المجمّعة إلى أجزاء أقصر محددة بـ `block_size`، والتي يجب أن تكون أقصر من الحد الأقصى لطول المدخلات ومناسبة لذاكرة GPU.
```py
>>> block_size = 128
>>> def group_texts(examples):
... # تجميع جميع النصوص.
... concatenated_examples = {k: sum(examples[k], []) for k in examples.keys()}
... total_length = len(concatenated_examples[list(examples.keys())[0]])
... # نتجاهل الجزء المتبقي الصغير، يمكننا إضافة الحشو إذا كان النموذج يدعمه بدلاً من هذا الإسقاط، يمكنك
... # تخصيص هذا الجزء حسب احتياجاتك.
... if total_length >= block_size:
... total_length = (total_length // block_size) * block_size
... # تقسيمها إلى أجزاء بحجم block_size.
... result = {
... k: [t[i : i + block_size] for i in range(0, total_length, block_size)]
... for k, t in concatenated_examples.items()
... }
... return result
```
طبق دالة `group_texts` على مجموعة البيانات بأكملها:
```py
>>> lm_dataset = tokenized_eli5.map(group_texts, batched=True, num_proc=4)
```
الآن، قم بإنشاء دفعة من الأمثلة باستخدام [`DataCollatorForLanguageModeling`]. من الأكثر كفاءة أن تقوم بـ *الحشو الديناميكي* ليصل طولها إلى أطول جملة في الدفعة أثناء التجميع، بدلاً من حشو مجموعة البيانات بأكملها إلى الطول الأقصى.
<frameworkcontent>
<pt>
استخدم رمز نهاية التسلسل كرمز الحشو وحدد `mlm_probability` لحجب الرموز عشوائياً كل مرة تكرر فيها البيانات:
```py
>>> from transformers import DataCollatorForLanguageModeling
>>> tokenizer.pad_token = tokenizer.eos_token
>>> data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm_probability=0.15)
```
</pt>
<tf>
استخدم رمز نهاية التسلسل كرمز الحشو وحدد `mlm_probability` لحجب الرموز عشوائياً كل مرة تكرر فيها البيانات:
```py
>>> from transformers import DataCollatorForLanguageModeling
>>> data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm_probability=0.15, return_tensors="tf")
```
</tf>
</frameworkcontent>
## التدريب (Train)
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن على دراية بتعديل نموذج باستخدام [`Trainer`], ألق نظرة على الدليل الأساسي [هنا](../training#train-with-pytorch-trainer)!
</Tip>
أنت مستعد الآن لبدء تدريب نموذجك! قم بتحميل DistilRoBERTa باستخدام [`AutoModelForMaskedLM`]:
```py
>>> from transformers import AutoModelForMaskedLM
>>> model = AutoModelForMaskedLM.from_pretrained("distilbert/distilroberta-base")
```
في هذه المرحلة، تبقى ثلاث خطوات فقط:
1. حدد معلمات التدريب الخاصة بك في [`TrainingArguments`]. المعلمة الوحيدة المطلوبة هي `output_dir` والتي تحدد مكان حفظ نموذجك. ستقوم بدفع هذا النموذج إلى Hub عن طريق تعيين `push_to_hub=True` (يجب أن تكون مسجلاً الدخول إلى Hugging Face لتحميل نموذجك).
2. قم بتمرير معلمات التدريب إلى [`Trainer`] مع النموذج، ومجموعات البيانات، ومجمّع البيانات.
3. قم باستدعاء [`~Trainer.train`] لتعديل نموذجك.
```py
>>> training_args = TrainingArguments(
... output_dir="my_awesome_eli5_mlm_model",
... eval_strategy="epoch",
... learning_rate=2e-5,
... num_train_epochs=3,
... weight_decay=0.01,
... push_to_hub=True,
... )
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=lm_dataset["train"],
... eval_dataset=lm_dataset["test"],
... data_collator=data_collator,
... tokenizer=tokenizer,
... )
>>> trainer.train()
```
بمجرد اكتمال التدريب، استخدم طريقة [`~transformers.Trainer.evaluate`] لتقييم النموذج والحصول على مقياس
الحيرة:
```py
>>> import math
>>> eval_results = trainer.evaluate()
>>> print(f"Perplexity: {math.exp(eval_results['eval_loss']):.2f}")
Perplexity: 8.76
```
ثم شارك نموذجك على Hub باستخدام طريقة [`~transformers.Trainer.push_to_hub`] حتى يتمكن الجميع من استخدام نموذجك:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن على دراية بتعديل نموذج باستخدام Keras، ألق نظرة على الدليل الأساسي [هنا](../training#train-a-tensorflow-model-with-keras)!
</Tip>
لتعديل نموذج في TensorFlow، ابدأ بإعداد دالة محسن، وجدول معدل التعلم، وبعض معلمات التدريب:
```py
>>> from transformers import create_optimizer, AdamWeightDecay
>>> optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)
```
ثم يمكنك تحميل DistilRoBERTa باستخدام [`TFAutoModelForMaskedLM`]:
```py
>>> from transformers import TFAutoModelForMaskedLM
>>> model = TFAutoModelForMaskedLM.from_pretrained("distilbert/distilroberta-base")
```
قم بتحويل مجموعات بياناتك إلى تنسيق `tf.data.Dataset` باستخدام [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... lm_dataset["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_test_set = model.prepare_tf_dataset(
... lm_dataset["test"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
قم بتهيئة النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method). لاحظ أن نماذج Transformers لديها جميعها دالة خسارة افتراضية ذات صلة بالمهمة، لذلك لا تحتاج إلى تحديد واحدة ما لم تكن تريد ذلك:
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer) # لا توجد حجة للخسارة!
```
يمكن القيام بذلك عن طريق تحديد مكان دفع نموذجك ومعالج الرموز في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> callback = PushToHubCallback(
... output_dir="my_awesome_eli5_mlm_model",
... tokenizer=tokenizer,
... )
```
أخيراً، أنت مستعد لبدء تدريب نموذجك! قم باستدعاء [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع مجموعات بيانات التدريب والتحقق، وعدد العصور، والتعليقات الخاصة بك لتعديل النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3, callbacks=[callback])
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائياً إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
لمثال أكثر تفصيلاً حول كيفية تعديل نموذج للنمذجة اللغوية المقنعة، ألق نظرة على الدفتر المقابل
[دفتر PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)
أو [دفتر TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb).
</Tip>
## الاستدلال
رائع، الآن بعد أن قمت بتعديل نموذج، يمكنك استخدامه للاستدلال!
جهّز بعض النصوص التي تريد أن يملأ النموذج الفراغات فيها، واستخدم الرمز الخاص `<mask>` للإشارة إلى الفراغ:
```py
>>> text = "The Milky Way is a <mask> galaxy."
```
أبسط طريقة لتجربة نموذجك المعدل للاستدلال هي استخدامه في [`pipeline`]. قم بإنشاء كائن `pipeline` لملء الفراغ مع نموذجك، ومرر نصك إليه. إذا أردت، يمكنك استخدام معلمة `top_k` لتحديد عدد التنبؤات التي تريد إرجاعها:
```py
>>> from transformers import pipeline
>>> mask_filler = pipeline("fill-mask", "username/my_awesome_eli5_mlm_model")
>>> mask_filler(text, top_k=3)
[{'score': 0.5150994658470154,
'token': 21300,
'token_str': ' spiral',
'sequence': 'The Milky Way is a spiral galaxy.'},
{'score': 0.07087188959121704,
'token': 2232,
'token_str': ' massive',
'sequence': 'The Milky Way is a massive galaxy.'},
{'score': 0.06434620916843414,
'token': 650,
'token_str': ' small',
'sequence': 'The Milky Way is a small galaxy.'}]
```
<frameworkcontent>
<pt>
قم بتجزئة النص وإرجاع `input_ids` كمتجهات PyTorch. ستحتاج أيضًا إلى تحديد موضع رمز `<mask>`:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_eli5_mlm_model")
>>> inputs = tokenizer(text, return_tensors="pt")
>>> mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1]
```
قم بتمرير المدخلات إلى النموذج وإرجاع `logits` للرمز المقنع:
```py
>>> from transformers import AutoModelForMaskedLM
>>> model = AutoModelForMaskedLM.from_pretrained("username/my_awesome_eli5_mlm_model")
>>> logits = model(**inputs).logits
>>> mask_token_logits = logits[0, mask_token_index, :]
```
ثم قم بإرجاع الرموز الثلاثة المقنعة ذات الاحتمالية الأعلى وطباعتها:
```py
>>> top_3_tokens = torch.topk(mask_token_logits, 3, dim=1).indices[0].tolist()
>>> for token in top_3_tokens:
... print(text.replace(tokenizer.mask_token, tokenizer.decode([token])))
The Milky Way is a spiral galaxy.
The Milky Way is a massive galaxy.
The Milky Way is a small galaxy.
```
</pt>
<tf>
قم بتقسيم النص إلى رموز وإرجاع `input_ids` كـ TensorFlow tensors. ستحتاج أيضًا إلى تحديد موضع رمز `<mask>`:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_eli5_mlm_model")
>>> inputs = tokenizer(text, return_tensors="tf")
>>> mask_token_index = tf.where(inputs["input_ids"] == tokenizer.mask_token_id)[0, 1]
```
قم بتمرير المدخلات إلى النموذج وإرجاع `logits` للرمز المقنع:
```py
>>> from transformers import TFAutoModelForMaskedLM
>>> model = TFAutoModelForMaskedLM.from_pretrained("username/my_awesome_eli5_mlm_model")
>>> logits = model(**inputs).logits
>>> mask_token_logits = logits[0, mask_token_index, :]
```
ثم قم بإرجاع الرموز الثلاثة المقنعة ذات الاحتمالية الأعلى وطباعتها:
```py
>>> top_3_tokens = tf.math.top_k(mask_token_logits, 3).indices.numpy()
>>> for token in top_3_tokens:
... print(text.replace(tokenizer.mask_token, tokenizer.decode([token])))
The Milky Way is a spiral galaxy.
The Milky Way is a massive galaxy.
The Milky Way is a small galaxy.
```
</tf>
</frameworkcontent>

View File

@ -0,0 +1,452 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# الاختيار من متعدد (Multiple choice)
[[open-in-colab]]
مهمة الاختيار من متعدد مشابهة لمهمة الإجابة على الأسئلة، ولكن مع توفير عدة إجابات محتملة مع سياق، ويُدرّب النموذج على تحديد الإجابة الصحيحة.
سيوضح لك هذا الدليل كيفية:
1. ضبط نموذج [BERT](https://huggingface.co/google-bert/bert-base-uncased) باستخدام الإعداد `regular` لمجموعة بيانات [SWAG](https://huggingface.co/datasets/swag) لاختيار الإجابة الأفضل من بين الخيارات المتعددة المتاحة مع السياق.
2. استخدام النموذج المضبوط للاستدلال.
قبل البدء، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate
```
نشجعك على تسجيل الدخول إلى حساب Hugging Face الخاص بك حتى تتمكن من تحميل نموذجك ومشاركته مع المجتمع. عند المطالبة، أدخل الرمز المميز الخاص بك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات SWAG
ابدأ بتحميل تهيئة `regular` لمجموعة بيانات SWAG من مكتبة 🤗 Datasets:
```py
>>> from datasets import load_dataset
>>> swag = load_dataset("swag", "regular")
```
ثم ألق نظرة على مثال:
```py
>>> swag["train"][0]
{'ending0': 'passes by walking down the street playing their instruments.',
'ending1': 'has heard approaching them.',
'ending2': "arrives and they're outside dancing and asleep.",
'ending3': 'turns the lead singer watches the performance.',
'fold-ind': '3416',
'gold-source': 'gold',
'label': 0,
'sent1': 'Members of the procession walk down the street holding small horn brass instruments.',
'sent2': 'A drum line',
'startphrase': 'Members of the procession walk down the street holding small horn brass instruments. A drum line',
'video-id': 'anetv_jkn6uvmqwh4'}
```
على الرغم من أن الحقول تبدو كثيرة، إلا أنها في الواقع بسيطة جداً:
- `sent1` و `sent2`: يعرض هذان الحقلان بداية الجملة، وبدمجهما معًا، نحصل على حقل `startphrase`.
- `ending`: يقترح نهاية محتملة للجملة، واحدة منها فقط هي الصحيحة.
- `label`: يحدد نهاية الجملة الصحيحة.
## المعالجة المسبقة (Preprocess)
الخطوة التالية هي استدعاء مُجزئ BERT لمعالجة بدايات الجمل والنهايات الأربع المحتملة:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")
```
تحتاج دالة المعالجة المسبقة التي تريد إنشاءها إلى:
1. إنشاء أربع نسخ من حقل `sent1` ودمج كل منها مع `sent2` لإعادة إنشاء كيفية بدء الجملة.
2. دمج `sent2` مع كل من نهايات الجمل الأربع المحتملة.
3. تتجميع هاتين القائمتين لتتمكن من تجزئتهما، ثم إعادة ترتيبها بعد ذلك بحيث يكون لكل مثال حقول `input_ids` و `attention_mask` و `labels` مقابلة.
```py
>>> ending_names = ["ending0", "ending1", "ending2", "ending3"]
>>> def preprocess_function(examples):
... first_sentences = [[context] * 4 for context in examples["sent1"]]
... question_headers = examples["sent2"]
... second_sentences = [
... [f"{header} {examples[end][i]}" for end in ending_names] for i, header in enumerate(question_headers)
... ]
... first_sentences = sum(first_sentences, [])
... second_sentences = sum(second_sentences, [])
... tokenized_examples = tokenizer(first_sentences, second_sentences, truncation=True)
... return {k: [v[i : i + 4] for i in range(0, len(v), 4)] for k, v in tokenized_examples.items()}
```
لتطبيق دالة المعالجة المسبقة على مجموعة البيانات بأكملها، استخدم طريقة [`~datasets.Dataset.map`] الخاصة بـ 🤗 Datasets. يمكنك تسريع دالة `map` عن طريق تعيين `batched=True` لمعالجة عناصر متعددة من مجموعة البيانات في وقت واحد:
```py
tokenized_swag = swag.map(preprocess_function, batched=True)
```
لا يحتوي 🤗 Transformers على مجمع بيانات للاختيار من متعدد، لذلك ستحتاج إلى تكييف [`DataCollatorWithPadding`] لإنشاء دفعة من الأمثلة. من الأكفأ إضافة حشو (padding) ديناميكي للجمل إلى أطول طول في دفعة أثناء التجميع، بدلاً من حشو مجموعة البيانات بأكملها إلى الحد الأقصى للطول.
يقوم `DataCollatorForMultipleChoice` بتجميع جميع مدخلات النموذج، ويطبق الحشو، ثم يعيد تجميع النتائج في شكلها الأصلي:
<frameworkcontent>
<pt>
```py
>>> from dataclasses import dataclass
>>> from transformers.tokenization_utils_base import PreTrainedTokenizerBase, PaddingStrategy
>>> from typing import Optional, Union
>>> import torch
>>> @dataclass
... class DataCollatorForMultipleChoice:
... """
... Data collator that will dynamically pad the inputs for multiple choice received.
... """
... tokenizer: PreTrainedTokenizerBase
... padding: Union[bool, str, PaddingStrategy] = True
... max_length: Optional[int] = None
... pad_to_multiple_of: Optional[int] = None
... def __call__(self, features):
... label_name = "label" if "label" in features[0].keys() else "labels"
... labels = [feature.pop(label_name) for feature in features]
... batch_size = len(features)
... num_choices = len(features[0]["input_ids"])
... flattened_features = [
... [{k: v[i] for k, v in feature.items()} for i in range(num_choices)] for feature in features
... ]
... flattened_features = sum(flattened_features, [])
... batch = self.tokenizer.pad(
... flattened_features,
... padding=self.padding,
... max_length=self.max_length,
... pad_to_multiple_of=self.pad_to_multiple_of,
... return_tensors="pt",
... )
... batch = {k: v.view(batch_size, num_choices, -1) for k, v in batch.items()}
... batch["labels"] = torch.tensor(labels, dtype=torch.int64)
... return batch
```
</pt>
<tf>
```py
>>> from dataclasses import dataclass
>>> from transformers.tokenization_utils_base import PreTrainedTokenizerBase, PaddingStrategy
>>> from typing import Optional, Union
>>> import tensorflow as tf
>>> @dataclass
... class DataCollatorForMultipleChoice:
... """
... Data collator that will dynamically pad the inputs for multiple choice received.
... """
... tokenizer: PreTrainedTokenizerBase
... padding: Union[bool, str, PaddingStrategy] = True
... max_length: Optional[int] = None
... pad_to_multiple_of: Optional[int] = None
... def __call__(self, features):
... label_name = "label" if "label" in features[0].keys() else "labels"
... labels = [feature.pop(label_name) for feature in features]
... batch_size = len(features)
... num_choices = len(features[0]["input_ids"])
... flattened_features = [
... [{k: v[i] for k, v in feature.items()} for i in range(num_choices)] for feature in features
... ]
... flattened_features = sum(flattened_features, [])
... batch = self.tokenizer.pad(
... flattened_features,
... padding=self.padding,
... max_length=self.max_length,
... pad_to_multiple_of=self.pad_to_multiple_of,
... return_tensors="tf",
... )
... batch = {k: tf.reshape(v, (batch_size, num_choices, -1)) for k, v in batch.items()}
... batch["labels"] = tf.convert_to_tensor(labels, dtype=tf.int64)
... return batch
```
</tf>
</frameworkcontent>
## التقييم (Evaluate)
يُفضل غالبًا تضمين مقياس أثناء التدريب لتقييم أداء نموذجك. يمكنك تحميل طريقة تقييم بسرعة باستخدام مكتبة 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index). لهذه المهمة، قم بتحميل مقياس [الدقة](https://huggingface.co/spaces/evaluate-metric/accuracy) (انظر إلى [الجولة السريعة](https://huggingface.co/docs/evaluate/a_quick_tour) لـ 🤗 Evaluate لمعرفة المزيد حول كيفية تحميل المقياس وحسابه):
```py
>>> import evaluate
>>> accuracy = evaluate.load("accuracy")
```
ثم أنشئ دالة لتمرير التنبؤات والتسميات إلى [`~evaluate.EvaluationModule.compute`] لحساب الدقة:
```py
>>> import numpy as np
>>> def compute_metrics(eval_pred):
... predictions, labels = eval_pred
... predictions = np.argmax(predictions, axis=1)
... return accuracy.compute(predictions=predictions, references=labels)
```
دالتك `compute_metrics` جاهزة الآن، وستعود إليها عند إعداد تدريبك.
## التدريب (Train)
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن معتادًا على ضبط نموذج باستخدام [`Trainer`], فراجع الدرس الأساسي [هنا](../training#train-with-pytorch-trainer)!
</Tip>
أنت جاهز لبدء تدريب نموذجك الآن! قم بتحميل BERT باستخدام [`AutoModelForMultipleChoice`]:
```py
>>> from transformers import AutoModelForMultipleChoice, TrainingArguments, Trainer
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-uncased")
```
في هذه المرحلة، تبقى ثلاث خطوات فقط:
1. حدد معلمات التدريب الخاصة بك في [`TrainingArguments`]. المعلمة الوحيدة المطلوبة هي `output_dir` التي تحدد مكان حفظ نموذجك. ستدفع هذا النموذج إلى Hub عن طريق تعيين `push_to_hub=True` (يجب عليك تسجيل الدخول إلى Hugging Face لتحميل نموذجك). في نهاية كل حقبة، سيقوم [`Trainer`] بتقييم الدقة وحفظ نقطة فحص التدريب.
2. مرر معلمات التدريب إلى [`Trainer`] جنبًا إلى جنب مع النموذج ومُجمِّع البيانات والمعالج ودالة تجميع البيانات ودالة `compute_metrics`.
3. استدعي [`~Trainer.train`] لضبط نموذجك.
```py
>>> training_args = TrainingArguments(
... output_dir="my_awesome_swag_model",
... eval_strategy="epoch",
... save_strategy="epoch",
... load_best_model_at_end=True,
... learning_rate=5e-5,
... per_device_train_batch_size=16,
... per_device_eval_batch_size=16,
... num_train_epochs=3,
... weight_decay=0.01,
... push_to_hub=True,
... )
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=tokenized_swag["train"],
... eval_dataset=tokenized_swag["validation"],
... processing_class=tokenizer,
... data_collator=DataCollatorForMultipleChoice(tokenizer=tokenizer),
... compute_metrics=compute_metrics,
... )
>>> trainer.train()
```
بمجرد اكتمال التدريب، شارك نموذجك مع Hub باستخدام طريقة [`~transformers.Trainer.push_to_hub`] حتى يتمكن الجميع من استخدام نموذجك:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن معتادًا على ضبط نموذج باستخدام Keras، فراجع الدرس الأساسي [هنا](../training#train-a-tensorflow-model-with-keras)!
</Tip>
لضبط نموذج في TensorFlow، ابدأ بإعداد دالة مُحسِّن وجدول معدل التعلم وبعض معلمات التدريب:
```py
>>> from transformers import create_optimizer
>>> batch_size = 16
>>> num_train_epochs = 2
>>> total_train_steps = (len(tokenized_swag["train"]) // batch_size) * num_train_epochs
>>> optimizer, schedule = create_optimizer(init_lr=5e-5, num_warmup_steps=0, num_train_steps=total_train_steps)
```
ثم يمكنك تحميل BERT باستخدام [`TFAutoModelForMultipleChoice`]:
```py
>>> from transformers import TFAutoModelForMultipleChoice
>>> model = TFAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-uncased")
```
حوّل مجموعات البيانات الخاصة بك إلى تنسيق `tf.data.Dataset` باستخدام [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> data_collator = DataCollatorForMultipleChoice(tokenizer=tokenizer)
>>> tf_train_set = model.prepare_tf_dataset(
... tokenized_swag["train"],
... shuffle=True,
... batch_size=batch_size,
... collate_fn=data_collator,
... )
>>> tf_validation_set = model.prepare_tf_dataset(
... tokenized_swag["validation"],
... shuffle=False,
... batch_size=batch_size,
... collate_fn=data_collator,
... )
```
قم بتهيئة النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method). لاحظ أن جميع نماذج Transformers تحتوي على دالة خسارة مناسبة للمهمة بشكل افتراضي، لذلك لا تحتاج إلى تحديد واحدة ما لم ترغب في ذلك:
```py
>>> model.compile(optimizer=optimizer) # لا توجد وسيطة خسارة!
```
الخطوتان الأخيرتان قبل بدء التدريب هما: حساب دقة التنبؤات، وتوفير طريقة لرفع النموذج إلى Hub. ويمكن تحقيق ذلك باستخدام [استدعاءات Keras](../main_classes/keras_callbacks)
مرر دالتك `compute_metrics` إلى [`~transformers.KerasMetricCallback`]:
```py
>>> from transformers.keras_callbacks import KerasMetricCallback
>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)
```
حدد مكان دفع نموذجك ومعالجك في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> push_to_hub_callback = PushToHubCallback(
... output_dir="my_awesome_model",
... tokenizer=tokenizer,
... )
```
ثم قم بتضمين الاستدعاءات معًا:
```py
>>> callbacks = [metric_callback, push_to_hub_callback]
```
أخيرًا، أنت جاهز لبدء تدريب نموذجك! استدعِ[`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع مجموعات بيانات التدريب والتحقق من الصحة وعدد الحقب والاستدعاءات لضبط النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=2, callbacks=callbacks)
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائيًا إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
للحصول على مثال أكثر تعمقًا حول كيفية ضبط نموذج للاختيار من متعدد، ألق نظرة على [دفتر ملاحظات PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multiple_choice.ipynb)
أو [دفتر ملاحظات TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multiple_choice-tf.ipynb) المقابل.
</Tip>
## الاستدلال (Inference)
رائع، الآن بعد أن قمت بضبط نموذج، يمكنك استخدامه للاستدلال!
قم بإنشاء نص واقتراح إجابتين محتملتين:
```py
>>> prompt = "France has a bread law, Le Décret Pain, with strict rules on what is allowed in a traditional baguette."
>>> candidate1 = "The law does not apply to croissants and brioche."
>>> candidate2 = "The law applies to baguettes."
```
<frameworkcontent>
<pt>
قم بتحليل كل مطالبة وزوج إجابة مرشح وأعد تنسورات PyTorch. يجب عليك أيضًا إنشاء بعض `العلامات`:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_swag_model")
>>> inputs = tokenizer([[prompt, candidate1], [prompt, candidate2]], return_tensors="pt", padding=True)
>>> labels = torch.tensor(0).unsqueeze(0)
```
مرر مدخلاتك والعلامات إلى النموذج وأرجع`logits`:
```py
>>> from transformers import AutoModelForMultipleChoice
>>> model = AutoModelForMultipleChoice.from_pretrained("username/my_awesome_swag_model")
>>> outputs = model(**{k: v.unsqueeze(0) for k, v in inputs.items()}, labels=labels)
>>> logits = outputs.logits
```
استخرج الفئة ذات الاحتمالية الأكبر:
```py
>>> predicted_class = logits.argmax().item()
>>> predicted_class
0
```
</pt>
<tf>
قم بتحليل كل مطالبة وزوج إجابة مرشح وأعد موترات TensorFlow:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_swag_model")
>>> inputs = tokenizer([[prompt, candidate1], [prompt, candidate2]], return_tensors="tf", padding=True)
```
مرر مدخلاتك إلى النموذج وأعد القيم logits:
```py
>>> from transformers import TFAutoModelForMultipleChoice
>>> model = TFAutoModelForMultipleChoice.from_pretrained("username/my_awesome_swag_model")
>>> inputs = {k: tf.expand_dims(v, 0) for k, v in inputs.items()}
>>> outputs = model(inputs)
>>> logits = outputs.logits
```
استخرج الفئة ذات الاحتمالية الأكبر:
```py
>>> predicted_class = int(tf.math.argmax(logits, axis=-1)[0])
>>> predicted_class
0
```
</tf>
</frameworkcontent>

View File

@ -0,0 +1,432 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# الإجابة على الأسئلة (Question answering)
[[open-in-colab]]
<Youtube id="ajPx5LwJD-I"/>
تُقدّم مهام الإجابة على الأسئلة إجابةً بناءً على سؤال. إذا سبق لك أن سألت مساعدًا افتراضيًا مثل Alexa أو Siri أو Google عن حالة الطقس، فأنت قد استخدمت نموذج للإجابة على الأسئلة من قبل. هناك نوعان شائعان لمهام الإجابة على الأسئلة:
- الاستخراجية: استخراج الإجابة من السياق المحدد.
- التلخيصية: إنشاء إجابة من السياق تجيب على السؤال بشكل صحيح.
سيوضح لك هذا الدليل كيفية:
1. ضبط [DistilBERT](https://huggingface.co/distilbert/distilbert-base-uncased) على مجموعة بيانات [SQuAD](https://huggingface.co/datasets/squad) للإجابة على الأسئلة الاستخراجية.
2. استخدام النموذج المضبوط للاستدلال.
<Tip>
لمشاهدة جميع الهياكل والنسخ المتوافقة مع هذه المهمة، نوصي بالرجوع إلى [صفحة المهمة](https://huggingface.co/tasks/question-answering)
</Tip>
قبل البدء، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate
```
نشجعك على تسجيل الدخول إلى حساب Hugging Face الخاص بك حتى تتمكن من تحميل نموذجك ومشاركته مع المجتمع. عند المطالبة، أدخل الرمز المميز الخاص بك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات SQuAD
ابدأ بتحميل جزء أصغر من مجموعة بيانات SQuAD من مكتبة 🤗 Datasets. سيتيح لك ذلك فرصة للتجربة والتحقق من عمل كل شيء بشكل صحيح قبل قضاء المزيد من الوقت في التدريب على مجموعة البيانات الكاملة.
```py
>>> from datasets import load_dataset
>>> squad = load_dataset("squad", split="train[:5000]")
```
قم بتقسيم تقسيم `train` لمجموعة البيانات إلى مجموعة تدريب واختبار باستخدام طريقة [`~datasets.Dataset.train_test_split`]:
```py
>>> squad = squad.train_test_split(test_size=0.2)
```
ثم ألق نظرة على مثال:
```py
>>> squad["train"][0]
{'answers': {'answer_start': [515], 'text': ['Saint Bernadette Soubirous']},
'context': 'Architecturally, the school has a Catholic character. Atop the Main Building\'s gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.',
'id': '5733be284776f41900661182',
'question': 'To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?',
'title': 'University_of_Notre_Dame'
}
```
هناك العديد من الحقول المهمة هنا:
- `answers`: موقع بداية الرمز المميز للإجابة ونص الإجابة.
- `context`: معلومات أساسية يحتاج النموذج إلى استخراج الإجابة منها.
- `question`: السؤال الذي يجب على النموذج الإجابة عليه.
## المعالجة المسبقة (Preprocess)
<Youtube id="qgaM0weJHpA"/>
الخطوة التالية هي تحميل المحلل اللغوى DistilBERT لمعالجة حقلي `question` و `context`:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
```
هناك بعض خطوات المعالجة المسبقة الخاصة بمهام الإجابة على الأسئلة التي يجب أن تكون على دراية بها:
1. قد تحتوي بعض الأمثلة في مجموعة البيانات على `context` طويلًا يتجاوز الحد الأقصى لطول مدخل النموذج. للتعامل مع النصوص الأطول، يتم اقتطاع `context` فقط عن طريق تعيين `truncation="only_second"`.
2. بعد ذلك، يتم تحديد مواضع بداية ونهاية الإجابة في `context` الأصلي عن طريق تعيين
`return_offset_mapping=True`.
3. باستخدام التعيين، يمكن الآن تحديد رموز بداية ونهاية الإجابة. استخدم طريقة [`~tokenizers.Encoding.sequence_ids`]
لتحديد أجزاء الإزاحة التي تتوافق مع `question` و `context`.
فيما يلي كيفية إنشاء دالة لقص وتعيين رموز البداية والنهاية لـ `answer` إلى `context`:
```py
>>> def preprocess_function(examples):
... questions = [q.strip() for q in examples["question"]]
... inputs = tokenizer(
... questions,
... examples["context"],
... max_length=384,
... truncation="only_second",
... return_offsets_mapping=True,
... padding="max_length",
... )
... offset_mapping = inputs.pop("offset_mapping")
... answers = examples["answers"]
... start_positions = []
... end_positions = []
... for i, offset in enumerate(offset_mapping):
... answer = answers[i]
... start_char = answer["answer_start"][0]
... end_char = answer["answer_start"][0] + len(answer["text"][0])
... sequence_ids = inputs.sequence_ids(i)
... # Find the start and end of the context
... idx = 0
... while sequence_ids[idx] != 1:
... idx += 1
... context_start = idx
... while sequence_ids[idx] == 1:
... idx += 1
... context_end = idx - 1
... # If the answer is not fully inside the context, label it (0, 0)
... if offset[context_start][0] > end_char or offset[context_end][1] < start_char:
... start_positions.append(0)
... end_positions.append(0)
... else:
... # Otherwise it's the start and end token positions
... idx = context_start
... while idx <= context_end and offset[idx][0] <= start_char:
... idx += 1
... start_positions.append(idx - 1)
... idx = context_end
... while idx >= context_start and offset[idx][1] >= end_char:
... idx -= 1
... end_positions.append(idx + 1)
... inputs["start_positions"] = start_positions
... inputs["end_positions"] = end_positions
... return inputs
```
لتطبيق المعالجة المسبقة على كامل مجموعة البيانات، استخدم [`~datasets.Dataset.map`] من مكتبة 🤗 Datasets. يمكنك تسريع دالة `map` عن طريق تعيين `batched=True` لمعالجة عناصر متعددة من مجموعة البيانات دفعة واحدة. قم بإزالة أي أعمدة لا تحتاجها:
```py
>>> tokenized_squad = squad.map(preprocess_function, batched=True, remove_columns=squad["train"].column_names)
```
الآن قم بإنشاء دفعة من الأمثلة باستخدام [`DefaultDataCollator`]. بخلاف مجمّعات البيانات الأخرى في 🤗 Transformers، لا يطبق [`DefaultDataCollator`] أي معالجة مسبقة إضافية مثل الحشو.
<frameworkcontent>
<pt>
```py
>>> from transformers import DefaultDataCollator
>>> data_collator = DefaultDataCollator()
```
</pt>
<tf>
```py
>>> from transformers import DefaultDataCollator
>>> data_collator = DefaultDataCollator(return_tensors="tf")
```
</tf>
</frameworkcontent>
## التدريب (Train)
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن معتادًا على ضبط نموذج باستخدام [`Trainer`], ألق نظرة على البرنامج التعليمي الأساسي [هنا](../training#train-with-pytorch-trainer)!
</Tip>
أنت جاهز لبدء تدريب نموذجك الآن! قم بتحميل DistilBERT باستخدام [`AutoModelForQuestionAnswering`]:
```py
>>> from transformers import AutoModelForQuestionAnswering, TrainingArguments, Trainer
>>> model = AutoModelForQuestionAnswering.from_pretrained("distilbert/distilbert-base-uncased")
```
في هذه المرحلة، تبقى ثلاث خطوات فقط:
1. حدد المعاملات الفائقة للتدريب في [`TrainingArguments`]. المعامل الوحيد المطلوب هو `output_dir` الذي يحدد مكان حفظ نموذجك. ستدفع هذا النموذج إلى Hub عن طريق تعيين `push_to_hub=True` (يجب عليك تسجيل الدخول إلى Hugging Face لتحميل نموذجك).
2. مرر معاملات التدريب إلى [`Trainer`] جنبًا إلى جنب مع النموذج، ومجموعة البيانات، والمُحلّل النصي، ومُجمّع البيانات.
3. استدعِ ـ [`~Trainer.train`] لضبط النموذج.
```py
>>> training_args = TrainingArguments(
... output_dir="my_awesome_qa_model",
... eval_strategy="epoch",
... learning_rate=2e-5,
... per_device_train_batch_size=16,
... per_device_eval_batch_size=16,
... num_train_epochs=3,
... weight_decay=0.01,
... push_to_hub=True,
... )
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=tokenized_squad["train"],
... eval_dataset=tokenized_squad["test"],
... processing_class=tokenizer,
... data_collator=data_collator,
... )
>>> trainer.train()
```
بمجرد اكتمال التدريب، شارك نموذجك في Hub باستخدام الدالة [`~transformers.Trainer.push_to_hub`] حتى يتمكن الجميع من استخدام نموذجك:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن معتادًا على ضبط نموذج باستخدام Keras، فألق نظرة على البرنامج التعليمي الأساسي [هنا](../training#train-a-tensorflow-model-with-keras)!
</Tip>
لضبط نموذج في TensorFlow، ابدأ بإعداد دالة مُحسِّن، وجدول معدل التعلم، وبعض المعاملات الفائقة للتدريب:
```py
>>> from transformers import create_optimizer
>>> batch_size = 16
>>> num_epochs = 2
>>> total_train_steps = (len(tokenized_squad["train"]) // batch_size) * num_epochs
>>> optimizer, schedule = create_optimizer(
... init_lr=2e-5,
... num_warmup_steps=0,
... num_train_steps=total_train_steps,
... )
```
ثم يمكنك تحميل DistilBERT باستخدام [`TFAutoModelForQuestionAnswering`]:
```py
>>> from transformers import TFAutoModelForQuestionAnswering
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert/distilbert-base-uncased")
```
حوّل مجموعات البيانات الخاصة بك إلى تنسيق `tf.data.Dataset` باستخدام [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... tokenized_squad["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_validation_set = model.prepare_tf_dataset(
... tokenized_squad["test"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
قم بتكوين النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method):
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer)
```
آخر شيء يجب إعداده قبل بدء التدريب هو توفير طريقة لدفع نموذجك إلى Hub. يمكن القيام بذلك عن طريق تحديد مكان دفع نموذجك ومعالجك المعجمي في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> callback = PushToHubCallback(
... output_dir="my_awesome_qa_model",
... tokenizer=tokenizer,
... )
```
أخيرًا، أنت جاهز لبدء تدريب نموذجك! اتصل بـ [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع مجموعات بيانات التدريب والتحقق من الصحة، وعدد العهود، ومعاودة الاتصال الخاصة بك لضبط النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3, callbacks=[callback])
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائيًا إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
للحصول على مثال أكثر تعمقًا حول كيفية ضبط نموذج للإجابة على الأسئلة، ألق نظرة على [دفتر ملاحظات PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering.ipynb) المقابل
أو [دفتر ملاحظات TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering-tf.ipynb).
</Tip>
## التقييم (Evaluate)
يتطلب التقييم للإجابة على الأسئلة قدرًا كبيرًا من المعالجة اللاحقة. لتوفير وقتك، يتخطى هذا الدليل خطوة التقييم. لا يزال [`Trainer`] يحسب خسارة التقييم أثناء التدريب، مما يعني أنك لست تجهل تمامًا أداء نموذجك.
إذا كان لديك المزيد من الوقت وتهتم بكيفية تقييم نموذجك للإجابة على الأسئلة، فألق نظرة على فصل [الإجابة على الأسئلة](https://huggingface.co/course/chapter7/7?fw=pt#post-processing) من دورة 🤗 Hugging Face!
## الاستدلال (Inference)
رائع، الآن بعد أن قمت بضبط نموذج، يمكنك استخدامه للاستدلال!
حدد سؤالًا وسياقًا ليقوم النموذج بالتنبؤ بالإجابة عليه:
```py
>>> question = "How many programming languages does BLOOM support?"
>>> context = "BLOOM has 176 billion parameters and can generate text in 46 languages natural languages and 13 programming languages."
```
أبسط طريقة لتجربة نموذجك المُدرَّب للاستدلال هي استخدامه في [`pipeline`]. قم بإنشاء كائن لـ `pipeline` للإجابة على الأسئلة باستخدام نموذجك، ومرِّر النص إليه:
```py
>>> from transformers import pipeline
>>> question_answerer = pipeline("question-answering", model="my_awesome_qa_model")
>>> question_answerer(question=question, context=context)
{'score': 0.2058267742395401,
'start': 10,
'end': 95,
'answer': '176 مليار معامل ويمكنه إنشاء نصوص بـ 46 لغة طبيعية و 13'}
```
يمكنك أيضًا تكرار نتائج `pipeline` يدويًا إذا أردت:
<frameworkcontent>
<pt>
قسّم النص وأرجع تنسورات PyTorch:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_qa_model")
>>> inputs = tokenizer(question, context, return_tensors="pt")
```
مرر مدخلاتك إلى النموذج وأرجع `logits`:
```py
>>> import torch
>>> from transformers import AutoModelForQuestionAnswering
>>> model = AutoModelForQuestionAnswering.from_pretrained("my_awesome_qa_model")
>>> with torch.no_grad():
... outputs = model(**inputs)
```
احصل على أعلى احتمال من مخرجات النموذج لموضعي البداية والنهاية:
```py
>>> answer_start_index = outputs.start_logits.argmax()
>>> answer_end_index = outputs.end_logits.argmax()
```
استخلاص الإجابة من الرموز المتوقعة:
```py
>>> predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
>>> tokenizer.decode(predict_answer_tokens)
'176 billion parameters and can generate text in 46 languages natural languages and 13'
```
</pt>
<tf>
قم بتحليل النص المعجمي وأعد موترات TensorFlow:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_qa_model")
>>> inputs = tokenizer(question, context, return_tensors="tf")
```
مرر مدخلاتك إلى النموذج وأعد `logits`:
```py
>>> from transformers import TFAutoModelForQuestionAnswering
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("my_awesome_qa_model")
>>> outputs = model(**inputs)
```
احصل على أعلى احتمال من مخرجات النموذج لموضعي البداية والنهاية:
```py
>>> answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)[0])
>>> answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)[0])
```
استخلاص الإجابة من الرموز المتوقعة:
```py
>>> predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
>>> tokenizer.decode(predict_answer_tokens)
'176 billion parameters and can generate text in 46 languages natural languages and 13'
```
</tf>
</frameworkcontent>

View File

@ -0,0 +1,387 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# تصنيف النص(Text classification)
[[open-in-colab]]
<Youtube id="leNG9fN9FQU"/>
تصنيف النص هو مهمة NLP شائعة حيث يُعيّن تصنيفًا أو فئة للنص. تستخدم بعض أكبر الشركات تصنيف النصوص في الإنتاج لمجموعة واسعة من التطبيقات العملية. أحد أكثر أشكال تصنيف النص شيوعًا هو تحليل المشاعر، والذي يقوم بتعيين تسمية مثل 🙂 إيجابية، 🙁 سلبية، أو 😐 محايدة لتسلسل نصي.
سيوضح لك هذا الدليل كيفية:
1. ضبط [DistilBERT](https://huggingface.co/distilbert/distilbert-base-uncased) على مجموعة بيانات [IMDb](https://huggingface.co/datasets/imdb) لتحديد ما إذا كانت مراجعة الفيلم إيجابية أو سلبية.
2. استخدام نموذج الضبط الدقيق للتنبؤ.
<Tip>
لرؤية جميع البنى ونقاط التحقق المتوافقة مع هذه المهمة، نوصي بالتحقق من [صفحة المهمة](https://huggingface.co/tasks/text-classification).
</Tip>
قبل أن تبدأ، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate accelerate
```
نحن نشجعك على تسجيل الدخول إلى حساب Hugging Face الخاص بك حتى تتمكن من تحميل ومشاركة نموذجك مع المجتمع. عند المطالبة، أدخل رمزك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات IMDb
ابدأ بتحميل مجموعة بيانات IMDb من مكتبة 🤗 Datasets:
```py
>>> from datasets import load_dataset
>>> imdb = load_dataset("imdb")
```
ثم ألق نظرة على مثال:
```py
>>> imdb["test"][0]
{
"label": 0,
"text": "I love sci-fi and am willing to put up with a lot. Sci-fi movies/TV are usually underfunded, under-appreciated and misunderstood. I tried to like this, I really did, but it is to good TV sci-fi as Babylon 5 is to Star Trek (the original). Silly prosthetics, cheap cardboard sets, stilted dialogues, CG that doesn't match the background, and painfully one-dimensional characters cannot be overcome with a 'sci-fi' setting. (I'm sure there are those of you out there who think Babylon 5 is good sci-fi TV. It's not. It's clichéd and uninspiring.) While US viewers might like emotion and character development, sci-fi is a genre that does not take itself seriously (cf. Star Trek). It may treat important issues, yet not as a serious philosophy. It's really difficult to care about the characters here as they are not simply foolish, just missing a spark of life. Their actions and reactions are wooden and predictable, often painful to watch. The makers of Earth KNOW it's rubbish as they have to always say \"Gene Roddenberry's Earth...\" otherwise people would not continue watching. Roddenberry's ashes must be turning in their orbit as this dull, cheap, poorly edited (watching it without advert breaks really brings this home) trudging Trabant of a show lumbers into space. Spoiler. So, kill off a main character. And then bring him back as another actor. Jeeez! Dallas all over again.",
}
```
هناك حقولان في هذه المجموعة من البيانات:
- `text`: نص مراجعة الفيلم.
- `label`: قيمة إما `0` لمراجعة سلبية أو `1` لمراجعة إيجابية.
## المعالجة المسبقة(Preprocess)
الخطوة التالية هي تحميل المُجزِّئ النص DistilBERT لتهيئة لحقل `text`:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
```
أنشئ دالة لتهيئة حقل `text` وتقصير السلاسل النصية بحيث لا يتجاوز طولها الحد الأقصى لإدخالات DistilBERT:
```py
>>> def preprocess_function(examples):
... return tokenizer(examples["text"], truncation=True)
```
لتطبيق دالة التهيئة على مجموعة البيانات بأكملها، استخدم دالة 🤗 Datasets [`~datasets.Dataset.map`] . يمكنك تسريع `map` باستخدام `batched=True` لمعالجة دفعات من البيانات:
```py
tokenized_imdb = imdb.map(preprocess_function, batched=True)
```
الآن قم بإنشاء دفعة من الأمثلة باستخدام [`DataCollatorWithPadding`]. الأكثر كفاءة هو استخدام الحشو الديناميكي لجعل الجمل متساوية في الطول داخل كل دفعة، بدلًا من حشو كامل البيانات إلى الحد الأقصى للطول.
<frameworkcontent>
<pt>
```py
>>> from transformers import DataCollatorWithPadding
>>> data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
```
</pt>
<tf>
```py
>>> from transformers import DataCollatorWithPadding
>>> data_collator = DataCollatorWithPadding(tokenizer=tokenizer, return_tensors="tf")
```
</tf>
</frameworkcontent>
## التقييم(Evaluate)
يُعدّ تضمين مقياس أثناء التدريب مفيدًا لتقييم أداء النموذج. يمكنك تحميل طريقة تقييم بسرعة باستخدام مكتبة 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index) . بالنسبة لهذه المهمة، قم بتحميل مقياس [الدقة](https://huggingface.co/spaces/evaluate-metric/accuracy) (راجع جولة 🤗 Evaluate [السريعة](https://huggingface.co/docs/evaluate/a_quick_tour) لمعرفة المزيد حول كيفية تحميل وحساب مقياس):
```py
>>> import evaluate
>>> accuracy = evaluate.load("accuracy")
```
ثم أنشئ دالة تقوم بتمرير تنبؤاتك وتصنيفاتك إلى [`~evaluate.EvaluationModule.compute`] لحساب الدقة:
```py
>>> import numpy as np
>>> def compute_metrics(eval_pred):
... predictions, labels = eval_pred
... predictions = np.argmax(predictions, axis=1)
... return accuracy.compute(predictions=predictions, references=labels)
```
دالة `compute_metrics` جاهزة الآن، وستعود إليها عند إعداد التدريب.
## التدريب(Train)
قبل أن تبدأ في تدريب نموذجك، قم بإنشاء خريطة من المعرفات المتوقعة إلى تسمياتها باستخدام `id2label` و `label2id`:
```py
>>> id2label = {0: "NEGATIVE", 1: "POSITIVE"}
>>> label2id = {"NEGATIVE": 0, "POSITIVE": 1}
```
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن على دراية بضبط نموذج دقيق باستخدام [`Trainer`], فالق نظرة على البرنامج التعليمي الأساسي [هنا](../training#train-with-pytorch-trainer)!
</Tip>
أنت مستعد الآن لبدء تدريب نموذجك! قم بتحميل DistilBERT مع [`AutoModelForSequenceClassification`] جنبًا إلى جنب مع عدد التصنيفات المتوقعة، وتصنيفات الخرائط:
```py
>>> from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer
>>> model = AutoModelForSequenceClassification.from_pretrained(
... "distilbert/distilbert-base-uncased", num_labels=2, id2label=id2label, label2id=label2id
... )
```
في هذه المرحلة، هناك ثلاث خطوات فقط متبقية:
1. حدد مُعامِلات التدريب في [`TrainingArguments`]. المُعامل المطلوب الوحيد هو `output_dir`، لتحديد مكان حفظ النموذج. يمكنك رفع النموذج إلى Hub بتعيين `push_to_hub=True` (يجب تسجيل الدخول إلى Hugging Face لرفع النموذج). سيقوم `Trainer` بتقييم الدقة وحفظ نقاط التحقق في نهاية كل حقبة.
2. مرر مُعامِلات التدريب إلى `Trainer` مع النموذج، ومجموعة البيانات، والمحلل اللغوي، ومُجمِّع البيانات، ووظيفة `compute_metrics`.
3. استدعِ [`~Trainer.train`] لضبط النموذج.
```py
>>> training_args = TrainingArguments(
... output_dir="my_awesome_model",
... learning_rate=2e-5,
... per_device_train_batch_size=16,
... per_device_eval_batch_size=16,
... num_train_epochs=2,
... weight_decay=0.01,
... eval_strategy="epoch",
... save_strategy="epoch",
... load_best_model_at_end=True,
... push_to_hub=True,
... )
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=tokenized_imdb["train"],
... eval_dataset=tokenized_imdb["test"],
... processing_class=tokenizer,
... data_collator=data_collator,
... compute_metrics=compute_metrics,
... )
>>> trainer.train()
```
<Tip>
يستخدم [`Trainer`] الحشو الديناميكي افتراضيًا عند تمرير `tokenizer` إليه. في هذه الحالة، لا تحتاج لتحديد مُجمِّع البيانات صراحةً.
</Tip>
بعد اكتمال التدريب، شارك نموذجك على Hub باستخدام الطريقة [`~transformers.Trainer.push_to_hub`] ليستخدمه الجميع:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن على دراية بضبط نموذج باستخدام Keras، قم بالاطلاع على البرنامج التعليمي الأساسي [هنا](../training#train-a-tensorflow-model-with-keras)!
</Tip>
لضبط نموذج في TensorFlow، ابدأ بإعداد دالة المحسن، وجدول معدل التعلم، وبعض معلمات التدريب:
```py
>>> from transformers import create_optimizer
>>> import tensorflow as tf
>>> batch_size = 16
>>> num_epochs = 5
>>> batches_per_epoch = len(tokenized_imdb["train"]) // batch_size
>>> total_train_steps = int(batches_per_epoch * num_epochs)
>>> optimizer, schedule = create_optimizer(init_lr=2e-5, num_warmup_steps=0, num_train_steps=total_train_steps)
```
ثم يمكنك تحميل DistilBERT مع [`TFAutoModelForSequenceClassification`] بالإضافة إلى عدد التصنيفات المتوقعة، وتعيينات التسميات:
```py
>>> from transformers import TFAutoModelForSequenceClassification
>>> model = TFAutoModelForSequenceClassification.from_pretrained(
... "distilbert/distilbert-base-uncased", num_labels=2, id2label=id2label, label2id=label2id
... )
```
قم بتحويل مجموعات بياناتك إلى تنسيق `tf.data.Dataset` باستخدام [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... tokenized_imdb["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_validation_set = model.prepare_tf_dataset(
... tokenized_imdb["test"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
قم بتهيئة النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method). لاحظ أن جميع نماذج Transformers لديها دالة خسارة ذات صلة بالمهمة بشكل افتراضي، لذلك لا تحتاج إلى تحديد واحدة ما لم ترغب في ذلك:
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer) # No loss argument!
```
آخر أمرين يجب إعدادهما قبل بدء التدريب هو حساب الدقة من التوقعات، وتوفير طريقة لدفع نموذجك إلى Hub. يتم ذلك باستخدام [Keras callbacks](../main_classes/keras_callbacks).
قم بتمرير دالة `compute_metrics` الخاصة بك إلى [`~transformers.KerasMetricCallback`]:
```py
>>> from transformers.keras_callbacks import KerasMetricCallback
>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)
```
حدد مكان دفع نموذجك والمجزئ اللغوي في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> push_to_hub_callback = PushToHubCallback(
... output_dir="my_awesome_model",
... tokenizer=tokenizer,
... )
```
ثم اجمع الاستدعاءات معًا:
```py
>>> callbacks = [metric_callback, push_to_hub_callback]
```
أخيرًا، أنت مستعد لبدء تدريب نموذجك! قم باستدعاء [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع مجموعات بيانات التدريب والتحقق، وعدد الحقبات، واستدعاءاتك لضبط النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3, callbacks=callbacks)
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائيًا إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
للحصول على مثال أكثر عمقًا حول كيفية ضبط نموذج لتصنيف النصوص، قم بالاطلاع على الدفتر المقابل
[دفتر PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification.ipynb)
أو [دفتر TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification-tf.ipynb).
</Tip>
## الاستدلال(Inference)
رائع، الآن بعد أن قمت بضبط نموذج، يمكنك استخدامه للاستدلال!
احصل على بعض النصوص التي ترغب في إجراء الاستدلال عليها:
```py
>>> text = "This was a masterpiece. Not completely faithful to the books, but enthralling from beginning to end. Might be my favorite of the three."
```
أسهل طريقة لتجربة النموذج المضبوط للاستدلال هي استخدامه ضمن [`pipeline`]. قم بإنشاء `pipeline` لتحليل المشاعر مع نموذجك، ومرر نصك إليه:
```py
>>> from transformers import pipeline
>>> classifier = pipeline("sentiment-analysis", model="stevhliu/my_awesome_model")
>>> classifier(text)
[{'label': 'POSITIVE', 'score': 0.9994940757751465}]
```
يمكنك أيضًا تكرار نتائج `pipeline` يدويًا إذا أردت:
<frameworkcontent>
<pt>
قم يتجزئة النص وإرجاع تنسورات PyTorch:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_model")
>>> inputs = tokenizer(text, return_tensors="pt")
```
مرر المدخلات إلى النموذج واسترجع `logits`:
```py
>>> from transformers import AutoModelForSequenceClassification
>>> model = AutoModelForSequenceClassification.from_pretrained("stevhliu/my_awesome_model")
>>> with torch.no_grad():
... logits = model(**inputs).logits
```
استخرج الفئة ذات الاحتمالية الأعلى، واستخدم `id2label` لتحويلها إلى تصنيف نصي:
```py
>>> predicted_class_id = logits.argmax().item()
>>> model.config.id2label[predicted_class_id]
'POSITIVE'
```
</pt>
<tf>
قم بتحليل النص وإرجاع تنسيقات TensorFlow:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_model")
>>> inputs = tokenizer(text, return_tensors="tf")
```
قم بتمرير مدخلاتك إلى النموذج وإرجاع `logits`:
```py
>>> from transformers import TFAutoModelForSequenceClassification
>>> model = TFAutoModelForSequenceClassification.from_pretrained("stevhliu/my_awesome_model")
>>> logits = model(**inputs).logits
```
استخرج الفئة ذات الاحتمالية الأعلى، واستخدم `id2label` لتحويلها إلى تصنيف نصي:
```py
>>> predicted_class_id = int(tf.math.argmax(logits, axis=-1)[0])
>>> model.config.id2label[predicted_class_id]
'POSITIVE'
```
</tf>
</frameworkcontent>

File diff suppressed because one or more lines are too long

View File

@ -0,0 +1,550 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# تصنيف الرموز(Token classification)
[[open-in-colab]]
<Youtube id="wVHdVlPScxA"/>
يهدف تصنيف الرموز إلى إعطاء تسمية لكل رمز على حدة في الجملة. من أكثر مهام تصنيف الرموز شيوعًا هو التعرف على الكيانات المسماة (NER). يحاول NER تحديد تسمية لكل كيان في الجملة، مثل شخص، أو مكان، أو منظمة.
سيوضح لك هذا الدليل كيفية:
1. ضبط [DistilBERT](https://huggingface.co/distilbert/distilbert-base-uncased) على مجموعة بيانات [WNUT 17](https://huggingface.co/datasets/wnut_17) للكشف عن كيانات جديدة.
2. استخدام نموذجك المضبوط بدقة للاستدلال.
<Tip>
للاطلاع جميع البنى والنقاط المتوافقة مع هذه المهمة، نوصي بالرجوع من [صفحة المهمة](https://huggingface.co/tasks/token-classification).
</Tip>
قبل أن تبدأ، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate seqeval
```
نحن نشجعك على تسجيل الدخول إلى حساب HuggingFace الخاص بك حتى تتمكن من تحميل ومشاركة نموذجك مع المجتمع. عندما يُطلب منك، أدخل رمزك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات WNUT 17
ابدأ بتحميل مجموعة بيانات WNUT 17 من مكتبة 🤗 Datasets:
```py
>>> from datasets import load_dataset
>>> wnut = load_dataset("wnut_17")
```
ثم ألق نظرة على مثال:
```py
>>> wnut["train"][0]
{'id': '0',
'ner_tags': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 8, 8, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0],
'tokens': ['@paulwalk', 'It', "'s", 'the', 'view', 'from', 'where', 'I', "'m", 'living', 'for', 'two', 'weeks', '.', 'Empire', 'State', 'Building', '=', 'ESB', '.', 'Pretty', 'bad', 'storm', 'here', 'last', 'evening', '.']
}
```
يمثل كل رقم في `ner_tags` كياناً. حوّل الأرقام إلى أسماء التصنيفات لمعرفة ماهية الكيانات:
```py
>>> label_list = wnut["train"].features[f"ner_tags"].feature.names
>>> label_list
[
"O",
"B-corporation",
"I-corporation",
"B-creative-work",
"I-creative-work",
"B-group",
"I-group",
"B-location",
"I-location",
"B-person",
"I-person",
"B-product",
"I-product",
]
```
يشير الحرف الذي يسبق كل `ner_tag` إلى موضع الرمز للكيان:
- `B-` يشير إلى بداية الكيان.
- `I-` يشير إلى أن الرمز يقع ضمن نفس الكيان (على سبيل المثال، الرمز `State` هو جزء من كيان مثل `Empire State Building`).
- `0` يشير إلى أن الرمز لا يمثل أي كيان.
## المعالجة المسبقة(Preprocess)
<Youtube id="iY2AZYdZAr0"/>
الخطوة التالية هي تحميل مُجزِّئ النصوص DistilBERT للمعالجة المسبقة لحقل `tokens`:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
```
كما رأيت في حقل `tokens` المثال أعلاه، يبدو أن المدخل قد تم تحليله بالفعل. لكن المدخل لم يُجزأ بعد ويتعيّن عليك ضبط `is_split_into_words=True` لتقسيم الكلمات إلى كلمات فرعية. على سبيل المثال:
```py
>>> example = wnut["train"][0]
>>> tokenized_input = tokenizer(example["tokens"], is_split_into_words=True)
>>> tokens = tokenizer.convert_ids_to_tokens(tokenized_input["input_ids"])
>>> tokens
['[CLS]', '@', 'paul', '##walk', 'it', "'", 's', 'the', 'view', 'from', 'where', 'i', "'", 'm', 'living', 'for', 'two', 'weeks', '.', 'empire', 'state', 'building', '=', 'es', '##b', '.', 'pretty', 'bad', 'storm', 'here', 'last', 'evening', '.', '[SEP]']
```
ومع ذلك، يضيف هذا بعض الرموز الخاصة `[CLS]` و`[SEP]` وتقسيم الكلمات إلى أجزاء يُنشئ عدم تطابق بين المُدخلات والتسميات. قد يتم تقسيم كلمة واحدة تقابل تسمية واحدة الآن إلى كلمتين فرعيتين. ستحتاج إلى إعادة محاذاة الرموز والتسميات عن طريق:
1. ربط كل رمز بالكلمة الأصلية باستخدام الخاصية [`word_ids`](https://huggingface.co/docs/transformers/main_classes/tokenizer#transformers.BatchEncoding.word_ids).
2. تعيين التسمية `-100` للرموز الخاصة `[CLS]` و`[SEP]` بحيث يتم تجاهلها بواسطة دالة الخسارة PyTorch (انظر [CrossEntropyLoss](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html)).
3. تسمية الرمز الأول فقط لكلمة معينة. قم بتعيين `-100` لأجزاء الكلمة الأخرى.
هنا كيف يمكنك إنشاء وظيفة لإعادة محاذاة الرموز والتسميات، وقص الجمل لتتجاوز الحد الأقصى لطول مُدخلات DistilBERT:
```py
>>> def tokenize_and_align_labels(examples):
... tokenized_inputs = tokenizer(examples["tokens"], truncation=True, is_split_into_words=True)
... labels = []
... for i, label in enumerate(examples[f"ner_tags"]):
... word_ids = tokenized_inputs.word_ids(batch_index=i) # تعيين الرموز إلى كلماتهم المقابلة.
... previous_word_idx = None
... label_ids = []
... for word_idx in word_ids: # تعيين الرموز الخاصة إلى -100.
... if word_idx is None:
... label_ids.append(-100)
... elif word_idx != previous_word_idx: # تسمية الرمز الأول فقط لكلمة معينة.
... label_ids.append(label[word_idx])
... else:
... label_ids.append(-100)
... previous_word_idx = word_idx
... labels.append(label_ids)
... tokenized_inputs["labels"] = labels
... return tokenized_inputs
```
لتطبيق هذه العملية على كامل مجموعة البيانات، استخدم الدالة [`~datasets.Dataset.map`] لمجموعة بيانات 🤗. يمكنك تسريع الدالة `map` عن طريق تعيين `batched=True` لمعالجة عناصر متعددة من مجموعة البيانات في وقت واحد:
```py
>>> tokenized_wnut = wnut.map(tokenize_and_align_labels, batched=True)
```
الآن قم بإنشاء دفعة من الأمثلة باستخدام [`DataCollatorWithPadding`].من الأفضل استخدام *الحشو الديناميكي* للجمل إلى أطول طول في دفعة أثناء التجميع، بدلاً من حشو مجموعة البيانات بالكامل إلى الطول الأقصى.
<frameworkcontent>
<pt>
```py
>>> from transformers import DataCollatorForTokenClassification
>>> data_collator = DataCollatorForTokenClassification(tokenizer=tokenizer)
```
</pt>
<tf>
```py
>>> from transformers import DataCollatorForTokenClassification
>>> data_collator = DataCollatorForTokenClassification(tokenizer=tokenizer, return_tensors="tf")
```
</tf>
</frameworkcontent>
## التقييم(Evaluate)
يُعدّ تضمين مقياس أثناء التدريب مفيدًا في تقييم أداء نموذجك. يمكنك تحميل طريقة تقييم بسرعة مع مكتبة 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index). لهذه المهمة، قم بتحميل إطار [seqeval](https://huggingface.co/spaces/evaluate-metric/seqeval) (انظر جولة 🤗 Evaluate [quick tour](https://huggingface.co/docs/evaluate/a_quick_tour) لمعرفة المزيد حول كيفية تحميل وحساب مقياس). يُخرج seqeval عدة نتائج: الدقة، والاستذكار، ومقياس F1، والدقة.
```py
>>> import evaluate
>>> seqeval = evaluate.load("seqeval")
```
احصل على تسميات الكيانات المسماة (NER) أولاً،ثم أنشئ دالة تُمرر تنبؤاتك وتسمياتك الصحيحة إلى [`~evaluate.EvaluationModule.compute`] لحساب النتائج:
```py
>>> import numpy as np
>>> labels = [label_list[i] for i in example[f"ner_tags"]]
>>> def compute_metrics(p):
... predictions, labels = p
... predictions = np.argmax(predictions, axis=2)
... true_predictions = [
... [label_list[p] for (p, l) in zip(prediction, label) if l != -100]
... for prediction, label in zip(predictions, labels)
... ]
... true_labels = [
... [label_list[l] for (p, l) in zip(prediction, label) if l != -100]
... for prediction, label in zip(predictions, labels)
... ]
... results = seqeval.compute(predictions=true_predictions, references=true_labels)
... return {
... "precision": results["overall_precision"],
... "recall": results["overall_recall"],
... "f1": results["overall_f1"],
... "accuracy": results["overall_accuracy"],
... }
```
دالة `compute_metrics` جاهزة للاستخدام، وستحتاج إليها عند إعداد التدريب.
## التدريب(Train)
قبل تدريب النموذج، جهّز خريطة تربط بين المعرّفات المتوقعة وتسمياتها باستخدام `id2label` و `label2id`:
```py
>>> id2label = {
... 0: "O",
... 1: "B-corporation",
... 2: "I-corporation",
... 3: "B-creative-work",
... 4: "I-creative-work",
... 5: "B-group",
... 6: "I-group",
... 7: "B-location",
... 8: "I-location",
... 9: "B-person",
... 10: "I-person",
... 11: "B-product",
... 12: "I-product",
... }
>>> label2id = {
... "O": 0,
... "B-corporation": 1,
... "I-corporation": 2,
... "B-creative-work": 3,
... "I-creative-work": 4,
... "B-group": 5,
... "I-group": 6,
... "B-location": 7,
... "I-location": 8,
... "B-person": 9,
... "I-person": 10,
... "B-product": 11,
... "I-product": 12,
... }
```
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن على دراية بتعديل نموذج باستخدام [`Trainer`], ألق نظرة على الدليل التعليمي الأساسي [هنا](../training#train-with-pytorch-trainer)!
</Tip>
أنت مستعد الآن لبدء تدريب نموذجك! قم بتحميل DistilBERT مع [`AutoModelForTokenClassification`] إلى جانب عدد التصنيفات المتوقعة، وخريطة التسميات:
```py
>>> from transformers import AutoModelForTokenClassification, TrainingArguments, Trainer
>>> model = AutoModelForTokenClassification.from_pretrained(
... "distilbert/distilbert-base-uncased", num_labels=13, id2label=id2label, label2id=label2id
... )
```
في هذه المرحلة، هناك ثلاث خطوات فقط متبقية:
1. حدد معلمات التدريب الخاصة بك في [`TrainingArguments`]. المعامل الوحيد المطلوب هو `output_dir` الذي يحدد مكان حفظ نموذجك. ستقوم بدفع هذا النموذج إلى Hub عن طريق تعيين `push_to_hub=True` (يجب أن تكون مسجلاً الدخول إلى Hugging Face لتحميل نموذجك). في نهاية كل حقبة، سيقوم [`Trainer`] بتقييم درجات seqeval وحفظ تسخة التدريب.
2. قم بتمرير معاملات التدريب إلى [`Trainer`] إلى جانب النموذج، ومجموعة البيانات، والمُجزِّئ اللغوي، و`data collator`، ودالة `compute_metrics`.
3.استدعِ [`~Trainer.train`] لتدريب نموذجك.
```py
>>> training_args = TrainingArguments(
... output_dir="my_awesome_wnut_model",
... learning_rate=2e-5,
... per_device_train_batch_size=16,
... per_device_eval_batch_size=16,
... num_train_epochs=2,
... weight_decay=0.01,
... eval_strategy="epoch",
... save_strategy="epoch",
... load_best_model_at_end=True,
... push_to_hub=True,
... )
>>> trainer = Trainer(
... model=model,
... args=training_args,
... train_dataset=tokenized_wnut["train"],
... eval_dataset=tokenized_wnut["test"],
... processing_class=tokenizer,
... data_collator=data_collator,
... compute_metrics=compute_metrics,
... )
>>> trainer.train()
```
بمجرد اكتمال التدريب، شارك نموذجك على Hub باستخدام طريقة [`~transformers.Trainer.push_to_hub`] حتى يتمكن الجميع من استخدام نموذجك:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن على دراية بتعديل نموذج باستخدام Keras، ألق نظرة على الدليل التعليمي الأساسي [هنا](../training#train-a-tensorflow-model-with-keras)!
</Tip>
للتعديل على نموذج في TensorFlow، ابدأ بإعداد دالة محسن، وجدول معدل التعلم، وبعض معلمات التدريب:
```py
>>> from transformers import create_optimizer
>>> batch_size = 16
>>> num_train_epochs = 3
>>> num_train_steps = (len(tokenized_wnut["train"]) // batch_size) * num_train_epochs
>>> optimizer, lr_schedule = create_optimizer(
... init_lr=2e-5,
... num_train_steps=num_train_steps,
... weight_decay_rate=0.01,
... num_warmup_steps=0,
... )
```
ثم يمكنك تحميل DistilBERT مع [`TFAutoModelForTokenClassification`] إلى جانب عدد التسميات المتوقعة، وتخطيطات التسميات:
```py
>>> from transformers import TFAutoModelForTokenClassification
>>> model = TFAutoModelForTokenClassification.from_pretrained(
... "distilbert/distilbert-base-uncased", num_labels=13, id2label=id2label, label2id=label2id
... )
```
قم بتحويل مجموعات بياناتك إلى تنسيق `tf.data.Dataset` مع [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... tokenized_wnut["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_validation_set = model.prepare_tf_dataset(
... tokenized_wnut["validation"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
هيّئ النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method). لاحظ أن نماذج Transformers تتضمن دالة خسارة افتراضية مرتبطة بالمهمة، لذلك لا تحتاج إلى تحديد واحدة إلا إذا كنت ترغب في ذلك:
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer) # No loss argument!
```
آخر أمرين يجب إعدادهما قبل بدء التدريب هو حساب درجات seqeval من التنبؤات، وتوفير طريقة لدفع نموذجك إلى Hub. يتم ذلك باستخدام [Keras callbacks](../main_classes/keras_callbacks).
مرر دالة `compute_metrics` الخاصة بك إلى [`~transformers.KerasMetricCallback`]:
```py
>>> from transformers.keras_callbacks import KerasMetricCallback
>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)
```
حدد مكان دفع نموذجك والمحلل اللغوي في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> push_to_hub_callback = PushToHubCallback(
... output_dir="my_awesome_wnut_model",
... tokenizer=tokenizer,
... )
```
ثم جمّع callbacks الخاصة بك معًا:
```py
>>> callbacks = [metric_callback, push_to_hub_callback]
```
أخيرًا، أنت جاهز الآن لبدء تدريب نموذجك! قم باستدعاء [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع بيانات التدريب والتحقق، وعدد الحقبات، وcallbacks لتعديل النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3, callbacks=callbacks)
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائيًا إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
للحصول على مثال أكثر تفصيلاً حول كيفية تعديل نموذج لتصنيف الرموز، ألق نظرة على الدفتر المقابل
[دفتر PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification.ipynb)
أو [دفتر TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification-tf.ipynb).
</Tip>
## الاستدلال(Inference)
رائع، الآن بعد أن قمت بتعديل نموذج، يمكنك استخدامه للاستدلال!
احصل على بعض النصوص التي تريد تشغيل الاستدلال عليها:
```py
>>> text = "The Golden State Warriors are an American professional basketball team based in San Francisco."
```
أبسط طريقة لتجربة نموذجك المُدرب مسبقًا للاستدلال هي استخدامه في [`pipeline`]. قم بتنفيذ `pipeline` لتصنيف الكيانات المسماة مع نموذجك، ومرر نصك إليه:
```py
>>> from transformers import pipeline
>>> classifier = pipeline("ner", model="stevhliu/my_awesome_wnut_model")
>>> classifier(text)
[{'entity': 'B-location',
'score': 0.42658573,
'index': 2,
'word': 'golden',
'start': 4,
'end': 10},
{'entity': 'I-location',
'score': 0.35856336,
'index': 3,
'word': 'state',
'start': 11,
'end': 16},
{'entity': 'B-group',
'score': 0.3064001,
'index': 4,
'word': 'warriors',
'start': 17,
'end': 25},
{'entity': 'B-location',
'score': 0.65523505,
'index': 13,
'word': 'san',
'start': 80,
'end': 83},
{'entity': 'B-location',
'score': 0.4668663,
'index': 14,
'word': 'francisco',
'start': 84,
'end': 93}]
```
يمكنك أيضًا تكرار نتائج `pipeline` يدويًا إذا أردت:
<frameworkcontent>
<pt>
قسّم النص إلى رموز وأرجع المُوتّرات بلغة PyTorch:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_wnut_model")
>>> inputs = tokenizer(text, return_tensors="pt")
```
مرر مدخلاتك إلى النموذج واحصل على `logits`:
```py
>>> from transformers import AutoModelForTokenClassification
>>> model = AutoModelForTokenClassification.from_pretrained("stevhliu/my_awesome_wnut_model")
>>> with torch.no_grad():
... logits = model(**inputs).logits
```
استخرج الفئة ذات الاحتمالية الأعلى، واستخدم جدول `id2label` الخاصة بالنموذج لتحويلها إلى تسمية نصية:
```py
>>> predictions = torch.argmax(logits, dim=2)
>>> predicted_token_class = [model.config.id2label[t.item()] for t in predictions[0]]
>>> predicted_token_class
['O',
'O',
'B-location',
'I-location',
'B-group',
'O',
'O',
'O',
'O',
'O',
'O',
'O',
'O',
'B-location',
'B-location',
'O',
'O']
```
</pt>
<tf>
قسّم النص إلى رموز وأرجع المُوتّرات ب TensorFlow:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_wnut_model")
>>> inputs = tokenizer(text, return_tensors="tf")
```
مرر مدخلاتك إلى النموذج واحصل على `logits`:
```py
>>> from transformers import TFAutoModelForTokenClassification
>>> model = TFAutoModelForTokenClassification.from_pretrained("stevhliu/my_awesome_wnut_model")
>>> logits = model(**inputs).logits
```
استخرج الفئة ذات الاحتمالية الأعلى، واستخدم جدول `id2label` الخاصة بالنموذج لتحويلها إلى تسمية نصية:
```py
>>> predicted_token_class_ids = tf.math.argmax(logits, axis=-1)
>>> predicted_token_class = [model.config.id2label[t] for t in predicted_token_class_ids[0].numpy().tolist()]
>>> predicted_token_class
['O',
'O',
'B-location',
'I-location',
'B-group',
'O',
'O',
'O',
'O',
'O',
'O',
'O',
'O',
'B-location',
'B-location',
'O',
'O']
```
</tf>
</frameworkcontent>

View File

@ -0,0 +1,407 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# الترجمة(Translation)
[[open-in-colab]]
<Youtube id="1JvfrvZgi6c"/>
الترجمة هي عملية تحويل سلسلة نصية من لغة إلى أخرى. وهي إحدى المهام التي يمكن صياغتها كمسألة تسلسل إلى تسلسل، وهو إطار عمل قوي لإنتاج مخرجات من مدخلات، مثل الترجمة أو التلخيص. تُستخدم أنظمة الترجمة عادةً للترجمة بين نصوص لغات مختلفة، ويمكن استخدامها أيضًا لترجمة الكلام أو لمهام تجمع بين النصوص والكلام، مثل تحويل النص إلى كلام أو تحويل الكلام إلى نص.
سيوضح لك هذا الدليل كيفية:
1. ضبط دقيق لنموذج [T5](https://huggingface.co/google-t5/t5-small) على المجموعة الفرعية الإنجليزية-الفرنسية من مجموعة بيانات [OPUS Books](https://huggingface.co/datasets/opus_books) لترجمة النص الإنجليزي إلى الفرنسية.
2. استخدام النموذج المضبوط بدقة للاستدلال.
<Tip>
لمشاهدة جميع البنى والنسخ المتوافقة مع هذه المهمة، نوصي بالتحقق من [صفحة المهمة](https://huggingface.co/tasks/translation).
</Tip>
قبل البدء، تأكد من تثبيت جميع المكتبات الضرورية:
```bash
pip install transformers datasets evaluate sacrebleu
```
نشجعك على تسجيل الدخول إلى حساب Hugging Face الخاص بك حتى تتمكن من تحميل نموذجك ومشاركته مع المجتمع. عند الطلب، أدخل الرمز المميز الخاص بك لتسجيل الدخول:
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## تحميل مجموعة بيانات OPUS Books
ابدأ بتحميل المجموعة الفرعية الإنجليزية-الفرنسية من مجموعة بيانات [OPUS Books](https://huggingface.co/datasets/opus_books) من مكتبة 🤗 Datasets:
```py
>>> from datasets import load_dataset
>>> books = load_dataset("opus_books", "en-fr")
```
قسّم مجموعة البيانات إلى مجموعة تدريب ومجموعة اختبار باستخدام طريقة [`~datasets.Dataset.train_test_split`]:
```py
>>> books = books["train"].train_test_split(test_size=0.2)
```
ثم ألقِ نظرة على مثال:
```py
>>> books["train"][0]
{'id': '90560',
'translation': {'en': 'But this lofty plateau measured only a few fathoms, and soon we reentered Our Element.',
'fr': 'Mais ce plateau élevé ne mesurait que quelques toises, et bientôt nous fûmes rentrés dans notre élément.'}}
```
`translation`: ترجمة إنجليزية وفرنسية للنص.
## المعالجة المسبقة(Preprocess)
<Youtube id="XAR8jnZZuUs"/>
الخطوة التالية هي تحميل مُجزئ T5 لمعالجة أزواج اللغة الإنجليزية-الفرنسية:
```py
>>> from transformers import AutoTokenizer
>>> checkpoint = "google-t5/t5-small"
>>> tokenizer = AutoTokenizer.from_pretrained(checkpoint)
```
يجب أن تقوم دالة المعالجة المسبقة التي تُريد إنشاءها بما يلي:
1. إضافة بادئة إلى المُدخل بمُوجه حتى يعرف T5 أن هذه مهمة ترجمة. تتطلب بعض النماذج القادرة على أداء مهام متعددة توجيهًا لمهام مُحددة.
2. تعيين اللغة الهدف (الفرنسية) في معامل `text_target` لضمان معالجة المُجزئ للنص بشكل صحيح. إذا لم تُعيّن `text_target`، فسيُعالج المُجزئ النص على أنه إنجليزي.
3. اقتطاع التسلسلات بحيث لا يزيد طولها عن الحد الأقصى الذي يحدده معامل `max_length`.
```py
>>> source_lang = "en"
>>> target_lang = "fr"
>>> prefix = "translate English to French: "
>>> def preprocess_function(examples):
... inputs = [prefix + example[source_lang] for example in examples["translation"]]
... targets = [example[target_lang] for example in examples["translation"]]
... model_inputs = tokenizer(inputs, text_target=targets, max_length=128, truncation=True)
... return model_inputs
```
لتطبيق دالة المعالجة المسبقة على مجموعة البيانات بأكملها، استخدم طريقة [`~datasets.Dataset.map`] من 🤗 Datasets. يمكنك تسريع دالة `map` عن طريق تعيين `batched=True` لمعالجة عناصر متعددة من مجموعة البيانات في وقت واحد:
```py
>>> tokenized_books = books.map(preprocess_function, batched=True)
```
الآن أنشئ دفعة من الأمثلة باستخدام [`DataCollatorForSeq2Seq`]. من الأكثر كفاءة *الحشو الديناميكي* للجمل إلى أطول طول في دفعة أثناء التجميع، بدلاً من حشو مجموعة البيانات بأكملها إلى الحد الأقصى للطول.
<frameworkcontent>
<pt>
```py
>>> from transformers import DataCollatorForSeq2Seq
>>> data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=checkpoint)
```
</pt>
<tf>
```py
>>> from transformers import DataCollatorForSeq2Seq
>>> data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=checkpoint, return_tensors="tf")
```
</tf>
</frameworkcontent>
## التقييم (Evaluate)
غالباً ما يكون تضمين مقياس أثناء التدريب مفيداً لتقييم أداء نموذجك. يمكنك تحميل طريقة تقييم بسرعة باستخدام مكتبة 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index). لهذه المهمة، حمّل مقياس [SacreBLEU](https://huggingface.co/spaces/evaluate-metric/sacrebleu) (راجع [الجولة السريعة](https://huggingface.co/docs/evaluate/a_quick_tour) لـ 🤗 Evaluate لمعرفة المزيد حول كيفية تحميل وحساب مقياس):
```py
>>> import evaluate
>>> metric = evaluate.load("sacrebleu")
```
ثم أنشئ دالة تُمرر تنبؤاتك وتسمياتك إلى [`~evaluate.EvaluationModule.compute`] لحساب درجة SacreBLEU:
```py
>>> import numpy as np
>>> def postprocess_text(preds, labels):
... preds = [pred.strip() for pred in preds]
... labels = [[label.strip()] for label in labels]
... return preds, labels
>>> def compute_metrics(eval_preds):
... preds, labels = eval_preds
... if isinstance(preds, tuple):
... preds = preds[0]
... decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)
... labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
... decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
... decoded_preds, decoded_labels = postprocess_text(decoded_preds, decoded_labels)
... result = metric.compute(predictions=decoded_preds, references=decoded_labels)
... result = {"bleu": result["score"]}
... prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in preds]
... result["gen_len"] = np.mean(prediction_lens)
... result = {k: round(v, 4) for k, v in result.items()}
... return result
```
دالة `compute_metrics` الخاصة بك جاهزة الآن، وسوف تعود إليها عند إعداد التدريب.
## التدريب (Train)
<frameworkcontent>
<pt>
<Tip>
إذا لم تكن معتادًا على ضبط دقيق نموذج باستخدام [`Trainer`], فألقِ نظرة على البرنامج التعليمي الأساسي [هنا](../training#train-with-pytorch-trainer)!
</Tip>
أنت جاهز لبدء تدريب نموذجك الآن! حمّل T5 باستخدام [`AutoModelForSeq2SeqLM`]:
```py
>>> from transformers import AutoModelForSeq2SeqLM, Seq2SeqTrainingArguments, Seq2SeqTrainer
>>> model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint)
```
في هذه المرحلة، تبقى ثلاث خطوات فقط:
1. حدد مُعاملات للتدريب في [`Seq2SeqTrainingArguments`]. المُعامل الوحيدة المطلوبة هي `output_dir` التي تحدد مكان حفظ النموذج الخاص بك. ستقوم بدفع هذا النموذج إلى Hub عن طريق تعيين `push_to_hub=True` (يجب عليك تسجيل الدخول إلى Hugging Face لتحميل نموذجك). في نهاية كل حقبة، سيقوم [`Trainer`] بتقييم مقياس SacreBLEU وحفظ نقطة تدقيق التدريب.
2. مرر مُعاملات التدريب إلى [`Seq2SeqTrainer`] جنبًا إلى جنب مع النموذج ومجموعة البيانات والمعالج اللغوي وجامع البيانات ووظيفة `compute_metrics`.
3. نفّذ [`~Trainer.train`] لضبط نموذجك.
```py
>>> training_args = Seq2SeqTrainingArguments(
... output_dir="my_awesome_opus_books_model",
... eval_strategy="epoch",
... learning_rate=2e-5,
... per_device_train_batch_size=16,
... per_device_eval_batch_size=16,
... weight_decay=0.01,
... save_total_limit=3,
... num_train_epochs=2,
... predict_with_generate=True,
... fp16=True, #change to bf16=True for XPU
... push_to_hub=True,
... )
>>> trainer = Seq2SeqTrainer(
... model=model,
... args=training_args,
... train_dataset=tokenized_books["train"],
... eval_dataset=tokenized_books["test"],
... processing_class=tokenizer,
... data_collator=data_collator,
... compute_metrics=compute_metrics,
... )
>>> trainer.train()
```
بمجرد اكتمال التدريب، شارك نموذجك مع Hub باستخدام طريقة [`~transformers.Trainer.push_to_hub`] حتى يتمكن الجميع من استخدام نموذجك:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
إذا لم تكن معتادًا على ضبط نموذج باستخدام Keras، فألق نظرة على البرنامج التعليمي الأساسي [هنا](../training#train-a-tensorflow-model-with-keras)!
</Tip>
لضبط نموذج في TensorFlow، ابدأ بإعداد دالة مُحسِّن وجدول معدل تعلم وبعض المعلمات الفائقة للتدريب:
```py
>>> from transformers import AdamWeightDecay
>>> optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)
```
ثم يمكنك تحميل T5 باستخدام [`TFAutoModelForSeq2SeqLM`]:
```py
>>> from transformers import TFAutoModelForSeq2SeqLM
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained(checkpoint)
```
حوّل مجموعات البيانات الخاصة بك إلى تنسيق `tf.data.Dataset` باستخدام [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... tokenized_books["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_test_set = model.prepare_tf_dataset(
... tokenized_books["test"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
قم بتكوين النموذج للتدريب باستخدام [`compile`](https://keras.io/api/models/model_training_apis/#compile-method). لاحظ أن جميع نماذج Transformers تحتوي على دالة خسارة ذات صلة بالمهمة بشكل افتراضي، لذلك لا تحتاج إلى تحديد واحدة إلا إذا كنت ترغب في ذلك:
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer) # No loss argument!
```
آخر شيئين يجب إعدادهما قبل بدء التدريب هما حساب مقياس SacreBLEU من التوقعات، وتوفير طريقة لدفع نموذجك إلى Hub. يتم كلاهما باستخدام [استدعاءات Keras](../main_classes/keras_callbacks).
مرر دالة `compute_metrics` الخاصة بك إلى [`~transformers.KerasMetricCallback`]:
```py
>>> from transformers.keras_callbacks import KerasMetricCallback
>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_test_set)
```
حدد مكان دفع نموذجك ومعالجك اللغوي في [`~transformers.PushToHubCallback`]:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> push_to_hub_callback = PushToHubCallback(
... output_dir="my_awesome_opus_books_model",
... tokenizer=tokenizer,
... )
```
ثم اجمع استدعاءاتك معًا:
```py
>>> callbacks = [metric_callback, push_to_hub_callback]
```
أخيرًا، أنت جاهز لبدء تدريب نموذجك! اتصل بـ [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) مع مجموعات بيانات التدريب والتحقق من الصحة وعدد الحقب واستدعاءاتك لضبط النموذج:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3, callbacks=callbacks)
```
بمجرد اكتمال التدريب، يتم تحميل نموذجك تلقائيًا إلى Hub حتى يتمكن الجميع من استخدامه!
</tf>
</frameworkcontent>
<Tip>
للحصول على مثال أكثر تعمقًا لكيفية ضبط نموذج للترجمة، ألق نظرة على [دفتر ملاحظات PyTorch](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/translation.ipynb) المقابل
أو [دفتر ملاحظات TensorFlow](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/translation-tf.ipynb).
</Tip>
## الاستدلال (Inference)
رائع، الآن بعد أن قمت بضبط نموذج، يمكنك استخدامه للاستدلال!
أحضر بعض النصوص التي ترغب في ترجمتها إلى لغة أخرى. بالنسبة لـ T5، تحتاج إلى إضافة بادئة إلى مدخلاتك اعتمادًا على المهمة التي تعمل عليها. للترجمة من الإنجليزية إلى الفرنسية، يجب عليك إضافة بادئة إلى مدخلاتك كما هو موضح أدناه:
```py
>>> text = "translate English to French: Legumes share resources with nitrogen-fixing bacteria."
```
أبسط طريقة لتجربة نموذجك المضبوط للاستدلال هي استخدامه في [`pipeline`]. قم بإنشاء مثيل لـ `pipeline` للترجمة باستخدام نموذجك، ومرر النص الخاص بك إليه:
```py
>>> from transformers import pipeline
# تغيير `xx` إلى لغة الإدخال و `yy` إلى لغة المخرجات المطلوبة.
# أمثلة: "en" للغة الإنجليزية، "fr" للغة الفرنسية، "de" للغة الألمانية، "es" للغة الإسبانية، "zh" للغة الصينية، إلخ؛ translation_en_to_fr تترجم من الإنجليزية إلى الفرنسية
# يمكنك عرض جميع قوائم اللغات هنا - https://huggingface.co/languages
>>> translator = pipeline("translation_xx_to_yy", model="username/my_awesome_opus_books_model")
>>> translator(text)
[{'translation_text': 'Legumes partagent des ressources avec des bactéries azotantes.'}]
```
يمكنك أيضًا تكرار نتائج `pipeline` يدويًا إذا أردت:
<frameworkcontent>
<pt>
قم بتحويل النص إلى رموز وإرجاع `input_ids` كموترات PyTorch:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_opus_books_model")
>>> inputs = tokenizer(text, return_tensors="pt").input_ids
```
استخدم الدالة [`~generation.GenerationMixin.generate`] لإنشاء الترجمة. لمزيد من التفاصيل حول استراتيجيات توليد النصوص المختلفة والمعلمات للتحكم في التوليد، تحقق من واجهة برمجة تطبيقات [توليد النصوص](../main_classes/text_generation).
```py
>>> from transformers import AutoModelForSeq2SeqLM
>>> model = AutoModelForSeq2SeqLM.from_pretrained("username/my_awesome_opus_books_model")
>>> outputs = model.generate(inputs, max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)
```
فك تشفير معرفات الرموز المولدة مرة أخرى إلى نص:
```py
>>> tokenizer.decode(outputs[0], skip_special_tokens=True)
'Les lignées partagent des ressources avec des bactéries enfixant l'azote.'
```
</pt>
<tf>
قم بتحويل النص إلى رموز وإرجاع `input_ids` كموترات TensorFlow:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("username/my_awesome_opus_books_model")
>>> inputs = tokenizer(text, return_tensors="tf").input_ids
```
استخدم طريقة [`~transformers.generation_tf_utils.TFGenerationMixin.generate`] لإنشاء الترجمة. لمزيد من التفاصيل حول استراتيجيات توليد النصوص المختلفة والمعلمات للتحكم في التوليد، تحقق من واجهة برمجة تطبيقات [توليد النصوص](../main_classes/text_generation).
```py
>>> from transformers import TFAutoModelForSeq2SeqLM
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("username/my_awesome_opus_books_model")
>>> outputs = model.generate(inputs, max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)
```
فك تشفير معرفات الرموز المولدة مرة أخرى إلى نص:
```py
>>> tokenizer.decode(outputs[0], skip_special_tokens=True)
'Les lugumes partagent les ressources avec des bactéries fixatrices d'azote.'
```
</tf>
</frameworkcontent>

40
docs/source/ar/tflite.md Normal file
View File

@ -0,0 +1,40 @@
# التصدير إلى TFLite
[TensorFlow Lite](https://www.tensorflow.org/lite/guide) هو إطار عمل خفيف الوزن لنشر نماذج التعلم الآلي على الأجهزة المحدودة الموارد، مثل الهواتف المحمولة، والأنظمة المدمجة، وأجهزة إنترنت الأشياء (IoT). تم تصميم TFLite لتشغيل النماذج وتحسينها بكفاءة على هذه الأجهزة ذات الطاقة الحاسوبية والذاكرة واستهلاك الطاقة المحدودة.
يُمثَّل نموذج TensorFlow Lite بتنسيق محمول فعال خاص يُعرَّف بامتداد الملف `.tflite`.
🤗 Optimum يقدم وظيفة لتصدير نماذج 🤗 Transformers إلى TFLite من خلال الوحدة النمطية `exporters.tflite`. بالنسبة لقائمة هندسات النماذج المدعومة، يرجى الرجوع إلى [وثائق 🤗 Optimum](https://huggingface.co/docs/optimum/exporters/tflite/overview).
لتصدير نموذج إلى TFLite، قم بتثبيت متطلبات البرنامج المطلوبة:
```bash
pip install optimum[exporters-tf]
```
للاطلاع على جميع المغامﻻت المتاحة، راجع [وثائق 🤗 Optimum](https://huggingface.co/docs/optimum/main/en/exporters/tflite/usage_guides/export_a_model)، أو عرض المساعدة في سطر الأوامر:
```bash
optimum-cli export tflite --help
```
لتصدير نسخة النموذج ل 🤗 Hub، على سبيل المثال، `google-bert/bert-base-uncased`، قم بتشغيل الأمر التالي:
```bash
optimum-cli export tflite --model google-bert/bert-base-uncased --sequence_length 128 bert_tflite/
```
ستظهر لك السجلات التي تُبيّن التقدم وموقع حفظ ملف `model.tflite` الناتج، كما في المثال التالي:
```bash
Validating TFLite model...
-[] TFLite model output names match reference model (logits)
- Validating TFLite Model output "logits":
-[] (1, 128, 30522) matches (1, 128, 30522)
-[x] values not close enough, max diff: 5.817413330078125e-05 (atol: 1e-05)
The TensorFlow Lite export succeeded with the warning: The maximum absolute difference between the output of the reference model and the TFLite exported model is not within the set tolerance 1e-05:
- logits: max diff = 5.817413330078125e-05.
The exported model was saved at: bert_tflite
```
يُبيّن المثال أعلاه كيفية تصدير نسخة من النموذج ل 🤗 Hub. عند تصدير نموذج محلي، تأكد أولاً من حفظ ملفات أوزان النموذج المجزء اللغوى في نفس المسار (`local_path`). عند استخدام CLI، قم بتمرير `local_path` إلى معامل `model` بدلاً من اسم النسخة على 🤗 Hub.

View File

@ -0,0 +1,41 @@
# Tiktoken والتفاعل مع Transformers
يتم دمج دعم ملفات نموذج tiktoken بسلاسة في 🤗 transformers عند تحميل النماذج
`from_pretrained` مع ملف `tokenizer.model` tiktoken على Hub، والذي يتم تحويله تلقائيًا إلى [المحلل اللغوي السريع](https://huggingface.co/docs/transformers/main/en/main_classes/tokenizer#transformers.PreTrainedTokenizerFast).
### النماذج المعروفة التي تم إصدارها مع `tiktoken.model`:
- gpt2
- llama3
## مثال على الاستخدام
من أجل تحميل ملفات `tiktoken` في `transformers`، تأكد من أن ملف `tokenizer.model` هو ملف tiktoken وسيتم تحميله تلقائيًا عند التحميل `from_pretrained`. إليك كيفية تحميل مجزىء لغوي ونموذج، والذي
يمكن تحميله من نفس الملف بالضبط:
```py
from transformers import AutoTokenizer
model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id, subfolder="original")
```
## إنشاء مجزىء لغوي tiktoken
لا يحتوي ملف `tokenizer.model` على أي معلومات حول الرموز أو الأنماط الإضافية. إذا كانت هذه الأمور مهمة، قم بتحويل المحلل اللغوي إلى `tokenizer.json`، وهو التنسيق المناسب لـ [`PreTrainedTokenizerFast`].
قم بتوليد ملف `tokenizer.model` باستخدام [tiktoken.get_encoding](https://github.com/openai/tiktoken/blob/63527649963def8c759b0f91f2eb69a40934e468/tiktoken/registry.py#L63) ثم قم بتحويله إلى `tokenizer.json` باستخدام [`convert_tiktoken_to_fast`].
```py
from transformers.integrations.tiktoken import convert_tiktoken_to_fast
from tiktoken import get_encoding
# يمكنك تحميل ترميزك المخصص أو الترميز الذي توفره OpenAI
encoding = get_encoding("gpt2")
convert_tiktoken_to_fast(encoding, "config/save/dir")
```
يتم حفظ ملف `tokenizer.json` الناتج في الدليل المحدد ويمكن تحميله باستخدام [`PreTrainedTokenizerFast`].
```py
tokenizer = PreTrainedTokenizerFast.from_pretrained("config/save/dir")
```

View File

@ -0,0 +1,154 @@
# التصدير إلى TorchScript
<Tip>
هذه هي بداية تجاربنا مع TorchScript ولا زلنا نستكشف قدراته مع نماذج المدخلات المتغيرة الحجم. إنه مجال اهتمامنا وسنعمق تحليلنا في الإصدارات القادمة، مع المزيد من الأمثلة البرمجية، وتنفيذ أكثر مرونة، ومقاييس مقارنة بين الأكواد القائمة على Python مع أكواد TorchScript المُجمّعة.
</Tip>
وفقًا لـ [وثائق TorchScript](https://pytorch.org/docs/stable/jit.html):
> TorchScript هي طريقة لإنشاء نماذج قابلة للتسلسل والتحسين من تعليمات PyTorch البرمجية.
هناك وحدتان من PyTorch، [JIT and TRACE](https://pytorch.org/docs/stable/jit.html)، تتيحان للمطورين تصدير نماذجهم لإعادة استخدامها في برامج أخرى مثل برامج C++ المُحسّنة للأداء.
نقدم واجهة تتيح لك تصدير نماذج 🤗 Transformers إلى TorchScript بحيث يمكن إعادة استخدامها في بيئة مختلفة عن برامج Python القائمة إلى PyTorch. هنا نشرح كيفية تصدير نماذجنا واستخدامها باستخدام TorchScript.
يتطلب تصدير نموذج أمرين:
- تهيئة مثيل للنموذج باستخدام علامة `torchscript`
- تمرير مُدخلات وهمية (dummy inputs) خلال النموذج
تنطوي هذه الضرورات على عدة أمور يجب على المطورين توخي الحذر بشأنها كما هو مفصل أدناه.
## علامة TorchScript والأوزان المرتبطة
علامة `torchscript` ضرورية لأن معظم نماذج اللغة 🤗 Transformers لها أوزان مرتبطة بين طبقة `Embedding` وطبقة `Decoding`. لا يسمح لك TorchScript بتصدير النماذج ذات الأوزان المرتبطة، لذلك من الضروري فصل الأوزان ونسخها مسبقًا.
النماذج المُهيأة باستخدام علامة `torchscript` لها طبقة `Embedding` وطبقة`Decoding` منفصلتين، مما يعني أنه لا ينبغي تدريبها لاحقًا. سيؤدي التدريب إلى عدم تزامن الطبقتين، مما يؤدي إلى نتائج غير متوقعة.
هذا لا ينطبق على النماذج التي لا تحتوي على رأس نموذج اللغة، حيث لا تملك أوزانًا مرتبطة. يمكن تصدير هذه النماذج بأمان دون علامة `torchscript`.
## المدخلات الوهمية والأطوال القياسية
تُستخدم المُدخلات الوهمية لتمرير أمامي خلال النموذج. أثناء انتشار قيم المُدخلات عبر الطبقات، يتتبع PyTorch العمليات المختلفة التي يتم تنفيذها على كل مصفوفة(tensor). ثم يتم استخدام هذه العمليات المُسجلة بعد ذلك لإنشاء *أثر* النموذج.
يتم إنشاء التتبع بالنسبة لأبعاد المُدخلات. وبالتالي، فهو مُقيّد بأبعاد المُدخلات الوهمية، ولن يعمل لأي طول تسلسل أو حجم دفعة مختلف. عند المحاولة بحجم مختلف، يتم رفع الخطأ التالي:
```
`The expanded size of the tensor (3) must match the existing size (7) at non-singleton dimension 2`
```
نوصي بتتبع النموذج باستخدام حجم مُدخلات وهمية لا يقل عن أكبر مُدخل سيتم تقديمه للنموذج أثناء الاستدلال. يمكن أن تساعد الحشوة(padding) في ملء القيم المفقودة. ومع ذلك، نظرًا لتتبع النموذج بحجم مُدخل أكبر، ستكون أبعاد المصفوفة ستكون كبيرة أيضًا، مما يؤدي عنه المزيد من الحسابات.
انتبه إلى إجمالي عدد العمليات المُنفذة على كل مُدخل وتابع الأداء عن كثب عند تصدير نماذج متغيرة طول التسلسل.
## استخدام TorchScript في Python
يوضح هذا القسم كيفية حفظ النماذج وتحميلها، بالإضافة إلى كيفية استخدام التتبع للاستدلال.
### حفظ نموذج
لتصدير `BertModel` باستخدام TorchScript، قم بتهيئة ـ `BertModel` من فئة `BertConfig` ثم احفظه على القرص تحت اسم الملف `traced_bert.pt`:
```python
from transformers import BertModel, BertTokenizer, BertConfig
import torch
enc = BertTokenizer.from_pretrained("google-bert/bert-base-uncased")
# Tokenizing input text
text = "[CLS] Who was Jim Henson ? [SEP] Jim Henson was a puppeteer [SEP]"
tokenized_text = enc.tokenize(text)
# Masking one of the input tokens
masked_index = 8
tokenized_text[masked_index] = "[MASK]"
indexed_tokens = enc.convert_tokens_to_ids(tokenized_text)
segments_ids = [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1]
# Creating a dummy input
tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])
dummy_input = [tokens_tensor, segments_tensors]
# Initializing the model with the torchscript flag
# Flag set to True even though it is not necessary as this model does not have an LM Head.
config = BertConfig(
vocab_size_or_config_json_file=32000,
hidden_size=768,
num_hidden_layers=12,
num_attention_heads=12,
intermediate_size=3072,
torchscript=True,
)
# Instantiating the model
model = BertModel(config)
# The model needs to be in evaluation mode
model.eval()
# If you are instantiating the model with *from_pretrained* you can also easily set the TorchScript flag
model = BertModel.from_pretrained("google-bert/bert-base-uncased", torchscript=True)
# Creating the trace
traced_model = torch.jit.trace(model, [tokens_tensor, segments_tensors])
torch.jit.save(traced_model, "traced_bert.pt")
```
### تحميل نموذج
يمكنك الآن تحميل `BertModel` المُحفظ سابقًا، `traced_bert.pt`، من القرص واستخدامه على `dummy_input` المُهيأ سابقًا:
```python
loaded_model = torch.jit.load("traced_bert.pt")
loaded_model.eval()
all_encoder_layers, pooled_output = loaded_model(*dummy_input)
```
### استخدام نموذج مُتتبع للاستدلال
استخدم النموذج المُتتبع للاستدلال باستخدام أسلوب `__call__` الخاص به:
```python
traced_model(tokens_tensor, segments_tensors)
```
## نشر نماذج Hugging Face TorchScript على AWS باستخدام Neuron SDK
قدمت AWS عائلة [Amazon EC2 Inf1](https://aws.amazon.com/ec2/instance-types/inf1/) من اﻷجهزة لخفض التكلفة وأداء التعلم الآلي عالي الأداء في البيئة السحابية. تعمل أجهزة Inf1 بواسطة شريحة Inferentia من AWS، وهي مُسرّع أجهزة مُخصص، متخصص في أعباء عمل الاستدلال للتعلم العميق. [AWS Neuron](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/#) هي SDK لـ Inferentia التي تدعم تتبع نماذج المحولات وتحسينها للنشر على Inf1. توفر Neuron SDK ما يلي:
1. واجهة برمجة تطبيقات سهلة الاستخدام مع تغيير سطر واحد من التعليمات البرمجية لتتبع نموذج TorchScript وتحسينه للاستدلال في البيئة السحابية.
2. تحسينات الأداء الجاهزة للاستخدام [تحسين التكلفة والأداء](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/benchmark/>).
3. دعم نماذج Hugging Face المحولات المبنية باستخدام إما [PyTorch](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/bert_tutorial/tutorial_pretrained_bert.html) أو [TensorFlow](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/tensorflow/huggingface_bert/huggingface_bert.html).
### الآثار المترتبة
تعمل نماذج المحولات المستندة إلى بنية [BERT (تمثيلات الترميز ثنائية الاتجاه من المحولات)](https://huggingface.co/docs/transformers/main/model_doc/bert) أو متغيراتها مثل [distilBERT](https://huggingface.co/docs/transformers/main/model_doc/distilbert) و [roBERTa](https://huggingface.co/docs/transformers/main/model_doc/roberta) بشكل أفضل على Inf1 للمهام غير التوليدية مثل الإجابة على الأسئلة الاستخراجية، وتصنيف التسلسلات، وتصنيف الرموز (tokens). ومع ذلك، يمكن تكييف مهام توليد النصوص للعمل على Inf1 وفقًا لهذا [برنامج تعليمي AWS Neuron MarianMT](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/transformers-marianmt.html). يمكن العثور على مزيد من المعلومات حول النماذج التي يمكن تحويلها جاهزة على Inferentia في قسم [ملاءمة بنية النموذج](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/models/models-inferentia.html#models-inferentia) من وثائق Neuron.
### التبعيات (Dependencies)
يتطلب استخدام AWS Neuron لتحويل النماذج [بيئة SDK Neuron](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/pytorch-neuron/index.html#installation-guide) والتي تأتي مسبقًا على [AMI للتعلم العميق من AWS](https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-inferentia-launching.html).
### تحويل نموذج لـ AWS Neuron
قم بتحويل نموذج لـ AWS NEURON باستخدام نفس التعليمات البرمجية من [استخدام TorchScript في Python](torchscript#using-torchscript-in-python) لتتبع `BertModel`. قم باستيراد امتداد إطار عمل `torch.neuron` للوصول إلى مكونات Neuron SDK من خلال واجهة برمجة تطبيقات Python:
```python
from transformers import BertModel, BertTokenizer, BertConfig
import torch
import torch.neuron
```
كل ما عليك فعله هو تعديل السطر التالي:
```diff
- torch.jit.trace(model, [tokens_tensor, segments_tensors])
+ torch.neuron.trace(model, [token_tensor, segments_tensors])
```
يتيح ذلك لـ Neuron SDK تتبع النموذج وتحسينه لمثيلات Inf1.
لمعرفة المزيد حول ميزات AWS Neuron SDK والأدوات ودروس البرامج التعليمية والتحديثات الأخيرة، يرجى الاطلاع على [وثائق AWS NeuronSDK](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/index.html).

720
docs/source/ar/trainer.md Normal file
View File

@ -0,0 +1,720 @@
# Trainer
تُتيح وحدة [`Trainer`] حلقة تدريب وتقييم متكاملة لنماذج PyTorch المطبقة في مكتبة Transformers. تحتاج فقط إلى تمرير المكونات الضرورية للتدريب (النموذج، والمجزىء النصى، ومجموعة البيانات، دالة التقييم، معلمات التدريب الفائقة، إلخ)، وستتولى فئة [`Trainer`] الباقي. هذا يُسهّل بدء التدريب بشكل أسرع دون كتابة حلقة التدريب الخاصة بك يدويًا. ولكن في الوقت نفسه، فإن [`Trainer`] قابل للتخصيص بدرجة كبيرة ويوفر العديد من خيارات التدريب حتى تتمكن من تخصيصه وفقًا لاحتياجات التدريب الخاصة بك بدقة.
<Tip>
بالإضافة إلى فئة [`Trainer`], توفر مكتبة Transformers أيضًا فئة [`Seq2SeqTrainer`] للمهام التسلسلية مثل الترجمة أو التلخيص. هناك أيضًا فئة [`~trl.SFTTrainer`] من مكتبة [TRL](https://hf.co/docs/trl) التي تغلّف فئة [`Trainer`] وهي مُحُسَّنة لتدريب نماذج اللغة مثل Llama-2 وMistral باستخدام تقنيات التوليد اللغوي. كما يدعم [`~trl.SFTTrainer`] ميزات مثل حزم التسلسلات، وLoRA، والقياس الكمي، وDeepSpeed مما يُمكّن من التدريب بكفاءة على نماذج ضخمة الحجم.
<br>
لا تتردد في الاطلاع على [مرجع API](./main_classes/trainer) لهذه الفئات الأخرى من النوع [`Trainer`] لمعرفة المزيد حول متى يتم استخدام كل منها. بشكل عام، [`Trainer`] هو الخيار الأكثر تنوعًا ومناسبًا لمجموعة واسعة من المهام. تم تصميم [`Seq2SeqTrainer`] للمهام التسلسلية ، و [`~trl.SFTTrainer`] مُصمم لتدريب نماذج اللغة الكبيرة.
</Tip>
قبل البدء، تأكد من تثبيت مكتبة [Accelerate](https://hf.co/docs/accelerate) - وهي مكتبة تُمكّن تشغيل تدريب PyTorch في بيئات مُوزعة.
```bash
pip install accelerate
# upgrade
pip install accelerate --upgrade
```
يوفر هذا الدليل نظرة عامة على فئة [`Trainer`].
## الاستخدام الأساسي
يتضمن [`Trainer`] جميع التعليمات البرمجية التي ستجدها في حلقة التدريب الأساسية:
1. قم بتنفيذ خطوة تدريب لحساب الخسارة
2. احسب المشتقات باستخدام طريقة [`~accelerate.Accelerator.backward`]
3. تحديث الأوزان بناءً على المشتقات
4. كرر هذه العملية حتى تصل إلى عدد محدد مسبقًا من الدورات (epochs).
تُجرد فئة [`Trainer`] كل هذه التعليمات البرمجية حتى لا تضطر إلى القلق بشأن كتابة حلقة تدريب يدويًا في كل مرة أما إذا كنت بدأت للتو في PyTorch والتدريب. كل ما عليك فعله هو توفير المكونات الأساسية اللازمة للتدريب، مثل النموذج ومجموعة بيانات، وتتعامل فئة [`Trainer`] مع كل شيء آخر.
إذا كنت تُريد تحديد أي خيارات تدريب أو معلمات فائقة، فيمكنك العثور عليها في فئة [`TrainingArguments`]. على سبيل المثال، دعنا نحدد أين يتم حفظ النموذج في `output_dir` ورفع النموذج إلى Hub بعد التدريب باستخدام `push_to_hub=True`.
```py
from transformers import TrainingArguments
training_args = TrainingArguments(
output_dir="your-model"،
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=2,
weight_decay=0.01,
eval_strategy="epoch"،
save_strategy="epoch"،
load_best_model_at_end=True,
push_to_hub=True,
)
```
مرر `training_args` إلى [`Trainer`] جنبًا إلى جنب مع النموذج، ومجموعة بيانات، وشئ لمعالجة مجموعة البيانات مسبقًا (حسب نوع البيانات، فقد يكون محللًا رمزيًا أو مستخرج ميزات أو معالج صور)، وجامع بيانات، ودالة لحساب المقاييس التي تُريد تتبعها أثناء التدريب.
أخيرًا، استدعِ [`~Trainer.train`] لبدء التدريب!
```py
from transformers import Trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=dataset["train"]،
eval_dataset=dataset["test"]،
tokenizer=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics,
)
trainer.train()
```
### نقاط الحفظ
تحفظ فئة [`Trainer`] نقاط الحفظ النموذج في الدليل المحدد في معامل `output_dir` من [`TrainingArguments`]. ستجد نقاط الحفظ في مجلد فرعي يسمى `checkpoint-000` حيث تتوافق الأرقام في النهاية مع خطوة التدريب. إن حفظ نقاط الحفظ مفيد لاستئناف التدريب لاحقًا.
```py
# استأنف من أحدث نقطة حفظ
trainer.train(resume_from_checkpoint=True)
# استأنف من نقطة حفظ محددة محفوظة في دليل الإخراج
trainer.train(resume_from_checkpoint="your-model/checkpoint-1000")
```
يمكنك حفظ نقاط الحفظ الخاصة بك (لا يتم حفظ حالة المُجزىء اللغوى تقائيًا) إلى Hub عن طريق تعيين `push_to_hub=True` في [`TrainingArguments`] لرفعها. الخيارات الأخرى لاتخاذ القرار بشأن كيفية حفظ هذة النقاط الخاصة بك هي الإعداد في معامل [`hub_strategy`](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments.hub_strategy):
* `hub_strategy="checkpoint"` يدفع أحدث نقطة حفظ إلى مجلد فرعي يسمى "last-checkpoint" يمكنك استئناف التدريب منه
* `hub_strategy="all_checkpoints"` يدفع جميع نقاط الحفظ إلى الدليل المحدد في `output_dir` (سترى نقطة حفظ واحدة لكل مجلد في مستودع النموذج الخاص بك)
عند استئناف التدريب من نقطة حفظ، تُحاول [`Trainer`] الحفاظ على حالات RNG Python وNumPy وPyTorch كما كانت عندما تم حفظ نقطة الحفظ. ولكن لأن PyTorch لديها العديد من الإعدادات الافتراضية غير الحتمية مُتنوعة، فإن حالات RNG ليست مضمونة لتكون هي نفسها. إذا كنت تريد تمكين الحتمية الكاملة، فراجع دليل [التحكم في مصادر العشوائية](https://pytorch.org/docs/stable/notes/randomness#controlling-sources-of-randomness) لمعرفة ما يُمكنك تمكينه لجعل تدريبك حتميًا تمامًا. ضع في اعتبارك أنه من خلال جعل إعدادات معينة حتمية، فقد يكون التدريب أبطأ.
## تخصيص المدرب
في حين أن فئة [`Trainer`] مُصممة لتكون سهلة الوصول وسهلة الاستخدام، فإنها توفر أيضًا الكثير من قابلية التخصيص للمستخدمين المغامرين. يُمكن إنشاء فئات فرعية من العديد من أساليب [`Trainer`] وتجاوزها لدعم الوظائف التي تُريدها، دون الحاجة إلى إعادة كتابة حلقة التدريب بأكملها من البداية لاستيعابها. تتضمن هذه الأساليب:
* [`~Trainer.get_train_dataloader`] ينشئ DataLoader للتدريب
* [`~Trainer.get_eval_dataloader`] ينشئ DataLoader للتقييم
* [`~Trainer.get_test_dataloader`] ينشئ DataLoader للاختبار
* [`~Trainer.log`] يسجل معلومات حول مختلف الكائنات التي تراقب التدريب
* [`~Trainer.create_optimizer_and_scheduler`] ينشئ محسنًا ومخططًا لمُعدل التعلم إذا لم يتم تمريرهما في `__init__`؛ يمكن أيضًا تخصيص هذه الوظائف بشكل منفصل باستخدام [`~Trainer.create_optimizer`] و [`~Trainer.create_scheduler`] على التوالي
* [`~Trainer.compute_loss`] يحسب دالة الخسارة على دفعة من مُدخلات التدريب
* [`~Trainer.training_step`] يُنفذ خطوة التدريب
* [`~Trainer.prediction_step`] يُنفذ خطوة التنبؤ والاختبار
* [`~Trainer.evaluate`] يُقيّم النموذج ويعيد مقاييس التقييم
* [`~Trainer.predict`] يُجري التنبؤات (مع المقاييس إذا كانت العلامات متاحة) على مجموعة الاختبار
على سبيل المثال، إذا كنت تريد تخصيص طريقة [`~Trainer.compute_loss`] لاستخدام دالة خسارة ذات ترجيح بدلاً من ذلك.
```py
from torch import nn
from transformers import Trainer
class CustomTrainer(Trainer):
def compute_loss(self, model, inputs, return_outputs=False):
labels = inputs.pop("labels")
# forward pass
outputs = model(**inputs)
logits = outputs.get("logits")
# compute custom loss for 3 labels with different weights
loss_fct = nn.CrossEntropyLoss(weight=torch.tensor([1.0, 2.0, 3.0], device=model.device))
loss = loss_fct(logits.view(-1, self.model.config.num_labels), labels.view(-1))
return (loss, outputs) if return_outputs else loss
```
### دوال الاستدعاء Callbacks
خيار آخر لتخصيص [`Trainer`] هو استخدام [دوال الاستدعاء](callbacks). لا *تغير* دوال الاستدعاء أي شيء في حلقة التدريب. إنهم تفحص حالة حلقة التدريب ثم تُنفذ بعض الإجراءات (مثل الإيقاف المبكر أو تسجيل النتائج، إلخ) اعتمادًا على الحالة. وبعبارة أخرى، لا يمكن استخدام دالة الاستدعاء لتنفيذ شيء مثل دالة خسارة مخصصة، ويجب عليك تجاوز دالة [`~Trainer.compute_loss`] لذلك.
على سبيل المثال، إذا كنت تريد إضافة دالة استدعاء إيقاف مبكر إلى حلقة التدريب بعد 10 خطوات.
```py
from transformers import TrainerCallback
class EarlyStoppingCallback(TrainerCallback):
def __init__(self, num_steps=10):
self.num_steps = num_steps
def on_step_end(self, args, state, control, **kwargs):
if state.global_step >= self.num_steps:
return {"should_training_stop": True}
else:
return {}
```
ثم مرره إلى معامل `callback` في [`Trainer`].
```py
from transformers import Trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=dataset["train"]،
eval_dataset=dataset["test"]،
tokenizer=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics,
callback=[EarlyStoppingCallback()],
)
```
## تسجيل الأحداث (Logging)
<Tip>
راجع مرجع [API](./main_classes/logging) للتسجيل للحصول على مزيد من المعلومات حول مستويات التسجيل المختلفة للأحداث.
</Tip>
يتم تعيين [`Trainer`] إلى `logging.INFO` افتراضيًا والذي يُبلغ عن الأخطاء والتحذيرات ومعلومات أساسية أخرى. يتم تعيين نسخة [`Trainer`] - في البيئات الموزعة - إلى `logging.WARNING` والتي يُبلغ فقط عن الأخطاء والتحذيرات. يمكنك تغيير مستوى تسجيل الأحداث باستخدام معاملي [`log_level`](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments.log_level) و [`log_level_replica`](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments.log_level_replica) في [`TrainingArguments`].
لتهيئة إعداد مُستوى تسجيل اﻷحداث لكل عقدة، استخدم معامل [`log_on_each_node`](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.TrainingArguments.log_on_each_node) لتحديد ما إذا كان سيتم استخدام مُستوى السجل على كل عقدة أو فقط على العقدة الرئيسية.
<Tip>
يحدد [`Trainer`] مُستوى التسجيل بشكل مُنفصل لكل عقدة في طريقة [`Trainer.__init__`]، لذا فقد ترغب في التفكير في تعيين هذا الإعداد في وقت سابق إذا كنت تستخدم وظائف Transformers الأخرى قبل إنشاء كائن [`Trainer`].
</Tip>
على سبيل المثال، لتعيين التعليمات البرمجية والوحدات النمطية الرئيسية الخاصة بك لاستخدام نفس مُستوى التسجيل وفقًا لكل عقدة:
```py
logger = logging.getLogger(__name__)
logging.basicConfig(
format="%(asctime)s - %(levelname)s - %(name)s - %(message)s"،
datefmt="%m/%d/%Y %H:%M:%S"،
handlers=[logging.StreamHandler(sys.stdout)],
)
log_level = training_args.get_process_log_level()
logger.setLevel(log_level)
datasets.utils.logging.set_verbosity(log_level)
transformers.utils.logging.set_verbosity(log_level)
trainer = Trainer(...)
```
استخدم تركيبات مختلفة من `log_level` و `log_level_replica` لتهيئة ما يتم تسجيله على كل من العقد.
<hfoptions id="logging">
<hfoption id="single node">
```bash
my_app.py ... --log_level warning --log_level_replica error
```
</hfoption>
<hfoption id="multi-node">
أضف معلمة `log_on_each_node 0` لبيئات متعددة العقد.
```bash
my_app.py ... --log_level warning --log_level_replica error --log_on_each_node 0
# set to only report errors
my_app.py ... --log_level error --log_level_replica error --log_on_each_node 0
```
</hfoption>
</hfoptions>
## NEFTune
[NEFTune](https://hf.co/papers/2310.05914) هي تقنية يمكن أن تحسن الأداء عن طريق إضافة ضوضاء إلى مُتجهات التعلم أثناء التدريب. لتمكينه في [`Trainer`], قم بتعيين معامل `neftune_noise_alpha` في [`TrainingArguments`] للتحكم في مقدار الضوضاء المُضافة.
```py
from transformers import TrainingArguments, Trainer
training_args = TrainingArguments(..., neftune_noise_alpha=0.1)
trainer = Trainer(..., args=training_args)
```
يتم تعطيل NEFTune بعد التدريب لاستعادة طبقة التعلم الأصلية لتجنب أي سلوك غير متوقع.
## نواة Liger
[Liger-Kernel](https://github.com/linkedin/Liger-Kernel) Kernel هي مجموعة من نوى Triton التي طورتها Linkedin مُصممة خصيصًا لتدريب نماذج اللغة الكبيرة (LLM). لقد قمنا بتنفيذ RMSNorm و RoPE و SwiGLU و CrossEntropy و FusedLinearCrossEntropy مُتوافقة مع Hugging Face، والمزيد قادم. يُمكنها زيادة إنتاجية التدريب متعدد وحدات معالجة الرسومات (GPU) بنسبة 20٪ وتقليل استخدام الذاكرة بنسبة 60٪. تعمل النواة بشكل تلقائي مع flash attention و PyTorch FSDP و Microsoft DeepSpeed.
احصل على زيادة في الإنتاجية بنسبة 20٪ وتقليل استخدام الذاكرة بنسبة 60٪ على تدريب نماذج LLaMA 3-8B. حقق أطوال سياق أكبر وأحجام دفعات أكبر. كما أنها مُفيدة إذا كنت تُريد زيادة حجم نموذجك إلى تدريب بنماذج متعددة الرؤوس أو أحجام مُفردات ضخمة. أطلق العنان للتدريب بنماذج متعددة الرؤوس (medusa) والمزيد. راجع التفاصيل والأمثلة في [Liger](https://github.com/linkedin/Liger-Kernel/tree/main/examples)
تأكد أولاً من تثبيت مستودع Liger الرسمي:
```bash
pip install liger-kernel
```
يجب عليك تمرير `use_liger_kernel=True` لتطبيق نواة `liger` على نموذجك، على سبيل المثال:
```python
from transformers import TrainingArguments
training_args = TrainingArguments(
output_dir="your-model",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=2,
weight_decay=0.01,
eval_strategy="epoch",
save_strategy="epoch",
load_best_model_at_end=True,
push_to_hub=True,
use_liger_kernel=True
)
```
تدعم النواة معماريات نماذج Llama و Gemma و Mistral و Mixtral. يُمكن العثور على أحدث قائمة بالنمائج المدعومة [هنا](https://github.com/linkedin/Liger-Kernel). عندما يتم تعيين `use_liger_kernel` إلى `True`، سيتم تصحيح الطبقات المُقابلة في النموذج الأصلي باستخدام تطبيق Liger الفعال، لذلك لا تحتاج إلى فعل أي شيء إضافي بخلاف تعيين قيمة المعامل.
## المُحسِّنات
يمكنك اختيار مُحسِّن مدمج للتدريب باستخدام:
```python
from transformers import TrainingArguments
training_args = TrainingArguments(..., optim="adamw_torch")
```
اطلع على [`OptimizerNames`](https://github.com/huggingface/transformers/blob/main/src/transformers/training_args.py) للاطلاع على القائمة الكاملة للخيارات. نُدرج أمثلة مُتقدمة في الأقسام أدناه.
يمكنك أيضًا استخدام مُحسِّن PyTorch عشوائي عبر:
```python
import torch
optimizer_cls = torch.optim.AdamW
optimizer_kwargs = {
"lr": 4e-3,
"betas": (0.9, 0.999),
"weight_decay": 0.05,
}
from transformers import Trainer
trainer = Trainer(..., optimizer_cls_and_kwargs=(optimizer_cls, optimizer_kwargs))
```
### GaLore
إسقاط التدرج ذو الرتبة المنخفضة (GaLore) هو إستراتيجية تدريب ذات رتبة منخفضة فعّالة من حيث الذاكرة، تسمح بتعلم المعلمات الكاملة ولكنها أكثر كفاءة من حيث الذاكرة من أساليب التكيّف الشائعة ذات الرتبة المنخفضة، مثل LoRA.
أولاً، تأكد من تثبيت المستودع الرسمي لـ GaLore:
```bash
pip install galore-torch
```
ثم أضف ببساطة أحد `["galore_adamw"، "galore_adafactor"، "galore_adamw_8bit"]` في `optim` جنبًا إلى جنب مع `optim_target_modules`، والتي يمكن أن تكون قائمة من السلاسل أو التعبيرات النمطية regex أو المسار الكامل المطابق لأسماء الوحدات المستهدفة التي تريد تكييفها. فيما يلي مثال على النص البرمجي كامل(تأكد من `pip install trl datasets`):
```python
import torch
import datasets
import trl
from transformers import TrainingArguments, AutoConfig, AutoTokenizer, AutoModelForCausalLM
train_dataset = datasets.load_dataset('imdb', split='train')
args = TrainingArguments(
output_dir="./test-galore"،
max_steps=100,
per_device_train_batch_size=2,
optim="galore_adamw"،
optim_target_modules=[r".*.attn.*"، r".*.mlp.*"]
)
model_id = "google/gemma-2b"
config = AutoConfig.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_config(config).to(0)
trainer = trl.SFTTrainer(
model=model,
args=args,
train_dataset=train_dataset,
dataset_text_field='text',
max_seq_length=512,
)
trainer.train()
```
لتمرير معامﻻت إضافية يدعمها GaLore، يجب عليك تمرير `optim_args` بشكل صحيح، على سبيل المثال:
```python
import torch
import datasets
import trl
from transformers import TrainingArguments, AutoConfig, AutoTokenizer, AutoModelForCausalLM
train_dataset = datasets.load_dataset('imdb', split='train')
args = TrainingArguments(
output_dir="./test-galore",
max_steps=100,
per_device_train_batch_size=2,
optim="galore_adamw",
optim_target_modules=[r".*.attn.*", r".*.mlp.*"],
optim_args="rank=64, update_proj_gap=100, scale=0.10",
)
model_id = "google/gemma-2b"
config = AutoConfig.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_config(config).to(0)
trainer = trl.SFTTrainer(
model=model,
args=args,
train_dataset=train_dataset,
dataset_text_field='text',
max_seq_length=512,
)
trainer.train()
```
يمكنك قراءة المزيد حول الطريقة في [المستودع الأصلي](https://github.com/jiaweizzhao/GaLore) أو [الورقة البحثية](https://arxiv.org/abs/2403.03507).
حاليًا، يمكنك فقط تدريب الطبقات الخطية التي تعتبر طبقات GaLore وستستخدم التحلل ذو الرتبة المنخفضة للتدريب بينما سيتم تحسين الطبقات المتبقية بالطريقة التقليدية.
لاحظ أنه سيستغرق الأمر بعض الوقت قبل بدء التدريب (~3 دقائق لنموذج 2B على NVIDIA A100)، ولكن يجب أن يسير التدريب بسلاسة بعد ذلك.
يمكنك أيضًا إجراء تحسين طبقة تلو الأخرى عن طريق إضافة `layerwise` إلى اسم المُحسِّن كما هو موضح أدناه:
```python
import torch
import datasets
import trl
from transformers import TrainingArguments، AutoConfig، AutoTokenizer، AutoModelForCausalLM
train_dataset = datasets.load_dataset('imdb'، split='train')
args = TrainingArguments(
output_dir="./test-galore"،
max_steps=100،
per_device_train_batch_size=2،
optim="galore_adamw_layerwise"،
optim_target_modules=[r".*.attn.*"، r".*.mlp.*"]
)
model_id = "google/gemma-2b"
config = AutoConfig.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_config(config).to(0)
trainer = trl.SFTTrainer(
model=model،
args=args،
train_dataset=train_dataset،
dataset_text_field='text'،
max_seq_length=512،
)
trainer.train()
```
لاحظ أن تحسين الطبقة تجريبي إلى حد ما ولا يدعم DDP (Distributed Data Parallel)، وبالتالي يمكنك تشغيل التعليمات البرمجية للتدريب على وحدة معالجة الرسومات (GPU) واحدة فقط. يرجى الاطلاع على [هذا القسم المناسب](https://github.com/jiaweizzhao/GaLore?tab=readme-ov-file#train-7b-model-with-a-single-gpu-with-24gb-memory) لمزيد من التفاصيل. قد لا تدعم الميزات الأخرى مثل تقليم التدرجات أو DeepSpeed، إلخ. من الصندوق. يرجى [تقديم تقرير عن المشكلة على GitHub](https://github.com/huggingface/transformers/issues) إذا واجهتك مثل هذه المشكلة.
### محسنات LOMO
تم تقديم مُحسِّنات LOMO في [التدريب على المعلمات الكاملة لنماذج اللغة الكبيرة باستخدام موارد محدودة](https://hf.co/papers/2306.09782) و [AdaLomo: تحسين ذاكرة منخفضة بمعدل تعلم متكيف](https://hf.co/papers/2310.10195).
يتكون كلاهما من طريقة فعالة لضبط المعلمات الكاملة. تدمج محسنات LOMO حساب الاشتقاق وتحديث المعلمات في خطوة واحدة لتقليل استخدام الذاكرة. محسنات LOMO المدعومة هي `"lomo"` و `"adalomo"`. أولاً قم بتثبيت LOMO من pypi `pip install lomo-optim` أو قم بتثبيته من المصدر باستخدام `pip install git+https://github.com/OpenLMLab/LOMO.git`.
<Tip>
وفقًا للمؤلفين، يوصى باستخدام `AdaLomo` بدون `grad_norm` للحصول على أداء أفضل وسرعة أعلى.
</Tip>
فيما يلي نص برمجي بسيط يوضح كيفية ضبط نموذج [google/gemma-2b](https://huggingface.co/google/gemma-2b) على مجموعة بيانات IMDB في الدقة الكاملة:
```python
import torch
import datasets
from transformers import TrainingArguments، AutoTokenizer، AutoModelForCausalLM
import trl
train_dataset = datasets.load_dataset('imdb'، split='train')
args = TrainingArguments(
output_dir="./test-lomo"،
max_steps=100،
per_device_train_batch_size=4،
optim="adalomo"،
gradient_checkpointing=True،
logging_strategy="steps"،
logging_steps=1،
learning_rate=2e-6،
save_strategy="no"،
run_name="lomo-imdb"،
)
model_id = "google/gemma-2b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id، low_cpu_mem_usage=True).to(0)
trainer = trl.SFTTrainer(
model=model،
args=args،
train_dataset=train_dataset،
dataset_text_field='text'،
max_seq_length=1024،
)
trainer.train()
```
### مُحسِّن GrokAdamW
تم تصميم مُحسِّن GrokAdamW لتعزيز أداء التدريب واستقراره، خاصةً للنماذج التي تستفيد من دوال إشارة `grokking`. لاستخدام `GrokAdamW`، قم أولاً بتثبيت حزمة المُحسِّن باستخدام `pip install grokadamw`.
<Tip>
يُعد GrokAdamW مفيدًا بشكل خاص للنماذج التي تتطلب تقنيات تحسين مُتقدمة لتحقيق أداء واستقرار أفضل.
</Tip>
فيما يلي نص برمجى بسيط لشرح كيفية ضبط [google/gemma-2b](https://huggingface.co/google/gemma-2b) بدقة على مجموعة بيانات IMDB باستخدام مُحسِّن GrokAdamW:
```python
import torch
import datasets
from transformers import TrainingArguments, AutoTokenizer, AutoModelForCausalLM, Trainer
# تحميل مجموعة البيانات IMDB
train_dataset = datasets.load_dataset('imdb', split='train')
# تعريف معامﻻت التدريب
args = TrainingArguments(
output_dir="./test-grokadamw",
max_steps=1000,
per_device_train_batch_size=4,
optim="grokadamw",
logging_strategy="steps",
logging_steps=1,
learning_rate=2e-5,
save_strategy="no",
run_name="grokadamw-imdb",
)
# تحميل النموذج والمجزىء اللغوي
model_id = "google/gemma-2b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, low_cpu_mem_usage=True).to(0)
# تهيئة المدرب
trainer = Trainer(
model=model,
args=args,
train_dataset=train_dataset,
)
# تدريب النموذج
trainer.train()
```
يوضح هذا النص البرمجى كيفية ضبط نموذج google/gemma-2b بدقة على مجموعة بيانات IMDB باستخدام مُحسِّن GrokAdamW. يتم تكوين TrainingArguments لاستخدام GrokAdamW، ويتم تمرير مجموعة البيانات إلى Trainer للتدريب.
### مُحسِّن بدون جدوله (Schedule Free Optimizer)
تم تقديم مُحسِّنات بدون جدوله في [The Road Less Scheduled](https://hf.co/papers/2405.15682).
يستبدل التعلم بدون جدوله زخم المُحسِّن الأساسي بمزيج من المتوسط ​​والتداخل، لإزالة الحاجة تمامًا إلى تخفيف مُعدل التعلم باستخدام جدوله تقليديه.
المُحسِّنات المدعومة لـ SFO هي "schedule_free_adamw" و "schedule_free_sgd". قم أولاً بتثبيت `schedulefree` من pypi باستخدام الأمر `pip install schedulefree`.
فيما يلي نص برمجى بسيط لشرح كيفية ضبط [google/gemma-2b](https://huggingface.co/google/gemma-2b) بدقة على مجموعة بيانات IMDB بدقة كاملة:
```python
import torch
import datasets
from transformers import TrainingArguments, AutoTokenizer, AutoModelForCausalLM
import trl
train_dataset = datasets.load_dataset('imdb', split='train')
args = TrainingArguments(
output_dir="./test-schedulefree",
max_steps=1000,
per_device_train_batch_size=4,
optim="schedule_free_adamw",
gradient_checkpointing=True,
logging_strategy="steps",
logging_steps=1,
learning_rate=2e-6,
save_strategy="no",
run_name="sfo-imdb",
)
model_id = "google/gemma-2b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, low_cpu_mem_usage=True).to(0)
trainer = trl.SFTTrainer(
model=model,
args=args,
train_dataset=train_dataset,
dataset_text_field='text',
max_seq_length=1024,
)
trainer.train()
```
## تسريع ومدرب
يتم تشغيل فئة [`Trainer`] بواسطة [تسريع](https://hf.co/docs/accelerate)، وهي مكتبة لتدريب نماذج PyTorch بسهولة في بيئات موزعة مع دعم عمليات التكامل مثل [FullyShardedDataParallel (FSDP)](https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/) و [DeepSpeed](https://www.deepspeed.ai/).
<Tip>
تعرف على المزيد حول استراتيجيات تجزئة FSDP، وتفريغ وحدة المعالجة المركزية (CPU)، والمزيد مع [`Trainer`] في [دليل Fully Sharded Data Parallel](fsdp).
</Tip>
لاستخدام Accelerate مع [`Trainer`]]، قم بتشغيل الأمر [`accelerate.config`](https://huggingface.co/docs/accelerate/package_reference/cli#accelerate-config) لإعداد التدريب لبيئة التدريب الخاصة بك. نشئ هذا الأمر ملف `config_file.yaml` الذي سيتم استخدامه عند تشغيل نص للتدريب البرمجى. على سبيل المثال، بعض تكوينات المثال التي يمكنك إعدادها هي:
<hfoptions id="config">
<hfoption id="DistributedDataParallel">
```yml
compute_environment: LOCAL_MACHINE
distributed_type: MULTI_GPU
downcast_bf16: 'no'
gpu_ids: all
machine_rank: 0 #change rank as per the node
main_process_ip: 192.168.20.1
main_process_port: 9898
main_training_function: main
mixed_precision: fp16
num_machines: 2
num_processes: 8
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
```
</hfoption>
<hfoption id="FSDP">
```yml
compute_environment: LOCAL_MACHINE
distributed_type: FSDP
downcast_bf16: 'no'
fsdp_config:
fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
fsdp_backward_prefetch_policy: BACKWARD_PRE
fsdp_forward_prefetch: true
fsdp_offload_params: false
fsdp_sharding_strategy: 1
fsdp_state_dict_type: FULL_STATE_DICT
fsdp_sync_module_states: true
fsdp_transformer_layer_cls_to_wrap: BertLayer
fsdp_use_orig_params: true
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 2
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
```
</hfoption>
<hfoption id="DeepSpeed">
```yml
compute_environment: LOCAL_MACHINE
deepspeed_config:
deepspeed_config_file: /home/user/configs/ds_zero3_config.json
zero3_init_flag: true
distributed_type: DEEPSPEED
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
num_machines: 1
num_processes: 4
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
```
</hfoption>
<hfoption id="DeepSpeed with Accelerate plugin">
```yml
compute_environment: LOCAL_MACHINE
deepspeed_config:
gradient_accumulation_steps: 1
gradient_clipping: 0.7
offload_optimizer_device: cpu
offload_param_device: cpu
zero3_init_flag: true
zero_stage: 2
distributed_type: DEEPSPEED
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 4
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
```
</hfoption>
</hfoptions>
يُعد أمر [`accelerate_launch`](https://huggingface.co/docs/accelerate/package_reference/cli#accelerate-launch) هو الطريقة المُوصى بها لتشغيل نص البرمجى للتدريب على نظام موزع باستخدام Accelerate و [`Trainer`] مع المعلمات المحددة في `config_file.yaml`. يتم حفظ هذا الملف في مجلد ذاكرة التخزين المؤقت لـ Accelerate ويتم تحميله تلقائيًا عند تشغيل `accelerate_launch`.
على سبيل المثال، لتشغيل النص البرنامجي للتدريب [run_glue.py](https://github.com/huggingface/transformers/blob/f4db565b695582891e43a5e042e5d318e28f20b8/examples/pytorch/text-classification/run_glue.py#L4) مع تكوين FSDP:
```bash
accelerate launch \
./examples/pytorch/text-classification/run_glue.py \
--model_name_or_path google-bert/bert-base-cased \
--task_name $TASK_NAME \
--do_train \
--do_eval \
--max_seq_length 128 \
--per_device_train_batch_size 16 \
--learning_rate 5e-5 \
--num_train_epochs 3 \
--output_dir /tmp/$TASK_NAME/ \
--overwrite_output_dir
```
يمكنك أيضًا تحديد المعلمات من ملف `config_file.yaml` مباشرة في سطر الأوامر:
```bash
accelerate launch --num_processes=2 \
--use_fsdp \
--mixed_precision=bf16 \
--fsdp_auto_wrap_policy=TRANSFORMER_BASED_WRAP \
--fsdp_transformer_layer_cls_to_wrap="BertLayer" \
--fsdp_sharding_strategy=1 \
--fsdp_state_dict_type=FULL_STATE_DICT \
./examples/pytorch/text-classification/run_glue.py
--model_name_or_path google-bert/bert-base-cased \
--task_name $TASK_NAME \
--do_train \
--do_eval \
--max_seq_length 128 \
--per_device_train_batch_size 16 \
--learning_rate 5e-5 \
--num_train_epochs 3 \
--output_dir /tmp/$TASK_NAME/ \
--overwrite_output_dir
```
اطلع على برنامج تعليمي [Launching your Accelerate scripts](https://huggingface.co/docs/accelerate/basic_tutorials/launch) لمعرفة المزيد حول `accelerate_launch` والتكوينات المخصصة.

View File

@ -0,0 +1,171 @@
# استكشاف الأخطاء وإصلاحها
تحدث الأخطاء أحيانًا، لكننا هنا للمساعدة! يغطي هذا الدليل بعض المشكلات الأكثر شيوعًا التي واجهناها وكيفية حلها. مع ذلك، لا يُقصد بهذا الدليل أن يكون مجموعة شاملة لكل مشكلات 🤗 Transformers. لمزيد من المساعدة في استكشاف مشكلتك وإصلاحها، جرب ما يلي:
<Youtube id="S2EEG3JIt2A"/>
1. اطلب المساعدة على [المنتديات](https://discuss.huggingface.co/). هناك فئات محددة يمكنك نشر سؤالك فيها، مثل [المبتدئين](https://discuss.huggingface.co/c/beginners/5) أو [🤗 Transformers](https://discuss.huggingface.co/c/transformers/9). تأكد من كتابة منشور جيد وواضح على المنتدى مع بعض التعليمات البرمجية القابلة للتكرار لزيادة احتمالية حل مشكلتك!
<Youtube id="_PAli-V4wj0"/>
2. قم بإنشاء [مشكلة](https://github.com/huggingface/transformers/issues/new/choose) في مستودع 🤗 Transformers إذا كانت هناك مشكلة متعلقة بالمكتبة. حاول تضمين أكبر قدر ممكن من المعلومات التي تصف المشكلة لمساعدتنا في معرفة ما هو الخطأ وكيفية إصلاحه.
3. تحقق من دليل [الترحيل](migration) إذا كنت تستخدم إصدارًا أقدم من مكتبة 🤗 Transformers حيث تم إدخال بعض التغييرات المهمة بين الإصدارات.
للحصول على مزيد من التفاصيل حول استكشاف الأخطاء وإصلاحها والحصول على المساعدة، راجع [الفصل 8](https://huggingface.co/course/chapter8/1?fw=pt) من دورة Hugging Face.
## بيئات جدار الحماية
بعض وحدات معالجة الرسومات (GPU) على السحابة وإعدادات الشبكة الداخلية محمية بجدار حماية من الاتصالات الخارجية، مما يؤدي إلى حدوث خطأ في الاتصال. عندما تحاول تعليمات البرنامج النصي تنزيل أوزان النموذج أو مجموعات البيانات، سيتوقف التنزيل ثم ينتهي بخطأ مثل:
```
ValueError: Connection error, and we cannot find the requested files in the cached path.
Please try again or make sure your Internet connection is on.
```
في هذه الحالة، يجب محاولة تشغيل 🤗 Transformers في [وضع عدم الاتصال](installation#offline-mode) لتجنب خطأ الاتصال.
## CUDA نفاد الذاكرة
يمكن أن يكون تدريب النماذج الكبيرة التي تحتوي على ملايين المعلمات أمرًا صعبًا بدون الأجهزة المناسبة. أحد الأخطاء الشائعة التي قد تواجهها عند نفاد ذاكرة GPU هو:
```
CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 11.17 GiB total capacity; 9.70 GiB already allocated; 179.81 MiB free; 9.85 GiB reserved in total by PyTorch)
```
فيما يلي بعض الحلول المحتملة التي يمكنك تجربتها لتقليل استخدام الذاكرة:
- قلل من قيمة [`per_device_train_batch_size`](main_classes/trainer#transformers.TrainingArguments.per_device_train_batch_size) في [`TrainingArguments`].
- حاول استخدام [`gradient_accumulation_steps`](main_classes/trainer#transformers.TrainingArguments.gradient_accumulation_steps) في [`TrainingArguments`] لزيادة حجم الدُفعة بشكل فعال.
<Tip>
راجع دليل [الأداء](performance) لمزيد من التفاصيل حول تقنيات توفير الذاكرة.
</Tip>
## عدم القدرة على تحميل نموذج TensorFlow محفوظ
تقوم طريقة TensorFlow [model.save](https://www.tensorflow.org/tutorials/keras/save_and_load#save_the_entire_model) بحفظ النموذج بالكامل - الهندسة المعمارية، الأوزان، تكوين التدريب - في ملف واحد. ومع ذلك، عند تحميل ملف النموذج مرة أخرى، قد تواجه خطأ لأن مكتبة 🤗 Transformers قد لا تقوم بتحميل جميع الكائنات المتعلقة بـ TensorFlow في ملف النموذج. لتجنب المشكلات المتعلقة بحفظ وتحميل نماذج TensorFlow، نوصي بما يلي:
- احفظ أوزان النموذج كملف `h5` باستخدام [`model.save_weights`](https://www.tensorflow.org/tutorials/keras/save_and_load#save_the_entire_model) ثم أعد تحميل النموذج باستخدام [`~TFPreTrainedModel.from_pretrained`]:
```python
>>> from transformers import TFPreTrainedModel
>>> from tensorflow import keras
>>> model.save_weights("some_folder/tf_model.h5")
>>> model = TFPreTrainedModel.from_pretrained("some_folder")
```
- احفظ النموذج باستخدام [`~TFPretrainedModel.save_pretrained`] وقم بتحميله مرة أخرى باستخدام [`~TFPreTrainedModel.from_pretrained`]:
```python
>>> from transformers import TFPreTrainedModel
>>> model.save_pretrained("path_to/model")
>>> model = TFPreTrainedModel.from_pretrained("path_to/model")
```
## ImportError
خطأ شائع آخر قد تواجهه، خاصة إذا كان نموذجًا تم إصداره حديثًا، هو `ImportError`:
```
ImportError: cannot import name 'ImageGPTImageProcessor' from 'transformers' (unknown location)
```
بالنسبة لأنواع الأخطاء هذه، تحقق من أن لديك أحدث إصدار من مكتبة Hugging Face Transformers مثبتًا للوصول إلى أحدث النماذج:
```bash
pip install transformers --upgrade
```
## خطأ CUDA: تم تشغيل التأكيد على جانب الجهاز
في بعض الأحيان، قد تواجه خطأ CUDA عامًا حول خطأ في كود الجهاز.
```
RuntimeError: CUDA error: device-side assert triggered
```
يجب عليك محاولة تشغيل الكود على وحدة المعالجة المركزية (CPU) أولاً للحصول على رسالة خطأ أكثر دقة. أضف متغير البيئة التالي في بداية كودك للتبديل إلى وحدة المعالجة المركزية:
```python
>>> import os
>>> os.environ["CUDA_VISIBLE_DEVICES"] = ""
```
الخيار الآخر هو الحصول على تتبع مكدس أفضل من GPU. أضف متغير البيئة التالي في بداية كودك للحصول على تتبع المكدس للإشارة إلى مصدر الخطأ:
```python
>>> import os
>>> os.environ["CUDA_LAUNCH_BLOCKING"] = "1"
```
## إخراج غير صحيح عند عدم إخفاء رموز الحشو
في بعض الحالات، قد يكون `hidden_state` غير صحيحة إذا تضمنت `input_ids` رموز حشو. ولإثبات ذلك، قم بتحميل نموذج ومجزىء لغوى. يمكنك الوصول إلى `pad_token_id` للنموذج لمعرفة قيمته. قد تكون `pad_token_id` `None` لبعض النماذج، ولكن يمكنك دائمًا تعيينها يدويًا.
```python
>>> from transformers import AutoModelForSequenceClassification
>>> import torch
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-uncased")
>>> model.config.pad_token_id
0
```
يوضح المثال التالي المُخرجات بدون إخفاء رموز الحشو:
```python
>>> input_ids = torch.tensor([[7592, 2057, 2097, 2393, 9611, 2115], [7592, 0, 0, 0, 0, 0]])
>>> output = model(input_ids)
>>> print(output.logits)
tensor([[ 0.0082, -0.2307],
[ 0.1317, -0.1683]], grad_fn=<AddmmBackward0>)
```
هنا المُخرجات الفعلية للتسلسل الثاني:
```python
>>> input_ids = torch.tensor([[7592]])
>>> output = model(input_ids)
>>> print(output.logits)
tensor([[-0.1008, -0.4061]], grad_fn=<AddmmBackward0>)
```
يجب عليك في معظم الوقت توفير `attention_mask` للنموذج لتجاهل رموز الحشو لتجنب هذا الخطأ الصامت. الآن يتطابق مُخرجات التسلسل الثاني مع مُخرجاته الفعلية:
<Tip>
بشكل افتراضي، ينشئ مجزىء النصوص `attention_mask` لك استنادًا إلى إعدادات المجزىء المحدد.
</Tip>
```python
>>> attention_mask = torch.tensor([[1, 1, 1, 1, 1, 1], [1, 0, 0, 0, 0, 0]])
>>> output = model(input_ids, attention_mask=attention_mask)
>>> print(output.logits)
tensor([[ 0.0082, -0.2307],
[-0.1008, -0.4061]], grad_fn=<AddmmBackward0>)
```
لا ينشئ 🤗 Transformers تلقائيًا `attention_mask` لإخفاء رمز الحشو إذا تم توفيره لأن:
- بعض النماذج ليس لها رمز حشو.
- بالنسبة لبعض الاستخدامات، يريد المستخدمون أن ينتبه النموذج إلى رمز الحشو.
## ValueError: فئة التكوين غير المعترف بها XYZ لهذا النوع من AutoModel
بشكل عام، نوصي باستخدام فئة [`AutoModel`] لتحميل النسخ المدربة مسبقًا من النماذج. يمكن لهذه الفئة أن تستنتج وتُحمل تلقائيًا البنية الصحيحة من نسخ معينة بناءً على التكوين. إذا رأيت هذا الخطأ `ValueError` عند تحميل نموذج من نسخة، فهذا يعني أن الفئة التلقائية (Auto) لم تتمكن من العثور على خريطة من التكوين في نقطة التفتيش المعطاة إلى نوع النموذج الذي تُحاول تحميله. وغالبًا ما يحدث هذا عندما لا تدعم نقطة التفتيش مهمة معينة.
على سبيل المثال، سترى هذا الخطأ في المثال التالي لأنه لا يوجد GPT2 للإجابة على الأسئلة:
```py
>>> from transformers import AutoProcessor, AutoModelForQuestionAnswering
>>> processor = AutoProcessor.from_pretrained("openai-community/gpt2-medium")
>>> model = AutoModelForQuestionAnswering.from_pretrained("openai-community/gpt2-medium")
ValueError: Unrecognized configuration class <class 'transformers.models.gpt2.configuration_gpt2.GPT2Config'> for this kind of AutoModel: AutoModelForQuestionAnswering.
Model type should be one of AlbertConfig, BartConfig, BertConfig, BigBirdConfig, BigBirdPegasusConfig, BloomConfig, ...
```

View File

@ -112,7 +112,7 @@ Bevor Sie irgendwelchen Code schreiben, empfehlen wir Ihnen dringend, die besteh
Sie benötigen grundlegende `git`-Kenntnisse, um zu 🤗 Transformers beizutragen. Obwohl `git` nicht das einfachste Werkzeug ist, hat es ein sehr gutes Handbuch. Geben Sie `git --help` in eine Shell ein und genießen Sie es! Wenn Sie Bücher bevorzugen, ist [Pro Git](https://git-scm.com/book/en/v2) eine gute Anlaufstelle.
Sie benötigen **[Python 3.8](https://github.com/huggingface/transformers/blob/main/setup.py#L426)** oder höher, um zu 🤗 Transformers beizutragen. Folgen Sie den nachstehenden Schritten, um mit dem Beitrag zu beginnen:
Sie benötigen **[Python 3.9](https://github.com/huggingface/transformers/blob/main/setup.py#L426)** oder höher, um zu 🤗 Transformers beizutragen. Folgen Sie den nachstehenden Schritten, um mit dem Beitrag zu beginnen:
1. Forken Sie das [Repository](https://github.com/huggingface/transformers), indem Sie auf den **[Fork](https://github.com/huggingface/transformers/fork)**-Button auf der Seite des Repositorys klicken. Dadurch wird eine Kopie des Codes auf Ihrem GitHub-Account erstellt.

View File

@ -149,7 +149,7 @@ conda install conda-forge::transformers
Vorgefertigte Modelle werden heruntergeladen und lokal zwischengespeichert unter: `~/.cache/huggingface/hub`. Dies ist das Standardverzeichnis, das durch die Shell-Umgebungsvariable "TRANSFORMERS_CACHE" vorgegeben ist. Unter Windows wird das Standardverzeichnis durch `C:\Benutzer\Benutzername\.cache\huggingface\hub` angegeben. Sie können die unten aufgeführten Shell-Umgebungsvariablen - in der Reihenfolge ihrer Priorität - ändern, um ein anderes Cache-Verzeichnis anzugeben:
1. Shell-Umgebungsvariable (Standard): `HUGGINGFACE_HUB_CACHE` oder `TRANSFORMERS_CACHE`.
1. Shell-Umgebungsvariable (Standard): `HF_HUB_CACHE` oder `TRANSFORMERS_CACHE`.
2. Shell-Umgebungsvariable: `HF_HOME`.
3. Shell-Umgebungsvariable: `XDG_CACHE_HOME` + `/huggingface`.

View File

@ -43,7 +43,7 @@ Folglich können Sie eine bestimmte Modellversion mit dem Parameter "Revision" l
```py
>>> model = AutoModel.from_pretrained(
... "julien-c/EsperBERTo-small", revision="v2.0.1" # tag name, or branch name, or commit hash
... "julien-c/EsperBERTo-small", revision="4c77982" # tag name, or branch name, or commit hash
... )
```

View File

@ -109,7 +109,7 @@ label: NEGATIVE, with score: 0.5309
Die [`pipeline`] kann auch über einen ganzen Datensatz iterieren. Starten wir mit der Installation der [🤗 Datasets](https://huggingface.co/docs/datasets/) Bibliothek:
```bash
pip install datasets
pip install datasets
```
Erstellen wir eine [`pipeline`] mit der Aufgabe die wir lösen und dem Modell welches wir nutzen möchten.
@ -191,7 +191,7 @@ Wenn Sie kein Modell für Ihren Anwendungsfall finden können, müssen Sie ein v
<Youtube id="AhChOFRegn4"/>
Unter der Haube arbeiten die Klassen [`AutoModelForSequenceClassification`] und [`AutoTokenizer`] zusammen, um die [`pipeline`] zu betreiben. Eine [`AutoClass`](./model_doc/auto) ist eine Abkürzung, die automatisch die Architektur eines trainierten Modells aus dessen Namen oder Pfad abruft. Sie müssen nur die passende `AutoClass` für Ihre Aufgabe und den zugehörigen Tokenizer mit [`AutoTokenizer`] auswählen.
Unter der Haube arbeiten die Klassen [`AutoModelForSequenceClassification`] und [`AutoTokenizer`] zusammen, um die [`pipeline`] zu betreiben. Eine [`AutoClass`](./model_doc/auto) ist eine Abkürzung, die automatisch die Architektur eines trainierten Modells aus dessen Namen oder Pfad abruft. Sie müssen nur die passende `AutoClass` für Ihre Aufgabe und den zugehörigen Tokenizer mit [`AutoTokenizer`] auswählen.
Kehren wir zu unserem Beispiel zurück und sehen wir uns an, wie Sie die `AutoClass` verwenden können, um die Ergebnisse der [`pipeline`] zu replizieren.
@ -281,7 +281,7 @@ Jetzt können Sie Ihren vorverarbeiteten Stapel von Eingaben direkt an das Model
```
Das Modell gibt die endgültigen Aktivierungen in dem Attribut "logits" aus. Wenden Sie die Softmax-Funktion auf die "logits" an, um die Wahrscheinlichkeiten zu erhalten:
```py
>>> from torch import nn
@ -308,7 +308,7 @@ In der [Aufgabenzusammenfassung](./task_summary) steht, welche [AutoModel]-Klass
</Tip>
Jetzt können Sie Ihren vorverarbeiteten Stapel von Eingaben direkt an das Modell übergeben, indem Sie die Wörterbuchschlüssel direkt an die Tensoren übergeben:
```py
>>> tf_outputs = tf_model(tf_batch)
```
@ -383,8 +383,8 @@ Ein besonders cooles 🤗 Transformers-Feature ist die Möglichkeit, ein Modell
```py
>>> from transformers import AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory)
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
>>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory)
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)
```
</pt>
<tf>
@ -392,8 +392,8 @@ Ein besonders cooles 🤗 Transformers-Feature ist die Möglichkeit, ein Modell
```py
>>> from transformers import TFAutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory)
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)
>>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory)
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
```
</tf>
</frameworkcontent>

View File

@ -11,4 +11,4 @@ black_avoid_patterns = {
"{processor_class}": "FakeProcessorClass",
"{model_class}": "FakeModelClass",
"{object_class}": "FakeObjectClass",
}
}

View File

@ -110,6 +110,17 @@
- local: kv_cache
title: Best Practices for Generation with Cache
title: Generation
- isExpanded: false
sections:
- local: chat_template_basics
title: Getting Started with Chat Templates for Text LLMs
- local: chat_template_multimodal
title: Multimodal Chat Templates for Vision and Audio LLMs
- local: chat_template_tools_and_documents
title: Expanding Chat Templates with Tools and Documents
- local: chat_template_advanced
title: Advanced Usage and Customizing Your Chat Templates
title: Chat Templates
- isExpanded: false
sections:
- local: tasks/idefics
@ -127,8 +138,6 @@
title: Use model-specific APIs
- local: custom_models
title: Share a custom model
- local: chat_templating
title: Chat templates
- local: trainer
title: Trainer
- local: sagemaker
@ -139,8 +148,6 @@
title: Export to TFLite
- local: torchscript
title: Export to TorchScript
- local: benchmarks
title: Benchmarks
- local: notebooks
title: Notebooks with examples
- local: community
@ -167,10 +174,16 @@
title: AWQ
- local: quantization/aqlm
title: AQLM
- local: quantization/vptq
title: SpQR
- local: quantization/spqr
title: VPTQ
- local: quantization/quanto
title: Quanto
- local: quantization/eetq
title: EETQ
- local: quantization/higgs
title: HIGGS
- local: quantization/hqq
title: HQQ
- local: quantization/fbgemm_fp8
@ -183,6 +196,8 @@
title: BitNet
- local: quantization/compressed_tensors
title: compressed-tensors
- local: quantization/finegrained_fp8
title: Fine-grained FP8
- local: quantization/contribute
title: Contribute new quantization method
title: Quantization Methods
@ -218,6 +233,8 @@
title: CPU inference
- local: perf_infer_gpu_one
title: GPU inference
- local: perf_infer_gpu_multi
title: Multi-GPU inference
title: Optimizing inference
- local: big_models
title: Instantiate a big model
@ -320,6 +337,8 @@
sections:
- local: model_doc/albert
title: ALBERT
- local: model_doc/bamba
title: Bamba
- local: model_doc/bart
title: BART
- local: model_doc/barthez
@ -360,6 +379,8 @@
title: CodeLlama
- local: model_doc/cohere
title: Cohere
- local: model_doc/cohere2
title: Cohere2
- local: model_doc/convbert
title: ConvBERT
- local: model_doc/cpm
@ -376,6 +397,8 @@
title: DeBERTa-v2
- local: model_doc/dialogpt
title: DialoGPT
- local: model_doc/diffllama
title: DiffLlama
- local: model_doc/distilbert
title: DistilBERT
- local: model_doc/dpr
@ -392,10 +415,10 @@
title: ESM
- local: model_doc/falcon
title: Falcon
- local: model_doc/falcon3
title: Falcon3
- local: model_doc/falcon_mamba
title: FalconMamba
- local: model_doc/fastspeech2_conformer
title: FastSpeech2Conformer
- local: model_doc/flan-t5
title: FLAN-T5
- local: model_doc/flan-ul2
@ -438,6 +461,10 @@
title: Granite
- local: model_doc/granitemoe
title: GraniteMoe
- local: model_doc/granitevision
title: GraniteVision
- local: model_doc/helium
title: Helium
- local: model_doc/herbert
title: HerBERT
- local: model_doc/ibert
@ -490,6 +517,8 @@
title: mLUKE
- local: model_doc/mobilebert
title: MobileBERT
- local: model_doc/modernbert
title: ModernBert
- local: model_doc/mpnet
title: MPNet
- local: model_doc/mpt
@ -514,6 +543,8 @@
title: Nyströmformer
- local: model_doc/olmo
title: OLMo
- local: model_doc/olmo2
title: OLMo2
- local: model_doc/olmoe
title: OLMoE
- local: model_doc/open-llama
@ -606,6 +637,10 @@
title: XLNet
- local: model_doc/yoso
title: YOSO
- local: model_doc/zamba
title: Zamba
- local: model_doc/zamba2
title: Zamba2
title: Text models
- isExpanded: false
sections:
@ -621,6 +656,8 @@
title: ConvNeXTV2
- local: model_doc/cvt
title: CvT
- local: model_doc/dab-detr
title: DAB-DETR
- local: model_doc/deformable_detr
title: Deformable DETR
- local: model_doc/deit
@ -629,6 +666,8 @@
title: Depth Anything
- local: model_doc/depth_anything_v2
title: Depth Anything V2
- local: model_doc/depth_pro
title: DepthPro
- local: model_doc/deta
title: DETA
- local: model_doc/detr
@ -637,6 +676,8 @@
title: DiNAT
- local: model_doc/dinov2
title: DINOV2
- local: model_doc/dinov2_with_registers
title: DINOv2 with Registers
- local: model_doc/dit
title: DiT
- local: model_doc/dpt
@ -651,6 +692,8 @@
title: GLPN
- local: model_doc/hiera
title: Hiera
- local: model_doc/ijepa
title: I-JEPA
- local: model_doc/imagegpt
title: ImageGPT
- local: model_doc/levit
@ -681,10 +724,14 @@
title: ResNet
- local: model_doc/rt_detr
title: RT-DETR
- local: model_doc/rt_detr_v2
title: RT-DETRv2
- local: model_doc/segformer
title: SegFormer
- local: model_doc/seggpt
title: SegGpt
- local: model_doc/superglue
title: SuperGlue
- local: model_doc/superpoint
title: SuperPoint
- local: model_doc/swiftformer
@ -697,6 +744,10 @@
title: Swin2SR
- local: model_doc/table-transformer
title: Table Transformer
- local: model_doc/textnet
title: TextNet
- local: model_doc/timm_wrapper
title: Timm Wrapper
- local: model_doc/upernet
title: UperNet
- local: model_doc/van
@ -713,10 +764,10 @@
title: ViTMatte
- local: model_doc/vit_msn
title: ViTMSN
- local: model_doc/vitpose
title: ViTPose
- local: model_doc/yolos
title: YOLOS
- local: model_doc/zamba
title: Zamba
- local: model_doc/zoedepth
title: ZoeDepth
title: Vision models
@ -732,8 +783,8 @@
title: dac
- local: model_doc/encodec
title: EnCodec
- local: model_doc/hiera
title: Hiera
- local: model_doc/fastspeech2_conformer
title: FastSpeech2Conformer
- local: model_doc/hubert
title: Hubert
- local: model_doc/mctct
@ -742,6 +793,8 @@
title: Mimi
- local: model_doc/mms
title: MMS
- local: model_doc/moonshine
title: Moonshine
- local: model_doc/moshi
title: Moshi
- local: model_doc/musicgen
@ -804,6 +857,8 @@
title: ALIGN
- local: model_doc/altclip
title: AltCLIP
- local: model_doc/aria
title: Aria
- local: model_doc/blip
title: BLIP
- local: model_doc/blip-2
@ -822,16 +877,22 @@
title: CLIPSeg
- local: model_doc/clvp
title: CLVP
- local: model_doc/colpali
title: ColPali
- local: model_doc/data2vec
title: Data2Vec
- local: model_doc/deplot
title: DePlot
- local: model_doc/donut
title: Donut
- local: model_doc/emu3
title: Emu3
- local: model_doc/flava
title: FLAVA
- local: model_doc/git
title: GIT
- local: model_doc/got_ocr2
title: GOT-OCR2
- local: model_doc/grounding-dino
title: Grounding DINO
- local: model_doc/groupvit
@ -892,6 +953,8 @@
title: Pix2Struct
- local: model_doc/pixtral
title: Pixtral
- local: model_doc/qwen2_5_vl
title: Qwen2.5-VL
- local: model_doc/qwen2_audio
title: Qwen2Audio
- local: model_doc/qwen2_vl
@ -973,4 +1036,4 @@
- local: internal/time_series_utils
title: Utilities for Time Series
title: Internal Helpers
title: API
title: API

View File

@ -184,7 +184,7 @@ class PairClassificationPipeline(Pipeline):
```
The implementation is framework agnostic, and will work for PyTorch and TensorFlow models. If we have saved this in
a file named `pair_classification.py`, we can then import it and register it like this:
a file named `pair_classification.py`, we can then import it and register it like this.
```py
from pair_classification import PairClassificationPipeline
@ -199,6 +199,22 @@ PIPELINE_REGISTRY.register_pipeline(
)
```
The [register_pipeline](https://github.com/huggingface/transformers/blob/9feae5fb0164e89d4998e5776897c16f7330d3df/src/transformers/pipelines/base.py#L1387) function registers the pipeline details (task type, pipeline class, supported backends) to a models `config.json` file.
```json
"custom_pipelines": {
"pair-classification": {
"impl": "pair_classification.PairClassificationPipeline",
"pt": [
"AutoModelForSequenceClassification"
],
"tf": [
"TFAutoModelForSequenceClassification"
],
}
},
```
Once this is done, we can use it with a pretrained model. For instance `sgugger/finetuned-bert-mrpc` has been
fine-tuned on the MRPC dataset, which classifies pairs of sentences as paraphrases or not.

View File

@ -225,7 +225,7 @@ You have access to the following tools:
To solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Code:', and 'Observation:' sequences.
At each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task, then the tools that you want to use.
Then in the 'Code:' sequence, you shold write the code in simple Python. The code sequence must end with '/End code' sequence.
Then in the 'Code:' sequence, you should write the code in simple Python. The code sequence must end with '/End code' sequence.
During each intermediate step, you can use 'print()' to save whatever important information you will then need.
These print outputs will then be available in the 'Observation:' field, for using this information as input for the next step.

View File

@ -66,10 +66,10 @@ manager_agent.run("Who is the CEO of Hugging Face?")
Let's take again the tool example from main documentation, for which we had implemented a `tool` decorator.
If you need to add variation, like custom attributes for your too, you can build your tool following the fine-grained method: building a class that inherits from the [`Tool`] superclass.
If you need to add variation, like custom attributes for your tool, you can build your tool following the fine-grained method: building a class that inherits from the [`Tool`] superclass.
The custom tool needs:
- An attribute `name`, which corresponds to the name of the tool itself. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's name is `model_download_counter`.
- An attribute `name`, which corresponds to the name of the tool itself. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's name it `model_download_counter`.
- An attribute `description` is used to populate the agent's system prompt.
- An `inputs` attribute, which is a dictionary with keys `"type"` and `"description"`. It contains information that helps the Python interpreter make educated choices about the input.
- An `output_type` attribute, which specifies the output type.
@ -123,6 +123,54 @@ from transformers import load_tool, CodeAgent
model_download_tool = load_tool("m-ric/hf-model-downloads")
```
### Import a Space as a tool 🚀
You can directly import a Space from the Hub as a tool using the [`Tool.from_space`] method!
You only need to provide the id of the Space on the Hub, its name, and a description that will help you agent understand what the tool does. Under the hood, this will use [`gradio-client`](https://pypi.org/project/gradio-client/) library to call the Space.
For instance, let's import the [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) Space from the Hub and use it to generate an image.
```
from transformers import Tool
image_generation_tool = Tool.from_space(
"black-forest-labs/FLUX.1-dev",
name="image_generator",
description="Generate an image from a prompt")
image_generation_tool("A sunny beach")
```
And voilà, here's your image! 🏖️
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/sunny_beach.webp">
Then you can use this tool just like any other tool. For example, let's improve the prompt `a rabbit wearing a space suit` and generate an image of it.
```python
from transformers import ReactCodeAgent
agent = ReactCodeAgent(tools=[image_generation_tool])
agent.run(
"Improve this prompt, then generate an image of it.", prompt='A rabbit wearing a space suit'
)
```
```text
=== Agent thoughts:
improved_prompt could be "A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background"
Now that I have improved the prompt, I can use the image generator tool to generate an image based on this prompt.
=== Agent is executing the code below:
image = image_generator(prompt="A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background")
final_answer(image)
```
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit_spacesuit_flux.webp">
How cool is this? 🤩
### Use gradio-tools
[gradio-tools](https://github.com/freddyaboulton/gradio-tools) is a powerful library that allows using Hugging
@ -140,36 +188,6 @@ gradio_prompt_generator_tool = StableDiffusionPromptGeneratorTool()
prompt_generator_tool = Tool.from_gradio(gradio_prompt_generator_tool)
```
Now you can use it just like any other tool. For example, let's improve the prompt `a rabbit wearing a space suit`.
```python
image_generation_tool = load_tool('huggingface-tools/text-to-image')
agent = CodeAgent(tools=[prompt_generator_tool, image_generation_tool], llm_engine=llm_engine)
agent.run(
"Improve this prompt, then generate an image of it.", prompt='A rabbit wearing a space suit'
)
```
The model adequately leverages the tool:
```text
======== New task ========
Improve this prompt, then generate an image of it.
You have been provided with these initial arguments: {'prompt': 'A rabbit wearing a space suit'}.
==== Agent is executing the code below:
improved_prompt = StableDiffusionPromptGenerator(query=prompt)
while improved_prompt == "QUEUE_FULL":
improved_prompt = StableDiffusionPromptGenerator(query=prompt)
print(f"The improved prompt is {improved_prompt}.")
image = image_generator(prompt=improved_prompt)
====
```
Before finally generating the image:
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png">
> [!WARNING]
> gradio-tools require *textual* inputs and outputs even when working with different modalities like image and audio objects. Image and audio inputs and outputs are currently incompatible.
@ -179,7 +197,7 @@ We love Langchain and think it has a very compelling suite of tools.
To import a tool from LangChain, use the `from_langchain()` method.
Here is how you can use it to recreate the intro's search result using a LangChain web search tool.
This tool will need `pip install google-search-results` to work properly.
```python
from langchain.agents import load_tools
from transformers import Tool, ReactCodeAgent
@ -188,12 +206,12 @@ search_tool = Tool.from_langchain(load_tools(["serpapi"])[0])
agent = ReactCodeAgent(tools=[search_tool])
agent.run("How many more blocks (also denoted as layers) in BERT base encoder than the encoder from the architecture proposed in Attention is All You Need?")
agent.run("How many more blocks (also denoted as layers) are in BERT base encoder compared to the encoder from the architecture proposed in Attention is All You Need?")
```
## Display your agent run in a cool Gradio interface
You can leverage `gradio.Chatbot`to display your agent's thoughts using `stream_to_gradio`, here is an example:
You can leverage `gradio.Chatbot` to display your agent's thoughts using `stream_to_gradio`, here is an example:
```py
import gradio as gr
@ -240,4 +258,4 @@ with gr.Blocks() as demo:
if __name__ == "__main__":
demo.launch()
```
```

View File

@ -138,12 +138,15 @@ Load a processor with [`AutoProcessor.from_pretrained`]:
<frameworkcontent>
<pt>
The `AutoModelFor` classes let you load a pretrained model for a given task (see [here](model_doc/auto) for a complete list of available tasks). For example, load a model for sequence classification with [`AutoModelForSequenceClassification.from_pretrained`]:
The `AutoModelFor` classes let you load a pretrained model for a given task (see [here](model_doc/auto) for a complete list of available tasks). For example, load a model for sequence classification with [`AutoModelForSequenceClassification.from_pretrained`].
> [!WARNING]
> By default, the weights are loaded in full precision (torch.float32) regardless of the actual data type the weights are stored in such as torch.float16. Set `torch_dtype="auto"` to load the weights in the data type defined in a model's `config.json` file to automatically load the most memory-optimal data type.
```py
>>> from transformers import AutoModelForSequenceClassification
>>> model = AutoModelForSequenceClassification.from_pretrained("distilbert/distilbert-base-uncased")
>>> model = AutoModelForSequenceClassification.from_pretrained("distilbert/distilbert-base-uncased", torch_dtype="auto")
```
Easily reuse the same checkpoint to load an architecture for a different task:
@ -151,7 +154,7 @@ Easily reuse the same checkpoint to load an architecture for a different task:
```py
>>> from transformers import AutoModelForTokenClassification
>>> model = AutoModelForTokenClassification.from_pretrained("distilbert/distilbert-base-uncased")
>>> model = AutoModelForTokenClassification.from_pretrained("distilbert/distilbert-base-uncased", torch_dtype="auto")
```
<Tip warning={true}>

View File

@ -1,387 +0,0 @@
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# Benchmarks
<Tip warning={true}>
Hugging Face's Benchmarking tools are deprecated and it is advised to use external Benchmarking libraries to measure the speed
and memory complexity of Transformer models.
</Tip>
[[open-in-colab]]
Let's take a look at how 🤗 Transformers models can be benchmarked, best practices, and already available benchmarks.
A notebook explaining in more detail how to benchmark 🤗 Transformers models can be found [here](https://github.com/huggingface/notebooks/tree/main/examples/benchmark.ipynb).
## How to benchmark 🤗 Transformers models
The classes [`PyTorchBenchmark`] and [`TensorFlowBenchmark`] allow to flexibly benchmark 🤗 Transformers models. The benchmark classes allow us to measure the _peak memory usage_ and _required time_ for both _inference_ and _training_.
<Tip>
Here, _inference_ is defined by a single forward pass, and _training_ is defined by a single forward pass and
backward pass.
</Tip>
The benchmark classes [`PyTorchBenchmark`] and [`TensorFlowBenchmark`] expect an object of type [`PyTorchBenchmarkArguments`] and
[`TensorFlowBenchmarkArguments`], respectively, for instantiation. [`PyTorchBenchmarkArguments`] and [`TensorFlowBenchmarkArguments`] are data classes and contain all relevant configurations for their corresponding benchmark class. In the following example, it is shown how a BERT model of type _bert-base-cased_ can be benchmarked.
<frameworkcontent>
<pt>
```py
>>> from transformers import PyTorchBenchmark, PyTorchBenchmarkArguments
>>> args = PyTorchBenchmarkArguments(models=["google-bert/bert-base-uncased"], batch_sizes=[8], sequence_lengths=[8, 32, 128, 512])
>>> benchmark = PyTorchBenchmark(args)
```
</pt>
<tf>
```py
>>> from transformers import TensorFlowBenchmark, TensorFlowBenchmarkArguments
>>> args = TensorFlowBenchmarkArguments(
... models=["google-bert/bert-base-uncased"], batch_sizes=[8], sequence_lengths=[8, 32, 128, 512]
... )
>>> benchmark = TensorFlowBenchmark(args)
```
</tf>
</frameworkcontent>
Here, three arguments are given to the benchmark argument data classes, namely `models`, `batch_sizes`, and
`sequence_lengths`. The argument `models` is required and expects a `list` of model identifiers from the
[model hub](https://huggingface.co/models) The `list` arguments `batch_sizes` and `sequence_lengths` define
the size of the `input_ids` on which the model is benchmarked. There are many more parameters that can be configured
via the benchmark argument data classes. For more detail on these one can either directly consult the files
`src/transformers/benchmark/benchmark_args_utils.py`, `src/transformers/benchmark/benchmark_args.py` (for PyTorch)
and `src/transformers/benchmark/benchmark_args_tf.py` (for Tensorflow). Alternatively, running the following shell
commands from root will print out a descriptive list of all configurable parameters for PyTorch and Tensorflow
respectively.
<frameworkcontent>
<pt>
```bash
python examples/pytorch/benchmarking/run_benchmark.py --help
```
An instantiated benchmark object can then simply be run by calling `benchmark.run()`.
```py
>>> results = benchmark.run()
>>> print(results)
==================== INFERENCE - SPEED - RESULT ====================
--------------------------------------------------------------------------------
Model Name Batch Size Seq Length Time in s
--------------------------------------------------------------------------------
google-bert/bert-base-uncased 8 8 0.006
google-bert/bert-base-uncased 8 32 0.006
google-bert/bert-base-uncased 8 128 0.018
google-bert/bert-base-uncased 8 512 0.088
--------------------------------------------------------------------------------
==================== INFERENCE - MEMORY - RESULT ====================
--------------------------------------------------------------------------------
Model Name Batch Size Seq Length Memory in MB
--------------------------------------------------------------------------------
google-bert/bert-base-uncased 8 8 1227
google-bert/bert-base-uncased 8 32 1281
google-bert/bert-base-uncased 8 128 1307
google-bert/bert-base-uncased 8 512 1539
--------------------------------------------------------------------------------
==================== ENVIRONMENT INFORMATION ====================
- transformers_version: 2.11.0
- framework: PyTorch
- use_torchscript: False
- framework_version: 1.4.0
- python_version: 3.6.10
- system: Linux
- cpu: x86_64
- architecture: 64bit
- date: 2020-06-29
- time: 08:58:43.371351
- fp16: False
- use_multiprocessing: True
- only_pretrain_model: False
- cpu_ram_mb: 32088
- use_gpu: True
- num_gpus: 1
- gpu: TITAN RTX
- gpu_ram_mb: 24217
- gpu_power_watts: 280.0
- gpu_performance_state: 2
- use_tpu: False
```
</pt>
<tf>
```bash
python examples/tensorflow/benchmarking/run_benchmark_tf.py --help
```
An instantiated benchmark object can then simply be run by calling `benchmark.run()`.
```py
>>> results = benchmark.run()
>>> print(results)
>>> results = benchmark.run()
>>> print(results)
==================== INFERENCE - SPEED - RESULT ====================
--------------------------------------------------------------------------------
Model Name Batch Size Seq Length Time in s
--------------------------------------------------------------------------------
google-bert/bert-base-uncased 8 8 0.005
google-bert/bert-base-uncased 8 32 0.008
google-bert/bert-base-uncased 8 128 0.022
google-bert/bert-base-uncased 8 512 0.105
--------------------------------------------------------------------------------
==================== INFERENCE - MEMORY - RESULT ====================
--------------------------------------------------------------------------------
Model Name Batch Size Seq Length Memory in MB
--------------------------------------------------------------------------------
google-bert/bert-base-uncased 8 8 1330
google-bert/bert-base-uncased 8 32 1330
google-bert/bert-base-uncased 8 128 1330
google-bert/bert-base-uncased 8 512 1770
--------------------------------------------------------------------------------
==================== ENVIRONMENT INFORMATION ====================
- transformers_version: 2.11.0
- framework: Tensorflow
- use_xla: False
- framework_version: 2.2.0
- python_version: 3.6.10
- system: Linux
- cpu: x86_64
- architecture: 64bit
- date: 2020-06-29
- time: 09:26:35.617317
- fp16: False
- use_multiprocessing: True
- only_pretrain_model: False
- cpu_ram_mb: 32088
- use_gpu: True
- num_gpus: 1
- gpu: TITAN RTX
- gpu_ram_mb: 24217
- gpu_power_watts: 280.0
- gpu_performance_state: 2
- use_tpu: False
```
</tf>
</frameworkcontent>
By default, the _time_ and the _required memory_ for _inference_ are benchmarked. In the example output above the first
two sections show the result corresponding to _inference time_ and _inference memory_. In addition, all relevant
information about the computing environment, _e.g._ the GPU type, the system, the library versions, etc... are printed
out in the third section under _ENVIRONMENT INFORMATION_. This information can optionally be saved in a _.csv_ file
when adding the argument `save_to_csv=True` to [`PyTorchBenchmarkArguments`] and
[`TensorFlowBenchmarkArguments`] respectively. In this case, every section is saved in a separate
_.csv_ file. The path to each _.csv_ file can optionally be defined via the argument data classes.
Instead of benchmarking pre-trained models via their model identifier, _e.g._ `google-bert/bert-base-uncased`, the user can
alternatively benchmark an arbitrary configuration of any available model class. In this case, a `list` of
configurations must be inserted with the benchmark args as follows.
<frameworkcontent>
<pt>
```py
>>> from transformers import PyTorchBenchmark, PyTorchBenchmarkArguments, BertConfig
>>> args = PyTorchBenchmarkArguments(
... models=["bert-base", "bert-384-hid", "bert-6-lay"], batch_sizes=[8], sequence_lengths=[8, 32, 128, 512]
... )
>>> config_base = BertConfig()
>>> config_384_hid = BertConfig(hidden_size=384)
>>> config_6_lay = BertConfig(num_hidden_layers=6)
>>> benchmark = PyTorchBenchmark(args, configs=[config_base, config_384_hid, config_6_lay])
>>> benchmark.run()
==================== INFERENCE - SPEED - RESULT ====================
--------------------------------------------------------------------------------
Model Name Batch Size Seq Length Time in s
--------------------------------------------------------------------------------
bert-base 8 128 0.006
bert-base 8 512 0.006
bert-base 8 128 0.018
bert-base 8 512 0.088
bert-384-hid 8 8 0.006
bert-384-hid 8 32 0.006
bert-384-hid 8 128 0.011
bert-384-hid 8 512 0.054
bert-6-lay 8 8 0.003
bert-6-lay 8 32 0.004
bert-6-lay 8 128 0.009
bert-6-lay 8 512 0.044
--------------------------------------------------------------------------------
==================== INFERENCE - MEMORY - RESULT ====================
--------------------------------------------------------------------------------
Model Name Batch Size Seq Length Memory in MB
--------------------------------------------------------------------------------
bert-base 8 8 1277
bert-base 8 32 1281
bert-base 8 128 1307
bert-base 8 512 1539
bert-384-hid 8 8 1005
bert-384-hid 8 32 1027
bert-384-hid 8 128 1035
bert-384-hid 8 512 1255
bert-6-lay 8 8 1097
bert-6-lay 8 32 1101
bert-6-lay 8 128 1127
bert-6-lay 8 512 1359
--------------------------------------------------------------------------------
==================== ENVIRONMENT INFORMATION ====================
- transformers_version: 2.11.0
- framework: PyTorch
- use_torchscript: False
- framework_version: 1.4.0
- python_version: 3.6.10
- system: Linux
- cpu: x86_64
- architecture: 64bit
- date: 2020-06-29
- time: 09:35:25.143267
- fp16: False
- use_multiprocessing: True
- only_pretrain_model: False
- cpu_ram_mb: 32088
- use_gpu: True
- num_gpus: 1
- gpu: TITAN RTX
- gpu_ram_mb: 24217
- gpu_power_watts: 280.0
- gpu_performance_state: 2
- use_tpu: False
```
</pt>
<tf>
```py
>>> from transformers import TensorFlowBenchmark, TensorFlowBenchmarkArguments, BertConfig
>>> args = TensorFlowBenchmarkArguments(
... models=["bert-base", "bert-384-hid", "bert-6-lay"], batch_sizes=[8], sequence_lengths=[8, 32, 128, 512]
... )
>>> config_base = BertConfig()
>>> config_384_hid = BertConfig(hidden_size=384)
>>> config_6_lay = BertConfig(num_hidden_layers=6)
>>> benchmark = TensorFlowBenchmark(args, configs=[config_base, config_384_hid, config_6_lay])
>>> benchmark.run()
==================== INFERENCE - SPEED - RESULT ====================
--------------------------------------------------------------------------------
Model Name Batch Size Seq Length Time in s
--------------------------------------------------------------------------------
bert-base 8 8 0.005
bert-base 8 32 0.008
bert-base 8 128 0.022
bert-base 8 512 0.106
bert-384-hid 8 8 0.005
bert-384-hid 8 32 0.007
bert-384-hid 8 128 0.018
bert-384-hid 8 512 0.064
bert-6-lay 8 8 0.002
bert-6-lay 8 32 0.003
bert-6-lay 8 128 0.0011
bert-6-lay 8 512 0.074
--------------------------------------------------------------------------------
==================== INFERENCE - MEMORY - RESULT ====================
--------------------------------------------------------------------------------
Model Name Batch Size Seq Length Memory in MB
--------------------------------------------------------------------------------
bert-base 8 8 1330
bert-base 8 32 1330
bert-base 8 128 1330
bert-base 8 512 1770
bert-384-hid 8 8 1330
bert-384-hid 8 32 1330
bert-384-hid 8 128 1330
bert-384-hid 8 512 1540
bert-6-lay 8 8 1330
bert-6-lay 8 32 1330
bert-6-lay 8 128 1330
bert-6-lay 8 512 1540
--------------------------------------------------------------------------------
==================== ENVIRONMENT INFORMATION ====================
- transformers_version: 2.11.0
- framework: Tensorflow
- use_xla: False
- framework_version: 2.2.0
- python_version: 3.6.10
- system: Linux
- cpu: x86_64
- architecture: 64bit
- date: 2020-06-29
- time: 09:38:15.487125
- fp16: False
- use_multiprocessing: True
- only_pretrain_model: False
- cpu_ram_mb: 32088
- use_gpu: True
- num_gpus: 1
- gpu: TITAN RTX
- gpu_ram_mb: 24217
- gpu_power_watts: 280.0
- gpu_performance_state: 2
- use_tpu: False
```
</tf>
</frameworkcontent>
Again, _inference time_ and _required memory_ for _inference_ are measured, but this time for customized configurations
of the `BertModel` class. This feature can especially be helpful when deciding for which configuration the model
should be trained.
## Benchmark best practices
This section lists a couple of best practices one should be aware of when benchmarking a model.
- Currently, only single device benchmarking is supported. When benchmarking on GPU, it is recommended that the user
specifies on which device the code should be run by setting the `CUDA_VISIBLE_DEVICES` environment variable in the
shell, _e.g._ `export CUDA_VISIBLE_DEVICES=0` before running the code.
- The option `no_multi_processing` should only be set to `True` for testing and debugging. To ensure accurate
memory measurement it is recommended to run each memory benchmark in a separate process by making sure
`no_multi_processing` is set to `True`.
- One should always state the environment information when sharing the results of a model benchmark. Results can vary
heavily between different GPU devices, library versions, etc., as a consequence, benchmark results on their own are not very
useful for the community.
## Sharing your benchmark
Previously all available core models (10 at the time) have been benchmarked for _inference time_, across many different
settings: using PyTorch, with and without TorchScript, using TensorFlow, with and without XLA. All of those tests were
done across CPUs (except for TensorFlow XLA) and GPUs.
The approach is detailed in the [following blogpost](https://medium.com/huggingface/benchmarking-transformers-pytorch-and-tensorflow-e2917fb891c2) and the results are
available [here](https://docs.google.com/spreadsheets/d/1sryqufw2D0XlUH4sq3e9Wnxu5EAQkaohzrJbd5HdQ_w/edit?usp=sharing).
With the new _benchmark_ tools, it is easier than ever to share your benchmark results with the community
- [PyTorch Benchmarking Results](https://github.com/huggingface/transformers/tree/main/examples/pytorch/benchmarking/README.md).
- [TensorFlow Benchmarking Results](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/benchmarking/README.md).

View File

@ -0,0 +1,463 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# Advanced Usage and Customizing Your Chat Templates
In this page, well explore more advanced techniques for working with chat templates in Transformers. Whether youre looking to write your own templates, create custom components, or optimize your templates for efficiency, well cover everything you need to take your templates to the next level. Lets dive into the tools and strategies that will help you get the most out of your chat models.
## How do chat templates work?
The chat template for a model is stored on the `tokenizer.chat_template` attribute. Let's take a look at a `Zephyr` chat template, though note this
one is a little simplified from the actual one!
```
{%- for message in messages %}
{{- '<|' + message['role'] + '|>\n' }}
{{- message['content'] + eos_token }}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|assistant|>\n' }}
{%- endif %}
```
If you've never seen one of these before, this is a [Jinja template](https://jinja.palletsprojects.com/en/3.1.x/templates/).
Jinja is a templating language that allows you to write simple code that generates text. In many ways, the code and
syntax resembles Python. In pure Python, this template would look something like this:
```python
for message in messages:
print(f'<|{message["role"]}|>')
print(message['content'] + eos_token)
if add_generation_prompt:
print('<|assistant|>')
```
Effectively, the template does three things:
1. For each message, print the role enclosed in `<|` and `|>`, like `<|user|>` or `<|assistant|>`.
2. Next, print the content of the message, followed by the end-of-sequence token.
3. Finally, if `add_generation_prompt` is set, print the assistant token, so that the model knows to start generating
an assistant response.
This is a pretty simple template but Jinja gives you a lot of flexibility to do more complex things! Let's see a Jinja
template that can format inputs similarly to the way LLaMA formats them (note that the real LLaMA template includes
handling for default system messages and slightly different system message handling in general - don't use this one
in your actual code!)
```
{%- for message in messages %}
{%- if message['role'] == 'user' %}
{{- bos_token + '[INST] ' + message['content'] + ' [/INST]' }}
{%- elif message['role'] == 'system' %}
{{- '<<SYS>>\\n' + message['content'] + '\\n<</SYS>>\\n\\n' }}
{%- elif message['role'] == 'assistant' %}
{{- ' ' + message['content'] + ' ' + eos_token }}
{%- endif %}
{%- endfor %}
```
Hopefully if you stare at this for a little bit you can see what this template is doing - it adds specific tokens like
`[INST]` and `[/INST]` based on the role of each message. User, assistant and system messages are clearly
distinguishable to the model because of the tokens they're wrapped in.
## How do I create a chat template?
Simple, just write a jinja template and set `tokenizer.chat_template`. You may find it easier to start with an
existing template from another model and simply edit it for your needs! For example, we could take the LLaMA template
above and add "[ASST]" and "[/ASST]" to assistant messages:
```
{%- for message in messages %}
{%- if message['role'] == 'user' %}
{{- bos_token + '[INST] ' + message['content'].strip() + ' [/INST]' }}
{%- elif message['role'] == 'system' %}
{{- '<<SYS>>\\n' + message['content'].strip() + '\\n<</SYS>>\\n\\n' }}
{%- elif message['role'] == 'assistant' %}
{{- '[ASST] ' + message['content'] + ' [/ASST]' + eos_token }}
{%- endif %}
{%- endfor %}
```
Now, simply set the `tokenizer.chat_template` attribute. Next time you use [`~PreTrainedTokenizer.apply_chat_template`], it will
use your new template! This attribute will be saved in the `tokenizer_config.json` file, so you can use
[`~utils.PushToHubMixin.push_to_hub`] to upload your new template to the Hub and make sure everyone's using the right
template for your model!
```python
template = tokenizer.chat_template
template = template.replace("SYS", "SYSTEM") # Change the system token
tokenizer.chat_template = template # Set the new template
tokenizer.push_to_hub("model_name") # Upload your new template to the Hub!
```
The method [`~PreTrainedTokenizer.apply_chat_template`] which uses your chat template is called by the [`TextGenerationPipeline`] class, so
once you set the correct chat template, your model will automatically become compatible with [`TextGenerationPipeline`].
<Tip>
If you're fine-tuning a model for chat, in addition to setting a chat template, you should probably add any new chat
control tokens as special tokens in the tokenizer. Special tokens are never split,
ensuring that your control tokens are always handled as single tokens rather than being tokenized in pieces. You
should also set the tokenizer's `eos_token` attribute to the token that marks the end of assistant generations in your
template. This will ensure that text generation tools can correctly figure out when to stop generating text.
</Tip>
## Why do some models have multiple templates?
Some models use different templates for different use cases. For example, they might use one template for normal chat
and another for tool-use, or retrieval-augmented generation. In these cases, `tokenizer.chat_template` is a dictionary.
This can cause some confusion, and where possible, we recommend using a single template for all use-cases. You can use
Jinja statements like `if tools is defined` and `{% macro %}` definitions to easily wrap multiple code paths in a
single template.
When a tokenizer has multiple templates, `tokenizer.chat_template` will be a `dict`, where each key is the name
of a template. The `apply_chat_template` method has special handling for certain template names: Specifically, it will
look for a template named `default` in most cases, and will raise an error if it can't find one. However, if a template
named `tool_use` exists when the user has passed a `tools` argument, it will use that instead. To access templates
with other names, pass the name of the template you want to the `chat_template` argument of
`apply_chat_template()`.
We find that this can be a bit confusing for users, though - so if you're writing a template yourself, we recommend
trying to put it all in a single template where possible!
## What template should I use?
When setting the template for a model that's already been trained for chat, you should ensure that the template
exactly matches the message formatting that the model saw during training, or else you will probably experience
performance degradation. This is true even if you're training the model further - you will probably get the best
performance if you keep the chat tokens constant. This is very analogous to tokenization - you generally get the
best performance for inference or fine-tuning when you precisely match the tokenization used during training.
If you're training a model from scratch, or fine-tuning a base language model for chat, on the other hand,
you have a lot of freedom to choose an appropriate template! LLMs are smart enough to learn to handle lots of different
input formats. One popular choice is the `ChatML` format, and this is a good, flexible choice for many use-cases.
It looks like this:
```
{%- for message in messages %}
{{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
```
If you like this one, here it is in one-liner form, ready to copy into your code. The one-liner also includes
handy support for [generation prompts](#what-are-generation-prompts), but note that it doesn't add BOS or EOS tokens!
If your model expects those, they won't be added automatically by `apply_chat_template` - in other words, the
text will be tokenized with `add_special_tokens=False`. This is to avoid potential conflicts between the template and
the `add_special_tokens` logic. If your model expects special tokens, make sure to add them to the template!
```python
tokenizer.chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}"
```
This template wraps each message in `<|im_start|>` and `<|im_end|>` tokens, and simply writes the role as a string, which
allows for flexibility in the roles you train with. The output looks like this:
```text
<|im_start|>system
You are a helpful chatbot that will do its best not to say anything so stupid that people tweet about it.<|im_end|>
<|im_start|>user
How are you?<|im_end|>
<|im_start|>assistant
I'm doing great!<|im_end|>
```
The "user", "system" and "assistant" roles are the standard for chat, and we recommend using them when it makes sense,
particularly if you want your model to operate well with [`TextGenerationPipeline`]. However, you are not limited
to these roles - templating is extremely flexible, and any string can be a role.
## I want to add some chat templates! How should I get started?
If you have any chat models, you should set their `tokenizer.chat_template` attribute and test it using
[`~PreTrainedTokenizer.apply_chat_template`], then push the updated tokenizer to the Hub. This applies even if you're
not the model owner - if you're using a model with an empty chat template, or one that's still using the default class
template, please open a [pull request](https://huggingface.co/docs/hub/repositories-pull-requests-discussions) to the model repository so that this attribute can be set properly!
Once the attribute is set, that's it, you're done! `tokenizer.apply_chat_template` will now work correctly for that
model, which means it is also automatically supported in places like `TextGenerationPipeline`!
By ensuring that models have this attribute, we can make sure that the whole community gets to use the full power of
open-source models. Formatting mismatches have been haunting the field and silently harming performance for too long -
it's time to put an end to them!
<Tip>
The easiest way to get started with writing Jinja templates is to take a look at some existing ones. You can use
`print(tokenizer.chat_template)` for any chat model to see what template it's using. In general, models that support tool use have
much more complex templates than other models - so when you're just getting started, they're probably a bad example
to learn from! You can also take a look at the
[Jinja documentation](https://jinja.palletsprojects.com/en/3.1.x/templates/#synopsis) for details
of general Jinja formatting and syntax.
</Tip>
Jinja templates in `transformers` are identical to Jinja templates elsewhere. The main thing to know is that
the conversation history will be accessible inside your template as a variable called `messages`.
You will be able to access `messages` in your template just like you can in Python, which means you can loop over
it with `{% for message in messages %}` or access individual messages with `{{ messages[0] }}`, for example.
You can also use the following tips to write clean, efficient Jinja templates:
### Trimming whitespace
By default, Jinja will print any whitespace that comes before or after a block. This can be a problem for chat
templates, which generally want to be very precise with whitespace! To avoid this, we strongly recommend writing
your templates like this:
```
{%- for message in messages %}
{{- message['role'] + message['content'] }}
{%- endfor %}
```
rather than like this:
```
{% for message in messages %}
{{ message['role'] + message['content'] }}
{% endfor %}
```
Adding `-` will strip any whitespace that comes before the block. The second example looks innocent, but the newline
and indentation may end up being included in the output, which is probably not what you want!
### Special variables
Inside your template, you will have access several special variables. The most important of these is `messages`,
which contains the chat history as a list of message dicts. However, there are several others. Not every
variable will be used in every template. The most common other variables are:
- `tools` contains a list of tools in JSON schema format. Will be `None` or undefined if no tools are passed.
- `documents` contains a list of documents in the format `{"title": "Title", "contents": "Contents"}`, used for retrieval-augmented generation. Will be `None` or undefined if no documents are passed.
- `add_generation_prompt` is a bool that is `True` if the user has requested a generation prompt, and `False` otherwise. If this is set, your template should add the header for an assistant message to the end of the conversation. If your model doesn't have a specific header for assistant messages, you can ignore this flag.
- **Special tokens** like `bos_token` and `eos_token`. These are extracted from `tokenizer.special_tokens_map`. The exact tokens available inside each template will differ depending on the parent tokenizer.
<Tip>
You can actually pass any `kwarg` to `apply_chat_template`, and it will be accessible inside the template as a variable. In general,
we recommend trying to stick to the core variables above, as it will make your model harder to use if users have
to write custom code to pass model-specific `kwargs`. However, we're aware that this field moves quickly, so if you
have a new use-case that doesn't fit in the core API, feel free to use a new `kwarg` for it! If a new `kwarg`
becomes common we may promote it into the core API and create a standard, documented format for it.
</Tip>
### Callable functions
There is also a short list of callable functions available to you inside your templates. These are:
- `raise_exception(msg)`: Raises a `TemplateException`. This is useful for debugging, and for telling users when they're
doing something that your template doesn't support.
- `strftime_now(format_str)`: Equivalent to `datetime.now().strftime(format_str)` in Python. This is used for getting
the current date/time in a specific format, which is sometimes included in system messages.
### Compatibility with non-Python Jinja
There are multiple implementations of Jinja in various languages. They generally have the same syntax,
but a key difference is that when you're writing a template in Python you can use Python methods, such as
`.lower()` on strings or `.items()` on dicts. This will break if someone tries to use your template on a non-Python
implementation of Jinja. Non-Python implementations are particularly common in deployment environments, where JS
and Rust are very popular.
Don't panic, though! There are a few easy changes you can make to your templates to ensure they're compatible across
all implementations of Jinja:
- Replace Python methods with Jinja filters. These usually have the same name, for example `string.lower()` becomes
`string|lower`, and `dict.items()` becomes `dict|items`. One notable change is that `string.strip()` becomes `string|trim`.
See the [list of built-in filters](https://jinja.palletsprojects.com/en/3.1.x/templates/#builtin-filters)
in the Jinja documentation for more.
- Replace `True`, `False` and `None`, which are Python-specific, with `true`, `false` and `none`.
- Directly rendering a dict or list may give different results in other implementations (for example, string entries
might change from single-quoted to double-quoted). Adding the `tojson` filter can help to ensure consistency here.
### Writing generation prompts
We mentioned above that `add_generation_prompt` is a special variable that will be accessible inside your template,
and is controlled by the user setting the `add_generation_prompt` flag. If your model expects a header for
assistant messages, then your template must support adding the header when `add_generation_prompt` is set.
Here is an example of a template that formats messages ChatML-style, with generation prompt support:
```text
{{- bos_token }}
{%- for message in messages %}
{{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- endif %}
```
The exact content of the assistant header will depend on your specific model, but it should always be **the string
that represents the start of an assistant message**, so that if the user applies your template with
`add_generation_prompt=True` and then generates text, the model will write an assistant response. Also note that some
models do not need a generation prompt, because assistant messages always begin immediately after user messages.
This is particularly common for LLaMA and Mistral models, where assistant messages begin immediately after the `[/INST]`
token that ends user messages. In these cases, the template can ignore the `add_generation_prompt` flag.
Generation prompts are important! If your model requires a generation prompt but it is not set in the template, then
model generations will likely be severely degraded, or the model may display unusual behaviour like continuing
the final user message!
### Writing and debugging larger templates
When this feature was introduced, most templates were quite small, the Jinja equivalent of a "one-liner" script.
However, with new models and features like tool-use and RAG, some templates can be 100 lines long or more. When
writing templates like these, it's a good idea to write them in a separate file, using a text editor. You can easily
extract a chat template to a file:
```python
open("template.jinja", "w").write(tokenizer.chat_template)
```
Or load the edited template back into the tokenizer:
```python
tokenizer.chat_template = open("template.jinja").read()
```
As an added bonus, when you write a long, multi-line template in a separate file, line numbers in that file will
exactly correspond to line numbers in template parsing or execution errors. This will make it much easier to
identify the source of issues.
## Writing templates for tools
Although chat templates do not enforce a specific API for tools (or for anything, really), we recommend
template authors try to stick to a standard API where possible. The whole point of chat templates is to allow code
to be transferable across models, so deviating from the standard tools API means users will have to write
custom code to use tools with your model. Sometimes it's unavoidable, but often with clever templating you can
make the standard API work!
Below, we'll list the elements of the standard API, and give tips on writing templates that will work well with it.
### Tool definitions
Your template should expect that the variable `tools` will either be null (if no tools are passed), or is a list
of JSON schema dicts. Our chat template methods allow users to pass tools as either JSON schema or Python functions, but when
functions are passed, we automatically generate JSON schema and pass that to your template. As a result, the
`tools` variable that your template receives will always be a list of JSON schema. Here is
a sample tool JSON schema:
```json
{
"type": "function",
"function": {
"name": "multiply",
"description": "A function that multiplies two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number to multiply"
},
"b": {
"type": "number",
"description": "The second number to multiply"
}
},
"required": ["a", "b"]
}
}
}
```
And here is some example code for handling tools in your chat template. Remember, this is just an example for a
specific format - your model will probably need different formatting!
```text
{%- if tools %}
{%- for tool in tools %}
{{- '<tool>' + tool['function']['name'] + '\n' }}
{%- for argument in tool['function']['parameters']['properties'] %}
{{- argument + ': ' + tool['function']['parameters']['properties'][argument]['description'] + '\n' }}
{%- endfor %}
{{- '\n</tool>' }}
{%- endif %}
{%- endif %}
```
The specific tokens and tool descriptions your template renders should of course be chosen to match the ones your model
was trained with. There is no requirement that your **model** understands JSON schema input, only that your template can translate
JSON schema into your model's format. For example, [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024)
was trained with tools defined using Python function headers, but the Command-R tool template accepts JSON schema,
converts types internally and renders the input tools as Python headers. You can do a lot with templates!
### Tool calls
Tool calls, if present, will be a list attached to a message with the "assistant" role. Note that `tool_calls` is
always a list, even though most tool-calling models only support single tool calls at a time, which means
the list will usually only have a single element. Here is a sample message dict containing a tool call:
```json
{
"role": "assistant",
"tool_calls": [
{
"type": "function",
"function": {
"name": "multiply",
"arguments": {
"a": 5,
"b": 6
}
}
}
]
}
```
And a common pattern for handling them would be something like this:
```text
{%- if message['role'] == 'assistant' and 'tool_calls' in message %}
{%- for tool_call in message['tool_calls'] %}
{{- '<tool_call>' + tool_call['function']['name'] + '\n' + tool_call['function']['arguments']|tojson + '\n</tool_call>' }}
{%- endif %}
{%- endfor %}
{%- endif %}
```
Again, you should render the tool call with the formatting and special tokens that your model expects.
### Tool responses
Tool responses have a simple format: They are a message dict with the "tool" role, a "name" key giving the name
of the called function, and a "content" key containing the result of the tool call. Here is a sample tool response:
```json
{
"role": "tool",
"name": "multiply",
"content": "30"
}
```
You don't need to use all of the keys in the tool response. For example, if your model doesn't expect the function
name to be included in the tool response, then rendering it can be as simple as:
```text
{%- if message['role'] == 'tool' %}
{{- "<tool_result>" + message['content'] + "</tool_result>" }}
{%- endif %}
```
Again, remember that the actual formatting and special tokens are model-specific - you should take a lot of care
to ensure that tokens, whitespace and everything else exactly match the format your model was trained with!

View File

@ -0,0 +1,287 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# Getting Started with Chat Templates for Text LLMs
An increasingly common use case for LLMs is **chat**. In a chat context, rather than continuing a single string
of text (as is the case with a standard language model), the model instead continues a conversation that consists
of one or more **messages**, each of which includes a **role**, like "user" or "assistant", as well as message text.
Much like tokenization, different models expect very different input formats for chat. This is the reason we added
**chat templates** as a feature. Chat templates are part of the tokenizer for text-only LLMs or processor for multimodal LLMs. They specify how to convert conversations, represented as lists of messages, into a single tokenizable string in the format that the model expects.
We'll explore the basic usage of chat templates with text-only LLMs in this page. For detailed guidance on multimodal models, we have a dedicated [documentation oage for multimodal models](./chat_template_multimodal), which covers how to work with image, video and audio inputs in your templates.
Let's make this concrete with a quick example using the `mistralai/Mistral-7B-Instruct-v0.1` model:
```python
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
>>> chat = [
... {"role": "user", "content": "Hello, how are you?"},
... {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
... {"role": "user", "content": "I'd like to show off how chat templating works!"},
... ]
>>> tokenizer.apply_chat_template(chat, tokenize=False)
"<s>[INST] Hello, how are you? [/INST]I'm doing great. How can I help you today?</s> [INST] I'd like to show off how chat templating works! [/INST]"
```
Notice how the tokenizer has added the control tokens [INST] and [/INST] to indicate the start and end of
user messages (but not assistant messages!), and the entire chat is condensed into a single string.
If we use `tokenize=True`, which is the default setting, that string will also be tokenized for us.
Now, try the same code, but swap in the `HuggingFaceH4/zephyr-7b-beta` model instead, and you should get:
```text
<|user|>
Hello, how are you?</s>
<|assistant|>
I'm doing great. How can I help you today?</s>
<|user|>
I'd like to show off how chat templating works!</s>
```
Both Zephyr and Mistral-Instruct were fine-tuned from the same base model, `Mistral-7B-v0.1`. However, they were trained
with totally different chat formats. Without chat templates, you would have to write manual formatting code for each
model, and it's very easy to make minor errors that hurt performance! Chat templates handle the details of formatting
for you, allowing you to write universal code that works for any model.
## How do I use chat templates?
As you can see in the example above, chat templates are easy to use. Simply build a list of messages, with `role`
and `content` keys, and then pass it to the [`~PreTrainedTokenizer.apply_chat_template`] or [`~ProcessorMixin.apply_chat_template`] method
depending on what type of model you are using. Once you do that,
you'll get output that's ready to go! When using chat templates as input for model generation, it's also a good idea
to use `add_generation_prompt=True` to add a [generation prompt](#what-are-generation-prompts).
Here's an example of preparing input for `model.generate()`, using `Zephyr` again:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "HuggingFaceH4/zephyr-7b-beta"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint) # You may want to use bfloat16 and/or move to GPU here
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
print(tokenizer.decode(tokenized_chat[0]))
```
This will yield a string in the input format that Zephyr expects.
```text
<|system|>
You are a friendly chatbot who always responds in the style of a pirate</s>
<|user|>
How many helicopters can a human eat in one sitting?</s>
<|assistant|>
```
Now that our input is formatted correctly for Zephyr, we can use the model to generate a response to the user's question:
```python
outputs = model.generate(tokenized_chat, max_new_tokens=128)
print(tokenizer.decode(outputs[0]))
```
This will yield:
```text
<|system|>
You are a friendly chatbot who always responds in the style of a pirate</s>
<|user|>
How many helicopters can a human eat in one sitting?</s>
<|assistant|>
Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all.
```
Arr, 'twas easy after all!
## Is there an automated pipeline for chat?
Yes, there is! Our text generation pipelines support chat inputs, which makes it easy to use chat models. In the past,
we used to use a dedicated "ConversationalPipeline" class, but this has now been deprecated and its functionality
has been merged into the [`TextGenerationPipeline`]. Let's try the `Zephyr` example again, but this time using
a pipeline:
```python
from transformers import pipeline
pipe = pipeline("text-generation", "HuggingFaceH4/zephyr-7b-beta")
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
print(pipe(messages, max_new_tokens=128)[0]['generated_text'][-1]) # Print the assistant's response
```
```text
{'role': 'assistant', 'content': "Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all."}
```
The pipeline will take care of all the details of tokenization and calling `apply_chat_template` for you -
once the model has a chat template, all you need to do is initialize the pipeline and pass it the list of messages!
## What are "generation prompts"?
You may have noticed that the `apply_chat_template` method has an `add_generation_prompt` argument. This argument tells
the template to add tokens that indicate the start of a bot response. For example, consider the following chat:
```python
messages = [
{"role": "user", "content": "Hi there!"},
{"role": "assistant", "content": "Nice to meet you!"},
{"role": "user", "content": "Can I ask a question?"}
]
```
Here's what this will look like without a generation prompt, for a model that uses standard "ChatML" formatting:
```python
tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)
"""<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
"""
```
And here's what it looks like **with** a generation prompt:
```python
tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
"""<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant
"""
```
Note that this time, we've added the tokens that indicate the start of a bot response. This ensures that when the model
generates text it will write a bot response instead of doing something unexpected, like continuing the user's
message. Remember, chat models are still just language models - they're trained to continue text, and chat is just a
special kind of text to them! You need to guide them with appropriate control tokens, so they know what they're
supposed to be doing.
Not all models require generation prompts. Some models, like LLaMA, don't have any
special tokens before bot responses. In these cases, the `add_generation_prompt` argument will have no effect. The exact
effect that `add_generation_prompt` has will depend on the template being used.
## What does "continue_final_message" do?
When passing a list of messages to `apply_chat_template` or `TextGenerationPipeline`, you can choose
to format the chat so the model will continue the final message in the chat instead of starting a new one. This is done
by removing any end-of-sequence tokens that indicate the end of the final message, so that the model will simply
extend the final message when it begins to generate text. This is useful for "prefilling" the model's response.
Here's an example:
```python
chat = [
{"role": "user", "content": "Can you format the answer in JSON?"},
{"role": "assistant", "content": '{"name": "'},
]
formatted_chat = tokenizer.apply_chat_template(chat, tokenize=True, return_dict=True, continue_final_message=True)
model.generate(**formatted_chat)
```
The model will generate text that continues the JSON string, rather than starting a new message. This approach
can be very useful for improving the accuracy of the model's instruction-following when you know how you want
it to start its replies.
Because `add_generation_prompt` adds the tokens that start a new message, and `continue_final_message` removes any
end-of-message tokens from the final message, it does not make sense to use them together. As a result, you'll
get an error if you try!
<Tip>
The default behaviour of `TextGenerationPipeline` is to set `add_generation_prompt=True` so that it starts a new
message. However, if the final message in the input chat has the "assistant" role, it will assume that this message is
a prefill and switch to `continue_final_message=True` instead, because most models do not support multiple
consecutive assistant messages. You can override this behaviour by explicitly passing the `continue_final_message`
argument when calling the pipeline.
</Tip>
## Can I use chat templates in training?
Yes! This is a good way to ensure that the chat template matches the tokens the model sees during training.
We recommend that you apply the chat template as a preprocessing step for your dataset. After this, you
can simply continue like any other language model training task. When training, you should usually set
`add_generation_prompt=False`, because the added tokens to prompt an assistant response will not be helpful during
training. Let's see an example:
```python
from transformers import AutoTokenizer
from datasets import Dataset
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-beta")
chat1 = [
{"role": "user", "content": "Which is bigger, the moon or the sun?"},
{"role": "assistant", "content": "The sun."}
]
chat2 = [
{"role": "user", "content": "Which is bigger, a virus or a bacterium?"},
{"role": "assistant", "content": "A bacterium."}
]
dataset = Dataset.from_dict({"chat": [chat1, chat2]})
dataset = dataset.map(lambda x: {"formatted_chat": tokenizer.apply_chat_template(x["chat"], tokenize=False, add_generation_prompt=False)})
print(dataset['formatted_chat'][0])
```
And we get:
```text
<|user|>
Which is bigger, the moon or the sun?</s>
<|assistant|>
The sun.</s>
```
From here, just continue training like you would with a standard language modelling task, using the `formatted_chat` column.
<Tip>
By default, some tokenizers add special tokens like `<bos>` and `<eos>` to text they tokenize. Chat templates should
already include all the special tokens they need, and so additional special tokens will often be incorrect or
duplicated, which will hurt model performance.
Therefore, if you format text with `apply_chat_template(tokenize=False)`, you should set the argument
`add_special_tokens=False` when you tokenize that text later. If you use `apply_chat_template(tokenize=True)`, you don't need to worry about this!
</Tip>

View File

@ -0,0 +1,289 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# Multimodal Chat Templates for Vision and Audio LLMs
In this section, we'll explore how to use chat templates with multimodal models, enabling your templates to handle a variety of inputs such as text, images, and audio. Multimodal models provide richer, more interactive experiences, and understanding how to effectively combine these inputs within your templates is key. Well walk through how to work with different modalities, configure your templates for optimal performance, and tackle common challenges along the way.
Just like with text-only LLMs, multimodal models expect a chat with **messages**, each of which includes a **role** and **content**. However, for multimodal models, chat templates are a part of the [Processor](./main_cllasses/processors) class. Let's see how we can format our prompts when there are images or videos in the input along with text.
## Image inputs
For models such as [LLaVA](https://huggingface.co/llava-hf) the prompts can be formatted as below. Notice that the only difference from text-only models is that we need to also pass a placeholder for input images. To accommodate for extra modalities, each **content** is a list containing either a text or an image **type**.
Let's make this concrete with a quick example using the `llava-hf/llava-onevision-qwen2-0.5b-ov-hf` model:
```python
from transformers import AutoProcessor, LlavaOnevisionForConditionalGeneration
model_id = "llava-hf/llava-onevision-qwen2-0.5b-ov-hf"
processor = AutoProcessor.from_pretrained(model_id)
messages = [
{
"role": "system",
"content": [{"type": "text", "text": "You are a friendly chatbot who always responds in the style of a pirate"}],
},
{
"role": "user",
"content": [
{"type": "image"},
{"type": "text", "text": "What are these?"},
],
},
]
formatted_prompt = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
print(formatted_prompt)
```
This yields a string in LLaVA's expected input format with many `<image>` tokens prepended before the text.
```text
'<|im_start|>system
<|im_start|>system
You are a friendly chatbot who always responds in the style of a pirate<|im_end|><|im_start|>user <image>
What are these?<|im_end|>
```
### Image paths or URLs
To incorporate images into your chat templates, you can pass them as file paths or URLs. This method automatically loads the image, processes it, and prepares the necessary pixel values to create ready-to-use inputs for the model. This approach simplifies the integration of images, enabling seamless multimodal functionality.
Let's see how it works with an example using the same model as above. This time we'll indicate an image URL with `"url"` key in the message's **content** and ask the chat template to `tokenize` and `return_dict`. Currently, "base64", "url", and "path" are supported image sources.
```python
from transformers import AutoProcessor, LlavaOnevisionForConditionalGeneration
model_id = "llava-hf/llava-onevision-qwen2-0.5b-ov-hf"
model = LlavaOnevisionForConditionalGeneration.from_pretrained(model_id)
processor = AutoProcessor.from_pretrained(model_id)
messages = [
{
"role": "system",
"content": [{"type": "text", "text": "You are a friendly chatbot who always responds in the style of a pirate"}],
},
{
"role": "user",
"content": [
{"type": "image", "url": "http://images.cocodataset.org/val2017/000000039769.jpg"},
{"type": "text", "text": "What are these?"},
],
},
]
processed_chat = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt")
print(processed_chat.keys())
```
This yields a dictionary with inputs processed and ready to be further passed into [`~GenerationMixin.generate`] to generate text.
```text
dict_keys(["input_ids", "attention_mask", "pixel_values", "image_sizes"])
```
## Video inputs
Some vision models support videos as inputs as well as images. The message format is very similar to the image-only models with tiny differences to handle loading videos from a URL. We can continue using the same model as before since it supports videos.
### Sampling with fixed number of frames
Here's an example of how to set up a conversation with video inputs. Notice the extra `kwargs` passed to `processor.apply_chat_template()`. The key parameter here is `num_frames`, which controls how many frames to sample uniformly from the video. Each model checkpoint has a maximum frame count it was trained with, and exceeding this limit can significantly impact generation quality. So, its important to choose a frame count that fits both the model's capacity and your computational resources. If you don't specify `num_frames`, the entire video will be loaded without any frame sampling.
You also have the option to choose a specific framework to load the video, depending on your preferences or needs. Currently, we support `decord`, `pyav` (the default), `opencv`, and `torchvision`. For this example, well use `decord`, as it's a bit faster than `pyav`.
<Tip>
Note that if you are trying to load a video from URL, you can decode the video only with `pyav` or `decord` as backend.
</Tip>
```python
from transformers import AutoProcessor, LlavaOnevisionForConditionalGeneration
model_id = "llava-hf/llava-onevision-qwen2-0.5b-ov-hf"
model = LlavaOnevisionForConditionalGeneration.from_pretrained(model_id)
processor = AutoProcessor.from_pretrained(model_id)
messages = [
{
"role": "system",
"content": [{"type": "text", "text": "You are a friendly chatbot who always responds in the style of a pirate"}],
},
{
"role": "user",
"content": [
{"type": "video", "url": "https://test-videos.co.uk/vids/bigbuckbunny/mp4/h264/720/Big_Buck_Bunny_720_10s_10MB.mp4"},
{"type": "text", "text": "What do you see in this video?"},
],
},
]
processed_chat = processor.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
num_frames=32,
video_load_backend="decord",
)
print(processed_chat.keys())
```
### Sampling with FPS
When working with long videos, you might want to sample more frames for better representation. Instead of a fixed number of frames, you can specify `video_fps`, which determines how many frames per second to extract. For example, if a video is **10 seconds long** and you set `video_fps=2`, the model will sample **20 frames** (2 per second, uniformly spaced).
Using the above model, we need to apply chat template as follows to sample 2 frames per second.
```python
processed_chat = processor.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
video_fps=32,
video_load_backend="decord",
)
print(processed_chat.keys())
```
### Custom Frame Sampling with a Function
Not all models sample frames **uniformly** — some require more complex logic to determine which frames to use. If your model follows a different sampling strategy, you can **customize** frame selection by providing a function:
🔹 Use the `sample_indices_fn` argument to pass a **callable function** for sampling.
🔹 If provided, this function **overrides** standard `num_frames` and `fps` methods.
🔹 It receives all the arguments passed to `load_video` and must return **valid frame indices** to sample.
You should use `sample_indices_fn` when:
- If you need a custom sampling strategy (e.g., **adaptive frame selection** instead of uniform sampling).
- If your model prioritizes **key moments** in a video rather than evenly spaced frames.
Heres an example of how to implement it:
```python
def sample_indices_fn(metadata, **kwargs):
# samples only the first and the second frame
return [0, 1]
processed_chat = processor.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
sample_indices_fn=sample_indices_fn,
video_load_backend="decord",
)
print(processed_chat.keys())
```
By using `sample_indices_fn`, you gain **full control** over frame selection, making your model **more adaptable** to different video scenarios. 🚀
### List of image frames as video
Sometimes, instead of having a full video file, you might only have a set of sampled frames stored as images.
You can pass a list of image file paths, and the processor will automatically concatenate them into a video. Just make sure that all images have the same size, as they are assumed to be from the same video.
```python
frames_paths = ["/path/to/frame0.png", "/path/to/frame5.png", "/path/to/frame10.png"]
messages = [
{
"role": "system",
"content": [{"type": "text", "text": "You are a friendly chatbot who always responds in the style of a pirate"}],
},
{
"role": "user",
"content": [
{"type": "video", "path": frames_paths},
{"type": "text", "text": "What do you see in this video?"},
],
},
]
processed_chat = processor.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
)
print(processed_chat.keys())
```
## Multimodal conversational pipeline
[`ImageTextToTextPipeline`] currently accepts images as inputs but we are planning to add support for video inputs in the future. The pipeline supports chat inputs in the same format as we have seen above. Apart from that, the pipeline will accept chats in OpenAI format. This format is supported exclusively within the pipeline to make inference easier and more accessible.
Here is how the OpenAI conversation format looks:
```python
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is in this image?",
},
{
"type": "image_url",
"image_url": {"url": f"http://images.cocodataset.org/val2017/000000039769.jpg"},
},
],
}
]
```
## Best Practices for Multimodal Template Configuration
To add a custom chat template for your multimodal LLM, simply create your template using [Jinja](https://jinja.palletsprojects.com/en/3.1.x/templates/) and set it with `processor.chat_template`. If you're new to writing chat templates or need some tips, check out our [tutorial here](./chat_template_advanced) for helpful guidance.
In some cases, you may want your template to handle a **list of content** from multiple modalities, while still supporting a plain string for text-only inference. Here's an example of how you can achieve that, using the [Llama-Vision](https://huggingface.co/collections/meta-llama/metas-llama-32-multimodal-models-675bfd70e574a62dd0e4059b) chat template.
```
{% for message in messages %}
{% if loop.index0 == 0 %}{{ bos_token }}{% endif %}
{{ '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n' }}
{% if message['content'] is string %}
{{ message['content'] }}
{% else %}
{% for content in message['content'] %}
{% if content['type'] == 'image' %}
{{ '<|image|>' }}
{% elif content['type'] == 'text' %}
{{ content['text'] }}
{% endif %}
{% endfor %}
{% endif %}
{{ '<|eot_id|>' }}
{% endfor %}
{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}
```

View File

@ -0,0 +1,410 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# Expanding Chat Templates with Tools and Documents
The only argument that `apply_chat_template` requires is `messages`. However, you can pass any keyword
argument to `apply_chat_template` and it will be accessible inside the template. This gives you a lot of freedom to use
chat templates for many things. There are no restrictions on the names or the format of these arguments - you can pass
strings, lists, dicts or whatever else you want.
That said, there are some common use-cases for these extra arguments,
such as passing tools for function calling, or documents for retrieval-augmented generation. In these common cases,
we have some opinionated recommendations about what the names and formats of these arguments should be, which are
described in the sections below. We encourage model authors to make their chat templates compatible with this format,
to make it easy to transfer tool-calling code between models.
## Tool use / function calling
"Tool use" LLMs can choose to call functions as external tools before generating an answer. When passing tools
to a tool-use model, you can simply pass a list of functions to the `tools` argument:
```python
import datetime
def current_time():
"""Get the current local time as a string."""
return str(datetime.now())
def multiply(a: float, b: float):
"""
A function that multiplies two numbers
Args:
a: The first number to multiply
b: The second number to multiply
"""
return a * b
tools = [current_time, multiply]
model_input = tokenizer.apply_chat_template(
messages,
tools=tools
)
```
In order for this to work correctly, you should write your functions in the format above, so that they can be parsed
correctly as tools. Specifically, you should follow these rules:
- The function should have a descriptive name
- Every argument must have a type hint
- The function must have a docstring in the standard Google style (in other words, an initial function description
followed by an `Args:` block that describes the arguments, unless the function does not have any arguments.)
- Do not include types in the `Args:` block. In other words, write `a: The first number to multiply`, not
`a (int): The first number to multiply`. Type hints should go in the function header instead.
- The function can have a return type and a `Returns:` block in the docstring. However, these are optional
because most tool-use models ignore them.
### Passing tool results to the model
The sample code above is enough to list the available tools for your model, but what happens if it wants to actually use
one? If that happens, you should:
1. Parse the model's output to get the tool name(s) and arguments.
2. Add the model's tool call(s) to the conversation.
3. Call the corresponding function(s) with those arguments.
4. Add the result(s) to the conversation
### A complete tool use example
Let's walk through a tool use example, step by step. For this example, we will use an 8B `Hermes-2-Pro` model,
as it is one of the highest-performing tool-use models in its size category at the time of writing. If you have the
memory, you can consider using a larger model instead like [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-v01)
or [Mixtral-8x22B](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1), both of which also support tool use
and offer even stronger performance.
First, let's load our model and tokenizer:
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "NousResearch/Hermes-2-Pro-Llama-3-8B"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, device_map="auto")
```
Next, let's define a list of tools:
```python
def get_current_temperature(location: str, unit: str) -> float:
"""
Get the current temperature at a location.
Args:
location: The location to get the temperature for, in the format "City, Country"
unit: The unit to return the temperature in. (choices: ["celsius", "fahrenheit"])
Returns:
The current temperature at the specified location in the specified units, as a float.
"""
return 22. # A real function should probably actually get the temperature!
def get_current_wind_speed(location: str) -> float:
"""
Get the current wind speed in km/h at a given location.
Args:
location: The location to get the temperature for, in the format "City, Country"
Returns:
The current wind speed at the given location in km/h, as a float.
"""
return 6. # A real function should probably actually get the wind speed!
tools = [get_current_temperature, get_current_wind_speed]
```
Now, let's set up a conversation for our bot:
```python
messages = [
{"role": "system", "content": "You are a bot that responds to weather queries. You should reply with the unit used in the queried location."},
{"role": "user", "content": "Hey, what's the temperature in Paris right now?"}
]
```
Now, let's apply the chat template and generate a response:
```python
inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))
```
And we get:
```text
<tool_call>
{"arguments": {"location": "Paris, France", "unit": "celsius"}, "name": "get_current_temperature"}
</tool_call><|im_end|>
```
The model has called the function with valid arguments, in the format requested by the function docstring. It has
inferred that we're most likely referring to the Paris in France, and it remembered that, as the home of SI units,
the temperature in France should certainly be displayed in Celsius.
<Tip>
The output format above is specific to the `Hermes-2-Pro` model we're using in this example. Other models may emit different
tool call formats, and you may need to do some manual parsing at this step. For example, `Llama-3.1` models will emit
slightly different JSON, with `parameters` instead of `arguments`. Regardless of the format the model outputs, you
should add the tool call to the conversation in the format below, with `tool_calls`, `function` and `arguments` keys.
</Tip>
Next, let's append the model's tool call to the conversation.
```python
tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France", "unit": "celsius"}}
messages.append({"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}]})
```
<Tip warning={true}>
If you're familiar with the OpenAI API, you should pay attention to an important difference here - the `tool_call` is
a dict, but in the OpenAI API it's a JSON string. Passing a string may cause errors or strange model behaviour!
</Tip>
Now that we've added the tool call to the conversation, we can call the function and append the result to the
conversation. Since we're just using a dummy function for this example that always returns 22.0, we can just append
that result directly.
```python
messages.append({"role": "tool", "name": "get_current_temperature", "content": "22.0"})
```
<Tip>
Some model architectures, notably Mistral/Mixtral, also require a `tool_call_id` here, which should be
9 randomly-generated alphanumeric characters, and assigned to the `id` key of the tool call
dictionary. The same key should also be assigned to the `tool_call_id` key of the tool response dictionary below, so
that tool calls can be matched to tool responses. So, for Mistral/Mixtral models, the code above would be:
```python
tool_call_id = "9Ae3bDc2F" # Random ID, 9 alphanumeric characters
tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France", "unit": "celsius"}}
messages.append({"role": "assistant", "tool_calls": [{"type": "function", "id": tool_call_id, "function": tool_call}]})
```
and
```python
messages.append({"role": "tool", "tool_call_id": tool_call_id, "name": "get_current_temperature", "content": "22.0"})
```
</Tip>
Finally, let's let the assistant read the function outputs and continue chatting with the user:
```python
inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))
```
And we get:
```text
The current temperature in Paris, France is 22.0 ° Celsius.<|im_end|>
```
Although this was a simple demo with dummy tools and a single call, the same technique works with
multiple real tools and longer conversations. This can be a powerful way to extend the capabilities of conversational
agents with real-time information, computational tools like calculators, or access to large databases.
### Understanding tool schemas
Each function you pass to the `tools` argument of `apply_chat_template` is converted into a
[JSON schema](https://json-schema.org/learn/getting-started-step-by-step). These schemas
are then passed to the model chat template. In other words, tool-use models do not see your functions directly, and they
never see the actual code inside them. What they care about is the function **definitions** and the **arguments** they
need to pass to them - they care about what the tools do and how to use them, not how they work! It is up to you
to read their outputs, detect if they have requested to use a tool, pass their arguments to the tool function, and
return the response in the chat.
Generating JSON schemas to pass to the template should be automatic and invisible as long as your functions
follow the specification above, but if you encounter problems, or you simply want more control over the conversion,
you can handle the conversion manually. Here is an example of a manual schema conversion.
```python
from transformers.utils import get_json_schema
def multiply(a: float, b: float):
"""
A function that multiplies two numbers
Args:
a: The first number to multiply
b: The second number to multiply
"""
return a * b
schema = get_json_schema(multiply)
print(schema)
```
This will yield:
```json
{
"type": "function",
"function": {
"name": "multiply",
"description": "A function that multiplies two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number to multiply"
},
"b": {
"type": "number",
"description": "The second number to multiply"
}
},
"required": ["a", "b"]
}
}
}
```
If you wish, you can edit these schemas, or even write them from scratch yourself without using `get_json_schema` at
all. JSON schemas can be passed directly to the `tools` argument of
`apply_chat_template` - this gives you a lot of power to define precise schemas for more complex functions. Be careful,
though - the more complex your schemas, the more likely the model is to get confused when dealing with them! We
recommend simple function signatures where possible, keeping arguments (and especially complex, nested arguments)
to a minimum.
Here is an example of defining schemas by hand, and passing them directly to `apply_chat_template`:
```python
# A simple function that takes no arguments
current_time = {
"type": "function",
"function": {
"name": "current_time",
"description": "Get the current local time as a string.",
"parameters": {
'type': 'object',
'properties': {}
}
}
}
# A more complete function that takes two numerical arguments
multiply = {
'type': 'function',
'function': {
'name': 'multiply',
'description': 'A function that multiplies two numbers',
'parameters': {
'type': 'object',
'properties': {
'a': {
'type': 'number',
'description': 'The first number to multiply'
},
'b': {
'type': 'number', 'description': 'The second number to multiply'
}
},
'required': ['a', 'b']
}
}
}
model_input = tokenizer.apply_chat_template(
messages,
tools = [current_time, multiply]
)
```
## Retrieval-augmented generation
"Retrieval-augmented generation" or "RAG" LLMs can search a corpus of documents for information before responding
to a query. This allows models to vastly expand their knowledge base beyond their limited context size. Our
recommendation for RAG models is that their template
should accept a `documents` argument. This should be a list of documents, where each "document"
is a single dict with `title` and `contents` keys, both of which are strings. Because this format is much simpler
than the JSON schemas used for tools, no helper functions are necessary.
Here's an example of a RAG template in action:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load the model and tokenizer
model_id = "CohereForAI/c4ai-command-r-v01-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
device = model.device # Get the device the model is loaded on
# Define conversation input
conversation = [
{"role": "user", "content": "What has Man always dreamed of?"}
]
# Define documents for retrieval-based generation
documents = [
{
"title": "The Moon: Our Age-Old Foe",
"text": "Man has always dreamed of destroying the moon. In this essay, I shall..."
},
{
"title": "The Sun: Our Age-Old Friend",
"text": "Although often underappreciated, the sun provides several notable benefits..."
}
]
# Tokenize conversation and documents using a RAG template, returning PyTorch tensors.
input_ids = tokenizer.apply_chat_template(
conversation=conversation,
documents=documents,
chat_template="rag",
tokenize=True,
add_generation_prompt=True,
return_tensors="pt").to(device)
# Generate a response
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
# Decode and print the generated text along with generation prompt
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
```
<Tip>
The `documents` input for retrieval-augmented generation is not widely supported, and many models have chat templates which simply ignore this input.
To verify if a model supports the `documents` input, you can read its model card, or `print(tokenizer.chat_template)` to see if the `documents` key is used anywhere.
One model class that does support it, though, is Cohere's [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-08-2024) and [Command-R+](https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024), through their `rag` chat template. You can see additional examples of grounded generation using this feature in their model cards.
</Tip>

File diff suppressed because it is too large Load Diff

View File

@ -30,7 +30,7 @@ DeepSpeed compiles CUDA C++ code and it can be a potential source of errors when
<Tip>
For any other installation issues, please [open an issue](https://github.com/microsoft/DeepSpeed/issues) with the DeepSpeed team.
For any other installation issues, please [open an issue](https://github.com/deepspeedai/DeepSpeed/issues) with the DeepSpeed team.
</Tip>
@ -89,7 +89,7 @@ sudo ln -s /usr/bin/g++-7 /usr/local/cuda-10.2/bin/g++
If you're still having issues with installing DeepSpeed or if you're building DeepSpeed at run time, you can try to prebuild the DeepSpeed modules before installing them. To make a local build for DeepSpeed:
```bash
git clone https://github.com/microsoft/DeepSpeed/
git clone https://github.com/deepspeedai/DeepSpeed/
cd DeepSpeed
rm -rf build
TORCH_CUDA_ARCH_LIST="8.6" DS_BUILD_CPU_ADAM=1 DS_BUILD_UTILS=1 pip install . \
@ -141,7 +141,7 @@ It is also possible to not specify `TORCH_CUDA_ARCH_LIST` and the build program
For training on multiple machines with the same setup, you'll need to make a binary wheel:
```bash
git clone https://github.com/microsoft/DeepSpeed/
git clone https://github.com/deepspeedai/DeepSpeed/
cd DeepSpeed
rm -rf build
TORCH_CUDA_ARCH_LIST="8.6" DS_BUILD_CPU_ADAM=1 DS_BUILD_UTILS=1 \

View File

@ -28,7 +28,7 @@ This guide will walk you through how to deploy DeepSpeed training, the features
## Installation
DeepSpeed is available to install from PyPI or Transformers (for more detailed installation options, take a look at the DeepSpeed [installation details](https://www.deepspeed.ai/tutorials/advanced-install/) or the GitHub [README](https://github.com/microsoft/deepspeed#installation)).
DeepSpeed is available to install from PyPI or Transformers (for more detailed installation options, take a look at the DeepSpeed [installation details](https://www.deepspeed.ai/tutorials/advanced-install/) or the GitHub [README](https://github.com/deepspeedai/DeepSpeed#installation)).
<Tip>
@ -114,10 +114,10 @@ DeepSpeed works with the [`Trainer`] class by way of a config file containing al
<Tip>
Find a complete list of DeepSpeed configuration options on the [DeepSpeed Configuration JSON](https://www.deepspeed.ai/docs/config-json/) reference. You can also find more practical examples of various DeepSpeed configuration examples on the [DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples) repository or the main [DeepSpeed](https://github.com/microsoft/DeepSpeed) repository. To quickly find specific examples, you can:
Find a complete list of DeepSpeed configuration options on the [DeepSpeed Configuration JSON](https://www.deepspeed.ai/docs/config-json/) reference. You can also find more practical examples of various DeepSpeed configuration examples on the [DeepSpeedExamples](https://github.com/deepspeedai/DeepSpeedExamples) repository or the main [DeepSpeed](https://github.com/deepspeedai/DeepSpeed) repository. To quickly find specific examples, you can:
```bash
git clone https://github.com/microsoft/DeepSpeedExamples
git clone https://github.com/deepspeedai/DeepSpeedExamples
cd DeepSpeedExamples
find . -name '*json'
# find examples with the Lamb optimizer
@ -303,7 +303,7 @@ For more information about initializing large models with ZeRO-3 and accessing t
[ZeRO-Infinity](https://hf.co/papers/2104.07857) allows offloading model states to the CPU and/or NVMe to save even more memory. Smart partitioning and tiling algorithms allow each GPU to send and receive very small amounts of data during offloading such that a modern NVMe can fit an even larger total memory pool than is available to your training process. ZeRO-Infinity requires ZeRO-3.
Depending on the CPU and/or NVMe memory available, you can offload both the [optimizer states](https://www.deepspeed.ai/docs/config-json/#optimizer-offloading) and [parameters](https://www.deepspeed.ai/docs/config-json/#parameter-offloading), just one of them, or none. You should also make sure the `nvme_path` is pointing to an NVMe device, because while it still works with a normal hard drive or solid state drive, it'll be significantly slower. With a modern NVMe, you can expect peak transfer speeds of ~3.5GB/s for read and ~3GB/s for write operations. Lastly, [run a benchmark](https://github.com/microsoft/DeepSpeed/issues/998) on your training setup to determine the optimal `aio` configuration.
Depending on the CPU and/or NVMe memory available, you can offload both the [optimizer states](https://www.deepspeed.ai/docs/config-json/#optimizer-offloading) and [parameters](https://www.deepspeed.ai/docs/config-json/#parameter-offloading), just one of them, or none. You should also make sure the `nvme_path` is pointing to an NVMe device, because while it still works with a normal hard drive or solid state drive, it'll be significantly slower. With a modern NVMe, you can expect peak transfer speeds of ~3.5GB/s for read and ~3GB/s for write operations. Lastly, [run a benchmark](https://github.com/deepspeedai/DeepSpeed/issues/998) on your training setup to determine the optimal `aio` configuration.
The example ZeRO-3/Infinity configuration file below sets most of the parameter values to `auto`, but you could also manually add these values.
@ -586,6 +586,20 @@ You can choose the communication data type by setting the `communication_data_ty
}
```
### Universal Checkpointing
[Universal Checkpointing](https://www.deepspeed.ai/tutorials/universal-checkpointing) is an efficient and flexible feature for saving and loading model checkpoints. It enables seamless model training continuation and fine-tuning across different model architectures, parallelism techniques, and training configurations.
Resume training with a universal checkpoint by setting [load_universal](https://www.deepspeed.ai/docs/config-json/#checkpoint-options) to `true` in the config file.
```yaml
{
"checkpoint": {
"load_universal": true
}
}
```
## Deployment
DeepSpeed can be deployed by different launchers such as [torchrun](https://pytorch.org/docs/stable/elastic/run.html), the `deepspeed` launcher, or [Accelerate](https://huggingface.co/docs/accelerate/basic_tutorials/launch#using-accelerate-launch). To deploy, add `--deepspeed ds_config.json` to the [`Trainer`] command line. Its recommended to use DeepSpeeds [`add_config_arguments`](https://deepspeed.readthedocs.io/en/latest/initialize.html#argument-parsing) utility to add any necessary command line arguments to your code.
@ -1143,7 +1157,7 @@ For Transformers>=4.28, if `synced_gpus` is automatically set to `True` if multi
## Troubleshoot
When you encounter an issue, you should consider whether DeepSpeed is the cause of the problem because often it isn't (unless it's super obviously and you can see DeepSpeed modules in the exception)! The first step should be to retry your setup without DeepSpeed, and if the problem persists, then you can report the issue. If the issue is a core DeepSpeed problem and unrelated to the Transformers integration, open an Issue on the [DeepSpeed repository](https://github.com/microsoft/DeepSpeed).
When you encounter an issue, you should consider whether DeepSpeed is the cause of the problem because often it isn't (unless it's super obviously and you can see DeepSpeed modules in the exception)! The first step should be to retry your setup without DeepSpeed, and if the problem persists, then you can report the issue. If the issue is a core DeepSpeed problem and unrelated to the Transformers integration, open an Issue on the [DeepSpeed repository](https://github.com/deepspeedai/DeepSpeed).
For issues related to the Transformers integration, please provide the following information:
@ -1213,7 +1227,7 @@ This means the DeepSpeed loss scaler is unable to find a scaling coefficient to
## Resources
DeepSpeed ZeRO is a powerful technology for training and loading very large models for inference with limited GPU resources, making it more accessible to everyone. To learn more about DeepSpeed, feel free to read the [blog posts](https://www.microsoft.com/en-us/research/search/?q=deepspeed), [documentation](https://www.deepspeed.ai/getting-started/), and [GitHub repository](https://github.com/microsoft/deepspeed).
DeepSpeed ZeRO is a powerful technology for training and loading very large models for inference with limited GPU resources, making it more accessible to everyone. To learn more about DeepSpeed, feel free to read the [blog posts](https://www.microsoft.com/en-us/research/search/?q=deepspeed), [documentation](https://www.deepspeed.ai/getting-started/), and [GitHub repository](https://github.com/deepspeedai/DeepSpeed).
The following papers are also a great resource for learning more about ZeRO:

Some files were not shown because too many files have changed in this diff Show More