Compare commits

..

259 Commits

Author SHA1 Message Date
1523e08a9e install quant docker 2024-11-21 15:27:59 +01:00
4e90b99ed9 Refactor StarCoder2 using modular (#34015)
* Create modular_starcoder2.py

* Update modular_starcoder2.py

* update

* finalize modular

* revert # no-unravel

* Add support

* style

* Update modular_model_converter.py

* update docstring
2024-11-21 14:52:39 +01:00
18871599c9 Fix heuristic scheduling for UAG (#34805)
* fix heuristic schedule

* fix style

* fix format
2024-11-21 14:46:35 +01:00
d6a5c23f71 Fix ds nvme (#34444)
* skip nested deepspeed.zero.Init call

* make fixup

* solve conflict

* solve conflict

* put back local

* use context mangers instead of local thread

* Skip recursive calls to deepspeed.zero.Init

* Skip recursive calls to deepspeed.zero.Init

* back to old notebooks

* make style
2024-11-21 13:52:22 +01:00
ae5cbf804b Improve gguf tensor processing (#34515)
* add tensor processing system to separate logic for models

* format refactoring

* small fix

* make some methods private

* move custom methods to processors

* refactor tensor processing

* format fix
2024-11-21 13:40:49 +01:00
c57eafdaa1 Add Nemotron GGUF Loading Support (#34725)
* Add Nemotron GGUF Loading Support

* fix the Nemotron architecture assignation

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-21 11:37:34 +01:00
d4e1acbb7c Change logging level from warning to info for max_steps overriding num_train_epochs (#34810)
Update trainer.py
2024-11-21 11:37:02 +01:00
28fb02fc05 VLMs: enable generation tests - last batch (#34484)
* add tests for 3 more vlms

* fix fuyu back

* skip test
2024-11-21 11:00:22 +01:00
40821a2478 Fix CI slack reporting issue (#34833)
* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-20 21:36:13 +01:00
3cb8676a91 Fix CI by tweaking torchao tests (#34832) 2024-11-20 20:28:51 +01:00
bf42c3bd4b Fix hyperparameter search when optuna+deepseed (#34642)
* Fix hyperparameter search when optuna+deepseed

* Adding free_memory to the search setup

---------

Co-authored-by: Corentin-Royer <corentin.royer@ibm.com>
2024-11-20 18:02:58 +01:00
67890de3b8 Torchao weights only + prequantized compability (#34355)
* weights only compability

* better tests from code review

* ping torch version

* add weights_only check
2024-11-20 17:24:45 +01:00
f297af55df Fix: take into account meta device (#34134)
* Do not load for meta device

* Make some minor improvements

* Add test

* Update tests/utils/test_modeling_utils.py

Update test parameters

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Make the test simpler

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-20 11:32:07 +01:00
8cadf76e1c fix(DPT,Depth-Anything) torch.export (#34103)
* Fix torch.export issue in dpt based models

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Simplify the if statements

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Move activation definitions of zoe_depth to init()

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Add test_export for dpt and zoedepth

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* add depth anything

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Remove zoedepth non-automated zoedepth changes and zoedepth test

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* [run_slow] dpt, depth_anything, zoedepth

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

---------

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>
2024-11-20 11:31:21 +01:00
9d16441e4f Fix the memory usage issue of logits in generate() (#34813) 2024-11-20 11:25:37 +01:00
9470d65324 Fix low memory beam search (#34746)
* fix

* higher max positions in tests
2024-11-20 07:46:35 +01:00
145fbd46cb LLaVA OV: fix unpadding precision (#34779)
* fix

* propagate

* type check
2024-11-20 07:46:13 +01:00
3033509327 Translate attention.md into Chinese (#34716)
* try

* tryagain

* tryagggain

* translated

* translated2

* Update docs/source/zh/attention.md

Co-authored-by: Huazhong Ji <hzji210@gmail.com>

---------

Co-authored-by: Huazhong Ji <hzji210@gmail.com>
2024-11-19 10:03:12 -08:00
befbbf2f98 Added image-text-to-text pipeline to task guide (#34783)
* Added image-text-to-text pipeline to task guide

* Update docs/source/en/tasks/image_text_to_text.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tasks/image_text_to_text.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tasks/image_text_to_text.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tasks/image_text_to_text.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Merge codeblocks

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-19 09:49:10 -08:00
469eddbe2d Fix check_training_gradient_checkpointing (#34806)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-19 17:48:34 +01:00
05ebe8b9b0 Run test_medium_seamless_m4t_pt in subprocess to avoid many failures (#34812)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-19 17:32:10 +01:00
eedc113914 Add Image Processor Fast Deformable DETR (#34353)
* add deformable detr image processor fast

* add fast processor to doc

* fix copies

* nit docstring

* Add tests gpu/cpu and fix docstrings

* fix docstring

* import changes from detr

* fix imports

* rebase and fix

* fix input data format change in detr and rtdetr fast
2024-11-19 11:18:58 -05:00
b99ca4d28b Add support for OpenAI api "image_url" input in chat for image-text-to-text pipeline (#34562)
* add support for openai api image_url input

* change continue to elif

* Explicitely add support for OpenAI/TGI chat format

* rewrite content to transformers chat format and add tests

* Add support for typing of image type in chat templates

* add base64 to possible image types

* refactor nesting
2024-11-19 11:08:37 -05:00
15dd625a0f Bump aiohttp from 3.10.2 to 3.10.11 in /examples/research_projects/decision_transformer (#34792)
Bump aiohttp in /examples/research_projects/decision_transformer

Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.10.2 to 3.10.11.
- [Release notes](https://github.com/aio-libs/aiohttp/releases)
- [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst)
- [Commits](https://github.com/aio-libs/aiohttp/compare/v3.10.2...v3.10.11)

---
updated-dependencies:
- dependency-name: aiohttp
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-19 16:08:07 +00:00
dc42330388 fix crash in tiiuae/falcon-11B-vlm image-to-text generation (#34728)
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
2024-11-19 16:51:32 +01:00
427b62ed1a Fix post process function called in the instance segmentation example of mask2former (#34588)
* Fix post process function called in the instance segmentation example of mask2former

* fix description and additional notes for post_process_instance_segmentation of maskformers

* remove white space in maskformers post_process_instance_segmentation doc

* change image.size[::-1] to height and width for clarity in segmentation examples
2024-11-19 16:49:25 +01:00
jp
fdb9230485 Add do_convert_rgb to vit (#34523)
* Add: do_convert_rgb

* Add: doc string

* Update src/transformers/models/vit/image_processing_vit.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/vit/image_processing_vit.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/vit/image_processing_vit.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Add: do_convert_rgb to fast

* Add: convert_to_rgb

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-11-19 16:48:05 +01:00
7b9e51c1a0 Feature: print tokens per second during training (#34507)
* Log tokens per second during training

* Nitpicks

* Move logic into _maybe_log_save_evaluate

* Use speed_metrics
2024-11-19 16:46:04 +01:00
5fa4f64605 ЁЯЪиЁЯЪиЁЯЪи fix(Mask2Former): torch export ЁЯЪиЁЯЪиЁЯЪи (#34393)
* fix(Mask2Former): torch export

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* revert level_start_index and create a level_start_index_list

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Add a comment to explain the level_start_index_list

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Address comment

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* add torch.export.export test

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* rename arg

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* remove spatial_shapes

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* Use the version check from pytorch_utils

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* [run_slow] mask2former

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

---------

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>
2024-11-19 16:44:53 +01:00
581524389a MLU devices : Checks if mlu is available via an cndev-based check which won't trigger the drivers and leave mlu (#34326)
* add Cambricon MLUs support

* fix mlu device rng state

* up for quality check

* up mlu to support fp16

* fix mlu device dependency error

* fix mlu device dependency error

* enable mlu device for bf16

* fix mlu device memory tracker

* Cambricon support SDPA and flash_attn

* MLU devices : Checks if `mlu` is available via an `cndev-based` check which won't trigger the drivers and leave mlu
2024-11-19 16:37:39 +01:00
e3a5889ef0 Modular fix (#34802)
* Modular fix

* style

* remove logger warning

* Update modular_model_converter.py
2024-11-19 16:08:57 +01:00
ce1d328e3b Fix cache_utils for optimum.quanto kvcache quantization (#34750)
* add co-author

Co-authored-by: w3rew <w3rew@users.noreply.github.com>

* fix docs

* fix cache

* remove print

---------

Co-authored-by: w3rew <w3rew@users.noreply.github.com>
2024-11-19 14:16:34 +01:00
4bff54f921 Gemma capping (#34282)
* softcapping

* soft cap before the mask

* style

* ...

* super nit

* update

* fixes

* update

* small issue with modular

* fix modular imports

* update

* fixup

* simplify a hell lot

* simplify cleaning imports

* finish fixing

* update our design

* nits

* use a deprecation cycle

* updates

* Fix modular (recursive deps need to always be computed after merges!)

* push

* fix

* update

* fix modular order

* make fix-copies

* updates

* update

* ?

* don't compile for now

* ?

* fix some stuff

* donc!

* fix copies

* update

* fixup

* ?

* fix two tests

* fix?

* for now, don't use head info

* eager when output attentoin and sdpa or flash as it's the simplest behaviour (for our tests as well :))

* fix-copies

* revert sdpa check

* Apply suggestions from code review

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>

* rebase, fix-copies and push

* add a slow integration test

* update the test

* fix left padding issue

* fix test

* remove duplicate scaling

* quality

* add a small test and make sure it works

* 2b

---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2024-11-19 13:52:38 +01:00
54739a320e Self-speculation (Layer-Skip Llama) (#34240)
* ЁЯШЕ

* early exit (#34244)

* mvp

* docs and tests

* a few fixes

* no shared cache

* Apply suggestions from code review

Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org>

* docs

* make fix-copies

* cohere fix

* [test all]

* [test all] consistent model code copies

* [test all] make fix-copies :D

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org>

* Update src/transformers/generation/candidate_generator.py

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* [test all] don't use a stand-alone attribute; fix test

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-11-19 12:20:07 +00:00
5de58d5955 fix cpu bnb path (#34647)
* fix cpu bnb path

* Update src/transformers/generation/utils.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix awq quantizer env check

* fix awq quantizer device check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-19 12:44:44 +01:00
jp
3cd78be34e Fix: siglip image processor rgb_convert is not being applied correctly. (#34301)
Fix: do_convert_rgb
2024-11-19 12:40:36 +01:00
0db91c3c8d Support gradient checkpointing in Qwen2VL ViT (#34724)
* Support gradient checkpointing in Qwen2VL ViT

* Enable gradient checkpoint tests for Qwen2VL

* [run-slow] qwen2_vl
2024-11-19 12:30:44 +01:00
1a0cd69435 feat: allow to use hf-hub models for timm backbone (#34729)
Currently a backbone name like 'hf-hub:bioptimus/H-optimus-0' throws an
error, even though it could work.

Co-authored-by: Christian Gebbe <>
2024-11-19 10:26:35 +00:00
d8a5d31d9c Trainer hyperparameter search kwargs docs update (#34459)
* doc: Trainer.hyperparameter_search docstring discrepancy solved

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-19 11:23:03 +01:00
dadb286f06 protect tensor parallel usage (#34800)
protect
2024-11-19 09:54:11 +01:00
eed11f34ab Fix Whisper CI (#34617)
* Revert "Revert "Fix Whisper CI" (#34605)"

This reverts commit 74d3824cc0725829e7d92e1d43b97be1f18454f8.

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-18 21:37:50 +01:00
759a378ee5 Allow handling files as args for a tool created with Tool.from_space (#34687)
* Allow handling files as args for a tool created with `Tool.from_space`
2024-11-18 20:15:35 +01:00
20142ab542 Simplify Tensor Parallel implementation with PyTorch TP (#34184)
* Simplify Tensor Parallel implementation with PyTorch TP

* Move tp_plan to config

* Lint

* Format and warning

* Disable copy-from check

* Conditionally get attr from config

* make fix-copies

* Move base_model_tp_plan to PretrainedConfig

* Move TP into from_pretrained

* Add device context for load

* Do not serialize

* Move _tp_plan setting to post_init

* Add has_tp_plan

* Add test_tp

* Add 'Multi-gpu inference' doc

* Add backward support for device type identification

* Auto-detect accelerator

* supports_tp_plan

* copyright year

* Fix copy
2024-11-18 19:51:49 +01:00
7df93d6ffb fix: Wrong task mentioned in docs (#34757) 2024-11-18 18:42:28 +00:00
7693b62268 Fix callback key name (#34762)
Fixes typo.
2024-11-18 18:41:12 +00:00
1ef6c5f1c5 fix: Update pixel_values parameter in hf_model input (#34782) 2024-11-18 18:40:01 +00:00
e80a65ba4f [tests] add XPU part to testing (#34778)
add XPU part to testing

Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
2024-11-18 09:59:11 -08:00
9568a9dfc5 [docs] add XPU besides CUDA, MPS etc. (#34777)
add XPU
2024-11-18 09:58:50 -08:00
8568bf1bcf [docs] make empty_cache device-agnostic (#34774)
make device-agnostic
2024-11-18 09:58:26 -08:00
36759f3312 make sure to disable gradients for integer tensor (#32943) 2024-11-18 16:49:37 +01:00
1c471fc307 Fix skip of test_training_gradient_checkpointing (#34723)
19d58d31f has introduced a context manager to manage subtests of
test_training_gradient_checkpointing. However, test body was not
moved under "with" statement. Thus, while tests are correctly
marked as skipped, test bodies were still executed. In some cases,
as with llama this caused attribute errors.

Fixes: #34722
Fixes: 19d58d31f ("Add MLLama (#33703)")

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2024-11-18 15:45:40 +01:00
c772d4d91e fix a typo bug where 'id2label' was incorrectly written as 'i2label' when reading config (#34637)
fix a bug where 'id2label' was incorrectly written as 'i2label' when reading the config from pretrained config
2024-11-18 14:41:48 +01:00
eb0ab3ed4b Fix broken link (#34618) 2024-11-18 14:13:26 +01:00
1646ffb4d1 VLMs: patch_size -> num_image_tokens in processing (#33424)
* use num additional tokens

* fix copies + docs

* another fix copies :)

* add docs

* move order for BC
2024-11-18 13:21:07 +01:00
3ee24e2208 Add OLMo November 2024 (#34551)
* Add model skeletion with transformers-cli add-new-model-like

* Convert config to modular, add rms_norm_eps, delete clip_qkv

* Convert model to modular, add RMSNorm

* Add flash attention with qk norm and no qkv clipping

* Add decoder layer with RMSNorm after attention/feedforward layers

* Add base and causal model

* Add converter improvements from OLMo repo

* Update weight loading in OLMo to HF converter

* Set correct default for rms_norm_eps

* Set correct pipeline_model_mapping in test

* Run make fixup

* Fix model type

* Re-run modular conversion

* Manually set config docs to fix build errors

* Convert olmo-1124 to olmo_1124 to fix flash attention docs errors

* Start updating tests

* Update tests

* Copy upstream test_eager_matches_sdpa_inference_1_bfloat16 changes to olmo_1124

* Rename input_layernorm and post_attention_layernorm to reflect their ops better

* Use correct tokenizer

* Remove test unsupported by GPT2 tokenizer

* Create GenerationConfig outside of from_pretrained call

* Use simpler init file structure

* Add explicit __all__ to support simplified init

* Make safetensor serialization the default

* Update OLMo November 2024 docs
2024-11-18 10:43:10 +01:00
13493215ab ЁЯз╝ remove v4.44 deprecations (#34245)
* remove v4.44 deprecations

* PR comments

* deprecations scheduled for v4.50

* hub version update

* make fiuxp

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-15 23:07:24 +01:00
8d50fda644 Remove FSDP wrapping from sub-models. (#34452)
* Remove FSDP wrapping from sub-models.

* solve conflict trainer.py

* make fixup

* add unit test for fsdp_auto_wrap_policy when using auto_find_batch_size

* put back extract_model_from_parallel

* use transformers unwrap_model
2024-11-15 23:00:03 +01:00
b0c0ba7b4d FSDP grad accum fix (#34645)
* add gradient accumulation steps tests for fsdp

* invert no_sync context to fix training for fsdp
2024-11-15 22:28:06 +01:00
52ea4aa589 add xpu path for awq (#34712)
* add xpu path for awq

* update readme
2024-11-15 15:45:24 +01:00
7b3d615bc2 fix(wandb): pass fake dataset to avoid exception in trainer (see #34455) (#34720) 2024-11-15 15:44:02 +01:00
f5dbfab7f3 Update llava.md (#34749)
LLava -> Llava
2024-11-15 15:39:57 +01:00
8ba3e1505e Retain newlines in chat template when continue_final_message=True (#34253)
* Retain newlines in chat template when

* Add try/except

* Add regression test

* Simplify test

* Apply suggestions from code review

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2024-11-15 14:27:04 +00:00
a3d69a8994 [docs] add xpu device check (#34684)
* add XPU path

* use accelerate API

* Update docs/source/en/tasks/semantic_segmentation.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update more places with accelerate API

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-13 14:16:59 -08:00
68f8186a89 Fix example in EsmConfig docstring (#34653) 2024-11-13 13:55:58 -08:00
e7c36a9d57 [docs] Broken link in generation_strategies (#34717)
[docs] Broken link
2024-11-13 13:44:42 -08:00
be8748a53c ЁЯМР [i18n-KO] Translated marian.md to Korean (#34698)
* initial translation

* removed english

* Fixed Trivial Typos, updated _toctree.yml
2024-11-13 13:14:23 -08:00
33eef99250 Agents: Small fixes in streaming to gradio + add tests (#34549)
* Better support transformers.agents in gradio: small fixes and additional tests
2024-11-11 20:52:09 +01:00
6de2a4d1f1 [i18n-ar] Translated file : docs/source/ar/torchscript.md into Arabic (#33079)
* Add docs/source/ar/torchscript.md to Add_docs_source_ar_torchscript.md

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Merge troubleshooting.md with this Branch

* Update _toctree.yml

* Update torchscript.md

* Update troubleshooting.md

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-11-11 10:41:01 -08:00
25f510a9c6 [docs] update not-working model revision (#34682)
update revision
2024-11-11 07:09:31 -08:00
3ea3ab62d8 Agents: turn any Space into a Tool with Tool.from_space() (#34561)
* Agents: you can now load a Space as a tool
2024-11-10 12:22:40 +01:00
134ba90da9 Update llm_engine.py (#33332)
* Update llm_engine.py
- Added support for optional token and max_tokens parameters in the constructor.
- Provided usage examples and detailed documentation for each method.
2024-11-10 12:19:20 +01:00
768f3c016e [i18n-ar] Translated file : docs/source/ar/trainer.md into Arabic (#33080)
* Add docs/source/ar/trainer.md to Add_docs_source_ar_trainer.md

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update trainer.md

* Update trainer.md

* Update trainer.md

* Create _toctree.yml

* Delete docs/source/ar/_toctree.yml

* Update _toctree.yml - add trainer

* Update _toctree.yml

* merge serialization.md into this branch

* merge sagemaker.md into this PR

* Update _toctree.yml

* Update docs/source/ar/trainer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ar/trainer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-09 11:26:28 -08:00
a06a0d1263 ЁЯМР [i18n-KO] Translated bert.md to Korean (#34627)
* Translated bert.md, Need additional check

* Translation 2nd ver, changed _toctree.yml

* Fixed Typo

* Update bert.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update bert.md

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update bert.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update bert.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-07 18:56:09 -08:00
1cf17077bf ЁЯМР [i18n-KO] Translated timesformer.md to Korean (#33972)
* docs: ko: model_doc/timesformer.md

* feat: nmt draft

* fix: manual edits

* fix_toctree

* fix toctree on Video Models
2024-11-07 11:04:27 -08:00
6938524a28 fix(dvclive): pass fake dataset to avoid exception in trainer init (#34455)
fix(dvclive): pass fake dataset to avoid exception in trainer
2024-11-07 15:57:34 +01:00
7bbc624743 ЁЯМР [i18n-KO] Translated convbert.md to Korean (#34599)
* docs: ko: convbert.md

* Update _toctree.yml

* feat: nmt draft
2024-11-05 09:32:17 -08:00
e83aaaa86b Fix use_parallel_residual and qkv_bias for StableLM GGUF config extraction (#34450)
* fix stablelm qkv_bias

* fix stablelm qkv_bias and use_parallel_residual

* remove original_model.config for stablelm gguf test
2024-11-05 18:26:20 +01:00
9f28d0c5d0 Fix torchvision interpolation CI (#34539)
fix-torch-interpolation-ci
2024-11-05 11:02:14 -05:00
d2bae7ee9d Changing __repr__ in torchao to show quantized Linear (#34202)
* Changing __repr__ in torchao

* small update

* make style

* small update

* add LinearActivationQuantizedTensor

* remove some cases

* update imports & handle return None

* update
2024-11-05 16:11:02 +01:00
f2d5dfbab2 Remove @slow for test_eager_matches_sdpa_inference (#34558)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-05 16:10:42 +01:00
082e57e0d4 Fix #34494 assistant tokens when truncated (#34531)
* Fix assistant tokens when truncated

* fix test

* fix test

* step
2024-11-05 15:10:15 +00:00
74d3824cc0 Revert "Fix Whisper CI" (#34605)
Revert "Fix Whisper CI (#34541)"

This reverts commit eb811449a2389e48930c45f84c88fd041735cf92.
2024-11-05 15:12:47 +01:00
45b0c7680c Remove unused test_dataset (#34516) 2024-11-05 14:01:25 +00:00
663c851239 DistilBERT is ExecuTorch compatible (#34475)
* DistillBERT is ExecuTorch compatible

* [run_slow] distilbert

* [run_slow] distilbert

---------

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-11-05 13:41:48 +01:00
893ad04fad Load sub-configs from composite configs (#34410)
* save/load sub-configs

* nit forgot these

* fix copies

* move test to common

* use dict for sub-configs

* add load-save-laod test

* clean up modeling check

* oops this are correct keys

* fix some tests, missed some composite configs

* this model was missed
2024-11-05 11:34:01 +01:00
5e1fd4e204 FIX: Broken repr of TorchAoConfig (#34560)
FIX Broken repr of TorchAoConfig

The __repr__ method references a non-existent self.kwargs. This is now
fixed.

There does not appear to be a uniform way of defining __repr__ for
quantization configs. I copied the method as implemented for HQQ:

e2ac16b28a/src/transformers/utils/quantization_config.py (L285-L287)
2024-11-05 10:26:13 +01:00
d0b1d8d888 Skip DeepSpeed ZeRO Stage 3 model initialization when bnb (#34395)
* Skip DeepSpeed ZeRO Stage 3 model initialization when it is intended to be quantized.

* Propagate the quantization state using a context manager

* make fixup
2024-11-05 10:06:07 +01:00
eb811449a2 Fix Whisper CI (#34541)
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-04 21:35:37 +01:00
bfa021be05 fix TrainerState doc because num_input_tokens_seen is unused by defauтАж (#34593)
fix TrainerState doc because num_input_tokens_seen is unused by default config

Co-authored-by: kangsheng <kangsheng@meituan.com>
2024-11-04 09:42:20 -08:00
0a6795af12 ЁЯМР [i18n-KO] Update README_ko.md (#33098)
* Update README_ko.md

Delete the blank paragraph in the language selection button and Edit to synchronize with the English version of README.md

* [i18n-KO] Update README_ko.md

* Additional edit for keep consistency with main [documentation](https://huggingface.co/docs/transformers/v4.44.2/ko/index). (ыйФьЭ╕ ым╕ьДЬьЩА ьЭ╝ъ┤АьД▒ ьЬаьзАые╝ ьЬДэХЬ ьИШьаХ)

* Update README_ko.md

Additional update.
* Change docs link to Korean translated page if it exists.

* Change doc link to korean translated if it exists.

Change the link of doc and delete a row 'migration' of the table Learn more[ыНФ ьХМьХДы│┤ъ╕░], since it does not exist in the main version of doc.

* modify a link of the main README.md

from
`https://huggingface.co/docs/transformers/index#supported-frameworks`

to
`https://huggingface.co/docs/transformers/index#supported-models-and-frameworks`

since the title of 'supported table' changed.

* [i18n-ko] edit links and sync with main `README.md`

* docs/change comment to Korean1

Change English comment to Korean

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* docs/change comment to Korean2

Change English comment to Korean

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* revise to original

to seperate `edit_README_ko_md` and `README.md`

* Synchronization with English documentation.

Synchronization with English documentation, and translated a line of comment from English to Korean.

---------

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
2024-11-04 09:42:07 -08:00
1112c54604 ЁЯМР [i18n-KO] Translated perf_train_special.md to Korean (#34590)
* Translated to Ko, 1st version

* updated _toctree.yml
2024-11-04 09:41:44 -08:00
a86bd6f2d8 [i18n-HI] Translated TFLite page to Hindi (#34572)
* [i18n-HI] Translated TFLite page to Hindi

* [i18n-HI] Translated TFLite page to Hindi

* Update docs/source/hi/tflite.md

Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>

---------

Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>
2024-11-04 09:40:30 -08:00
48831b7d11 Add text support to the Trainer's TensorBoard integration (#34418)
* feat: add text support to TensorBoardCallback

* feat: ignore long strings in trainer progress

* docs: add docstring for max_str_len

* style: remove trailing whitespace

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-11-04 17:36:27 +01:00
34927b0f73 MPS: isin_mps_friendly can support 0D tensors (#34538)
* apply fix

* tested

* make fixup
2024-11-04 16:18:50 +00:00
187439c3fa VLM: special multimodal Tokenizer (#34461)
* kinda works

* update

* add tests

* update

* use special tokens in processors

* typo

* fix copies

* fix

* fix moshi after rebase

* update

* fix tests

* update

* Update docs/source/en/main_classes/tokenizer.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update docs

* test for load time adding tokens

* fix some more tests which are now fetched better

* one more fix

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-11-04 16:37:51 +01:00
ef976a7e18 Update trainer for easier handling of accumulate, compile fixes, and proper reporting (#34511)
* Update trainer for easier handling of accumulate + proper reporting

* test

* Fixup tests

* Full fix

* Fix style

* rm comment

* Fix tests

* Minimize test + remove py 311 check

* Unused import

* Forward contrib credits from discussions

* Fix reported metrics

* Refactor, good as it's going to get

* rm pad tok id check

* object detection and audio are being annoying

* Fin

* Fin x2

---------

Co-authored-by: Gyanateet Dutta <Ryukijano@users.noreply.github.com>
2024-11-04 07:47:34 -05:00
33868a057c [i18n-HI] Translated accelerate page to Hindi (#34443)
* [i18n-HI] Translated accelerate page to Hindi

* Update docs/source/hi/accelerate.md

Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>

* Update docs/source/hi/accelerate.md

Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>

* Update docs/source/hi/accelerate.md

Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>

* Update docs/source/hi/accelerate.md

Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>

---------

Co-authored-by: Kay <kay@Kays-MacBook-Pro.local>
Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>
2024-11-01 08:26:45 -07:00
e2ac16b28a Large modular logic refactoring (#34487)
* rework converter

* Update modular_model_converter.py

* Update modular_model_converter.py

* Update modular_model_converter.py

* Update modular_model_converter.py

* cleaning

* cleaning

* finalize imports

* imports

* Update modular_model_converter.py

* Better renaming to avoid visiting same file multiple times

* start converting files

* style

* address most comments

* style

* remove unused stuff in get_needed_imports

* style

* move class dependency functions outside class

* Move main functions outside class

* style

* Update modular_model_converter.py

* rename func

* add augmented dependencies

* Update modular_model_converter.py

* Add types_to_file_type + tweak annotation handling

* Allow assignment dependency mapping + fix regex

* style + update modular examples

* fix modular_roberta example (wrong redefinition of __init__)

* slightly correct order in which dependencies will appear

* style

* review comments

* Performance + better handling of dependencies when they are imported

* style

* Add advanced new classes capabilities

* style

* add forgotten check

* Update modeling_llava_next_video.py

* Add prority list ordering in check_conversion as well

* Update check_modular_conversion.py

* Update configuration_gemma.py
2024-11-01 10:13:51 +01:00
86701f2b6f ЁЯФ┤ ЁЯФ┤ fix query_pre_attn_scalar different of num_heads in default gemma2 config (#34540)
* fix query_pre_attn_scalar different of num_heads in default config

* propagate modular changes

* fix copies

* fix modular copies

* fix copies?

* correct copies fix
2024-11-01 09:06:17 +01:00
4cc0813e28 BLIP: enable generation tests (#34174)
* blip2 tests

* instructblips

* copies

* fix slow tests

* fix

* uncomment this

* clean up after rebase

* should be model main input

* fix overwritten tests

* oops len should be multiple of frame number

* style

* fix some tests
2024-11-01 08:54:48 +01:00
6beb3f1691 Blip: get/set input embeddings correctly (#34152)
* set-get embeds

* add tests

* fix tests

* remove

* return dict True

* fix tests

* why did i remove this

* enabel torchscript tests
2024-11-01 08:39:39 +01:00
b53e44e847 [i18n-ar] Translated file : docs/source/ar/multilingual.md into Arabic (#33048)
* Add docs/source/ar/multilingual.md to Add_docs_source_ar_multilingual.md

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/multilingual.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update _toctree.yml

* Update _toctree.yml

* Add Translated files to branch for merg

* Update _toctree.yml

* Update _toctree.yml

* Update custom_models.md

* Update chat_templating.md

* Update docs/source/ar/create_a_model.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update create_a_model.md

* Update gguf.md

* Update gguf.md

* Update gguf.md

* Update gguf.md

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-31 16:10:09 -07:00
2801d7bcf6 update doc (#34478)
* update doc

* Update docs/source/en/perf_train_cpu.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* delete closing tip

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-31 15:59:23 -07:00
df8640cedb [CLIPSeg] Make interpolate_pos_encoding default to True (#34419)
* Remove interpolate_pos_encoding

* Make fixup

* Make interpolate_pos_encoding default to True

* Reuse existing interpolation

* Add integration test
2024-10-31 22:15:04 +01:00
203e27059b Add image text to text pipeline (#34170)
* Standardize image-text-to-text-models-output

add post_process_image_text_to_text to chameleon and cleanup

Fix legacy kwarg behavior and deprecation warning

add post_process_image_text_to_text to qwen2_vl and llava_onevision

Add post_process_image_text_to_text to idefics3, mllama, pixtral processor

* nit var name post_process_image_text_to_text udop

* nit fix deprecation warnings

* Add image-text-to-text pipeline

* add support for image url in chat template for pipeline

* Reformat to be fully compatible with chat templates

* Add tests chat template

* Fix imports and tests

* Add pipeline tag

* change logic handling of single prompt ans multiple images

* add pipeline mapping to models

* fix batched inference

* fix tests

* Add manual batching for preprocessing

* Fix outputs with nested images

* Add support for all common processing kwargs

* Add default padding when multiple text inputs (batch size>1)

* nit change version deprecation warning

* Add support for text only inference

* add chat_template warnings

* Add pipeline tests and add copied from post process function

* Fix batched pipeline tests

* nit

* Fix pipeline tests blip2

* remove unnecessary max_new_tokens

* revert processing kosmos2 and remove unnecessary max_new_tokens

* fix pipeline tests idefics

* Force try loading processor if pipeline supports it

* revert load_processor change

* hardcode loading only processor

* remove unnecessary try except

* skip imagetexttotext tests for kosmos2 as tiny model causes problems

* Make code clearer

* Address review comments

* remove preprocessing logic from pipeline

* fix fuyu

* add BC resize fuyu

* Move post_process_image_text_to_text to ProcessorMixin

* add guard in post_process

* fix zero shot object detection pipeline

* add support for generator input in pipeline

* nit

* change default image-text-to-text model to llava onevision

* fix owlv2 size dict

* Change legacy deprecation warning to only show when True
2024-10-31 15:48:11 -04:00
c443d8d536 Bug Fix for issue #34294 (#34295)
Update SiglipVisionEmbeddings.forward to cast input to correct dtype before embedding it.
2024-10-31 18:51:15 +01:00
114dd812dd make test_eager_matches_sdpa_inference less flaky (#34512)
* try

* try

* try

* try

* try

* try

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-31 18:34:00 +01:00
294c170ff9 feat: add benchmarks pg indexes (#34536)
* feat: add benchmarks pg indexes

* refactor: remove debug `df -h`
2024-10-31 17:41:06 +01:00
b5919e12f7 fix(DPT,Depth-Anything) Address expected_slice errors inside inference tests (#34518)
* fix(DPT,Depth-Anything) Address expected_slice errors inside inference tests

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>

* [run_slow] dpt, depth_anything

---------

Signed-off-by: Phillip Kuznetsov <philkuz@gimletlabs.ai>
2024-10-31 16:47:58 +01:00
4ca004eac6 Qwen2VL: skip base input_ids-inputs_embeds equivalence check (#34535)
it has complex inputs_embeds computation
2024-10-31 15:42:13 +00:00
ab98f0b0a1 avoid calling gc.collect and cuda.empty_cache (#34514)
* update

* update

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-31 16:36:13 +01:00
dca93ca076 Fix step shifting when accumulate gradient (#33673)
* replace total_batched_samples with step while counting grad accum step

* remove unused variable

* simplify condition for update step

* fix format by ruff

* simplify update step condition using accelerator.sync_gradients

* simplify update condition using do_sync_step

* remove print for test

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
2024-10-31 09:53:23 -04:00
jp
1b86772de5 Fix: img size mismatch caused by incorrect unpadding in LLaVA-Next (#34522)
Fix: unpadding img mismatch
2024-10-31 14:32:45 +01:00
f38531619d enable QA bf16 pipeline (#34483)
* enable QA bf16 pipeline

* add tests
2024-10-31 12:55:53 +00:00
405b562698 UPDATE Documentation for #TRANSLATING.md Documentation into Multiple Languages.(Changes made) (#34226)
* Update TRANSLATING.md

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update TRANSLATING.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-30 12:37:39 -07:00
48872fd6ae Add Image Processor Fast RT-DETR (#34354)
* add fast image processor rtdetr

* add gpu/cpu test and fix docstring

* remove prints

* add to doc

* nit docstring

* avoid iterating over images/annotations several times

* change torch typing

* Add image processor fast documentation
2024-10-30 13:49:47 -04:00
9f06fb0505 Fix super tiny extra space typo (#34440)
Update training_args.py
2024-10-30 16:55:16 +01:00
5251fe6271 Add GGUF for Mamba (#34200)
* add mamba architecture for gguf

* add logic for weights conversion, some fixes and refactoring

* add lm_head layers, unit test refactoring

* more fixes for tests

* remove lm_head creation

* remove unused comments
2024-10-30 16:52:17 +01:00
eab6c491d4 Use torch 2.5 in scheduled CI (#34465)
* torch 2.5

* try

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-30 14:54:10 +01:00
241d79026f fix pixtral processor (#34486)
* fix pixtral processor

* test out full length batches + remove undue ValueError

* fix up processing

* fix tests

* fix

* last fixup

* style

* [run-slow] pixtral

* [run-slow] pixtral

* fix config key

* skip torchscript tests

* [run-slow] pixtral

* add missing key

* [run-slow] pixtral

* fix docs

* [run-slow] pixtral

* fix wrong url for integration test

* [run-slow] pixtral

* pixtralVisionModel does not have a lm head

* [run-slow] pixtral
2024-10-30 14:17:20 +01:00
8a734ea2c3 Tests: move generate tests to the right mixin and delete redundant tests (#34464)
* tmp commit

* tmp commit

* cull overwrites of deleted tests

* typo

* more specific docstring

* make fixup

* parameterize at the top?

* correction

* more deletions :D

* tmp commit

* for VLMs too

* fix _check_outputs

* test nit

* make fixup

* fix another flaky

* test_generate_from_inputs_embeds -- handle missing attention mask
2024-10-30 10:59:08 +00:00
913330ca9f VLMs: fix number of image tokens (#34332)
* fix

* fix tests

* add tests

* style

* style

* fix qwen after rebase

* fix video llava
2024-10-30 10:21:37 +01:00
0f764a5af7 Mllama: update docs (#34334)
* update docs

* be more explicit

* use avaialble methods
2024-10-30 10:11:50 +01:00
25a9fc584a Fix format mistake in string repr of tokenizer objects (#34493)
* fix repr string format for tokenizer objects

The repr of tokenizer tokens looks confusing and just stupid, like this: `Tokenizer(...), added_tokens_decoder={1: ..., 2: ...}`. The dict that is the value of the added_tokens_decoder attribute is outside of the parentheses of the tokenizer object, whereas all other attributes are inside the parentheses like they should be.

This commit fixes this bug.

* cos: add newline before closing parenthesis of repr string
2024-10-30 10:03:41 +01:00
cd277618d4 Roberta is ExecuTorch compatible (#34425)
* Roberta is ExecuTorch compatible

* [run_slow] roberta

---------

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-30 08:36:45 +00:00
9bee9ff5db Un-deprecate timeout arg in pipelines (#34382)
* Un-deprecate timeout

* Put "timeout" on the allowed list

* make fixup
2024-10-29 18:45:14 +00:00
e4449bb790 fix incorrect warning (#34416) 2024-10-29 14:08:42 -04:00
f55595b177 Fix performance in get_imports regexp (#34298)
* fix: Fix performance in get_imports regexp

* Minimize get_imports content regexp
2024-10-29 17:29:24 +00:00
4e2e8809ff Bump werkzeug from 3.0.3 to 3.0.6 in /examples/research_projects/decision_transformer (#34420)
Bump werkzeug in /examples/research_projects/decision_transformer

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.3 to 3.0.6.
- [Release notes](https://github.com/pallets/werkzeug/releases)
- [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/werkzeug/compare/3.0.3...3.0.6)

---
updated-dependencies:
- dependency-name: werkzeug
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-29 16:42:40 +00:00
e9ad460494 Adding optimizer_cls_and_kwargs to Trainer.__init__ (#34358)
* Adding `optimizer_cls_and_kwargs` to `Trainer.__init__`

* formatting

* make fix-copies docstring

* added more docs for optimizer_cls_and_kwargs

* add docs for Trainer(optimizer_cls_and_kwargs)

* reverting anchor names
2024-10-29 16:23:16 +01:00
f339042b0b Albert is ExecuTorch compatible (#34476)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-29 16:22:13 +01:00
34620e8f0a MobileBERT is ExecuTorch compatible (#34473)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-29 16:14:31 +01:00
56c45d5757 Bug fix for drop path decay rate in swin transformer (#34291)
* potential bug fix for drop path

* variable name change

* forgot to rename the variables

* back to original

* modify dpr properly

* check_copies auto fix

* corresponsing swin2 changes

* auto fix

* linting

* default value for drop_path_rate as 0.0

* Update src/transformers/models/glm/modeling_glm.py

* maskformer fix

* ruff format

* changes made to tf code as well

* lint

---------

Co-authored-by: abhijit deo <167164474+deo-abhijit@users.noreply.github.com>
2024-10-29 16:09:18 +01:00
0ab0a42651 fix-qwen2vl-no-position_ids (#33487) 2024-10-29 15:27:34 +01:00
8755dd26b7 manual head_dim for mixtral model (#34281) 2024-10-29 14:31:36 +01:00
5392f12e16 Bert is ExecuTorch compatible (#34424)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-29 14:30:02 +01:00
004530aa05 Fix regression loading dtype (#34409)
* fix regression

* add test for torchao

* expected output

* better fix
2024-10-29 11:41:04 +01:00
9e3d704e23 Fixes for Modular Converter on Windows (#34266)
* Separator in regex

* Standardize separator for relative path in auto generated message

* open() encoding

* Replace `\` on `os.path.abspath`

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-29 11:40:41 +01:00
626c610a4d Fix perplexity computation in perplexity.md (#34387)
fix average NLL in perplexity.md
2024-10-29 11:10:10 +01:00
439334c8fb Simplify running tests in a subprocess (#34213)
* check

* check

* check

* check

* add docstring

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-29 10:48:57 +01:00
a1835195d1 ЁЯЪиЁЯЪиЁЯЪи [SuperPoint] Fix keypoint coordinate output and add post processing (#33200)
* feat: Added int conversion and unwrapping

* test: added tests for post_process_keypoint_detection of SuperPointImageProcessor

* docs: changed docs to include post_process_keypoint_detection method and switched from opencv to matplotlib

* test: changed test to not depend on SuperPointModel forward

* test: added missing require_torch decorator

* docs: changed pyplot parameters for the keypoints to be more visible in the example

* tests: changed import torch location to make test_flax and test_tf

* Revert "tests: changed import torch location to make test_flax and test_tf"

This reverts commit 39b32a2f69500bc7af01715fc7beae2260549afe.

* tests: fixed import

* chore: applied suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* tests: fixed import

* tests: fixed import (bis)

* tests: fixed import (ter)

* feat: added choice of type for target_size and changed tests accordingly

* docs: updated code snippet to reflect the addition of target size type choice in post process method

* tests: fixed imports (...)

* tests: fixed imports (...)

* style: formatting file

* docs: fixed typo from image[0] to image.size[0]

* docs: added output image and fixed some tests

* Update docs/source/en/model_doc/superpoint.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* fix: included SuperPointKeypointDescriptionOutput in TYPE_CHECKING if statement and changed tests results to reflect changes to SuperPoint from absolute keypoints coordinates to relative

* docs: changed SuperPoint's docs to print output instead of just accessing

* style: applied make style

* docs: added missing output type and precision in docstring of post_process_keypoint_detection

* perf: deleted loop to perform keypoint conversion in one statement

* fix: moved keypoint conversion at the end of model forward

* docs: changed SuperPointInterestPointDecoder to SuperPointKeypointDecoder class name and added relative (x, y) coordinates information to its method

* fix: changed type hint

* refactor: removed unnecessary brackets

* revert: SuperPointKeypointDecoder to SuperPointInterestPointDecoder

* Update docs/source/en/model_doc/superpoint.md

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

---------

Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-10-29 09:36:03 +00:00
655bec2da7 use a tinymodel to test generation config which aviod timeout (#34482)
* use a tinymodel to test generation config which aviod timeout

* remove tailing whitespace
2024-10-29 09:39:06 +01:00
63ca6d9771 Fix CI (#34458)
* fix

* fix mistral
2024-10-29 08:26:04 +01:00
808d6c50f8 Generation: fix test (#34369)
* fix test

* fix copies
2024-10-29 07:57:10 +01:00
fe76b60370 LLaVA: latency issues (#34460)
* fix llavas

* code style

* green ci
2024-10-29 07:54:51 +01:00
a769ed45e1 Add post_process_depth_estimation for GLPN (#34413)
* add depth postprocessing for GLPN

* remove previous temp fix for glpn tests

* Style changes for GLPN's `post_process_depth_estimation`

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* additional style fix

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-28 19:44:20 +01:00
6cc4a67b3d feat: run benchmarks on A100 (#34287) 2024-10-28 19:33:17 +01:00
d21dbd1520 enable average tokens across devices (#34373)
* enable average tokens across devices

* reduce earlier in case model needs it

* simplify if statement

* reformat code to make ruff happy

* add doc for argument: average_tokens_across_devices

* cannot find world size when pytorch is unavailable

* format code

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-28 18:59:38 +01:00
a17f287ac0 [i18n-ar] Translated file : docs/source/ar/fast_tokenizers.md into Arabic (#33034)
* Add docs/source/ar/fast_tokenizers.md to Add_docs_source_ar_fast_tokenizers.md

* Update _toctree.yml

* Update _toctree.yml

* Update docs/source/ar/_toctree.yml

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

* Update docs/source/ar/fast_tokenizers.md

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>

---------

Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
2024-10-28 10:54:37 -07:00
084e946cfd Apply linting to the important code blocks to make it readable (#34449)
Enhance user experience using py-linting
2024-10-28 10:48:18 -07:00
1f7539c829 ЁЯМР [i18n-KO] Translated model_doc/barthez.md to Korean (#33980)
* docs: ko: model_doc/barthez.md

* feat: nmt draft

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-28 10:46:49 -07:00
fc1ae7f30f [docs] update input documentation for MAMBA2 and MISTRAL models to include cache_position and attention_mask details (#34322)
* [docs] update input documentation for MAMBA2 and MISTRAL models to include cache_position and attention_mask details

* [docs] correct input documentation for MISTRAL model to reference `input_ids` instead of `decoder_input_ids`

* [docs] clarify cache_position description in MISTRAL model documentation
2024-10-28 09:14:07 -07:00
c1753436db New option called "best" for args.save_strategy. (#31817)
* Add _determine_best_metric and new saving logic.

1. Logic to determine the best logic was separated out from
`_save_checkpoint`.
2. In `_maybe_log_save_evaluate`, whether or not a new best metric was
achieved is determined after each evaluation, and if the save strategy
is "best' then the TrainerControl is updated accordingly.

* Added SaveStrategy.

Same as IntervalStrategy, but with a new attribute called BEST.

* IntervalStrategy -> SaveStrategy

* IntervalStratgy -> SaveStrategy for save_strat.

* Interval -> Save in docstring.

* Updated docstring for save_strategy.

* Added SaveStrategy and made according changes.

`save_strategy` previously followed `IntervalStrategy` but now follows
`SaveStrategy`.

Changes were made accordingly to the code and the docstring.

* Changes from `make fixup`.

* Removed redundant metrics argument.

* Added new test_save_best_checkpoint test.

1. Checks for both cases where `metric_for_best_model` is explicitly
provided and when it's not provided.
2. The first case should have two checkpoints saved, whereas the second
should have three saved.

* Changed should_training_end saving logic.

The Trainer saves a checkpoints at the end of training by default as
long as `save_strategy != SaveStrategy.NO`. This condition was modified
to include `SaveStrategy.BEST` because it would be counterintuitive that
we'd only want the best checkpoint to be saved but the last one is as
well.

* `args.metric_for_best_model` default to loss.

* Undo metric_for_best_model update.

* Remove checking metric_for_best_model.

* Added test cases for loss and no metric.

* Added error for metric and changed default best_metric.

* Removed unused import.

* `new_best_metric` -> `is_new_best_metric`

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Applied `is_new_best_metric` to all.

Changes were made for consistency and also to fix a potential bug.

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
2024-10-28 16:02:22 +01:00
8b3b9b48fc exclude fsdp from delay_optimizer_creation (#34140)
* exclude fsdp from delay_optimizer_creation

* add test case for trainer: FSDP mode and fp8 as mixed precision

* rearrange imports

* ruff formatted

* adapt _init_fsdp to fp8

* use _init_fsdp only when resume_from_checkpoint

* In case of FDP, self.layer will be CheckpointWrapper which has no len() method

* delete _init_fsdp

* solve conflict

* fix conflict

* make fixup
2024-10-28 13:50:16 +01:00
92bcdff2ef Fix batch size handling in prediction_loop for DataLoaderShard (#34343)
* Fix batch size handling in prediction_loop for DataLoaderShard

Updated the prediction_loop method in the Trainer class to correctly handle batch size when using DataLoaderShard. This ensures that the batch size is retrieved from total_batch_size for distributed training scenarios, preventing TypeError related to NoneType during evaluation.

* Update src/transformers/trainer.py

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* Applied the fix to remove unused imports

---------

Co-authored-by: Zach Mueller <muellerzr@gmail.com>
2024-10-28 13:23:52 +01:00
9360f1827d Tiny update after #34383 (#34404)
* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-28 12:01:05 +01:00
fc465bb196 pin tensorflow_probability<0.22 in docker files (#34381)
0.21

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-28 11:59:46 +01:00
fddbd3c13c Fix pix2struct (#34374)
* fix

* fix and test use_cache test

* style

* remove atol
2024-10-28 11:24:56 +01:00
1d06379331 [docs] Cache implementations (#34325)
cache
2024-10-25 08:52:45 -07:00
6a62a6d1b5 Fix typos in agents_advanced.md (#34405) 2024-10-25 08:52:29 -07:00
f73f5e62e2 Avoid check expected exception when it is on CUDA (#34408)
* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-25 17:14:07 +02:00
e447185b1f Fix bnb training test failure (#34414)
* Fix bnb training test: compatibility with OPTSdpaAttention
2024-10-25 10:23:20 -04:00
186b8dc190 Tests: upgrade test_eager_matches_sdpa_generate (#34386) 2024-10-25 11:55:07 +01:00
8814043c8c SynthID: better example (#34372)
* better example

* Update src/transformers/generation/configuration_utils.py

* Update src/transformers/generation/logits_process.py

* nits
2024-10-25 11:46:46 +01:00
223855314f no filter (#34391)
* no filter

* no filter

* no filter

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-25 12:32:39 +02:00
9f365fe0ac Fix right padding in LLaVA models (#34305)
* fix right pad llavas

* device mismatch
2024-10-25 11:02:07 +02:00
5779bac4c4 Fix onnx non-expotable inplace aten op (#34376)
* fix onnx non-expotable inplace op

* mistral, qwen2, qwen2_vl, starcoder2

* fixup copies
2024-10-25 09:44:09 +02:00
940a6bd343 Use non nested images and batched text Idefics2/3 (#34222)
* add support for non nested images and add tests

* add tests error scenario

* fix style

* added single and no image to error tests
2024-10-24 20:00:13 -04:00
3d99f1746e Fix glm (#34388)
* Fix duplicated

* fix import
2024-10-24 19:17:52 +02:00
a308d28d39 [auto. ping] Avoid sending empty info + add more team members (#34383)
* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-24 19:07:23 +02:00
4c6e0c9252 Correct the new defaults (#34377)
* Correct the new defaults

* CIs

* add check

* Update utils.py

* Update utils.py

* Add the max_length in generate test checking shape without passing length

* style

* CIs

* fix fx CI issue
2024-10-24 18:42:03 +02:00
1c5918d910 Fix torch.fx issue related to the new loss_kwargs keyword argument (#34380)
* Fix FX

* Unskip tests
2024-10-24 18:34:28 +02:00
d9989e0b9a [PEFT] Add warning for missing key in LoRA adapter (#34068)
When loading a LoRA adapter, so far, there was only a warning when there
were unexpected keys in the checkpoint. Now, there is also a warning
when there are missing keys.

This change is consistent with
https://github.com/huggingface/peft/pull/2118 in PEFT and the planned PR
https://github.com/huggingface/diffusers/pull/9622 in diffusers.

Apart from this change, the error message for unexpected keys was
slightly altered for consistency (it should be more readable now). Also,
besides adding a test for the missing keys warning, a test for
unexpected keys warning was also added, as it was missing so far.
2024-10-24 17:56:40 +02:00
fe35073319 Ignore unsupported kwarg in ProcessorMixin call (#34285)
Fix accept any common kwargs
2024-10-24 11:46:39 -04:00
e288616606 refactor: remove redundant if-condition and improve type correctness for convert_tokens_to_ids (#34030)
* chore: remove redundant if-condition

* fix: import `Iterable`
2024-10-24 17:40:26 +02:00
450b9cbfac Add code sample docstrings and checkpoint reference for GLM models (#34360)
* Add code sample docstrings and checkpoint reference for GLM models

* Update modular_glm.py

* Update modeling_glm.py
2024-10-24 17:28:51 +02:00
6432ad8bb5 Fix pil_torch_interpolation_mapping import in image_processing_detr_fast (#34375)
fix pil_torch_interpolation_mapping import
2024-10-24 09:22:50 -04:00
dd267fca72 Add T5 GGUF loading support (#33389)
* add: GGUFT5Converter

* add: tensormapping for t5

* add: test code for t5

* fix: Remove whitespace from blank line

* add: t5 fp16 tests

* fix: whitespace formatting

* fix: minor formatting

* fix: testing every weights
2024-10-24 15:10:59 +02:00
30c76d5b28 add code generation to natural language processing section (#34333) 2024-10-24 14:42:47 +02:00
2112027d0c Zamba is an LM (#34342)
* Zamba is an LM

* Addition
2024-10-24 14:29:33 +02:00
b29c24ff1e CI: fix failures (#34371)
fix
2024-10-24 13:44:53 +02:00
f0b3ef9e2e translated gguf.md into chinese (#34163)
* translated gguf.md into chinese

* Apply suggestions from code review

I have updated the PR accordingly.Thank you very much for detailed guidance,and I 'll pay more attention to the details next time.

Co-authored-by: Isotr0py <2037008807@qq.com>

* Apply suggestions from code review

Co-authored-by: Isotr0py <2037008807@qq.com>

---------

Co-authored-by: Isotr0py <2037008807@qq.com>
2024-10-24 11:47:58 +02:00
9643069465 v4.47.0.dev0 2024-10-24 11:23:29 +02:00
f0e640adfa Drop support for Python 3.8 (#34314)
* drop python 3.8

* update docker files

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-24 11:16:55 +02:00
05863817d6 Better defaults (#34026)
* be nice to our usres

* nit

* fixup

* default to -1

* oups

* turbo nit

* auto infer framework
2024-10-24 11:11:55 +02:00
65753d6065 Remove graph breaks for torch.compile() in flash_attention_forward when Lllama Model is padding free tuned (#33932)
* fix: fixes for graph breaks

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: formatting

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: import error

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: Add Fa2Kwargs

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* Revert "PR changes"

This reverts commit 39d2868e5c93cc5f3f3c7c6ff981b66614c0e0e4.

* PR changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: FlashAttentionKwarg

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: FlashAttentionKwarg

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* PR Changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* addition of documentation

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* change in _flash_attention_forward

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* make fix-copies

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* revert make fix-copies

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix copies

* style

* loss kwargs typing

* style and pull latest changes

---------

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2024-10-24 11:02:54 +02:00
b0f0c61899 Add SynthID (watermerking by Google DeepMind) (#34350)
* Add SynthIDTextWatermarkLogitsProcessor

* esolving comments.

* Resolving comments.

* esolving commits,

* Improving SynthIDWatermark tests.

* switch to PT version

* detector as pretrained model + style

* update training + style

* rebase

* Update logits_process.py

* Improving SynthIDWatermark tests.

* Shift detector training to wikitext negatives and stabilize with lower learning rate.

* Clean up.

* in for 7B

* cleanup

* upport python 3.8.

* README and final cleanup.

* HF Hub upload and initiaze.

* Update requirements for synthid_text.

* Adding SynthIDTextWatermarkDetector.

* Detector testing.

* Documentation changes.

* Copyrights fix.

* Fix detector api.

* ironing out errors

* ironing out errors

* training checks

* make fixup and make fix-copies

* docstrings and add to docs

* copyright

* BC

* test docstrings

* move import

* protect type hints

* top level imports

* watermarking example

* direct imports

* tpr fpr meaning

* process_kwargs

* SynthIDTextWatermarkingConfig docstring

* assert -> exception

* example updates

* no immutable dict (cant be serialized)

* pack fn

* einsum equivalent

* import order

* fix test on gpu

* add detector example

---------

Co-authored-by: Sumedh Ghaisas <sumedhg@google.com>
Co-authored-by: Marc Sun <marc@huggingface.co>
Co-authored-by: sumedhghaisas2 <138781311+sumedhghaisas2@users.noreply.github.com>
Co-authored-by: raushan <raushan@huggingface.co>
2024-10-23 21:18:52 +01:00
e50bf61dec Fix red CI: benchmark script (#34351)
* dont'trigger always

* fux

* oups

* update

* ??

* ?

* aie
2024-10-23 18:33:52 +02:00
c42b3223db skip test_pipeline_depth_estimation temporarily (#34316)
skip

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-23 17:27:51 +02:00
d9f733625c Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283)
* Enable grad accum fix across all models + trainer fully in forward()

* handle peft case

* Account for DDP: need to run scale tests

* Use accelerator state

* Quality

* Guard

* Experiment w/ only fairseq fix

* Fairseq only

* Revert multiply_grads fix

* Mult by grad accum to fully bring back solution

* Style

* Good to go now

* Skip fx tests for now

* Bookmark

* Working now
2024-10-23 11:24:57 -04:00
1fb575fcf0 Support boolean tool args (#34208)
Support boolean tool arguments
2024-10-23 16:48:21 +02:00
343c8cb86f Added Deberta model type support (#34308)
* Added Deberta model type for 'add_prefix_space' functionality

* housekeeping

---------

Co-authored-by: Filippos Ventirozos <filippos.ventirozos@autotrader.co.uk>
2024-10-23 11:15:36 +02:00
5ba85de7a4 [docs] Fix Korean toctree (#34324)
fix
2024-10-23 10:52:51 +02:00
049682a5a6 Example doc for token classification of Llama and Dependent/Copied Models (#34139)
* Added Example Doc for token classification on all tokenClassificationModels copied from llama

* Refactor code to add code sample docstrings for Gemma and Gemma2 models (including modular Gemma)

* Refactor code to update model checkpoint names for Qwen2 models
2024-10-22 10:26:16 -07:00
644d5287b2 ЁЯМР [i18n-KO] Translated model_doc/bartpho.md to Korean (#33981)
* docs: ko: model_doc/bartpho.md

* feat: nmt draft

* Update docs/source/ko/model_doc/bartpho.md

* Update docs/source/ko/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-22 09:46:52 -07:00
b03dc0a87e ЁЯМР [i18n-KO] Translated bert japanese.md to Korean (#33890)
* docs: ko: bert-japanese.md

* Update _toctree.yml

* fix: manual edits

* Update docs/source/ko/_toctree.yml

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Update docs/source/ko/_toctree.yml

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

---------

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-22 09:46:31 -07:00
4b14aa1bcd ЁЯМР [i18n-KO] Translated executorch.md to Korean (#33888)
* docs: ko: executorch.md

* Update _toctree.yml

* fix: manual edits

* Update docs/source/ko/main_classes/executorch.md

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>

* Update docs/source/ko/_toctree.yml

Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Update docs/source/ko/_toctree.yml

* Update docs/source/ko/_toctree.yml

* Update docs/source/ko/_toctree.yml

---------

Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com>
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-22 09:46:20 -07:00
688eeac81e [docs] fix typo (#34235)
fix typo
2024-10-22 09:46:07 -07:00
a65a6ce7fe fix error in _get_eval_sampler when group_by_length enabled (#34237)
* remove self in _get_eval_sampler

* remove self in front of _get_eval_sampler
2024-10-22 18:02:42 +02:00
e7c3fa7f57 Fix continue_final_message for image-text-to-text chat templates (#34236)
* fix continue_final_message for vlms

* Add one test for vlms continue_final_message chat template
2024-10-22 11:57:44 -04:00
96f67c068b Feature: Add MLFLOW_MAX_LOG_PARAMS to MLflowCallback (#34279) 2024-10-22 16:34:17 +02:00
eef6b0ba42 Add option for running ffmpeg_microphone_live as a background process (#32838)
* Add option for running ffmpeg_microphone_live as a background process

* Code quality checks for audio_utils

* Code clean up for audio_utils

* Fixing logic in ffmpeg_microphone calls in audio_utils

* Allowing any arbitrary arguments to be passed to ffmpeg_microphone_live

* Formatting

* Fixing last problems with adding ffmpeg_additional_args

* Fixing default arguments and formatting issues

* Fixing comments for ffmpeg_additional_args

* Adding two shorts tests for ffmpeg_microphone_live

* Fixing test bug
2024-10-22 15:56:41 +02:00
c14ccbcd64 Olmo is ExecuTorch Compatible (#34181)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-22 15:53:01 +02:00
7a08a772cc Qwen2.5 is ExecuTorch Compatible (#34102)
Qwen2 is ExecuTorch Compatible

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-22 15:52:23 +02:00
c31a6ff474 Add post_process_depth_estimation to image processors and support ZoeDepth's inference intricacies (#32550)
* add colorize_depth and matplotlib availability check

* add post_process_depth_estimation for zoedepth + tests

* add post_process_depth_estimation for DPT + tests

* add post_process_depth_estimation in DepthEstimationPipeline & special case for zoedepth

* run `make fixup`

* fix import related error on tests

* fix more import related errors on test

* forgot some `torch` calls in declerations

* remove `torch` call in zoedepth tests that caused error

* updated docs for depth estimation

* small fix for `colorize` input/output types

* remove `colorize_depth`, fix various names, remove matplotlib dependency

* fix formatting

* run fixup

* different images for test

* update examples in `forward` functions

* fixed broken links

* fix output types for docs

* possible format fix inside `<Tip>`

* Readability related updates

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Readability related update

* cleanup after merge

* refactor `post_process_depth_estimation` to return dict; simplify ZoeDepth's `post_process_depth_estimation`

* rewrite dict merging to support python 3.8

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-10-22 15:50:54 +02:00
104599d7a8 Fix: tensor of examples of the same length triggers invalid stacking (#34166)
* Fix issue where tensor of examples of the same length triggers invalid stacking

* Update data_collator.py
2024-10-22 15:49:21 +02:00
51e395d13e Fix FA2 attention for models supporting sliding window (#34093)
Fix FA2
2024-10-22 15:37:21 +02:00
eb6a734995 [RT-DETR] Fix onnx inference bug for Optype (Where) (#33877)
* feat: [RT-DETR] Add onnx runtime config and fix onnx inference bug Optype (Where)

* fix lint

* use dtype istead of torch.float32

* add doc

* remove onnx config

* use dtype info

* use tensor to fix lint
2024-10-22 15:14:07 +02:00
84b17e03f1 Update PR templates (#34065)
update PR template
2024-10-22 15:11:54 +02:00
681fc43713 Sync video classification pipeline with huggingface_hub spec (#34288)
* Sync video classification pipeline

* Add disclaimer
2024-10-22 13:33:49 +01:00
93352e81f5 Fix Korean doc _toctree.yml (#34293)
Fix korean doc _toctree.yml
2024-10-22 11:05:56 +02:00
b644178ed4 [docs] Fix GenerationConfig params (#34299)
fix generationconfigs
2024-10-22 11:03:25 +02:00
73d65e637b T5 compile compatibilty (#34089)
* this worked in normal generation, needs more tests

* fix almost all tests in t5

* nit

* longt5, umt5, mt5

* style

* udop, pix2struct

* more models

* fix some tests

* fix onnx tests

* tracing tests fixed

* compile enabled and tested for t5 models

* fix small bug in slow tests

* [run-slow] t5

* uncomment

* style

* update with new generation refactoring

* nit

* fix copies

* this is the fix, had to change t5 to fix copies

* update

* [run-slow] t5

* [run-slow] t5

* update

* add test for encoder only T5

* clean up after rebase

* fix pop2piano

* add comment

* style

* fix copies after rebase

* fix copies  missed this one
2024-10-22 08:23:53 +02:00
5077bc034f VLM: add more modularity (#34175)
* update

* fix tests + fix copies

* fix tests once more
2024-10-22 07:56:35 +02:00
21d5025826 Attn implementation for composite models (#32238)
* first try

* codestyle

* idefics2 is happy

* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo, paligemma

* fix-copies

* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo

* blip-2 needs to init vision from config

* when was this removed O_o

* minor fix

* tests

* this way?

* tests

* model-agnostic code

* codestyle

* add tests for idefics

* modify general test for VLMs

* no generation test for vlm yet!

* no generation test here also

* wanr in VIT-SDPA if output attn

* add more tests

* user can pass dict as attn impl

* repo consistency

* update

* muicgen

* no prints

* forgot speech enc-dec and clip

* how many composite models we have?

* musicgen meelody is same as mudicgen

* +siglip

* fix tests + add some more

* remove idefics custom overriden code

* make idefics2 automappable

* nits

* skip tests

* doctests

* Update src/transformers/models/idefics2/configuration_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/clip/test_modeling_clip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/idefics2/test_modeling_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/idefics2/test_modeling_idefics2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/configuration_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* major update, no need for automap

* clean up

* add FA2 test

* more tests

* style

* skip tests

* why did these started failing now?

* no attributes for FA2 needed

* one tiny test

* address comment about FA2 false warning

* style

* add new models and resolve conflicts

* fix copies

* let it be this way for now, come back tomorrow to review

* some more fixes

* update

* more updates

* update

* fix copies

* style and tests

* another big update

* fix tests

* fix tests

* update

* another update

* fix tests

* fix copies

* fix tests

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-10-22 06:54:44 +02:00
32590b5ecb Fix method name which changes in tutorial (#34252)
The method `model_download_tool` was called `model_download_counter` earlier in the tutorial, this raises an error when following the code.
2024-10-21 14:21:52 -03:00
f701b98e4a Add a doc section on writing generation prompts (#34248)
Add a section on writing generation prompts
2024-10-21 14:35:57 +01:00
a4122813d1 Add DetrImageProcessorFast (#34063)
* add fully functionning image_processing_detr_fast

* Create tensors on the correct device

* fix copies

* fix doc

* add tests equivalence cpu gpu

* fix doc en

* add relative imports and copied from

* Fix copies and nit
2024-10-21 09:05:05 -04:00
24bdc94da5 Change Paligemma import logging to work with modular (#34211)
* change import logging

* fix CI
2024-10-21 08:55:27 -04:00
ca541bd4f4 Generation tests: don't rely on main input name (#34228)
* don't rely on main input name

* update
2024-10-21 10:00:14 +02:00
816f442496 Only cast logits to float when computing loss (#34147)
* Only cast logits to float when computing loss

Some misses from #31292 and #33902

* Move logits.float() into existing if labels is not None branch
2024-10-18 18:15:26 +02:00
e46e3bc173 Fix UDOP dtype issue (#34180)
* Trigger UDOP tests

* Try forcing dtype in LayoutLMV3

* Do checks to see where uint8 is getting in

* Do checks to see where uint8 is getting in

* Found it!

* Add .astype(np.float32)

* Remove forced check, make fixup

* Checking where exactly the uint8 creeps in

* More checking on the uint8 issues

* Manually upcast in rescale()

* Remove UDOP trigger
2024-10-18 16:54:58 +01:00
6604764007 add Glm (#33823)
* Create modular_glm.py

* Update modular_glm.py

* Finalize architecture without all attentions

* Add all attentions modules

* Finalize modular

* Update given last version

* Last update

* Finalize model

* Finalize converter

* Update convert_glm_weights_to_hf.py

* style

* style

* Create __init__.py

* Aff all inits

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Correct the rotary embeddings

* Remove apply_residual_connection_post_layernorm (always false)

* remove use_rms_norm (always true)

* remove past_layer_norm (always true)

* Update __init__.py

* Update config and license

* start adding tests and doc

* Add doc + style

* Update test_modeling_glm.py

* Add dummies

* Apply correct modeling

* Refactor attention to follow llama

* Update __init__.py

* Update convert_glm_weights_to_hf.py

* Correct bias

* remove linear_bias and pdrop (never used)

* apply modular

* Simplify converter

* remove dummies + style

* add model_input_names

* Add pretraining_tp to config for when eager attention is used

* Update modular to remove all pretraining_tp

* Update test_modeling_glm.py

* Update the __all__

* Update __all__

* Update __init__.py

* Update test_modeling_glm.py

* add revisions

* Add the correct repos and revisions

* style

* Update __init__.py

* update exports

* remove import of modular files

* style

* Apply Llama changes + refine converter

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* Update convert_glm_weights_to_hf.py

* style

* Use new modular converter

* add pretrainedmodel to init

* style

* Update test_modeling_glm.py

* Move config outside modular to please CI about docstrings

* Add dummies to please CI

* Update glm.md

* Update glm.md
2024-10-18 17:41:12 +02:00
e95ea479ee Informative 2 (#34154)
* Informative

* style

* Informative 2

* Apply suggestions from code review

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

---------

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2024-10-18 14:12:15 +02:00
0437d6cd03 Fix broken test decorator require_torch_up_to_2_accelerators (#34201)
* fix broken require_torch_up_to_2_accelerators

* make style
2024-10-18 13:54:55 +02:00
5a5b590d06 BLIP: fix input expansion logic (#34225)
fix
2024-10-18 12:17:30 +02:00
b54109c746 Fix-red-ci (#34230)
* fix copies, skip fx for llama

* styke

* re-fix copies

* last?

* style
2024-10-17 23:38:35 +02:00
6ba31a8a94 Enable users to use their own loss functions + deal with prefetching for grad accum (#34198)
* bookmark

* Bookmark

* Bookmark

* Actually implement

* Pass in kwarg explicitly

* Adjust for if we do or don't have labels

* Bookmark fix for od

* bookmark

* Fin

* closer

* Negate accelerate grad accum div

* Fixup not training long enough

* Add in compute_loss to take full model output

* Document

* compute_loss -> compute_loss_fn

* Add a test

* Refactor

* Refactor

* Uncomment tests

* Update tests/trainer/test_trainer.py

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-10-17 17:01:56 -04:00
7a06d07e14 Support Llama 3.2 conversion (text models) (#33778)
* Support Llama 3.2 conversion (text models)

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Fix rope factor

* Update chat template

Initialize from a well-known template.
The guidance is that the changes should be applied to 3.1 models as
well.

* Remove import

* Support Llama Guard 3 conversion

* Tokenizer details

* Fix eos added token in base models

* Fix generation config for base models

* Specify revision for known tokenizers

* Style

* Reuse chat templates for older models

* Improve error when converting tokenizer < Llama 3

---------

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
2024-10-17 22:37:37 +02:00
c1c7e89620 Fix Gradient Accumulation issue (#34191)
* quick fix

* 3 losses

* oups

* fix

* nits

* check how it scales for special models

* propagate for conditiona detr

* propagate

* propagate

* propagate

* fixes

* propagate changes

* update

* fixup

* nits

* f string

* fixes

* more fixes

* ?

* nit

* arg annoying f string

* nits

* grumble

* update

* nit

* refactor

* fix fetch tests

* nit

* nit

* Update src/transformers/loss/loss_utils.py

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>

* update

* nit

* fixup

* make pass

* nits

* port code to more models

* fixup

* ntis

* arf

* update

* update

* nits

* update

* fix

* update

* nits

* fine

* agjkfslga.jsdlkgjklas

* nits

* fix fx?

* update

* update

* styel

* fix imports

* update

* update

* fixup to fix the torch fx?

---------

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
2024-10-17 22:34:40 +02:00
f51ac9e059 Generate: visit non-llm prepare_inputs_for_generation (#34199)
* tmp

* all visited

* test all

* Update src/transformers/models/moshi/modeling_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* delete another one :D

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-17 16:53:48 +01:00
1d2c29f0b3 Fix bus error when using GPT2 on M1 macs (#34031)
There's a bug on M1 macs with transformer >= 4.43.0 and torch >= 2.1.0, where if a model has tied embeddings, then the fast loading from #31771 causes a bus error when the model is actually run. This can be solved by disabling `_supports_param_buffer_assignment` for these models.

More info in comments in #33357
2024-10-17 17:39:04 +02:00
9470c00042 Llama3 and Llama2 are ExecuTorch compatible (#34101)
Llama3_1b and Llama2_7b are ExecuTorch compatible

Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-17 17:33:19 +02:00
7f5088503f removes decord (#33987)
* removes decord dependency

optimize

np

Revert "optimize"

This reverts commit faa136b51ec4ec5858e5b0ae40eb7ef89a88b475.

helpers as documentation

pydoc

missing keys

* make fixup

* require_av

---------

Co-authored-by: ad <hi@arnaudiaz.com>
2024-10-17 17:27:34 +02:00
f2846ad2b7 Fix for tokenizer.apply_chat_template with continue_final_message=True (#34214)
* Strip final message

* Do full strip instead of rstrip

* Retrigger CI

---------

Co-authored-by: Matt <rocketknight1@gmail.com>
2024-10-17 15:45:07 +01:00
b57c7bce21 fix(Wav2Vec2ForCTC): torch export (#34023)
* fix(Wav2Vec2ForCTC): torch export

Resolves the issue described in #34022 by implementing the
masking of the hidden states using an elementwise multiplication
rather than indexing with assignment.

The torch.export functionality seems to mark the tensor as frozen
even though the update is legal.

This change is a workaround for now to allow the export of the
model as a FxGraph. Further investigation is required to find
the real solution in pytorch.

* [run-slow] hubert, unispeech, unispeech_sat, wav2vec2
2024-10-17 15:41:55 +01:00
fce1fcfe71 Ping team members for new failed tests in daily CI (#34171)
* ping

* fix

* fix

* fix

* remove runner

* update members

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-17 16:11:52 +02:00
aa3e35ac67 Fix warning message for fp32_cpu_offloading in bitsandbytes configs (#34079)
* change cpu offload warning for fp8 quantization

* change cpu offload warning for fp4 quantization

* change cpu offload variable name for fp8 and fp4 quantization
2024-10-17 15:11:33 +02:00
6d2b203339 Update trainer._get_eval_sampler() to support group_by_length arg (#33514)
Update 'trainer._get_eval_sampler()' to support 'group_by_length' argument

Trainer didn't support grouping by length for evaluation, which made evaluation slow with 'eval_batch_size'>1.

Updated 'trainer._get_eval_sampler()' method was based off of 'trainer._get_train_sampler()'.
2024-10-17 14:43:29 +02:00
3f06f95ebe Revert "Fix FSDP resume Initialization issue" (#34193)
Revert "Fix FSDP resume Initialization issue (#34032)"

This reverts commit 4de1bdbf637fe6411c104c62ab385f660bfb1064.
2024-10-16 15:25:18 -04:00
3a10c6192b Avoid using torch's Tensor or PIL's Image in chat template utils if not available (#34165)
* fix(utils): Avoid using torch Tensor or PIL Image if not available

* Trigger CI

---------

Co-authored-by: Matt <rocketknight1@gmail.com>
2024-10-16 16:01:18 +01:00
bd5dc10fd2 Fix wrong name for llava onevision and qwen2_vl in tokenization auto (#34177)
* nit fix wrong llava onevision name in tokenization auto

* add qwen2_vl and fix style
2024-10-16 16:48:52 +02:00
cc7d8b87e1 Revert accelerate error caused by 46d09af (#34197)
Revert `accelerate` bug
2024-10-16 16:13:41 +02:00
98bad9c6d6 [fix] fix token healing tests and usage errors (#33931)
* auto-gptq requirement is removed & model is changed & tokenizer pad token is assigned

* values func is changed with extensions & sequence key value bug is fixed

* map key value check is added in ExtensionsTree

* empty trimmed_ids bug is fixed

* tail_id IndexError is fixed

* empty trimmed_ids bug fix is updated for failed test

* too much specific case for specific tokenizer is removed

* input_ids check is updated

* require auto-gptq import is removed

* key error check is changed with empty list check

* empty input_ids check is added

* empty trimmed_ids fix is checked with numel function

* usage change comments are added

* test changes are commented

* comment style and quality bugs are fixed

* test comment style and quality bug is fixed
2024-10-16 14:22:55 +02:00
9ba021ea75 Moshi integration (#33624)
* clean mimi commit

* some nits suggestions from Arthur

* make fixup

* first moshi WIP

* converting weights working + configuration + generation configuration

* finalize converting script - still missing tokenizer and FE and processor

* fix saving model w/o default config

* working generation

* use GenerationMixin instead of inheriting

* add delay pattern mask

* fix right order: moshi codes then user codes

* unconditional inputs + generation config

* get rid of MoshiGenerationConfig

* blank user inputs

* update convert script:fix conversion, add  tokenizer, feature extractor and bf16

* add and correct Auto classes

* update modeling code, configuration and tests

* make fixup

* fix some copies

* WIP: add integration tests

* add dummy objects

* propose better readiblity and code organisation

* update tokenization tests

* update docstrigns, eval and modeling

* add .md

* make fixup

* add MoshiForConditionalGeneration to ignore Auto

* revert mimi changes

* re

* further fix

* Update moshi.md

* correct md formating

* move prepare causal mask to class

* fix copies

* fix depth decoder causal

* fix and correct some tests

* make style and update .md

* correct config checkpoitn

* Update tests/models/moshi/test_tokenization_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/moshi/test_tokenization_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* make style

* Update src/transformers/models/moshi/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

* change firm in copyrights

* udpate config with nested dict

* replace einsum

* make style

* change split to True

* add back splt=False

* remove tests in convert

* Update tests/models/moshi/test_modeling_moshi.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add default config repo + add model to FA2 docstrings

* remove logits float

* fix some tokenization tests and ignore some others

* make style tokenization tests

* update modeling with sliding window + update modeling tests

* [run-slow] moshi

* remove prepare for generation frol CausalLM

* isort

* remove copied from

* ignore offload tests

* update causal mask and prepare 4D mask aligned with recent changes

* further test refine + add back prepare_inputs_for_generation for depth decoder

* correct conditional use of prepare mask

* update slow integration tests

* fix multi-device forward

* remove previous solution to device_map

* save_load is flaky

* fix generate multi-devices

* fix device

* move tensor to int

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Marc Sun <marc@huggingface.co>
2024-10-16 11:21:49 +02:00
d087165db0 IDEFICS: support inputs embeds (#34043)
* support embeds

* use cache from config

* style...

* fix tests after rebase
2024-10-16 09:25:26 +02:00
9d6998c759 ЁЯМР [i18n-KO] Translated blip-2.md to Korean (#33516)
* docs: ko: model_doc/blip-2

* feat: nmt draft

* Apply suggestions from code review

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* Update docs/source/ko/model_doc/blip-2.md

Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>

---------

Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>
Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>
2024-10-15 11:21:22 -07:00
554ed5d1e0 ЁЯМР [i18n-KO] Translated trainer_utils.md to Korean (#33817)
* docs: ko: trainer_utils.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

---------

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
2024-10-15 11:21:05 -07:00
8c33cf4eec ЁЯМР [i18n-KO] Translated gemma2.md to Korean (#33937)
* docs: ko: gemma2.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions
2024-10-15 11:20:46 -07:00
67acb0b123 ЁЯМР [i18n-KO] Translated vivit.md to Korean (#33935)
* docs: ko: model_doc/vivit.md

* feat: nmt draft

* fix: manual edits

* fix: manual edits
2024-10-15 10:31:44 -07:00
0f49deacbf [feat] LlavaNext add feature size check to avoid CUDA Runtime Error (#33608)
* [feat] add feature size check to avoid CUDA Runtime Error

* [minor] add error handling to all llava models

* [minor] avoid nested if else

* [minor] add error message to Qwen2-vl and chameleon

* [fix] token dimension for check

* [minor] add feature dim check for videos too

* [fix] dimension check

* [fix] test reference values

---------

Co-authored-by: Raushan Turganbay <raushan@huggingface.co>
2024-10-15 16:19:18 +02:00
d00f1ca860 Fix optuna ddp hp search (#34073) 2024-10-15 15:42:07 +02:00
65442718c4 Add support for inheritance from class with different suffix in modular (#34077)
* add support for different suffix in modular

* add dummy example, pull new changes for modular

* nide lines order change
2024-10-15 14:55:09 +02:00
d314ce70bf Generate: move logits to same device as input_ids (#34076)
tmp commit
2024-10-15 14:32:09 +02:00
5ee9e786d1 Fix default behaviour in TextClassificationPipeline for regression problem type (#34066)
* update code

* update docstrings

* update tests
2024-10-15 13:06:20 +01:00
4de1bdbf63 Fix FSDP resume Initialization issue (#34032)
* Fix FSDP Initialization for resume training

* Added init_fsdp function to work with dummy values

* Fix FSDP initialization for resuming training

* Added CUDA decorator for tests

* Added torch_gpu decorator to FSDP tests

* Fixup for failing code quality tests
2024-10-15 13:48:10 +02:00
293e6271c6 Add sdpa for Vivit (#33757)
* chore:add sdpa to vivit

* fix:failing slow test_inference_interpolate_pos_encoding(failing on main branch too)

* chore:fix nits

* ci:fix repo consistency failure

* chore:add info and benchmark to model doc

* [run_slow] vivit

* chore:revert interpolation test fix for new issue

* [run_slow] vivit

* [run_slow] vivit

* [run_slow] vivit

* chore:add fallback for output_attentions being True

* [run_slow] vivit

* style:make fixup

* [run_slow] vivit
2024-10-15 11:27:54 +02:00
23874f5948 Idefics: enable generation tests (#34062)
* add idefics

* conflicts after merging main

* enable tests but need to fix some

* fix tests

* no print

* fix/skip some slow tests

* continue not skip

* rebasing broken smth, this is the fix
2024-10-15 11:17:14 +02:00
dd4216b766 Update README.md with Enterprise Hub (#34150) 2024-10-15 10:45:22 +02:00
696 changed files with 47055 additions and 20615 deletions

View File

@ -55,7 +55,7 @@ body:
- deepspeed: HF Trainer/Accelerate: @muellerzr
- ray/raytune: @richardliaw, @amogkam
- Big Model Inference: @SunMarc
- quantization (bitsandbytes, autogpt): @SunMarc
- quantization (bitsandbytes, autogpt): @SunMarc @MekkCyber
Documentation: @stevhliu

View File

@ -59,7 +59,7 @@ Integrations:
- deepspeed: HF Trainer/Accelerate: @muellerzr
- ray/raytune: @richardliaw, @amogkam
- Big Model Inference: @SunMarc
- quantization (bitsandbytes, autogpt): @SunMarc
- quantization (bitsandbytes, autogpt): @SunMarc @MekkCyber
Documentation: @stevhliu

View File

@ -16,23 +16,22 @@ env:
jobs:
benchmark:
name: Benchmark
strategy:
matrix:
group: [aws-g5-4xlarge-cache, aws-p4d-24xlarge-plus]
runs-on:
group: aws-g5-4xlarge-cache
group: ${{ matrix.group }}
if: |
(github.event_name == 'pull_request' && contains( github.event.pull_request.labels.*.name, 'run-benchmark') )||
(github.event_name == 'push' && github.ref == 'refs/heads/main')
container:
image: huggingface/transformers-pytorch-gpu
options: --gpus all --privileged --ipc host
steps:
- name: Get repo
if: github.event_name == 'pull_request'
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Get repo
if: github.event_name == 'push'
uses: actions/checkout@v4
with:
ref: ${{ github.sha }}
ref: ${{ github.event.pull_request.head.sha || github.sha }}
- name: Install libpq-dev & psql
run: |
@ -67,6 +66,9 @@ jobs:
python3 benchmark/llama.py "${{ github.head_ref || github.ref_name }}" "$commit_id" "$commit_msg"
env:
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
# Enable this to see debug logs
# HF_HUB_VERBOSITY: debug
# TRANSFORMERS_VERBOSITY: debug
PGHOST: ${{ secrets.TRANSFORMERS_BENCHMARKS_PGHOST }}
PGUSER: transformers_benchmarks
PGPASSWORD: ${{ secrets.TRANSFORMERS_BENCHMARKS_PGPASSWORD }}

View File

@ -3,7 +3,7 @@ name: Build docker images (scheduled)
on:
push:
branches:
- build_ci_docker_image*
- update-quantization-docker
repository_dispatch:
workflow_call:
inputs:
@ -18,341 +18,341 @@ concurrency:
cancel-in-progress: false
jobs:
latest-docker:
name: "Latest PyTorch + TensorFlow [dev]"
runs-on:
group: aws-general-8-plus
steps:
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
-
name: Check out code
uses: actions/checkout@v4
-
name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_PASSWORD }}
-
name: Build and push
uses: docker/build-push-action@v5
with:
context: ./docker/transformers-all-latest-gpu
build-args: |
REF=main
push: true
tags: huggingface/transformers-all-latest-gpu${{ inputs.image_postfix }}
# Push CI images still need to be re-built daily
-
name: Build and push (for Push CI) in a daily basis
# This condition allows `schedule` events, or `push` events that trigger this workflow NOT via `workflow_call`.
# The later case is useful for manual image building for debugging purpose. Use another tag in this case!
if: inputs.image_postfix != '-push-ci'
uses: docker/build-push-action@v5
with:
context: ./docker/transformers-all-latest-gpu
build-args: |
REF=main
push: true
tags: huggingface/transformers-all-latest-gpu-push-ci
# latest-docker:
# name: "Latest PyTorch + TensorFlow [dev]"
# runs-on:
# group: aws-general-8-plus
# steps:
# -
# name: Set up Docker Buildx
# uses: docker/setup-buildx-action@v3
# -
# name: Check out code
# uses: actions/checkout@v4
# -
# name: Login to DockerHub
# uses: docker/login-action@v3
# with:
# username: ${{ secrets.DOCKERHUB_USERNAME }}
# password: ${{ secrets.DOCKERHUB_PASSWORD }}
# -
# name: Build and push
# uses: docker/build-push-action@v5
# with:
# context: ./docker/transformers-all-latest-gpu
# build-args: |
# REF=main
# push: true
# tags: huggingface/transformers-all-latest-gpu${{ inputs.image_postfix }}
# # Push CI images still need to be re-built daily
# -
# name: Build and push (for Push CI) in a daily basis
# # This condition allows `schedule` events, or `push` events that trigger this workflow NOT via `workflow_call`.
# # The later case is useful for manual image building for debugging purpose. Use another tag in this case!
# if: inputs.image_postfix != '-push-ci'
# uses: docker/build-push-action@v5
# with:
# context: ./docker/transformers-all-latest-gpu
# build-args: |
# REF=main
# push: true
# tags: huggingface/transformers-all-latest-gpu-push-ci
- name: Post to Slack
if: always()
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
title: ЁЯдЧ Results of the transformers-all-latest-gpu-push-ci docker build
status: ${{ job.status }}
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
# - name: Post to Slack
# if: always()
# uses: huggingface/hf-workflows/.github/actions/post-slack@main
# with:
# slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
# title: ЁЯдЧ Results of the transformers-all-latest-gpu-push-ci docker build
# status: ${{ job.status }}
# slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
latest-torch-deepspeed-docker:
name: "Latest PyTorch + DeepSpeed"
runs-on:
group: aws-general-8-plus
steps:
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
-
name: Check out code
uses: actions/checkout@v4
-
name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_PASSWORD }}
-
name: Build and push
uses: docker/build-push-action@v5
with:
context: ./docker/transformers-pytorch-deepspeed-latest-gpu
build-args: |
REF=main
push: true
tags: huggingface/transformers-pytorch-deepspeed-latest-gpu${{ inputs.image_postfix }}
# latest-torch-deepspeed-docker:
# name: "Latest PyTorch + DeepSpeed"
# runs-on:
# group: aws-general-8-plus
# steps:
# -
# name: Set up Docker Buildx
# uses: docker/setup-buildx-action@v3
# -
# name: Check out code
# uses: actions/checkout@v4
# -
# name: Login to DockerHub
# uses: docker/login-action@v3
# with:
# username: ${{ secrets.DOCKERHUB_USERNAME }}
# password: ${{ secrets.DOCKERHUB_PASSWORD }}
# -
# name: Build and push
# uses: docker/build-push-action@v5
# with:
# context: ./docker/transformers-pytorch-deepspeed-latest-gpu
# build-args: |
# REF=main
# push: true
# tags: huggingface/transformers-pytorch-deepspeed-latest-gpu${{ inputs.image_postfix }}
- name: Post to Slack
if: always()
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER}}
title: ЁЯдЧ Results of the transformers-pytorch-deepspeed-latest-gpu docker build
status: ${{ job.status }}
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
# - name: Post to Slack
# if: always()
# uses: huggingface/hf-workflows/.github/actions/post-slack@main
# with:
# slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER}}
# title: ЁЯдЧ Results of the transformers-pytorch-deepspeed-latest-gpu docker build
# status: ${{ job.status }}
# slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
# Can't build 2 images in a single job `latest-torch-deepspeed-docker` (for `nvcr.io/nvidia`)
latest-torch-deepspeed-docker-for-push-ci-daily-build:
name: "Latest PyTorch + DeepSpeed (Push CI - Daily Build)"
runs-on:
group: aws-general-8-plus
steps:
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
-
name: Check out code
uses: actions/checkout@v4
-
name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_PASSWORD }}
# Push CI images still need to be re-built daily
-
name: Build and push (for Push CI) in a daily basis
# This condition allows `schedule` events, or `push` events that trigger this workflow NOT via `workflow_call`.
# The later case is useful for manual image building for debugging purpose. Use another tag in this case!
if: inputs.image_postfix != '-push-ci'
uses: docker/build-push-action@v5
with:
context: ./docker/transformers-pytorch-deepspeed-latest-gpu
build-args: |
REF=main
push: true
tags: huggingface/transformers-pytorch-deepspeed-latest-gpu-push-ci
# # Can't build 2 images in a single job `latest-torch-deepspeed-docker` (for `nvcr.io/nvidia`)
# latest-torch-deepspeed-docker-for-push-ci-daily-build:
# name: "Latest PyTorch + DeepSpeed (Push CI - Daily Build)"
# runs-on:
# group: aws-general-8-plus
# steps:
# -
# name: Set up Docker Buildx
# uses: docker/setup-buildx-action@v3
# -
# name: Check out code
# uses: actions/checkout@v4
# -
# name: Login to DockerHub
# uses: docker/login-action@v3
# with:
# username: ${{ secrets.DOCKERHUB_USERNAME }}
# password: ${{ secrets.DOCKERHUB_PASSWORD }}
# # Push CI images still need to be re-built daily
# -
# name: Build and push (for Push CI) in a daily basis
# # This condition allows `schedule` events, or `push` events that trigger this workflow NOT via `workflow_call`.
# # The later case is useful for manual image building for debugging purpose. Use another tag in this case!
# if: inputs.image_postfix != '-push-ci'
# uses: docker/build-push-action@v5
# with:
# context: ./docker/transformers-pytorch-deepspeed-latest-gpu
# build-args: |
# REF=main
# push: true
# tags: huggingface/transformers-pytorch-deepspeed-latest-gpu-push-ci
- name: Post to Slack
if: always()
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
title: ЁЯдЧ Results of the transformers-pytorch-deepspeed-latest-gpu-push-ci docker build
status: ${{ job.status }}
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
# - name: Post to Slack
# if: always()
# uses: huggingface/hf-workflows/.github/actions/post-slack@main
# with:
# slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
# title: ЁЯдЧ Results of the transformers-pytorch-deepspeed-latest-gpu-push-ci docker build
# status: ${{ job.status }}
# slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
doc-builder:
name: "Doc builder"
# Push CI doesn't need this image
if: inputs.image_postfix != '-push-ci'
runs-on:
group: aws-general-8-plus
steps:
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
-
name: Check out code
uses: actions/checkout@v4
-
name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_PASSWORD }}
-
name: Build and push
uses: docker/build-push-action@v5
with:
context: ./docker/transformers-doc-builder
push: true
tags: huggingface/transformers-doc-builder
# doc-builder:
# name: "Doc builder"
# # Push CI doesn't need this image
# if: inputs.image_postfix != '-push-ci'
# runs-on:
# group: aws-general-8-plus
# steps:
# -
# name: Set up Docker Buildx
# uses: docker/setup-buildx-action@v3
# -
# name: Check out code
# uses: actions/checkout@v4
# -
# name: Login to DockerHub
# uses: docker/login-action@v3
# with:
# username: ${{ secrets.DOCKERHUB_USERNAME }}
# password: ${{ secrets.DOCKERHUB_PASSWORD }}
# -
# name: Build and push
# uses: docker/build-push-action@v5
# with:
# context: ./docker/transformers-doc-builder
# push: true
# tags: huggingface/transformers-doc-builder
- name: Post to Slack
if: always()
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
title: ЁЯдЧ Results of the huggingface/transformers-doc-builder docker build
status: ${{ job.status }}
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
# - name: Post to Slack
# if: always()
# uses: huggingface/hf-workflows/.github/actions/post-slack@main
# with:
# slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
# title: ЁЯдЧ Results of the huggingface/transformers-doc-builder docker build
# status: ${{ job.status }}
# slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
latest-pytorch:
name: "Latest PyTorch [dev]"
# Push CI doesn't need this image
if: inputs.image_postfix != '-push-ci'
runs-on:
group: aws-general-8-plus
steps:
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
-
name: Check out code
uses: actions/checkout@v4
-
name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_PASSWORD }}
-
name: Build and push
uses: docker/build-push-action@v5
with:
context: ./docker/transformers-pytorch-gpu
build-args: |
REF=main
push: true
tags: huggingface/transformers-pytorch-gpu
# latest-pytorch:
# name: "Latest PyTorch [dev]"
# # Push CI doesn't need this image
# if: inputs.image_postfix != '-push-ci'
# runs-on:
# group: aws-general-8-plus
# steps:
# -
# name: Set up Docker Buildx
# uses: docker/setup-buildx-action@v3
# -
# name: Check out code
# uses: actions/checkout@v4
# -
# name: Login to DockerHub
# uses: docker/login-action@v3
# with:
# username: ${{ secrets.DOCKERHUB_USERNAME }}
# password: ${{ secrets.DOCKERHUB_PASSWORD }}
# -
# name: Build and push
# uses: docker/build-push-action@v5
# with:
# context: ./docker/transformers-pytorch-gpu
# build-args: |
# REF=main
# push: true
# tags: huggingface/transformers-pytorch-gpu
- name: Post to Slack
if: always()
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
title: ЁЯдЧ Results of the huggingface/transformers-pytorch-gpudocker build
status: ${{ job.status }}
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
# - name: Post to Slack
# if: always()
# uses: huggingface/hf-workflows/.github/actions/post-slack@main
# with:
# slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
# title: ЁЯдЧ Results of the huggingface/transformers-pytorch-gpudocker build
# status: ${{ job.status }}
# slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
latest-pytorch-amd:
name: "Latest PyTorch (AMD) [dev]"
runs-on:
group: aws-general-8-plus
steps:
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
-
name: Check out code
uses: actions/checkout@v4
-
name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_PASSWORD }}
-
name: Build and push
uses: docker/build-push-action@v5
with:
context: ./docker/transformers-pytorch-amd-gpu
build-args: |
REF=main
push: true
tags: huggingface/transformers-pytorch-amd-gpu${{ inputs.image_postfix }}
# Push CI images still need to be re-built daily
-
name: Build and push (for Push CI) in a daily basis
# This condition allows `schedule` events, or `push` events that trigger this workflow NOT via `workflow_call`.
# The later case is useful for manual image building for debugging purpose. Use another tag in this case!
if: inputs.image_postfix != '-push-ci'
uses: docker/build-push-action@v5
with:
context: ./docker/transformers-pytorch-amd-gpu
build-args: |
REF=main
push: true
tags: huggingface/transformers-pytorch-amd-gpu-push-ci
# latest-pytorch-amd:
# name: "Latest PyTorch (AMD) [dev]"
# runs-on:
# group: aws-general-8-plus
# steps:
# -
# name: Set up Docker Buildx
# uses: docker/setup-buildx-action@v3
# -
# name: Check out code
# uses: actions/checkout@v4
# -
# name: Login to DockerHub
# uses: docker/login-action@v3
# with:
# username: ${{ secrets.DOCKERHUB_USERNAME }}
# password: ${{ secrets.DOCKERHUB_PASSWORD }}
# -
# name: Build and push
# uses: docker/build-push-action@v5
# with:
# context: ./docker/transformers-pytorch-amd-gpu
# build-args: |
# REF=main
# push: true
# tags: huggingface/transformers-pytorch-amd-gpu${{ inputs.image_postfix }}
# # Push CI images still need to be re-built daily
# -
# name: Build and push (for Push CI) in a daily basis
# # This condition allows `schedule` events, or `push` events that trigger this workflow NOT via `workflow_call`.
# # The later case is useful for manual image building for debugging purpose. Use another tag in this case!
# if: inputs.image_postfix != '-push-ci'
# uses: docker/build-push-action@v5
# with:
# context: ./docker/transformers-pytorch-amd-gpu
# build-args: |
# REF=main
# push: true
# tags: huggingface/transformers-pytorch-amd-gpu-push-ci
- name: Post to Slack
if: always()
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
title: ЁЯдЧ Results of the huggingface/transformers-pytorch-amd-gpu-push-ci build
status: ${{ job.status }}
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
# - name: Post to Slack
# if: always()
# uses: huggingface/hf-workflows/.github/actions/post-slack@main
# with:
# slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
# title: ЁЯдЧ Results of the huggingface/transformers-pytorch-amd-gpu-push-ci build
# status: ${{ job.status }}
# slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
latest-tensorflow:
name: "Latest TensorFlow [dev]"
# Push CI doesn't need this image
if: inputs.image_postfix != '-push-ci'
runs-on:
group: aws-general-8-plus
steps:
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
-
name: Check out code
uses: actions/checkout@v4
-
name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_PASSWORD }}
-
name: Build and push
uses: docker/build-push-action@v5
with:
context: ./docker/transformers-tensorflow-gpu
build-args: |
REF=main
push: true
tags: huggingface/transformers-tensorflow-gpu
# latest-tensorflow:
# name: "Latest TensorFlow [dev]"
# # Push CI doesn't need this image
# if: inputs.image_postfix != '-push-ci'
# runs-on:
# group: aws-general-8-plus
# steps:
# -
# name: Set up Docker Buildx
# uses: docker/setup-buildx-action@v3
# -
# name: Check out code
# uses: actions/checkout@v4
# -
# name: Login to DockerHub
# uses: docker/login-action@v3
# with:
# username: ${{ secrets.DOCKERHUB_USERNAME }}
# password: ${{ secrets.DOCKERHUB_PASSWORD }}
# -
# name: Build and push
# uses: docker/build-push-action@v5
# with:
# context: ./docker/transformers-tensorflow-gpu
# build-args: |
# REF=main
# push: true
# tags: huggingface/transformers-tensorflow-gpu
- name: Post to Slack
if: always()
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
title: ЁЯдЧ Results of the huggingface/transformers-tensorflow-gpu build
status: ${{ job.status }}
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
# - name: Post to Slack
# if: always()
# uses: huggingface/hf-workflows/.github/actions/post-slack@main
# with:
# slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
# title: ЁЯдЧ Results of the huggingface/transformers-tensorflow-gpu build
# status: ${{ job.status }}
# slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
latest-pytorch-deepspeed-amd:
name: "PyTorch + DeepSpeed (AMD) [dev]"
runs-on:
group: aws-general-8-plus
steps:
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
-
name: Check out code
uses: actions/checkout@v4
-
name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_PASSWORD }}
-
name: Build and push
uses: docker/build-push-action@v5
with:
context: ./docker/transformers-pytorch-deepspeed-amd-gpu
build-args: |
REF=main
push: true
tags: huggingface/transformers-pytorch-deepspeed-amd-gpu${{ inputs.image_postfix }}
# Push CI images still need to be re-built daily
-
name: Build and push (for Push CI) in a daily basis
# This condition allows `schedule` events, or `push` events that trigger this workflow NOT via `workflow_call`.
# The later case is useful for manual image building for debugging purpose. Use another tag in this case!
if: inputs.image_postfix != '-push-ci'
uses: docker/build-push-action@v5
with:
context: ./docker/transformers-pytorch-deepspeed-amd-gpu
build-args: |
REF=main
push: true
tags: huggingface/transformers-pytorch-deepspeed-amd-gpu-push-ci
# latest-pytorch-deepspeed-amd:
# name: "PyTorch + DeepSpeed (AMD) [dev]"
# runs-on:
# group: aws-general-8-plus
# steps:
# -
# name: Set up Docker Buildx
# uses: docker/setup-buildx-action@v3
# -
# name: Check out code
# uses: actions/checkout@v4
# -
# name: Login to DockerHub
# uses: docker/login-action@v3
# with:
# username: ${{ secrets.DOCKERHUB_USERNAME }}
# password: ${{ secrets.DOCKERHUB_PASSWORD }}
# -
# name: Build and push
# uses: docker/build-push-action@v5
# with:
# context: ./docker/transformers-pytorch-deepspeed-amd-gpu
# build-args: |
# REF=main
# push: true
# tags: huggingface/transformers-pytorch-deepspeed-amd-gpu${{ inputs.image_postfix }}
# # Push CI images still need to be re-built daily
# -
# name: Build and push (for Push CI) in a daily basis
# # This condition allows `schedule` events, or `push` events that trigger this workflow NOT via `workflow_call`.
# # The later case is useful for manual image building for debugging purpose. Use another tag in this case!
# if: inputs.image_postfix != '-push-ci'
# uses: docker/build-push-action@v5
# with:
# context: ./docker/transformers-pytorch-deepspeed-amd-gpu
# build-args: |
# REF=main
# push: true
# tags: huggingface/transformers-pytorch-deepspeed-amd-gpu-push-ci
- name: Post to Slack
if: always()
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
title: ЁЯдЧ Results of the transformers-pytorch-deepspeed-amd-gpu build
status: ${{ job.status }}
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
# - name: Post to Slack
# if: always()
# uses: huggingface/hf-workflows/.github/actions/post-slack@main
# with:
# slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
# title: ЁЯдЧ Results of the transformers-pytorch-deepspeed-amd-gpu build
# status: ${{ job.status }}
# slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
latest-quantization-torch-docker:
name: "Latest Pytorch + Quantization [dev]"

View File

@ -0,0 +1,129 @@
name: Process failed tests
on:
workflow_call:
inputs:
docker:
required: true
type: string
start_sha:
required: true
type: string
env:
HF_HOME: /mnt/cache
TRANSFORMERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes
# For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
# This token is created under the bot `hf-transformers-bot`.
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
TF_FORCE_GPU_ALLOW_GROWTH: true
RUN_PT_TF_CROSS_TESTS: 1
CUDA_VISIBLE_DEVICES: 0,1
jobs:
run_models_gpu:
name: " "
runs-on:
group: aws-g4dn-2xlarge-cache
container:
image: ${{ inputs.docker }}
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- uses: actions/download-artifact@v4
with:
name: ci_results_run_models_gpu
path: /transformers/ci_results_run_models_gpu
- name: Update clone
working-directory: /transformers
run: git fetch && git checkout ${{ github.sha }}
- name: Get target commit
working-directory: /transformers/utils
run: |
echo "END_SHA=$(TOKEN=${{ secrets.ACCESS_REPO_INFO_TOKEN }} python3 -c 'import os; from get_previous_daily_ci import get_last_daily_ci_run_commit; commit=get_last_daily_ci_run_commit(token=os.environ["TOKEN"]); print(commit)')" >> $GITHUB_ENV
- name: Checkout to `start_sha`
working-directory: /transformers
run: git fetch && git checkout ${{ inputs.start_sha }}
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Environment
working-directory: /transformers
run: |
python3 utils/print_env.py
- name: Show installed libraries and their versions
working-directory: /transformers
run: pip freeze
- name: Check failed tests
working-directory: /transformers
run: python3 utils/check_bad_commit.py --start_commit ${{ inputs.start_sha }} --end_commit ${{ env.END_SHA }} --file ci_results_run_models_gpu/new_model_failures.json --output_file new_model_failures_with_bad_commit.json
- name: Show results
working-directory: /transformers
run: |
ls -l new_model_failures_with_bad_commit.json
cat new_model_failures_with_bad_commit.json
- name: Checkout back
working-directory: /transformers
run: |
git checkout ${{ inputs.start_sha }}
- name: Process report
shell: bash
working-directory: /transformers
env:
TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}
run: |
python3 utils/process_bad_commit_report.py
- name: Process report
shell: bash
working-directory: /transformers
env:
TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}
run: |
{
echo 'REPORT_TEXT<<EOF'
python3 utils/process_bad_commit_report.py
echo EOF
} >> "$GITHUB_ENV"
- name: Send processed report
if: ${{ !endsWith(env.REPORT_TEXT, '{}') }}
uses: slackapi/slack-github-action@6c661ce58804a1a20f6dc5fbee7f0381b469e001
with:
# Slack channel id, channel name, or user id to post message.
# See also: https://api.slack.com/methods/chat.postMessage#channels
channel-id: '#transformers-ci-feedback-tests'
# For posting a rich message using Block Kit
payload: |
{
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "${{ env.REPORT_TEXT }}"
}
}
]
}
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

View File

@ -562,3 +562,13 @@ jobs:
ci_event: ${{ inputs.ci_event }}
secrets: inherit
check_new_model_failures:
if: ${{ always() && inputs.ci_event == 'Daily CI' && inputs.job == 'run_models_gpu' && needs.send_results.result == 'success' }}
name: Check new model failures
needs: send_results
uses: ./.github/workflows/check_failed_model_tests.yml
with:
docker: ${{ inputs.docker }}
start_sha: ${{ github.sha }}
secrets: inherit

View File

@ -132,7 +132,7 @@ You will need basic `git` proficiency to contribute to
manual. Type `git --help` in a shell and enjoy! If you prefer books, [Pro
Git](https://git-scm.com/book/en/v2) is a very good reference.
You'll need **[Python 3.8](https://github.com/huggingface/transformers/blob/main/setup.py#L449)** or above to contribute to ЁЯдЧ Transformers. Follow the steps below to start contributing:
You'll need **[Python 3.9](https://github.com/huggingface/transformers/blob/main/setup.py#L449)** or above to contribute to ЁЯдЧ Transformers. Follow the steps below to start contributing:
1. Fork the [repository](https://github.com/huggingface/transformers) by
clicking on the **[Fork](https://github.com/huggingface/transformers/fork)** button on the repository's page. This creates a copy of the code

View File

@ -128,10 +128,10 @@ incredible projects built in the vicinity of transformers.
If you own or use a project that you believe should be part of the list, please open a PR to add it!
## If you are looking for custom support from the Hugging Face team
## Serious about AI in your organisation? Build faster with the Hugging Face Enterprise Hub.
<a target="_blank" href="https://huggingface.co/support">
<img alt="HuggingFace Expert Acceleration Program" src="https://cdn-media.huggingface.co/marketing/transformers/new-support-improved.png" style="max-width: 600px; border: 1px solid #eee; border-radius: 4px; box-shadow: 0 1px 2px 0 rgba(0, 0, 0, 0.05);">
<a target="_blank" href="https://huggingface.co/enterprise">
<img alt="Hugging Face Enterprise Hub" src="https://github.com/user-attachments/assets/247fb16d-d251-4583-96c4-d3d76dda4925">
</a><br>
## Quick tour
@ -249,7 +249,7 @@ The model itself is a regular [Pytorch `nn.Module`](https://pytorch.org/docs/sta
### With pip
This repository is tested on Python 3.8+, Flax 0.4.1+, PyTorch 1.11+, and TensorFlow 2.6+.
This repository is tested on Python 3.9+, Flax 0.4.1+, PyTorch 1.11+, and TensorFlow 2.6+.
You should install ЁЯдЧ Transformers in a [virtual environment](https://docs.python.org/3/library/venv.html). If you're unfamiliar with Python virtual environments, check out the [user guide](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).

File diff suppressed because it is too large Load Diff

View File

@ -7,6 +7,10 @@ CREATE TABLE IF NOT EXISTS benchmarks (
created_at timestamp without time zone NOT NULL DEFAULT (current_timestamp AT TIME ZONE 'UTC')
);
CREATE INDEX IF NOT EXISTS benchmarks_benchmark_id_idx ON benchmarks (benchmark_id);
CREATE INDEX IF NOT EXISTS benchmarks_branch_idx ON benchmarks (branch);
CREATE TABLE IF NOT EXISTS device_measurements (
measurement_id SERIAL PRIMARY KEY,
benchmark_id int REFERENCES benchmarks (benchmark_id),
@ -17,6 +21,8 @@ CREATE TABLE IF NOT EXISTS device_measurements (
time timestamp without time zone NOT NULL DEFAULT (current_timestamp AT TIME ZONE 'UTC')
);
CREATE INDEX IF NOT EXISTS device_measurements_branch_idx ON device_measurements (benchmark_id);
CREATE TABLE IF NOT EXISTS model_measurements (
measurement_id SERIAL PRIMARY KEY,
benchmark_id int REFERENCES benchmarks (benchmark_id),
@ -24,3 +30,4 @@ CREATE TABLE IF NOT EXISTS model_measurements (
time timestamp without time zone NOT NULL DEFAULT (current_timestamp AT TIME ZONE 'UTC')
);
CREATE INDEX IF NOT EXISTS model_measurements_branch_idx ON model_measurements (benchmark_id);

View File

@ -96,17 +96,21 @@ def run_benchmark(branch: str, commit_id: str, commit_msg: str, num_tokens_to_ge
)
conn.commit()
benchmark_id = cur.fetchone()[0]
logger.info(f"running benchmark #{benchmark_id} on {gpu_name}")
metrics_thread = Thread(target=collect_metrics, args=[benchmark_id, continue_metric_collection])
metrics_thread.start()
logger.info("started background thread to fetch device metrics")
os.environ["TOKENIZERS_PARALLELISM"] = "false" # silence warnings when compiling
device = "cuda"
ckpt = "meta-llama/Llama-2-7b-hf"
logger.info("downloading weights")
# This is to avoid counting download in model load time measurement
model = AutoModelForCausalLM.from_pretrained(ckpt, torch_dtype=torch.float16)
gen_config = GenerationConfig(do_sample=False, top_p=1, temperature=1)
logger.info("loading model")
start = perf_counter()
model = AutoModelForCausalLM.from_pretrained(
ckpt, torch_dtype=torch.float16, generation_config=gen_config

View File

@ -1,4 +1,4 @@
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
@ -9,7 +9,7 @@ SHELL ["sh", "-lc"]
# The following `ARG` are mainly used to specify the versions explicitly & directly in this docker file, and not meant
# to be used as arguments for docker build (so far).
ARG PYTORCH='2.4.0'
ARG PYTORCH='2.5.1'
# (not always a valid torch version)
ARG INTEL_TORCH_EXT='2.3.0'
# Example: `cu102`, `cu113`, etc.
@ -26,7 +26,7 @@ RUN git clone https://github.com/huggingface/transformers && cd transformers &&
# 1. Put several commands in a single `RUN` to avoid image/layer exporting issue. Could be revised in the future.
# 2. Regarding `torch` part, We might need to specify proper versions for `torchvision` and `torchaudio`.
# Currently, let's not bother to specify their versions explicitly (so installed with their latest release versions).
RUN python3 -m pip install --no-cache-dir -U tensorflow==2.13 protobuf==3.20.3 tensorflow_text tensorflow_probability && python3 -m pip install --no-cache-dir -e ./transformers[dev,onnxruntime] && [ ${#PYTORCH} -gt 0 -a "$PYTORCH" != "pre" ] && VERSION='torch=='$PYTORCH'.*' || VERSION='torch'; echo "export VERSION='$VERSION'" >> ~/.profile && echo torch=$VERSION && [ "$PYTORCH" != "pre" ] && python3 -m pip install --no-cache-dir -U $VERSION torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/$CUDA || python3 -m pip install --no-cache-dir -U --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/$CUDA
RUN python3 -m pip install --no-cache-dir -U tensorflow==2.13 protobuf==3.20.3 "tensorflow_text<2.16" "tensorflow_probability<0.22" && python3 -m pip install --no-cache-dir -e ./transformers[dev,onnxruntime] && [ ${#PYTORCH} -gt 0 -a "$PYTORCH" != "pre" ] && VERSION='torch=='$PYTORCH'.*' || VERSION='torch'; echo "export VERSION='$VERSION'" >> ~/.profile && echo torch=$VERSION && [ "$PYTORCH" != "pre" ] && python3 -m pip install --no-cache-dir -U $VERSION torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/$CUDA || python3 -m pip install --no-cache-dir -U --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/$CUDA
RUN python3 -m pip uninstall -y flax jax
@ -43,7 +43,7 @@ RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/pef
RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/optimum@main#egg=optimum
# For video model testing
RUN python3 -m pip install --no-cache-dir decord av==9.2.0
RUN python3 -m pip install --no-cache-dir av==9.2.0
# Some slow tests require bnb
RUN python3 -m pip install --no-cache-dir bitsandbytes

View File

@ -1,4 +1,4 @@
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
@ -11,7 +11,7 @@ ARG REF=main
RUN git clone https://github.com/huggingface/transformers && cd transformers && git checkout $REF
# If set to nothing, will install the latest version
ARG PYTORCH='2.4.0'
ARG PYTORCH='2.5.1'
ARG TORCH_VISION=''
ARG TORCH_AUDIO=''
# Example: `cu102`, `cu113`, etc.

View File

@ -1,4 +1,4 @@
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
@ -9,12 +9,12 @@ SHELL ["sh", "-lc"]
# The following `ARG` are mainly used to specify the versions explicitly & directly in this docker file, and not meant
# to be used as arguments for docker build (so far).
ARG PYTORCH='2.2.1'
ARG PYTORCH='2.4.1'
# Example: `cu102`, `cu113`, etc.
ARG CUDA='cu118'
RUN apt update
RUN apt install -y git libsndfile1-dev tesseract-ocr espeak-ng python python3-pip ffmpeg
RUN apt install -y git libsndfile1-dev tesseract-ocr espeak-ng python3 python3-pip ffmpeg
RUN python3 -m pip install --no-cache-dir --upgrade pip
ARG REF=main
@ -53,7 +53,7 @@ RUN python3 -m pip install --no-cache-dir gguf
# Add autoawq for quantization testing
# >=v0.2.3 needed for compatibility with torch 2.2.1
RUN python3 -m pip install --no-cache-dir https://github.com/casper-hansen/AutoAWQ/releases/download/v0.2.3/autoawq-0.2.3+cu118-cp38-cp38-linux_x86_64.whl
RUN python3 -m pip install --no-cache-dir https://github.com/casper-hansen/AutoAWQ/releases/download/v0.2.3/autoawq-0.2.3+cu118-cp310-cp310-linux_x86_64.whl
# Add quanto for quantization testing
RUN python3 -m pip install --no-cache-dir optimum-quanto

View File

@ -1,4 +1,4 @@
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04
LABEL maintainer="Hugging Face"
ARG DEBIAN_FRONTEND=noninteractive
@ -18,7 +18,7 @@ RUN [ ${#TENSORFLOW} -gt 0 ] && VERSION='tensorflow=='$TENSORFLOW'.*' || VERSIO
RUN python3 -m pip uninstall -y torch flax
RUN python3 -m pip install -U "itsdangerous<2.1.0"
RUN python3 -m pip install --no-cache-dir -U tensorflow_probability
RUN python3 -m pip install --no-cache-dir -U "tensorflow_probability<0.22"
# When installing in editable mode, `transformers` is not recognized as a package.
# this line must be added in order for python to be aware of transformers.

View File

@ -276,14 +276,14 @@ building the return.
Here's an example of a single value return:
```
```python
Returns:
`List[int]`: A list of integers in the range [0, 1] --- 1 for a special token, 0 for a sequence token.
```
Here's an example of a tuple return, comprising several objects:
```
```python
Returns:
`tuple(torch.FloatTensor)` comprising various elements depending on the configuration ([`BertConfig`]) and inputs:
- ** loss** (*optional*, returned when `masked_lm_labels` is provided) `torch.FloatTensor` of shape `(1,)` --
@ -322,10 +322,9 @@ includes an example of how to transcribe speech to text in the
The syntax for Example docstrings can look as follows:
```
```python
Example:
```python
>>> from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC
>>> from datasets import load_dataset
>>> import torch
@ -347,7 +346,6 @@ The syntax for Example docstrings can look as follows:
>>> transcription = processor.batch_decode(predicted_ids)
>>> transcription[0]
'MISTER QUILTER IS THE APOSTLE OF THE MIDDLE CLASSES AND WE ARE GLAD TO WELCOME HIS GOSPEL'
```
```
The docstring should give a minimal, clear example of how the respective model

View File

@ -1,57 +1,70 @@
### Translating the Transformers documentation into your language
# Translating the Transformers documentation into your language
As part of our mission to democratize machine learning, we'd love to make the Transformers library available in many more languages! Follow the steps below if you want to help translate the documentation into your language ЁЯЩП.
As part of our mission to democratize machine learning, we aim to make the Transformers library available in many more languages! Follow the steps below to help translate the documentation into your language.
**ЁЯЧЮя╕П Open an issue**
## Open an Issue
To get started, navigate to the [Issues](https://github.com/huggingface/transformers/issues) page of this repo and check if anyone else has opened an issue for your language. If not, open a new issue by selecting the "Translation template" from the "New issue" button.
1. Navigate to the Issues page of this repository.
2. Check if anyone has already opened an issue for your language.
3. If not, create a new issue by selecting the "Translation template" from the "New issue" button.
4. Post a comment indicating which chapters youтАЩd like to work on, and weтАЩll add your name to the list.
Once an issue exists, post a comment to indicate which chapters you'd like to work on, and we'll add your name to the list.
## Fork the Repository
1. First, fork the Transformers repo by clicking the Fork button in the top-right corner.
2. Clone your fork to your local machine for editing with the following command:
**ЁЯН┤ Fork the repository**
```bash
git clone https://github.com/YOUR-USERNAME/transformers.git
```
Replace `YOUR-USERNAME` with your GitHub username.
First, you'll need to [fork the Transformers repo](https://docs.github.com/en/get-started/quickstart/fork-a-repo). You can do this by clicking on the **Fork** button on the top-right corner of this repo's page.
## Copy-paste the English version with a new language code
Once you've forked the repo, you'll want to get the files on your local machine for editing. You can do that by cloning the fork with Git as follows:
The documentation files are organized in the following directory:
```bash
git clone https://github.com/YOUR-USERNAME/transformers.git
```
- **docs/source**: This contains all documentation materials organized by language.
**ЁЯУЛ Copy-paste the English version with a new language code**
To copy the English version to your new language directory:
The documentation files are in one leading directory:
1. Navigate to your fork of the repository:
- [`docs/source`](https://github.com/huggingface/transformers/tree/main/docs/source): All the documentation materials are organized here by language.
```bash
cd ~/path/to/transformers/docs
```
You'll only need to copy the files in the [`docs/source/en`](https://github.com/huggingface/transformers/tree/main/docs/source/en) directory, so first navigate to your fork of the repo and run the following:
Replace `~/path/to` with your actual path.
```bash
cd ~/path/to/transformers/docs
cp -r source/en source/LANG-ID
```
2. Run the following command:
Here, `LANG-ID` should be one of the ISO 639-1 or ISO 639-2 language codes -- see [here](https://www.loc.gov/standards/iso639-2/php/code_list.php) for a handy table.
```bash
cp -r source/en source/LANG-ID
```
**тЬНя╕П Start translating**
Replace `LANG-ID` with the appropriate ISO 639-1 or ISO 639-2 language code (see [this table](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) for reference).
The fun part comes - translating the text!
## Start translating
The first thing we recommend is translating the part of the `_toctree.yml` file that corresponds to your doc chapter. This file is used to render the table of contents on the website.
Begin translating the text!
> ЁЯЩЛ If the `_toctree.yml` file doesn't yet exist for your language, you can create one by copy-pasting from the English version and deleting the sections unrelated to your chapter. Just make sure it exists in the `docs/source/LANG-ID/` directory!
1. Start with the `_toctree.yml` file that corresponds to your documentation chapter. This file is essential for rendering the table of contents on the website.
The fields you should add are `local` (with the name of the file containing the translation; e.g. `autoclass_tutorial`), and `title` (with the title of the doc in your language; e.g. `Load pretrained instances with an AutoClass`) -- as a reference, here is the `_toctree.yml` for [English](https://github.com/huggingface/transformers/blob/main/docs/source/en/_toctree.yml):
- If the `_toctree.yml` file doesnтАЩt exist for your language, create one by copying the English version and removing unrelated sections.
- Ensure it is placed in the `docs/source/LANG-ID/` directory.
```yaml
- sections:
- local: pipeline_tutorial # Do not change this! Use the same name for your .md file
title: Pipelines for inference # Translate this!
...
title: Tutorials # Translate this!
```
HereтАЩs an example structure for the `_toctree.yml` file:
Once you have translated the `_toctree.yml` file, you can start translating the [MDX](https://mdxjs.com/) files associated with your docs chapter.
```yaml
- sections:
- local: pipeline_tutorial # Keep this name for your .md file
title: Pipelines for Inference # Translate this
...
title: Tutorials # Translate this
```
> ЁЯЩЛ If you'd like others to help you with the translation, you should [open an issue](https://github.com/huggingface/transformers/issues) and tag @stevhliu.
2. Once youтАЩve translated the `_toctree.yml`, move on to translating the associated MDX files.
## Collaborate and share
If you'd like assistance with your translation, open an issue and tag `@stevhliu`. Feel free to share resources or glossaries to ensure consistent terminology.

View File

@ -108,38 +108,38 @@
# title: ╪п┘Д┘К┘Д ╪е╪▒╪┤╪з╪п┘К ┘Д┘Е╪н┘Б╪▓╪з╪к ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д┘Д╪║┘И┘К╪й ╪з┘Д┘Г╪и┘К╪▒╪й
# title: ╪з┘Д╪е╪▒╪┤╪з╪п
# title: ╪г╪п┘Д╪й ╪з┘Д┘Е┘З╪з┘Е
# - sections:
# - local: fast_tokenizers
# title: ╪з╪│╪к╪о╪п┘Е ╪и╪▒╪з┘Е╪м ╪з┘Д╪к╪м╪▓╪ж╪й ╪з┘Д╪│╪▒┘К╪╣╪й ┘Е┘Ж ЁЯдЧ Tokenizers
# - local: multilingual
# title: ╪к╪┤╪║┘К┘Д ╪з┘Д╪з╪│╪к┘Ж╪к╪з╪м ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Е╪з╪░╪м ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к
# - local: create_a_model
# title: ╪з╪│╪к╪о╪п╪з┘Е ┘И╪з╪м┘З╪з╪к ╪и╪▒┘Е╪м╪й ╪з┘Д╪к╪╖╪и┘К┘В╪з╪к ╪з┘Д╪о╪з╪╡╪й ╪и╪з┘Д┘Ж┘Е┘И╪░╪м
# - local: custom_models
# title: ┘Е╪┤╪з╪▒┘Г╪й ┘Ж┘Е┘И╪░╪м ┘Е╪о╪╡╪╡
# - local: chat_templating
# title: ┘В┘И╪з┘Д╪и ┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪п╪▒╪п╪┤╪й
# - local: trainer
# title: ╪з┘Д┘Е╪п╪▒╪и
# - local: sagemaker
# title: ╪к╪┤╪║┘К┘Д ╪з┘Д╪к╪п╪▒┘К╪и ╪╣┘Д┘Й Amazon SageMaker
# - local: serialization
# title: ╪з┘Д╪к╪╡╪п┘К╪▒ ╪е┘Д┘Й ONNX
# - local: tflite
# title: ╪з┘Д╪к╪╡╪п┘К╪▒ ╪е┘Д┘Й TFLite
# - local: torchscript
# title: ╪з┘Д╪к╪╡╪п┘К╪▒ ╪е┘Д┘Й TorchScript
- sections:
- local: fast_tokenizers
title: ╪з╪│╪к╪о╪п┘Е ┘Е╪м╪▓╪ж┘К╪з╪к ╪з┘Д┘Ж╪╡┘И╪╡ ╪з┘Д╪│╪▒┘К╪╣╪й ┘Е┘Ж ЁЯдЧ Tokenizers
- local: multilingual
title: ╪з┘Д╪з╪│╪к╪п┘Д╪з┘Д ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Е╪з╪░╪м ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к
- local: create_a_model
title: ╪з╪│╪к╪о╪п╪з┘Е ┘И╪з╪м┘З╪з╪к ╪и╪▒┘Е╪м╪й ╪з┘Д╪к╪╖╪и┘К┘В╪з╪к ╪з┘Д╪о╪з╪╡╪й ╪и╪з┘Д┘Ж┘Е┘И╪░╪м
- local: custom_models
title: ┘Е╪┤╪з╪▒┘Г╪й ┘Ж┘Е┘И╪░╪м ┘Е╪о╪╡╪╡
- local: chat_templating
title: ┘В┘И╪з┘Д╪и ┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪п╪▒╪п╪┤╪й
- local: trainer
title: ╪з┘Д┘Е╪п╪▒╪и
- local: sagemaker
title: ╪к╪┤╪║┘К┘Д ╪з┘Д╪к╪п╪▒┘К╪и ╪╣┘Д┘Й Amazon SageMaker
- local: serialization
title: ╪з┘Д╪к╪╡╪п┘К╪▒ ╪е┘Д┘Й ONNX
- local: tflite
title: ╪з┘Д╪к╪╡╪п┘К╪▒ ╪е┘Д┘Й TFLite
- local: torchscript
title: ╪з┘Д╪к╪╡╪п┘К╪▒ ╪е┘Д┘Й TorchScript
# - local: benchmarks
# title: ╪з┘Д┘Е╪╣╪з┘К┘К╪▒
# - local: notebooks
# title: ╪п┘Б╪з╪к╪▒ ╪з┘Д┘Е┘Д╪з╪н╪╕╪з╪к ┘Е╪╣ ╪з┘Д╪г┘Е╪л┘Д╪й
# - local: community
# title: ┘Е┘И╪з╪▒╪п ╪з┘Д┘Е╪м╪к┘Е╪╣
# - local: troubleshooting
# title: ╪з╪│╪к┘Г╪┤╪з┘Б ╪з┘Д╪г╪о╪╖╪з╪б ┘И╪е╪╡┘Д╪з╪н┘З╪з
# - local: gguf
# title: ╪з┘Д╪к┘И╪з┘Б┘В ┘Е╪╣ ┘Е┘Д┘Б╪з╪к GGUF
# title: ╪г╪п┘Д╪й ╪з┘Д┘Е╪╖┘И╪▒┘К┘Ж
- local: troubleshooting
title: ╪з╪│╪к┘Г╪┤╪з┘Б ╪з┘Д╪г╪о╪╖╪з╪б ┘И╪е╪╡┘Д╪з╪н┘З╪з
- local: gguf
title: ╪з┘Д╪к┘И╪з┘Б┘В ┘Е╪╣ ┘Е┘Д┘Б╪з╪к GGUF
title: ╪г╪п┘Д╪й ╪з┘Д┘Е╪╖┘И╪▒┘К┘Ж
# - sections:
# - local: quantization/overview
# title: ┘Ж╪╕╪▒╪й ╪╣╪з┘Е╪й

View File

@ -464,7 +464,7 @@ image = image_generator(prompt=improved_prompt)
┘В╪и┘Д ╪е┘Ж╪┤╪з╪б ╪з┘Д╪╡┘И╪▒╪й ╪г╪о┘К╪▒┘Л╪з:
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png" />
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit_spacesuit_flux.webp" />
> [!WARNING]
> ╪к╪к╪╖┘Д╪и gradio-tools ╪е╪п╪о╪з┘Д╪з╪к ┘И╪е╪о╪▒╪з╪м╪з╪к *┘Ж╪╡┘К╪й* ╪н╪к┘Й ╪╣┘Ж╪п ╪з┘Д╪╣┘Е┘Д ┘Е╪╣ ╪╖╪▒╪з╪ж┘В ┘Е╪о╪к┘Д┘Б╪й ┘Е╪л┘Д ┘Г╪з╪ж┘Ж╪з╪к ╪з┘Д╪╡┘И╪▒ ┘И╪з┘Д╪╡┘И╪к. ╪з┘Д╪е╪п╪о╪з┘Д╪з╪к ┘И╪з┘Д╪е╪о╪▒╪з╪м╪з╪к ╪з┘Д╪╡┘И╪▒┘К╪й ┘И╪з┘Д╪╡┘И╪к┘К╪й ╪║┘К╪▒ ┘Е╪к┘И╪з┘Б┘В╪й ╪н╪з┘Д┘К┘Л╪з.

View File

@ -0,0 +1,835 @@
# ┘В┘И╪з┘Д╪и ┘Ж┘Е╪з╪░╪м ╪з┘Д╪п╪▒╪п╪┤╪й
## ┘Е┘В╪п┘Е╪й
╪к╪╣╪п **╪з┘Д╪п╪▒╪п╪┤╪й** ╪г╪н╪п ╪з╪│╪к╪о╪п╪з┘Е╪з╪к ┘Ж┘Е╪з╪░╪м ╪з┘Д┘Д╪║╪з╪к ╪з┘Д┘Г╪и┘К╪▒╪й (LLMs) ╪┤╪з╪ж╪╣╪й ╪з┘Д╪з╪│╪к╪о╪п╪з┘Е ╪и╪┤┘Г┘Д ┘Е╪к╪▓╪з┘К╪п. ┘Б┘Б┘К ╪│┘К╪з┘В ╪з┘Д╪п╪▒╪п╪┤╪й╪М ┘И╪и╪п┘Д╪з┘Л ┘Е┘Ж ┘Е╪к╪з╪и╪╣╪й ╪│┘Д╪│┘Д╪й ┘Ж╪╡┘К╪й ┘И╪з╪н╪п╪й (┘Г┘Е╪з ┘З┘И ╪з┘Д╪н╪з┘Д ┘Е╪╣ ┘Ж┘Е╪з╪░╪м ╪з┘Д┘Д╪║╪з╪к ╪з┘Д┘В┘К╪з╪│┘К╪й)╪М ┘К┘И╪з╪╡┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪░┘Д┘Г ┘Е╪н╪з╪п╪л╪й ╪к╪к┘Г┘И┘Ж ┘Е┘Ж ╪▒╪│╪з┘Д╪й ┘И╪з╪н╪п╪й ╪г┘И ╪г┘Г╪л╪▒╪М ╪к╪к╪╢┘Е┘Ж ┘Г┘Д ┘Е┘Ж┘З╪з ╪п┘И╪▒┘Л╪з╪М ┘Е╪л┘Д "╪з┘Д┘Е╪│╪к╪о╪п┘Е" ╪г┘И "╪з┘Д┘Е╪│╪з╪╣╪п"╪М ╪и╪з┘Д╪е╪╢╪з┘Б╪й ╪е┘Д┘Й ┘Ж╪╡ ╪з┘Д╪▒╪│╪з┘Д╪й.
┘И┘Г┘Е╪з ┘З┘И ╪з┘Д╪н╪з┘Д ┘Е╪╣ ╪к┘В╪│┘К┘Е ╪з┘Д┘Ж╪╡ ╪е┘Д┘Й ╪▒┘Е┘И╪▓ (tokenization)╪М ╪к╪к┘И┘В╪╣ ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д┘Е╪о╪к┘Д┘Б╪й ╪к┘Ж╪│┘К┘В╪з╪к ╪е╪п╪о╪з┘Д ┘Е╪о╪к┘Д┘Б╪й ╪к┘Е╪з┘Е┘Л╪з ┘Д┘Д┘Е╪н╪з╪п╪л╪й. ┘Д┘З╪░╪з ╪з┘Д╪│╪и╪и ╪г╪╢┘Б┘Ж╪з **┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й** ┘Г┘Е┘К╪▓╪й ╪м╪п┘К╪п╪й. ╪к┘П╪╣╪п ┘В┘И╪з┘Д╪и ╪з┘Д┘Е╪н╪з╪п╪л╪й ╪м╪▓╪б┘Л╪з ┘Е┘Ж tokenizer. ╪к╪н╪п╪п ┘З╪░┘З ╪з┘Д┘В┘И╪з┘Д╪и ┘Г┘К┘Б┘К╪й ╪к╪н┘И┘К┘Д ╪з┘Д┘Е╪н╪з╪п╪л╪з╪к╪М ┘И╪з┘Д╪к┘К ┘К╪к┘Е ╪к┘Е╪л┘К┘Д┘З╪з ┘Г┘В┘И╪з╪ж┘Е ┘Е┘Ж ╪з┘Д╪▒╪│╪з╪ж┘Д╪М ╪е┘Д┘Й ╪│┘Д╪│┘Д╪й ┘Ж╪╡┘К╪й ┘И╪з╪н╪п╪й ┘В╪з╪и┘Д╪й ┘Д┘Д╪к┘В╪│┘К┘Е ╪е┘Д┘Й ╪▒┘Е┘И╪▓ ╪и╪з┘Д╪к┘Ж╪│┘К┘В ╪з┘Д╪░┘К ┘К╪к┘И┘В╪╣┘З ╪з┘Д┘Ж┘Е┘И╪░╪м.
╪п╪╣┘И┘Ж╪з ┘Ж╪м╪╣┘Д ┘З╪░╪з ┘Е┘Д┘Е┘И╪│┘Л╪з ╪и┘Е╪л╪з┘Д ╪│╪▒┘К╪╣ ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Е┘И╪░╪м `BlenderBot`. ┘Д╪п┘Й BlenderBot ┘В╪з┘Д╪и ╪з┘Б╪к╪▒╪з╪╢┘К ╪и╪│┘К╪╖ ┘Д┘Д╪║╪з┘К╪й╪М ┘И╪з┘Д╪░┘К ┘К╪╢┘К┘Б ┘Б┘К ╪з┘Д╪║╪з┘Д╪и ┘Е╪│╪з┘Б╪з╪к ╪и┘К╪╢╪з╪б ╪и┘К┘Ж ╪м┘И┘Д╪з╪к ╪з┘Д╪н┘И╪з╪▒:
```python
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("facebook/blenderbot-400M-distill")
>>> chat = [
... {"role": "user", "content": "Hello, how are you?"},
... {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
... {"role": "user", "content": "I'd like to show off how chat templating works!"},
... ]
>>> tokenizer.apply_chat_template(chat, tokenize=False)
" Hello, how are you? I'm doing great. How can I help you today? I'd like to show off how chat templating works!</s>"
```
┘Д╪з╪н╪╕ ┘Г┘К┘Б ╪к┘Е ╪╢╪║╪╖ ╪з┘Д╪п╪▒╪п╪┤╪й ╪и╪г┘Г┘Е┘Д┘З╪з ┘Б┘К ╪│┘Д╪│┘Д╪й ┘И╪з╪н╪п╪й. ╪е╪░╪з ╪з╪│╪к╪о╪п┘Е┘Ж╪з `tokenize=True`╪М ┘И┘З┘И ╪з┘Д╪е╪╣╪п╪з╪п ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К╪М ┘Б╪│┘К╪к┘Е ╪г┘К╪╢┘Л╪з ╪к╪н┘Д┘К┘Д ╪з┘Д╪│┘Д╪│┘Д╪й ┘Ж╪н┘И┘К┘Л╪з ┘Ж┘К╪з╪и╪й ╪╣┘Ж╪з. ┘И┘Д┘Г┘Ж╪М ┘Д┘Ж╪┤╪з┘З╪п ┘В╪з┘Д╪и┘Л╪з ╪г┘Г╪л╪▒ ╪к╪╣┘В┘К╪п┘Л╪з ┘Б┘К ╪з┘Д╪╣┘Е┘Д╪М ╪п╪╣┘И┘Ж╪з ┘Ж╪│╪к╪о╪п┘Е ┘Ж┘Е┘И╪░╪м `mistralai/Mistral-7B-Instruct-v0.1`.
```python
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
>>> chat = [
... {"role": "user", "content": "Hello, how are you?"},
... {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
... {"role": "user", "content": "I'd like to show off how chat templating works!"},
... ]
>>> tokenizer.apply_chat_template(chat, tokenize=False)
"<s>[INST] Hello, how are you? [/INST]I'm doing great. How can I help you today?</s> [INST] I'd like to show off how chat templating works! [/INST]</s>"
```
┘Д╪з╪н╪╕ ┘Г┘К┘Б ╪г╪╢╪з┘Б ╪з┘Д┘Е╪м╪▓┘Й╪б ╪з┘Д┘Д╪║┘И┘Й tokenizer ╪▒┘Е┘И╪▓ ╪з┘Д╪к╪н┘Г┘Е `[INST]` ┘И `[/INST]` ┘Д┘Д╪е╪┤╪з╪▒╪й ╪е┘Д┘Й ╪и╪п╪з┘К╪й ┘И┘Ж┘З╪з┘К╪й ╪▒╪│╪з╪ж┘Д ╪з┘Д┘Е╪│╪к╪о╪п┘Е (┘И┘Д┘Г┘Ж ┘Д┘К╪│ ╪▒╪│╪з╪ж┘Д ╪з┘Д┘Е╪│╪з╪╣╪п!) ╪М ┘И╪к┘Е ╪к┘Г╪л┘К┘Б ╪з┘Д┘Е╪н╪з╪п╪л╪й ╪и╪г┘Г┘Е┘Д┘З╪з ┘Б┘К ╪│┘Д╪│┘Д╪й ┘Ж╪╡┘К╪й ┘И╪з╪н╪п╪й. ╪е╪░╪з ╪з╪│╪к╪о╪п┘Е┘Ж╪з `tokenize=True` ╪М ┘И┘З┘И ╪з┘Д╪е╪╣╪п╪з╪п ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К ╪М ┘Б╪│┘К╪к┘Е ╪г┘К╪╢┘Л╪з ╪к┘В╪│┘К┘Е ╪к┘Д┘Г ╪з┘Д╪│┘Д╪│┘Д╪й ╪е┘Д┘Й ╪▒┘Е┘И╪▓.
╪н╪з┘И┘Д ╪з┘Д╪в┘Ж ╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Б╪│ ╪з┘Д╪┤┘Б╪▒╪й╪М ┘Д┘Г┘Ж ┘Е╪╣ ╪з╪│╪к╪и╪п╪з┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и┘А `HuggingFaceH4/zephyr-7b-beta` ╪М ┘И╪│╪к╪н╪╡┘Д ╪╣┘Д┘Й:
```text
<|user|>
Hello, how are you?</s>
<|assistant|>
I'm doing great. How can I help you today?</s>
<|user|>
I'd like to show off how chat templating works!</s>
```
╪к┘Е ╪╢╪и╪╖ ┘Г┘Д ┘Е┘Ж Zephyr ┘И Mistral-Instruct ┘Е┘Ж ┘Ж┘Б╪│ ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪г╪╡┘Д┘К ╪М Mistral-7B-v0.1. ┘И┘Е╪╣ ╪░┘Д┘Г ╪М ┘Б┘В╪п ╪к┘Е ╪к╪п╪▒┘К╪и┘З┘Е ╪и╪к┘Ж╪│┘К┘В╪з╪к ╪п╪▒╪п╪┤╪й ┘Е╪о╪к┘Д┘Б╪й ╪к┘Е╪з┘Е┘Л╪з. ╪и╪п┘И┘Ж ┘В┘И╪з┘Д╪и ╪з┘Д┘Е╪н╪з╪п╪л╪й╪М ╪│╪к╪╢╪╖╪▒ ╪е┘Д┘Й ┘Г╪к╪з╪и╪й ╪┤┘Б╪▒╪й ╪к┘Ж╪│┘К┘В ┘К╪п┘И┘К┘Л╪з ┘Д┘Г┘Д ┘Ж┘Е┘И╪░╪м ╪М ┘И┘Е┘Ж ╪з┘Д╪│┘З┘Д ╪м╪п┘Л╪з ╪з╪▒╪к┘Г╪з╪и ╪г╪о╪╖╪з╪б ╪и╪│┘К╪╖╪й ╪к╪д╪л╪▒ ╪╣┘Д┘Й ╪з┘Д╪г╪п╪з╪б! ╪к┘П╪п┘К╪▒ ┘В┘И╪з┘Д╪и ╪з┘Д┘Е╪н╪з╪п╪л╪й ╪к┘Б╪з╪╡┘К┘Д ╪з┘Д╪к┘Ж╪│┘К┘В ┘Ж┘К╪з╪и╪й┘Л ╪╣┘Ж┘Г ╪М ┘Е┘Е╪з ┘К┘П╪к┘К╪н ┘Д┘Г ┘Г╪к╪з╪и╪й ╪┤┘Б╪▒╪й ╪╣╪з┘Е╪й ╪к╪╣┘Е┘Д ┘Е╪╣ ╪г┘К ┘Ж┘Е┘И╪░╪м.
## ┘Г┘К┘Б ╪г╪│╪к╪о╪п┘Е ┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й╪Я
┘Г┘Е╪з ╪▒╪г┘К╪к ┘Б┘К ╪з┘Д┘Е╪л╪з┘Д ╪з┘Д╪│╪з╪и┘В╪М ┘Е┘Ж ╪з┘Д╪│┘З┘Д ╪з╪│╪к╪о╪п╪з┘Е ┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й. ┘В┘Е ╪и╪и╪│╪з╪╖╪й ╪и╪е┘Ж╪┤╪з╪б ┘В╪з╪ж┘Е╪й ┘Е┘Ж ╪з┘Д╪▒╪│╪з╪ж┘Д╪М ┘Е╪╣ ┘Е┘Б╪к╪з╪н┘К `role` ┘И`content`╪М ╪л┘Е ┘В┘Е ╪и╪к┘Е╪▒┘К╪▒┘З╪з ╪е┘Д┘Й [`~PreTrainedTokenizer.apply_chat_template`] . ╪и┘Е╪м╪▒╪п ┘В┘К╪з┘Е┘Г ╪и╪░┘Д┘Г╪М ╪│╪к╪н╪╡┘Д ╪╣┘Д┘Й ┘Е╪о╪▒╪м╪з╪к ╪м╪з┘З╪▓╪й ┘Д┘Д╪з╪│╪к╪о╪п╪з┘Е! ╪╣┘Ж╪п ╪з╪│╪к╪о╪п╪з┘Е ┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ┘Г╪е╪п╪о╪з┘Д ┘Д╪к┘И┘Д┘К╪п ┘Ж╪╡┘И╪╡ ╪и┘И╪з╪│╪╖╪й ╪з┘Д┘Ж┘Е┘И╪░╪м╪М ┘Б┘Е┘Ж ╪з┘Д╪м┘К╪п ╪г┘К╪╢┘Л╪з ╪з╪│╪к╪о╪п╪з┘Е `add_generation_prompt=True` ┘Д╪е╪╢╪з┘Б╪й [┘Е╪╖╪з┘Д╪и╪з╪к ╪к┘И┘Д┘К╪п ╪з┘Д┘Ж╪╡┘И╪╡](#what-are-generation-prompts).
┘Б┘К┘Е╪з ┘К┘Д┘К ┘Е╪л╪з┘Д ╪╣┘Д┘Й ╪е╪╣╪п╪з╪п ╪з┘Д╪е╪п╪о╪з┘Д ┘Д┘А `model.generate()`╪М ╪и╪з╪│╪к╪о╪п╪з┘Е Zephyr ┘Е╪▒╪й ╪г╪о╪▒┘Й:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "HuggingFaceH4/zephyr-7b-beta"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint) # ┘В╪п ╪к╪▒╪║╪и ┘Б┘К ╪з╪│╪к╪о╪п╪з┘Е bfloat16 ┘И/╪г┘И ╪з┘Д╪з┘Ж╪к┘В╪з┘Д ╪е┘Д┘Й GPU ┘З┘Ж╪з
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
print(tokenizer.decode(tokenized_chat[0]))
```
╪│┘К╪д╪п┘К ┘З╪░╪з ╪е┘Д┘Й ╪е┘Ж╪к╪з╪м ╪│┘Д╪│┘Д╪й ┘Ж╪╡┘К╪й ╪и╪к┘Ж╪│┘К┘В ╪з┘Д╪е╪п╪о╪з┘Д ╪з┘Д╪░┘К ┘К╪к┘И┘В╪╣┘З Zephyr.
```text
<|system|>
You are a friendly chatbot who always responds in the style of a pirate</s>
<|user|>
How many helicopters can a human eat in one sitting?</s>
<|assistant|>
```
╪з┘Д╪в┘Ж ╪и╪╣╪п ╪г┘Ж ╪к┘Е ╪к┘Ж╪│┘К┘В ╪з┘Д╪е╪п╪о╪з┘Д ╪и╪┤┘Г┘Д ╪╡╪н┘К╪н ┘Д┘А Zephyr╪М ┘К┘Е┘Г┘Ж┘Ж╪з ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Д╪е┘Ж╪┤╪з╪б ╪▒╪п ╪╣┘Д┘Й ╪│╪д╪з┘Д ╪з┘Д┘Е╪│╪к╪о╪п┘Е:
```python
outputs = model.generate(tokenized_chat, max_new_tokens=128)
print(tokenizer.decode(outputs[0]))
```
╪│┘К╪д╪п┘К ┘З╪░╪з ╪е┘Д┘Й ┘Е╪з ┘К┘Д┘К:
```text
<|system|>
You are a friendly chatbot who always responds in the style of a pirate</s>
<|user|>
How many helicopters can a human eat in one sitting?</s>
<|assistant|>
Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all.
```
┘Г╪з┘Ж ╪░┘Д┘Г ╪│┘З┘Д╪з┘Л ╪и╪╣╪п ┘Г┘Д ╪┤┘К╪б !
## ┘З┘Д ┘З┘Ж╪з┘Г ┘В┘Ж┘И╪з╪к ┘Е╪╣╪з┘Д╪м╪й ╪г┘И╪к┘И┘Е╪з╪к┘К┘Г┘К╪й ┘Д┘Д╪п╪▒╪п╪┤╪й╪Я
┘Ж╪╣┘Е ┘К┘И╪м╪п ! ╪к╪п╪╣┘Е ┘В┘Ж┘И╪з╪к ╪з┘Д┘Е╪╣╪з┘Д╪м╪й ╪к┘И┘Д┘К╪п ╪з┘Д┘Ж╪╡┘И╪╡ ┘Е╪п╪о┘Д╪з╪к ╪з┘Д╪п╪▒╪п╪┤╪й ╪М ┘Е┘Е╪з ┘К┘П╪│┘З┘С┘Д ╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Е╪з╪░╪м ╪з┘Д╪п╪▒╪п╪┤╪й . ┘Б┘К ╪з┘Д┘Е╪з╪╢┘К ╪М ┘Г┘Ж╪з ┘Ж╪│╪к╪о╪п┘Е ┘Б╪ж╪й "ConversationalPipeline" ╪з┘Д┘Е┘П╪о╪╡┘С╪╡╪й ╪М ┘И┘Д┘Г┘Ж ╪к┘Е ╪з┘Д╪в┘Ж ╪е┘К┘В╪з┘Б┘З╪з ┘И╪к┘Е ╪п┘Е╪м ┘И╪╕╪з╪ж┘Б┘З╪з ┘Б┘К [`TextGenerationPipeline`]. ╪п╪╣┘И┘Ж╪з ┘Ж╪м╪▒┘С╪и ┘Е╪л╪з┘Д Zephyr ┘Е╪▒╪й ╪г╪о╪▒┘Й ╪М ┘И┘Д┘Г┘Ж ┘З╪░┘З ╪з┘Д┘Е╪▒╪й ╪и╪з╪│╪к╪о╪п╪з┘Е ┘В┘Ж╪з╪й ┘Е╪╣╪з┘Д╪м╪й:
```python
from transformers import pipeline
pipe = pipeline("text-generation", "HuggingFaceH4/zephyr-7b-beta")
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
print(pipe(messages, max_new_tokens=128)[0]['generated_text'][-1]) # ╪╖╪и╪з╪╣╪й ╪з╪│╪к╪м╪з╪и╪й ╪з┘Д┘Е╪│╪з╪╣╪п
```
```╪з┘Д┘Ж╪╡
{'role': 'assistant', 'content': "Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all."}
```
╪│┘К┘П╪▒╪з╪╣┘К ┘В┘Ж╪з╪й ╪з┘Д┘Е╪╣╪з┘Д╪м╪й ╪м┘Е┘К╪╣ ╪к┘Б╪з╪╡┘К┘Д ╪к┘В╪│┘К┘Е ╪з┘Д┘Ж╪╡ ╪е┘Д┘Й ╪▒┘Е┘И╪▓ ┘И╪з╪│╪к╪п╪╣╪з╪б apply_chat_template ┘Ж┘К╪з╪и╪й┘Л ╪╣┘Ж┘Г - ╪и┘Е╪м╪▒╪п ╪г┘Ж ┘К╪╡╪и╪н ┘Д┘Р╪п┘Й ╪з┘Д┘Ж┘Е┘И╪░╪м ┘В╪з┘Д╪и ╪п╪▒╪п╪┤╪й ╪М ┘Б┘Г┘Д ┘Е╪з ╪к╪н╪к╪з╪м ╪е┘Д┘Й ╪з┘Д┘В┘К╪з┘Е ╪и┘З ┘З┘И ╪к┘З┘К╪ж╪й ┘В┘Ж╪з╪й ┘Е╪╣╪з┘Д╪м╪й ┘И╪к┘Е╪▒┘К╪▒ ┘В╪з╪ж┘Е╪й ╪з┘Д╪▒╪│╪з╪ж┘Д ╪е┘Д┘К┘З╪з!
## ┘Е╪з ┘З┘К "┘Е╪╖╪з┘Д╪и╪з╪к ╪з┘Д╪к┘И┘Д┘К╪п"╪Я
┘В╪п ╪к┘Д╪з╪н╪╕ ╪г┘Ж ╪╖╪▒┘К┘В╪й `apply_chat_template` ┘Д┘З╪з ┘Е╪╣╪з┘Е┘Д `add_generation_prompt`. ╪к╪о╪и╪▒ ┘З╪░┘З ╪з┘Д┘Е╪╣╪з┘Е┘Д ╪з┘Д┘В╪з┘Д╪и ╪и╪е╪╢╪з┘Б╪й ╪▒┘Е┘И╪▓ ╪к╪┤┘К╪▒ ╪е┘Д┘Й ╪и╪п╪з┘К╪й ╪▒╪п ╪з┘Д╪и┘И╪к. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪╢╪╣ ┘Б┘К ╪з╪╣╪к╪и╪з╪▒┘Г ╪з┘Д╪п╪▒╪п╪┤╪й ╪з┘Д╪к╪з┘Д┘К╪й:
```python
messages = [
{"role": "user", "content": "Hi there!"},
{"role": "assistant", "content": "Nice to meet you!"},
{"role": "user", "content": "Can I ask a question?"}
]
```
╪е┘Д┘К┘Г ┘Г┘К┘Б ╪│┘К╪и╪п┘И ╪░┘Д┘Г ╪и╪п┘И┘Ж ┘Е┘И╪м┘З ╪к┘И┘Д┘К╪п ┘Ж╪╡┘И╪╡ ╪М ╪и╪з┘Д┘Ж╪│╪и╪й ┘Д┘Ж┘Е┘И╪░╪м ┘К╪│╪к╪о╪п┘Е ╪к┘Ж╪│┘К┘В "ChatML" ╪з┘Д┘В┘К╪з╪│┘К :
```python
tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)
"""<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
"""
```
┘И┘З┘Г╪░╪з ┘К╪и╪п┘И ╪з┘Д╪г┘Е╪▒ **┘Е╪╣** ┘Е╪╖╪з┘Д╪и╪й ╪з┘Д╪к┘И┘Д┘К╪п:
```python
tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
"""<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant
"""
```
┘Д╪з╪н╪╕ ╪г┘Ж┘Ж╪з ╪г╪╢┘Б┘Ж╪з ┘З╪░┘З ╪з┘Д┘Е╪▒╪й ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪к┘К ╪к╪┤┘К╪▒ ╪е┘Д┘Й ╪и╪п╪з┘К╪й ╪▒╪п ╪з┘Д╪и┘И╪к. ┘К╪╢┘Е┘Ж ┘З╪░╪з ╪г┘Ж┘З ╪╣┘Ж╪п┘Е╪з ┘К┘П┘И┘Д┘С╪п ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Ж╪╡┘Л╪з ┘Б╪│┘К┘Г╪к╪и ╪▒╪п ╪з┘Д╪и┘И╪к ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪з┘Д┘В┘К╪з┘Е ╪и╪┤┘К╪б ╪║┘К╪▒ ┘Е╪к┘И┘В╪╣╪М ┘Е╪л┘Д ╪з┘Д╪з╪│╪к┘Е╪▒╪з╪▒ ┘Б┘К ╪▒╪│╪з┘Д╪й ╪з┘Д┘Е╪│╪к╪о╪п┘Е. ╪к╪░┘Г╪▒╪М ╪г┘Ж ┘Ж┘Е╪з╪░╪м ╪з┘Д╪п╪▒╪п╪┤╪й ┘Д╪з ╪к╪▓╪з┘Д ┘Е╪м╪▒╪п ┘Ж┘Е╪з╪░╪м ┘Д┘Д╪║╪й - ┘Б┘З┘К ┘Е╪п╪▒╪и╪й ╪╣┘Д┘Й ┘Е╪к╪з╪и╪╣╪й ╪з┘Д┘Ж╪╡┘И╪╡╪М ┘И╪з┘Д╪п╪▒╪п╪┤╪й ┘З┘К ┘Е╪м╪▒╪п ┘Ж┘И╪╣ ╪о╪з╪╡ ┘Е┘Ж ╪з┘Д┘Ж╪╡┘И╪╡ ╪и╪з┘Д┘Ж╪│╪и╪й ┘Д┘З╪з! ┘К╪м╪и ╪к┘И╪м┘К┘З┘З╪з ╪и╪▒┘Е┘И╪▓ ╪к╪н┘Г┘Е ┘Е┘Ж╪з╪│╪и╪й╪М ╪н╪к┘Й ╪к╪╣╪▒┘Б ┘Е╪з ╪з┘Д╪░┘К ┘К╪м╪и ╪╣┘Д┘К┘З╪з ┘Б╪╣┘Д┘З.
┘Д╪з ╪к╪к╪╖┘Д╪и ╪м┘Е┘К╪╣ ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪к╪н┘Г┘Е┘К╪й ┘Д╪к┘И┘Д┘К╪п ┘Ж╪╡┘И╪╡ . ╪и╪╣╪╢ ╪з┘Д┘Ж┘Е╪з╪░╪м ╪М ┘Е╪л┘Д LLaMA ╪М ┘Д┘К╪│ ┘Д╪п┘К┘З╪з ╪г┘К ╪▒┘Е┘И╪▓ ╪о╪з╪╡╪й ┘В╪и┘Д ╪▒╪п┘И╪п ╪з┘Д╪и┘И╪к . ┘Б┘К ┘З╪░┘З ╪з┘Д╪н╪з┘Д╪з╪к ╪М ┘Д┘Ж ┘К┘Г┘И┘Ж ┘Д┘Е╪╣╪з┘Е┘Д `add_generation_prompt` ╪г┘К ╪к╪г╪л┘К╪▒. ┘К╪╣╪к┘Е╪п ╪з┘Д╪к╪г╪л┘К╪▒ ╪з┘Д╪п┘В┘К┘В ╪з┘Д╪░┘К ╪к┘П╪н╪п╪л┘З `add_generation_prompt` ╪╣┘Д┘Й ╪з┘Д┘В╪з┘Д╪и ╪з┘Д┘Е╪│╪к╪о╪п┘Е .
## ┘Е╪з ┘И╪╕┘К┘Б╪й "continue_final_message"╪Я
╪╣┘Ж╪п ╪к┘Е╪▒┘К╪▒ ┘В╪з╪ж┘Е╪й ┘Е┘Ж ╪з┘Д╪▒╪│╪з╪ж┘Д ╪е┘Д┘Й `apply_chat_template` ╪г┘И `TextGenerationPipeline` ╪М ┘К┘Е┘Г┘Ж┘Г ╪з╪о╪к┘К╪з╪▒ ╪к┘Ж╪│┘К┘В ╪з┘Д┘Е╪н╪з╪п╪л╪й ╪и╪н┘К╪л ┘К┘И╪з╪╡┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪▒╪│╪з┘Д╪й ╪з┘Д╪г╪о┘К╪▒╪й ┘Б┘К ╪з┘Д┘Е╪н╪з╪п╪л╪й ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪и╪п╪б ╪▒╪│╪з┘Д╪й ╪м╪п┘К╪п╪й. ┘К╪к┘Е ╪░┘Д┘Г ╪╣┘Ж ╪╖╪▒┘К┘В ╪е╪▓╪з┘Д╪й ╪г┘К ╪▒┘Е┘И╪▓ ┘Ж┘З╪з┘К╪й ╪з┘Д╪к╪│┘Д╪│┘Д ╪з┘Д╪к┘К ╪к╪┤┘К╪▒ ╪е┘Д┘Й ┘Ж┘З╪з┘К╪й ╪з┘Д╪▒╪│╪з┘Д╪й ╪з┘Д╪г╪о┘К╪▒╪й ╪М ╪и╪н┘К╪л ┘К┘В┘И┘Е ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и╪и╪│╪з╪╖╪й ╪и╪к┘Е╪п┘К╪п ╪з┘Д╪▒╪│╪з┘Д╪й ╪з┘Д╪г╪о┘К╪▒╪й ╪╣┘Ж╪п┘Е╪з ┘К╪и╪п╪г ┘Б┘К ╪к┘И┘Д┘К╪п ╪з┘Д┘Ж╪╡ . ┘К┘П╪╣╪п ┘З╪░╪з ╪г┘Е╪▒┘Л╪з ┘Е┘Б┘К╪п┘Л╪з "┘Д┘Р┘Е┘О┘Д╪б ╪и╪п╪з┘К╪й" ╪▒╪п ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Е┘П╪│╪и┘В┘Л╪з.
┘И┘З┘Ж╪з ┘Е╪л╪з┘Д:
```python
chat = [
{"role": "user", "content": "Can you format the answer in JSON?"},
{"role": "assistant", "content": '{"name": "'},
]
formatted_chat = tokenizer.apply_chat_template(chat, tokenize=True, return_dict=True, continue_final_message=True)
model.generate(**formatted_chat)
```
╪│┘К┘В┘И┘Е ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и╪к┘И┘Д┘К╪п ┘Ж╪╡ ┘К┘Г┘Е┘Д ╪│┘Д╪│┘Д╪й JSON ╪М ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪и╪п╪б ╪▒╪│╪з┘Д╪й ╪м╪п┘К╪п╪й . ┘К┘Е┘Г┘Ж ╪г┘Ж ┘К┘Г┘И┘Ж ┘З╪░╪з ╪з┘Д┘Ж┘З╪м ┘Е┘Б┘К╪п┘Л╪з ╪м╪п┘Л╪з ┘Д╪к╪н╪│┘К┘Ж ╪п┘В╪й ╪з╪к╪и╪з╪╣ ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Д┘Д╪е╪▒╪┤╪з╪п╪з╪к ╪╣┘Ж╪п┘Е╪з ╪к╪╣╪▒┘Б ┘Г┘К┘Б ╪к╪▒┘К╪п ╪г┘Ж ┘К╪и╪п╪г ╪▒╪п┘И╪п┘З .
.
┘Ж╪╕╪▒┘Л╪з ┘Д╪г┘Ж `add_generation_prompt` ╪к╪╢┘К┘Б ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪к┘К ╪к╪и╪п╪г ╪▒╪│╪з┘Д╪й ╪м╪п┘К╪п╪й ╪М ┘И `continue_final_message` ╪к╪▓┘К┘Д ╪г┘К ╪▒┘Е┘И╪▓ ┘Ж┘З╪з┘К╪й ╪з┘Д╪▒╪│╪з┘Д╪й ┘Е┘Ж ╪з┘Д╪▒╪│╪з┘Д╪й ╪з┘Д╪г╪о┘К╪▒╪й ╪М ┘Б┘Д┘К╪│ ┘Е┘Ж ╪з┘Д┘Е┘Ж╪╖┘В┘К ╪з╪│╪к╪о╪п╪з┘Е┘З┘Е╪з ┘Е╪╣┘Л╪з . ┘И┘Ж╪к┘К╪м╪й ┘Д╪░┘Д┘Г ╪М ╪│╪к╪к┘Д┘В┘С┘Й ╪о╪╖╪г┘Л ╪е╪░╪з ╪н╪з┘И┘Д╪к ╪░┘Д┘Г !
╪з┘Д╪│┘Д┘И┘Г ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К ┘Д┘Р┘А `TextGenerationPipeline` ┘З┘И ╪к╪╣┘К┘К┘Ж `add_generation_prompt=True` ╪и╪н┘К╪л ╪к╪и╪п╪г ╪▒╪│╪з┘Д╪й ╪м╪п┘К╪п╪й . ┘И┘Е╪╣ ╪░┘Д┘Г ╪М ╪е╪░╪з ┘Г╪з┘Ж╪к ╪з┘Д╪▒╪│╪з┘Д╪й ╪з┘Д╪г╪о┘К╪▒╪й ┘Б┘К ╪з┘Д┘Е╪н╪з╪п╪л╪й ╪з┘Д╪к┘К ╪к┘Е ╪е╪п╪о╪з┘Д┘З╪з ┘Д╪п┘К┘З╪з ╪п┘И╪▒ "assistant" ╪М ┘Б╪│┘И┘Б ╪к┘Б╪к╪▒╪╢ ╪г┘Ж ┘З╪░┘З ╪з┘Д╪▒╪│╪з┘Д╪й ┘З┘К "┘Е┘О┘Д╪б ╪и╪п╪з┘К╪й" ┘И╪к╪к╪н┘И┘С┘Д ╪е┘Д┘Й `continue_final_message=True` ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪░┘Д┘Г ╪М ┘Д╪г┘Ж ┘Е┘П╪╣╪╕┘Е ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Д╪з ╪к╪п╪╣┘Е ╪╣╪п╪й ╪▒╪│╪з╪ж┘Д ┘Е╪к╪к╪з┘Д┘К╪й ┘Д┘Д┘Е╪│╪з╪╣╪п . ┘К┘Е┘Г┘Ж┘Г ╪к╪м╪з┘И╪▓ ┘З╪░╪з ╪з┘Д╪│┘Д┘И┘Г ╪╣┘Ж ╪╖╪▒┘К┘В ╪к┘Е╪▒┘К╪▒ ┘Е╪╣╪з┘Е┘Д `continue_final_message` ╪и╪┤┘Г┘Д ╪╡╪▒┘К╪н ╪╣┘Ж╪п ╪з╪│╪к╪п╪╣╪з╪б ┘В┘Ж╪з╪й ╪з┘Д┘Е╪╣╪з┘Д╪м╪й .
## ┘З┘Д ┘К┘Е┘Г┘Ж┘Ж┘К ╪з╪│╪к╪о╪п╪з┘Е ┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ┘Б┘К ╪з┘Д╪к╪п╪▒┘К╪и╪Я
┘Ж╪╣┘Е ! ╪к┘П╪╣╪п ┘З╪░┘З ╪╖╪▒┘К┘В╪й ╪м┘К╪п╪й ┘Д┘Д╪к╪г┘Г╪п ┘Е┘Ж ╪г┘Ж ┘В╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ┘К╪к╪╖╪з╪и┘В ┘Е╪╣ ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪к┘К ┘К╪▒╪з┘З╪з ╪з┘Д┘Ж┘Е┘И╪░╪м ╪г╪л┘Ж╪з╪б ╪з┘Д╪к╪п╪▒┘К╪и . ┘Ж┘И╪╡┘К ╪и╪к╪╖╪и┘К┘В ┘В╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ┘Г╪о╪╖┘И╪й ┘Е╪╣╪з┘Д╪м╪й ╪г┘И┘Д┘К╪й ┘Д┘Е╪м┘Е┘И╪╣╪й ╪и┘К╪з┘Ж╪з╪к┘Г . ╪и╪╣╪п ╪░┘Д┘Г ╪М ┘К┘Е┘Г┘Ж┘Г ╪и╪и╪│╪з╪╖╪й ┘Е╪к╪з╪и╪╣╪й ╪╣┘Е┘Д┘К╪й ╪з┘Д╪к╪п╪▒┘К╪и ┘Г┘Е╪з ┘З┘И ╪з┘Д╪н╪з┘Д ┘Е╪╣ ╪г┘К ┘Е┘З┘Е╪й ╪к╪п╪▒┘К╪и ┘Ж┘Е╪з╪░╪м ┘Д╪║╪з╪к ╪г╪о╪▒┘Й . ╪╣┘Ж╪п ╪з┘Д╪к╪п╪▒┘К╪и ╪М ┘К╪м╪и ╪г┘Ж ╪к┘П╪╣┘К┘С┘Ж ╪╣╪з╪п╪й┘Л `add_generation_prompt=False` ╪М ┘Д╪г┘Ж┘З ┘Д┘Ж ╪к┘Г┘И┘Ж ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д┘Е┘П╪╢╪з┘Б╪й ┘Д╪к╪н┘Б┘К╪▓ ╪▒╪п ╪з┘Д┘Е╪│╪з╪╣╪п ┘Е┘Б┘К╪п╪й ╪г╪л┘Ж╪з╪б ╪з┘Д╪к╪п╪▒┘К╪и . ╪п╪╣┘И┘Ж╪з ┘Ж╪▒┘Й ┘Е╪л╪з┘Д╪з┘Л :
```python
from transformers import AutoTokenizer
from datasets import Dataset
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-beta")
chat1 = [
{"role": "user", "content": "Which is bigger, the moon or the sun?"},
{"role": "assistant", "content": "The sun."}
]
chat2 = [
{"role": "user", "content": "Which is bigger, a virus or a bacterium?"},
{"role": "assistant", "content": "A bacterium."}
]
dataset = Dataset.from_dict({"chat": [chat1, chat2]})
dataset = dataset.map(lambda x: {"formatted_chat": tokenizer.apply_chat_template(x["chat"], tokenize=False, add_generation_prompt=False)})
print(dataset['formatted_chat'][0])
```
┘И┘Ж╪н╪╡┘Д ╪╣┘Д┘Й:
```text
<|user|>
Which is bigger, the moon or the sun?</s>
<|assistant|>
The sun.</s>
```
┘Е┘Ж ┘З┘Ж╪з╪М ╪з╪│╪к┘Е╪▒ ┘Б┘К ╪з┘Д╪к╪п╪▒┘К╪и ┘Г┘Е╪з ╪к┘Б╪╣┘Д ┘Е╪╣ ┘Е┘З┘Е╪й ┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д┘В┘К╪з╪│┘К╪й╪М ╪и╪з╪│╪к╪о╪п╪з┘Е ╪╣┘Е┘И╪п `formatted_chat`.
<Tip>
╪и╪┤┘Г┘Д ╪з┘Б╪к╪▒╪з╪╢┘К ╪М ╪к╪╢┘К┘Б ╪и╪╣╪╢ *tokenizers* ╪▒┘Е┘И╪▓┘Л╪з ╪о╪з╪╡╪й ┘Е╪л┘Д `<bos>` ┘И `<eos>` ╪е┘Д┘Й ╪з┘Д┘Ж╪╡ ╪з┘Д╪░┘К ╪к┘В┘И┘Е ╪и╪к┘В╪│┘К┘Е┘З ╪е┘Д┘Й ╪▒┘Е┘И╪▓. ┘К╪м╪и ╪г┘Ж ╪к╪к╪╢┘Е┘Ж ┘В┘И╪з┘Д╪и ╪з┘Д┘Е╪н╪з╪п╪л╪й ╪и╪з┘Д┘Б╪╣┘Д ╪м┘Е┘К╪╣ ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪о╪з╪╡╪й ╪з┘Д╪к┘К ╪к╪н╪к╪з╪м┘З╪з ╪М ┘И╪и╪з┘Д╪к╪з┘Д┘К ┘Б╪е┘Ж ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪о╪з╪╡╪й ╪з┘Д╪е╪╢╪з┘Б┘К╪й ╪│╪к┘Г┘И┘Ж ╪║╪з┘Д╪и┘Л╪з ╪║┘К╪▒ ╪╡╪н┘К╪н╪й ╪г┘И ┘Е┘П┘Г╪▒╪▒╪й ╪М ┘Е┘Е╪з ╪│┘К╪д╪л╪▒ ╪│┘Д╪и┘Л╪з ╪╣┘Д┘Й ╪г╪п╪з╪б ╪з┘Д┘Ж┘Е┘И╪░╪м .
┘Д╪░┘Д┘Г ╪М ╪е╪░╪з ┘В┘Е╪к ╪и╪к┘Ж╪│┘К┘В ╪з┘Д┘Ж╪╡ ╪и╪з╪│╪к╪о╪п╪з┘Е `apply_chat_template(tokenize=False)` ╪М ┘Б┘К╪м╪и ╪к╪╣┘К┘К┘Ж ╪з┘Д┘Е╪╣╪з┘Е┘Д `add_special_tokens=False` ╪╣┘Ж╪п┘Е╪з ╪к┘В┘И┘Е ╪и╪к┘В╪│┘К┘Е ╪░┘Д┘Г ╪з┘Д┘Ж╪╡ ╪е┘Д┘Й ╪▒┘Е┘И╪▓ ┘Д╪з╪н┘В┘Л╪з . ╪е╪░╪з ┘Г┘Ж╪к ╪к╪│╪к╪о╪п┘Е `apply_chat_template(tokenize=True)` ╪М ┘Б┘Д┘Ж ╪к╪н╪к╪з╪м ╪е┘Д┘Й ╪з┘Д┘В┘Д┘В ╪и╪┤╪г┘Ж ╪░┘Д┘Г !
</Tip>
## ┘Е╪к┘В╪п┘С┘Е: ┘Е╪п╪о┘Д╪з╪к ╪е╪╢╪з┘Б┘К╪й ┘Д┘Р┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й
╪з┘Д┘Е╪╣╪з┘Е┘Д ╪з┘Д┘И╪н┘К╪п╪й ╪з┘Д╪к┘К ╪к╪к╪╖┘Д╪и┘З╪з ╪╖╪▒┘К┘В╪й `apply_chat_template` ┘З┘К `messages`. ┘И┘Е╪╣ ╪░┘Д┘Г╪М ┘К┘Е┘Г┘Ж┘Г ╪к┘Е╪▒┘К╪▒ ╪г┘К ┘Е╪╣╪з┘Е┘Д ┘Г┘Г┘Д┘Е╪й ┘Е┘Б╪к╪з╪н┘К╪й ╪е┘Д┘Й `apply_chat_template` ┘И╪│╪к┘Г┘И┘Ж ┘Е╪к╪з╪н╪й ╪п╪з╪о┘Д ╪з┘Д┘В╪з┘Д╪и. ┘К┘Е┘Ж╪н┘Г ┘З╪░╪з ╪з┘Д┘Г╪л┘К╪▒ ┘Е┘Ж ╪з┘Д┘Е╪▒┘И┘Ж╪й ┘Д╪з╪│╪к╪о╪п╪з┘Е ┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ┘Д┘Д╪╣╪п┘К╪п ┘Е┘Ж ╪з┘Д╪г╪┤┘К╪з╪б. ┘Д╪з ╪к┘И╪м╪п ┘В┘К┘И╪п ╪╣┘Д┘Й ╪г╪│┘Е╪з╪б ┘З╪░┘З ╪з┘Д┘Е╪╣╪з┘Ея╗╗╪к ╪г┘И ╪к┘Ж╪│┘К┘В╪з╪к┘З╪з - ┘К┘Е┘Г┘Ж┘Г ╪к┘Е╪▒┘К╪▒ ╪│┘Д╪з╪│┘Д ┘Ж╪╡┘К╪й ╪г┘И ┘В┘И╪з╪ж┘Е ╪г┘И ┘В┘И╪з┘Е┘К╪│ ╪г┘И ╪г┘К ╪┤┘К╪б ╪в╪о╪▒ ╪к╪▒┘К╪п┘З.
┘И┘Е╪╣ ╪░┘Д┘Г╪М ┘З┘Ж╪з┘Г ╪и╪╣╪╢ ╪з┘Д╪н╪з┘Д╪з╪к ╪з┘Д╪┤╪з╪ж╪╣╪й ┘Д╪з╪│╪к╪о╪п╪з┘Е ┘З╪░┘З ╪з┘Д┘Е╪╣╪з┘Ея╗╗╪к ╪з┘Д╪е╪╢╪з┘Б┘К╪й╪М ┘Е╪л┘Д ╪к┘Е╪▒┘К╪▒ ╪г╪п┘И╪з╪к ┘Д╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д┘И╪╕╪з╪ж┘Б╪М ╪г┘И ╪з┘Д┘Е╪│╪к┘Ж╪п╪з╪к ┘Д╪е┘Ж╪┤╪з╪б ╪з┘Д┘Ж╪╡┘И╪╡ ╪з┘Д┘Е┘П╪╣╪▓┘С╪▓╪й ╪и╪з┘Д╪з╪│╪к╪▒╪м╪з╪╣. ┘Б┘К ┘З╪░┘З ╪з┘Д╪н╪з┘Д╪з╪к ╪з┘Д╪┤╪з╪ж╪╣╪й╪М ┘Д╪п┘К┘Ж╪з ╪и╪╣╪╢ ╪з┘Д╪к┘И╪╡┘К╪з╪к ╪з┘Д┘Е┘П╪н╪п┘С╪п╪й ╪н┘И┘Д ╪г╪│┘Е╪з╪б ┘З╪░┘З ╪з┘Д┘Е╪╣╪з┘Ея╗╗╪к ┘И╪к┘Ж╪│┘К┘В╪з╪к┘З╪з╪М ┘И╪з┘Д╪к┘К ┘К╪к┘Е ┘И╪╡┘Б┘З╪з ┘Б┘К ╪з┘Д╪г┘В╪│╪з┘Е ╪з┘Д╪к╪з┘Д┘К╪й. ┘Ж╪┤╪м╪╣ ┘Е╪╖┘И┘С╪▒┘К ╪з┘Д┘Ж┘Е╪з╪░╪м ╪╣┘Д┘Й ╪м╪╣┘Д ┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ╪з┘Д╪о╪з╪╡╪й ╪и┘З┘Е ┘Е╪к┘И╪з┘Б┘В╪й ┘Е╪╣ ┘З╪░╪з ╪з┘Д╪к┘Ж╪│┘К┘В╪М ┘Д╪к╪│┘З┘К┘Д ┘Ж┘В┘Д ╪з┘Д╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Е╪м┘К╪й ┘Д╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪г╪п┘И╪з╪к ╪и┘К┘Ж ╪з┘Д┘Ж┘Е╪з╪░╪м.
## ┘Е╪к┘В╪п┘Е: ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п╪з╪й / ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪п╪з┘Д╪й
┘К┘Е┘Г┘Ж ┘Д┘Ж┘Е╪з╪░╪м "╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п╪з╪й" ╪з╪о╪к┘К╪з╪▒ ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪п┘И╪з┘Д ┘Г╪г╪п┘И╪з╪к ╪о╪з╪▒╪м┘К╪й ┘В╪и┘Д ╪к┘И┘Д┘К╪п ╪з┘Д╪е╪м╪з╪и╪й. ╪╣┘Ж╪п ╪к┘Е╪▒┘К╪▒ ╪з┘Д╪г╪п┘И╪з╪к ╪е┘Д┘Й ┘Ж┘Е┘И╪░╪м ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п┘И╪з╪к╪М ┘К┘Е┘Г┘Ж┘Г ╪и╪и╪│╪з╪╖╪й ╪к┘Е╪▒┘К╪▒ ┘В╪з╪ж┘Е╪й ┘Е┘Ж ╪з┘Д┘И╪╕╪з╪ж┘Б ╪е┘Д┘Й ┘Е╪╣╪з┘Е┘Д `tools`:
```python
import datetime
def current_time():
"""Get the current local time as a string."""
return str(datetime.now())
def multiply(a: float, b: float):
"""
A function that multiplies two numbers
Args:
a: The first number to multiply
b: The second number to multiply
"""
return a * b
tools = [current_time, multiply]
model_input = tokenizer.apply_chat_template(
messages,
tools=tools
)
```
┘Д┘Г┘К ┘К╪╣┘Е┘Д ┘З╪░╪з ╪и╪┤┘Г┘Д ╪╡╪н┘К╪н╪М ┘К╪м╪и ╪╣┘Д┘К┘Г ┘Г╪к╪з╪и╪й ┘И╪╕╪з╪ж┘Б┘Г ╪и╪з┘Д╪к┘Ж╪│┘К┘В ╪з┘Д╪│╪з╪и┘В╪М ╪н╪к┘Й ┘К┘Е┘Г┘Ж ╪к╪н┘Д┘К┘Д┘З╪з ╪и╪┤┘Г┘Д ╪╡╪н┘К╪н ┘Г╪г╪п┘И╪з╪к. ╪╣┘Д┘Й ┘И╪м┘З ╪з┘Д╪к╪н╪п┘К╪п╪М ┘К╪м╪и ╪╣┘Д┘К┘Г ╪з╪к╪и╪з╪╣ ┘З╪░┘З ╪з┘Д┘В┘И╪з╪╣╪п:
- ┘К╪м╪и ╪г┘Ж ┘К┘Г┘И┘Ж ┘Д┘Д╪п╪з┘Д╪й ╪з╪│┘Е ┘И╪╡┘Б┘К.
- ┘К╪м╪и ╪г┘Ж ┘К┘Г┘И┘Ж ┘Д┘Г┘Д ┘Е╪╣╪з┘Е┘Д ┘Ж┘И╪╣ ┘Д┘Д╪к┘Д┘Е┘К╪н.
- ┘К╪м╪и ╪г┘Ж ╪к╪н╪к┘И┘К ╪з┘Д╪п╪з┘Д╪й ╪╣┘Д┘Й ╪│┘Д╪│┘Д╪й ┘Е╪│╪к┘Ж╪п┘К╪й ╪и╪к┘Ж╪│┘К┘В Google ╪з┘Д┘В┘К╪з╪│┘К (╪и┘Е╪╣┘Ж┘Й ┘И╪╡┘Б ╪з┘Д╪п╪з┘Д╪й ╪з┘Д╪г┘И┘Д┘К ┘Е╪к╪и┘И╪╣┘Л╪з ╪и┘Г╪к┘Д╪й `Args:` ╪з┘Д╪к┘К ╪к╪╡┘Б ╪з┘Д┘Е╪╣╪зя╗╗╪к╪М ┘Е╪з ┘Д┘Е ╪к┘Г┘Ж ╪з┘Д╪п╪з┘Д╪й ┘Д╪з ╪к╪н╪к┘И┘К ╪╣┘Д┘Й ╪г┘К ┘Е╪╣╪з┘Ея╗╗╪к.
- ┘Д╪з ╪к┘В┘Е ╪и╪к╪╢┘Е┘К┘Ж ╪з┘Д╪г┘Ж┘И╪з╪╣ ┘Б┘К ┘Г╪к┘Д╪й `Args:` . ╪и╪╣╪и╪з╪▒╪й ╪г╪о╪▒┘Й╪М ╪з┘Г╪к╪и `a: The first number to multiply`╪М ┘И┘Д┘К╪│ `a (int): The first number to multiply`. ┘К╪м╪и ╪г┘Ж ╪к╪░┘З╪и ╪к┘Д┘Е┘К╪н╪з╪к ╪з┘Д╪г┘Ж┘И╪з╪╣ ┘Б┘К ╪▒╪г╪│ ╪з┘Д╪п╪з┘Д╪й ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪░┘Д┘Г.
- ┘К┘Е┘Г┘Ж ╪г┘Ж ┘К┘Г┘И┘Ж ┘Д┘Д╪п╪з┘Д╪й ┘Ж┘И╪╣ ┘Д┘Д╪е╪▒╪м╪з╪╣ ┘И┘Е╪▒╪и╪╣ `Returns:` ┘Б┘К ╪з┘Д╪│┘Д╪│┘Д╪й. ┘И┘Е╪╣ ╪░┘Д┘Г╪М ┘Б┘З╪░┘З ╪з╪о╪к┘К╪з╪▒┘К╪й ┘Д╪г┘Ж ┘Е╪╣╪╕┘Е ┘Ж┘Е╪з╪░╪м ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п┘И╪з╪к ╪к╪к╪м╪з┘З┘Д┘З╪з.
### ╪к┘Е╪▒┘К╪▒ ┘Ж╪к╪з╪ж╪м ╪з┘Д╪г╪п╪з╪й ╪е┘Д┘Й ╪з┘Д┘Ж┘Е┘И╪░╪м
┘К┘Г┘Б┘К ╪з┘Д┘Г┘И╪п ╪з┘Д╪│╪з╪и┘В╪й ┘Д╪│╪▒╪п ╪з┘Д╪г╪п┘И╪з╪к ╪з┘Д┘Е╪к╪з╪н╪й ┘Д┘Ж┘Е┘И╪░╪м┘Г╪М ┘И┘Д┘Г┘Ж ┘Е╪з╪░╪з ┘К╪н╪п╪л ╪е╪░╪з ╪г╪▒╪з╪п ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з╪│╪к╪о╪п╪з┘Е ┘И╪з╪н╪п╪й ┘Е┘Ж┘З╪з╪Я ╪е╪░╪з ╪н╪п╪л ╪░┘Д┘Г╪М ┘Б┘К╪м╪и ╪╣┘Д┘К┘Г:
1. ╪к╪н┘Д┘К┘Д ┘Е╪о╪▒╪м╪з╪к ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Д┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ╪з╪│┘Е (╪г╪│┘Е╪з╪б) ╪з┘Д╪г╪п┘И╪з╪к ┘И┘Е╪╣╪з┘Ея╗╗╪к┘З╪з.
2. ╪г╪╢┘Б ╪з╪│╪к╪п╪╣╪з╪б (╪з╪│╪к╪п╪╣╪з╪б╪з╪к) ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Д┘Р┘Д╪г╪п┘И╪з╪к ╪е┘Д┘Й ╪з┘Д┘Е╪н╪з╪п╪л╪й.
3. ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪п╪з┘Д╪й (╪з┘Д╪п╪з┘Д╪з╪к) ╪з┘Д┘Е┘В╪з╪и┘Д╪й ╪и╪к┘Д┘Г ╪з┘Д┘Е╪╣╪з┘Ея╗╗╪к.
4. ╪г╪╢┘Б ╪з┘Д┘Ж╪к┘К╪м╪й (╪з┘Д┘Ж╪к╪з╪ж╪м) ╪е┘Д┘Й ╪з┘Д┘Е╪н╪з╪п╪л╪й
### ┘Е╪л╪з┘Д ┘Г╪з┘Е┘Д ╪╣┘Д┘Й ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п╪з╪й
╪│┘Ж╪│╪к╪╣╪▒╪╢ ┘Е╪л╪з┘Д╪з┘Л ╪╣┘Д┘Й ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п┘И╪з╪к ╪о╪╖┘И╪й ╪и╪о╪╖┘И╪й . ┘Б┘К ┘З╪░╪з ╪з┘Д┘Е╪л╪з┘Д ╪М ╪│┘Ж╪│╪к╪о╪п┘Е ┘Ж┘Е┘И╪░╪м `Hermes-2-Pro` ╪и╪н╪м┘Е 8 ┘Е┘Д┘К╪з╪▒╪з╪к ┘Е╪╣╪з┘Е┘Д ╪М ┘Ж╪╕╪▒┘Л╪з ┘Д╪г┘Ж┘З ╪г╪н╪п ╪г╪╣┘Д┘Й ┘Ж┘Е╪з╪░╪м ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п┘И╪з╪к ╪г╪п╪з╪б┘Л ┘Б┘К ┘Б╪ж╪й ╪н╪м┘Е┘З ┘И┘В╪к ┘Г╪к╪з╪и╪й ┘З╪░╪з ╪з┘Д┘Ж╪╡ . ╪е╪░╪з ┘Г╪з┘Ж ┘Д╪п┘К┘Г ╪з┘Д╪░╪з┘Г╪▒╪й ╪з┘Д┘Г╪з┘Б┘К╪й ╪М ┘Б┘К┘Е┘Г┘Ж┘Г ╪з┘Д┘Ж╪╕╪▒ ┘Б┘К ╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Е┘И╪░╪м ╪г┘Г╪и╪▒ ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪░┘Д┘Г ┘Е╪л┘Д `Command-R` ╪г┘И `Mixtral-8x22B` ╪М ┘И┘Г┘Д╪з┘З┘Е╪з ┘К╪п╪╣┘Е ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п┘И╪з╪к ┘И┘К┘И┘Б╪▒ ╪г╪п╪з╪б┘Л ╪г┘В┘И┘Й .
╪г┘И┘Д╪з┘Л ╪М ┘Д┘Ж┘В┘Е ╪и╪к╪н┘Е┘К┘Д ┘Ж┘Е┘И╪░╪м┘Ж╪з ┘И tokenizer ╪з┘Д╪о╪з╪╡ ╪и┘Ж╪з:
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "NousResearch/Hermes-2-Pro-Llama-3-8B"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, device_map="auto")
```python
messages = [
{"role": "system", "content": "You are a bot that responds to weather queries. You should reply with the unit used in the queried location."},
{"role": "user", "content": "Hey, what's the temperature in Paris right now?"}
]
```
╪з┘Д╪в┘Ж╪М ┘Д┘Ж┘В┘Е ┘Ж╪╖╪и┘В ┘В╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ┘И┘Ж┘И┘Д╪п ╪▒╪п:
```python
inputs = tokenizer.apply_chat_template(messages, chat_template="tool_use", tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))
```
┘И┘Ж╪н╪╡┘Д ╪╣┘Д┘Й:
```text
<tool_call>
{"arguments": {"location": "Paris, France", "unit": "celsius"}, "name": "get_current_temperature"}
</tool_call><|im_end|>
```
┘Д┘В╪п ┘В╪з┘Е ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪п╪з┘Д╪й ┘Е╪╣ ┘Е╪╣╪з┘Ея╗╗╪к ╪╡╪н┘К╪н╪й╪М ╪и╪з┘Д╪╡┘К╪║╪й ╪з┘Д╪к┘К ╪╖┘Д╪и╪к┘З╪з ╪к┘И╪л┘К┘В ╪з┘Д╪п╪з┘Д╪й. ┘Д┘В╪п ╪з╪│╪к┘Ж╪к╪м ╪г┘Ж┘Ж╪з ┘Ж╪┤┘К╪▒ ╪╣┘Д┘Й ╪з┘Д╪г╪▒╪м╪н ╪е┘Д┘Й ╪и╪з╪▒┘К╪│ ┘Б┘К ┘Б╪▒┘Ж╪│╪з╪М ┘И╪к╪░┘Г╪▒ ╪г┘Ж┘З ╪и┘Г┘И┘Ж┘З╪з ┘Е┘И╪╖┘Ж ┘И╪н╪п╪з╪к ╪з┘Д┘В┘К╪з╪│ ╪з┘Д╪п┘И┘Д┘К╪й╪М ┘К╪м╪и ╪╣╪▒╪╢ ╪п╪▒╪м╪й ╪з┘Д╪н╪▒╪з╪▒╪й ┘Б┘К ┘Б╪▒┘Ж╪│╪з ╪и╪з┘Д╪п╪▒╪м╪й ╪з┘Д┘Е╪ж┘И┘К╪й.
╪п╪╣┘Ж╪з ┘Ж╪╢┘К┘Б ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪г╪п╪з╪й ╪з┘Д╪о╪з╪╡ ╪и╪з┘Д┘Ж┘Е┘И╪░╪м ╪е┘Д┘Й ╪з┘Д┘Е╪н╪з╪п╪л╪й. ┘Д╪з╪н╪╕ ╪г┘Ж┘Ж╪з ┘Ж┘И┘Д╪п ┘Е╪╣╪▒┘Б ╪з╪│╪к╪п╪╣╪з╪б ╪г╪п╪з╪й ╪╣╪┤┘И╪з╪ж┘К┘Л╪з ┘З┘Ж╪з. ┘Д╪з ╪к╪│╪к╪о╪п┘Е ╪м┘Е┘К╪╣ ╪з┘Д┘Ж┘Е╪з╪░╪м ┘З╪░┘З ╪з┘Д┘Е╪╣╪▒┘Б╪з╪к╪М ┘И┘Д┘Г┘Ж┘З╪з ╪к╪│┘Е╪н ┘Д┘Д┘Ж┘Е╪з╪░╪м ╪и╪е╪╡╪п╪з╪▒ ╪╣╪п╪й ╪з╪│╪к╪п╪╣╪з╪б╪з╪к ┘Д┘Д╪г╪п┘И╪з╪к ┘Б┘К ┘Ж┘Б╪│ ╪з┘Д┘И┘В╪к ┘И╪к╪к╪и╪╣ ╪з┘Д╪з╪│╪к╪м╪з╪и╪й ╪з┘Д┘Е┘В╪з╪и┘Д╪й ┘Д┘Г┘Д ╪з╪│╪к╪п╪╣╪з╪б. ┘К┘Е┘Г┘Ж┘Г ╪к┘И┘Д┘К╪п ┘З╪░┘З ╪з┘Д┘Е╪╣╪▒┘Б╪з╪к ╪и╪г┘К ╪╖╪▒┘К┘В╪й ╪к╪▒┘К╪п┘З╪з╪М ┘И┘Д┘Г┘Ж ┘К╪м╪и ╪г┘Ж ╪к┘Г┘И┘Ж ┘Б╪▒┘К╪п╪й ╪п╪з╪о┘Д ┘Г┘Д ┘Е╪н╪з╪п╪л╪й.
```python
tool_call_id = "vAHdf3" # Random ID, should be unique for each tool call
tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France", "unit": "celsius"}}
messages.append({"role": "assistant", "tool_calls": [{"id": tool_call_id, "type": "function", "function": tool_call}]})
```
╪з┘Д╪в┘Ж ╪и╪╣╪п ╪г┘Ж ╪г╪╢┘Б┘Ж╪з ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪г╪п╪з╪й ╪е┘Д┘Й ╪з┘Д┘Е╪н╪з╪п╪л╪й╪М ┘К┘Е┘Г┘Ж┘Ж╪з ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪п╪з┘Д╪й ┘И╪е╪╢╪з┘Б╪й ╪з┘Д┘Ж╪к┘К╪м╪й ╪е┘Д┘Й ╪з┘Д┘Е╪н╪з╪п╪л╪й. ┘Ж╪╕╪▒┘Л╪з ┘Д╪г┘Ж┘Ж╪з ┘Ж╪│╪к╪о╪п┘Е ╪п╪з┘Д╪й ┘И┘З┘Е┘К╪й ┘Д┘З╪░╪з ╪з┘Д┘Е╪л╪з┘Д ┘И╪з┘Д╪к┘К ╪к╪╣┘К╪п ╪п╪з╪ж┘Е┘Л╪з 22.0╪М ┘Б┘К┘Е┘Г┘Ж┘Ж╪з ╪и╪и╪│╪з╪╖╪й ╪е╪╢╪з┘Б╪й ╪к┘Д┘Г ╪з┘Д┘Ж╪к┘К╪м╪й ┘Е╪и╪з╪┤╪▒╪й┘Л. ┘Д╪з╪н╪╕ ┘Е╪╣╪▒┘Б ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪г╪п╪з╪й - ┘К╪м╪и ╪г┘Ж ┘К╪к╪╖╪з╪и┘В ┘Е╪╣ ╪з┘Д┘Е╪╣╪▒┘Б ╪з┘Д┘Е╪│╪к╪о╪п┘Е ┘Б┘К ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪г╪п╪з╪й ╪г╪╣┘Д╪з┘З.
```python
messages.append({"role": "tool", "tool_call_id": tool_call_id, "name": "get_current_temperature", "content": "22.0"})
```
╪г╪о┘К╪▒┘Л╪з╪М ╪п╪╣┘Ж╪з ┘Ж╪м╪╣┘Д ╪з┘Д┘Е╪│╪з╪╣╪п ┘К┘В╪▒╪г ┘Е╪о╪▒╪м╪з╪к ╪з┘Д╪п╪з┘Д╪й ┘И┘К┘Г┘Е┘Д ╪з┘Д╪п╪▒╪п╪┤╪й ┘Е╪╣ ╪з┘Д┘Е╪│╪к╪о╪п┘Е:
```python
inputs = tokenizer.apply_chat_template(messages, chat_template="tool_use", tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))
```
┘И┘Ж╪н╪╡┘Д ╪╣┘Д┘Й:
```text
The current temperature in Paris, France is 22.0 ┬░ Celsius.<|im_end|>
```
<Tip>
┘Д╪з ╪к╪│╪к╪о╪п┘Е ╪м┘Е┘К╪╣ ┘Ж┘Е╪з╪░╪м ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п┘И╪з╪к ╪м┘Е┘К╪╣ ┘Е┘К╪▓╪з╪к ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪г╪п┘И╪з╪к ╪з┘Д┘Е┘И╪╢╪н╪й ╪г╪╣┘Д╪з┘З. ┘К╪│╪к╪о╪п┘Е ╪з┘Д╪и╪╣╪╢ ┘Е╪╣╪▒┘Б╪з╪к ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪г╪п┘И╪з╪к╪М ╪и┘К┘Ж┘Е╪з ┘К╪│╪к╪о╪п┘Е ╪з┘Д╪и╪╣╪╢ ╪з┘Д╪в╪о╪▒ ╪и╪и╪│╪з╪╖╪й ╪з╪│┘Е ╪з┘Д╪п╪з┘Д╪й ┘И┘К┘В╪з╪▒┘Ж ╪з╪│╪к╪п╪╣╪з╪б╪з╪к ╪з┘Д╪г╪п┘И╪з╪к ╪и╪з┘Д┘Ж╪к╪з╪ж╪м ╪и╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪к╪▒╪к┘К╪и╪М ┘И┘З┘Ж╪з┘Г ╪╣╪п╪й ┘Ж┘Е╪з╪░╪м ┘Д╪з ╪к╪│╪к╪о╪п┘Е ╪г┘К┘Л╪з ┘Е┘Ж┘З┘Е╪з ┘И┘Д╪з ╪к╪╡╪п╪▒ ╪│┘И┘Й ╪з╪│╪к╪п╪╣╪з╪б ╪г╪п╪з╪й ┘И╪з╪н╪п ┘Б┘К ┘Г┘Д ┘Е╪▒╪й ┘Д╪к╪м┘Ж╪и ╪з┘Д╪з╪▒╪к╪и╪з┘Г. ╪е╪░╪з ┘Г┘Ж╪к ╪к╪▒┘К╪п ╪г┘Ж ┘К┘Г┘И┘Ж ╪▒┘Е╪▓┘Г ┘Е╪к┘И╪з┘Б┘В┘Л╪з ┘Е╪╣ ╪г┘Г╪и╪▒ ╪╣╪п╪п ┘Е┘Е┘Г┘Ж ┘Е┘Ж ╪з┘Д┘Ж┘Е╪з╪░╪м╪М ┘Б╪е┘Ж┘Ж╪з ┘Ж┘И╪╡┘К ╪и┘З┘К┘Г┘Д╪й ╪з╪│╪к╪п╪╣╪з╪б╪з╪к ╪з┘Д╪г╪п┘И╪з╪к ╪з┘Д╪о╪з╪╡╪й ╪и┘Г ┘Г┘Е╪з ┘З┘И ┘Е┘И╪╢╪н ┘З┘Ж╪з╪М ┘И╪е╪╣╪з╪п╪й ┘Ж╪к╪з╪ж╪м ╪з┘Д╪г╪п┘И╪з╪к ╪и╪з┘Д╪к╪▒╪к┘К╪и ╪з┘Д╪░┘К ╪г╪╡╪п╪▒┘З╪з ╪з┘Д┘Ж┘Е┘И╪░╪м. ┘К╪м╪и ╪г┘Ж ╪к╪к╪╣╪з┘Е┘Д ┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ╪╣┘Д┘Й ┘Г┘Д ┘Ж┘Е┘И╪░╪м ┘Е╪╣ ╪з┘Д╪и╪з┘В┘К.
</Tip>
### ┘Б┘З┘Е ┘Е╪о╪╖╪╖╪з╪к ╪з┘Д╪г╪п┘И╪з╪к
┘К╪к┘Е ╪к╪н┘И┘К┘Д ┘Г┘Д ╪п╪з┘Д╪й ╪к┘В┘И┘Е ╪и╪к┘Е╪▒┘К╪▒┘З╪з ╪е┘Д┘Й ┘Е╪╣╪з┘Е┘Д `tools` ┘Б┘К ╪п╪з┘Д╪й `apply_chat_template` ╪е┘Д┘Й [┘Е╪о╪╖╪╖ JSON](https://json-schema.org/learn/getting-started-step-by-step). ┘К╪к┘Е ╪и╪╣╪п ╪░┘Д┘Г ╪к┘Е╪▒┘К╪▒ ┘З╪░┘З ╪з┘Д┘Е╪о╪╖╪╖╪з╪к ╪е┘Д┘Й ┘В╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ╪з┘Д┘Ж┘Е┘И╪░╪м. ┘И╪и╪╣╪и╪з╪▒╪й ╪г╪о╪▒┘Й╪М ┘Б╪е┘Ж ┘Ж┘Е╪з╪░╪м ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п┘И╪з╪к ┘Д╪з ╪к╪▒┘Й ╪п┘И╪з┘Д┘Г ┘Е╪и╪з╪┤╪▒╪й╪М ┘И┘Д╪з ╪к╪▒┘Й ┘Е╪╖┘Д┘В┘Л╪з ╪з┘Д┘Г┘И╪п ╪з┘Д┘Е┘И╪м┘И╪п ╪и╪п╪з╪о┘Д┘З╪з. ┘Е╪з ┘К┘З┘Е┘З╪з ┘З┘И**╪к╪╣╪▒┘К┘Б╪з╪к** ╪з┘Д╪п┘И╪з┘Д ┘И**╪з┘Д┘Е╪╣╪з┘Ея╗╗╪к** ╪з┘Д╪к┘К ╪к╪н╪к╪з╪м ╪е┘Д┘Й ╪к┘Е╪▒┘К╪▒┘З╪з ╪е┘Д┘К┘З╪з - ┘Б┘З┘К ╪к┘З╪к┘Е ╪и┘Е╪з ╪к┘Б╪╣┘Д┘З ╪з┘Д╪г╪п┘И╪з╪к ┘И┘Г┘К┘Б┘К╪й ╪з╪│╪к╪о╪п╪з┘Е┘З╪з╪М ┘И┘Д┘К╪│ ╪и┘Г┘К┘Б┘К╪й ╪╣┘Е┘Д┘З╪з! ┘К┘В╪╣ ╪╣┘Д┘Й ╪╣╪з╪к┘В┘Г ┘В╪▒╪з╪б╪й ┘Е╪о╪▒╪м╪з╪к┘З╪з╪М ┘И╪з┘Д┘Г╪┤┘Б ╪╣┘Е╪з ╪е╪░╪з ┘Г╪з┘Ж╪к ┘В╪п ╪╖┘Д╪и╪к ╪з╪│╪к╪о╪п╪з┘Е ╪г╪п╪з╪й╪М ┘И╪к┘Е╪▒┘К╪▒ ╪з┘Д┘Е╪╣╪з┘Ея╗╗╪к ╪е┘Д┘Й ╪п╪з┘Д╪й ╪з┘Д╪г╪п╪з╪й╪М ┘И╪е╪▒╪м╪з╪╣ ╪з┘Д╪▒╪п ┘Б┘К ╪з┘Д╪п╪▒╪п╪┤╪й.
┘К╪м╪и ╪г┘Ж ┘К┘Г┘И┘Ж ╪е┘Ж╪┤╪з╪б ┘Е╪о╪╖╪╖╪з╪к JSON ┘Д╪к┘Е╪▒┘К╪▒┘З╪з ╪е┘Д┘Й ╪з┘Д┘В╪з┘Д╪и ╪к┘Д┘В╪з╪ж┘К┘Л╪з ┘И╪║┘К╪▒ ┘Е╪▒╪ж┘К ╪╖╪з┘Д┘Е╪з ╪г┘Ж ╪п┘И╪з┘Д┘Г ╪к╪к╪и╪╣ ╪з┘Д┘Е┘И╪з╪╡┘Б╪з╪к ╪з┘Д┘Е┘И╪╢╪н╪й ╪г╪╣┘Д╪з┘З╪М ┘И┘Д┘Г┘Ж ╪е╪░╪з ┘И╪з╪м┘З╪к ┘Е╪┤┘Г┘Д╪з╪к╪М ╪г┘И ╪е╪░╪з ┘Г┘Ж╪к ╪к╪▒┘К╪п ╪и╪и╪│╪з╪╖╪й ┘Е╪▓┘К╪п┘Л╪з ┘Е┘Ж ╪з┘Д╪к╪н┘Г┘Е ┘Б┘К ╪з┘Д╪к╪н┘И┘К┘Д╪М ┘Б┘К┘Е┘Г┘Ж┘Г ╪з┘Д╪к╪╣╪з┘Е┘Д ┘Е╪╣ ╪з┘Д╪к╪н┘И┘К┘Д ┘К╪п┘И┘К┘Л╪з. ┘Б┘К┘Е╪з ┘К┘Д┘К ┘Е╪л╪з┘Д ╪╣┘Д┘Й ╪к╪н┘И┘К┘Д ┘Е╪о╪╖╪╖ ┘К╪п┘И┘К:
```python
from transformers.utils import get_json_schema
def multiply(a: float, b: float):
"""
A function that multiplies two numbers
Args:
a: The first number to multiply
b: The second number to multiply
"""
return a * b
schema = get_json_schema(multiply)
print(schema)
```
╪│┘К╪д╪п┘К ┘З╪░╪з ╪е┘Д┘Й ┘Е╪з ┘К┘Д┘К:
```json
{
"type": "function",
"function": {
"name": "multiply",
"description": "A function that multiplies two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number to multiply"
},
"b": {
"type": "number",
"description": "The second number to multiply"
}
},
"required": ["a", "b"]
}
}
}
```
╪е╪░╪з ┘Г┘Ж╪к ╪к╪▒╪║╪и ┘Б┘К ╪░┘Д┘Г╪М ┘К┘Е┘Г┘Ж┘Г ╪к╪н╪▒┘К╪▒ ┘З╪░┘З ╪з┘Д┘Е╪о╪╖╪╖╪з╪к╪М ╪г┘И ╪н╪к┘Й ┘Г╪к╪з╪и╪к┘З╪з ┘Е┘Ж ╪з┘Д╪и╪п╪з┘К╪й ╪и┘Ж┘Б╪│┘Г ╪п┘И┘Ж ╪з╪│╪к╪о╪п╪з┘Е `get_json_schema` ╪╣┘Д┘Й ╪з┘Д╪е╪╖┘Д╪з┘В. ┘К┘Е┘Г┘Ж ╪к┘Е╪▒┘К╪▒ ┘Е╪о╪╖╪╖╪з╪к JSON ┘Е╪и╪з╪┤╪▒╪й┘Л ╪е┘Д┘Й ┘Е╪╣╪з┘Е┘Д `tools` ┘Б┘К `apply_chat_template` - ┘К┘Е┘Ж╪н┘Г ┘З╪░╪з ╪з┘Д┘Г╪л┘К╪▒ ┘Е┘Ж ╪з┘Д┘В┘И╪й ┘Д╪к╪╣╪▒┘К┘Б ┘Е╪о╪╖╪╖╪з╪к ╪п┘В┘К┘В╪й ┘Д┘И╪╕╪з╪ж┘Б ╪г┘Г╪л╪▒ ╪к╪╣┘В┘К╪п┘Л╪з. ┘И┘Д┘Г┘Ж ┘Г┘Ж ╪н╪░╪▒┘Л╪з - ┘Г┘Д┘Е╪з ╪▓╪з╪п ╪к╪╣┘В┘К╪п ┘Е╪о╪╖╪╖╪з╪к┘Г╪М ╪▓╪з╪п ╪з╪н╪к┘Е╪з┘Д ╪з╪▒╪к╪и╪з┘Г ╪з┘Д┘Ж┘Е┘И╪░╪м ╪╣┘Ж╪п ╪з┘Д╪к╪╣╪з┘Е┘Д ┘Е╪╣┘З╪з! ┘Ж┘И╪╡┘К ╪и╪к┘И┘В┘К╪╣╪з╪к ╪п┘И╪з┘Д ╪и╪│┘К╪╖╪й ╪н┘К╪л┘Е╪з ╪г┘Е┘Г┘Ж╪М ┘Е╪╣ ╪к┘В┘Д┘К┘Д ╪з┘Д┘Е╪╣╪з┘Ея╗╗╪к (┘И╪о╪з╪╡╪й ╪з┘Д┘Е╪╣╪з┘Ея╗╗╪к ╪з┘Д┘Е╪╣┘В╪п╪й ┘И╪з┘Д┘Е╪к╪п╪з╪о┘Д╪й) ╪е┘Д┘Й ╪з┘Д╪н╪п ╪з┘Д╪г╪п┘Ж┘Й.
┘Б┘К┘Е╪з ┘К┘Д┘К ┘Е╪л╪з┘Д ╪╣┘Д┘Й ╪к╪╣╪▒┘К┘Б ╪з┘Д┘Е╪о╪╖╪╖╪з╪к ┘К╪п┘И┘К┘Л╪з╪М ┘И╪к┘Е╪▒┘К╪▒┘З╪з ┘Е╪и╪з╪┤╪▒╪й┘Л ╪е┘Д┘Й `apply_chat_template`:
```python
# A simple function that takes no arguments
current_time = {
"type": "function",
"function": {
"name": "current_time",
"description": "Get the current local time as a string.",
"parameters": {
'type': 'object',
'properties': {}
}
}
}
# A more complete function that takes two numerical arguments
multiply = {
'type': 'function',
'function': {
'name': 'multiply',
'description': 'A function that multiplies two numbers',
'parameters': {
'type': 'object',
'properties': {
'a': {
'type': 'number',
'description': 'The first number to multiply'
},
'b': {
'type': 'number', 'description': 'The second number to multiply'
}
},
'required': ['a', 'b']
}
}
}
model_input = tokenizer.apply_chat_template(
messages,
tools = [current_time, multiply]
)
```
## ┘Е╪к┘В╪п┘Е: ╪к┘И┘Д┘К╪п ┘В╪з╪ж┘Е ╪╣┘Д┘Й ╪з┘Д╪з╪│╪к╪▒╪м╪з╪╣
┘К┘Е┘Г┘Ж ┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д┘Д╪║╪й ╪з┘Д┘Г╪и┘К╪▒╪й ┘Е┘Ж ┘Ж┘И╪╣ "╪к┘И┘Д┘К╪п ┘В╪з╪ж┘Е ╪╣┘Д┘Й ╪з┘Д╪з╪│╪к╪▒╪м╪з╪╣" ╪г┘И "RAG" ╪з┘Д╪и╪н╪л ┘Б┘К ┘Е╪м┘Е┘И╪╣╪й ┘Ж╪╡┘И╪╡ ╪╣┘Ж ┘Е╪╣┘Д┘И┘Е╪з╪к ┘В╪и┘Д ╪з┘Д╪▒╪п ╪╣┘Д┘Й ╪з┘Д╪з╪│╪к╪╣┘Д╪з┘Е. ┘К╪│┘Е╪н ┘З╪░╪з ┘Д┘Д┘Ж┘Е╪з╪░╪м ╪и╪к┘И╪│┘К╪╣ ┘В╪з╪╣╪п╪й ┘Е╪╣╪з╪▒┘Б┘З╪з ╪и╪┤┘Г┘Д ┘Г╪и┘К╪▒ ╪е┘Д┘Й ┘Е╪з ┘З┘И ╪г╪и╪╣╪п ┘Е┘Ж ╪н╪м┘Е ╪│┘К╪з┘В┘З╪з ╪з┘Д┘Е╪н╪п┘И╪п. ╪к┘И╪╡┘К╪к┘Ж╪з ┘Д┘Ж┘Е╪з╪░╪м RAG ┘З┘К ╪г┘Ж ┘К┘В╪и┘Д ┘В╪з┘Д╪и┘З╪з ┘И╪│┘К╪╖╪й `documents`. ┘К╪м╪и ╪г┘Ж ╪к┘Г┘И┘Ж ┘З╪░┘З ┘В╪з╪ж┘Е╪й ┘Е┘Ж ╪з┘Д┘Е╪│╪к┘Ж╪п╪з╪к╪М ╪н┘К╪л ┘К┘Г┘И┘Ж ┘Г┘Д "┘Е╪│╪к┘Ж╪п" ╪╣╪и╪з╪▒╪й ╪╣┘Ж ┘В╪з┘Е┘И╪│ ┘И╪з╪н╪п ╪и┘Е┘Б╪з╪к┘К╪н `title` ┘И `contents`╪М ┘И┘Г┘Д╪з┘З┘Е╪з ╪│┘Д╪з╪│┘Д ┘Ж╪╡┘К╪й. ┘Ж╪╕╪▒┘Л╪з ┘Д╪г┘Ж ┘З╪░╪з ╪з┘Д╪к┘Ж╪│┘К┘В ╪г╪и╪│╪╖ ╪и┘Г╪л┘К╪▒ ┘Е┘Ж ┘Е╪о╪╖╪╖╪з╪к JSON ╪з┘Д┘Е╪│╪к╪о╪п┘Е╪й ┘Д┘Д╪г╪п┘И╪з╪к╪М ┘Б┘Д╪з ╪к┘И╪м╪п ╪н╪з╪м╪й ╪е┘Д┘Й ╪п┘И╪з┘Д ┘Е╪│╪з╪╣╪п╪й.
┘Б┘К┘Е╪з ┘К┘Д┘К ┘Е╪л╪з┘Д ╪╣┘Д┘Й ┘В╪з┘Д╪и RAG ╪и╪з┘Д┘Б╪╣┘Д:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
# ╪к╪н┘Е┘К┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И╪з┘Д┘Е╪м╪▓┘Й╪б ╪з┘Д┘Д╪║┘И┘К
model_id = "CohereForAI/c4ai-command-r-v01-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
device = model.device # ╪з┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ╪з┘Д╪м┘З╪з╪▓ ╪з┘Д╪░┘К ╪к┘Е ╪к╪н┘Е┘К┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м ╪╣┘Д┘К┘З
# ╪к╪╣╪▒┘К┘Б ┘Е┘П╪п╪о┘Д╪з╪к ╪з┘Д┘Е╪н╪з╪п╪л╪й
conversation = [
{"role": "user", "content": "What has Man always dreamed of?"}
]
# ╪к╪╣╪▒┘К┘Б ╪з┘Д┘Е╪│╪к┘Ж╪п╪з╪к ┘Д╪к┘И┘Д┘К╪п ┘В╪з╪ж┘Е ╪╣┘Д┘Й ╪з┘Д╪з╪│╪к╪▒╪м╪з╪╣
documents = [
{
"title": "The Moon: Our Age-Old Foe",
"text": "Man has always dreamed of destroying the moon. In this essay, I shall..."
},
{
"title": "The Sun: Our Age-Old Friend",
"text": "Although often underappreciated, the sun provides several notable benefits..."
}
]
# ┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Е╪н╪з╪п╪л╪й ┘И╪з┘Д┘Е╪│╪к┘Ж╪п╪з╪к ╪и╪з╪│╪к╪о╪п╪з┘Е ┘В╪з┘Д╪и RAG╪М ┘И╪е╪▒╪м╪з╪╣ ┘Е┘И╪к╪▒╪з╪к PyTorch.
input_ids = tokenizer.apply_chat_template(
conversation=conversation,
documents=documents,
chat_template="rag",
tokenize=True,
add_generation_prompt=True,
return_tensors="pt").to(device)
# ╪к┘И┘Д┘К╪п ╪з┘Д╪▒╪п
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
# ┘Б┘Г ╪к╪┤┘Б┘К╪▒ ╪з┘Д┘Ж╪╡ ╪з┘Д┘Е┘П┘И┘О┘Д┘С╪п ┘И╪╖╪и╪з╪╣╪к┘З
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
```
╪е┘Ж ┘Е┘П╪п╪о┘Д documents ┘Д┘Д╪к┘И┘Д┘К╪п ╪з┘Д┘В╪з╪ж┘Е ╪╣┘Д┘Й ╪з┘Д╪з╪│╪к╪▒╪м╪з╪╣ ╪║┘К╪▒ ┘Е╪п╪╣┘И┘Е ╪╣┘Д┘Й ┘Ж╪╖╪з┘В ┘И╪з╪│╪╣╪М ┘И╪з┘Д╪╣╪п┘К╪п ┘Е┘Ж ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Д╪п┘К┘З╪з ┘В┘И╪з┘Д╪и ╪п╪▒╪п╪┤╪й ╪к╪к╪м╪з┘З┘Д ┘З╪░╪з ╪з┘Д┘Е┘П╪п╪о┘Д ╪и╪и╪│╪з╪╖╪й.
┘Д┘Д╪к╪н┘В┘В ┘Е┘Е╪з ╪е╪░╪з ┘Г╪з┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘К╪п╪╣┘Е ┘Е┘П╪п╪о┘Д `documents`╪М ┘К┘Е┘Г┘Ж┘Г ┘В╪▒╪з╪б╪й ╪и╪╖╪з┘В╪й ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪о╪з╪╡╪й ╪и┘З╪М ╪г┘И `print(tokenizer.chat_template)` ┘Д┘Е╪╣╪▒┘Б╪й ┘Е╪з ╪е╪░╪з ┘Г╪з┘Ж ┘Е┘Б╪к╪з╪н `documents` ┘Е╪│╪к╪о╪п┘Е┘Л╪з ┘Б┘К ╪г┘К ┘Е┘Г╪з┘Ж.
<Tip>
┘И┘Е╪╣ ╪░┘Д┘Г╪М ┘Б╪е┘Ж ╪г╪н╪п ┘Б╪ж╪з╪к ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к┘К ╪к╪п╪╣┘Е┘З ┘З┘К [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-08-2024) ┘И [Command-R+](https://huggingface.co/CohereForAI/c4ai-command-r-pluse-08-2024) ┘Е┘Ж Cohere╪М ┘Е┘Ж ╪о┘Д╪з┘Д ┘В╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й rag ╪з┘Д╪о╪з╪╡ ╪и┘З┘Е. ┘К┘Е┘Г┘Ж┘Г ╪▒╪д┘К╪й ╪г┘Е╪л┘Д╪й ╪е╪╢╪з┘Б┘К╪й ╪╣┘Д┘Й ╪з┘Д╪к┘И┘Д┘К╪п ╪и╪з╪│╪к╪о╪п╪з┘Е ┘З╪░┘З ╪з┘Д┘Е┘К╪▓╪й ┘Б┘К ╪и╪╖╪з┘В╪з╪к ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪о╪з╪╡╪й ╪и┘З┘Е.
</Tip>
## ┘Е╪к┘В╪п┘Е: ┘Г┘К┘Б ╪к╪╣┘Е┘Д ┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й╪Я
┘К╪к┘Е ╪к╪о╪▓┘К┘Ж ┘В╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ┘Д┘Д┘Ж┘Е┘И╪░╪м ┘Б┘К ╪з┘Д╪о╪з╪╡┘К╪й `tokenizer.chat_template`. ╪е╪░╪з ┘Д┘Е ┘К╪к┘Е ╪к╪╣┘К┘К┘Ж ┘В╪з┘Д╪и ╪п╪▒╪п╪┤╪й╪М ┘Б╪│┘К╪к┘Е ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘В╪з┘Д╪и ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К ┘Д┘Б╪ж╪й ╪з┘Д┘Ж┘Е┘И╪░╪м ┘З╪░┘З ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪░┘Д┘Г. ╪п╪╣┘И┘Ж╪з ┘Ж┘Д┘В┘К ┘Ж╪╕╪▒╪й ╪╣┘Д┘Й ┘В╪з┘Д╪и ╪п╪▒╪п╪┤╪й `Zephyr`╪М ┘И┘Д┘Г┘Ж ┘Д╪з╪н╪╕ ╪г┘Ж ┘З╪░╪з ╪з┘Д┘В╪з┘Д╪и ┘Е┘П╪и╪│┘С╪╖ ┘В┘Д┘К┘Д╪з┘Л ╪╣┘Ж ╪з┘Д┘В╪з┘Д╪и ╪з┘Д┘Б╪╣┘Д┘К!
```
{%- for message in messages %}
{{- '<|' + message['role'] + |>\n' }}
{{- message['content'] + eos_token }}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|assistant|>\n' }}
{%- endif %}
```
╪е╪░╪з ┘Д┘Е ╪к┘Г┘Ж ┘В╪п ╪▒╪г┘К╪к ╪г╪н╪п ┘З╪░┘З ╪з┘Д┘В┘И╪з┘Д╪и ┘Е┘Ж ┘В╪и┘Д╪М ┘Б┘З╪░╪з [┘В╪з┘Д╪и Jinja](https://jinja.palletsprojects.com/en/3.1.x/templates/) .Jinja ┘З┘К ┘Д╪║╪й ┘В┘И╪з┘Д╪и ╪к╪│┘Е╪н ┘Д┘Г ╪и┘Г╪к╪з╪и╪й ╪к╪╣┘Д┘К┘Е╪з╪к ╪и╪▒┘Е╪м┘К╪й ╪и╪│┘К╪╖╪й ╪к┘П┘И┘О┘Д┘С╪п ┘Ж╪╡┘Л╪з. ┘Е┘Ж ┘Ж┘И╪з╪н┘Н ╪╣╪п┘К╪п╪й╪М ┘К┘П╪┤╪и┘З ╪з┘Д╪▒┘Е╪▓ ┘И╪з┘Д╪к╪▒┘Г┘К╪и ┘Д┘Д╪║╪й Python. ╪г┘Е╪з ┘Б┘К ┘Д╪║╪й Python╪М ╪│┘К╪и╪п┘И ┘З╪░╪з ╪з┘Д┘В╪з┘Д╪и ┘Г┘Е╪з ┘К┘Д┘К:
```python
for message in messages:
print(f'<|{message["role"]}|>')
print(message['content'] + eos_token)
if add_generation_prompt:
print('<|assistant|>')
```
┘К┘В┘И┘Е ╪з┘Д┘В╪з┘Д╪и ╪и╪л┘Д╪з╪л╪й ╪г╪┤┘К╪з╪б ╪и╪┤┘Г┘Д ┘Б╪╣╪з┘Д:
- ┘Д┘Г┘Д ╪▒╪│╪з┘Д╪й╪М ╪и╪╖╪и╪╣ ╪з┘Д╪п┘И╪▒ ┘Е┘П╪н╪з╪╖┘Л╪з ╪и┘А `<|` ┘И `|>`╪М ┘Е╪л┘Д `<|user|>` ╪г┘И `<|assistant|>`.
- ╪и╪╣╪п ╪░┘Д┘Г╪М ┘К╪╖╪и╪╣ ┘Е╪н╪к┘И┘Й ╪з┘Д╪▒╪│╪з┘Д╪й╪М ┘Е╪к╪и┘И╪╣┘Л╪з ╪и╪▒┘Е╪▓ ┘Ж┘З╪з┘К╪й ╪з┘Д╪к╪│┘Д╪│┘Д `eos_token` .
- ╪г╪о┘К╪▒┘Л╪з╪М ╪е╪░╪з ╪к┘Е ╪к╪╣┘К┘К┘Ж `add_generation_prompt` ╪М ┘К╪╖╪и╪╣ ╪з┘Д╪▒┘Е╪▓ ╪з┘Д┘Е╪│╪з╪╣╪п╪М ╪н╪к┘Й ┘К╪╣╪▒┘Б ╪з┘Д┘Ж┘Е┘И╪░╪м ╪г┘Ж┘З ┘К╪м╪и ╪г┘Ж ┘К╪и╪п╪г ┘Б┘К ╪к┘И┘Д┘К╪п ╪з╪│╪к╪м╪з╪и╪й ╪з┘Д┘Е╪│╪з╪╣╪п.
┘З╪░╪з ┘В╪з┘Д╪и ╪и╪│┘К╪╖ ╪м╪п┘Л╪з╪М ┘Д┘Г┘Ж Jinja ╪к┘Е┘Ж╪н┘Г ╪з┘Д┘Г╪л┘К╪▒ ┘Е┘Ж ╪з┘Д┘Е╪▒┘И┘Ж╪й ┘Д┘Д┘В┘К╪з┘Е ╪и╪г╪┤┘К╪з╪б ╪г┘Г╪л╪▒ ╪к╪╣┘В┘К╪п┘Л╪з! ╪п╪╣┘И┘Ж╪з ┘Ж╪▒┘Й ┘В╪з┘Д╪и Jinja ┘К┘П┘Е┘Г┘Ж┘З ╪к┘Ж╪│┘К┘В ╪з┘Д┘Е┘П╪п╪о┘Д╪з╪к ╪и╪╖╪▒┘К┘В╪й ╪к┘П╪┤╪и┘З ╪з┘Д╪╖╪▒┘К┘В╪й ╪з┘Д╪к┘К ╪к┘П┘Ж╪│┘С┘В ╪и┘З╪з LLaMA ┘Е┘П╪п╪о┘Д╪з╪к┘З╪з (┘Д╪з╪н╪╕ ╪г┘Ж ┘В╪з┘Д╪и LLaMA ╪з┘Д╪н┘В┘К┘В┘К ┘К╪к╪╢┘Е┘Ж ┘Е╪╣╪з┘Д╪м╪й ┘Д╪▒╪│╪з╪ж┘Д ╪з┘Д┘Ж╪╕╪з┘Е ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К╪й ┘И┘Е╪╣╪з┘Д╪м╪й ╪▒╪│╪з╪ж┘Д ╪з┘Д┘Ж╪╕╪з┘Е ╪и╪┤┘Г┘Д ┘Е╪о╪к┘Д┘Б ┘В┘Д┘К┘Д╪з┘Л ╪и╪┤┘Г┘Д ╪╣╪з┘Е - ┘Д╪з ╪к╪│╪к╪о╪п┘Е ┘З╪░╪з ╪з┘Д┘В╪з┘Д╪и ┘Б┘К ╪з┘Д╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Е╪м┘К╪й ╪з┘Д┘Б╪╣┘Д┘К╪й ╪з┘Д╪о╪з╪╡╪й ╪и┘Г!)
```
{%- for message in messages %}
{%- if message['role'] == 'user' %}
{{- bos_token + '[INST] ' + message['content'] + ' [/INST]' }}
{%- elif message['role'] == 'system' %}
{{- '<<SYS>>\\n' + message['content'] + '\\n<</SYS>>\\n\\n' }}
{%- elif message['role'] == 'assistant' %}
{{- ' ' + message['content'] + ' ' + eos_token }}
{%- endif %}
{%- endfor %}
```
┘Ж╪г┘Е┘Д ╪г┘Ж┘З ╪е╪░╪з ╪н╪п┘В╪к ┘Б┘К ┘З╪░╪з ┘Д┘Б╪к╪▒╪й ┘В╪╡┘К╪▒╪й╪М ┘К┘Е┘Г┘Ж┘Г ╪г┘Ж ╪к╪▒┘Й ┘Е╪з ┘К┘Б╪╣┘Д┘З ┘З╪░╪з ╪з┘Д┘В╪з┘Д╪и - ┘Б┘З┘И ┘К┘П╪╢┘К┘Б ╪▒┘Е┘И╪▓┘Л╪з ┘Е┘П╪н╪п╪п╪й ┘Е╪л┘Д `[INST]` ┘И `[/INST]` ╪и┘Ж╪з╪б┘Л ╪╣┘Д┘Й ╪п┘И╪▒ ┘Г┘Д ╪▒╪│╪з┘Д╪й. ┘К┘Е┘Г┘Ж ╪к┘Е┘К┘К╪▓ ╪▒╪│╪з╪ж┘Д ╪з┘Д┘Е╪│╪к╪о╪п┘Е ┘И╪з┘Д┘Е╪│╪з╪╣╪п ┘И╪з┘Д┘Ж╪╕╪з┘Е ╪и┘И╪╢┘И╪н ┘Д┘Д┘Ж┘Е┘И╪░╪м ╪и╪│╪и╪и ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪к┘К ╪к┘П╪н┘К╪╖ ╪и┘З╪з.
## ┘Е╪к┘В╪п┘Е: ╪е╪╢╪з┘Б╪й ┘И╪к╪╣╪п┘К┘Д ┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й
### ┘Г┘К┘Б ╪г┘Ж╪┤╪ж ┘В╪з┘Д╪и ╪п╪▒╪п╪┤╪й╪Я
╪и╪и╪│╪з╪╖╪й╪М ╪з┘Г╪к╪и ┘В╪з┘Д╪и Jinja ┘И╪з╪╢╪и╪╖ `tokenizer.chat_template`. ┘В╪п ╪к╪м╪п ╪г┘Ж┘З ┘Е┘Ж ╪з┘Д╪г╪│┘З┘Д ╪з┘Д╪и╪п╪б ╪и┘В╪з┘Д╪и ┘Е┘И╪м┘И╪п ┘Е┘Ж ┘Ж┘Е┘И╪░╪м ╪в╪о╪▒ ┘И╪к╪н╪▒┘К╪▒┘З ╪и╪и╪│╪з╪╖╪й ┘Д┘К┘Ж╪з╪│╪и ╪з╪н╪к┘К╪з╪м╪з╪к┘Г! ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ┘К┘Е┘Г┘Ж┘Ж╪з ╪г┘Ж ┘Ж╪г╪о╪░ ┘В╪з┘Д╪и LLaMA ╪г╪╣┘Д╪з┘З ┘И┘Ж╪╢┘К┘Б `[ASST]` ┘И `[/ASST]` ╪е┘Д┘Й ╪▒╪│╪з╪ж┘Д ╪з┘Д┘Е╪│╪з╪╣╪п:
```
{%- for message in messages %}
{%- if message['role'] == 'user' %}
{{- bos_token + '[INST] ' + message['content'].strip() + ' [/INST]' }}
{%- elif message['role'] == 'system' %}
{{- '<<SYS>>\\n' + message['content'].strip() + '\\n<</SYS>>\\n\\n' }}
{%- elif message['role'] == 'assistant' %}
{{- '[ASST] ' + message['content'] + ' [/ASST]' + eos_token }}
{%- endif %}
{%- endfor %}
```
╪з┘Д╪в┘Ж╪М ╪з╪╢╪и╪╖ ╪и╪и╪│╪з╪╖╪й ╪з┘Д╪о╪з╪╡┘К╪й `tokenizer.chat_template`. ┘Б┘К ╪з┘Д┘Е╪▒╪й ╪з┘Д┘В╪з╪п┘Е╪й ╪з┘Д╪к┘К ╪к╪│╪к╪о╪п┘Е ┘Б┘К┘З╪з [`~PreTrainedTokenizer.apply_chat_template`] ╪М ╪│┘К╪│╪к╪о╪п┘Е ╪з┘Д┘В╪з┘Д╪и ╪з┘Д╪м╪п┘К╪п ╪з┘Д╪о╪з╪╡ ╪и┘Г! ╪│┘К╪к┘Е ╪н┘Б╪╕ ┘З╪░┘З ╪з┘Д╪о╪з╪╡┘К╪й ┘Б┘К ┘Е┘Д┘Б `tokenizer_config.json`╪М ╪н╪к┘Й ╪к╪к┘Е┘Г┘Ж ┘Е┘Ж ╪з╪│╪к╪о╪п╪з┘Е [`~utils.PushToHubMixin.push_to_hub`] ┘Д╪к╪н┘Е┘К┘Д ┘В╪з┘Д╪и┘Г ╪з┘Д╪м╪п┘К╪п ╪е┘Д┘Й Hub ┘И╪з┘Д╪к╪г┘Г╪п ┘Е┘Ж ╪г┘Ж ╪з┘Д╪м┘Е┘К╪╣ ┘К╪│╪к╪о╪п┘Е ╪з┘Д┘В╪з┘Д╪и ╪з┘Д╪╡╪н┘К╪н ┘Д┘Ж┘Е┘И╪░╪м┘Г!
```python
template = tokenizer.chat_template
template = template.replace("SYS", "SYSTEM") # ╪к╪║┘К┘К╪▒ ╪▒┘Е╪▓ ╪з┘Д┘Ж╪╕╪з┘Е
tokenizer.chat_template = template # ╪к╪╣┘К┘К┘Ж ╪з┘Д┘В╪з┘Д╪и ╪з┘Д╪м╪п┘К╪п
tokenizer.push_to_hub("model_name") # ╪к╪н┘Е┘К┘Д ╪з┘Д┘В╪з┘Д╪и ╪з┘Д╪м╪п┘К╪п ╪е┘Д┘Й Hub!
```
┘К╪к┘Е ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪п╪з┘Д╪й [`~PreTrainedTokenizer.apply_chat_template`] ╪з┘Д╪░┘К ┘Ж╪│╪к╪о╪п┘Е ┘В╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ╪з┘Д╪о╪з╪╡ ╪и┘Г ╪и┘И╪з╪│╪╖╪й ┘Б╪ж╪й [`TextGenerationPipeline`] ┘Д╪░┘Д┘Г ╪и┘Е╪м╪▒╪п ╪к╪╣┘К┘К┘Ж ┘В╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ╪з┘Д╪╡╪н┘К╪н╪М ╪│┘К╪╡╪и╪н ┘Ж┘Е┘И╪░╪м┘Г ┘Е╪к┘И╪з┘Б┘В┘Л╪з ╪к┘Д┘В╪з╪ж┘К┘Л╪з ┘Е╪╣ [`TextGenerationPipeline`].
<Tip>
╪е╪░╪з ┘Г┘Ж╪к ╪к┘П╪м╪▒┘К ╪╢╪и╪╖┘Л╪з ╪п┘В┘К┘В┘Л╪з ┘Д┘Ж┘Е┘И╪░╪м ┘Д┘Д╪п╪▒╪п╪┤╪й╪М ╪и╪з┘Д╪е╪╢╪з┘Б╪й ╪е┘Д┘Й ╪к╪╣┘К┘К┘Ж ┘В╪з┘Д╪и ╪п╪▒╪п╪┤╪й╪М ┘Б╪▒╪и┘Е╪з ┘К╪м╪и ╪╣┘Д┘К┘Г ╪е╪╢╪з┘Б╪й ╪г┘К ╪▒┘Е┘И╪▓ ╪к╪н┘Г┘Е ╪п╪▒╪п╪┤╪й ╪м╪п┘К╪п╪й ┘Г╪▒┘Е┘И╪▓ ╪о╪з╪╡╪й ┘Б┘К ╪з┘Д┘Е╪м╪▓┘Й╪б ╪з┘Д┘Д╪║┘И┘К. ┘Д╪з ┘К╪к┘Е ╪к┘В╪│┘К┘Е ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪о╪з╪╡╪й ╪г╪и╪п┘Л╪з╪М ┘Е┘Е╪з ┘К╪╢┘Е┘Ж ┘Е╪╣╪з┘Д╪м╪й ╪▒┘Е┘И╪▓ ╪з┘Д╪к╪н┘Г┘Е ╪з┘Д╪о╪з╪╡╪й ╪и┘Г ╪п╪з╪ж┘Е┘Л╪з ┘Г╪▒┘Е┘И╪▓ ┘Б╪▒╪п┘К╪й ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪к╪м╪▓╪ж╪к┘З╪з ╪е┘Д┘Й ╪г╪м╪▓╪з╪б. ┘К╪м╪и ╪╣┘Д┘К┘Г ╪г┘К╪╢┘Л╪з ╪к╪╣┘К┘К┘Ж ╪о╪з╪╡┘К╪й `eos_token` ┘Д┘Д┘Е╪м╪▓┘Й╪б ╪з┘Д┘Д╪║┘И┘К ╪е┘Д┘Й ╪з┘Д╪▒┘Е╪▓ ╪з┘Д╪░┘К ┘К┘П╪┤┘К╪▒ ╪е┘Д┘Й ┘Ж┘З╪з┘К╪й ╪к┘И┘Д┘К╪п╪з╪к ╪з┘Д┘Е╪│╪з╪╣╪п ┘Б┘К ┘В╪з┘Д╪и┘Г. ╪│┘К╪╢┘Е┘Ж ┘З╪░╪з ╪г┘Ж ╪г╪п┘И╪з╪к ╪к┘И┘Д┘К╪п ╪з┘Д┘Ж╪╡┘И╪╡ ┘К┘Е┘Г┘Ж┘З╪з ╪к╪н╪п┘К╪п ┘И┘В╪к ╪е┘К┘В╪з┘Б ╪к┘И┘Д┘К╪п ╪з┘Д┘Ж╪╡ ╪и╪┤┘Г┘Д ╪╡╪н┘К╪н.
</Tip>
### ┘Д┘Е╪з╪░╪з ╪к╪н╪к┘И┘К ╪и╪╣╪╢ ╪з┘Д┘Ж┘Е╪з╪░╪м ╪╣┘Д┘Й ┘В┘И╪з┘Д╪и ┘Е╪к╪╣╪п╪п╪й╪Я
╪к╪│╪к╪о╪п┘Е ╪и╪╣╪╢ ╪з┘Д┘Ж┘Е╪з╪░╪м ┘В┘И╪з┘Д╪и ┘Е╪о╪к┘Д┘Б╪й ┘Д╪н╪з┘Д╪з╪к ╪з╪│╪к╪о╪п╪з┘Е ┘Е╪о╪к┘Д┘Б╪й. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ┘В╪п ╪к╪│╪к╪о╪п┘Е ┘В╪з┘Д╪и┘Л╪з ┘И╪з╪н╪п┘Л╪з ┘Д┘Д╪п╪▒╪п╪┤╪й ╪з┘Д╪╣╪з╪п┘К╪й ┘И╪в╪о╪▒ ┘Д╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п┘И╪з╪к╪М ╪г┘И ╪з┘Д╪к┘И┘Д┘К╪п ╪з┘Д┘В╪з╪ж┘Е ╪╣┘Д┘Й ╪з┘Д╪з╪│╪к╪▒╪м╪з╪╣. ┘Б┘К ┘З╪░┘З ╪з┘Д╪н╪з┘Д╪з╪к╪М ╪к┘Г┘И┘Ж `tokenizer.chat_template` ┘В╪з┘Е┘И╪│┘Л╪з. ┘К┘Е┘Г┘Ж ╪г┘Ж ┘К╪к╪│╪и╪и ┘З╪░╪з ┘Б┘К ╪и╪╣╪╢ ╪з┘Д╪з╪▒╪к╪и╪з┘Г╪М ┘И╪н┘К╪л┘Е╪з ╪г┘Е┘Г┘Ж╪М ┘Ж┘И╪╡┘К ╪и╪з╪│╪к╪о╪п╪з┘Е ┘В╪з┘Д╪и ┘И╪з╪н╪п ┘Д╪м┘Е┘К╪╣ ╪н╪з┘Д╪з╪к ╪з┘Д╪з╪│╪к╪о╪п╪з┘Е. ┘К┘Е┘Г┘Ж┘Г ╪з╪│╪к╪о╪п╪з┘Е ╪╣╪и╪з╪▒╪з╪к Jinja ┘Е╪л┘Д `if tools is defined` ┘И╪к╪╣╪▒┘К┘Б╪з╪к `{% macro %}` ┘Д╪к╪╢┘Е┘К┘Ж ┘Е╪│╪з╪▒╪з╪к ╪к╪╣┘Д┘К┘Е╪з╪к ╪и╪▒┘Е╪м┘К╪й ┘Е╪к╪╣╪п╪п╪й ╪и╪│┘З┘И┘Д╪й ┘Б┘К ┘В╪з┘Д╪и ┘И╪з╪н╪п.
╪╣┘Ж╪п┘Е╪з ┘К╪н╪к┘И┘К ╪з┘Д┘Е╪╣╪з┘Д╪м ╪з┘Д┘Д╪║┘И┘К ╪╣┘Д┘Й ┘В┘И╪з┘Д╪и ┘Е╪к╪╣╪п╪п╪й╪М ╪│╪к┘Г┘И┘Ж `tokenizer.chat_template dict`╪М ╪н┘К╪л ┘К┘Г┘И┘Ж ┘Г┘Д ┘Е┘Б╪к╪з╪н ┘З┘И ╪з╪│┘Е ┘В╪з┘Д╪и. ┘К╪н╪к┘И┘К ╪г╪│┘Д┘И╪и `apply_chat_template` ╪╣┘Д┘Й ┘Е╪╣╪з┘Д╪м╪й ╪о╪з╪╡╪й ┘Д╪г╪│┘Е╪з╪б ┘В┘И╪з┘Д╪и ┘Е┘П╪╣┘К┘Ж╪й: ╪╣┘Д┘Й ┘И╪м┘З ╪з┘Д╪к╪н╪п┘К╪п╪М ╪│┘К╪и╪н╪л ╪╣┘Ж ┘В╪з┘Д╪и ╪и╪з╪│┘Е `default` ┘Б┘К ┘Е╪╣╪╕┘Е ╪з┘Д╪н╪з┘Д╪з╪к╪М ┘И╪│┘К┘П╪л┘К╪▒ ╪о╪╖╪г┘Л ╪е╪░╪з ┘Д┘Е ┘К╪к┘Е┘Г┘Ж ┘Е┘Ж ╪з┘Д╪╣╪л┘И╪▒ ╪╣┘Д┘Й ┘И╪з╪н╪п. ┘И┘Е╪╣ ╪░┘Д┘Г╪М ╪е╪░╪з ┘Г╪з┘Ж ┘З┘Ж╪з┘Г ┘В╪з┘Д╪и ╪и╪з╪│┘Е `tool_use` ╪╣┘Ж╪п┘Е╪з ┘В╪з┘Е ╪з┘Д┘Е╪│╪к╪о╪п┘Е ╪и╪к┘Е╪▒┘К╪▒ ┘И╪│┘К╪╖╪й `tools`╪М ┘Б╪│┘К╪│╪к╪о╪п┘Е ┘З╪░╪з ╪з┘Д┘В╪з┘Д╪и ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪░┘Д┘Г. ┘Д┘Д┘И╪╡┘И┘Д ╪е┘Д┘Й ┘В┘И╪з┘Д╪и ╪и╪г╪│┘Е╪з╪б ╪г╪о╪▒┘Й╪М ┘Е╪▒╪▒ ╪з╪│┘Е ╪з┘Д┘В╪з┘Д╪и ╪з┘Д╪░┘К ╪к┘П╪▒┘К╪п┘З ╪е┘Д┘Й ┘И╪│┘К╪╖╪й `chat_template` ┘Д┘А `apply_chat_template()`.
┘Ж╪м╪п ╪г┘Ж ┘З╪░╪з ┘В╪п ┘К┘Г┘И┘Ж ┘Е┘П╪▒╪и┘Г┘Л╪з ╪и╪╣╪╢ ╪з┘Д╪┤┘К╪б ┘Д┘Д┘Е╪│╪к╪о╪п┘Е┘К┘Ж - ┘Д╪░┘Д┘Г ╪е╪░╪з ┘Г┘Ж╪к ╪к┘Г╪к╪и ┘В╪з┘Д╪и┘Л╪з ╪и┘Ж┘Б╪│┘Г╪М ┘Б┘Ж┘Ж╪╡╪н┘Г ╪и┘Е╪н╪з┘И┘Д╪й ┘И╪╢╪╣┘З ┘Г┘Д┘З ┘Б┘К ┘В╪з┘Д╪и ┘И╪з╪н╪п ╪н┘К╪л┘Е╪з ╪г┘Е┘Г┘Ж!
## ┘Е╪з ╪з┘Д┘В╪з┘Д╪и ╪з┘Д╪░┘К ┘К╪м╪и ╪г┘Ж ╪г╪│╪к╪о╪п┘Е┘З╪Я
╪╣┘Ж╪п ╪к╪╣┘К┘К┘Ж ┘В╪з┘Д╪и ┘Д┘Ж┘Е┘И╪░╪м ╪к┘Е ╪к╪п╪▒┘К╪и┘З ╪и╪з┘Д┘Б╪╣┘Д ╪╣┘Д┘Й ╪з┘Д╪п╪▒╪п╪┤╪й╪М ┘К╪м╪и ╪з┘Д╪к╪г┘Г╪п ┘Е┘Ж ╪г┘Ж ╪з┘Д┘В╪з┘Д╪и ┘К╪к╪╖╪з╪и┘В ╪к┘Е╪з┘Е┘Л╪з ┘Е╪╣ ╪к┘Ж╪│┘К┘В ╪з┘Д╪▒╪│╪з┘Д╪й ╪з┘Д╪░┘К ╪┤╪з┘З╪п┘З ╪з┘Д┘Ж┘Е┘И╪░╪м ╪г╪л┘Ж╪з╪б ╪з┘Д╪к╪п╪▒┘К╪и╪М ┘И╪е┘Д╪з ┘Б┘Е┘Ж ╪з┘Д┘Е╪н╪к┘Е┘Д ╪г┘Ж ╪к┘И╪з╪м┘З ╪к╪п┘З┘И╪▒┘Л╪з ┘Б┘К ╪з┘Д╪г╪п╪з╪б. ┘З╪░╪з ╪╡╪н┘К╪н ╪н╪к┘Й ╪е╪░╪з ┘Г┘Ж╪к ╪к╪п╪▒╪и ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и╪┤┘Г┘Д ╪е╪╢╪з┘Б┘К - ┘Б┘Е┘Ж ╪з┘Д┘Е╪н╪к┘Е┘Д ╪г┘Ж ╪к╪н╪╡┘Д ╪╣┘Д┘Й ╪г┘Б╪╢┘Д ╪г╪п╪з╪б ╪е╪░╪з ┘В┘Е╪к ╪и╪е╪и┘В╪з╪б ╪▒┘Е┘И╪▓ ╪з┘Д╪п╪▒╪п╪┤╪й ╪л╪з╪и╪к╪й. ┘К┘П╪┤╪и┘З ┘З╪░╪з ╪е┘Д┘Й ╪н╪п ┘Г╪и┘К╪▒ ╪╣┘Е┘Д┘К╪й ╪з┘Д╪к╪м╪▓╪ж╪й - ┘Б╪г┘Ж╪к ╪к╪н╪╡┘Д ╪и╪┤┘Г┘Д ╪╣╪з┘Е ╪╣┘Д┘Й ╪г┘Б╪╢┘Д ╪г╪п╪з╪б ┘Д┘Д╪з╪│╪к╪п┘Д╪з┘Д ╪г┘И ╪з┘Д╪╢╪и╪╖ ╪з┘Д╪п┘В┘К┘В ╪╣┘Ж╪п┘Е╪з ╪к╪к╪╖╪з╪и┘В ╪и╪п┘В╪й ┘Е╪╣ ╪з┘Д╪к╪м╪▓╪ж╪й ╪з┘Д┘Е╪│╪к╪о╪п┘Е╪й ╪г╪л┘Ж╪з╪б ╪з┘Д╪к╪п╪▒┘К╪и.
┘Е┘Ж ┘Ж╪з╪н┘К╪й ╪г╪о╪▒┘Й╪М ╪е╪░╪з ┘Г┘Ж╪к ╪к┘П╪п╪▒┘С╪и ┘Ж┘Е┘И╪░╪м┘Л╪з ┘Е┘Ж ╪з┘Д╪и╪п╪з┘К╪й╪М ╪г┘И ╪к┘В┘И┘Е ╪и╪╢╪и╪╖ ╪п┘В┘К┘В ┘Д┘Ж┘Е┘И╪░╪м ┘Д╪║╪й ╪г╪│╪з╪│┘К ┘Д┘Д╪п╪▒╪п╪┤╪й╪М ┘Д╪п┘К┘Г ╪н╪▒┘К╪й ╪з╪о╪к┘К╪з╪▒ ┘В╪з┘Д╪и ┘Е┘Ж╪з╪│╪и! ╪к╪к┘Е╪к╪╣ LLMs ╪и╪з┘Д╪░┘Г╪з╪б ╪з┘Д┘Г╪з┘Б┘К ┘Д┘Д╪к╪╣╪з┘Е┘Д ┘Е╪╣ ╪з┘Д╪╣╪п┘К╪п ┘Е┘Ж ╪к┘Ж╪│┘К┘В╪з╪к ╪з┘Д╪е╪п╪о╪з┘Д ╪з┘Д┘Е╪о╪к┘Д┘Б╪й. ╪г╪н╪п ╪з┘Д╪о┘К╪з╪▒╪з╪к ╪з┘Д╪┤╪з╪ж╪╣╪й ┘З┘И ╪к┘Ж╪│┘К┘В "ChatML"╪М ┘И┘З┘И ╪о┘К╪з╪▒ ╪м┘К╪п ┘И┘Е╪▒┘Ж ┘Д┘Д╪╣╪п┘К╪п ┘Е┘Ж ╪н╪з┘Д╪з╪к ╪з┘Д╪з╪│╪к╪о╪п╪з┘Е. ┘К╪и╪п┘И ┘Г╪з┘Д╪к╪з┘Д┘К:
```
{%- for message in messages %}
{{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
```
╪е╪░╪з ╪г╪╣╪м╪и┘Г ┘З╪░╪з╪М ┘Б╪е┘Д┘К┘Г ┘Ж╪│╪о╪й ╪м╪з┘З╪▓╪й ┘Д┘И╪╢╪╣┘З╪з ┘Б┘К ┘Г┘И╪п┘Г. ┘К╪к╪╢┘Е┘Ж ╪з┘Д╪о╪╖ ╪з┘Д┘Е┘Б╪▒╪п ╪г┘К╪╢┘Л╪з ╪п╪╣┘Е┘Л╪з ┘Е┘Б┘К╪п┘Л╪з [┘Д╪е╪▒╪┤╪з╪п╪з╪к ╪з┘Д╪к┘И┘Д┘К╪п](#what-are-generation-prompts)╪М ┘И┘Д┘Г┘Ж ┘Д╪з╪н╪╕ ╪г┘Ж┘З ┘Д╪з ┘К╪╢┘К┘Б ╪▒┘Е┘И╪▓ BOS ╪г┘И EOS! ╪е╪░╪з ┘Г╪з┘Ж ┘Ж┘Е┘И╪░╪м┘Г ┘К╪к┘И┘В╪╣ ┘З╪░┘З ╪з┘Д╪▒┘Е┘И╪▓╪М ┘Б┘Д┘Ж ┘К╪к┘Е ╪е╪╢╪з┘Б╪к┘З╪з ╪к┘Д┘В╪з╪ж┘К┘Л╪з ╪и┘И╪з╪│╪╖╪й "apply_chat_template" - ╪и┘Е╪╣┘Ж┘Й ╪в╪о╪▒╪М ╪│┘К╪к┘Е ╪к╪м╪▓╪ж╪й ╪з┘Д┘Ж╪╡ ╪и╪з╪│╪к╪о╪п╪з┘Е "add_special_tokens=False". ┘З╪░╪з ┘Д╪к╪м┘Ж╪и ╪з┘Д╪к╪╣╪з╪▒╪╢╪з╪к ╪з┘Д┘Е╪н╪к┘Е┘Д╪й ╪и┘К┘Ж ╪з┘Д┘В╪з┘Д╪и ┘И┘Е┘Ж╪╖┘В "add_special_tokens". ╪е╪░╪з ┘Г╪з┘Ж ┘Ж┘Е┘И╪░╪м┘Г ┘К╪к┘И┘В╪╣ ╪▒┘Е┘И╪▓┘Л╪з ╪о╪з╪╡╪й╪М ┘Б╪к╪г┘Г╪п ┘Е┘Ж ╪е╪╢╪з┘Б╪к┘З╪з ╪е┘Д┘Й ╪з┘Д┘В╪з┘Д╪и!
```python
tokenizer.chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}"
```
┘К┘П╪н┘К╪╖ ┘З╪░╪з ╪з┘Д┘В╪з┘Д╪и ┘Г┘Д ╪▒╪│╪з┘Д╪й ╪и┘К┘Ж ╪з┘Д╪▒┘Е╪▓┘К┘Ж "<|im_start|>" ┘И "<|im_end|>"╪М ┘И┘К┘Г╪к╪и ╪и╪и╪│╪з╪╖╪й ╪з┘Д╪п┘И╪▒ ┘Г╪│┘Д╪│┘Д╪й ┘Ж╪╡┘К╪й╪М ┘Е┘Е╪з ┘К╪│┘Е╪н ╪и╪з┘Д┘Е╪▒┘И┘Ж╪й ┘Б┘К ╪з┘Д╪г╪п┘И╪з╪▒ ╪з┘Д╪к┘К ╪к╪к╪п╪▒╪и ╪╣┘Д┘К┘З╪з. ┘К╪и╪п┘И ╪з┘Д┘Ж╪з╪к╪м ┘Г┘Е╪з ┘К┘Д┘К:
```text
<|im_start|>system
You are a helpful chatbot that will do its best not to say anything so stupid that people tweet about it.<|im_end|>
<|im_start|>user
How are you?<|im_end|>
<|im_start|>assistant
I'm doing great!<|im_end|>
```
╪к╪╣╪п ╪г╪п┘И╪з╪▒ "user" ┘И "system" ┘И "assistant" ┘З┘К ╪з┘Д╪г╪п┘И╪з╪▒ ╪з┘Д┘В┘К╪з╪│┘К╪й ┘Д┘Д╪п╪▒╪п╪┤╪й╪М ┘И┘Ж┘И╪╡┘К ╪и╪з╪│╪к╪о╪п╪з┘Е┘З╪з ╪╣┘Ж╪п┘Е╪з ┘К┘Г┘И┘Ж ╪░┘Д┘Г ┘Е┘Ж╪╖┘В┘К┘Л╪з╪М ╪о╪з╪╡╪й ╪е╪░╪з ┘Г┘Ж╪к ╪к╪▒┘К╪п ╪г┘Ж ┘К╪╣┘Е┘Д ┘Ж┘Е┘И╪░╪м┘Г ╪и╪┤┘Г┘Д ╪м┘К╪п ┘Е╪╣ [`TextGenerationPipeline`]. ┘И┘Е╪╣ ╪░┘Д┘Г╪М ┘Б╪г┘Ж╪к ┘Д╪│╪к ┘Е┘В┘К╪п┘Л╪з ╪и┘З╪░┘З ╪з┘Д╪г╪п┘И╪з╪▒ - ┘Б╪е┘Ж ╪з┘Д┘В┘И╪з┘Д╪и ┘Е╪▒┘Ж╪й ┘Д┘Д╪║╪з┘К╪й╪М ┘И┘К┘Е┘Г┘Ж ╪г┘Ж ╪к┘Г┘И┘Ж ╪г┘К ╪│┘Д╪│┘Д╪й ┘Ж╪╡┘К╪й ╪п┘И╪▒┘Л╪з.
## ╪г╪▒┘К╪п ╪е╪╢╪з┘Б╪й ╪и╪╣╪╢ ┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й! ┘Г┘К┘Б ╪г╪и╪п╪г╪Я
╪е╪░╪з ┘Г╪з┘Ж ┘Д╪п┘К┘Г ╪г┘К ┘Ж┘Е╪з╪░╪м ╪п╪▒╪п╪┤╪й╪М ┘Б┘К╪м╪и ╪╣┘Д┘К┘Г ╪к╪╣┘К┘К┘Ж ╪з┘Д╪о╪з╪╡┘К╪й "tokenizer.chat_template" ╪з┘Д╪о╪з╪╡╪й ╪и┘З╪з ┘И╪з╪о╪к╪и╪з╪▒┘З╪з ╪и╪з╪│╪к╪о╪п╪з┘Е [`~PreTrainedTokenizer.apply_chat_template`]╪М ╪л┘Е ╪▒┘Б╪╣ ╪з┘Д┘Е╪м╪▓┘Й╪б ╪з┘Д┘Д╪║┘И┘К ╪з┘Д┘Е┘П╪н╪п┘С╪л ╪е┘Д┘Й Hub. ┘К┘Ж╪╖╪и┘В ┘З╪░╪з ╪н╪к┘Й ╪е╪░╪з ┘Д┘Е ╪к┘Г┘Ж ┘Е╪з┘Д┘Г ╪з┘Д┘Ж┘Е┘И╪░╪м - ╪е╪░╪з ┘Г┘Ж╪к ╪к╪│╪к╪о╪п┘Е ┘Ж┘Е┘И╪░╪м┘Л╪з ╪и┘В╪з┘Д╪и ╪п╪▒╪п╪┤╪й ┘Б╪з╪▒╪║╪М ╪г┘И ┘Д╪з ┘К╪▓╪з┘Д ┘К╪│╪к╪о╪п┘Е ┘В╪з┘Д╪и ╪з┘Д┘Б╪ж╪й ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К╪й╪М ┘Б┘К╪▒╪м┘Й ┘Б╪к╪н [╪╖┘Д╪и ╪│╪н╪и](https://huggingface.co/docs/hub/repositories-pull-requests-discussions) ╪е┘Д┘Й ┘Е╪│╪к┘И╪п╪╣ ╪з┘Д┘Ж┘Е┘И╪░╪м ╪н╪к┘Й ┘К┘Е┘Г┘Ж ╪к╪╣┘К┘К┘Ж ╪з┘Д╪о╪з╪╡┘К╪й ╪и╪┤┘Г┘Д ╪╡╪н┘К╪н!
╪и┘Е╪м╪▒╪п ╪к╪╣┘К┘К┘Ж ╪з┘Д╪о╪з╪╡┘К╪й╪М ┘З╪░╪з ┘Г┘Д ╪┤┘К╪б╪М ┘Д┘В╪п ╪з┘Ж╪к┘З┘К╪к! ╪│╪к╪╣┘Е┘Д "tokenizer.apply_chat_template" ╪з┘Д╪в┘Ж ╪и╪┤┘Г┘Д ╪╡╪н┘К╪н ┘Д┘З╪░╪з ╪з┘Д┘Ж┘Е┘И╪░╪м╪М ┘Е┘Е╪з ┘К╪╣┘Ж┘К ╪г┘Ж┘З╪з ┘Е╪п╪╣┘И┘Е╪й ╪г┘К╪╢┘Л╪з ╪и╪┤┘Г┘Д ╪к┘Д┘В╪з╪ж┘К ┘Б┘К ╪г┘Е╪з┘Г┘Ж ┘Е╪л┘Д "TextGenerationPipeline"!
┘Е┘Ж ╪о┘Д╪з┘Д ╪╢┘Е╪з┘Ж ╪з┘Е╪к┘Д╪з┘Г ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Д┘З╪░┘З ╪з┘Д╪о╪з╪╡┘К╪й╪М ┘К┘П┘Е┘Г┘Ж┘Ж╪з ╪з┘Д╪к╪г┘Г╪п ┘Е┘Ж ╪г┘Ж ╪з┘Д┘Е╪м╪к┘Е╪╣ ╪и╪г┘Г┘Е┘Д┘З ┘К╪│╪к╪о╪п┘Е ╪з┘Д┘В┘И╪й ╪з┘Д┘Г╪з┘Е┘Д╪й ┘Д┘Д┘Ж┘Е╪з╪░╪м ┘Е┘Б╪к┘И╪н╪й ╪з┘Д┘Е╪╡╪п╪▒. ┘Д┘В╪п ┘Г╪з┘Ж╪к ╪╣╪п┘Е ╪к╪╖╪з╪и┘В ╪з┘Д╪к┘Ж╪│┘К┘В ╪к╪╖╪з╪▒╪п ╪з┘Д┘Е╪м╪з┘Д ┘И╪г╪╢╪▒╪к ╪з┘Д╪г╪п╪з╪б ╪и╪╡┘Е╪к ┘Д┘Б╪к╪▒╪й ╪╖┘И┘К┘Д╪й ╪м╪п┘Л╪з - ┘Д┘В╪п ╪н╪з┘Ж ╪з┘Д┘И┘В╪к ┘Д┘И╪╢╪╣ ╪н╪п ┘Д┘З╪з!
## ┘Е╪к┘В╪п┘Е: ┘Ж╪╡╪з╪ж╪н ┘Д┘Г╪к╪з╪и╪й ╪з┘Д┘В┘И╪з┘Д╪и
<Tip>
╪г╪│┘З┘Д ╪╖╪▒┘К┘В╪й ┘Д┘Д╪и╪п╪б ┘Б┘К ┘Г╪к╪з╪и╪й ┘В┘И╪з┘Д╪и Jinja ┘З┘К ╪е┘Д┘В╪з╪б ┘Ж╪╕╪▒╪й ╪╣┘Д┘Й ╪и╪╣╪╢ ╪з┘Д┘В┘И╪з┘Д╪и ╪з┘Д┘Е┘И╪м┘И╪п╪й. ┘К┘Е┘Г┘Ж┘Г ╪з╪│╪к╪о╪п╪з┘Е `print(tokenizer.chat_template)` ┘Д╪г┘К ┘Ж┘Е┘И╪░╪м ╪п╪▒╪п╪┤╪й ┘Д┘Е╪╣╪▒┘Б╪й ╪з┘Д┘В╪з┘Д╪и ╪з┘Д╪░┘К ┘К╪│╪к╪о╪п┘Е┘З. ╪и╪┤┘Г┘Д ╪╣╪з┘Е╪М ╪к╪н╪к┘И┘К ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к┘К ╪к╪п╪╣┘Е ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п┘И╪з╪к ╪╣┘Д┘Й ┘В┘И╪з┘Д╪и ╪г┘Г╪л╪▒ ╪к╪╣┘В┘К╪п┘Л╪з ╪и┘Г╪л┘К╪▒ ┘Е┘Ж ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪г╪о╪▒┘Й - ┘Д╪░┘Д┘Г ╪╣┘Ж╪п┘Е╪з ╪к╪и╪п╪г ┘Д┘Д╪к┘И╪М ┘Б┘Е┘Ж ╪з┘Д┘Е╪н╪к┘Е┘Д ╪г┘Ж┘З╪з ┘Е╪л╪з┘Д ╪│┘К╪ж ┘Д┘Д╪к╪╣┘Д┘Е ┘Е┘Ж┘З! ┘К┘Е┘Г┘Ж┘Г ╪г┘К╪╢┘Л╪з ╪е┘Д┘В╪з╪б ┘Ж╪╕╪▒╪й ╪╣┘Д┘Й [┘И╪л╪з╪ж┘В Jinja](https://jinja.palletsprojects.com/en/3.1.x/templates/#synopsis) ┘Д┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ╪к┘Б╪з╪╡┘К┘Д ╪н┘И┘Д ╪к┘Ж╪│┘К┘В Jinja ╪з┘Д╪╣╪з┘Е ┘И╪к╪▒┘Г┘К╪и┘З.
</Tip>
╪к┘П╪╖╪з╪и┘В ┘В┘И╪з┘Д╪и Jinja ┘Б┘К `transformers` ┘В┘И╪з┘Д╪и Jinja ┘Б┘К ╪г┘К ┘Е┘Г╪з┘Ж ╪в╪о╪▒. ╪з┘Д╪┤┘К╪б ╪з┘Д╪▒╪ж┘К╪│┘К ╪з┘Д╪░┘К ┘К╪м╪и ┘Е╪╣╪▒┘Б╪к┘З ┘З┘И ╪г┘Ж ╪│╪м┘Д ╪з┘Д╪п╪▒╪п╪┤╪й ╪│┘К┘Г┘И┘Ж ┘Е╪к╪з╪н┘Л╪з ╪п╪з╪о┘Д ┘В╪з┘Д╪и┘Г ┘Г┘Е╪к╪║┘К╪▒ ┘К╪│┘Е┘Й `messages`. ╪│╪к╪к┘Е┘Г┘Ж ┘Е┘Ж ╪з┘Д┘И╪╡┘И┘Д ╪е┘Д┘Й `messages` ┘Б┘К ┘В╪з┘Д╪и┘Г ╪к┘Е╪з┘Е┘Л╪з ┘Г┘Е╪з ┘К┘Е┘Г┘Ж┘Г ┘Б┘К Python╪М ┘Е┘Е╪з ┘К╪╣┘Ж┘К ╪г┘Ж┘З ┘К┘Е┘Г┘Ж┘Г ╪з┘Д╪к┘Г╪▒╪з╪▒ ╪о┘Д╪з┘Д┘З ╪и╪з╪│╪к╪о╪п╪з┘Е `{% for message in messages %}` ╪г┘И ╪з┘Д┘И╪╡┘И┘Д ╪е┘Д┘Й ╪▒╪│╪з╪ж┘Д ┘Б╪▒╪п┘К╪й ╪и╪з╪│╪к╪о╪п╪з┘Е `{{ messages[0] }}`╪М ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д.
┘К┘Е┘Г┘Ж┘Г ╪г┘К╪╢┘Л╪з ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Ж╪╡╪з╪ж╪н ╪з┘Д╪к╪з┘Д┘К╪й ┘Д┘Г╪к╪з╪и╪й ┘В┘И╪з┘Д╪и Jinja ┘Ж╪╕┘К┘Б╪й ┘И┘Б╪╣╪з┘Д╪й:
### ╪е┘В╪к╪╖╪з╪╣ ╪з┘Д┘Е╪│╪з┘Б╪з╪к ╪з┘Д┘Б╪з╪▒╪║╪й
╪и╪┤┘Г┘Д ╪з┘Б╪к╪▒╪з╪╢┘К╪М ╪│╪к╪╖╪и╪╣ Jinja ╪г┘К ┘Е╪│╪з┘Б╪з╪к ┘Б╪з╪▒╪║╪й ╪к╪г╪к┘К ┘В╪и┘Д ╪г┘И ╪и╪╣╪п ┘Г╪к┘Д╪й. ┘К┘Е┘Г┘Ж ╪г┘Ж ┘К┘Г┘И┘Ж ┘З╪░╪з ┘Е╪┤┘Г┘Д╪й ┘Д┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й╪М ┘И╪з┘Д╪к┘К ╪к╪▒┘К╪п ╪╣╪з╪п╪й┘Л ╪г┘Ж ╪к┘Г┘И┘Ж ╪п┘В┘К┘В╪й ╪м╪п┘Л╪з ┘Е╪╣ ╪з┘Д┘Е╪│╪з┘Б╪з╪к! ┘Д╪к╪м┘Ж╪и ╪░┘Д┘Г╪М ┘Ж┘И╪╡┘К ╪и╪┤╪п╪й ╪и┘Г╪к╪з╪и╪й ┘В┘И╪з┘Д╪и┘Г ╪╣┘Д┘Й ╪з┘Д┘Ж╪н┘И ╪з┘Д╪к╪з┘Д┘К:
```
{%- for message in messages %}
{{- message['role'] + message['content'] }}
{%- endfor %}
```
╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪░┘Д┘Г:
```
{% for message in messages %}
{{ message['role'] + message['content'] }}
{% endfor %}
```
╪│┘К╪д╪п┘К ╪е╪╢╪з┘Б╪й "-" ╪е┘Д┘Й ╪е╪▓╪з┘Д╪й ╪г┘К ┘Е╪│╪з┘Б╪з╪к ╪к╪г╪к┘К ┘В╪и┘Д ╪з┘Д┘Г╪к┘Д╪й. ┘К╪и╪п┘И ╪з┘Д┘Е╪л╪з┘Д ╪з┘Д╪л╪з┘Ж┘К ╪╣╪з╪п┘К╪й╪М ┘И┘Д┘Г┘Ж ┘В╪п ┘К╪к┘Е ╪к╪╢┘Е┘К┘Ж ╪з┘Д╪│╪╖╪▒ ╪з┘Д╪м╪п┘К╪п ┘И╪з┘Д┘Е╪│╪з┘Б╪й ╪з┘Д╪и╪з╪п╪ж╪й ┘Б┘К ╪з┘Д┘Е╪о╪▒╪м╪з╪к╪М ┘И┘З┘И ╪╣┘Д┘Й ╪з┘Д╪г╪▒╪м╪н ┘Д┘К╪│ ┘Е╪з ╪к┘П╪▒┘К╪п┘З!
### ╪з┘Д┘Е╪к╪║┘К╪▒╪з╪к ╪з┘Д╪о╪з╪╡╪й
╪п╪з╪о┘Д ┘В╪з┘Д╪и┘Г╪М ╪│┘К┘Г┘И┘Ж ┘Д╪п┘К┘Г ╪н┘В ╪з┘Д┘И╪╡┘И┘Д ╪е┘Д┘Й ╪з┘Д╪╣╪п┘К╪п ┘Е┘Ж ╪з┘Д┘Е╪к╪║┘К╪▒╪з╪к ╪з┘Д╪о╪з╪╡╪й. ╪г┘З┘Е┘З╪з ┘З┘И `messages`╪М ┘И╪з┘Д╪░┘К ┘К╪н╪к┘И┘К ╪╣┘Д┘Й ╪│╪м┘Д ╪з┘Д╪п╪▒╪п╪┤╪й ┘Г┘В╪з╪ж┘Е╪й ┘Е┘Ж ┘В┘И╪з┘Е┘К╪│ ╪з┘Д╪▒╪│╪з╪ж┘Д. ┘И┘Е╪╣ ╪░┘Д┘Г╪М ┘З┘Ж╪з┘Г ╪з┘Д╪╣╪п┘К╪п ┘Е┘Ж ╪з┘Д┘Е╪к╪║┘К╪▒╪з╪к ╪з┘Д╪г╪о╪▒┘Й. ┘Д┘Ж ┘К╪к┘Е ╪з╪│╪к╪о╪п╪з┘Е ┘Г┘Д ┘Е╪к╪║┘К╪▒ ┘Б┘К ┘Г┘Д ┘В╪з┘Д╪и. ╪з┘Д┘Е╪к╪║┘К╪▒╪з╪к ╪з┘Д╪г┘Г╪л╪▒ ╪┤┘К┘И╪╣┘Л╪з ┘З┘К:
- `tools` ╪к╪н╪к┘И┘К ╪╣┘Д┘Й ┘В╪з╪ж┘Е╪й ╪и╪з┘Д╪г╪п┘И╪з╪к ╪и╪к┘Ж╪│┘К┘В ┘Е╪о╪╖╪╖ JSON. ╪│╪к┘Г┘И┘Ж `None` ╪г┘И ╪║┘К╪▒ ┘Е┘П╪╣╪▒┘С┘Б╪й ╪е╪░╪з ┘Д┘Е ┘К╪к┘Е ╪к┘Е╪▒┘К╪▒ ╪г┘К ╪г╪п┘И╪з╪к.
- `documents` ╪к╪н╪к┘И┘К ╪╣┘Д┘Й ┘В╪з╪ж┘Е╪й ┘Е┘Ж ╪з┘Д┘Е╪│╪к┘Ж╪п╪з╪к ╪и╪з┘Д╪к┘Ж╪│┘К┘В `{"title": "╪з┘Д╪╣┘Ж┘И╪з┘Ж", "contents": "╪з┘Д┘Е╪н╪к┘И┘К╪з╪к"}`╪М ╪к┘П╪│╪к╪о╪п┘Е ┘Д┘Д╪к┘И┘Д┘К╪п ╪з┘Д┘Е┘П╪╣╪▓╪▓ ╪и╪з┘Д╪з╪│╪к╪▒╪м╪з╪╣. ╪│╪к┘Г┘И┘Ж `None` ╪г┘И ╪║┘К╪▒ ┘Е┘П╪╣╪▒┘С┘Б╪й ╪е╪░╪з ┘Д┘Е ┘К╪к┘Е ╪к┘Е╪▒┘К╪▒ ╪г┘К ┘Е╪│╪к┘Ж╪п╪з╪к.
- `add_generation_prompt` ┘З┘К ┘В┘К┘Е╪й ┘Е┘Ж╪╖┘В┘К╪й ╪к┘Г┘И┘Ж `True` ╪е╪░╪з ╪╖┘Д╪и ╪з┘Д┘Е╪│╪к╪о╪п┘Е ┘Е┘П╪╖╪з┘Д╪и╪й ╪к┘И┘Д┘К╪п╪М ┘И `False` ╪и╪о┘Д╪з┘Б ╪░┘Д┘Г. ╪е╪░╪з ╪к┘Е ╪к╪╣┘К┘К┘Ж ┘З╪░╪з╪М ┘Б┘К╪м╪и ╪г┘Ж ┘К┘П╪╢┘К┘Б ┘В╪з┘Д╪и┘Г ╪▒╪г╪│ ╪▒╪│╪з┘Д╪й ┘Е╪│╪з╪╣╪п ╪е┘Д┘Й ┘Ж┘З╪з┘К╪й ╪з┘Д┘Е╪н╪з╪п╪л╪й. ╪е╪░╪з ┘Д┘Е ┘К┘Г┘Ж ┘Д╪п┘Й ┘Ж┘Е┘И╪░╪м┘Г ╪▒╪г╪│ ┘Е┘П╪н╪п╪п ┘Д╪▒╪│╪з╪ж┘Д ╪з┘Д┘Е╪│╪з╪╣╪п╪М ┘Б┘К┘Е┘Г┘Ж┘Г ╪к╪м╪з┘З┘Д ┘З╪░╪з ╪з┘Д╪╣┘Д┘Е.
- **╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪о╪з╪╡╪й** ┘Е╪л┘Д `bos_token` ┘И `eos_token`. ┘К╪к┘Е ╪з╪│╪к╪о╪▒╪з╪м┘З╪з ┘Е┘Ж `tokenizer.special_tokens_map`. ╪│╪к╪о╪к┘Д┘Б ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪п┘В┘К┘В╪й ╪з┘Д┘Е╪к╪з╪н╪й ╪п╪з╪о┘Д ┘Г┘Д ┘В╪з┘Д╪и ╪з╪╣╪к┘Е╪з╪п┘Л╪з ╪╣┘Д┘Й ╪з┘Д┘Е╪м╪▓┘Й╪б ╪з┘Д┘Д╪║┘И┘К ╪з┘Д╪г╪╡┘Д┘К.
<Tip>
┘К┘Е┘Г┘Ж┘Г ┘Б┘К ╪з┘Д┘И╪з┘В╪╣ ╪к┘Е╪▒┘К╪▒ ╪г┘К `kwarg` ╪е┘Д┘Й `apply_chat_template`╪М ┘И╪│╪к┘Г┘И┘Ж ┘Е╪к╪з╪н╪й ╪п╪з╪о┘Д ╪з┘Д┘В╪з┘Д╪и ┘Г┘Е╪к╪║┘К╪▒. ╪и╪┤┘Г┘Д ╪╣╪з┘Е╪М ┘Ж┘И╪╡┘К ╪и┘Е╪н╪з┘И┘Д╪й ╪з┘Д╪з┘Д╪к╪▓╪з┘Е ╪и╪з┘Д┘Е╪к╪║┘К╪▒╪з╪к ╪з┘Д╪г╪│╪з╪│┘К╪й ╪з┘Д┘Е╪░┘Г┘И╪▒╪й ╪г╪╣┘Д╪з┘З╪М ┘Д╪г┘Ж ╪░┘Д┘Г ╪│┘К╪м╪╣┘Д ┘Ж┘Е┘И╪░╪м┘Г ╪г┘Г╪л╪▒ ╪╡╪╣┘И╪и╪й ┘Б┘К ╪з┘Д╪з╪│╪к╪о╪п╪з┘Е ╪е╪░╪з ┘Г╪з┘Ж ╪╣┘Д┘Й ╪з┘Д┘Е╪│╪к╪о╪п┘Е┘К┘Ж ┘Г╪к╪з╪и╪й ╪к╪╣┘Д┘К┘Е╪з╪к ╪и╪▒┘Е╪м┘К╪й ┘Е╪о╪╡╪╡╪й ┘Д╪к┘Е╪▒┘К╪▒ `kwargs` ╪о╪з╪╡╪й ╪и╪з┘Д┘Ж┘Е┘И╪░╪м. ┘И┘Е╪╣ ╪░┘Д┘Г╪М ┘Б┘Ж╪н┘Ж ┘Ж┘П╪п╪▒┘Г ╪г┘Ж ┘З╪░╪з ╪з┘Д┘Е╪м╪з┘Д ┘К╪к╪н╪▒┘Г ╪и╪│╪▒╪╣╪й╪М ┘Д╪░┘Д┘Г ╪е╪░╪з ┘Г╪з┘Ж╪к ┘Д╪п┘К┘Г ╪н╪з┘Д╪й ╪з╪│╪к╪о╪п╪з┘Е ╪м╪п┘К╪п╪й ┘Д╪з ╪к╪к┘Ж╪з╪│╪и ┘Е╪╣ ┘И╪з╪м┘З╪й ╪и╪▒┘Е╪м╪й ╪з┘Д╪к╪╖╪и┘К┘В╪з╪к ╪з┘Д╪г╪│╪з╪│┘К╪й╪М ┘Б┘Д╪з ╪к╪к╪▒╪п╪п ┘Б┘К ╪з╪│╪к╪о╪п╪з┘Е `kwarg` ┘Е╪╣╪з┘Е┘Д ╪м╪п┘К╪п ┘Д┘З╪з! ╪е╪░╪з ╪г╪╡╪и╪н `kwarg` ╪з┘Д┘Е╪╣╪з┘Е┘Д ╪з┘Д╪м╪п┘К╪п ╪┤╪з╪ж╪╣┘Л╪з╪М ┘Б┘В╪п ┘Ж┘В┘И┘Е ╪и╪к╪▒┘В┘К╪к┘З ╪е┘Д┘Й ┘И╪з╪м┘З╪й ╪и╪▒┘Е╪м╪й ╪з┘Д╪к╪╖╪и┘К┘В╪з╪к ╪з┘Д╪г╪│╪з╪│┘К╪й ┘И╪е┘Ж╪┤╪з╪б ┘И╪к┘И╪л┘К┘В ╪з┘Д╪о╪з╪╡ ╪и┘З.
</Tip>
### ╪п┘И╪з┘Д ┘В╪з╪и┘Д╪й ┘Д┘Д╪з╪│╪к╪п╪╣╪з╪б
┘З┘Ж╪з┘Г ╪г┘К╪╢┘Л╪з ┘В╪з╪ж┘Е╪й ┘В╪╡┘К╪▒╪й ┘Е┘Ж ╪з┘Д╪п┘И╪з┘Д ╪з┘Д┘В╪з╪и┘Д╪й ┘Д┘Д╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д┘Е╪к╪з╪н╪й ┘Д┘Г ╪п╪з╪о┘Д ┘В┘И╪з┘Д╪и┘Г. ┘З╪░┘З ┘З┘К:
- `raise_exception(msg)`: ╪к┘П╪л┘К╪▒ `TemplateException`. ┘З╪░╪з ┘Е┘Б┘К╪п ┘Д╪к╪╡╪н┘К╪н ╪з┘Д╪г╪о╪╖╪з╪б╪М ┘И┘Д╪е╪о╪и╪з╪▒ ╪з┘Д┘Е╪│╪к╪о╪п┘Е┘К┘Ж ╪╣┘Ж╪п┘Е╪з ┘К┘Б╪╣┘Д┘И┘Ж ╪┤┘К╪ж┘Л╪з ┘Д╪з ┘К╪п╪╣┘Е┘З ┘В╪з┘Д╪и┘Г.
- `strftime_now(format_str)`: ╪к┘П┘Г╪з┘Б╪ж `datetime.now().strftime(format_str)` ┘Б┘К Python. ┘К┘П╪│╪к╪о╪п┘Е ┘З╪░╪з ┘Д┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ╪з┘Д╪к╪з╪▒┘К╪о/╪з┘Д┘И┘В╪к ╪з┘Д╪н╪з┘Д┘К ╪и╪к┘Ж╪│┘К┘В ┘Е┘П╪н╪п╪п╪М ┘И╪з┘Д╪░┘К ┘К╪к┘Е ╪к╪╢┘Е┘К┘Ж┘З ╪г╪н┘К╪з┘Ж┘Л╪з ┘Б┘К ╪▒╪│╪з╪ж┘Д ╪з┘Д┘Ж╪╕╪з┘Е.
### ╪з┘Д╪к┘И╪з┘Б┘В ┘Е╪╣ Jinja ╪║┘К╪▒ Python
┘З┘Ж╪з┘Г ╪к╪╖╪и┘К┘В╪з╪к ┘Е╪к╪╣╪п╪п╪й ┘Д┘А Jinja ╪и┘Д╪║╪з╪к ┘Е╪о╪к┘Д┘Б╪й. ╪╣╪з╪п╪й ┘Е╪з ┘К┘Г┘И┘Ж ┘Д┘З╪з ┘Ж┘Б╪│ ╪з┘Д╪к╪▒┘Г┘К╪и╪М ┘И┘Д┘Г┘Ж ╪з┘Д╪з╪о╪к┘Д╪з┘Б ╪з┘Д╪▒╪ж┘К╪│┘К ┘З┘И ╪г┘Ж┘З ╪╣┘Ж╪п ┘Г╪к╪з╪и╪й ┘В╪з┘Д╪и┘Л╪з ┘Б┘К Python╪М ┘К┘Е┘Г┘Ж┘Г ╪з╪│╪к╪о╪п╪з┘Е ╪г╪│╪з┘Д┘К╪и Python╪М ┘Е╪л┘Д ".lower()" ╪╣┘Д┘Й ╪з┘Д╪│┘Д╪з╪│┘Д ╪г┘И ".items()" ╪╣┘Д┘Й ╪з┘Д┘В┘И╪з┘Е┘К╪│. ╪│┘К╪д╪п┘К ┘З╪░╪з ╪е┘Д┘Й ┘Г╪│╪▒ ╪е╪░╪з ╪н╪з┘И┘Д ╪┤╪о╪╡ ┘Е╪з ╪з╪│╪к╪о╪п╪з┘Е ┘В╪з┘Д╪и┘Г ┘Б┘К ╪к┘Ж┘Б┘К╪░ ╪║┘К╪▒ Python ┘Д┘А Jinja. ╪к╪╣╪п ╪з┘Д╪к╪╖╪и┘К┘В╪з╪к ╪║┘К╪▒ Python ╪┤╪з╪ж╪╣╪й ╪и╪┤┘Г┘Д ╪о╪з╪╡ ┘Б┘К ╪и┘К╪ж╪з╪к ╪з┘Д┘Ж╪┤╪▒╪М ╪н┘К╪л ╪к╪╣╪п JS ┘И Rust ╪┤╪з╪ж╪╣╪й ╪м╪п┘Л╪з.
┘Д╪з ╪к┘В┘Д┘В╪М ╪╣┘Д┘Й ╪з┘Д╪▒╪║┘Е ┘Е┘Ж ╪░┘Д┘Г! ┘З┘Ж╪з┘Г ╪и╪╣╪╢ ╪з┘Д╪к╪║┘К┘К╪▒╪з╪к ╪з┘Д╪и╪│┘К╪╖╪й ╪з┘Д╪к┘К ┘К┘Е┘Г┘Ж┘Г ╪е╪м╪▒╪з╪д┘З╪з ╪╣┘Д┘Й ┘В┘И╪з┘Д╪и┘Г ┘Д╪╢┘Е╪з┘Ж ╪к┘И╪з┘Б┘В┘З╪з ╪╣╪и╪▒ ╪м┘Е┘К╪╣ ╪к╪╖╪и┘К┘В╪з╪к Jinja:
- ╪з╪│╪к╪и╪п┘Д ╪г╪│╪з┘Д┘К╪и Python ╪и┘Е╪▒╪┤╪н╪з╪к Jinja. ╪╣╪з╪п╪й ┘Е╪з ┘К┘Г┘И┘Ж ┘Д┘З╪з ┘Ж┘Б╪│ ╪з┘Д╪з╪│┘Е╪М ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ┘К╪╡╪и╪н "string.lower()" ╪╣╪и╪з╪▒╪й ╪╣┘Ж "string|lower"╪М ┘И┘К╪╡╪и╪н "dict.items()" ╪╣╪и╪з╪▒╪й ╪╣┘Ж "dict|items". ╪г╪н╪п ╪з┘Д╪к╪║┘К┘К╪▒╪з╪к ╪з┘Д┘Е┘Д╪н┘И╪╕╪й ┘З┘И ╪г┘Ж "string.strip()" ┘К╪╡╪и╪н "string|trim". ╪▒╪з╪м╪╣ [┘В╪з╪ж┘Е╪й ╪з┘Д┘Е╪▒╪┤╪н╪з╪к ╪з┘Д┘Е╪п┘Е╪м╪й](https://jinja.palletsprojects.com/en/3.1.x/templates/#builtin-filters) ┘Б┘К ┘И╪л╪з╪ж┘В Jinja ┘Д┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д┘Е╪╣┘Д┘И┘Е╪з╪к.
- ╪з╪│╪к╪и╪п┘Д "True" ┘И "False" ┘И "None"╪М ┘И┘З┘К ╪о╪з╪╡╪й ╪и┘А Python╪М ╪и┘А "true" ┘И "false" ┘И "none".
- ┘В╪п ┘К╪д╪п┘К ╪╣╪▒╪╢ ┘В╪з┘Е┘И╪│ ╪г┘И ┘В╪з╪ж┘Е╪й ┘Е╪и╪з╪┤╪▒╪й ╪е┘Д┘Й ┘Ж╪к╪з╪ж╪м ┘Е╪о╪к┘Д┘Б╪й ┘Б┘К ╪з┘Д╪к╪╖╪и┘К┘В╪з╪к ╪з┘Д╪г╪о╪▒┘Й (╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ┘В╪п ╪к╪к╪║┘К╪▒ ┘Е╪п╪оя╗╗╪к ╪з┘Д╪│┘Д╪│┘Д╪й ╪з┘Д┘Ж╪╡┘К╪й ┘Е┘Ж ╪╣┘Д╪з┘Е╪з╪к ╪з┘В╪к╪и╪з╪│ ┘Е┘Б╪▒╪п╪й ' ╪е┘Д┘Й ╪╣┘Д╪з┘Е╪з╪к ╪з┘В╪к╪и╪з╪│ ┘Е╪▓╪п┘И╪м╪й "). ┘К┘Е┘Г┘Ж ╪г┘Ж ┘К╪│╪з╪╣╪п ╪е╪╢╪з┘Б╪й "tojson" ┘Б┘К ╪╢┘Е╪з┘Ж ╪з┘Д╪з╪к╪│╪з┘В ┘З┘Ж╪з.
## ┘Г╪к╪з╪и╪й ┘Е╪╖╪з┘Д╪и╪з╪к ╪з┘Д╪к┘И┘Д┘К╪п
┘Д┘В╪п ╪░┘Г╪▒┘Ж╪з ╪г╪╣┘Д╪з┘З ╪г┘Ж add_generation_prompt ┘З┘И ┘Е╪к╪║┘К╪▒ ╪о╪з╪╡ ┘К┘Е┘Г┘Ж ╪з┘Д┘И╪╡┘И┘Д ╪е┘Д┘К┘З ╪п╪з╪о┘Д ┘В╪з┘Д╪и┘Г╪М ┘И┘К╪к╪н┘Г┘Е ┘Б┘К┘З ╪з┘Д┘Е╪│╪к╪о╪п┘Е ┘Е┘Ж ╪о┘Д╪з┘Д ╪к╪╣┘К┘К┘Ж ┘Е╪╣╪з┘Е┘Д add_generation_prompt. ╪е╪░╪з ┘Г╪з┘Ж ┘Ж┘Е┘И╪░╪м┘Г ┘К╪к┘И┘В╪╣ ╪╣┘Ж┘И╪з┘Ж ┘Д╪▒╪│╪з╪ж┘Д ╪з┘Д┘Е╪│╪з╪╣╪п╪М ┘Б┘К╪м╪и ╪г┘Ж ┘К╪п╪╣┘Е ┘В╪з┘Д╪и┘Г ╪е╪╢╪з┘Б╪й ╪з┘Д╪╣┘Ж┘И╪з┘Ж ╪╣┘Ж╪п ╪к╪╣┘К┘К┘Ж add_generation_prompt.
┘Б┘К┘Е╪з ┘К┘Д┘К ┘Е╪л╪з┘Д ╪╣┘Д┘Й ┘В╪з┘Д╪и ┘К┘П┘Ж╪│┘С┘В ╪з┘Д╪▒╪│╪з╪ж┘Д ╪и╪г╪│┘Д┘И╪и ChatML╪М ┘Е╪╣ ╪п╪╣┘Е ┘Е┘П╪╖╪з┘Д╪и╪й ╪з┘Д╪к┘И┘Д┘К╪п:
```text
{{- bos_token }}
{%- for message in messages %}
{{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- endif %}
```
╪│┘К╪╣╪к┘Е╪п ╪з┘Д┘Е╪н╪к┘И┘Й ╪з┘Д╪п┘В┘К┘В ┘Д╪╣┘Ж┘И╪з┘Ж ╪з┘Д┘Е╪│╪з╪╣╪п ╪╣┘Д┘Й ┘Ж┘Е┘И╪░╪м┘Г ╪з┘Д┘Е┘П╪н╪п╪п╪М ┘И┘Д┘Г┘Ж ┘К╪м╪и ╪г┘Ж ┘К┘Г┘И┘Ж ╪п╪з╪ж┘Е┘Л╪з ╪з┘Д╪│┘Д╪│┘Д╪й ╪з┘Д┘Ж╪╡┘К╪й ╪з┘Д╪к┘К ╪к┘П┘Е╪л┘Д ╪и╪п╪з┘К╪й ╪▒╪│╪з┘Д╪й ╪з┘Д┘Е╪│╪з╪╣╪п╪М ╪и╪н┘К╪л ╪е╪░╪з ┘В╪з┘Е ╪з┘Д┘Е╪│╪к╪о╪п┘Е ╪и╪к╪╖╪и┘К┘В ┘В╪з┘Д╪и┘Г ╪и╪з╪│╪к╪о╪п╪з┘Е add_generation_prompt=True ╪л┘Е ┘В╪з┘Е ╪и╪к┘И┘Д┘К╪п ┘Ж╪╡╪М ╪│┘К┘Г╪к╪и ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з╪│╪к╪м╪з╪и╪й ╪з┘Д┘Е╪│╪з╪╣╪п. ┘Д╪з╪н╪╕ ╪г┘К╪╢┘Л╪з ╪г┘Ж ╪и╪╣╪╢ ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Д╪з ╪к╪н╪к╪з╪м ╪е┘Д┘Й ┘Е┘П╪╖╪з┘Д╪и╪й ╪к┘И┘Д┘К╪п╪М ┘Д╪г┘Ж ╪▒╪│╪з╪ж┘Д ╪з┘Д┘Е╪│╪з╪╣╪п ╪к╪и╪п╪г ╪п╪з╪ж┘Е┘Л╪з ┘Б┘И╪▒┘Л╪з ╪и╪╣╪п ╪▒╪│╪з╪ж┘Д ╪з┘Д┘Е╪│╪к╪о╪п┘Е. ┘З╪░╪з ╪┤╪з╪ж╪╣ ╪и╪┤┘Г┘Д ╪о╪з╪╡ ┘Д┘Ж┘Е╪з╪░╪м LLaMA ┘И Mistral╪М ╪н┘К╪л ╪к╪и╪п╪г ╪▒╪│╪з╪ж┘Д ╪з┘Д┘Е╪│╪з╪╣╪п ┘Б┘И╪▒┘Л╪з ╪и╪╣╪п ╪▒┘Е╪▓ [/INST] ╪з┘Д╪░┘К ┘К┘Ж┘З┘К ╪▒╪│╪з╪ж┘Д ╪з┘Д┘Е╪│╪к╪о╪п┘Е. ┘Б┘К ┘З╪░┘З ╪з┘Д╪н╪з┘Д╪з╪к╪М ┘К┘Е┘Г┘Ж ┘Д┘Д┘В╪з┘Д╪и ╪к╪м╪з┘З┘Д ┘Е╪╣╪з┘Е┘Д add_generation_prompt.
┘Е┘П╪╖╪з┘Д╪и╪з╪к ╪з┘Д╪к┘И┘Д┘К╪п ┘Е┘П┘З┘Е╪й! ╪е╪░╪з ┘Г╪з┘Ж ┘Ж┘Е┘И╪░╪м┘Г ┘К╪к╪╖┘Д╪и ┘Е┘П╪╖╪з┘Д╪и╪й ╪к┘И┘Д┘К╪п ┘И┘Д┘Г┘Ж┘З╪з ╪║┘К╪▒ ┘Е┘П╪╣┘К┘С┘Ж╪й ┘Б┘К ╪з┘Д┘В╪з┘Д╪и╪М ┘Б┘Е┘Ж ╪з┘Д┘Е┘П╪н╪к┘Е┘Д ╪г┘Ж ╪к╪к╪п┘З┘И╪▒ ╪╣┘Е┘Д┘К╪з╪к ╪к┘И┘Д┘К╪п ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и╪┤╪п╪й╪М ╪г┘И ┘В╪п ┘К┘П╪╕┘З╪▒ ╪з┘Д┘Ж┘Е┘И╪░╪м ╪│┘Д┘И┘Г┘Л╪з ╪║┘К╪▒ ╪╣╪з╪п┘К ┘Е╪л┘Д ┘Е╪к╪з╪и╪╣╪й ╪▒╪│╪з┘Д╪й ╪з┘Д┘Е╪│╪к╪о╪п┘Е ╪з┘Д╪г╪о┘К╪▒╪й!
### ┘Г╪к╪з╪и╪й ┘В┘И╪з┘Д╪и ╪г┘Г╪и╪▒ ┘И╪к╪╡╪н┘К╪н┘З╪з
╪╣┘Ж╪п┘Е╪з ╪к┘Е ╪к┘В╪п┘К┘Е ┘З╪░┘З ╪з┘Д┘Е┘К╪▓╪й╪М ┘Г╪з┘Ж╪к ┘Е╪╣╪╕┘Е ╪з┘Д┘В┘И╪з┘Д╪и ╪╡╪║┘К╪▒╪й ╪м╪п┘Л╪з╪М ╪г┘К ┘Е╪з ┘К┘П╪╣╪з╪п┘Д ┘Ж╪╡ ╪и╪▒┘Е╪м┘К "┘Е┘Ж ╪│╪╖╪▒ ┘И╪з╪н╪п" ┘Б┘К Jinja. ┘И┘Е╪╣ ╪░┘Д┘Г╪М ┘Е╪╣ ╪з┘Д┘Ж┘Е╪з╪░╪м ┘И╪з┘Д┘Е┘К╪▓╪з╪к ╪з┘Д╪м╪п┘К╪п╪й ┘Е╪л┘Д ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п┘И╪з╪к ┘И RAG╪М ┘К┘Е┘Г┘Ж ╪г┘Ж ┘К╪╡┘Д ╪╖┘И┘Д ╪и╪╣╪╢ ╪з┘Д┘В┘И╪з┘Д╪и ╪е┘Д┘Й 100 ╪│╪╖╪▒ ╪г┘И ╪г┘Г╪л╪▒. ╪╣┘Ж╪п ┘Г╪к╪з╪и╪й ┘В┘И╪з┘Д╪и ┘Г┘З╪░┘З╪М ┘Е┘Ж ╪з┘Д╪м┘К╪п ┘Г╪к╪з╪и╪к┘З╪з ┘Б┘К ┘Е┘Д┘Б ┘Е┘П┘Ж┘Б╪╡┘Д╪М ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Е┘П╪н╪▒╪▒ ┘Ж╪╡┘И╪╡. ┘К┘Е┘Г┘Ж┘Г ╪и╪│┘З┘И┘Д╪й ╪з╪│╪к╪о╪▒╪з╪м ┘В╪з┘Д╪и ╪п╪▒╪п╪┤╪й ╪е┘Д┘Й ┘Е┘Д┘Б:
```python
open("template.jinja", "w").write(tokenizer.chat_template)
```
╪г┘И ╪к╪н┘Е┘К┘Д ╪з┘Д┘В╪з┘Д╪и ╪з┘Д┘Е┘П╪н╪▒╪▒ ┘Е╪▒╪й ╪г╪о╪▒┘Й ╪е┘Д┘Й ╪з┘Д┘Е╪╣╪з┘Д╪м ╪з┘Д┘Д╪║┘И┘К:
```python
tokenizer.chat_template = open("template.jinja").read()
```
┘Г┘Е┘К╪▓╪й ╪е╪╢╪з┘Б┘К╪й╪М ╪╣┘Ж╪п┘Е╪з ╪к┘Г╪к╪и ┘В╪з┘Д╪и┘Л╪з ╪╖┘И┘К┘Д╪з┘Л ┘Е╪к╪╣╪п╪п ╪з┘Д╪г╪│╪╖╪▒ ┘Б┘К ┘Е┘Д┘Б ┘Е┘П┘Ж┘Б╪╡┘Д╪М ╪│╪к╪к┘И╪з┘Б┘В ╪г╪▒┘В╪з┘Е ╪з┘Д╪г╪│╪╖╪▒ ┘Б┘К ┘З╪░╪з ╪з┘Д┘Е┘Д┘Б ╪к┘Е╪з┘Е┘Л╪з ┘Е╪╣ ╪г╪▒┘В╪з┘Е ╪з┘Д╪г╪│╪╖╪▒ ┘Б┘К ╪г╪о╪╖╪з╪б ╪к╪н┘Д┘К┘Д ╪з┘Д┘В╪з┘Д╪и ╪г┘И ╪к┘Ж┘Б┘К╪░┘З. ╪│┘К┘П╪│┘З┘С┘Д ┘З╪░╪з ┘Г╪л┘К╪▒┘Л╪з ╪к╪н╪п┘К╪п ┘Е┘Г╪з┘Ж ╪з┘Д┘Е╪┤┘Г┘Д╪з╪к.
### ┘Г╪к╪з╪и╪й ┘В┘И╪з┘Д╪и ┘Д┘Д╪г╪п┘И╪з╪к
╪╣┘Д┘Й ╪з┘Д╪▒╪║┘Е ┘Е┘Ж ╪г┘Ж ┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ┘Д╪з ╪к┘Б╪▒╪╢ ┘И╪з╪м┘З╪й ╪и╪▒┘Е╪м╪й ╪к╪╖╪и┘К┘В╪з╪к ┘Е┘П╪н╪п╪п╪й ┘Д┘Д╪г╪п┘И╪з╪к (╪г┘И ┘Д╪г┘К ╪┤┘К╪б ╪н┘В┘Л╪з)╪М ┘Б╪е┘Ж┘Ж╪з ┘Ж┘И╪╡┘К ┘Е╪д┘Д┘Б┘К ╪з┘Д┘В┘И╪з┘Д╪и ╪и┘Е╪н╪з┘И┘Д╪й ╪з┘Д╪з┘Д╪к╪▓╪з┘Е ╪и┘И╪з╪м┘З╪й ╪и╪▒┘Е╪м╪й ╪к╪╖╪и┘К┘В╪з╪к ┘В┘К╪з╪│┘К╪й ╪н┘К╪л┘Е╪з ╪г┘Е┘Г┘Ж. ╪з┘Д┘З╪п┘Б ╪з┘Д┘Ж┘З╪з╪ж┘К ┘Д┘В┘И╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ┘З┘И ╪з┘Д╪│┘Е╪з╪н ╪и┘Ж┘В┘Д ╪з┘Д╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Е╪м┘К╪й ╪╣╪и╪▒ ╪з┘Д┘Ж┘Е╪з╪░╪м╪М ┘Д╪░╪з ┘Б╪е┘Ж ╪з┘Д╪з┘Ж╪н╪▒╪з┘Б ╪╣┘Ж ┘И╪з╪м┘З╪й ╪и╪▒┘Е╪м╪й ╪к╪╖╪и┘К┘В╪з╪к ╪з┘Д╪г╪п┘И╪з╪к ╪з┘Д┘В┘К╪з╪│┘К╪й ┘К╪╣┘Ж┘К ╪г┘Ж ╪з┘Д┘Е╪│╪к╪о╪п┘Е┘К┘Ж ╪│┘К╪╢╪╖╪▒┘И┘Ж ╪е┘Д┘Й ┘Г╪к╪з╪и╪й ╪к╪╣┘Д┘К┘Е╪з╪к ╪и╪▒┘Е╪м┘К╪й ┘Е╪о╪╡╪╡╪й ┘Д╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п┘И╪з╪к ┘Е╪╣ ┘Ж┘Е┘И╪░╪м┘Г. ┘Б┘К ╪и╪╣╪╢ ╪з┘Д╪г╪н┘К╪з┘Ж ┘К┘Г┘И┘Ж ╪░┘Д┘Г ╪г┘Е╪▒┘Л╪з ┘Д╪з ┘Е┘Б╪▒ ┘Е┘Ж┘З╪М ┘И┘Д┘Г┘Ж ╪║╪з┘Д╪и┘Л╪з ┘Е╪з ┘К┘Г┘И┘Ж ┘Е┘Ж ╪з┘Д┘Е┘Е┘Г┘Ж ╪з╪│╪к╪о╪п╪з┘Е ┘И╪з╪м┘З╪й ╪и╪▒┘Е╪м╪й ╪з┘Д╪к╪╖╪и┘К┘В╪з╪к ╪з┘Д┘В┘К╪з╪│┘К╪й ┘Е┘Ж ╪о┘Д╪з┘Д ╪з╪│╪к╪о╪п╪з┘Е ┘В┘И╪з┘Д╪и ╪░┘Г┘К╪й!
╪г╪п┘Ж╪з┘З╪М ╪│┘Ж┘П╪п╪▒╪м ╪╣┘Ж╪з╪╡╪▒ ┘И╪з╪м┘З╪й ╪и╪▒┘Е╪м╪й ╪з┘Д╪к╪╖╪и┘К┘В╪з╪к ╪з┘Д┘В┘К╪з╪│┘К╪й╪М ┘И┘Ж┘В╪п┘Е ┘Ж╪╡╪з╪ж╪н ╪н┘И┘Д ┘Г╪к╪з╪и╪й ┘В┘И╪з┘Д╪и ╪│╪к╪╣┘Е┘Д ╪и╪┤┘Г┘Д ╪м┘К╪п ┘Е╪╣┘З╪з.
#### ╪к╪╣╪▒┘К┘Б╪з╪к ╪з┘Д╪г╪п┘И╪з╪к
┘К╪м╪и ╪г┘Ж ┘К╪к┘И┘В╪╣ ┘В╪з┘Д╪и┘Г ╪г┘Ж ┘К┘Г┘И┘Ж ╪з┘Д┘Е╪к╪║┘К╪▒ tools ╪е┘Е╪з ┘Б╪з╪▒╪║┘Л╪з (╪е╪░╪з ┘Д┘Е ┘К╪к┘Е ╪к┘Е╪▒┘К╪▒ ╪г┘К ╪г╪п┘И╪з╪к)╪М ╪г┘И ┘В╪з╪ж┘Е╪й ┘Е┘Ж ┘В┘И╪з┘Е┘К╪│ ┘Е╪о╪╖╪╖ JSON. ╪к╪│┘Е╪н ╪г╪│╪з┘Д┘К╪и ┘В╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ╪з┘Д╪о╪з╪╡╪й ╪и┘Ж╪з ┘Д┘Д┘Е╪│╪к╪о╪п┘Е┘К┘Ж ╪и╪к┘Е╪▒┘К╪▒ ╪з┘Д╪г╪п┘И╪з╪к ╪е┘Е╪з ┘Г┘Е╪о╪╖╪╖ JSON ╪г┘И ┘Г╪п┘И╪з┘Д Python╪М ┘И┘Д┘Г┘Ж ╪╣┘Ж╪п┘Е╪з ┘К╪к┘Е ╪к┘Е╪▒┘К╪▒ ╪з┘Д╪п┘И╪з┘Д╪М ┘Б╪е┘Ж┘Ж╪з ┘Ж┘В┘И┘Е ╪к┘Д┘В╪з╪ж┘К┘Л╪з ╪и╪е┘Ж╪┤╪з╪б ┘Е╪о╪╖╪╖ JSON ┘И╪к┘Е╪▒┘К╪▒┘З ╪е┘Д┘Й ┘В╪з┘Д╪и┘Г. ┘Ж╪к┘К╪м╪й ┘Д╪░┘Д┘Г╪М ╪│┘К┘Г┘И┘Ж ┘Е╪к╪║┘К╪▒ tools ╪з┘Д╪░┘К ┘К╪│╪к┘В╪и┘Д┘З ┘В╪з┘Д╪и┘Г ╪п╪з╪ж┘Е┘Л╪з ┘В╪з╪ж┘Е╪й ┘Е┘Ж ┘Е╪о╪╖╪╖╪з╪к JSON. ┘З┘Ж╪з ┘Е╪о╪╖╪╖ JSON ╪г╪п╪з╪й ┘Ж┘Е┘И╪░╪м┘К:
```json
{
"type": "function",
"function": {
"name": "multiply",
"description": "╪п╪з┘Д╪й ╪к╪╢╪▒╪и ╪╣╪п╪п┘К┘Ж",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "╪з┘Д╪▒┘В┘Е ╪з┘Д╪г┘И┘Д ┘Д┘Д╪╢╪▒╪и"
},
"b": {
"type": "number",
"description": "╪з┘Д╪▒┘В┘Е ╪з┘Д╪л╪з┘Ж┘К ┘Д┘Д╪╢╪▒╪и"
}
},
"required": ["a", "b"]
}
}
}
```
┘И┘З┘Ж╪з ╪и╪╣╪╢ ╪з┘Д╪г┘Е╪л┘Д╪й ╪з┘Д╪и╪▒┘Е╪м┘К╪й ┘Д┘Д╪к╪╣╪з┘Е┘Д ┘Е╪╣ ╪з┘Д╪г╪п┘И╪з╪к ┘Б┘К ┘В╪з┘Д╪и ╪з┘Д╪п╪▒╪п╪┤╪й ╪з┘Д╪о╪з╪╡ ╪и┘Г. ╪к╪░┘Г╪▒ ╪г┘Ж ┘З╪░╪з ┘Е╪м╪▒╪п ┘Е╪л╪з┘Д ┘Д╪к┘Ж╪│┘К┘В ┘Е┘П╪н╪п╪п - ┘Е┘Ж ╪з┘Д┘Е╪н╪к┘Е┘Д ╪г┘Ж ┘К╪н╪к╪з╪м ┘Ж┘Е┘И╪░╪м┘Г ╪е┘Д┘Й ╪к┘Ж╪│┘К┘В ┘Е╪о╪к┘Д┘Б!
```text
{%- if tools %}
{%- for tool in tools %}
{{- '<tool>' + tool['function']['name'] + '\n' }}
{%- for argument in tool['function']['parameters']['properties'] %}
{{- argument + ': ' + tool['function']['parameters']['properties'][argument]['description'] + '\n' }}
{%- endfor %}
{{- '\n</tool>' }}
{%- endif %}
{%- endif %}
```
┘К╪м╪и ╪и╪з┘Д╪╖╪и╪╣ ╪з╪о╪к┘К╪з╪▒ ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д┘Е╪н╪п╪п╪й ┘И┘И╪╡┘Б ╪з┘Д╪г╪п┘И╪з╪к ╪з┘Д╪к┘К ┘К┘П╪╣╪▒╪╢┘З╪з ┘В╪з┘Д╪и┘Г ┘Д╪к╪к┘Ж╪з╪│╪и ┘Е╪╣ ╪к┘Д┘Г ╪з┘Д╪к┘К ╪к┘Е ╪к╪п╪▒┘К╪и ┘Ж┘Е┘И╪░╪м┘Г ╪╣┘Д┘К┘З╪з. ┘Д╪з ┘К┘И╪м╪п ╪┤╪▒╪╖ ╪г┘Ж ┘К┘Б┘З┘Е ┘Ж┘Е┘И╪░╪м┘Г ┘Е┘П╪п╪о┘Д╪з╪к ┘Е╪о╪╖╪╖ JSON╪М ┘Б┘В╪╖ ╪г┘Ж ┘К╪к┘Е┘Г┘Ж ┘В╪з┘Д╪и┘Г ┘Е┘Ж ╪к╪▒╪м┘Е╪й ┘Е╪о╪╖╪╖ JSON ╪е┘Д┘Й ╪к┘Ж╪│┘К┘В ┘Ж┘Е┘И╪░╪м┘Г. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪к┘Е ╪к╪п╪▒┘К╪и Command-R ╪и╪з╪│╪к╪о╪п╪з┘Е ╪г╪п┘И╪з╪к ┘Е┘П╪╣╪▒┘С┘Б╪й ╪и╪з╪│╪к╪о╪п╪з┘Е ╪▒╪д┘И╪│ ╪п┘И╪з┘Д Python╪М ┘И┘Д┘Г┘Ж ┘К┘В╪и┘Д ┘В╪з┘Д╪и ╪г╪п╪з╪й Command-R ┘Е╪о╪╖╪╖ JSON╪М ┘И┘К┘П╪н┘И┘С┘Д ╪з┘Д╪г┘Ж┘И╪з╪╣ ╪п╪з╪о┘Д┘К┘Л╪з ┘И┘К┘П╪╣╪▒╪╢ ╪г╪п┘И╪з╪к ╪з┘Д╪е╪п╪о╪з┘Д ┘Г╪╣┘Ж╪з┘И┘К┘Ж Python. ┘К┘Е┘Г┘Ж┘Г ┘Б╪╣┘Д ╪з┘Д┘Г╪л┘К╪▒ ╪и╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘В┘И╪з┘Д╪и!
#### ╪з╪│╪к╪п╪╣╪з╪б╪з╪к ╪з┘Д╪г╪п┘И╪з╪к
╪з╪│╪к╪п╪╣╪з╪б╪з╪к ╪з┘Д╪г╪п┘И╪з╪к╪М ╪е╪░╪з ┘Г╪з┘Ж╪к ┘Е┘И╪м┘И╪п╪й╪М ╪│╪к┘Г┘И┘Ж ┘В╪з╪ж┘Е╪й ┘Е┘П╪▒┘Б┘В╪й ╪и╪▒╪│╪з┘Д╪й ╪и╪п┘И╪▒ "assistant". ┘Д╪з╪н╪╕ ╪г┘Ж tool_calls ┘З┘К ╪п╪з╪ж┘Е┘Л╪з ┘В╪з╪ж┘Е╪й╪М ╪╣┘Д┘Й ╪з┘Д╪▒╪║┘Е ┘Е┘Ж ╪г┘Ж ┘Е╪╣╪╕┘Е ┘Ж┘Е╪з╪░╪м ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪г╪п┘И╪з╪к ╪к╪п╪╣┘Е ┘Б┘В╪╖ ╪з╪│╪к╪п╪╣╪з╪б╪з╪к ╪г╪п┘И╪з╪к ┘Б╪▒╪п┘К╪й ┘Б┘К ┘Г┘Д ┘Е╪▒╪й╪М ┘Е┘Е╪з ┘К╪╣┘Ж┘К ╪г┘Ж ╪з┘Д┘В╪з╪ж┘Е╪й ╪│╪к╪н╪к┘И┘К ╪╣╪з╪п╪й┘Л ╪╣┘Д┘Й ╪╣┘Ж╪╡╪▒ ┘И╪з╪н╪п ┘Б┘В╪╖. ┘З┘Ж╪з ┘В╪з┘Е┘И╪│ ╪▒╪│╪з┘Д╪й ┘Ж┘Е┘И╪░╪м┘К ┘К╪н╪к┘И┘К ╪╣┘Д┘Й ╪з╪│╪к╪п╪╣╪з╪б ╪г╪п╪з╪й:
```json
{
"role": "assistant",
"tool_calls": [
{
"type": "function",
"function": {
"name": "multiply",
"arguments": {
"a": 5,
"b": 6
}
}
}
]
}
```
┘И╪з┘Д┘Ж┘Е╪╖ ╪з┘Д╪┤╪з╪ж╪╣ ┘Д┘Д╪к╪╣╪з┘Е┘Д ┘Е╪╣┘З╪з ╪│┘К┘Г┘И┘Ж ┘Г┘З╪░╪з:
```text
{%- if message['role'] == 'assistant' and 'tool_calls' in message %}
{%- for tool_call in message['tool_calls'] %}
{{- '<tool_call>' + tool_call['function']['name'] + '\n' + tool_call['function']['arguments']|tojson + '\n</tool_call>' }}
{%- endif %}
{%- endfor %}
{%- endif %}
```
┘Е╪▒╪й ╪г╪о╪▒┘Й╪М ┘К╪м╪и ╪╣┘Д┘К┘Г ╪╣╪▒╪╢ ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪г╪п╪з╪й ╪и╪з┘Д╪к┘Ж╪│┘К┘В ┘И╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪о╪з╪╡╪й ╪з┘Д╪к┘К ┘К╪к┘И┘В╪╣┘З╪з ┘Ж┘Е┘И╪░╪м┘Г.
#### ╪з╪│╪к╪м╪з╪и╪з╪к ╪з┘Д╪г╪п┘И╪з╪к
╪з╪│╪к╪м╪з╪и╪з╪к ╪з┘Д╪г╪п┘И╪з╪к ┘Д┘З╪з ╪к┘Ж╪│┘К┘В ╪и╪│┘К╪╖: ╪е┘Ж┘З╪з ┘В╪з┘Е┘И╪│ ╪▒╪│╪з┘Д╪й ╪и╪п┘И╪▒ "tool"╪М ┘И┘Е┘Б╪к╪з╪н "name" ┘К┘П╪╣╪╖┘К ╪з╪│┘Е ╪з┘Д╪п╪з┘Д╪й ╪з┘Д┘Е┘П╪│╪к╪п╪╣╪з╪й╪М ┘И┘Е┘Б╪к╪з╪н "content" ┘К╪н╪к┘И┘К ╪╣┘Д┘Й ┘Ж╪к┘К╪м╪й ╪з╪│╪к╪п╪╣╪з╪б ╪з┘Д╪г╪п╪з╪й. ┘З┘Ж╪з ╪з╪│╪к╪м╪з╪и╪й ╪г╪п╪з╪й ┘Ж┘Е┘И╪░╪м┘К╪й:
```json
{
"role": "tool",
"name": "multiply",
"content": "30"
}
```
┘Д╪│╪к ╪и╪н╪з╪м╪й ╪е┘Д┘Й ╪з╪│╪к╪о╪п╪з┘Е ╪м┘Е┘К╪╣ ╪з┘Д┘Е┘Б╪з╪к┘К╪н ┘Б┘К ╪з╪│╪к╪м╪з╪и╪й ╪з┘Д╪г╪п╪з╪й. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪е╪░╪з ┘Г╪з┘Ж ┘Ж┘Е┘И╪░╪м┘Г ┘Д╪з ┘К╪к┘И┘В╪╣ ╪к╪╢┘Е┘К┘Ж ╪з╪│┘Е ╪з┘Д╪п╪з┘Д╪й ┘Б┘К ╪з╪│╪к╪м╪з╪и╪й ╪з┘Д╪г╪п╪з╪й╪М ┘Б┘К┘Е┘Г┘Ж ╪г┘Ж ┘К┘Г┘И┘Ж ╪╣╪▒╪╢┘З╪з ╪и╪│┘К╪╖┘Л╪з ┘Е╪л┘Д:
```text
{%- if message['role'] == 'tool' %}
{{- "<tool_result>" + message['content'] + "</tool_result>" }}
{%- endif %}
```
┘Е╪▒╪й ╪г╪о╪▒┘Й╪М ╪к╪░┘Г╪▒ ╪г┘Ж ╪з┘Д╪к┘Ж╪│┘К┘В ╪з┘Д┘Б╪╣┘Д┘К ┘И╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪о╪з╪╡╪й ╪о╪з╪╡╪й ╪и╪з┘Д┘Ж┘Е┘И╪░╪м - ┘К╪м╪и ╪г┘Ж ╪к┘П┘И┘Д┘К ╪╣┘Ж╪з┘К╪й ┘Г╪и┘К╪▒╪й ┘Д╪╢┘Е╪з┘Ж ╪г┘Ж ╪з┘Д╪▒┘Е┘И╪▓ ┘И╪з┘Д┘Е╪│╪з┘Б╪з╪к ╪з┘Д┘Б╪з╪▒╪║╪й ┘И┘Г┘Д ╪┤┘К╪б ╪в╪о╪▒ ┘К╪к╪╖╪з╪и┘В ╪к┘Е╪з┘Е┘Л╪з ┘Е╪╣ ╪з┘Д╪к┘Ж╪│┘К┘В ╪з┘Д╪░┘К ╪к┘Е ╪к╪п╪▒┘К╪и ┘Ж┘Е┘И╪░╪м┘Г ╪╣┘Д┘К┘З!

View File

@ -0,0 +1,436 @@
# ╪е┘Ж╪┤╪з╪б ╪и┘Ж┘К╪й ┘Е╪о╪╡╪╡╪й
╪к╪н╪п╪п ┘Б╪ж╪й [`AutoClass`](model_doc/auto) ╪к┘Д┘В╪з╪ж┘К┘Л╪з ╪и┘Ж┘К╪й ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И╪к┘В┘И┘Е ╪и╪к┘Ж╪▓┘К┘Д ╪к┘Г┘И┘К┘Ж ┘И╪г┘И╪▓╪з┘Ж ┘Е╪│╪и┘В┘К┘Ж ┘Д┘Д┘Ж┘Е┘И╪░╪м. ╪и╪┤┘Г┘Д ╪╣╪з┘Е╪М ┘Ж┘И╪╡┘К ╪и╪з╪│╪к╪о╪п╪з┘Е `AutoClass` ┘Д╪е┘Ж╪к╪з╪м ┘Г┘И╪п ╪║┘К╪▒ ┘Е╪▒╪к╪и╪╖ ╪и┘Ж╪│╪о╪й ┘Е╪╣┘К┘Ж╪й. ┘И┘Д┘Г┘Ж ┘К┘Е┘Г┘Ж ┘Д┘Д┘Е╪│╪к╪о╪п┘Е┘К┘Ж ╪з┘Д╪░┘К┘Ж ┘К╪▒┘К╪п┘И┘Ж ┘Е╪▓┘К╪п┘Л╪з ┘Е┘Ж ╪з┘Д╪к╪н┘Г┘Е ┘Б┘К ┘Е╪╣┘Д┘Е╪з╪к ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д┘Е╪н╪п╪п╪й ╪е┘Ж╪┤╪з╪б ┘Ж┘Е┘И╪░╪м ┘Е╪о╪╡╪╡ ┘Е┘Ж ЁЯдЧ Transformers ┘Е┘Ж ┘Е╪м╪▒╪п ╪и╪╢╪╣ ┘Б╪ж╪з╪к ╪г╪│╪з╪│┘К╪й. ┘В╪п ┘К┘Г┘И┘Ж ┘З╪░╪з ┘Е┘Б┘К╪п┘Л╪з ╪и╪┤┘Г┘Д ╪о╪з╪╡ ┘Д╪г┘К ╪┤╪о╪╡ ┘Е┘З╪к┘Е ╪и╪п╪▒╪з╪│╪й ┘Ж┘Е┘И╪░╪м ЁЯдЧ Transformers ╪г┘И ╪к╪п╪▒┘К╪и┘З ╪г┘И ╪е╪м╪▒╪з╪б ╪к╪м╪з╪▒╪и ╪╣┘Д┘К┘З. ┘Б┘К ┘З╪░╪з ╪з┘Д╪п┘Д┘К┘Д╪М ╪│┘Ж╪║┘И╪╡ ╪и╪┤┘Г┘Д ╪г╪╣┘Е┘В ┘Б┘К ╪е┘Ж╪┤╪з╪б ┘Ж┘Е┘И╪░╪м ┘Е╪о╪╡╪╡ ╪и╪п┘И┘Ж `AutoClass`. ╪к╪╣╪▒┘Б ╪╣┘Д┘Й ┘Г┘К┘Б┘К╪й:
- ╪к╪н┘Е┘К┘Д ╪к┘Г┘И┘К┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И╪к╪о╪╡┘К╪╡┘З.
- ╪е┘Ж╪┤╪з╪б ╪и┘Ж┘К╪й ┘Ж┘Е┘И╪░╪м.
- ╪е┘Ж╪┤╪з╪б ┘Е╪м╪▓╪б ┘Д╪║┘И┘Й ╪│╪▒┘К╪╣ ┘И╪и╪╖┘К╪б ┘Д┘Д┘Ж╪╡.
- ╪е┘Ж╪┤╪з╪б ┘Е╪╣╪з┘Д╪м ╪╡┘И╪▒ ┘Д┘Е┘З╪з┘Е ╪з┘Д╪▒╪д┘К╪й.
- ╪е┘Ж╪┤╪з╪б ┘Е╪│╪к╪о╪▒╪м ┘Е┘К╪▓╪з╪к ┘Д┘Е┘З╪з┘Е ╪з┘Д╪╡┘И╪к.
- ╪е┘Ж╪┤╪з╪б ┘Е╪╣╪з┘Д╪м ┘Д┘Д┘Е┘З╪з┘Е ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘И╪│╪з╪ж╪╖.
## ╪з┘Д╪к┘Г┘И┘К┘Ж
┘К╪┤┘К╪▒ ┘Е╪╡╪╖┘Д╪н [╪з┘Д╪к┘Г┘И┘К┘Ж](main_classes/configuration) ╪е┘Д┘Й ╪з┘Д╪о╪╡╪з╪ж╪╡ ╪з┘Д┘Е╪н╪п╪п╪й ┘Д┘Д┘Ж┘Е┘И╪░╪м. ┘Д┘Г┘Д ╪к┘Г┘И┘К┘Ж ┘Ж┘Е┘И╪░╪м ╪о╪╡╪з╪ж╪╡┘З ╪з┘Д╪о╪з╪╡╪й╪Ы ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪к╪┤╪к╪▒┘Г ╪м┘Е┘К╪╣ ┘Ж┘Е╪з╪░╪м NLP ┘Б┘К ╪з┘Д╪о╪╡╪з╪ж╪╡ `hidden_size` ┘И`num_attention_heads` ┘И`num_hidden_layers` ┘И`vocab_size` ╪з┘Д┘Е╪┤╪к╪▒┘Г╪й. ╪к╪н╪п╪п ┘З╪░┘З ╪з┘Д╪о╪╡╪з╪ж╪╡ ╪╣╪п╪п ╪▒╪д┘И╪│ ╪з┘Д╪з┘Ж╪к╪и╪з┘З ╪г┘И ╪з┘Д╪╖╪и┘В╪з╪к ╪з┘Д┘Е╪о┘Б┘К╪й ┘Д╪и┘Ж╪з╪б ┘Ж┘Е┘И╪░╪м ╪и┘З╪з.
╪з╪╖┘Д╪╣ ╪╣┘Д┘Й [DistilBERT](model_doc/distilbert) ┘Е┘Ж ╪о┘Д╪з┘Д [`DistilBertConfig`] ┘Д┘Е╪╣╪з┘К┘Ж╪й ╪о╪╡╪з╪ж╪╡┘З:
```py
>>> from transformers import DistilBertConfig
>>> config = DistilBertConfig()
>>> print(config)
DistilBertConfig {
"activation": "gelu",
"attention_dropout": 0.1,
"dim": 768,
"dropout": 0.1,
"hidden_dim": 3072,
"initializer_range": 0.02,
"max_position_embeddings": 512,
"model_type": "distilbert",
"n_heads": 12,
"n_layers": 6,
"pad_token_id": 0,
"qa_dropout": 0.1,
"seq_classif_dropout": 0.2,
"sinusoidal_pos_embds": false,
"transformers_version": "4.16.2",
"vocab_size": 30522
}
```
┘К╪╣╪▒╪╢ [`DistilBertConfig`] ╪м┘Е┘К╪╣ ╪з┘Д╪о╪╡╪з╪ж╪╡ ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К╪й ╪з┘Д┘Е╪│╪к╪о╪п┘Е╪й ┘Д╪и┘Ж╪з╪б ┘Ж┘Е┘И╪░╪м [`DistilBertModel`] ╪г╪│╪з╪│┘К. ╪м┘Е┘К╪╣ ╪з┘Д╪о╪╡╪з╪ж╪╡ ┘В╪з╪и┘Д╪й ┘Д┘Д╪к╪╣╪п┘К┘Д╪М ┘Е┘Е╪з ┘К┘К╪к┘К╪н ┘Е╪м╪з┘Д╪з┘Л ┘Д┘Д╪к╪м╪▒┘К╪и. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ┘К┘Е┘Г┘Ж┘Г ╪к╪╣╪п┘К┘Д ┘Ж┘Е┘И╪░╪м ╪з┘Б╪к╪▒╪з╪╢┘К ┘Д┘А:
- ╪к╪м╪▒╪и╪й ╪п╪з┘Д╪й ╪к┘Ж╪┤┘К╪╖ ┘Е╪о╪к┘Д┘Б╪й ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Е╪╣╪з┘Е┘Д `activation`.
- ╪з╪│╪к╪о╪п╪з┘Е ┘Е╪╣╪п┘Д ╪е╪│┘В╪з╪╖ ╪г╪╣┘Д┘Й ╪з┘Д╪з╪н╪к┘Е╪з┘Д╪з╪к ╪з┘Д╪з┘Ж╪к╪и╪з┘З ┘Е╪╣ ┘Е╪╣╪з┘Е┘Д `attention_dropout`.
```py
>>> my_config = DistilBertConfig(activation="relu", attention_dropout=0.4)
>>> print(my_config)
DistilBertConfig {
"activation": "relu",
"attention_dropout": 0.4,
```
┘К┘Е┘Г┘Ж ╪к╪╣╪п┘К┘Д ╪о╪╡╪з╪ж╪╡ ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д┘Е╪п╪▒╪и ┘Е╪│╪и┘В┘Л╪з ┘Б┘К ╪п╪з┘Д╪й [`~PretrainedConfig.from_pretrained`] :
```py
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
```
╪и┘Е╪м╪▒╪п ╪г┘Ж ╪к╪╡╪и╪н ╪▒╪з╪╢┘К┘Л╪з ╪╣┘Ж ╪к┘Г┘И┘К┘Ж ┘Ж┘Е┘И╪░╪м┘Г╪М ┘К┘Е┘Г┘Ж┘Г ╪н┘Б╪╕┘З ╪и╪з╪│╪к╪о╪п╪з┘Е [`~PretrainedConfig.save_pretrained`]. ┘К╪к┘Е ╪к╪о╪▓┘К┘Ж ┘Е┘Д┘Б ╪з┘Д╪к┘Г┘И┘К┘Ж ╪з┘Д╪о╪з╪╡ ╪и┘Г ╪╣┘Д┘Й ╪г┘Ж┘З ┘Е┘Д┘Б JSON ┘Б┘К ╪п┘Д┘К┘Д ╪з┘Д╪н┘Б╪╕ ╪з┘Д┘Е╪н╪п╪п:
```py
>>> my_config.save_pretrained(save_directory="./your_model_save_path")
```
┘Д╪е╪╣╪з╪п╪й ╪з╪│╪к╪о╪п╪з┘Е ┘Е┘Д┘Б ╪з┘Д╪к┘Г┘И┘К┘Ж╪М ┘В┘Е ╪и╪к╪н┘Е┘К┘Д┘З ╪и╪з╪│╪к╪о╪п╪з┘Е [`~PretrainedConfig.from_pretrained`]:
```py
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")
```
<Tip>
┘К┘Е┘Г┘Ж┘Г ╪г┘К╪╢┘Л╪з ╪н┘Б╪╕ ┘Е┘Д┘Б ╪з┘Д╪к┘Г┘И┘К┘Ж ┘Г┘В╪з┘Е┘И╪│ ╪г┘И ╪н╪к┘Й ┘Г┘Б╪▒┘В ╪и┘К┘Ж ╪о╪╡╪з╪ж╪╡ ╪з┘Д╪к┘Г┘И┘К┘Ж ╪з┘Д┘Е┘П╪╣╪п┘С┘Д╪й ┘И╪з┘Д╪о╪╡╪з╪ж╪╡ ╪з┘Д╪к┘Г┘И┘К┘Ж ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К╪й! ╪▒╪з╪м╪╣ ┘И╪л╪з╪ж┘В [╪з┘Д╪к┘Г┘И┘К┘Ж](main_classes/configuration) ┘Д┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д╪к┘Б╪з╪╡┘К┘Д.
</Tip>
## ╪з┘Д┘Ж┘Е┘И╪░╪м
╪з┘Д╪о╪╖┘И╪й ╪з┘Д╪к╪з┘Д┘К╪й ┘З┘К ╪е┘Ж╪┤╪з╪б [┘Ж┘Е┘И╪░╪м](main_classes/models). ╪з┘Д┘Ж┘Е┘И╪░╪м - ┘И┘К┘П╪┤╪з╪▒ ╪е┘Д┘К┘З ╪г╪н┘К╪з┘Ж┘Л╪з ╪и╪з╪│┘Е ╪з┘Д╪и┘Ж┘К╪й - ┘К┘П╪н╪п╪п ┘И╪╕┘К┘Б╪й ┘Г┘Д ╪╖╪и┘В╪й ┘И╪з┘Д╪╣┘Е┘Д┘К╪з╪к ╪з┘Д╪н╪│╪з╪и┘К╪й ╪з┘Д┘Е┘П┘Ж┘Б╪░╪й. ╪к┘П╪│╪к╪о╪п┘Е ╪о╪╡╪з╪ж╪╡ ┘Е╪л┘Д `num_hidden_layers` ┘Е┘Ж ╪з┘Д╪к┘Г┘И┘К┘Ж ┘Д╪к╪н╪п┘К╪п ┘З╪░┘З ╪з┘Д╪и┘Ж┘К╪й. ╪к╪┤╪к╪▒┘Г ╪м┘Е┘К╪╣ ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Б┘К ┘Б╪ж╪й ╪г╪│╪з╪│┘К╪й ┘И╪з╪н╪п╪й ┘З┘К [`PreTrainedModel`] ┘И╪и╪╣╪╢ ╪з┘Д┘И╪╕╪з╪ж┘Б ╪з┘Д┘Е┘П╪┤╪к╪▒┘Г╪й ┘Е╪л┘Д ╪║┘К┘К╪▒ ╪н╪м┘Е ┘Е┘П╪п╪о┘Д╪з╪к ╪з┘Д┘Г┘Д┘Е╪з╪к ┘И╪к┘В┘Д┘К╪╡ ╪▒╪д┘И╪│ ╪в┘Д┘К╪й ╪з┘Д╪з┘Ж╪к╪и╪з┘З ╪з┘Д╪░╪з╪к┘К. ╪и╪з┘Д╪е╪╢╪з┘Б╪й ╪е┘Д┘Й ╪░┘Д┘Г╪М ┘Б╪е┘Ж ╪м┘Е┘К╪╣ ╪з┘Д┘Ж┘Е╪з╪░╪м ┘З┘К ┘Б╪ж╪з╪к ┘Б╪▒╪╣┘К╪й ╪е┘Е╪з ┘Е┘Ж [`torch.nn.Module`](https://pytorch.org/docs/stable/generated/torch.nn.Module.html)╪М [`tf.keras.Model`](https://www.tensorflow.org/api_docs/python/tf/keras/Model) ╪г┘И [`flax.linen.Module`](https://flax.readthedocs.io/en/latest/api_reference/flax.linen/module.html) . ┘З╪░╪з ┘К╪╣┘Ж┘К ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Е╪к┘И╪з┘Б┘В╪й ┘Е╪╣ ┘Г┘Д ╪з╪│╪к╪о╪п╪з┘Е ┘Д╪е╪╖╪з╪▒ ╪з┘Д╪╣┘Е┘Д ╪з┘Д╪о╪з╪╡ ╪и┘З╪з.
<frameworkcontent>
<pt>
┘В┘Е ╪и╪к╪н┘Е┘К┘Д ╪о╪╡╪з╪ж╪╡ ╪з┘Д╪к┘Г┘И┘К┘Ж ╪з┘Д┘Е╪о╪╡╪╡╪й ╪з┘Д╪о╪з╪╡╪й ╪и┘Г ┘Б┘К ╪з┘Д┘Ж┘Е┘И╪░╪м:
```py
>>> from transformers import DistilBertModel
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")
>>> model = DistilBertModel(my_config)
```
┘З╪░╪з ┘К┘Ж╪┤╪ж ┘Ж┘Е┘И╪░╪м┘Л╪з ╪и┘В┘К┘Е ╪╣╪┤┘И╪з╪ж┘К╪й ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪з┘Д╪г┘И╪▓╪з┘Ж ╪з┘Д┘Е┘П╪п╪▒╪и╪й ┘Е╪│╪и┘В┘Л╪з. ┘Д┘Ж ┘К┘Г┘И┘Ж ┘З╪░╪з ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Е┘Б┘К╪п┘Л╪з ╪н╪к┘Й ┘К╪к┘Е ╪к╪п╪▒┘К╪и┘З. ╪к┘П╪╣╪п ╪╣┘Е┘Д┘К╪й ╪з┘Д╪к╪п╪▒┘К╪и ┘Е┘Г┘Д┘Б╪й ┘И╪к╪│╪к╪║╪▒┘В ┘И┘В╪к┘Л╪з ╪╖┘И┘К┘Д╪з┘Л. ┘Е┘Ж ╪з┘Д╪г┘Б╪╢┘Д ╪и╪┤┘Г┘Д ╪╣╪з┘Е ╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Е┘И╪░╪м ┘Е┘П╪п╪▒╪и ┘Е╪│╪и┘В┘Л╪з ┘Д┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ┘Ж╪к╪з╪ж╪м ╪г┘Б╪╢┘Д ╪и╪┤┘Г┘Д ╪г╪│╪▒╪╣╪М ┘Е╪╣ ╪з╪│╪к╪о╪п╪з┘Е ╪м╪▓╪б ╪и╪│┘К╪╖ ┘Б┘В╪╖ ┘Е┘Ж ╪з┘Д┘Е┘И╪з╪▒╪п ╪з┘Д┘Е╪╖┘Д┘И╪и╪й ┘Д┘Д╪к╪п╪▒┘К╪и.
┘В┘Е ╪и╪е┘Ж╪┤╪з╪б ┘Ж┘Е┘И╪░╪м ┘Е┘П╪п╪▒╪и ┘Е╪│╪и┘В┘Л╪з ╪и╪з╪│╪к╪о╪п╪з┘Е [`~PreTrainedModel.from_pretrained`]:
```py
>>> model = DistilBertModel.from_pretrained("distilbert/distilbert-base-uncased")
```
╪╣┘Ж╪п ╪и╪к╪н┘Е┘К┘Д ╪з┘Д╪г┘И╪▓╪з┘Ж ╪з┘Д┘Е┘П╪п╪▒╪и╪й ┘Е╪│╪и┘В┘Л╪з╪М ┘К╪к┘Е ╪к╪н┘Е┘К┘Д ╪к┘Г┘И┘К┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К ╪к┘Д┘В╪з╪ж┘К┘Л╪з ╪е╪░╪з ┘Г╪з┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Е┘Ж ┘Е┘Г╪к╪и╪й ЁЯдЧ Transformers. ┘И┘Е╪╣ ╪░┘Д┘Г╪М ┘К┘Е┘Г┘Ж┘Г ╪г┘К╪╢┘Л╪з ╪з╪│╪к╪и╪п╪з┘Д - ╪и╪╣╪╢ ╪г┘И ┘Г┘Д - ╪│╪е╪╣╪п╪з╪п╪з╪к ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К╪й ╪и╪е╪╣╪п╪з╪п╪з╪к┘Г ╪з┘Д╪о╪з╪╡╪й:
```py
>>> model = DistilBertModel.from_pretrained("distilbert/distilbert-base-uncased"╪М config=my_config)
```
</pt>
<tf>
┘В┘Е ╪и╪к╪н┘Е┘К┘Д ╪о╪╡╪з╪ж╪╡ ╪з┘Д╪к┘Г┘И┘К┘Ж ╪з┘Д┘Е┘П╪о╪╡╪╡╪й ╪з┘Д╪о╪з╪╡╪й ╪и┘Г ┘Б┘К ╪з┘Д┘Ж┘Е┘И╪░╪м:
```py
>>> from transformers import TFDistilBertModel
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json")
>>> tf_model = TFDistilBertModel(my_config)
```
┘З╪░╪з ┘К┘Ж╪┤╪ж ┘Ж┘Е┘И╪░╪м┘Л╪з ╪и┘В┘К┘Е ╪╣╪┤┘И╪з╪ж┘К╪й ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪з┘Д╪г┘И╪▓╪з┘Ж ╪з┘Д┘Е┘П╪п╪▒╪и╪й ┘Е╪│╪и┘В┘Л╪з. ┘Д┘Ж ┘К┘Г┘И┘Ж ┘З╪░╪з ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Е┘Б┘К╪п┘Л╪з ╪н╪к┘Й ┘К╪к┘Е ╪к╪п╪▒┘К╪и┘З. ╪к┘П╪╣╪п ╪╣┘Е┘Д┘К╪й ╪з┘Д╪к╪п╪▒┘К╪и ┘Е┘Г┘Д┘Б╪й ┘И╪к╪│╪к╪║╪▒┘В ┘И┘В╪к┘Л╪з ╪╖┘И┘К┘Д╪з┘Л. ┘Е┘Ж ╪з┘Д╪г┘Б╪╢┘Д ╪и╪┤┘Г┘Д ╪╣╪з┘Е ╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Е┘И╪░╪м ┘Е┘П╪п╪▒╪и ┘Е╪│╪и┘В┘Л╪з ┘Д┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ┘Ж╪к╪з╪ж╪м ╪г┘Б╪╢┘Д ╪и╪┤┘Г┘Д ╪г╪│╪▒╪╣╪М ┘Е╪╣ ╪з╪│╪к╪о╪п╪з┘Е ╪м╪▓╪б ╪и╪│┘К╪╖ ┘Б┘В╪╖ ┘Е┘Ж ╪з┘Д┘Е┘И╪з╪▒╪п ╪з┘Д┘Е╪╖┘Д┘И╪и╪й ┘Д┘Д╪к╪п╪▒┘К╪и.
┘В┘Е ╪и╪е┘Ж╪┤╪з╪б ┘Ж┘Е┘И╪░╪м ┘Е┘П╪п╪▒╪и ┘Е╪│╪и┘В┘Л╪з ╪и╪з╪│╪к╪о╪п╪з┘Е [`~TFPreTrainedModel.from_pretrained`]:
```py
>>> tf_model = TFDistilBertModel.from_pretrained("distilbert/distilbert-base-uncased")
```
╪╣┘Ж╪п┘Е╪з ╪к┘В┘И┘Е ╪и╪к╪н┘Е┘К┘Д ╪з┘Д╪г┘И╪▓╪з┘Ж ╪з┘Д┘Е┘П╪п╪▒╪и╪й ┘Е╪│╪и┘В┘Л╪з╪М┘К╪к┘Е ╪к╪н┘Е┘К┘Д ╪е╪╣╪п╪з╪п╪з╪к ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К ╪к┘Д┘В╪з╪ж┘К┘Л╪з ╪е╪░╪з ┘Г╪з┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Е┘Ж ┘Е┘Г╪к╪и╪й ЁЯдЧ Transformers. ┘И┘Е╪╣ ╪░┘Д┘Г╪М ┘К┘Е┘Г┘Ж┘Г ╪г┘К╪╢┘Л╪з ╪з╪│╪к╪и╪п╪з┘Д - ╪и╪╣╪╢ ╪г┘И ┘Г┘Д - ╪е╪╣╪п╪з╪п╪з╪к ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К╪й ╪и╪е╪╣╪п╪з╪п╪з╪к┘Г ╪з┘Д╪о╪з╪╡╪й:
```py
>>> tf_model = TFDistilBertModel.from_pretrained("distilbert/distilbert-base-uncased"╪М config=my_config)
```
</tf>
</frameworkcontent>
### ╪▒╪д┘И╪│ ╪з┘Д┘Ж┘Е┘И╪░╪м
┘Б┘К ┘З╪░┘З ╪з┘Д┘Е╪▒╪н┘Д╪й╪М ┘Д╪п┘К┘Г ┘Ж┘Е┘И╪░╪м DistilBERT ╪з┘Д╪г╪│╪з╪│┘К ╪з┘Д╪░┘К ┘К╪о╪▒╪м *╪н╪з┘Д╪з╪к ╪з┘Д┘Г╪з┘Е┘Ж╪й*. ╪к┘П┘Е╪▒┘С┘О╪▒ ┘З╪░┘З ╪з┘Д╪н╪з┘Д╪з╪к ╪з┘Д┘Г╪з┘Е┘Ж╪й ┘Г┘Е╪п╪о┘Д╪з╪к ┘Д╪▒╪г╪│ ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Д╪е┘Ж╪к╪з╪м ╪з┘Д┘Е╪о╪▒╪м╪з╪к ╪з┘Д┘Ж┘З╪з╪ж┘К╪й. ╪к┘И┘Б╪▒ ┘Е┘Г╪к╪и╪й ЁЯдЧ Transformers ╪▒╪г╪│ ┘Ж┘Е┘И╪░╪м ┘Е╪о╪к┘Д┘Б ┘Д┘Г┘Д ┘Е┘З┘Е╪й ╪╖╪з┘Д┘Е╪з ╪г┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘К╪п╪╣┘Е ╪з┘Д┘Е┘З┘Е╪й (╪г┘К ┘Д╪з ┘К┘Е┘Г┘Ж┘Г ╪з╪│╪к╪о╪п╪з┘Е DistilBERT ┘Д┘Е┘З┘Е╪й ╪к╪│┘Д╪│┘Д ╪е┘Д┘Й ╪к╪│┘Д╪│┘Д ┘Е╪л┘Д ╪з┘Д╪к╪▒╪м┘Е╪й).
<frameworkcontent>
<pt>
╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М [`DistilBertForSequenceClassification`] ┘З┘И ┘Ж┘Е┘И╪░╪м DistilBERT ╪з┘Д╪г╪│╪з╪│ ┘Е╪▓┘И╪п┘Л╪з ╪и╪▒╪г╪│ ╪к╪╡┘Ж┘К┘Б ╪к╪│┘Д╪│┘Д┘К. ┘К┘П╪┤┘Г┘С┘Д ╪▒╪г╪│ ╪з┘Д╪к╪╡┘Ж┘К┘Б ╪з┘Д╪к╪│┘Д╪│┘Д┘К ╪╖╪и┘В╪й ╪о╪╖┘К╪й ┘Б┘И┘В ╪з┘Д┘Е╪о╪▒╪м╪з╪к ╪з┘Д┘Е╪м┘Е╪╣╪й.
```py
>>> from transformers import DistilBertForSequenceClassification
>>> model = DistilBertForSequenceClassification.from_pretrained("distilbert/distilbert-base-uncased")
```
╪г╪╣╪п ╪з╪│╪к╪о╪п╪з┘Е ┘З╪░╪з ┘Ж┘В╪╖╪й ╪з┘Д╪к╪н┘В┘В ┘З╪░┘З ┘Д┘Е┘З┘Е╪й ╪г╪о╪▒┘Й ╪и╪│┘З┘И┘Д╪й╪М ┘И╪░┘Д┘Г ╪и╪к╪║┘К┘К╪▒ ╪▒╪г╪│ ╪з┘Д┘Ж┘Е┘И╪░╪м.┘Б┘Б┘К ┘Е┘З┘Е╪й ╪з┘Д╪е╪м╪з╪и╪й ╪╣┘Д┘Й ╪з┘Д╪г╪│╪ж┘Д╪й╪М ╪│╪к╪│╪к╪о╪п┘Е ╪▒╪г╪│ ╪з┘Д┘Ж┘Е┘И╪░╪м [`DistilBertForQuestionAnswering`]. ╪▒╪г╪│ ╪з┘Д╪е╪м╪з╪и╪й ╪╣┘Д┘Й ╪з┘Д╪г╪│╪ж┘Д╪й ┘Е╪┤╪з╪и┘З ┘Д╪▒╪г╪│ ╪з┘Д╪к╪╡┘Ж┘К┘Б ╪з┘Д╪к╪│┘Д╪│┘Д┘К ╪и╪з╪│╪к╪л┘Ж╪з╪б ╪г┘Ж┘З ╪╖╪и┘В╪й ╪о╪╖┘К╪й ┘Б┘И┘В ┘Е╪о╪▒╪м╪з╪к ╪з┘Д╪н╪з┘Д╪з╪к ╪з┘Д┘Г╪з┘Е┘Ж╪й.
```py
>>> from transformers import DistilBertForQuestionAnswering
>>> model = DistilBertForQuestionAnswering.from_pretrained("distilbert/distilbert-base-uncased")
```
</pt>
<tf>
╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М [`TFDistilBertForSequenceClassification`] ┘З┘И ┘Ж┘Е┘И╪░╪м DistilBERT ╪з┘Д╪г╪│╪з╪│┘К ╪и╪▒╪г╪│ ╪к╪╡┘Ж┘К┘Б ╪к╪│┘Д╪│┘Д. ╪▒╪г╪│ ╪з┘Д╪к╪╡┘Ж┘К┘Б ╪з┘Д╪к╪│┘Д╪│┘Д┘К ┘З┘И ╪╖╪и┘В╪й ╪о╪╖┘К╪й ╪г╪╣┘Д┘Й ╪з┘Д┘Е╪о╪▒╪м╪з╪к ╪з┘Д┘Е╪м┘Е╪╣╪й.
```py
>>> from transformers import TFDistilBertForSequenceClassification
>>> tf_model = TFDistilBertForSequenceClassification.from_pretrained("distilbert/distilbert-base-uncased")
```
╪г╪╣╪п ╪з╪│╪к╪о╪п╪з┘Е ┘З╪░╪з ┘Ж┘В╪╖╪й ╪з┘Д╪к╪н┘В┘В ┘Д┘Е┘З┘Е╪й ╪г╪о╪▒┘Й ╪╣┘Ж ╪╖╪▒┘К┘В ╪з┘Д╪к╪и╪п┘К┘Д ╪е┘Д┘Й ╪▒╪г╪│ ┘Ж┘Е┘И╪░╪м ┘Е╪о╪к┘Д┘Б. ┘Д┘Е┘З┘Е╪й ╪з┘Д╪е╪м╪з╪и╪й ╪╣┘Д┘Й ╪з┘Д╪г╪│╪ж┘Д╪й╪М ╪│╪к╪│╪к╪о╪п┘Е ╪▒╪г╪│ ╪з┘Д┘Ж┘Е┘И╪░╪м [`TFDistilBertForQuestionAnswering`]. ╪▒╪г╪│ ╪з┘Д╪е╪м╪з╪и╪й ╪╣┘Д┘Й ╪з┘Д╪г╪│╪ж┘Д╪й ┘Е╪┤╪з╪и┘З ┘Д╪▒╪г╪│ ╪з┘Д╪к╪╡┘Ж┘К┘Б ╪з┘Д╪к╪│┘Д╪│┘Д┘К ╪и╪з╪│╪к╪л┘Ж╪з╪б ╪г┘Ж┘З ╪╖╪и┘В╪й ╪о╪╖┘К╪й ╪г╪╣┘Д┘Й ╪н╪з┘Д╪з╪к ╪з┘Д╪е╪о╪▒╪з╪м ╪з┘Д┘Е╪о┘Б┘К╪й.
```py
>>> from transformers import TFDistilBertForQuestionAnswering
>>> tf_model = TFDistilBertForQuestionAnswering.from_pretrained("distilbert/distilbert-base-uncased")
```
</tf>
</frameworkcontent>
## ┘Е╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡
╪з┘Д┘Б╪ж╪й ╪з┘Д╪г╪│╪з╪│┘К╪й ╪з┘Д╪г╪о┘К╪▒╪й ╪з┘Д╪к┘К ╪к╪н╪к╪з╪м┘З╪з ┘В╪и┘Д ╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Е┘И╪░╪м ┘Д┘Д╪и┘К╪з┘Ж╪з╪к ╪з┘Д┘Ж╪╡┘К╪й ┘З┘К [┘Е╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡](main_classes/tokenizer) ┘Д╪к╪н┘И┘К┘Д ╪з┘Д┘Ж╪╡ ╪з┘Д╪о╪з┘Е ╪е┘Д┘Й ╪к┘Ж╪│┘И╪▒╪з╪к (tensors). ┘З┘Ж╪з┘Г ┘Ж┘И╪╣╪з┘Ж ┘Е┘Ж ╪з┘Д┘Е╪н┘И┘Д╪з╪к ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪к┘К ┘К┘Е┘Г┘Ж┘Г ╪з╪│╪к╪о╪п╪з┘Е┘З╪з ┘Е╪╣ ЁЯдЧ Transformers:
- [`PreTrainedTokenizer`]: ╪к┘Ж┘Б┘К╪░ Python ┘Д┘Е╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡.
- [`PreTrainedTokenizerFast`]: ┘Е╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡ ┘Е┘Ж ┘Е┘Г╪к╪и╪й [ЁЯдЧ Tokenizer](https://huggingface.co/docs/tokenizers/python/latest/) ╪з┘Д┘Е┘П╪и┘Ж┘К╪й ╪╣┘Д┘Й ┘Д╪║╪й Rust. ┘З╪░╪з ╪з┘Д┘Ж┘И╪╣ ┘Е┘Ж ╪з┘Д┘Е╪м╪▓╪ж╪з╪к ╪г╪│╪▒╪╣ ╪и┘Г╪л┘К╪▒╪М ╪о╪з╪╡╪й┘Л ╪╣┘Ж╪п ┘Е╪╣╪з┘Д╪м╪й ╪п┘Б╪╣╪з╪к ╪з┘Д┘Ж╪╡┘И╪╡╪М ┘И╪░┘Д┘Г ╪и┘Б╪╢┘Д ╪к╪╡┘Е┘К┘Е┘З ╪и┘Д╪║╪й Rust. ┘Г┘Е╪з ┘К┘И┘Б╪▒ ┘Е╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡ ╪з┘Д╪│╪▒┘К╪╣ ╪╖╪▒┘В┘Л╪з ╪е╪╢╪з┘Б┘К╪й ┘Е╪л┘Д *┘Е╪о╪╖╪╖ ╪з┘Д╪е╪▓╪з╪н╪й* ╪з┘Д╪░┘К ┘К┘П╪╖╪з╪и┘В ╪з┘Д╪▒┘Е┘И╪▓ ╪и┘Г┘Д┘Е╪з╪к┘З╪з ╪г┘И ╪г╪н╪▒┘Б┘З╪з ╪з┘Д╪г╪╡┘Д┘К╪й.
┘К╪п╪╣┘Е ┘Г┘Д╪з ╪з┘Д┘Ж┘И╪╣┘К┘Ж ┘Е┘Ж ╪з┘Д┘Е╪м╪▓╪ж╪з╪к ╪╖╪▒┘В┘Л╪з ╪┤╪з╪ж╪╣╪й ┘Е╪л┘Д ╪з┘Д╪к╪▒┘Е┘К╪▓ ┘И┘Б┘Г ╪з┘Д╪к╪▒┘Е┘К╪▓╪М ┘И╪е╪╢╪з┘Б╪й ╪▒┘Е┘И╪▓ ╪м╪п┘К╪п╪й╪М ┘И╪е╪п╪з╪▒╪й ╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д╪о╪з╪╡╪й.
<Tip warning={true}>
┘Д╪з ┘К╪п╪╣┘Е ┘Г┘Д ┘Ж┘Е┘И╪░╪м ┘Е╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡ ╪│╪▒┘К╪╣. ╪з┘Д┘В ┘Ж╪╕╪▒╪й ╪╣┘Д┘Й ┘З╪░╪з [╪м╪п┘И┘Д](index#supported-frameworks) ┘Д┘Д╪к╪н┘В┘В ┘Е┘Е╪з ╪е╪░╪з ┘Г╪з┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘К╪н╪к┘И┘К ╪╣┘Д┘Й ╪п╪╣┘Е ┘Е╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡ ╪│╪▒┘К╪╣.
</Tip>
╪е╪░╪з ╪п╪▒╪и╪к ┘Е╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡ ╪о╪з╪╡ ╪и┘Г╪М ┘Б┘К┘Е┘Г┘Ж┘Г ╪е┘Ж╪┤╪з╪б ┘И╪з╪н╪п ┘Е┘Ж *┘В╪з┘Е┘И╪│┘Г*:```
```py
>>> from transformers import DistilBertTokenizer
>>> my_tokenizer = DistilBertTokenizer(vocab_file="my_vocab_file.txt"╪М do_lower_case=False╪М padding_side="left")
```
┘Е┘Ж ╪з┘Д┘Е┘З┘Е ╪г┘Ж ╪к╪к╪░┘Г╪▒ ╪г┘Ж ┘В╪з┘Е┘И╪│ ┘Е╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡ ╪з┘Д┘Е┘П╪о╪╡╪╡ ╪│┘К┘Г┘И┘Ж ┘Е╪о╪к┘Д┘Б┘Л╪з ╪╣┘Ж ┘В╪з┘Е┘И╪│ ┘Е╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡ ┘Ж┘Е┘И╪░╪м ┘Е┘П╪п╪▒┘С╪и ┘Е╪│╪и┘В┘Л╪з. ┘К╪м╪и ╪╣┘Д┘К┘Г ╪з╪│╪к╪о╪п╪з┘Е ┘В╪з┘Е┘И╪│ ┘Ж┘Е┘И╪░╪м ┘Е┘П╪п╪▒┘С╪и ┘Е╪│╪и┘В┘Л╪з ╪е╪░╪з ┘Г┘Ж╪к ╪к╪│╪к╪о╪п┘Е ┘Ж┘Е┘И╪░╪м┘Л╪з ┘Е┘П╪п╪▒┘С╪и┘Л╪з ┘Е╪│╪и┘В┘Л╪з╪М ┘И╪е┘Д╪з ┘Б┘Д┘Ж ╪к┘Г┘И┘Ж ╪з┘Д┘Е╪п╪о┘Д╪з╪к ╪░╪з╪к ┘Е╪╣┘Ж┘Й. ┘В┘Е ╪и╪е┘Ж╪┤╪з╪б ┘Е╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡ ╪и╪з╪│╪к╪о╪п╪з┘Е ┘В╪з┘Е┘И╪│ ┘Ж┘Е┘И╪░╪м ┘Е┘П╪п╪▒┘С╪и ┘Е╪│╪и┘В┘Л╪з ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Б╪ж╪й [`DistilBertTokenizer`]:
```py
>>> from transformers import DistilBertTokenizer
>>> slow_tokenizer = DistilBertTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
```
┘В┘Е ╪и╪е┘Ж╪┤╪з╪б ┘Е╪м╪▓╪ж ┘Ж╪╡┘И╪╡ ╪│╪▒┘К╪╣ ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Б╪ж╪й [`DistilBertTokenizerFast`]:
```py
>>> from transformers import DistilBertTokenizerFast
>>> fast_tokenizer = DistilBertTokenizerFast.from_pretrained("distilbert/distilbert-base-uncased")
```
<Tip>
╪з┘Б╪к╪▒╪з╪╢┘К┘Л╪з╪М ╪│┘К╪н╪з┘И┘Д [`AutoTokenizer`] ╪к╪н┘Е┘К┘Д ┘Е╪м╪▓╪ж ┘Ж╪╡┘И╪╡ ╪│╪▒┘К╪╣. ┘К┘Е┘Г┘Ж┘Г ╪к╪╣╪╖┘К┘Д ┘З╪░╪з ╪з┘Д╪│┘Д┘И┘Г ╪╣┘Ж ╪╖╪▒┘К┘В ╪к╪╣┘К┘К┘Ж `use_fast=False` ┘Б┘К `from_pretrained`.
</Tip>
## ┘Е╪╣╪з┘Д╪м ╪з┘Д╪╡┘И╪▒
┘К╪╣╪з┘Д╪м ┘Е╪╣╪з┘Д╪м ╪з┘Д╪╡┘И╪▒ ╪и┘К╪з┘Ж╪з╪к ╪з┘Д╪▒╪д┘К╪й. ┘И┘З┘И ┘К╪▒╪л ┘Е┘Ж ╪з┘Д┘Б╪ж╪й ╪з┘Д╪г╪│╪з╪│┘К╪й [`~image_processing_utils.ImageProcessingMixin`].
┘Д╪и┘Ж╪з╪б ┘Е╪╣╪з┘Д╪м ╪╡┘И╪▒ ╪о╪з╪╡ ╪и╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д┘Е╪│╪к╪о╪п┘Е╪М ╪г┘Ж╪┤╪ж ┘Е╪л┘Д╪з┘Л ┘Е┘П╪╣╪з┘Д╪м [`ViTImageProcessor`] ╪з┘Б╪к╪▒╪з╪╢┘К┘Л╪з ╪е╪░╪з ┘Г┘Ж╪к ╪к╪│╪к╪о╪п┘Е [ViT](model_doc/vit) ┘Д╪к╪╡┘Ж┘К┘Б ╪з┘Д╪╡┘И╪▒:
```py
>>> from transformers import ViTImageProcessor
>>> vit_extractor = ViTImageProcessor()
>>> print(vit_extractor)
ViTImageProcessor {
"do_normalize": true,
"do_resize": true,
"image_processor_type": "ViTImageProcessor",
"image_mean": [
0.5,
0.5,
0.5
],
"image_std": [
0.5,
0.5,
0.5
],
"resample": 2,
"size": 224
}
```
<Tip>
╪е╪░╪з ┘Г┘Ж╪к ┘Д╪з ╪к╪и╪н╪л ╪╣┘Ж ╪г┘К ╪к╪о╪╡┘К╪╡╪М ┘Б┘Е╪з ╪╣┘Д┘К┘Г ╪│┘И┘Й ╪з╪│╪к╪о╪п╪з┘Е ╪╖╪▒┘К┘В╪й `from_pretrained` ┘Д╪к╪н┘Е┘К┘Д ┘Е╪╣┘Д┘Е╪з╪к ┘Е╪╣╪з┘Д╪м ╪з┘Д╪╡┘И╪▒ ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К╪й ┘Д┘Д┘Ж┘Е┘И╪░╪м.
</Tip>
╪╣╪п┘Д ╪г┘К┘Л╪з ┘Е┘Ж ┘Е╪╣┘Д┘Е╪з╪к [`ViTImageProcessor`] ┘Д╪е┘Ж╪┤╪з╪б ┘Е╪╣╪з┘Д╪м ╪з┘Д╪╡┘И╪▒ ╪з┘Д┘Е╪о╪╡╪╡ ╪з┘Д╪о╪з╪╡ ╪и┘Г:
```py
>>> from transformers import ViTImageProcessor
>>> my_vit_extractor = ViTImageProcessor(resample="PIL.Image.BOX", do_normalize=False, image_mean=[0.3, 0.3, 0.3])
>>> print(my_vit_extractor)
ViTImageProcessor {
"do_normalize": false,
"do_resize": true,
"image_processor_type": "ViTImageProcessor",
"image_mean": [
0.3,
0.3,
0.3
],
"image_std": [
0.5,
0.5,
0.5
],
"resample": "PIL.Image.BOX",
"size": 224
}
```
## ╪з┘Д╪╣┘Е┘И╪п ╪з┘Д┘Б┘В╪▒┘К
<div style="text-align: center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Backbone.png">
</div>
╪к╪к┘Г┘И┘Ж ┘Ж┘Е╪з╪░╪м ╪▒╪д┘К╪й ╪з┘Д╪н╪з╪│╪и ┘Е┘Ж ╪м╪▓╪б ╪г╪│╪з╪│┘К╪М ┘И╪м╪▓╪б ┘И╪│┘К╪╖╪М ┘И╪м╪▓╪б ┘Е╪╣╪з┘Д╪м╪й ┘Ж┘З╪з╪ж┘К. ┘К╪│╪к╪о╪▒╪м ╪з┘Д╪м╪▓╪б ╪з┘Д╪г╪│╪з╪│┘К ╪з┘Д┘Е┘К╪▓╪з╪к ┘Е┘Ж ╪╡┘И╪▒╪й ╪з┘Д╪е╪п╪о╪з┘Д╪М ┘И┘К╪м┘Е╪╣ ╪з┘Д╪м╪▓╪б ╪з┘Д┘И╪│┘К╪╖ ┘З╪░┘З ╪з┘Д┘Е┘К╪▓╪з╪к ╪з┘Д┘Е╪│╪к╪о╪▒╪м╪й ┘И┘К╪╣╪▓╪▓┘З╪з╪М ┘И┘К┘П╪│╪к╪о╪п┘Е ╪з┘Д╪м╪▓╪б ╪з┘Д┘Ж┘З╪з╪ж┘К ┘Д┘Д┘Е┘З┘Е╪й ╪з┘Д╪▒╪ж┘К╪│┘К╪й (┘Е╪л┘Д ╪з┘Г╪к╪┤╪з┘Б ╪з┘Д╪г╪м╪│╪з┘Е). ╪з╪и╪п╪г ╪╣╪и╪к┘З┘К╪ж╪й ╪з┘Д╪м╪▓╪б ╪з┘Д╪г╪│╪з╪│┘К ┘Б┘К ╪к┘Г┘И┘К┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И╪н╪п╪п ┘Е╪з ╪е╪░╪з ┘Г┘Ж╪к ╪к╪▒┘К╪п ╪к╪н┘Е┘К┘Д ╪г┘И╪▓╪з┘Ж ┘Е╪п╪▒╪и╪й ┘Е╪│╪и┘В┘Л╪з ╪г┘И ╪г┘И╪▓╪з┘Ж┘Л╪з ╪╣╪┤┘И╪з╪ж┘К╪й. ╪и╪╣╪п ╪░┘Д┘Г╪М ┘К┘Е┘Г┘Ж┘Г ╪к┘Е╪▒┘К╪▒ ╪к┘Г┘И┘К┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ╪е┘Д┘Й ╪м╪▓╪б ╪з┘Д┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Ж┘З╪з╪ж┘К.
╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ┘Д╪к╪н┘Е┘К┘Д [ResNet](../model_doc/resnet) backbone ┘Б┘К ┘Ж┘Е┘И╪░╪м [MaskFormer](../model_doc/maskformer) ┘Е╪╣ ╪▒╪г╪│ ╪к╪м╪▓╪ж╪й ┘Е╪л┘К┘Д:
<hfoptions id="backbone">
<hfoption id="pretrained weights">
┘В┘Е ╪и╪к╪╣┘К┘К┘Ж `use_pretrained_backbone=True` ┘Д╪к╪н┘Е┘К┘Д ╪з┘Д╪г┘И╪▓╪з┘Ж ╪з┘Д┘Е╪│╪и┘В╪й ╪з┘Д╪к╪п╪▒┘К╪и ┘Д┘А ResNet ┘Д┘Д╪╣┘Е┘И╪п ╪з┘Д┘Б┘В╪▒┘К.
```py
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation
config = MaskFormerConfig(backbone="microsoft/resnet-50", use_pretrained_backbone=True) # ╪к┘Г┘И┘К┘Ж ╪з┘Д╪м╪▓╪б ╪з┘Д╪г╪│╪з╪│┘К ┘И╪з┘Д╪м╪▓╪б ╪з┘Д┘И╪│┘К╪╖
model = MaskFormerForInstanceSegmentation(config) # ╪м╪▓╪б ╪з┘Д┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Ж┘З╪з╪ж┘К
```
</hfoption>
<hfoption id="random weights">
┘В┘Е ╪и╪к╪╣┘К┘К┘Ж `use_pretrained_backbone=False` ┘Д╪к┘З┘К╪ж╪й ╪м╪▓╪б ResNet ╪з┘Д╪г╪│╪з╪│┘К ╪и╪┤┘Г┘Д ╪╣╪┤┘И╪з╪ж┘К.
```py
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation
config = MaskFormerConfig(backbone="microsoft/resnet-50", use_pretrained_backbone=False) # ╪к┘Г┘И┘К┘Ж ╪з┘Д╪м╪▓╪б ╪з┘Д╪г╪│╪з╪│┘К ┘И╪з┘Д╪м╪▓╪б ╪з┘Д┘И╪│┘К╪╖
model = MaskFormerForInstanceSegmentation(config) # ╪м╪▓╪б ╪з┘Д┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Ж┘З╪з╪ж┘К
```
┘К┘Е┘Г┘Ж┘Г ╪г┘К╪╢┘Л╪з ╪к╪н┘Е┘К┘Д ╪к┘Г┘И┘К┘Ж ╪з┘Д╪м╪▓╪б ╪з┘Д╪г╪│╪з╪│┘К ╪и╪┤┘Г┘Д ┘Е┘Ж┘Б╪╡┘Д╪М ╪л┘Е ╪к┘Е╪▒┘К╪▒┘З ╪е┘Д┘Й ╪к┘Г┘И┘К┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м.```
```py
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation, ResNetConfig
backbone_config = ResNetConfig()
config = MaskFormerConfig(backbone_config=backbone_config)
model = MaskFormerForInstanceSegmentation(config)
```
</hfoption>
<hfoption id="timm backbone">
┘К╪к┘Е ╪к╪н┘Е┘К┘Д ┘Ж┘Е╪з╪░╪м [timm](https://hf.co/docs/timm/index) ╪п╪з╪о┘Д ┘Ж┘Е┘И╪░╪м ╪и╪з╪│╪к╪о╪п╪з┘Е `use_timm_backbone=True` ╪г┘И ╪и╪з╪│╪к╪о╪п╪з┘Е [`TimmBackbone`] ┘И [`TimmBackboneConfig`].
╪з╪│╪к╪о╪п┘Е `use_timm_backbone=True` ┘И `use_pretrained_backbone=True` ┘Д╪к╪н┘Е┘К┘Д ╪г┘И╪▓╪з┘Ж timm ╪з┘Д┘Е┘П╪п╪▒┘С╪и╪й ┘Е╪│╪и┘В┘Л╪з ┘Д┘Д╪м╪▓╪б ╪з┘Д╪г╪│╪з╪│┘К.
```python
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation
config = MaskFormerConfig(backbone="resnet50", use_pretrained_backbone=True, use_timm_backbone=True) # ╪к┘Г┘И┘К┘Ж ╪з┘Д╪м╪▓╪б ╪з┘Д╪г╪│╪з╪│┘К ┘И╪з┘Д╪м╪▓╪б ╪з┘Д┘И╪│┘К╪╖
model = MaskFormerForInstanceSegmentation(config) # ╪м╪▓╪б ╪з┘Д┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Ж┘З╪з╪ж┘К
```
┘В┘Е ╪и╪к╪╣┘К┘К┘Ж `use_timm_backbone=True` ┘И `use_pretrained_backbone=False` ┘Д╪к╪н┘Е┘К┘Д ╪╣┘Е┘И╪п ┘Б┘В╪▒┘К timm ┘Е╪и╪п╪ж┘К ╪╣╪┤┘И╪з╪ж┘К.
```python
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation
config = MaskFormerConfig(backbone="resnet50", use_pretrained_backbone=False, use_timm_backbone=True) # ╪к┘Г┘И┘К┘Ж ╪з┘Д╪м╪▓╪б ╪з┘Д╪г╪│╪з╪│┘К ┘И╪з┘Д╪м╪▓╪б ╪з┘Д┘И╪│┘К╪╖
model = MaskFormerForInstanceSegmentation(config) # ╪м╪▓╪б ╪з┘Д┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Ж┘З╪з╪ж┘К
```
┘К┘Е┘Г┘Ж┘Г ╪г┘К╪╢┘Л╪з ╪к╪н┘Е┘К┘Д ╪к┘Г┘И┘К┘Ж ╪з┘Д╪м╪▓╪б ╪з┘Д╪г╪│╪з╪│┘К ┘И╪з╪│╪к╪о╪п╪з┘Е┘З ┘Д╪е┘Ж╪┤╪з╪б `TimmBackbone` ╪г┘И ╪к┘Е╪▒┘К╪▒┘З ╪е┘Д┘Й ╪к┘Г┘И┘К┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м. ╪│┘К╪к┘Е ╪к╪н┘Е┘К┘Д╪г┘И╪▓╪з┘Ж ╪з┘Д╪м╪▓╪б ╪з┘Д╪г╪│╪з╪│┘К ┘Д┘А Timm ╪з┘Д┘Е┘П╪п╪▒┘С╪и╪й ┘Е╪│╪и┘В┘Л╪з ╪з┘Б╪к╪▒╪з╪╢┘К┘Л╪з. ╪╣┘К┘С┘Ж `use_pretrained_backbone=False` ┘Д╪к╪н┘Е┘К┘Д ╪з┘Д╪г┘И╪▓╪з┘Ж ╪з┘Д┘Е╪и╪п╪ж┘К╪й ╪з┘Д╪╣╪┤┘И╪з╪ж┘К╪й.
```python
from transformers import TimmBackboneConfig, TimmBackbone
backbone_config = TimmBackboneConfig("resnet50", use_pretrained_backbone=False)
# ┘В┘Е ╪и╪е┘Ж╪┤╪з╪б ┘Е╪л┘К┘Д ┘Е┘Ж ╪з┘Д╪╣┘Е┘И╪п ╪з┘Д┘Б┘В╪▒┘К
backbone = TimmBackbone(config=backbone_config)
# ┘В┘Е ╪и╪е┘Ж╪┤╪з╪б ┘Ж┘Е┘И╪░╪м ╪и╪з╪│╪к╪о╪п╪з┘Е ╪╣┘Е┘И╪п ┘Б┘В╪▒┘К timm
from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation
config = MaskFormerConfig(backbone_config=backbone_config)
model = MaskFormerForInstanceSegmentation(config)
```
## ┘Е╪│╪к╪о╪▒╪м ╪з┘Д┘Е┘К╪▓╪з╪к
┘К┘В┘И┘Е ┘Е┘П╪│╪к╪о╪▒╪м ╪з┘Д┘Е┘К╪▓╪з╪к ╪и┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Е╪п╪о┘Д╪з╪к ╪з┘Д╪╡┘И╪к┘К╪й. ┘К╪▒╪л ┘Е┘Ж ┘Б╪ж╪й ╪з┘Д╪г╪│╪з╪│ [`~feature_extraction_utils.FeatureExtractionMixin`]╪М ┘И┘В╪п ┘К╪▒╪л ╪г┘К╪╢┘Л╪з ┘Е┘Ж ┘Б╪ж╪й [`SequenceFeatureExtractor`] ┘Д┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Е╪п╪о┘Д╪з╪к ╪з┘Д╪╡┘И╪к┘К╪й.
┘Д┘Д╪з╪│╪к╪о╪п╪з┘Е╪М ┘В┘Е ╪и╪е┘Ж╪┤╪з╪б ┘Е╪│╪к╪о╪▒╪м ┘Е┘К╪▓╪з╪к ┘Е╪▒╪к╪и╪╖ ╪и╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪░┘К ╪к╪│╪к╪о╪п┘Е┘З. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ┘В┘Е ╪и╪е┘Ж╪┤╪з╪б ┘Е╪│╪к╪о╪▒╪м ┘Е┘К╪▓╪з╪к Wav2Vec2 ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К ╪е╪░╪з ┘Г┘Ж╪к ╪к╪│╪к╪о╪п┘Е [Wav2Vec2](model_doc/wav2vec2) ┘Д╪к╪╡┘Ж┘К┘Б ╪з┘Д╪╡┘И╪к:
```py
>>> from transformers import Wav2Vec2FeatureExtractor
>>> w2v2_extractor = Wav2Vec2FeatureExtractor()
>>> print(w2v2_extractor)
Wav2Vec2FeatureExtractor {
"do_normalize": true,
"feature_extractor_type": "Wav2Vec2FeatureExtractor",
"feature_size": 1,
"padding_side": "right",
"padding_value": 0.0,
"return_attention_mask": false,
"sampling_rate": 16000
}
```
<Tip>
╪е╪░╪з ┘Д┘Е ╪к┘Г┘Ж ╪и╪н╪з╪м╪й ┘Д╪г┘К ╪к╪о╪╡┘К╪╡╪М ┘Б╪з╪│╪к╪о╪п┘Е ┘Б┘В╪╖ ╪╖╪▒┘К┘В╪й `from_pretrained` ┘Д╪к╪н┘Е┘К┘Д ┘Е╪╣┘Д┘Е╪з╪к ┘Е╪│╪к╪о╪▒╪м ╪з┘Д┘Е┘К╪▓╪з╪к ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К╪й ┘Д┘Д┘Ж┘Е┘И╪░╪м.
</Tip>
┘В┘Е ╪и╪к╪╣╪п┘К┘Д ╪г┘К ┘Е┘Ж ┘Е╪╣┘Д┘Е╪з╪к [`Wav2Vec2FeatureExtractor`] ┘Д╪е┘Ж╪┤╪з╪б ┘Е╪│╪к╪о╪▒╪м ┘Е┘К╪▓╪з╪к ┘Е╪о╪╡╪╡:
```py
>>> from transformers import Wav2Vec2FeatureExtractor
>>> w2v2_extractor = Wav2Vec2FeatureExtractor(sampling_rate=8000╪М do_normalize=False)
>>> print(w2v2_extractor)
Wav2Vec2FeatureExtractor {
"do_normalize": false,
"feature_extractor_type": "Wav2Vec2FeatureExtractor"╪М
"feature_size": 1╪М
"padding_side": "right"╪М
"padding_value": 0.0╪М
"return_attention_mask": false╪М
"sampling_rate": 8000
}
```
## ╪з┘Д┘Е╪╣╪з┘Д╪м
╪и╪з┘Д┘Ж╪│╪и╪й ┘Д┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к┘К ╪к╪п╪╣┘Е ┘Е┘З╪з┘Е ╪з┘Д┘И╪│╪з╪ж╪╖ ╪з┘Д┘Е╪к╪╣╪п╪п╪й╪М ╪к┘И┘Б╪▒ ┘Е┘Г╪к╪и╪й ЁЯдЧ Transformers ┘Б╪ж╪й ┘Е╪╣╪з┘Д╪м ╪к╪м┘Е╪╣ ╪и┘Б╪з╪╣┘Д┘К╪й ┘Б╪ж╪з╪к ╪з┘Д┘Е╪╣╪з┘Д╪м╪й ┘Е╪л┘Д ┘Е╪│╪к╪о╪▒╪м ╪з┘Д┘Е┘К╪▓╪з╪к ┘И┘Е┘В╪│┘С┘Е ╪з┘Д╪▒┘Е┘И╪▓ ┘Б┘К ┘Г╪з╪ж┘Ж ┘И╪з╪н╪п. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪п╪╣┘Ж╪з ┘Ж╪│╪к╪о╪п┘Е [`Wav2Vec2Processor`] ┘Д┘Е┘З┘Е╪й ╪з┘Д╪к╪╣╪▒┘Б ╪з┘Д╪в┘Д┘К ╪╣┘Д┘Й ╪з┘Д┘Г┘Д╪з┘Е (ASR). ╪к┘В┘И┘Е ┘Е┘З┘Е╪й ASR ╪и╪к╪н┘И┘К┘Д ╪з┘Д╪╡┘И╪к ╪е┘Д┘Й ┘Ж╪╡╪М ┘Д╪░┘Д┘Г ╪│╪к╪н╪к╪з╪м ╪е┘Д┘Й ┘Е╪│╪к╪о╪▒╪м ┘Е┘К╪▓╪з╪к ┘И┘Е┘В╪│┘С┘Е ╪▒┘Е┘И╪▓.
┘В┘Е ╪и╪е┘Ж╪┤╪з╪б ┘Е╪│╪к╪о╪▒╪м ┘Е┘К╪▓╪з╪к ┘Д┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Е╪п╪о┘Д╪з╪к ╪з┘Д╪╡┘И╪к┘К╪й:
```py
>>> from transformers import Wav2Vec2FeatureExtractor
>>> feature_extractor = Wav2Vec2FeatureExtractor(padding_value=1.0, do_normalize=True)
```
┘В┘Е ╪и╪е┘Ж╪┤╪з╪б ┘Е┘В╪│┘С┘Е ╪▒┘Е┘И╪▓ ┘Д┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Е╪п╪о┘Д╪з╪к ╪з┘Д┘Ж╪╡┘К╪й:
```py
>>> from transformers import Wav2Vec2CTCTokenizer
>>> tokenizer = Wav2Vec2CTCTokenizer(vocab_file="my_vocab_file.txt")
```
┘В┘Е ╪и╪п┘Е╪м ┘Е╪│╪к╪о╪▒╪м ╪з┘Д┘Е┘К╪▓╪з╪к ┘И┘Е┘В╪│┘С┘Е ╪з┘Д╪▒┘Е┘И╪▓ ┘Б┘К [`Wav2Vec2Processor`]:
```py
>>> from transformers import Wav2Vec2Processor
>>> processor = Wav2Vec2Processor(feature_extractor=feature_extractor, tokenizer=tokenizer)
```
╪и╪з╪│╪к╪о╪п╪з┘Е ┘Б╪ж╪к┘К┘Ж ╪г╪│╪з╪│┘К╪к┘К┘Ж - ╪з┘Д╪к┘Г┘И┘К┘Ж ┘И╪з┘Д┘Ж┘Е┘И╪░╪м - ╪и╪з┘Д╪е╪╢╪з┘Б╪й ╪е┘Д┘Й ┘Б╪ж╪й ┘Е╪╣╪з┘Д╪м╪й ┘Е╪│╪и┘В (┘Е┘В╪│┘С┘Е ╪▒┘Е┘И╪▓ ╪г┘И ┘Е╪╣╪з┘Д╪м ╪╡┘И╪▒╪й ╪г┘И ┘Е╪│╪к╪о╪▒╪м ┘Е┘К╪▓╪з╪к ╪г┘И ┘Е╪╣╪з┘Д╪м)╪М ┘К┘Е┘Г┘Ж┘Г ╪е┘Ж╪┤╪з╪б ╪г┘К ┘Е┘Ж ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к┘К ╪к╪п╪╣┘Е┘З╪з ┘Е┘Г╪к╪и╪й ЁЯдЧ Transformers. ┘К┘Е┘Г┘Ж ╪к┘Г┘И┘К┘Ж ┘Г┘Д ┘Е┘Ж ┘З╪░┘З ╪з┘Д┘Б╪ж╪з╪к ╪з┘Д╪г╪│╪з╪│┘К╪й╪М ┘Е┘Е╪з ┘К╪│┘Е╪н ┘Д┘Г ╪и╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪│┘Е╪з╪к ╪з┘Д┘Е╪╖┘Д┘И╪и╪й. ┘К┘Е┘Г┘Ж┘Г ╪и╪│┘З┘И┘Д╪й ╪к┘З┘К╪ж╪й ┘Ж┘Е┘И╪░╪м ┘Д┘Д╪к╪п╪▒┘К╪и ╪г┘И ╪к╪╣╪п┘К┘Д ┘Ж┘Е┘И╪░╪м ┘Е╪п╪▒╪и ┘Е╪│╪и┘В╪з┘Л ┘Д╪е╪м╪▒╪з╪б ╪╢╪и╪╖ ╪п┘В┘К┘В.

View File

@ -0,0 +1,323 @@
# ╪и┘Ж╪з╪б ┘Ж┘Е╪з╪░╪м ┘Е╪о╪╡╪╡╪й
╪к┘Е ╪к╪╡┘Е┘К┘Е ┘Е┘Г╪к╪и╪й ЁЯдЧ Transformers ┘Д╪к┘Г┘И┘Ж ┘В╪з╪и┘Д╪й ┘Д┘Д╪к┘И╪│┘К╪╣ ╪и╪│┘З┘И┘Д╪й. ┘Г┘Д ┘Ж┘Е┘И╪░╪м ┘Е┘П╪┤┘Б┘С╪▒ ╪и╪з┘Д┘Г╪з┘Е┘Д ┘Б┘К ┘Е╪м┘Д╪п ┘Б╪▒╪╣┘К ┘Е╪╣┘К┘Ж ╪и╪з┘Д┘Е╪│╪к┘И╪п╪╣╪М ╪п┘И┘Ж ╪г┘К ╪к╪м╪▒┘К╪п╪М ┘Д╪░┘Д┘Г ┘К┘Е┘Г┘Ж┘Г ╪и╪│┘З┘И┘Д╪й ┘Ж╪│╪о ┘Е┘Д┘Б ╪з┘Д┘Ж┘Е╪░╪м╪й ┘И╪к╪╣╪п┘К┘Д┘З ┘И┘Б┘В┘Л╪з ┘Д╪з╪н╪к┘К╪з╪м╪з╪к┘Г.
╪е╪░╪з ┘Г┘Ж╪к ╪к┘П┘Ж╪┤╪ж ┘Ж┘Е┘И╪░╪м┘Л╪з ╪м╪п┘К╪п┘Л╪з ╪к┘Е╪з┘Е┘Л╪з╪М ┘Б┘В╪п ┘К┘Г┘И┘Ж ┘Е┘Ж ╪з┘Д╪г╪│┘З┘Д ╪з┘Д╪и╪п╪б ┘Е┘Ж ╪з┘Д╪╡┘Б╪▒. ┘Б┘К ┘З╪░╪з ╪з┘Д╪и╪▒┘Ж╪з┘Е╪м ╪з┘Д╪к╪╣┘Д┘К┘Е┘К╪М ╪│┘Ж┘П╪▒┘Р┘К┘Г ┘Г┘К┘Б┘К╪й ┘Г╪к╪з╪и╪й ┘Ж┘Е┘И╪░╪м ┘Е╪о╪╡╪╡ ┘И╪к┘Г┘И┘К┘Ж┘З ┘Д┘К┘П╪│╪к╪о╪п┘Е ╪п╪з╪о┘Д Transformers╪М ┘И┘Г┘К┘Б┘К╪й ┘Е╪┤╪з╪▒┘Г╪к┘З ┘Е╪╣ ╪з┘Д┘Е╪м╪к┘Е╪╣ (┘Е╪╣ ╪з┘Д┘Г┘И╪п ╪з┘Д╪░┘К ┘К╪╣╪к┘Е╪п ╪╣┘Д┘К┘З) ╪и╪н┘К╪л ┘К┘Е┘Г┘Ж ┘Д╪г┘К ╪┤╪о╪╡ ╪з╪│╪к╪о╪п╪з┘Е┘З╪М ╪н╪к┘Й ╪е╪░╪з ┘Д┘Е ┘К┘Г┘Ж ┘Е┘И╪м┘И╪п┘Л╪з ┘Б┘К ┘Е┘Г╪к╪и╪й ЁЯдЧ Transformers. ╪│┘Ж╪▒┘Й ┘Г┘К┘Б┘К╪й ╪з┘Д╪и┘Ж╪з╪б ╪╣┘Д┘Й ╪з┘Д┘Е╪н┘И┘Д╪з╪к ┘И┘Ж┘И╪│┘С╪╣ ╪з┘Д╪е╪╖╪з╪▒ ╪и╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪п┘И╪з╪к ╪з┘Д╪к┘К ┘К┘Е┘Г┘Ж ╪з╪│╪к╪о╪п╪з┘Е┘З╪з ┘Д╪к╪╣╪п┘К┘Д ╪│┘Д┘И┘Г ╪з┘Д╪е╪╖╪з╪▒ (hooks) ┘И╪з┘Д╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Е╪м┘К╪й ╪з┘Д┘Е╪о╪╡╪╡╪й.
╪│┘Ж┘И╪╢╪н ┘Г┘Д ┘З╪░╪з ┘Е┘Ж ╪о┘Д╪з┘Д ┘Ж┘Е┘И╪░╪м ResNet╪М ╪и╪к╪║┘Д┘К┘Б ┘Б╪ж╪й ResNet ┘Е┘Ж
[┘Е┘Г╪к╪и╪й timm](https://github.com/rwightman/pytorch-image-models) ╪п╪з╪о┘Д [`PreTrainedModel`].
## ┘Г╪к╪з╪и╪й ╪е╪╣╪п╪з╪п╪з╪к ┘Е╪о╪╡╪╡╪й
┘Д┘Ж╪и╪п╪г ╪и┘Г╪к╪з╪и╪й ╪е╪╣╪п╪з╪п╪з╪к ╪з┘Д┘Ж┘Е┘И╪░╪м. ╪е╪╣╪п╪з╪п╪з╪к ╪з┘Д┘Ж┘Е┘И╪░╪м ┘З┘И ┘Г╪з╪ж┘Ж┘М ┘К╪н╪к┘И┘К ╪╣┘Д┘Й ╪м┘Е┘К╪╣ ╪з┘Д┘Е╪╣┘Д┘И┘Е╪з╪к ╪з┘Д┘Д╪з╪▓┘Е╪й ┘Д╪и┘Ж╪з╪ж┘З. ┘Г┘Е╪з ╪│┘Ж╪▒┘Й ┘Д╪з╪н┘В┘Л╪з╪М ┘К╪к╪╖┘Д╪и ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Г╪з╪ж┘Ж `config` ┘Д╪к┘З┘К╪ж╪к┘З╪М ┘Д╪░╪з ┘К╪м╪и ╪г┘Ж ┘К┘Г┘И┘Ж ┘З╪░╪з ╪з┘Д┘Г╪з╪ж┘Ж ┘Г╪з┘Е┘Д╪з┘Л.
<Tip>
╪к╪к╪и╪╣ ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Б┘К ┘Е┘Г╪к╪и╪й `transformers` ╪з╪к┘Б╪з┘В┘К╪й ┘В╪и┘И┘Д ┘Г╪з╪ж┘Ж `config` ┘Б┘К ╪п╪з┘Д╪й `__init__` ╪з┘Д╪о╪з╪╡╪й ╪и┘З╪з╪М ╪л┘Е ╪к┘Е╪▒╪▒ ┘Г╪з╪ж┘Ж `config` ╪и╪з┘Д┘Г╪з┘Е┘Д ╪е┘Д┘Й ╪з┘Д╪╖╪и┘В╪з╪к ╪з┘Д┘Б╪▒╪╣┘К╪й ┘Б┘К ╪з┘Д┘Ж┘Е┘И╪░╪м╪М ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪к┘В╪│┘К┘Е┘З ╪е┘Д┘Й ┘Е╪╣╪з┘Ея╗╗╪к ┘Е╪к╪╣╪п╪п╪й. ┘К╪д╪п┘К ┘Г╪к╪з╪и╪й ┘Ж┘Е┘И╪░╪м┘Г ╪и┘З╪░╪з ╪з┘Д╪г╪│┘Д┘И╪и ╪е┘Д┘Й ┘Г┘И╪п ╪г╪и╪│╪╖ ┘Е╪╣ "┘Е╪╡╪п╪▒ ╪н┘В┘К┘В╪й" ┘И╪з╪╢╪н ┘Д╪г┘К ┘Б╪▒╪╖ ┘Е╪╣┘Д┘Е╪з╪к╪М ┘Г┘Е╪з ┘К╪│┘З┘Д ╪е╪╣╪з╪п╪й ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Г┘И╪п ┘Е┘Ж ┘Ж┘Е╪з╪░╪м ╪г╪о╪▒┘Й ┘Б┘К `transformers`.
</Tip>
┘Б┘К ┘Е╪л╪з┘Д┘Ж╪з╪М ╪│┘Ж╪╣╪п┘С┘Д ╪и╪╣╪╢ ╪з┘Д┘И╪│╪з╪ж╪╖ ┘Б┘К ┘Б╪ж╪й ResNet ╪з┘Д╪к┘К ┘В╪п ┘Ж╪▒╪║╪и ┘Б┘К ╪╢╪и╪╖┘З╪з. ╪│╪к╪╣╪╖┘К┘Ж╪з ╪з┘Д╪к┘Г┘И┘К┘Ж╪з╪к ╪з┘Д┘Е╪о╪к┘Д┘Б╪й ╪г┘Ж┘И╪з╪╣ ResNets ╪з┘Д┘Е╪о╪к┘Д┘Б╪й ╪з┘Д┘Е┘Е┘Г┘Ж╪й. ╪│┘Ж┘В┘И┘Е ╪и╪к╪о╪▓┘К┘Ж ┘З╪░┘З ╪з┘Д┘И╪│╪з╪ж╪╖ ╪и╪╣╪п ╪з┘Д╪к╪н┘В┘В ┘Е┘Ж ╪╡╪н╪к┘З.
```python
from transformers import PretrainedConfig
from typing import List
class ResnetConfig(PretrainedConfig):
model_type = "resnet"
def __init__(
self,
block_type="bottleneck",
layers: List[int] = [3, 4, 6, 3],
num_classes: int = 1000,
input_channels: int = 3,
cardinality: int = 1,
base_width: int = 64,
stem_width: int = 64,
stem_type: str = "",
avg_down: bool = False,
**kwargs,
):
if block_type not in ["basic", "bottleneck"]:
raise ValueError(f"`block_type` must be 'basic' or bottleneck', got {block_type}.")
if stem_type not in ["", "deep", "deep-tiered"]:
raise ValueError(f"`stem_type` must be '', 'deep' or 'deep-tiered', got {stem_type}.")
self.block_type = block_type
self.layers = layers
self.num_classes = num_classes
self.input_channels = input_channels
self.cardinality = cardinality
self.base_width = base_width
self.stem_width = stem_width
self.stem_type = stem_type
self.avg_down = avg_down
super().__init__(**kwargs)
```
╪з┘Д╪г╪┤┘К╪з╪б ╪з┘Д╪л┘Д╪з╪л╪й ╪з┘Д┘Е┘З┘Е╪й ╪з┘Д╪к┘К ┘К╪м╪и ╪к╪░┘Г╪▒┘З╪з ╪╣┘Ж╪п ┘Г╪к╪з╪и╪й ╪к┘Г┘И┘К┘Ж┘Г ╪з┘Д╪о╪з╪╡ ┘З┘К:
- ┘К╪м╪и ╪г┘Ж ╪к╪▒╪л ┘Е┘Ж `PretrainedConfig`╪М
- ┘К╪м╪и ╪г┘Ж ╪к┘В╪и┘Д ╪п╪з┘Д╪й `__init__` ╪з┘Д╪о╪з╪╡╪й ╪и┘А `PretrainedConfig` ╪г┘К ┘Е╪╣╪з┘Ея╗╗╪к ╪е╪╢╪з┘Б┘К╪й kwargs╪М
- ┘К╪м╪и ╪к┘Е╪▒┘К╪▒ ┘З╪░┘З ╪з┘Д┘Е╪╣╪з┘Ея╗╗╪к ╪з┘Д╪е╪╢╪з┘Б┘К╪й ╪е┘Д┘Й ╪п╪з┘Д╪й `__init__` ┘Б┘Й ╪з┘Д┘Б╪ж╪й ╪з┘Д╪г╪│╪з╪│┘К╪й ╪з┘Д╪з╪╣┘Д┘Й.
┘К╪╢┘Е┘Ж ╪з┘Д╪е╪▒╪л ╪н╪╡┘И┘Д┘Г ╪╣┘Д┘Й ╪м┘Е┘К╪╣ ╪з┘Д┘И╪╕╪з╪ж┘Б ┘Е┘Ж ┘Е┘Г╪к╪и╪й ЁЯдЧ Transformers╪М ┘Б┘К ╪н┘К┘Ж ╪г┘Ж ╪з┘Д┘В┘К╪п┘К┘Ж ╪з┘Д╪к╪з┘Ж┘Й ┘И╪з┘Д╪л╪з┘Д╪л ┘К╪г╪к┘К╪з┘Ж ┘Е┘Ж ╪н┘В┘К┘В╪й ╪г┘Ж `PretrainedConfig` ┘Д╪п┘К┘З ╪з┘Д┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д╪н┘В┘И┘Д ╪г┘Г╪л╪▒ ┘Е┘Ж ╪к┘Д┘Г ╪з┘Д╪к┘К ╪к┘В┘И┘Е ╪и╪к╪╣┘К┘К┘Ж┘З╪з. ╪╣┘Ж╪п ╪е╪╣╪з╪п╪й ╪к╪н┘Е┘К┘Д ╪к┘Г┘И┘К┘Ж ╪и╪з╪│╪к╪о╪п╪з┘Е ╪╖╪▒┘К┘В╪й `from_pretrained`╪М ┘К╪м╪и ╪г┘Ж ┘К┘В╪и┘Д ╪к┘Г┘И┘К┘Ж┘Г ┘З╪░┘З ╪з┘Д╪н┘В┘И┘Д ╪л┘Е ╪е╪▒╪│╪з┘Д┘З╪з ╪е┘Д┘Й ╪з┘Д┘Б╪ж╪й ╪з┘Д╪г╪│╪з╪│┘К╪й ╪з┘Д╪г╪╣┘Д┘Й.
╪к╪н╪п┘К╪п `model_type` ┘Д╪к┘Г┘И┘К┘Ж┘Г (┘З┘Ж╪з `model_type="resnet"`) ┘Д┘К╪│ ╪е┘Д╪▓╪з┘Е┘К┘Л╪з╪М ┘Е╪з ┘Д┘Е ╪к╪▒╪║╪и ┘Б┘К
╪к╪│╪м┘К┘Д ┘Ж┘Е┘И╪░╪м┘Г ╪и╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Б╪ж╪з╪к ╪з┘Д╪к┘Д┘В╪з╪ж┘К╪й (╪▒╪з╪м╪╣ ╪з┘Д┘В╪│┘Е ╪з┘Д╪г╪о┘К╪▒).
┘Е╪╣ ╪з┘Д┘В┘К╪з┘Е ╪и╪░┘Д┘Г╪М ┘К┘Е┘Г┘Ж┘Г ╪и╪│┘З┘И┘Д╪й ╪е┘Ж╪┤╪з╪б ╪к┘Г┘И┘К┘Ж┘Г ┘И╪н┘Б╪╕┘З ┘Е╪л┘Д┘Е╪з ╪к┘Б╪╣┘Д ┘Е╪╣ ╪г┘К ╪к┘Г┘И┘К┘Ж ┘Ж┘Е┘И╪░╪м ╪в╪о╪▒ ┘Б┘К
╪з┘Д┘Е┘Г╪к╪и╪й. ╪е┘Д┘К┘Г ┘Г┘К┘Б┘К╪й ╪е┘Ж╪┤╪з╪б ╪к┘Г┘И┘К┘Ж resnet50d ┘И╪н┘Б╪╕┘З:
```py
resnet50d_config = ResnetConfig(block_type="bottleneck", stem_width=32, stem_type="deep", avg_down=True)
resnet50d_config.save_pretrained("custom-resnet")
```
╪│┘К╪д╪п┘К ┘З╪░╪з ╪е┘Д┘Й ╪н┘Б╪╕ ┘Е┘Д┘Б ╪и╪з╪│┘Е `config.json` ╪п╪з╪о┘Д ┘Е╪м┘Д╪п `custom-resnet`. ┘К┘Е┘Г┘Ж┘Г ╪и╪╣╪п ╪░┘Д┘Г ╪е╪╣╪з╪п╪й ╪к╪н┘Е┘К┘Д ╪к┘Г┘И┘К┘Ж┘Г ╪и╪з╪│╪к╪о╪п╪з┘Е
╪╖╪▒┘К┘В╪й `from_pretrained`:
```py
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
```
┘К┘Е┘Г┘Ж┘Г ╪г┘К╪╢┘Л╪з ╪з╪│╪к╪о╪п╪з┘Е ╪г┘К ╪╖╪▒┘К┘В╪й ╪г╪о╪▒┘Й ┘Е┘Ж ┘Б╪ж╪й [`PretrainedConfig`]╪М ┘Е╪л┘Д [`~PretrainedConfig.push_to_hub`] ┘Д╪к╪н┘Е┘К┘Д ╪к┘Г┘И┘К┘Ж┘Г ┘Е╪и╪з╪┤╪▒╪й ╪е┘Д┘Й Hub.
## ┘Г╪к╪з╪и╪й ┘Ж┘Е┘И╪░╪м ┘Е╪о╪╡╪╡
╪з┘Д╪в┘Ж ╪и╪╣╪п ╪г┘Ж ╪г╪╡╪и╪н ┘Д╪п┘К┘Ж╪з ╪к┘Г┘И┘К┘Ж ResNet╪М ┘К┘Е┘Г┘Ж┘Ж╪з ╪з┘Д┘Е╪к╪з╪и╪╣╪й ┘Д╪е┘Ж╪┤╪з╪б ┘Ж┘Е┘И╪░╪м┘К┘Ж: ╪з┘Д╪г┘И┘Д ┘К╪│╪к╪о╪▒╪м ╪з┘Д┘Е┘К╪▓╪з╪к ╪з┘Д┘Е╪о┘Б┘К╪й ┘Е┘Ж ╪п┘Б╪╣╪й ┘Е┘Ж ╪з┘Д╪╡┘И╪▒ (┘Е╪л┘Д [`BertModel`]) ┘И╪з┘Д╪в╪о╪▒ ┘Е┘Ж╪з╪│╪и ┘Д╪к╪╡┘Ж┘К┘Б ╪з┘Д╪╡┘И╪▒ (┘Е╪л┘Д [`BertForSequenceClassification`]).
┘Г┘Е╪з ╪░┘Г╪▒┘Ж╪з ╪│╪з╪и┘В┘Л╪з╪М ╪│┘Ж┘В┘И┘Е ╪и╪и┘Ж╪з╪б ┘Ж┘Е┘И╪░╪м ┘Е╪и╪│╪╖ ┘Д╪к╪│┘З┘К┘Д ╪з┘Д┘Б┘З┘Е ┘Б┘К ┘З╪░╪з ╪з┘Д┘Е╪л╪з┘Д. ╪з┘Д╪о╪╖┘И╪й ╪з┘Д┘И╪н┘К╪п╪й ╪з┘Д┘Е╪╖┘Д┘И╪и╪й ┘В╪и┘Д ┘Г╪к╪з╪и╪й ┘З╪░┘З ╪з┘Д┘Б╪ж╪й ┘З┘К ┘Д╪▒╪и╪╖ ╪г┘Ж┘И╪з╪╣ ┘И╪н╪п╪з╪к ╪з┘Д╪и┘Ж╪з╪б ╪и┘Б╪ж╪з╪к ╪░╪з╪к ┘И╪н╪п╪з╪к ╪и┘Ж╪з╪б ┘Б╪╣┘Д┘К╪й. ╪и╪╣╪п ╪░┘Д┘Г╪М ┘К┘П╪╣╪▒┘С┘Б ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Е┘Ж ╪о┘Д╪з┘Д ╪з┘Д╪к┘Г┘И┘К┘Ж ╪╣╪и╪▒ ╪к┘Е╪▒┘К╪▒ ┘Г┘Д ╪┤┘К╪б ╪е┘Д┘Й ┘Б╪ж╪й `ResNet`:
```py
from transformers import PreTrainedModel
from timm.models.resnet import BasicBlock, Bottleneck, ResNet
from .configuration_resnet import ResnetConfig
BLOCK_MAPPING = {"basic": BasicBlock, "bottleneck": Bottleneck}
class ResnetModel(PreTrainedModel):
config_class = ResnetConfig
def __init__(self, config):
super().__init__(config)
block_layer = BLOCK_MAPPING[config.block_type]
self.model = ResNet(
block_layer,
config.layers,
num_classes=config.num_classes,
in_chans=config.input_channels,
cardinality=config.cardinality,
base_width=config.base_width,
stem_width=config.stem_width,
stem_type=config.stem_type,
avg_down=config.avg_down,
)
def forward(self, tensor):
return self.model.forward_features(tensor)
```
╪и╪з┘Д┘Ж╪│╪и╪й ┘Д┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪░┘К ╪│┘К╪╡┘Ж┘Б ╪з┘Д╪╡┘И╪▒╪М ┘Б╪е┘Ж┘Ж╪з ┘Ж╪║┘К╪▒ ┘Б┘В╪╖ ╪╖╪▒┘К┘В╪й ╪з┘Д╪к┘В╪п┘К┘Е:
```py
import torch
class ResnetModelForImageClassification(PreTrainedModel):
config_class = ResnetConfig
def __init__(self, config):
super().__init__(config)
block_layer = BLOCK_MAPPING[config.block_type]
self.model = ResNet(
block_layer,
config.layers,
num_classes=config.num_classes,
in_chans=config.input_channels,
cardinality=config.cardinality,
base_width=config.base_width,
stem_width=config.stem_width,
stem_type=config.stem_type,
avg_down=config.avg_down,
)
def forward(self, tensor, labels=None):
logits = self.model(tensor)
if labels is not None:
loss = torch.nn.cross_entropy(logits, labels)
return {"loss": loss, "logits": logits}
return {"logits": logits}
```
┘Б┘К ┘Г┘Д╪к╪з ╪з┘Д╪н╪з┘Д╪к┘К┘Ж╪М ┘Д╪з╪н╪╕ ┘Г┘К┘Б ┘Ж╪▒╪л ┘Е┘Ж `PreTrainedModel` ┘И┘Ж╪│╪к╪п╪╣┘К ┘Е┘П┘З┘К╪ж ╪з┘Д┘Б╪ж╪й ╪з┘Д╪▒╪ж┘К╪│┘К╪й ╪и╪з╪│╪к╪о╪п╪з┘Е `config` (┘Г┘Е╪з ╪к┘Б╪╣┘Д ╪╣┘Ж╪п ╪е┘Ж╪┤╪з╪б ┘И╪н╪п╪й `torch.nn.Module` ╪╣╪з╪п┘К╪й). ┘Д┘К╪│ ┘Е┘Ж ╪з┘Д╪╢╪▒┘И╪▒┘К ╪к╪╣╪▒┘К┘Б `config_class` ╪е┘Д╪з ╪е╪░╪з ┘Г┘Ж╪к ╪к╪▒╪║╪и ┘Б┘К ╪к╪│╪м┘К┘Д ┘Ж┘Е┘И╪░╪м┘Г ┘Е╪╣ ╪з┘Д┘Б╪ж╪з╪к ╪з┘Д╪к┘Д┘В╪з╪ж┘К╪й (╪▒╪з╪м╪╣ ╪з┘Д┘В╪│┘Е ╪з┘Д╪г╪о┘К╪▒).
<Tip>
╪е╪░╪з ┘Г╪з┘Ж ┘Ж┘Е┘И╪░╪м┘Г ┘Е╪┤╪з╪и┘З┘Л╪з ╪м╪п┘Л╪з ┘Д┘Ж┘Е┘И╪░╪м ╪п╪з╪о┘Д ╪з┘Д┘Е┘Г╪к╪и╪й╪М ┘Б┘К┘Е┘Г┘Ж┘Г ╪е╪╣╪з╪п╪й ╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Б╪│ ╪з┘Д╪к┘Г┘И┘К┘Ж ┘Е╪л┘Д ┘З╪░╪з ╪з┘Д┘Ж┘Е┘И╪░╪м.
</Tip>
┘К┘Е┘Г┘Ж ┘Д┘Ж┘Е┘И╪░╪м┘Г ╪г┘Ж ┘К╪╣┘К╪п ╪г┘К ╪┤┘К╪б ╪к╪▒┘К╪п┘З╪М ┘И┘Д┘Г┘Ж ╪е╪╣╪з╪п╪й ┘В╪з┘Е┘И╪│ ┘Е╪л┘Д┘Е╪з ┘Б╪╣┘Д┘Ж╪з ┘Д┘А
`ResnetModelForImageClassification`╪М ┘Е╪╣ ╪к╪╢┘Е┘К┘Ж ╪з┘Д╪о╪│╪з╪▒╪й ╪╣┘Ж╪п ╪к┘Е╪▒┘К╪▒ ╪з┘Д╪╣┘Д╪з┘Е╪з╪к╪М ╪│┘К╪м╪╣┘Д ┘Ж┘Е┘И╪░╪м┘Г ┘В╪з╪и┘Д┘Л╪з ┘Д┘Д╪з╪│╪к╪о╪п╪з┘Е ┘Е╪и╪з╪┤╪▒╪й ╪п╪з╪о┘Д ┘Б╪ж╪й [`Trainer`]. ┘К╪╣╪п ╪з╪│╪к╪о╪п╪з┘Е ╪к┘Ж╪│┘К┘В ╪е╪о╪▒╪з╪м ╪в╪о╪▒ ╪г┘Е╪▒┘Л╪з ╪м┘К╪п┘Л╪з ╪╖╪з┘Д┘Е╪з ╪г┘Ж┘Г ╪к╪о╪╖╪╖ ┘Д╪з╪│╪к╪о╪п╪з┘Е ╪н┘Д┘В╪й ╪к╪п╪▒┘К╪и ╪о╪з╪╡╪й ╪и┘Г ╪г┘И ┘Е┘Г╪к╪и╪й ╪г╪о╪▒┘Й ┘Д┘Д╪к╪п╪▒┘К╪и.
╪з┘Д╪в┘Ж ╪и╪╣╪п ╪г┘Ж ╪г╪╡╪и╪н ┘Д╪п┘К┘Ж╪з ┘Б╪ж╪й ╪з┘Д┘Ж┘Е┘И╪░╪м╪М ╪п╪╣┘Ж╪з ┘Ж┘Ж╪┤╪ж ┘И╪з╪н╪п╪й:
```py
resnet50d = ResnetModelForImageClassification(resnet50d_config)
```
┘К┘Е┘Г┘Ж┘Г ╪з╪│╪к╪о╪п╪з┘Е ╪г┘К ┘Е┘Ж ╪╖╪▒┘В ┘Б╪ж╪й [`PreTrainedModel`]╪М ┘Е╪л┘Д [`~PreTrainedModel.save_pretrained`] ╪г┘И
[`~PreTrainedModel.push_to_hub`]. ╪│┘Ж╪│╪к╪о╪п┘Е ╪з┘Д╪л╪з┘Ж┘К ┘Б┘К ╪з┘Д┘В╪│┘Е ╪з┘Д╪к╪з┘Д┘К╪М ┘И╪│┘Ж╪▒┘Й ┘Г┘К┘Б┘К╪й ╪п┘Б╪╣ ╪г┘И╪▓╪з┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Е╪╣ ┘Г┘И╪п ┘Ж┘Е┘И╪░╪м┘Ж╪з. ┘И┘Д┘Г┘Ж ╪г┘И┘Д╪з┘Л╪М ╪п╪╣┘Ж╪з ┘Ж╪н┘Е┘Д ╪и╪╣╪╢ ╪з┘Д╪г┘И╪▓╪з┘Ж ╪з┘Д┘Е┘П╪╣┘Д┘Е╪й ┘Е╪│╪и┘В┘Л╪з ╪п╪з╪о┘Д ┘Ж┘Е┘И╪░╪м┘Ж╪з.
┘Б┘К ╪н╪з┘Д╪й ╪з┘Д╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪о╪з╪╡╪й ╪и┘Г╪М ┘Б┘Е┘Ж ╪з┘Д┘Е╪н╪к┘Е┘Д ╪г┘Ж ╪к┘В┘И┘Е ╪и╪к╪п╪▒┘К╪и ┘Ж┘Е┘И╪░╪м┘Г ╪з┘Д┘Е╪о╪╡╪╡ ╪╣┘Д┘Й ╪и┘К╪з┘Ж╪з╪к┘Г ╪з┘Д╪о╪з╪╡╪й. ┘Д┘Д╪з┘Ж╪к┘В╪з┘Д ╪и╪│╪▒╪╣╪й ╪о┘Д╪з┘Д ┘З╪░╪з ╪з┘Д╪и╪▒┘Ж╪з┘Е╪м ╪з┘Д╪к╪╣┘Д┘К┘Е┘К╪М
╪│┘Ж╪│╪к╪о╪п┘Е ╪з┘Д╪е╪╡╪п╪з╪▒ ╪з┘Д┘Е┘П╪╣┘Д┘Е ┘Е╪│╪и┘В┘Л╪з ┘Е┘Ж resnet50d. ┘Ж╪╕╪▒┘Л╪з ┘Д╪г┘Ж ┘Ж┘Е┘И╪░╪м┘Ж╪з ┘З┘И ┘Е╪м╪▒╪п ╪║┘Д╪з┘Б ╪н┘И┘Д┘З╪М ┘Б┘Е┘Ж ╪з┘Д╪│┘З┘Д ┘Ж┘В┘Д ┘З╪░┘З ╪з┘Д╪г┘И╪▓╪з┘Ж:
```py
import timm
pretrained_model = timm.create_model("resnet50d", pretrained=True)
resnet50d.model.load_state_dict(pretrained_model.state_dict())
```
╪з┘Д╪в┘Ж ╪п╪╣┘И┘Ж╪з ┘Ж╪▒┘Й ┘Г┘К┘Б┘К╪й ╪з┘Д╪к╪г┘Г╪п ┘Е┘Ж ╪г┘Ж┘З ╪╣┘Ж╪п ┘В┘К╪з┘Е┘Ж╪з ╪и┘А [`~PreTrainedModel.save_pretrained`] ╪г┘И [`~PreTrainedModel.push_to_hub`]╪М ┘К╪к┘Е ╪н┘Б╪╕ ┘Г┘И╪п ╪з┘Д┘Ж┘Е┘И╪░╪м.
## ╪к╪│╪м┘К┘Д ┘Ж┘Е┘И╪░╪м ┘Е╪╣ ┘Г┘И╪п ┘Е╪о╪╡╪╡ ┘Д┘Д┘Б╪ж╪з╪к ╪з┘Д╪к┘Д┘В╪з╪ж┘К╪й
╪е╪░╪з ┘Г┘Ж╪к ╪к┘Г╪к╪и ┘Е┘Г╪к╪и╪й ╪к┘И╪│╪╣ ЁЯдЧ Transformers╪М ┘Б┘В╪п ╪к╪▒╪║╪и ┘Б┘К ╪к┘И╪│┘К╪╣ ╪з┘Д┘Б╪ж╪з╪к ╪з┘Д╪к┘Д┘В╪з╪ж┘К╪й ┘Д╪к╪┤┘Е┘Д ┘Ж┘Е┘И╪░╪м┘Г ╪з┘Д╪о╪з╪╡. ┘К╪о╪к┘Д┘Б ┘З╪░╪з ╪╣┘Ж ┘Ж╪┤╪▒ ╪з┘Д┘Г┘И╪п ╪е┘Д┘Й Hub ╪и┘Е╪╣┘Ж┘Й ╪г┘Ж ╪з┘Д┘Е╪│╪к╪о╪п┘Е┘К┘Ж ╪│┘К╪н╪к╪з╪м┘И┘Ж ╪е┘Д┘Й ╪з╪│╪к┘К╪▒╪з╪п ┘Е┘Г╪к╪и╪к┘Г ┘Д┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д┘Е╪о╪╡╪╡╪й (╪╣┘Д┘Й ╪╣┘Г╪│ ╪к┘Ж╪▓┘К┘Д ┘Г┘И╪п ╪з┘Д┘Ж┘Е┘И╪░╪м ╪к┘Д┘В╪з╪ж┘К┘Л╪з ┘Е┘Ж Hub).
┘Е╪з ╪п╪з┘Е ╪к┘Г┘И┘К┘Ж┘Г ┘К╪н╪к┘И┘К ╪╣┘Д┘Й ┘Е╪╣╪з┘Е┘Д `model_type` ┘Е╪о╪к┘Д┘Б╪й ╪╣┘Ж ╪г┘Ж┘И╪з╪╣ ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪н╪з┘Д┘К╪й╪М ┘И╪г┘Ж ┘Б╪ж╪з╪к ┘Ж┘Е╪з╪░╪м┘Г ┘Д╪п┘К┘Г ┘Д╪п┘К┘З╪з ╪з┘Д╪о╪╡╪з╪ж╪╡ ╪з┘Д╪╡╪н┘К╪н╪й `config_class`╪М ┘Б┘К┘Е┘Г┘Ж┘Г ╪и╪и╪│╪з╪╖╪й ╪е╪╢╪з┘Б╪к┘З╪з ╪е┘Д┘Й ╪з┘Д┘Б╪ж╪з╪к ╪з┘Д╪к┘Д┘В╪з╪ж┘К╪й ┘Е╪л┘Д ┘З╪░╪з:
```py
from transformers import AutoConfig, AutoModel, AutoModelForImageClassification
AutoConfig.register("resnet", ResnetConfig)
AutoModel.register(ResnetConfig, ResnetModel)
AutoModelForImageClassification.register(ResnetConfig, ResnetModelForImageClassification)
```
┘Д╪з╪н╪╕ ╪г┘Ж ╪з┘Д╪н╪м╪й ╪з┘Д╪г┘И┘Д┘Й ╪з┘Д┘Е╪│╪к╪о╪п┘Е╪й ╪╣┘Ж╪п ╪к╪│╪м┘К┘Д ╪к┘Г┘И┘К┘Ж┘Г ╪з┘Д┘Е╪о╪╡╪╡ ┘Д┘А [`AutoConfig`] ┘К╪м╪и ╪г┘Ж ╪к╪к╪╖╪з╪и┘В ┘Е╪╣ `model_type`
┘Е┘Ж ╪к┘Г┘И┘К┘Ж┘Г ╪з┘Д┘Е╪о╪╡╪╡╪М ┘И╪з┘Д╪н╪м╪й ╪з┘Д╪г┘И┘Д┘Й ╪з┘Д┘Е╪│╪к╪о╪п┘Е╪й ╪╣┘Ж╪п ╪к╪│╪м┘К┘Д ┘Ж┘Е╪з╪░╪м┘Г ╪з┘Д┘Е╪о╪╡╪╡╪й ┘Д╪г┘К ┘Б╪ж╪й ┘Ж┘Е┘И╪░╪м ╪к┘Д┘В╪з╪ж┘К ┘К╪м╪и
╪г┘Ж ╪к╪к╪╖╪з╪и┘В ┘Е╪╣ `config_class` ┘Е┘Ж ╪к┘Д┘Г ╪з┘Д┘Ж┘Е╪з╪░╪м.
## ╪е╪▒╪│╪з┘Д ╪з┘Д┘Г┘И╪п ╪е┘Д┘Й Hub
<Tip warning={true}>
┘З╪░╪з API ╪к╪м╪▒┘К╪и┘К ┘И┘В╪п ┘К┘Г┘И┘Ж ┘Д┘З ╪и╪╣╪╢ ╪з┘Д╪к╪║┘К┘К╪▒╪з╪к ╪з┘Д╪╖┘Б┘К┘Б╪й ┘Б┘К ╪з┘Д╪е╪╡╪п╪з╪▒╪з╪к ╪з┘Д┘В╪з╪п┘Е╪й.
</Tip>
╪г┘И┘Д╪з┘Л╪М ╪к╪г┘Г╪п ┘Е┘Ж ╪к╪╣╪▒┘К┘Б ┘Ж┘Е┘И╪░╪м┘Г ╪и╪з┘Д┘Г╪з┘Е┘Д ┘Б┘К ┘Е┘Д┘Б `.py`. ┘К┘Е┘Г┘Ж ╪г┘Ж ┘К╪╣╪к┘Е╪п ╪╣┘Д┘Й ╪з┘Д╪з╪│╪к┘К╪▒╪з╪п ╪з┘Д┘Ж╪│╪и┘К ┘Д┘Е┘Д┘Б╪з╪к ╪г╪о╪▒┘Й ╪╖╪з┘Д┘Е╪з ╪г┘Ж ╪м┘Е┘К╪╣ ╪з┘Д┘Е┘Д┘Б╪з╪к ┘Е┘И╪м┘И╪п╪й ┘Б┘К ┘Ж┘Б╪│ ╪з┘Д╪п┘Д┘К┘Д (┘Д╪з ┘Ж╪п╪╣┘Е ╪з┘Д┘И╪н╪п╪з╪к ╪з┘Д┘Б╪▒╪╣┘К╪й ┘Д┘З╪░┘З ╪з┘Д┘Е┘К╪▓╪й ╪н╪к┘Й ╪з┘Д╪в┘Ж). ┘Б┘К ┘Е╪л╪з┘Д┘Ж╪з╪М ╪│┘Ж╪н╪п╪п ┘Е┘Д┘Б `modeling_resnet.py` ┘И┘Е┘Д┘Б `configuration_resnet.py` ┘Б┘К ┘Е╪м┘Д╪п ╪и╪з╪│┘Е "resnet_model" ┘Б┘К ╪п┘Д┘К┘Д ╪з┘Д╪╣┘Е┘Д ╪з┘Д╪н╪з┘Д┘К. ┘К╪н╪к┘И┘К ┘Е┘Д┘Б ╪з┘Д╪к┘Г┘И┘К┘Ж ╪╣┘Д┘Й ┘Г┘И╪п ┘Д┘А `ResnetConfig` ┘И┘К╪н╪к┘И┘К ┘Е┘Д┘Б ╪з┘Д┘Ж┘Е╪░╪м╪й ╪╣┘Д┘Й ┘Г┘И╪п ┘Д┘А `ResnetModel` ┘И`ResnetModelForImageClassification`.
```
.
тФФтФАтФА resnet_model
тФЬтФАтФА __init__.py
тФЬтФАтФА configuration_resnet.py
тФФтФАтФА modeling_resnet.py
```
┘К┘Е┘Г┘Ж ╪г┘Ж ┘К┘Г┘И┘Ж ┘Е┘Д┘Б `__init__.py` ┘Б╪з╪▒╪║┘Л╪з╪М ┘Б┘З┘И ┘Е┘И╪м┘И╪п ┘Б┘В╪╖ ╪н╪к┘Й ┘К╪к┘Е┘Г┘Ж Python ┘Е┘Ж ╪з┘Г╪к╪┤╪з┘Б ╪г┘Ж `resnet_model` ┘К┘Е┘Г┘Ж ╪з╪│╪к╪о╪п╪з┘Е┘З ┘Г┘Е┘И╪п┘К┘Д.
<Tip warning={true}>
╪е╪░╪з ┘Г┘Ж╪к ╪к┘В┘И┘Е ╪и┘Ж╪│╪о ┘Е┘Д┘Б╪з╪к ╪з┘Д┘Ж┘Е╪░╪м╪й ┘Е┘Ж ╪з┘Д┘Е┘Г╪к╪и╪й╪М ┘Б╪│┘И┘Б ╪к╪н╪к╪з╪м ╪е┘Д┘Й ╪з╪│╪к╪и╪п╪з┘Д ╪м┘Е┘К╪╣ ╪з┘Д┘И╪з╪▒╪п╪з╪к ╪з┘Д┘Ж╪│╪и┘К╪й ┘Б┘К ╪г╪╣┘Д┘Й ╪з┘Д┘Е┘Д┘Б
┘Д╪з╪│╪к┘К╪▒╪з╪п┘З╪з ┘Е┘Ж ╪н╪▓┘Е╪й `transformers`.
</Tip>
┘Д╪з╪н╪╕ ╪г┘Ж┘З ┘К┘Е┘Г┘Ж┘Г ╪е╪╣╪з╪п╪й ╪з╪│╪к╪о╪п╪з┘Е (╪г┘И ╪к┘И╪│┘К╪╣) ╪к┘Г┘И┘К┘Ж/┘Ж┘Е┘И╪░╪м ┘Е┘И╪м┘И╪п.
┘Д┘Е╪┤╪з╪▒┘Г╪й ┘Ж┘Е┘И╪░╪м┘Г ┘Е╪╣ ╪з┘Д┘Е╪м╪к┘Е╪╣╪М ╪з╪к╪и╪╣ ╪з┘Д╪о╪╖┘И╪з╪к ╪з┘Д╪к╪з┘Д┘К╪й: ╪г┘И┘Д╪з┘Л╪М ┘В┘Е ╪и╪з╪│╪к┘К╪▒╪з╪п ┘Ж┘Е┘И╪░╪м ResNet ┘И╪з┘Д╪к┘Г┘И┘К┘Ж ┘Е┘Ж ╪з┘Д┘Е┘Д┘Б╪з╪к ╪з┘Д╪к┘К ╪к┘Е ╪е┘Ж╪┤╪з╪д┘З╪з ╪н╪п┘К╪л┘Л╪з:
```py
from resnet_model.configuration_resnet import ResnetConfig
from resnet_model.modeling_resnet import ResnetModel, ResnetModelForImageClassification
```
╪и╪╣╪п ╪░┘Д┘Г╪М ┘К╪м╪и ╪╣┘Д┘К┘Г ╪е╪о╪и╪з╪▒ ╪з┘Д┘Е┘Г╪к╪и╪й ╪и╪г┘Ж┘Г ╪к╪▒┘К╪п ┘Ж╪│╪о ┘Е┘Д┘Б╪з╪к ╪з┘Д┘Г┘И╪п ╪з┘Д╪о╪з╪╡╪й ╪и┘З╪░┘З ╪з┘Д┘Г╪з╪ж┘Ж╪з╪к ╪╣┘Ж╪п ╪з╪│╪к╪о╪п╪з┘Е ╪╖╪▒┘К┘В╪й `save_pretrained`
┘И╪к╪│╪м┘К┘Д┘З╪з ╪и╪┤┘Г┘Д ╪╡╪н┘К╪н ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Б╪ж╪й ╪к┘Д┘В╪з╪ж┘К╪й (╪о╪з╪╡╪й ┘Д┘Д┘Ж┘Е╪з╪░╪м)╪М ┘Е╪з ╪╣┘Д┘К┘Г ╪│┘И┘Й ╪к╪┤╪║┘К┘Д:
```py
ResnetConfig.register_for_auto_class()
ResnetModel.register_for_auto_class("AutoModel")
ResnetModelForImageClassification.register_for_auto_class("AutoModelForImageClassification")
```
┘Д╪з╪н╪╕ ╪г┘Ж┘З ┘Д╪з ╪к┘И╪м╪п ╪н╪з╪м╪й ┘Д╪к╪н╪п┘К╪п ┘Б╪ж╪й ╪к┘Д┘В╪з╪ж┘К╪й ┘Д┘Д╪к┘Г┘И┘К┘Ж (┘З┘Ж╪з┘Г ┘Б╪ж╪й ╪к┘Д┘В╪з╪ж┘К╪й ┘И╪з╪н╪п╪й ┘Б┘В╪╖ ┘Д┘З╪з╪М
[`AutoConfig`]) ┘И┘Д┘Г┘Ж ╪з┘Д╪г┘Е╪▒ ┘К╪о╪к┘Д┘Б ╪и╪з┘Д┘Ж╪│╪и╪й ┘Д┘Д┘Ж┘Е╪з╪░╪м. ┘В╪п ┘К┘Г┘И┘Ж ┘Ж┘Е┘И╪░╪м┘Г ╪з┘Д┘Е╪о╪╡╪╡ ┘Е┘Ж╪з╪│╪и┘Л╪з ┘Д┘Д╪╣╪п┘К╪п ┘Е┘Ж ╪з┘Д┘Е┘З╪з┘Е ╪з┘Д┘Е╪о╪к┘Д┘Б╪й╪М ┘Д╪░┘Д┘Г ┘К╪м╪и
╪к╪н╪п┘К╪п ╪г┘К ┘Е┘Ж ╪з┘Д┘Б╪ж╪з╪к ╪з┘Д╪к┘Д┘В╪з╪ж┘К╪й ┘З┘И ╪з┘Д╪╡╪н┘К╪н ┘Д┘Ж┘Е┘И╪░╪м┘Г.
<Tip>
╪з╪│╪к╪о╪п┘Е `register_for_auto_class()` ╪е╪░╪з ┘Г┘Ж╪к ╪к╪▒┘К╪п ┘Ж╪│╪о ┘Е┘Д┘Б╪з╪к ╪з┘Д┘Г┘И╪п. ╪е╪░╪з ┘Г┘Ж╪к ╪к┘Б╪╢┘Д ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Г┘И╪п ╪╣┘Д┘Й Hub ┘Е┘Ж ┘Е╪│╪к┘И╪п╪╣ ╪в╪о╪▒╪М
┘Б┘Д╪з ╪к╪н╪к╪з╪м ╪е┘Д┘Й ╪з╪│╪к╪п╪╣╪з╪ж┘З. ┘Б┘К ╪з┘Д╪н╪з┘Д╪з╪к ╪з┘Д╪к┘К ┘К┘И╪м╪п ┘Б┘К┘З╪з ╪г┘Г╪л╪▒ ┘Е┘Ж ┘Б╪ж╪й ╪к┘Д┘В╪з╪ж┘К╪й ┘И╪з╪н╪п╪й╪М ┘К┘Е┘Г┘Ж┘Г ╪к╪╣╪п┘К┘Д ┘Е┘Д┘Б `config.json` ┘Е╪и╪з╪┤╪▒╪й ╪и╪з╪│╪к╪о╪п╪з┘Е
╪з┘Д┘З┘К┘Г┘Д ╪з┘Д╪к╪з┘Д┘К:
```json
"auto_map": {
"AutoConfig": "<your-repo-name>--<config-name>",
"AutoModel": "<your-repo-name>--<config-name>",
"AutoModelFor<Task>": "<your-repo-name>--<config-name>",
},
```
</Tip>
╪и╪╣╪п ╪░┘Д┘Г╪М ╪п╪╣┘Ж╪з ┘Ж┘В┘И┘Е ╪и╪е┘Ж╪┤╪з╪б ╪з┘Д╪к┘Г┘И┘К┘Ж ┘И╪з┘Д┘Ж┘Е╪з╪░╪м ┘Г┘Е╪з ┘Б╪╣┘Д┘Ж╪з ┘Е┘Ж ┘В╪и┘Д:
```py
resnet50d_config = ResnetConfig(block_type="bottleneck", stem_width=32, stem_type="deep", avg_down=True)
resnet50d = ResnetModelForImageClassification(resnet50d_config)
pretrained_model = timm.create_model("resnet50d", pretrained=True)
resnet50d.model.load_state_dict(pretrained_model.state_dict())
```
╪з┘Д╪в┘Ж ┘Д╪е╪▒╪│╪з┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м ╪е┘Д┘Й Hub╪М ╪к╪г┘Г╪п ┘Е┘Ж ╪к╪│╪м┘К┘Д ╪з┘Д╪п╪о┘И┘Д. ╪е┘Е╪з ╪к╪┤╪║┘К┘Д ┘Б┘К ╪з┘Д┘Е╪н╪╖╪й ╪з┘Д╪г┘И╪з┘Е╪▒ ╪з┘Д╪╖╪▒┘Б┘К╪й ╪з┘Д╪о╪з╪╡╪й ╪и┘Г:
```bash
huggingface-cli login
```
╪г┘И ┘Е┘Ж ╪п┘Б╪к╪▒ ┘Е┘Д╪з╪н╪╕╪з╪к:
```py
from huggingface_hub import notebook_login
notebook_login()
```
┘К┘Е┘Г┘Ж┘Г ╪и╪╣╪п ╪░┘Д┘Г ╪з┘Д╪╢╪║╪╖ ╪╣┘Д┘Й ┘Е╪│╪з╪н╪й ╪з┘Д╪з╪│┘Е ╪з┘Д╪о╪з╪╡╪й ╪и┘Г (╪г┘И ┘Е┘Ж╪╕┘Е╪й ╪г┘Ж╪к ╪╣╪╢┘И ┘Б┘К┘З╪з) ┘Е╪л┘Д ┘З╪░╪з:
```py
resnet50d.push_to_hub("custom-resnet50d")
```
╪и╪з┘Д╪е╪╢╪з┘Б╪й ╪е┘Д┘Й ╪г┘И╪▓╪з┘Ж ╪з┘Д┘Ж┘Е╪░╪м╪й ┘И╪з┘Д╪к┘Г┘И┘К┘Ж ╪и╪к┘Ж╪│┘К┘В json╪М ┘Б┘В╪п ┘В╪з┘Е ┘З╪░╪з ╪г┘К╪╢┘Л╪з ╪и┘Ж╪│╪о ┘Е┘Д┘Б╪з╪к ╪з┘Д┘Ж┘Е╪░╪м╪й ┘И╪з┘Д╪к┘Г┘И┘К┘Ж `.py` ┘Б┘К ┘Е╪м┘Д╪п `custom-resnet50d` ┘И╪к╪н┘Е┘К┘Д ╪з┘Д┘Ж╪к┘К╪м╪й ╪е┘Д┘Й Hub. ┘К┘Е┘Г┘Ж┘Г ╪з┘Д╪к╪н┘В┘В ┘Е┘Ж ╪з┘Д┘Ж╪к┘К╪м╪й ┘Б┘К ┘З╪░╪з [┘Е╪│╪к┘И╪п╪╣ ╪з┘Д┘Ж┘Е┘И╪░╪м](https://huggingface.co/sgugger/custom-resnet50d).
╪▒╪з╪м╪╣ [╪з┘Д╪и╪▒┘Ж╪з┘Е╪м ╪з┘Д╪к╪╣┘Д┘К┘Е┘К ┘Д┘Д┘Е╪┤╪з╪▒┘Г╪й](model_sharing) ┘Д┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д┘Е╪╣┘Д┘И┘Е╪з╪к ╪н┘И┘Д ╪╖╪▒┘К┘В╪й ╪з┘Д╪п┘Б╪╣ ╪е┘Д┘Й ╪з┘Д┘Е╪н┘И╪▒.
### ╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Е┘И╪░╪м ┘Е╪╣ ┘Г┘И╪п ┘Е╪о╪╡╪╡
┘К┘Е┘Г┘Ж┘Г ╪з╪│╪к╪о╪п╪з┘Е ╪г┘К ╪к┘Г┘И┘К┘Ж ╪г┘И ┘Ж┘Е┘И╪░╪м ╪г┘И ┘Е┘В╪│┘Е ┘Д╪║┘И┘К ┘Е╪╣ ┘Е┘Д┘Б╪з╪к ╪и╪▒┘Е╪м╪й ┘Е╪о╪╡╪╡╪й ┘Б┘К ┘Е╪│╪к┘И╪п╪╣┘З ╪и╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Б╪ж╪з╪к ╪з┘Д╪к┘Д┘В╪з╪ж┘К╪й ┘И ╪п╪з┘Д╪й `from_pretrained`.╪к┘П┘Б╪н╪╡ ╪м┘Е┘К╪╣ ╪з┘Д┘Е┘Д┘Б╪з╪к ┘И╪з┘Д╪▒┘Е┘И╪▓ ╪з┘Д┘Е╪▒┘Б┘И╪╣ ╪е┘Д┘Й Hub ╪и╪н╪л┘Л╪з ╪╣┘Ж ╪з┘Д╪и╪▒╪з┘Е╪м ╪з┘Д╪╢╪з╪▒╪й (╪▒╪з╪м╪╣ ┘И╪л╪з╪ж┘В [╪г┘Е╪з┘Ж Hub](https://huggingface.co/docs/hub/security#malware-scanning) ┘Д┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д┘Е╪╣┘Д┘И┘Е╪з╪к)╪М ┘И┘Д┘Г┘Ж ┘К╪м╪и ╪╣┘Д┘К┘Г ┘Е╪▒╪з╪м╪╣╪й ┘Г┘И╪п ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И╪з┘Д┘Е╪д┘Д┘Б ┘Д╪к╪м┘Ж╪и ╪к┘Ж┘Б┘К╪░ ╪з┘Д╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Е╪м┘К╪й ╪з┘Д╪╢╪з╪▒╪й ╪╣┘Д┘Й ╪м┘З╪з╪▓┘Г. ┘Д╪к┘Б╪╣┘К┘Д ┘Ж┘Е┘И╪░╪м ┘К╪н╪к┘И┘К ╪╣┘Д┘Й ╪┤┘Б╪▒╪й ╪и╪▒┘Е╪м┘К╪й ┘Е╪о╪╡╪╡╪й╪М ╪╣┘К┘С┘Ж `trust_remote_code=True`:
```py
from transformers import AutoModelForImageClassification
model = AutoModelForImageClassification.from_pretrained("sgugger/custom-resnet50d", trust_remote_code=True)
```
┘К┘П┘Ж╪╡╪н ╪и╪┤╪п╪й ╪и╪к╪н╪п┘К╪п ╪▒┘В┘Е ╪е╪╡╪п╪з╪▒ (commit hash) ┘Г┘А `revision` ┘Д┘Д╪к╪г┘Г╪п ┘Е┘Ж ╪╣╪п┘Е ╪к╪╣╪п┘К┘Д ┘Е╪д┘Д┘Б ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Д┘Д╪┤┘Б╪▒╪й ┘Д╪з╪н┘В┘Л╪з╪и╪е╪╢╪з┘Б╪й ╪г╪│╪╖╪▒ ╪╢╪з╪▒╪й (╪е┘Д╪з ╪е╪░╪з ┘Г┘Ж╪к ╪к╪л┘В ╪к┘Е╪з┘Е┘Л╪з ╪и┘Е╪д┘Д┘Б┘К ╪з┘Д┘Ж┘Е┘И╪░╪м):
```py
commit_hash = "ed94a7c6247d8aedce4647f00f20de6875b5b292"
model = AutoModelForImageClassification.from_pretrained(
"sgugger/custom-resnet50d"╪М trust_remote_code=True╪М revision=commit_hash
)
```
┘Д╪з╪н╪╕ ┘И╪м┘И╪п ╪▓╪▒┘С ┘Д┘Ж╪│╪о ╪▒┘В┘Е ╪е╪╡╪п╪з╪▒ ╪и╪│┘З┘И┘Д╪й ╪╣┘Ж╪п ╪к╪╡┘Б╪н ╪│╪м┘Д ╪з┘Д╪к╪▓╪з┘Е╪з╪к ┘Е╪│╪к┘И╪п╪╣ ╪з┘Д┘Ж┘Е┘И╪░╪м ╪╣┘Д┘Й ┘Е┘Ж╪╡╪й Hugging Face.

View File

@ -0,0 +1,51 @@
# ╪з╪│╪к╪о╪п╪з┘Е ┘Е╪м╪▓╪ж┘К╪з╪к ╪з┘Д┘Ж╪╡┘И╪╡ ┘Е┘Ж ЁЯдЧ Tokenizers
┘К╪╣╪к┘Е╪п [`PreTrainedTokenizerFast`] ╪╣┘Д┘Й ┘Е┘Г╪к╪и╪й [ЁЯдЧ Tokenizers](https://huggingface.co/docs/tokenizers). ┘К┘Е┘Г┘Ж ╪к╪н┘Е┘К┘Д ╪з┘Д┘Е╪м╪▓╪ж╪з╪к ╪з┘Д┘Д╪║┘И┘К┘К┘Ж ╪з┘Д╪░┘К┘Ж ╪к┘Е ╪з┘Д╪н╪╡┘И┘Д ╪╣┘Д┘К┘З┘Е ┘Е┘Ж ┘Е┘Г╪к╪и╪й ЁЯдЧ Tokenizers ╪и╪и╪│╪з╪╖╪й ╪┤╪п┘К╪п╪й ┘Б┘К ЁЯдЧ Transformers.
┘В╪и┘Д ╪з┘Д╪п╪о┘И┘Д ┘Б┘К ╪з┘Д╪к┘Б╪з╪╡┘К┘Д╪М ╪п╪╣┘И┘Ж╪з ┘Ж╪и╪п╪г ╪г┘И┘Д╪з┘Л ╪и╪е┘Ж╪┤╪з╪б ┘Е┘П╪м╪▓┘Й╪б ┘Д╪║┘И┘К ╪к╪м╪▒┘К╪и┘К ┘Б┘К ╪и╪╢╪╣ ╪│╪╖┘И╪▒:
```python
>>> from tokenizers import Tokenizer
>>> from tokenizers.models import BPE
>>> from tokenizers.trainers import BpeTrainer
>>> from tokenizers.pre_tokenizers import Whitespace
>>> tokenizer = Tokenizer(BPE(unk_token="[UNK]"))
>>> trainer = BpeTrainer(special_tokens=["[UNK]", "[CLS]", "[SEP]", "[PAD]", "[MASK]"])
>>> tokenizer.pre_tokenizer = Whitespace()
>>> files = [...]
>>> tokenizer.train(files, trainer)
```
╪з┘Д╪в┘Ж ┘Д╪п┘К┘Ж╪з ┘Е┘П╪м╪▓┘Й╪б ┘Д╪║┘И┘К ┘Е╪п╪▒╪и ╪╣┘Д┘Й ╪з┘Д┘Е┘Д┘Б╪з╪к ╪з┘Д╪к┘К ╪н╪п╪п┘Ж╪з┘З╪з. ┘К┘Е┘Г┘Ж┘Ж╪з ╪е┘Е╪з ╪з┘Д╪з╪│╪к┘Е╪▒╪з╪▒ ┘Б┘К ╪з╪│╪к╪о╪п╪з┘Е┘З ┘Б┘К ┘И┘В╪к ╪з┘Д╪к╪┤╪║┘К┘Д ┘З╪░╪з╪М ╪г┘И ╪н┘Б╪╕┘З ┘Б┘К ┘Е┘Д┘Б JSON ┘Д╪е╪╣╪з╪п╪й ╪з╪│╪к╪о╪п╪з┘Е┘З ┘Д╪з╪н┘В┘Л╪з.
## ╪к╪н┘Е┘К┘Д ┘Е┘П╪м╪▓╪ж ╪з┘Д┘Ж┘С╪╡┘И╪╡ ┘Е┘П╪и╪з╪┤╪▒╪й┘Л
╪п╪╣┘И┘Ж╪з ┘Ж╪▒┘Й ┘Г┘К┘Б ┘К┘Е┘Г┘Ж┘Ж╪з ╪з┘Д╪з╪│╪к┘Б╪з╪п╪й ┘Е┘Ж ┘Г╪з╪ж┘Ж (┘Е┘П╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡) ┘Б┘К ┘Е┘Г╪к╪и╪й ЁЯдЧ Transformers. ╪к╪│┘Е╪н ┘Б╪ж╪й [`PreTrainedTokenizerFast`] ╪│┘З┘И┘Д╪й ╪е┘Ж╪┤╪з╪б *tokenizer*╪М ┘Е┘Ж ╪о┘Д╪з┘Д ┘В╪и┘И┘Д ┘Г╪з╪ж┘Ж *╪з┘Д┘Е┘П╪м╪▓╪ж ╪з┘Д┘Ж╪╡┘И╪╡* ┘Е┘П┘З┘К┘С╪г ┘Е┘П╪│╪и┘В┘Л╪з ┘Г┘Е╪╣╪з┘Е┘Д:
```python
>>> from transformers import PreTrainedTokenizerFast
>>> fast_tokenizer = PreTrainedTokenizerFast(tokenizer_object=tokenizer)
```
┘К┘Е┘Г┘Ж ╪з┘Д╪в┘Ж ╪з╪│╪к╪о╪п╪з┘Е ┘З╪░╪з ╪з┘Д┘Г╪з╪ж┘Ж ┘Е╪╣ ╪м┘Е┘К╪╣ ╪з┘Д╪╖╪▒┘В ╪з┘Д┘Е┘П╪┤╪к╪▒┘Г╪й ╪и┘К┘Ж ┘Е┘П╪м╪▓┘С╪ж┘К ╪з┘Д┘Ж┘С╪╡┘И╪╡ ┘Д┘А ЁЯдЧ Transformers! ╪з┘Ж╪к┘В┘Д ╪е┘Д┘Й [╪╡┘Б╪н╪й ┘Е┘П╪м╪▓┘С╪ж ╪з┘Д┘Ж┘С╪╡┘И╪╡](main_classes/tokenizer) ┘Д┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д┘Е╪╣┘Д┘И┘Е╪з╪к.
## ╪з┘Д╪к╪н┘Е┘К┘Д ┘Е┘Ж ┘Е┘Д┘Б JSON
┘Д╪к╪н┘Е┘К┘Д ┘Е┘П╪м╪▓┘С╪ж ╪з┘Д┘Ж╪╡ ┘Е┘Ж ┘Е┘Д┘Б JSON╪М ╪п╪╣┘И┘Ж╪з ┘Ж╪и╪п╪г ╪г┘И┘Д╪з┘Л ╪и╪н┘Б╪╕ ┘Е┘П╪м╪▓┘С╪ж ╪з┘Д┘Ж┘С╪╡┘И╪╡:
```python
>>> tokenizer.save("tokenizer.json")
```
┘К┘Е┘Г┘Ж ╪к┘Е╪▒┘К╪▒ ╪з┘Д┘Е╪│╪з╪▒ ╪з┘Д╪░┘К ╪н┘Б╪╕┘Ж╪з ╪и┘З ┘З╪░╪з ╪з┘Д┘Е┘Д┘Б ╪е┘Д┘Й ╪╖╪▒┘К┘В╪й ╪к┘З┘К╪ж╪й [`PreTrainedTokenizerFast`] ╪и╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Е┘П╪╣╪з┘Е┘Д `tokenizer_file`:
```python
>>> from transformers import PreTrainedTokenizerFast
>>> fast_tokenizer = PreTrainedTokenizerFast(tokenizer_file="tokenizer.json")
```
┘К┘Е┘Г┘Ж ╪з┘Д╪в┘Ж ╪з╪│╪к╪о╪п╪з┘Е ┘З╪░╪з ╪з┘Д┘Г╪з╪ж┘Ж ┘Е╪╣ ╪м┘Е┘К╪╣ ╪з┘Д╪╖╪▒┘В ╪з┘Д╪к┘К ╪к╪┤╪к╪▒┘Г ┘Б┘К┘З╪з ┘Е┘П╪м╪▓┘С╪ж┘К ╪з┘Д┘Ж┘С╪╡┘И╪╡ ┘Д┘А ЁЯдЧ Transformers! ╪з┘Ж╪к┘В┘Д ╪е┘Д┘Й [╪╡┘Б╪н╪й ┘Е┘П╪м╪▓┘С╪ж ╪з┘Д┘Ж╪╡](main_classes/tokenizer) ┘Д┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д┘Е╪╣┘Д┘И┘Е╪з╪к.

89
docs/source/ar/gguf.md Normal file
View File

@ -0,0 +1,89 @@
# GGUF ┘И╪к┘Б╪з╪╣┘Д┘З╪з ┘Е╪╣ ╪з┘Д┘Е╪н┘И┘Д╪з╪к
╪к┘П╪│╪к╪о╪п┘Е ╪╡┘К╪║╪й ┘Е┘Д┘Б GGUF ┘Д╪к╪о╪▓┘К┘Ж ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Д┘Д╪з╪│╪к╪п┘Д╪з┘Д ╪и╪з╪│╪к╪о╪п╪з┘Е [GGML](https://github.com/ggerganov/ggml) ┘И╪з┘Д┘Е┘Г╪к╪и╪з╪к ╪з┘Д╪г╪о╪▒┘Й ╪з┘Д╪к┘К ╪к╪╣╪к┘Е╪п ╪╣┘Д┘К┘З╪М ┘Е╪л┘Д [llama.cpp](https://github.com/ggerganov/llama.cpp) ╪г┘И [whisper.cpp](https://github.com/ggerganov/whisper.cpp) ╪з┘Д╪┤┘З┘К╪▒╪й ╪м╪п┘Л╪з.
╪е┘Ж┘З╪з ╪╡┘К╪║╪й ┘Е┘Д┘Б [┘Е╪п╪╣┘И┘Е╪й ┘Е┘Ж ┘В╪и┘Д Hugging Face Hub](https://huggingface.co/docs/hub/en/gguf) ┘Е╪╣ ┘Е┘К╪▓╪з╪к ╪к╪│┘Е╪н ╪и╪з┘Д┘Б╪н╪╡ ╪з┘Д╪│╪▒┘К╪╣ ┘Д┘Д┘Е┘И╪к╪▒╪з╪к ┘И╪з┘Д╪и┘К╪з┘Ж╪з╪к ╪з┘Д┘И╪╡┘Б┘К╪й ╪п╪з╪о┘Д ╪з┘Д┘Е┘Д┘Б.
╪к┘Е ╪к╪╡┘Е┘К┘Е ╪к┘Ж╪│┘К┘В ╪з┘Д┘Е┘Д┘Б ┘З╪░╪з ┘Г┘А "╪к┘Ж╪│┘К┘В ┘Е┘Д┘Б ┘И╪з╪н╪п" ╪н┘К╪л ┘К╪н╪к┘И┘К ┘Е┘Д┘Б ┘И╪з╪н╪п ╪╣╪з╪п╪й┘Л ╪╣┘Д┘Й ┘Г┘Д ┘Е┘Ж ╪│┘Е╪з╪к ╪з┘Д╪к┘Г┘И┘К┘Ж ┘И┘Е┘Б╪▒╪п╪з╪к ╪з┘Д┘Е╪м╪▓┘Й╪б ╪з┘Д┘Д╪║┘И┘К ┘И╪з┘Д╪о╪╡╪з╪ж╪╡ ╪з┘Д╪г╪о╪▒┘Й╪М ╪и╪з┘Д╪е╪╢╪з┘Б╪й ╪е┘Д┘Й ╪м┘Е┘К╪╣ ╪з┘Д┘Е┘И╪к╪▒╪з╪к ╪з┘Д╪к┘К ╪│┘К╪к┘Е ╪к╪н┘Е┘К┘Д┘З╪з ┘Б┘К ╪з┘Д┘Ж┘Е┘И╪░╪м. ╪к╪г╪к┘К ┘З╪░┘З ╪з┘Д┘Е┘Д┘Б╪з╪к ╪и╪к┘Ж╪│┘К┘В╪з╪к ┘Е╪о╪к┘Д┘Б╪й ┘И┘Б┘В┘Л╪з ┘Д┘Ж┘И╪╣ ╪з┘Д╪к┘Г┘Е┘К┘Е ┘Б┘К ╪з┘Д┘Е┘Д┘Б. ┘Ж┘Д┘В┘К ┘Ж╪╕╪▒╪й ┘Е┘И╪м╪▓╪й ╪╣┘Д┘Й ╪и╪╣╪╢┘З╪з [┘З┘Ж╪з](https://huggingface.co/docs/hub/en/gguf#quantization-types).
## ╪з┘Д╪п╪╣┘Е ╪п╪з╪о┘Д ╪з┘Д┘Е╪н┘И┘Д╪з╪к
╪г╪╢┘Б┘Ж╪з ╪з┘Д┘В╪п╪▒╪й ╪╣┘Д┘Й ╪к╪н┘Е┘К┘Д ┘Е┘Д┘Б╪з╪к `gguf` ╪п╪з╪о┘Д `╪з┘Д┘Е╪н┘И┘Д╪з╪к` ┘Д╪к┘И┘Б┘К╪▒ ┘В╪п╪▒╪з╪к ╪к╪п╪▒┘К╪и/╪╢╪и╪╖ ╪е╪╢╪з┘Б┘К╪й ┘Д┘Ж┘Е╪з╪░╪м gguf╪М ┘В╪и┘Д ╪е╪╣╪з╪п╪й ╪к╪н┘И┘К┘Д ╪к┘Д┘Г ╪з┘Д┘Ж┘Е╪з╪░╪м ╪е┘Д┘Й `gguf` ┘Д╪з╪│╪к╪о╪п╪з┘Е┘З╪з ╪п╪з╪о┘Д ┘Ж╪╕╪з┘Е `ggml`. ╪╣┘Ж╪п ╪к╪н┘Е┘К┘Д ┘Ж┘Е┘И╪░╪м╪М ┘Ж┘В┘И┘Е ╪г┘И┘Д╪з┘Л ╪и╪е┘Д╪║╪з╪б ╪к┘Г┘Е┘К┘Е┘З ╪е┘Д┘Й fp32╪М ┘В╪и┘Д ╪к╪н┘Е┘К┘Д ╪з┘Д╪г┘И╪▓╪з┘Ж ┘Д╪з╪│╪к╪о╪п╪з┘Е┘З╪з ┘Б┘К PyTorch.
> [!NOTE]
> ┘Д╪з ┘К╪▓╪з┘Д ╪з┘Д╪п╪╣┘Е ╪к╪м╪▒┘К╪и┘К┘Л╪з ┘Д┘Д╪║╪з┘К╪й ┘И┘Ж╪▒╪н╪и ╪и╪з┘Д┘Е╪│╪з┘З┘Е╪з╪к ┘Е┘Ж ╪г╪м┘Д ╪к╪▒╪│┘К╪о┘З ╪╣╪и╪▒ ╪г┘Ж┘И╪з╪╣ ╪з┘Д╪к┘Г┘Е┘К┘Е ┘И╪и┘Ж┘Й ╪з┘Д┘Ж┘Е╪з╪░╪м.
┘Б┘К┘Е╪з ┘К┘Д┘К╪М ╪и┘Ж┘К╪з╪к ╪з┘Д┘Ж┘Е╪з╪░╪м ┘И╪г┘Ж┘И╪з╪╣ ╪з┘Д╪к┘Г┘Е┘К┘Е ╪з┘Д┘Е╪п╪╣┘И┘Е╪й:
### ╪г┘Ж┘И╪з╪╣ ╪з┘Д╪к┘Г┘Е┘К┘Е ╪з┘Д┘Е╪п╪╣┘И┘Е╪й
╪к┘П╪н╪п╪п ╪г┘Ж┘И╪з╪╣ ╪з┘Д╪к┘Г┘Е┘К┘Е ╪з┘Д┘Е╪п╪╣┘И┘Е╪й ┘Е╪и╪п╪ж┘К┘Л╪з ┘И┘Б┘В┘Л╪з ┘Д┘Е┘Д┘Б╪з╪к ╪з┘Д╪к┘Г┘Е┘К┘Е ╪з┘Д╪┤╪з╪ж╪╣╪й ╪з┘Д╪к┘К ╪к┘Е╪к ┘Е╪┤╪з╪▒┘Г╪к┘З╪з ╪╣┘Д┘Й Hub.
- F32
- F16
- BF16
- Q4_0
- Q4_1
- Q5_0
- Q5_1
- Q8_0
- Q2_K
- Q3_K
- Q4_K
- Q5_K
- Q6_K
- IQ1_S
- IQ1_M
- IQ2_XXS
- IQ2_XS
- IQ2_S
- IQ3_XXS
- IQ3_S
- IQ4_XS
- IQ4_NL
> [!NOTE]
> ┘Д╪п╪╣┘Е ╪е┘Д╪║╪з╪б ╪к┘Г┘Е┘К┘Е gguf╪М ┘К┘Д╪▓┘Е ╪к╪л╪и┘К╪к `gguf>=0.10.0`.
### ╪и┘Ж┘К╪з╪к ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д┘Е╪п╪╣┘И┘Е╪й
┘Б┘К ╪з┘Д┘И┘В╪к ╪з┘Д╪н╪з┘Д┘К╪М ╪и┘Ж┘К╪з╪к ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д┘Е╪п╪╣┘И┘Е╪й ┘З┘К ╪з┘Д╪и┘Ж┘К╪з╪к ╪з┘Д╪к┘К ┘Г╪з┘Ж╪к ╪┤╪з╪ж╪╣╪й ╪м╪п┘Л╪з ╪╣┘Д┘Й Hub╪М ┘И┘З┘К:
- LLaMa
- Mistral
- Qwen2
- Qwen2Moe
- Phi3
- Bloom
- Falcon
- StableLM
- GPT2
- Starcoder2
- T5
## ┘Е╪л╪з┘Д ╪з┘Д╪з╪│╪к╪о╪п╪з┘Е
┘Д╪к╪н┘Е┘К┘Д ┘Е┘Д┘Б╪з╪к `gguf` ┘Б┘К `transformers`╪М ┘К╪м╪и ╪к╪н╪п┘К╪п ┘Е╪╣╪з┘Е┘Д `gguf_file` ┘Б┘Й ╪п╪з┘Д╪й `from_pretrained` ┘Д┘Г┘Д ┘Е┘Ж ╪з┘Д┘Е┘П╪м╪▓┘С╪ж ╪з┘Д┘Д╪║┘И┘К╪й ┘И╪з┘Д┘Ж┘Е┘И╪░╪м. ┘Б┘К┘Е╪з ┘К┘Д┘К ┘Г┘К┘Б┘К╪й ╪к╪н┘Е┘К┘Д ╪з┘Д┘Е┘П╪м╪▓┘С╪ж ╪з┘Д┘Д╪║┘И┘К ┘И┘Ж┘Е┘И╪░╪м╪М ┘К┘Е┘Г┘Ж ╪к╪н┘Е┘К┘Д┘З┘Е╪з ┘Е┘Ж ┘Ж┘Б╪│ ╪з┘Д┘Е┘Д┘Б:
```py
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF"
filename = "tinyllama-1.1b-chat-v1.0.Q6_K.gguf"
tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename)
model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename)
```
╪з┘Д╪в┘Ж ┘Д╪п┘К┘Г ╪е┘Е┘Г╪з┘Ж┘К╪й ╪з┘Д┘И╪╡┘И┘Д ╪е┘Д┘Й ╪з┘Д┘Ж╪│╪о╪й ╪з┘Д┘Г╪з┘Е┘Д ╪║┘К╪▒ ╪з┘Д┘Е┘Г┘Е┘Е╪й ┘Д┘Д┘Ж┘Е┘И╪░╪м ┘Б┘К ╪и┘К╪ж╪й PyTorch╪М ╪н┘К╪л ┘К┘Е┘Г┘Ж┘Г ╪п┘Е╪м┘З ┘Е╪╣ ┘Е╪м┘Е┘И╪╣╪й ┘Г╪и┘К╪▒╪й ┘Е┘Ж ╪з┘Д╪г╪п┘И╪з╪к ╪з┘Д╪г╪о╪▒┘Й.
┘Д╪е╪╣╪з╪п╪й ╪з┘Д╪к╪н┘И┘К┘Д ╪е┘Д┘Й ┘Е┘Д┘Б `gguf`╪М ┘Ж┘И╪╡┘К ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Е┘Д┘Б [`convert-hf-to-gguf.py`](https://github.com/ggerganov/llama.cpp/blob/master/convert-hf-to-gguf.py) ┘Е┘Ж llama.cpp.
┘Б┘К┘Е╪з ┘К┘Д┘К ┘Г┘К┘Б┘К╪й ╪е┘Г┘Е╪з┘Д ╪з┘Д╪и╪▒┘Ж╪з┘Е╪м ╪з┘Д┘Ж╪╡┘К ╪г╪╣┘Д╪з┘З ┘Д╪н┘Б╪╕ ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И╪е╪╣╪з╪п╪й ╪к╪╡╪п┘К╪▒┘З ┘Е╪▒╪й ╪г╪о╪▒┘Й ╪е┘Д┘Й `gguf`:
```py
tokenizer.save_pretrained('directory')
model.save_pretrained('directory')
!python ${path_to_llama_cpp}/convert-hf-to-gguf.py ${directory}
```

View File

@ -28,7 +28,7 @@ picture-in-picture" allowfullscreen></iframe>
```py
>>> model = AutoModel.from_pretrained(
... "julien-c/EsperBERTo-small", revision="v2.0.1" # ╪з╪│┘Е ╪з┘Д╪╣┘Д╪з┘Е╪й╪М ╪г┘И ╪з╪│┘Е ╪з┘Д┘Б╪▒╪╣╪М ╪г┘И ╪к╪м╪▓╪ж╪й ╪з┘Д╪з┘Д╪к╪▓╪з┘Е
... "julien-c/EsperBERTo-small", revision="4c77982" # ╪з╪│┘Е ╪з┘Д╪╣┘Д╪з┘Е╪й╪М ╪г┘И ╪з╪│┘Е ╪з┘Д┘Б╪▒╪╣╪М ╪г┘И ╪к╪м╪▓╪ж╪й ╪з┘Д╪з┘Д╪к╪▓╪з┘Е
... )
```

View File

@ -0,0 +1,160 @@
# ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к ┘Д┘Д╪з╪│╪к╪п┘Д╪з┘Д
┘З┘Ж╪з┘Г ╪з┘Д╪╣╪п┘К╪п ┘Е┘Ж ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к ┘Б┘К ┘Е┘Г╪к╪и╪й ЁЯдЧ Transformers╪М ┘И╪к╪о╪к┘Д┘Б ╪╖╪▒┘К┘В╪й ╪з╪│╪к╪о╪п╪з┘Е┘З╪з ┘Д┘Д╪з╪│╪к╪п┘Д╪з┘Д ╪╣┘Ж ╪з┘Д┘Ж┘Е╪з╪░╪м ╪г╪н╪з╪п┘К╪й ╪з┘Д┘Д╪║╪й. ┘И┘Д┘Г┘Ж ┘Д┘К╪│ ┘Г┘Д ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к ┘Е╪о╪к┘Д┘Б. ┘Б╪и╪╣╪╢ ╪з┘Д┘Ж┘Е╪з╪░╪м╪М ┘Е╪л┘Д [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased)╪М ┘К┘Е┘Г┘Ж ╪з╪│╪к╪о╪п╪з┘Е┘З╪з ╪к┘Е╪з┘Е┘Л╪з ┘Е╪л┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м ╪г╪н╪з╪п┘К ╪з┘Д┘Д╪║╪й. ╪│┘К┘И╪╢╪н ┘Д┘Г ┘З╪░╪з ╪з┘Д╪п┘Д┘К┘Д ┘Г┘К┘Б┘К╪й ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к ╪з┘Д╪к┘К ╪к╪о╪к┘Д┘Б ╪╖╪▒┘К┘В╪й ╪з╪│╪к╪о╪п╪з┘Е┘З╪з ┘Д┘Д╪з╪│╪к╪п┘Д╪з┘Д.
## XLM
┘К╪н╪к┘И┘К XLM ╪╣┘Д┘Й ╪╣╪┤╪▒ ┘Ж╪│╪о ┘Е╪о╪к┘Д┘Б╪й╪М ┘И╪з╪н╪п╪й ┘Е┘Ж┘З╪з ┘Б┘В╪╖ ╪г╪н╪з╪п┘К╪й ╪з┘Д┘Д╪║╪й. ┘И┘К┘Е┘Г┘Ж ╪к┘В╪│┘К┘Е ┘Ж╪│╪о ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к╪│╪╣ ╪з┘Д┘Е╪к╪и┘В┘К╪й ╪е┘Д┘Й ┘Б╪ж╪к┘К┘Ж: ┘Ж╪│╪о ╪з┘Д╪к┘К ╪к╪│╪к╪о╪п┘Е ╪к╪╢┘Е┘К┘Ж╪з╪к ╪з┘Д┘Д╪║╪й (language embeddings) ┘И╪к┘Д┘Г ╪з┘Д╪к┘К ┘Д╪з ╪к╪│╪к╪о╪п┘Е┘З╪з.
### XLM ┘Е╪╣ ╪к╪╢┘Е┘К┘Ж╪з╪к ╪з┘Д┘Д╪║╪й
╪к╪│╪к╪о╪п┘Е ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к╪з┘Д┘К╪й ┘Е┘Ж XLM ╪к╪╢┘Е┘К┘Ж╪з╪к ╪з┘Д┘Д╪║╪й ┘Д╪к╪н╪п┘К╪п ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е╪│╪к╪о╪п┘Е╪й ╪г╪л┘Ж╪з╪б ╪з┘Д╪з╪│╪к╪п┘Д╪з┘Д:
- `FacebookAI/xlm-mlm-ende-1024` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е┘В┘Ж╪╣╪й╪М ╪з┘Д╪е┘Ж╪м┘Д┘К╪▓┘К╪й-╪з┘Д╪г┘Д┘Е╪з┘Ж┘К╪й)
- `FacebookAI/xlm-mlm-enfr-1024` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е┘В┘Ж╪╣╪й╪М ╪з┘Д╪е┘Ж╪м┘Д┘К╪▓┘К╪й-╪з┘Д┘Б╪▒┘Ж╪│┘К╪й)
- `FacebookAI/xlm-mlm-enro-1024` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е┘В┘Ж╪╣╪й╪М ╪з┘Д╪е┘Ж╪м┘Д┘К╪▓┘К╪й-╪з┘Д╪▒┘И┘Е╪з┘Ж┘К╪й)
- `FacebookAI/xlm-mlm-xnli15-1024` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е┘В┘Ж╪╣╪й╪М ┘Д╪║╪з╪к XNLI)
- `FacebookAI/xlm-mlm-tlm-xnli15-1024` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е┘В┘Ж╪╣╪й + ╪з┘Д╪к╪▒╪м┘Е╪й╪М ┘Д╪║╪з╪к XNLI)
- `FacebookAI/xlm-clm-enfr-1024` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д╪│╪и╪и┘К╪й╪М ╪з┘Д╪е┘Ж╪м┘Д┘К╪▓┘К╪й-╪з┘Д┘Б╪▒┘Ж╪│┘К╪й)
- `FacebookAI/xlm-clm-ende-1024` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д╪│╪и╪и┘К╪й╪М ╪з┘Д╪е┘Ж╪м┘Д┘К╪▓┘К╪й-╪з┘Д╪г┘Д┘Е╪з┘Ж┘К╪й)
╪к┘П┘Е╪л┘Д ╪к╪╢┘Е┘К┘Ж╪з╪к ╪з┘Д┘Д╪║╪й ╪╣┘Д┘Й ╪┤┘Г┘Д ┘Е╪╡┘Б┘И┘Б╪й ╪и┘Ж┘Б╪│ ╪┤┘Г┘Д `input_ids` ╪з┘Д╪к┘К ┘К╪к┘Е ╪к┘Е╪▒┘К╪▒┘З ╪е┘Д┘Й ╪з┘Д┘Ж┘Е┘И╪░╪м. ┘И╪к╪╣╪к┘Е╪п ╪з┘Д┘В┘К┘Е ┘Б┘К ┘З╪░┘З ╪з┘Д┘Е╪╡┘Б┘И┘Б╪з╪к ╪╣┘Д┘Й ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е╪│╪к╪о╪п┘Е╪й ┘И┘К╪к┘Е ╪к╪н╪п┘К╪п┘З╪з ╪и┘И╪з╪│╪╖╪й ┘Е╪╣╪з┘Е┘Д┘Й ╪з┘Д┘Е╪м╪▓┘Й╪б `lang2id` ┘И `id2lang`.
┘Б┘К ┘З╪░╪з ╪з┘Д┘Е╪л╪з┘Д╪М ┘В┘Е ╪и╪к╪н┘Е┘К┘Д ┘Ж╪│╪о╪й `FacebookAI/xlm-clm-enfr-1024` ( ┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д╪│╪и╪и┘К╪й╪М ╪з┘Д╪е┘Ж╪м┘Д┘К╪▓┘К╪й-╪з┘Д┘Б╪▒┘Ж╪│┘К╪й):
```py
>>> import torch
>>> from transformers import XLMTokenizer, XLMWithLMHeadModel
>>> tokenizer = XLMTokenizer.from_pretrained("FacebookAI/xlm-clm-enfr-1024")
>>> model = XLMWithLMHeadModel.from_pretrained("FacebookAI/xlm-clm-enfr-1024")
```
╪к┘П╪╕┘З╪▒ ╪о╪з╪╡┘К╪й `lang2id` ┘Б┘К ╪з┘Д┘Е╪м╪▓┘Й╪б ╪з┘Д┘Д╪║╪з╪к ┘И╪г╪▒┘В╪з┘Е ╪к╪╣╪▒┘К┘Б┘З╪з ┘Б┘К ┘З╪░╪з ╪з┘Д┘Ж┘Е┘И╪░╪м:
```py
>>> print(tokenizer.lang2id)
{'en': 0, 'fr': 1}
```
╪и╪╣╪п ╪░┘Д┘Г╪М ┘В┘Е ╪и╪е┘Ж╪┤╪з╪б ┘Е╪л╪з┘Д ╪╣┘Д┘Й ╪з┘Д┘Е╪п╪о┘Д╪з╪к:
```py
>>> input_ids = torch.tensor([tokenizer.encode("Wikipedia was used to")]) # batch size of 1
```
┘В┘Е ╪и╪к╪╣┘К┘К┘Ж ┘Е╪╣╪▒┘Б ╪з┘Д┘Д╪║╪й ╪е┘Д┘Й `"en"` ┘И╪з╪│╪к╪о╪п┘Е┘З ┘Д╪к╪н╪п┘К╪п ╪к╪╢┘Е┘К┘Ж ╪з┘Д┘Д╪║╪й. ┘И╪к╪╢┘Е┘К┘Ж ╪з┘Д┘Д╪║╪й ╪╣╪и╪з╪▒╪й ╪╣┘Ж ┘Е╪╡┘Б┘И┘Б╪й ┘Е┘Е┘Д┘И╪б╪й ╪и┘А `0` ┘Д╪г┘Ж ┘З╪░╪з ┘З┘И ┘Е╪╣╪▒┘Б ╪з┘Д┘Д╪║╪й ╪з┘Д╪е┘Ж╪м┘Д┘К╪▓┘К╪й. ┘К╪м╪и ╪г┘Ж ╪к┘Г┘И┘Ж ┘З╪░┘З ╪з┘Д┘Е╪╡┘Б┘И┘Б╪й ╪и┘Ж┘Б╪│ ╪н╪м┘Е `input_ids`.
```py
>>> language_id = tokenizer.lang2id["en"] # 0
>>> langs = torch.tensor([language_id] * input_ids.shape[1]) # torch.tensor([0, 0, 0, ..., 0])
>>> # ┘Ж┘В┘И┘Е ╪и╪е╪╣╪з╪п╪й ╪к╪┤┘Г┘К┘Д┘З╪з ┘Д╪к┘Г┘И┘Ж ╪и╪з┘Д╪н╪м┘Е (batch_size╪М sequence_length)
>>> langs = langs.view(1, -1) # ╪з┘Д╪в┘Ж ╪и╪з┘Д╪н╪м┘Е [1╪М sequence_length] (┘Д╪п┘К┘Ж╪з batch size ╪к╪│╪з┘И┘К 1)
```
╪з┘Д╪в┘Ж ┘К┘Е┘Г┘Ж┘Г ╪к┘Е╪▒┘К╪▒ `input_ids` ┘И╪к╪╢┘Е┘К┘Ж ╪з┘Д┘Д╪║╪й ╪е┘Д┘Й ╪з┘Д┘Ж┘Е┘И╪░╪м:
```py
>>> outputs = model(input_ids, langs=langs)
```
┘К┘Е┘Г┘Ж ┘Д┘Ж╪╡ ╪з┘Д╪и╪▒┘Ж╪з┘Е╪м ╪з┘Д┘Ж╪╡┘К [run_generation.py](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-generation/run_generation.py) ╪к┘И┘Д┘К╪п ╪з┘Д┘Ж╪╡ ╪и╪з╪│╪к╪о╪п╪з┘Е ╪к╪╢┘Е┘К┘Ж╪з╪к ╪з┘Д┘Д╪║╪й ┘Е╪╣ ┘Ж┘В╪з╪╖ ╪к┘Б╪к┘К╪┤ `xlm-clm`.
### XLM ╪и╪п┘И┘Ж ╪к╪╢┘Е┘К┘Ж╪з╪к ╪з┘Д┘Д╪║╪й
╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к╪з┘Д┘К╪й ┘Е┘Ж XLM ┘Д╪з ╪к╪к╪╖┘Д╪и ╪к╪╢┘Е┘К┘Ж╪з╪к ╪з┘Д┘Д╪║╪й ╪г╪л┘Ж╪з╪б ╪з┘Д╪з╪│╪к┘Ж╪к╪з╪м:
- `FacebookAI/xlm-mlm-17-1280` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е┘В┘Ж╪╣╪й╪М 17 ┘Д╪║╪й)
- `FacebookAI/xlm-mlm-100-1280` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е┘В┘Ж╪╣╪й╪М 100 ┘Д╪║╪й)
╪к┘П╪│╪к╪о╪п┘Е ┘З╪░┘З ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Д╪к┘Е╪л┘К┘Д ╪з┘Д╪м┘Е┘Д ╪з┘Д╪╣╪з┘Е╪й╪М ╪╣┘Д┘Й ╪╣┘Г╪│ ┘Ж╪│╪н XLM ╪з┘Д╪│╪з╪и┘В╪й.
## BERT
┘К┘Е┘Г┘Ж ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к╪з┘Д┘К╪й ┘Е┘Ж BERT ┘Д┘Д┘Е┘З╪з┘Е ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к:
- `google-bert/bert-base-multilingual-uncased` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е┘В┘Ж╪╣╪й + ╪з┘Д╪к┘Ж╪и╪д ╪и╪з┘Д╪м┘Е┘Д╪й ╪з┘Д╪к╪з┘Д┘К╪й╪М 102 ┘Д╪║╪й)
- `google-bert/bert-base-multilingual-cased` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е┘В┘Ж╪╣╪й + ╪з┘Д╪к┘Ж╪и╪д ╪и╪з┘Д╪м┘Е┘Д╪й ╪з┘Д╪к╪з┘Д┘К╪й╪М 104 ┘Д╪║╪з╪к)
┘Д╪з ╪к╪к╪╖┘Д╪и ┘З╪░┘З ╪з┘Д┘Ж┘Е╪з╪░╪м ╪к╪╢┘Е┘К┘Ж╪з╪к ╪з┘Д┘Д╪║╪й ╪г╪л┘Ж╪з╪б ╪з┘Д╪з╪│╪к╪п┘Д╪з┘Д. ┘К╪м╪и ╪г┘Ж ╪к┘П╪н╪п┘С╪п ╪з┘Д┘Д╪║╪й ┘Е┘Ж ╪з┘Д╪│┘К╪з┘В ┘И╪к╪│╪к┘Ж╪к╪м ┘И┘Б┘В╪з┘Л ┘Д╪░┘Д┘Г.
## XLM-RoBERTa
┘К┘Е┘Г┘Ж ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к╪з┘Д┘К╪й ┘Е┘Ж XLM-RoBERTa ┘Д┘Д┘Е┘З╪з┘Е ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к:
- `FacebookAI/xlm-roberta-base` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е┘В┘Ж╪╣╪й╪М 100 ┘Д╪║╪й)
- `FacebookAI/xlm-roberta-large` (┘Ж┘Е╪░╪м╪й ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е┘В┘Ж╪╣╪й╪М 100 ┘Д╪║╪й)
╪к┘Е ╪к╪п╪▒┘К╪и XLM-RoBERTa ╪╣┘Д┘Й 2.5 ╪к┘К╪▒╪з╪и╪з┘К╪к ┘Е┘Ж ╪и┘К╪з┘Ж╪з╪к CommonCrawl ╪з┘Д╪м╪п┘К╪п╪й ┘И╪з┘Д┘Е╪н╪│┘Ж╪й ┘Б┘К 100 ┘Д╪║╪й. ┘И┘К┘И┘Б╪▒ ┘Е┘Г╪з╪│╪и ┘В┘И┘К╪й ╪╣┘Д┘Й ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к ╪з┘Д╪к┘К ╪к┘Е ╪е╪╡╪п╪з╪▒┘З╪з ╪│╪з╪и┘В╪з┘Л ┘Е╪л┘Д mBERT ╪г┘И XLM ┘Б┘К ┘Е┘З╪з┘Е ╪з┘Д┘Е╪╡╪и ┘Е╪л┘Д ╪з┘Д╪к╪╡┘Ж┘К┘Б╪М ┘И┘И╪╢╪╣ ╪з┘Д╪╣┘Д╪з┘Е╪з╪к ╪з┘Д╪к╪│┘Д╪│┘Д┘К╪й╪М ┘И╪з┘Д╪г╪│╪ж┘Д╪й ┘И╪з┘Д╪г╪м┘И╪и╪й.
## M2M100
┘К┘Е┘Г┘Ж ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к╪з┘Д┘К╪й ┘Е┘Ж M2M100 ┘Д┘Д╪к╪▒╪м┘Е╪й ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к:
- `facebook/m2m100_418M` (╪з┘Д╪к╪▒╪м┘Е╪й)
- `facebook/m2m100_1.2B` (╪з┘Д╪к╪▒╪м┘Е╪й)
┘Б┘К ┘З╪░╪з ╪з┘Д┘Е╪л╪з┘Д╪М ┘В┘Е ╪и╪к╪н┘Е┘К┘Д ┘Ж╪│╪н╪й `facebook/m2m100_418M` ┘Д╪к╪▒╪м┘Е╪й ╪з┘Д┘Ж╪╡ ┘Е┘Ж ╪з┘Д╪╡┘К┘Ж┘К╪й ╪е┘Д┘Й ╪з┘Д╪е┘Ж╪м┘Д┘К╪▓┘К╪й. ┘К┘Е┘Г┘Ж┘Г ╪к╪╣┘К┘К┘Ж ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е╪╡╪п╪▒ ┘Б┘К ╪з┘Д┘Е╪м╪▓┘Й╪б ╪з┘Д┘Д╪║┘И┘Й:
```py
>>> from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer
>>> en_text = "Do not meddle in the affairs of wizards, for they are subtle and quick to anger."
>>> chinese_text = "ф╕НшжБцПТцЙЛх╖лх╕лчЪДф║ЛхЛЩ, хЫачВ║ф╗ЦхАСцШпх╛охжЩчЪД, х╛Их┐лх░▒цЬГчЩ╝цАТ."
>>> tokenizer = M2M100Tokenizer.from_pretrained("facebook/m2m100_418M", src_lang="zh")
>>> model = M2M100ForConditionalGeneration.from_pretrained("facebook/m2m100_418M")
```
╪к┘В╪│┘К┘Е ╪з┘Д┘Ж┘С╪╡ ╪е┘Д┘Й ╪▒┘Е┘И╪▓:
```py
>>> encoded_zh = tokenizer(chinese_text, return_tensors="pt")
```
┘К╪м╪и╪▒ M2M100 ┘Е╪╣╪▒┘Б ╪з┘Д┘Д╪║╪й ╪з┘Д┘З╪п┘Б ┘Г╪г┘И┘Д ╪▒┘Е╪▓ ┘Е┘И┘Д╪п ┘Д┘Д╪к╪▒╪м┘Е╪й ╪е┘Д┘Й ╪з┘Д┘Д╪║╪й ╪з┘Д┘З╪п┘Б. ┘В┘Е ╪и╪к╪╣┘К┘К┘Ж `forced_bos_token_id` ╪е┘Д┘Й `en` ┘Б┘К ╪╖╪▒┘К┘В╪й `generate` ┘Д┘Д╪к╪▒╪м┘Е╪й ╪е┘Д┘Й ╪з┘Д╪е┘Ж╪м┘Д┘К╪▓┘К╪й:
```py
>>> generated_tokens = model.generate(**encoded_zh, forced_bos_token_id=tokenizer.get_lang_id("en"))
>>> tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
'Do not interfere with the matters of the witches, because they are delicate and will soon be angry.'
```
## MBart
┘К┘Е┘Г┘Ж ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к╪з┘Д┘К╪й ┘Е┘Ж MBart ┘Д┘Д╪к╪▒╪м┘Е╪й ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к:
- `facebook/mbart-large-50-one-to-many-mmt` (╪з┘Д╪к╪▒╪м┘Е╪й ╪з┘Д╪в┘Д┘К╪й ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к ┘Е┘Ж ┘И╪з╪н╪п ╪е┘Д┘Й ┘Г╪л┘К╪▒╪М 50 ┘Д╪║╪й)
- `facebook/mbart-large-50-many-to-many-mmt` (╪з┘Д╪к╪▒╪м┘Е╪й ╪з┘Д╪в┘Д┘К╪й ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к ┘Е┘Ж ┘Г╪л┘К╪▒ ╪е┘Д┘Й ┘Г╪л┘К╪▒╪М 50 ┘Д╪║╪й)
- `facebook/mbart-large-50-many-to-one-mmt` (╪з┘Д╪к╪▒╪м┘Е╪й ╪з┘Д╪в┘Д┘К╪й ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к ┘Е┘Ж ┘Г╪л┘К╪▒ ╪е┘Д┘Й ┘И╪з╪н╪п╪М 50 ┘Д╪║╪й)
- `facebook/mbart-large-50` (╪з┘Д╪к╪▒╪м┘Е╪й ┘Е╪к╪╣╪п╪п╪й ╪з┘Д┘Д╪║╪з╪к╪М 50 ┘Д╪║╪й)
- `facebook/mbart-large-cc25`
┘Б┘К ┘З╪░╪з ╪з┘Д┘Е╪л╪з┘Д╪М ┘В┘Е ╪и╪к╪н┘Е┘К┘Д ┘Ж╪│╪о╪й `facebook/mbart-large-50-many-to-many-mmt` ┘Д╪к╪▒╪м┘Е╪й ╪з┘Д┘Ж╪╡ ┘Е┘Ж ╪з┘Д┘Б┘Ж┘Д┘Ж╪п┘К╪й ╪е┘Д┘Й ╪з┘Д╪е┘Ж╪м┘Д┘К╪▓┘К╪й. ┘К┘Е┘Г┘Ж┘Г ╪к╪╣┘К┘К┘Ж ╪з┘Д┘Д╪║╪й ╪з┘Д┘Е╪╡╪п╪▒ ┘Б┘К ╪з┘Д┘Е╪м╪▓┘Й╪б:
```py
>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
>>> en_text = "Do not meddle in the affairs of wizards, for they are subtle and quick to anger."
>>> fi_text = "├Дl├д sekaannu velhojen asioihin, sill├д ne ovat hienovaraisia ja nopeasti vihaisia."
>>> tokenizer = AutoTokenizer.from_pretrained("facebook/mbart-large-50-many-to-many-mmt", src_lang="fi_FI")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("facebook/mbart-large-50-many-to-many-mmt")
```
╪к┘В╪│┘К┘Е ╪з┘Д┘Ж┘С╪╡ ╪е┘Д┘Й ╪▒┘Е┘И╪▓:
```py
>>> encoded_en = tokenizer(en_text, return_tensors="pt")
```
┘К╪м╪и╪▒ MBart ┘Е╪╣╪▒┘Б ┘Д╪║╪й ╪з┘Д┘З╪п┘Б ┘Г╪г┘И┘Д ╪▒┘Е╪▓ ┘Е┘И┘Д╪п ┘Д┘Д╪к╪▒╪м┘Е╪й ╪е┘Д┘Й ╪з┘Д┘Д╪║╪й ╪з┘Д┘З╪п┘Б. ┘В┘Е ╪и╪к╪╣┘К┘К┘Ж `forced_bos_token_id` ╪е┘Д┘Й `en` ┘Б┘К ╪╖╪▒┘К┘В╪й `generate` ┘Д┘Д╪к╪▒╪м┘Е╪й ╪е┘Д┘Й ╪з┘Д╪е┘Ж╪м┘Д┘К╪▓┘К╪й:
```py
>>> generated_tokens = model.generate(**encoded_en, forced_bos_token_id=tokenizer.lang_code_to_id["en_XX"])
>>> tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
"Don't interfere with the wizard's affairs, because they are subtle, will soon get angry."
```
╪е╪░╪з ┘Г┘Ж╪к ╪к╪│╪к╪о╪п┘Е ┘Ж╪│╪о╪й `facebook/mbart-large-50-many-to-one-mmt`╪М ┘Б┘Д╪з ╪к╪н╪к╪з╪м ╪е┘Д┘Й ╪е╪м╪и╪з╪▒ ┘Е╪╣╪▒┘Б ┘Д╪║╪й ╪з┘Д┘З╪п┘Б ┘Г╪г┘И┘Д ╪▒┘Е╪▓ ┘Е┘И┘Д╪п╪М ┘И╪е┘Д╪з ┘Б╪е┘Ж ╪з┘Д╪з╪│╪к╪о╪п╪з┘Е ┘З┘И ┘Ж┘Б╪│┘З.

View File

@ -0,0 +1,8 @@
# ╪к╪┤╪║┘К┘Д ╪з┘Д╪к╪п╪▒┘К╪и ╪╣┘Д┘Й Amazon SageMaker
╪к┘Е ┘Ж┘В┘Д ╪з┘Д╪к┘И╪л┘К┘В ╪е┘Д┘Й [hf.co/docs/sagemaker](https://huggingface.co/docs/sagemaker). ┘И╪│┘К╪к┘Е ╪е╪▓╪з┘Д╪й ┘З╪░┘З ╪з┘Д╪╡┘Б╪н╪й ┘Б┘К ╪з┘Д╪е╪╡╪п╪з╪▒ 5.0 ┘Е┘Ж ╪и╪▒┘Ж╪з┘Е╪м Transformers.
### ╪м╪п┘И┘Д ╪з┘Д┘Е╪н╪к┘И┘К╪з╪к
- [╪к╪п╪▒┘К╪и ┘Ж┘Е╪з╪░╪м Hugging Face ╪╣┘Д┘Й Amazon SageMaker ╪и╪з╪│╪к╪о╪п╪з┘Е SageMaker Python SDK](https://huggingface.co/docs/sagemaker/train)
- [┘Ж╪┤╪▒ ┘Ж┘Е╪з╪░╪м Hugging Face ╪╣┘Д┘Й Amazon SageMaker ╪и╪з╪│╪к╪о╪п╪з┘Е SageMaker Python SDK](https://huggingface.co/docs/sagemaker/inference)

View File

@ -0,0 +1,170 @@
# ╪з┘Д╪к╪╡╪п┘К╪▒ ╪е┘Д┘Й ONNX
╪║╪з┘Д╪и╪з┘Л ┘Е╪з ┘К╪к╪╖┘Д╪и ┘Ж╪┤╪▒ ┘Ж┘Е╪з╪░╪м ЁЯдЧ Transformers ┘Б┘К ╪и┘К╪ж╪з╪к ╪з┘Д╪е┘Ж╪к╪з╪м ╪г┘И ┘К┘Е┘Г┘Ж ╪г┘Ж ┘К╪│╪к┘Б┘К╪п ┘Е┘Ж ╪к╪╡╪п┘К╪▒ ╪з┘Д┘Ж┘Е╪з╪░╪м ╪е┘Д┘Й ╪к┘Ж╪│┘К┘В ╪к╪│┘Д╪│┘Д┘К ┘К┘П┘Е┘Г┘Ж ╪к╪н┘Е┘К┘Д┘З ┘И╪к┘Ж┘Б┘К╪░┘З ╪╣┘Д┘Й ╪г╪м┘З╪▓╪й ┘И╪и╪▒╪з┘Е╪м ╪к╪┤╪║┘К┘Д ┘Е┘П╪к╪о╪╡╪╡╪й.
ЁЯдЧ Optimum ┘З┘И ╪з┘Е╪к╪п╪з╪п ┘Д┘А Transformers ┘К┘Е┘Г┘С┘Ж ┘Е┘Ж ╪к╪╡╪п┘К╪▒ ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Е┘Ж PyTorch ╪г┘И TensorFlow ╪е┘Д┘Й ╪к┘Ж╪│┘К┘В╪з╪к ┘Е┘П╪к╪│┘Д╪│┘Д╪й ┘Е╪л┘Д ONNX ┘И TFLite ┘Е┘Ж ╪о┘Д╪з┘Д ┘И╪н╪п╪й `exporters` ╪з┘Д╪о╪з╪╡╪й ╪и┘З. ┘К┘И┘Б╪▒ ЁЯдЧ Optimum ╪г┘К╪╢┘Л╪з ┘Е╪м┘Е┘И╪╣╪й ┘Е┘Ж ╪г╪п┘И╪з╪к ╪к╪н╪│┘К┘Ж ╪з┘Д╪г╪п╪з╪б ┘Д╪к╪п╪▒┘К╪и ╪з┘Д┘Ж┘Е╪з╪░╪м ┘И╪к╪┤╪║┘К┘Д┘З╪з ╪╣┘Д┘Й ╪г╪м┘З╪▓╪й ┘Е╪│╪к┘З╪п┘Б╪й ╪и┘Г┘Б╪з╪б╪й ┘В╪╡┘И┘Й.
┘К┘И╪╢╪н ┘З╪░╪з ╪з┘Д╪п┘Д┘К┘Д ┘Г┘К┘Б┘К╪й ╪к╪╡╪п┘К╪▒ ┘Ж┘Е╪з╪░╪м ЁЯдЧ Transformers ╪е┘Д┘Й ONNX ╪и╪з╪│╪к╪о╪п╪з┘Е ЁЯдЧ Optimum╪М ┘И┘Д┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ╪з┘Д╪п┘Д┘К┘Д ╪з┘Д╪о╪з╪╡ ╪и╪к╪╡╪п┘К╪▒ ╪з┘Д┘Ж┘Е╪з╪░╪м ╪е┘Д┘Й TFLite╪М ┘К┘П╪▒╪м┘Й ╪з┘Д╪▒╪м┘И╪╣ ╪е┘Д┘Й ╪╡┘Б╪н╪й [╪з┘Д╪к╪╡╪п┘К╪▒ ╪е┘Д┘Й TFLite](tflite).
## ╪з┘Д╪к╪╡╪п┘К╪▒ ╪е┘Д┘Й ONNX
┘Е╪м┘Е╪п [ONNX (Open Neural Network Exchange)](http://onnx.ai) ┘З┘И ┘Е╪╣┘К╪з╪▒ ┘Е┘Б╪к┘И╪н ┘К┘П╪н╪п╪п ┘Е╪м┘Е┘И╪╣╪й ┘Е╪┤╪к╪▒┘Г╪й ┘Е┘Ж ╪з┘Д╪╣┘И╪з┘Е┘Д ┘И╪к┘Ж╪│┘К┘В ┘Е┘Д┘Б ┘Е╪┤╪к╪▒┘Г ┘Д╪к┘Е╪л┘К┘Д ┘Ж┘Е╪з╪░╪м ╪з┘Д╪к╪╣┘Д┘Е ╪з┘Д╪╣┘Е┘К┘В ┘Б┘К ┘Е╪м┘Е┘И╪╣╪й ┘Е╪к┘Ж┘И╪╣╪й ┘И╪з╪│╪╣╪й ┘Е┘Ж ╪з┘Д╪г╪╖╪▒╪М ╪и┘Е╪з ┘Б┘К ╪░┘Д┘Г PyTorch ┘ИTensorFlow. ╪╣┘Ж╪п┘Е╪з ┘К╪к┘Е ╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ╪е┘Д┘Й ╪к┘Ж╪│┘К┘В ONNX╪М ┘К╪к┘Е ╪з╪│╪к╪о╪п╪з┘Е ┘З╪░┘З ╪з┘Д┘Е╪┤╪║┘Д╪з╪к ┘Д╪и┘Ж╪з╪б ╪▒╪│┘Е ╪и┘К╪з┘Ж┘К ╪н╪з╪│┘И╪и┘К (┘К┘П╪╖┘Д┘В ╪╣┘Д┘К┘З ╪║╪з┘Д╪и┘Л╪з ╪з╪│┘Е _╪к┘Е╪л┘К┘Д ┘И╪│┘К╪╖_) ┘И╪з┘Д╪░┘К ┘К┘Е╪л┘Д ╪к╪п┘Б┘В ╪з┘Д╪и┘К╪з┘Ж╪з╪к ╪╣╪и╪▒ ╪з┘Д╪┤╪и┘Г╪й ╪з┘Д╪╣╪╡╪и┘К╪й.
┘Е┘Ж ╪о┘Д╪з┘Д ╪╣╪▒╪╢ ╪▒╪│┘Е ╪и┘К╪з┘Ж┘К ╪и╪╣┘И╪з┘Е┘Д ┘И╪г┘Ж┘И╪з╪╣ ╪и┘К╪з┘Ж╪з╪к ┘Е╪╣┘К╪з╪▒┘К╪й╪М ┘К┘П╪│┘З┘С┘Д ONNX ╪з┘Д╪к╪и╪п┘К┘Д ╪и┘К┘Ж ╪з┘Д╪г╪╖╪▒. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ┘К┘П┘Е┘Г┘Ж ╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ┘Е╪п╪▒╪и ┘Б┘К PyTorch ╪е┘Д┘Й ╪к┘Ж╪│┘К┘В ONNX ╪л┘Е ╪з╪│╪к┘К╪▒╪з╪п┘З ┘Б┘К TensorFlow (┘И╪з┘Д╪╣┘Г╪│ ╪╡╪н┘К╪н).
╪и┘Е╪м╪▒╪п ╪з┘Д╪к╪╡╪п┘К╪▒ ╪е┘Д┘Й ╪к┘Ж╪│┘К┘В ONNX╪М ┘К┘П┘Е┘Г┘Ж:
- ╪к╪н╪│┘К┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Д┘Д╪з╪│╪к╪п┘Д╪з┘Д ╪╣╪и╪▒ ╪к┘В┘Ж┘К╪з╪к ┘Е╪л┘Д [╪к╪н╪│┘К┘Ж ╪з┘Д╪▒╪│┘Е ╪з┘Д╪и┘К╪з┘Ж┘К](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/optimization) ┘И [╪з┘Д╪к┘Г┘Е┘К┘Е](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/quantization).
- ╪к╪┤╪║┘К┘Д┘З ╪и╪з╪│╪к╪о╪п╪з┘Е ONNX Runtime ╪╣╪и╪▒ ┘Б╪ж╪з╪к [`ORTModelForXXX`](https://huggingface.co/docs/optimum/onnxruntime/package_reference/modeling_ort)╪М ┘И╪з┘Д╪к┘К ╪к╪к╪и╪╣ ┘Ж┘Б╪│ ┘И╪з╪м┘З╪й ╪и╪▒┘Е╪м╪й ╪з┘Д╪к╪╖╪и┘К┘В╪з╪к (API) ┘Д┘А `AutoModel` ╪з┘Д╪к┘К ╪з╪╣╪к╪п╪к ╪╣┘Д┘К┘З╪з ┘Б┘К ЁЯдЧ Transformers.
- ╪к╪┤╪║┘К┘Д┘З ╪и╪з╪│╪к╪о╪п╪з┘Е [┘В┘Ж┘И╪з╪к ┘Е╪╣╪з┘Д╪м╪й ╪з┘Д╪з╪│╪к╪п┘Д╪з┘Д ┘Е┘П╪н╪│┘С┘Ж╪й](https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/pipelines)╪М ┘И╪з┘Д╪к┘К ┘Д┘З╪з ┘Ж┘Б╪│ ┘И╪з╪м┘З╪й ╪и╪▒┘Е╪м╪й ╪з┘Д╪к╪╖╪и┘К┘В╪з╪к (API) ┘Е╪л┘Д ┘И╪╕┘К┘Б╪й [`pipeline`] ┘Б┘К ЁЯдЧ Transformers.
┘К┘И┘Б╪▒ ЁЯдЧ Optimum ╪п╪╣┘Е┘Л╪з ┘Д╪к╪╡╪п┘К╪▒ ONNX ┘Е┘Ж ╪о┘Д╪з┘Д ╪з┘Д╪з╪│╪к┘Б╪з╪п╪й ┘Е┘Ж ┘Г╪з╪ж┘Ж╪з╪к ╪з┘Д╪к┘Г┘И┘К┘Ж. ╪к╪г╪к┘К ┘Г╪з╪ж┘Ж╪з╪к ╪з┘Д╪к┘Г┘И┘К┘Ж ┘З╪░┘З ╪м╪з┘З╪▓╪й ┘Д╪╣╪п╪п ┘Е┘Ж ┘Е╪╣┘Е╪з╪▒┘К╪з╪к ╪з┘Д┘Ж┘Е╪з╪░╪м╪М ┘И┘В╪п ╪к┘Е ╪к╪╡┘Е┘К┘Е┘З╪з ┘Д╪к┘Г┘И┘Ж ┘В╪з╪и┘Д╪й ┘Д┘Д╪к┘И╪│╪╣╪й ╪и╪│┘З┘И┘Д╪й ╪е┘Д┘Й ┘Е╪╣┘Е╪з╪▒┘К╪з╪к ╪г╪о╪▒┘Й.
┘Д┘Д╪з╪╖┘Д╪з╪╣ ╪╣┘Д┘Й ┘В╪з╪ж┘Е╪й ╪и╪з┘Д╪к┘Г┘И┘К┘Ж╪з╪к ╪з┘Д╪м╪з┘З╪▓╪й╪М ┘К┘П╪▒╪м┘Й ╪з┘Д╪▒╪м┘И╪╣ ╪е┘Д┘Й [┘И╪л╪з╪ж┘В ЁЯдЧ Optimum](https://huggingface.co/docs/optimum/exporters/onnx/overview).
┘З┘Ж╪з┘Г ╪╖╪▒┘К┘В╪к╪з┘Ж ┘Д╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ЁЯдЧ Transformers ╪е┘Д┘Й ONNX╪М ┘Ж╪╣╪▒╪╢ ┘З┘Ж╪з ┘Г┘Д┘К┘З┘Е╪з:
- ╪з┘Д╪к╪╡╪п┘К╪▒ ╪и╪з╪│╪к╪о╪п╪з┘Е ЁЯдЧ Optimum ╪╣╪и╪▒ ┘И╪з╪м┘З╪й ╪│╪╖╪▒ ╪з┘Д╪г┘И╪з┘Е╪▒ (CLI).
- ╪з┘Д╪к╪╡╪п┘К╪▒ ╪и╪з╪│╪к╪о╪п╪з┘Е ЁЯдЧ Optimum ┘Е╪╣ `optimum.onnxruntime`.
### ╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ЁЯдЧ Transformers ╪е┘Д┘Й ONNX ╪и╪з╪│╪к╪о╪п╪з┘Е ┘И╪з╪м┘З╪й ╪│╪╖╪▒ ╪з┘Д╪г┘И╪з┘Е╪▒
┘Д╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ЁЯдЧ Transformers ╪е┘Д┘Й ONNX╪М ┘В┘Е ╪г┘И┘Д╪з┘Л ╪и╪к╪л╪и┘К╪к ╪з╪╣╪к┘Е╪з╪п ╪е╪╢╪з┘Б┘К:
```bash
pip install optimum[exporters]
```
┘Д┘Д╪з╪╖┘Д╪з╪╣ ╪╣┘Д┘Й ╪м┘Е┘К╪╣ ╪з┘Д┘Е╪╣╪з┘Ея╗╗╪к ╪з┘Д┘Е╪к╪з╪н╪й╪М ┘К╪▒╪м┘Й ╪з┘Д╪▒╪м┘И╪╣ ╪е┘Д┘Й [┘И╪л╪з╪ж┘В ЁЯдЧ Optimum](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model#exporting-a-model-to-onnx-using-the-cli)╪М ╪г┘И ╪╣╪▒╪╢ ╪з┘Д┘Е╪│╪з╪╣╪п╪й ┘Б┘К ╪│╪╖╪▒ ╪з┘Д╪г┘И╪з┘Е╪▒:
```bash
optimum-cli export onnx --help
```
```bash
optimum-cli export onnx --help
```
┘Д╪к╪╡╪п┘К╪▒ ┘Ж┘В╪╖╪й ╪к┘Б╪к┘К╪┤ ┘Ж┘Е┘И╪░╪м ┘Е┘Ж ЁЯдЧ Hub╪М ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М `distilbert/distilbert-base-uncased-distilled-squad`╪М ┘В┘Е ╪и╪к╪┤╪║┘К┘Д ╪з┘Д╪г┘Е╪▒ ╪з┘Д╪к╪з┘Д┘К:
```bash
optimum-cli export onnx --model distilbert/distilbert-base-uncased-distilled-squad distilbert_base_uncased_squad_onnx/
```
┘К╪м╪и ╪г┘Ж ╪к╪┤╪з┘З╪п ╪з┘Д╪│╪м┘Д╪з╪к ╪з┘Д╪к┘К ╪к╪┤┘К╪▒ ╪е┘Д┘Й ╪з┘Д╪к┘В╪п┘Е ╪з┘Д┘Е╪н╪▒╪▓ ┘И╪к╪╕┘З╪▒ ╪з┘Д┘Е┘Г╪з┘Ж ╪з┘Д╪░┘К ╪к┘Е ┘Б┘К┘З ╪н┘Б╪╕ ┘Е┘Д┘Б `model.onnx` ╪з┘Д┘Ж╪з╪к╪м╪М ┘Е╪л┘Д ┘З╪░╪з:
```bash
Validating ONNX model distilbert_base_uncased_squad_onnx/model.onnx...
-[тЬУ] ONNX model output names match reference model (start_logits, end_logits)
- Validating ONNX Model output "start_logits":
-[тЬУ] (2, 16) matches (2, 16)
-[тЬУ] all values close (atol: 0.0001)
- Validating ONNX Model output "end_logits":
-[тЬУ] (2, 16) matches (2, 16)
-[тЬУ] all values close (atol: 0.0001)
The ONNX export succeeded and the exported model was saved at: distilbert_base_uncased_squad_onnx
```
┘К┘И╪╢╪н ╪з┘Д┘Е╪л╪з┘Д ╪г╪╣┘Д╪з┘З ╪к╪╡╪п┘К╪▒ ┘Ж┘В╪╖╪й ╪к┘Б╪к┘К╪┤ ┘Е┘Ж ЁЯдЧ Hub. ╪╣┘Ж╪п ╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ┘Е╪н┘Д┘К╪М ╪к╪г┘Г╪п ╪г┘И┘Д╪з┘Л ┘Е┘Ж ╪н┘Б╪╕ ┘Е┘Д┘Б╪з╪к ╪г┘И╪▓╪з┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И┘Е╪н┘И┘Д ╪з┘Д╪▒┘Е┘И╪▓ ┘Б┘К ┘Ж┘Б╪│ ╪з┘Д╪п┘Д┘К┘Д (`local_path`). ╪╣┘Ж╪п ╪з╪│╪к╪о╪п╪з┘Е ┘И╪з╪м┘З╪й ╪│╪╖╪▒ ╪з┘Д╪г┘И╪з┘Е╪▒╪М ┘В┘Е ╪и╪к┘Е╪▒┘К╪▒ `local_path` ╪е┘Д┘Й ┘И╪│┘К╪╖ `model` ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪з╪│┘Е ┘Ж┘В╪╖╪й ╪з┘Д╪к┘Б╪к┘К╪┤ ╪╣┘Д┘Й ЁЯдЧ Hub ┘И┘В╪п┘Е ┘И╪│┘К╪╖ `--task`. ┘К┘Е┘Г┘Ж┘Г ┘Е╪▒╪з╪м╪╣╪й ┘В╪з╪ж┘Е╪й ╪з┘Д┘Е┘З╪з┘Е ╪з┘Д┘Е╪п╪╣┘И┘Е╪й ┘Б┘К [┘И╪л╪з╪ж┘В ЁЯдЧ Optimum](https://huggingface.co/docs/optimum/exporters/task_manager). ╪е╪░╪з ┘Д┘Е ┘К╪к┘Е ╪к┘И┘Б┘К╪▒ ┘И╪│┘К╪╖ `task`╪М ┘Б╪│┘К╪к┘Е ╪к╪╣┘К┘К┘Ж┘З ╪з┘Б╪к╪▒╪з╪╢┘К┘Л╪з ╪е┘Д┘Й ┘З┘Ж╪п╪│╪й ╪з┘Д┘Ж┘Е┘И╪░╪м ╪п┘И┘Ж ╪г┘К ╪▒╪г╪│ ┘Е╪н╪п╪п ┘Д┘Д┘Е┘З┘Е╪й.
```bash
optimum-cli export onnx --model local_path --task question-answering distilbert_base_uncased_squad_onnx/
```
┘К┘Е┘Г┘Ж ╪и╪╣╪п ╪░┘Д┘Г ╪к╪┤╪║┘К┘Д ┘Е┘Д┘Б `model.onnx` ╪з┘Д┘Ж╪з╪к╪м ╪╣┘Д┘Й ╪г╪н╪п [╪з┘Д┘Е╪│╪▒╪╣╪з╪к](https://onnx.ai/supported-tools.html#deployModel) ╪з┘Д╪╣╪п┘К╪п╪й ╪з┘Д╪к┘К ╪к╪п╪╣┘Е ┘Е╪╣┘К╪з╪▒ ONNX. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ┘К┘Е┘Г┘Ж┘Ж╪з ╪к╪н┘Е┘К┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И╪к╪┤╪║┘К┘Д┘З ╪и╪з╪│╪к╪о╪п╪з┘Е [ONNX Runtime](https://onnxruntime.ai/) ┘Г┘Е╪з ┘К┘Д┘К:
```python
>>> from transformers import AutoTokenizer
>>> from optimum.onnxruntime import ORTModelForQuestionAnswering
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert_base_uncased_squad_onnx")
>>> model = ORTModelForQuestionAnswering.from_pretrained("distilbert_base_uncased_squad_onnx")
>>> inputs = tokenizer("What am I using?", "Using DistilBERT with ONNX Runtime!", return_tensors="pt")
>>> outputs = model(**inputs)
```
╪к┘Г┘И┘Ж ╪з┘Д╪╣┘Е┘Д┘К╪й ┘Е┘Е╪з╪л┘Д╪й ╪и╪з┘Д┘Ж╪│╪и╪й ╪е┘Д┘Й ┘Ж┘В╪з╪╖ ╪к┘Б╪к┘К╪┤ TensorFlow ╪╣┘Д┘Й Hub. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪е┘Д┘К┘Г ┘Г┘К┘Б┘К╪й ╪к╪╡╪п┘К╪▒ ┘Ж┘В╪╖╪й ╪к┘Б╪к┘К╪┤ TensorFlow ┘Ж┘В┘К╪й ┘Е┘Ж [┘Е┘Ж╪╕┘Е╪й Keras](https://huggingface.co/keras-io):
```bash
optimum-cli export onnx --model keras-io/transformers-qa distilbert_base_cased_squad_onnx/
```
### ╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ЁЯдЧ Transformers ╪е┘Д┘Й ONNX ╪и╪з╪│╪к╪о╪п╪з┘Е `optimum.onnxruntime`
┘Г╪и╪п┘К┘Д ┘Д┘И╪з╪м┘З╪й ╪│╪╖╪▒ ╪з┘Д╪г┘И╪з┘Е╪▒╪М ┘К┘П┘Е┘Г┘Ж┘Г ╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ЁЯдЧ Transformers ╪е┘Д┘Й ONNX ╪и╪▒┘Е╪м┘К┘Л╪з ┘Г┘Е╪з ┘К┘Д┘К:
```python
>>> from optimum.onnxruntime import ORTModelForSequenceClassification
>>> from transformers import AutoTokenizer
>>> model_checkpoint = "distilbert_base_uncased_squad"
>>> save_directory = "onnx/"
>>> # ╪к╪н┘Е┘К┘Д ┘Ж┘Е┘И╪░╪м ┘Е┘Ж transformers ┘И╪к╪╡╪п┘К╪▒┘З ╪е┘Д┘Й ONNX
>>> ort_model = ORTModelForSequenceClassification.from_pretrained(model_checkpoint, export=True)
>>> tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
>>> # ╪н┘Б╪╕ ┘Ж┘Е┘И╪░╪м onnx ┘И┘Е╪м╪▓┘Й╪б ╪з┘Д┘Ж╪╡┘И╪╡
>>> ort_model.save_pretrained(save_directory)
>>> tokenizer.save_pretrained(save_directory)
```
### ╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ┘Д┘З┘Ж╪п╪│╪й ╪║┘К╪▒ ┘Е╪п╪╣┘И┘Е╪й
╪е╪░╪з ┘Г┘Ж╪к ╪к╪▒╪║╪и ┘Б┘К ╪з┘Д┘Е╪│╪з┘З┘Е╪й ┘Е┘Ж ╪о┘Д╪з┘Д ╪е╪╢╪з┘Б╪й ╪п╪╣┘Е ┘Д┘Ж┘Е┘И╪░╪м ┘Д╪з ┘К┘П┘Е┘Г┘Ж ╪к╪╡╪п┘К╪▒┘З ╪н╪з┘Д┘К┘Л╪з╪М ┘Б┘К╪м╪и ╪╣┘Д┘К┘Г ╪г┘И┘Д╪з┘Л ╪з┘Д╪к╪н┘В┘В ┘Е┘Е╪з ╪е╪░╪з ┘Г╪з┘Ж ┘Е╪п╪╣┘И┘Е┘Л╪з ┘Б┘К [`optimum.exporters.onnx`](https://huggingface.co/docs/optimum/exporters/onnx/overview)╪М ┘И╪е╪░╪з ┘Д┘Е ┘К┘Г┘Ж ┘Е╪п╪╣┘И┘Е┘Л╪з╪М [┘Б┘К┘Е┘Г┘Ж┘Г ╪з┘Д┘Е╪│╪з┘З┘Е╪й ┘Б┘К ЁЯдЧ Optimum](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/contribute) ┘Е┘П╪и╪з╪┤╪▒╪й┘Л.
### ╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ╪и╪з╪│╪к╪о╪п╪з┘Е `transformers.onnx`
<Tip warning={true}>
┘Д┘Е ┘К╪╣╪п ┘К╪к┘Е ╪п╪╣┘Е `tranformers.onnx` ┘К┘П╪▒╪м┘Й ╪к╪╡╪п┘К╪▒ ╪з┘Д┘Ж┘Е╪з╪░╪м ╪и╪з╪│╪к╪о╪п╪з┘Е ЁЯдЧ Optimum ┘Г┘Е╪з ┘З┘И ┘Е┘И╪╢╪н ╪г╪╣┘Д╪з┘З. ╪│┘К╪к┘Е ╪е╪▓╪з┘Д╪й ┘З╪░╪з ╪з┘Д┘В╪│┘Е ┘Б┘К ╪з┘Д╪е╪╡╪п╪з╪▒╪з╪к ╪з┘Д┘В╪з╪п┘Е╪й.
</Tip>
┘Д╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ЁЯдЧ Transformers ╪е┘Д┘Й ONNX ╪и╪з╪│╪к╪о╪п╪з┘Е `tranformers.onnx`╪М ╪л╪и┘С╪к ╪з┘Д╪к╪и╪╣┘К╪з╪к ╪з┘Д╪е╪╢╪з┘Б┘К╪й:
```bash
pip install transformers[onnx]
```
╪з╪│╪к╪о╪п┘Е ╪н╪▓┘Е╪й `transformers.onnx` ┘Г┘Ж┘Е┘И╪░╪м Python ┘Д╪к╪╡╪п┘К╪▒ ┘Ж┘В╪╖╪й ╪н┘Б╪╕ ╪и╪з╪│╪к╪о╪п╪з┘Е ╪к┘Г┘И┘К┘Ж ╪м╪з┘З╪▓:
```bash
python -m transformers.onnx --model=distilbert/distilbert-base-uncased onnx/
```
┘К┘П╪╡╪п┘С╪▒ ┘З╪░╪з ╪▒╪│┘Е┘Л╪з ╪и┘К╪з┘Ж┘К┘Л╪з ONNX ┘Д┘Ж┘В╪╖╪й ╪з┘Д╪н┘Б╪╕ ╪з┘Д┘Е┘П╪н╪п╪п╪й ╪и┘И╪з╪│╪╖╪й ┘И╪│┘К╪╖╪й `--model`. ┘Е╪▒╪▒ ╪г┘К ┘Ж┘В╪╖╪й ╪н┘Б╪╕ ╪╣┘Д┘Й ЁЯдЧ Hub ╪г┘И ┘Ж┘В╪╖╪й ╪н┘Б╪╕ ┘Е┘П╪о╪▓┘Ж╪й ┘Е╪н┘Д┘К┘Л╪з.
┘К┘П┘Е┘Г┘Ж ╪и╪╣╪п ╪░┘Д┘Г ╪к╪┤╪║┘К┘Д ┘Е┘Д┘Б `model.onnx` ╪з┘Д┘Ж╪з╪к╪м ╪╣┘Д┘Й ╪г╪н╪п ╪з┘Д┘Е┘П╪│╪▒╪╣╪з╪к ╪з┘Д╪╣╪п┘К╪п╪й ╪з┘Д╪к┘К ╪к╪п╪╣┘Е ┘Е╪╣┘К╪з╪▒ ONNX. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ┘В┘Е ╪и╪к╪н┘Е┘К┘Д ┘И╪к╪┤╪║┘К┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и╪з╪│╪к╪о╪п╪з┘Е ONNX Runtime ┘Г┘Е╪з ┘К┘Д┘К:
```python
>>> from transformers import AutoTokenizer
>>> from onnxruntime import InferenceSession
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
>>> session = InferenceSession("onnx/model.onnx")
>>> # ┘К╪к┘И┘В╪╣ ONNX Runtime ┘Е╪╡┘Б┘И┘Б╪з╪к NumPy ┘Г┘Е╪п╪о┘Д╪з╪к
>>> inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="np")
>>> outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))
```
┘К┘П┘Е┘Г┘Ж ╪з┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ╪г╪│┘Е╪з╪б ╪з┘Д┘Е╪о╪▒╪м╪з╪к ╪з┘Д┘Е╪╖┘Д┘И╪и╪й (┘Е╪л┘Д `["last_hidden_state"]`) ┘Е┘Ж ╪о┘Д╪з┘Д ╪е┘Д┘В╪з╪б ┘Ж╪╕╪▒╪й ╪╣┘Д┘Й ╪к┘Г┘И┘К┘Ж ONNX ┘Д┘Г┘Д ┘Ж┘Е┘И╪░╪м. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪и╪з┘Д┘Ж╪│╪и╪й ┘Д┘А DistilBERT╪М ┘Д╪п┘К┘Ж╪з:
```python
>>> from transformers.models.distilbert import DistilBertConfig, DistilBertOnnxConfig
>>> config = DistilBertConfig()
>>> onnx_config = DistilBertOnnxConfig(config)
>>> print(list(onnx_config.outputs.keys()))
["last_hidden_state"]
```
╪з┘Д╪╣┘Е┘Д┘К╪з╪к ┘Е┘П╪к╪╖╪з╪и┘В╪й ┘Д┘Ж┘В╪з╪╖ ╪з┘Д╪н┘Б╪╕ TensorFlow ╪╣┘Д┘Й Hub. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪╡╪п┘С╪▒ ┘Ж┘В╪╖╪й ╪н┘Б╪╕ TensorFlow ╪о╪з┘Д╪╡╪й ┘Г┘Е╪з ┘К┘Д┘К:
```bash
python -m transformers.onnx --model=keras-io/transformers-qa onnx/
```
┘Д╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ┘Е┘П╪о╪▓┘Ж ┘Е╪н┘Д┘К┘Л╪з╪М ╪з╪н┘Б╪╕ ╪г┘И╪▓╪з┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И┘Е╪м╪▓┘Й╪б ╪з┘Д┘Д╪║┘И┘Й ┘Б┘К ┘Ж┘Б╪│ ╪з┘Д╪п┘Д┘К┘Д (╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д `local-pt-checkpoint`)╪М ╪л┘Е ┘В┘Е ╪и╪к╪╡╪п┘К╪▒┘З ╪е┘Д┘Й ONNX ╪╣┘Ж ╪╖╪▒┘К┘В ╪к┘И╪м┘К┘З ┘И╪│┘К╪╖ `--model` ┘Д╪н╪▓┘Е╪й `transformers.onnx` ╪е┘Д┘Й ╪з┘Д╪п┘Д┘К┘Д ╪з┘Д┘Е╪╖┘Д┘И╪и:
```bash
python -m transformers.onnx --model=local-pt-checkpoint onnx/
```

40
docs/source/ar/tflite.md Normal file
View File

@ -0,0 +1,40 @@
# ╪з┘Д╪к╪╡╪п┘К╪▒ ╪е┘Д┘Й TFLite
[TensorFlow Lite](https://www.tensorflow.org/lite/guide) ┘З┘И ╪е╪╖╪з╪▒ ╪╣┘Е┘Д ╪о┘Б┘К┘Б ╪з┘Д┘И╪▓┘Ж ┘Д┘Ж╪┤╪▒ ┘Ж┘Е╪з╪░╪м ╪з┘Д╪к╪╣┘Д┘Е ╪з┘Д╪в┘Д┘К ╪╣┘Д┘Й ╪з┘Д╪г╪м┘З╪▓╪й ╪з┘Д┘Е╪н╪п┘И╪п╪й ╪з┘Д┘Е┘И╪з╪▒╪п╪М ┘Е╪л┘Д ╪з┘Д┘З┘И╪з╪к┘Б ╪з┘Д┘Е╪н┘Е┘И┘Д╪й╪М ┘И╪з┘Д╪г┘Ж╪╕┘Е╪й ╪з┘Д┘Е╪п┘Е╪м╪й╪М ┘И╪г╪м┘З╪▓╪й ╪е┘Ж╪к╪▒┘Ж╪к ╪з┘Д╪г╪┤┘К╪з╪б (IoT). ╪к┘Е ╪к╪╡┘Е┘К┘Е TFLite ┘Д╪к╪┤╪║┘К┘Д ╪з┘Д┘Ж┘Е╪з╪░╪м ┘И╪к╪н╪│┘К┘Ж┘З╪з ╪и┘Г┘Б╪з╪б╪й ╪╣┘Д┘Й ┘З╪░┘З ╪з┘Д╪г╪м┘З╪▓╪й ╪░╪з╪к ╪з┘Д╪╖╪з┘В╪й ╪з┘Д╪н╪з╪│┘И╪и┘К╪й ┘И╪з┘Д╪░╪з┘Г╪▒╪й ┘И╪з╪│╪к┘З┘Д╪з┘Г ╪з┘Д╪╖╪з┘В╪й ╪з┘Д┘Е╪н╪п┘И╪п╪й.
┘К┘П┘Е╪л┘О┘С┘Д ┘Ж┘Е┘И╪░╪м TensorFlow Lite ╪и╪к┘Ж╪│┘К┘В ┘Е╪н┘Е┘И┘Д ┘Б╪╣╪з┘Д ╪о╪з╪╡ ┘К┘П╪╣╪▒┘О┘С┘Б ╪и╪з┘Е╪к╪п╪з╪п ╪з┘Д┘Е┘Д┘Б `.tflite`.
ЁЯдЧ Optimum ┘К┘В╪п┘Е ┘И╪╕┘К┘Б╪й ┘Д╪к╪╡╪п┘К╪▒ ┘Ж┘Е╪з╪░╪м ЁЯдЧ Transformers ╪е┘Д┘Й TFLite ┘Е┘Ж ╪о┘Д╪з┘Д ╪з┘Д┘И╪н╪п╪й ╪з┘Д┘Ж┘Е╪╖┘К╪й `exporters.tflite`. ╪и╪з┘Д┘Ж╪│╪и╪й ┘Д┘В╪з╪ж┘Е╪й ┘З┘Ж╪п╪│╪з╪к ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д┘Е╪п╪╣┘И┘Е╪й╪М ┘К╪▒╪м┘Й ╪з┘Д╪▒╪м┘И╪╣ ╪е┘Д┘Й [┘И╪л╪з╪ж┘В ЁЯдЧ Optimum](https://huggingface.co/docs/optimum/exporters/tflite/overview).
┘Д╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ╪е┘Д┘Й TFLite╪М ┘В┘Е ╪и╪к╪л╪и┘К╪к ┘Е╪к╪╖┘Д╪и╪з╪к ╪з┘Д╪и╪▒┘Ж╪з┘Е╪м ╪з┘Д┘Е╪╖┘Д┘И╪и╪й:
```bash
pip install optimum[exporters-tf]
```
┘Д┘Д╪з╪╖┘Д╪з╪╣ ╪╣┘Д┘Й ╪м┘Е┘К╪╣ ╪з┘Д┘Е╪║╪з┘Ея╗╗╪к ╪з┘Д┘Е╪к╪з╪н╪й╪М ╪▒╪з╪м╪╣ [┘И╪л╪з╪ж┘В ЁЯдЧ Optimum](https://huggingface.co/docs/optimum/main/en/exporters/tflite/usage_guides/export_a_model)╪М ╪г┘И ╪╣╪▒╪╢ ╪з┘Д┘Е╪│╪з╪╣╪п╪й ┘Б┘К ╪│╪╖╪▒ ╪з┘Д╪г┘И╪з┘Е╪▒:
```bash
optimum-cli export tflite --help
```
┘Д╪к╪╡╪п┘К╪▒ ┘Ж╪│╪о╪й ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Д ЁЯдЧ Hub╪М ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М `google-bert/bert-base-uncased`╪М ┘В┘Е ╪и╪к╪┤╪║┘К┘Д ╪з┘Д╪г┘Е╪▒ ╪з┘Д╪к╪з┘Д┘К:
```bash
optimum-cli export tflite --model google-bert/bert-base-uncased --sequence_length 128 bert_tflite/
```
╪│╪к╪╕┘З╪▒ ┘Д┘Г ╪з┘Д╪│╪м┘Д╪з╪к ╪з┘Д╪к┘К ╪к┘П╪и┘К┘С┘Ж ╪з┘Д╪к┘В╪п┘Е ┘И┘Е┘И┘В╪╣ ╪н┘Б╪╕ ┘Е┘Д┘Б `model.tflite` ╪з┘Д┘Ж╪з╪к╪м╪М ┘Г┘Е╪з ┘Б┘К ╪з┘Д┘Е╪л╪з┘Д ╪з┘Д╪к╪з┘Д┘К:
```bash
Validating TFLite model...
-[тЬУ] TFLite model output names match reference model (logits)
- Validating TFLite Model output "logits":
-[тЬУ] (1, 128, 30522) matches (1, 128, 30522)
-[x] values not close enough, max diff: 5.817413330078125e-05 (atol: 1e-05)
The TensorFlow Lite export succeeded with the warning: The maximum absolute difference between the output of the reference model and the TFLite exported model is not within the set tolerance 1e-05:
- logits: max diff = 5.817413330078125e-05.
The exported model was saved at: bert_tflite
```
┘К┘П╪и┘К┘С┘Ж ╪з┘Д┘Е╪л╪з┘Д ╪г╪╣┘Д╪з┘З ┘Г┘К┘Б┘К╪й ╪к╪╡╪п┘К╪▒ ┘Ж╪│╪о╪й ┘Е┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Д ЁЯдЧ Hub. ╪╣┘Ж╪п ╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ┘Е╪н┘Д┘К╪М ╪к╪г┘Г╪п ╪г┘И┘Д╪з┘Л ┘Е┘Ж ╪н┘Б╪╕ ┘Е┘Д┘Б╪з╪к ╪г┘И╪▓╪з┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д┘Е╪м╪▓╪б ╪з┘Д┘Д╪║┘И┘Й ┘Б┘К ┘Ж┘Б╪│ ╪з┘Д┘Е╪│╪з╪▒ (`local_path`). ╪╣┘Ж╪п ╪з╪│╪к╪о╪п╪з┘Е CLI╪М ┘В┘Е ╪и╪к┘Е╪▒┘К╪▒ `local_path` ╪е┘Д┘Й ┘Е╪╣╪з┘Е┘Д `model` ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪з╪│┘Е ╪з┘Д┘Ж╪│╪о╪й ╪╣┘Д┘Й ЁЯдЧ Hub.

View File

@ -0,0 +1,154 @@
# ╪з┘Д╪к╪╡╪п┘К╪▒ ╪е┘Д┘Й TorchScript
<Tip>
┘З╪░┘З ┘З┘К ╪и╪п╪з┘К╪й ╪к╪м╪з╪▒╪и┘Ж╪з ┘Е╪╣ TorchScript ┘И┘Д╪з ╪▓┘Д┘Ж╪з ┘Ж╪│╪к┘Г╪┤┘Б ┘В╪п╪▒╪з╪к┘З ┘Е╪╣ ┘Ж┘Е╪з╪░╪м ╪з┘Д┘Е╪п╪о┘Д╪з╪к ╪з┘Д┘Е╪к╪║┘К╪▒╪й ╪з┘Д╪н╪м┘Е. ╪е┘Ж┘З ┘Е╪м╪з┘Д ╪з┘З╪к┘Е╪з┘Е┘Ж╪з ┘И╪│┘Ж╪╣┘Е┘В ╪к╪н┘Д┘К┘Д┘Ж╪з ┘Б┘К ╪з┘Д╪е╪╡╪п╪з╪▒╪з╪к ╪з┘Д┘В╪з╪п┘Е╪й╪М ┘Е╪╣ ╪з┘Д┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д╪г┘Е╪л┘Д╪й ╪з┘Д╪и╪▒┘Е╪м┘К╪й╪М ┘И╪к┘Ж┘Б┘К╪░ ╪г┘Г╪л╪▒ ┘Е╪▒┘И┘Ж╪й╪М ┘И┘Е┘В╪з┘К┘К╪│ ┘Е┘В╪з╪▒┘Ж╪й ╪и┘К┘Ж ╪з┘Д╪г┘Г┘И╪з╪п ╪з┘Д┘В╪з╪ж┘Е╪й ╪╣┘Д┘Й Python ┘Е╪╣ ╪г┘Г┘И╪з╪п TorchScript ╪з┘Д┘Е┘П╪м┘Е┘С╪╣╪й.
</Tip>
┘И┘Б┘В┘Л╪з ┘Д┘А [┘И╪л╪з╪ж┘В TorchScript](https://pytorch.org/docs/stable/jit.html):
> TorchScript ┘З┘К ╪╖╪▒┘К┘В╪й ┘Д╪е┘Ж╪┤╪з╪б ┘Ж┘Е╪з╪░╪м ┘В╪з╪и┘Д╪й ┘Д┘Д╪к╪│┘Д╪│┘Д ┘И╪з┘Д╪к╪н╪│┘К┘Ж ┘Е┘Ж ╪к╪╣┘Д┘К┘Е╪з╪к PyTorch ╪з┘Д╪и╪▒┘Е╪м┘К╪й.
┘З┘Ж╪з┘Г ┘И╪н╪п╪к╪з┘Ж ┘Е┘Ж PyTorch╪М [JIT and TRACE](https://pytorch.org/docs/stable/jit.html)╪М ╪к╪к┘К╪н╪з┘Ж ┘Д┘Д┘Е╪╖┘И╪▒┘К┘Ж ╪к╪╡╪п┘К╪▒ ┘Ж┘Е╪з╪░╪м┘З┘Е ┘Д╪е╪╣╪з╪п╪й ╪з╪│╪к╪о╪п╪з┘Е┘З╪з ┘Б┘К ╪и╪▒╪з┘Е╪м ╪г╪о╪▒┘Й ┘Е╪л┘Д ╪и╪▒╪з┘Е╪м C++ ╪з┘Д┘Е┘П╪н╪│┘С┘Ж╪й ┘Д┘Д╪г╪п╪з╪б.
┘Ж┘В╪п┘Е ┘И╪з╪м┘З╪й ╪к╪к┘К╪н ┘Д┘Г ╪к╪╡╪п┘К╪▒ ┘Ж┘Е╪з╪░╪м ЁЯдЧ Transformers ╪е┘Д┘Й TorchScript ╪и╪н┘К╪л ┘К┘Е┘Г┘Ж ╪е╪╣╪з╪п╪й ╪з╪│╪к╪о╪п╪з┘Е┘З╪з ┘Б┘К ╪и┘К╪ж╪й ┘Е╪о╪к┘Д┘Б╪й ╪╣┘Ж ╪и╪▒╪з┘Е╪м Python ╪з┘Д┘В╪з╪ж┘Е╪й ╪е┘Д┘Й PyTorch. ┘З┘Ж╪з ┘Ж╪┤╪▒╪н ┘Г┘К┘Б┘К╪й ╪к╪╡╪п┘К╪▒ ┘Ж┘Е╪з╪░╪м┘Ж╪з ┘И╪з╪│╪к╪о╪п╪з┘Е┘З╪з ╪и╪з╪│╪к╪о╪п╪з┘Е TorchScript.
┘К╪к╪╖┘Д╪и ╪к╪╡╪п┘К╪▒ ┘Ж┘Е┘И╪░╪м ╪г┘Е╪▒┘К┘Ж:
- ╪к┘З┘К╪ж╪й ┘Е╪л┘К┘Д ┘Д┘Д┘Ж┘Е┘И╪░╪м ╪и╪з╪│╪к╪о╪п╪з┘Е ╪╣┘Д╪з┘Е╪й `torchscript`
- ╪к┘Е╪▒┘К╪▒ ┘Е┘П╪п╪о┘Д╪з╪к ┘И┘З┘Е┘К╪й (dummy inputs) ╪о┘Д╪з┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м
╪к┘Ж╪╖┘И┘К ┘З╪░┘З ╪з┘Д╪╢╪▒┘И╪▒╪з╪к ╪╣┘Д┘Й ╪╣╪п╪й ╪г┘Е┘И╪▒ ┘К╪м╪и ╪╣┘Д┘Й ╪з┘Д┘Е╪╖┘И╪▒┘К┘Ж ╪к┘И╪о┘К ╪з┘Д╪н╪░╪▒ ╪и╪┤╪г┘Ж┘З╪з ┘Г┘Е╪з ┘З┘И ┘Е┘Б╪╡┘Д ╪г╪п┘Ж╪з┘З.
## ╪╣┘Д╪з┘Е╪й TorchScript ┘И╪з┘Д╪г┘И╪▓╪з┘Ж ╪з┘Д┘Е╪▒╪к╪и╪╖╪й
╪╣┘Д╪з┘Е╪й `torchscript` ╪╢╪▒┘И╪▒┘К╪й ┘Д╪г┘Ж ┘Е╪╣╪╕┘Е ┘Ж┘Е╪з╪░╪м ╪з┘Д┘Д╪║╪й ЁЯдЧ Transformers ┘Д┘З╪з ╪г┘И╪▓╪з┘Ж ┘Е╪▒╪к╪и╪╖╪й ╪и┘К┘Ж ╪╖╪и┘В╪й `Embedding` ┘И╪╖╪и┘В╪й `Decoding`. ┘Д╪з ┘К╪│┘Е╪н ┘Д┘Г TorchScript ╪и╪к╪╡╪п┘К╪▒ ╪з┘Д┘Ж┘Е╪з╪░╪м ╪░╪з╪к ╪з┘Д╪г┘И╪▓╪з┘Ж ╪з┘Д┘Е╪▒╪к╪и╪╖╪й╪М ┘Д╪░┘Д┘Г ┘Е┘Ж ╪з┘Д╪╢╪▒┘И╪▒┘К ┘Б╪╡┘Д ╪з┘Д╪г┘И╪▓╪з┘Ж ┘И┘Ж╪│╪о┘З╪з ┘Е╪│╪и┘В┘Л╪з.
╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д┘Е┘П┘З┘К╪г╪й ╪и╪з╪│╪к╪о╪п╪з┘Е ╪╣┘Д╪з┘Е╪й `torchscript` ┘Д┘З╪з ╪╖╪и┘В╪й `Embedding` ┘И╪╖╪и┘В╪й`Decoding` ┘Е┘Ж┘Б╪╡┘Д╪к┘К┘Ж╪М ┘Е┘Е╪з ┘К╪╣┘Ж┘К ╪г┘Ж┘З ┘Д╪з ┘К┘Ж╪и╪║┘К ╪к╪п╪▒┘К╪и┘З╪з ┘Д╪з╪н┘В┘Л╪з. ╪│┘К╪д╪п┘К ╪з┘Д╪к╪п╪▒┘К╪и ╪е┘Д┘Й ╪╣╪п┘Е ╪к╪▓╪з┘Е┘Ж ╪з┘Д╪╖╪и┘В╪к┘К┘Ж╪М ┘Е┘Е╪з ┘К╪д╪п┘К ╪е┘Д┘Й ┘Ж╪к╪з╪ж╪м ╪║┘К╪▒ ┘Е╪к┘И┘В╪╣╪й.
┘З╪░╪з ┘Д╪з ┘К┘Ж╪╖╪и┘В ╪╣┘Д┘Й ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к┘К ┘Д╪з ╪к╪н╪к┘И┘К ╪╣┘Д┘Й ╪▒╪г╪│ ┘Ж┘Е┘И╪░╪м ╪з┘Д┘Д╪║╪й╪М ╪н┘К╪л ┘Д╪з ╪к┘Е┘Д┘Г ╪г┘И╪▓╪з┘Ж┘Л╪з ┘Е╪▒╪к╪и╪╖╪й. ┘К┘Е┘Г┘Ж ╪к╪╡╪п┘К╪▒ ┘З╪░┘З ╪з┘Д┘Ж┘Е╪з╪░╪м ╪и╪г┘Е╪з┘Ж ╪п┘И┘Ж ╪╣┘Д╪з┘Е╪й `torchscript`.
## ╪з┘Д┘Е╪п╪о┘Д╪з╪к ╪з┘Д┘И┘З┘Е┘К╪й ┘И╪з┘Д╪г╪╖┘И╪з┘Д ╪з┘Д┘В┘К╪з╪│┘К╪й
╪к┘П╪│╪к╪о╪п┘Е ╪з┘Д┘Е┘П╪п╪о┘Д╪з╪к ╪з┘Д┘И┘З┘Е┘К╪й ┘Д╪к┘Е╪▒┘К╪▒ ╪г┘Е╪з┘Е┘К ╪о┘Д╪з┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м. ╪г╪л┘Ж╪з╪б ╪з┘Ж╪к╪┤╪з╪▒ ┘В┘К┘Е ╪з┘Д┘Е┘П╪п╪о┘Д╪з╪к ╪╣╪и╪▒ ╪з┘Д╪╖╪и┘В╪з╪к╪М ┘К╪к╪к╪и╪╣ PyTorch ╪з┘Д╪╣┘Е┘Д┘К╪з╪к ╪з┘Д┘Е╪о╪к┘Д┘Б╪й ╪з┘Д╪к┘К ┘К╪к┘Е ╪к┘Ж┘Б┘К╪░┘З╪з ╪╣┘Д┘Й ┘Г┘Д ┘Е╪╡┘Б┘И┘Б╪й(tensor). ╪л┘Е ┘К╪к┘Е ╪з╪│╪к╪о╪п╪з┘Е ┘З╪░┘З ╪з┘Д╪╣┘Е┘Д┘К╪з╪к ╪з┘Д┘Е┘П╪│╪м┘Д╪й ╪и╪╣╪п ╪░┘Д┘Г ┘Д╪е┘Ж╪┤╪з╪б *╪г╪л╪▒* ╪з┘Д┘Ж┘Е┘И╪░╪м.
┘К╪к┘Е ╪е┘Ж╪┤╪з╪б ╪з┘Д╪к╪к╪и╪╣ ╪и╪з┘Д┘Ж╪│╪и╪й ┘Д╪г╪и╪╣╪з╪п ╪з┘Д┘Е┘П╪п╪о┘Д╪з╪к. ┘И╪и╪з┘Д╪к╪з┘Д┘К╪М ┘Б┘З┘И ┘Е┘П┘В┘К┘С╪п ╪и╪г╪и╪╣╪з╪п ╪з┘Д┘Е┘П╪п╪о┘Д╪з╪к ╪з┘Д┘И┘З┘Е┘К╪й╪М ┘И┘Д┘Ж ┘К╪╣┘Е┘Д ┘Д╪г┘К ╪╖┘И┘Д ╪к╪│┘Д╪│┘Д ╪г┘И ╪н╪м┘Е ╪п┘Б╪╣╪й ┘Е╪о╪к┘Д┘Б. ╪╣┘Ж╪п ╪з┘Д┘Е╪н╪з┘И┘Д╪й ╪и╪н╪м┘Е ┘Е╪о╪к┘Д┘Б╪М ┘К╪к┘Е ╪▒┘Б╪╣ ╪з┘Д╪о╪╖╪г ╪з┘Д╪к╪з┘Д┘К:
```
`The expanded size of the tensor (3) must match the existing size (7) at non-singleton dimension 2`
```
┘Ж┘И╪╡┘К ╪и╪к╪к╪и╪╣ ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и╪з╪│╪к╪о╪п╪з┘Е ╪н╪м┘Е ┘Е┘П╪п╪о┘Д╪з╪к ┘И┘З┘Е┘К╪й ┘Д╪з ┘К┘В┘Д ╪╣┘Ж ╪г┘Г╪и╪▒ ┘Е┘П╪п╪о┘Д ╪│┘К╪к┘Е ╪к┘В╪п┘К┘Е┘З ┘Д┘Д┘Ж┘Е┘И╪░╪м ╪г╪л┘Ж╪з╪б ╪з┘Д╪з╪│╪к╪п┘Д╪з┘Д. ┘К┘Е┘Г┘Ж ╪г┘Ж ╪к╪│╪з╪╣╪п ╪з┘Д╪н╪┤┘И╪й(padding) ┘Б┘К ┘Е┘Д╪б ╪з┘Д┘В┘К┘Е ╪з┘Д┘Е┘Б┘В┘И╪п╪й. ┘И┘Е╪╣ ╪░┘Д┘Г╪М ┘Ж╪╕╪▒┘Л╪з ┘Д╪к╪к╪и╪╣ ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и╪н╪м┘Е ┘Е┘П╪п╪о┘Д ╪г┘Г╪и╪▒╪М ╪│╪к┘Г┘И┘Ж ╪г╪и╪╣╪з╪п ╪з┘Д┘Е╪╡┘Б┘И┘Б╪й ╪│╪к┘Г┘И┘Ж ┘Г╪и┘К╪▒╪й ╪г┘К╪╢┘Л╪з╪М ┘Е┘Е╪з ┘К╪д╪п┘К ╪╣┘Ж┘З ╪з┘Д┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д╪н╪│╪з╪и╪з╪к.
╪з┘Ж╪к╪и┘З ╪е┘Д┘Й ╪е╪м┘Е╪з┘Д┘К ╪╣╪п╪п ╪з┘Д╪╣┘Е┘Д┘К╪з╪к ╪з┘Д┘Е┘П┘Ж┘Б╪░╪й ╪╣┘Д┘Й ┘Г┘Д ┘Е┘П╪п╪о┘Д ┘И╪к╪з╪и╪╣ ╪з┘Д╪г╪п╪з╪б ╪╣┘Ж ┘Г╪л╪и ╪╣┘Ж╪п ╪к╪╡╪п┘К╪▒ ┘Ж┘Е╪з╪░╪м ┘Е╪к╪║┘К╪▒╪й ╪╖┘И┘Д ╪з┘Д╪к╪│┘Д╪│┘Д.
## ╪з╪│╪к╪о╪п╪з┘Е TorchScript ┘Б┘К Python
┘К┘И╪╢╪н ┘З╪░╪з ╪з┘Д┘В╪│┘Е ┘Г┘К┘Б┘К╪й ╪н┘Б╪╕ ╪з┘Д┘Ж┘Е╪з╪░╪м ┘И╪к╪н┘Е┘К┘Д┘З╪з╪М ╪и╪з┘Д╪е╪╢╪з┘Б╪й ╪е┘Д┘Й ┘Г┘К┘Б┘К╪й ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪к╪к╪и╪╣ ┘Д┘Д╪з╪│╪к╪п┘Д╪з┘Д.
### ╪н┘Б╪╕ ┘Ж┘Е┘И╪░╪м
┘Д╪к╪╡╪п┘К╪▒ `BertModel` ╪и╪з╪│╪к╪о╪п╪з┘Е TorchScript╪М ┘В┘Е ╪и╪к┘З┘К╪ж╪й ┘А `BertModel` ┘Е┘Ж ┘Б╪ж╪й `BertConfig` ╪л┘Е ╪з╪н┘Б╪╕┘З ╪╣┘Д┘Й ╪з┘Д┘В╪▒╪╡ ╪к╪н╪к ╪з╪│┘Е ╪з┘Д┘Е┘Д┘Б `traced_bert.pt`:
```python
from transformers import BertModel, BertTokenizer, BertConfig
import torch
enc = BertTokenizer.from_pretrained("google-bert/bert-base-uncased")
# Tokenizing input text
text = "[CLS] Who was Jim Henson ? [SEP] Jim Henson was a puppeteer [SEP]"
tokenized_text = enc.tokenize(text)
# Masking one of the input tokens
masked_index = 8
tokenized_text[masked_index] = "[MASK]"
indexed_tokens = enc.convert_tokens_to_ids(tokenized_text)
segments_ids = [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1]
# Creating a dummy input
tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])
dummy_input = [tokens_tensor, segments_tensors]
# Initializing the model with the torchscript flag
# Flag set to True even though it is not necessary as this model does not have an LM Head.
config = BertConfig(
vocab_size_or_config_json_file=32000,
hidden_size=768,
num_hidden_layers=12,
num_attention_heads=12,
intermediate_size=3072,
torchscript=True,
)
# Instantiating the model
model = BertModel(config)
# The model needs to be in evaluation mode
model.eval()
# If you are instantiating the model with *from_pretrained* you can also easily set the TorchScript flag
model = BertModel.from_pretrained("google-bert/bert-base-uncased", torchscript=True)
# Creating the trace
traced_model = torch.jit.trace(model, [tokens_tensor, segments_tensors])
torch.jit.save(traced_model, "traced_bert.pt")
```
### ╪к╪н┘Е┘К┘Д ┘Ж┘Е┘И╪░╪м
┘К┘Е┘Г┘Ж┘Г ╪з┘Д╪в┘Ж ╪к╪н┘Е┘К┘Д `BertModel` ╪з┘Д┘Е┘П╪н┘Б╪╕ ╪│╪з╪и┘В┘Л╪з╪М `traced_bert.pt`╪М ┘Е┘Ж ╪з┘Д┘В╪▒╪╡ ┘И╪з╪│╪к╪о╪п╪з┘Е┘З ╪╣┘Д┘Й `dummy_input` ╪з┘Д┘Е┘П┘З┘К╪г ╪│╪з╪и┘В┘Л╪з:
```python
loaded_model = torch.jit.load("traced_bert.pt")
loaded_model.eval()
all_encoder_layers, pooled_output = loaded_model(*dummy_input)
```
### ╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Е┘И╪░╪м ┘Е┘П╪к╪к╪и╪╣ ┘Д┘Д╪з╪│╪к╪п┘Д╪з┘Д
╪з╪│╪к╪о╪п┘Е ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д┘Е┘П╪к╪к╪и╪╣ ┘Д┘Д╪з╪│╪к╪п┘Д╪з┘Д ╪и╪з╪│╪к╪о╪п╪з┘Е ╪г╪│┘Д┘И╪и `__call__` ╪з┘Д╪о╪з╪╡ ╪и┘З:
```python
traced_model(tokens_tensor, segments_tensors)
```
## ┘Ж╪┤╪▒ ┘Ж┘Е╪з╪░╪м Hugging Face TorchScript ╪╣┘Д┘Й AWS ╪и╪з╪│╪к╪о╪п╪з┘Е Neuron SDK
┘В╪п┘Е╪к AWS ╪╣╪з╪ж┘Д╪й [Amazon EC2 Inf1](https://aws.amazon.com/ec2/instance-types/inf1/) ┘Е┘Ж ╪зя╗╖╪м┘З╪▓╪й ┘Д╪о┘Б╪╢ ╪з┘Д╪к┘Г┘Д┘Б╪й ┘И╪г╪п╪з╪б ╪з┘Д╪к╪╣┘Д┘Е ╪з┘Д╪в┘Д┘К ╪╣╪з┘Д┘К ╪з┘Д╪г╪п╪з╪б ┘Б┘К ╪з┘Д╪и┘К╪ж╪й ╪з┘Д╪│╪н╪з╪и┘К╪й. ╪к╪╣┘Е┘Д ╪г╪м┘З╪▓╪й Inf1 ╪и┘И╪з╪│╪╖╪й ╪┤╪▒┘К╪н╪й Inferentia ┘Е┘Ж AWS╪М ┘И┘З┘К ┘Е┘П╪│╪▒┘С╪╣ ╪г╪м┘З╪▓╪й ┘Е┘П╪о╪╡╪╡╪М ┘Е╪к╪о╪╡╪╡ ┘Б┘К ╪г╪╣╪и╪з╪б ╪╣┘Е┘Д ╪з┘Д╪з╪│╪к╪п┘Д╪з┘Д ┘Д┘Д╪к╪╣┘Д┘Е ╪з┘Д╪╣┘Е┘К┘В. [AWS Neuron](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/#) ┘З┘К SDK ┘Д┘А Inferentia ╪з┘Д╪к┘К ╪к╪п╪╣┘Е ╪к╪к╪и╪╣ ┘Ж┘Е╪з╪░╪м ╪з┘Д┘Е╪н┘И┘Д╪з╪к ┘И╪к╪н╪│┘К┘Ж┘З╪з ┘Д┘Д┘Ж╪┤╪▒ ╪╣┘Д┘Й Inf1. ╪к┘И┘Б╪▒ Neuron SDK ┘Е╪з ┘К┘Д┘К:
1. ┘И╪з╪м┘З╪й ╪и╪▒┘Е╪м╪й ╪к╪╖╪и┘К┘В╪з╪к ╪│┘З┘Д╪й ╪з┘Д╪з╪│╪к╪о╪п╪з┘Е ┘Е╪╣ ╪к╪║┘К┘К╪▒ ╪│╪╖╪▒ ┘И╪з╪н╪п ┘Е┘Ж ╪з┘Д╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Е╪м┘К╪й ┘Д╪к╪к╪и╪╣ ┘Ж┘Е┘И╪░╪м TorchScript ┘И╪к╪н╪│┘К┘Ж┘З ┘Д┘Д╪з╪│╪к╪п┘Д╪з┘Д ┘Б┘К ╪з┘Д╪и┘К╪ж╪й ╪з┘Д╪│╪н╪з╪и┘К╪й.
2. ╪к╪н╪│┘К┘Ж╪з╪к ╪з┘Д╪г╪п╪з╪б ╪з┘Д╪м╪з┘З╪▓╪й ┘Д┘Д╪з╪│╪к╪о╪п╪з┘Е [╪к╪н╪│┘К┘Ж ╪з┘Д╪к┘Г┘Д┘Б╪й ┘И╪з┘Д╪г╪п╪з╪б](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/benchmark/>).
3. ╪п╪╣┘Е ┘Ж┘Е╪з╪░╪м Hugging Face ╪з┘Д┘Е╪н┘И┘Д╪з╪к ╪з┘Д┘Е╪и┘Ж┘К╪й ╪и╪з╪│╪к╪о╪п╪з┘Е ╪е┘Е╪з [PyTorch](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/bert_tutorial/tutorial_pretrained_bert.html) ╪г┘И [TensorFlow](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/tensorflow/huggingface_bert/huggingface_bert.html).
### ╪з┘Д╪в╪л╪з╪▒ ╪з┘Д┘Е╪к╪▒╪к╪и╪й
╪к╪╣┘Е┘Д ┘Ж┘Е╪з╪░╪м ╪з┘Д┘Е╪н┘И┘Д╪з╪к ╪з┘Д┘Е╪│╪к┘Ж╪п╪й ╪е┘Д┘Й ╪и┘Ж┘К╪й [BERT (╪к┘Е╪л┘К┘Д╪з╪к ╪з┘Д╪к╪▒┘Е┘К╪▓ ╪л┘Ж╪з╪ж┘К╪й ╪з┘Д╪з╪к╪м╪з┘З ┘Е┘Ж ╪з┘Д┘Е╪н┘И┘Д╪з╪к)](https://huggingface.co/docs/transformers/main/model_doc/bert) ╪г┘И ┘Е╪к╪║┘К╪▒╪з╪к┘З╪з ┘Е╪л┘Д [distilBERT](https://huggingface.co/docs/transformers/main/model_doc/distilbert) ┘И [roBERTa](https://huggingface.co/docs/transformers/main/model_doc/roberta) ╪и╪┤┘Г┘Д ╪г┘Б╪╢┘Д ╪╣┘Д┘Й Inf1 ┘Д┘Д┘Е┘З╪з┘Е ╪║┘К╪▒ ╪з┘Д╪к┘И┘Д┘К╪п┘К╪й ┘Е╪л┘Д ╪з┘Д╪е╪м╪з╪и╪й ╪╣┘Д┘Й ╪з┘Д╪г╪│╪ж┘Д╪й ╪з┘Д╪з╪│╪к╪о╪▒╪з╪м┘К╪й╪М ┘И╪к╪╡┘Ж┘К┘Б ╪з┘Д╪к╪│┘Д╪│┘Д╪з╪к╪М ┘И╪к╪╡┘Ж┘К┘Б ╪з┘Д╪▒┘Е┘И╪▓ (tokens). ┘И┘Е╪╣ ╪░┘Д┘Г╪М ┘К┘Е┘Г┘Ж ╪к┘Г┘К┘К┘Б ┘Е┘З╪з┘Е ╪к┘И┘Д┘К╪п ╪з┘Д┘Ж╪╡┘И╪╡ ┘Д┘Д╪╣┘Е┘Д ╪╣┘Д┘Й Inf1 ┘И┘Б┘В┘Л╪з ┘Д┘З╪░╪з [╪и╪▒┘Ж╪з┘Е╪м ╪к╪╣┘Д┘К┘Е┘К AWS Neuron MarianMT](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/transformers-marianmt.html). ┘К┘Е┘Г┘Ж ╪з┘Д╪╣╪л┘И╪▒ ╪╣┘Д┘Й ┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д┘Е╪╣┘Д┘И┘Е╪з╪к ╪н┘И┘Д ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к┘К ┘К┘Е┘Г┘Ж ╪к╪н┘И┘К┘Д┘З╪з ╪м╪з┘З╪▓╪й ╪╣┘Д┘Й Inferentia ┘Б┘К ┘В╪│┘Е [┘Е┘Д╪з╪б┘Е╪й ╪и┘Ж┘К╪й ╪з┘Д┘Ж┘Е┘И╪░╪м](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/models/models-inferentia.html#models-inferentia) ┘Е┘Ж ┘И╪л╪з╪ж┘В Neuron.
### ╪з┘Д╪к╪и╪╣┘К╪з╪к (Dependencies)
┘К╪к╪╖┘Д╪и ╪з╪│╪к╪о╪п╪з┘Е AWS Neuron ┘Д╪к╪н┘И┘К┘Д ╪з┘Д┘Ж┘Е╪з╪░╪м [╪и┘К╪ж╪й SDK Neuron](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/pytorch-neuron/index.html#installation-guide) ┘И╪з┘Д╪к┘К ╪к╪г╪к┘К ┘Е╪│╪и┘В┘Л╪з ╪╣┘Д┘Й [AMI ┘Д┘Д╪к╪╣┘Д┘Е ╪з┘Д╪╣┘Е┘К┘В ┘Е┘Ж AWS](https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-inferentia-launching.html).
### ╪к╪н┘И┘К┘Д ┘Ж┘Е┘И╪░╪м ┘Д┘А AWS Neuron
┘В┘Е ╪и╪к╪н┘И┘К┘Д ┘Ж┘Е┘И╪░╪м ┘Д┘А AWS NEURON ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Б╪│ ╪з┘Д╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Е╪м┘К╪й ┘Е┘Ж [╪з╪│╪к╪о╪п╪з┘Е TorchScript ┘Б┘К Python](torchscript#using-torchscript-in-python) ┘Д╪к╪к╪и╪╣ `BertModel`. ┘В┘Е ╪и╪з╪│╪к┘К╪▒╪з╪п ╪з┘Е╪к╪п╪з╪п ╪е╪╖╪з╪▒ ╪╣┘Е┘Д `torch.neuron` ┘Д┘Д┘И╪╡┘И┘Д ╪е┘Д┘Й ┘Е┘Г┘И┘Ж╪з╪к Neuron SDK ┘Е┘Ж ╪о┘Д╪з┘Д ┘И╪з╪м┘З╪й ╪и╪▒┘Е╪м╪й ╪к╪╖╪и┘К┘В╪з╪к Python:
```python
from transformers import BertModel, BertTokenizer, BertConfig
import torch
import torch.neuron
```
┘Г┘Д ┘Е╪з ╪╣┘Д┘К┘Г ┘Б╪╣┘Д┘З ┘З┘И ╪к╪╣╪п┘К┘Д ╪з┘Д╪│╪╖╪▒ ╪з┘Д╪к╪з┘Д┘К:
```diff
- torch.jit.trace(model, [tokens_tensor, segments_tensors])
+ torch.neuron.trace(model, [token_tensor, segments_tensors])
```
┘К╪к┘К╪н ╪░┘Д┘Г ┘Д┘А Neuron SDK ╪к╪к╪и╪╣ ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И╪к╪н╪│┘К┘Ж┘З ┘Д┘Е╪л┘К┘Д╪з╪к Inf1.
┘Д┘Е╪╣╪▒┘Б╪й ╪з┘Д┘Е╪▓┘К╪п ╪н┘И┘Д ┘Е┘К╪▓╪з╪к AWS Neuron SDK ┘И╪з┘Д╪г╪п┘И╪з╪к ┘И╪п╪▒┘И╪│ ╪з┘Д╪и╪▒╪з┘Е╪м ╪з┘Д╪к╪╣┘Д┘К┘Е┘К╪й ┘И╪з┘Д╪к╪н╪п┘К╪л╪з╪к ╪з┘Д╪г╪о┘К╪▒╪й╪М ┘К╪▒╪м┘Й ╪з┘Д╪з╪╖┘Д╪з╪╣ ╪╣┘Д┘Й [┘И╪л╪з╪ж┘В AWS NeuronSDK](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/index.html).

720
docs/source/ar/trainer.md Normal file
View File

@ -0,0 +1,720 @@
# Trainer
╪к┘П╪к┘К╪н ┘И╪н╪п╪й [`Trainer`] ╪н┘Д┘В╪й ╪к╪п╪▒┘К╪и ┘И╪к┘В┘К┘К┘Е ┘Е╪к┘Г╪з┘Е┘Д╪й ┘Д┘Ж┘Е╪з╪░╪м PyTorch ╪з┘Д┘Е╪╖╪и┘В╪й ┘Б┘К ┘Е┘Г╪к╪и╪й Transformers. ╪к╪н╪к╪з╪м ┘Б┘В╪╖ ╪е┘Д┘Й ╪к┘Е╪▒┘К╪▒ ╪з┘Д┘Е┘Г┘И┘Ж╪з╪к ╪з┘Д╪╢╪▒┘И╪▒┘К╪й ┘Д┘Д╪к╪п╪▒┘К╪и (╪з┘Д┘Ж┘Е┘И╪░╪м╪М ┘И╪з┘Д┘Е╪м╪▓┘Й╪б ╪з┘Д┘Ж╪╡┘Й╪М ┘И┘Е╪м┘Е┘И╪╣╪й ╪з┘Д╪и┘К╪з┘Ж╪з╪к╪М ╪п╪з┘Д╪й ╪з┘Д╪к┘В┘К┘К┘Е╪М ┘Е╪╣┘Д┘Е╪з╪к ╪з┘Д╪к╪п╪▒┘К╪и ╪з┘Д┘Б╪з╪ж┘В╪й╪М ╪е┘Д╪о)╪М ┘И╪│╪к╪к┘И┘Д┘Й ┘Б╪ж╪й [`Trainer`] ╪з┘Д╪и╪з┘В┘К. ┘З╪░╪з ┘К┘П╪│┘З┘С┘Д ╪и╪п╪б ╪з┘Д╪к╪п╪▒┘К╪и ╪и╪┤┘Г┘Д ╪г╪│╪▒╪╣ ╪п┘И┘Ж ┘Г╪к╪з╪и╪й ╪н┘Д┘В╪й ╪з┘Д╪к╪п╪▒┘К╪и ╪з┘Д╪о╪з╪╡╪й ╪и┘Г ┘К╪п┘И┘К┘Л╪з. ┘И┘Д┘Г┘Ж ┘Б┘К ╪з┘Д┘И┘В╪к ┘Ж┘Б╪│┘З╪М ┘Б╪е┘Ж [`Trainer`] ┘В╪з╪и┘Д ┘Д┘Д╪к╪о╪╡┘К╪╡ ╪и╪п╪▒╪м╪й ┘Г╪и┘К╪▒╪й ┘И┘К┘И┘Б╪▒ ╪з┘Д╪╣╪п┘К╪п ┘Е┘Ж ╪о┘К╪з╪▒╪з╪к ╪з┘Д╪к╪п╪▒┘К╪и ╪н╪к┘Й ╪к╪к┘Е┘Г┘Ж ┘Е┘Ж ╪к╪о╪╡┘К╪╡┘З ┘И┘Б┘В┘Л╪з ┘Д╪з╪н╪к┘К╪з╪м╪з╪к ╪з┘Д╪к╪п╪▒┘К╪и ╪з┘Д╪о╪з╪╡╪й ╪и┘Г ╪и╪п┘В╪й.
<Tip>
╪и╪з┘Д╪е╪╢╪з┘Б╪й ╪е┘Д┘Й ┘Б╪ж╪й [`Trainer`], ╪к┘И┘Б╪▒ ┘Е┘Г╪к╪и╪й Transformers ╪г┘К╪╢┘Л╪з ┘Б╪ж╪й [`Seq2SeqTrainer`] ┘Д┘Д┘Е┘З╪з┘Е ╪з┘Д╪к╪│┘Д╪│┘Д┘К╪й ┘Е╪л┘Д ╪з┘Д╪к╪▒╪м┘Е╪й ╪г┘И ╪з┘Д╪к┘Д╪о┘К╪╡. ┘З┘Ж╪з┘Г ╪г┘К╪╢┘Л╪з ┘Б╪ж╪й [`~trl.SFTTrainer`] ┘Е┘Ж ┘Е┘Г╪к╪и╪й [TRL](https://hf.co/docs/trl) ╪з┘Д╪к┘К ╪к╪║┘Д┘С┘Б ┘Б╪ж╪й [`Trainer`] ┘И┘З┘К ┘Е┘П╪н┘П╪│┘О┘С┘Ж╪й ┘Д╪к╪п╪▒┘К╪и ┘Ж┘Е╪з╪░╪м ╪з┘Д┘Д╪║╪й ┘Е╪л┘Д Llama-2 ┘ИMistral ╪и╪з╪│╪к╪о╪п╪з┘Е ╪к┘В┘Ж┘К╪з╪к ╪з┘Д╪к┘И┘Д┘К╪п ╪з┘Д┘Д╪║┘И┘К. ┘Г┘Е╪з ┘К╪п╪╣┘Е [`~trl.SFTTrainer`] ┘Е┘К╪▓╪з╪к ┘Е╪л┘Д ╪н╪▓┘Е ╪з┘Д╪к╪│┘Д╪│┘Д╪з╪к╪М ┘ИLoRA╪М ┘И╪з┘Д┘В┘К╪з╪│ ╪з┘Д┘Г┘Е┘К╪М ┘ИDeepSpeed ┘Е┘Е╪з ┘К┘П┘Е┘Г┘С┘Ж ┘Е┘Ж ╪з┘Д╪к╪п╪▒┘К╪и ╪и┘Г┘Б╪з╪б╪й ╪╣┘Д┘Й ┘Ж┘Е╪з╪░╪м ╪╢╪о┘Е╪й ╪з┘Д╪н╪м┘Е.
<br>
┘Д╪з ╪к╪к╪▒╪п╪п ┘Б┘К ╪з┘Д╪з╪╖┘Д╪з╪╣ ╪╣┘Д┘Й [┘Е╪▒╪м╪╣ API](./main_classes/trainer) ┘Д┘З╪░┘З ╪з┘Д┘Б╪ж╪з╪к ╪з┘Д╪г╪о╪▒┘Й ┘Е┘Ж ╪з┘Д┘Ж┘И╪╣ [`Trainer`] ┘Д┘Е╪╣╪▒┘Б╪й ╪з┘Д┘Е╪▓┘К╪п ╪н┘И┘Д ┘Е╪к┘Й ┘К╪к┘Е ╪з╪│╪к╪о╪п╪з┘Е ┘Г┘Д ┘Е┘Ж┘З╪з. ╪и╪┤┘Г┘Д ╪╣╪з┘Е╪М [`Trainer`] ┘З┘И ╪з┘Д╪о┘К╪з╪▒ ╪з┘Д╪г┘Г╪л╪▒ ╪к┘Ж┘И╪╣┘Л╪з ┘И┘Е┘Ж╪з╪│╪и┘Л╪з ┘Д┘Е╪м┘Е┘И╪╣╪й ┘И╪з╪│╪╣╪й ┘Е┘Ж ╪з┘Д┘Е┘З╪з┘Е. ╪к┘Е ╪к╪╡┘Е┘К┘Е [`Seq2SeqTrainer`] ┘Д┘Д┘Е┘З╪з┘Е ╪з┘Д╪к╪│┘Д╪│┘Д┘К╪й ╪М ┘И [`~trl.SFTTrainer`] ┘Е┘П╪╡┘Е┘Е ┘Д╪к╪п╪▒┘К╪и ┘Ж┘Е╪з╪░╪м ╪з┘Д┘Д╪║╪й ╪з┘Д┘Г╪и┘К╪▒╪й.
</Tip>
┘В╪и┘Д ╪з┘Д╪и╪п╪б╪М ╪к╪г┘Г╪п ┘Е┘Ж ╪к╪л╪и┘К╪к ┘Е┘Г╪к╪и╪й [Accelerate](https://hf.co/docs/accelerate) - ┘И┘З┘К ┘Е┘Г╪к╪и╪й ╪к┘П┘Е┘Г┘С┘Ж ╪к╪┤╪║┘К┘Д ╪к╪п╪▒┘К╪и PyTorch ┘Б┘К ╪и┘К╪ж╪з╪к ┘Е┘П┘И╪▓╪╣╪й.
```bash
pip install accelerate
# upgrade
pip install accelerate --upgrade
```
┘К┘И┘Б╪▒ ┘З╪░╪з ╪з┘Д╪п┘Д┘К┘Д ┘Ж╪╕╪▒╪й ╪╣╪з┘Е╪й ╪╣┘Д┘Й ┘Б╪ж╪й [`Trainer`].
## ╪з┘Д╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г╪│╪з╪│┘К
┘К╪к╪╢┘Е┘Ж [`Trainer`] ╪м┘Е┘К╪╣ ╪з┘Д╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Е╪м┘К╪й ╪з┘Д╪к┘К ╪│╪к╪м╪п┘З╪з ┘Б┘К ╪н┘Д┘В╪й ╪з┘Д╪к╪п╪▒┘К╪и ╪з┘Д╪г╪│╪з╪│┘К╪й:
1. ┘В┘Е ╪и╪к┘Ж┘Б┘К╪░ ╪о╪╖┘И╪й ╪к╪п╪▒┘К╪и ┘Д╪н╪│╪з╪и ╪з┘Д╪о╪│╪з╪▒╪й
2. ╪з╪н╪│╪и ╪з┘Д┘Е╪┤╪к┘В╪з╪к ╪и╪з╪│╪к╪о╪п╪з┘Е ╪╖╪▒┘К┘В╪й [`~accelerate.Accelerator.backward`]
3. ╪к╪н╪п┘К╪л ╪з┘Д╪г┘И╪▓╪з┘Ж ╪и┘Ж╪з╪б┘Л ╪╣┘Д┘Й ╪з┘Д┘Е╪┤╪к┘В╪з╪к
4. ┘Г╪▒╪▒ ┘З╪░┘З ╪з┘Д╪╣┘Е┘Д┘К╪й ╪н╪к┘Й ╪к╪╡┘Д ╪е┘Д┘Й ╪╣╪п╪п ┘Е╪н╪п╪п ┘Е╪│╪и┘В┘Л╪з ┘Е┘Ж ╪з┘Д╪п┘И╪▒╪з╪к (epochs).
╪к┘П╪м╪▒╪п ┘Б╪ж╪й [`Trainer`] ┘Г┘Д ┘З╪░┘З ╪з┘Д╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Е╪м┘К╪й ╪н╪к┘Й ┘Д╪з ╪к╪╢╪╖╪▒ ╪е┘Д┘Й ╪з┘Д┘В┘Д┘В ╪и╪┤╪г┘Ж ┘Г╪к╪з╪и╪й ╪н┘Д┘В╪й ╪к╪п╪▒┘К╪и ┘К╪п┘И┘К┘Л╪з ┘Б┘К ┘Г┘Д ┘Е╪▒╪й ╪г┘Е╪з ╪е╪░╪з ┘Г┘Ж╪к ╪и╪п╪г╪к ┘Д┘Д╪к┘И ┘Б┘К PyTorch ┘И╪з┘Д╪к╪п╪▒┘К╪и. ┘Г┘Д ┘Е╪з ╪╣┘Д┘К┘Г ┘Б╪╣┘Д┘З ┘З┘И ╪к┘И┘Б┘К╪▒ ╪з┘Д┘Е┘Г┘И┘Ж╪з╪к ╪з┘Д╪г╪│╪з╪│┘К╪й ╪з┘Д┘Д╪з╪▓┘Е╪й ┘Д┘Д╪к╪п╪▒┘К╪и╪М ┘Е╪л┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И┘Е╪м┘Е┘И╪╣╪й ╪и┘К╪з┘Ж╪з╪к╪М ┘И╪к╪к╪╣╪з┘Е┘Д ┘Б╪ж╪й [`Trainer`] ┘Е╪╣ ┘Г┘Д ╪┤┘К╪б ╪в╪о╪▒.
╪е╪░╪з ┘Г┘Ж╪к ╪к┘П╪▒┘К╪п ╪к╪н╪п┘К╪п ╪г┘К ╪о┘К╪з╪▒╪з╪к ╪к╪п╪▒┘К╪и ╪г┘И ┘Е╪╣┘Д┘Е╪з╪к ┘Б╪з╪ж┘В╪й╪М ┘Б┘К┘Е┘Г┘Ж┘Г ╪з┘Д╪╣╪л┘И╪▒ ╪╣┘Д┘К┘З╪з ┘Б┘К ┘Б╪ж╪й [`TrainingArguments`]. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪п╪╣┘Ж╪з ┘Ж╪н╪п╪п ╪г┘К┘Ж ┘К╪к┘Е ╪н┘Б╪╕ ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Б┘К `output_dir` ┘И╪▒┘Б╪╣ ╪з┘Д┘Ж┘Е┘И╪░╪м ╪е┘Д┘Й Hub ╪и╪╣╪п ╪з┘Д╪к╪п╪▒┘К╪и ╪и╪з╪│╪к╪о╪п╪з┘Е `push_to_hub=True`.
```py
from transformers import TrainingArguments
training_args = TrainingArguments(
output_dir="your-model"╪М
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=2,
weight_decay=0.01,
eval_strategy="epoch"╪М
save_strategy="epoch"╪М
load_best_model_at_end=True,
push_to_hub=True,
)
```
┘Е╪▒╪▒ `training_args` ╪е┘Д┘Й [`Trainer`] ╪м┘Ж╪и┘Л╪з ╪е┘Д┘Й ╪м┘Ж╪и ┘Е╪╣ ╪з┘Д┘Ж┘Е┘И╪░╪м╪М ┘И┘Е╪м┘Е┘И╪╣╪й ╪и┘К╪з┘Ж╪з╪к╪М ┘И╪┤╪ж ┘Д┘Е╪╣╪з┘Д╪м╪й ┘Е╪м┘Е┘И╪╣╪й ╪з┘Д╪и┘К╪з┘Ж╪з╪к ┘Е╪│╪и┘В┘Л╪з (╪н╪│╪и ┘Ж┘И╪╣ ╪з┘Д╪и┘К╪з┘Ж╪з╪к╪М ┘Б┘В╪п ┘К┘Г┘И┘Ж ┘Е╪н┘Д┘Д┘Л╪з ╪▒┘Е╪▓┘К┘Л╪з ╪г┘И ┘Е╪│╪к╪о╪▒╪м ┘Е┘К╪▓╪з╪к ╪г┘И ┘Е╪╣╪з┘Д╪м ╪╡┘И╪▒)╪М ┘И╪м╪з┘Е╪╣ ╪и┘К╪з┘Ж╪з╪к╪М ┘И╪п╪з┘Д╪й ┘Д╪н╪│╪з╪и ╪з┘Д┘Е┘В╪з┘К┘К╪│ ╪з┘Д╪к┘К ╪к┘П╪▒┘К╪п ╪к╪к╪и╪╣┘З╪з ╪г╪л┘Ж╪з╪б ╪з┘Д╪к╪п╪▒┘К╪и.
╪г╪о┘К╪▒┘Л╪з╪М ╪з╪│╪к╪п╪╣┘Р [`~Trainer.train`] ┘Д╪и╪п╪б ╪з┘Д╪к╪п╪▒┘К╪и!
```py
from transformers import Trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=dataset["train"]╪М
eval_dataset=dataset["test"]╪М
tokenizer=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics,
)
trainer.train()
```
### ┘Ж┘В╪з╪╖ ╪з┘Д╪н┘Б╪╕
╪к╪н┘Б╪╕ ┘Б╪ж╪й [`Trainer`] ┘Ж┘В╪з╪╖ ╪з┘Д╪н┘Б╪╕ ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Б┘К ╪з┘Д╪п┘Д┘К┘Д ╪з┘Д┘Е╪н╪п╪п ┘Б┘К ┘Е╪╣╪з┘Е┘Д `output_dir` ┘Е┘Ж [`TrainingArguments`]. ╪│╪к╪м╪п ┘Ж┘В╪з╪╖ ╪з┘Д╪н┘Б╪╕ ┘Б┘К ┘Е╪м┘Д╪п ┘Б╪▒╪╣┘К ┘К╪│┘Е┘Й `checkpoint-000` ╪н┘К╪л ╪к╪к┘И╪з┘Б┘В ╪з┘Д╪г╪▒┘В╪з┘Е ┘Б┘К ╪з┘Д┘Ж┘З╪з┘К╪й ┘Е╪╣ ╪о╪╖┘И╪й ╪з┘Д╪к╪п╪▒┘К╪и. ╪е┘Ж ╪н┘Б╪╕ ┘Ж┘В╪з╪╖ ╪з┘Д╪н┘Б╪╕ ┘Е┘Б┘К╪п ┘Д╪з╪│╪к╪ж┘Ж╪з┘Б ╪з┘Д╪к╪п╪▒┘К╪и ┘Д╪з╪н┘В┘Л╪з.
```py
# ╪з╪│╪к╪г┘Ж┘Б ┘Е┘Ж ╪г╪н╪п╪л ┘Ж┘В╪╖╪й ╪н┘Б╪╕
trainer.train(resume_from_checkpoint=True)
# ╪з╪│╪к╪г┘Ж┘Б ┘Е┘Ж ┘Ж┘В╪╖╪й ╪н┘Б╪╕ ┘Е╪н╪п╪п╪й ┘Е╪н┘Б┘И╪╕╪й ┘Б┘К ╪п┘Д┘К┘Д ╪з┘Д╪е╪о╪▒╪з╪м
trainer.train(resume_from_checkpoint="your-model/checkpoint-1000")
```
┘К┘Е┘Г┘Ж┘Г ╪н┘Б╪╕ ┘Ж┘В╪з╪╖ ╪з┘Д╪н┘Б╪╕ ╪з┘Д╪о╪з╪╡╪й ╪и┘Г (┘Д╪з ┘К╪к┘Е ╪н┘Б╪╕ ╪н╪з┘Д╪й ╪з┘Д┘Е┘П╪м╪▓┘Й╪б ╪з┘Д┘Д╪║┘И┘Й ╪к┘В╪з╪ж┘К┘Л╪з) ╪е┘Д┘Й Hub ╪╣┘Ж ╪╖╪▒┘К┘В ╪к╪╣┘К┘К┘Ж `push_to_hub=True` ┘Б┘К [`TrainingArguments`] ┘Д╪▒┘Б╪╣┘З╪з. ╪з┘Д╪о┘К╪з╪▒╪з╪к ╪з┘Д╪г╪о╪▒┘Й ┘Д╪з╪к╪о╪з╪░ ╪з┘Д┘В╪▒╪з╪▒ ╪и╪┤╪г┘Ж ┘Г┘К┘Б┘К╪й ╪н┘Б╪╕ ┘З╪░╪й ╪з┘Д┘Ж┘В╪з╪╖ ╪з┘Д╪о╪з╪╡╪й ╪и┘Г ┘З┘К ╪з┘Д╪е╪╣╪п╪з╪п ┘Б┘К ┘Е╪╣╪з┘Е┘Д [`hub_strategy`](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments.hub_strategy):
* `hub_strategy="checkpoint"` ┘К╪п┘Б╪╣ ╪г╪н╪п╪л ┘Ж┘В╪╖╪й ╪н┘Б╪╕ ╪е┘Д┘Й ┘Е╪м┘Д╪п ┘Б╪▒╪╣┘К ┘К╪│┘Е┘Й "last-checkpoint" ┘К┘Е┘Г┘Ж┘Г ╪з╪│╪к╪ж┘Ж╪з┘Б ╪з┘Д╪к╪п╪▒┘К╪и ┘Е┘Ж┘З
* `hub_strategy="all_checkpoints"` ┘К╪п┘Б╪╣ ╪м┘Е┘К╪╣ ┘Ж┘В╪з╪╖ ╪з┘Д╪н┘Б╪╕ ╪е┘Д┘Й ╪з┘Д╪п┘Д┘К┘Д ╪з┘Д┘Е╪н╪п╪п ┘Б┘К `output_dir` (╪│╪к╪▒┘Й ┘Ж┘В╪╖╪й ╪н┘Б╪╕ ┘И╪з╪н╪п╪й ┘Д┘Г┘Д ┘Е╪м┘Д╪п ┘Б┘К ┘Е╪│╪к┘И╪п╪╣ ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪о╪з╪╡ ╪и┘Г)
╪╣┘Ж╪п ╪з╪│╪к╪ж┘Ж╪з┘Б ╪з┘Д╪к╪п╪▒┘К╪и ┘Е┘Ж ┘Ж┘В╪╖╪й ╪н┘Б╪╕╪М ╪к┘П╪н╪з┘И┘Д [`Trainer`] ╪з┘Д╪н┘Б╪з╪╕ ╪╣┘Д┘Й ╪н╪з┘Д╪з╪к RNG Python ┘ИNumPy ┘ИPyTorch ┘Г┘Е╪з ┘Г╪з┘Ж╪к ╪╣┘Ж╪п┘Е╪з ╪к┘Е ╪н┘Б╪╕ ┘Ж┘В╪╖╪й ╪з┘Д╪н┘Б╪╕. ┘И┘Д┘Г┘Ж ┘Д╪г┘Ж PyTorch ┘Д╪п┘К┘З╪з ╪з┘Д╪╣╪п┘К╪п ┘Е┘Ж ╪з┘Д╪е╪╣╪п╪з╪п╪з╪к ╪з┘Д╪з┘Б╪к╪▒╪з╪╢┘К╪й ╪║┘К╪▒ ╪з┘Д╪н╪к┘Е┘К╪й ┘Е┘П╪к┘Ж┘И╪╣╪й╪М ┘Б╪е┘Ж ╪н╪з┘Д╪з╪к RNG ┘Д┘К╪│╪к ┘Е╪╢┘Е┘И┘Ж╪й ┘Д╪к┘Г┘И┘Ж ┘З┘К ┘Ж┘Б╪│┘З╪з. ╪е╪░╪з ┘Г┘Ж╪к ╪к╪▒┘К╪п ╪к┘Е┘Г┘К┘Ж ╪з┘Д╪н╪к┘Е┘К╪й ╪з┘Д┘Г╪з┘Е┘Д╪й╪М ┘Б╪▒╪з╪м╪╣ ╪п┘Д┘К┘Д [╪з┘Д╪к╪н┘Г┘Е ┘Б┘К ┘Е╪╡╪з╪п╪▒ ╪з┘Д╪╣╪┤┘И╪з╪ж┘К╪й](https://pytorch.org/docs/stable/notes/randomness#controlling-sources-of-randomness) ┘Д┘Е╪╣╪▒┘Б╪й ┘Е╪з ┘К┘П┘Е┘Г┘Ж┘Г ╪к┘Е┘Г┘К┘Ж┘З ┘Д╪м╪╣┘Д ╪к╪п╪▒┘К╪и┘Г ╪н╪к┘Е┘К┘Л╪з ╪к┘Е╪з┘Е┘Л╪з. ╪╢╪╣ ┘Б┘К ╪з╪╣╪к╪и╪з╪▒┘Г ╪г┘Ж┘З ┘Е┘Ж ╪о┘Д╪з┘Д ╪м╪╣┘Д ╪е╪╣╪п╪з╪п╪з╪к ┘Е╪╣┘К┘Ж╪й ╪н╪к┘Е┘К╪й╪М ┘Б┘В╪п ┘К┘Г┘И┘Ж ╪з┘Д╪к╪п╪▒┘К╪и ╪г╪и╪╖╪г.
## ╪к╪о╪╡┘К╪╡ ╪з┘Д┘Е╪п╪▒╪и
┘Б┘К ╪н┘К┘Ж ╪г┘Ж ┘Б╪ж╪й [`Trainer`] ┘Е┘П╪╡┘Е┘Е╪й ┘Д╪к┘Г┘И┘Ж ╪│┘З┘Д╪й ╪з┘Д┘И╪╡┘И┘Д ┘И╪│┘З┘Д╪й ╪з┘Д╪з╪│╪к╪о╪п╪з┘Е╪М ┘Б╪е┘Ж┘З╪з ╪к┘И┘Б╪▒ ╪г┘К╪╢┘Л╪з ╪з┘Д┘Г╪л┘К╪▒ ┘Е┘Ж ┘В╪з╪и┘Д┘К╪й ╪з┘Д╪к╪о╪╡┘К╪╡ ┘Д┘Д┘Е╪│╪к╪о╪п┘Е┘К┘Ж ╪з┘Д┘Е╪║╪з┘Е╪▒┘К┘Ж. ┘К┘П┘Е┘Г┘Ж ╪е┘Ж╪┤╪з╪б ┘Б╪ж╪з╪к ┘Б╪▒╪╣┘К╪й ┘Е┘Ж ╪з┘Д╪╣╪п┘К╪п ┘Е┘Ж ╪г╪│╪з┘Д┘К╪и [`Trainer`] ┘И╪к╪м╪з┘И╪▓┘З╪з ┘Д╪п╪╣┘Е ╪з┘Д┘И╪╕╪з╪ж┘Б ╪з┘Д╪к┘К ╪к┘П╪▒┘К╪п┘З╪з╪М ╪п┘И┘Ж ╪з┘Д╪н╪з╪м╪й ╪е┘Д┘Й ╪е╪╣╪з╪п╪й ┘Г╪к╪з╪и╪й ╪н┘Д┘В╪й ╪з┘Д╪к╪п╪▒┘К╪и ╪и╪г┘Г┘Е┘Д┘З╪з ┘Е┘Ж ╪з┘Д╪и╪п╪з┘К╪й ┘Д╪з╪│╪к┘К╪╣╪з╪и┘З╪з. ╪к╪к╪╢┘Е┘Ж ┘З╪░┘З ╪з┘Д╪г╪│╪з┘Д┘К╪и:
* [`~Trainer.get_train_dataloader`] ┘К┘Ж╪┤╪ж DataLoader ┘Д┘Д╪к╪п╪▒┘К╪и
* [`~Trainer.get_eval_dataloader`] ┘К┘Ж╪┤╪ж DataLoader ┘Д┘Д╪к┘В┘К┘К┘Е
* [`~Trainer.get_test_dataloader`] ┘К┘Ж╪┤╪ж DataLoader ┘Д┘Д╪з╪о╪к╪и╪з╪▒
* [`~Trainer.log`] ┘К╪│╪м┘Д ┘Е╪╣┘Д┘И┘Е╪з╪к ╪н┘И┘Д ┘Е╪о╪к┘Д┘Б ╪з┘Д┘Г╪з╪ж┘Ж╪з╪к ╪з┘Д╪к┘К ╪к╪▒╪з┘В╪и ╪з┘Д╪к╪п╪▒┘К╪и
* [`~Trainer.create_optimizer_and_scheduler`] ┘К┘Ж╪┤╪ж ┘Е╪н╪│┘Ж┘Л╪з ┘И┘Е╪о╪╖╪╖┘Л╪з ┘Д┘Е┘П╪╣╪п┘Д ╪з┘Д╪к╪╣┘Д┘Е ╪е╪░╪з ┘Д┘Е ┘К╪к┘Е ╪к┘Е╪▒┘К╪▒┘З┘Е╪з ┘Б┘К `__init__`╪Ы ┘К┘Е┘Г┘Ж ╪г┘К╪╢┘Л╪з ╪к╪о╪╡┘К╪╡ ┘З╪░┘З ╪з┘Д┘И╪╕╪з╪ж┘Б ╪и╪┤┘Г┘Д ┘Е┘Ж┘Б╪╡┘Д ╪и╪з╪│╪к╪о╪п╪з┘Е [`~Trainer.create_optimizer`] ┘И [`~Trainer.create_scheduler`] ╪╣┘Д┘Й ╪з┘Д╪к┘И╪з┘Д┘К
* [`~Trainer.compute_loss`] ┘К╪н╪│╪и ╪п╪з┘Д╪й ╪з┘Д╪о╪│╪з╪▒╪й ╪╣┘Д┘Й ╪п┘Б╪╣╪й ┘Е┘Ж ┘Е┘П╪п╪о┘Д╪з╪к ╪з┘Д╪к╪п╪▒┘К╪и
* [`~Trainer.training_step`] ┘К┘П┘Ж┘Б╪░ ╪о╪╖┘И╪й ╪з┘Д╪к╪п╪▒┘К╪и
* [`~Trainer.prediction_step`] ┘К┘П┘Ж┘Б╪░ ╪о╪╖┘И╪й ╪з┘Д╪к┘Ж╪и╪д ┘И╪з┘Д╪з╪о╪к╪и╪з╪▒
* [`~Trainer.evaluate`] ┘К┘П┘В┘К┘С┘Е ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И┘К╪╣┘К╪п ┘Е┘В╪з┘К┘К╪│ ╪з┘Д╪к┘В┘К┘К┘Е
* [`~Trainer.predict`] ┘К┘П╪м╪▒┘К ╪з┘Д╪к┘Ж╪и╪д╪з╪к (┘Е╪╣ ╪з┘Д┘Е┘В╪з┘К┘К╪│ ╪е╪░╪з ┘Г╪з┘Ж╪к ╪з┘Д╪╣┘Д╪з┘Е╪з╪к ┘Е╪к╪з╪н╪й) ╪╣┘Д┘Й ┘Е╪м┘Е┘И╪╣╪й ╪з┘Д╪з╪о╪к╪и╪з╪▒
╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪е╪░╪з ┘Г┘Ж╪к ╪к╪▒┘К╪п ╪к╪о╪╡┘К╪╡ ╪╖╪▒┘К┘В╪й [`~Trainer.compute_loss`] ┘Д╪з╪│╪к╪о╪п╪з┘Е ╪п╪з┘Д╪й ╪о╪│╪з╪▒╪й ╪░╪з╪к ╪к╪▒╪м┘К╪н ╪и╪п┘Д╪з┘Л ┘Е┘Ж ╪░┘Д┘Г.
```py
from torch import nn
from transformers import Trainer
class CustomTrainer(Trainer):
def compute_loss(self, model, inputs, return_outputs=False):
labels = inputs.pop("labels")
# forward pass
outputs = model(**inputs)
logits = outputs.get("logits")
# compute custom loss for 3 labels with different weights
loss_fct = nn.CrossEntropyLoss(weight=torch.tensor([1.0, 2.0, 3.0], device=model.device))
loss = loss_fct(logits.view(-1, self.model.config.num_labels), labels.view(-1))
return (loss, outputs) if return_outputs else loss
```
### ╪п┘И╪з┘Д ╪з┘Д╪з╪│╪к╪п╪╣╪з╪б Callbacks
╪о┘К╪з╪▒ ╪в╪о╪▒ ┘Д╪к╪о╪╡┘К╪╡ [`Trainer`] ┘З┘И ╪з╪│╪к╪о╪п╪з┘Е [╪п┘И╪з┘Д ╪з┘Д╪з╪│╪к╪п╪╣╪з╪б](callbacks). ┘Д╪з *╪к╪║┘К╪▒* ╪п┘И╪з┘Д ╪з┘Д╪з╪│╪к╪п╪╣╪з╪б ╪г┘К ╪┤┘К╪б ┘Б┘К ╪н┘Д┘В╪й ╪з┘Д╪к╪п╪▒┘К╪и. ╪е┘Ж┘З┘Е ╪к┘Б╪н╪╡ ╪н╪з┘Д╪й ╪н┘Д┘В╪й ╪з┘Д╪к╪п╪▒┘К╪и ╪л┘Е ╪к┘П┘Ж┘Б╪░ ╪и╪╣╪╢ ╪з┘Д╪е╪м╪▒╪з╪б╪з╪к (┘Е╪л┘Д ╪з┘Д╪е┘К┘В╪з┘Б ╪з┘Д┘Е╪и┘Г╪▒ ╪г┘И ╪к╪│╪м┘К┘Д ╪з┘Д┘Ж╪к╪з╪ж╪м╪М ╪е┘Д╪о) ╪з╪╣╪к┘Е╪з╪п┘Л╪з ╪╣┘Д┘Й ╪з┘Д╪н╪з┘Д╪й. ┘И╪и╪╣╪и╪з╪▒╪й ╪г╪о╪▒┘Й╪М ┘Д╪з ┘К┘Е┘Г┘Ж ╪з╪│╪к╪о╪п╪з┘Е ╪п╪з┘Д╪й ╪з┘Д╪з╪│╪к╪п╪╣╪з╪б ┘Д╪к┘Ж┘Б┘К╪░ ╪┤┘К╪б ┘Е╪л┘Д ╪п╪з┘Д╪й ╪о╪│╪з╪▒╪й ┘Е╪о╪╡╪╡╪й╪М ┘И┘К╪м╪и ╪╣┘Д┘К┘Г ╪к╪м╪з┘И╪▓ ╪п╪з┘Д╪й [`~Trainer.compute_loss`] ┘Д╪░┘Д┘Г.
╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪е╪░╪з ┘Г┘Ж╪к ╪к╪▒┘К╪п ╪е╪╢╪з┘Б╪й ╪п╪з┘Д╪й ╪з╪│╪к╪п╪╣╪з╪б ╪е┘К┘В╪з┘Б ┘Е╪и┘Г╪▒ ╪е┘Д┘Й ╪н┘Д┘В╪й ╪з┘Д╪к╪п╪▒┘К╪и ╪и╪╣╪п 10 ╪о╪╖┘И╪з╪к.
```py
from transformers import TrainerCallback
class EarlyStoppingCallback(TrainerCallback):
def __init__(self, num_steps=10):
self.num_steps = num_steps
def on_step_end(self, args, state, control, **kwargs):
if state.global_step >= self.num_steps:
return {"should_training_stop": True}
else:
return {}
```
╪л┘Е ┘Е╪▒╪▒┘З ╪е┘Д┘Й ┘Е╪╣╪з┘Е┘Д `callback` ┘Б┘К [`Trainer`].
```py
from transformers import Trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=dataset["train"]╪М
eval_dataset=dataset["test"]╪М
tokenizer=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics,
callback=[EarlyStoppingCallback()],
)
```
## ╪к╪│╪м┘К┘Д ╪з┘Д╪г╪н╪п╪з╪л (Logging)
<Tip>
╪▒╪з╪м╪╣ ┘Е╪▒╪м╪╣ [API](./main_classes/logging) ┘Д┘Д╪к╪│╪м┘К┘Д ┘Д┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д┘Е╪╣┘Д┘И┘Е╪з╪к ╪н┘И┘Д ┘Е╪│╪к┘И┘К╪з╪к ╪з┘Д╪к╪│╪м┘К┘Д ╪з┘Д┘Е╪о╪к┘Д┘Б╪й ┘Д┘Д╪г╪н╪п╪з╪л.
</Tip>
┘К╪к┘Е ╪к╪╣┘К┘К┘Ж [`Trainer`] ╪е┘Д┘Й `logging.INFO` ╪з┘Б╪к╪▒╪з╪╢┘К┘Л╪з ┘И╪з┘Д╪░┘К ┘К┘П╪и┘Д╪║ ╪╣┘Ж ╪з┘Д╪г╪о╪╖╪з╪б ┘И╪з┘Д╪к╪н╪░┘К╪▒╪з╪к ┘И┘Е╪╣┘Д┘И┘Е╪з╪к ╪г╪│╪з╪│┘К╪й ╪г╪о╪▒┘Й. ┘К╪к┘Е ╪к╪╣┘К┘К┘Ж ┘Ж╪│╪о╪й [`Trainer`] - ┘Б┘К ╪з┘Д╪и┘К╪ж╪з╪к ╪з┘Д┘Е┘И╪▓╪╣╪й - ╪е┘Д┘Й `logging.WARNING` ┘И╪з┘Д╪к┘К ┘К┘П╪и┘Д╪║ ┘Б┘В╪╖ ╪╣┘Ж ╪з┘Д╪г╪о╪╖╪з╪б ┘И╪з┘Д╪к╪н╪░┘К╪▒╪з╪к. ┘К┘Е┘Г┘Ж┘Г ╪к╪║┘К┘К╪▒ ┘Е╪│╪к┘И┘Й ╪к╪│╪м┘К┘Д ╪з┘Д╪г╪н╪п╪з╪л ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Е╪╣╪з┘Е┘Д┘К [`log_level`](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments.log_level) ┘И [`log_level_replica`](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments.log_level_replica) ┘Б┘К [`TrainingArguments`].
┘Д╪к┘З┘К╪ж╪й ╪е╪╣╪п╪з╪п ┘Е┘П╪│╪к┘И┘Й ╪к╪│╪м┘К┘Д ╪зя╗╖╪н╪п╪з╪л ┘Д┘Г┘Д ╪╣┘В╪п╪й╪М ╪з╪│╪к╪о╪п┘Е ┘Е╪╣╪з┘Е┘Д [`log_on_each_node`](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.TrainingArguments.log_on_each_node) ┘Д╪к╪н╪п┘К╪п ┘Е╪з ╪е╪░╪з ┘Г╪з┘Ж ╪│┘К╪к┘Е ╪з╪│╪к╪о╪п╪з┘Е ┘Е┘П╪│╪к┘И┘Й ╪з┘Д╪│╪м┘Д ╪╣┘Д┘Й ┘Г┘Д ╪╣┘В╪п╪й ╪г┘И ┘Б┘В╪╖ ╪╣┘Д┘Й ╪з┘Д╪╣┘В╪п╪й ╪з┘Д╪▒╪ж┘К╪│┘К╪й.
<Tip>
┘К╪н╪п╪п [`Trainer`] ┘Е┘П╪│╪к┘И┘Й ╪з┘Д╪к╪│╪м┘К┘Д ╪и╪┤┘Г┘Д ┘Е┘П┘Ж┘Б╪╡┘Д ┘Д┘Г┘Д ╪╣┘В╪п╪й ┘Б┘К ╪╖╪▒┘К┘В╪й [`Trainer.__init__`]╪М ┘Д╪░╪з ┘Б┘В╪п ╪к╪▒╪║╪и ┘Б┘К ╪з┘Д╪к┘Б┘Г┘К╪▒ ┘Б┘К ╪к╪╣┘К┘К┘Ж ┘З╪░╪з ╪з┘Д╪е╪╣╪п╪з╪п ┘Б┘К ┘И┘В╪к ╪│╪з╪и┘В ╪е╪░╪з ┘Г┘Ж╪к ╪к╪│╪к╪о╪п┘Е ┘И╪╕╪з╪ж┘Б Transformers ╪з┘Д╪г╪о╪▒┘Й ┘В╪и┘Д ╪е┘Ж╪┤╪з╪б ┘Г╪з╪ж┘Ж [`Trainer`].
</Tip>
╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ┘Д╪к╪╣┘К┘К┘Ж ╪з┘Д╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Е╪м┘К╪й ┘И╪з┘Д┘И╪н╪п╪з╪к ╪з┘Д┘Ж┘Е╪╖┘К╪й ╪з┘Д╪▒╪ж┘К╪│┘К╪й ╪з┘Д╪о╪з╪╡╪й ╪и┘Г ┘Д╪з╪│╪к╪о╪п╪з┘Е ┘Ж┘Б╪│ ┘Е┘П╪│╪к┘И┘Й ╪з┘Д╪к╪│╪м┘К┘Д ┘И┘Б┘В┘Л╪з ┘Д┘Г┘Д ╪╣┘В╪п╪й:
```py
logger = logging.getLogger(__name__)
logging.basicConfig(
format="%(asctime)s - %(levelname)s - %(name)s - %(message)s"╪М
datefmt="%m/%d/%Y %H:%M:%S"╪М
handlers=[logging.StreamHandler(sys.stdout)],
)
log_level = training_args.get_process_log_level()
logger.setLevel(log_level)
datasets.utils.logging.set_verbosity(log_level)
transformers.utils.logging.set_verbosity(log_level)
trainer = Trainer(...)
```
╪з╪│╪к╪о╪п┘Е ╪к╪▒┘Г┘К╪и╪з╪к ┘Е╪о╪к┘Д┘Б╪й ┘Е┘Ж `log_level` ┘И `log_level_replica` ┘Д╪к┘З┘К╪ж╪й ┘Е╪з ┘К╪к┘Е ╪к╪│╪м┘К┘Д┘З ╪╣┘Д┘Й ┘Г┘Д ┘Е┘Ж ╪з┘Д╪╣┘В╪п.
<hfoptions id="logging">
<hfoption id="single node">
```bash
my_app.py ... --log_level warning --log_level_replica error
```
</hfoption>
<hfoption id="multi-node">
╪г╪╢┘Б ┘Е╪╣┘Д┘Е╪й `log_on_each_node 0` ┘Д╪и┘К╪ж╪з╪к ┘Е╪к╪╣╪п╪п╪й ╪з┘Д╪╣┘В╪п.
```bash
my_app.py ... --log_level warning --log_level_replica error --log_on_each_node 0
# set to only report errors
my_app.py ... --log_level error --log_level_replica error --log_on_each_node 0
```
</hfoption>
</hfoptions>
## NEFTune
[NEFTune](https://hf.co/papers/2310.05914) ┘З┘К ╪к┘В┘Ж┘К╪й ┘К┘Е┘Г┘Ж ╪г┘Ж ╪к╪н╪│┘Ж ╪з┘Д╪г╪п╪з╪б ╪╣┘Ж ╪╖╪▒┘К┘В ╪е╪╢╪з┘Б╪й ╪╢┘И╪╢╪з╪б ╪е┘Д┘Й ┘Е┘П╪к╪м┘З╪з╪к ╪з┘Д╪к╪╣┘Д┘Е ╪г╪л┘Ж╪з╪б ╪з┘Д╪к╪п╪▒┘К╪и. ┘Д╪к┘Е┘Г┘К┘Ж┘З ┘Б┘К [`Trainer`], ┘В┘Е ╪и╪к╪╣┘К┘К┘Ж ┘Е╪╣╪з┘Е┘Д `neftune_noise_alpha` ┘Б┘К [`TrainingArguments`] ┘Д┘Д╪к╪н┘Г┘Е ┘Б┘К ┘Е┘В╪п╪з╪▒ ╪з┘Д╪╢┘И╪╢╪з╪б ╪з┘Д┘Е┘П╪╢╪з┘Б╪й.
```py
from transformers import TrainingArguments, Trainer
training_args = TrainingArguments(..., neftune_noise_alpha=0.1)
trainer = Trainer(..., args=training_args)
```
┘К╪к┘Е ╪к╪╣╪╖┘К┘Д NEFTune ╪и╪╣╪п ╪з┘Д╪к╪п╪▒┘К╪и ┘Д╪з╪│╪к╪╣╪з╪п╪й ╪╖╪и┘В╪й ╪з┘Д╪к╪╣┘Д┘Е ╪з┘Д╪г╪╡┘Д┘К╪й ┘Д╪к╪м┘Ж╪и ╪г┘К ╪│┘Д┘И┘Г ╪║┘К╪▒ ┘Е╪к┘И┘В╪╣.
## ┘Ж┘И╪з╪й Liger
[Liger-Kernel](https://github.com/linkedin/Liger-Kernel) Kernel ┘З┘К ┘Е╪м┘Е┘И╪╣╪й ┘Е┘Ж ┘Ж┘И┘Й Triton ╪з┘Д╪к┘К ╪╖┘И╪▒╪к┘З╪з Linkedin ┘Е┘П╪╡┘Е┘Е╪й ╪о╪╡┘К╪╡┘Л╪з ┘Д╪к╪п╪▒┘К╪и ┘Ж┘Е╪з╪░╪м ╪з┘Д┘Д╪║╪й ╪з┘Д┘Г╪и┘К╪▒╪й (LLM). ┘Д┘В╪п ┘В┘Е┘Ж╪з ╪и╪к┘Ж┘Б┘К╪░ RMSNorm ┘И RoPE ┘И SwiGLU ┘И CrossEntropy ┘И FusedLinearCrossEntropy ┘Е┘П╪к┘И╪з┘Б┘В╪й ┘Е╪╣ Hugging Face╪М ┘И╪з┘Д┘Е╪▓┘К╪п ┘В╪з╪п┘Е. ┘К┘П┘Е┘Г┘Ж┘З╪з ╪▓┘К╪з╪п╪й ╪е┘Ж╪к╪з╪м┘К╪й ╪з┘Д╪к╪п╪▒┘К╪и ┘Е╪к╪╣╪п╪п ┘И╪н╪п╪з╪к ┘Е╪╣╪з┘Д╪м╪й ╪з┘Д╪▒╪│┘И┘Е╪з╪к (GPU) ╪и┘Ж╪│╪и╪й 20┘к ┘И╪к┘В┘Д┘К┘Д ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪░╪з┘Г╪▒╪й ╪и┘Ж╪│╪и╪й 60┘к. ╪к╪╣┘Е┘Д ╪з┘Д┘Ж┘И╪з╪й ╪и╪┤┘Г┘Д ╪к┘Д┘В╪з╪ж┘К ┘Е╪╣ flash attention ┘И PyTorch FSDP ┘И Microsoft DeepSpeed.
╪з╪н╪╡┘Д ╪╣┘Д┘Й ╪▓┘К╪з╪п╪й ┘Б┘К ╪з┘Д╪е┘Ж╪к╪з╪м┘К╪й ╪и┘Ж╪│╪и╪й 20┘к ┘И╪к┘В┘Д┘К┘Д ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪░╪з┘Г╪▒╪й ╪и┘Ж╪│╪и╪й 60┘к ╪╣┘Д┘Й ╪к╪п╪▒┘К╪и ┘Ж┘Е╪з╪░╪м LLaMA 3-8B. ╪н┘В┘В ╪г╪╖┘И╪з┘Д ╪│┘К╪з┘В ╪г┘Г╪и╪▒ ┘И╪г╪н╪м╪з┘Е ╪п┘Б╪╣╪з╪к ╪г┘Г╪и╪▒. ┘Г┘Е╪з ╪г┘Ж┘З╪з ┘Е┘П┘Б┘К╪п╪й ╪е╪░╪з ┘Г┘Ж╪к ╪к┘П╪▒┘К╪п ╪▓┘К╪з╪п╪й ╪н╪м┘Е ┘Ж┘Е┘И╪░╪м┘Г ╪е┘Д┘Й ╪к╪п╪▒┘К╪и ╪и┘Ж┘Е╪з╪░╪м ┘Е╪к╪╣╪п╪п╪й ╪з┘Д╪▒╪д┘И╪│ ╪г┘И ╪г╪н╪м╪з┘Е ┘Е┘П┘Б╪▒╪п╪з╪к ╪╢╪о┘Е╪й. ╪г╪╖┘Д┘В ╪з┘Д╪╣┘Ж╪з┘Ж ┘Д┘Д╪к╪п╪▒┘К╪и ╪и┘Ж┘Е╪з╪░╪м ┘Е╪к╪╣╪п╪п╪й ╪з┘Д╪▒╪д┘И╪│ (medusa) ┘И╪з┘Д┘Е╪▓┘К╪п. ╪▒╪з╪м╪╣ ╪з┘Д╪к┘Б╪з╪╡┘К┘Д ┘И╪з┘Д╪г┘Е╪л┘Д╪й ┘Б┘К [Liger](https://github.com/linkedin/Liger-Kernel/tree/main/examples)
╪к╪г┘Г╪п ╪г┘И┘Д╪з┘Л ┘Е┘Ж ╪к╪л╪и┘К╪к ┘Е╪│╪к┘И╪п╪╣ Liger ╪з┘Д╪▒╪│┘Е┘К:
```bash
pip install liger-kernel
```
┘К╪м╪и ╪╣┘Д┘К┘Г ╪к┘Е╪▒┘К╪▒ `use_liger_kernel=True` ┘Д╪к╪╖╪и┘К┘В ┘Ж┘И╪з╪й `liger` ╪╣┘Д┘Й ┘Ж┘Е┘И╪░╪м┘Г╪М ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д:
```python
from transformers import TrainingArguments
training_args = TrainingArguments(
output_dir="your-model",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=2,
weight_decay=0.01,
eval_strategy="epoch",
save_strategy="epoch",
load_best_model_at_end=True,
push_to_hub=True,
use_liger_kernel=True
)
```
╪к╪п╪╣┘Е ╪з┘Д┘Ж┘И╪з╪й ┘Е╪╣┘Е╪з╪▒┘К╪з╪к ┘Ж┘Е╪з╪░╪м Llama ┘И Gemma ┘И Mistral ┘И Mixtral. ┘К┘П┘Е┘Г┘Ж ╪з┘Д╪╣╪л┘И╪▒ ╪╣┘Д┘Й ╪г╪н╪п╪л ┘В╪з╪ж┘Е╪й ╪и╪з┘Д┘Ж┘Е╪з╪ж╪м ╪з┘Д┘Е╪п╪╣┘И┘Е╪й [┘З┘Ж╪з](https://github.com/linkedin/Liger-Kernel). ╪╣┘Ж╪п┘Е╪з ┘К╪к┘Е ╪к╪╣┘К┘К┘Ж `use_liger_kernel` ╪е┘Д┘Й `True`╪М ╪│┘К╪к┘Е ╪к╪╡╪н┘К╪н ╪з┘Д╪╖╪и┘В╪з╪к ╪з┘Д┘Е┘П┘В╪з╪и┘Д╪й ┘Б┘К ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪г╪╡┘Д┘К ╪и╪з╪│╪к╪о╪п╪з┘Е ╪к╪╖╪и┘К┘В Liger ╪з┘Д┘Б╪╣╪з┘Д╪М ┘Д╪░┘Д┘Г ┘Д╪з ╪к╪н╪к╪з╪м ╪е┘Д┘Й ┘Б╪╣┘Д ╪г┘К ╪┤┘К╪б ╪е╪╢╪з┘Б┘К ╪и╪о┘Д╪з┘Б ╪к╪╣┘К┘К┘Ж ┘В┘К┘Е╪й ╪з┘Д┘Е╪╣╪з┘Е┘Д.
## ╪з┘Д┘Е┘П╪н╪│┘С┘Р┘Ж╪з╪к
┘К┘Е┘Г┘Ж┘Г ╪з╪о╪к┘К╪з╪▒ ┘Е┘П╪н╪│┘С┘Р┘Ж ┘Е╪п┘Е╪м ┘Д┘Д╪к╪п╪▒┘К╪и ╪и╪з╪│╪к╪о╪п╪з┘Е:
```python
from transformers import TrainingArguments
training_args = TrainingArguments(..., optim="adamw_torch")
```
╪з╪╖┘Д╪╣ ╪╣┘Д┘Й [`OptimizerNames`](https://github.com/huggingface/transformers/blob/main/src/transformers/training_args.py) ┘Д┘Д╪з╪╖┘Д╪з╪╣ ╪╣┘Д┘Й ╪з┘Д┘В╪з╪ж┘Е╪й ╪з┘Д┘Г╪з┘Е┘Д╪й ┘Д┘Д╪о┘К╪з╪▒╪з╪к. ┘Ж┘П╪п╪▒╪м ╪г┘Е╪л┘Д╪й ┘Е┘П╪к┘В╪п┘Е╪й ┘Б┘К ╪з┘Д╪г┘В╪│╪з┘Е ╪г╪п┘Ж╪з┘З.
┘К┘Е┘Г┘Ж┘Г ╪г┘К╪╢┘Л╪з ╪з╪│╪к╪о╪п╪з┘Е ┘Е┘П╪н╪│┘С┘Р┘Ж PyTorch ╪╣╪┤┘И╪з╪ж┘К ╪╣╪и╪▒:
```python
import torch
optimizer_cls = torch.optim.AdamW
optimizer_kwargs = {
"lr": 4e-3,
"betas": (0.9, 0.999),
"weight_decay": 0.05,
}
from transformers import Trainer
trainer = Trainer(..., optimizer_cls_and_kwargs=(optimizer_cls, optimizer_kwargs))
```
### GaLore
╪е╪│┘В╪з╪╖ ╪з┘Д╪к╪п╪▒╪м ╪░┘И ╪з┘Д╪▒╪к╪и╪й ╪з┘Д┘Е┘Ж╪о┘Б╪╢╪й (GaLore) ┘З┘И ╪е╪│╪к╪▒╪з╪к┘К╪м┘К╪й ╪к╪п╪▒┘К╪и ╪░╪з╪к ╪▒╪к╪и╪й ┘Е┘Ж╪о┘Б╪╢╪й ┘Б╪╣┘С╪з┘Д╪й ┘Е┘Ж ╪н┘К╪л ╪з┘Д╪░╪з┘Г╪▒╪й╪М ╪к╪│┘Е╪н ╪и╪к╪╣┘Д┘Е ╪з┘Д┘Е╪╣┘Д┘Е╪з╪к ╪з┘Д┘Г╪з┘Е┘Д╪й ┘И┘Д┘Г┘Ж┘З╪з ╪г┘Г╪л╪▒ ┘Г┘Б╪з╪б╪й ┘Е┘Ж ╪н┘К╪л ╪з┘Д╪░╪з┘Г╪▒╪й ┘Е┘Ж ╪г╪│╪з┘Д┘К╪и ╪з┘Д╪к┘Г┘К┘С┘Б ╪з┘Д╪┤╪з╪ж╪╣╪й ╪░╪з╪к ╪з┘Д╪▒╪к╪и╪й ╪з┘Д┘Е┘Ж╪о┘Б╪╢╪й╪М ┘Е╪л┘Д LoRA.
╪г┘И┘Д╪з┘Л╪М ╪к╪г┘Г╪п ┘Е┘Ж ╪к╪л╪и┘К╪к ╪з┘Д┘Е╪│╪к┘И╪п╪╣ ╪з┘Д╪▒╪│┘Е┘К ┘Д┘А GaLore:
```bash
pip install galore-torch
```
╪л┘Е ╪г╪╢┘Б ╪и╪и╪│╪з╪╖╪й ╪г╪н╪п `["galore_adamw"╪М "galore_adafactor"╪М "galore_adamw_8bit"]` ┘Б┘К `optim` ╪м┘Ж╪и┘Л╪з ╪е┘Д┘Й ╪м┘Ж╪и ┘Е╪╣ `optim_target_modules`╪М ┘И╪з┘Д╪к┘К ┘К┘Е┘Г┘Ж ╪г┘Ж ╪к┘Г┘И┘Ж ┘В╪з╪ж┘Е╪й ┘Е┘Ж ╪з┘Д╪│┘Д╪з╪│┘Д ╪г┘И ╪з┘Д╪к╪╣╪и┘К╪▒╪з╪к ╪з┘Д┘Ж┘Е╪╖┘К╪й regex ╪г┘И ╪з┘Д┘Е╪│╪з╪▒ ╪з┘Д┘Г╪з┘Е┘Д ╪з┘Д┘Е╪╖╪з╪и┘В ┘Д╪г╪│┘Е╪з╪б ╪з┘Д┘И╪н╪п╪з╪к ╪з┘Д┘Е╪│╪к┘З╪п┘Б╪й ╪з┘Д╪к┘К ╪к╪▒┘К╪п ╪к┘Г┘К┘К┘Б┘З╪з. ┘Б┘К┘Е╪з ┘К┘Д┘К ┘Е╪л╪з┘Д ╪╣┘Д┘Й ╪з┘Д┘Ж╪╡ ╪з┘Д╪и╪▒┘Е╪м┘К ┘Г╪з┘Е┘Д(╪к╪г┘Г╪п ┘Е┘Ж `pip install trl datasets`):
```python
import torch
import datasets
import trl
from transformers import TrainingArguments, AutoConfig, AutoTokenizer, AutoModelForCausalLM
train_dataset = datasets.load_dataset('imdb', split='train')
args = TrainingArguments(
output_dir="./test-galore"╪М
max_steps=100,
per_device_train_batch_size=2,
optim="galore_adamw"╪М
optim_target_modules=[r".*.attn.*"╪М r".*.mlp.*"]
)
model_id = "google/gemma-2b"
config = AutoConfig.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_config(config).to(0)
trainer = trl.SFTTrainer(
model=model,
args=args,
train_dataset=train_dataset,
dataset_text_field='text',
max_seq_length=512,
)
trainer.train()
```
┘Д╪к┘Е╪▒┘К╪▒ ┘Е╪╣╪з┘Ея╗╗╪к ╪е╪╢╪з┘Б┘К╪й ┘К╪п╪╣┘Е┘З╪з GaLore╪М ┘К╪м╪и ╪╣┘Д┘К┘Г ╪к┘Е╪▒┘К╪▒ `optim_args` ╪и╪┤┘Г┘Д ╪╡╪н┘К╪н╪М ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д:
```python
import torch
import datasets
import trl
from transformers import TrainingArguments, AutoConfig, AutoTokenizer, AutoModelForCausalLM
train_dataset = datasets.load_dataset('imdb', split='train')
args = TrainingArguments(
output_dir="./test-galore",
max_steps=100,
per_device_train_batch_size=2,
optim="galore_adamw",
optim_target_modules=[r".*.attn.*", r".*.mlp.*"],
optim_args="rank=64, update_proj_gap=100, scale=0.10",
)
model_id = "google/gemma-2b"
config = AutoConfig.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_config(config).to(0)
trainer = trl.SFTTrainer(
model=model,
args=args,
train_dataset=train_dataset,
dataset_text_field='text',
max_seq_length=512,
)
trainer.train()
```
┘К┘Е┘Г┘Ж┘Г ┘В╪▒╪з╪б╪й ╪з┘Д┘Е╪▓┘К╪п ╪н┘И┘Д ╪з┘Д╪╖╪▒┘К┘В╪й ┘Б┘К [╪з┘Д┘Е╪│╪к┘И╪п╪╣ ╪з┘Д╪г╪╡┘Д┘К](https://github.com/jiaweizzhao/GaLore) ╪г┘И [╪з┘Д┘И╪▒┘В╪й ╪з┘Д╪и╪н╪л┘К╪й](https://arxiv.org/abs/2403.03507).
╪н╪з┘Д┘К┘Л╪з╪М ┘К┘Е┘Г┘Ж┘Г ┘Б┘В╪╖ ╪к╪п╪▒┘К╪и ╪з┘Д╪╖╪и┘В╪з╪к ╪з┘Д╪о╪╖┘К╪й ╪з┘Д╪к┘К ╪к╪╣╪к╪и╪▒ ╪╖╪и┘В╪з╪к GaLore ┘И╪│╪к╪│╪к╪о╪п┘Е ╪з┘Д╪к╪н┘Д┘Д ╪░┘И ╪з┘Д╪▒╪к╪и╪й ╪з┘Д┘Е┘Ж╪о┘Б╪╢╪й ┘Д┘Д╪к╪п╪▒┘К╪и ╪и┘К┘Ж┘Е╪з ╪│┘К╪к┘Е ╪к╪н╪│┘К┘Ж ╪з┘Д╪╖╪и┘В╪з╪к ╪з┘Д┘Е╪к╪и┘В┘К╪й ╪и╪з┘Д╪╖╪▒┘К┘В╪й ╪з┘Д╪к┘В┘Д┘К╪п┘К╪й.
┘Д╪з╪н╪╕ ╪г┘Ж┘З ╪│┘К╪│╪к╪║╪▒┘В ╪з┘Д╪г┘Е╪▒ ╪и╪╣╪╢ ╪з┘Д┘И┘В╪к ┘В╪и┘Д ╪и╪п╪б ╪з┘Д╪к╪п╪▒┘К╪и (~3 ╪п┘В╪з╪ж┘В ┘Д┘Ж┘Е┘И╪░╪м 2B ╪╣┘Д┘Й NVIDIA A100)╪М ┘И┘Д┘Г┘Ж ┘К╪м╪и ╪г┘Ж ┘К╪│┘К╪▒ ╪з┘Д╪к╪п╪▒┘К╪и ╪и╪│┘Д╪з╪│╪й ╪и╪╣╪п ╪░┘Д┘Г.
┘К┘Е┘Г┘Ж┘Г ╪г┘К╪╢┘Л╪з ╪е╪м╪▒╪з╪б ╪к╪н╪│┘К┘Ж ╪╖╪и┘В╪й ╪к┘Д┘И ╪з┘Д╪г╪о╪▒┘Й ╪╣┘Ж ╪╖╪▒┘К┘В ╪е╪╢╪з┘Б╪й `layerwise` ╪е┘Д┘Й ╪з╪│┘Е ╪з┘Д┘Е┘П╪н╪│┘С┘Р┘Ж ┘Г┘Е╪з ┘З┘И ┘Е┘И╪╢╪н ╪г╪п┘Ж╪з┘З:
```python
import torch
import datasets
import trl
from transformers import TrainingArguments╪М AutoConfig╪М AutoTokenizer╪М AutoModelForCausalLM
train_dataset = datasets.load_dataset('imdb'╪М split='train')
args = TrainingArguments(
output_dir="./test-galore"╪М
max_steps=100╪М
per_device_train_batch_size=2╪М
optim="galore_adamw_layerwise"╪М
optim_target_modules=[r".*.attn.*"╪М r".*.mlp.*"]
)
model_id = "google/gemma-2b"
config = AutoConfig.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_config(config).to(0)
trainer = trl.SFTTrainer(
model=model╪М
args=args╪М
train_dataset=train_dataset╪М
dataset_text_field='text'╪М
max_seq_length=512╪М
)
trainer.train()
```
┘Д╪з╪н╪╕ ╪г┘Ж ╪к╪н╪│┘К┘Ж ╪з┘Д╪╖╪и┘В╪й ╪к╪м╪▒┘К╪и┘К ╪е┘Д┘Й ╪н╪п ┘Е╪з ┘И┘Д╪з ┘К╪п╪╣┘Е DDP (Distributed Data Parallel)╪М ┘И╪и╪з┘Д╪к╪з┘Д┘К ┘К┘Е┘Г┘Ж┘Г ╪к╪┤╪║┘К┘Д ╪з┘Д╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Е╪м┘К╪й ┘Д┘Д╪к╪п╪▒┘К╪и ╪╣┘Д┘Й ┘И╪н╪п╪й ┘Е╪╣╪з┘Д╪м╪й ╪з┘Д╪▒╪│┘И┘Е╪з╪к (GPU) ┘И╪з╪н╪п╪й ┘Б┘В╪╖. ┘К╪▒╪м┘Й ╪з┘Д╪з╪╖┘Д╪з╪╣ ╪╣┘Д┘Й [┘З╪░╪з ╪з┘Д┘В╪│┘Е ╪з┘Д┘Е┘Ж╪з╪│╪и](https://github.com/jiaweizzhao/GaLore?tab=readme-ov-file#train-7b-model-with-a-single-gpu-with-24gb-memory) ┘Д┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д╪к┘Б╪з╪╡┘К┘Д. ┘В╪п ┘Д╪з ╪к╪п╪╣┘Е ╪з┘Д┘Е┘К╪▓╪з╪к ╪з┘Д╪г╪о╪▒┘Й ┘Е╪л┘Д ╪к┘В┘Д┘К┘Е ╪з┘Д╪к╪п╪▒╪м╪з╪к ╪г┘И DeepSpeed╪М ╪е┘Д╪о. ┘Е┘Ж ╪з┘Д╪╡┘Ж╪п┘И┘В. ┘К╪▒╪м┘Й [╪к┘В╪п┘К┘Е ╪к┘В╪▒┘К╪▒ ╪╣┘Ж ╪з┘Д┘Е╪┤┘Г┘Д╪й ╪╣┘Д┘Й GitHub](https://github.com/huggingface/transformers/issues) ╪е╪░╪з ┘И╪з╪м┘З╪к┘Г ┘Е╪л┘Д ┘З╪░┘З ╪з┘Д┘Е╪┤┘Г┘Д╪й.
### ┘Е╪н╪│┘Ж╪з╪к LOMO
╪к┘Е ╪к┘В╪п┘К┘Е ┘Е┘П╪н╪│┘С┘Р┘Ж╪з╪к LOMO ┘Б┘К [╪з┘Д╪к╪п╪▒┘К╪и ╪╣┘Д┘Й ╪з┘Д┘Е╪╣┘Д┘Е╪з╪к ╪з┘Д┘Г╪з┘Е┘Д╪й ┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д┘Д╪║╪й ╪з┘Д┘Г╪и┘К╪▒╪й ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Е┘И╪з╪▒╪п ┘Е╪н╪п┘И╪п╪й](https://hf.co/papers/2306.09782) ┘И [AdaLomo: ╪к╪н╪│┘К┘Ж ╪░╪з┘Г╪▒╪й ┘Е┘Ж╪о┘Б╪╢╪й ╪и┘Е╪╣╪п┘Д ╪к╪╣┘Д┘Е ┘Е╪к┘Г┘К┘Б](https://hf.co/papers/2310.10195).
┘К╪к┘Г┘И┘Ж ┘Г┘Д╪з┘З┘Е╪з ┘Е┘Ж ╪╖╪▒┘К┘В╪й ┘Б╪╣╪з┘Д╪й ┘Д╪╢╪и╪╖ ╪з┘Д┘Е╪╣┘Д┘Е╪з╪к ╪з┘Д┘Г╪з┘Е┘Д╪й. ╪к╪п┘Е╪м ┘Е╪н╪│┘Ж╪з╪к LOMO ╪н╪│╪з╪и ╪з┘Д╪з╪┤╪к┘В╪з┘В ┘И╪к╪н╪п┘К╪л ╪з┘Д┘Е╪╣┘Д┘Е╪з╪к ┘Б┘К ╪о╪╖┘И╪й ┘И╪з╪н╪п╪й ┘Д╪к┘В┘Д┘К┘Д ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪░╪з┘Г╪▒╪й. ┘Е╪н╪│┘Ж╪з╪к LOMO ╪з┘Д┘Е╪п╪╣┘И┘Е╪й ┘З┘К `"lomo"` ┘И `"adalomo"`. ╪г┘И┘Д╪з┘Л ┘В┘Е ╪и╪к╪л╪и┘К╪к LOMO ┘Е┘Ж pypi `pip install lomo-optim` ╪г┘И ┘В┘Е ╪и╪к╪л╪и┘К╪к┘З ┘Е┘Ж ╪з┘Д┘Е╪╡╪п╪▒ ╪и╪з╪│╪к╪о╪п╪з┘Е `pip install git+https://github.com/OpenLMLab/LOMO.git`.
<Tip>
┘И┘Б┘В┘Л╪з ┘Д┘Д┘Е╪д┘Д┘Б┘К┘Ж╪М ┘К┘И╪╡┘Й ╪и╪з╪│╪к╪о╪п╪з┘Е `AdaLomo` ╪и╪п┘И┘Ж `grad_norm` ┘Д┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ╪г╪п╪з╪б ╪г┘Б╪╢┘Д ┘И╪│╪▒╪╣╪й ╪г╪╣┘Д┘Й.
</Tip>
┘Б┘К┘Е╪з ┘К┘Д┘К ┘Ж╪╡ ╪и╪▒┘Е╪м┘К ╪и╪│┘К╪╖ ┘К┘И╪╢╪н ┘Г┘К┘Б┘К╪й ╪╢╪и╪╖ ┘Ж┘Е┘И╪░╪м [google/gemma-2b](https://huggingface.co/google/gemma-2b) ╪╣┘Д┘Й ┘Е╪м┘Е┘И╪╣╪й ╪и┘К╪з┘Ж╪з╪к IMDB ┘Б┘К ╪з┘Д╪п┘В╪й ╪з┘Д┘Г╪з┘Е┘Д╪й:
```python
import torch
import datasets
from transformers import TrainingArguments╪М AutoTokenizer╪М AutoModelForCausalLM
import trl
train_dataset = datasets.load_dataset('imdb'╪М split='train')
args = TrainingArguments(
output_dir="./test-lomo"╪М
max_steps=100╪М
per_device_train_batch_size=4╪М
optim="adalomo"╪М
gradient_checkpointing=True╪М
logging_strategy="steps"╪М
logging_steps=1╪М
learning_rate=2e-6╪М
save_strategy="no"╪М
run_name="lomo-imdb"╪М
)
model_id = "google/gemma-2b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id╪М low_cpu_mem_usage=True).to(0)
trainer = trl.SFTTrainer(
model=model╪М
args=args╪М
train_dataset=train_dataset╪М
dataset_text_field='text'╪М
max_seq_length=1024╪М
)
trainer.train()
```
### ┘Е┘П╪н╪│┘С┘Р┘Ж GrokAdamW
╪к┘Е ╪к╪╡┘Е┘К┘Е ┘Е┘П╪н╪│┘С┘Р┘Ж GrokAdamW ┘Д╪к╪╣╪▓┘К╪▓ ╪г╪п╪з╪б ╪з┘Д╪к╪п╪▒┘К╪и ┘И╪з╪│╪к┘В╪▒╪з╪▒┘З╪М ╪о╪з╪╡╪й┘Л ┘Д┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к┘К ╪к╪│╪к┘Б┘К╪п ┘Е┘Ж ╪п┘И╪з┘Д ╪е╪┤╪з╪▒╪й `grokking`. ┘Д╪з╪│╪к╪о╪п╪з┘Е `GrokAdamW`╪М ┘В┘Е ╪г┘И┘Д╪з┘Л ╪и╪к╪л╪и┘К╪к ╪н╪▓┘Е╪й ╪з┘Д┘Е┘П╪н╪│┘С┘Р┘Ж ╪и╪з╪│╪к╪о╪п╪з┘Е `pip install grokadamw`.
<Tip>
┘К┘П╪╣╪п GrokAdamW ┘Е┘Б┘К╪п┘Л╪з ╪и╪┤┘Г┘Д ╪о╪з╪╡ ┘Д┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д╪к┘К ╪к╪к╪╖┘Д╪и ╪к┘В┘Ж┘К╪з╪к ╪к╪н╪│┘К┘Ж ┘Е┘П╪к┘В╪п┘Е╪й ┘Д╪к╪н┘В┘К┘В ╪г╪п╪з╪б ┘И╪з╪│╪к┘В╪▒╪з╪▒ ╪г┘Б╪╢┘Д.
</Tip>
┘Б┘К┘Е╪з ┘К┘Д┘К ┘Ж╪╡ ╪и╪▒┘Е╪м┘Й ╪и╪│┘К╪╖ ┘Д╪┤╪▒╪н ┘Г┘К┘Б┘К╪й ╪╢╪и╪╖ [google/gemma-2b](https://huggingface.co/google/gemma-2b) ╪и╪п┘В╪й ╪╣┘Д┘Й ┘Е╪м┘Е┘И╪╣╪й ╪и┘К╪з┘Ж╪з╪к IMDB ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Е┘П╪н╪│┘С┘Р┘Ж GrokAdamW:
```python
import torch
import datasets
from transformers import TrainingArguments, AutoTokenizer, AutoModelForCausalLM, Trainer
# ╪к╪н┘Е┘К┘Д ┘Е╪м┘Е┘И╪╣╪й ╪з┘Д╪и┘К╪з┘Ж╪з╪к IMDB
train_dataset = datasets.load_dataset('imdb', split='train')
# ╪к╪╣╪▒┘К┘Б ┘Е╪╣╪з┘Ея╗╗╪к ╪з┘Д╪к╪п╪▒┘К╪и
args = TrainingArguments(
output_dir="./test-grokadamw",
max_steps=1000,
per_device_train_batch_size=4,
optim="grokadamw",
logging_strategy="steps",
logging_steps=1,
learning_rate=2e-5,
save_strategy="no",
run_name="grokadamw-imdb",
)
# ╪к╪н┘Е┘К┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м ┘И╪з┘Д┘Е╪м╪▓┘Й╪б ╪з┘Д┘Д╪║┘И┘К
model_id = "google/gemma-2b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, low_cpu_mem_usage=True).to(0)
# ╪к┘З┘К╪ж╪й ╪з┘Д┘Е╪п╪▒╪и
trainer = Trainer(
model=model,
args=args,
train_dataset=train_dataset,
)
# ╪к╪п╪▒┘К╪и ╪з┘Д┘Ж┘Е┘И╪░╪м
trainer.train()
```
┘К┘И╪╢╪н ┘З╪░╪з ╪з┘Д┘Ж╪╡ ╪з┘Д╪и╪▒┘Е╪м┘Й ┘Г┘К┘Б┘К╪й ╪╢╪и╪╖ ┘Ж┘Е┘И╪░╪м google/gemma-2b ╪и╪п┘В╪й ╪╣┘Д┘Й ┘Е╪м┘Е┘И╪╣╪й ╪и┘К╪з┘Ж╪з╪к IMDB ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Е┘П╪н╪│┘С┘Р┘Ж GrokAdamW. ┘К╪к┘Е ╪к┘Г┘И┘К┘Ж TrainingArguments ┘Д╪з╪│╪к╪о╪п╪з┘Е GrokAdamW╪М ┘И┘К╪к┘Е ╪к┘Е╪▒┘К╪▒ ┘Е╪м┘Е┘И╪╣╪й ╪з┘Д╪и┘К╪з┘Ж╪з╪к ╪е┘Д┘Й Trainer ┘Д┘Д╪к╪п╪▒┘К╪и.
### ┘Е┘П╪н╪│┘С┘Р┘Ж ╪и╪п┘И┘Ж ╪м╪п┘И┘Д┘З (Schedule Free Optimizer)
╪к┘Е ╪к┘В╪п┘К┘Е ┘Е┘П╪н╪│┘С┘Р┘Ж╪з╪к ╪и╪п┘И┘Ж ╪м╪п┘И┘Д┘З ┘Б┘К [The Road Less Scheduled](https://hf.co/papers/2405.15682).
┘К╪│╪к╪и╪п┘Д ╪з┘Д╪к╪╣┘Д┘Е ╪и╪п┘И┘Ж ╪м╪п┘И┘Д┘З ╪▓╪о┘Е ╪з┘Д┘Е┘П╪н╪│┘С┘Р┘Ж ╪з┘Д╪г╪│╪з╪│┘К ╪и┘Е╪▓┘К╪м ┘Е┘Ж ╪з┘Д┘Е╪к┘И╪│╪╖ тАЛтАЛ┘И╪з┘Д╪к╪п╪з╪о┘Д╪М ┘Д╪е╪▓╪з┘Д╪й ╪з┘Д╪н╪з╪м╪й ╪к┘Е╪з┘Е┘Л╪з ╪е┘Д┘Й ╪к╪о┘Б┘К┘Б ┘Е┘П╪╣╪п┘Д ╪з┘Д╪к╪╣┘Д┘Е ╪и╪з╪│╪к╪о╪п╪з┘Е ╪м╪п┘И┘Д┘З ╪к┘В┘Д┘К╪п┘К┘З.
╪з┘Д┘Е┘П╪н╪│┘С┘Р┘Ж╪з╪к ╪з┘Д┘Е╪п╪╣┘И┘Е╪й ┘Д┘А SFO ┘З┘К "schedule_free_adamw" ┘И "schedule_free_sgd". ┘В┘Е ╪г┘И┘Д╪з┘Л ╪и╪к╪л╪и┘К╪к `schedulefree` ┘Е┘Ж pypi ╪и╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪г┘Е╪▒ `pip install schedulefree`.
┘Б┘К┘Е╪з ┘К┘Д┘К ┘Ж╪╡ ╪и╪▒┘Е╪м┘Й ╪и╪│┘К╪╖ ┘Д╪┤╪▒╪н ┘Г┘К┘Б┘К╪й ╪╢╪и╪╖ [google/gemma-2b](https://huggingface.co/google/gemma-2b) ╪и╪п┘В╪й ╪╣┘Д┘Й ┘Е╪м┘Е┘И╪╣╪й ╪и┘К╪з┘Ж╪з╪к IMDB ╪и╪п┘В╪й ┘Г╪з┘Е┘Д╪й:
```python
import torch
import datasets
from transformers import TrainingArguments, AutoTokenizer, AutoModelForCausalLM
import trl
train_dataset = datasets.load_dataset('imdb', split='train')
args = TrainingArguments(
output_dir="./test-schedulefree",
max_steps=1000,
per_device_train_batch_size=4,
optim="schedule_free_adamw",
gradient_checkpointing=True,
logging_strategy="steps",
logging_steps=1,
learning_rate=2e-6,
save_strategy="no",
run_name="sfo-imdb",
)
model_id = "google/gemma-2b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, low_cpu_mem_usage=True).to(0)
trainer = trl.SFTTrainer(
model=model,
args=args,
train_dataset=train_dataset,
dataset_text_field='text',
max_seq_length=1024,
)
trainer.train()
```
## ╪к╪│╪▒┘К╪╣ ┘И┘Е╪п╪▒╪и
┘К╪к┘Е ╪к╪┤╪║┘К┘Д ┘Б╪ж╪й [`Trainer`] ╪и┘И╪з╪│╪╖╪й [╪к╪│╪▒┘К╪╣](https://hf.co/docs/accelerate)╪М ┘И┘З┘К ┘Е┘Г╪к╪и╪й ┘Д╪к╪п╪▒┘К╪и ┘Ж┘Е╪з╪░╪м PyTorch ╪и╪│┘З┘И┘Д╪й ┘Б┘К ╪и┘К╪ж╪з╪к ┘Е┘И╪▓╪╣╪й ┘Е╪╣ ╪п╪╣┘Е ╪╣┘Е┘Д┘К╪з╪к ╪з┘Д╪к┘Г╪з┘Е┘Д ┘Е╪л┘Д [FullyShardedDataParallel (FSDP)](https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/) ┘И [DeepSpeed](https://www.deepspeed.ai/).
<Tip>
╪к╪╣╪▒┘Б ╪╣┘Д┘Й ╪з┘Д┘Е╪▓┘К╪п ╪н┘И┘Д ╪з╪│╪к╪▒╪з╪к┘К╪м┘К╪з╪к ╪к╪м╪▓╪ж╪й FSDP╪М ┘И╪к┘Б╪▒┘К╪║ ┘И╪н╪п╪й ╪з┘Д┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Е╪▒┘Г╪▓┘К╪й (CPU)╪М ┘И╪з┘Д┘Е╪▓┘К╪п ┘Е╪╣ [`Trainer`] ┘Б┘К [╪п┘Д┘К┘Д Fully Sharded Data Parallel](fsdp).
</Tip>
┘Д╪з╪│╪к╪о╪п╪з┘Е Accelerate ┘Е╪╣ [`Trainer`]]╪М ┘В┘Е ╪и╪к╪┤╪║┘К┘Д ╪з┘Д╪г┘Е╪▒ [`accelerate.config`](https://huggingface.co/docs/accelerate/package_reference/cli#accelerate-config) ┘Д╪е╪╣╪п╪з╪п ╪з┘Д╪к╪п╪▒┘К╪и ┘Д╪и┘К╪ж╪й ╪з┘Д╪к╪п╪▒┘К╪и ╪з┘Д╪о╪з╪╡╪й ╪и┘Г. ┘Ж╪┤╪ж ┘З╪░╪з ╪з┘Д╪г┘Е╪▒ ┘Е┘Д┘Б `config_file.yaml` ╪з┘Д╪░┘К ╪│┘К╪к┘Е ╪з╪│╪к╪о╪п╪з┘Е┘З ╪╣┘Ж╪п ╪к╪┤╪║┘К┘Д ┘Ж╪╡ ┘Д┘Д╪к╪п╪▒┘К╪и ╪з┘Д╪и╪▒┘Е╪м┘Й. ╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪и╪╣╪╢ ╪к┘Г┘И┘К┘Ж╪з╪к ╪з┘Д┘Е╪л╪з┘Д ╪з┘Д╪к┘К ┘К┘Е┘Г┘Ж┘Г ╪е╪╣╪п╪з╪п┘З╪з ┘З┘К:
<hfoptions id="config">
<hfoption id="DistributedDataParallel">
```yml
compute_environment: LOCAL_MACHINE
distributed_type: MULTI_GPU
downcast_bf16: 'no'
gpu_ids: all
machine_rank: 0 #change rank as per the node
main_process_ip: 192.168.20.1
main_process_port: 9898
main_training_function: main
mixed_precision: fp16
num_machines: 2
num_processes: 8
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
```
</hfoption>
<hfoption id="FSDP">
```yml
compute_environment: LOCAL_MACHINE
distributed_type: FSDP
downcast_bf16: 'no'
fsdp_config:
fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
fsdp_backward_prefetch_policy: BACKWARD_PRE
fsdp_forward_prefetch: true
fsdp_offload_params: false
fsdp_sharding_strategy: 1
fsdp_state_dict_type: FULL_STATE_DICT
fsdp_sync_module_states: true
fsdp_transformer_layer_cls_to_wrap: BertLayer
fsdp_use_orig_params: true
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 2
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
```
</hfoption>
<hfoption id="DeepSpeed">
```yml
compute_environment: LOCAL_MACHINE
deepspeed_config:
deepspeed_config_file: /home/user/configs/ds_zero3_config.json
zero3_init_flag: true
distributed_type: DEEPSPEED
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
num_machines: 1
num_processes: 4
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
```
</hfoption>
<hfoption id="DeepSpeed with Accelerate plugin">
```yml
compute_environment: LOCAL_MACHINE
deepspeed_config:
gradient_accumulation_steps: 1
gradient_clipping: 0.7
offload_optimizer_device: cpu
offload_param_device: cpu
zero3_init_flag: true
zero_stage: 2
distributed_type: DEEPSPEED
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 4
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
```
</hfoption>
</hfoptions>
┘К┘П╪╣╪п ╪г┘Е╪▒ [`accelerate_launch`](https://huggingface.co/docs/accelerate/package_reference/cli#accelerate-launch) ┘З┘И ╪з┘Д╪╖╪▒┘К┘В╪й ╪з┘Д┘Е┘П┘И╪╡┘Й ╪и┘З╪з ┘Д╪к╪┤╪║┘К┘Д ┘Ж╪╡ ╪з┘Д╪и╪▒┘Е╪м┘Й ┘Д┘Д╪к╪п╪▒┘К╪и ╪╣┘Д┘Й ┘Ж╪╕╪з┘Е ┘Е┘И╪▓╪╣ ╪и╪з╪│╪к╪о╪п╪з┘Е Accelerate ┘И [`Trainer`] ┘Е╪╣ ╪з┘Д┘Е╪╣┘Д┘Е╪з╪к ╪з┘Д┘Е╪н╪п╪п╪й ┘Б┘К `config_file.yaml`. ┘К╪к┘Е ╪н┘Б╪╕ ┘З╪░╪з ╪з┘Д┘Е┘Д┘Б ┘Б┘К ┘Е╪м┘Д╪п ╪░╪з┘Г╪▒╪й ╪з┘Д╪к╪о╪▓┘К┘Ж ╪з┘Д┘Е╪д┘В╪к ┘Д┘А Accelerate ┘И┘К╪к┘Е ╪к╪н┘Е┘К┘Д┘З ╪к┘Д┘В╪з╪ж┘К┘Л╪з ╪╣┘Ж╪п ╪к╪┤╪║┘К┘Д `accelerate_launch`.
╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ┘Д╪к╪┤╪║┘К┘Д ╪з┘Д┘Ж╪╡ ╪з┘Д╪и╪▒┘Ж╪з┘Е╪м┘К ┘Д┘Д╪к╪п╪▒┘К╪и [run_glue.py](https://github.com/huggingface/transformers/blob/f4db565b695582891e43a5e042e5d318e28f20b8/examples/pytorch/text-classification/run_glue.py#L4) ┘Е╪╣ ╪к┘Г┘И┘К┘Ж FSDP:
```bash
accelerate launch \
./examples/pytorch/text-classification/run_glue.py \
--model_name_or_path google-bert/bert-base-cased \
--task_name $TASK_NAME \
--do_train \
--do_eval \
--max_seq_length 128 \
--per_device_train_batch_size 16 \
--learning_rate 5e-5 \
--num_train_epochs 3 \
--output_dir /tmp/$TASK_NAME/ \
--overwrite_output_dir
```
┘К┘Е┘Г┘Ж┘Г ╪г┘К╪╢┘Л╪з ╪к╪н╪п┘К╪п ╪з┘Д┘Е╪╣┘Д┘Е╪з╪к ┘Е┘Ж ┘Е┘Д┘Б `config_file.yaml` ┘Е╪и╪з╪┤╪▒╪й ┘Б┘К ╪│╪╖╪▒ ╪з┘Д╪г┘И╪з┘Е╪▒:
```bash
accelerate launch --num_processes=2 \
--use_fsdp \
--mixed_precision=bf16 \
--fsdp_auto_wrap_policy=TRANSFORMER_BASED_WRAP \
--fsdp_transformer_layer_cls_to_wrap="BertLayer" \
--fsdp_sharding_strategy=1 \
--fsdp_state_dict_type=FULL_STATE_DICT \
./examples/pytorch/text-classification/run_glue.py
--model_name_or_path google-bert/bert-base-cased \
--task_name $TASK_NAME \
--do_train \
--do_eval \
--max_seq_length 128 \
--per_device_train_batch_size 16 \
--learning_rate 5e-5 \
--num_train_epochs 3 \
--output_dir /tmp/$TASK_NAME/ \
--overwrite_output_dir
```
╪з╪╖┘Д╪╣ ╪╣┘Д┘Й ╪и╪▒┘Ж╪з┘Е╪м ╪к╪╣┘Д┘К┘Е┘К [Launching your Accelerate scripts](https://huggingface.co/docs/accelerate/basic_tutorials/launch) ┘Д┘Е╪╣╪▒┘Б╪й ╪з┘Д┘Е╪▓┘К╪п ╪н┘И┘Д `accelerate_launch` ┘И╪з┘Д╪к┘Г┘И┘К┘Ж╪з╪к ╪з┘Д┘Е╪о╪╡╪╡╪й.

View File

@ -0,0 +1,171 @@
# ╪з╪│╪к┘Г╪┤╪з┘Б ╪з┘Д╪г╪о╪╖╪з╪б ┘И╪е╪╡┘Д╪з╪н┘З╪з
╪к╪н╪п╪л ╪з┘Д╪г╪о╪╖╪з╪б ╪г╪н┘К╪з┘Ж┘Л╪з╪М ┘Д┘Г┘Ж┘Ж╪з ┘З┘Ж╪з ┘Д┘Д┘Е╪│╪з╪╣╪п╪й! ┘К╪║╪╖┘К ┘З╪░╪з ╪з┘Д╪п┘Д┘К┘Д ╪и╪╣╪╢ ╪з┘Д┘Е╪┤┘Г┘Д╪з╪к ╪з┘Д╪г┘Г╪л╪▒ ╪┤┘К┘И╪╣┘Л╪з ╪з┘Д╪к┘К ┘И╪з╪м┘З┘Ж╪з┘З╪з ┘И┘Г┘К┘Б┘К╪й ╪н┘Д┘З╪з. ┘Е╪╣ ╪░┘Д┘Г╪М ┘Д╪з ┘К┘П┘В╪╡╪п ╪и┘З╪░╪з ╪з┘Д╪п┘Д┘К┘Д ╪г┘Ж ┘К┘Г┘И┘Ж ┘Е╪м┘Е┘И╪╣╪й ╪┤╪з┘Е┘Д╪й ┘Д┘Г┘Д ┘Е╪┤┘Г┘Д╪з╪к ЁЯдЧ Transformers. ┘Д┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д┘Е╪│╪з╪╣╪п╪й ┘Б┘К ╪з╪│╪к┘Г╪┤╪з┘Б ┘Е╪┤┘Г┘Д╪к┘Г ┘И╪е╪╡┘Д╪з╪н┘З╪з╪М ╪м╪▒╪и ┘Е╪з ┘К┘Д┘К:
<Youtube id="S2EEG3JIt2A"/>
1. ╪з╪╖┘Д╪и ╪з┘Д┘Е╪│╪з╪╣╪п╪й ╪╣┘Д┘Й [╪з┘Д┘Е┘Ж╪к╪п┘К╪з╪к](https://discuss.huggingface.co/). ┘З┘Ж╪з┘Г ┘Б╪ж╪з╪к ┘Е╪н╪п╪п╪й ┘К┘Е┘Г┘Ж┘Г ┘Ж╪┤╪▒ ╪│╪д╪з┘Д┘Г ┘Б┘К┘З╪з╪М ┘Е╪л┘Д [╪з┘Д┘Е╪и╪к╪п╪ж┘К┘Ж](https://discuss.huggingface.co/c/beginners/5) ╪г┘И [ЁЯдЧ Transformers](https://discuss.huggingface.co/c/transformers/9). ╪к╪г┘Г╪п ┘Е┘Ж ┘Г╪к╪з╪и╪й ┘Е┘Ж╪┤┘И╪▒ ╪м┘К╪п ┘И┘И╪з╪╢╪н ╪╣┘Д┘Й ╪з┘Д┘Е┘Ж╪к╪п┘Й ┘Е╪╣ ╪и╪╣╪╢ ╪з┘Д╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Е╪м┘К╪й ╪з┘Д┘В╪з╪и┘Д╪й ┘Д┘Д╪к┘Г╪▒╪з╪▒ ┘Д╪▓┘К╪з╪п╪й ╪з╪н╪к┘Е╪з┘Д┘К╪й ╪н┘Д ┘Е╪┤┘Г┘Д╪к┘Г!
<Youtube id="_PAli-V4wj0"/>
2. ┘В┘Е ╪и╪е┘Ж╪┤╪з╪б [┘Е╪┤┘Г┘Д╪й](https://github.com/huggingface/transformers/issues/new/choose) ┘Б┘К ┘Е╪│╪к┘И╪п╪╣ ЁЯдЧ Transformers ╪е╪░╪з ┘Г╪з┘Ж╪к ┘З┘Ж╪з┘Г ┘Е╪┤┘Г┘Д╪й ┘Е╪к╪╣┘Д┘В╪й ╪и╪з┘Д┘Е┘Г╪к╪и╪й. ╪н╪з┘И┘Д ╪к╪╢┘Е┘К┘Ж ╪г┘Г╪и╪▒ ┘В╪п╪▒ ┘Е┘Е┘Г┘Ж ┘Е┘Ж ╪з┘Д┘Е╪╣┘Д┘И┘Е╪з╪к ╪з┘Д╪к┘К ╪к╪╡┘Б ╪з┘Д┘Е╪┤┘Г┘Д╪й ┘Д┘Е╪│╪з╪╣╪п╪к┘Ж╪з ┘Б┘К ┘Е╪╣╪▒┘Б╪й ┘Е╪з ┘З┘И ╪з┘Д╪о╪╖╪г ┘И┘Г┘К┘Б┘К╪й ╪е╪╡┘Д╪з╪н┘З.
3. ╪к╪н┘В┘В ┘Е┘Ж ╪п┘Д┘К┘Д [╪з┘Д╪к╪▒╪н┘К┘Д](migration) ╪е╪░╪з ┘Г┘Ж╪к ╪к╪│╪к╪о╪п┘Е ╪е╪╡╪п╪з╪▒┘Л╪з ╪г┘В╪п┘Е ┘Е┘Ж ┘Е┘Г╪к╪и╪й ЁЯдЧ Transformers ╪н┘К╪л ╪к┘Е ╪е╪п╪о╪з┘Д ╪и╪╣╪╢ ╪з┘Д╪к╪║┘К┘К╪▒╪з╪к ╪з┘Д┘Е┘З┘Е╪й ╪и┘К┘Ж ╪з┘Д╪е╪╡╪п╪з╪▒╪з╪к.
┘Д┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д╪к┘Б╪з╪╡┘К┘Д ╪н┘И┘Д ╪з╪│╪к┘Г╪┤╪з┘Б ╪з┘Д╪г╪о╪╖╪з╪б ┘И╪е╪╡┘Д╪з╪н┘З╪з ┘И╪з┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ╪з┘Д┘Е╪│╪з╪╣╪п╪й╪М ╪▒╪з╪м╪╣ [╪з┘Д┘Б╪╡┘Д 8](https://huggingface.co/course/chapter8/1?fw=pt) ┘Е┘Ж ╪п┘И╪▒╪й Hugging Face.
## ╪и┘К╪ж╪з╪к ╪м╪п╪з╪▒ ╪з┘Д╪н┘Е╪з┘К╪й
╪и╪╣╪╢ ┘И╪н╪п╪з╪к ┘Е╪╣╪з┘Д╪м╪й ╪з┘Д╪▒╪│┘И┘Е╪з╪к (GPU) ╪╣┘Д┘Й ╪з┘Д╪│╪н╪з╪и╪й ┘И╪е╪╣╪п╪з╪п╪з╪к ╪з┘Д╪┤╪и┘Г╪й ╪з┘Д╪п╪з╪о┘Д┘К╪й ┘Е╪н┘Е┘К╪й ╪и╪м╪п╪з╪▒ ╪н┘Е╪з┘К╪й ┘Е┘Ж ╪з┘Д╪з╪к╪╡╪з┘Д╪з╪к ╪з┘Д╪о╪з╪▒╪м┘К╪й╪М ┘Е┘Е╪з ┘К╪д╪п┘К ╪е┘Д┘Й ╪н╪п┘И╪л ╪о╪╖╪г ┘Б┘К ╪з┘Д╪з╪к╪╡╪з┘Д. ╪╣┘Ж╪п┘Е╪з ╪к╪н╪з┘И┘Д ╪к╪╣┘Д┘К┘Е╪з╪к ╪з┘Д╪и╪▒┘Ж╪з┘Е╪м ╪з┘Д┘Ж╪╡┘К ╪к┘Ж╪▓┘К┘Д ╪г┘И╪▓╪з┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ╪г┘И ┘Е╪м┘Е┘И╪╣╪з╪к ╪з┘Д╪и┘К╪з┘Ж╪з╪к╪М ╪│┘К╪к┘И┘В┘Б ╪з┘Д╪к┘Ж╪▓┘К┘Д ╪л┘Е ┘К┘Ж╪к┘З┘К ╪и╪о╪╖╪г ┘Е╪л┘Д:
```
ValueError: Connection error, and we cannot find the requested files in the cached path.
Please try again or make sure your Internet connection is on.
```
┘Б┘К ┘З╪░┘З ╪з┘Д╪н╪з┘Д╪й╪М ┘К╪м╪и ┘Е╪н╪з┘И┘Д╪й ╪к╪┤╪║┘К┘Д ЁЯдЧ Transformers ┘Б┘К [┘И╪╢╪╣ ╪╣╪п┘Е ╪з┘Д╪з╪к╪╡╪з┘Д](installation#offline-mode) ┘Д╪к╪м┘Ж╪и ╪о╪╖╪г ╪з┘Д╪з╪к╪╡╪з┘Д.
## CUDA ┘Ж┘Б╪з╪п ╪з┘Д╪░╪з┘Г╪▒╪й
┘К┘Е┘Г┘Ж ╪г┘Ж ┘К┘Г┘И┘Ж ╪к╪п╪▒┘К╪и ╪з┘Д┘Ж┘Е╪з╪░╪м ╪з┘Д┘Г╪и┘К╪▒╪й ╪з┘Д╪к┘К ╪к╪н╪к┘И┘К ╪╣┘Д┘Й ┘Е┘Д╪з┘К┘К┘Ж ╪з┘Д┘Е╪╣┘Д┘Е╪з╪к ╪г┘Е╪▒┘Л╪з ╪╡╪╣╪и┘Л╪з ╪и╪п┘И┘Ж ╪з┘Д╪г╪м┘З╪▓╪й ╪з┘Д┘Е┘Ж╪з╪│╪и╪й. ╪г╪н╪п ╪з┘Д╪г╪о╪╖╪з╪б ╪з┘Д╪┤╪з╪ж╪╣╪й ╪з┘Д╪к┘К ┘В╪п ╪к┘И╪з╪м┘З┘З╪з ╪╣┘Ж╪п ┘Ж┘Б╪з╪п ╪░╪з┘Г╪▒╪й GPU ┘З┘И:
```
CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 11.17 GiB total capacity; 9.70 GiB already allocated; 179.81 MiB free; 9.85 GiB reserved in total by PyTorch)
```
┘Б┘К┘Е╪з ┘К┘Д┘К ╪и╪╣╪╢ ╪з┘Д╪н┘Д┘И┘Д ╪з┘Д┘Е╪н╪к┘Е┘Д╪й ╪з┘Д╪к┘К ┘К┘Е┘Г┘Ж┘Г ╪к╪м╪▒╪и╪к┘З╪з ┘Д╪к┘В┘Д┘К┘Д ╪з╪│╪к╪о╪п╪з┘Е ╪з┘Д╪░╪з┘Г╪▒╪й:
- ┘В┘Д┘Д ┘Е┘Ж ┘В┘К┘Е╪й [`per_device_train_batch_size`](main_classes/trainer#transformers.TrainingArguments.per_device_train_batch_size) ┘Б┘К [`TrainingArguments`].
- ╪н╪з┘И┘Д ╪з╪│╪к╪о╪п╪з┘Е [`gradient_accumulation_steps`](main_classes/trainer#transformers.TrainingArguments.gradient_accumulation_steps) ┘Б┘К [`TrainingArguments`] ┘Д╪▓┘К╪з╪п╪й ╪н╪м┘Е ╪з┘Д╪п┘П┘Б╪╣╪й ╪и╪┤┘Г┘Д ┘Б╪╣╪з┘Д.
<Tip>
╪▒╪з╪м╪╣ ╪п┘Д┘К┘Д [╪з┘Д╪г╪п╪з╪б](performance) ┘Д┘Е╪▓┘К╪п ┘Е┘Ж ╪з┘Д╪к┘Б╪з╪╡┘К┘Д ╪н┘И┘Д ╪к┘В┘Ж┘К╪з╪к ╪к┘И┘Б┘К╪▒ ╪з┘Д╪░╪з┘Г╪▒╪й.
</Tip>
## ╪╣╪п┘Е ╪з┘Д┘В╪п╪▒╪й ╪╣┘Д┘Й ╪к╪н┘Е┘К┘Д ┘Ж┘Е┘И╪░╪м TensorFlow ┘Е╪н┘Б┘И╪╕
╪к┘В┘И┘Е ╪╖╪▒┘К┘В╪й TensorFlow [model.save](https://www.tensorflow.org/tutorials/keras/save_and_load#save_the_entire_model) ╪и╪н┘Б╪╕ ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и╪з┘Д┘Г╪з┘Е┘Д - ╪з┘Д┘З┘Ж╪п╪│╪й ╪з┘Д┘Е╪╣┘Е╪з╪▒┘К╪й╪М ╪з┘Д╪г┘И╪▓╪з┘Ж╪М ╪к┘Г┘И┘К┘Ж ╪з┘Д╪к╪п╪▒┘К╪и - ┘Б┘К ┘Е┘Д┘Б ┘И╪з╪н╪п. ┘И┘Е╪╣ ╪░┘Д┘Г╪М ╪╣┘Ж╪п ╪к╪н┘Е┘К┘Д ┘Е┘Д┘Б ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Е╪▒╪й ╪г╪о╪▒┘Й╪М ┘В╪п ╪к┘И╪з╪м┘З ╪о╪╖╪г ┘Д╪г┘Ж ┘Е┘Г╪к╪и╪й ЁЯдЧ Transformers ┘В╪п ┘Д╪з ╪к┘В┘И┘Е ╪и╪к╪н┘Е┘К┘Д ╪м┘Е┘К╪╣ ╪з┘Д┘Г╪з╪ж┘Ж╪з╪к ╪з┘Д┘Е╪к╪╣┘Д┘В╪й ╪и┘А TensorFlow ┘Б┘К ┘Е┘Д┘Б ╪з┘Д┘Ж┘Е┘И╪░╪м. ┘Д╪к╪м┘Ж╪и ╪з┘Д┘Е╪┤┘Г┘Д╪з╪к ╪з┘Д┘Е╪к╪╣┘Д┘В╪й ╪и╪н┘Б╪╕ ┘И╪к╪н┘Е┘К┘Д ┘Ж┘Е╪з╪░╪м TensorFlow╪М ┘Ж┘И╪╡┘К ╪и┘Е╪з ┘К┘Д┘К:
- ╪з╪н┘Б╪╕ ╪г┘И╪▓╪з┘Ж ╪з┘Д┘Ж┘Е┘И╪░╪м ┘Г┘Е┘Д┘Б `h5` ╪и╪з╪│╪к╪о╪п╪з┘Е [`model.save_weights`](https://www.tensorflow.org/tutorials/keras/save_and_load#save_the_entire_model) ╪л┘Е ╪г╪╣╪п ╪к╪н┘Е┘К┘Д ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и╪з╪│╪к╪о╪п╪з┘Е [`~TFPreTrainedModel.from_pretrained`]:
```python
>>> from transformers import TFPreTrainedModel
>>> from tensorflow import keras
>>> model.save_weights("some_folder/tf_model.h5")
>>> model = TFPreTrainedModel.from_pretrained("some_folder")
```
- ╪з╪н┘Б╪╕ ╪з┘Д┘Ж┘Е┘И╪░╪м ╪и╪з╪│╪к╪о╪п╪з┘Е [`~TFPretrainedModel.save_pretrained`] ┘И┘В┘Е ╪и╪к╪н┘Е┘К┘Д┘З ┘Е╪▒╪й ╪г╪о╪▒┘Й ╪и╪з╪│╪к╪о╪п╪з┘Е [`~TFPreTrainedModel.from_pretrained`]:
```python
>>> from transformers import TFPreTrainedModel
>>> model.save_pretrained("path_to/model")
>>> model = TFPreTrainedModel.from_pretrained("path_to/model")
```
## ImportError
╪о╪╖╪г ╪┤╪з╪ж╪╣ ╪в╪о╪▒ ┘В╪п ╪к┘И╪з╪м┘З┘З╪М ╪о╪з╪╡╪й ╪е╪░╪з ┘Г╪з┘Ж ┘Ж┘Е┘И╪░╪м┘Л╪з ╪к┘Е ╪е╪╡╪п╪з╪▒┘З ╪н╪п┘К╪л┘Л╪з╪М ┘З┘И `ImportError`:
```
ImportError: cannot import name 'ImageGPTImageProcessor' from 'transformers' (unknown location)
```
╪и╪з┘Д┘Ж╪│╪и╪й ┘Д╪г┘Ж┘И╪з╪╣ ╪з┘Д╪г╪о╪╖╪з╪б ┘З╪░┘З╪М ╪к╪н┘В┘В ┘Е┘Ж ╪г┘Ж ┘Д╪п┘К┘Г ╪г╪н╪п╪л ╪е╪╡╪п╪з╪▒ ┘Е┘Ж ┘Е┘Г╪к╪и╪й Hugging Face Transformers ┘Е╪л╪и╪к┘Л╪з ┘Д┘Д┘И╪╡┘И┘Д ╪е┘Д┘Й ╪г╪н╪п╪л ╪з┘Д┘Ж┘Е╪з╪░╪м:
```bash
pip install transformers --upgrade
```
## ╪о╪╖╪г CUDA: ╪к┘Е ╪к╪┤╪║┘К┘Д ╪з┘Д╪к╪г┘Г┘К╪п ╪╣┘Д┘Й ╪м╪з┘Ж╪и ╪з┘Д╪м┘З╪з╪▓
┘Б┘К ╪и╪╣╪╢ ╪з┘Д╪г╪н┘К╪з┘Ж╪М ┘В╪п ╪к┘И╪з╪м┘З ╪о╪╖╪г CUDA ╪╣╪з┘Е┘Л╪з ╪н┘И┘Д ╪о╪╖╪г ┘Б┘К ┘Г┘И╪п ╪з┘Д╪м┘З╪з╪▓.
```
RuntimeError: CUDA error: device-side assert triggered
```
┘К╪м╪и ╪╣┘Д┘К┘Г ┘Е╪н╪з┘И┘Д╪й ╪к╪┤╪║┘К┘Д ╪з┘Д┘Г┘И╪п ╪╣┘Д┘Й ┘И╪н╪п╪й ╪з┘Д┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Е╪▒┘Г╪▓┘К╪й (CPU) ╪г┘И┘Д╪з┘Л ┘Д┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ╪▒╪│╪з┘Д╪й ╪о╪╖╪г ╪г┘Г╪л╪▒ ╪п┘В╪й. ╪г╪╢┘Б ┘Е╪к╪║┘К╪▒ ╪з┘Д╪и┘К╪ж╪й ╪з┘Д╪к╪з┘Д┘К ┘Б┘К ╪и╪п╪з┘К╪й ┘Г┘И╪п┘Г ┘Д┘Д╪к╪и╪п┘К┘Д ╪е┘Д┘Й ┘И╪н╪п╪й ╪з┘Д┘Е╪╣╪з┘Д╪м╪й ╪з┘Д┘Е╪▒┘Г╪▓┘К╪й:
```python
>>> import os
>>> os.environ["CUDA_VISIBLE_DEVICES"] = ""
```
╪з┘Д╪о┘К╪з╪▒ ╪з┘Д╪в╪о╪▒ ┘З┘И ╪з┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ╪к╪к╪и╪╣ ┘Е┘Г╪п╪│ ╪г┘Б╪╢┘Д ┘Е┘Ж GPU. ╪г╪╢┘Б ┘Е╪к╪║┘К╪▒ ╪з┘Д╪и┘К╪ж╪й ╪з┘Д╪к╪з┘Д┘К ┘Б┘К ╪и╪п╪з┘К╪й ┘Г┘И╪п┘Г ┘Д┘Д╪н╪╡┘И┘Д ╪╣┘Д┘Й ╪к╪к╪и╪╣ ╪з┘Д┘Е┘Г╪п╪│ ┘Д┘Д╪е╪┤╪з╪▒╪й ╪е┘Д┘Й ┘Е╪╡╪п╪▒ ╪з┘Д╪о╪╖╪г:
```python
>>> import os
>>> os.environ["CUDA_LAUNCH_BLOCKING"] = "1"
```
## ╪е╪о╪▒╪з╪м ╪║┘К╪▒ ╪╡╪н┘К╪н ╪╣┘Ж╪п ╪╣╪п┘Е ╪е╪о┘Б╪з╪б ╪▒┘Е┘И╪▓ ╪з┘Д╪н╪┤┘И
┘Б┘К ╪и╪╣╪╢ ╪з┘Д╪н╪з┘Д╪з╪к╪М ┘В╪п ┘К┘Г┘И┘Ж `hidden_state` ╪║┘К╪▒ ╪╡╪н┘К╪н╪й ╪е╪░╪з ╪к╪╢┘Е┘Ж╪к `input_ids` ╪▒┘Е┘И╪▓ ╪н╪┤┘И. ┘И┘Д╪е╪л╪и╪з╪к ╪░┘Д┘Г╪М ┘В┘Е ╪и╪к╪н┘Е┘К┘Д ┘Ж┘Е┘И╪░╪м ┘И┘Е╪м╪▓┘Й╪б ┘Д╪║┘И┘Й. ┘К┘Е┘Г┘Ж┘Г ╪з┘Д┘И╪╡┘И┘Д ╪е┘Д┘Й `pad_token_id` ┘Д┘Д┘Ж┘Е┘И╪░╪м ┘Д┘Е╪╣╪▒┘Б╪й ┘В┘К┘Е╪к┘З. ┘В╪п ╪к┘Г┘И┘Ж `pad_token_id` `None` ┘Д╪и╪╣╪╢ ╪з┘Д┘Ж┘Е╪з╪░╪м╪М ┘И┘Д┘Г┘Ж ┘К┘Е┘Г┘Ж┘Г ╪п╪з╪ж┘Е┘Л╪з ╪к╪╣┘К┘К┘Ж┘З╪з ┘К╪п┘И┘К┘Л╪з.
```python
>>> from transformers import AutoModelForSequenceClassification
>>> import torch
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-uncased")
>>> model.config.pad_token_id
0
```
┘К┘И╪╢╪н ╪з┘Д┘Е╪л╪з┘Д ╪з┘Д╪к╪з┘Д┘К ╪з┘Д┘Е┘П╪о╪▒╪м╪з╪к ╪и╪п┘И┘Ж ╪е╪о┘Б╪з╪б ╪▒┘Е┘И╪▓ ╪з┘Д╪н╪┤┘И:
```python
>>> input_ids = torch.tensor([[7592, 2057, 2097, 2393, 9611, 2115], [7592, 0, 0, 0, 0, 0]])
>>> output = model(input_ids)
>>> print(output.logits)
tensor([[ 0.0082, -0.2307],
[ 0.1317, -0.1683]], grad_fn=<AddmmBackward0>)
```
┘З┘Ж╪з ╪з┘Д┘Е┘П╪о╪▒╪м╪з╪к ╪з┘Д┘Б╪╣┘Д┘К╪й ┘Д┘Д╪к╪│┘Д╪│┘Д ╪з┘Д╪л╪з┘Ж┘К:
```python
>>> input_ids = torch.tensor([[7592]])
>>> output = model(input_ids)
>>> print(output.logits)
tensor([[-0.1008, -0.4061]], grad_fn=<AddmmBackward0>)
```
┘К╪м╪и ╪╣┘Д┘К┘Г ┘Б┘К ┘Е╪╣╪╕┘Е ╪з┘Д┘И┘В╪к ╪к┘И┘Б┘К╪▒ `attention_mask` ┘Д┘Д┘Ж┘Е┘И╪░╪м ┘Д╪к╪м╪з┘З┘Д ╪▒┘Е┘И╪▓ ╪з┘Д╪н╪┤┘И ┘Д╪к╪м┘Ж╪и ┘З╪░╪з ╪з┘Д╪о╪╖╪г ╪з┘Д╪╡╪з┘Е╪к. ╪з┘Д╪в┘Ж ┘К╪к╪╖╪з╪и┘В ┘Е┘П╪о╪▒╪м╪з╪к ╪з┘Д╪к╪│┘Д╪│┘Д ╪з┘Д╪л╪з┘Ж┘К ┘Е╪╣ ┘Е┘П╪о╪▒╪м╪з╪к┘З ╪з┘Д┘Б╪╣┘Д┘К╪й:
<Tip>
╪и╪┤┘Г┘Д ╪з┘Б╪к╪▒╪з╪╢┘К╪М ┘К┘Ж╪┤╪ж ┘Е╪м╪▓┘Й╪б ╪з┘Д┘Ж╪╡┘И╪╡ `attention_mask` ┘Д┘Г ╪з╪│╪к┘Ж╪з╪п┘Л╪з ╪е┘Д┘Й ╪е╪╣╪п╪з╪п╪з╪к ╪з┘Д┘Е╪м╪▓┘Й╪б ╪з┘Д┘Е╪н╪п╪п.
</Tip>
```python
>>> attention_mask = torch.tensor([[1, 1, 1, 1, 1, 1], [1, 0, 0, 0, 0, 0]])
>>> output = model(input_ids, attention_mask=attention_mask)
>>> print(output.logits)
tensor([[ 0.0082, -0.2307],
[-0.1008, -0.4061]], grad_fn=<AddmmBackward0>)
```
┘Д╪з ┘К┘Ж╪┤╪ж ЁЯдЧ Transformers ╪к┘Д┘В╪з╪ж┘К┘Л╪з `attention_mask` ┘Д╪е╪о┘Б╪з╪б ╪▒┘Е╪▓ ╪з┘Д╪н╪┤┘И ╪е╪░╪з ╪к┘Е ╪к┘И┘Б┘К╪▒┘З ┘Д╪г┘Ж:
- ╪и╪╣╪╢ ╪з┘Д┘Ж┘Е╪з╪░╪м ┘Д┘К╪│ ┘Д┘З╪з ╪▒┘Е╪▓ ╪н╪┤┘И.
- ╪и╪з┘Д┘Ж╪│╪и╪й ┘Д╪и╪╣╪╢ ╪з┘Д╪з╪│╪к╪о╪п╪з┘Е╪з╪к╪М ┘К╪▒┘К╪п ╪з┘Д┘Е╪│╪к╪о╪п┘Е┘И┘Ж ╪г┘Ж ┘К┘Ж╪к╪и┘З ╪з┘Д┘Ж┘Е┘И╪░╪м ╪е┘Д┘Й ╪▒┘Е╪▓ ╪з┘Д╪н╪┤┘И.
## ValueError: ┘Б╪ж╪й ╪з┘Д╪к┘Г┘И┘К┘Ж ╪║┘К╪▒ ╪з┘Д┘Е╪╣╪к╪▒┘Б ╪и┘З╪з XYZ ┘Д┘З╪░╪з ╪з┘Д┘Ж┘И╪╣ ┘Е┘Ж AutoModel
╪и╪┤┘Г┘Д ╪╣╪з┘Е╪М ┘Ж┘И╪╡┘К ╪и╪з╪│╪к╪о╪п╪з┘Е ┘Б╪ж╪й [`AutoModel`] ┘Д╪к╪н┘Е┘К┘Д ╪з┘Д┘Ж╪│╪о ╪з┘Д┘Е╪п╪▒╪и╪й ┘Е╪│╪и┘В┘Л╪з ┘Е┘Ж ╪з┘Д┘Ж┘Е╪з╪░╪м. ┘К┘Е┘Г┘Ж ┘Д┘З╪░┘З ╪з┘Д┘Б╪ж╪й ╪г┘Ж ╪к╪│╪к┘Ж╪к╪м ┘И╪к┘П╪н┘Е┘Д ╪к┘Д┘В╪з╪ж┘К┘Л╪з ╪з┘Д╪и┘Ж┘К╪й ╪з┘Д╪╡╪н┘К╪н╪й ┘Е┘Ж ┘Ж╪│╪о ┘Е╪╣┘К┘Ж╪й ╪и┘Ж╪з╪б┘Л ╪╣┘Д┘Й ╪з┘Д╪к┘Г┘И┘К┘Ж. ╪е╪░╪з ╪▒╪г┘К╪к ┘З╪░╪з ╪з┘Д╪о╪╖╪г `ValueError` ╪╣┘Ж╪п ╪к╪н┘Е┘К┘Д ┘Ж┘Е┘И╪░╪м ┘Е┘Ж ┘Ж╪│╪о╪й╪М ┘Б┘З╪░╪з ┘К╪╣┘Ж┘К ╪г┘Ж ╪з┘Д┘Б╪ж╪й ╪з┘Д╪к┘Д┘В╪з╪ж┘К╪й (Auto) ┘Д┘Е ╪к╪к┘Е┘Г┘Ж ┘Е┘Ж ╪з┘Д╪╣╪л┘И╪▒ ╪╣┘Д┘Й ╪о╪▒┘К╪╖╪й ┘Е┘Ж ╪з┘Д╪к┘Г┘И┘К┘Ж ┘Б┘К ┘Ж┘В╪╖╪й ╪з┘Д╪к┘Б╪к┘К╪┤ ╪з┘Д┘Е╪╣╪╖╪з╪й ╪е┘Д┘Й ┘Ж┘И╪╣ ╪з┘Д┘Ж┘Е┘И╪░╪м ╪з┘Д╪░┘К ╪к┘П╪н╪з┘И┘Д ╪к╪н┘Е┘К┘Д┘З. ┘И╪║╪з┘Д╪и┘Л╪з ┘Е╪з ┘К╪н╪п╪л ┘З╪░╪з ╪╣┘Ж╪п┘Е╪з ┘Д╪з ╪к╪п╪╣┘Е ┘Ж┘В╪╖╪й ╪з┘Д╪к┘Б╪к┘К╪┤ ┘Е┘З┘Е╪й ┘Е╪╣┘К┘Ж╪й.
╪╣┘Д┘Й ╪│╪и┘К┘Д ╪з┘Д┘Е╪л╪з┘Д╪М ╪│╪к╪▒┘Й ┘З╪░╪з ╪з┘Д╪о╪╖╪г ┘Б┘К ╪з┘Д┘Е╪л╪з┘Д ╪з┘Д╪к╪з┘Д┘К ┘Д╪г┘Ж┘З ┘Д╪з ┘К┘И╪м╪п GPT2 ┘Д┘Д╪е╪м╪з╪и╪й ╪╣┘Д┘Й ╪з┘Д╪г╪│╪ж┘Д╪й:
```py
>>> from transformers import AutoProcessor, AutoModelForQuestionAnswering
>>> processor = AutoProcessor.from_pretrained("openai-community/gpt2-medium")
>>> model = AutoModelForQuestionAnswering.from_pretrained("openai-community/gpt2-medium")
ValueError: Unrecognized configuration class <class 'transformers.models.gpt2.configuration_gpt2.GPT2Config'> for this kind of AutoModel: AutoModelForQuestionAnswering.
Model type should be one of AlbertConfig, BartConfig, BertConfig, BigBirdConfig, BigBirdPegasusConfig, BloomConfig, ...
```

View File

@ -112,7 +112,7 @@ Bevor Sie irgendwelchen Code schreiben, empfehlen wir Ihnen dringend, die besteh
Sie ben├╢tigen grundlegende `git`-Kenntnisse, um zu ЁЯдЧ Transformers beizutragen. Obwohl `git` nicht das einfachste Werkzeug ist, hat es ein sehr gutes Handbuch. Geben Sie `git --help` in eine Shell ein und genie├Яen Sie es! Wenn Sie B├╝cher bevorzugen, ist [Pro Git](https://git-scm.com/book/en/v2) eine gute Anlaufstelle.
Sie ben├╢tigen **[Python 3.8](https://github.com/huggingface/transformers/blob/main/setup.py#L426)** oder h├╢her, um zu ЁЯдЧ Transformers beizutragen. Folgen Sie den nachstehenden Schritten, um mit dem Beitrag zu beginnen:
Sie ben├╢tigen **[Python 3.9](https://github.com/huggingface/transformers/blob/main/setup.py#L426)** oder h├╢her, um zu ЁЯдЧ Transformers beizutragen. Folgen Sie den nachstehenden Schritten, um mit dem Beitrag zu beginnen:
1. Forken Sie das [Repository](https://github.com/huggingface/transformers), indem Sie auf den **[Fork](https://github.com/huggingface/transformers/fork)**-Button auf der Seite des Repositorys klicken. Dadurch wird eine Kopie des Codes auf Ihrem GitHub-Account erstellt.

View File

@ -43,7 +43,7 @@ Folglich k├╢nnen Sie eine bestimmte Modellversion mit dem Parameter "Revision" l
```py
>>> model = AutoModel.from_pretrained(
... "julien-c/EsperBERTo-small", revision="v2.0.1" # tag name, or branch name, or commit hash
... "julien-c/EsperBERTo-small", revision="4c77982" # tag name, or branch name, or commit hash
... )
```

View File

@ -218,6 +218,8 @@
title: CPU inference
- local: perf_infer_gpu_one
title: GPU inference
- local: perf_infer_gpu_multi
title: Multi-GPU inference
title: Optimizing inference
- local: big_models
title: Instantiate a big model
@ -414,6 +416,8 @@
title: Gemma
- local: model_doc/gemma2
title: Gemma2
- local: model_doc/glm
title: GLM
- local: model_doc/openai-gpt
title: GPT
- local: model_doc/gpt_neo
@ -512,6 +516,8 @@
title: Nystr├╢mformer
- local: model_doc/olmo
title: OLMo
- local: model_doc/olmo_1124
title: OLMo November 2024
- local: model_doc/olmoe
title: OLMoE
- local: model_doc/open-llama
@ -604,6 +610,8 @@
title: XLNet
- local: model_doc/yoso
title: YOSO
- local: model_doc/zamba
title: Zamba
title: Text models
- isExpanded: false
sections:
@ -713,8 +721,6 @@
title: ViTMSN
- local: model_doc/yolos
title: YOLOS
- local: model_doc/zamba
title: Zamba
- local: model_doc/zoedepth
title: ZoeDepth
title: Vision models
@ -740,6 +746,8 @@
title: Mimi
- local: model_doc/mms
title: MMS
- local: model_doc/moshi
title: Moshi
- local: model_doc/musicgen
title: MusicGen
- local: model_doc/musicgen_melody
@ -969,4 +977,4 @@
- local: internal/time_series_utils
title: Utilities for Time Series
title: Internal Helpers
title: API
title: API

View File

@ -332,7 +332,7 @@ This code can quickly be converted into a tool, just by wrapping it in a functio
from transformers import tool
@tool
def model_download_counter(task: str) -> str:
def model_download_tool(task: str) -> str:
"""
This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.
It returns the name of the checkpoint.
@ -345,7 +345,7 @@ def model_download_counter(task: str) -> str:
```
The function needs:
- A clear name. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's put `model_download_counter`.
- A clear name. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's put `model_download_tool`.
- Type hints on both inputs and output
- A description, that includes an 'Args:' part where each argument is described (without a type indication this time, it will be pulled from the type hint).
All these will be automatically baked into the agent's system prompt upon initialization: so strive to make them as clear as possible!
@ -367,7 +367,7 @@ You get the following:
======== New task ========
Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?
==== Agent is executing the code below:
most_downloaded_model = model_download_counter(task="text-to-video")
most_downloaded_model = model_download_tool(task="text-to-video")
print(f"The most downloaded model for the 'text-to-video' task is {most_downloaded_model}.")
====
```

View File

@ -66,10 +66,10 @@ manager_agent.run("Who is the CEO of Hugging Face?")
Let's take again the tool example from main documentation, for which we had implemented a `tool` decorator.
If you need to add variation, like custom attributes for your too, you can build your tool following the fine-grained method: building a class that inherits from the [`Tool`] superclass.
If you need to add variation, like custom attributes for your tool, you can build your tool following the fine-grained method: building a class that inherits from the [`Tool`] superclass.
The custom tool needs:
- An attribute `name`, which corresponds to the name of the tool itself. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's name is `model_download_counter`.
- An attribute `name`, which corresponds to the name of the tool itself. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's name it `model_download_counter`.
- An attribute `description` is used to populate the agent's system prompt.
- An `inputs` attribute, which is a dictionary with keys `"type"` and `"description"`. It contains information that helps the Python interpreter make educated choices about the input.
- An `output_type` attribute, which specifies the output type.
@ -123,6 +123,54 @@ from transformers import load_tool, CodeAgent
model_download_tool = load_tool("m-ric/hf-model-downloads")
```
### Import a Space as a tool ЁЯЪА
You can directly import a Space from the Hub as a tool using the [`Tool.from_space`] method!
You only need to provide the id of the Space on the Hub, its name, and a description that will help you agent understand what the tool does. Under the hood, this will use [`gradio-client`](https://pypi.org/project/gradio-client/) library to call the Space.
For instance, let's import the [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) Space from the Hub and use it to generate an image.
```
from transformers import Tool
image_generation_tool = Tool.from_space(
"black-forest-labs/FLUX.1-dev",
name="image_generator",
description="Generate an image from a prompt")
image_generation_tool("A sunny beach")
```
And voil├а, here's your image! ЁЯПЦя╕П
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/sunny_beach.webp">
Then you can use this tool just like any other tool. For example, let's improve the prompt `a rabbit wearing a space suit` and generate an image of it.
```python
from transformers import ReactCodeAgent
agent = ReactCodeAgent(tools=[image_generation_tool])
agent.run(
"Improve this prompt, then generate an image of it.", prompt='A rabbit wearing a space suit'
)
```
```text
=== Agent thoughts:
improved_prompt could be "A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background"
Now that I have improved the prompt, I can use the image generator tool to generate an image based on this prompt.
>>> Agent is executing the code below:
image = image_generator(prompt="A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background")
final_answer(image)
```
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit_spacesuit_flux.webp">
How cool is this? ЁЯдй
### Use gradio-tools
[gradio-tools](https://github.com/freddyaboulton/gradio-tools) is a powerful library that allows using Hugging
@ -140,36 +188,6 @@ gradio_prompt_generator_tool = StableDiffusionPromptGeneratorTool()
prompt_generator_tool = Tool.from_gradio(gradio_prompt_generator_tool)
```
Now you can use it just like any other tool. For example, let's improve the prompt `a rabbit wearing a space suit`.
```python
image_generation_tool = load_tool('huggingface-tools/text-to-image')
agent = CodeAgent(tools=[prompt_generator_tool, image_generation_tool], llm_engine=llm_engine)
agent.run(
"Improve this prompt, then generate an image of it.", prompt='A rabbit wearing a space suit'
)
```
The model adequately leverages the tool:
```text
======== New task ========
Improve this prompt, then generate an image of it.
You have been provided with these initial arguments: {'prompt': 'A rabbit wearing a space suit'}.
==== Agent is executing the code below:
improved_prompt = StableDiffusionPromptGenerator(query=prompt)
while improved_prompt == "QUEUE_FULL":
improved_prompt = StableDiffusionPromptGenerator(query=prompt)
print(f"The improved prompt is {improved_prompt}.")
image = image_generator(prompt=improved_prompt)
====
```
Before finally generating the image:
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png">
> [!WARNING]
> gradio-tools require *textual* inputs and outputs even when working with different modalities like image and audio objects. Image and audio inputs and outputs are currently incompatible.
@ -179,7 +197,7 @@ We love Langchain and think it has a very compelling suite of tools.
To import a tool from LangChain, use the `from_langchain()` method.
Here is how you can use it to recreate the intro's search result using a LangChain web search tool.
This tool will need `pip install google-search-results` to work properly.
```python
from langchain.agents import load_tools
from transformers import Tool, ReactCodeAgent
@ -188,7 +206,7 @@ search_tool = Tool.from_langchain(load_tools(["serpapi"])[0])
agent = ReactCodeAgent(tools=[search_tool])
agent.run("How many more blocks (also denoted as layers) in BERT base encoder than the encoder from the architecture proposed in Attention is All You Need?")
agent.run("How many more blocks (also denoted as layers) are in BERT base encoder compared to the encoder from the architecture proposed in Attention is All You Need?")
```
## Display your agent run in a cool Gradio interface
@ -240,4 +258,4 @@ with gr.Blocks() as demo:
if __name__ == "__main__":
demo.launch()
```
```

View File

@ -943,6 +943,35 @@ all implementations of Jinja:
- Directly rendering a dict or list may give different results in other implementations (for example, string entries
might change from single-quoted to double-quoted). Adding the `tojson` filter can help to ensure consistency here.
### Writing generation prompts
We mentioned above that `add_generation_prompt` is a special variable that will be accessible inside your template,
and is controlled by the user setting the `add_generation_prompt` flag. If your model expects a header for
assistant messages, then your template must support adding the header when `add_generation_prompt` is set.
Here is an example of a template that formats messages ChatML-style, with generation prompt support:
```text
{{- bos_token }}
{%- for message in messages %}
{{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- endif %}
```
The exact content of the assistant header will depend on your specific model, but it should always be **the string
that represents the start of an assistant message**, so that if the user applies your template with
`add_generation_prompt=True` and then generates text, the model will write an assistant response. Also note that some
models do not need a generation prompt, because assistant messages always begin immediately after user messages.
This is particularly common for LLaMA and Mistral models, where assistant messages begin immediately after the `[/INST]`
token that ends user messages. In these cases, the template can ignore the `add_generation_prompt` flag.
Generation prompts are important! If your model requires a generation prompt but it is not set in the template, then
model generations will likely be severely degraded, or the model may display unusual behaviour like continuing
the final user message!
### Writing and debugging larger templates
When this feature was introduced, most templates were quite small, the Jinja equivalent of a "one-liner" script.

View File

@ -403,7 +403,7 @@ culture, and they allow us to design the'
This guide illustrates the main parameters that enable various decoding strategies. More advanced parameters exist for the
[`generate`] method, which gives you even further control over the [`generate`] method's behavior.
For the complete list of the available parameters, refer to the [API documentation](./main_classes/text_generation.md).
For the complete list of the available parameters, refer to the [API documentation](./main_classes/text_generation).
### Speculative Decoding
@ -416,16 +416,6 @@ Assisted decoding assumes the main and assistant models have the same tokenizer,
Currently, only greedy search and sampling are supported with assisted decoding, and assisted decoding doesn't support batched inputs.
To learn more about assisted decoding, check [this blog post](https://huggingface.co/blog/assisted-generation).
#### Universal Assisted Decoding
Universal Assisted Decoding (UAD) adds support for main and assistant models with different tokenizers.
To use it, simply pass the tokenizers using the `tokenizer` and `assistant_tokenizer` arguments (see below).
Internally, the main model input tokens are re-encoded into assistant model tokens, then candidate tokens are generated in the assistant encoding, which are
in turn re-encoded into main model candidate tokens. Validation then proceeds as explained above.
The re-encoding steps involve decoding token ids into text and then encoding the text using a different tokenizer.
Since re-encoding the tokens may result in tokenization discrepancies, UAD finds the longest common subsequence between the source and target encodings,
to ensure the new tokens include the correct prompt suffix.
To enable assisted decoding, set the `assistant_model` argument with a model.
```python
@ -445,26 +435,6 @@ To enable assisted decoding, set the `assistant_model` argument with a model.
['Alice and Bob are sitting in a bar. Alice is drinking a beer and Bob is drinking a']
```
If the main and assistant models have different tokenizers, use Universal Assisted Decoding.
```python
>>> from transformers import AutoModelForCausalLM, AutoTokenizer
>>> prompt = "Alice and Bob"
>>> checkpoint = "google/gemma-2-9b"
>>> assistant_checkpoint = "double7/vicuna-68m"
>>> assistant_tokenizer = AutoTokenizer.from_pretrained(assistant_checkpoint)
>>> tokenizer = AutoTokenizer.from_pretrained(checkpoint)
>>> inputs = tokenizer(prompt, return_tensors="pt")
>>> model = AutoModelForCausalLM.from_pretrained(checkpoint)
>>> assistant_model = AutoModelForCausalLM.from_pretrained(assistant_checkpoint)
>>> outputs = model.generate(**inputs, assistant_model=assistant_model, tokenizer=tokenizer, assistant_tokenizer=assistant_tokenizer)
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
['Alice and Bob are sitting in a bar. Alice is drinking a beer and Bob is drinking a']
```
When using assisted decoding with sampling methods, you can use the `temperature` argument to control the randomness,
just like in multinomial sampling. However, in assisted decoding, reducing the temperature may help improve the latency.
@ -486,9 +456,63 @@ just like in multinomial sampling. However, in assisted decoding, reducing the t
['Alice and Bob, a couple of friends of mine, who are both in the same office as']
```
#### Universal Assisted Decoding
Universal Assisted Decoding (UAD) adds support for main and assistant models with different tokenizers.
To use it, simply pass the tokenizers using the `tokenizer` and `assistant_tokenizer` arguments (see below).
Internally, the main model input tokens are re-encoded into assistant model tokens, then candidate tokens are generated in the assistant encoding, which are
in turn re-encoded into main model candidate tokens. Validation then proceeds as explained above.
The re-encoding steps involve decoding token ids into text and then encoding the text using a different tokenizer.
Since re-encoding the tokens may result in tokenization discrepancies, UAD finds the longest common subsequence between the source and target encodings,
to ensure the new tokens include the correct prompt suffix.
```python
>>> from transformers import AutoModelForCausalLM, AutoTokenizer
>>> prompt = "Alice and Bob"
>>> checkpoint = "google/gemma-2-9b"
>>> assistant_checkpoint = "double7/vicuna-68m"
>>> assistant_tokenizer = AutoTokenizer.from_pretrained(assistant_checkpoint)
>>> tokenizer = AutoTokenizer.from_pretrained(checkpoint)
>>> inputs = tokenizer(prompt, return_tensors="pt")
>>> model = AutoModelForCausalLM.from_pretrained(checkpoint)
>>> assistant_model = AutoModelForCausalLM.from_pretrained(assistant_checkpoint)
>>> outputs = model.generate(**inputs, assistant_model=assistant_model, tokenizer=tokenizer, assistant_tokenizer=assistant_tokenizer)
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
['Alice and Bob are sitting in a bar. Alice is drinking a beer and Bob is drinking a']
```
#### Prompt Lookup
Alternatively, you can also set the `prompt_lookup_num_tokens` to trigger n-gram based assisted decoding, as opposed
to model based assisted decoding. You can read more about it [here](https://twitter.com/joao_gante/status/1747322413006643259).
#### Self-Speculative Decoding
An LLM can be trained to also use its language modeling head with earlier hidden states as input, effectively
skipping layers to yield a lower-quality output -- a technique called early exiting.
We use the lower-quality early exit output as an assistant output, and apply self-speculation to fix the output using the remaining layers. The final generation of that self-speculative solution is the same (or has the same distribution) as the original model's generation.
If the model you're using was trained to do early exit, you can pass
`assistant_early_exit` (integer). In this case, the assistant model will be the same model but exiting early, hence the
"self-speculative" name. Because the assistant model is a portion of the target model, caches and weights can be shared, which results in lower memory requirements. As in other assisted generation methods, the final generated result has the same quality as if no assistant had been used.
```python
>>> from transformers import AutoModelForCausalLM, AutoTokenizer
>>> prompt = "Alice and Bob"
>>> checkpoint = "facebook/layerskip-llama3.2-1B"
>>> tokenizer = AutoTokenizer.from_pretrained(checkpoint)
>>> inputs = tokenizer(prompt, return_tensors="pt")
>>> model = AutoModelForCausalLM.from_pretrained(checkpoint)
>>> outputs = model.generate(**inputs, assistant_early_exit=4, do_sample=False, max_new_tokens=20)
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
['Alice and Bob are sitting in a bar. Alice is drinking a beer and Bob is drinking a']
```
### DoLa Decoding
**D**ecoding by C**o**ntrasting **La**yers (DoLa) is a contrastive decoding strategy to improve the factuality and reduce the
@ -508,10 +532,11 @@ See the following examples for DoLa decoding with the 32-layer LLaMA-7B model.
```python
>>> from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed
>>> import torch
>>> from accelerate.test_utils.testing import get_backend
>>> tokenizer = AutoTokenizer.from_pretrained("huggyllama/llama-7b")
>>> model = AutoModelForCausalLM.from_pretrained("huggyllama/llama-7b", torch_dtype=torch.float16)
>>> device = 'cuda' if torch.cuda.is_available() else 'cpu'
>>> device, _, _ = get_backend() # automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
>>> model.to(device)
>>> set_seed(42)

View File

@ -85,6 +85,9 @@ For now the supported model architectures are the architectures that have been v
- StableLM
- GPT2
- Starcoder2
- T5
- Mamba
- Nemotron
## Example usage

View File

@ -19,7 +19,7 @@ State-of-the-art Machine Learning for [PyTorch](https://pytorch.org/), [TensorFl
ЁЯдЧ Transformers provides APIs and tools to easily download and train state-of-the-art pretrained models. Using pretrained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch. These models support common tasks in different modalities, such as:
ЁЯУЭ **Natural Language Processing**: text classification, named entity recognition, question answering, language modeling, summarization, translation, multiple choice, and text generation.<br>
ЁЯУЭ **Natural Language Processing**: text classification, named entity recognition, question answering, language modeling, code generation, summarization, translation, multiple choice, and text generation.<br>
ЁЯЦ╝я╕П **Computer Vision**: image classification, object detection, and segmentation.<br>
ЁЯЧгя╕П **Audio**: automatic speech recognition and audio classification.<br>
ЁЯРЩ **Multimodal**: table question answering, optical character recognition, information extraction from scanned documents, video classification, and visual question answering.
@ -150,6 +150,7 @@ Flax), PyTorch, and/or TensorFlow.
| [Gemma](model_doc/gemma) | тЬЕ | тЭМ | тЬЕ |
| [Gemma2](model_doc/gemma2) | тЬЕ | тЭМ | тЭМ |
| [GIT](model_doc/git) | тЬЕ | тЭМ | тЭМ |
| [GLM](model_doc/glm) | тЬЕ | тЭМ | тЭМ |
| [GLPN](model_doc/glpn) | тЬЕ | тЭМ | тЭМ |
| [GPT Neo](model_doc/gpt_neo) | тЬЕ | тЭМ | тЬЕ |
| [GPT NeoX](model_doc/gpt_neox) | тЬЕ | тЭМ | тЭМ |
@ -223,6 +224,7 @@ Flax), PyTorch, and/or TensorFlow.
| [MobileNetV2](model_doc/mobilenet_v2) | тЬЕ | тЭМ | тЭМ |
| [MobileViT](model_doc/mobilevit) | тЬЕ | тЬЕ | тЭМ |
| [MobileViTV2](model_doc/mobilevitv2) | тЬЕ | тЭМ | тЭМ |
| [Moshi](model_doc/moshi) | тЬЕ | тЭМ | тЭМ |
| [MPNet](model_doc/mpnet) | тЬЕ | тЬЕ | тЭМ |
| [MPT](model_doc/mpt) | тЬЕ | тЭМ | тЭМ |
| [MRA](model_doc/mra) | тЬЕ | тЭМ | тЭМ |
@ -238,6 +240,7 @@ Flax), PyTorch, and/or TensorFlow.
| [Nougat](model_doc/nougat) | тЬЕ | тЬЕ | тЬЕ |
| [Nystr├╢mformer](model_doc/nystromformer) | тЬЕ | тЭМ | тЭМ |
| [OLMo](model_doc/olmo) | тЬЕ | тЭМ | тЭМ |
| [OLMo November 2024](model_doc/olmo_1124) | тЬЕ | тЭМ | тЭМ |
| [OLMoE](model_doc/olmoe) | тЬЕ | тЭМ | тЭМ |
| [OmDet-Turbo](model_doc/omdet-turbo) | тЬЕ | тЭМ | тЭМ |
| [OneFormer](model_doc/oneformer) | тЬЕ | тЭМ | тЭМ |

View File

@ -185,6 +185,9 @@ generation.
[[autodoc]] SuppressTokensLogitsProcessor
- __call__
[[autodoc]] SynthIDTextWatermarkLogitsProcessor
- __call__
[[autodoc]] TemperatureLogitsWarper
- __call__
@ -418,5 +421,18 @@ A [`Constraint`] can be used to force the generation to include specific tokens
## Watermark Utils
[[autodoc]] WatermarkingConfig
- __call__
[[autodoc]] WatermarkDetector
- __call__
[[autodoc]] BayesianDetectorConfig
[[autodoc]] BayesianDetectorModel
- forward
[[autodoc]] SynthIDTextWatermarkingConfig
[[autodoc]] SynthIDTextWatermarkDetector
- __call__

View File

@ -348,6 +348,99 @@ model = AutoModelForCausalLM.from_pretrained(
)
```
### Fine-Tuning with torch.compile and Padding-Free Data Collation
In addition to optimizing inference, you can also enhance the training efficiency of large language models by leveraging torch.compile during fine-tuning and using a padding-free data collator. This approach can significantly speed up training and reduce computational overhead.
Here's how you can fine-tune a Llama model using SFTTrainer from the TRL library, with torch_compile enabled and a padding-free data collator:
```
#################### IMPORTS ###################
import math
import datasets
import dataclasses
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
TrainingArguments
)
from trl import SFTConfig, SFTTrainer, DataCollatorForCompletionOnlyLM
#################### MODEL LOADING WITH FLASH ATTENTION ###################
model_name = "meta-llama/Llama-3.2-1B"
model = AutoModelForCausalLM.from_pretrained(
model_name,
attn_implementation="flash_attention_2" # Enables FlashAttention-2
)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
#################### DATA PREPROCESSING (PADDING-FREE) ###################
response_template = "\n### Label:"
response_template_ids = tokenizer.encode(
response_template, add_special_tokens=False
)[2:] # Exclude special tokens
data_collator = DataCollatorForCompletionOnlyLM(
response_template_ids=response_template_ids,
tokenizer=tokenizer,
ignore_index=-100,
padding_free=True # Enables padding-free collation
)
def format_dataset(example):
return {
"output": example["output"] + tokenizer.eos_token
}
data_files = {"train": "path/to/dataset"} # Replace with your dataset path
json_dataset = datasets.load_dataset("json", data_files=data_files)
formatted_train_dataset = json_dataset["train"].map(format_dataset)
################# TRAINING CONFIGURATION ############################
train_args = TrainingArguments(
num_train_epochs=5,
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
gradient_accumulation_steps=4,
learning_rate=1e-5,
weight_decay=0.0,
warmup_ratio=0.03,
lr_scheduler_type="cosine",
logging_steps=1,
include_tokens_per_second=True,
save_strategy="epoch",
output_dir="output",
torch_compile=True, # Enables torch.compile
torch_compile_backend="inductor",
torch_compile_mode="default"
)
# Convert TrainingArguments to SFTConfig
transformer_train_arg_fields = [x.name for x in dataclasses.fields(SFTConfig)]
transformer_kwargs = {
k: v
for k, v in train_args.to_dict().items()
if k in transformer_train_arg_fields
}
training_args = SFTConfig(**transformer_kwargs)
####################### FINE-TUNING #####################
trainer = SFTTrainer(
model=model,
tokenizer=tokenizer,
train_dataset=formatted_train_dataset,
data_collator=data_collator,
dataset_text_field="output",
args=training_args,
)
trainer.train()
```
### PyTorch scaled dot product attention
Scaled dot product attention (SDPA) is automatically enabled in PyTorch 2.0 and it supports FlashAttention, xFormers, and PyTorch's C++ implementation. SDPA chooses the most performant attention algorithm if you're using a CUDA backend. For other backends, SDPA defaults to the PyTorch C++ implementation.

View File

@ -18,6 +18,49 @@ rendered properly in your Markdown viewer.
An image processor is in charge of preparing input features for vision models and post processing their outputs. This includes transformations such as resizing, normalization, and conversion to PyTorch, TensorFlow, Flax and Numpy tensors. It may also include model specific post-processing such as converting logits to segmentation masks.
Fast image processors are available for a few models and more will be added in the future. They are based on the [torchvision](https://pytorch.org/vision/stable/index.html) library and provide a significant speed-up, especially when processing on GPU.
They have the same API as the base image processors and can be used as drop-in replacements.
To use a fast image processor, you need to install the `torchvision` library, and set the `use_fast` argument to `True` when instantiating the image processor:
```python
from transformers import AutoImageProcessor
processor = AutoImageProcessor.from_pretrained("facebook/detr-resnet-50", use_fast=True)
```
When using a fast image processor, you can also set the `device` argument to specify the device on which the processing should be done. By default, the processing is done on the same device as the inputs if the inputs are tensors, or on the CPU otherwise.
```python
from torchvision.io import read_image
from transformers import DetrImageProcessorFast
images = read_image("image.jpg")
processor = DetrImageProcessorFast.from_pretrained("facebook/detr-resnet-50")
images_processed = processor(images, return_tensors="pt", device="cuda")
```
Here are some speed comparisons between the base and fast image processors for the `DETR` and `RT-DETR` models, and how they impact overall inference time:
<div class="flex">
<div>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/benchmark_results_full_pipeline_detr_fast_padded.png" />
</div>
<div>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/benchmark_results_full_pipeline_detr_fast_batched_compiled.png" />
</div>
</div>
<div class="flex">
<div>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/benchmark_results_full_pipeline_rt_detr_fast_single.png" />
</div>
<div>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/benchmark_results_full_pipeline_rt_detr_fast_batched.png" />
</div>
</div>
These benchmarks were run on an [AWS EC2 g5.2xlarge instance](https://aws.amazon.com/ec2/instance-types/g5/), utilizing an NVIDIA A10G Tensor Core GPU.
## ImageProcessingMixin

View File

@ -478,6 +478,12 @@ Pipelines available for multimodal tasks include the following.
- __call__
- all
### ImageTextToTextPipeline
[[autodoc]] ImageTextToTextPipeline
- __call__
- all
### MaskGenerationPipeline
[[autodoc]] MaskGenerationPipeline

View File

@ -41,8 +41,6 @@ like token streaming.
- validate
- get_generation_mode
[[autodoc]] generation.WatermarkingConfig
## GenerationMixin
[[autodoc]] GenerationMixin

View File

@ -51,6 +51,25 @@ token space (e.g., getting the index of the token comprising a given character o
to a given token).
# Multimodal Tokenizer
Apart from that each tokenizer can be a "multimodal" tokenizer which means that the tokenizer will hold all relevant special tokens
as part of tokenizer attributes for easier access. For example, if the tokenizer is loaded from a vision-language model like LLaVA, you will
be able to access `tokenizer.image_token_id` to obtain the special image token used as a placeholder.
To enable extra special tokens for any type of tokenizer, you have to add the following lines and save the tokenizer. Extra special tokens do not
have to be modality related and can ne anything that the model often needs access to. In the below code, tokenizer at `output_dir` will have direct access
to three more special tokens.
```python
vision_tokenizer = AutoTokenizer.from_pretrained(
"llava-hf/llava-1.5-7b-hf",
extra_special_tokens={"image_token": "<image>", "boi_token": "<image_start>", "eoi_token": "<image_end>"}
)
print(vision_tokenizer.image_token, vision_tokenizer.image_token_id)
("<image>", 32000)
```
## PreTrainedTokenizer
[[autodoc]] PreTrainedTokenizer

View File

@ -40,6 +40,10 @@ The original code can be found [here](https://github.com/salesforce/LAVIS/tree/5
- BLIP-2 can be used for conditional text generation given an image and an optional text prompt. At inference time, it's recommended to use the [`generate`] method.
- One can use [`Blip2Processor`] to prepare images for the model, and decode the predicted tokens ID's back to text.
> [!NOTE]
> BLIP models after release v4.46 will raise warnings about adding `processor.num_query_tokens = {{num_query_tokens}}` and expand model embeddings layer to add special `<image>` token. It is strongly recommended to add the attributes to the processor if you own the model checkpoint, or open a PR if it is not owned by you. Adding these attributes means that BLIP will add the number of query tokens required per image and expand the text with as many `<image>` placeholders as there will be query tokens. Usually it is around 500 tokens per image, so make sure that the text is not truncated as otherwise there wil be failure when merging the embeddings.
The attributes can be obtained from model config, as `model.config.num_query_tokens` and model embeddings expansion can be done by following [this link](https://gist.github.com/zucchini-nlp/e9f20b054fa322f84ac9311d9ab67042).
## Resources
A list of official Hugging Face and community (indicated by ЁЯМО) resources to help you get started with BLIP-2.

View File

@ -54,6 +54,12 @@ If you're interested in submitting a resource to be included here, please feel f
- preprocess
- post_process_object_detection
## DeformableDetrImageProcessorFast
[[autodoc]] DeformableDetrImageProcessorFast
- preprocess
- post_process_object_detection
## DeformableDetrFeatureExtractor
[[autodoc]] DeformableDetrFeatureExtractor

View File

@ -84,27 +84,24 @@ If you want to do the pre- and postprocessing yourself, here's how to do that:
>>> with torch.no_grad():
... outputs = model(**inputs)
... predicted_depth = outputs.predicted_depth
>>> # interpolate to original size
>>> prediction = torch.nn.functional.interpolate(
... predicted_depth.unsqueeze(1),
... size=image.size[::-1],
... mode="bicubic",
... align_corners=False,
>>> # interpolate to original size and visualize the prediction
>>> post_processed_output = image_processor.post_process_depth_estimation(
... outputs,
... target_sizes=[(image.height, image.width)],
... )
>>> # visualize the prediction
>>> output = prediction.squeeze().cpu().numpy()
>>> formatted = (output * 255 / np.max(output)).astype("uint8")
>>> depth = Image.fromarray(formatted)
>>> predicted_depth = post_processed_output[0]["predicted_depth"]
>>> depth = (predicted_depth - predicted_depth.min()) / (predicted_depth.max() - predicted_depth.min())
>>> depth = depth.detach().cpu().numpy() * 255
>>> depth = Image.fromarray(depth.astype("uint8"))
```
## Resources
A list of official Hugging Face and community (indicated by ЁЯМО) resources to help you get started with Depth Anything.
- [Monocular depth estimation task guide](../tasks/depth_estimation)
- [Monocular depth estimation task guide](../tasks/monocular_depth_estimation)
- A notebook showcasing inference with [`DepthAnythingForDepthEstimation`] can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/Depth%20Anything/Predicting_depth_in_an_image_with_Depth_Anything.ipynb). ЁЯМО
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.

View File

@ -78,27 +78,24 @@ If you want to do the pre- and post-processing yourself, here's how to do that:
>>> with torch.no_grad():
... outputs = model(**inputs)
... predicted_depth = outputs.predicted_depth
>>> # interpolate to original size
>>> prediction = torch.nn.functional.interpolate(
... predicted_depth.unsqueeze(1),
... size=image.size[::-1],
... mode="bicubic",
... align_corners=False,
>>> # interpolate to original size and visualize the prediction
>>> post_processed_output = image_processor.post_process_depth_estimation(
... outputs,
... target_sizes=[(image.height, image.width)],
... )
>>> # visualize the prediction
>>> output = prediction.squeeze().cpu().numpy()
>>> formatted = (output * 255 / np.max(output)).astype("uint8")
>>> depth = Image.fromarray(formatted)
>>> predicted_depth = post_processed_output[0]["predicted_depth"]
>>> depth = (predicted_depth - predicted_depth.min()) / (predicted_depth.max() - predicted_depth.min())
>>> depth = depth.detach().cpu().numpy() * 255
>>> depth = Image.fromarray(depth.astype("uint8"))
```
## Resources
A list of official Hugging Face and community (indicated by ЁЯМО) resources to help you get started with Depth Anything.
- [Monocular depth estimation task guide](../tasks/depth_estimation)
- [Monocular depth estimation task guide](../tasks/monocular_depth_estimation)
- [Depth Anything V2 demo](https://huggingface.co/spaces/depth-anything/Depth-Anything-V2).
- A notebook showcasing inference with [`DepthAnythingForDepthEstimation`] can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/Depth%20Anything/Predicting_depth_in_an_image_with_Depth_Anything.ipynb). ЁЯМО
- [Core ML conversion of the `small` variant for use on Apple Silicon](https://huggingface.co/apple/coreml-depth-anything-v2-small).

View File

@ -181,6 +181,15 @@ If you're interested in submitting a resource to be included here, please feel f
- post_process_instance_segmentation
- post_process_panoptic_segmentation
## DetrImageProcessorFast
[[autodoc]] DetrImageProcessorFast
- preprocess
- post_process_object_detection
- post_process_semantic_segmentation
- post_process_instance_segmentation
- post_process_panoptic_segmentation
## DetrFeatureExtractor
[[autodoc]] DetrFeatureExtractor

View File

@ -0,0 +1,99 @@
<!--Copyright 2024 The GLM & ZhipuAI team and The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
тЪая╕П Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# GLM
## Overview
The GLM Model was proposed
in [ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools](https://arxiv.org/html/2406.12793v1)
by GLM Team, THUDM & ZhipuAI.
The abstract from the paper is the following:
*We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report
primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most
capable models that are trained with all the insights and lessons gained from the preceding three generations of
ChatGLM. To date, the GLM-4 models are pre-trained on ten trillions of tokens mostly in Chinese and English, along with
a small set of corpus from 24 languages, and aligned primarily for Chinese and English usage. The high-quality alignment
is achieved via a multi-stage post-training process, which involves supervised fine-tuning and learning from human
feedback. Evaluations show that GLM-4 1) closely rivals or outperforms GPT-4 in terms of general metrics such as MMLU,
GSM8K, MATH, BBH, GPQA, and HumanEval, 2) gets close to GPT-4-Turbo in instruction following as measured by IFEval, 3)
matches GPT-4 Turbo (128K) and Claude 3 for long context tasks, and 4) outperforms GPT-4 in Chinese alignments as
measured by AlignBench. The GLM-4 All Tools model is further aligned to understand user intent and autonomously decide
when and which tool(s) to useтАФincluding web browser, Python interpreter, text-to-image model, and user-defined
functionsтАФto effectively complete complex tasks. In practical applications, it matches and even surpasses GPT-4 All
Tools in tasks like accessing online information via web browsing and solving math problems using Python interpreter.
Over the course, we have open-sourced a series of models, including ChatGLM-6B (three generations), GLM-4-9B (128K, 1M),
GLM-4V-9B, WebGLM, and CodeGeeX, attracting over 10 million downloads on Hugging face in the year 2023 alone.*
Tips:
- This model was contributed by [THUDM](https://huggingface.co/THUDM). The most recent code can be
found [here](https://github.com/thudm/GLM-4).
## Usage tips
`GLM-4` can be found on the [Huggingface Hub](https://huggingface.co/collections/THUDM/glm-4-665fcf188c414b03c2f7e3b7)
In the following, we demonstrate how to use `glm-4-9b-chat` for the inference. Note that we have used the ChatML format for dialog, in this demo we show how to leverage `apply_chat_template` for this purpose.
```python
>>> from transformers import AutoModelForCausalLM, AutoTokenizer
>>> device = "cuda" # the device to load the model onto
>>> model = AutoModelForCausalLM.from_pretrained("THUDM/glm-4-9b-chat", device_map="auto")
>>> tokenizer = AutoTokenizer.from_pretrained("THUDM/glm-4-9b-chat")
>>> prompt = "Give me a short introduction to large language model."
>>> messages = [{"role": "user", "content": prompt}]
>>> text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
>>> model_inputs = tokenizer([text], return_tensors="pt").to(device)
>>> generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=512, do_sample=True)
>>> generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]
>>> response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```
## GlmConfig
[[autodoc]] GlmConfig
## GlmModel
[[autodoc]] GlmModel
- forward
## GlmForCausalLM
[[autodoc]] GlmForCausalLM
- forward
## GlmForSequenceClassification
[[autodoc]] GlmForSequenceClassification
- forward
## GlmForTokenClassification
[[autodoc]] GlmForTokenClassification
- forward

View File

@ -33,6 +33,10 @@ The original code can be found [here](https://github.com/salesforce/LAVIS/tree/m
InstructBLIP uses the same architecture as [BLIP-2](blip2) with a tiny but important difference: it also feeds the text prompt (instruction) to the Q-Former.
> [!NOTE]
> BLIP models after release v4.46 will raise warnings about adding `processor.num_query_tokens = {{num_query_tokens}}` and expand model embeddings layer to add special `<image>` token. It is strongly recommended to add the attributes to the processor if you own the model checkpoint, or open a PR if it is not owned by you. Adding these attributes means that BLIP will add the number of query tokens required per image and expand the text with as many `<image>` placeholders as there will be query tokens. Usually it is around 500 tokens per image, so make sure that the text is not truncated as otherwise there wil be failure when merging the embeddings.
The attributes can be obtained from model config, as `model.config.num_query_tokens` and model embeddings expansion can be done by following [this link](https://gist.github.com/zucchini-nlp/e9f20b054fa322f84ac9311d9ab67042).
## InstructBlipConfig
[[autodoc]] InstructBlipConfig

View File

@ -35,6 +35,10 @@ The original code can be found [here](https://github.com/salesforce/LAVIS/tree/m
- The model was trained by sampling 4 frames per video, so it's recommended to sample 4 frames
> [!NOTE]
> BLIP models after release v4.46 will raise warnings about adding `processor.num_query_tokens = {{num_query_tokens}}` and expand model embeddings layer to add special `<image>` token. It is strongly recommended to add the attributes to the processor if you own the model checkpoint, or open a PR if it is not owned by you. Adding these attributes means that BLIP will add the number of query tokens required per image and expand the text with as many `<image>` placeholders as there will be query tokens. Usually it is around 500 tokens per image, so make sure that the text is not truncated as otherwise there wil be failure when merging the embeddings.
The attributes can be obtained from model config, as `model.config.num_query_tokens` and model embeddings expansion can be done by following [this link](https://gist.github.com/zucchini-nlp/e9f20b054fa322f84ac9311d9ab67042).
## InstructBlipVideoConfig
[[autodoc]] InstructBlipVideoConfig

View File

@ -40,6 +40,13 @@ The original code can be found [here](https://github.com/haotian-liu/LLaVA/tree/
- Note the model has not been explicitly trained to process multiple images in the same prompt, although this is technically possible, you may experience inaccurate results.
> [!NOTE]
> LLaVA models after release v4.46 will raise warnings about adding `processor.patch_size = {{patch_size}}`, `processor.num_additional_image_tokens = {{num_additional_image_tokens}}` and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. It is strongly recommended to add the attributes to the processor if you own the model checkpoint, or open a PR if it is not owned by you.
Adding these attributes means that LLaVA will try to infer the number of image tokens required per image and expand the text with as many `<image>` placeholders as there will be tokens. Usually it is around 500 tokens per image, so make sure that the text is not truncated as otherwise there will be failure when merging the embeddings.
The attributes can be obtained from model config, as `model.config.vision_config.patch_size` or `model.config.vision_feature_select_strategy`. The `num_additional_image_tokens` should be `1` if the vision backbone adds a CLS token or `0` if nothing extra is added to the vision patches.
### Single image inference
For best results, we recommend users to use the processor's `apply_chat_template()` method to format your prompt correctly. For that you need to construct a conversation history, passing in a plain string will not format your prompt. Each message in the conversation history for chat templates is a dictionary with keys "role" and "content". The "content" should be a list of dictionaries, for "text" and "image" modalities, as follows:
@ -85,10 +92,10 @@ LLaVa also supports batched inference. Here is how you can do it:
import requests
from PIL import Image
import torch
from transformers import AutoProcessor, LLavaForConditionalGeneration
from transformers import AutoProcessor, LlavaForConditionalGeneration
# Load the model in half-precision
model = LLavaForConditionalGeneration.from_pretrained("llava-hf/llava-1.5-7b-hf", torch_dtype=torch.float16, device_map="auto")
model = LlavaForConditionalGeneration.from_pretrained("llava-hf/llava-1.5-7b-hf", torch_dtype=torch.float16, device_map="auto")
processor = AutoProcessor.from_pretrained("llava-hf/llava-1.5-7b-hf")
# Get two different images

View File

@ -53,6 +53,12 @@ The original code can be found [here](https://github.com/haotian-liu/LLaVA/tree/
</Tip>
> [!NOTE]
> LLaVA models after release v4.46 will raise warnings about adding `processor.patch_size = {{patch_size}}`, `processor.num_additional_image_tokens = {{num_additional_image_tokens}}` and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. It is strongly recommended to add the attributes to the processor if you own the model checkpoint, or open a PR if it is not owned by you.
Adding these attributes means that LLaVA will try to infer the number of image tokens required per image and expand the text with as many `<image>` placeholders as there will be tokens. Usually it is around 500 tokens per image, so make sure that the text is not truncated as otherwise there will be failure when merging the embeddings.
The attributes can be obtained from model config, as `model.config.vision_config.patch_size` or `model.config.vision_feature_select_strategy`. The `num_additional_image_tokens` should be `1` if the vision backbone adds a CLS token or `0` if nothing extra is added to the vision patches.
- Note that each checkpoint has been trained with a specific prompt format, depending on which large language model (LLM) was used. You can use the processor's `apply_chat_template` to format your prompts correctly. For that you have to construct a conversation history, passing a plain string will not format your prompt. Each message in the conversation history for chat templates is a dictionary with keys "role" and "content". The "content" should be a list of dictionaries, for "text" and "image" modalities. Below is an example of how to do that and the list of formats accepted by each checkpoint.
We will use [llava-v1.6-mistral-7b-hf](https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf) and a conversation history of text and image. Each content field has to be a list of dicts, as follows:

View File

@ -50,6 +50,12 @@ The original code can be found [here](https://github.com/LLaVA-VL/LLaVA-NeXT/tre
</Tip>
> [!NOTE]
> LLaVA models after release v4.46 will raise warnings about adding `processor.patch_size = {{patch_size}}`, `processor.num_additional_image_tokens = {{num_additional_image_tokens}}` and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. It is strongly recommended to add the attributes to the processor if you own the model checkpoint, or open a PR if it is not owned by you.
Adding these attributes means that LLaVA will try to infer the number of image tokens required per image and expand the text with as many `<image>` placeholders as there will be tokens. Usually it is around 500 tokens per image, so make sure that the text is not truncated as otherwise there will be failure when merging the embeddings.
The attributes can be obtained from model config, as `model.config.vision_config.patch_size` or `model.config.vision_feature_select_strategy`. The `num_additional_image_tokens` should be `1` if the vision backbone adds a CLS token or `0` if nothing extra is added to the vision patches.
- Note that each checkpoint has been trained with a specific prompt format, depending on which large language model (LLM) was used. You can use tokenizer's `apply_chat_template` to format your prompts correctly. Below is an example of how to do that.
We will use [LLaVA-NeXT-Video-7B-hf](https://huggingface.co/llava-hf/LLaVA-NeXT-Video-7B-hf) and a conversation history of videos and images. Each content field has to be a list of dicts, as follows:

View File

@ -66,4 +66,4 @@ The original code can be found [here](https://github.com/kyutai-labs/moshi).
[[autodoc]] MimiModel
- decode
- encode
- forward
- forward

View File

@ -30,6 +30,25 @@ The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a
- The text passed to the processor should have the `"<|image|>"` tokens where the images should be inserted.
- The processor has its own `apply_chat_template` method to convert chat messages to text that can then be passed as text to the processor.
<Tip warning={true}>
Mllama has an extra token used as a placeholder for image positions in the text. It means that input ids and an input embedding layer will have an extra token. But since the weights for input and output embeddings are not tied, the `lm_head` layer has one less token and will fail if you want to calculate loss on image tokens or apply some logit processors. In case you are training, make sure to mask out special `"<|image|>"` tokens in the `labels` as the model should not be trained on predicting them.
Otherwise if you see CUDA-side index erros when generating, use the below code to expand the `lm_head` by one more token.
```python
old_embeddings = model.get_output_embeddings()
num_tokens = model.vocab_size + 1
resized_embeddings = model._get_resized_lm_head(old_embeddings, new_num_tokens=num_tokens, mean_resizing=True)
resized_embeddings.requires_grad_(old_embeddings.weight.requires_grad)
model.set_output_embeddings(resized_embeddings)
```
</Tip>
## Usage Example
#### Instruct model

View File

@ -0,0 +1,183 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
тЪая╕П Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# Moshi
## Overview
The Moshi model was proposed in [Moshi: a speech-text foundation model for real-time dialogue](https://kyutai.org/Moshi.pdf) by Alexandre D├йfossez, Laurent Mazar├й, Manu Orsini, Am├йlie Royer, Patrick P├йrez, Herv├й J├йgou, Edouard Grave and Neil Zeghidour.
Moshi is a speech-text foundation model that casts spoken dialogue as speech-to-speech generation. Starting from a text language model backbone, Moshi generates speech as tokens from the residual quantizer of a neural audio codec, while modeling separately its own speech and that of the user into parallel streams. This allows for the removal of explicit speaker turns, and the modeling of arbitrary conversational dynamics. Moshi also predicts time-aligned text tokens as a prefix to audio tokens. This тАЬInner MonologueтАЭ method significantly improves the linguistic quality of generated speech and provides streaming speech recognition and text-to-speech. As a result, Moshi is the first real-time full-duplex spoken large language model, with a theoretical latency of 160ms, 200ms in practice.
<div style="text-align: center">
<img src="https://huggingface.co/datasets/ylacombe/benchmark-comparison/resolve/main/moshi_architecture.png">
</div>
The abstract from the paper is the following:
*We introduce Moshi, a speech-text foundation model and full-duplex spoken dialogue framework. Current systems for spoken dialogue rely on pipelines of independent components, namely voice activity detection, speech recognition, textual dialogue and text-to-speech. Such frameworks cannot emulate the experience of real conversations. First, their complexity induces a latency of several seconds between interactions. Second, text being the intermediate modality for dialogue, non-linguistic information that modifies meaningтАФ such as emotion or non-speech soundsтАФ is lost in the interaction. Finally, they rely on a segmentation into speaker turns, which does not take into account overlapping speech, interruptions and interjections. Moshi solves these independent issues altogether by casting spoken dialogue as speech-to-speech generation. Starting from a text language model backbone, Moshi generates speech as tokens from the residual quantizer of a neural audio codec, while modeling separately its own speech and that of the user into parallel streams. This allows for the removal of explicit speaker turns, and the modeling of arbitrary conversational dynamics. We moreover extend the hierarchical semantic-to-acoustic token generation of previous work to first predict time-aligned text tokens as a prefix to audio tokens. Not only this тАЬInner MonologueтАЭ method significantly improves the linguistic quality of generated speech, but we also illustrate how it can provide streaming speech recognition and text-to-speech. Our resulting model is the first real-time full-duplex spoken large language model, with a theoretical latency of 160ms, 200ms in practice, and is available at github.com/kyutai-labs/moshi.*
Moshi deals with 3 streams of information:
1. The user's audio
2. Moshi's audio
3. Moshi's textual output
Similarly to [`~MusicgenModel`], audio is represented with audio codebooks, which can be interpreted like tokens. The main difference between text tokens and audio codebooks is that audio codebooks introduce an additional dimension of information.
Text tokens are typically of dim `(batch_size, sequence_length)` but audio tokens are of dim `(batch_size, num_codebooks, sequence_length)`.
Moshi's made of 3 components:
**1. The main decoder (Helium in the paper)**
It corresponds to [`MoshiForCausalLM`]. It is strictly a classic text LLM, that uses an architecture similar to [` ~GemmaForCausalLM`]. In other words, it takes text tokens, embeds them, pass them through the decoder and a language head, to get text logits.
**2. The depth decoder**
On its own, it's also a classic LLM, but this time, instead of generating over the time dimension, it generates over the codebook dimension.
It also means that its context length is `num_codebooks`, thus it can't generate more than `num_codebooks`.
Note that each timestamp - i.e each codebook - gets its own set of Linear Layers and Embeddings.
**3. [`MimiModel`]**
It's the audio encoder from Kyutai, that has recently been integrated to transformers, which is used to "tokenize" audio. It has the same use that [`~EncodecModel`] has in [`~MusicgenModel`].
## Tips:
The original checkpoints can be converted using the conversion script `src/transformers/models/moshi/convert_moshi_transformers.py`
### How to use the model:
This implementation has two main aims:
1. quickly test model generation by simplifying the original API
2. simplify training. A training guide will come soon, but user contributions are welcomed!
<Tip>
It is designed for intermediate use. We strongly recommend using the original [implementation](https://github.com/kyutai-labs/moshi) to infer the model in real-time streaming.
</Tip>
**1. Model generation**
Moshi is a streaming auto-regressive model with two streams of audio. To put it differently, one audio stream corresponds to what the model said/will say and the other audio stream corresponds to what the user said/will say.
[`MoshiForConditionalGeneration.generate`] thus needs 3 inputs:
1. `input_ids` - corresponding to the text token history
2. `moshi_input_values` or `moshi_audio_codes`- corresponding to the model audio history
3. `user_input_values` or `user_audio_codes` - corresponding to the user audio history
These three inputs must be synchronized. Meaning that their lengths must correspond to the same number of tokens.
You can dynamically use the 3 inputs depending on what you want to test:
1. Simply check the model response to an user prompt - in that case, `input_ids` can be filled with pad tokens and `user_input_values` can be a zero tensor of the same shape than the user prompt.
2. Test more complex behaviour - in that case, you must be careful about how the input tokens are synchronized with the audios.
<Tip>
The original model is synchronized text with audio by padding the text in between each token enunciation.
To follow the example of the following image, `"Hello, I'm Moshi"` could be transformed to `"Hello,<pad><unk>I'm Moshi"`.
</Tip>
<div style="text-align: center">
<img src="https://huggingface.co/datasets/ylacombe/benchmark-comparison/resolve/main/moshi_text_sync.png">
</div>
[`MoshiForConditionalGeneration.generate`] then auto-regressively feeds to itself its own audio stream, but since it doesn't have access to the user input stream while using `transformers`, it will thus **assume that the user is producing blank audio**.
```python
>>> from datasets import load_dataset, Audio
>>> import torch, math
>>> from transformers import MoshiForConditionalGeneration, AutoFeatureExtractor, AutoTokenizer
>>> librispeech_dummy = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> # prepare user input audio
>>> librispeech_dummy = librispeech_dummy.cast_column("audio", Audio(sampling_rate=feature_extractor.sampling_rate))
>>> audio_sample = librispeech_dummy[-1]["audio"]["array"]
>>> user_input_values = feature_extractor(raw_audio=audio_sample, sampling_rate=feature_extractor.sampling_rate, return_tensors="pt").to(device=device, dtype=dtype)
>>> # prepare moshi input values - we suppose moshi didn't say anything while the user spoke
>>> moshi_input_values = torch.zeros_like(user_input_values.input_values)
>>> # prepare moshi input ids - we suppose moshi didn't say anything while the user spoke
>>> num_tokens = math.ceil(moshi_input_values.shape[-1] * waveform_to_token_ratio)
>>> input_ids = torch.ones((1, num_tokens), device=device, dtype=torch.int64) * tokenizer.encode("<pad>")[0]
>>> # generate 25 new tokens (around 2s of audio)
>>> output = model.generate(input_ids=input_ids, user_input_values=user_input_values.input_values, moshi_input_values=moshi_input_values, max_new_tokens=25)
>>> text_tokens = output.sequences
>>> audio_waveforms = output.audio_sequences
```
**2. Model training**
Most of the work has to be done during data creation/pre-processing, because of the need to align/synchronize streams.
Once it's done, you can simply forward `text_labels` and `audio_labels` to [`MoshiForConditionalGeneration.forward`], alongside the usual inputs, to get the model loss.
A training guide will come soon, but user contributions are welcomed!
### How does the model forward the inputs / generate:
1. The input streams are embedded and combined into `inputs_embeds`.
2. `inputs_embeds` is passed through the main decoder, which processes it like a normal LLM would.
3. The main decoder outputs `text logits` but also its `last hidden state` which is called `temporal context` in the paper.
3. The depth decoder switches the dimension on which we forward / generate (codebooks instead of time). It uses the token generated from `text logits` and the `temporal context` to auto-regressively generate audio codebooks.
This model was contributed by [Yoach Lacombe (ylacombe)](https://huggingface.co/ylacombe).
The original code can be found [here](https://github.com/kyutai-labs/moshi).
## MoshiConfig
[[autodoc]] MoshiConfig
## MoshiDepthConfig
[[autodoc]] MoshiDepthConfig
## MoshiModel
[[autodoc]] MoshiModel
- forward
## MoshiForCausalLM
[[autodoc]] MoshiForCausalLM
- forward
## MoshiForConditionalGeneration
[[autodoc]] MoshiForConditionalGeneration
- forward
- generate
- get_unconditional_inputs

View File

@ -0,0 +1,46 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
тЪая╕П Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# OLMo November 2024
## Overview
The OLMo November 2024 model is a successor of the OLMo model, which was proposed in
[OLMo: Accelerating the Science of Language Models](https://arxiv.org/abs/2402.00838).
The architectural changes from the original OLMo model to this model are:
- RMSNorm is used instead of standard layer norm.
- Norm is applied to attention queries and keys.
- Norm is applied after attention/feedforward layers rather than before.
This model was contributed by [shanearora](https://huggingface.co/shanearora).
The original code can be found [here](https://github.com/allenai/OLMo/tree/main/olmo).
## Olmo1124Config
[[autodoc]] Olmo1124Config
## Olmo1124Model
[[autodoc]] Olmo1124Model
- forward
## Olmo1124ForCausalLM
[[autodoc]] Olmo1124ForCausalLM
- forward

View File

@ -46,7 +46,7 @@ Initially, an image is processed using a pre-trained convolutional neural networ
>>> from PIL import Image
>>> from transformers import RTDetrForObjectDetection, RTDetrImageProcessor
>>> url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
>>> url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
>>> image = Image.open(requests.get(url, stream=True).raw)
>>> image_processor = RTDetrImageProcessor.from_pretrained("PekingU/rtdetr_r50vd")
@ -57,7 +57,7 @@ Initially, an image is processed using a pre-trained convolutional neural networ
>>> with torch.no_grad():
... outputs = model(**inputs)
>>> results = image_processor.post_process_object_detection(outputs, target_sizes=torch.tensor([image.size[::-1]]), threshold=0.3)
>>> results = image_processor.post_process_object_detection(outputs, target_sizes=torch.tensor([(image.height, image.width)]), threshold=0.3)
>>> for result in results:
... for score, label_id, box in zip(result["scores"], result["labels"], result["boxes"]):
@ -95,6 +95,12 @@ A list of official Hugging Face and community (indicated by ЁЯМО) resources to h
- preprocess
- post_process_object_detection
## RTDetrImageProcessorFast
[[autodoc]] RTDetrImageProcessorFast
- preprocess
- post_process_object_detection
## RTDetrModel
[[autodoc]] RTDetrModel

View File

@ -86,24 +86,32 @@ model = SuperPointForKeypointDetection.from_pretrained("magic-leap-community/sup
inputs = processor(images, return_tensors="pt")
outputs = model(**inputs)
image_sizes = [(image.height, image.width) for image in images]
outputs = processor.post_process_keypoint_detection(outputs, image_sizes)
for i in range(len(images)):
image_mask = outputs.mask[i]
image_indices = torch.nonzero(image_mask).squeeze()
image_keypoints = outputs.keypoints[i][image_indices]
image_scores = outputs.scores[i][image_indices]
image_descriptors = outputs.descriptors[i][image_indices]
for output in outputs:
for keypoints, scores, descriptors in zip(output["keypoints"], output["scores"], output["descriptors"]):
print(f"Keypoints: {keypoints}")
print(f"Scores: {scores}")
print(f"Descriptors: {descriptors}")
```
You can then print the keypoints on the image to visualize the result :
You can then print the keypoints on the image of your choice to visualize the result:
```python
import cv2
for keypoint, score in zip(image_keypoints, image_scores):
keypoint_x, keypoint_y = int(keypoint[0].item()), int(keypoint[1].item())
color = tuple([score.item() * 255] * 3)
image = cv2.circle(image, (keypoint_x, keypoint_y), 2, color)
cv2.imwrite("output_image.png", image)
import matplotlib.pyplot as plt
plt.axis("off")
plt.imshow(image_1)
plt.scatter(
outputs[0]["keypoints"][:, 0],
outputs[0]["keypoints"][:, 1],
c=outputs[0]["scores"] * 100,
s=outputs[0]["scores"] * 50,
alpha=0.8
)
plt.savefig(f"output_image.png")
```
![image/png](https://cdn-uploads.huggingface.co/production/uploads/632885ba1558dac67c440aa8/ZtFmphEhx8tcbEQqOolyE.png)
This model was contributed by [stevenbucaille](https://huggingface.co/stevenbucaille).
The original code can be found [here](https://github.com/magicleap/SuperPointPretrainedNetwork).
@ -123,6 +131,7 @@ A list of official Hugging Face and community (indicated by ЁЯМО) resources to h
[[autodoc]] SuperPointImageProcessor
- preprocess
- post_process_keypoint_detection
## SuperPointForKeypointDetection

View File

@ -54,6 +54,12 @@ This model was contributed by [RaushanTurganbay](https://huggingface.co/RaushanT
The original code can be found [here](https://github.com/PKU-YuanGroup/Video-LLaVA).
> [!NOTE]
> LLaVA models after release v4.46 will raise warnings about adding `processor.patch_size = {{patch_size}}`, `processor.num_additional_image_tokens = {{num_additional_image_tokens}}` and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. It is strongly recommended to add the attributes to the processor if you own the model checkpoint, or open a PR if it is not owned by you.
Adding these attributes means that LLaVA will try to infer the number of image tokens required per image and expand the text with as many `<image>` placeholders as there will be tokens. Usually it is around 500 tokens per image, so make sure that the text is not truncated as otherwise there will be failure when merging the embeddings.
The attributes can be obtained from model config, as `model.config.vision_config.patch_size` or `model.config.vision_feature_select_strategy`. The `num_additional_image_tokens` should be `1` if the vision backbone adds a CLS token or `0` if nothing extra is added to the vision patches.
## Usage example
### Single Media Mode

View File

@ -39,6 +39,12 @@ This model was contributed by [Younes Belkada](https://huggingface.co/ybelkada)
- Note the model has not been explicitly trained to process multiple images in the same prompt, although this is technically possible, you may experience inaccurate results.
> [!NOTE]
> LLaVA models after release v4.46 will raise warnings about adding `processor.patch_size = {{patch_size}}`, `processor.num_additional_image_tokens = {{num_additional_image_tokens}}` and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. It is strongly recommended to add the attributes to the processor if you own the model checkpoint, or open a PR if it is not owned by you.
Adding these attributes means that LLaVA will try to infer the number of image tokens required per image and expand the text with as many `<image>` placeholders as there will be tokens. Usually it is around 500 tokens per image, so make sure that the text is not truncated as otherwise there will be failure when merging the embeddings.
The attributes can be obtained from model config, as `model.config.vision_config.patch_size` or `model.config.vision_feature_select_strategy`. The `num_additional_image_tokens` should be `1` if the vision backbone adds a CLS token or `0` if nothing extra is added to the vision patches.
- For better results, we recommend users to use the processor's `apply_chat_template()` method to format your prompt correctly. For that you need to construct a conversation history, passing in a plain string will not format your prompt. Each message in the conversation history for chat templates is a dictionary with keys "role" and "content". The "content" should be a list of dictionaries, for "text" and "image" modalities, as follows:
```python

View File

@ -23,6 +23,43 @@ The abstract from the paper is the following:
This model was contributed by [jegormeister](https://huggingface.co/jegormeister). The original code (written in JAX) can be found [here](https://github.com/google-research/scenic/tree/main/scenic/projects/vivit).
### Using Scaled Dot Product Attention (SDPA)
PyTorch includes a native scaled dot-product attention (SDPA) operator as part of `torch.nn.functional`. This function
encompasses several implementations that can be applied depending on the inputs and the hardware in use. See the
[official documentation](https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html)
or the [GPU Inference](https://huggingface.co/docs/transformers/main/en/perf_infer_gpu_one#pytorch-scaled-dot-product-attention)
page for more information.
SDPA is used by default for `torch>=2.1.1` when an implementation is available, but you may also set
`attn_implementation="sdpa"` in `from_pretrained()` to explicitly request SDPA to be used.
```
from transformers import VivitModel
model = VivitModel.from_pretrained("google/vivit-b-16x2-kinetics400", attn_implementation="sdpa", torch_dtype=torch.float16)
...
```
For the best speedups, we recommend loading the model in half-precision (e.g. `torch.float16` or `torch.bfloat16`).
On a local benchmark (A100-40GB, PyTorch 2.3.0, OS Ubuntu 22.04) with `float32` and `google/vivit-b-16x2-kinetics400` model, we saw the following speedups during inference.
### Training
| num_training_steps | batch_size | is cuda | Speedup (%) | Eager peak mem (MB) | sdpa peak mem (MB) | Mem saving (%) |
|---------------------:|-------------:|----------:|--------------:|----------------------:|---------------------:|-----------------:|
| 100 | 1 | True | 7.122 | 2575.28 | 5932.54 | 130.364 |
### Inference
| num_batches | batch_size | is cuda | is half | Speedup (%) | Mem eager (MB) | Mem BT (MB) | Mem saved (%) |
|---------------|--------------|-----------|-----------|---------------|------------------|---------------|-----------------|
| 20 | 1 | True | False | 15.422 | 715.807 | 317.079 | 125.75 |
| 20 | 2 | True | False | 17.146 | 1234.75 | 447.175 | 176.122 |
| 20 | 4 | True | False | 18.093 | 2275.82 | 709.864 | 220.6 |
| 20 | 8 | True | False | 19.284 | 4358.19 | 1233.24 | 253.393 |
## VivitConfig
[[autodoc]] VivitConfig

View File

@ -39,54 +39,66 @@ The original code can be found [here](https://github.com/isl-org/ZoeDepth).
The easiest to perform inference with ZoeDepth is by leveraging the [pipeline API](../main_classes/pipelines.md):
```python
from transformers import pipeline
from PIL import Image
import requests
>>> from transformers import pipeline
>>> from PIL import Image
>>> import requests
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
>>> url = "http://images.cocodataset.org/val2017/000000039769.jpg"
>>> image = Image.open(requests.get(url, stream=True).raw)
pipe = pipeline(task="depth-estimation", model="Intel/zoedepth-nyu-kitti")
result = pipe(image)
depth = result["depth"]
>>> pipe = pipeline(task="depth-estimation", model="Intel/zoedepth-nyu-kitti")
>>> result = pipe(image)
>>> depth = result["depth"]
```
Alternatively, one can also perform inference using the classes:
```python
from transformers import AutoImageProcessor, ZoeDepthForDepthEstimation
import torch
import numpy as np
from PIL import Image
import requests
>>> from transformers import AutoImageProcessor, ZoeDepthForDepthEstimation
>>> import torch
>>> import numpy as np
>>> from PIL import Image
>>> import requests
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
>>> url = "http://images.cocodataset.org/val2017/000000039769.jpg"
>>> image = Image.open(requests.get(url, stream=True).raw)
image_processor = AutoImageProcessor.from_pretrained("Intel/zoedepth-nyu-kitti")
model = ZoeDepthForDepthEstimation.from_pretrained("Intel/zoedepth-nyu-kitti")
>>> image_processor = AutoImageProcessor.from_pretrained("Intel/zoedepth-nyu-kitti")
>>> model = ZoeDepthForDepthEstimation.from_pretrained("Intel/zoedepth-nyu-kitti")
# prepare image for the model
inputs = image_processor(images=image, return_tensors="pt")
>>> # prepare image for the model
>>> inputs = image_processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
predicted_depth = outputs.predicted_depth
>>> with torch.no_grad():
... outputs = model(pixel_values)
# interpolate to original size
prediction = torch.nn.functional.interpolate(
predicted_depth.unsqueeze(1),
size=image.size[::-1],
mode="bicubic",
align_corners=False,
)
>>> # interpolate to original size and visualize the prediction
>>> ## ZoeDepth dynamically pads the input image. Thus we pass the original image size as argument
>>> ## to `post_process_depth_estimation` to remove the padding and resize to original dimensions.
>>> post_processed_output = image_processor.post_process_depth_estimation(
... outputs,
... source_sizes=[(image.height, image.width)],
... )
# visualize the prediction
output = prediction.squeeze().cpu().numpy()
formatted = (output * 255 / np.max(output)).astype("uint8")
depth = Image.fromarray(formatted)
>>> predicted_depth = post_processed_output[0]["predicted_depth"]
>>> depth = (predicted_depth - predicted_depth.min()) / (predicted_depth.max() - predicted_depth.min())
>>> depth = depth.detach().cpu().numpy() * 255
>>> depth = Image.fromarray(depth.astype("uint8"))
```
<Tip>
<p>In the <a href="https://github.com/isl-org/ZoeDepth/blob/edb6daf45458569e24f50250ef1ed08c015f17a7/zoedepth/models/depth_model.py#L131">original implementation</a> ZoeDepth model performs inference on both the original and flipped images and averages out the results. The <code>post_process_depth_estimation</code> function can handle this for us by passing the flipped outputs to the optional <code>outputs_flipped</code> argument:</p>
<pre><code class="language-Python">&gt;&gt;&gt; with torch.no_grad():
... outputs = model(pixel_values)
... outputs_flipped = model(pixel_values=torch.flip(inputs.pixel_values, dims=[3]))
&gt;&gt;&gt; post_processed_output = image_processor.post_process_depth_estimation(
... outputs,
... source_sizes=[(image.height, image.width)],
... outputs_flipped=outputs_flipped,
... )
</code></pre>
</Tip>
## Resources
A list of official Hugging Face and community (indicated by ЁЯМО) resources to help you get started with ZoeDepth.

View File

@ -43,7 +43,7 @@ As a result, you can load a specific model version with the `revision` parameter
```py
>>> model = AutoModel.from_pretrained(
... "julien-c/EsperBERTo-small", revision="v2.0.1" # tag name, or branch name, or commit hash
... "julien-c/EsperBERTo-small", revision="4c77982" # tag name, or branch name, or commit hash
... )
```

View File

@ -0,0 +1,68 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
тЪая╕П Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# Multi-GPU inference
Built-in Tensor Parallelism (TP) is now available with certain models using PyTorch. Tensor parallelism shards a model onto multiple GPUs, enabling larger model sizes, and parallelizes computations such as matrix multiplication.
To enable tensor parallel, pass the argument `tp_plan="auto"` to [`~AutoModelForCausalLM.from_pretrained`]:
```python
import os
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
# Initialize distributed
rank = int(os.environ["RANK"])
device = torch.device(f"cuda:{rank}")
torch.distributed.init_process_group("nccl", device_id=device)
# Retrieve tensor parallel model
model = AutoModelForCausalLM.from_pretrained(
model_id,
tp_plan="auto",
)
# Prepare input tokens
tokenizer = AutoTokenizer.from_pretrained(model_id)
prompt = "Can I help"
inputs = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
# Distributed run
outputs = model(inputs)
```
You can use `torchrun` to launch the above script with multiple processes, each mapping to a GPU:
```
torchrun --nproc-per-node 4 demo.py
```
PyTorch tensor parallel is currently supported for the following models:
* [Llama](https://huggingface.co/docs/transformers/model_doc/llama#transformers.LlamaModel)
You can request to add tensor parallel support for another model by opening a GitHub Issue or Pull Request.
### Expected speedups
You can benefit from considerable speedups for inference, especially for inputs with large batch size or long sequences.
For a single forward pass on [Llama](https://huggingface.co/docs/transformers/model_doc/llama#transformers.LlamaModel) with a sequence length of 512 and various batch sizes, the expected speedup is as follows:
<div style="text-align: center">
<img src="huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Meta-Llama-3-8B-Instruct, seqlen = 512, python, w_ compile.png">
</div>

View File

@ -42,6 +42,7 @@ FlashAttention-2 is currently supported for the following architectures:
* [Chameleon](https://huggingface.co/docs/transformers/model_doc/chameleon#transformers.Chameleon)
* [CLIP](https://huggingface.co/docs/transformers/model_doc/clip#transformers.CLIPModel)
* [Cohere](https://huggingface.co/docs/transformers/model_doc/cohere#transformers.CohereModel)
* [GLM](https://huggingface.co/docs/transformers/model_doc/glm#transformers.GLMModel)
* [Dbrx](https://huggingface.co/docs/transformers/model_doc/dbrx#transformers.DbrxModel)
* [DistilBert](https://huggingface.co/docs/transformers/model_doc/distilbert#transformers.DistilBertModel)
* [Gemma](https://huggingface.co/docs/transformers/model_doc/gemma#transformers.GemmaModel)
@ -70,13 +71,16 @@ FlashAttention-2 is currently supported for the following architectures:
* [MBart](https://huggingface.co/docs/transformers/model_doc/mbart#transformers.MBartModel)
* [Mistral](https://huggingface.co/docs/transformers/model_doc/mistral#transformers.MistralModel)
* [Mixtral](https://huggingface.co/docs/transformers/model_doc/mixtral#transformers.MixtralModel)
* [Moshi](https://huggingface.co/docs/transformers/model_doc/moshi#transformers.MoshiModel)
* [Musicgen](https://huggingface.co/docs/transformers/model_doc/musicgen#transformers.MusicgenModel)
* [MusicGen Melody](https://huggingface.co/docs/transformers/model_doc/musicgen_melody#transformers.MusicgenMelodyModel)
* [Nemotron](https://huggingface.co/docs/transformers/model_doc/nemotron)
* [NLLB](https://huggingface.co/docs/transformers/model_doc/nllb)
* [OLMo](https://huggingface.co/docs/transformers/model_doc/olmo#transformers.OlmoModel)
* [OLMo November 2024](https://huggingface.co/docs/transformers/model_doc/olmo_1124#transformers.Olmo1124Model)
* [OLMoE](https://huggingface.co/docs/transformers/model_doc/olmoe#transformers.OlmoeModel)
* [OPT](https://huggingface.co/docs/transformers/model_doc/opt#transformers.OPTModel)
* [PaliGemma](https://huggingface.co/docs/transformers/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration)
* [Phi](https://huggingface.co/docs/transformers/model_doc/phi#transformers.PhiModel)
* [Phi3](https://huggingface.co/docs/transformers/model_doc/phi3#transformers.Phi3Model)
* [PhiMoE](https://huggingface.co/docs/transformers/model_doc/phimoe#transformers.PhimoeModel)
@ -86,6 +90,10 @@ FlashAttention-2 is currently supported for the following architectures:
* [Qwen2Audio](https://huggingface.co/docs/transformers/model_doc/qwen2_audio#transformers.Qwen2AudioEncoder)
* [Qwen2MoE](https://huggingface.co/docs/transformers/model_doc/qwen2_moe#transformers.Qwen2MoeModel)
* [Qwen2VL](https://huggingface.co/docs/transformers/model_doc/qwen2_vl#transformers.Qwen2VLModel)
* [RAG](https://huggingface.co/docs/transformers/model_doc/rag#transformers.RagModel)
* [SpeechEncoderDecoder](https://huggingface.co/docs/transformers/model_doc/speech_encoder_decoder#transformers.SpeechEncoderDecoderModel)
* [VisionEncoderDecoder](https://huggingface.co/docs/transformers/model_doc/vision_encoder_decoder#transformers.VisionEncoderDecoderModel)
* [VisionTextDualEncoder](https://huggingface.co/docs/transformers/model_doc/vision_text_dual_encoder#transformers.VisionTextDualEncoderModel)
* [Whisper](https://huggingface.co/docs/transformers/model_doc/whisper#transformers.WhisperModel)
* [Wav2Vec2](https://huggingface.co/docs/transformers/model_doc/wav2vec2#transformers.Wav2Vec2Model)
* [Hubert](https://huggingface.co/docs/transformers/model_doc/hubert#transformers.HubertModel)
@ -215,6 +223,7 @@ For now, Transformers supports SDPA inference and training for the following arc
* [CamemBERT](https://huggingface.co/docs/transformers/model_doc/camembert#transformers.CamembertModel)
* [Chameleon](https://huggingface.co/docs/transformers/model_doc/chameleon#transformers.Chameleon)
* [CLIP](https://huggingface.co/docs/transformers/model_doc/clip#transformers.CLIPModel)
* [GLM](https://huggingface.co/docs/transformers/model_doc/glm#transformers.GLMModel)
* [Cohere](https://huggingface.co/docs/transformers/model_doc/cohere#transformers.CohereModel)
* [data2vec_audio](https://huggingface.co/docs/transformers/main/en/model_doc/data2vec#transformers.Data2VecAudioModel)
* [Dbrx](https://huggingface.co/docs/transformers/model_doc/dbrx#transformers.DbrxModel)
@ -222,6 +231,7 @@ For now, Transformers supports SDPA inference and training for the following arc
* [Dinov2](https://huggingface.co/docs/transformers/en/model_doc/dinov2)
* [DistilBert](https://huggingface.co/docs/transformers/model_doc/distilbert#transformers.DistilBertModel)
* [Dpr](https://huggingface.co/docs/transformers/model_doc/dpr#transformers.DprReader)
* [EncoderDecoder](https://huggingface.co/docs/transformers/model_doc/encoder_decoder#transformers.EncoderDecoderModel)
* [Falcon](https://huggingface.co/docs/transformers/model_doc/falcon#transformers.FalconModel)
* [Gemma](https://huggingface.co/docs/transformers/model_doc/gemma#transformers.GemmaModel)
* [Gemma2](https://huggingface.co/docs/transformers/model_doc/gemma2#transformers.Gemma2Model)
@ -230,21 +240,28 @@ For now, Transformers supports SDPA inference and training for the following arc
* [GPTNeoX](https://huggingface.co/docs/transformers/model_doc/gpt_neox#transformers.GPTNeoXModel)
* [Hubert](https://huggingface.co/docs/transformers/model_doc/hubert#transformers.HubertModel)
* [Idefics](https://huggingface.co/docs/transformers/model_doc/idefics#transformers.IdeficsModel)
* [Idefics2](https://huggingface.co/docs/transformers/model_doc/idefics2#transformers.Idefics2Model)
* [Idefics3](https://huggingface.co/docs/transformers/model_doc/idefics3#transformers.Idefics3Model)
* [Granite](https://huggingface.co/docs/transformers/model_doc/granite#transformers.GraniteModel)
* [GraniteMoe](https://huggingface.co/docs/transformers/model_doc/granitemoe#transformers.GraniteMoeModel)
* [JetMoe](https://huggingface.co/docs/transformers/model_doc/jetmoe#transformers.JetMoeModel)
* [Jamba](https://huggingface.co/docs/transformers/model_doc/jamba#transformers.JambaModel)
* [Llama](https://huggingface.co/docs/transformers/model_doc/llama#transformers.LlamaModel)
* [Llava](https://huggingface.co/docs/transformers/model_doc/llava)
* [Llava-NeXT](https://huggingface.co/docs/transformers/model_doc/llava_next)
* [Llava-NeXT-Video](https://huggingface.co/docs/transformers/model_doc/llava_next_video)
* [LLaVA-Onevision](https://huggingface.co/docs/transformers/model_doc/llava_onevision)
* [M2M100](https://huggingface.co/docs/transformers/model_doc/m2m_100#transformers.M2M100Model)
* [Mimi](https://huggingface.co/docs/transformers/model_doc/mimi)
* [Mistral](https://huggingface.co/docs/transformers/model_doc/mistral#transformers.MistralModel)
* [Mllama](https://huggingface.co/docs/transformers/model_doc/mllama#transformers.MllamaForConditionalGeneration)
* [Mixtral](https://huggingface.co/docs/transformers/model_doc/mixtral#transformers.MixtralModel)
* [Moshi](https://huggingface.co/docs/transformers/model_doc/moshi#transformers.MoshiModel)
* [Musicgen](https://huggingface.co/docs/transformers/model_doc/musicgen#transformers.MusicgenModel)
* [MusicGen Melody](https://huggingface.co/docs/transformers/model_doc/musicgen_melody#transformers.MusicgenMelodyModel)
* [NLLB](https://huggingface.co/docs/transformers/model_doc/nllb)
* [OLMo](https://huggingface.co/docs/transformers/model_doc/olmo#transformers.OlmoModel)
* [OLMo November 2024](https://huggingface.co/docs/transformers/model_doc/olmo_1124#transformers.Olmo1124Model)
* [OLMoE](https://huggingface.co/docs/transformers/model_doc/olmoe#transformers.OlmoeModel)
* [OPT](https://huggingface.co/docs/transformers/en/model_doc/opt)
* [PaliGemma](https://huggingface.co/docs/transformers/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration)
@ -273,11 +290,17 @@ For now, Transformers supports SDPA inference and training for the following arc
* [Musicgen](https://huggingface.co/docs/transformers/model_doc/musicgen#transformers.MusicgenModel)
* [MusicGen Melody](https://huggingface.co/docs/transformers/model_doc/musicgen_melody#transformers.MusicgenMelodyModel)
* [Nemotron](https://huggingface.co/docs/transformers/model_doc/nemotron)
* [SpeechEncoderDecoder](https://huggingface.co/docs/transformers/model_doc/speech_encoder_decoder#transformers.SpeechEncoderDecoderModel)
* [VideoLlava](https://huggingface.co/docs/transformers/model_doc/video_llava)
* [VipLlava](https://huggingface.co/docs/transformers/model_doc/vipllava)
* [VisionEncoderDecoder](https://huggingface.co/docs/transformers/model_doc/vision_encoder_decoder#transformers.VisionEncoderDecoderModel)
* [ViT](https://huggingface.co/docs/transformers/model_doc/vit#transformers.ViTModel)
* [ViTHybrid](https://huggingface.co/docs/transformers/model_doc/vit_hybrid#transformers.ViTHybridModel)
* [ViTMAE](https://huggingface.co/docs/transformers/model_doc/vit_mae#transformers.ViTMAEModel)
* [ViTMSN](https://huggingface.co/docs/transformers/model_doc/vit_msn#transformers.ViTMSNModel)
* [VisionTextDualEncoder](https://huggingface.co/docs/transformers/model_doc/vision_text_dual_encoder#transformers.VisionTextDualEncoderModel)
* [VideoMAE](https://huggingface.co/docs/transformers/model_doc/videomae#transformers.VideoMAEModell)
* [ViViT](https://huggingface.co/docs/transformers/model_doc/vivit#transformers.VivitModel)
* [wav2vec2](https://huggingface.co/docs/transformers/model_doc/wav2vec2#transformers.Wav2Vec2Model)
* [Whisper](https://huggingface.co/docs/transformers/model_doc/whisper#transformers.WhisperModel)
* [XLM-RoBERTa](https://huggingface.co/docs/transformers/model_doc/xlm-roberta#transformers.XLMRobertaModel)

View File

@ -18,11 +18,11 @@ rendered properly in your Markdown viewer.
This guide focuses on training large models efficiently on CPU.
## Mixed precision with IPEX
Mixed precision uses single (fp32) and half-precision (bf16/fp16) data types in a model to accelerate training or inference while still preserving much of the single-precision accuracy. Modern CPUs such as 3rd and 4th Gen Intel┬о Xeon┬о Scalable processors natively support bf16, so you should get more performance out of the box by enabling mixed precision training with bf16.
Mixed precision uses single (fp32) and half-precision (bf16/fp16) data types in a model to accelerate training or inference while still preserving much of the single-precision accuracy. Modern CPUs such as 3rd, 4th, and 5th Gen Intel┬о Xeon┬о Scalable processors natively support bf16. 6th Gen Intel┬о Xeon┬о Scalable processors natively support bf16 and fp16. You should get more performance out of the box by enabling mixed precision training with bf16 or fp16.
To further maximize training performance, you can use Intel┬о Extension for PyTorch (IPEX), which is a library built on PyTorch and adds additional CPU instruction level architecture (ISA) level support such as Intel┬о Advanced Vector Extensions 512 Vector Neural Network Instructions (Intel┬о AVX512-VNNI), and Intel┬о Advanced Matrix Extensions (Intel┬о AMX) for an extra performance boost on Intel CPUs. However, CPUs with only AVX2 (e.g., AMD or older Intel CPUs) are not guaranteed to have better performance under IPEX.
Auto Mixed Precision (AMP) for CPU backends has been enabled since PyTorch 1.10. AMP support for bf16 on CPUs and bf16 operator optimization is also supported in IPEX and partially upstreamed to the main PyTorch branch. You can get better performance and user experience with IPEX AMP.
Auto Mixed Precision (AMP) for CPU backends has been enabled since PyTorch 1.10. AMP support for bf16/fp16 on CPUs and bf16/fp16 operator optimization is also supported in IPEX and partially upstreamed to the main PyTorch branch. You can get better performance and user experience with IPEX AMP.
Check more detailed information for [Auto Mixed Precision](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/features/amp.html).
@ -32,10 +32,10 @@ IPEX release is following PyTorch, to install via pip:
| PyTorch Version | IPEX version |
| :---------------: | :----------: |
| 2.1.x | 2.1.100+cpu |
| 2.0.x | 2.0.100+cpu |
| 1.13 | 1.13.0+cpu |
| 1.12 | 1.12.300+cpu |
| 2.5.0 | 2.5.0+cpu |
| 2.4.0 | 2.4.0+cpu |
| 2.3.0 | 2.3.0+cpu |
| 2.2.0 | 2.2.0+cpu |
Please run `pip list | grep torch` to get your `pytorch_version`, so you can get the `IPEX version_name`.
```bash
@ -46,7 +46,7 @@ You can check the latest versions in [ipex-whl-stable-cpu](https://developer.int
Check more approaches for [IPEX installation](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/installation.html).
### Usage in Trainer
To enable auto mixed precision with IPEX in Trainer, users should add `use_ipex`, `bf16` and `no_cuda` in training command arguments.
To enable auto mixed precision with IPEX in Trainer, users should add `use_ipex`, `bf16` or `fp16`, and `no_cuda` in training command arguments.
Take an example of the use cases on [Transformers question-answering](https://github.com/huggingface/transformers/tree/main/examples/pytorch/question-answering)

View File

@ -30,46 +30,32 @@ Check more detailed information for [oneccl_bind_pt](https://github.com/intel/to
Wheel files are available for the following Python versions:
| Extension Version | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 | Python 3.10 |
| :---------------: | :--------: | :--------: | :--------: | :--------: | :---------: |
| 2.1.0 | | тИЪ | тИЪ | тИЪ | тИЪ |
| 2.0.0 | | тИЪ | тИЪ | тИЪ | тИЪ |
| 1.13.0 | | тИЪ | тИЪ | тИЪ | тИЪ |
| 1.12.100 | | тИЪ | тИЪ | тИЪ | тИЪ |
| 1.12.0 | | тИЪ | тИЪ | тИЪ | тИЪ |
| Extension Version | Python 3.7 | Python 3.8 | Python 3.9 | Python 3.10 | Python 3.11 |
| :---------------: | :--------: | :--------: | :--------: | :---------: | :---------: |
| 2.5.0 | | тИЪ | тИЪ | тИЪ | тИЪ |
| 2.4.0 | | тИЪ | тИЪ | тИЪ | тИЪ |
| 2.3.0 | | тИЪ | тИЪ | тИЪ | тИЪ |
| 2.2.0 | | тИЪ | тИЪ | тИЪ | тИЪ |
Please run `pip list | grep torch` to get your `pytorch_version`.
```bash
pip install oneccl_bind_pt=={pytorch_version} -f https://developer.intel.com/ipex-whl-stable-cpu
```
where `{pytorch_version}` should be your PyTorch version, for instance 2.1.0.
where `{pytorch_version}` should be your PyTorch version, for instance 2.4.0.
Check more approaches for [oneccl_bind_pt installation](https://github.com/intel/torch-ccl).
Versions of oneCCL and PyTorch must match.
<Tip warning={true}>
oneccl_bindings_for_pytorch 1.12.0 prebuilt wheel does not work with PyTorch 1.12.1 (it is for PyTorch 1.12.0)
PyTorch 1.12.1 should work with oneccl_bindings_for_pytorch 1.12.100
</Tip>
## Intel┬о MPI library
Use this standards-based MPI implementation to deliver flexible, efficient, scalable cluster messaging on Intel┬о architecture. This component is part of the Intel┬о oneAPI HPC Toolkit.
oneccl_bindings_for_pytorch is installed along with the MPI tool set. Need to source the environment before using it.
for Intel┬о oneCCL >= 1.12.0
```bash
oneccl_bindings_for_pytorch_path=$(python -c "from oneccl_bindings_for_pytorch import cwd; print(cwd)")
source $oneccl_bindings_for_pytorch_path/env/setvars.sh
```
for Intel┬о oneCCL whose version < 1.12.0
```bash
torch_ccl_path=$(python -c "import torch; import torch_ccl; import os; print(os.path.abspath(os.path.dirname(torch_ccl.__file__)))")
source $torch_ccl_path/env/setvars.sh
```
#### Intel┬о Extension for PyTorch installation
Intel Extension for PyTorch (IPEX) provides performance optimizations for CPU training with both Float32 and BFloat16 (refer to the [single CPU section](./perf_train_cpu) to learn more).
@ -155,7 +141,7 @@ This example assumes that you have:
The snippet below is an example of a Dockerfile that uses a base image that supports distributed CPU training and then
extracts a Transformers release to the `/workspace` directory, so that the example scripts are included in the image:
```dockerfile
FROM intel/intel-optimized-pytorch:2.3.0-pip-multinode
FROM intel/intel-optimized-pytorch:2.4.0-pip-multinode
RUN apt-get update -y && \
apt-get install -y --no-install-recommends --fix-missing \
@ -165,7 +151,7 @@ RUN apt-get update -y && \
WORKDIR /workspace
# Download and extract the transformers code
ARG HF_TRANSFORMERS_VER="4.44.0"
ARG HF_TRANSFORMERS_VER="4.46.0"
RUN pip install --no-cache-dir \
transformers==${HF_TRANSFORMERS_VER} && \
mkdir transformers && \
@ -319,4 +305,4 @@ with the job, the PyTorchJob resource can be deleted from the cluster using `kub
This guide covered running distributed PyTorch training jobs using multiple CPUs on bare metal and on a Kubernetes
cluster. Both cases utilize Intel Extension for PyTorch and Intel oneCCL Bindings for PyTorch for optimal training
performance, and can be used as a template to run your own workload on multiple nodes.
performance, and can be used as a template to run your own workload on multiple nodes.

View File

@ -53,7 +53,7 @@ sections we go through the steps to run inference on CPU and single/multi-GPU se
* [Inference on a single CPU](perf_infer_cpu)
* [Inference on a single GPU](perf_infer_gpu_one)
* [Multi-GPU inference](perf_infer_gpu_one)
* [Multi-GPU inference](perf_infer_gpu_multi)
* [XLA Integration for TensorFlow Models](tf_xla)

View File

@ -107,7 +107,8 @@ max_length = model.config.n_positions
stride = 512
seq_len = encodings.input_ids.size(1)
nlls = []
nll_sum = 0.0
n_tokens = 0
prev_end_loc = 0
for begin_loc in tqdm(range(0, seq_len, stride)):
end_loc = min(begin_loc + max_length, seq_len)
@ -124,13 +125,19 @@ for begin_loc in tqdm(range(0, seq_len, stride)):
# to the left by 1.
neg_log_likelihood = outputs.loss
nlls.append(neg_log_likelihood)
# Accumulate the total negative log-likelihood and the total number of tokens
num_valid_tokens = (target_ids != -100).sum().item() # number of valid tokens in target_ids
batch_size = target_ids.size(0)
num_loss_tokens = num_valid_tokens - batch_size # subtract batch_size due to internal label shift
nll_sum += neg_log_likelihood * num_loss_tokens
n_tokens += num_loss_tokens
prev_end_loc = end_loc
if end_loc == seq_len:
break
ppl = torch.exp(torch.stack(nlls).mean())
avg_nll = nll_sum / n_tokens # average negative log-likelihood per token
ppl = torch.exp(avg_nll)
```
Running this with the stride length equal to the max input length is equivalent to the suboptimal, non-sliding-window
@ -139,5 +146,5 @@ and the better the reported perplexity will typically be.
When we run the above with `stride = 1024`, i.e. no overlap, the resulting PPL is `19.44`, which is about the same
as the `19.93` reported in the GPT-2 paper. By using `stride = 512` and thereby employing our striding window
strategy, this jumps down to `16.45`. This is not only a more favorable score, but is calculated in a way that is
strategy, this jumps down to `16.44`. This is not only a more favorable score, but is calculated in a way that is
closer to the true autoregressive decomposition of a sequence likelihood.

View File

@ -45,19 +45,19 @@ In short, supporting a wide range of quantization methods allows you to pick the
Use the table below to help you decide which quantization method to use.
| Quantization method | On the fly quantization | CPU | CUDA GPU | RoCm GPU (AMD) | Metal (Apple Silicon) | torch.compile() support | Number of bits | Supports fine-tuning (through PEFT) | Serializable with ЁЯдЧ transformers | ЁЯдЧ transformers support | Link to library |
|-------------------------------------|-------------------------|-----|----------|----------------|-----------------------|-------------------------|----------------|-------------------------------------|--------------|------------------------|---------------------------------------------|
| [AQLM](./aqlm) | ЁЯФ┤ | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | ЁЯЯв | 1 / 2 | ЁЯЯв | ЁЯЯв | ЁЯЯв | https://github.com/Vahe1994/AQLM |
| [AWQ](./awq) | ЁЯФ┤ | ЁЯФ┤ | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ? | 4 | ЁЯЯв | ЁЯЯв | ЁЯЯв | https://github.com/casper-hansen/AutoAWQ |
| [bitsandbytes](./bitsandbytes) | ЁЯЯв | ЁЯЯб * | ЁЯЯв | ЁЯЯб * | ЁЯФ┤ ** | ЁЯФ┤ (soon!) | 4 / 8 | ЁЯЯв | ЁЯЯв | ЁЯЯв | https://github.com/bitsandbytes-foundation/bitsandbytes |
| [compressed-tensors](./compressed_tensors) | ЁЯФ┤ | ЁЯЯв | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | 1 - 8 | ЁЯЯв | ЁЯЯв | ЁЯЯв | https://github.com/neuralmagic/compressed-tensors |
| [EETQ](./eetq) | ЁЯЯв | ЁЯФ┤ | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | ? | 8 | ЁЯЯв | ЁЯЯв | ЁЯЯв | https://github.com/NetEase-FuXi/EETQ |
| GGUF / GGML (llama.cpp) | ЁЯЯв | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯЯв | ЁЯФ┤ | 1 - 8 | ЁЯФ┤ | [See GGUF section](../gguf) | [See GGUF section](../gguf) | https://github.com/ggerganov/llama.cpp |
| [GPTQ](./gptq) | ЁЯФ┤ | ЁЯФ┤ | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | 2 - 3 - 4 - 8 | ЁЯЯв | ЁЯЯв | ЁЯЯв | https://github.com/AutoGPTQ/AutoGPTQ |
| [HQQ](./hqq) | ЁЯЯв | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | ЁЯЯв | 1 - 8 | ЁЯЯв | ЁЯФ┤ | ЁЯЯв | https://github.com/mobiusml/hqq/ |
| [Quanto](./quanto) | ЁЯЯв | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯЯв | ЁЯЯв | 2 / 4 / 8 | ЁЯФ┤ | ЁЯФ┤ | ЁЯЯв | https://github.com/huggingface/quanto |
| [FBGEMM_FP8](./fbgemm_fp8.md) | ЁЯЯв | ЁЯФ┤ | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | ЁЯФ┤ | 8 | ЁЯФ┤ | ЁЯЯв | ЁЯЯв | https://github.com/pytorch/FBGEMM |
| [torchao](./torchao.md) | ЁЯЯв | | ЁЯЯв | ЁЯФ┤ | partial support (int4 weight only) | | 4 / 8 | | ЁЯЯвЁЯФ┤ | ЁЯЯв | https://github.com/pytorch/ao |
| Quantization method | On the fly quantization | CPU | CUDA GPU | RoCm GPU (AMD) | Metal (Apple Silicon) | Intel GPU | torch.compile() support | Number of bits | Supports fine-tuning (through PEFT) | Serializable with ЁЯдЧ transformers | ЁЯдЧ transformers support | Link to library |
|-------------------------------------|-------------------------|-----|----------|----------------|-----------------------|-----------|-------------------------|----------------|-------------------------------------|--------------|------------------------|---------------------------------------------|
| [AQLM](./aqlm) | ЁЯФ┤ | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | ЁЯФ┤ | ЁЯЯв | 1 / 2 | ЁЯЯв | ЁЯЯв | ЁЯЯв | https://github.com/Vahe1994/AQLM |
| [AWQ](./awq) | ЁЯФ┤ | ЁЯЯв | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯЯв | ? | 4 | ЁЯЯв | ЁЯЯв | ЁЯЯв | https://github.com/casper-hansen/AutoAWQ |
| [bitsandbytes](./bitsandbytes) | ЁЯЯв | ЁЯЯб * | ЁЯЯв | ЁЯЯб * | ЁЯФ┤ ** | ЁЯЯб * | ЁЯФ┤ (soon!) | 4 / 8 | ЁЯЯв | ЁЯЯв | ЁЯЯв | https://github.com/bitsandbytes-foundation/bitsandbytes |
| [compressed-tensors](./compressed_tensors) | ЁЯФ┤ | ЁЯЯв | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | ЁЯФ┤ | 1 - 8 | ЁЯЯв | ЁЯЯв | ЁЯЯв | https://github.com/neuralmagic/compressed-tensors |
| [EETQ](./eetq) | ЁЯЯв | ЁЯФ┤ | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | ЁЯФ┤ | ? | 8 | ЁЯЯв | ЁЯЯв | ЁЯЯв | https://github.com/NetEase-FuXi/EETQ |
| GGUF / GGML (llama.cpp) | ЁЯЯв | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | 1 - 8 | ЁЯФ┤ | [See GGUF section](../gguf) | [See GGUF section](../gguf) | https://github.com/ggerganov/llama.cpp |
| [GPTQ](./gptq) | ЁЯФ┤ | ЁЯФ┤ | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | ЁЯФ┤ | 2 - 3 - 4 - 8 | ЁЯЯв | ЁЯЯв | ЁЯЯв | https://github.com/AutoGPTQ/AutoGPTQ |
| [HQQ](./hqq) | ЁЯЯв | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | ЁЯФ┤ | ЁЯЯв | 1 - 8 | ЁЯЯв | ЁЯФ┤ | ЁЯЯв | https://github.com/mobiusml/hqq/ |
| [optimum-quanto](./quanto) | ЁЯЯв | ЁЯЯв | ЁЯЯв | ЁЯФ┤ | ЁЯЯв | ЁЯФ┤ | ЁЯЯв | 2 / 4 / 8 | ЁЯФ┤ | ЁЯФ┤ | ЁЯЯв | https://github.com/huggingface/optimum-quanto |
| [FBGEMM_FP8](./fbgemm_fp8.md) | ЁЯЯв | ЁЯФ┤ | ЁЯЯв | ЁЯФ┤ | ЁЯФ┤ | ЁЯФ┤ | ЁЯФ┤ | 8 | ЁЯФ┤ | ЁЯЯв | ЁЯЯв | https://github.com/pytorch/FBGEMM |
| [torchao](./torchao.md) | ЁЯЯв | | ЁЯЯв | ЁЯФ┤ | partial support (int4 weight only) | ЁЯФ┤ | | 4 / 8 | | ЁЯЯвЁЯФ┤ | ЁЯЯв | https://github.com/pytorch/ao |
<Tip>

View File

@ -14,21 +14,21 @@ rendered properly in your Markdown viewer.
-->
# Quanto
# Optimum-quanto
<Tip>
Try Quanto + transformers with this [notebook](https://colab.research.google.com/drive/16CXfVmtdQvciSh9BopZUDYcmXCDpvgrT?usp=sharing)!
Try optimum-quanto + transformers with this [notebook](https://colab.research.google.com/drive/16CXfVmtdQvciSh9BopZUDYcmXCDpvgrT?usp=sharing)!
</Tip>
[ЁЯдЧ Quanto](https://github.com/huggingface/quanto) library is a versatile pytorch quantization toolkit. The quantization method used is the linear quantization. Quanto provides several unique features such as:
[ЁЯдЧ optimum-quanto](https://github.com/huggingface/optimum-quanto) library is a versatile pytorch quantization toolkit. The quantization method used is the linear quantization. Quanto provides several unique features such as:
- weights quantization (`float8`,`int8`,`int4`,`int2`)
- activation quantization (`float8`,`int8`)
- modality agnostic (e.g CV,LLM)
- device agnostic (e.g CUDA,MPS,CPU)
- device agnostic (e.g CUDA,XPU,MPS,CPU)
- compatibility with `torch.compile`
- easy to add custom kernel for specific device
- supports quantization aware training
@ -37,12 +37,12 @@ Try Quanto + transformers with this [notebook](https://colab.research.google.com
Before you begin, make sure the following libraries are installed:
```bash
pip install quanto accelerate transformers
pip install optimum-quanto accelerate transformers
```
Now you can quantize a model by passing [`QuantoConfig`] object in the [`~PreTrainedModel.from_pretrained`] method. This works for any model in any modality, as long as it contains `torch.nn.Linear` layers.
The integration with transformers only supports weights quantization. For the more complex use case such as activation quantization, calibration and quantization aware training, you should use [quanto](https://github.com/huggingface/quanto) library instead.
The integration with transformers only supports weights quantization. For the more complex use case such as activation quantization, calibration and quantization aware training, you should use [optimum-quanto](https://github.com/huggingface/optimum-quanto) library instead.
```py
from transformers import AutoModelForCausalLM, AutoTokenizer, QuantoConfig
@ -55,7 +55,7 @@ quantized_model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cud
Note that serialization is not supported yet with transformers but it is coming soon! If you want to save the model, you can use quanto library instead.
Quanto library uses linear quantization algorithm for quantization. Even though this is a basic quantization technique, we get very good results! Have a look at the following benchmark (llama-2-7b on perplexity metric). You can find more benchmarks [here](https://github.com/huggingface/quanto/tree/main/bench/generation)
Optimum-quanto library uses linear quantization algorithm for quantization. Even though this is a basic quantization technique, we get very good results! Have a look at the following benchmark (llama-2-7b on perplexity metric). You can find more benchmarks [here](https://github.com/huggingface/optimum-quanto/tree/main/bench/generation)
<div class="flex gap-4">
<div>

View File

@ -360,8 +360,8 @@ One particularly cool ЁЯдЧ Transformers feature is the ability to save a model a
```py
>>> from transformers import AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory)
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
>>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory)
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)
```
</pt>
<tf>
@ -369,8 +369,8 @@ One particularly cool ЁЯдЧ Transformers feature is the ability to save a model a
```py
>>> from transformers import TFAutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory)
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)
>>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory)
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
```
</tf>
</frameworkcontent>

View File

@ -386,9 +386,9 @@ The use and prompting for the conversational use is very similar to using the ba
```py
>>> import torch
>>> from transformers import IdeficsForVisionText2Text, AutoProcessor
>>> from accelerate.test_utils.testing import get_backend
>>> device = "cuda" if torch.cuda.is_available() else "cpu"
>>> device, _, _ = get_backend() # automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
>>> checkpoint = "HuggingFaceM4/idefics-9b-instruct"
>>> model = IdeficsForVisionText2Text.from_pretrained(checkpoint, torch_dtype=torch.bfloat16).to(device)
>>> processor = AutoProcessor.from_pretrained(checkpoint)

View File

@ -256,8 +256,9 @@ image
Prepare image for the model.
```python
device = "cuda" if torch.cuda.is_available() else "cpu"
from accelerate.test_utils.testing import get_backend
# automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
device, _, _ = get_backend()
inputs = processor(images=image, return_tensors="pt").to(device)
pixel_values = inputs.pixel_values
```

View File

@ -26,7 +26,7 @@ after a natural disaster, monitoring crop health, or helping screen medical imag
This guide illustrates how to:
1. Fine-tune [ViT](model_doc/vit) on the [Food-101](https://huggingface.co/datasets/food101) dataset to classify a food item in an image.
1. Fine-tune [ViT](../model_doc/vit) on the [Food-101](https://huggingface.co/datasets/food101) dataset to classify a food item in an image.
2. Use your fine-tuned model for inference.
<Tip>

View File

@ -43,8 +43,9 @@ Let's see the pipeline in action. First, initialize the pipeline. If you don't p
```python
import torch
from transformers import pipeline
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
from accelerate.test_utils.testing import get_backend
# automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
DEVICE, _, _ = get_backend()
pipe = pipeline(task="image-feature-extraction", model_name="google/vit-base-patch16-384", device=DEVICE, pool=True)
```

View File

@ -120,6 +120,46 @@ print(generated_texts)
## ['User: What do we see in this image? \nAssistant: In this image we can see two cats on the nets. \nUser: And how about this image? \nAssistant: In this image we can see flowers, plants and insect.']
```
## Pipeline
The fastest way to get started is to use the [`Pipeline`] API. Specify the `"image-text-to-text"` task and the model you want to use.
```python
from transformers import pipeline
pipe = pipeline("image-text-to-text", model="llava-hf/llava-interleave-qwen-0.5b-hf")
```
The example below uses chat templates to format the text inputs.
```python
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"image": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg",
},
{"type": "text", "text": "Describe this image."},
],
},
{
"role": "assistant",
"content": [
{"type": "text", "text": "There's a pink flower"},
],
},
]
```
Pass the chat template formatted text and image to [`Pipeline`] and set `return_full_text=False` to remove the input from the generated output.
```python
outputs = pipe(text=messages, max_new_tokens=20, return_full_text=False)
outputs[0]["generated_text"]
# with a yellow center in the foreground. The flower is surrounded by red and white flowers with green stems
```
## Streaming
We can use [text streaming](./generation_strategies#streaming) for a better generation experience. Transformers supports streaming with the [`TextStreamer`] or [`TextIteratorStreamer`] classes. We will use the [`TextIteratorStreamer`] with IDEFICS-8B.

View File

@ -37,8 +37,9 @@ We can now initialize the pipeline with a [Swin2SR model](https://huggingface.co
```python
from transformers import pipeline
import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
from accelerate.test_utils.testing import get_backend
# automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
device, _, _ = get_backend()
pipe = pipeline(task="image-to-image", model="caidas/swin2SR-lightweight-x2-64", device=device)
```

View File

@ -58,7 +58,7 @@ from transformers import TrainingArguments, Trainer
import torch
import torch.nn as nn
import torch.nn.functional as F
from accelerate.test_utils.testing import get_backend
class ImageDistilTrainer(Trainer):
def __init__(self, teacher_model=None, student_model=None, temperature=None, lambda_param=None, *args, **kwargs):
@ -66,7 +66,7 @@ class ImageDistilTrainer(Trainer):
self.teacher = teacher_model
self.student = student_model
self.loss_function = nn.KLDivLoss(reduction="batchmean")
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device, _, _ = get_backend() # automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
self.teacher.to(device)
self.teacher.eval()
self.temperature = temperature

View File

@ -125,9 +125,9 @@ the processor.
```python
from transformers import SamModel, SamProcessor
import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
from accelerate.test_utils.testing import get_backend
# automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
device, _, _ = get_backend()
model = SamModel.from_pretrained("facebook/sam-vit-base").to(device)
processor = SamProcessor.from_pretrained("facebook/sam-vit-base")
```

View File

@ -53,8 +53,9 @@ Instantiate a pipeline from a [checkpoint on the Hugging Face Hub](https://huggi
```py
>>> from transformers import pipeline
>>> import torch
>>> device = "cuda" if torch.cuda.is_available() else "cpu"
>>> from accelerate.test_utils.testing import get_backend
# automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
>>> device, _, _ = get_backend()
>>> checkpoint = "depth-anything/Depth-Anything-V2-base-hf"
>>> pipe = pipeline("depth-estimation", model=checkpoint, device=device)
```
@ -126,97 +127,34 @@ Pass the prepared inputs through the model:
... outputs = model(pixel_values)
```
Let's post-process and visualize the results.
We need to pad and then resize the outputs so that predicted depth map has the same dimension as the original image. After resizing we will remove the padded regions from the depth.
Let's post-process the results to remove any padding and resize the depth map to match the original image size. The `post_process_depth_estimation` outputs a list of dicts containing the `"predicted_depth"`.
```py
>>> import numpy as np
>>> import torch.nn.functional as F
>>> # ZoeDepth dynamically pads the input image. Thus we pass the original image size as argument
>>> # to `post_process_depth_estimation` to remove the padding and resize to original dimensions.
>>> post_processed_output = image_processor.post_process_depth_estimation(
... outputs,
... source_sizes=[(image.height, image.width)],
... )
>>> predicted_depth = outputs.predicted_depth.unsqueeze(dim=1)
>>> height, width = pixel_values.shape[2:]
>>> height_padding_factor = width_padding_factor = 3
>>> pad_h = int(np.sqrt(height/2) * height_padding_factor)
>>> pad_w = int(np.sqrt(width/2) * width_padding_factor)
>>> if predicted_depth.shape[-2:] != pixel_values.shape[-2:]:
>>> predicted_depth = F.interpolate(predicted_depth, size= (height, width), mode='bicubic', align_corners=False)
>>> if pad_h > 0:
predicted_depth = predicted_depth[:, :, pad_h:-pad_h,:]
>>> if pad_w > 0:
predicted_depth = predicted_depth[:, :, :, pad_w:-pad_w]
>>> predicted_depth = post_processed_output[0]["predicted_depth"]
>>> depth = (predicted_depth - predicted_depth.min()) / (predicted_depth.max() - predicted_depth.min())
>>> depth = depth.detach().cpu().numpy() * 255
>>> depth = Image.fromarray(depth.astype("uint8"))
```
We can now visualize the results (the function below is taken from the [GaussianObject](https://github.com/GaussianObject/GaussianObject/blob/ad6629efadb57902d5f8bc0fa562258029a4bdf1/pred_monodepth.py#L11) framework).
```py
import matplotlib
def colorize(value, vmin=None, vmax=None, cmap='gray_r', invalid_val=-99, invalid_mask=None, background_color=(128, 128, 128, 255), gamma_corrected=False, value_transform=None):
"""Converts a depth map to a color image.
Args:
value (torch.Tensor, numpy.ndarray): Input depth map. Shape: (H, W) or (1, H, W) or (1, 1, H, W). All singular dimensions are squeezed
vmin (float, optional): vmin-valued entries are mapped to start color of cmap. If None, value.min() is used. Defaults to None.
vmax (float, optional): vmax-valued entries are mapped to end color of cmap. If None, value.max() is used. Defaults to None.
cmap (str, optional): matplotlib colormap to use. Defaults to 'magma_r'.
invalid_val (int, optional): Specifies value of invalid pixels that should be colored as 'background_color'. Defaults to -99.
invalid_mask (numpy.ndarray, optional): Boolean mask for invalid regions. Defaults to None.
background_color (tuple[int], optional): 4-tuple RGB color to give to invalid pixels. Defaults to (128, 128, 128, 255).
gamma_corrected (bool, optional): Apply gamma correction to colored image. Defaults to False.
value_transform (Callable, optional): Apply transform function to valid pixels before coloring. Defaults to None.
Returns:
numpy.ndarray, dtype - uint8: Colored depth map. Shape: (H, W, 4)
"""
if isinstance(value, torch.Tensor):
value = value.detach().cpu().numpy()
value = value.squeeze()
if invalid_mask is None:
invalid_mask = value == invalid_val
mask = np.logical_not(invalid_mask)
# normalize
vmin = np.percentile(value[mask],2) if vmin is None else vmin
vmax = np.percentile(value[mask],85) if vmax is None else vmax
if vmin != vmax:
value = (value - vmin) / (vmax - vmin) # vmin..vmax
else:
# Avoid 0-division
value = value * 0.
# squeeze last dim if it exists
# grey out the invalid values
value[invalid_mask] = np.nan
cmapper = matplotlib.colormaps.get_cmap(cmap)
if value_transform:
value = value_transform(value)
# value = value / value.max()
value = cmapper(value, bytes=True) # (nxmx4)
# img = value[:, :, :]
img = value[...]
img[invalid_mask] = background_color
# return img.transpose((2, 0, 1))
if gamma_corrected:
# gamma correction
img = img / 255
img = np.power(img, 2.2)
img = img * 255
img = img.astype(np.uint8)
return img
>>> result = colorize(predicted_depth.cpu().squeeze().numpy())
>>> Image.fromarray(result)
```
<Tip>
<p>In the <a href="https://github.com/isl-org/ZoeDepth/blob/edb6daf45458569e24f50250ef1ed08c015f17a7/zoedepth/models/depth_model.py#L131">original implementation</a> ZoeDepth model performs inference on both the original and flipped images and averages out the results. The <code>post_process_depth_estimation</code> function can handle this for us by passing the flipped outputs to the optional <code>outputs_flipped</code> argument:</p>
<pre><code class="language-Python">&gt;&gt;&gt; with torch.no_grad():
... outputs = model(pixel_values)
... outputs_flipped = model(pixel_values=torch.flip(inputs.pixel_values, dims=[3]))
&gt;&gt;&gt; post_processed_output = image_processor.post_process_depth_estimation(
... outputs,
... source_sizes=[(image.height, image.width)],
... outputs_flipped=outputs_flipped,
... )
</code></pre>
</Tip>
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/depth-visualization-zoe.png" alt="Depth estimation visualization"/>

View File

@ -1488,7 +1488,9 @@ Now that you have finetuned a model, evaluated it, and uploaded it to the Huggin
Load model and image processor from the Hugging Face Hub (skip to use already trained in this session):
```py
>>> device = "cuda"
>>> from accelerate.test_utils.testing import get_backend
# automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
>>> device, _, _ = get_backend()
>>> model_repo = "qubvel-hf/detr_finetuned_cppe5"
>>> image_processor = AutoImageProcessor.from_pretrained(model_repo)

View File

@ -689,7 +689,9 @@ Reload the dataset and load an image for inference.
We will now see how to infer without a pipeline. Process the image with an image processor and place the `pixel_values` on a GPU:
```py
>>> device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # use GPU if available, otherwise use a CPU
>>> from accelerate.test_utils.testing import get_backend
# automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
>>> device, _, _ = get_backend()
>>> encoding = image_processor(image, return_tensors="pt")
>>> pixel_values = encoding.pixel_values.to(device)
```

View File

@ -282,10 +282,10 @@ containing the corresponding speaker embedding.
>>> import os
>>> import torch
>>> from speechbrain.inference.classifiers import EncoderClassifier
>>> from accelerate.test_utils.testing import get_backend
>>> spk_model_name = "speechbrain/spkrec-xvect-voxceleb"
>>> device = "cuda" if torch.cuda.is_available() else "cpu"
>>> device, _, _ = get_backend() # automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
>>> speaker_model = EncoderClassifier.from_hparams(
... source=spk_model_name,
... run_opts={"device": device},

View File

@ -363,10 +363,11 @@ GPU, if available, which we didn't need to do earlier when training, as [`Traine
```py
>>> from transformers import AutoProcessor, Blip2ForConditionalGeneration
>>> import torch
>>> from accelerate.test_utils.testing import get_backend
>>> processor = AutoProcessor.from_pretrained("Salesforce/blip2-opt-2.7b")
>>> model = Blip2ForConditionalGeneration.from_pretrained("Salesforce/blip2-opt-2.7b", torch_dtype=torch.float16)
>>> device = "cuda" if torch.cuda.is_available() else "cpu"
>>> device, _, _ = get_backend() # automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
>>> model.to(device)
```

View File

@ -182,7 +182,7 @@ There are three main components to Mask2Former:
The mask predictions are generated by combining the pixel-embeddings with the final decoder hidden states. The sigmoid cross-entropy and dice loss is calculated between the logits and the ground truth mask to find the most likely mask.
Ready to try your hand at object detection? Check out our complete [image segmentation guide](tasks/semantic_segmentation) to learn how to finetune SegFormer and use it for inference!
Ready to try your hand at image segmentation? Check out our complete [image segmentation guide](tasks/semantic_segmentation) to learn how to finetune SegFormer and use it for inference!
### Depth estimation
@ -292,4 +292,4 @@ Ready to try your hand at translation? Check out our complete [translation guide
For more information about text generation, check out the [text generation strategies](generation_strategies) guide!
</Tip>
</Tip>

View File

@ -428,7 +428,7 @@ pytest --instafail
### To GPU or not to GPU
On a GPU-enabled setup, to test in CPU-only mode add `CUDA_VISIBLE_DEVICES=""`:
On a GPU-enabled setup, to test in CPU-only mode add `CUDA_VISIBLE_DEVICES=""` for CUDA GPUs:
```bash
CUDA_VISIBLE_DEVICES="" pytest tests/utils/test_logging.py
@ -441,10 +441,12 @@ second gpu if you have gpus `0` and `1`, you can run:
CUDA_VISIBLE_DEVICES="1" pytest tests/utils/test_logging.py
```
For Intel GPUs, use `ZE_AFFINITY_MASK` instead of `CUDA_VISIBLE_DEVICES` in the above example.
This is handy when you want to run different tasks on different GPUs.
Some tests must be run on CPU-only, others on either CPU or GPU or TPU, yet others on multiple-GPUs. The following skip
decorators are used to set the requirements of tests CPU/GPU/TPU-wise:
decorators are used to set the requirements of tests CPU/GPU/XPU/TPU-wise:
- `require_torch` - this test will run only under torch
- `require_torch_gpu` - as `require_torch` plus requires at least 1 GPU

View File

@ -174,7 +174,7 @@ trainer = Trainer(
processing_class=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics,
callback=[EarlyStoppingCallback()],
callbacks=[EarlyStoppingCallback()],
)
```
@ -252,7 +252,70 @@ trainer = Trainer(..., args=training_args)
NEFTune is disabled after training to restore the original embedding layer to avoid any unexpected behavior.
## GaLore
## Liger Kernel
[Liger-Kernel](https://github.com/linkedin/Liger-Kernel) Kernel is a collection of Triton kernels developed by Linkedin designed specifically for LLM training. We have implemented Hugging Face Compatible RMSNorm, RoPE, SwiGLU, CrossEntropy, FusedLinearCrossEntropy, and more to come. It can effectively increase multi-GPU training throughput by 20% and reduces memory usage by 60%. The kernel works out of the box with flash attention, PyTorch FSDP, and Microsoft DeepSpeed.
<Tip>
Gain +20% throughput and reduce memory usage by 60% on LLaMA 3-8B model training. Achieve longer context lengths and larger batch sizes. ItтАЩs also useful if you want to scale up your model to multi-head training or large vocabulary sizes. Unleash multi-head training (medusa) and more. See details and examples in [Liger](https://github.com/linkedin/Liger-Kernel/tree/main/examples)
</Tip>
First make sure to install Liger official repository:
```bash
pip install liger-kernel
```
You should pass `use_liger_kernel=True` to apply liger kernel on your model, for example:
```py
from transformers import TrainingArguments
training_args = TrainingArguments(
output_dir="your-model",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=2,
weight_decay=0.01,
eval_strategy="epoch",
save_strategy="epoch",
load_best_model_at_end=True,
push_to_hub=True,
use_liger_kernel=True
)
```
The kernel supports the Llama, Gemma, Mistral, and Mixtral model architectures. The most up-to-date list of supported models can be found [here](https://github.com/linkedin/Liger-Kernel). When `use_liger_kernel` is set to `True`, the corresponding layers in the original model will be patched with Liger's efficient implementation, so you don't need to do anything extra other than setting the argument value.
## Optimizers
You can choose a built-in optimizer for training using:
```python
from transformers import TrainingArguments
training_args = TrainingArguments(..., optim="adamw_torch")
```
See [`OptimizerNames`](https://github.com/huggingface/transformers/blob/main/src/transformers/training_args.py) for a full list of choices. We include advanced examples in the sections below.
You can also use an arbitrary PyTorch optimizer via:
```python
import torch
optimizer_cls = torch.optim.AdamW
optimizer_kwargs = {
"lr": 4e-3,
"betas": (0.9, 0.999),
"weight_decay": 0.05,
}
from transformers import Trainer
trainer = Trainer(..., optimizer_cls_and_kwargs=(optimizer_cls, optimizer_kwargs))
```
### GaLore
Gradient Low-Rank Projection (GaLore) is a memory-efficient low-rank training strategy that allows full-parameter learning but is more memory-efficient than common low-rank adaptation methods, such as LoRA.
@ -382,42 +445,7 @@ trainer.train()
Note layerwise optimization is a bit experimental and does not support DDP (Distributed Data Parallel), thus you can run the training script only on a single GPU. Please see [this appropriate section](https://github.com/jiaweizzhao/GaLore?tab=readme-ov-file#train-7b-model-with-a-single-gpu-with-24gb-memory) for more details. Other features such as gradient clipping, DeepSpeed, etc might not be supported out of the box. Please [raise an issue on GitHub](https://github.com/huggingface/transformers/issues) if you encounter such issue.
## Liger Kernel
[Liger-Kernel](https://github.com/linkedin/Liger-Kernel) Kernel is a collection of Triton kernels developed by Linkedin designed specifically for LLM training. We have implemented Hugging Face Compatible RMSNorm, RoPE, SwiGLU, CrossEntropy, FusedLinearCrossEntropy, and more to come. It can effectively increase multi-GPU training throughput by 20% and reduces memory usage by 60%. The kernel works out of the box with flash attention, PyTorch FSDP, and Microsoft DeepSpeed.
<Tip>
Gain +20% throughput and reduce memory usage by 60% on LLaMA 3-8B model training. Achieve longer context lengths and larger batch sizes. ItтАЩs also useful if you want to scale up your model to multi-head training or large vocabulary sizes. Unleash multi-head training (medusa) and more. See details and examples in [Liger](https://github.com/linkedin/Liger-Kernel/tree/main/examples)
</Tip>
First make sure to install Liger official repository:
```bash
pip install liger-kernel
```
You should pass `use_liger_kernel=True` to apply liger kernel on your model, for example:
```py
from transformers import TrainingArguments
training_args = TrainingArguments(
output_dir="your-model",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=2,
weight_decay=0.01,
eval_strategy="epoch",
save_strategy="epoch",
load_best_model_at_end=True,
push_to_hub=True,
use_liger_kernel=True
)
```
The kernel supports the Llama, Gemma, Mistral, and Mixtral model architectures. The most up-to-date list of supported models can be found [here](https://github.com/linkedin/Liger-Kernel). When `use_liger_kernel` is set to `True`, the corresponding layers in the original model will be patched with Liger's efficient implementation, so you don't need to do anything extra other than setting the argument value.
## LOMO optimizer
### LOMO optimizer
The LOMO optimizers have been introduced in [Full Parameter Fine-Tuning for Large Language Models with Limited Resources](https://hf.co/papers/2306.09782) and [AdaLomo: Low-memory Optimization with Adaptive Learning Rate](https://hf.co/papers/2310.10195).
They both consist of an efficient full-parameter fine-tuning method. These optimizers fuse the gradient computation and the parameter update in one step to reduce memory usage. Supported optimizers for LOMO are `"lomo"` and `"adalomo"`. First either install LOMO from pypi `pip install lomo-optim` or install it from source with `pip install git+https://github.com/OpenLMLab/LOMO.git`.
@ -467,7 +495,7 @@ trainer = trl.SFTTrainer(
trainer.train()
```
## GrokAdamW optimizer
### GrokAdamW optimizer
The GrokAdamW optimizer is designed to enhance training performance and stability, particularly for models that benefit from grokking signal functions. To use GrokAdamW, first install the optimizer package with `pip install grokadamw`.
@ -518,7 +546,7 @@ trainer.train()
This script demonstrates how to fine-tune the `google/gemma-2b` model on the IMDB dataset using the GrokAdamW optimizer. The `TrainingArguments` are configured to use GrokAdamW, and the dataset is passed to the `Trainer` for training.
## Schedule Free Optimizer
### Schedule Free Optimizer
The Schedule Free optimizers have been introduced in [The Road Less Scheduled](https://hf.co/papers/2405.15682).
Schedule-Free learning replaces the momentum of the base optimizer with a combination of averaging and interpolation, to completely remove the need to anneal the learning rate with a traditional schedule.

View File

@ -287,9 +287,10 @@ model.fit(tf_dataset)
At this point, you may need to restart your notebook or execute the following code to free some memory:
```py
from accelerate.utils.memory import clear_device_cache
del model
del trainer
torch.cuda.empty_cache()
clear_device_cache()
```
Next, manually postprocess `tokenized_dataset` to prepare it for training.
@ -364,8 +365,9 @@ Lastly, specify `device` to use a GPU if you have access to one. Otherwise, trai
```py
>>> import torch
>>> from accelerate.test_utils.testing import get_backend
>>> device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
>>> device, _, _ = get_backend() # automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)
>>> model.to(device)
```

View File

@ -43,7 +43,7 @@ Como resultado, puedes cargar una versi├│n espec├нfica del modelo con el par├бme
```py
>>> model = AutoModel.from_pretrained(
... "julien-c/EsperBERTo-small", revision="v2.0.1" # tag name, or branch name, or commit hash
... "julien-c/EsperBERTo-small", revision="4c77982" # tag name, or branch name, or commit hash
... )
```

View File

@ -1,3 +1,7 @@
- sections:
- local: pipeline_tutorial
title: рдкрд╛рдЗрдкрд▓рд╛рдЗрдиреЛрдВ рдХреЗ рд╕рд╛рде рдЕрдиреБрдорд╛рди рдЪрд▓рд╛рдПрдБ
title: рдкрд╛рдЗрдкрд▓рд╛рдЗрдиреЛрдВ рдХреЗ рд╕рд╛рде рдЕрдиреБрдорд╛рди рдЪрд▓рд╛рдПрдБ
- local: accelerate
title: ЁЯдЧ Accelerate рдХреЗ рд╕рд╛рде рд╡рд┐рддрд░рд┐рдд рдкреНрд░рд╢рд┐рдХреНрд╖рдг рд╕реЗрдЯ рдХрд░реЗрдВ
- local: tflite
title: TFLite рдореЗрдВ рдирд┐рд░реНрдпрд╛рдд рдХрд░реЗрдВ

View File

@ -0,0 +1,136 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
тЪая╕П Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# рд╡рд┐рддрд░рд┐рдд рдкреНрд░рд╢рд┐рдХреНрд╖рдг рдХреЗ рд╕рд╛рде ЁЯдЧ Accelerate
рдЬреИрд╕реЗ-рдЬреИрд╕реЗ рдореЙрдбрд▓ рдмрдбрд╝реЗ рд╣реЛрддреЗ рд╣реИрдВ, рд╕рдорд╛рдирд╛рдВрддрд░рддрд╛ рд╕реАрдорд┐рдд рд╣рд╛рд░реНрдбрд╡реЗрдпрд░ рдкрд░ рдмрдбрд╝реЗ рдореЙрдбрд▓ рдХреЛ рдкреНрд░рд╢рд┐рдХреНрд╖рд┐рдд рдХрд░рдиреЗ рдФрд░ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рдХреА рдЧрддрд┐ рдХреЛ рдХрдИ рдЖрджреЗрд╢реЛрдВ рдХреЗ рдЖрдХрд╛рд░ рдореЗрдВ рддреЗрдЬ рдХрд░рдиреЗ рдХреЗ рд▓рд┐рдП рдПрдХ рд░рдгрдиреАрддрд┐ рдХреЗ рд░реВрдк рдореЗрдВ рдЙрднрд░реА рд╣реИред рд╣рдЧрд┐рдВрдЧ рдлреЗрд╕ рдореЗрдВ, рд╣рдордиреЗ рдЙрдкрдпреЛрдЧрдХрд░реНрддрд╛рдУрдВ рдХреЛ рдХрд┐рд╕реА рднреА рдкреНрд░рдХрд╛рд░ рдХреЗ рд╡рд┐рддрд░рд┐рдд рд╕реЗрдЯрдЕрдк рдкрд░ ЁЯдЧ рдЯреНрд░рд╛рдВрд╕рдлрд╛рд░реНрдорд░реНрд╕ рдореЙрдбрд▓ рдХреЛ рдЖрд╕рд╛рдиреА рд╕реЗ рдкреНрд░рд╢рд┐рдХреНрд╖рд┐рдд рдХрд░рдиреЗ рдореЗрдВ рдорджрдж рдХрд░рдиреЗ рдХреЗ рд▓рд┐рдП [ЁЯдЧ Accelerate](https://huggingface.co/docs/accelerate) рдкреБрд╕реНрддрдХрд╛рд▓рдп рдмрдирд╛рдпрд╛ рд╣реИ, рдЪрд╛рд╣реЗ рд╡рд╣ рдПрдХ рдорд╢реАрди рдкрд░ рдХрдИ GPU рд╣реЛрдВ рдпрд╛ рдХрдИ рдорд╢реАрдиреЛрдВ рдореЗрдВ рдХрдИ GPUред рдЗрд╕ рдЯреНрдпреВрдЯреЛрд░рд┐рдпрд▓ рдореЗрдВ, рдЬрд╛рдиреЗрдВ рдХрд┐ рдЕрдкрдиреЗ рдореВрд▓ PyTorch рдкреНрд░рд╢рд┐рдХреНрд╖рдг рд▓реВрдк рдХреЛ рдХреИрд╕реЗ рдЕрдиреБрдХреВрд▓рд┐рдд рдХрд┐рдпрд╛ рдЬрд╛рдП рддрд╛рдХрд┐ рд╡рд┐рддрд░рд┐рдд рд╡рд╛рддрд╛рд╡рд░рдг рдореЗрдВ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рд╕рдХреНрд╖рдо рд╣реЛ рд╕рдХреЗред
## рд╕реЗрдЯрдЕрдк
ЁЯдЧ Accelerate рд╕реНрдерд╛рдкрд┐рдд рдХрд░рдХреЗ рд╢реБрд░реВ рдХрд░реЗрдВ:
```bash
pip install accelerate
```
рдлрд┐рд░ рдПрдХ [`~accelerate.Accelerator`] рдСрдмреНрдЬреЗрдХреНрдЯ рдЖрдпрд╛рдд рдХрд░реЗрдВ рдФрд░ рдмрдирд╛рдПрдВред [`~accelerate.Accelerator`] рд╕реНрд╡рдЪрд╛рд▓рд┐рдд рд░реВрдк рд╕реЗ рдЖрдкрдХреЗ рд╡рд┐рддрд░рд┐рдд рд╕реЗрдЯрдЕрдк рдХреЗ рдкреНрд░рдХрд╛рд░ рдХрд╛ рдкрддрд╛ рд▓рдЧрд╛рдПрдЧрд╛ рдФрд░ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рдХреЗ рд▓рд┐рдП рд╕рднреА рдЖрд╡рд╢реНрдпрдХ рдШрдЯрдХреЛрдВ рдХреЛ рдкреНрд░рд╛рд░рдВрдн рдХрд░реЗрдЧрд╛ред рдЖрдкрдХреЛ рдЕрдкрдиреЗ рдореЙрдбрд▓ рдХреЛ рдХрд┐рд╕реА рдбрд┐рд╡рд╛рдЗрд╕ рдкрд░ рд╕реНрдкрд╖реНрдЯ рд░реВрдк рд╕реЗ рд░рдЦрдиреЗ рдХреА рдЖрд╡рд╢реНрдпрдХрддрд╛ рдирд╣реАрдВ рд╣реИред
```py
>>> from accelerate import Accelerator
>>> accelerator = Accelerator()
```
## рддреЗрдЬреА рд▓рд╛рдиреЗ рдХреА рддреИрдпрд╛рд░реА
рдЕрдЧрд▓рд╛ рдХрджрдо рд╕рднреА рдкреНрд░рд╛рд╕рдВрдЧрд┐рдХ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рд╡рд╕реНрддреБрдУрдВ рдХреЛ [`~accelerate.Accelerator.prepare`] рд╡рд┐рдзрд┐ рдореЗрдВ рдкрд╛рд╕ рдХрд░рдирд╛ рд╣реИред рдЗрд╕рдореЗрдВ рдЖрдкрдХреЗ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рдФрд░ рдореВрд▓реНрдпрд╛рдВрдХрди DataLoaders, рдПрдХ рдореЙрдбрд▓ рдФрд░ рдПрдХ рдСрдкреНрдЯрд┐рдорд╛рдЗрдЬрд╝рд░ рд╢рд╛рдорд┐рд▓ рд╣реИрдВ:
```py
>>> train_dataloader, eval_dataloader, model, optimizer = accelerator.prepare(
... train_dataloader, eval_dataloader, model, optimizer
... )
```
## рдмреИрдХрд╡рд░реНрдб
рдЕрдВрддрд┐рдо рдЬреЛрдбрд╝ рдпрд╣ рд╣реИ рдХрд┐ рдЖрдкрдХреЗ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рд▓реВрдк рдореЗрдВ рд╕рд╛рдорд╛рдиреНрдп `loss.backward()` рдХреЛ ЁЯдЧ Accelerate рдХреЗ [`~accelerate.Accelerator.backward`] рд╡рд┐рдзрд┐ рд╕реЗ рдмрджрд▓реЗрдВ:
```py
>>> for epoch in range(num_epochs):
... for batch in train_dataloader:
... outputs = model(**batch)
... loss = outputs.loss
... accelerator.backward(loss)
... optimizer.step()
... lr_scheduler.step()
... optimizer.zero_grad()
... progress_bar.update(1)
```
рдЬреИрд╕рд╛ рдХрд┐ рдЖрдк рдирд┐рдореНрдирд▓рд┐рдЦрд┐рдд рдХреЛрдб рдореЗрдВ рджреЗрдЦ рд╕рдХрддреЗ рд╣реИрдВ, рдЖрдкрдХреЛ рд╡рд┐рддрд░рд┐рдд рдкреНрд░рд╢рд┐рдХреНрд╖рдг рд╕рдХреНрд╖рдо рдХрд░рдиреЗ рдХреЗ рд▓рд┐рдП рдЕрдкрдиреЗ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рд▓реВрдк рдореЗрдВ рдХреЗрд╡рд▓ рдЪрд╛рд░ рдЕрддрд┐рд░рд┐рдХреНрдд рдХреЛрдб рдХреА рдкрдВрдХреНрддрд┐рдпрд╛рдБ рдЬреЛрдбрд╝рдиреЗ рдХреА рдЖрд╡рд╢реНрдпрдХрддрд╛ рд╣реИ!
```diff
+ from accelerate import Accelerator
from transformers import AdamW, AutoModelForSequenceClassification, get_scheduler
+ accelerator = Accelerator()
model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)
optimizer = AdamW(model.parameters(), lr=3e-5)
- device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
- model.to(device)
+ train_dataloader, eval_dataloader, model, optimizer = accelerator.prepare(
+ train_dataloader, eval_dataloader, model, optimizer
+ )
num_epochs = 3
num_training_steps = num_epochs * len(train_dataloader)
lr_scheduler = get_scheduler(
"linear",
optimizer=optimizer,
num_warmup_steps=0,
num_training_steps=num_training_steps
)
progress_bar = tqdm(range(num_training_steps))
model.train()
for epoch in range(num_epochs):
for batch in train_dataloader:
- batch = {k: v.to(device) for k, v in batch.items()}
outputs = model(**batch)
loss = outputs.loss
- loss.backward()
+ accelerator.backward(loss)
optimizer.step()
lr_scheduler.step()
optimizer.zero_grad()
progress_bar.update(1)
```
## рдкреНрд░рд╢рд┐рдХреНрд╖рдг
рдПрдХ рдмрд╛рд░ рдЬрдм рдЖрдкрдиреЗ рдкреНрд░рд╛рд╕рдВрдЧрд┐рдХ рдХреЛрдб рдХреА рдкрдВрдХреНрддрд┐рдпрд╛рдБ рдЬреЛрдбрд╝ рджреА рд╣реИрдВ, рддреЛ рдЕрдкрдиреЗ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рдХреЛ рд╕реНрдХреНрд░рд┐рдкреНрдЯ рдпрд╛ рдХреЛрд▓реИрдмреЛрд░реЗрдЯрд░реА рдЬреИрд╕реЗ рдиреЛрдЯрдмреБрдХ рдореЗрдВ рд▓реЙрдиреНрдЪ рдХрд░реЗрдВред
### рд╕реНрдХреНрд░рд┐рдкреНрдЯ рдХреЗ рд╕рд╛рде рдкреНрд░рд╢рд┐рдХреНрд╖рдг
рдпрджрд┐ рдЖрдк рд╕реНрдХреНрд░рд┐рдкреНрдЯ рд╕реЗ рдЕрдкрдирд╛ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рдЪрд▓рд╛ рд░рд╣реЗ рд╣реИрдВ, рддреЛ рдПрдХ рдХреЙрдиреНрдлрд╝рд┐рдЧрд░реЗрд╢рди рдлрд╝рд╛рдЗрд▓ рдмрдирд╛рдиреЗ рдФрд░ рд╕рд╣реЗрдЬрдиреЗ рдХреЗ рд▓рд┐рдП рдирд┐рдореНрдирд▓рд┐рдЦрд┐рдд рдХрдорд╛рдВрдб рдЪрд▓рд╛рдПрдБ:
```bash
accelerate config
```
рдлрд┐рд░ рдЕрдкрдиреЗ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рдХреЛ рдЗрд╕ рддрд░рд╣ рд▓реЙрдиреНрдЪ рдХрд░реЗрдВ:
```bash
accelerate launch train.py
```
### рдиреЛрдЯрдмреБрдХ рдХреЗ рд╕рд╛рде рдкреНрд░рд╢рд┐рдХреНрд╖рдг
ЁЯдЧ Accelerate рдПрдХ рдиреЛрдЯрдмреБрдХ рдореЗрдВ рднреА рдЪрд▓ рд╕рдХрддрд╛ рд╣реИ рдпрджрд┐ рдЖрдк Colaboratory рдХреЗ TPU рдХрд╛ рдЙрдкрдпреЛрдЧ рдХрд░рдиреЗ рдХреА рдпреЛрдЬрдирд╛ рдмрдирд╛ рд░рд╣реЗ рд╣реИрдВред рдкреНрд░рд╢рд┐рдХреНрд╖рдг рдХреЗ рд▓рд┐рдП рдЬрд┐рдореНрдореЗрджрд╛рд░ рд╕рднреА рдХреЛрдб рдХреЛ рдПрдХ рдлрд╝рдВрдХреНрд╢рди рдореЗрдВ рд▓рдкреЗрдЯреЗрдВ, рдФрд░ рдЗрд╕реЗ [`~accelerate.notebook_launcher`] рдореЗрдВ рдкрд╛рд╕ рдХрд░реЗрдВ:
```py
>>> from accelerate import notebook_launcher
>>> notebook_launcher(training_function)
```
ЁЯдЧ Accelerate рдФрд░ рдЗрд╕рдХреА рд╕рдореГрджреНрдз рд╕реБрд╡рд┐рдзрд╛рдУрдВ рдХреЗ рдмрд╛рд░реЗ рдореЗрдВ рдЕрдзрд┐рдХ рдЬрд╛рдирдХрд╛рд░реА рдХреЗ рд▓рд┐рдП, [рджрд╕реНрддрд╛рд╡реЗрдЬрд╝реАрдХрд░рдг](https://huggingface.co/docs/accelerate) рджреЗрдЦреЗрдВред

55
docs/source/hi/tflite.md Normal file
View File

@ -0,0 +1,55 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
тЪая╕П Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# TFLite рдореЗрдВ рдирд┐рд░реНрдпрд╛рдд рдХрд░реЗрдВ
[TensorFlow Lite](https://www.tensorflow.org/lite/guide) рдПрдХ рд╣рд▓реНрдХрд╛ рдврд╛рдВрдЪрд╛ рд╣реИ рдЬреЛ рдорд╢реАрди рд▓рд░реНрдирд┐рдВрдЧ рдореЙрдбрд▓ рдХреЛ рд╕рдВрд╕рд╛рдзрди-рд╕реАрдорд┐рдд рдЙрдкрдХрд░рдгреЛрдВ, рдЬреИрд╕реЗ рдореЛрдмрд╛рдЗрд▓ рдлреЛрди, рдПрдореНрдмреЗрдбреЗрдб рд╕рд┐рд╕реНрдЯрдо рдФрд░ рдЗрдВрдЯрд░рдиреЗрдЯ рдСрдл рдерд┐рдВрдЧреНрд╕ (IoT) рдЙрдкрдХрд░рдгреЛрдВ рдкрд░ рддреИрдирд╛рдд рдХрд░рдиреЗ рдХреЗ рд▓рд┐рдП рд╣реИред TFLite рдХреЛ рдЗрди рдЙрдкрдХрд░рдгреЛрдВ рдкрд░ рд╕реАрдорд┐рдд рдЧрдгрдирд╛рддреНрдордХ рд╢рдХреНрддрд┐, рдореЗрдореЛрд░реА рдФрд░ рдКрд░реНрдЬрд╛ рдЦрдкрдд рдХреЗ рд╕рд╛рде рдореЙрдбрд▓ рдХреЛ рдХреБрд╢рд▓рддрд╛ рд╕реЗ рдСрдкреНрдЯрд┐рдорд╛рдЗрдЬрд╝ рдФрд░ рдЪрд▓рд╛рдиреЗ рдХреЗ рд▓рд┐рдП рдбрд┐рдЬрд╝рд╛рдЗрди рдХрд┐рдпрд╛ рдЧрдпрд╛ рд╣реИред рдПрдХ TensorFlow Lite рдореЙрдбрд▓ рдХреЛ рдПрдХ рд╡рд┐рд╢реЗрд╖ рдХреБрд╢рд▓ рдкреЛрд░реНрдЯреЗрдмрд▓ рдкреНрд░рд╛рд░реВрдк рдореЗрдВ рджрд░реНрд╢рд╛рдпрд╛ рдЬрд╛рддрд╛ рд╣реИ рдЬрд┐рд╕реЗ `.tflite` рдлрд╝рд╛рдЗрд▓ рдПрдХреНрд╕рдЯреЗрдВрд╢рди рджреНрд╡рд╛рд░рд╛ рдкрд╣рдЪрд╛рдирд╛ рдЬрд╛рддрд╛ рд╣реИред
ЁЯдЧ Optimum рдореЗрдВ `exporters.tflite` рдореЙрдбреНрдпреВрд▓ рдХреЗ рдорд╛рдзреНрдпрдо рд╕реЗ ЁЯдЧ Transformers рдореЙрдбрд▓ рдХреЛ TFLite рдореЗрдВ рдирд┐рд░реНрдпрд╛рдд рдХрд░рдиреЗ рдХреА рдХрд╛рд░реНрдпрдХреНрд╖рдорддрд╛ рд╣реИред рд╕рдорд░реНрдерд┐рдд рдореЙрдбрд▓ рдЖрд░реНрдХрд┐рдЯреЗрдХреНрдЪрд░ рдХреА рд╕реВрдЪреА рдХреЗ рд▓рд┐рдП, рдХреГрдкрдпрд╛ [ЁЯдЧ Optimum рджрд╕реНрддрд╛рд╡реЗрдЬрд╝](https://huggingface.co/docs/optimum/exporters/tflite/overview) рджреЗрдЦреЗрдВред
TFLite рдореЗрдВ рдПрдХ рдореЙрдбрд▓ рдирд┐рд░реНрдпрд╛рдд рдХрд░рдиреЗ рдХреЗ рд▓рд┐рдП, рдЖрд╡рд╢реНрдпрдХ рдирд┐рд░реНрднрд░рддрд╛рдПрдБ рд╕реНрдерд╛рдкрд┐рдд рдХрд░реЗрдВ:
```bash
pip install optimum[exporters-tf]
```
рд╕рднреА рдЙрдкрд▓рдмреНрдз рддрд░реНрдХреЛрдВ рдХреА рдЬрд╛рдВрдЪ рдХрд░рдиреЗ рдХреЗ рд▓рд┐рдП, [ЁЯдЧ Optimum рджрд╕реНрддрд╛рд╡реЗрдЬрд╝](https://huggingface.co/docs/optimum/main/en/exporters/tflite/usage_guides/export_a_model) рджреЗрдЦреЗрдВ,
рдпрд╛ рдХрдорд╛рдВрдб рд▓рд╛рдЗрди рдореЗрдВ рдорджрдж рджреЗрдЦреЗрдВ:
```bash
optimum-cli export tflite --help
```
рдпрджрд┐ рдЖрдк ЁЯдЧ Hub рд╕реЗ рдПрдХ рдореЙрдбрд▓ рдХрд╛ рдЪреЗрдХрдкреЙрдЗрдВрдЯ рдирд┐рд░реНрдпрд╛рдд рдХрд░рдирд╛ рдЪрд╛рд╣рддреЗ рд╣реИрдВ, рдЙрджрд╛рд╣рд░рдг рдХреЗ рд▓рд┐рдП, `google-bert/bert-base-uncased`, рдирд┐рдореНрдирд▓рд┐рдЦрд┐рдд рдХрдорд╛рдВрдб рдЪрд▓рд╛рдПрдБ:
```bash
optimum-cli export tflite --model google-bert/bert-base-uncased --sequence_length 128 bert_tflite/
```
рдЖрдкрдХреЛ рдкреНрд░рдЧрддрд┐ рдХреЛ рджрд░реНрд╢рд╛рддреЗ рд╣реБрдП рд▓реЙрдЧ рджрд┐рдЦрд╛рдИ рджреЗрдВрдЧреЗ рдФрд░ рдпрд╣ рджрд┐рдЦрд╛рдПрдВрдЧреЗ рдХрд┐ рдкрд░рд┐рдгрд╛рдорд╕реНрд╡рд░реВрдк `model.tflite` рдХрд╣рд╛рдБ рд╕рд╣реЗрдЬрд╛ рдЧрдпрд╛ рд╣реИ, рдЬреИрд╕реЗ:
```bash
Validating TFLite model...
-[тЬУ] TFLite model output names match reference model (logits)
- Validating TFLite Model output "logits":
-[тЬУ] (1, 128, 30522) matches (1, 128, 30522)
-[x] values not close enough, max diff: 5.817413330078125e-05 (atol: 1e-05)
The TensorFlow Lite export succeeded with the warning: The maximum absolute difference between the output of the reference model and the TFLite exported model is not within the set tolerance 1e-05:
- logits: max diff = 5.817413330078125e-05.
The exported model was saved at: bert_tflite
```
рдЙрдкрд░реЛрдХреНрдд рдЙрджрд╛рд╣рд░рдг ЁЯдЧ Hub рд╕реЗ рдПрдХ рдЪреЗрдХрдкреЙрдЗрдВрдЯ рдирд┐рд░реНрдпрд╛рдд рдХрд░рдиреЗ рдХреЛ рджрд░реНрд╢рд╛рддрд╛ рд╣реИред рдЬрдм рдПрдХ рд╕реНрдерд╛рдиреАрдп рдореЙрдбрд▓ рдирд┐рд░реНрдпрд╛рдд рдХрд░рддреЗ рд╣реИрдВ, рддреЛ рдкрд╣рд▓реЗ рд╕реБрдирд┐рд╢реНрдЪрд┐рдд рдХрд░реЗрдВ рдХрд┐ рдЖрдкрдиреЗ рдореЙрдбрд▓ рдХреЗ рд╡рдЬрд╝рди рдФрд░ рдЯреЛрдХрдирд╛рдЗрдЬрд╝рд░ рдлрд╝рд╛рдЗрд▓реЛрдВ рдХреЛ рдПрдХ рд╣реА рдирд┐рд░реНрджреЗрд╢рд┐рдХрд╛ (`local_path`) рдореЗрдВ рд╕рд╣реЗрдЬрд╛ рд╣реИред CLI рдХрд╛ рдЙрдкрдпреЛрдЧ рдХрд░рддреЗ рд╕рдордп, рдЪреЗрдХрдкреЙрдЗрдВрдЯ рдирд╛рдо рдХреЗ рдмрдЬрд╛рдп `model` рддрд░реНрдХ рдореЗрдВ `local_path` рдкрд╛рд╕ рдХрд░реЗрдВред

View File

@ -43,7 +43,7 @@ Come risultato, puoi caricare una specifica versione di un modello con il parame
```py
>>> model = AutoModel.from_pretrained(
... "julien-c/EsperBERTo-small", revision="v2.0.1" # nome di un tag, di un branch, o commit hash
... "julien-c/EsperBERTo-small", revision="4c77982" # nome di un tag, di un branch, o commit hash
... )
```

Some files were not shown because too many files have changed in this diff Show More