mirror of
https://github.com/deepspeedai/DeepSpeed.git
synced 2025-10-20 15:33:51 +08:00
update dependencies version info (#7206)
The release versions are now available. update from the master branch to use the minimum required versions instead. also link the example.https://github.com/deepspeedai/DeepSpeedExamples/pull/964 --------- Signed-off-by: inkcherry <mingzhi.liu@intel.com>
This commit is contained in:
@ -48,9 +48,15 @@ Figure 2 illustrates the basic flowchart, The division of TP and ZeRO is impleme
|
||||
|
||||
# Usage
|
||||
|
||||
Although we evaluated AutoTP training with Llama2 & Llama3 models in this blog, we expect compatibility with other Hugging Face models, especially [those](https://www.deepspeed.ai/tutorials/automatic-tensor-parallelism/) previously validated with AutoTP inference. Please upgrade accelerate and transformers to the master branch. We will add their minimum version once they have release tag.
|
||||
|
||||
|
||||
Although we evaluated AutoTP training with Llama2 & Llama3 models in this blog, we expect compatibility with other Hugging Face models, especially [those](https://www.deepspeed.ai/tutorials/automatic-tensor-parallelism/) previously validated with AutoTP inference.
|
||||
|
||||
**Requirements**
|
||||
- `deepspeed >= 0.16.4`
|
||||
- `transformers >= 4.50.1`
|
||||
- `accelerate >= 1.6.0`
|
||||
|
||||
**Enable TP training**
|
||||
|
||||
Similar to ZeRO, AutoTP training is enabled using the [deepspeed configuration file](https://www.deepspeed.ai/docs/config-json/) by specifying ```[tensor_parallel][autotp_size]```.
|
||||
@ -113,12 +119,10 @@ Models saved this way can be directly used for HF format inference without inter
|
||||
Saving Checkpoints remains compatible with HF transformers. Use [trainer.save_state()](https://huggingface.co/docs/transformers/v4.49.0/en/main_classes/trainer#transformers.Trainer.save_state) or set the save interval for automatic saving, which can be used to resume training.
|
||||
```
|
||||
trainer.train(resume_from_checkpoint="your_saved_path/checkpoint-1200")
|
||||
)
|
||||
```
|
||||
|
||||
# Example
|
||||
We validated AutoTP training using supervised finetune training (SFT) task: [stanford_alpaca](https://github.com/tatsu-lab/stanford_alpaca). The original benchmark model used in this project is Llama2-7B.
|
||||
|
||||
We validated AutoTP training using supervised finetune training (SFT) task: [stanford_alpaca](https://github.com/tatsu-lab/stanford_alpaca). The original benchmark model used in this project is Llama2-7B. The example code is also available [here](https://github.com/deepspeedai/DeepSpeedExamples/tree/master/training/tensor_parallel)
|
||||
|
||||
|
||||
**Training Loss curve**
|
||||
@ -216,7 +220,7 @@ The following loss curves depict SFT training, where gbs is uniformly set to 32,
|
||||
|
||||
# Miscellaneous
|
||||
|
||||
If users define their own dataloader, please ensure data consistency within ```deepspeed.utils.get_tensor_model_parallel_group()```. DeepSpeed provides basic validation functions to assist with this.
|
||||
If users define their own dataloader, please ensure data consistency within ```deepspeed.utils.groups.get_tensor_model_parallel_group()```. DeepSpeed provides basic validation functions to assist with this.
|
||||
|
||||
Furthermore, if users are not using transformers library, you can replace the ```TensorParallel_Layer``` layer and its subclasses as needed. See ```prepare_tp_model``` function in ```unit/model_parallelism/test_autotp_training.py```. Users can also define different shard and gather for subclasses of ```TensorParallel_Layer.```
|
||||
|
||||
|
Reference in New Issue
Block a user