mirror of
https://github.com/deepspeedai/DeepSpeed.git
synced 2025-10-20 23:53:48 +08:00
Update path for BingBertSquad from DeepSpeedExamples (#6746)
In https://github.com/microsoft/DeepSpeedExamples/pull/245, the DeepSpeedExamples directory structure was refactored, this updates the DeepSpeed examples from those changes.
This commit is contained in:
@ -10,14 +10,14 @@ In this tutorial we will be adding DeepSpeed to the BingBert model for the SQuAD
|
||||
|
||||
If you don't already have a copy of the DeepSpeed repository, please clone in
|
||||
now and checkout the DeepSpeedExamples submodule the contains the BingBertSquad
|
||||
example (DeepSpeedExamples/BingBertSquad) we will be going over in the rest of
|
||||
example (DeepSpeedExamples/training/BingBertSquad) we will be going over in the rest of
|
||||
this tutorial.
|
||||
|
||||
```shell
|
||||
git clone https://github.com/microsoft/DeepSpeed
|
||||
cd DeepSpeed
|
||||
git submodule update --init --recursive
|
||||
cd DeepSpeedExamples/BingBertSquad
|
||||
cd DeepSpeedExamples/training/BingBertSquad
|
||||
```
|
||||
|
||||
### Pre-requisites
|
||||
|
@ -136,7 +136,7 @@ You can also use a pre-trained BERT model checkpoint from either DeepSpeed, [Hug
|
||||
|
||||
### 2.1 Running BingBertSQuAD with DeepSpeed and 1-bit Adam
|
||||
|
||||
We provide example scripts under [DeepSpeedExamples/BingBertSquad/1-bit_adam/](https://github.com/microsoft/DeepSpeedExamples/tree/master/BingBertSquad/1-bit_adam). There are 3 sets of scripts corresponding to NCCL-based implementation, MPI-based implementation on Ethernet systems, and MPI-based implementation on InfiniBand systems. For MPI-based implementation, we provide both example scripts when launching with deepspeed or mpirun.
|
||||
We provide example scripts under [DeepSpeedExamples/training/BingBertSquad/1-bit_adam/](https://github.com/microsoft/DeepSpeedExamples/tree/master/training/BingBertSquad/1-bit_adam). There are 3 sets of scripts corresponding to NCCL-based implementation, MPI-based implementation on Ethernet systems, and MPI-based implementation on InfiniBand systems. For MPI-based implementation, we provide both example scripts when launching with deepspeed or mpirun.
|
||||
|
||||
<!-- The main part of training is done in `nvidia_run_squad_deepspeed.py`, which has
|
||||
already been modified to use DeepSpeed. The `run_squad_deepspeed.sh` script
|
||||
@ -157,7 +157,7 @@ To enable the 1-bit compressed training, 1-bit Adam uses an MPI library (E.g. MV
|
||||
|
||||
### Launch with deepspeed
|
||||
|
||||
The following helper script in the DeepSpeedExamples/BingBertSQuAD will launch the training without the need for setting any `mpirun` parameters. The number of nodes and GPUs will be automatically detected and the job will be launched on all the available resources.
|
||||
The following helper script in the DeepSpeedExamples/training/BingBertSQuAD will launch the training without the need for setting any `mpirun` parameters. The number of nodes and GPUs will be automatically detected and the job will be launched on all the available resources.
|
||||
|
||||
```shell
|
||||
bash run_squad_deepspeed_onebitadam.sh <PATH_TO_OUTPUT_DIR>
|
||||
|
@ -93,7 +93,7 @@ done
|
||||
|
||||
# Validate path to BingBertSquad script
|
||||
if [ -z "${BingBertSquad_DIR+x}" ]; then
|
||||
export BingBertSquad_DIR=../../../../DeepSpeedExamples/BingBertSquad
|
||||
export BingBertSquad_DIR=../../../../DeepSpeedExamples/training/BingBertSquad
|
||||
echo "BingBertSquad_DIR environment variable not set; trying default: ${BingBertSquad_DIR}"
|
||||
fi
|
||||
validate_folder ${BingBertSquad_DIR} "BingBertSquad_DIR"
|
||||
|
@ -94,7 +94,7 @@ done
|
||||
|
||||
# Validate path to BingBertSquad script
|
||||
if [ -z "${BingBertSquad_DIR+x}" ]; then
|
||||
export BingBertSquad_DIR=../../../DeepSpeedExamples/BingBertSquad
|
||||
export BingBertSquad_DIR=../../../DeepSpeedExamples/training/BingBertSquad
|
||||
echo "BingBertSquad_DIR environment variable not set; trying default: ${BingBertSquad_DIR}"
|
||||
fi
|
||||
validate_folder ${BingBertSquad_DIR} "BingBertSquad_DIR"
|
||||
|
@ -31,7 +31,7 @@ validate_folder() {
|
||||
|
||||
# Validate path to BingBertSquad script
|
||||
if [ -z "${BingBertSquad_DIR+x}" ]; then
|
||||
export BingBertSquad_DIR=../../../DeepSpeedExamples/BingBertSquad
|
||||
export BingBertSquad_DIR=../../../DeepSpeedExamples/training/BingBertSquad
|
||||
echo "BingBertSquad_DIR environment variable not set; trying default: ${BingBertSquad_DIR}"
|
||||
fi
|
||||
validate_folder ${BingBertSquad_DIR} "BingBertSquad_DIR"
|
||||
|
@ -10,11 +10,11 @@ import sys
|
||||
import pytest
|
||||
import json
|
||||
|
||||
sys.path.append("../../../DeepSpeedExamples/BingBertSquad")
|
||||
sys.path.append("../../../DeepSpeedExamples/training/BingBertSquad")
|
||||
import evaluate as eval
|
||||
|
||||
squad_dir = "/data/BingBertSquad"
|
||||
base_dir = "../../../DeepSpeedExamples/BingBertSquad"
|
||||
base_dir = "../../../DeepSpeedExamples/training/BingBertSquad"
|
||||
|
||||
script_file_name = "run_squad_deepspeed.sh"
|
||||
model_file_name = "training_state_checkpoint_162.tar"
|
||||
|
Reference in New Issue
Block a user