!242 修改trl多机文档

Merge pull request !242 from mamba_chen/master
This commit is contained in:
mamba_chen
2025-06-18 09:35:33 +00:00
committed by i-robot
parent 14e75e69e2
commit 49302340c8

View File

@ -14,8 +14,10 @@ open-r1项目是huggingface官方开源的对DeepSeek-R1模型进行完全开放
|-----------|----------------------------------------------------------------------------------------------------------|
| Python | [3.10](https://www.python.org/downloads/) |
| CANN | 在研版本* |
| NNAL | 在研版本* |
| torch-npu | 在研版本* |
| torch | [2.6.0](https://github.com/pytorch/pytorch/releases/tag/v2.6.0) |
| torchvision | 0.21.0 |
* *在研版本请联系相关人员获取,获得当前较优的性能。
@ -24,8 +26,8 @@ open-r1项目是huggingface官方开源的对DeepSeek-R1模型进行完全开放
```shell
git clone https://github.com/vllm-project/vllm.git
cd vllm
git checkout 83f9ce4932247647c7ff91b457a5d72682d68f0d
pip install -r requirements-build.txt
git checkout 68bb122eb
pip install -r requirements/build.txt
VLLM_TARGET_DEVICE=empty pip install -e .
```
@ -34,7 +36,7 @@ VLLM_TARGET_DEVICE=empty pip install -e .
```shell
git clone https://github.com/vllm-project/vllm-ascend.git
cd vllm-ascend
git checkout c805d9a0c318bb42dec64263686be5066facf02f
git checkout c3d1a3782
COMPILE_CUSTOM_KERNELS=0 pip install -e .
```
@ -43,7 +45,7 @@ COMPILE_CUSTOM_KERNELS=0 pip install -e .
```shell
git clone https://github.com/huggingface/trl.git
cd trl
git checkout ad3b24b0fdfb28a3089a5d8cd88eedf667cf32c3
git checkout 27adc3016
pip install -e .
```
@ -54,8 +56,9 @@ pip install -e .
git clone https://github.com/huggingface/open-r1.git
cd open-r1
git checkout e128cd5edcdcb86d577250b14848357e3af807f1
# 从本项目中拷贝部分内容至本地open-rl代码仓中
cp -r ../recipes/Qwen2.5-7B-Instruct ./recipes/Qwen2.5-7B-Instruct
cp ../setup.py ./set_up.py
cp ../setup.py ./setup.py
pip install -e ".[dev]"
```
@ -64,11 +67,12 @@ pip install -e ".[dev]"
### 单机
```shell
cd open-r1
#启动推理server
# 在trl路径下执行
# 启动推理server
trl vllm-serve --model path/to/Qwen2.5-7B-Instruct --tensor_parallel_size 1
# 在open-r1路径下执行
# 启动训练
ASCEND_RT_VISIBLE_DEVICES=1,2,3,4,5,6,7 ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero2.yaml --num_processes 7\
src/open_r1/grpo.py \
@ -80,11 +84,13 @@ ASCEND_RT_VISIBLE_DEVICES=1,2,3,4,5,6,7 ACCELERATE_LOG_LEVEL=info accelerate lau
在主节点执行:
```shell
cd open-r1
cd trl
#启动推理server
# 在trl路径下执行
# 启动推理server
trl vllm-serve --model path/to/Qwen2.5-7B-Instruct --tensor_parallel_size 1
# 在open-r1路径下执行
# 启动训练
ASCEND_RT_VISIBLE_DEVICES=1,2,3,4,5,6,7 ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero2.yaml\
--num_processes 14 --num_machines 2 --main_process_ip x.x.x.x(主节点ip) --main_process_port 12345 --machine_rank 0 \
@ -95,8 +101,8 @@ ASCEND_RT_VISIBLE_DEVICES=1,2,3,4,5,6,7 ACCELERATE_LOG_LEVEL=info accelerate lau
在次节点执行:
```shell
cd open-r1
# 在open-r1路径下执行
# 启动训练
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero2.yaml \
--num_processes 14 --num_machines 2 --main_process_ip x.x.x.x(主节点ip) --main_process_port 12345 --machine_rank 1 \