mirror of
https://github.com/volcengine/verl.git
synced 2025-10-20 13:43:50 +08:00
[sglang] feat: add preparation for sglang+verl (#3506)
### What does this PR do? support npu for verl + sglang ```python bash examples/grpo_trainer/run_qwen3_8b_grpo_sglang_1k_npu.sh ``` ### Accuracy test 8b: <img width="747" height="842" alt="8b" src="https://github.com/user-attachments/assets/f36ef25a-b32f-4c76-97d0-2e5fe53ff183" /> 30b: <img width="759" height="850" alt="30b" src="https://github.com/user-attachments/assets/97979002-7ebf-47fa-ae57-3e9b6637f12c" /> ### Test ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) --------- Signed-off-by: lbk-sys <hello_lbk@163.com> Co-authored-by: 1StepForever <wangww1Step@foxmail.com>
This commit is contained in:
113
docs/ascend_tutorial/ascend_sglang_quick_start.rst
Normal file
113
docs/ascend_tutorial/ascend_sglang_quick_start.rst
Normal file
@ -0,0 +1,113 @@
|
||||
verl x Ascend
|
||||
===================================
|
||||
|
||||
Last updated: 09/25/2025.
|
||||
|
||||
我们在 verl 上增加对华为昇腾设备的支持。
|
||||
|
||||
硬件支持
|
||||
-----------------------------------
|
||||
|
||||
Atlas 200T A2 Box16
|
||||
|
||||
Atlas 900 A2 PODc
|
||||
|
||||
Atlas 800T A3
|
||||
|
||||
|
||||
安装
|
||||
-----------------------------------
|
||||
|
||||
基础环境准备
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
+-----------+-------------+
|
||||
| software | version |
|
||||
+-----------+-------------+
|
||||
| Python | == 3.11 |
|
||||
+-----------+-------------+
|
||||
| CANN | == 8.3.RC1 |
|
||||
+-----------+-------------+
|
||||
| HDK | == 25.3.RC1 |
|
||||
+-----------+-------------+
|
||||
| torch | == 2.6.0 |
|
||||
+-----------+-------------+
|
||||
| torch_npu | == 2.6.0 |
|
||||
+-----------+-------------+
|
||||
|
||||
**目前verl框架中sglang npu后端仅支持上述HDK、CANN和PTA版本, 商发可用版本预计2025年10月发布**
|
||||
|
||||
为了能够在 verl 中正常使用 sglang,需使用以下命令安装sglang、torch_memory_saver和verl。
|
||||
|
||||
sglang
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: bash
|
||||
|
||||
# sglang
|
||||
git clone https://github.com/sgl-project/sglang.git
|
||||
cd sglang
|
||||
mv python/pyproject.toml python/pyproject.toml.backup
|
||||
mv python/pyproject_other.toml python/pyproject.toml
|
||||
pip install -e "python[srt_npu]"
|
||||
|
||||
安装torch_memory_saver
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. code-block:: bash
|
||||
|
||||
# torch_memory_saver
|
||||
git clone https://github.com/sgl-project/sgl-kernel-npu.git
|
||||
cd sgl-kernel-npu
|
||||
bash build.sh -a memory-saver
|
||||
pip install output/torch_memory_saver*.whl
|
||||
|
||||
安装verl
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
git clone https://github.com/volcengine/verl.git
|
||||
cd verl
|
||||
pip install --no-deps -e .
|
||||
pip install -r requirements-npu.txt
|
||||
|
||||
|
||||
其他三方库说明
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
+--------------+---------------+
|
||||
| software | description |
|
||||
+--------------+---------------+
|
||||
| transformers | v4.56.1 |
|
||||
+--------------+---------------+
|
||||
| triton_ascend| v3.2.0 |
|
||||
+--------------+---------------+
|
||||
|
||||
1. sglang依赖 transformers v4.56.1
|
||||
2. sglang依赖triton_ascend v3.2.0
|
||||
3. 暂不支持多模态模型,卸载相关安装包torchvision、timm
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip uninstall torchvision
|
||||
pip uninstall timm
|
||||
pip uninstall triton
|
||||
|
||||
pip install transformers==4.56.1
|
||||
pip install -i https://test.pypi.org/simple/ triton-ascend==3.2.0.dev20250925
|
||||
|
||||
|
||||
快速开始
|
||||
-----------------------------------
|
||||
正式使用前,建议您通过对Qwen3-8B GRPO的训练尝试以检验环境准备和安装的正确性。
|
||||
|
||||
1.下载数据集并将数据集预处理为parquet格式,以便包含计算RL奖励所需的必要字段
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python3 examples/data_preprocess/gsm8k.py --local_save_dir ~/data/gsm8k
|
||||
|
||||
2.执行训练
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
bash verl/examples/grpo_trainer/run_qwen3_8b_grpo_sglang_1k_npu.sh
|
@ -133,6 +133,7 @@ verl is fast with:
|
||||
ascend_tutorial/ascend_quick_start.rst
|
||||
ascend_tutorial/ascend_profiling_zh.rst
|
||||
ascend_tutorial/ascend_profiling_en.rst
|
||||
ascend_tutorial/ascend_sglang_quick_start.rst
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
Reference in New Issue
Block a user