frozenleaves/verl - verl - Gitea: Git for Me

mirror of https://github.com/volcengine/verl.git synced 2025-10-20 13:43:50 +08:00

Author	SHA1	Message	Date
Joel	6ff2b43d13	[ci] feat: upgrade sglang to 0.5.2 (#3613 ) ### What does this PR do? Solve https://github.com/volcengine/verl/pull/3530#issuecomment-3332840437	2025-09-26 09:25:53 +08:00
Chi Zhang	c4f4caf0cd	[misc] feat: prototype deprecate DataProto and replace with Tensordict: part 1 (#2733 ) ### What does this PR do? - Add TensorDict utilities and tests to cover the current DataProto functionalities. - Add nested tensor example to remove padding throughout the system - Add image example - Upgrade tensordict to v0.10 ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-09-09 14:47:32 +08:00
Huapeng Zhou	27b63c724a	[env, sglang] feat: Bump new sglang version to fix vlm OOM (#3216 ) ### What does this PR do? - Bump new version of sglang - This version's sglang can fix vlm OOM issue, detail are in: https://github.com/sgl-project/sglang/issues/9365 ### Test Using instruction following https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial/blob/main/rlhf/verl/multi-turn/release_log/latest_sglang.md Now we have new version of sglang: <img width="786" height="154" alt="image" src="https://github.com/user-attachments/assets/bcec557e-196c-40c0-aa0f-c19d9f5c3e98" /> `gsm8k`: using `verl/examples/sglang_multiturn/run_qwen2.5-3b_gsm8k_multiturn.sh` [Wandb](https://wandb.ai/popsoda-university-of-washington/multi-turn-grpo-qwen2.5-3b-sglang/runs/dtcdin9b?nw=nwuserpopsoda) <img width="532" height="329" alt="image" src="https://github.com/user-attachments/assets/12f67d1a-a57e-497d-bfe5-6ff8c642e83f" /> It can work well. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)	2025-08-26 13:29:36 +08:00
Stefan He	06bc679a57	[sglang] chore: bump transformer formers 4.54.0 and fix QWen VL issues (#2869 ) ### What does this PR do? Do not merge before: https://github.com/volcengine/verl/pull/2720 > Add concise overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review. Bump xformers, fixing patched model code ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)	2025-08-03 18:30:51 +08:00
Yuge Zhang	473d8ff0c1	[env] fix: bump tensordict to 0.9.1 (#2541 ) ### What does this PR do? Bump to tensordict 0.9.1 and ban 0.9.0 per discussions in #2460. This bug: https://github.com/pytorch/tensordict/issues/1374 has an impact on dp_actor, making it crash because of the wrong batch size. ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [x] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).	2025-07-15 19:04:07 +08:00
Chi Zhang	de38ed4218	[env] feat: upgrade tensordict version (#2460 ) ### What does this PR do? Upgrade tensordict to latest ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).	2025-07-10 16:12:02 -07:00
Blue Space	ebb21b7fc7	[docker] refactor: Migrate images to verlai, support latest flash attention and newer CUDA versions in future (#2085 ) ### Checklist Before Starting - [ ] Searched for similar PR(s). - [ ] Checked PR Title format - In format of: [modules] type: Title - modules are in `fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data` - type is in `feat, fix, refactor, chore, test` - can involve multiple modules, seperated by `,` or space, like `[megatron, fsdp, doc] feat: xxx` ### What does this PR do? Migrate images to verlai, upgrade CUDA support to 12.6 and support latest flash attention ```txt docker ├── README.md ├── verl0.4-cu124-torch2.6-fa2.7.4 │ ├── Dockerfile.app.sglang.vllm.mcore0.12 │ ├── Dockerfile.app.sglang.vllm.mcore0.13.preview │ ├── Dockerfile.app.vllm.mcore0.12 │ ├── Dockerfile.app.vllm.mcore0.13.preview │ ├── Dockerfile.base │ └── README.md ├── verl0.5-cu126-torch2.7.1-fa2.8.0 │ ├── Dockerfile.app.sglang.mcore0.12 │ ├── Dockerfile.app.sglang.mcore0.13.preview │ ├── Dockerfile.base.fi0.2.6 │ └── README.md └── verl0.5-preview-cu128-torch2.7.1-fa2.8.0 ├── Dockerfile.app.sglang.megatron ├── Dockerfile.base.fi0.2.6 └── README.md ``` - verlai/verl - verl0.4 - base - app.sglang.vllm.mcore - app.vllm.mcore - verl0.5 - base - app.sglang.mcore - app.vllm.mcore [may not support now, for debug] - verl0.5-preview - base - app.sglang.mcore - app.vllm.mcore [may not support now, for debug] ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluatuion results, etc. ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes > List the specific changes. ### API > Demonstrate how the API changes if any. ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Checklist Before Submitting - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title `description` if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.	2025-07-04 14:32:02 +08:00
Yuzhen Zhou	4de247fe4d	[sglang] refactor: Unify async rollout under SGLangRollout, and support sglang==0.4.6.post5 (#1717 ) ### Checklist Before Starting - [x] Search for similar PR(s). ### What does this PR do? - Unify the functionality of SGLangRollout and AsyncSGLangRollout, remove original SGLangRollout and rename AsyncSGLangRollout to SGLangRollout. - Make trivial changes due to modification in sglang==0.4.6.post5. ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes > List the specific changes. ### API > Demonstrate how the API changes if any. ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluatuion results, etc. ### Additional Info. - Issue Number: Fixes issue # or discussion # if any. - Training: [Note which backend this PR will affect: FSDP, Megatron, both, or none] - Inference: [Note which backend this PR will affect: vLLM, SGLang, both, or none] ### Checklist Before Submitting - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add CI test(s) if necessary. --------- Co-authored-by: zyzshishui <@qq.com> Co-authored-by: Xiang Long <mindsculptor@yeah.net> Co-authored-by: ocss884 <ocss.lin@gmail.com> Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com> Co-authored-by: H <linhaibin.eric@gmail.com>	2025-05-31 19:47:25 -07:00
Changling	c5bc81b692	[sglang] Feat: Search Tool Invocation in Multi-Turn RL Training (#1682 )	2025-05-31 12:51:19 +08:00
Xiang Long	8160ec6a58	Bump to sglang 0.4.6.post4 & unified generate sequences ability between sgl and sgl async (#1577 ) ### Checklist Before Starting - [x] Search for similar PR(s). - Thanks to: - close #1558 due to mix of prs - close #1449 due to partial fix sgl new version issue - close #1300 which is part of current pr - This pr is co-authored with @ocss884 ### What does this PR do? > Add one-line overview of what this PR aims to achieve or accomplish. - bump sglang to 0.4.6.post4 - unified sglang and sglang_async `generate_sequences` api behavior, e.g. image support - fix warning for cuda barrier at start of fsdp_workers ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [x] Add `[BREAKING]` to the PR title if it breaks any API. - [x] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [x] Add CI test(s) if necessary. --------- Co-authored-by: ocss884 <ocss.lin@gmail.com>	2025-05-20 09:39:07 +08:00
Blue Space	43782a24bd	[Doc/Docker Image] Update mcore image to use vLLM which support qwen3 and rewrite installation from conda (#1505 ) ### Checklist Before Starting - [x] Search for similar PR(s). ### What does this PR do? Update mcore image to use vLLM which support qwen3 and rewrite installation from conda ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes Docker image and docs ### API > Demonstrate how the API changes if any. ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluatuion results, etc. ### Additional Info. - Issue Number: Fixes issue # or discussion # if any. - Training: both - Inference: both ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [x] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [x] Add CI test(s) if neccessary.	2025-05-14 14:40:13 +08:00
湛露先生	5c3802687f	distro: clean req packages. (#1253 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-04-25 07:14:00 -07:00
Junrong Lin	7753d37d71	[sglang] ci: upgrade to sglang 0.4.4.post4 (#941 )	2025-04-06 10:33:53 -07:00
Qunhong Zeng	6974bbaeea	[dataset] refactor: use hf Dataset instead of pandas DataFrame in RLHFDataset for speedup (#890 ) HF Dataset provides better memory management and can handle larger datasets. It also supports multi-process acceleration during map/filter operations (while pandas requires version >2.0). Now we can specify `filter_overlong_prompts` on large-scale datasets when set `filter_overlong_prompts_workers` to a appreciate num. --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-03 21:51:53 -07:00
Xiang Long	5138a22c66	[sglang] fix: add memory saver support to sglang rollout to avoid OOMs (#756 ) as title --------- Co-authored-by: ocss884 <ocss.lin@gmail.com>	2025-03-30 08:36:16 -07:00

15 Commits