frozenleaves/verl - verl - Gitea: Git for Me

mirror of https://github.com/volcengine/verl.git synced 2025-10-20 13:43:50 +08:00

Author	SHA1	Message	Date
Chi Zhang	515f2255ac	[ci] fix: use local models/configs/datasets to increase stability (#3616 ) ### What does this PR do? - As title ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)	2025-09-25 22:14:56 +08:00
Chi Zhang	6e6fafdc74	[model] feat: add FSDP/Megatron critic worker with model engine (#3439 ) ### What does this PR do? - As title - Add a test to compare the output of FSDP/Megatron engine with huggingface model ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-09-13 12:18:58 +08:00
Chi Zhang	b03866768f	[ci] feat: move more tests to volcano engine (#3455 )	2025-09-12 18:54:55 +08:00
Chi Zhang	5c46f4f437	[model] feat: replace DataProto with TensorDict in engine (#3422 )	2025-09-09 22:28:25 +08:00
Chi Zhang	d7a0469977	[model] feat: polish model engine (#3321 )	2025-09-03 20:44:39 +08:00
Chi Zhang	91ee0a2c08	[fsdp, model] feat: support FSDP model engine (#3270 ) ### What does this PR do? - Support FSDPEngine and FSDPEngineWithLMHead - Add tests and show that fsdp engine matches with mcore and huggingface on QWen 2.5 0.5b model ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) --------- Co-authored-by: ziheng.jiang <ziheng.jiang@bytedance.com>	2025-09-01 16:17:45 +08:00
Chi Zhang	1065a29d14	[megatron, model] feat: add MegatronEngine, MegatronEngineForCausalLM (#3235 )	2025-08-28 19:36:05 +08:00
Huapeng Zhou	27b63c724a	[env, sglang] feat: Bump new sglang version to fix vlm OOM (#3216 ) ### What does this PR do? - Bump new version of sglang - This version's sglang can fix vlm OOM issue, detail are in: https://github.com/sgl-project/sglang/issues/9365 ### Test Using instruction following https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial/blob/main/rlhf/verl/multi-turn/release_log/latest_sglang.md Now we have new version of sglang: <img width="786" height="154" alt="image" src="https://github.com/user-attachments/assets/bcec557e-196c-40c0-aa0f-c19d9f5c3e98" /> `gsm8k`: using `verl/examples/sglang_multiturn/run_qwen2.5-3b_gsm8k_multiturn.sh` [Wandb](https://wandb.ai/popsoda-university-of-washington/multi-turn-grpo-qwen2.5-3b-sglang/runs/dtcdin9b?nw=nwuserpopsoda) <img width="532" height="329" alt="image" src="https://github.com/user-attachments/assets/12f67d1a-a57e-497d-bfe5-6ff8c642e83f" /> It can work well. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)	2025-08-26 13:29:36 +08:00
Blue Space	9b6a07fa77	[docker] feat: update to vllm 0.10.0, mcore 0.13, transformers 4.55.4 (#3192 )	2025-08-26 05:17:57 +08:00
Blue Space	ae46f5a41a	[ci] fix: model tests, transformers 4.55 has troubles with backward (#3139 ) ### What does this PR do? [ci] fix: model tests, transformers 4.55 has troubles with backward ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)	2025-08-20 13:33:12 +08:00
杨睿	3ebe6717ad	[megatron] fix: retain MLA config in mcore config converter (#2933 ) ### What does this PR do? > Add concise overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review. - in the current `check_and_disable_incompatible_configs` function, we will drop config if it's not an attribute of `TransformerConfig`, however when using `MLATransformerConfig`, this funcion will drop mla config like `q_lora_rank`, and cause a lots of problems in the downstream pipeline - this pr refactored `check_and_disable_incompatible_configs` to a factory function `check_and_construct_configs `, which accecpt a class type bounded with TransformerConfig, and return a TransformerConfig instance. @ETOgaosion ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) --------- Co-authored-by: gaoziyuan <gaoziyuan.955@bytedance.com>	2025-08-07 12:35:18 +08:00
Blue Space	f32e54deaa	[docker] feat: Upgrade sglang 0.4.9 + transformers 4.53.2 (#2794 ) ### What does this PR do? feat: Upgrade sglang 0.4.9 + transformers 4.53.2 ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)	2025-07-31 00:49:27 +08:00
H	f98ee1c697	[cfg] fix: fix failing rollout config test on main (#2771 ) ### What does this PR do? The cpu unit test is broken when https://github.com/volcengine/verl/pull/2757/files is merged. ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` --------- Co-authored-by: gaoziyuan <gaoziyuan.955@bytedance.com>	2025-07-28 16:43:56 +08:00
Blue Space	4879d619fc	[docker] feat: upgrade to torch 2.7, sglang 0.4.8 (#2617 ) ### What does this PR do? [docker] feat: upgrade to torch 2.7, sglang 0.4.8 Stage 2: vllm 0.9.1 Stage 3: mcore 0.13.0 ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). --------- Co-authored-by: hebiao064 <hebiaobuaa@gmail.com>	2025-07-24 14:53:24 -07:00
Blue Space	69a467f934	[docker] fix: downgrade TransformerEngine version 2.2.1 to allow mcore image using rope fusion and provide another set of v0.5 image (#2611 ) ### What does this PR do? Downgrade TransformerEngine version to allow mcore image using rope fusion and provide another set of v0.5 image. ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).	2025-07-18 17:23:19 +08:00
Blue Space	ebb21b7fc7	[docker] refactor: Migrate images to verlai, support latest flash attention and newer CUDA versions in future (#2085 ) ### Checklist Before Starting - [ ] Searched for similar PR(s). - [ ] Checked PR Title format - In format of: [modules] type: Title - modules are in `fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data` - type is in `feat, fix, refactor, chore, test` - can involve multiple modules, seperated by `,` or space, like `[megatron, fsdp, doc] feat: xxx` ### What does this PR do? Migrate images to verlai, upgrade CUDA support to 12.6 and support latest flash attention ```txt docker ├── README.md ├── verl0.4-cu124-torch2.6-fa2.7.4 │ ├── Dockerfile.app.sglang.vllm.mcore0.12 │ ├── Dockerfile.app.sglang.vllm.mcore0.13.preview │ ├── Dockerfile.app.vllm.mcore0.12 │ ├── Dockerfile.app.vllm.mcore0.13.preview │ ├── Dockerfile.base │ └── README.md ├── verl0.5-cu126-torch2.7.1-fa2.8.0 │ ├── Dockerfile.app.sglang.mcore0.12 │ ├── Dockerfile.app.sglang.mcore0.13.preview │ ├── Dockerfile.base.fi0.2.6 │ └── README.md └── verl0.5-preview-cu128-torch2.7.1-fa2.8.0 ├── Dockerfile.app.sglang.megatron ├── Dockerfile.base.fi0.2.6 └── README.md ``` - verlai/verl - verl0.4 - base - app.sglang.vllm.mcore - app.vllm.mcore - verl0.5 - base - app.sglang.mcore - app.vllm.mcore [may not support now, for debug] - verl0.5-preview - base - app.sglang.mcore - app.vllm.mcore [may not support now, for debug] ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluatuion results, etc. ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes > List the specific changes. ### API > Demonstrate how the API changes if any. ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Checklist Before Submitting - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title `description` if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.	2025-07-04 14:32:02 +08:00
H	cfc5ff2452	[ci] fix: add tests for vllm (#2036 ) ### Checklist Before Starting - [x] Searched for similar PR(s). - [x] Checked PR Title format - In format of: [modules] type: Title - modules are in `fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data` - type is in `feat, fix, refactor, chore` - can involve multiple modules, seperated by `,` or space, like `[megatron, fsdp, doc] feat: xxx` ### What does this PR do? Fix the failing vllm test ### Test Added one more test to make sure problematic tool class should fail during initialization ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes > List the specific changes. ### API > Demonstrate how the API changes if any. ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Checklist Before Submitting - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title `description` if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path. --------- Co-authored-by: wuxibin <wuxibin@bytedance.com>	2025-06-16 18:27:28 +08:00
H	5fa911b3ce	[ci] refactor: setup testing guidance (#1958 )	2025-06-12 06:16:58 -07:00
OC	9afa8d6dff	fix error when ci failed by incorrect sgl-kernel version (#1872 ) ### Checklist Before Starting - [ done ] Search for similar PR(s). ### What does this PR do? Fix ci failure from incorrect sgl-kernel version in docker image: ``` File "/usr/local/lib/python3.10/dist-packages/sglang/srt/utils.py", line 647, in assert_pkg_version raise Exception( Exception: sgl-kernel is installed with version 0.1.0, which is less than the minimum required version 0.1.1. Please reinstall the latest version with `pip install sgl-kernel --force-reinstall` ```	2025-06-06 13:55:08 +08:00
Blue Space	7b0426a738	[Docker Image] update images and fix sglang installation (#1606 ) ### Checklist Before Starting - [ ] Search for similar PR(s). ### What does this PR do? update images and fix sglang installation, the latest image: `whatcanyousee/verl:ngc-cu124-vllm0.8.5-sglang0.4.6-mcore0.12.0-te2.3` ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes - vLLM: 0.8.5.post1 - SGLang: 0.4.6.post4, fix installation - Megatron: core_v0.12.0 announcement - TransformerEngine: 2.3 ### API > Demonstrate how the API changes if any. ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluatuion results, etc. ### Additional Info. - Issue Number: Fixes issue # or discussion # if any. - Training: [Note which backend this PR will affect: FSDP, Megatron, both, or none] - Inference: [Note which backend this PR will affect: vLLM, SGLang, both, or none] ### Checklist Before Submitting - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [x] Add CI test(s) if necessary.	2025-05-21 09:13:51 +08:00
Blue Space	b8bd596811	[Docker Image] use latest vLLM (0.8.5) to fully support Qwen3 moe (#1544 )	2025-05-17 07:28:55 +08:00
Blue Space	43782a24bd	[Doc/Docker Image] Update mcore image to use vLLM which support qwen3 and rewrite installation from conda (#1505 ) ### Checklist Before Starting - [x] Search for similar PR(s). ### What does this PR do? Update mcore image to use vLLM which support qwen3 and rewrite installation from conda ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes Docker image and docs ### API > Demonstrate how the API changes if any. ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluatuion results, etc. ### Additional Info. - Issue Number: Fixes issue # or discussion # if any. - Training: both - Inference: both ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [x] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [x] Add CI test(s) if neccessary.	2025-05-14 14:40:13 +08:00
Hongpeng Guo	d4a11ebb44	[utils] Enrich and fix utils from `fsdp_utils` and `seqlen_balancing` (#1495 ) ### Checklist Before Starting - [x] Search for similar PR(s). ### What does this PR do? Enrich and fix utility functions in `verl/utils/fsdp_utils.py` and `verl/utils/seqlen_balancing.py`. * In `get_fsdp_wrap_policy`, introduce a unified `_get_attr` helper so both dict‑based (OmegaConf) and dataclass‑style configs can work. * In `rearrange_micro_batches`, add two new parameters (`same_micro_num_in_dp`, `min_num_micro_batch`). * Also re-organized the workflow pipeline structure to make it align better with the verl file structure. ### API In `verl.utils.seqlen_balancing.rearrange_micro_batches`, add two new parameters (`same_micro_num_in_dp`, `min_num_micro_batch`). ### Usage Example ```python # A very toy example dataproto = DataProto.from_single_dict({"input_ids": input_ids, "attention_mask": attention_mask}) micros, idx_map = rearrange_micro_batches(batch, max_token_len=300, same_micro_num_in_dp=False, min_num_micro_batch=2) ``` ### Test * Added in `tests/utils/gpu_tests/test_seqlen_balancing.py` ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [x] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [x] Add CI test(s) if neccessary. --------- Signed-off-by: Hongpeng Guo <hg5@illinois.edu>	2025-05-13 17:01:16 +08:00
H	249c26fdc8	[tests] BREAKING: move recipe.dapo.src to recipe.dapo; move test files to their own namespaces (tests/verl/xxx -> tests/xxx) (#1392 )	2025-05-10 11:21:53 +08:00
Shawn/Yuxuan Tong	8bb009bf47	[CI] feat: separate FSDP2 test & fix: CI trigger (#1389 ) ### Checklist Before Starting - [x] Search for similar PR(s). ### What does this PR do? 1. Separate the FSDP2 test to avoid blocking other tests. 2. Fix the CI trigger rule to avoid redundant runs (since I find the original PR triggers unrelated tests, so I fix the rule based on [the doc](https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#onpushpull_requestpull_request_targetpathspaths-ignore)) ### Test For 2, I test by commenting out the matching path for workflow `.yml`, and see only related workflows are triggered: Before: <img width="870" alt="image" src="https://github.com/user-attachments/assets/2f7dbe0c-f638-4a75-8cbc-a364081271fc" /> After: <img width="869" alt="image" src="https://github.com/user-attachments/assets/f5a35d85-f03c-452e-abed-3ca3ce22d699" /> ### Additional Info. - Issue Number: https://github.com/volcengine/verl/issues/1388 - Training: FSDP - Inference: none ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [x] Add `[BREAKING]` to the PR title if it breaks any API. - [x] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [x] Add CI test(s) if neccessary.	2025-05-05 07:20:35 -07:00
Xiang Long	e0d035cd4a	[sglang] feat: Add SGLang async multi-turn rollout with tool support (#1037 ) A redesigned version of #917 ## Current Status [Develop log & Tracker](https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial/issues/113) What Has Been Done - Async Rollout Refactoring: Integrate with the tool server to coordinate tool calls during generation, leveraging request IDs for state and progress tracking, support async multi-turn conversations in Agentic RL training (with Tool support). - Async Request Management: Encapsulate rollout requests into a unified structure, enabling efficient tracking and handling of concurrent multi-turn dialogues with chatml style messages. - Extensible Tools: A modular design for adapt tools in OpenAIFunctionTool format which is both support by SGLang and vLLM, with create separate instance, execute when tool call, calc score according to tool env state and release resource. - Multi-turn support has been implemented for the GSM8K task (new version working on). However, training has not yet converged, and we hope the community could join to investigate the issue. What Is WIP - [x] Merge loss mask to training process from last version - [x] Add more user friendly tool config and e2e tests for gsm8k with tool training - [ ] We are going to validate our multiturn feature in open-source sandbox environments. ## Key Features will be introduced in future version - Integrate a Ray-based agent trainer to enable explicit separation of the rollout and training pipeline. Provide support for partial rollout handling and fine-grained request state management. - Extend the framework to support simulated user interactions (e.g., roleplay, interactive feedback) and more complex environment-in-the-loop RL tasks. Future Plan [Discussion Thread](https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial/issues/74#issuecomment-2763192625) [RFC doc](https://github.com/SwordFaith/verl-sglang-dev-log/blob/main/rlhf/verl/multi-turn/veRL-multiturn-rollout-RFC.md) will be updated soon. ## Contributors & Acknowledgement - Xiang Long [mid.of.change@gmail.com](mailto:mid.of.change@gmail.com) @SwordFaith (Design RFC & core-dev of refactor part) - Yuzhen Zhou [zyzshishui@gmail.com](mailto:zyzshishui@gmail.com) @zyzshishui (Core-dev) - Chenyang Zhao [zhaochen20@outlook.com](mailto:zhaochen20@outlook.com) @zhaochenyang20 (PM) - Guanhua Wang @WANG-GH - Junrong Lin @ocss884 (verl-sglang support) - Hanchen Zhang [zhanghanchen77@gmail.com](mailto:zhanghanchen77@gmail.com) - Haoran Wang [ubecwang@gmail.com](mailto:ubecwang@gmail.com) - Rui Lu [learningrate1@gmail.com](mailto:learningrate1@gmail.com) - Yujiang Li [liyujiang2020@gmail.com](mailto:liyujiang2020@gmail.com) - Jiajun Li [guapisolo@gmail.com](mailto:guapisolo@gmail.com) - Jin Pan [jpan236@wisc.edu](mailto:jpan236@wisc.edu) - Zhi Zheng [zhengzhi@modelbest.cn](mailto:zhengzhi@modelbest.cn) @zh-zheng --------- Co-authored-by: zyzshishui <492129152@qq.com> Co-authored-by: guanhua <281484683@qq.com> Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com> Co-authored-by: ocss884 <ocss.lin@gmail.com> Co-authored-by: Shawn/Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: HL <linhaibin.eric@gmail.com>	2025-04-29 13:20:06 -07:00
Blue Space	0234d8e3ab	fix reward model and add CI test (#1252 ) Fix bugs related to #1165 . Megatron backend reward model has no CI test, add to current ppo trainer. Fix `micro_batch_size_per_gpu` but not sure whether it is right for reward config. The output format is also not right with current `forward_micro_batch` implementation.	2025-04-29 21:20:21 +08:00
Shawn/Yuxuan Tong	fbb93e44b1	[CI] feat: only test for push to main (#1271 )	2025-04-27 09:51:09 +08:00
Blue Space	a35c044627	Migrate to new image with FlashInfer 0.2.2 + vLLM 0.8.3 + SGLang 0.4.5 + MCore 0.12.0 + TE 2.2 + cuDNN 9.8.0 (#1237 ) As support both, we let TE to choose attention backend now. New Image: `whatcanyousee/verl:ngc-cu124-vllm0.8.3-sglang0.4.5-mcore0.12.0-te2.2`	2025-04-24 16:14:48 +08:00
HL	5313d96f9b	[CI] fix: add additional pre-commit test before ppo trainer tests (#1175 )	2025-04-20 11:16:19 -07:00
HL	568239fb38	CI: limit ruff checks and enable push tests (#1157 )	2025-04-19 13:54:45 +08:00
Shawn/Yuxuan Tong	5ba1dbc606	[ci] feat: improve CI speed to 1-2min per test (#1032 ) ### Summary #### Minimize Test Workloads This PR minimizes the test workloads while keeping them meaningful, reducing the time cost of a test from >10 min to 1~2 min. Specifically, we 1. set batch sizes and steps as small but still meaningful numbers: ```bash train_traj_micro_bsz_per_gpu=2 # b n_resp_per_prompt=4 # g train_traj_micro_bsz=$((train_traj_micro_bsz_per_gpu * NUM_GPUS)) # b * n train_traj_mini_bsz=$((train_traj_micro_bsz * 2)) # 2 * b * n train_prompt_mini_bsz=$((train_traj_mini_bsz * n_resp_per_prompt)) # 2 * b * n / g train_prompt_bsz=$((train_prompt_mini_bsz * 2)) # 4 * b * n / g # ... TOT_TRAIN_STEPS=${TOT_TRAIN_STEPS:-1} ``` 2. disable validation (this costs a lot!) / saving / resuming for training tests by default and leave them to specialized tests ```bash # Validation VAL_BEFORE_TRAIN=${VAL_BEFORE_TRAIN:-False} TEST_FREQ=${TEST_FREQ:--1} # Save & Resume RESUME_MODE=${RESUME_MODE:-disable} SAVE_FREQ=${SAVE_FREQ:--1} ``` #### Improve Triggering Mode This PRs introduces a more comprehensive triggering logic mode. Specifically, we 1. consider all Python code by default 2. include related entrypoints (the workflow config, scripts used by it and hydra config, etc.) 3. exclude unrelated Python code from other components (e.g., recipes, examples, Megatron, SFT, generation, evaluation, etc. for FSDP training) An example from `e2e_ppo_trainer`: ```yaml on: paths: - "*/.py" # Entrypoints - ".github/workflows/e2e_ppo_trainer.yml" - "examples/data_preprocess/gsm8k.py" - "examples/data_preprocess/geo3k.py" - "tests/e2e/ppo_trainer" - "verl/trainer/main_ppo.py" - "verl/trainer/config/ppo_trainer.yaml" - "!examples" - "!verl/trainer/main_.py" - "!verl/trainer/fsdp_sft_trainer.py" # Recipes - "!recipe" # Megatron - "!verl/workers//megatron_.py" ``` #### Avoid missing out errors Some test scripts didn't end with the main python command and might miss out the error. To address this issue, this PR introduces the following options: ```bash set -xeuo pipefail ``` , which means - `x`: Print each command before executing it (useful for debugging) - `e`: Exit immediately if any command fails (returns non-zero exit status) - `u`: Treat unset variables as an error - `o pipefail`: Return the exit status of the last command in a pipeline that failed, or zero if all succeeded Together, these options make the script fail fast and provide verbose output, which helps with debugging and ensuring the script doesn't continue after encountering errors. #### Others Besides, we also 1. unify runner labels into `"L20x8"` to enable preemptive scheduling of jobs 2. reduce test scripts of minimal differences, grouping by entrypoint (e.g. `ppo_trainer`, `ppo_megatron_trainer`, recipes, etc.), into a base script with options	2025-04-14 09:48:10 -07:00
Shawn/Yuxuan Tong	866e9808d4	[CI] feat: unify CI label to enbale preemptive schedule for jobs (#1072 )	2025-04-14 16:52:30 +08:00
Blue Space	f976b1853d	Update vllm 0.8.2 with megatron 0.11.0 (#1054 ) Parts of #851 Including minimal of upgrade: 1. vllm 0.8.2 with megatron 2. part of per-tensor allgather and load weights 3. fix bugs with context parallel, because of dataloader random seed, seems behavior changed in torch 2.6.0	2025-04-14 09:27:35 +08:00
HL	d882b62b01	tests: add import utils tests (#1042 )	2025-04-11 18:55:54 -07:00
Chi Zhang	c9e3c57cf8	[megatron] feat: optimize entropy loss (#1007 )	2025-04-11 09:37:37 +08:00
HL	526c0908be	[ci] chore: reduce CI load (#934 )	2025-04-06 10:06:10 -07:00
Blue Space	5d0a7eaf6d	[feat] Megatron checkpoint support for current Llama and Qwen models (#687 ) # Intro Support Megatron checkpoint for Model, Optimizer States and RNG states, with a new layer of abstraction: `MegatronCheckpointManager` like FSDP. Also add checkpoint tests. # Involved Issues and PRs This solved issue #682 #605 , including PR #510 #634 #368 #330 . Thanks for the great efforts of @uygnef, @ShareLer and @caaatch22 in these contributions. # TODOs - [ ] Support Megatron dist checkpointing mechanism, now use torch.save/load to store/restore model weights. - [x] Quick: Also store hf format model. --------- Co-authored-by: caaatch22 <mr.liumingjie@gmail.com> Co-authored-by: Yu Feng <admin@fengyu.org> Co-authored-by: ShareLer <sharele@163.com>	2025-03-23 14:36:05 +08:00
Lumeng Wu	3f6d45d95b	fix: support transformers==4.50.0 (#704 ) https://github.com/volcengine/verl/issues/703	2025-03-22 13:54:34 +08:00
Chi Zhang	0cc2bdada0	[misc] feat: add allgather method to dataproto (#497 ) - Add allgather method to dataproto - Add tests - Replace existing raw allgather with this function	2025-03-06 22:05:51 +08:00
Chi Zhang	c15c6447ca	[ci] feat: add ci timeout (#487 ) Set timeout in CI to avoid infinite hang. close #468	2025-03-06 08:52:05 +08:00
zhou fan	d36422be5c	feat: add support for ulysses sequence parallel for transformers >= 0.48 (#357 ) close #312 Add support for ulysses sp for transformers >= 0.48 I've tested transformers 0.45.0, 0.46.0, 0.47.0, 0.48.0 and 0.49.0, using sp=2 with the following script in my local env ```bash #!/bin/bash set -ex VERSIONS=("4.45.0" "4.46.0" "4.47.0" "4.48.0" "4.49.0") for version in "${VERSIONS[@]}"; do echo "Testing with Transformers version ${version}" echo "----------------------------------------" pip install "transformers==${version}" PYTHONPATH=./ torchrun --nproc_per_node=2 tests/model/test_transformers_ulysses.py echo "----------------------------------------" echo "Completed testing for version ${version}" echo "" done ```	2025-02-24 18:54:39 +08:00
HL	0a1b16f800	distro: bump up version to v0.2.0.dev, limit vllm version (#327 )	2025-02-20 15:21:43 +08:00
Willem Jiang	dd09d47fe2	Added content permissions of the workflow (#303 ) We need to specify the minimum permission in the workflow.	2025-02-19 10:23:22 +08:00
Guangming Sheng	27484a7bbb	[misc] feat: add ckpt manager in utils (#216 ) - Support FSDPCheckpointManager - Support hdfs_io import if installed - Add CI for FSDPCheckpointManager TODO: - Will integrate in the next PR	2025-02-07 09:09:03 +08:00
Guangming Sheng	695bdbb030	[misc] fix: gradient accumulation in seq balance and modify default vllm log level (#141 ) - Previous gradient accumulation value is computed by micro_batch_size, which is wrong when using dynamic_bsz - Fix ci script to avoid overlooking this issue - Change vLLM state log default value to True to disable log. - We will check the `self.config.actor.ppo_mini_batch_size % self.config.actor.ppo_micro_batch_size_per_gpu == 0` after normalization in fsdp_workers instead of in dp_actor and dp_critic.	2025-01-27 21:44:25 +08:00
Guangming Sheng	ff0c7ccd41	[ci] fix: add force stop in ray e2e ci to clean env (#112 ) - As titled	2025-01-17 21:41:50 +08:00
Guangming Sheng	1facb9d2fb	[misc] feat: support different flash_attn versions with variable num returns (#100 ) * add ci * fix reward model and write more ci script * support different flash_attn version with variable num returns * update transformers rmpad workflow * balance workload * lint * lint	2025-01-13 16:38:51 +08:00
Guangming Sheng	569210e06c	[misc] feat: spport rmpad/data-packing in FSDP with transformers (#91 ) * init commit of rmpad * add rmpad test * support rmpad in actor model * add test for value model * support rmpad in critic and rm * fix actor return and fix num_labels and clean not used rmpad * fix critic and benchmark * update script * fix critic * lint * fix util issue * fix unnecessary unpad * address issues * fix args * update test and update rmpad support model list * fix typo * fix typo and fix name * rename rmpad to rename padding * fix arch to model_type * add ci for e2e rmpad and fix typo * lint * fix ci * fix typo * update tests for customize tokenizer in actor * fix rmpad test * update requirement of transformers as hf_rollout may have issue	2025-01-11 16:50:15 +08:00

49 Commits