[rollout, sglang] feat: Add sync mode for bash (#3186)

### What does this PR do?
- Use `sync` mode for `dapo`, `gsm8k` and `geo`

### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
This commit is contained in:
Huapeng Zhou
2025-08-24 23:43:11 -04:00
committed by GitHub
parent 2c7a9c5708
commit 7ff2386987
3 changed files with 5 additions and 1 deletions

View File

@ -40,6 +40,7 @@ python3 -m verl.trainer.main_ppo \
actor_rollout_ref.rollout.n=5 \
actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu=20 \
actor_rollout_ref.ref.fsdp_config.param_offload=True \
actor_rollout_ref.rollout.mode=sync \
algorithm.use_kl_in_reward=False \
trainer.critic_warmup=0 \
trainer.logger='["console","wandb"]' \

View File

@ -48,7 +48,8 @@ python3 -m verl.trainer.main_ppo \
actor_rollout_ref.rollout.n=16 \
actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu=32 \
actor_rollout_ref.ref.fsdp_config.param_offload=True \
actor_rollout_ref.rollout.over_sample_rate=0 \
actor_rollout_ref.rollout.over_sample_rate=0.1 \
actor_rollout_ref.rollout.mode=sync \
algorithm.use_kl_in_reward=False \
trainer.critic_warmup=0 \
trainer.logger='["console","wandb"]' \

View File

@ -35,6 +35,8 @@ python3 -m verl.trainer.main_ppo \
actor_rollout_ref.rollout.name=sglang \
actor_rollout_ref.rollout.gpu_memory_utilization=0.5 \
actor_rollout_ref.rollout.n=16 \
actor_rollout_ref.rollout.over_sample_rate=0.1 \
actor_rollout_ref.rollout.mode=sync \
actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu=32 \
actor_rollout_ref.ref.fsdp_config.param_offload=True \
algorithm.use_kl_in_reward=False \