mirror of
https://github.com/volcengine/verl.git
synced 2025-11-12 01:04:44 +08:00
This PR allows users to pass all vllm/sglang engine args and optimizes qwen3 rollout speed through vllm Engine argument. 1. deprecate the default value of previous engine_kwargs 2. pass all the engine_kwargs to vllm/sglang engine 3. optimize Qwen3-235B rollout speed by setting TP=8 and enabling expert parallel. From top to bottom: tp=16 without EP, tp=8 without EP and tp=8 with EP. <img width="1000" height="808" alt="image" src="https://github.com/user-attachments/assets/6b096be4-3896-4e96-8916-d8d6e13a58cc" /> PS: The DeepSeek-V3's rollout slows down after enabling expert parallelism.