Files
verl/docs/README_vllm0.8.md
H 52065c6405 [BREAKING][rollout] refactor: drop vllm v0.5.4 and v0.6.3 support (#2257)
### What does this PR do?

This PR removes support for vLLM versions 0.5.4 and 0.6.3 from the verl
repository, completing a comprehensive cleanup of legacy
version-specific code branches. The changes simplify the codebase by
eliminating conditional logic and version-specific implementations,
requiring users to upgrade to vLLM 0.7.0 or later (recommended: vLLM
0.8.3+).

**Key Changes:**
- Deleted legacy rollout implementations (`fire_vllm_rollout.py`,
`vllm_rollout.py`, `test_vllm_hf_loader.py`)
- Removed version-specific directories (`vllm_v_0_5_4`, `vllm_v_0_6_3`) 
- Simplified sharding managers by removing `customized_vllm` flag
conditionals
- Updated configuration files to remove deprecated options
(`use_fire_sampling`)
- Cleaned up documentation and environment variable exports

### Checklist Before Starting

- [x] Search for similar PRs: No similar PRs found for this specific
cleanup
- [x] Format the PR title as `[BREAKING][vllm, rollout, worker]
refactor: Remove vLLM 0.5.4 and 0.6.3 support`
  - Modules: `vllm`, `rollout`, `worker` (primary affected components)
  - Type: `refactor` (code cleanup and simplification)
  - Breaking: Yes, requires vLLM version upgrade

### Test

This PR has been validated through:
- **CI Pipeline**: All existing tests pass with vLLM 0.7.0+ (27 checks
pending/running)
- **Version Detection**: New version check logic properly rejects vLLM
0.5.4/0.6.3 with clear error messages
- **Merge Conflict Resolution**: Successfully resolved complex conflicts
during main branch merge
- **Pre-commit Checks**: All linting and formatting requirements
satisfied

### API and Usage Example

**Breaking Changes:**
- **vLLM Version Requirement**: Minimum supported version is now 0.7.0
(recommended: 0.8.3+)
- **Removed Configuration Options**: `use_fire_sampling` no longer
available in config files
- **Environment Variables**: `VLLM_ATTENTION_BACKEND=XFORMERS` exports
removed (not needed for vLLM 0.7.0+)

**Migration Guide:**
```bash
# Before: vLLM 0.5.4/0.6.3 with custom flags
pip install vllm==0.6.3
export VLLM_ATTENTION_BACKEND=XFORMERS

# After: vLLM 0.8.3+ with V1 API
pip install vllm>=0.8.3
export VLLM_USE_V1=1  # Recommended for optimal performance
```

**Updated Configuration:**
```yaml
# generation.yaml - removed use_fire_sampling option
rollout:
  name: vllm_rollout
  # use_fire_sampling: False  # <- REMOVED
  
# Use standard vLLM rollout without legacy options
```

### High-Level Design

```mermaid
graph TB
    subgraph "Before: Multi-Version Support"
        A1[vLLM Version Check] --> B1{Version 0.5.4?}
        A1 --> B2{Version 0.6.3?}
        A1 --> B3{Version 0.7.0+?}
        B1 --> C1[Legacy vllm_v_0_5_4 Code]
        B2 --> C2[Legacy vllm_v_0_6_3 Code]
        B3 --> C3[Modern vLLM Code]
    end
    
    subgraph "After: Simplified Support"
        A2[vLLM Version Check] --> B4{Version >= 0.7.0?}
        B4 -->|Yes| C4[Modern vLLM Code Only]
        B4 -->|No| C5[Clear Error Message]
    end
```

### Specific Changes

**Deleted Files:**
- `verl/workers/rollout/vllm_rollout/fire_vllm_rollout.py`
- `verl/workers/rollout/vllm_rollout/vllm_rollout.py` 
- `tests/workers/rollout/rollout_vllm/test_vllm_hf_loader.py`
- `verl/third_party/vllm/vllm_v_0_5_4/` (entire directory)
- `verl/third_party/vllm/vllm_v_0_6_3/` (entire directory)
- `pytest.ini`

**Modified Core Files:**
- `verl/third_party/vllm/__init__.py`: Simplified version detection with
clear error messages
- `verl/workers/rollout/vllm_rollout/vllm_rollout_spmd.py`: Removed
cache engine management and version conditionals
- `verl/workers/sharding_manager/fsdp_vllm.py`: Dropped
`customized_vllm` flag logic
- `verl/workers/sharding_manager/megatron_vllm.py`: Simplified weight
loading and cache management

**Configuration Updates:**
- `verl/trainer/config/generation.yaml`: Removed `use_fire_sampling`
option
- `verl/trainer/config/ppo_trainer.yaml`: Removed `use_fire_sampling`
option
- `tests/special_sanity/check_api_docs.py`: Removed `LLMEngine` from
whitelist

**Documentation Updates:**
- `docs/start/install.rst`: Updated to recommend vLLM 0.8.3+ with
`VLLM_USE_V1=1`
- `docs/perf/perf_tuning.rst`: Updated performance recommendations
- Removed 42+ `VLLM_ATTENTION_BACKEND=XFORMERS` exports from bash
scripts

**Reverted Changes:**
- `.github/workflows/vllm.yml`: Restored original container image names
- `docs/faq/faq.rst`: Restored original apptainer commands
- `docs/ascend_tutorial/ascend_quick_start.rst`: Reverted all
modifications
- `examples/tuning/*/`: Restored original `nproc_per_gpu` settings

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide)
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting):
`pre-commit run --all-files --show-diff-on-failure --color=always`
- [x] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs):
Updated install and performance tuning docs
- [x] Add unit or end-to-end test(s): Existing CI tests validate the
changes; legacy-specific tests were removed as intended
- [x] **CI Request**: Once PR is ready, message will be sent to
`ci-request` channel in verl Slack workspace

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2025-06-29 19:27:22 -07:00

1.5 KiB

Upgrading to vLLM >= 0.8

Last updated: 05/04/2025.

Installation

Note: This version of verl+vLLM 0.8+ supports FSDP for training and vLLM for rollout.

# Create the conda environment
conda create -n verl python==3.10
conda activate verl

# Install verl
git clone https://github.com/volcengine/verl.git
cd verl
pip3 install -e .

# Install the latest stable version of vLLM
pip3 install vllm==0.8.3

# Install flash-attn
pip3 install flash-attn --no-build-isolation

We have a pre-built docker image for verl+vLLM 0.8.3. You can direct import it with the following command:

docker pull hiyouga/verl:ngc-th2.6.0-cu126-vllm0.8.3-flashinfer0.2.2-cxx11abi0

Features

vLLM 0.8+ supports cuda graph and V1 engine by default in verl. To enable these features, remember to add the following lines to the bash script:

actor_rollout_ref.rollout.enforce_eager=False \
actor_rollout_ref.rollout.free_cache_engine=True \

and also remove the environment variable if it exists:

Notes

When you just directly upgrade vllm>=0.8, some dependency packages may undergo version changes. If you encounter the following problems:

in <module> from torch.multiprocessing.reductions import ForkingPickler ImportError: cannot import name 'ForkingPickler' from 'torch.multiprocessing.reductions' (/opt/conda/lib/python3.11/site-packages/torch/multiprocessing/reductions.py)

You need to upgrade tensordict to version 0.6.2 using the command pip install tensordict==0.6.2.