9 Commits

Author SHA1 Message Date
H
01624c6da7 [doc] fix: colocation documentation updates (#2465)
### What does this PR do?

Update docs and awesome work.
2025-07-11 08:11:17 +08:00
H
52065c6405 [BREAKING][rollout] refactor: drop vllm v0.5.4 and v0.6.3 support (#2257)
### What does this PR do?

This PR removes support for vLLM versions 0.5.4 and 0.6.3 from the verl
repository, completing a comprehensive cleanup of legacy
version-specific code branches. The changes simplify the codebase by
eliminating conditional logic and version-specific implementations,
requiring users to upgrade to vLLM 0.7.0 or later (recommended: vLLM
0.8.3+).

**Key Changes:**
- Deleted legacy rollout implementations (`fire_vllm_rollout.py`,
`vllm_rollout.py`, `test_vllm_hf_loader.py`)
- Removed version-specific directories (`vllm_v_0_5_4`, `vllm_v_0_6_3`) 
- Simplified sharding managers by removing `customized_vllm` flag
conditionals
- Updated configuration files to remove deprecated options
(`use_fire_sampling`)
- Cleaned up documentation and environment variable exports

### Checklist Before Starting

- [x] Search for similar PRs: No similar PRs found for this specific
cleanup
- [x] Format the PR title as `[BREAKING][vllm, rollout, worker]
refactor: Remove vLLM 0.5.4 and 0.6.3 support`
  - Modules: `vllm`, `rollout`, `worker` (primary affected components)
  - Type: `refactor` (code cleanup and simplification)
  - Breaking: Yes, requires vLLM version upgrade

### Test

This PR has been validated through:
- **CI Pipeline**: All existing tests pass with vLLM 0.7.0+ (27 checks
pending/running)
- **Version Detection**: New version check logic properly rejects vLLM
0.5.4/0.6.3 with clear error messages
- **Merge Conflict Resolution**: Successfully resolved complex conflicts
during main branch merge
- **Pre-commit Checks**: All linting and formatting requirements
satisfied

### API and Usage Example

**Breaking Changes:**
- **vLLM Version Requirement**: Minimum supported version is now 0.7.0
(recommended: 0.8.3+)
- **Removed Configuration Options**: `use_fire_sampling` no longer
available in config files
- **Environment Variables**: `VLLM_ATTENTION_BACKEND=XFORMERS` exports
removed (not needed for vLLM 0.7.0+)

**Migration Guide:**
```bash
# Before: vLLM 0.5.4/0.6.3 with custom flags
pip install vllm==0.6.3
export VLLM_ATTENTION_BACKEND=XFORMERS

# After: vLLM 0.8.3+ with V1 API
pip install vllm>=0.8.3
export VLLM_USE_V1=1  # Recommended for optimal performance
```

**Updated Configuration:**
```yaml
# generation.yaml - removed use_fire_sampling option
rollout:
  name: vllm_rollout
  # use_fire_sampling: False  # <- REMOVED
  
# Use standard vLLM rollout without legacy options
```

### High-Level Design

```mermaid
graph TB
    subgraph "Before: Multi-Version Support"
        A1[vLLM Version Check] --> B1{Version 0.5.4?}
        A1 --> B2{Version 0.6.3?}
        A1 --> B3{Version 0.7.0+?}
        B1 --> C1[Legacy vllm_v_0_5_4 Code]
        B2 --> C2[Legacy vllm_v_0_6_3 Code]
        B3 --> C3[Modern vLLM Code]
    end
    
    subgraph "After: Simplified Support"
        A2[vLLM Version Check] --> B4{Version >= 0.7.0?}
        B4 -->|Yes| C4[Modern vLLM Code Only]
        B4 -->|No| C5[Clear Error Message]
    end
```

### Specific Changes

**Deleted Files:**
- `verl/workers/rollout/vllm_rollout/fire_vllm_rollout.py`
- `verl/workers/rollout/vllm_rollout/vllm_rollout.py` 
- `tests/workers/rollout/rollout_vllm/test_vllm_hf_loader.py`
- `verl/third_party/vllm/vllm_v_0_5_4/` (entire directory)
- `verl/third_party/vllm/vllm_v_0_6_3/` (entire directory)
- `pytest.ini`

**Modified Core Files:**
- `verl/third_party/vllm/__init__.py`: Simplified version detection with
clear error messages
- `verl/workers/rollout/vllm_rollout/vllm_rollout_spmd.py`: Removed
cache engine management and version conditionals
- `verl/workers/sharding_manager/fsdp_vllm.py`: Dropped
`customized_vllm` flag logic
- `verl/workers/sharding_manager/megatron_vllm.py`: Simplified weight
loading and cache management

**Configuration Updates:**
- `verl/trainer/config/generation.yaml`: Removed `use_fire_sampling`
option
- `verl/trainer/config/ppo_trainer.yaml`: Removed `use_fire_sampling`
option
- `tests/special_sanity/check_api_docs.py`: Removed `LLMEngine` from
whitelist

**Documentation Updates:**
- `docs/start/install.rst`: Updated to recommend vLLM 0.8.3+ with
`VLLM_USE_V1=1`
- `docs/perf/perf_tuning.rst`: Updated performance recommendations
- Removed 42+ `VLLM_ATTENTION_BACKEND=XFORMERS` exports from bash
scripts

**Reverted Changes:**
- `.github/workflows/vllm.yml`: Restored original container image names
- `docs/faq/faq.rst`: Restored original apptainer commands
- `docs/ascend_tutorial/ascend_quick_start.rst`: Reverted all
modifications
- `examples/tuning/*/`: Restored original `nproc_per_gpu` settings

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide)
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting):
`pre-commit run --all-files --show-diff-on-failure --color=always`
- [x] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs):
Updated install and performance tuning docs
- [x] Add unit or end-to-end test(s): Existing CI tests validate the
changes; legacy-specific tests were removed as intended
- [x] **CI Request**: Once PR is ready, message will be sent to
`ci-request` channel in verl Slack workspace

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2025-06-29 19:27:22 -07:00
H
7559a6a938 [doc] fix: add time info for each doc, assert sphinx warning in CI (#2255)
### What does this PR do?

add time info for each doc, assert sphinx warning in CI.
The time info is helpful for the community to identify docs that may be
too old before it's actually removed or updated.

### Checklist Before Starting

- [x] Search for similar PRs. Paste at least one query link here: ...
- [x] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`


### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [x] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [x] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2025-06-29 11:58:35 +08:00
0e127b208b chore: fix typos across codebase (#1805)
Fixed typos across codebase.
2025-06-02 21:05:07 +08:00
91fa2a6b94 [docs] fix: typo (#1391) 2025-05-04 12:15:07 -07:00
HL
9b5ffe42ca docs: doc improvements via Openhands, add SimpleRL-Zoo (#764) 2025-03-26 13:02:52 +08:00
99fb2dde77 fix: 2 typos (#435) 2025-03-01 22:18:16 +08:00
610c20c7ba [doc] fix document deprecated link (#235)
- As titled
2025-02-09 21:12:05 +08:00
HL
e842b73dae docs: add programming model guide (#230) 2025-02-09 19:10:47 +08:00