Files
verl/docs/examples/multi_modal_example.rst
H 7559a6a938 [doc] fix: add time info for each doc, assert sphinx warning in CI (#2255)
### What does this PR do?

add time info for each doc, assert sphinx warning in CI.
The time info is helpful for the community to identify docs that may be
too old before it's actually removed or updated.

### Checklist Before Starting

- [x] Search for similar PRs. Paste at least one query link here: ...
- [x] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`


### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [x] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [x] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2025-06-29 11:58:35 +08:00

46 lines
944 B
ReStructuredText

Multi-Modal Example Architecture
=================================
Last updated: 04/28/2025.
Introduction
------------
Now, verl has supported multi-modal training. You can use fsdp and
vllm/sglang to start a multi-modal RL task. Megatron supports is also
on the way.
Follow the steps below to quickly start a multi-modal RL task.
Step 1: Prepare dataset
-----------------------
.. code:: python
# it will be saved in the $HOME/data/geo3k folder
python examples/data_preprocess/geo3k.py
Step 2: Download Model
----------------------
.. code:: bash
# download the model from huggingface
python3 -c "import transformers; transformers.pipeline(model='Qwen/Qwen2.5-VL-7B-Instruct')"
Step 3: Perform GRPO training with multi-modal model on Geo3K Dataset
---------------------------------------------------------------------
.. code:: bash
# run the task
bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh