mirror of
https://github.com/volcengine/verl.git
synced 2025-10-20 13:43:50 +08:00
> [!WARNING]
> We are [immigrating to `ruff` as the linter and formatter and
`pre-commit` as the managing
tool](https://github.com/volcengine/verl/pull/1010).
>
> If your branch is based on a previous commit using `yapf` and
`pylint`, simply merging might trigger overwhelming linting errors,
while **you are only expected to resolve ones in the files related to
your PR**.
>
> To resolve this issue, please try the following workaround to only
include the files you **really changed** in the PR:
>
> 1. In your branch, fix linting and format with `ruff`: `ruff check
--fix && ruff-format`
> 2. Squash into a single commit in a new branch: `git reset --soft
$(git merge-base main HEAD) && git add -A && git commit -m "feat: ..."`
> 3. Merge with the latest main: `git merge origin/main`
> 4. Force push to your branch: `git push --force`
We add the reminder above to the documentation to tell contributors how
to avoid overwhelming linting errors.
### Motivation
According to dicussion in #896, this PR immigrates from yapf & pylint to
ruff based on pre-commit, which allows unified version control and
automatic hook on committing.
### Summary
The `pre-commit` hook and CI
- checks staged / committed files in commits / PR's
- checks all files each month (This should fail before we fix all the
files by the ruff standard)
### Explanation for the Failing CI Workflow `pre-commit`
For now, we only apply `ruff format` and `ruff check --fix` **without
resolving all the errors**, since there are too many errors to resolve,
which causes the CI workflow `pre-commit` fails.
For resolving the remaining errors, we leave to future commits.
Specifically, the `pre-commit` hook and CI will require every commit to
fix its related files with `ruff`, which will fix all the files
incrementally.
### Reviewing Suggestion
The commit
3d93f51ba8
is huge since we apply `ruff` to all the files. To review the main
changes, please check the commits before and after it.
156 lines
5.3 KiB
ReStructuredText
156 lines
5.3 KiB
ReStructuredText
Welcome to verl's documentation!
|
|
================================================
|
|
|
|
.. _hf_arxiv: https://arxiv.org/pdf/2409.19256
|
|
|
|
verl is a flexible, efficient and production-ready RL training framework designed for large language models (LLMs) post-training. It is an open source implementation of the `HybridFlow <hf_arxiv>`_ paper.
|
|
|
|
verl is flexible and easy to use with:
|
|
|
|
- **Easy extension of diverse RL algorithms**: The Hybrid programming model combines the strengths of single-controller and multi-controller paradigms to enable flexible representation and efficient execution of complex Post-Training dataflows. Allowing users to build RL dataflows in a few lines of code.
|
|
|
|
- **Seamless integration of existing LLM infra with modular APIs**: Decouples computation and data dependencies, enabling seamless integration with existing LLM frameworks, such as PyTorch FSDP, Megatron-LM and vLLM. Moreover, users can easily extend to other LLM training and inference frameworks.
|
|
|
|
- **Flexible device mapping and parallelism**: Supports various placement of models onto different sets of GPUs for efficient resource utilization and scalability across different cluster sizes.
|
|
|
|
- Ready integration with popular HuggingFace models
|
|
|
|
|
|
verl is fast with:
|
|
|
|
- **State-of-the-art throughput**: By seamlessly integrating existing SOTA LLM training and inference frameworks, verl achieves high generation and training throughput.
|
|
|
|
- **Efficient actor model resharding with 3D-HybridEngine**: Eliminates memory redundancy and significantly reduces communication overhead during transitions between training and generation phases.
|
|
|
|
--------------------------------------------
|
|
|
|
.. _Contents:
|
|
|
|
.. toctree::
|
|
:maxdepth: 5
|
|
:caption: Quickstart
|
|
|
|
start/install
|
|
start/quickstart
|
|
start/multinode
|
|
|
|
.. toctree::
|
|
:maxdepth: 4
|
|
:caption: Programming guide
|
|
|
|
hybrid_flow
|
|
|
|
.. toctree::
|
|
:maxdepth: 5
|
|
:caption: Data Preparation
|
|
|
|
preparation/prepare_data
|
|
preparation/reward_function
|
|
|
|
.. toctree::
|
|
:maxdepth: 5
|
|
:caption: Configurations
|
|
|
|
examples/config
|
|
|
|
.. toctree::
|
|
:maxdepth: 2
|
|
:caption: PPO Example
|
|
|
|
examples/ppo_code_architecture
|
|
examples/gsm8k_example
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:caption: PPO Trainer and Workers
|
|
|
|
workers/ray_trainer
|
|
workers/fsdp_workers
|
|
workers/megatron_workers
|
|
workers/sglang_worker
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:caption: Performance Tuning Guide
|
|
|
|
perf/perf_tuning
|
|
README_vllm0.8.md
|
|
perf/device_tuning
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:caption: Experimental Results
|
|
|
|
experiment/ppo
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:caption: Advance Usage and Extension
|
|
|
|
advance/placement
|
|
advance/dpo_extension
|
|
advance/fsdp_extension
|
|
advance/megatron_extension
|
|
advance/checkpoint
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:caption: API References
|
|
|
|
data.rst
|
|
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:caption: FAQ
|
|
|
|
faq/faq
|
|
|
|
Contribution
|
|
-------------
|
|
|
|
verl is free software; you can redistribute it and/or modify it under the terms
|
|
of the Apache License 2.0. We welcome contributions.
|
|
Join us on `GitHub <https://github.com/volcengine/verl>`_, `Slack <https://join.slack.com/t/verlgroup/shared_invite/zt-2w5p9o4c3-yy0x2Q56s_VlGLsJ93A6vA>`_ and `Wechat <https://raw.githubusercontent.com/eric-haibin-lin/verl-community/refs/heads/main/WeChat.JPG>`_ for discussions.
|
|
|
|
Contributions from the community are welcome! Please check out our `project roadmap <https://github.com/volcengine/verl/issues/710>`_ and `good first issues <https://github.com/volcengine/verl/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22>`_ to see where you can contribute.
|
|
|
|
Code Linting and Formatting
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
.. warning::
|
|
We are `immigrating to ``ruff`` as the linter and formatter and ``pre-commit`` as the managing tool <https://github.com/volcengine/verl/pull/1010>`_.
|
|
|
|
If your branch is based on a previous commit using ``yapf`` and ``pylint``, simply merging might trigger overwhelming linting errors, while **you are only expected to resolve ones in the files related to your PR**.
|
|
|
|
To resolve this issue, please try the following workaround to only include the files you **really changed** in the PR:
|
|
|
|
1. In your branch, fix linting and format with ``ruff``: ``ruff check --fix && ruff-format``
|
|
2. Squash into a new single commit: ``git reset --soft $(git merge-base main HEAD) && git add -A && git commit -m "feat: ..."``
|
|
3. Merge with the latest main: ``git merge origin/main``
|
|
4. Force push to your branch: ``git push --force``
|
|
|
|
We use pre-commit to help improve code quality. To initialize pre-commit, run:
|
|
|
|
.. code-block:: bash
|
|
|
|
pip install pre-commit
|
|
pre-commit install
|
|
|
|
You can also manually run pre-commit by:
|
|
|
|
.. code-block:: bash
|
|
|
|
pre-commit run
|
|
|
|
Adding CI tests
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
If possible, please add CI test(s) for your new feature:
|
|
|
|
1. Find the most relevant workflow yml file, which usually corresponds to a ``hydra`` default config (e.g. ``ppo_trainer``, ``ppo_megatron_trainer``, ``sft_trainer``, etc).
|
|
2. Add related path patterns to the ``paths`` section if not already included.
|
|
3. Minimize the workload of the test script(s) (see existing scripts for examples).
|
|
|
|
We are HIRING! Send us an `email <mailto:haibin.lin@bytedance.com>`_ if you are interested in internship/FTE opportunities in MLSys/LLM reasoning/multimodal alignment.
|