fix: 2 typos (#435)

2025-10-20 13:43:50 +08:00 · 2025-03-01 22:18:16 +08:00
parent cef4c2dea5
commit 99fb2dde77
1 changed files with 2 additions and 2 deletions
--- a/docs/hybrid_flow.rst
+++ b/docs/hybrid_flow.rst
@ -15,7 +15,7 @@ We use dataflow to represent RL systems. [4]_.
 DataFlow
 ~~~~~~~~~~~~~~~~~~~~

-Dataflow is an abstraction of computations. Neural Netowork training is a typical dataflow. It can be represented by computational graph. 
+Dataflow is an abstraction of computations. Neural Network training is a typical dataflow. It can be represented by computational graph. 

 .. image:: https://github.com/eric-haibin-lin/verl-community/blob/main/docs/dataflow.jpeg?raw=true
   :alt: The dataflow graph from CS231n 2024 lecture 4
@ -94,7 +94,7 @@ Code: https://github.com/volcengine/verl/blob/main/verl/trainer/main_ppo.py
 In this file, we define a remote function `main_task` that serves as the controller (driver) process as shown in the above figure. We also define a ``RewardManager``, where users can customize their reward function based on the data source in the dataset. Note that `RewardManager` should return the final token-level reward that is optimized by RL algorithms. Note that users can combine model-based rewards and rule-based rewards.
 The ``main_task`` constructs a RayPPOTrainer instance and launch the fit. Note that ``main_task`` **runs as a single process**.

-We highly recommend that the ``main_task`` is NOT schduled on the head of the ray cluster because ``main_task`` will consume a lot of memory but the head usually contains very few resources.
+We highly recommend that the ``main_task`` is NOT scheduled on the head of the ray cluster because ``main_task`` will consume a lot of memory but the head usually contains very few resources.

 Ray trainer
 ~~~~~~~~~~~~~~~~~~~~