Files
verl/requirements_sglang.txt
Qunhong Zeng 6974bbaeea [dataset] refactor: use hf Dataset instead of pandas DataFrame in RLHFDataset for speedup (#890)
HF Dataset provides better memory management and can handle larger
datasets. It also supports multi-process acceleration during map/filter
operations (while pandas requires version >2.0).

Now we can specify `filter_overlong_prompts` on large-scale datasets
when set `filter_overlong_prompts_workers` to a appreciate num.

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-03 21:51:53 -07:00

22 lines
321 B
Plaintext

# requirements.txt records the full set of dependencies for development
accelerate
codetiming
datasets
dill
flash-attn
hydra-core
numpy
pandas
datasets
peft
pyarrow>=15.0.0
pybind11
pylatexenc
ray[default]>=2.10
tensordict<=0.6.2
torchdata
torchvision
transformers
wandb
sglang[all]==0.4.4.post3
torch-memory-saver>=0.0.5