Commit Graph - verl - Gitea: Git for Me

frozenleaves/verl

Fork 0

mirror of https://github.com/volcengine/verl.git synced 2025-10-20 13:43:50 +08:00

Commit Graph

Select branches

Hide Pull Requests

dependabot/pip/sglang-all--0.5.3.post1

dependabot/pip/torchvision-0.23.0

main

recipe/dapo

recipe/entropy-mechanism

recipe/one_step_off_async

revert-3769-fix-async-reward

v0.2.x

v0.3.x

v0.4.1.x

v0.4.x

v0.5.x

v0.6.x

wuxibin/fix_agent_loop

wuxibin/rollout_mode

#1

#100

#1001

#1005

#1006

#1006

#1007

#1009

#1010

#1012

#1016

#1018

#102

#1020

#1020

#1021

#1021

#1024

#1025

#1026

#1027

#1029

#103

#1030

#1032

#1034

#1035

#1037

#1038

#1040

#1040

#1041

#1042

#1044

#1044

#1045

#1046

#1047

#1048

#105

#1050

#1052

#1052

#1053

#1054

#1055

#1060

#1064

#1068

#1069

#107

#1071

#1072

#1073

#1074

#1075

#1078

#1081

#1082

#1085

#1086

#1087

#1088

#109

#1090

#1091

#1092

#1092

#1095

#1095

#1098

#1099

#110

#1100

#1101

#1103

#1104

#1105

#1107

#111

#1111

#1112

#1113

#1114

#1116

#1117

#1118

#112

#1123

#1123

#1124

#1124

#1125

#1127

#1128

#1128

#113

#1131

#1132

#1135

#1136

#1137

#1138

#1139

#114

#1140

#1143

#1146

#115

#1152

#1152

#1153

#1153

#1154

#1155

#1157

#1159

#116

#1160

#1161

#1162

#1164

#1168

#117

#1171

#1173

#1174

#1175

#1177

#1178

#1179

#118

#1180

#1181

#1184

#1185

#1186

#1190

#1195

#1198

#1199

#120

#1200

#1202

#1203

#1204

#1205

#1206

#1206

#1207

#121

#1211

#1212

#1215

#1217

#1219

#122

#1220

#1222

#1223

#1225

#1225

#1227

#1228

#1229

#123

#1230

#1231

#1231

#1234

#1234

#1236

#1237

#124

#1240

#1241

#1245

#1247

#1248

#125

#1250

#1252

#1253

#1254

#1256

#1258

#1259

#126

#1260

#1261

#1261

#1265

#1266

#1267

#1269

#127

#1271

#1272

#1274

#1275

#1276

#1277

#1278

#1279

#1279

#128

#1280

#1281

#1282

#1283

#1284

#1286

#1287

#1288

#1289

#129

#1290

#1292

#1294

#1295

#1295

#1296

#1296

#1297

#130

#1300

#1301

#1301

#1312

#1316

#1318

#1319

#1319

#132

#1320

#1323

#1324

#1325

#1327

#133

#1331

#1333

#1336

#1337

#1339

#1340

#1342

#1342

#1343

#1347

#1349

#1349

#135

#1350

#1351

#1353

#1355

#1356

#1358

#1358

#136

#1362

#1364

#1366

#1369

#137

#1370

#1372

#1373

#1374

#1378

#1379

#1385

#1387

#1389

#139

#1390

#1391

#1392

#1395

#1396

#1397

#140

#1400

#1401

#1404

#1405

#1406

#1407

#1408

#1409

#141

#1411

#1413

#1415

#1419

#142

#1421

#1423

#1424

#1429

#1432

#1433

#1434

#1435

#1437

#1439

#1440

#1441

#1442

#1443

#1444

#1445

#1449

#1450

#1451

#1453

#1454

#146

#1460

#1461

#1463

#1464

#1465

#1466

#1467

#1468

#147

#1470

#1475

#1479

#1480

#1482

#1483

#1488

#1489

#1489

#1490

#1490

#1491

#1494

#1495

#1497

#1499

#150

#1505

#1509

#1513

#1514

#1519

#1519

#152

#1520

#1522

#1523

#1525

#1527

#1529

#153

#1532

#1533

#1536

#1538

#1539

#1540

#1541

#1541

#1544

#1547

#1548

#1548

#1549

#1551

#1552

#1553

#1555

#1557

#1557

#1558

#1559

#156

#1562

#1564

#1566

#1567

#1568

#1571

#1577

#1582

#1583

#1585

#1586

#1587

#1588

#1591

#1592

#1593

#1594

#1596

#1597

#1598

#1600

#1601

#1602

#1604

#1606

#1607

#1608

#1609

#1610

#1612

#1613

#1616

#1617

#162

#1621

#1622

#1623

#1624

#1625

#1627

#1629

#163

#1630

#1631

#1634

#1637

#1638

#1639

#164

#1641

#1647

#1648

#165

#1650

#1651

#1652

#1653

#166

#1660

#1665

#1666

#1667

#1668

#1669

#167

#1670

#1671

#1672

#1673

#1678

#1681

#1682

#1685

#1686

#1686

#1687

#1688

#1689

#1690

#1693

#1693

#1694

#1697

#1698

#17

#1700

#1701

#1702

#1704

#1705

#1706

#1707

#1709

#1712

#1713

#1714

#1715

#1717

#1719

#1726

#1728

#1729

#173

#1730

#1731

#1732

#1737

#1738

#1739

#174

#1740

#1741

#1742

#1743

#1744

#1745

#1748

#1749

#175

#1750

#1753

#1754

#1754

#1755

#1756

#1760

#1761

#1762

#1763

#1765

#1766

#1767

#1768

#1769

#177

#1770

#1771

#1772

#1773

#1774

#1775

#1776

#1779

#178

#1781

#1786

#1786

#1787

#1788

#179

#1791

#1792

#1794

#1796

#1797

#1799

#18

#1805

#1806

#1807

#1808

#1809

#1810

#1811

#1812

#1813

#1814

#1816

#1817

#1818

#1819

#182

#1820

#1821

#1822

#1823

#1826

#1828

#1829

#183

#1830

#1831

#1832

#1833

#1834

#1835

#1835

#1836

#1838

#1839

#1840

#1841

#1843

#1844

#1846

#1847

#185

#1850

#1851

#1852

#1854

#1855

#1856

#1856

#1858

#1859

#186

#1861

#1862

#1866

#1867

#1868

#187

#1870

#1871

#1871

#1872

#1877

#1878

#1879

#1880

#1884

#1886

#1888

#1889

#1890

#1891

#1893

#1896

#1897

#1898

#1899

#19

#1900

#1901

#1904

#1905

#1906

#1907

#1908

#1909

#191

#1910

#1911

#1912

#1913

#1914

#1916

#1917

#1918

#192

#1920

#1922

#1924

#1925

#1927

#193

#1930

#1932

#1933

#1935

#194

#1942

#1943

#1944

#1945

#1946

#1947

#1948

#1949

#195

#1950

#1952

#1953

#1953

#1954

#1955

#1956

#1957

#1958

#1959

#196

#1961

#1965

#1968

#1969

#1971

#1971

#1972

#1974

#1977

#1979

#198

#198

#1981

#1983

#1985

#1987

#199

#1990

#1991

#1994

#1995

#1996

#1998

#1999

#2

#200

#2000

#2002

#2003

#2003

#2004

#2006

#2007

#2008

#201

#2010

#2014

#2016

#2017

#2018

#2019

#202

#2021

#2021

#2023

#2025

#2026

#2028

#2029

#203

#2033

#2035

#2036

#2037

#2039

#204

#2043

#2045

#2046

#2049

#205

#2050

#2051

#2052

#2053

#2054

#2056

#2056

#2057

#206

#2062

#2063

#2064

#2067

#2068

#207

#2070

#2071

#2071

#2072

#2074

#2075

#2075

#2076

#2076

#2077

#208

#2080

#2081

#2083

#2084

#2085

#2089

#209

#2090

#2091

#2092

#2094

#2097

#2097

#2098

#210

#2100

#2100

#2101

#2102

#2103

#2104

#2105

#2107

#2109

#2111

#2113

#2114

#2116

#2117

#2118

#2120

#2120

#2121

#2122

#2123

#2124

#2125

#2126

#2127

#213

#2130

#2131

#2132

#2132

#2135

#2135

#2137

#2138

#2139

#2139

#2141

#2142

#2142

#2143

#2144

#2144

#2146

#2147

#215

#2154

#2156

#2158

#2158

#216

#2161

#2162

#2163

#2164

#2169

#2170

#2172

#2173

#2173

#2174

#2176

#2178

#218

#2180

#2182

#2183

#2183

#2184

#2187

#219

#2190

#2191

#2192

#2194

#2199

#2199

#220

#2200

#2203

#2206

#2210

#2212

#2214

#2215

#2217

#2219

#2219

#222

#2220

#2220

#2224

#2226

#2227

#2229

#223

#2230

#2231

#2232

#2232

#2233

#2239

#2240

#2241

#2244

#2246

#2247

#2248

#2249

#225

#2251

#2253

#2255

#2256

#2257

#2258

#2259

#226

#2261

#2267

#2268

#2269

#2269

#2272

#2275

#228

#2281

#2284

#2287

#2288

#2289

#2290

#2291

#2292

#2293

#2295

#2297

#2297

#2298

#23

#230

#2303

#2307

#2308

#2308

#231

#2312

#2314

#2315

#2321

#2322

#2323

#2323

#2324

#2326

#2327

#2328

#2328

#233

#2330

#2331

#2332

#2334

#2334

#2335

#2336

#2337

#2338

#2338

#234

#2341

#2342

#2343

#2345

#2346

#2347

#2347

#2349

#235

#2350

#2352

#2353

#2354

#2358

#236

#2362

#2365

#2367

#2369

#237

#2370

#2371

#2372

#2374

#2375

#2376

#2377

#2378

#2379

#2381

#2381

#2382

#2383

#2385

#239

#2390

#2392

#2393

#2398

#240

#2400

#2403

#2404

#2407

#2409

#2410

#2412

#2418

#2421

#2428

#243

#2430

#2431

#2432

#2433

#2434

#2435

#2435

#2436

#2436

#2439

#244

#2440

#2442

#2444

#2444

#2446

#2446

#2451

#2452

#2454

#2454

#2455

#2456

#2459

#2460

#2461

#2462

#2463

#2465

#2466

#2466

#2467

#2468

#2469

#247

#247

#2470

#2472

#2472

#2473

#2474

#2475

#2475

#2476

#2477

#248

#2480

#2481

#2484

#2485

#2487

#2487

#2488

#2489

#2491

#2494

#2495

#2495

#2496

#2497

#2498

#2499

#2500

#2500

#2502

#2503

#2508

#2509

#2510

#2510

#2511

#2513

#2514

#2516

#2519

#2521

#2523

#2524

#2525

#2529

#2530

#2530

#2531

#2532

#2533

#2534

#2537

#254

#2541

#2542

#2543

#2545

#2548

#255

#2553

#2554

#2555

#2557

#2559

#256

#2560

#2561

#2563

#2567

#2568

#2569

#257

#2574

#2576

#2577

#258

#2580

#2582

#2583

#2585

#2587

#259

#2590

#2591

#2595

#2598

#260

#2602

#2604

#2605

#2605

#2606

#2607

#261

#2610

#2611

#2614

#2615

#2615

#2617

#2619

#262

#2620

#2621

#2622

#2623

#2623

#2624

#2629

#2629

#2630

#2632

#2634

#2635

#2636

#2637

#264

#2641

#2641

#2645

#2646

#2647

#2649

#2651

#2659

#266

#2664

#2665

#2666

#2667

#2668

#2669

#2670

#2671

#2672

#2675

#2676

#2678

#268

#2685

#2687

#2689

#2690

#2690

#2691

#2692

#2693

#2695

#2696

#2698

#2699

#27

#2700

#2701

#2707

#2709

#2711

#2714

#2715

#2716

#2717

#2718

#2719

#272

#2720

#2725

#2726

#2729

#2730

#2731

#2732

#2733

#2739

#274

#2740

#2741

#2745

#2746

#2747

#2748

#2748

#2749

#2750

#2754

#2757

#2759

#2761

#2762

#2763

#2764

#2765

#2768

#2771

#2772

#2773

#2774

#2774

#2775

#2777

#2782

#2784

#2786

#2787

#2788

#279

#2792

#2792

#2794

#2795

#2799

#2799

#28

#2800

#2805

#2806

#2808

#281

#2810

#2813

#2815

#2815

#2816

#2817

#2818

#282

#2821

#2823

#2824

#2827

#2831

#2835

#2835

#2836

#2838

#2839

#2839

#284

#2843

#2845

#2845

#2849

#2851

#2851

#2854

#2854

#2859

#286

#2861

#2861

#2864

#2866

#2868

#2869

#2870

#2871

#2874

#2875

#2875

#2878

#2878

#2880

#2881

#2882

#2883

#2884

#2885

#2887

#2888

#2888

#2893

#2894

#2895

#2896

#290

#2900

#2901

#2902

#2903

#2904

#2905

#2907

#2908

#291

#2912

#2915

#2915

#2924

#2925

#2926

#2927

#2929

#2931

#2933

#2936

#294

#2941

#2945

#2946

#2947

#2948

#2952

#2953

#2954

#2956

#2957

#2959

#2960

#2962

#2963

#2963

#2964

#2967

#2970

#2973

#2974

#2977

#2978

#2980

#2981

#2982

#2983

#2983

#2985

#2988

#2988

#2989

#2989

#299

#2990

#2990

#2993

#2998

#2999

#3

#30

#3000

#3001

#3004

#3006

#3007

#3009

#301

#3010

#3011

#3012

#3013

#3014

#3014

#3016

#3017

#3019

#3020

#3021

#3022

#3023

#3029

#303

#3031

#3034

#3036

#3037

#304

#3041

#3044

#3046

#3047

#3049

#305

#3051

#3052

#3053

#3054

#3055

#3059

#3060

#3063

#3066

#3068

#307

#3071

#3072

#3077

#3077

#3078

#308

#3082

#3083

#3084

#3084

#309

#3090

#3091

#3092

#3094

#3099

#31

#310

#3100

#3100

#3101

#3102

#3103

#3105

#3106

#3108

#3109

#3111

#3112

#3113

#3117

#3119

#3120

#3127

#3128

#313

#3131

#3132

#3133

#3134

#3135

#3136

#3138

#3139

#314

#3141

#3142

#3144

#3145

#3146

#3151

#3152

#3154

#3155

#3155

#3156

#3161

#3162

#3167

#3167

#3168

#3168

#317

#3171

#3174

#3175

#3177

#3178

#3179

#318

#3180

#3180

#3181

#3183

#3184

#3186

#3187

#3188

#3189

#3191

#3191

#3192

#3194

#3196

#3197

#3198

#32

#320

#3202

#3203

#3204

#3206

#3207

#3208

#3209

#321

#321

#3211

#3212

#3214

#3215

#3216

#3217

#3219

#322

#3221

#3222

#3224

#3225

#3227

#3228

#3229

#3229

#323

#3230

#3231

#3235

#3239

#3241

#3242

#3245

#3247

#3247

#325

#3251

#3252

#3259

#3259

#3260

#3261

#3268

#327

#3270

#3272

#3274

#3280

#3281

#3282

#3283

#3283

#3284

#3285

#3287

#3289

#329

#3291

#3294

#3294

#3295

#3297

#3297

#3298

#33

#330

#3300

#3301

#3303

#3304

#3306

#3307

#3309

#3310

#3312

#3313

#3314

#3315

#3317

#332

#3320

#3321

#3326

#3326

#3327

#3328

#333

#3330

#3331

#3332

#3333

#3336

#3341

#3343

#3344

#3345

#3347

#3348

#335

#3352

#3353

#3355

#3356

#3357

#3357

#3359

#3362

#3363

#3367

#3368

#3369

#3370

#3371

#3378

#3383

#3384

#3385

#3388

#3388

#3389

#3389

#339

#3390

#3394

#3395

#3397

#3397

#34

#3401

#3404

#3405

#3405

#3409

#341

#3410

#3411

#3414

#3417

#3422

#3423

#3425

#3427

#3427

#3429

#3429

#343

#3430

#3433

#3438

#3438

#3439

#3441

#3442

#3448

#3454

#3455

#3456

#3458

#346

#3460

#3460

#3461

#3465

#3465

#3467

#3467

#3471

#3472

#3473

#3475

#3475

#3479

#3479

#348

#3482

#3483

#3484

#3485

#3485

#3489

#3490

#3492

#3493

#3493

#3494

#3495

#3496

#3497

#3499

#35

#350

#3500

#3501

#3502

#3503

#3505

#3506

#3507

#3509

#351

#3511

#3513

#3514

#3519

#3519

#3520

#3522

#3522

#3523

#3523

#3524

#3526

#3527

#3528

#3529

#353

#3530

#3531

#3532

#3533

#3534

#3535

#3537

#3540

#3541

#3543

#3544

#3547

#3548

#3548

#3552

#3553

#3555

#3556

#3563

#3565

#3567

#357

#3570

#3571

#3572

#3573

#3574

#3576

#3577

#3578

#358

#3580

#3582

#3586

#3587

#3588

#3589

#359

#359

#3590

#3590

#3594

#3595

#3596

#3598

#3599

#36

#360

#3600

#3601

#3604

#3605

#3605

#3608

#361

#3610

#3610

#3611

#3613

#3614

#3616

#3617

#362

#3621

#3623

#3625

#3626

#3627

#3628

#3628

#363

#3630

#3631

#3635

#3637

#3637

#3639

#364

#364

#3640

#3641

#3644

#3648

#3649

#3649

#3651

#3651

#3653

#3655

#3656

#3656

#3657

#366

#3661

#3662

#3663

#3667

#3668

#3670

#3673

#3673

#3675

#3677

#3678

#3679

#3679

#368

#3681

#3682

#3682

#3684

#3685

#3686

#3687

#3688

#3689

#3690

#3690

#3691

#3691

#3692

#3694

#3698

#3699

#37

#370

#3700

#3701

#3702

#3703

#3705

#3706

#3710

#3710

#3712

#3713

#3713

#3716

#3718

#3722

#3724

#3724

#3725

#3726

#3728

#3728

#3729

#3729

#3732

#3734

#3735

#3736

#3738

#3739

#374

#3746

#3747

#3747

#3748

#3749

#3749

#3750

#3751

#3755

#3756

#3760

#3763

#3764

#3765

#3765

#3767

#3768

#3769

#377

#3770

#3771

#3772

#3773

#3774

#3777

#3778

#3779

#3780

#3781

#3782

#3786

#3789

#379

#3790

#3793

#3797

#3799

#38

#3803

#3804

#381

#3810

#3812

#3812

#3814

#3814

#3815

#3817

#3819

#382

#3820

#3821

#3824

#386

#389

#39

#391

#392

#396

#4

#402

#403

#404

#408

#409

#41

#416

#417

#419

#419

#423

#424

#425

#426

#427

#43

#432

#435

#436

#436

#437

#44

#440

#442

#443

#445

#448

#449

#45

#450

#451

#452

#454

#459

#461

#462

#463

#464

#465

#469

#47

#470

#474

#475

#476

#478

#480

#482

#483

#487

#489

#49

#490

#495

#497

#499

#50

#500

#504

#505

#506

#507

#508

#510

#511

#515

#515

#52

#520

#524

#527

#532

#534

#54

#541

#542

#543

#545

#547

#547

#549

#55

#551

#552

#553

#554

#556

#557

#558

#559

#56

#560

#561

#562

#563

#564

#565

#569

#57

#570

#575

#577

#58

#581

#583

#585

#586

#587

#588

#59

#590

#591

#592

#599

#60

#601

#601

#602

#603

#61

#610

#611

#612

#614

#615

#617

#618

#619

#620

#621

#622

#623

#624

#629

#63

#631

#634

#635

#639

#640

#641

#642

#644

#645

#646

#648

#65

#650

#652

#657

#662

#663

#664

#669

#67

#674

#675

#676

#677

#681

#683

#684

#686

#687

#688

#691

#694

#695

#7

#70

#704

#706

#71

#711

#712

#714

#715

#717

#718

#72

#722

#723

#724

#729

#732

#735

#736

#737

#738

#738

#74

#741

#743

#743

#745

#749

#75

#753

#754

#755

#756

#757

#757

#759

#759

#760

#761

#762

#763

#764

#765

#769

#772

#773

#776

#777

#780

#781

#782

#784

#786

#787

#788

#789

#79

#790

#792

#793

#794

#794

#797

#80

#800

#802

#808

#808

#809

#81

#810

#810

#811

#812

#812

#813

#814

#818

#819

#820

#822

#824

#826

#828

#829

#83

#832

#833

#838

#84

#840

#844

#845

#846

#85

#850

#851

#852

#853

#854

#857

#858

#86

#861

#865

#866

#867

#868

#869

#871

#872

#875

#878

#879

#88

#881

#882

#883

#884

#885

#886

#887

#889

#89

#890

#893

#894

#9

#90

#902

#906

#908

#909

#91

#910

#912

#915

#916

#917

#918

#92

#921

#922

#923

#924

#925

#927

#929

#930

#931

#934

#938

#94

#941

#942

#944

#945

#947

#948

#948

#951

#954

#955

#955

#956

#956

#958

#959

#960

#962

#963

#969

#970

#971

#972

#973

#973

#975

#976

#978

#979

#98

#982

#983

#984

#986

#987

#989

#99

#990

#996

#998

v0.1

v0.1rc

v0.2

v0.2.0.post1

v0.2.0.post2

v0.3.0.post0

v0.3.0.post1

v0.3.0.rc0

v0.4.0

v0.4.1

v0.5.0

v0.6.0

4f1c489e45 [algo] fix: remove torch.quantile-based percentile metrics to resolve tensor size limit error (#3810) main Yingru Li 2025-10-20 13:04:57 +08:00
53aed3eea1 [doc] fix: update install instruction and retool readme (#3824) OC 2025-10-20 11:43:11 +08:00
65eb019a81 [trainer] fix: Add data.seed to config (#3815) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-10-20 04:57:14 +03:00
8235425094 Revert "[worker] fix: create a new event loop if none exists when building rollouts" (#3820) Chi Zhang 2025-10-20 09:19:49 +08:00
1546ce23ae [rollout, vllm] fix: make LoRA with async vLLM work properly (#3821) listar2000 2025-10-19 20:18:35 -05:00
25060e9f63 Revert "[trainer] fix: address serialization issues when using async reward f…" revert-3769-fix-async-reward Chi Zhang 2025-10-19 07:40:05 +08:00
f209c6f656 [ci] fix: Install mlflow dependency (#3817) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-10-19 02:21:54 +03:00
4da0d3d318 [misc] fix: Sanitize MLFlow metric names (#3736) Pratik Sharma 2025-10-17 19:12:05 -07:00
5b417da543 [megatron] fix: fix logits process error when disable pack_seqs (#3777) HaochenYuan 2025-10-18 10:11:36 +08:00
f0539a5121 [trainer] fix: address serialization issues when using async reward function and ray ppo trainer (#3769) ben 2025-10-17 17:22:59 -07:00
e0e352b566 [worker] fix: create a new event loop if none exists when building rollouts (#3803) ChangyWen 2025-10-18 08:20:57 +08:00
85d5b2ee2e [doc] feat: update fully async experiment message (#3804) arron 2025-10-18 06:20:01 +08:00
b25bb7d4f3 [trainer, recipe] feat: fully async training recipe (#2981) arron 2025-10-17 22:29:18 +08:00
dd8864f9ee [megatron] feat: script of qwen3vl 235b (#3799) Yan Bai 2025-10-17 16:46:45 +08:00
ae5d8504d4 [trainer] feat: ReMax support using reward model for baseline (#3780) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-10-17 07:07:05 +03:00
a80ed95e70 [trainer] fix: batch size mismatch with n>1 when gen_max for ReMax (#3779) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-10-17 05:05:12 +03:00
9078a533c6 [vllm] fix: catch exception of vllm async engine (#3789) 杨睿 2025-10-17 09:50:34 +08:00
4abae2d77a [doc] chore: add agent loop get started tutorial (#3790) Joel 2025-10-17 08:30:10 +08:00
7e3898fef2 [recipe] fix: fix the gpt-oss-20b training script for agent loop recipe (#3793) HEJIAN SANG 2025-10-16 17:09:45 -07:00
65b8bf1bc0 [misc] fix: sft SFT E2E CI test failure due to megatron engine (#3786) Houmin Wei 2025-10-17 06:27:39 +08:00
acfcf98ed0 [doc] fix: actor_rollout_ref.critic is not correct (#3778) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-10-16 06:12:45 +03:00
e81e7db725 [docker] feat: update Dockerfile.rocm7 (#3781) vickytsang 2025-10-15 20:02:43 -07:00
061535208c [recipe] feat: Add example for gpt-oss training using agent loop (#3774) HEJIAN SANG 2025-10-15 01:45:11 -07:00
55f651c94d [misc] feat: bump version to 0.7.0.dev (#3772) Chi Zhang 2025-10-15 13:40:12 +08:00
ddd86f527a [misc] chore: bump version to v0.6.0 (#3773) v0.6.0 v0.6.x Chi Zhang 2025-10-15 13:19:38 +08:00
22d082f9a4 [recipe] feat: add open math reasoning (#3767) Chi Zhang 2025-10-15 12:11:41 +08:00
8ec9bf64a1 [ci] fix: fix test_engine ci (#3771) Chi Zhang 2025-10-15 12:11:17 +08:00
231d725f69 Revert "[trainer] feat: set interleave to False in dapo trainer" (#3770) Chi Zhang 2025-10-15 11:41:33 +08:00
d69164e1cb [misc] feat: bump version to 0.6.0.dev (#3768) Chi Zhang 2025-10-15 10:47:13 +08:00
2181d5b33a [recipe] fix: update readme for gmpo-trainer (#3764) Liu Yue 2025-10-15 10:24:24 +08:00
33eb86f54f [megatron] feat: support qwen3vl (#3763) Yan Bai 2025-10-15 10:19:22 +08:00
67f9a21b8e [trainer] feat: set interleave to False in dapo trainer (#3760) jiaqiw09 2025-10-14 21:13:57 +08:00
d2c51dc186 Add Meta-Bandit-LLM, a long-horizon multiturn interative awesome use case of verl (#3756) Sanxing Chen 2025-10-14 00:01:13 -04:00
16c2a21064 Add ARES and Revisual-R1 two awesome multimodal reasoning work using verl. (#3755) 凪 2025-10-14 10:51:32 +08:00
3abcc09d44 [sglang, recipe] feat: add SGLang as rollout engine for one-step-off-policy (#3531) KAMiPan 2025-10-14 10:48:29 +08:00
5d378b5f95 [rollout] refactor: rename "clip" mode back to "mask" mode (#3750) Yingru Li 2025-10-14 02:06:36 +08:00
3bee096da2 build(deps): bump sglang[all] from 0.5.2 to 0.5.3.post1 dependabot/pip/sglang-all--0.5.3.post1 dependabot[bot] 2025-10-13 17:14:16 +00:00
21271aabb9 [BREAKING][rollout, trainer, algo] feat: comprehensive rollout importance sampling implementation (#3694) Yingru Li 2025-10-13 17:05:29 +08:00
7f27789961 [fsdp,doc] refactor: rename warmup_style@FSDPOptimizerConfig -> lr_scheduler_type (#3739) yangbaoxing 2025-10-13 15:58:59 +08:00
e9ee6b39c6 [model] fix: qwen3vl models shape mismatch error with SP (#3735) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-10-13 08:09:10 +03:00
9d4554b931 [model] fix: qwen3vl training stuck with mixed text-image data (#3734) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-10-13 08:08:13 +03:00
71cf69e7ad [ci] feat: increase sft e2e time (#3738) Chi Zhang 2025-10-13 11:29:39 +08:00
7ddb9b29f0 [misc] feat: prototype deprecate DataProto and replace with Tensordict: part 3 (#3600) Houmin Wei 2025-10-13 08:18:09 +08:00
8cc9e3af67 [misc] feat: support offline generation with server mode (#3732) Chi Zhang 2025-10-12 11:00:33 +08:00
f07596c02e [misc] feat: support build DataProto from TensordDict (#3726) Huazhong 2025-10-11 17:28:18 +08:00
656f4e6705 [rollout] chore: Misc changes for extending internal compatibility (#3701) Peng Wu 2025-10-11 01:08:39 -07:00
d36d3b9cbe [rollout] feat: add default agent name for agent loop (#3716) Joel 2025-10-11 14:45:30 +08:00
e960fbaeab [rollout] feat: Add gpt-oss tool parser to enable agent loop training for gpt-oss models (#3705) HEJIAN SANG 2025-10-10 20:53:10 -07:00
d87602432c [fsdp] fix: Handle dict type for per_tensor_param in LoRA weight sync (#3712) Pouria Mistani 2025-10-10 06:58:30 -07:00
e01376663b [megatron] feat: add ascend megatron merge support (#3722) jiaqiw09 2025-10-10 21:54:27 +08:00
152ce6a1de [misc] fix: Allow HF model ID with use_shm (#3663) EduardDurech 2025-10-10 07:44:53 +02:00
2d72c52e1b [misc] fix: model reassign to inner model in vllm patch file (#3668) Changlong Yu 2025-10-09 21:13:49 -07:00
eb06fda2a9 [data] fix: merge metrics from all workers in DataProto.concat() (#3699) Yingru Li 2025-10-10 11:45:08 +08:00
7ffd413734 [megatron, model] fix: VLMs using mbridge together with fused kernels (#3700) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-10-10 06:05:32 +03:00
cf619d68d4 [recipe] fix: move all collabllm files into recipe directory (#3706) OC 2025-10-09 18:50:37 +08:00
23877bcc64 [worker] fix: create a new event loop if none exists (#3703) Huazhong 2025-10-09 17:11:58 +08:00
e56e3df071 [worker] refactor: Add kwargs to checkpoint related functions in BaseEngine and its subclasses (#3662) Hongpeng Guo 2025-10-08 23:56:22 -07:00
54fed7fec7 [rollout] feat: support async mode for multimodal data inference (#3702) xichengpro 2025-10-09 14:11:09 +08:00
f06ef09f1c [rollout] fix: Add LoRA datatype based on rollout model type to the LoRA config (#3675) mgilmore-relace 2025-10-08 20:48:32 -07:00
fc489dbaef [rollout] fix: add batch_data_id default value check in AsyncRolloutRequest (#3657) Pandeng Yao 2025-10-09 10:56:10 +08:00
d45d04946b [rollout,sglang] fix: get_tool_call_parser_type for gpt-oss models in sglang rollout (#3661) HEJIAN SANG 2025-10-08 19:51:37 -07:00
baf7506cff [worker] fix: support for vllm V0 deprecation version (#3687) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-10-09 05:44:31 +03:00
798a6f8ba0 [trainer] feat: Enabled fused adamw (#3692) Puneesh Khanna 2025-10-07 23:13:46 +04:00
ab10eb2671 [model] fix: qwen3vl patch (#3686) Yaowei Zheng 2025-10-07 03:32:53 +08:00
7904d0b672 [ci] fix: fix checkpoint converter ci (#3685) Chi Zhang 2025-10-06 19:42:47 +13:00
1216ce4599 [ci] fix: merge pre-commit-full into pre-commit (#3684) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-10-06 05:56:11 +03:00
42c55ac6b3 [model] feat: add qwen3vl (#3681) Yaowei Zheng 2025-10-06 10:21:19 +08:00
327e813136 [rollout] fix: qwen2_vl position_ids shape mismatch (#3653) m-Just 2025-10-05 16:03:12 +08:00
83aebcc133 [ci] fix: disable workflows with self-host machines to run on fork (#3677) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-10-04 12:02:41 +03:00
4e9faafc94 [model] fix: stuck issue with mixed text-image data (#3670) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-10-04 02:47:09 +03:00
f50e5c2e8f [sglang] feat: add preparation for sglang+verl (#3506) lbk-sys 2025-09-29 10:21:01 +08:00
aa19c1afc4 [recipe] feat: add multiturn scripts for vllm backend; fix progess bar in dapo (#3644) jiaqiw09 2025-09-28 20:28:25 +08:00
9e2072d120 [megatron, training_utils] fix: encoder pp is removed in mcore >= 0.14 (#3640) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-09-28 07:59:32 +03:00
39e531f29e [rollout,vllm] fix: Add LoRA Loading to Async vLLM (#3639) Kion Fallah 2025-09-27 19:13:40 -07:00
abca659ec7 [megatron, worker] fix: use extract_multi_modal_inputs method for handling multi_modal_inputs (#3641) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-09-28 05:08:51 +03:00
4ff3ce2fed [algo, perf] feat: Vectorize GRPO Advantage Estimator - 13～26x Speedup (#3635) CedricHuang 2025-09-27 17:21:08 +08:00
c03dcb0f8f [model] feat: add glm4v (#3291) Lambert 2025-09-27 04:12:14 +08:00
84d5619f99 [2/N][rollout] feat: support vllm/sglang DP+EP in server mode (#3530) Joel 2025-09-26 21:52:03 +08:00
64a9860be2 [trainer] fix: Ref to #3596. More import fix for transformers version higher than 4.55.0 (#3608) A1waysBeenHere 2025-09-26 21:37:46 +08:00
e51305883d [rollout] refactor: Update rollout and reward configs to reuse vllm/sglang replicas (#3625) Yuyang Ding 2025-09-26 17:43:45 +08:00
2234810235 [megatron] feat: add mindspeed engine and support sft (#3599) Huazhong 2025-09-26 14:39:10 +08:00
377bbb84f0 [recipe] fix: Fix a Typo in One_Step_Off_Policy and Add async of Generative Reward Model in Response Generation (#3369) Zhichao Wang 2025-09-25 22:22:00 -07:00
096ab6dc1b [CI] fix: changed the model used in the PPO test case to Qwen2.5-0.5B to avoid the huggingface download error (#3631) Huazhong 2025-09-26 13:20:40 +08:00
231e18948d [tool] feat: support load local datasets when preparing datasets (#3621) Huazhong 2025-09-26 11:42:53 +08:00
fbfdc81f9a [ci] feat: increase timeout of e2e_sft (#3630) Chi Zhang 2025-09-26 10:23:25 +08:00
6ff2b43d13 [ci] feat: upgrade sglang to 0.5.2 (#3613) Joel 2025-09-26 09:25:53 +08:00
14c397f474 [doc] feat: Adding Table-R1 to the Awesome work (#3627) FlowRays 2025-09-25 23:26:26 +08:00
21536f2b03 [ci] fix: fix sanity ci (#3626) Chi Zhang 2025-09-25 23:15:10 +08:00
515f2255ac [ci] fix: use local models/configs/datasets to increase stability (#3616) Chi Zhang 2025-09-25 22:14:56 +08:00
bf7aac2fa7 [rollout, tool] feat: export rollout rewards to total rewards (#3563) Qizhi Chen 2025-09-25 17:33:03 +08:00
616e933e29 [worker] fix: correctly determine is_vlm_model if sp > 1 (#3282) ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 2025-09-25 12:21:40 +03:00
90154aeeb6 [doc] fix: fix doc (#3614) Chi Zhang 2025-09-25 16:11:43 +08:00
7731c5c6ec [rollout] fix: remove code responsible for tool response duplication (#3604) mgilmore-relace 2025-09-25 01:10:36 -07:00
4d0999c161 [ci] chore: Use local dataset and models in e2e_ascend CI (#3601) Zhen 2025-09-25 15:14:45 +08:00
3dfa28ae32 [doc] feat: add model engine doc (#3611) Chi Zhang 2025-09-25 14:25:44 +08:00
25d78fa913 [recipe] feat: CollabLLM integration for multiturn training (#3574) Shirley Wu 2025-09-24 18:53:39 -07:00
ba8555120a [trainer] fix: Import flash attn utils for Transformers higher than 4.55.0 (#3596) A1waysBeenHere 2025-09-24 23:27:48 +08:00
634bd9352b [CI] chore: reopen ppo test in e2e_ascend CI (#3588) Zhen 2025-09-24 17:46:30 +08:00
26a734e740 [algo, perf] feat: Vectorize RLOO Advantage Estimator - 20x Speedup (#3555) EduardDurech 2025-09-24 11:36:41 +02:00
69b0127b74 [misc] feat: prototype deprecate DataProto and replace with Tensordict: part 2 (#3567) Houmin Wei 2025-09-24 17:12:31 +08:00