71db0d49e9
feat: add benchmark v2 ci with results pushed to dataset ( #41672 )
2025-10-20 08:56:58 +01:00
307c523854
further improve utils/check_bad_commit.py
( #41658 ) ( #41690 )
...
* fix
* Update utils/check_bad_commit.py
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com >
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com >
2025-10-17 23:07:00 +02:00
b9bd8c45a1
[CI] Build translated docs ( #41632 )
...
fix
2025-10-16 14:01:33 +02:00
2b2c20f315
Update issue template ( #41573 )
...
* update
* fix
2025-10-15 13:54:37 +02:00
94df0e6560
Benchmark overhaul ( #41408 )
...
* Big refactor, still classes to move around and script to re-complexify
* Move to streamer, isolate benches, propagate num tokens
* Some refacto
* Added compile mode to name
* Re-order
* Move to dt_tokens
* Better format
* Fix and disable use_cache by default
* Fixed compile and SDPA backend default
* Refactor results format
* Added default compile mode
* Always use cache
* Fixed cache and added flex
* Plan for missing modules
* Experiments: no cg and shuffle
* Disable compile for FA
* Remove wall time, add sweep mode, get git commit
* Review compliance, start
* Apply suggestions from code review
Co-authored-by: Luc Georges <McPatate@users.noreply.github.com >
* Update benchmark_v2/framework/benchmark_runner.py
Co-authored-by: Luc Georges <McPatate@users.noreply.github.com >
* Disable workflow
* Pretty print
* Added some pretty names to have pretty logs
* Review n2 compliance (end?)
* Style and end of PR
---------
Co-authored-by: Luc Georges <McPatate@users.noreply.github.com >
2025-10-14 21:41:43 +02:00
1a3a5f5289
Remove SigOpt ( #41479 )
...
* remove sigopt
* style
2025-10-09 18:05:55 +02:00
42bcc81ba2
Minor security fix for ssh-runner.yml
( #41317 )
...
security issue
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-10-03 14:14:34 +02:00
7adb43e60a
Build doc in 2 jobs: en
and other languages
( #41290 )
...
* separate
* separate
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-10-02 14:33:57 +00:00
e1f1d32af0
Remove some previous team members from allow list of triggering Github Actions ( #41263 )
...
* delete
* delete
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-10-02 16:32:28 +02:00
639ad8ccd9
feat: use aws-highcpu-32-priv
for amd docker img build ( #41285 )
...
* feat: use `aws-highcpu-32-priv` for amd docker img build
* feat: add `workflow_dispatch` event to docker build CI
2025-10-02 12:53:14 +00:00
9d8f693c7e
add peft team members to issue/pr template ( #41262 )
...
* add
* Update .github/PULL_REQUEST_TEMPLATE.md
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
2025-10-01 17:26:59 +00:00
8e7b0655f1
update code owners ( #41221 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-30 16:21:19 +02:00
1f1e93e095
Align pull request template to bug report template ( #41220 )
...
The only difference is that I don't users to https://discuss.huggingface.co/ for hub issues.
2025-09-30 14:25:41 +02:00
399c589dfa
Separate docker images for Nvidia and AMD in benchmarking ( #41119 )
...
Separate docker images for Nvidia and AMD
2025-09-29 17:03:27 +02:00
2dcb20dcec
CI Runners - move amd runners mi355 and 325 to runner group ( #41193 )
...
* Update CI workflows to use devmi355 branch
* Add workflow trigger for AMD scheduled CI caller
* Remove unnecessary blank line in workflow YAML
* Add trigger for workflow_run on main branch
* Update workflow references from devmi355 to main
* Change runner_scale_set to runner_group in CI config
2025-09-29 11:14:19 +02:00
03c92884b5
Update team member list for some CI workflows ( #41094 )
...
* update list
* update list
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-23 09:48:40 +00:00
1bb69cce82
Fix CI jobs being all red 🔴 (false positive) ( #41059 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-22 16:51:00 +02:00
b9d337b6f3
Add write token for uploading benchmark results to the Hub ( #41047 )
...
* Separate write token for Hub upload
* Address review comments
* Address review comments
2025-09-22 14:13:46 +00:00
67097bf340
Fix benchmark runner argument name ( #41012 )
2025-09-20 10:53:56 +02:00
96a3e898cd
RUFF fix on CI scripts ( #40805 )
...
Signed-off-by: Yuanyuan Chen <cyyever@outlook.com >
2025-09-19 13:50:26 +00:00
fce746512b
[docs] rm stray tf/flax autodocs references ( #40999 )
...
rm tf references
2025-09-19 12:04:12 +01:00
61eff450d3
Benchmarking v2 GH workflows ( #40716 )
...
* WIP benchmark v2 workflow
* Container was missing
* Change to sandbox branch name
* Wrong place for image name
* Variable declarations
* Remove references to file logging
* Remove unnecessary step
* Fix deps install
* Syntax
* Add workdir
* Add upload feature
* typo
* No need for hf_transfer
* Pass in runner
* Runner config
* Runner config
* Runner config
* Runner config
* Runner config
* mi325 caller
* Name workflow runs properly
* Copy-paste error
* Add final repo IDs and schedule
* Review comments
* Remove wf params
* Remove parametrization from worfkflow files
* Fix callers
* Change push trigger to pull_request + label
* Add back schedule event
* Push to the same dataset
* Simplify parameter description
2025-09-19 08:54:49 +00:00
5ac3c5171a
Track the CI (model) jobs that don't produce test output files (process being killed etc.) ( #40981 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-18 18:27:27 +02:00
738b223f57
Add captured actual outputs to CI artifacts ( #40965 )
...
* fix
* fix
* Remove `# TODO: ???` as it make me `???`
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-18 15:40:53 +02:00
270da89708
Remove runner_map
( #40880 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-16 15:18:07 +02:00
96d3795cfc
Update model tags and integration references in bug report ( #40881 )
2025-09-15 12:08:29 +02:00
9c804f7ec4
Redirect MI355 CI results to dummy dataset ( #40862 )
2025-09-14 18:42:49 +02:00
d8f670583e
Change docker image to preview for the MI355 CI ( #40693 )
...
* Change docker image to preview for the MI355 CI
* Use pushed image
2025-09-04 17:23:09 +02:00
30a4b8707d
CircleCI docker images cleanup / update / fix ( #40681 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-04 10:42:18 +02:00
ca9b36a9c1
Avoid night torch CI not run because of irrelevant docker image failing to build ( #40677 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-04 09:06:37 +02:00
8c60a7c385
Add collated reports job to Nvidia CI ( #40470 )
...
* Add collated reports job to Nvidia CI
* machine_type
* Move collated reports job to model_jobs
* Propagate repo id variable
* assifgn runner_type is self-scheduled-caller
2025-09-02 14:25:22 +02:00
3c3dac3c12
Add Copilot instructions ( #40432 )
...
* Add copilot-instructions.md
* Fix typo
* Update .github/copilot-instructions.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2025-09-01 14:09:54 +01:00
db6821b79c
Allow remi-or
to run-slow
( #40590 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-01 12:30:53 +02:00
821384d5d4
Fix the CI workflow of merge to main
( #40503 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-08-27 18:35:12 +02:00
304225aa15
Collated reports: no need to upload artifact ( #40502 )
...
No need to upload collated reports as gh artifact
2025-08-27 18:31:55 +02:00
80f4c0c6a0
CI when PR merged to main
( #40451 )
...
* up
* up
* up
* up
* up
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-08-27 10:56:18 +02:00
ff8b88a948
Fix nightly torch CI ( #40469 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-08-26 22:02:15 +02:00
74ad608a2b
Not to shock AMD team by the cancelled workflow run notification ❤️ 💖 ( #40467 )
2025-08-26 20:53:24 +02:00
6b5eab70e4
Remove working-dir from collated reports job ( #40435 )
2025-08-25 18:14:35 +02:00
1a35d07f56
Update collated reports working directory and --path ( #40433 )
2025-08-25 15:18:26 +00:00
5d906740d2
Update CI with nightly torch workflow file ( #40306 )
...
* fix nightly ci
* Apply suggestions from code review
Co-authored-by: ivarflakstad <69173633+ivarflakstad@users.noreply.github.com >
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
Co-authored-by: ivarflakstad <69173633+ivarflakstad@users.noreply.github.com >
2025-08-20 16:59:00 +02:00
28746cdc7b
Remove MI300 CI ( #40270 )
...
Remove MI300 CI (in history if we need it back)
2025-08-19 08:23:39 +00:00
e472efb9ac
Fix benchmark workflow ( #40254 )
...
Correct init_db.sql path
Co-authored-by: Akos Hadnagy <akoshuggingface@mi325x8-123.atl1.do.cpe.ice.amd.com >
2025-08-18 18:14:16 +00:00
2fe43376cd
AMD scheduled CI ref env file ( #40243 )
...
* Reference env-file to be used in docker running the CI
* Disable MI300 CI for now
2025-08-18 15:23:27 +02:00
e446372f76
Create self-scheduled-amd-mi355-caller.yml ( #40134 )
2025-08-14 01:33:45 +02:00
ebceef343a
Collated reports ( #40080 )
...
* Add initial collated reports script and job definition
* provide commit hash for this run. Also use hash in generated artifact name. Json formatting
* tidy
* Add option to upload collated reports to hf hub
* Add glob pattern for test report folders
* Fix glob
* Use machine_type as path filter instead of glob. Include machine_type in collated report
2025-08-13 14:48:15 +02:00
801e869b67
send some feedback when manually building doc via comment ( #39889 )
...
* fix
* fix
* fix
* Update .github/workflows/pr_build_doc_with_comment.yml
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
2025-08-04 18:20:48 +00:00
0d511f7a77
Use comment to build doc on PRs ( #39846 )
...
* try
* try
* try
* try
* try
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-08-04 10:24:45 +02:00
33aa49df9d
[docs] Ko doc fixes after toc update ( #39660 )
...
* update docs
* doc builder working
* make fixup
2025-07-29 17:05:26 +01:00
63b3200779
Use --gpus all
in workflow files ( #39752 )
...
gpu all
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-07-29 14:53:33 +02:00