MoE + vllm = 😻 (#40132)

* update modeling mixtral

* oups[13;2u

* fix

* better naming?

* compute softmax and top_k inside the experts

* update minamax as well

* models that will need an update

* more models that need a fix

* stash

* fix mixtral

* update olmoe

* update

* update

* current changes

* nits

* molmoe is now fixed

* olmoe is good to go!

* refactor qwen2_moe

* fixes

* fixed moe

* fix qwen2 modular

* nit

* qwen2_moie test script works

* tricky rope !

* fix qwen3

* DeepSeek v3 MoE Standardization (#40538)

* DeepSeek-v3

Shared

Shared

* Dependents of DS3

* Standardize GLM4V MoE (#40539)

* up

* Standardize VitPose's MoE (#40549)

* VitPose

* outside

* outside

* outside

* fix

* update dbrx

* dbrx... the magix

* Refactor Ernie 4.5's MoE (#40547)

* Isolate Ernie fixes

* fix moe

---------

Co-authored-by: Vasqu <antonprogamer@gmail.com>

* fix style

* style

* fix copies

* style

* latest changes

* fixes

* had to stage

* current updaters

* up

* another modular

* modular graniteMoe

* some update

* draft another modular moe

* updaters

* up

* fix nit

* q3 nit

* fix phi moe

* we're going up up up up its our mooooment

* fix switch transformers this time around

* up

* gptsan japanese is deprecated forget about it

* fix mixtral to not be a linear (gives us more freedom)

* update

* fix copies gone wrong try catch nothing

* fix mixtral

* new refactor again

* update aria as well

* up dbrx and deepseekv3

* nit

* fix phimoe?

* fix deepseek v3

* nits

* don't bother with this one please

* up olmoe

* ??

* fix olmoe

* yups

* fiupx

* ish

* hot patch

* new qwen3

* updates

* up

* nit

* fix copies

* fix

* nits

* we're going up up up

* nits

* switch_transformesr edge case

* lol modular gptsan?

* fix deepseek

* finally all modeling match modular

* update

* up

* up

* dang

* up

* up aria

* fix dbrx

* nits here and there

* finish fixing dbrx

* fix deepseek

* upd

* up

* fix flex olmo

* updated

* update jamba

* JAMBA is stil a bit todo

* forward forward

* fix dots11

* update

* fix hunyuan

* fix some other

* update phimoe

* fuck you phimoe you are now submitted

* submit granitemoe as well

* try to fix some other models, reduces some of the failures

* fix olmoe and qwem2moe

* up

* up

* fix qwen2_moe

* update modular make it again, simpler

* nits

* up

* up

* fix

* someswitch reductions

* up

* fix qwen3vl

* some fixes to jetmo

* these should be shipped to the modular to fix jetmoe

* fix most of the nllb failures

* more nllb fixes

* fix the modular

* remove nllb modular as it sucks for now

* ?

* fix granitemoe

* granitemoehybrid don't have rope

* use rope when rope, no rope when no rope

* updates

* finish fixing dumbgrainite

* fix most of minimax

* fix

* update modular

* ?

* up

* up jetmoe still broken

* up

* fix, now align the moe

* fix jetmoe

* fix styling and qwen3 repo consitency

* updatge

* up up

* update ruff?

* nits

* modeling is goot now for switch

* fix

* more fixses to switch!

* fix some siwtch test

* ?

* ?

* up

* fix switch modular!

* nit?

* uip

* subtest

* can't believe I wasted so much time on this...

* fix

* updates

* nits

* nit jamba is fucking annoying

* ?

* fix?

* oups

* good good

* styling

* up

* make sure qwen2 sliding works!

* fix dbrx small

* lol

* nits

* fix one test

* fix load balancing loss issue

* fix jamba

* fix nllbmoe

* fix jamba consistency and doc?

* up

* thse are correct

* up

* up

* up

* some of the final cleanup

* update

* up

* fix some revert in granimoe

* bring back attention multipliers for the granite family we'll see later on if they need removal

* small jamba fix docstring and typing

* fix phimoe

* yup

* fix unk returndict in granitemoes

* up

* fix qwen config

* fix phiemoe check quality

* nits

* update based on caught non relative imports!

* fix dbrx

* Apply suggestions from code review

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>

* fix copies

* fiuxp

* fix dot1 regression!

* fix phimoe issue

* fix phi moe

* fix float() for some models

* fix jamba regression

* ui

* more dtype issues

* fix deepseek2 and 3?

* proper update

* fix modular deepseek!

* jamba jambaaaaaa

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>
Co-authored-by: Vasqu <antonprogamer@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
This commit is contained in:
Arthur
2025-10-02 12:12:44 +02:00
committed by GitHub
parent e6a8e7debe
commit 7938e91faa
86 changed files with 9207 additions and 9959 deletions

View File

@ -54,7 +54,7 @@ SPECIAL_CASES_TO_ALLOW = {
"expert_layer_period",
],
"Qwen2Config": ["use_sliding_window", "max_window_layers"],
"Qwen2MoeConfig": ["use_sliding_window"],
"Qwen2MoeConfig": ["use_sliding_window", "max_window_layers"],
"Qwen2VLTextConfig": ["use_sliding_window", "max_window_layers"],
"Qwen2_5_VLTextConfig": ["use_sliding_window", "max_window_layers"],
"Qwen2_5OmniTextConfig": ["use_sliding_window", "max_window_layers"],
@ -65,8 +65,10 @@ SPECIAL_CASES_TO_ALLOW = {
# generation configs (TODO joao)
"Gemma2Config": ["tie_word_embeddings", "cache_implementation"],
"Cohere2Config": ["cache_implementation"],
"JetMoeConfig": ["output_router_logits"],
# Dropout with this value was declared but never used
"Phi3Config": ["embd_pdrop"],
"PhimoeConfig": ["max_position_embeddings"],
# used to compute the property `self.chunk_length`
"EncodecConfig": ["overlap"],
# used to compute `frame_rate`

View File

@ -197,10 +197,25 @@ if __name__ == "__main__":
# Process files with diff
num_workers = min(args.num_workers, len(files_to_check))
with multiprocessing.Pool(num_workers) as p:
is_changed_flags = p.map(
partial(compare_files, show_diff=not args.fix_and_overwrite),
files_to_check,
)
try:
is_changed_flags = p.map(
partial(compare_files, show_diff=not args.fix_and_overwrite),
files_to_check,
)
except Exception as e:
console.print(
f"[bold red]Failed to convert one or more files in batch: {files_to_check}[/bold red]"
)
console.print(f"[bold red]Error: {e}[/bold red]")
# Try to process files individually to identify which one failed
is_changed_flags = []
for file_path in files_to_check:
try:
result = compare_files(file_path, show_diff=not args.fix_and_overwrite)
is_changed_flags.append(result)
except Exception as individual_error:
console.print(f"[bold red]Failed to convert {file_path}: {individual_error}[/bold red]")
is_changed_flags.append(0) # Mark as no change to continue processing
# Collect changed files and their original paths
for is_changed, file_path in zip(is_changed_flags, files_to_check):

View File

@ -1220,9 +1220,14 @@ class ModularFileMapper(ModuleMapper):
if import_module not in self.model_specific_modules:
if "models" not in import_module:
import_module = "models." + import_module
if "transformers" not in import_module:
if not import_module.startswith("transformers"):
import_module = "transformers." + import_module
source_code = get_module_source_from_name(import_module)
try:
source_code = get_module_source_from_name(import_module)
except ModuleNotFoundError as e:
raise ModuleNotFoundError(
f"Failed to visit import from for: {self.python_module.code_for_node(node)}. Tried to import {import_module} but failed."
) from e
tree = cst.parse_module(source_code)
self.model_specific_modules[import_module] = tree
imported_object = self.python_module.code_for_node(imported_.name)