MoE + vllm = 😻 (#40132)

* update modeling mixtral * oups[13;2u * fix * better naming? * compute softmax and top_k inside the experts * update minamax as well * models that will need an update * more models that need a fix * stash * fix mixtral * update olmoe * update * update * current changes * nits * molmoe is now fixed * olmoe is good to go! * refactor qwen2_moe * fixes * fixed moe * fix qwen2 modular * nit * qwen2_moie test script works * tricky rope ! * fix qwen3 * DeepSeek v3 MoE Standardization (#40538) * DeepSeek-v3 Shared Shared * Dependents of DS3 * Standardize GLM4V MoE (#40539) * up * Standardize VitPose's MoE (#40549) * VitPose * outside * outside * outside * fix * update dbrx * dbrx... the magix * Refactor Ernie 4.5's MoE (#40547) * Isolate Ernie fixes * fix moe --------- Co-authored-by: Vasqu <antonprogamer@gmail.com> * fix style * style * fix copies * style * latest changes * fixes * had to stage * current updaters * up * another modular * modular graniteMoe * some update * draft another modular moe * updaters * up * fix nit * q3 nit * fix phi moe * we're going up up up up its our mooooment * fix switch transformers this time around * up * gptsan japanese is deprecated forget about it * fix mixtral to not be a linear (gives us more freedom) * update * fix copies gone wrong try catch nothing * fix mixtral * new refactor again * update aria as well * up dbrx and deepseekv3 * nit * fix phimoe? * fix deepseek v3 * nits * don't bother with this one please * up olmoe * ?? * fix olmoe * yups * fiupx * ish * hot patch * new qwen3 * updates * up * nit * fix copies * fix * nits * we're going up up up * nits * switch_transformesr edge case * lol modular gptsan? * fix deepseek * finally all modeling match modular * update * up * up * dang * up * up aria * fix dbrx * nits here and there * finish fixing dbrx * fix deepseek * upd * up * fix flex olmo * updated * update jamba * JAMBA is stil a bit todo * forward forward * fix dots11 * update * fix hunyuan * fix some other * update phimoe * fuck you phimoe you are now submitted * submit granitemoe as well * try to fix some other models, reduces some of the failures * fix olmoe and qwem2moe * up * up * fix qwen2_moe * update modular make it again, simpler * nits * up * up * fix * someswitch reductions * up * fix qwen3vl * some fixes to jetmo * these should be shipped to the modular to fix jetmoe * fix most of the nllb failures * more nllb fixes * fix the modular * remove nllb modular as it sucks for now * ? * fix granitemoe * granitemoehybrid don't have rope * use rope when rope, no rope when no rope * updates * finish fixing dumbgrainite * fix most of minimax * fix * update modular * ? * up * up jetmoe still broken * up * fix, now align the moe * fix jetmoe * fix styling and qwen3 repo consitency * updatge * up up * update ruff? * nits * modeling is goot now for switch * fix * more fixses to switch! * fix some siwtch test * ? * ? * up * fix switch modular! * nit? * uip * subtest * can't believe I wasted so much time on this... * fix * updates * nits * nit jamba is fucking annoying * ? * fix? * oups * good good * styling * up * make sure qwen2 sliding works! * fix dbrx small * lol * nits * fix one test * fix load balancing loss issue * fix jamba * fix nllbmoe * fix jamba consistency and doc? * up * thse are correct * up * up * up * some of the final cleanup * update * up * fix some revert in granimoe * bring back attention multipliers for the granite family we'll see later on if they need removal * small jamba fix docstring and typing * fix phimoe * yup * fix unk returndict in granitemoes * up * fix qwen config * fix phiemoe check quality * nits * update based on caught non relative imports! * fix dbrx * Apply suggestions from code review Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> * fix copies * fiuxp * fix dot1 regression! * fix phimoe issue * fix phi moe * fix float() for some models * fix jamba regression * ui * more dtype issues * fix deepseek2 and 3? * proper update * fix modular deepseek! * jamba jambaaaaaa --------- Co-authored-by: Lysandre Debut <hi@lysand.re> Co-authored-by: Vasqu <antonprogamer@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2025-10-20 17:13:56 +08:00 · 2025-10-02 12:12:44 +02:00
parent e6a8e7debe
commit 7938e91faa
86 changed files with 9207 additions and 9959 deletions
--- a/utils/check_config_attributes.py
+++ b/utils/check_config_attributes.py
@ -54,7 +54,7 @@ SPECIAL_CASES_TO_ALLOW = {
        "expert_layer_period",
    ],
    "Qwen2Config": ["use_sliding_window", "max_window_layers"],
-    "Qwen2MoeConfig": ["use_sliding_window"],
+    "Qwen2MoeConfig": ["use_sliding_window", "max_window_layers"],
    "Qwen2VLTextConfig": ["use_sliding_window", "max_window_layers"],
    "Qwen2_5_VLTextConfig": ["use_sliding_window", "max_window_layers"],
    "Qwen2_5OmniTextConfig": ["use_sliding_window", "max_window_layers"],
@ -65,8 +65,10 @@ SPECIAL_CASES_TO_ALLOW = {
    # generation configs (TODO joao)
    "Gemma2Config": ["tie_word_embeddings", "cache_implementation"],
    "Cohere2Config": ["cache_implementation"],
+    "JetMoeConfig": ["output_router_logits"],
    # Dropout with this value was declared but never used
    "Phi3Config": ["embd_pdrop"],
+    "PhimoeConfig": ["max_position_embeddings"],
    # used to compute the property `self.chunk_length`
    "EncodecConfig": ["overlap"],
    # used to compute `frame_rate`
--- a/utils/check_modular_conversion.py
+++ b/utils/check_modular_conversion.py
@ -197,10 +197,25 @@ if __name__ == "__main__":
            # Process files with diff
            num_workers = min(args.num_workers, len(files_to_check))
            with multiprocessing.Pool(num_workers) as p:
-                is_changed_flags = p.map(
-                    partial(compare_files, show_diff=not args.fix_and_overwrite),
-                    files_to_check,
-                )
+                try:
+                    is_changed_flags = p.map(
+                        partial(compare_files, show_diff=not args.fix_and_overwrite),
+                        files_to_check,
+                    )
+                except Exception as e:
+                    console.print(
+                        f"[bold red]Failed to convert one or more files in batch: {files_to_check}[/bold red]"
+                    )
+                    console.print(f"[bold red]Error: {e}[/bold red]")
+                    # Try to process files individually to identify which one failed
+                    is_changed_flags = []
+                    for file_path in files_to_check:
+                        try:
+                            result = compare_files(file_path, show_diff=not args.fix_and_overwrite)
+                            is_changed_flags.append(result)
+                        except Exception as individual_error:
+                            console.print(f"[bold red]Failed to convert {file_path}: {individual_error}[/bold red]")
+                            is_changed_flags.append(0)  # Mark as no change to continue processing

            # Collect changed files and their original paths
            for is_changed, file_path in zip(is_changed_flags, files_to_check):
--- a/utils/modular_model_converter.py
+++ b/utils/modular_model_converter.py
@ -1220,9 +1220,14 @@ class ModularFileMapper(ModuleMapper):
                    if import_module not in self.model_specific_modules:
                        if "models" not in import_module:
                            import_module = "models." + import_module
-                        if "transformers" not in import_module:
+                        if not import_module.startswith("transformers"):
                            import_module = "transformers." + import_module
-                        source_code = get_module_source_from_name(import_module)
+                        try:
+                            source_code = get_module_source_from_name(import_module)
+                        except ModuleNotFoundError as e:
+                            raise ModuleNotFoundError(
+                                f"Failed to visit import from for: {self.python_module.code_for_node(node)}. Tried to import {import_module} but failed."
+                            ) from e
                        tree = cst.parse_module(source_code)
                        self.model_specific_modules[import_module] = tree
                    imported_object = self.python_module.code_for_node(imported_.name)