# `torch._inductor.analysis` Contains scripts for inductor performance analysis. ## Analysis This will analyze a chrome trace to create a table useful for performance work. We mainly care about . Currently, it will add the flops and the memory reads of a kernel via formula (it's not looking at program counters or anything.) These, combined with the kernel duration, can be use to calculate achieved flops, achieved memory bandwidth, and roofline calculations. ### Usage ``` python profile_analysis.py --analysis ``` ### Arguments - `input_json_profile`: The json profile files generated by `torch.profile.export_chrome_trace()`. - `default_dtype`: The default dtype of the model. Sometimes the dtypes of the kernel inputs are not available in the profile, so we use the default dtype to infer the dtypes of the inputs. ## Diff This mode will diff two different profiles and output a table of the differences. It groups by kernel name, which can fail to properly match accross hardware vendors. More intelligent grouping coming soon. ### Usage ``` python profile_analysis.py --diff --name_limit 50 ``` ### Arguments - `json_profile_1` `json_profile_2`: The json profile files generated by `torch.profile.export_chrome_trace()`. - `profile_name_1` `profile_name_2`: The name of the profile. This is used to identify the profile in the output table. - `default_dtype`: The default dtype of the model. Sometimes the dtypes of the kernel inputs are not available in the profile, so we use the default dtype to infer the dtypes of the inputs. - `name_limit`: The maximum number of characters in the kernel name (they can be quite lengthly and hard to read). ## Augment This mode will add post-hoc analysis to a profile. Currently, it will add the flops and the memory reads of a kernel via formula (it's not looking at program counters or anything.) These, combined with the kernel duration, can be use to calculate achieved flops, achieved memory bandwidth, and roofline calculations. ### Usage ``` python profile_analysis.py --augment_trace ``` ### Arguments - `input_json_profile`: The json profile files generated by `torch.profile.export_chrome_trace()`. - `output_json_profile`: Where the augmented profile is written. - `default_dtype`: The default dtype of the model. Sometimes the dtypes of the kernel inputs are not available in the profile, so we use the default dtype to infer the dtypes of the inputs.